Related
sometimes it would be convenient to have an easy way of doing the following:
Foo a = dosomething();
if (a != null){
if (a.isValid()){
...
}
}
My idea was to have some kind of static “default” methods for not initialized variables like this:
class Foo{
public boolean isValid(){
return true;
}
public static boolean isValid(){
return false;
}
}
And now I could do this…
Foo a = dosomething();
if (a.isValid()){
// In our example case -> variable is initialized and the "normal" method gets called
}else{
// In our example case -> variable is null
}
So, if a == null the static “default” methods from our class gets called, otherwise the method of our object gets called.
Is there either some keyword I’m missing to do exactly this or is there a reason why this is not already implemented in programming languages like java/c#?
Note: this example is not very breathtaking if this would work, however there are examples where this would be - indeed - very nice.
It's very slightly odd; ordinarily, x.foo() runs the foo() method as defined by the object that the x reference is pointing to. What you propose is a fallback mechanism where, if x is null (is referencing nothing) then we don't look at the object that x is pointing to (there's nothing its pointing at; hence, that is impossible), but that we look at the type of x, the variable itself, instead, and ask this type: Hey, can you give me the default impl of foo()?
The core problem is that you're assigning a definition to null that it just doesn't have. Your idea requires a redefinition of what null means which means the entire community needs to go back to school. I think the current definition of null in the java community is some nebulous ill defined cloud of confusion, so this is probably a good idea, but it is a huge commitment, and it is extremely easy for the OpenJDK team to dictate a direction and for the community to just ignore it. The OpenJDK team should be very hesitant in trying to 'solve' this problem by introducing a language feature, and they are.
Let's talk about the definitions of null that make sense, which definition of null your idea specifically is catering to (at the detriment of the other interpretations!), and how catering to that specific idea is already easy to do in current java, i.e. - what you propose sounds outright daft to me, in that it's just unneccessary and forces an opinion of what null means down everybody's throats for no reason.
Not applicable / undefined / unset
This definition of null is exactly how SQL defines it, and it has the following properties:
There is no default implementation available. By definition! How can one define what the size is of, say, an unset list? You can't say 0. You have no idea what the list is supposed to be. The very point is that interaction with an unset/not-applicable/unknown value should immediately lead to a result that represents either [A] the programmer messed up, the fact that they think they can interact with this value means they programmed a bug - they made an assumption about the state of the system which does not hold, or [B] that the unset nature is infectuous: The operation returns the notion 'unknown / unset / not applicable' as result.
SQL chose the B route: Any interaction with NULL in SQL land is infectuous. For example, even NULL = NULL in SQL is NULL, not FALSE. It also means that all booleans in SQL are tri-state, but this actually 'works', in that one can honestly fathom this notion. If I ask you: Hey, are the lights on?, then there are 3 reasonable answers: Yes, No, and I can't tell you right now; I don't know.
In my opinion, java as a language is meant for this definition as well, but has mostly chosen the [A] route: Throw an NPE to let everybody know: There is a bug, and to let the programmer get to the relevant line extremely quickly. NPEs are easy to solve, which is why I don't get why everybody hates NPEs. I love NPEs. So much better than some default behaviour that is usually but not always what I intended (objectively speaking, it is better to have 50 bugs that each takes 3 minutes to solve, than one bug that takes an an entire working day, by a large margin!) – this definition 'works' with the language:
Uninitialized fields, and uninitialized values in an array begin as null, and in the absence of further information, treating it as unset is correct.
They are, in fact, infectuously erroneous: Virtually all attempts to interact with them results in an exception, except ==, but that is intentional, for the same reason in SQL IS NULL will return TRUE or FALSE and not NULL: Now we're actually talking about the pointer nature of the object itself ("foo" == "foo" can be false if the 2 strings aren't the same ref: Clearly == in java between objects is about the references itself and not about the objects referenced).
A key aspect to this is that null has absolutely no semantic meaning, at all. Its lack of semantic meaning is the point. In other words, null doesn't mean that a value is short or long or blank or indicative of anything in particular. The only thing it does mean is that it means nothing. You can't derive any information from it. Hence, foo.size() is not 0 when foo is unset/unknown - the question 'what is the size of the object foo is pointing at' is unanswerable, in this definition, and thus NPE is exactly right.
Your idea would hurt this interpretation - it would confound matters by giving answers to unanswerable questions.
Sentinel / 'empty'
null is sometimes used as a value that does have semantic meaning. Something specific. For example, if you ever wrote this, you're using this interpretation:
if (x == null || x.isEmpty()) return false;
Here you've assigned a semantic meaning to null - the same meaning you assigned to an empty string. This is common in java and presumably stems from some bass ackwards notion of performance. For example, in the eclipse ecj java parser system, all empty arrays are done with null pointers. For example, the definition of a method has a field Argument[] arguments (for the method parameters; using argument is the slightly wrong word, but it is used to store the param definitions); however, for methods with zero parameters, the semantically correct choice is obviously new Argument[0]. However, that is NOT what ecj fills the Abstract Syntax Tree with, and if you are hacking around on the ecj code and assign new Argument[0] to this, other code will mess up as it just wasn't written to deal with this.
This is in my opinion bad use of null, but is quite common. And, in ecj's defense, it is about 4 times faster than javac, so I don't think it's fair to cast aspersions at their seemingly deplorably outdated code practices. If it's stupid and it works it isn't stupid, right? ecj also has a better track record than javac (going mostly by personal experience; I've found 3 bugs in ecj over the years and 12 in javac).
This kind of null does get a lot better if we implement your idea.
The better solution
What ecj should have done, get the best of both worlds: Make a public constant for it! new Argument[0], the object, is entirely immutable. You need to make a single instance, once, ever, for an entire JVM run. The JVM itself does this; try it: List.of() returns the 'singleton empty list'. So does Collections.emptyList() for the old timers in the crowd. All lists 'made' with Collections.emptyList() are actually just refs to the same singleton 'empty list' object. This works because the lists these methods make are entirely immutable.
The same can and generally should apply to you!
If you ever write this:
if (x == null || x.isEmpty())
then you messed up if we go by the first definition of null, and you're simply writing needlessly wordy, but correct, code if we go by the second
definition. You've come up with a solution to address this, but there's a much, much better one!
Find the place where x got its value, and address the boneheaded code that decided to return null instead of "". You should in fact emphatically NOT be adding null checks to your code, because it's far too easy to get into this mode where you almost always do it, and therefore you rarely actually have null refs, but it's just swiss cheese laid on top of each other: There may still be holes, and then you get NPEs. Better to never check so you get NPEs very quickly in the development process - somebody returned null where they should be returning "" instead.
Sometimes the code that made the bad null ref is out of your control. In that case, do the same thing you should always do when working with badly designed APIs: Fix it ASAP. Write a wrapper if you have to. But if you can commit a fix, do that instead. This may require making such an object.
Sentinels are awesome
Sometimes sentinel objects (objects that 'stand in' for this default / blank take, such as "" for strings, List.of() for lists, etc) can be a bit more fancy than this. For example, one can imagine using LocalDate.of(1800, 1, 1) as sentinel for a missing birthdate, but do note that this instance is not a great idea. It does crazy stuff. For example, if you write code to determine the age of a person, then it starts giving completely wrong answers (which is significantly worse than throwing an exception. With the exception you know you have a bug faster and you get a stacktrace that lets you find it in literally 500 milliseconds (just click the line, voila. That is the exact line you need to look at right now to fix the problem). It'll say someone is 212 years old all of a sudden.
But you could make a LocalDate object that does some things (such as: It CAN print itself; sentinel.toString() doesn't throw NPE but prints something like 'unset date'), but for other things it will throw an exception. For example, .getYear() would throw.
You can also make more than one sentinel. If you want a sentinel that means 'far future', that's trivially made (LocalDate.of(9999, 12, 31) is pretty good already), and you can also have one as 'for as long as anyone remembers', e.g. 'distant past'. That's cool, and not something your proposal could ever do!
You will have to deal with the consequences though. In some small ways the java ecosystem's definitions don't mesh with this, and null would perhaps have been a better standin. For example, the equals contract clearly states that a.equals(a) must always hold, and yet, just like in SQL NULL = NULL isn't TRUE, you probably don't want missingDate.equals(missingDate) to be true; that's conflating the meta with the value: You can't actually tell me that 2 missing dates are equal. By definition: The dates are missing. You do not know if they are equal or not. It is not an answerable question. And yet we can't implement the equals method of missingDate as return false; (or, better yet, as you also can't really know they aren't equal either, throw an exception) as that breaks contract (equals methods must have the identity property and must not throw, as per its own javadoc, so we can't do either of those things).
Dealing with null better
There are a few things that make dealing with null a lot easier:
Annotations: APIs can and should be very clear in communicating when their methods can return null and what that means. Annotations to turn that documentation into compiler-checked documentation is awesome. Your IDE can start warning you, as you type, that null may occur and what that means, and will say so in auto-complete dialogs too. And it's all entirely backwards compatible in all senses of the word: No need to start considering giant swaths of the java ecosystem as 'obsolete' (unlike Optional, which mostly sucks).
Optional, except this is a non-solution. The type isn't orthogonal (you can't write a method that takes a List<MaybeOptionalorNot<String>> that works on both List<String> and List<Optional<String>>, even though a method that checks the 'is it some or is it none?' state of all list members and doesn't add anything (except maybe shuffle things around) would work equally on both methods, and yet you just can't write it. This is bad, and it means all usages of optional must be 'unrolled' on the spot, and e.g. Optional<X> should show up pretty much never ever as a parameter type or field type. Only as return types and even that is dubious - I'd just stick to what Optional was made for: As return type of Stream terminal operations.
Adopting it also isn't backwards compatible. For example, hashMap.get(key) should, in all possible interpretations of what Optional is for, obviously return an Optional<V>, but it doesn't, and it never will, because java doesn't break backwards compatibility lightly and breaking that is obviously far too heavy an impact. The only real solution is to introduce java.util2 and a complete incompatible redesign of the collections API, which is splitting the java ecosystem in twain. Ask the python community (python2 vs. python3) how well that goes.
Use sentinels, use them heavily, make them available. If I were designing LocalDate, I'd have created LocalDate.FAR_FUTURE and LocalDate_DISTANT_PAST (but let it be clear that I think Stephen Colebourne, who designed JSR310, is perhaps the best API designer out there. But nothing is so perfect that it can't be complained about, right?)
Use API calls that allow defaulting. Map has this.
Do NOT write this code:
String phoneNr = phoneNumbers.get(userId);
if (phoneNr == null) return "Unknown phone number";
return phoneNr;
But DO write this:
return phoneNumbers.getOrDefault(userId, "Unknown phone number");
Don't write:
Map<Course, List<Student>> participants;
void enrollStudent(Student student) {
List<Student> participating = participants.get(econ101);
if (participating == null) {
participating = new ArrayList<Student>();
participants.put(econ101, participating);
}
participating.add(student);
}
instead write:
Map<Course, List<Student>> participants;
void enrollStudent(Student student) {
participants.computeIfAbsent(econ101,
k -> new ArrayList<Student>())
.add(student);
}
and, crucially, if you are writing APIs, ensure things like getOrDefault, computeIfAbsent, etc. are available so that the users of your API don't have to deal with null nearly as much.
You can write a static test() method like this:
static <T> boolean test(T object, Predicate<T> validation) {
return object != null && validation.test(object);
}
and
static class Foo {
public boolean isValid() {
return true;
}
}
static Foo dosomething() {
return new Foo();
}
public static void main(String[] args) {
Foo a = dosomething();
if (test(a, Foo::isValid))
System.out.println("OK");
else
System.out.println("NG");
}
output:
OK
If dosomething() returns null, it prints NG
Not exactly, but take a look at Optional:
Optional.ofNullable(dosomething())
.filter(Foo::isValid)
.ifPresent(a -> ...);
This question already has answers here:
Uses for Optional
(14 answers)
Closed 7 years ago.
In Java 8 you can return an Optional instead of a null. Java 8 documentation says that an Optional is "A container object which may or may not contain a non-null value. If a value is present, isPresent() will return true and get() will return the value."
In practice, why is this useful?
Also, is there any case where using null would be preferred? What about performance?
In practice, why is this useful?
For example let's say you have this stream of integers and you're doing a filtering:
int x = IntStream.of(1, -3, 5)
.filter(x -> x % 2 == 0)
.findFirst(); //hypothetical assuming that there's no Optional in the API
You don't know in advance that the filter operation will remove all the values in the Stream.
Assume that there would be no Optional in the API. In this case, what should findFirst return?
The only possible way would be to throw an exception such as NoSuchElementException, which is IMO rather annoying, as I don't think it should stop the execution of your program (or you'd have to catch the exception, not very convenient either) and the filtering criteria could be more complex than that.
With the use of Optional, it's up to the caller to check whether the Optional is empty or not (i.e if your computation resulted in a value or not).
With reference type, you could also return null (but null could be a possible value in the case you filter only null values; so we're back to the exception case).
Concerning non-stream usages, in addition to prevent NPE, I think it also helps to design a more explicit API saying that the value may be present or not. For example consider this class:
class Car {
RadioCar radioCar; //may be null or not
public Optional<RadioCar> getRadioCar() {
return Optional.ofNullable(radioCar);
}
}
Here you are clearly saying to the caller that the radio in the car is optional, it might be or not there.
When Java was first designed it was common practice to use a special value, usually called null to indicate special circumstances like I couldn't find what you were looking for. This practice was adopted by Java.
Since then it has been suggested that this practice should be considered an anti-pattern, especially for objects, because it means that you have to litter your code with null checks to achieve reliability and stability. It is also a pain when you want to put null into a collection for example.
The modern attitude is to use a special object that may or may not hold a value. This way you can safely create one and just not fill it with anything. Here you are seeing Java 8 encouraging this best-practice by providing an Optional object.
Optional helps you to handle variables as available or not available and avoid check null references.
When reading JDK source code, I find it common that the author will check the parameters if they are null and then throw new NullPointerException() manually.
Why do they do it? I think there's no need to do so since it will throw new NullPointerException() when it calls any method. (Here is some source code of HashMap, for instance :)
public V computeIfPresent(K key,
BiFunction<? super K, ? super V, ? extends V> remappingFunction) {
if (remappingFunction == null)
throw new NullPointerException();
Node<K,V> e; V oldValue;
int hash = hash(key);
if ((e = getNode(hash, key)) != null &&
(oldValue = e.value) != null) {
V v = remappingFunction.apply(key, oldValue);
if (v != null) {
e.value = v;
afterNodeAccess(e);
return v;
}
else
removeNode(hash, key, null, false, true);
}
return null;
}
There are a number of reasons that come to mind, several being closely related:
Fail-fast: If it's going to fail, best to fail sooner rather than later. This allows problems to be caught closer to their source, making them easier to identify and recover from. It also avoids wasting CPU cycles on code that's bound to fail.
Intent: Throwing the exception explicitly makes it clear to maintainers that the error is there purposely and the author was aware of the consequences.
Consistency: If the error were allowed to happen naturally, it might not occur in every scenario. If no mapping is found, for example, remappingFunction would never be used and the exception wouldn't be thrown. Validating input in advance allows for more deterministic behavior and clearer documentation.
Stability: Code evolves over time. Code that encounters an exception naturally might, after a bit of refactoring, cease to do so, or do so under different circumstances. Throwing it explicitly makes it less likely for behavior to change inadvertently.
It is for clarity, consistency, and to prevent extra, unnecessary work from being performed.
Consider what would happen if there wasn't a guard clause at the top of the method. It would always call hash(key) and getNode(hash, key) even when null had been passed in for the remappingFunction before the NPE was thrown.
Even worse, if the if condition is false then we take the else branch, which doesn't use the remappingFunction at all, which means the method doesn't always throw NPE when a null is passed; whether it does depends on the state of the map.
Both scenarios are bad. If null is not a valid value for remappingFunction the method should consistently throw an exception regardless of the internal state of the object at the time of the call, and it should do so without doing unnecessary work that is pointless given that it is just going to throw. Finally, it is a good principle of clean, clear code to have the guard right up front so that anyone reviewing the source code can readily see that it will do so.
Even if the exception were currently thrown by every branch of code, it is possible that a future revision of the code would change that. Performing the check at the beginning ensures it will definitely be performed.
In addition to the reasons listed by #shmosel's excellent answer ...
Performance: There may be / have been performance benefits (on some JVMs) to throwing the NPE explicitly rather than letting the JVM do it.
It depends on the strategy that the Java interpreter and JIT compiler take to detecting the dereferencing of null pointers. One strategy is to not test for null, but instead trap the SIGSEGV that happens when an instruction tries to access address 0. This is the fastest approach in the case where the reference is always valid, but it is expensive in the NPE case.
An explicit test for null in the code would avoid the SIGSEGV performance hit in a scenario where NPEs were frequent.
(I doubt that this would be a worthwhile micro-optimization in a modern JVM, but it could have been in the past.)
Compatibility: The likely reason that there is no message in the exception is for compatibility with NPEs that are thrown by the JVM itself. In a compliant Java implementation, an NPE thrown by the JVM has a null message. (Android Java is different.)
Apart from what other people have pointed out, it's worth noting the role of convention here. In C#, for example, you also have the same convention of explicitly raising an exception in cases like this, but it's specifically an ArgumentNullException, which is somewhat more specific. (The C# convention is that NullReferenceException always represents a bug of some kind - quite simply, it shouldn't ever happen in production code; granted, ArgumentNullException usually does, too, but it could be a bug more along the line of "you don't understand how to use the library correctly" kind of bug).
So, basically, in C# NullReferenceException means that your program actually tried to use it, whereas ArgumentNullException it means that it recognized that the value was wrong and it didn't even bother to try to use it. The implications can actually be different (depending on the circumstances) because ArgumentNullException means that the method in question didn't have side effects yet (since it failed the method preconditions).
Incidentally, if you're raising something like ArgumentNullException or IllegalArgumentException, that's part of the point of doing the check: you want a different exception than you'd "normally" get.
Either way, explicitly raising the exception reinforces the good practice of being explicit about your method's pre-conditions and expected arguments, which makes the code easier to read, use, and maintain. If you didn't explicitly check for null, I don't know if it's because you thought that no one would ever pass a null argument, you're counting it to throw the exception anyway, or you just forgot to check for that.
It is so you will get the exception as soon as you perpetrate the error, rather than later on when you're using the map and won't understand why it happened.
It turns a seemingly erratic error condition into a clear contract violation: The function has some preconditions for working correctly, so it checks them beforehand, enforcing them to be met.
The effect is, that you won't have to debug computeIfPresent() when you get the exception out of it. Once you see that the exception comes from the precondition check, you know that you called the function with an illegal argument. If the check were not there, you would need to exclude the possibility that there is some bug within computeIfPresent() itself that leads to the exception being thrown.
Obviously, throwing the generic NullPointerException is a really bad choice, as it does not signal a contract violation in and of itself. IllegalArgumentException would be a better choice.
Sidenote:
I don't know whether Java allows this (I doubt it), but C/C++ programmers use an assert() in this case, which is significantly better for debugging: It tells the program to crash immediately and as hard as possible should the provided condition evaluate to false. So, if you ran
void MyClass_foo(MyClass* me, int (*someFunction)(int)) {
assert(me);
assert(someFunction);
...
}
under a debugger, and something passed NULL into either argument, the program would stop right at the line telling which argument was NULL, and you would be able to examine all local variables of the entire call stack at leisure.
It's because it's possible for it not to happen naturally. Let's see piece of code like this:
bool isUserAMoron(User user) {
Connection c = UnstableDatabase.getConnection();
if (user.name == "Moron") {
// In this case we don't need to connect to DB
return true;
} else {
return c.makeMoronishCheck(user.id);
}
}
(of course there is numerous problems in this sample about code quality. Sorry to lazy to imagine perfect sample)
Situation when c will not be actually used and NullPointerException will not be thrown even if c == null is possible.
In more complicated situations it's becomes very non-easy to hunt down such cases. This is why general check like if (c == null) throw new NullPointerException() is better.
It is intentional to protect further damage, or to getting into inconsistent state.
Apart from all other excellent answers here, I'd also like to add a few cases.
You can add a message if you create your own exception
If you throw your own NullPointerException you can add a message (which you definitely should!)
The default message is a null from new NullPointerException() and all methods that use it, for instance Objects.requireNonNull. If you print that null it can even translate to an empty string...
A bit short and uninformative...
The stack trace will give a lot of information, but for the user to know what was null they have to dig up the code and look at the exact row.
Now imagine that NPE being wrapped and sent over the net, e.g. as a message in a web service error, perhaps between different departments or even organizations. Worst case scenario, no one may figure out what null stands for...
Chained method calls will keep you guessing
An exception will only tell you on what row the exception occurred. Consider the following row:
repository.getService(someObject.someMethod());
If you get an NPE and it points at this row, which one of repository and someObject was null?
Instead, checking these variables when you get them will at least point to a row where they are hopefully the only variable being handled. And, as mentioned before, even better if your error message contains the name of the variable or similar.
Errors when processing lots of input should give identifying information
Imagine that your program is processing an input file with thousands of rows and suddenly there's a NullPointerException. You look at the place and realize some input was incorrect... what input? You'll need more information about the row number, perhaps the column or even the whole row text to understand what row in that file needs fixing.
This question already has answers here:
Uses for Optional
(14 answers)
Closed 8 years ago.
In Java 8 you can return an Optional instead of a null. Java 8 documentation says that an Optional is "A container object which may or may not contain a non-null value. If a value is present, isPresent() will return true and get() will return the value."
In practice, why is this useful?
Also, is there any case where using null would be preferred? What about performance?
In practice, why is this useful?
For example let's say you have this stream of integers and you're doing a filtering:
int x = IntStream.of(1, -3, 5)
.filter(x -> x % 2 == 0)
.findFirst(); //hypothetical assuming that there's no Optional in the API
You don't know in advance that the filter operation will remove all the values in the Stream.
Assume that there would be no Optional in the API. In this case, what should findFirst return?
The only possible way would be to throw an exception such as NoSuchElementException, which is IMO rather annoying, as I don't think it should stop the execution of your program (or you'd have to catch the exception, not very convenient either) and the filtering criteria could be more complex than that.
With the use of Optional, it's up to the caller to check whether the Optional is empty or not (i.e if your computation resulted in a value or not).
With reference type, you could also return null (but null could be a possible value in the case you filter only null values; so we're back to the exception case).
Concerning non-stream usages, in addition to prevent NPE, I think it also helps to design a more explicit API saying that the value may be present or not. For example consider this class:
class Car {
RadioCar radioCar; //may be null or not
public Optional<RadioCar> getRadioCar() {
return Optional.ofNullable(radioCar);
}
}
Here you are clearly saying to the caller that the radio in the car is optional, it might be or not there.
When Java was first designed it was common practice to use a special value, usually called null to indicate special circumstances like I couldn't find what you were looking for. This practice was adopted by Java.
Since then it has been suggested that this practice should be considered an anti-pattern, especially for objects, because it means that you have to litter your code with null checks to achieve reliability and stability. It is also a pain when you want to put null into a collection for example.
The modern attitude is to use a special object that may or may not hold a value. This way you can safely create one and just not fill it with anything. Here you are seeing Java 8 encouraging this best-practice by providing an Optional object.
Optional helps you to handle variables as available or not available and avoid check null references.
I am not able to understand the point of Option[T] class in Scala. I mean, I am not able to see any advanages of None over null.
For example, consider the code:
object Main{
class Person(name: String, var age: int){
def display = println(name+" "+age)
}
def getPerson1: Person = {
// returns a Person instance or null
}
def getPerson2: Option[Person] = {
// returns either Some[Person] or None
}
def main(argv: Array[String]): Unit = {
val p = getPerson1
if (p!=null) p.display
getPerson2 match{
case Some(person) => person.display
case None => /* Do nothing */
}
}
}
Now suppose, the method getPerson1 returns null, then the call made to display on first line of main is bound to fail with NPE. Similarly if getPerson2 returns None, the display call will again fail with some similar error.
If so, then why does Scala complicate things by introducing a new value wrapper (Option[T]) instead of following a simple approach used in Java?
UPDATE:
I have edited my code as per #Mitch's suggestion. I am still not able to see any particular advantage of Option[T]. I have to test for the exceptional null or None in both cases. :(
If I have understood correctly from #Michael's reply, is the only advantage of Option[T] is that it explicitly tells the programmer that this method could return None? Is this the only reason behind this design choice?
You'll get the point of Option better if you force yourself to never, ever, use get. That's because get is the equivalent of "ok, send me back to null-land".
So, take that example of yours. How would you call display without using get? Here are some alternatives:
getPerson2 foreach (_.display)
for (person <- getPerson2) person.display
getPerson2 match {
case Some(person) => person.display
case _ =>
}
getPerson2.getOrElse(Person("Unknown", 0)).display
None of this alternatives will let you call display on something that does not exist.
As for why get exists, Scala doesn't tell you how your code should be written. It may gently prod you, but if you want to fall back to no safety net, it's your choice.
You nailed it here:
is the only advantage of Option[T] is
that it explicitly tells the
programmer that this method could
return None?
Except for the "only". But let me restate that in another way: the main advantage of Option[T] over T is type safety. It ensures you won't be sending a T method to an object that may not exist, as the compiler won't let you.
You said you have to test for nullability in both cases, but if you forget -- or don't know -- you have to check for null, will the compiler tell you? Or will your users?
Of course, because of its interoperability with Java, Scala allows nulls just as Java does. So if you use Java libraries, if you use badly written Scala libraries, or if you use badly written personal Scala libraries, you'll still have to deal with null pointers.
Other two important advantages of Option I can think of are:
Documentation: a method type signature will tell you whether an object is always returned or not.
Monadic composability.
The latter one takes much longer to fully appreciate, and it's not well suited to simple examples, as it only shows its strength on complex code. So, I'll give an example below, but I'm well aware it will hardly mean anything except for the people who get it already.
for {
person <- getUsers
email <- person.getEmail // Assuming getEmail returns Option[String]
} yield (person, email)
Compare:
val p = getPerson1 // a potentially null Person
val favouriteColour = if (p == null) p.favouriteColour else null
with:
val p = getPerson2 // an Option[Person]
val favouriteColour = p.map(_.favouriteColour)
The monadic property bind, which appears in Scala as the map function, allows us to chain operations on objects without worrying about whether they are 'null' or not.
Take this simple example a little further. Say we wanted to find all the favourite colours of a list of people.
// list of (potentially null) Persons
for (person <- listOfPeople) yield if (person == null) null else person.favouriteColour
// list of Options[Person]
listOfPeople.map(_.map(_.favouriteColour))
listOfPeople.flatMap(_.map(_.favouriteColour)) // discards all None's
Or perhaps we would like to find the name of a person's father's mother's sister:
// with potential nulls
val father = if (person == null) null else person.father
val mother = if (father == null) null else father.mother
val sister = if (mother == null) null else mother.sister
// with options
val fathersMothersSister = getPerson2.flatMap(_.father).flatMap(_.mother).flatMap(_.sister)
I hope this sheds some light on how options can make life a little easier.
The difference is subtle. Keep in mind to be truly a function it must return a value - null is not really considered to be a "normal return value" in that sense, more a bottom type/nothing.
But, in a practical sense, when you call a function that optionally returns something, you would do:
getPerson2 match {
case Some(person) => //handle a person
case None => //handle nothing
}
Granted, you can do something similar with null - but this makes the semantics of calling getPerson2 obvious by virtue of the fact it returns Option[Person] (a nice practical thing, other than relying on someone reading the doc and getting an NPE because they don't read the doc).
I will try and dig up a functional programmer who can give a stricter answer than I can.
For me options are really interesting when handled with for comprehension syntax. Taking synesso preceding example:
// with potential nulls
val father = if (person == null) null else person.father
val mother = if (father == null) null else father.mother
val sister = if (mother == null) null else mother.sister
// with options
val fathersMothersSister = for {
father <- person.father
mother <- father.mother
sister <- mother.sister
} yield sister
If any of the assignation are None, the fathersMothersSister will be None but no NullPointerException will be raised. You can then safely pass fathersMothersSisterto a function taking Option parameters without worrying. so you don't check for null and you don't care of exceptions. Compare this to the java version presented in synesso example.
You have pretty powerful composition capabilities with Option:
def getURL : Option[URL]
def getDefaultURL : Option[URL]
val (host,port) = (getURL orElse getDefaultURL).map( url => (url.getHost,url.getPort) ).getOrElse( throw new IllegalStateException("No URL defined") )
Maybe someone else pointed this out, but I didn't see it:
One advantage of pattern-matching with Option[T] vs. null checking is that Option is a sealed class, so the Scala compiler will issue a warning if you neglect to code either the Some or the None case. There is a compiler flag to the compiler that will turn warnings into errors. So it's possible to prevent the failure to handle the "doesn't exist" case at compile time rather than at runtime. This is an enormous advantage over the use of the null value.
It's not there to help avoid a null check, it's there to force a null check. The point becomes clear when your class has 10 fields, two of which could be null. And your system has 50 other similar classes. In the Java world, you try to prevent NPEs on those fields using some combination of mental horesepower, naming convention, or maybe even annotations. And every Java dev fails at this to a significant degree. The Option class not only makes "nullable" values visually clear to any developers trying to understand the code, but allows the compiler to enforce this previously unspoken contract.
[ copied from this comment by Daniel Spiewak ]
If the only way to use Option were
to pattern match in order to get
values out, then yes, I agree that it
doesn’t improve at all over null.
However, you’re missing a *huge* class
of its functionality. The only
compelling reason to use Option is
if you’re using its higher-order
utility functions. Effectively, you
need to be using its monadic nature.
For example (assuming a certain amount
of API trimming):
val row: Option[Row] = database fetchRowById 42
val key: Option[String] = row flatMap { _ get “port_key” }
val value: Option[MyType] = key flatMap (myMap get)
val result: MyType = value getOrElse defaultValue
There, wasn’t that nifty? We can
actually do a lot better if we use
for-comprehensions:
val value = for {
row <- database fetchRowById 42
key <- row get "port_key"
value <- myMap get key
} yield value
val result = value getOrElse defaultValue
You’ll notice that we are *never*
checking explicitly for null, None or
any of its ilk. The whole point of
Option is to avoid any of that
checking. You just string computations
along and move down the line until you
*really* need to get a value out. At
that point, you can decide whether or
not you want to do explicit checking
(which you should never have to do),
provide a default value, throw an
exception, etc.
I never, ever do any explicit matching
against Option, and I know a lot of
other Scala developers who are in the
same boat. David Pollak mentioned to
me just the other day that he uses
such explicit matching on Option (or
Box, in the case of Lift) as a sign
that the developer who wrote the code
doesn’t fully understand the language
and its standard library.
I don’t mean to be a troll hammer, but
you really need to look at how
language features are *actually* used
in practice before you bash them as
useless. I absolutely agree that
Option is quite uncompelling as *you*
used it, but you’re not using it the
way it was designed.
One point that nobody else here seems to have raised is that while you can have a null reference, there is a distinction introduced by Option.
That is you can have Option[Option[A]], which would be inhabited by None, Some(None) and Some(Some(a)) where a is one of the usual inhabitants of A. This means that if you have some kind of container, and want to be able to store null pointers in it, and get them out, you need to pass back some extra boolean value to know if you actually got a value out. Warts like this abound in the java containers APIs and some lock-free variants can't even provide them.
null is a one-off construction, it doesn't compose with itself, it is only available for reference types, and it forces you to reason in a non-total fashion.
For instance, when you check
if (x == null) ...
else x.foo()
you have to carry around in your head throughout the else branch that x != null and that this has already been checked. However, when using something like option
x match {
case None => ...
case Some(y) => y.foo
}
you know y is not Noneby construction -- and you'd know it wasn't null either, if it weren't for Hoare's billion dollar mistake.
Option[T] is a monad, which is really useful when you using high-order functions to manipulate values.
I'll suggest you read articles listed below, they are really good articles that show you why Option[T] is useful and how can it be used in functional way.
Martians vs Monads: Null Considered Harmful
Monads are Elephants Part 1
Adding on to Randall's teaser of an answer, understanding why the potential absence of a value is represented by Option requires understanding what Option shares with many other types in Scala—specifically, types modeling monads. If one represents the absence of a value with null, that absence-presence distinction can't participate in the contracts shared by the other monadic types.
If you don't know what monads are, or if you don't notice how they're represented in Scala's library, you won't see what Option plays along with, and you can't see what you're missing out on. There are many benefits to using Option instead of null that would be noteworthy even in the absence of any monad concept (I discuss some of them in the "Cost of Option / Some vs null" scala-user mailing list thread here), but talking about it isolation is kind of like talking about a particular linked list implementation's iterator type, wondering why it's necessary, all the while missing out on the more general container/iterator/algorithm interface. There's a broader interface at work here too, and Option provides a presence-and-absence model of that interface.
I think the key is found in Synesso's answer: Option is not primarily useful as a cumbersome alias for null, but as a full-fledged object that can then help you out with your logic.
The problem with null is that it is the lack of an object. It has no methods that might help you deal with it (though as a language designer you can add increasingly long lists of features to your language that emulate an object if you really feel like it).
One thing Option can do, as you've demonstrated, is to emulate null; you then have to test for the extraordinary value "None" instead of the extraordinary value "null". If you forget, in either case, bad things will happen. Option does make it less likely to happen by accident, since you have to type "get" (which should remind you that it might be null, er, I mean None), but this is a small benefit in exchange for an extra wrapper object.
Where Option really starts to show its power is helping you deal with the concept of I-wanted-something-but-I-don't-actually-have-one.
Let's consider some things you might want to do with things that might be null.
Maybe you want to set a default value if you have a null. Let's compare Java and Scala:
String s = (input==null) ? "(undefined)" : input;
val s = input getOrElse "(undefined)"
In place of a somewhat cumbersome ?: construct we have a method that deals with the idea of "use a default value if I'm null". This cleans up your code a little bit.
Maybe you want to create a new object only if you have a real value. Compare:
File f = (filename==null) ? null : new File(filename);
val f = filename map (new File(_))
Scala is slightly shorter and again avoids sources of error. Then consider the cumulative benefit when you need to chain things together as shown in the examples by Synesso, Daniel, and paradigmatic.
It isn't a vast improvement, but if you add everything up, it's well worth it everywhere save very high-performance code (where you want to avoid even the tiny overhead of creating the Some(x) wrapper object).
The match usage isn't really that helpful on its own except as a device to alert you about the null/None case. When it is really helpful is when you start chaining it, e.g., if you have a list of options:
val a = List(Some("Hi"),None,Some("Bye"));
a match {
case List(Some(x),_*) => println("We started with " + x)
case _ => println("Nothing to start with.")
}
Now you get to fold the None cases and the List-is-empty cases all together in one handy statement that pulls out exactly the value you want.
Null return values are only present for compatibility with Java. You should not use them otherwise.
It is really a programming style question. Using Functional Java, or by writing your own helper methods, you could have your Option functionality but not abandon the Java language:
http://functionaljava.org/examples/#Option.bind
Just because Scala includes it by default doesn't make it special. Most aspects of functional languages are available in that library and it can coexist nicely with other Java code. Just as you can choose to program Scala with nulls you can choose to program Java without them.
Admitting in advance that it is a glib answer, Option is a monad.
Actually I share the doubt with you. About Option it really bothers me that 1) there is a performance overhead, as there is a lor of "Some" wrappers created everywehre. 2) I have to use a lot of Some and Option in my code.
So to see advantages and disadvantages of this language design decision we should take into consideration alternatives. As Java just ignores the problem of nullability, it's not an alternative. The actual alternative provides Fantom programming language. There are nullable and non-nullable types there and ?. ?: operators instead of Scala's map/flatMap/getOrElse. I see the following bullets in the comparison:
Option's advantage:
simpler language - no additional language constructs required
uniform with other monadic types
Nullable's advantage:
shorter syntax in typical cases
better performance (as you don't need to create new Option objects and lambdas for map, flatMap)
So there is no obvious winner here. And one more note. There is no principal syntactic advantage for using Option. You can define something like:
def nullableMap[T](value: T, f: T => T) = if (value == null) null else f(value)
Or use some implicit conversions to get pritty syntax with dots.
The real advantage of having explicit option types is that you are able to not use them in 98% of all places, and thus statically preclude null exceptions. (And in the other 2% the type system reminds you to check properly when you actually access them.)
Another situation where Option works, is in situations where types are not able to have a null value. It is not possible to store null in an Int, Float, Double, etc. value, but with an Option you can use the None.
In Java, you would need to use the boxed versions (Integer, ...) of those types.