In my Java book, it says that "an expression is a statement that can convey a return value." This is different than my traditional understanding. I thought an expression DOES return a value. Not CAN return a value.
this is from Sams Teach Yourself Java in 21 Days.
A mathematical expression always returns something, but a Java expression doesn't have to. The Java Specification defines what exactly is meant by the term expression in the Java language. Another difference is that expressions can, and often do, have side effects in Java. A side effect is pretty much anything that happens other than returning a value.
Quoting the Java Language Specification:
Much of the work in a program is done by evaluating expressions, either for their side effects, such as assignments to variables, or for their values, which can be used as arguments or operands in larger expressions, or to affect the execution sequence in statements, or both.
For example system.out.println("Hello World"); doesn't return a value, but it does print Hello World to the output stream. This process of outputting data is a side effect of calling println. Functional languages, in contrast, attempt to minimize dependence on side effects and stick more closely to the mathematical definition of an expression.
Quoting from the JLS again, here is the BNF grammar for an expression:
Primary:
PrimaryNoNewArray
ArrayCreationExpression
PrimaryNoNewArray:
Literal
Type . class
void . class
this
ClassName.this
( Expression )
ClassInstanceCreationExpression
FieldAccess
MethodInvocation
ArrayAccess
You can see that a MethodInvocation is an expansion of PrimaryNoNewArray, which is an expansion of Primary (expression).
Your understanding is incomplete. In Java, an expression could return a value, and it could terminate due to an exception. Similar situations arise in other languages which support exceptions, and more generally. (For instance, in the C language, division by zero causes the current expression evaluation to terminate without returning a value.)
Another explanation is that (according to the JLS), a method invocation expression like System.err.println("hello") can deliver a notional void value to its context, and this really means that it is delivering no value.
I don't think this second explanation is sound. We start with an "expression" that is specified as delivering a void value. Then we are argue that since the void value is in reality not a value, the expression is delivering nothing. Finally, we say it is an expression that delivers no value.
A simpler explanation for this example is that an "expression" that delivers "void" is not really an expression in the intuitive sense. Certainly, in Java you cannot use a void-delivering MethodInvocation expression where a non-void-delivering expression is required. And you can't use a non-void-delivering expression as a Statement.
Alternatively, we can stick with the JLS treatment and say that the "void" value really is a value ... even though you can't ever do anything with it. By this argument, System.err.println("Hi") is returning a value after all.
Ouch... This doesn't seem like a very good book :(
An expression either has a type and can be evaluated to yield a value of that type, or is of void type and can be evaluated to yield nothing. The JLS also says an expression can evaluate to a variable, but the variable in turn has a type and a value. For example, 1 + 1 is an expression.
A statement, on the other hand, is composed of expressions but doesn't have a type or a value itself. For example, int x = 1 + 1; has no value. It wouldn't make sense in Java to say something like System.out.println(int x = 1 + 1;);
The method public void foo(); does not return anything, thus the expression foo() does not return anything. But if the method was public int foo(), then it would. Thus it could return something, but doesn't necessarily have to. From the Java spec:
An expression denotes nothing if and only if it is a method invocation
that invokes a method that does not return a value, that is, a method
declared void. Such an expression can be used only as an expression
statement, because every other context in which an expression can
appear requires the expression to denote something. An expression
statement that is a method invocation may also invoke a method that
produces a result; in this case the value returned by the method is
quietly discarded.
Related
Say I perform a simple add/concatenation statement:
variable + newInput
Without setting the calculated value to a new variable, as in:
variable = variable + newInput
or
variable += newInput
Does Java have some sort of specifier to be able to use the computed sum or concatenated string?
Apparently in Python it is automatically saved in the implicit global variable _ -which is implementable like
Print(_)
Is there anything like this in Java?
No. It does not have anything like this. You have to assign the computed value to a variable, otherwise it will be lost and consequently collected by the garbage collector.
The best option is to use a special operator so not to use an extra variable but assign the result to an old one. This is a Shorthand operator.
Variable += NewInput
More than just not saving the result, Java will outright refuse to compile your program if it contains such a line, precisely because the result would be unsaved and unusable if it was allowed:
public class Main
{
public static void main(String[] args)
{
1+2;
}
}
Result:
Main.java:5: error: not a statement
1+2;
^
1 error
Java does not allow arbitrary expressions as statements, and addition expressions are not considered valid Java statements.
The expressions that are allowed as statements by themselves are listed in the JLS:
ExpressionStatement:
StatementExpression ;
StatementExpression:
Assignment
PreIncrementExpression
PreDecrementExpression
PostIncrementExpression
PostDecrementExpression
MethodInvocation
ClassInstanceCreationExpression
Assignment, increment, decrement, method calls, and new Whatever(), all things with side effects or potential side effects. Barring possible side effects of an implicit toString() call, + cannot have side effects, so to catch probable errors, Java forbids addition expressions from being statements.
You can for sure do:
variable + newInput
but the result of that operation must be assigned to a variable, if not, it will get lost...
the most you can get is
variable += newInput
whihch is similar to
variable = variable + newInput
The point is: the + operator in Java simply takes two operands and returns a result (either numerical, or as string concatenation).
Without assigning this result to something (like returning it from a method; or as shown in your example) ... it is like: the operation never takes place.
This operation doesn't have any side effects on its operands; and there is no way of accessing this result.
Beyond that, there is no operator overloading in Java. So it is also not possible to do some black magic that somehow stores the result of operation as side effect. You could theoretically add an agent to the JVM, that intercepts at runtime to do something upon an add operation, but that is more like: "technically possible", but nothing you would do in practical reality.
Other JVM languages, like Scala for example might use it implicitly - the last expression in a method is always returned, even when leaving out the return statement (in scala).
The statement you show is evaluated and nothing is done with it. Unless you bind a variable to the result, the evaluation occurs without effect.
This will not be flagged as an error by the compiler.
IntelliJ keeps proposing me to replace my lambda expressions with method references.
Is there any objective difference between both of them?
Let me offer some perspective on why we added this feature to the language, when clearly we didn't strictly need to (all methods refs can be expressed as lambdas.)
Note that there is no right answer. Anyone who says "always use a method ref instead of a lambda" or "always use a lambda instead of a method ref" should be ignored.
This question is very similar in spirit to "when should I use a named class vs an anonymous class"? And the answer is the same: when you find it more readable. There are certainly cases that are definitely one or definitely the other but there's a host of grey in the middle, and judgment must be used.
The theory behind method refs is simple: names matter. If a method has a name, then referring to it by name, rather than by an imperative bag of code that ultimately just turns around and invokes it, is often (but not always!) more clear and readable.
The arguments about performance or about counting characters are mostly red herrings, and you should ignore them. The goal is writing code that is crystal clear what it does. Very often (but not always!) method refs win on this metric, so we included them as an option, to be used in those cases.
A key consideration about whether method refs clarify or obfuscate intent is whether it is obvious from context what is the shape of the function being represented. In some cases (e.g., map(Person::getLastName), it's quite clear from the context that a function that maps one thing to another is required, and in cases like this, method references shine. In others, using a method ref requires the reader to wonder about what kind of function is being described; this is a warning sign that a lambda might be more readable, even if it is longer.
Finally, what we've found is that most people at first steer away from method refs because they feel even newer and weirder than lambdas, and so initially find them "less readable", but over time, when they get used to the syntax, generally change their behavior and gravitate towards method references when they can. So be aware that your own subjective initial "less readable" reaction almost certainly entails some aspect of familiarity bias, and you should give yourself a chance to get comfortable with both before rendering a stylistic opinion.
Long lambda expressions consisting of several statements may reduce the readability of your code. In such a case, extracting those statements in a method and referencing it may be a better choice.
The other reason may be re-usability. Instead of copy&pasting your lambda expression of few statements, you can construct a method and call it from different places of your code.
As user stuchl4n3k wrote in comments to question there may exception occurs.
Lets consider that some variable field is uninitialized field, then:
field = null;
runThisLater(()->field.method());
field = new SomeObject();
will not crash, while
field = null;
runThisLater(field::method);
field = new SomeObject();
will crash with java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.Class java.lang.Object.getClass()', at a method reference statement line, at least on Android.
Todays IntelliJ notes "may change semantics" while suggesting this refactoring.
This happens when do "referencing" of instance method of a particular object. Why?
Lets check first two paragraphs of
15.13.3. Run-Time Evaluation of Method References:
At run time, evaluation of a method reference expression is similar to evaluation of a
class instance creation expression, insofar as normal completion produces a reference
to an object. Evaluation of a method reference expression is distinct from invocation
of the method itself.
First, if the method reference expression begins with an ExpressionName or a
Primary, this subexpression is evaluated. If the subexpression evaluates to null, a
NullPointerException is raised, and the method reference expression completes
abruptly. If the subexpression completes abruptly, the method reference expression
completes abruptly for the same reason.
In case of lambda expression, I'm unsure, final type is derived in compile-time from method declaration. This is just simplification of what is going exactly. But lets assume that method runThisLater has been declared as e.g.
void runThisLater(SamType obj), where SamType is some Functional interface then runThisLater(()->field.method()); translates into something like:
runThisLater(new SamType() {
void doSomething() {
field.method();
}
});
Additional info:
15.27.4. Run-Time Evaluation of Lambda Expressions
Translation of Lambda Expressions
State of the Lambda, version 3, where SAM was mentioned.
State of the Lambda, final.
While it is true that all methods references can be expressed as lambdas, there is a potential difference in semantics when side effects are involved. #areacode's example throwing an NPE in one case but not in the other is very explicit regarding the involved side effect. However, there is a more subtle case you could run into when working with CompletableFuture:
Let's simulate a task that takes a while (2 seconds) to complete via the following helper function slow:
private static <T> Supplier<T> slow(T s) {
return () -> {
try {
Thread.sleep(2000);
} catch (InterruptedException e) {}
return s;
};
}
Then
var result =
CompletableFuture.supplyAsync(slow(Function.identity()))
.thenCompose(supplyAsync(slow("foo"))::thenApply);
Effectively runs both async tasks in parallel allowing the future to complete after roughly 2 seconds.
On the other hand if we refactor the ::thenApply method reference into a lambda, both async tasks would run sequentially one after each other and the future only completes after about 4 seconds.
Side note: while the example seems contrived, it does come up when you try to regain the applicative instance hidden in the future.
IntelliJ keeps proposing me to replace my lambda expressions with method references.
Is there any objective difference between both of them?
Let me offer some perspective on why we added this feature to the language, when clearly we didn't strictly need to (all methods refs can be expressed as lambdas.)
Note that there is no right answer. Anyone who says "always use a method ref instead of a lambda" or "always use a lambda instead of a method ref" should be ignored.
This question is very similar in spirit to "when should I use a named class vs an anonymous class"? And the answer is the same: when you find it more readable. There are certainly cases that are definitely one or definitely the other but there's a host of grey in the middle, and judgment must be used.
The theory behind method refs is simple: names matter. If a method has a name, then referring to it by name, rather than by an imperative bag of code that ultimately just turns around and invokes it, is often (but not always!) more clear and readable.
The arguments about performance or about counting characters are mostly red herrings, and you should ignore them. The goal is writing code that is crystal clear what it does. Very often (but not always!) method refs win on this metric, so we included them as an option, to be used in those cases.
A key consideration about whether method refs clarify or obfuscate intent is whether it is obvious from context what is the shape of the function being represented. In some cases (e.g., map(Person::getLastName), it's quite clear from the context that a function that maps one thing to another is required, and in cases like this, method references shine. In others, using a method ref requires the reader to wonder about what kind of function is being described; this is a warning sign that a lambda might be more readable, even if it is longer.
Finally, what we've found is that most people at first steer away from method refs because they feel even newer and weirder than lambdas, and so initially find them "less readable", but over time, when they get used to the syntax, generally change their behavior and gravitate towards method references when they can. So be aware that your own subjective initial "less readable" reaction almost certainly entails some aspect of familiarity bias, and you should give yourself a chance to get comfortable with both before rendering a stylistic opinion.
Long lambda expressions consisting of several statements may reduce the readability of your code. In such a case, extracting those statements in a method and referencing it may be a better choice.
The other reason may be re-usability. Instead of copy&pasting your lambda expression of few statements, you can construct a method and call it from different places of your code.
As user stuchl4n3k wrote in comments to question there may exception occurs.
Lets consider that some variable field is uninitialized field, then:
field = null;
runThisLater(()->field.method());
field = new SomeObject();
will not crash, while
field = null;
runThisLater(field::method);
field = new SomeObject();
will crash with java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.Class java.lang.Object.getClass()', at a method reference statement line, at least on Android.
Todays IntelliJ notes "may change semantics" while suggesting this refactoring.
This happens when do "referencing" of instance method of a particular object. Why?
Lets check first two paragraphs of
15.13.3. Run-Time Evaluation of Method References:
At run time, evaluation of a method reference expression is similar to evaluation of a
class instance creation expression, insofar as normal completion produces a reference
to an object. Evaluation of a method reference expression is distinct from invocation
of the method itself.
First, if the method reference expression begins with an ExpressionName or a
Primary, this subexpression is evaluated. If the subexpression evaluates to null, a
NullPointerException is raised, and the method reference expression completes
abruptly. If the subexpression completes abruptly, the method reference expression
completes abruptly for the same reason.
In case of lambda expression, I'm unsure, final type is derived in compile-time from method declaration. This is just simplification of what is going exactly. But lets assume that method runThisLater has been declared as e.g.
void runThisLater(SamType obj), where SamType is some Functional interface then runThisLater(()->field.method()); translates into something like:
runThisLater(new SamType() {
void doSomething() {
field.method();
}
});
Additional info:
15.27.4. Run-Time Evaluation of Lambda Expressions
Translation of Lambda Expressions
State of the Lambda, version 3, where SAM was mentioned.
State of the Lambda, final.
While it is true that all methods references can be expressed as lambdas, there is a potential difference in semantics when side effects are involved. #areacode's example throwing an NPE in one case but not in the other is very explicit regarding the involved side effect. However, there is a more subtle case you could run into when working with CompletableFuture:
Let's simulate a task that takes a while (2 seconds) to complete via the following helper function slow:
private static <T> Supplier<T> slow(T s) {
return () -> {
try {
Thread.sleep(2000);
} catch (InterruptedException e) {}
return s;
};
}
Then
var result =
CompletableFuture.supplyAsync(slow(Function.identity()))
.thenCompose(supplyAsync(slow("foo"))::thenApply);
Effectively runs both async tasks in parallel allowing the future to complete after roughly 2 seconds.
On the other hand if we refactor the ::thenApply method reference into a lambda, both async tasks would run sequentially one after each other and the future only completes after about 4 seconds.
Side note: while the example seems contrived, it does come up when you try to regain the applicative instance hidden in the future.
In Java when passing arguments to a method and modifying the passed arguments during the method call is it guaranteed that the result is what is expected?
E.g.
a.method(++i); etc
Is it guaranteed for instance that inside method the variable i will have the updated
value?
Or a.method(i++) Will method get the value of i after incrementing or before?
Also same for all similar cases.
I kind of remember this is forbidden in C++ as implementation specific but perhaps I remember wrong.
The java language specification for prefix/postfix increment/decrement operators:
Prefix: http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.15.1.
... the value 1 is added to the value of the variable and the sum is stored back into the variable ... The value of the prefix increment expression is the value of the
variable after the new value is stored.
Postfix: http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.14.2
... the value 1 is added to the value of the variable and the sum is stored back into the variable ... The value of the postfix increment expression is the value of the variable before the new value is stored.
I think it's pretty clear. The function will get the incremented value in the prefix case, and not in the postfix case.
The expression ++i is evaluated before the method is called.
From the Java Language Specification's section "Runtime evaluation of method invocation":
... Second, the argument expressions are evaluated. ... Fifth, a new activation frame is created, synchronization is performed if necessary, and control is transferred to the method code.
And from the Java Language Specification's section "Prefix increment operator":
The value of the prefix increment expression is the value of the variable after the new value is stored.
No problem in Java, method will receive the updated value.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Constant abuse?
I've seen -1 used in various APIs, most commonly when searching into a "collection" with zero-based indices, usually to indicate the "not found" index. This "works" because -1 is never a legal index to begin with. It seems that any negative number should work, but I think -1 is almost always used, as some sort of (unwritten?) convention.
I would like to limit the scope to Java at least for now. My questions are:
What are the official words from Sun regarding using -1 as a "special" return value like this?
What quotes are there regarding this issue, from e.g. James Gosling, Josh Bloch, or even other authoritative figures outside of Java?
What were some of the notable discussions regarding this issue in the past?
This is a common idiom in languages where the types do not include range checks. An "out of bounds" value is used to indicate one of several conditions. Here, the return value indicates two things: 1) was the character found, and 2) where was it found.
The use of -1 for not found and a non-negative index for found succinctly encodes both of these into one value, and the fact that not-found does not need to return an index.
In a language with strict range checking, such as Ada or Pascal, the method might be implemented as (pseudo code)
bool indexOf(c:char, position:out Positive);
Positive is a subtype of int, but restricted to non-negative values.
This separates the found/not-found flag from the position. The position is provided as an out parameter - essentialy another return value. It could also be an in-out parameter, to start the search from a given position. Use of -1 to indicate not-found would not be allowed here since it violates range checks on the Positive type.
The alternatives in java are:
throw an exception: this is not a good choice here, since not finding a character is not an exceptional condition.
split the result into several methods, e.g. boolean indexOf(char c); int lastFoundIndex();. This implies the object must hold on to state, which will not work in a concurrent program, unless the state is stored in thread-local storage, or synchronization is used - all considerable overheads.
return the position and found flag separately: such as boolean indexOf(char c, Position pos). Here, creating the position object may be seen as unnecessary overhead.
create a multi-value return type
such as
class FindIndex {
boolean found;
int position;
}
FindIndex indexOf(char c);
although it clearly separates the return values, it suffers object creation overhead. Some of that could be mitigated by passing the FindIndex as a parameter, e.g.
FindIndex indexOf(char c, FindIndex start);
Incidentally, multiple return values were going to be part of java (oak), but were axed prior to 1.0 to cut time to release. James Gosling says he wishes they had been included. It's still a wished-for feature.
My take is that use of magic values are a practical way of encoding a multi-valued results (a flag and a value) in a single return value, without requiring excessive object creation overhead.
However, if using magic values, it's much nicer to work with if they are consistent across related api calls. For example,
// get everything after the first c
int index = str.indexOf('c');
String afterC = str.substring(index);
Java falls short here, since the use of -1 in the call to substring will cause an IndeOutOfBoundsException. Instead, it might have been more consistent for substring to return "" when invoked with -1, if negative values are considered to start at the end of the string. Critics of magic values for error conditions say that the return value can be ignored (or assumed to be positive). A consistent api that handles these magic values in a useful way would reduce the need to check for -1 and allow for cleaner code.
Is -1 a magic number?
In this context, not really. There is nothing special about -1 ... apart from the fact that it is guaranteed to be an invalid index value by virtue of being negative.
An anti-pattern?
No. To qualify as an anti-pattern there would need to be something harmful about this idiom. I see nothing harmful in using -1 this way.
A code smell?
Ditto. (It is arguably better style to use a named constant rather than a bare -1 literal. But I don't think that is what you are asking about, and it wouldn't count as "code smell" anyway, IMO.)
Quotes and guidelines from authorities
Not that I'm aware of. However, I would observe that this "device" is used in various standard classes. For example, String.indexOf(...) returns -1 to say that the character or substring could not be found.
As far as I am concerned, this is simply an "algorithmic device" that is useful in some cases. I'm sure that if you looked back through the literature, you will see examples of using -1 (or 0 for languages with one-based arrays) this way going back to the 1960's and before.
The choice of -1 rather than some other negative number is simply a matter of personal taste, and (IMO) not worth analyzing., in this context.
It may be a bad idea for a method to return -1 (or some other value) to indicate an error instead of throwing an exception. However, the problem here is not the value returned but the fact that the method is requiring the caller to explicitly test for errors.
The flip side is that if the "condition" represented by -1 (or whatever) is not an "error" / "exceptional condition", then returning the special value is both reasonable and proper.
Both Java and JavaScript use -1 when an index isn't found. Since the index is always 0-n it seems a pretty obvious choice.
//JavaScript
var url = 'example.com/foo?bar&admin=true';
if(url.indexOf('&admin') != -1){
alert('we likely have an insecure app!');
}
I find this approach (which I've used when extending Array-type elements to have a .indexOf() method) to be quite normal.
On the other hand, you can try the PHP approach e.g. strpos() but IMHO it gets confusing as there are multiple return types (it returns FALSE when not found)
-1 as a return value is slightly ugly but necessary. The alternatives to signal a "not found" condition are IMHO all much worse:
You could throw an Exception, but
this isn't ideal because Exceptions
are best used to signal unexpected
conditions that require some form of
recovery or propagated failure. Not
finding an occurrence of a substring
is actually pretty expected. Also
Exception throwing has a significant
performance penalty.
You could use a compound result
object with (found,index) but this
requires an object allocation and
more complex code on the part of the
caller to inspect the result.
You could separate out two separate
function calls for contains and indexOf - however this is
again quite cumbersome for the caller
and also results in a performance hit
as both calls would be O(n) and
require a full traversal of the
String.
Personally, I never like to refer to the -1 constant: my test for not-found is always something like:
int i = someString.indexOf("substring");
if (i>=0) {
// do stuff with found index
} else {
// handle not found case
}
It is good practice to define a final class variable for all constant values in your code.
But it is general accepted to use 0, 1, -1, "" (empty string) without an explicit declaration.
This is an inheritance from C where only a single primitive value could be returned. In java you Can also return a single object.
So for new code return an object of a basetype with the subtype indicating the problem to be used with instaceof, or throw a "not Found" exception.
For existing special values make -1 a constant in your code names accordingly - NOT_FOUND - so the reader Can tell the meaning without having to check javadocs.
The same practice as with null applies to -1. Its been discussed many times.
e.g. Java api design - NULL or Exception
Its used because its the first invalid value you encounter in 0-based arrays. As you know, not all types can hold null or nothing so need "something" to signify nothing.
I would say its not official, it has just become convention (unwritten) because its very sensible for the situation. Personally, I wouldn't also call it an issue. API design is also down to the author, but guidelines can be found online.
As far as I know, such values are called sentinel values, although most common definitions differ slightly from this scenario.
Languages such as Java chose to not support passing by reference (which I think is a good idea), so while the values of individual arguments are mutable, the variables passed to a function remain unaffected. As a consequence of this, you can only have one return value of only one type. So what you do is to chose an otherwise invalid value of a valid type, and return it to transport additional semantics, because the return value is not actually the return value of the operation but a special signal.
Now I guess, the cleanest approach would be to have a contains and an indexOf method, the second of which would throw an exception, if the element you're asking for is not in the collection. Why? Because one would expect the following to be true:
someCollection.objectAtIndex(someCollection.indexOf(someObject)) == someObject
What you're likely to get is an exception because -1 is out of bounds, while the actual reason why this plausible relation is not true is, that someObject is not an element of someCollection, and that is why the inner call should raise the exception.
Now as clean and robust, as this may be, it has two key flaws:
Usually both operations would usually cost you O(n) (unless you have an inverse map within the collection), so you're better off if you do just one.
It is really quite verbose.
In the end, it's up to you to decide. This is a matter of philosophy. I'd call it a "semantic hack" to achieve both shortness & speed at the cost of robustness. Your call ;)
greetz
back2dos
like why 51% means everything among shareholders of a company, since it's the best nearest and makes sense rather than -2 or -3 ...