Inspired by another question, I started wondering about the well-definedness of the following Java code:
public class Test {
List<Object> foo;
Object bar() {
foo = new ArrayList<Object>():
return(/* Anything */);
}
void frob() {
foo.add(bar());
}
}
In this case, does the Java specification specify a strict order of evaluation for the dot operator in frob? Is left-to-right evaluation guaranteed such that the add method is always executed on the list that was there before bar replaces the field with a new list, or can there legally be compilers that let bar execute before the foo field is fetched in frob, thus executing the add method on the new list?
I suspect that left-to-right evaluation is guaranteed, but is there wording in the specification to this effect?
Yes, this is part of Run-time Evaluation of Method Invocation - Compute Target Reference
If the form is Primary . [TypeArguments] Identifier involved, then:
If the invocation mode is static, then there is no target reference. The Primary expression is evaluated, but the result is then
discarded.
Otherwise, the Primary expression is evaluated and the result is used as the target reference.
Then comes the evaluation of method arguments.
Related
As far as I know, when you define a method in a function, an object is instantiated:
myList.stream().map(x->x.getName().replaceAll('a','b')).toList();
Or the equivalent
Function<MyObject,String> myFunc = x -> {return x.getName().replaceAll('a','b');}
myList.stream().map(myFunc).toList();
x->x.getName().replaceAll('a','b') is created as a functional interface object (and requires memory allocation, a new somewhere/somehow, right?).
However, if I pass an already existing method as a parameter, is anything created?
class A{
public list<String> transform(List<String> input){
return input.stream().filter(this::myFilter).filter(A::staticFilter).toList();
}
public boolean myFilter(String s){ // Whatever }
public static boolean staticFilter(String s) { // whatever }
}
What happens here:
Is myFilter "wrapped" in a functional interface? (is it the same for a static method reference?)
Is there something specific that happens at bytecode level which is not "clear" on language level (like method pointer or something?).
From JavaDoc Api
Note that instances of functional interfaces can be created with
lambda expressions, method references, or constructor references.
As to if the lambda expression will create an instance in heap or not, you can follow this thread where the top comment from #Brian Goetz might be helpful.
About lambda expressions:
Also as indicated here in Java Specifications for Run-Time Evaluation of Lambda Expressions
These rules are meant to offer flexibility to implementations of the
Java programming language, in that:
A new object need not be allocated on every evaluation.
Objects produced by different lambda expressions need not belong to
different classes (if the bodies are identical, for example).
Every object produced by evaluation need not belong to the same class
(captured local variables might be inlined, for example).
If an "existing instance" is available, it need not have been created
at a previous lambda evaluation (it might have been allocated during
the enclosing class's initialization, for example).
So to your question:
x->x.getName().replaceAll('a','b') is created as a functional
interface object (and requires memory allocation, a new
somewhere/somehow, right?).
The answer is some times yes, some times no. Not always the same case.
About method reference expressions:
Evaluation of a method reference expression produces an instance of a
functional interface type (§9.8). Method reference evaluation does not
cause the execution of the corresponding method; instead, this may
occur at a later time when an appropriate method of the functional
interface is invoked.
Based on what is written here for Run-Time Evaluation of Method References
The timing of method reference expression evaluation is more complex
than that of lambda expressions (§15.27.4). When a method reference
expression has an expression (rather than a type) preceding the ::
separator, that subexpression is evaluated immediately. The result of
evaluation is stored until the method of the corresponding functional
interface type is invoked; at that point, the result is used as the
target reference for the invocation. This means the expression
preceding the :: separator is evaluated only when the program
encounters the method reference expression, and is not re-evaluated on
subsequent invocations on the functional interface type.
I would assume that a functional interface type is created but not each time with each invocation. It should as well be cached and optimized for the less amount of evaluations.
Well, the compiler has a lot of leeway in how it actually implements the code you write, but generally .map() takes a Function Object so whatever expression you put in the parentheses will produce an object.
That does not mean, however, that a new Object is created every time. In your lambda example, the lambda function doesn't reference anything defined in an enclosing method scope, so a single Function object can be created and reused for all calls.
Similarly, the A::staticFilter reference only needs to produce one Function.
The object created by this::myFilter, however, needs to have a reference to this (unless the compiler can determine that it doesn't!), and so you will certainly get a new Function created inside each call to transform.
Dealing with another SO question, I was wondering if the code below has an undefined behavior:
if (str.equals(str = getAnotherString())) {
// [...]
}
I tend to think the str reference from which the equals() call is made is evaluated before the further str assignment passed as argument. Is there a source about it?
This is clearly specified in the JLS Section 15.12.4:
At run time, method invocation requires five steps. First, a target reference may be computed. Second, the argument expressions are evaluated. [...]
What's a "target reference" you ask? This is specified in the next subsection:
15.12.4.1. Compute Target Reference (If Necessary)
...
If form is ExpressionName . [TypeArguments] Identifier, then:
If the invocation mode is static, then there is no target reference. The ExpressionName is evaluated, but the result is then discarded.
Otherwise, the target reference is the value denoted by ExpressionName.
So the "target reference" is the str bit in str.equals - the expression on which you are calling the method.
As the first quote says, the target reference is evaluated first, then the arguments. Therefore, str.equals(str = getAnotherString()) only evaluates to true if getAnotherString returns a string that has the same characters as str before the assignment expression.
So yeah, the thing that you tend to think is correct. But this is not "undefined behaviour".
Take a look at below code:
class Foo{
public static int x = 1;
}
class Bar{
public static void main(String[] args) {
Foo foo;
System.out.println(foo.x); // Error: Variable 'foo' might not have been initialized
}
}
As you see while trying to access static field x via an uninitialized local variable Foo foo; code foo.x generates compilation error: Variable 'foo' might not have been initialized.
It could seem like this error makes sense, but only until we realize that to access a static member the JVM doesn't actually use the value of a variable, but only its type.
For instance I can initialize foo with value null and this will let us access x without any problems:
Foo foo = null;
System.out.println(foo.x); //compiles and at runtime prints 1!!!
Such scenario works because compiler realizes that x is static and treats foo.x as if it was written like Foo.x (at least that is what I thought until now).
So why compiler suddenly insists on foo having a value which it will NOT use at all?
Disclaimer: This is not code which would be used in real application, but interesting phenomenon which I couldn't find answer to on Stack Overflow, so I decided to ask about it.
§15.11. Field Access Expressions:
If the field is static:
The Primary expression is evaluated, and the result is discarded. If evaluation of the Primary expression completes abruptly, the field access expression completes abruptly for the same reason.
Where earlier it states that field access is identified by Primary.Identifier.
This shows that even though it seems to not use the Primary, it is still evaluated and the result is then discarded which is why it will need to be initialized. This can make a difference when the evaluation halts the access as stated in the quote.
EDIT:
Here is a short example just to demonstrate visually that the Primary is evaluated even though the result is discarded:
class Foo {
public static int x = 1;
public static Foo dummyFoo() throws InterruptedException {
Thread.sleep(5000);
return null;
}
public static void main(String[] args) throws InterruptedException {
System.out.println(dummyFoo().x);
System.out.println(Foo.x);
}
}
Here you can see that dummyFoo() is still evaluated because the print is delayed by the 5 second Thread.sleep() even though it always returns a null value which is discarded.
If the expression was not evaluated the print would appear instantly, which can be seen when the class Foo is used directly to access x with Foo.x.
Note: Method invocation is also considered a Primary shown in §15.8 Primary Expressions.
Chapter 16. Definite Assignment
Each local variable (§14.4) and every blank final field (§4.12.4, §8.3.1.2) must have a definitely assigned value when any access of its value occurs.
It doesn't really matter what you try to access via a local variable. The rule is that it should be definitely assigned before that.
To evaluate a field access expression foo.x, the primary part of it (foo) must be evaluated first. It means that access to foo will occur, which will result in a compile-time error.
For every access of a local variable or blank final field x, x must be definitely assigned before the access, or a compile-time error occurs.
There is value in keeping the rules as simple as possible, and “don’t use a variable that might not have been initialised” is as simple as it gets.
More to the point, there is an established way of calling static methods - always use the class name, not a variable.
System.out.println(Foo.x);
The variable “foo” is unwanted overhead that should be removed, and the compiler errors and warnings could be seen as helping leading towards that.
Other answers perfectly explain the mechanism behind what is happening. Maybe you also wanted the rationale behind Java's specification. Not being a Java expert, I cannot give you the original reasons, but let me point this out:
Every piece of code either has a meaning or it triggers a compilation error.
(For statics, because an instance is unnecessary, Foo.x is natural.)
Now, what shall we do with foo.x (access through instance variable)?
It could be a compilation error, as in C#, or
It has a meaning. Because Foo.x already means "simply access x", it is reasonable that the expression foo.x has a different meaning; that is, every part of the expression is valid and access x.
Let's hope someone knowledgeable can tell the real reason. :-)
This question already has answers here:
Java implicit "this" parameter in method?
(2 answers)
Closed 4 years ago.
I'm trying to understand how method references work in java.
At first sight it is pretty straightforward. But not when it comes to such things:
There is a method in Foo class:
public class Foo {
public Foo merge(Foo another) {
//some logic
}
}
And in another class Bar there is a method like this:
public class Bar {
public void function(BiFunction<Foo, Foo, Foo> biFunction) {
//some logic
}
}
And a method reference is used:
new Bar().function(Foo::merge);
It complies and works, but I don't understand how does it match this:
Foo merge(Foo another)
to BiFunction method:
R apply(T t, U u);
???
There is an implicit this argument on instance methods. This is defined §3.7 of the JVM specification:
The invocation is set up by first pushing a reference to the current instance, this, on to the operand stack. The method invocation's arguments, int values 12 and 13, are then pushed. When the frame for the addTwo method is created, the arguments passed to the method become the initial values of the new frame's local variables. That is, the reference for this and the two arguments, pushed onto the operand stack by the invoker, will become the initial values of local variables 0, 1, and 2 of the invoked method.
To understand why method invocation is done this way, we need to understand how the JVM stores code in memory. The code and the data of an object are separated. In fact, all methods of one class (static and non-static) are stored in the same place, the method area (§2.5.4 of JVM spec). This allows to store each method only once instead of re-storing them for each instance of a class over and over again. When a method like
someObject.doSomethingWith(someOtherObject);
is called, it gets actually compiled to something that looks more like
doSomething(someObject, someOtherObject);
Most Java-programmers would agree that someObject.doSomethingWith(someOtherObject) has a "lower cognitive complexity": we do something with someObject that involves someOtherObject. The center of this action is someObject, where someOtherObject is just a means to an end.
With doSomethingWith(someObject, someOtherObject), you do not transport this semantics of someObject being the center of the action.
So in essence, we write the first version, but the computer prefers the second version.
As was pointed out by #FedericoPeraltaSchaffner, you can even write the implicit this parameter explicitly since Java 8. The exact definition is given in JLS, §8.4.1:
The receiver parameter is an optional syntactic device for an instance method or an inner class's constructor. For an instance method, the receiver parameter represents the object for which the method is invoked. For an inner class's constructor, the receiver parameter represents the immediately enclosing instance of the newly constructed object. Either way, the receiver parameter exists solely to allow the type of the represented object to be denoted in source code, so that the type may be annotated. The receiver parameter is not a formal parameter; more precisely, it is not a declaration of any kind of variable (§4.12.3), it is never bound to any value passed as an argument in a method invocation expression or qualified class instance creation expression, and it has no effect whatsoever at run time.
The receiver parameter must be of the type of the class and must be named this.
This means that
public String doSomethingWith(SomeOtherClass other) { ... }
and
public String doSomethingWith(SomeClass this, SomeOtherClass other) { ... }
will have the same semantic meaning, but the latter allows for e.g. annotations.
I find it easier to understand with different types :
public class A {
public void test(){
function(A::merge);
}
public void function(BiFunction<A, B, C> f){
}
public C merge(B i){
return null;
}
class B{}
class C{}
}
We can know see that using a method reference Test::merge instead of a reference on an instance will implicitly use this as the first value.
15.13.3. Run-Time Evaluation of Method References
If the form is ReferenceType :: [TypeArguments] Identifier
[...]
If the compile-time declaration is an instance method, then the target reference is the first formal parameter of the invocation method. Otherwise, there is no target reference.
And we can find some example using this behavior on the following subject:
The JLS - 15.13.1. Compile-Time Declaration of a Method Reference mention:
A method reference expression of the form ReferenceType::[TypeArguments] Identifier can be interpreted in different ways.
- If Identifier refers to an instance method, then the implicit lambda expression has an extra parameter [...]
- if Identifier refers to a static method. It is possible for ReferenceType to have both kinds of applicable methods, so the search algorithm described above identifies them separately, since there are different parameter types for each case.
It then show some ambiguity possible with this behavior :
class C {
int size() { return 0; }
static int size(Object arg) { return 0; }
void test() {
Fun<C, Integer> f1 = C::size;
// Error: instance method size()
// or static method size(Object)?
}
}
The sample code is :
public class OverloadingTest {
public static void test(Object obj){
System.out.println("Object called");
}
public static void test(String obj){
System.out.println("String called");
}
public static void main(String[] args){
test(null);
System.out.println("10%2==0 is "+(10%2==0));
test((10%2==0)?null:new Object());
test((10%2==0)?null:null);
}
And the output is :
String called
10%2==0 is true
Object called
String called
The first call to test(null) invokes the method with String argument , which is understandable according to The Java Language Specification .
1) Can anyone explain me on what basis test() is invoked in preceding calls ?
2) Again when we put , say a if condition :
if(10%2==0){
test(null);
}
else
{
test(new Object());
}
It always invokes the method with String argument .
Will the compiler compute the expression (10%2) while compiling ? I want to know whether expressions are computed at compile time or run time . Thanks.
Java uses early binding. The most specific method is chosen at compile time. The most specific method is chosen by number of parameters and type of parameters. Number of parameters is not relevant in this case. This leaves us with the type of parameters.
What type do the parameters have? Both parameters are expressions, using the ternary conditional operator. The question reduces to: What type does the conditional ternary operator return? The type is computed at compile time.
Given are the two expressions:
(10%2==0)? null : new Object(); // A
(10%2==0)? null : null; // B
The rules of type evaluation are listed here. In B it is easy, both terms are exactly the same: null will be returned (whatever type that may be) (JLS: "If the second and third operands have the same type (which may be the null type), then that is the type of the conditional expression."). In A the second term is from a specific class. As this is more specific and null can be substituted for an object of class Object the type of the whole expression is Object (JLS: "If one of the second and third operands is of the null type and the type of the other is a reference type, then the type of the conditional expression is that reference type.").
After the type evaluation of the expressions the method selection is as expected.
The example with if you give is different: You call the methods with objects of two different types. The ternary conditional operator always is evaluated to one type at compile time that fits both terms.
JLS 15.25:
The type of a conditional expression is determined as follows:
[...]
If one of the second and third operands is of the null type and the type of the other
is a reference type, then the type of the conditional expression is that reference
type.
[...]
So the type of
10 % 2 == 0 ? null : new Object();
is Object.
test((10%2==0)?null:new Object());
Is the same as:
Object o;
if(10%2==0)
o=null;
else
o=new Object();
test(o);
Since type of o is Object (just like the type of (10%2==0)?null:new Object()) test(Object) will be always called. The value of o doesn't matter.
Your answer is : Runtime because in runtime specify parameter is instance of String or not so in compile-time can't find this.
This is the really nice question.
Let me try to clarify your code that you have written above.
In your first method call
test(null);
In this the null will be converted into string type so calling the test(String obj), as per JLS you are convinced with the call.
In the second method call
test((10%2==0)?null:new Object());
Which is going to return the boolean "true" value. So first boolean "true" value is going to auto cast into Boolean Wrapper class object. Boolean wrapper Object is finding the best match with your new Object() option in the ternary operator. And the method calls with Object as a parameter so it calls the following method
public static void test(Object obj)
For the experiment sake you can try the following combinations then you will get better clarity.
test((10 % 2 == 0) ? new Object() : "stringObj" );
test((10 % 2 == 0) ? new Object() : null );
test((10 % 2 == 0) ? "stringObj" : null );
Finally in the last when you are calling with the following code.
test((10%2==0)?null:null);
This time again it returns as boolean "true" value, and it will again follow the same casts as explained above. But this time there is no new Object() parameter is there in your ternary operator. So it will be auto type cast into null Object. Again it follows same method call as the your first method call.
In the last when you asked for code if you put in if .. else statement. Then also the compiler doing the fair decision with the code.
if(10%2==0) {
test(null);
}
Here all the time your if condition is true and calling this code test(null). Therefore all the time it call the firsttest(String obj) method with String as parameter as explained above.
I think your problem is that you are making the wrong assumption, your expressions:
test((10%2==0)?null:new Object());
and
test((10%2==0)?null:null;
Will always call test(null), and that's why they will go through test (Object).
as #Banthar mentionend the ?: operator assigns a value to a variable first then evaluates the condition.
On the other hand, the if condition you mentioned always returns true, so the compiler will replace the whole if-else block with only the body of the if.
1) the test() method is determined by the type of the parameter at the compilation time :
test((Object) null);
test((Object)"String");
output :
Object called
Object called
2) The compiler is even smarter, the compiled code is equivalent to just :
test(null);
you can check the bytecode with javap -c:
0: aconst_null
1: invokestatic #6 // Method test:(Ljava/lang/String;)V
4: return
This is what Java Language Specifications say about the problem.
If more than one method declaration is both accessible and applicable
to a method invocation, it is necessary to choose one to provide the
descriptor for the run-time method dispatch. The Java programming
language uses the rule that the most specific method is chosen.
This is test(String) method in your case.
And because of that if you add...
public static void test(Integer obj){
System.out.println("Ingeter called");
}
it will show compilation error -The method test(String) is ambiguous for the type OverloadingTest.
Just like JLS says:
It is possible that no method is the most specific, because there are
two or more maximally specific methods. In this case:
If all the maximally specific methods have the same signature, then:
If one of the maximally specific methods is not declared abstract, it
is the most specific method. Otherwise, all the maximally specific
methods are necessarily declared abstract. The most specific method is
chosen arbitrarily among the maximally specific methods. However, the
most specific method is considered to throw a checked exception if and
only if that exception is declared in the throws clauses of each of
the maximally specific methods. Otherwise, we say that the method
invocation is ambiguous, and a compile-time error occurs.