Well, I really thought that this would work (inside a method):
var x, y = 1;
var x = 1, y = 2;
But it does not, it would not compile - "var is not allowed in a compound definition".
I guess the reason for this is an usual trade-off. This is not a very used feature and thus not implemented, but we could yes and may be might in a future release...
Well, if you give it a manifest type:
int x, y = 1;
This declares two int variables, and initializes one of them. But local variable type inference requires an initializer to infer a type. So you're dead out of the gate.
But, suppose you meant to provide an initializer for both. It's "obvious" what to do when both initializers have the same type. So let's make it harder. Suppose you said:
var x = 1, y = 2.0;
What is this supposed to mean? Does this declare x as int and y as float? Or does it try to find some type that can be the type of both x and y? Whichever we decided, some people would think it should work the other way, and it would be fundamentally confusing.
And, for what benefit? The incremental syntactic cost of saying what you mean is trivial compared to the potential semantic confusion. And that's why we excluded this from the scope of type inference for locals.
You might say, then, "well, only make it work if they are the same type." We could do that, but now the boundary of when you can use inference and when not is even more complicated. And I'd be answering the same sort of "why don't you" question right now anyway ... The reality is that inference schemes always have limits; what you get to pick is the boundary. Better to pick clean, clear limits ("can use it in these contexts") than fuzzy ones.
Related
Following JEP 286: Local-Variable Type Inference description
I am wondering, what the reason is for introducing such a restriction, as:
Main.java:199: error: cannot infer type for local variable k
var k = { 1 , 2 };
^
(array initializer needs an explicit target-type)
So for me logically it should be:
var k = {1, 2}; // Infers int[]
var l = {1, 2L, 3}; // Infers long[]
Because Java compiler can already infer properly the type of an array:
void decide() {
arr(1, 2, 3); // call void arr(int ...arr)
arr(1, 2L, 3); // call void arr(long ...arr)
}
void arr(int ...arr) {
}
void arr(long ...arr) {
}
So what is the impediment?
Every time we improve the reach of type inference in Java, we get a spate of "but you could also infer this too, why don't you?" (Or sometimes, less politely.)
Some general observations on designing type inference schemes:
Inference schemes will always have limits; there are always cases at the margin where we cannot infer an answer, or end up inferring something surprising. The harder we try to infer everything, the more likely we will infer surprising things. This is not always the best tradeoff.
It's easy to cherry-pick examples of "but surely you can infer in this case." But if such cases are very similar to other cases that do not have an obvious answer, we've just moved the problem around -- "why does it work for X but not Y where X and Y are both Z?"
An inference scheme can always be made to handle incremental cases, but there is almost always collateral damage, either in the form of getting a worse result in other cases, increased instability (where seemingly unrelated changes can change the inferred type), or more complexity. You don't want to optimize just for number of cases you can infer; you want to optimize also for an educated user's ability to predict what will work and what will not. Drawing simpler lines (e.g., don't bother to try to infer the type of array initializers) often is a win here.
Given that there are always limits, its often better to choose a smaller but better-defined target, because that simplifies the user model. (See related questions on "why can't I use type inference for the return type of private methods. The answer is we could have done this, but the result would be a more complicated user model for small expressive benefit. We call this "poor return-on-complexity.")
From the mailing list platform-jep-discuss, message Reader Mail Bag for Thursday (Thu Mar 10 15:07:54 UTC 2016) by Brian Goetz:
Why is it not possible to use var when the initializer is an array initializer, as in:
var ints = { 1, 2, 3 }
The rule is: we derive the type of the variable by treating the
initializer as a standalone expression, and deriving its type.
However, array initializers, like lambdas and method refs, are poly
expressions
-- they need a target type in order to compute their type. So they are rejected.
Could we make this work? We probably could. But it would add a lot
of complexity to the feature, for the benefit of a mostly corner case.
We'd like for this to be a simple feature.
The short-hand array initializer takes its type information from the declaration, but as the declaration here is var it must be specified explicitly.
You will need to choose between:
var k = new int[]{ 1 , 2 };
or
int[] k = { 1 , 2 };
Allowing var k = { 1 , 2 } would change the semantics of something that is already syntactic sugar. In the case of int[] n = { 1, 2 } the type is determined by the declaration. If you allow var n = { 1, 2 } the type is suddenly determined by the initializer itself. This might lead to (easier to create) compiler bugs or ambiguities.
I would like to have a clear and precise understanding of the difference between the two.
Also is the this keyword used to implicitly reference or explicitly ? This is also why I want clarification between the two?
I assume to use the this keyword is to reference implicitly (being something withing the class) whilst explicitly (is something not belonging to the class itself) like a parameter variable being passed into a method.
Of course my assumptions could obviously be wrong which is why I'm here asking for clarification.
Explicit means done by the programmer.
Implicit means done by the JVM or the tool , not the Programmer.
For Example:
Java will provide us default constructor implicitly.Even if the programmer didn't write code for constructor, he can call default constructor.
Explicit is opposite to this , ie. programmer has to write .
already you have got your answer but I would like to add few more.
Implicit: which is already available into your programming language like methods, classes , dataTypes etc.
-implicit code resolve the difficulties of programmer and save the time of development.
-it provides optimised code. and so on.
Explicit: which is created by the programmer(you) as per their(your) requirement, like your app class, method like getName(), setName() etc.
finally in simple way,
A pre-defined code which provides help to programmer to build their app,programs etc it is know as implicit, and which have been written by the (you)programmer to full fill the requirement it is known as Explicit.
1: Implicit casting (widening conversion)
A data type of lower size (occupying less memory) is assigned to a data type of higher size. This is done implicitly by the JVM. The lower size is widened to higher size. This is also named as automatic type conversion.
Examples:
int x = 10; // occupies 4 bytes
double y = x; // occupies 8 bytes
System.out.println(y); // prints 10.0
In the above code 4 bytes integer value is assigned to 8 bytes double value.
Explicit casting (narrowing conversion)
A data type of higher size (occupying more memory) cannot be assigned to a data type of lower size. This is not done implicitly by the JVM and requires explicit casting; a casting operation to be performed by the programmer. The higher size is narrowed to lower size.
double x = 10.5; // 8 bytes
int y = x; // 4 bytes ; raises compilation error
1
2
double x = 10.5; // 8 bytes
int y = x; // 4 bytes ; raises compilation error
In the above code, 8 bytes double value is narrowed to 4 bytes int value. It raises error. Let us explicitly type cast it.
double x = 10.5;
int y = (int) x;
1
2
double x = 10.5;
int y = (int) x;
The double x is explicitly converted to int y. The thumb rule is, on both sides, the same data type should exist.
I'll try to provide an example of a similar functionality across different programming languages to differentiate between implicit & explicit.
Implicit: When something is available as a feature/aspect of the programming language constructs being used. And you have to do nothing but call the respective functionality through the API/interface directly.
For example Garbage collection in java happens implicitly. The JVM does it for us at an appropriate time.
Explicit: When user/programmer intervention is required to invoke/call a specific functionality, without which the desired action wont take place.
For example, in C++, freeing of the memory (read: Garbage collection version) has to happen by explicitly calling delete and free operators.
Hope this helps you understand the difference clearly.
This was way more complicated than I think it needed to be:
explicit = label names of a index (label-based indexing)
example:
df['index label name']
vs
implicit = integer of index (zero-based indexing)
df[0]
Assume a scenario where we can use a local variable vs access an instance variable what should I chose ?
eg:
inside foo.java
int x = bar.getX();
int y = x * 5;
Advantage is that we dont have performance impact of accessing object variable over and over.
or
inside foo.java
int y = bar.getX() * 5;
Advantage of using the second method, is we dont need to create and maintain another variable.
Do whatever is more readable.
Usually the shorter code is easier to read, comprehend, and show correct, but if an expression becomes long and the itermediate expressions make sense on their own, then introduce a local variable to make the code clearer.
As far as performance goes, you don't know that there will be any performance hit one way or the other unless you profile the code. Optimization in Java happens at compile but also at runtime with modern JIT compliation. Compilers can do inlining, common subexpression elimination and all kinds of algebraic transformations and strength reductions that I doubt that your example will show any performance impact one way or the other.
And unless an expression like this is executed a billion times or more in a loop, readability wins.
Thumb rule is reduce the scope of variable as much as you can.
If you want access of variable across all the methods, use instance member. Otherwise simply go for local variable.
If you are using your variable X across methods, define it as a member.
Coming to the performance impact , int x = obj.getX(); simply you are creating a copy variable and nothing else.
You'll realize once you use the Object and not the primitive. Then reference matters.
So in your case the first line int x = obj.getX(); is little redundant. Second is more readable.
Can someone explain to me why the first of the following two samples compiles, while the second doesn't? Notice the only difference is that the first one explicitly qualifies the reference to x with '.this', while the second doesn't. In both cases, the final field x is clearly attempted to be used before initialized.
I would have thought both samples would be treated completely equally, resulting in a compilation error for both.
1)
public class Foo {
private final int x;
private Foo() {
int y = 2 * this.x;
x = 5;
}
}
2)
public class Foo {
private final int x;
private Foo() {
int y = 2 * x;
x = 5;
}
}
After a bunch of spec-reading and thought, I've concluded that:
In a Java 5 or Java 6 compiler, this is correct behavior. Chapter 16 "Definite Assignment of The Java Language Specification, Third Edition says:
Each local variable (§14.4) and every blank final (§4.12.4) field (§8.3.1.2) must have a definitely assigned value when any access of its value occurs. An access to its value consists of the simple name of the variable occurring anywhere in an expression except as the left-hand operand of the simple assignment operator =.
(emphasis mine). So in the expression 2 * this.x, the this.x part is not considered an "access of [x's] value" (and therefore is not subject to the rules of definite assignment), because this.x is not the simple name of the instance variable x. (N.B. the rule for when definite assignment occurs, in the paragraph after the above-quoted text, does allow something like this.x = 3, and considers x to be definitely assigned thereafter; it's only the rule for accesses that doesn't count this.x.) Note that the value of this.x in this case will be zero, per §17.5.2.
In a Java 7 compiler, this is a compiler bug, but an understandable one. Chapter 16 "Definite Assignment" of the Java Language Specification, Java 7 SE Edition says:
Each local variable (§14.4) and every blank final field (§4.12.4, §8.3.1.2) must have a definitely assigned value when any access of its value occurs.
An access to its value consists of the simple name of the variable (or, for a field, the simple name of the field qualified by this) occurring anywhere in an expression except as the left-hand operand of the simple assignment operator = (§15.26.1).
(emphasis mine). So in the expression 2 * this.x, the this.x part should be considered an "access to [x's] value", and should give a compile error.
But you didn't ask whether the first one should compile, you asked why it does compile (in some compilers). This is necessarily speculative, but I'll make two guesses:
Most Java 7 compilers were written by modifying Java 6 compilers. Some compiler-writers may not have noticed this change. Furthermore, many Java-7 compilers and IDEs still support Java 6, and some compiler-writers may not have felt motivated to specifically reject something in Java-7 mode that they accept in Java-6 mode.
The new Java 7 behavior is strangely inconsistent. Something like (false ? null : this).x is still allowed, and for that matter, even (this).x is still allowed; it's only the specific token-sequence this plus . plus the field-name that's affected by this change. Granted, such an inconsistency already existed on the left-hand side of an assignment statement (we can write this.x = 3, but not (this).x = 3), but that's more readily understandable: it's accepting this.x = 3 as a special permitted case of the otherwise forbidden construction obj.x = 3. It makes sense to allow that. But I don't think it makes sense to reject 2 * this.x as a special forbidden case of the otherwise permitted construction 2 * obj.x, given that (1) this special forbidden case is easily worked around by adding parentheses, that (2) this special forbidden case was allowed in previous versions of the language, and that (3) we still need the special rule whereby final fields have their default values (e.g. 0 for an int) until they're initialized, both because of cases like (this).x, and because of cases like this.foo() where foo() is a method that accesses x. So some compiler-writers may not have felt motivated to make this inconsistent change.
Either of these would be surprising — I assume that compiler-writers had detailed information about every single change to the spec, and in my experience Java compilers are usually pretty good about sticking to the spec exactly (unlike some languages, where every compiler has its own dialect) — but, well, something happened, and the above are my only two guesses.
When you use this in the constructor, compiler is seeing x as a member attribute of this object (default initialized). Since x is int, it's default initialized with 0. This makes compiler happy and its working fine at run time too.
When you don't use this, then compiler is using x declaration directly in the lexical analysis and hence it complains about it's initialization (compile time phenomenon).
So It's definition of this, which makes compiler to analyze x as a member variable of an object versus direct attribute during the lexical analysis in the compilation and resulting into different compilation behavior.
When used as a primary expression, the keyword this denotes a value that is a reference to the object for which the instance method was invoked (§15.12), or to the object being constructed.
I think the compiler estimates that writing this.x implies 'this' exists, so a Constructor has been called (and final variable has been initialized).
But you should get a RuntimeException when trying to run it
I assume you refer to the behaviour in Eclipse. (As stated as comment a compile with javac works).
I think this is an Eclipse problem. It has its own compiler, and own set of rules. One of them is that you may not access a field which is not initialized, although the Java-commpiler would initialize variables for you.
When I find myself calling the same getter method multiple times, should this be considered a problem? Is it better to [always] assign to a local variable and call only once?
I'm sure the answer of course is "it depends".
I'm more concerned about the simpler case where the getter is simply a "pass-along-the-value-of-a-private-variable" type method. i.e. there's no expensive computation involved, no database connections being consumed, etc.
My question of "is it better" pertains to both code readability (style) and also performance. i.e. is it that much of a performance hit to have:
SomeMethod1(a, b, foo.getX(), c);
SomeMethod2(b, foo.getX(), c);
SomeMethod3(foo.getX());
vs:
X x = foo.getX();
SomeMethod1(a, b, x, c);
SomeMethod2(b, x, c);
SomeMethod3(x);
I realize this question is a bit nit-picky and gray. But I just realized, I have no consistent way of evaluating these trade-offs, at all. Am fishing for some criteria that are more than just completely whimsical.
Thanks.
The choice shouldn't really be about performance hit but about code readability.
When you create a variable you can give it the name it deserves in the current context. When you use a same value more than one time it has surely a real meaning, more than a method name (or worse a chain of methods).
And it's really better to read:
String username = user.getName();
SomeMethod1(a, b, username, c);
SomeMethod2(b, username, c);
SomeMethod3(username);
than
SomeMethod1(a, b, user.getName(), c);
SomeMethod2(b, user.getName(), c);
SomeMethod3(user.getName());
For plain getters - those that just returns a value - HotSpot inlines it in the calling code, so it will be as fast as it can be.
I, however, have a principle about keeping a statement on a single line, which very often results in expressions like "foo.getBar()" being too long to fit. Then it is more readable - to me - to extract it to a local variable ("Bar bar = foo.getBar()").
They could be 2 different things.
If GetX is non-deterministic then the 1st one will give different results than the 2nd
Personally, I'd use the 2nd one. It's more obvious and less unnecessarily verbose.
I use the second style if it makes my code more readable or if I have to use the assigned value again. I never consider performance (on trivial things) unless I have to.
That depends on what getX() actually does. Consider this class:
public class Foo {
private X x;
public X getX() { return x; }
}
In this case, when you make a call to foo.getX(), JVM will optimize it all the way down to foo.x (as in direct reference to foo's private field, basically a memory pointer). However, if the class looks like this:
public class Foo {
private X x;
public X getX() { return cleanUpValue(x); }
private X cleanUpValue(X x) {
/* some modifications/sanitization to x such as null safety checks */
}
}
the JVM can't actually inline it as efficiently anymore since by Foo's constructional contract, it has to sanitize x before handing it out.
To summarize, if getX() doesn't really do anything beyond returning a field, then there's no difference after initial optimization runs to the bytecode in whether you call the method just once or multiple times.
Most of the time I would use getX if it was only once, and create a var for it for all other cases. Often just to save typing.
With regards to performance, the compiler would probably be able to optimize away most of the overhead, but the possibility of side-effects could force the compiler into more work when doing multiple method-calls.
I generally store it locally if:
I'm will use it in a loop and I don't want or expect the value to change during the loop.
I'm about to use it in a long line of code or the function & parameters are very long.
I want to rename the variable to better correspond to the task at hand.
Testing indicates a significant performance boost.
Otherwise I like the ability to get current values and lower level of abstraction of method calls.
Two things have to be considered:
Does the call to getX() have any side effects? Following established coding patterns, a getter should not alter the object on which it is called, the in most cases, there is no side effect. Therefore, it is semantically equivalent to call the getter once and store the value locally vs. calling the getter multiple times. (This concept is called idempotency - it does not matter whether you call a method once or multiple times; the effect on the data is exactly the same.)
If the getter has no side effect, the compiler can safely remove subsequent calls to the getter and create the temporary local storage on its own - thus, the code remains ultra-readable and you have all the speed advantage from calling the getter only once. This is all the more important if the getter does not simply return a value but has to fetch/compute the value or runs some validations.
Assuming your getter does not change the object on which it operates it is probably more readable to have multiple calls to getX() - and thanks to the compiler you do not have to trade performance for readability and maintainability.