//assume there is a java class call Node, and node is an array of Node
for(Node i: node){
Node j = i;
}
Node j;
for(Node i: node){
j = i;
}
Can someone please explain essentially what's the difference between this two?
Node j; on its own isn't 'code', in the sense that it results in zero bytecode - it does nothing on its own. It's a 'declaration'. It's an instruction to the compiler.
You're telling the compiler: There is this thing; I shall call it j, and it is a local variable. Please fail my compilation unless you can 100% ensure that this variable only ever holds either null or " reference to an object whose actual type is either Node or some subtype of Node". It decline to assign something to this variable right now; please fail to compile if I attempt to access it before you can 100% guarantee that some code has already ran that definitely assigned it first.
That's a ton of instructions for the compiler and absolutely none of it requires any code right then and there - it's just instructions for what to do in other lines of code - for example, in your second snippet, without that Node j; at the top, j = i; is not something that the compiler can do for you. It has no idea what you intend to do there; it does not know what j is, you did not inform the compiler about it, and in java, the language spec indicates that you can't just make up variable names without first telling the compiler about them.
Thus, bytecode wise, the exact difference between your 2 specific snippets here is nothing - local variables are a figment of javac's imagination: As a concept, 'local variable' doesn't exist, at all, at the JVM/bytecode level: It's just 'slots' in the local frame (i.e. local variables have a number, not a name), or on-stack space.
In other words, to figure out what the difference might be, it's entirely a matter of answering the question: How does javac interpret these 2 snippets? Because bytecode wise there is not going to be any difference, ever. Node j; simply cannot be bytecode - there is no such thing as a 'local variable' there, after all.
The only crucial difference is that any given local variable declaration (such as Node j;) declares its existence for a certain scope: It says: "There is this thing called j for these specific lines and not for any others". That scope is simply the next nearest pair of braces. So, in:
void foo() {
Node j;
for(Node i: node){
j = i;
}
}
You have declared that there is one local variable named j, and this applies to the entire foo() method (given that those are the nearest pair of braces when you look 'outwards' from your Node j; declaration.
Whereas with:
void foo() {
for(Node i: node){
Node j = i;
}
}
The 'scope' of your j declaration is that for loop - after all, that's the 'nearest pair'. In other words, this is almost legal:
void foo() {
Node j;
for(Node i: node){
j = i;
}
System.out.println(j);
}
Except for one tiny little detail: The compiler cannot 100% guarantee that j is actually assigned when we get to the println line - after all, what if your node collection is actually empty? Then j = i; ran zero times. So if you try to compile the above, the compiler will actually refuse, saying j is not definitely assigned. But make that Node j = null; and the above code would compile and run. In contrast to:
void foo() {
for(Node i: node){
Node j = i;
}
System.out.println(j);
}
This one doesn't compile with the error j? What are you talking about - the j you declared here only applies to all lines within the {} that goes with the for loop, thus, the mentioning of j in the println line is mentioning an unknown thing, and thus the compiler will complain about it in those terms: What is j?
So what do I do?
Follow conventions: When in rome, act like romans. When programming java, write it the way the vast majority of java coders would. Which means: Declare local variables at the latest possible point, at the right scope level. Thus, something like:
void foo() {
System.out.println("Starting method");
Node lastVisited = null;
for (Node i : nodes) {
lastVisited = i;
}
System.out.println("Last visited: " + lastVisited);
}
Don't declare lastVisited at the top of the method.
Do declare it before the for loop as we need it outside of it.
Names are important.
Related
Question related to frames of function and effective finality of variables.
int lenMin = list.get(0).length();
for (String s : list)
if (s.length() < lenMin)
lenMin = s.length();
list
.stream()
.filter(s -> lenMin == s.length())
.forEach(System.out::println);
st:1. here we create a variable.
st: 3-5 here we do something with it (change it).
st:9 the lenMin variable is underlined in red.
Effectivity finality we cannot change, but can we use it? I know that the lambda has a new frame, but if we do this:
int len = lenMin;
list
.stream()
.filter(s -> len == s.length())
.forEach(System.out::println);
then there will be no error.
Please explain why this is happening?
Functions that can close over variables need a way to capture those variables.
If a variable can't change, then you can easily capture variables by copying their values.
Java does exactly that. Java lambdas, and inner classes simply correspond to classes that have an extra field for each captured variable.
The JVM only understands flat classes; nested and inner classes are compiler fictions.
Runnable f() {
int answer = 42;
class C extends Runnable {
public void run() { System.out.println(answer); }
}
return new C();
}
compiles to something equivalent to
class C {
private int answer; // Capturing field
C(int answer) { this.answer = answer }
void run() { System.out.println(this.answer); }
}
Runnable f() {
int answer = 42;
return new C(answer);
}
This change is called closure conversion.
As long as answer is only assigned once, javac can do simple closure conversion: copying answer into any classes and lambdas that need it.
But simple copying doesn't work if a local variable might change after being captured.
int answer = 42;
Runnable c = new C();
answer += 1;
c(); // Should this print 42 or 43?
Programmers familiar with languages that allow for reassignment of closed-over variables would get confused if the above didn't behave like the JavaScript below.
let answer = 42;
let c = () => console.log(answer);
answer += 1;
c(); // Logs 43
Java could allow for mutable closed-over variables by doing more complicated closure conversion. For example, instead of copying answer, storing answer in a shared object and rewriting reads&writes to be reads/writes of a property or element of that shared object.
For example,
int[] answerBox = new int[1];
answerBox[0] = 42; // assignment
...
answerBox[0] = 43; // re-assignment
...
System.out.println(answerBox[0]);
...
Java's designers decided not to do that though; capturing by-reference the way JavaScript does would introduce a host of subtle complications. JavaScript is single threaded, but the JVM is not, so imagine when a closed over long variable is suddenly assigned in more than one thread.
This is not hard to work around though as long as you don't need to mutate captured state after creating the capturing value.
Runnable f() {
int x = 1;
x += 1;
return () -> System.out.println(x); // ERROR
}
The above fails, but the below works; we simply store the state of the non-effectively-final x in an effectively final finalX that the lambda can capture.
Runnable f() {
int x = 1;
x += 1;
int finalX = x;
return () -> System.out.println(finalX); // OK
}
So the question is: why one has to introduce a new copy, to have a variable, that is effectively final (= could be made final). Here len.
int len = lenMin;
list
.stream()
.filter(s -> len == s.length())
.forEach(System.out::println);
The reason that the lambda (inside filter) is an anonymous implemented interface instance, that uses len as follows: as it is a local variable, it copies it into a local interface variable with the same name len.
But now one has two variables (though same names). As the lambda theoretically could run in another thread, one has two threads that can change same named variables differently. The language designers did not want to allow two lens with different values, and simply required that the variable is not changed, effectively final.
Here it would not be dangerously misleading, but assume you later assign to len at the end or inside a lambda, and perhaps use .parallelStream(). Then suddenly a compile error must be given at an unchanged piece of code.
That is an additionally sufficient reason to keep the rule simple: a local variable used inside a lambda must be effectively final.
When I try to compile this:
public static Rand searchCount (int[] x)
{
int a ;
int b ;
...
for (int l= 0; l<x.length; l++)
{
if (x[l] == 0)
a++ ;
else if (x[l] == 1)
b++ ;
}
...
}
I get these errors:
Rand.java:72: variable a might not have been initialized
a++ ;
^
Rand.java:74: variable b might not have been initialized
b++ ;
^
2 errors
It seems to me that I initialized them at the top of the method. What's going wrong?
You declared them, but you didn't initialize them. Initializing them is setting them equal to a value:
int a; // This is a declaration
a = 0; // This is an initialization
int b = 1; // This is a declaration and initialization
You get the error because you haven't initialized the variables, but you increment them (e.g., a++) in the for loop.
Java primitives have default values but as one user commented below
Their default value is zero when declared as class members. Local variables don't have default values
Local variables do not get default values. Their initial values are undefined with out assigning values by some means. Before you can use local variables they must be initialized.
There is a big difference when you declare a variable at class level (as a member ie. as a field) and at method level.
If you declare a field at class level they get default values according to their type. If you declare a variable at method level or as a block (means anycode inside {}) do not get any values and remain undefined until somehow they get some starting values ie some values assigned to them.
If they were declared as fields of the class then they would be really initialized with 0.
You're a bit confused because if you write:
class Clazz {
int a;
int b;
Clazz () {
super ();
b = 0;
}
public void printA () {
sout (a + b);
}
public static void main (String[] args) {
new Clazz ().printA ();
}
}
Then this code will print "0". It's because a special constructor will be called when you create new instance of Clazz. At first super () will be called, then field a will be initialized implicitly, and then line b = 0 will be executed.
You declared them, but not initialized.
int a; // declaration, unknown value
a = 0; // initialization
int a = 0; // declaration with initialization
You declared them, but you didn't initialize them with a value. Add something like this:
int a = 0;
You declared them but did not provide them with an intial value - thus, they're unintialized. Try something like:
public static Rand searchCount (int[] x)
{
int a = 0 ;
int b = 0 ;
and the warnings should go away.
Since no other answer has cited the Java language standard, I have decided to write an answer of my own:
In Java, local variables are not, by default, initialized with a certain value (unlike, for example, the field of classes). From the language specification one (§4.12.5) can read the following:
A local variable (§14.4, §14.14) must be explicitly given a value
before it is used, by either initialization (§14.4) or assignment
(§15.26), in a way that can be verified using the rules for definite
assignment (§16 (Definite Assignment)).
Therefore, since the variables a and b are not initialized :
for (int l= 0; l<x.length; l++)
{
if (x[l] == 0)
a++ ;
else if (x[l] == 1)
b++ ;
}
the operations a++; and b++; could not produce any meaningful results, anyway. So it is logical for the compiler to notify you about it:
Rand.java:72: variable a might not have been initialized
a++ ;
^
Rand.java:74: variable b might not have been initialized
b++ ;
^
However, one needs to understand that the fact that a++; and b++; could not produce any meaningful results has nothing to do with the reason why the compiler displays an error. But rather because it is explicitly set on the Java language specification that
A local variable (§14.4, §14.14) must be explicitly given a value (...)
To showcase the aforementioned point, let us change a bit your code to:
public static Rand searchCount (int[] x)
{
if(x == null || x.length == 0)
return null;
int a ;
int b ;
...
for (int l= 0; l<x.length; l++)
{
if(l == 0)
a = l;
if(l == 1)
b = l;
}
...
}
So even though the code above can be formally proven to be valid (i.e., the variables a and b will be always assigned with the value 0 and 1, respectively) it is not the compiler job to try to analyze your application's logic, and neither does the rules of local variable initialization rely on that. The compiler checks if the variables a and b are initialized according to the local variable initialization rules, and reacts accordingly (e.g., displaying a compilation error).
You declared them at the start of the method, but you never initialized them. Initializing would be setting them equal to a value, such as:
int a = 0;
int b = 0;
Imagine what happens if x[l] is neither 0 nor 1 in the loop. In that case a and b will never be assigned to and have an undefined value.
You must initialize them both with some value, for example 0.
It's a good practice to initialize the local variables inside the method block before using it. Here is a mistake that a beginner may commit.
public static void main(String[] args){
int a;
int[] arr = {1,2,3,4,5};
for(int i=0; i<arr.length; i++){
a = arr[i];
}
System.out.println(a);
}
You may expect the console will show '5' but instead the compiler will throw 'variable a might not be initialized' error. Though one may think variable a is 'initialized' inside the for loop, the compiler does not think in that way. What if arr.length is 0? The for loop will not be run at all. Hence, the compiler will give variable a might not have been initialized to point out the potential danger and require you to initialize the variable.
To prevent this kind of error, just initialize the variable when you declare it.
int a = 0;
You haven't initialised a and b, only declared them. There is a subtle difference.
int a = 0;
int b = 0;
At least this is for C++, I presume Java is the same concept.
Set variable "a" to some value like this,
a=0;
Declaring and initialzing are both different.
Good Luck
I have been learning about java byte-code recently, and i have been understanding most of it, but i am confused about how the local variable count for example is counted. I thought it would just be the total of the local variables, but this code generates 1 local variable when looking through the bytecode
public int testFail()
{
return 1;
}
But i thought it should be zero local variables because no local variable are defined.
Additionally this method also generates one local variable but it has more local variables than the previous example.
Finally this method
public static int testFail(int a, int b)
{
return a+b;
}
gnerates two local variable in the bytecode.
public static int testFail(int a)
{
return a;
}
Non-static methods use a local variable slot for this. Another complication is that longs and doubles count as 2 each. Also, depending on your compiler and settings, you may not see a one-to-one mapping between local variables in the source code and local variables in the byte code. For example, if debug information is left out, the compiler may eliminate unnecessary local variables.
Edit:
I just remembered: compilers may also re-use local variable slots. For example, given this code:
public static void test() {
for(int i = 0; i < 100; i++) {
...
}
for(int j = 0; j < 100; j++) {
}
}
the same slot can be used for i and j because their scopes don't overlap.
The reason the first one has a local variable is because it is a nonstatic method, so there is an implicit this parameter.
I have problem understanding the order in which initialization happens. this is the order I assumed:
*Once per
1. Static variable declaration
2. Static block
*Once per object
3. variable declaration
4. initialization block
5. constructor
but according to this code I am obviously wrong:
class SomethingWrongWithMe
{
{
b=0; //no. no error here.
int a = b; //Error: Cannot reference a field before it is defined.
}
int b = 0;
}
And the error would disappear if I do this:
class SomethingWrongWithMe
{
int b = 0;
{
b=0;
int a = b; //The error is gone.
}
}
I can't figure out why isn't there an error on
b=0;
The Java Language Specification (section 8.3.2.3) says you can use a variable on the left hand side of an expression, i.e. assign to it, before it is declared, but you cannot use it on the right hand side.
All variables are initialized to their default values, then explicit initializers and anonymous blocks are run in the order they are found in the source file. Finally the constructor is called.
Statics are only run once on the first use of a class.
The compile error appears to be a rule of Java rather than something that necessarily makes sense in every case.
Variable definitions are not done "before" blocks. They are both done at the same time, in the order that they are defined
class SomethingWrongWithMe {
{
b = debug("block 1");
}
int b = debug("define");
{
b = debug("block 2");
}
private int debug(String str) {
System.out.println(str);
return 0;
}
}
Output
block 1
define
block 2
First of all, your assumptions are more or less correct, except for the fact that declarations (with initialization, such as int b = 0) and instance initializer blocks are executed in the order they are written.
int b = 0; // executed first
{
b = 1; // executed second
}
int a = b; // executed third
Also note that the declaration i.e. int b is not executed. The declaration just declares the existence of the variable.
As for the error you got (or, rather the error you didn't get) I agree that it looks strange. I assume that the compiler deals with referencing a variable in an expression and assigning a value to it in different ways. When writing to a variable in an instance initializer, it just checks that the variable is there, while when reading from it, it requires it to be declared above the instance initializer block. I'll see if I can find a reference for that in the JLS.
The following method does not work because the inner block declares a variable of the same name as one in the outer block. Apparently variables belong to the method or class in which they are declared, not to the block in which they are declared, so I therefore can't write a short little temporary block for debugging that happens to push a variable in the outer scope off into shadow just for a moment:
void methodName() {
int i = 7;
for (int j = 0; j < 10; j++) {
int i = j * 2;
}
}
Almost every block-scoped language I've ever used supported this, including trivial little languages that I wrote interpreters and compilers for in school. Perl can do this, as can Scheme, and even C. Even PL/SQL supports this!
What's the rationale for this design decision for Java?
Edit: as somebody pointed out, Java does have block-scoping. What's the name for the concept I'm asking about? I wish I could remember more from those language-design classes. :)
Well, strictly speaking, Java does have block-scoped variable declarations; so this is an error:
void methodName() {
for (int j = 0; j < 10; j++) {
int i = j * 2;
}
System.out.println(i); // error
}
Because 'i' doesn't exist outside the for block.
The problem is that Java doesn't allow you to create a variable with the same name of another variable that was declared in an outer block of the same method. As other people have said, supposedly this was done to prevent bugs that are hard to identify.
Because it's not uncommon for writers to do this intentionally and then totally screw it up by forgetting that there are now two variables with the same name. They change the inner variable name, but leave code that uses the variable, which now unintentially uses the previously-shadowed variable. This results in a program that still compiles, but executes buggily.
Similarly, it's not uncommon to accidentally shadow variables and change the program's behavior. Unknowingly shadowing an existing variable can change the program as easily as unshadowing a variable as I mentioned above.
There's so little benefit to allowing this shadowing that they ruled it out as too dangerous. Seriously, just call your new variable something else and the problem goes away.
I believe the rationale is that most of the time, that isn't intentional, it is a programming or logic flaw.
in an example as trivial as yours, its obvious, but in a large block of code, accidentally redeclaring a variable may not be obvious.
ETA: it might also be related to exception handling in java. i thought part of this question was discussed in a question related to why variables declared in a try section were not available in the catch/finally scopes.
It leads to bugs that are hard to spot, I guess. It's similar in C#.
Pascal does not support this, since you have to declare variables above the function body.
The underlying assumption in this question is wrong.
Java does have block-level scope. But it also has a hierarchy of scope, which is why you can reference i within the for loop, but not j outside of the for loop.
public void methodName() {
int i = 7;
for (int j = 0; j < 10; j++) {
i = j * 2;
}
//this would cause a compilation error!
j++;
}
I can't for the life of me figure out why you would want scoping to behave any other way. It'd be impossible to determine which i you were referring to inside the for loop, and I'd bet chances are 99.999% of the time you want to refer to the i inside the method.
another reason: if this kind of variable declaration were allowed, people would want (need?) a way to access outer block variables. may be something like "outer" keyword would be added:
void methodName() {
int i = 7;
for (int j = 0; j < 10; j++) {
int i = outer.i * 2;
if(i > 10) {
int i = outer.outer.i * 2 + outer.i;
}
}
}