JVM bug? Cached Object field value cause ArrayIndexOutOfBoundsException - java

This is kind of strange, but code speaks more then words, so look at the test to see what I'm doing. In my current setup (Java 7 update 21 on Windows 64 bit) this test fails with ArrayIndexOutOfBoundsException, but replacing the test method code with the commented code, it the works. And I wonder if there is any part of the Java specification that would explain why.
It seems to me, as "michael nesterenko" suggested, that the value of the array field is cached in the stack, before calling the method, and not updated on return from the call. I can't tell if it's a JVM bug or a documented "optimisation". No multi-threading or "magic" involved.
public class TestAIOOB {
private String[] array = new String[0];
private int grow(final String txt) {
final int index = array.length;
array = Arrays.copyOf(array, index + 1);
array[index] = txt;
return index;
}
#Test
public void testGrow() {
//final int index = grow("test");
//System.out.println(array[index]);
System.out.println(array[grow("test")]);
}
}

This is well defined by the Java Language Specification: to evaluate x[y], first x is evaluated, and then y is evaluated. In your case, x evaluates to a String[] with zero elements. Then, y modifies a member variable, and evaluates to 0. Trying to access the 0th element of the already-returned array fails. The fact that the member array changes has no bearing on the array lookup, because we're looking at the String[] that array referenced at the time we evaluated it.

This behavior is mandated by the JLS. Per 15.13.1, "An array access expression is evaluated using the following procedure: First, the array reference expression is evaluated. If this evaluation completes abruptly, then the array access completes abruptly for the same reason and the index expression is not evaluated. Otherwise, the index expression is evaluated. [...]".

Compare the compiled Java code by using javap -c TestAIOOB
Uncommented code:
public void testGrow();
Code:
0: getstatic #6; //Field java/lang/System.out:Ljava/io/PrintStream;
3: aload_0
4: getfield #3; //Field array:[Ljava/lang/String;
7: aload_0
8: ldc #7; //String test
10: invokespecial #8; //Method grow:(Ljava/lang/String;)I
13: aaload
14: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/St
ing;)V
17: return
Commented code:
public void testGrow();
Code:
0: aload_0
1: ldc #6; //String test
3: invokespecial #7; //Method grow:(Ljava/lang/String;)I
6: istore_1
7: getstatic #8; //Field java/lang/System.out:Ljava/io/PrintStream;
10: aload_0
11: getfield #3; //Field array:[Ljava/lang/String;
14: iload_1
15: aaload
16: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/Str
ing;)V
19: return
In the first the getfield happens before the call to grow and in the second it happens after.

Related

What function is more efficient?

I'm new to Java and i wanted to know if there is a difference between these 2 functions:
public static String function1(int x) {
String res = "";
if(x > 10)
res = "a";
else
res = "b";
return res;
}
and:
public static String function2(int x) {
if(x > 10)
return "a";
return "b";
}
and I'm not speaking on the length of the code, only efficiency.
The second version is in theory more efficient, decompiling to:
public static java.lang.String function1(int);
Code:
0: ldc #2 // String
2: astore_1
3: iload_0
4: bipush 10
6: if_icmple 12
9: ldc #3 // String a
11: areturn
12: ldc #4 // String b
14: areturn
whereas the version with the assignment decompiles to:
public static java.lang.String function1(int);
Code:
0: ldc #2 // String
2: astore_1
3: iload_0
4: bipush 10
6: if_icmple 15
9: ldc #3 // String a
11: astore_1
12: goto 18
15: ldc #4 // String b
17: astore_1
18: aload_1
19: areturn
where it can be seen that the additional variable is created and returned.
However in practise the difference in actual runtime performance should be negligible. The JIT compiler would (hopefully) optimise away the useless variable, and in any case unless the code was in a hot code path according to your profiler then this would certainly count as premature optimisation.
Both versions end up creating a string either "a" or "b" and return it out.
But version 2 is better in term of efficiency, which doesn't create an redundant empty string "" in memory.

Can we use the + sign to add a string literal in a StringBuffer?

StringBuffer sb = new StringBuffer();
sb.append("New "+"Delhi");
and other is:
sb.append("New ").append("Delhi");
both will print "New Delhi".
Which one is better and why?
Because some times to save time I use "+" instead of ".append".
sb.append("New "+"Delhi"):
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: new #2 // class java/lang/StringBuffer
3: dup
4: invokespecial #3 // Method java/lang/StringBuffer."<init>":()V
7: astore_1
8: aload_1
9: ldc #4 // String New Delhi
11: invokevirtual #5 // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
14: pop
15: return
}
sb.append("New ").append("Delhi"):
Compiled from "Test.java"
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: new #2 // class java/lang/StringBuffer
3: dup
4: invokespecial #3 // Method java/lang/StringBuffer."<init>":()V
7: astore_1
8: aload_1
9: ldc #4 // String New
11: invokevirtual #5 // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
14: ldc #6 // String Delhi
16: invokevirtual #5 // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
19: pop
20: return
}
As the above bytecode, for static string:
when using "+", the javac compiler will auto concat it a String.
when using "append", the javac compiler will auto expand as two String variables.
so for static string, the "+" is good for using.
"+" sign is used to add a string at the end of another string. Now, as per your question, whenever append() is used with String Buffer to append character sequence or string at that time append function internally performing to concatenate string using "+" sign.
any string append operation is converted into StringBuilder internally like
"The answer is: " + value
is converted into :
new StringBuilder("The answer is: ")).append(value).toString()
If any expression getting concatenated is not constant , .append is a better approach.
so in your case doesn't matter performance wise which way you write. Only '+' will improve readability of your code.
Constant string concatenations will be replaced at compile-time.
You should use a Stringbuilder/Stringbuffer if you concatenate variable strings e.g. variables, especially when you do the concatenations in a loop.

How does for-each loop works internally in JAVA?

I was trying to find the working of for-each loop when I make a function call. Please see following code,
public static int [] returnArr()
{
int [] a=new int [] {1,2,3,4,5};
return a;
}
public static void main(String[] args)
{
//Version 1
for(int a : returnArr())
{
System.out.println(a);
}
//Version 2
int [] myArr=returnArr();
for(int a : myArr)
{
System.out.println(a);
}
}
In version 1, I'm calling returnArr() method in for-each loop and in version 2, I'm explicitly calling returnArr() method and assigning it to an array and then iterating through it. Result is same for both the scenarios. I would like to know which is more efficient and why.
I thought version 2 will be more efficient, as I'm not calling method in every iteration. But to my surprise, when I debugged the code using version 1, I saw the method call happened only once!
Can anyone please explain how does it actually work? Which is more efficient/better when I code for complex objects?
The Java Language Specification shows the underlying compilation
Let L1 ... Lm be the (possibly empty) sequence of labels immediately
preceding the enhanced for statement.
The enhanced for statement is equivalent to a basic for statement of
the form:
T[] #a = Expression;
L1: L2: ... Lm:
for (int #i = 0; #i < #a.length; #i++) {
{VariableModifier} TargetType Identifier = #a[#i];
Statement
}
where Expression is the right hand side of the : in an enhanced for statement (your returnArr()). In both cases, it gets evaluated only once: in version 1, as part of the enhanced for statement; in version 2, because its result is assigned to a variable which is then used in the enhanced for statement.
The compiler is calling the method returnArr() only once. compile time optimization :)
Byte code :
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=6, args_size=1
** case -1 start ***
0: invokestatic #20 // Method returnArr:()[I --> called only once.
3: dup
4: astore 4
6: arraylength
7: istore_3
8: iconst_0
9: istore_2
10: goto 28
13: aload 4 --> loop start
15: iload_2
16: iaload
17: istore_1
18: getstatic #22 // Field java/lang/System.out:Ljav
/io/PrintStream;
21: iload_1
22: invokevirtual #28 // Method java/io/PrintStream.prin
ln:(I)V
25: iinc 2, 1
28: iload_2
29: iload_3
30: if_icmplt 13
***case -2 start****
33: invokestatic #20 // Method returnArr:()[I
36: astore_1
37: aload_1
38: dup
39: astore 5
41: arraylength
42: istore 4
44: iconst_0
45: istore_3
46: goto 64
49: aload 5 --> loop start case 2
51: iload_3
52: iaload
53: istore_2
54: getstatic #22 // Field java/lang/System.out:Ljav
/io/PrintStream;
57: iload_2
58: invokevirtual #28 // Method java/io/PrintStream.prin
ln:(I)V
61: iinc 3, 1
64: iload_3
65: iload 4
67: if_icmplt 49
70: return
Note : I am using jdk 8.
I'm not going to copy paste from the Java Language Specification, like one of the previous answers did, but instead interpret the specification in a readable format.
Consider the following code:
for (T x : expr) {
// do something with x
}
If expr evaluates to an array type like in your case, the language specification states that the resulting bytecode will be the same as:
T[] arr = expr;
for (int i = 0; i < arr.length; i++) {
T x = arr[i];
// do something with x
}
The difference only is that the variables arr and i will not be visible to your code - or the debugger, unfortunately. That's why for development, the second version might be more useful: You have the return value stored in a variable accessible by the debugger.
In your first version expr is simply the function call, while in the second version you declare another variable and assign the result of the function call to that, then use that variable as expr. I'd expect them to exhibit no measurable difference in performance, as that additional variable assignment in the second version should be optimized away by the JIT compiler, unless you also use it elsewhere.
foreach internally uses list iterator to traverse through list and yes there is a difference between them.
If you just want to traverse the list and do not have any intension to modify it then you should use foreach else use list iterator.
for (String i : myList) {
System.out.println(i);
list.remove(i); // Exception here
}
Iterator it=list.iterator();
while (it.hasNext()){
System.out.println(it.next());
it.remove(); // No Exception
}
Also if using foreach you are passing a list which is null then you will get null pointer exception in java.util.ArrayList.iterator()

String Concatenation and Autoboxing in Java

When you concatenate a String with a primitive such as int, does it autobox the value first.
ex.
String string = "Four" + 4;
How does it convert the value to a string in Java?
To see what the Java compiler produces it is always useful to use javap -c to show the actual bytecode produced:
For example the following Java code:
String s1 = "Four" + 4;
int i = 4;
String s2 = "Four" + i;
would produce the following bytecode:
0: ldc #2; //String Four4
2: astore_1
3: iconst_4
4: istore_2
5: new #3; //class java/lang/StringBuilder
8: dup
9: invokespecial #4; //Method java/lang/StringBuilder."<init>":()V
12: ldc #5; //String Four
14: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/
String;)Ljava/lang/StringBuilder;
17: iload_2
18: invokevirtual #7; //Method java/lang/StringBuilder.append:(I)Ljava/lan
g/StringBuilder;
21: invokevirtual #8; //Method java/lang/StringBuilder.toString:()Ljava/la
ng/String;
24: astore_3
25: return
From this we can see:
In the case of "Four" + 4, the Java compiler (I was using JDK 6) was clever enough to deduce that this is a constant, so there is no computational effort at runtime, as the string is concatenated at compile time
In the case of "Four" + i, the equivalent code is new StringBuilder().append("Four").append(i).toString()
Autoboxing is not involved here as there is an StringBuilder.append(int) method which according to the docs is using String.valueOf(int) to create the string representation of the integer.
The java compiler actually creates a StringBuilder1 and invokes the append() method. It can be seen in the byte-code:
22 invokespecial java.lang.StringBuilder(java.lang.String) [40]
...
29 invokevirtual java.lang.StringBuilder.append(int) : java.lang.StringBuilder [47]
32 invokevirtual java.lang.StringBuilder.toString() : java.lang.String [51]
Nevertheless, the behavior is identical to boxing and then invoking toString(): "Four" + new Integer(4).toString() - which I believe what the language designers had in mind.
(1) To be exact, the compiler is already concatting the string literal and int literal to a single string literal "Four4". You can see it in the byte code in the following line in byte-code:
0 ldc <String "Four4"> [19]
According to http://jcp.org/aboutJava/communityprocess/jsr/tiger/autoboxing.html, autoboxing is done on the primitive type whenever a reference type is needed(such as the Integer class in this case)
So the int will be converted into an Integer then that integer objects toString() method is called and its result is appended to the preceding string.

Java: How does the == operator work when comparing int?

Given this Java code:
int fst = 5;
int snd = 6;
if(fst == snd)
do something;
I want to know how Java will compare equality for this case. Will it use an XOR operation to check equality?
Are you asking "what native machine code does this turn into?"? If so, the answer is "implementation-depdendent".
However, if you want to know what JVM bytecode is used, just take a look at the resulting .class file (use e.g. javap to disassemble it).
In case you are asking about the JVM, use the javap program.
public class A {
public static void main(String[] args) {
int a = 5;
System.out.println(5 == a);
}
}
Here is the disassembly:
public class A {
public A();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: iconst_5
1: istore_1
2: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
5: iconst_5
6: iload_1
7: if_icmpne 14
10: iconst_1
11: goto 15
14: iconst_0
15: invokevirtual #3 // Method java/io/PrintStream.println:(Z)V
18: return
}
In this case it optimized the branching a bit and used if_icmpne. In most cases, it will use if_icmpne or if_icmpeq.
if_icmpeq : if ints are equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpn : if ints are not equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)

Categories