I am a master's student and I am researching static analysis.
In one of my tests I came across a problem in marking lines in the java compiler.
I have the following java code:
226: String json = "/org/elasticsearch/index/analysis/commongrams/commongrams_query_mode.json";
227: Settings settings = Settings.settingsBuilder()
228: .loadFromStream(json, getClass().getResourceAsStream(json))
229: .put("path.home", createHome())
230: .build();
When compiling this code, and executing the command javap -p -v CLASSNAME, I get a table with the corresponding line of the source code for each instruction in the bytecode.
See the image below:
Bytecode table
The problem is that in the call to the .put (" path.home ", createHome ()) method, bytecode generates basically four instructions:
19: anewarray
24: ldc - String path.home
30: invokespecial - createHome
34: invokevirtual - put
Being the first two marked as line 228 (Wrong) and the last two as line 229 (correct).
See the image below:
Bytecode table
This is the original implementation of the .put("path.home", createHome()) method:
public Builder put(Object... settings) {
if (settings.length == 1) {
// support cases where the actual type gets lost down the road...
if (settings[0] instanceof Map) {
//noinspection unchecked
return put((Map) settings[0]);
} else if (settings[0] instanceof Settings) {
return put((Settings) settings[0]);
}
}
if ((settings.length % 2) != 0) {
throw new IllegalArgumentException("array settings of key + value order doesn't hold correct number of arguments (" + settings.length + ")");
}
for (int i = 0; i < settings.length; i++) {
put(settings[i++].toString(), settings[i].toString());
}
return this;
}
I have already tried to compile the code using Oracle-JDK v8 and Open-JDK v16 and in both results.
I also did a test by making a change to the put() method by removing its parameters. When compiling this code the problem in marking the lines did not occur.
I wonder why the bytecode instructions map the line 229: .put (" path.home ", createHome ()) on lines other than the original in the java source code? Does anyone know if this is done on purpose?
This is connected to the way, the line number association is stored in the class file and the history of the javac compiler.
The line number table only contains entries associating line numbers to a a code location marking its beginning. So all instructions after that location are assumed to belong to the same line up to the next location that has been explicitly mentioned in the table.
Since detailed information will take up space and the specification does not demand a particular precision for the line number table, compiler vendors made different decisions about which details to include.
In the past, i.e. up to Java 7, javac only generated line number table entries for the beginning of statements, so when I compile the following code with Java 7’s javac
String settings = new StringBuilder() // this is line 7 in my .java file
.append('a')
.append(
5
+
"".length())
.toString();
I get something like
stack=3, locals=2, args_size=1
0: new #2 // class java/lang/StringBuilder
3: dup
4: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
7: bipush 97
9: invokevirtual #4 // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
12: iconst_5
13: ldc #5 // String
15: invokevirtual #6 // Method java/lang/String.length:()I
18: iadd
19: invokevirtual #7 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
22: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
25: astore_1
26: return
LineNumberTable:
line 7: 0
line 14: 26
which would cause all instructions belonging to the statement to be associated with line 7 only.
This has been considered to be too little, so starting with Java 8, javac generates additional entries for method invocations within an expression spanning multiple lines. So when I compile the same code with Java 8 or newer, I get
stack=3, locals=2, args_size=1
0: new #2 // class java/lang/StringBuilder
3: dup
4: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
7: bipush 97
9: invokevirtual #4 // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
12: iconst_5
13: ldc #5 // String
15: invokevirtual #6 // Method java/lang/String.length:()I
18: iadd
19: invokevirtual #7 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
22: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
25: astore_1
26: return
LineNumberTable:
line 7: 0
line 8: 9
line 12: 15
line 9: 19
line 13: 22
line 14: 26
Note how each additional entry (compared to the Java 7 version) points to an invocation instruction, to ensure that the method invocations are associated with the correct line number. This greatly improves exception stack traces as well as step debugging.
The non-invocation instructions having no explicit entry will still get associated with their closest preceding code location that has an entry.
Therefore, the bipush 97 instruction corresponding to the 'a' constant gets associated with line 7 as only the subsequent append invocation consuming the constant has an explicit entry associating it with line 8.
The consequences for the next expression, 5 + "".length(), are even more dramatic.
The instructions for pusing the constants, iconst_5 and ldc [""], get associated to line 8, the location of the previous append invocation, whereas the iadd instruction, actually belonging to the + operator between the 5 and "" constants, gets associated with the line 12, as the most recent invocation instruction that got an explicit line number is the length() invocation.
For comparison, this is how Eclipse compiles the same code:
stack=3, locals=2, args_size=1
0: new #20 // class java/lang/StringBuilder
3: dup
4: invokespecial #22 // Method java/lang/StringBuilder."<init>":()V
7: bipush 97
9: invokevirtual #23 // Method java/lang/StringBuilder.append:(C)Ljava/lang/StringBuilder;
12: iconst_5
13: ldc #27 // String
15: invokevirtual #29 // Method java/lang/String.length:()I
18: iadd
19: invokevirtual #35 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
22: invokevirtual #38 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
25: astore_1
26: return
LineNumberTable:
line 6: 0
line 7: 7
line 9: 12
line 11: 13
line 9: 18
line 8: 19
line 12: 22
line 6: 25
line 13: 26
The Eclipse compiler doesn’t have javac’s history, but rather has been designed to produce line number entries for expressions in the first place. We can see that it associates the first instruction belonging to an invocation expression (not the invocation instruction) with the right line, i.e. the bipush 97 for append('a') and ldc [""] for "".length().
Further, it has additional entries for iconst_5, iadd, and astore_1, to associate them with the right lines. Of course, this higher precision also results in slightly bigger class files.
Related
This question already has an answer here:
How is concatenation of final strings done in Java?
(1 answer)
Closed 2 years ago.
I have the following code. I understand the concept of java string immutability and string constant pool. I don't understand why 'name1 == name2' results false and 'name2 == name3' results true in the following program. How are the string variables name1, name2, and name3 placed in the string constant pool?
public class Test {
public static void main(String[] args) {
final String firstName = "John";
String lastName = "Smith";
String name1 = firstName + lastName;
String name2 = firstName + "Smith";
String name3 = "John" + "Smith";
System.out.println(name1 == name2);
System.out.println(name2 == name3);
}
}
Output:
false
true
The answer is fairly simple: As a shortcut, java treats certain concepts as a so-called 'compile time constant' (CTC). The idea is to entirely inline a variable, at the compilation level (which is extraordinary; normally javac basically just bashes your java source file into a class file using very simple and easily understood transformations, and the fancypants optimizations occur at runtime during hotspot).
For example, if you do this:
Save to UserOfBatch.java:
class BatchOConstants {
public static final int HELLO = 5;
}
public class UserOfBatch {
public static void main(String[] args) {
System.out.println(BatchOConstants.HELLO);
}
}
Run on the command line:
> javac UserOfBatch.java
> java UserOfBatch
5
> javap -c UserOfBatch # javap prints bytecode
public static void main(java.lang.String[]);
Code:
0: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
3: iconst_5
4: invokevirtual #15 // Method java/io/PrintStream.println:(I)V
7: return
Check out line 3 up there. iconst_5. That 5? It was hardcoded!!
No reference to BatchOConstants remains. Let's test that:
on command line:
> rm BatchOConstants.class
> java UserOfBatch
5
Wow. The code ran even though it's missing the very class file that should be providing that 5, thus proving, it was 'hardcoded' by the compiler itself and not the runtime.
Another way to 'observe' CTC-ness of any value is annoparams. An annotation parameter must be hardcoded into the class file by javac, so you can't pass expressions. Given:
public #interface Foo {
long value();
I can't write: #Foo(System.currentTimeMillis()), because System.cTM obviously isn't a compile time constant. But I can write #Foo(SomeClass.SOME_STATIC_FINAL_LONG_FIELD) assuming that the value assigned to S_S_F_L_F is a compile time constant. If it's not, that #Foo(...) code would not compile. If it is, it will compile: CTC-ness now determines whether your code compiles or not.
There are specific rules about when the compiler is allowed to construe something as a 'compile time constant' and go on an inlining spree. For example, null is not an inline constant, ever. Oversimplifying, but:
For fields, the field must be static and final and have a primitive or String type, initialized on the spot (in the same breath, not later in a static block), with a constant expression, that isn't null.
For local variables, the rules are very similar, except, the need for them to be static is obviously waived as they cannot be. Other than that, all the fixins apply: final, primitive-or-String, non-null, initialized on the spot, and with a constant expression.
Unfortunately, CTC-ness of a local is harder to directly observe. The code you wrote is a fine way to indirectly observe it though. You've proven with your prints that firstName is CTC and lastName is not.
We can observe a select few things though. So let's take your code, compile it, and toss it at javap to witness the results. Via Jonas Konrad's online javap tool, let's analyse:
public Main() {
final String a = "hello";
String b = "world";
String c = a + "!";
String d = b + "!";
System.out.println(c == "hello!");
System.out.println(d == "world!");
}
The relevant parts:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: ldc #7 // String hello
6: astore_1
start local 1 // java.lang.String a
7: ldc #9 // String world
9: astore_2
start local 2 // java.lang.String b
10: ldc #11 // String hello!
12: astore_3
start local 3 // java.lang.String c
13: aload_2
14: invokedynamic #13, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
19: astore 4
start local 4 // java.lang.String d
Note how 'start local 2' (which is c; javap starts counting at 0) shows just loading hello! as a complete constant, but 'start local 3' (which is d) shows loading 2 constants and invoking makeConcat to tie em together.
run javap -c Test after you compiled the code,
you will see that
Compiled from "Test.java"
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: ldc #2 // String Smith
2: astore_2
3: aload_2
4: invokedynamic #3, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
9: astore_3
10: ldc #4 // String JohnSmith
12: astore 4
14: ldc #4 // String JohnSmith
16: astore 5
18: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
21: aload_3
22: aload 4
24: if_acmpne 31
27: iconst_1
28: goto 32
31: iconst_0
32: invokevirtual #6 // Method java/io/PrintStream.println:(Z)V
35: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
38: aload 4
40: aload 5
42: if_acmpne 49
45: iconst_1
46: goto 50
49: iconst_0
50: invokevirtual #6 // Method java/io/PrintStream.println:(Z)V
53: return
}
As you can see
4: invokedynamic #3, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
in public static void main(java.lang.String[]);, this is the actual compiled byte code when running firstname + lastname. So generated String will not have the same hashcode as "JohnSmith" in constant pool.
where as on
10: ldc #4 // String JohnSmith
and
14: ldc #4 // String JohnSmith
this is the byte code generated by the compiler when doing both
firstname + "Smith" and "John" + "Smith" which means both actually reading from the constant pool.
This is the reason why when you comparing name1 with name2 using == it will return false. since name2 and name3 reference the same string from the constant pool. Hench it return true when compare with ==.
This is the reason why it is not a good idea to compare 2 string with ==. Please use String.equals() when doing String comparison.
since both
Let's look at the bytecode with final:
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: ldc #7 // String Smith
2: astore_1
3: aload_1
4: invokedynamic #9, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
9: astore_2
10: ldc #13 // String JohnSmith
12: astore_3
13: ldc #13 // String JohnSmith
15: astore 4
17: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream;
20: aload_2
21: aload_3
22: if_acmpne 29
25: iconst_1
26: goto 30
29: iconst_0
30: invokevirtual #21 // Method java/io/PrintStream.println:(Z)V
33: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream;
36: aload_3
37: aload 4
39: if_acmpne 46
42: iconst_1
43: goto 47
46: iconst_0
47: invokevirtual #21 // Method java/io/PrintStream.println:(Z)V
50: return
}
And the bytecode without final:
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: ldc #7 // String John
2: astore_1
3: ldc #9 // String Smith
5: astore_2
6: aload_1
7: aload_2
8: invokedynamic #11, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
13: astore_3
14: aload_1
15: invokedynamic #15, 0 // InvokeDynamic #1:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
20: astore 4
22: ldc #18 // String JohnSmith
24: astore 5
26: getstatic #20 // Field java/lang/System.out:Ljava/io/PrintStream;
29: aload_3
30: aload 4
32: if_acmpne 39
35: iconst_1
36: goto 40
39: iconst_0
40: invokevirtual #26 // Method java/io/PrintStream.println:(Z)V
43: getstatic #20 // Field java/lang/System.out:Ljava/io/PrintStream;
46: aload 4
48: aload 5
50: if_acmpne 57
53: iconst_1
54: goto 58
57: iconst_0
58: invokevirtual #26 // Method java/io/PrintStream.println:(Z)V
61: return
}
As you can see, with final, Java recognizes both the left and right hand sides of + to be constant, so it replaces the concatenation with the constant string "JohnSmith" at compile time. The only call to makeConcatWithConstants (to concatenate strings) is made for firstName + lastName, since lastName isn't final.
In the second example, there are two calls to makeConcatWithConstants, one for firstName + lastName and another for firstName + "Smith", since Java doesn't recognize firstName as a constant.
That's why name1 == name2 is false in your example: name2 is a constant "JohnSmith" in the string pool, whereas name1 is dynamically computed at runtime. However, name2 and name3 are both constants in the string pool, which is why name2 == name3 is true.
Based on the book Cracking the Coding Interview (page 90), the following algorithm requires O(xn²) time (where 'x' represents a length of the string and 'n' is amount of the strings). The code is in Java. Can anybody explain how we obtain such runtime ?
String joinWords(String[] words)
{
String sentence = "";
for(String w : words)
{
sentence = sentence + w;
}
return sentence;
}
For each string that is concatenated to sentence, a new StringBuilder is created, two strings are appended to it using the StringBuilder.append method, and then the resulting string is created using the StringBuilder.toString method. The complexity of this operation is O(n_1 + n_2) where n_1 and n_2 are the lengths of the strings.
In this code, the loop runs n times, and each time it runs, the string sentence of length O(xn) is concatenated with the string w of length x. Therefore the overall complexity is n * O(xn + x) = O(xn^2), as expected.
For the skeptics, here's the disassembled bytecode of the joinWords method; I compiled it using javac 10.0.1 (which is the version I have to hand at the moment). The StringBuilder is used from positions 25 to 41, which are inside the loop (see 48: goto 12).
java.lang.String joinWords(java.lang.String[]);
Code:
0: ldc #2 // String
2: astore_2
3: aload_1
4: astore_3
5: aload_3
6: arraylength
7: istore 4
9: iconst_0
10: istore 5
12: iload 5
14: iload 4
16: if_icmpge 51
19: aload_3
20: iload 5
22: aaload
23: astore 6
25: new #3 // class java/lang/StringBuilder
28: dup
29: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V
32: aload_2
33: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
36: aload 6
38: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
41: invokevirtual #6 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
44: astore_2
45: iinc 5, 1
48: goto 12
51: aload_2
52: areturn
I was trying to find the working of for-each loop when I make a function call. Please see following code,
public static int [] returnArr()
{
int [] a=new int [] {1,2,3,4,5};
return a;
}
public static void main(String[] args)
{
//Version 1
for(int a : returnArr())
{
System.out.println(a);
}
//Version 2
int [] myArr=returnArr();
for(int a : myArr)
{
System.out.println(a);
}
}
In version 1, I'm calling returnArr() method in for-each loop and in version 2, I'm explicitly calling returnArr() method and assigning it to an array and then iterating through it. Result is same for both the scenarios. I would like to know which is more efficient and why.
I thought version 2 will be more efficient, as I'm not calling method in every iteration. But to my surprise, when I debugged the code using version 1, I saw the method call happened only once!
Can anyone please explain how does it actually work? Which is more efficient/better when I code for complex objects?
The Java Language Specification shows the underlying compilation
Let L1 ... Lm be the (possibly empty) sequence of labels immediately
preceding the enhanced for statement.
The enhanced for statement is equivalent to a basic for statement of
the form:
T[] #a = Expression;
L1: L2: ... Lm:
for (int #i = 0; #i < #a.length; #i++) {
{VariableModifier} TargetType Identifier = #a[#i];
Statement
}
where Expression is the right hand side of the : in an enhanced for statement (your returnArr()). In both cases, it gets evaluated only once: in version 1, as part of the enhanced for statement; in version 2, because its result is assigned to a variable which is then used in the enhanced for statement.
The compiler is calling the method returnArr() only once. compile time optimization :)
Byte code :
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=6, args_size=1
** case -1 start ***
0: invokestatic #20 // Method returnArr:()[I --> called only once.
3: dup
4: astore 4
6: arraylength
7: istore_3
8: iconst_0
9: istore_2
10: goto 28
13: aload 4 --> loop start
15: iload_2
16: iaload
17: istore_1
18: getstatic #22 // Field java/lang/System.out:Ljav
/io/PrintStream;
21: iload_1
22: invokevirtual #28 // Method java/io/PrintStream.prin
ln:(I)V
25: iinc 2, 1
28: iload_2
29: iload_3
30: if_icmplt 13
***case -2 start****
33: invokestatic #20 // Method returnArr:()[I
36: astore_1
37: aload_1
38: dup
39: astore 5
41: arraylength
42: istore 4
44: iconst_0
45: istore_3
46: goto 64
49: aload 5 --> loop start case 2
51: iload_3
52: iaload
53: istore_2
54: getstatic #22 // Field java/lang/System.out:Ljav
/io/PrintStream;
57: iload_2
58: invokevirtual #28 // Method java/io/PrintStream.prin
ln:(I)V
61: iinc 3, 1
64: iload_3
65: iload 4
67: if_icmplt 49
70: return
Note : I am using jdk 8.
I'm not going to copy paste from the Java Language Specification, like one of the previous answers did, but instead interpret the specification in a readable format.
Consider the following code:
for (T x : expr) {
// do something with x
}
If expr evaluates to an array type like in your case, the language specification states that the resulting bytecode will be the same as:
T[] arr = expr;
for (int i = 0; i < arr.length; i++) {
T x = arr[i];
// do something with x
}
The difference only is that the variables arr and i will not be visible to your code - or the debugger, unfortunately. That's why for development, the second version might be more useful: You have the return value stored in a variable accessible by the debugger.
In your first version expr is simply the function call, while in the second version you declare another variable and assign the result of the function call to that, then use that variable as expr. I'd expect them to exhibit no measurable difference in performance, as that additional variable assignment in the second version should be optimized away by the JIT compiler, unless you also use it elsewhere.
foreach internally uses list iterator to traverse through list and yes there is a difference between them.
If you just want to traverse the list and do not have any intension to modify it then you should use foreach else use list iterator.
for (String i : myList) {
System.out.println(i);
list.remove(i); // Exception here
}
Iterator it=list.iterator();
while (it.hasNext()){
System.out.println(it.next());
it.remove(); // No Exception
}
Also if using foreach you are passing a list which is null then you will get null pointer exception in java.util.ArrayList.iterator()
I understand that this will fail to compile:
int caseNum = 2;
switch(caseNum)
{
case 2:
System.out.println("Happy");
break;
case 2:
System.out.println("Birthday");
break;
case 2:
System.out.println("To the ground!");
break;
default:
System.out.println("<3");
break;
}
I know that the case statements are conflicting and that the compiler "doesn't know which 'case 2' I am talking about". A few peers of mine and myself were wondering behind the scenes what the conflict is and had heard that switch statements are converted into hash-maps. Is that the case, does a switch statement become a hash-map during compile time and the conflict in the mapping create the error?
So far I have looked around Stack Overflow and Google for an answer, and the information must be out there already, but I am unsure how to express the question correctly it would appear. Thanks in advance!
A hash map is just one way that a switch statement could be compiled, but in any case, you can imagine having duplicate cases as trying to have multiple values for the same key in a regular HashMap. The compiler doesn't know which one of the values corresponds to the key, and so emits an error.
switch statements could also be compiled into a jump table, in which case this is still ambiguous for a very similar reason -- you have multiple different possibilities for the same jump location.
switch statements could also be compiled into a binary search, in which case you still have the same problem -- multiple different results for the same key being searched for.
Just in case you were curious, I did a small test case to see what javac would compile the switch to. From this (slightly modified) source:
public static void main(final String[] args) {
final int caseNum = 2;
switch (caseNum) {
case 1:
System.out.println("Happy");
break;
case 2:
System.out.println("Birthday");
break;
case 3:
System.out.println("To the ground!");
break;
default:
System.out.println("<3");
break;
}
}
You get this bytecode:
public static void main(java.lang.String[]);
Code:
0: iconst_2
1: tableswitch { // 1 to 3
1: 28
2: 39
3: 50
default: 61
}
28: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
31: ldc #3 // String Happy
33: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
36: goto 69
39: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
42: ldc #5 // String Birthday
44: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
47: goto 69
50: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
53: ldc #6 // String To the ground!
55: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
58: goto 69
61: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
64: ldc #7 // String <3
66: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
69: return
So at least for this small switch, a jump table seems to be used. I have heard that the way switches are compiled depends partly on their size, but I don't know the exact point at which the implementation changes. 20 cases still seem to be implemented as a jump table...
Turns out String switch statements are implemented a bit differently. From this source:
public static void main(final String[] args) {
final String caseNum = "2";
switch (caseNum) {
case "1":
System.out.println("Happy");
break;
case "2":
System.out.println("Birthday");
break;
case "3":
System.out.println("To the ground!");
break;
default:
System.out.println("<3");
break;
}
}
You get this bytecode:
public static void main(java.lang.String[]);
Code:
0: ldc #2 // String 2
2: astore_2
3: iconst_m1
4: istore_3
5: aload_2
6: invokevirtual #3 // Method java/lang/String.hashCode:()I
9: tableswitch { // 49 to 51
49: 36
50: 50
51: 64
default: 75
}
36: aload_2
37: ldc #4 // String 1
39: invokevirtual #5 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
42: ifeq 75
45: iconst_0
46: istore_3
47: goto 75
50: aload_2
51: ldc #2 // String 2
53: invokevirtual #5 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
56: ifeq 75
59: iconst_1
60: istore_3
61: goto 75
64: aload_2
65: ldc #6 // String 3
67: invokevirtual #5 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
70: ifeq 75
73: iconst_2
74: istore_3
75: iload_3
76: tableswitch { // 0 to 2
0: 104
1: 115
2: 126
default: 137
}
104: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
107: ldc #8 // String Happy
109: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
112: goto 145
115: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
118: ldc #10 // String Birthday
120: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
123: goto 145
126: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
129: ldc #11 // String To the ground!
131: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
134: goto 145
137: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
140: ldc #12 // String <3
142: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
145: return
So you still have jump tables, but you have two. Bit of a roundabout process -- take the hash code, switch on that, based on the case load another constant, and from that do a "normal" switch (I think). But same basic process, I suppose?
Historically, switch statements were/could be implemented as jump tables (i.e. a mapping of value to destination address). Depending on the implementation of the table, it may be impossible to have two entries for the same value pointing to different addresses. Even if it were possible, how would you handle that duplicate value? Return from the first handler and then go to second handler? Never execute the second handler?
It makes no sense to allow this.
I was having a discussion about usage of Strings and StringBuffers in Java. How many objects are created in each of these two examples?
Ex 1:
String s = "a";
s = s + "b";
s = s + "c";
Ex 2:
StringBuilder sb = new StringBuilder("a");
sb.append("b");
sb.append("c");
In my opinion, Ex 1 will create 5 and Ex 2 will create 4 objects.
I've used a memory profiler to get the exact counts.
On my machine, the first example creates 8 objects:
String s = "a";
s = s + "b";
s = s + "c";
two objects of type String;
two objects of type StringBuilder;
four objects of type char[].
On the other hand, the second example:
StringBuffer sb = new StringBuffer("a");
sb.append("b");
sb.append("c");
creates 2 objects:
one object of type StringBuilder;
one object of type char[].
This is using JDK 1.6u30.
P.S. To the make the comparison fair, you probably ought to call sb.toString() at the end of the second example.
In terms of objects created:
Example 1 creates 8 objects:
String s = "a"; // No object created
s = s + "b"; // 1 StringBuilder/StringBuffer + 1 String + 2 char[] (1 for SB and 1 for String)
s = s + "c"; // 1 StringBuilder/StringBuffer + 1 String + 2 char[] (1 for SB and 1 for String)
Example 2 creates 2 object:
StringBuffer sb = new StringBuffer("a"); // 1 StringBuffer + 1 char[] (in SB)
sb.append("b"); // 0
sb.append("c"); // 0
To be fair, I did not know that new char[] actually created an Object in Java (but I knew they were created). Thanks to aix for pointing that out.
You can determine the answer by analyzing the java bytecode (use javap -c). Example 1 creates two StringBuilder objects (see line #4) and two String objects (see line #7), while example 2 creates one StringBuilder object (see line #2).
Note that you must also take the char[] objects into account (since arrays are objects in Java). String and StringBuilder objects are both implemented using an underlying char[]. Thus, example 1 creates eight objects and example 2 creates two objects.
Example 1:
public static void main(java.lang.String[]);
Code:
0: ldc #2; //String a
2: astore_1
3: new #3; //class java/lang/StringBuilder
6: dup
7: invokespecial #4; //Method java/lang/StringBuilder."<init>":()V
10: aload_1
11: invokevirtual #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
14: ldc #6; //String b
16: invokevirtual #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: invokevirtual #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
22: astore_1
23: new #3; //class java/lang/StringBuilder
26: dup
27: invokespecial #4; //Method java/lang/StringBuilder."<init>":()V
30: aload_1
31: invokevirtual #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
34: ldc #8; //String c
36: invokevirtual #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
39: invokevirtual #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
42: astore_1
43: return
}
Example 2:
public static void main(java.lang.String[]);
Code:
0: new #2; //class java/lang/StringBuilder
3: dup
4: ldc #3; //String a
6: invokespecial #4; //Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
9: astore_1
10: aload_1
11: ldc #5; //String b
13: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
16: pop
17: aload_1
18: ldc #7; //String c
20: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
23: pop
24: return
}
The answer is tied to specific implementations of the language (compiler and runtime libraries). Even to the presence of specific optimization options or not. And, of course, version of the implementation (and, implicitly, the JLS it is compliant with). So, it's better to speak in term of minima and maxima. In fact, this exercise gives a better
For Ex1, the minimum number of objects is 1 (the compiler realizes that there are only constants involved and produces only code for String s= "abc" ; ). The maximum could be just anything, depending on implementation, but a reasonable estimation is 8 (also given in another answer as the number produced by certain configuration).
For Ex2, the minimum number of objects is 2. The compiler has no way of knowing if we have replaced StringBuilder with a custom version with different semantics, so it will not optimize. The maximum could be around 6, for an extremely memory-conserving StringBuilder implementation that expands a backing char[] array one character at a time, but in most cases it will be 2 too.