Java 7 String switch decompiled: unexpected instruction - java

I have decompiled a very simple class that uses the new Java 7 String Switch feature.
The class:
public class StringSwitch {
public static void main(String[] args) {
final String color = "red";
switch (color) {
case "red":
System.out.println("IS RED!");
break;
case "black":
System.out.println("IS BLACK");
break;
case "blue":
System.out.println("IS BLUE");
break;
case "green":
System.out.println("IS GREEN");
break;
}
}
}
Running the Java 7 "javap" against this class, generates an interesting set of instructions (the complete disassembled code is available here):
public static void main(java.lang.String[]);
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=4, args_size=1
...
12: lookupswitch { // 4
112785: 56
3027034: 84
93818879: 70
98619139: 98
default: 109
}
56: aload_2
57: ldc #2 // String red
...
110: tableswitch { // 0 to 3
0: 140
1: 151
2: 162
3: 173
default: 181
}
140: getstatic #8 // Field java/lang/System.out:Ljava/io/PrintStream;
143: ldc #9 // String IS RED!
...
181: return
The "LOOKUPSWITCH" is an instruction used when the switch case is sparse and can replace the TABLESWITCH, that is the default instruction for "switch" statements.
So, the question is, why are we seeing a "LOOKUPSWITCH" followed by a "TABLESWITCH"?
Thanks
Luciano

With strings in switch finding the correct case statement is a 2 step process.
Compute the hashcode of the switch string and look for a 'hashcode match' among the case statements, this is done via LOOKUPSWITCH. Note the large integer numbers under LOOKUPSWITCH, these are hashcodes of the strings in case statements.
Now 2 strings can have the same hashcode, however unlikely it may be. Hence the actual string comparison must still take place. Hence once the hashcode is matched, the switch string is compared with the string in the matched case statement. The instructions between LOOKUPSWITCH and TABLESWITCH do exactly this. Once the match is confirmed, the code to be executed for the matched case statement is reached via TABLESWITCH.
Also note that it is useful to specify which compiler you used - javac or ECJ (Eclipse compiler for java). Both compilers may generate the bytecode differently.

Related

Unreachable statement: while true vs if true [duplicate]

This question already has answers here:
Unreachable statement error using while loop in java [duplicate]
(2 answers)
Why does Java have an "unreachable statement" compiler error?
(8 answers)
if(false) vs. while(false): unreachable code vs. dead code
(3 answers)
Closed 6 years ago.
How should I understand this Java compiler behaviour?
while (true) return;
System.out.println("I love Java");
// Err: unreachable statement
if (true) return;
System.out.println("I hate Java");
// OK.
Thanks.
EDIT:
I find out the point after a few minutes:
In the first case compiler throws error because of infinite loop. In both cases compiler does not think about the code inside the statement consequent.
EDIT II:
What impress me on javac now is:
if (true) return; // Correct
}
while (true) return; // Correct
}
It looks like javac knows what is inside both loop and if consequent,
but when you write another command (as in the first example) you get non-equivalent behaviour (which looks like javac forgot what is inside loop/if).
public static final EDIT III:
As the result of this answer I may remark (hopefully correct):
Expressions as if (arg) { ...; return;} and while (arg) { ...; return;} are equivalent both semantically and syntactically (in bytecode) for Java iff argv is non-constant (or effectively final type) expression. If argv is constant expression bytecode (and behaviour) may differs.
Disclaimer
This question is not on unreachable statements but different handling of logically equivalent expressions such as while true return and if true return.
There are quite strict rules when statements are reachable in java. These rules are design to be easily evaluated and not to be 100% acurate. It should prevent basic programming errors. To reason about reachability in java you are restricted to these rules, "common logic" does not apply.
So here are the rules from the Java Language Specification 14.21. Unreachable Statements
An if-then statement can complete normally iff it is reachable.
So without an else, statements after an if-then are always reachable
A while statement can complete normally iff at least one of the following is true:
The while statement is reachable and the condition expression is not a constant expression (ยง15.28) with value true.
There is a reachable break statement that exits the while statement.
The condition is a constant expression "true", there is no break. Hence it does not complete normally.
According to the docs:
Except for the special treatment of while, do, and for statements whose condition expression has the constant value true, the values of expressions are not taken into account in the flow analysis.
If you change your code slightly (remove the constant expression), so it doesnt trigger javac reachability it will actually produce identical bytecode for both.
static boolean flag = true;
static void twhile(){
while (flag) return;
System.out.println("Java");
}
static void tif(){
if (flag) return;
System.out.println("Java");
}
The resulting bytecode:
static void twhile();
descriptor: ()V
flags: ACC_STATIC
Code:
stack=2, locals=0, args_size=0
StackMap locals:
StackMap stack:
0: getstatic #10 // Field flag:Z
3: ifeq 7
6: return
StackMap locals:
StackMap stack:
7: getstatic #20 // Field java/lang/System.out:Ljava/io/PrintStream;
10: ldc #26 // String Java
12: invokevirtual #28 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
15: return
LineNumberTable:
line 8: 0
line 9: 7
line 10: 15
LocalVariableTable:
Start Length Slot Name Signature
StackMapTable: number_of_entries = 1
frame_type = 7 /* same */
static void tif();
descriptor: ()V
flags: ACC_STATIC
Code:
stack=2, locals=0, args_size=0
StackMap locals:
StackMap stack:
0: getstatic #10 // Field flag:Z
3: ifeq 7
6: return
StackMap locals:
StackMap stack:
7: getstatic #20 // Field java/lang/System.out:Ljava/io/PrintStream;
10: ldc #26 // String Java
12: invokevirtual #28 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
15: return
LineNumberTable:
line 12: 0
line 13: 7
line 14: 15
LocalVariableTable:
Start Length Slot Name Signature
StackMapTable: number_of_entries = 1
frame_type = 7 /* same */

Short Circuiting vs Multiple if's

What are the differences between this:
if(a && b)
{
//code
}
and this:
if(a)
{
if(b)
{
//code
}
}
From what I know b will only get evaluated in the first code block if a is true, and the second code block would be the same thing.
Are there any benefits of using one over the other? Code execution time? memory? etc.
They get compiled to the same bytecode. No performance difference.
Readability is the only difference. As a huge generalization, short-circuiting looks better but nesting is slightly clearer. It really boils down to the specific use case. I'd typically short-circuit.
I tried this out. Here's the code:
public class Test {
public static void main(String[] args) {
boolean a = 1>0;
boolean b = 0>1;
if (a && b)
System.out.println(5);
if (a)
if (b)
System.out.println(5);
}
}
This compiles to:
0: iconst_1
1: istore_1
2: iconst_0
3: istore_2
4: iload_1
5: ifeq 19
8: iload_2
9: ifeq 19
12: getstatic #2
15: iconst_5
16: invokevirtual #3
19: iload_1
20: ifeq 34
23: iload_2
24: ifeq 34
27: getstatic #2
30: iconst_5
31: invokevirtual #3
34: return
Note how this block repeats twice:
4: iload_1
5: ifeq 19
8: iload_2
9: ifeq 19
12: getstatic #2
15: iconst_5
16: invokevirtual #3
Same bytecode both times.
It makes a difference if you have an else associated with each if.
if(a && b)
{
//do something if both a and b evaluate to true
} else {
//do something if either of a or b is false
}
and this:
if(a)
{
if(b)
{
//do something if both a and b are true
} else {
//do something if only a is true
}
} else {
if(b)
{
//do something if only b is true
} else {
//do something if both a and b are false
}
}
If there is nothing in between two if statements in your second example then definitely first one is more cleaner and more readable.
But if there is a piece of code that could fit in between the two if conditions then only way is second example.
there shouldn't be a difference, but in readability I would prefer the first one, because it is less verbose and less indented.

Java: How does the == operator work when comparing int?

Given this Java code:
int fst = 5;
int snd = 6;
if(fst == snd)
do something;
I want to know how Java will compare equality for this case. Will it use an XOR operation to check equality?
Are you asking "what native machine code does this turn into?"? If so, the answer is "implementation-depdendent".
However, if you want to know what JVM bytecode is used, just take a look at the resulting .class file (use e.g. javap to disassemble it).
In case you are asking about the JVM, use the javap program.
public class A {
public static void main(String[] args) {
int a = 5;
System.out.println(5 == a);
}
}
Here is the disassembly:
public class A {
public A();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: iconst_5
1: istore_1
2: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
5: iconst_5
6: iload_1
7: if_icmpne 14
10: iconst_1
11: goto 15
14: iconst_0
15: invokevirtual #3 // Method java/io/PrintStream.println:(Z)V
18: return
}
In this case it optimized the branching a bit and used if_icmpne. In most cases, it will use if_icmpne or if_icmpeq.
if_icmpeq : if ints are equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)
if_icmpn : if ints are not equal, branch to instruction at branchoffset (signed short constructed from unsigned bytes branchbyte1 << 8 + branchbyte2)

significance of && in if()

I would like to know is there any difference in performance between these two codes.
String sample="hello";
if(sample!=null)
{
if(!sample.equals(""))
{
// some code in here
}
}
or
String sample="hello";
if(sample!=null && !sample.equals(""))
{
// some code in here
}
As far as i have understood, in the first code, if sample is not null then only it will go in to the block. same is the case with 2nd piece of code.
What i would like to know is what is the difference in performance or better coding standards and why?
If you're asking about performance you should always measure. But No, there shouldn't be a difference. Besides, if that is your only performance-problematic code then I envy you, seriously.
As for coding standards. Less nesting is almost always nicer to read and follow. Which means that putting both in a single if, especially since they are related is preferrable. The pattern
if (check_foo_for_null && compare_foo)
is very common and thus much less surprising than another nested if.
EDIT: To back it up:
I have the two little methods:
static boolean x(String a) {
if (a != null && a.equals("Foo"))
return true;
else return false;
}
static boolean y(String a) {
if (a != null) {
if (a.equals("Foo")) {
return true;
} else return false;
} else return false;
}
which produce the following code:
static boolean x(java.lang.String);
Code:
0: aload_0
1: ifnull 15
4: aload_0
5: ldc #16 // String Foo
7: invokevirtual #21 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
10: ifeq 15
13: iconst_1
14: ireturn
15: iconst_0
16: ireturn
static boolean y(java.lang.String);
Code:
0: aload_0
1: ifnull 17
4: aload_0
5: ldc #16 // String Foo
7: invokevirtual #21 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
10: ifeq 15
13: iconst_1
14: ireturn
15: iconst_0
16: ireturn
17: iconst_0
18: ireturn
So apart from an extraneous else jump target the code is identical. If you don't even have the else:
static boolean z(String a) {
if (a != null) {
if (a.equals("Foo"))
return true;
return false;
}
then the result is really the same:
static boolean z(java.lang.String);
Code:
0: aload_0
1: ifnull 15
4: aload_0
5: ldc #16 // String Foo
7: invokevirtual #21 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
10: ifeq 15
13: iconst_1
14: ireturn
15: iconst_0
16: ireturn
As everyone else said, there shouldn't be any difference in preformance.
Small tip - equals almost always calls instanceof which returns false for null.
So writing:
if( !"".equals(foo)) {...}
does same check and is null-safe.
Bothe have no difference in terms of performance. Because in first case it checks one condition, if fails it does not enter inside. In 2nd case also, JVM checks the first condition, if it return false, then JVM will never go for 2nd check. As logical && operator will always false if first is false.
In terms of coding standard, I will choose 2nd option, as it has less number of coding lines.
Most likely the bytcode generated will be optimized to if(sample!=null && !sample.equals("")) since java performs an optimization in compile time.
If you are talking about the actual code you write it is better to have only one if. Since the structure of two if is more complex for the compiler (with no optimization). Although I have no empiric data to back this.

Best method for string pattern matching in Java if performance is a concern

Greetings,
Let's say you wanted to test a string to see if it's an exact match, or, if it's a match with an _ and any number of characters appended following the _
Valid match examples:
MyTestString
MyTestString_
MyTestString_1234
If performance was a huge concern, which methods would you investigate? Currently I am doing the following:
if (String.equals(stringToMatch)) {
// success
} else {
if (stringToMatch.contains(stringToMatch + "_")) {
// success
}
// fail
}
I tried replacing the pattern the String.contains _ with a Java.util.regex.Pattern match on _*, but that performed much worse. Is my solution here ideal or can you think of something more cleaver to improve performance a bit more?
Thanks for any thoughts
You can do something like
if(string.startsWith(testString)) {
int len = testString.length();
if(string.length() == len || string.charAt(len) == '_')
// success
}
I assume you want the testString to appear even if you have a "_"?
EDIT: On whether to use one long condition or nested if statements, there is no difference in code or performance.
public static void nestedIf(boolean a, boolean b) {
if (a) {
if (b) {
System.out.println("a && b");
}
}
}
public static void logicalConditionIf(boolean a, boolean b) {
if (a && b) {
System.out.println("a && b");
}
}
compiles to the same code. If you do javap -c
public static void nestedIf(boolean, boolean);
Code:
0: iload_0
1: ifeq 16
4: iload_1
5: ifeq 16
8: getstatic #7; //Field java/lang/System.out:Ljava/io/PrintStream;
11: ldc #8; //String a && b
13: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
16: return
public static void logicalConditionIf(boolean, boolean);
Code:
0: iload_0
1: ifeq 16
4: iload_1
5: ifeq 16
8: getstatic #7; //Field java/lang/System.out:Ljava/io/PrintStream;
11: ldc #8; //String a && b
13: invokevirtual #9; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
16: return
The complied code is identical.
You could use regular expressions to match patterns. You can use stringToMatch.matches(".*?_.*?"). This returns a boolean.
I ran some benchmarks. This is the quickest I can get.
String a = "Test123";
String b = "Test123_321tseT_Test_rest";
int len1 = a.length();
int len2 = b.length();
if ((len1 == len2 || (len2 > len1 && (b.charAt(len1)) == '_'))
&& b.startsWith(a)) {
System.out.println("success");
} else {
System.out.println("Fail");
}
This will at least work correctly at reasonable performance.
Edit: I switched the _ check and the startsWith check, since startsWith will have worse perforam the _ check.
Edit2: Fixed StringIndexOutOfBoundsException.
Edit3: Peter Lawrey is correct that making only 1 call to a.length() spares time. 2.2% in my case.
Latest benchmark shows I'm 88% faster then OP and 10% faster then Peter Lawrey's code.
Edit4: I replace all str.length() with a local var, and ran dozen more benchmarks. Now the results of the benchmarks are getting so random it's impossible to say what code is faster. My latest version seems to win by a notch.

Categories