Does Java compiler optimize stream filtering? - java

Let's have a case:
x.stream().filter(X::isFlag).filter(this::isOtherFlag).reduce(...)
Does it differ from this one?
x.stream().filter(predicate(X::isFlag).and(this::isOtherFlag)).reduce(...)

Functionally, the two statements are equivalent. However, consider the two following blocks of code and their respective bytecodes:
public static void main(String[] args) {
List<String> list = List.of("Seven", "Eight", "Nine");
list.stream().filter(s -> s.length() >= 5)
.filter(s -> s.contains("n"))
.forEach(System.out::println);
}
public static void main(java.lang.String[]);
Code:
0: ldc #16 // String Seven
2: ldc #18 // String Eight
4: ldc #20 // String Nine
6: invokestatic #22 // InterfaceMethod java/util/List.of:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/util/List;
9: astore_1
10: aload_1
11: invokeinterface #28, 1 // InterfaceMethod java/util/List.stream:()Ljava/util/stream/Stream;
16: invokedynamic #35, 0 // InvokeDynamic #0:test:()Ljava/util/function/Predicate;
21: invokeinterface #36, 2 // InterfaceMethod java/util/stream/Stream.filter:(Ljava/util/function/Predicate;)Ljava/util/stream/Stream;
26: invokedynamic #42, 0 // InvokeDynamic #1:test:()Ljava/util/function/Predicate;
31: invokeinterface #36, 2 // InterfaceMethod java/util/stream/Stream.filter:(Ljava/util/function/Predicate;)Ljava/util/stream/Stream;
36: getstatic #43 // Field java/lang/System.out:Ljava/io/PrintStream;
39: invokedynamic #52, 0 // InvokeDynamic #2:accept:(Ljava/io/PrintStream;)Ljava/util/function/Consumer;
44: invokeinterface #53, 2 // InterfaceMethod java/util/stream/Stream.forEach:(Ljava/util/function/Consumer;)V
49: return
-
public static void main(String[] args) {
List<String> list = List.of("Seven", "Eight", "Nine");
list.stream().filter(s -> s.length() >= 5 && s.contains("n"))
.forEach(System.out::println);
}
public static void main(java.lang.String[]);
Code:
0: ldc #16 // String Seven
2: ldc #18 // String Eight
4: ldc #20 // String Nine
6: invokestatic #22 // InterfaceMethod java/util/List.of:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/util/List;
9: astore_1
10: aload_1
11: invokeinterface #28, 1 // InterfaceMethod java/util/List.stream:()Ljava/util/stream/Stream;
16: invokedynamic #35, 0 // InvokeDynamic #0:test:()Ljava/util/function/Predicate;
21: invokeinterface #36, 2 // InterfaceMethod java/util/stream/Stream.filter:(Ljava/util/function/Predicate;)Ljava/util/stream/Stream;
26: getstatic #42 // Field java/lang/System.out:Ljava/io/PrintStream;
29: invokedynamic #51, 0 // InvokeDynamic #1:accept:(Ljava/io/PrintStream;)Ljava/util/function/Consumer;
34: invokeinterface #52, 2 // InterfaceMethod java/util/stream/Stream.forEach:(Ljava/util/function/Consumer;)V
39: return
We can see that, in the second example, one call to invokedynamic and invokeinterface are missing (which makes sense as we omitted a call to filter). I'm sure someone could assist with me with the static analysis of this bytecode (I can post verbose files if needed), but the Java compiler clearly treats the single call to filter as a single Predicate<String> rather than splitting it at the operator &&, shortening the bytecode slightly.

Related

How does final keyword work with string immutability? [duplicate]

This question already has an answer here:
How is concatenation of final strings done in Java?
(1 answer)
Closed 2 years ago.
I have the following code. I understand the concept of java string immutability and string constant pool. I don't understand why 'name1 == name2' results false and 'name2 == name3' results true in the following program. How are the string variables name1, name2, and name3 placed in the string constant pool?
public class Test {
public static void main(String[] args) {
final String firstName = "John";
String lastName = "Smith";
String name1 = firstName + lastName;
String name2 = firstName + "Smith";
String name3 = "John" + "Smith";
System.out.println(name1 == name2);
System.out.println(name2 == name3);
}
}
Output:
false
true
The answer is fairly simple: As a shortcut, java treats certain concepts as a so-called 'compile time constant' (CTC). The idea is to entirely inline a variable, at the compilation level (which is extraordinary; normally javac basically just bashes your java source file into a class file using very simple and easily understood transformations, and the fancypants optimizations occur at runtime during hotspot).
For example, if you do this:
Save to UserOfBatch.java:
class BatchOConstants {
public static final int HELLO = 5;
}
public class UserOfBatch {
public static void main(String[] args) {
System.out.println(BatchOConstants.HELLO);
}
}
Run on the command line:
> javac UserOfBatch.java
> java UserOfBatch
5
> javap -c UserOfBatch # javap prints bytecode
public static void main(java.lang.String[]);
Code:
0: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
3: iconst_5
4: invokevirtual #15 // Method java/io/PrintStream.println:(I)V
7: return
Check out line 3 up there. iconst_5. That 5? It was hardcoded!!
No reference to BatchOConstants remains. Let's test that:
on command line:
> rm BatchOConstants.class
> java UserOfBatch
5
Wow. The code ran even though it's missing the very class file that should be providing that 5, thus proving, it was 'hardcoded' by the compiler itself and not the runtime.
Another way to 'observe' CTC-ness of any value is annoparams. An annotation parameter must be hardcoded into the class file by javac, so you can't pass expressions. Given:
public #interface Foo {
long value();
I can't write: #Foo(System.currentTimeMillis()), because System.cTM obviously isn't a compile time constant. But I can write #Foo(SomeClass.SOME_STATIC_FINAL_LONG_FIELD) assuming that the value assigned to S_S_F_L_F is a compile time constant. If it's not, that #Foo(...) code would not compile. If it is, it will compile: CTC-ness now determines whether your code compiles or not.
There are specific rules about when the compiler is allowed to construe something as a 'compile time constant' and go on an inlining spree. For example, null is not an inline constant, ever. Oversimplifying, but:
For fields, the field must be static and final and have a primitive or String type, initialized on the spot (in the same breath, not later in a static block), with a constant expression, that isn't null.
For local variables, the rules are very similar, except, the need for them to be static is obviously waived as they cannot be. Other than that, all the fixins apply: final, primitive-or-String, non-null, initialized on the spot, and with a constant expression.
Unfortunately, CTC-ness of a local is harder to directly observe. The code you wrote is a fine way to indirectly observe it though. You've proven with your prints that firstName is CTC and lastName is not.
We can observe a select few things though. So let's take your code, compile it, and toss it at javap to witness the results. Via Jonas Konrad's online javap tool, let's analyse:
public Main() {
final String a = "hello";
String b = "world";
String c = a + "!";
String d = b + "!";
System.out.println(c == "hello!");
System.out.println(d == "world!");
}
The relevant parts:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: ldc #7 // String hello
6: astore_1
start local 1 // java.lang.String a
7: ldc #9 // String world
9: astore_2
start local 2 // java.lang.String b
10: ldc #11 // String hello!
12: astore_3
start local 3 // java.lang.String c
13: aload_2
14: invokedynamic #13, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
19: astore 4
start local 4 // java.lang.String d
Note how 'start local 2' (which is c; javap starts counting at 0) shows just loading hello! as a complete constant, but 'start local 3' (which is d) shows loading 2 constants and invoking makeConcat to tie em together.
run javap -c Test after you compiled the code,
you will see that
Compiled from "Test.java"
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: ldc #2 // String Smith
2: astore_2
3: aload_2
4: invokedynamic #3, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
9: astore_3
10: ldc #4 // String JohnSmith
12: astore 4
14: ldc #4 // String JohnSmith
16: astore 5
18: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
21: aload_3
22: aload 4
24: if_acmpne 31
27: iconst_1
28: goto 32
31: iconst_0
32: invokevirtual #6 // Method java/io/PrintStream.println:(Z)V
35: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
38: aload 4
40: aload 5
42: if_acmpne 49
45: iconst_1
46: goto 50
49: iconst_0
50: invokevirtual #6 // Method java/io/PrintStream.println:(Z)V
53: return
}
As you can see
4: invokedynamic #3, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
in public static void main(java.lang.String[]);, this is the actual compiled byte code when running firstname + lastname. So generated String will not have the same hashcode as "JohnSmith" in constant pool.
where as on
10: ldc #4 // String JohnSmith
and
14: ldc #4 // String JohnSmith
this is the byte code generated by the compiler when doing both
firstname + "Smith" and "John" + "Smith" which means both actually reading from the constant pool.
This is the reason why when you comparing name1 with name2 using == it will return false. since name2 and name3 reference the same string from the constant pool. Hench it return true when compare with ==.
This is the reason why it is not a good idea to compare 2 string with ==. Please use String.equals() when doing String comparison.
since both
Let's look at the bytecode with final:
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: ldc #7 // String Smith
2: astore_1
3: aload_1
4: invokedynamic #9, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
9: astore_2
10: ldc #13 // String JohnSmith
12: astore_3
13: ldc #13 // String JohnSmith
15: astore 4
17: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream;
20: aload_2
21: aload_3
22: if_acmpne 29
25: iconst_1
26: goto 30
29: iconst_0
30: invokevirtual #21 // Method java/io/PrintStream.println:(Z)V
33: getstatic #15 // Field java/lang/System.out:Ljava/io/PrintStream;
36: aload_3
37: aload 4
39: if_acmpne 46
42: iconst_1
43: goto 47
46: iconst_0
47: invokevirtual #21 // Method java/io/PrintStream.println:(Z)V
50: return
}
And the bytecode without final:
public class Test {
public Test();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: ldc #7 // String John
2: astore_1
3: ldc #9 // String Smith
5: astore_2
6: aload_1
7: aload_2
8: invokedynamic #11, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
13: astore_3
14: aload_1
15: invokedynamic #15, 0 // InvokeDynamic #1:makeConcatWithConstants:(Ljava/lang/String;)Ljava/lang/String;
20: astore 4
22: ldc #18 // String JohnSmith
24: astore 5
26: getstatic #20 // Field java/lang/System.out:Ljava/io/PrintStream;
29: aload_3
30: aload 4
32: if_acmpne 39
35: iconst_1
36: goto 40
39: iconst_0
40: invokevirtual #26 // Method java/io/PrintStream.println:(Z)V
43: getstatic #20 // Field java/lang/System.out:Ljava/io/PrintStream;
46: aload 4
48: aload 5
50: if_acmpne 57
53: iconst_1
54: goto 58
57: iconst_0
58: invokevirtual #26 // Method java/io/PrintStream.println:(Z)V
61: return
}
As you can see, with final, Java recognizes both the left and right hand sides of + to be constant, so it replaces the concatenation with the constant string "JohnSmith" at compile time. The only call to makeConcatWithConstants (to concatenate strings) is made for firstName + lastName, since lastName isn't final.
In the second example, there are two calls to makeConcatWithConstants, one for firstName + lastName and another for firstName + "Smith", since Java doesn't recognize firstName as a constant.
That's why name1 == name2 is false in your example: name2 is a constant "JohnSmith" in the string pool, whereas name1 is dynamically computed at runtime. However, name2 and name3 are both constants in the string pool, which is why name2 == name3 is true.

Why this string join algorithm takes so much steps?

Based on the book Cracking the Coding Interview (page 90), the following algorithm requires O(xn²) time (where 'x' represents a length of the string and 'n' is amount of the strings). The code is in Java. Can anybody explain how we obtain such runtime ?
String joinWords(String[] words)
{
String sentence = "";
for(String w : words)
{
sentence = sentence + w;
}
return sentence;
}
For each string that is concatenated to sentence, a new StringBuilder is created, two strings are appended to it using the StringBuilder.append method, and then the resulting string is created using the StringBuilder.toString method. The complexity of this operation is O(n_1 + n_2) where n_1 and n_2 are the lengths of the strings.
In this code, the loop runs n times, and each time it runs, the string sentence of length O(xn) is concatenated with the string w of length x. Therefore the overall complexity is n * O(xn + x) = O(xn^2), as expected.
For the skeptics, here's the disassembled bytecode of the joinWords method; I compiled it using javac 10.0.1 (which is the version I have to hand at the moment). The StringBuilder is used from positions 25 to 41, which are inside the loop (see 48: goto 12).
java.lang.String joinWords(java.lang.String[]);
Code:
0: ldc #2 // String
2: astore_2
3: aload_1
4: astore_3
5: aload_3
6: arraylength
7: istore 4
9: iconst_0
10: istore 5
12: iload 5
14: iload 4
16: if_icmpge 51
19: aload_3
20: iload 5
22: aaload
23: astore 6
25: new #3 // class java/lang/StringBuilder
28: dup
29: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V
32: aload_2
33: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
36: aload 6
38: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
41: invokevirtual #6 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
44: astore_2
45: iinc 5, 1
48: goto 12
51: aload_2
52: areturn

Can a java compiler optimize this code away?

Is a java compiler or runtime ( or any other language compiler ) smart enough to realize branch 3 can never happen , and optimize it away? I've seen this kind of "defensive programming" with many beginning developers, and wonder if this dead weight stays in the bytecode.
import java.util.Random;
class Example
{
public static void main(String[] args) {
int x = new Random().nextInt() % 10;
if ( x < 5 )
{
System.out.println("Case 1");
}
else
if ( x >= 5 )
{
System.out.println("Case 2");
}
else
{
System.out.println("Case 3");
}
}
}
or even this more blunt case
boolean bool = new Random().nextBoolean();
if ( bool )
{
System.out.println("Case 1");
}
else
if ( bool )
{
System.out.println("Case 2");
}
The Java 8 compiler I have doesn't seem to optimize it away. Using "javap -c" to examine the byte code after compiling:
public static void main(java.lang.String[]);
Code:
0: new #2 // class java/util/Random
3: dup
4: invokespecial #3 // Method java/util/Random."<init>":()V
7: invokevirtual #4 // Method java/util/Random.nextInt:()I
10: bipush 10
12: irem
13: istore_1
14: iload_1
15: iconst_5
16: if_icmpge 30
19: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
22: ldc #6 // String Case 1
24: invokevirtual #7 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
27: goto 54
30: iload_1
31: iconst_5
32: if_icmplt 46
35: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
38: ldc #8 // String Case 2
40: invokevirtual #7 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
43: goto 54
46: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
49: ldc #9 // String Case 3
51: invokevirtual #7 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
54: return
}
The string "Case 3" still exists in the byte code.

Which code snippet executes faster?

I can do same thing by two types of code snippet.
First Way:
String makeDate = Integer.toString(now.year) + Integer.toString(now.month) + Integer.toString(now.monthDay);
Or Second Way:
String makeDate = now.year + "" + now.month + "" + now.monthDay;
My question is:
Which method is preferable [First way or Second way]?
Which code snippet will execute faster?
The two snippits you show are nearly identical.
a String in Java is immutable; it can't be changed. When using the concatenation operator (+) the compiler actually generates code using a StringBuilder
For example your second snippit becomes:
String makeDate = new StringBuilder()
.append(now.year)
.append("")
.append(now.month)
.append("")
.append(now.monthDay)
.toString();
You can look at the generated bytecode to see this. Java comes with a program javap that allows you to look at your compiled .class.
I created a simple main() to provide minimal bytecode:
public static void main(String[] args)
{
String makeDate = Integer.toString(1) + Integer.toString(1) + Integer.toString(1);
System.out.println(makeDate);
}
Which produces:
public static void main(java.lang.String[]);
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=2, args_size=1
0: new #2 // class java/lang/StringBuilder
3: dup
4: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
7: iconst_1
8: invokestatic #4 // Method java/lang/Integer.toString:(I)Ljava/lang/String;
11: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
14: iconst_1
15: invokestatic #4 // Method java/lang/Integer.toString:(I)Ljava/lang/String;
18: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
21: iconst_1
22: invokestatic #4 // Method java/lang/Integer.toString:(I)Ljava/lang/String;
25: invokevirtual #5 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
28: invokevirtual #6 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
31: astore_1
32: getstatic #7 // Field java/lang/System.out:Ljava/io/PrintStream;
35: aload_1
36: invokevirtual #8 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
39: return
Versus:
public static void main(String[] args)
{
int i = 1;
String makeDate = i + "" + i + "" + i;
System.out.println(makeDate);
}
Produces:
public static void main(java.lang.String[]);
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=3, args_size=1
0: iconst_1
1: istore_1
2: new #2 // class java/lang/StringBuilder
5: dup
6: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
9: iload_1
10: invokevirtual #4 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
13: ldc #5 // String
15: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
18: iload_1
19: invokevirtual #4 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
22: ldc #5 // String
24: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
27: iload_1
28: invokevirtual #4 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
31: invokevirtual #7 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
34: astore_2
35: getstatic #8 // Field java/lang/System.out:Ljava/io/PrintStream;
38: aload_2
39: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
42: return
Technically the latter is probably faster at some scale that is nearly immeasurable (< 1ns) but for all practical purposes it doesn't matter; use whichever you like.

In a java enhanced for loop, is it safe to assume the expression to be looped over will be evaluated only once?

In the following:
for (String deviceNetwork : deviceOrganizer.getNetworkTypes(deviceManufacturer)) {
// do something
}
Is it safe to assume that deviceOrganizer.getNetworkTypes(deviceManufacturer) will be called only once?
Yes, absolutely.
From section 14.14.2 of the spec:
If the type of Expression is a subtype of Iterable, then let I be the type of the
expression Expression.iterator(). The enhanced for statement is equivalent to a basic for
statement of the form:
for (I #i = Expression.iterator(); #i.hasNext(); ) {
VariableModifiersopt Type Identifier = #i.next();
Statement
}
(The alternative deals with arrays.)
Note how Expression is only mentioned in the first part of the for loop expression - so it's only evaluated once.
Yes, give it a try:
public class ForLoop {
public static void main( String [] args ) {
for( int i : testData() ){
System.out.println(i);
}
}
public static int[] testData() {
System.out.println("Test data invoked");
return new int[]{1,2,3,4};
}
}
Output:
$ java ForLoop
Test data invoked
1
2
3
4
To complement what's been said and verify that the spec is doing what it says, let's look at the generated bytecode for the following class, which implements the old and new style loops to loop over a list returned by a method call, getList():
public class Main {
static java.util.List getList() { return new java.util.ArrayList(); }
public static void main(String[] args) {
for (Object o : getList()) {
System.out.print(o);
}
for (java.util.Iterator itr = getList().iterator(); itr.hasNext(); ) {
Object o = itr.next(); System.out.print(o);
}
}
}
Relevant parts of the output:
0: invokestatic #4; //Method getList
3: invokeinterface #5, 1; //InterfaceMethod java/util/List.iterator
8: astore_1
9: aload_1
10: invokeinterface #6, 1; //InterfaceMethod java/util/Iterator.hasNext
15: ifeq 35
18: aload_1
19: invokeinterface #7, 1; //InterfaceMethod java/util/Iterator.next
24: astore_2
25: getstatic #8; //Field java/lang/System.out
28: aload_2
29: invokevirtual #9; //Method java/io/PrintStream.print
32: goto 9
35: invokestatic #4; //Method getList
38: invokeinterface #10, 1; //InterfaceMethod java/util/List.iterator
43: astore_1
44: aload_1
45: invokeinterface #6, 1; //InterfaceMethod java/util/Iterator.hasNext
50: ifeq 70
53: aload_1
54: invokeinterface #7, 1; //InterfaceMethod java/util/Iterator.next
59: astore_2
60: getstatic #8; //Field java/lang/System.out
63: aload_2
64: invokevirtual #9; //Method java/io/PrintStream.print
67: goto 44
70: return
This shows that the first loop (0 to 32) and the second (35-67) are identical.
The generated bytecode is exactly the same.

Categories