String object creation in java [duplicate] - java

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Questions about Java’s String pool
I have a doubt in java Strings object creation.
String s1 = "Hello"+"world";
String s2 = s1+"Java";
in this program how many String objects will be created and how ?please explain it.
Thanks.

The answer is 3
Two String objects will be created once per JVM start:
"Helloworld"
"Java"
Both will be interned, because they are constants (known at compile time).
They will be reused every time this code runs. A StringBuilder will be created to concatenate the two String above. References to them will be assigned to s1 and s2.
Here's the bytecode for the code:
0: ldc #37; //String Helloworld
2: astore_1
3: new #39; //class java/lang/StringBuilder
6: dup
7: aload_1
8: invokestatic #41; //Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String;
11: invokespecial #47; //Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
14: ldc #50; //String Java
16: invokevirtual #52; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: invokevirtual #56; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
22: astore_2
23: return

You can't really say, how many Strings are created, since there's several differences due to the different implementations of the JVM.
As String is an immutable class, the naive answer is 5. But with some optimization (e.g. using a StringBuffer/ StringBuilder there would only be 2 Strings.
As concats would be summarized via append()-calls.
Edit: As the're some different answers here an explanation why I said 5:
"Hello"
"world"
(s1) "Helloworld"
"Java"
(s2) "HelloworldJava"

if you look at the compiled code, you can easily guess:
String s1 = "Helloworld";
String s2 = (new StringBuilder(String.valueOf(s1))).append("Java").toString();
We can't accurately know by just looking at source code as many optimizations are done by the compiler before execution.
Here we see that 1 String object is created for s1, and another String object for s2. Here 2 string literals are there in the string pool: "Helloworld" and "Java"

If you decompile your Program.class you will see the real code
String s1 = "Helloworld";
String s2 = (new StringBuilder(String.valueOf(s1))).append("Java").toString();
10 objects it seems, because inside each String there is char[] value this is a separate object + another char[] inside StringBuilder

The answer is 3.
You can view the deassembled result by:
javap -verbose YourClass
The Constant pool includes:
...
const #17 = Asciz Helloworld;
...
const #30 = Asciz Java;
...
It means two strings ("Helloworld" and "Java") are compile-time constant expression which will be interned into constant pool automatically.
The code:
Code:
Stack=3, Locals=3, Args_size=1
0: ldc #16; //String Helloworld
2: astore_1
3: new #18; //class java/lang/StringBuilder
6: dup
7: aload_1
8: invokestatic #20; //Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String;
11: invokespecial #26; //Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
14: ldc #29; //String Java
16: invokevirtual #31; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: invokevirtual #35; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
22: astore_2
23: return
It indicates that s2 is created by StringBuilder.append() and toString().
To make this more interesting, javac can optimize the code in constant folding. You can guess the count of strings created by the following code:
final String s1 = "Hello" + "world";
String s2 = s1 + "Java";
"final" means s1 is constant which can help javac to build the value of s2 and intern s2. So the count of string created here is 2.

Yup, there will be be five String objects created. Strings are immutable; followings are the steps -
1. First "Hello"
2. then another objects "Hello World"
3. then another object "Java"
4. then s1+"Java"
and then finally s2 would be created.

Actually no String objects will be created just two String literals.
When Strings are initialized like you have they are literals not objects. If you wanted to create String objects you would do the following
String a = new String("abcd");

Related

If String is immutable, if String constant pool having only one copy of a particular value how the following scenario giving two different outputs [duplicate]

This question already has answers here:
concatenation of string constants in java [duplicate]
(2 answers)
Closed 4 years ago.
String s1 = "A"+"B";
String s2 = "AB";
System.out.println(s1 == s2); // true
String s1 and s2 should refer to String value "AB" in String constant pool. So one value "AB" one reference, reference comparison giving is true. (this is acceptable according to the theory I am aware of)
String st1 = "C D";
st1 += " E";
String str2 = "C D E";
System.out.println(st1 == str2); // false
String str1 and str2 both should refer to String "C D E" in String constant pool (two identical values cannot be in String pool ). Then why the reference comparison of str1 and str2 return false?
What I am missing here?
Thanks
The answer is very simple.
When you type String st1 += " E" or String st1 = st1 + " E" JVM transforms it to the StringBuilder syntax like String st1 = new StringBuilder(st1).append(" E").toString().
When you look at StringBuilder.toString():
#Override
public String toString() {
// Create a copy, don't share the array
return new String(value, 0, count);
}
... you see new String(), it creates new string not using constants from String Pool.
Finally, if you replace st1 += " E" to s1 = (s1 + " E").intern(), you get true as results.
P.S.
If I remember correctly, this is valid from JVM 6 or so. That's why now you can use string concatenation str = str1 + str2 (internally this is str = new StringBuilder(str1).append(str2).toString()). But it is not true, when you use it inside loop:
String str = "A";
for(int i = 1; i <= 3; i++)
str += i;
JVM cannot optimize it to use StringBuilder and does old-style concatenation with generating multiple temporary strings, which is very slowly.
In addition to Alex Salauyou and oleg.cherednik answers:
In bytecode below you can see, that string s1 was evaluated on compile time as Alex Salauyou told, but the second string was constructed with StringBuilder as oleg.cherednik explained.
Code:
0: ldc #2 // String AB <- s1 evaluated on compile time there
2: astore_1
3: ldc #2 // String AB
5: astore_2
6: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream;
9: aload_1
10: aload_2
11: if_acmpne 18
14: iconst_1
15: goto 19
18: iconst_0
19: invokevirtual #4 // Method java/io/PrintStream.println:(Z)V
22: ldc #5 // String C D
24: astore_3
25: new #6 // class java/lang/StringBuilder
28: dup
29: invokespecial #7 // Method java/lang/StringBuilder."<init>":()V
32: aload_3
33: invokevirtual #8 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
36: ldc #9 // String E
38: invokevirtual #8 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
41: invokevirtual #10 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
44: astore_3
45: ldc #11 // String C D E
47: astore 4
49: getstatic #3 // Field java/lang/System.out:Ljava/io/PrintStream;
52: aload_3
53: aload 4
55: if_acmpne 62
58: iconst_1
59: goto 63
62: iconst_0
63: invokevirtual #4 // Method java/io/PrintStream.println:(Z)V
66: return
The rule you mentioned regards string literals, i.e. pieces of code written in double quotes, like "AB". Result of operation, which may be evaluated on compile time ("constant expression"), is immediately replaced by resulting literal:
String st1 = "A" + "B";
is the same as writing:
String st1 = "AB";
And equal literals are always evaluated to same String object, as told in specification (https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.5)
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned"
In your question, you assume that constant pool cannot hold two different instances of the same value, that is right, but that does not imply that every String instance resides in constant pool. In fact, there is no possibility to guaranteely make equal strings created on runtime be the same object, for example:
String st1 = "AB";
String st2 = new String(st1);
String st3 = new String(st1);
will result in three different String instances: first as literal evaluation result, taken from constant pool, others as newly constructed objects (though new always creates a new object!). == on them will fail unless you adjust them using String#intern:
st1 == st2.intern(); // true

How these two references point to the same Object in the heap? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Could anyone explain how these two references will point to same object in the heap?
String strA = new String("APPLES");
String strB = new String("APPLES");
Yes java will use the same heap space when it detects the same String.
You can clearly see this in byteCode
public static void main(java.lang.String[]);
Code:
0: new #21 // class java/lang/String
3: dup
4: ldc #23 // String APPLES
6: invokespecial #25 // Method java/lang/String."<init>":
Ljava/lang/String;)V
9: astore_1
10: new #21 // class java/lang/String
13: dup
14: ldc #23 // String APPLES
16: invokespecial #25 // Method java/lang/String."<init>":
Ljava/lang/String;)V
19: astore_2
20: getstatic #28 // Field java/lang/System.out:Ljava/
o/PrintStream;
23: invokevirtual #34 // Method java/io/PrintStream.printl
:()V
26: return
As you can see #23 is used for the both object which mean they are located on the same memory location.
Now lets look on different string
String strA = new String("APPLES");
String strB = new String("BANANA");
The byteCode for this is:
public static void main(java.lang.String[]);
Code:
0: new #21 // class java/lang/String
3: dup
4: ldc #23 // String APPLES
6: invokespecial #25 // Method java/lang/String."<init>":
Ljava/lang/String;)V
9: astore_1
10: new #21 // class java/lang/String
13: dup
14: ldc #28 // String BANANA
16: invokespecial #25 // Method java/lang/String."<init>":
Ljava/lang/String;)V
19: astore_2
20: return
Now BANANA will be located on different memory location as seen in the byteCode above. That is because Java now knows that it differs from APPLES that it need to be located/stored on a different memory location.
You are most likely experiencing string interning
3.10.5. String Literals
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.
There's also the method String.intern for reference.
String strA = new String("APPLES");
String strB = new String("APPLES");
System.out.println(strA.equals(strB));
System.out.println(strA == strB);
Result:
true
false
The Strings are the same because all of their characters match, but they are not the same INSTANCE. They are totally different objects bearing the same value.
Both of the new String variables do not point to the same object on the heap. Since you are using the new operator it will almost always allocate the object on the heap. Since you are using it twice it will create two different objects. Rod_Algonquin and Patrick are partially correct in their answers in that the String Literal "APPLES" do point to the same object but that object is passed to the String constructor. The two variables do not point to the same object as the String Literal "APPLES". All strings known at compile time are all interned at run time to save space. But any string created using new will not point to the same object as a literal unless you use String.intern.
Edit: Although a very aggressive JVM could potentially optimize such cases as this to make all references point to the same object, but to my knowledge no current JVM does this for any String created using new. It does seem like a pointless optimization that a JVM would do as it doesn't add much benefit and a programmer can detect such cases much easier than a JVM can.

String Concatenation and Autoboxing in Java

When you concatenate a String with a primitive such as int, does it autobox the value first.
ex.
String string = "Four" + 4;
How does it convert the value to a string in Java?
To see what the Java compiler produces it is always useful to use javap -c to show the actual bytecode produced:
For example the following Java code:
String s1 = "Four" + 4;
int i = 4;
String s2 = "Four" + i;
would produce the following bytecode:
0: ldc #2; //String Four4
2: astore_1
3: iconst_4
4: istore_2
5: new #3; //class java/lang/StringBuilder
8: dup
9: invokespecial #4; //Method java/lang/StringBuilder."<init>":()V
12: ldc #5; //String Four
14: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/
String;)Ljava/lang/StringBuilder;
17: iload_2
18: invokevirtual #7; //Method java/lang/StringBuilder.append:(I)Ljava/lan
g/StringBuilder;
21: invokevirtual #8; //Method java/lang/StringBuilder.toString:()Ljava/la
ng/String;
24: astore_3
25: return
From this we can see:
In the case of "Four" + 4, the Java compiler (I was using JDK 6) was clever enough to deduce that this is a constant, so there is no computational effort at runtime, as the string is concatenated at compile time
In the case of "Four" + i, the equivalent code is new StringBuilder().append("Four").append(i).toString()
Autoboxing is not involved here as there is an StringBuilder.append(int) method which according to the docs is using String.valueOf(int) to create the string representation of the integer.
The java compiler actually creates a StringBuilder1 and invokes the append() method. It can be seen in the byte-code:
22 invokespecial java.lang.StringBuilder(java.lang.String) [40]
...
29 invokevirtual java.lang.StringBuilder.append(int) : java.lang.StringBuilder [47]
32 invokevirtual java.lang.StringBuilder.toString() : java.lang.String [51]
Nevertheless, the behavior is identical to boxing and then invoking toString(): "Four" + new Integer(4).toString() - which I believe what the language designers had in mind.
(1) To be exact, the compiler is already concatting the string literal and int literal to a single string literal "Four4". You can see it in the byte code in the following line in byte-code:
0 ldc <String "Four4"> [19]
According to http://jcp.org/aboutJava/communityprocess/jsr/tiger/autoboxing.html, autoboxing is done on the primitive type whenever a reference type is needed(such as the Integer class in this case)
So the int will be converted into an Integer then that integer objects toString() method is called and its result is appended to the preceding string.

How many objects are created

I was having a discussion about usage of Strings and StringBuffers in Java. How many objects are created in each of these two examples?
Ex 1:
String s = "a";
s = s + "b";
s = s + "c";
Ex 2:
StringBuilder sb = new StringBuilder("a");
sb.append("b");
sb.append("c");
In my opinion, Ex 1 will create 5 and Ex 2 will create 4 objects.
I've used a memory profiler to get the exact counts.
On my machine, the first example creates 8 objects:
String s = "a";
s = s + "b";
s = s + "c";
two objects of type String;
two objects of type StringBuilder;
four objects of type char[].
On the other hand, the second example:
StringBuffer sb = new StringBuffer("a");
sb.append("b");
sb.append("c");
creates 2 objects:
one object of type StringBuilder;
one object of type char[].
This is using JDK 1.6u30.
P.S. To the make the comparison fair, you probably ought to call sb.toString() at the end of the second example.
In terms of objects created:
Example 1 creates 8 objects:
String s = "a"; // No object created
s = s + "b"; // 1 StringBuilder/StringBuffer + 1 String + 2 char[] (1 for SB and 1 for String)
s = s + "c"; // 1 StringBuilder/StringBuffer + 1 String + 2 char[] (1 for SB and 1 for String)
Example 2 creates 2 object:
StringBuffer sb = new StringBuffer("a"); // 1 StringBuffer + 1 char[] (in SB)
sb.append("b"); // 0
sb.append("c"); // 0
To be fair, I did not know that new char[] actually created an Object in Java (but I knew they were created). Thanks to aix for pointing that out.
You can determine the answer by analyzing the java bytecode (use javap -c). Example 1 creates two StringBuilder objects (see line #4) and two String objects (see line #7), while example 2 creates one StringBuilder object (see line #2).
Note that you must also take the char[] objects into account (since arrays are objects in Java). String and StringBuilder objects are both implemented using an underlying char[]. Thus, example 1 creates eight objects and example 2 creates two objects.
Example 1:
public static void main(java.lang.String[]);
Code:
0: ldc #2; //String a
2: astore_1
3: new #3; //class java/lang/StringBuilder
6: dup
7: invokespecial #4; //Method java/lang/StringBuilder."<init>":()V
10: aload_1
11: invokevirtual #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
14: ldc #6; //String b
16: invokevirtual #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
19: invokevirtual #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
22: astore_1
23: new #3; //class java/lang/StringBuilder
26: dup
27: invokespecial #4; //Method java/lang/StringBuilder."<init>":()V
30: aload_1
31: invokevirtual #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
34: ldc #8; //String c
36: invokevirtual #5; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
39: invokevirtual #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
42: astore_1
43: return
}
Example 2:
public static void main(java.lang.String[]);
Code:
0: new #2; //class java/lang/StringBuilder
3: dup
4: ldc #3; //String a
6: invokespecial #4; //Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V
9: astore_1
10: aload_1
11: ldc #5; //String b
13: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
16: pop
17: aload_1
18: ldc #7; //String c
20: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
23: pop
24: return
}
The answer is tied to specific implementations of the language (compiler and runtime libraries). Even to the presence of specific optimization options or not. And, of course, version of the implementation (and, implicitly, the JLS it is compliant with). So, it's better to speak in term of minima and maxima. In fact, this exercise gives a better
For Ex1, the minimum number of objects is 1 (the compiler realizes that there are only constants involved and produces only code for String s= "abc" ; ). The maximum could be just anything, depending on implementation, but a reasonable estimation is 8 (also given in another answer as the number produced by certain configuration).
For Ex2, the minimum number of objects is 2. The compiler has no way of knowing if we have replaced StringBuilder with a custom version with different semantics, so it will not optimize. The maximum could be around 6, for an extremely memory-conserving StringBuilder implementation that expands a backing char[] array one character at a time, but in most cases it will be 2 too.

What's could be the right test case for evaluating assignment operators performance vs concrete operations?

Simply I am trying to figure out what is the fast way to assign a value like
somevar+=1;
or
somevar=somevar+1;
time ago in situations with text instead of integers I encountered some performance decrease using text+="sometext" instead of text.append("sometext") or text=text+"sometext", the problem is that I did not find anymore the source code where I annotated my considerations. So theoretically what's the fastest way?
The code background is set into a fast loop, nearly real time.
If you have something like this:
Collection<String> strings = ...;
String all = "";
for (String s : strings) all += s;
... then it's equivalent to:
Collection<String> strings = ...;
String all = "";
for (String s : strings) all = new StringBuilder(all).append(s).toString();
Each loops creates a new StringBuilder which is essentially a copy of all, appends a copy of s to it, and then copies the result of the concatenation to a new String. Obviously, using a single StringBuilder saves a lot of unnecessary allocations:
Collection<String> strings = ...;
StringBuilder sb = new StringBuilder();
for (String s : strings) sb.append(s);
String all = sb.toString();
As for x += y versus x = x + y, they compile to the same thing.
class Concat {
public String concat1(String a, String b) {
a += b;
return a;
}
public String concat2(String a, String b) {
a = a + b;
return a;
}
}
Compile it with javac, and then disassemble it with javap:
$ javap -c Concat
Compiled from "Concat.java"
class Concat extends java.lang.Object{
Concat();
Code:
0: aload_0
1: invokespecial #1; //Method java/lang/Object."<init>":()V
4: return
public java.lang.String concat1(java.lang.String, java.lang.String);
Code:
0: new #2; //class java/lang/StringBuilder
3: dup
4: invokespecial #3; //Method java/lang/StringBuilder."<init>":()V
7: aload_1
8: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
11: aload_2
12: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: invokevirtual #5; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
18: astore_1
19: aload_1
20: areturn
public java.lang.String concat2(java.lang.String, java.lang.String);
Code:
0: new #2; //class java/lang/StringBuilder
3: dup
4: invokespecial #3; //Method java/lang/StringBuilder."<init>":()V
7: aload_1
8: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
11: aload_2
12: invokevirtual #4; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
15: invokevirtual #5; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
18: astore_1
19: aload_1
20: areturn
}
Personally, I'd favor += because with it, you make a clearer statement of intent - "I want to add the content of b to a". Any variations in performance between the two forms are with a 100% certainty the result of something outside of your code (e.g., GC pauses, random cache misses or something like it).
Bad compilers might also have a slightly easier time to optimize the += form (which is irrelevant to you, since even if javac would be crappy, HotSpot sure isn't).
Strings are immutable. Hence, concatenating 2 Strings causes a new String object to be created. Using StringBuilder will gain you a lot of performance.
However, primitive numeric types are not immutable and not heap allocated. Whatever you do with them tends to be really fast no matter what :)
String 'arithmetics' internal are rather complicated - just because they happen quite often and the compilers tries to optimize whereever it can.
First of all, append("something") is not a String method. It is a method from the StringBuffer or StringBuilder class.
The other expressions are quite equivalent. I'm pretty sure one can't look at the pattern and decide that one is faster in general.
In General and because Strings are immutable, a concatenation will create a new String for each String literal, a new String for each intermediate result and one new String for the result. So this
String s = "one" + "two" + "three";
will require five String objects ("one", "two", "three", "onetwo", "onwtwothree").
The first compiler optimization is 'interning' the literals. In short: "one", "two" and "three" are not created but reused.
A second compiler optimization is replacing the String addition by StringBuilder operation (if it makes sense). So the concatenation could be replaced by
StringBuilder sb = new StringBuilder("one");
sb.append("two");
sb.append("three");
String s = sb.toString();
(just to illustrate - a real compiler might not use the strategy for this example code, it might concatenate the the literals during compilation already...)

Categories