String intern() method usage confusion - java

We use intern() method for the Strings which are created through new String() so that they will create an entry in String Pool and return the newly created String from the String Pool so that this created string is eligible for use with == operator (as per my understanding).
Then what is the use of creating a new String through constructor?
When should we use constructor for creating new String?

new String() can create at-most two Objects of String and at-least one. one in constant pool (if same is not present in constant pool) and another in heap memory.
constant pool entry is created by interning String object.
intern will create String Literal in Constant pool so that whenever you create String without new keyword it will not create string object.
suppose you have created as
String str= new String("abc");
two object will be created one in constant pool (if "abc" is not present in constant pool). jvm wil internally call intern for that object.
so next time if you are doing
String str1= "abc";
no entry will be added to constant pool because only entry can be possible in constant pool for same literal.

public native String intern(); java doc says,
A pool of strings, initially empty, is maintained privately by the class >{#code String}. When the intern method is invoked, if the pool already >contains a string equal to this {#code String} object as determined by >the {#link #equals(Object)} method, then the string from the pool is >returned. Otherwise, this {#code String} object is added to the pool and >a reference to this {#code String} object is returned. It follows that >for any two strings {#code s} and {#code t}, {#code s.intern() == >t.intern()} is {#code true} if and only if {#code s.equals(t)} is {#code >true}.
Let's consider an example:
String temp1 = new String("abcd"); this means, a new object "abcd" will be created in the heap memory and its reference will be linked to temp1.
String temp2 = new String("abcd").intern(); this means "abcd" literal will be checked in the String Pool. If already present, its reference will be linked to newly created temp2. Else a new entry will be created in String Pool and its reference will be linked to newly created temp2.
String temp3 = "abcd"; this means "abcd" literal will be checked in the String Pool. If already present, its reference will be linked to newly created temp2. Else a new entry will be created in String Pool and its reference will be linked to newly created temp3. This conclude that, Java automatically interns String literals.
Let's consider another example:
String s1 = "string-test";
String s2 = new String("string-test");
String s3 = new String("string-test").intern();
if ( s1 == s2 ){
System.out.println("s1 and s2 are same");
}
if ( s2 == s3 ){
System.out.println("s2 and s3 are same" );
}
if ( s1 == s3 ){
System.out.println("s1 and s3 are same" );
}
Output: s1 and s3 are same
Now when to create a String object using = " " or using new String(" ")?
One use case which I came across,
// imagine a multi-megabyte string here
String s = "0123456789012345678901234567890123456789";
String s2 = s.substring(0, 1);
s = null;
You'll now have a String s2 which, although it seems to be a one-character string, holds a reference to the gigantic char array created in the String s. This means the array won't be garbage collected, even though we've explicitly nulled out the String s!
The fix for this is to use String constructor like this:
String s2 = new String(s.substring(0, 1));

Related

How many objects will be created in string pool?

How many objects will be created in the following Java code:
String s = "abc";
s = "";
String s2 = new String("mno");
s2 = "pqr";
String s = "abc"; → one object, that goes into the string pool, as the literal "abc" is used;
s = ""; → one empty string ("") object, and again - allocated in the string pool;
String s2 = new String("mno"); → another object created with an explicit new keyword, and note, that it actually involves yet another literal object (again - created in the string pool) - "mno"; overall, two objects here;
s2 = "pqr"; → yet another object, being stored into the string pool.
So, there are 5 objects in total; 4 in the string pool (a.k.a. "intern pool"), and one in the ordinary heap.
Remember, that anytime you use "string literal", JVM first checks whether the same string object (according to String::equals..()) exists in the string pool, and it then does one of the following:
If corresponding string does not exist, JVM creates a string object and puts it in the string pool. That string object is a candidate to be reused, by JVM, anytime equal to it (again, according to String::equals(..)) string literal is referenced (without explicit new);
If corresponding string exists, its reference is just being returned, without creating anything new.

Count Distinct String Object Instances

How many distinct String object instances are created in the following code segment?
String s1 = new String("hello");
String s2 = "GoodBye";
String s3 = s1;
Not sure about all my reasoning here.
By using the keyword new that creates an instance from the String class, I am guessing that has to be an object. However, I am confused, is String after the new now considered a method because it has ( ) and then it is calling a String literal "hello" in it?
String s2 = "Goodbye";
I think this is a String literal, and since Strings are actually objects even the String literal is considered object. Not 100% sure if that is true.
String s3 = s1; Just refers back to s1. Therefore, it is not distinct.
So my answer is 2 distinct objects.
Please explain if I am a right or wrong.
The correct answer is 3.
String s1 = new String("hello");
String s2 = "GoodBye";
String s3 = s1;
The compiler will put both literals "hello" and "GoodBye" into a "constant pool" during compilation time, which then will be loaded by the classloader. So the JVM automatically interns all String literals used by this class, when it loads that class. More about this: When are Java Strings interned?. The String constant pool is then managed during runtime.
During runtime the JVM will create the third String object when it reaches the line String s1 = new String("hello").
So you would and up with three distinct String objects, where two of them contain the same word "hello". So s1.equals("hello") would be true, but s1 == "hello" would be false, because s1 references to a different String on the heap, than the literal "hello".
The line String s3 = s1 just creates a variable s3 with a copied reference to the String object of s1. It doesn't create a new object.
Also mind that you can "manually" add Strings into the String constant pool by using the method String#intern. So s1.intern() == "hello" is true, because the String reference returned from s1.intern() is the reference to the literal "hello" which was already in the constant pool.
If you like to get another and maybe more detailed explanation with some drawings about objects and their location, you can check this article on javaranch.

what happens with new String("") in String Constant pool

if i create a string object as
String s=new String("Stackoverflow");
will String object created only in heap, or it also makes a copy in String constant pool.
Thanks in advance.
You only get a string into the constant pool if you call intern or use a string literal, as far as I'm aware.
Any time you call new String(...) you just get a regular new String object, regardless of which constructor overload you call.
In your case you're also ensuring that there is a string with contents "Stackoverflow" in the constant pool, by the fact that you're using the string literal at all - but that won't add another one if it's already there. So to split it up:
String x = "Stackoverflow"; // May or may not introduce a new string to the pool
String y = new String(x); // Just creates a regular object
Additionally, the result of a call to new String(...) will always be a different reference to all previous references - unlike the use of a string literal. For example:
String a = "Stackoverflow";
String b = "Stackoverflow";
String x = new String(a);
String y = new String(a);
System.out.println(a == b); // true due to constant pooling
System.out.println(x == y); // false; different objects
Finally, the exact timing of when a string is added to the constant pool has never been clear to me, nor has it mattered to me. I would guess it might be on class load (all the string constants used by that class, loaded immediately) but it could potentially be on a per-method basis. It's possible to find out for one particular implementation using intern(), but it's never been terribly important to me :)
In this case you are constructing an entirely new String object and that object won't be shared in the constant pool. Note though that in order to construct your new String() object you actually passed into it a String constant. That String constant is in the constants pool, however the string you created through new does not point to the same one, even though they have the same value.
If you did String s = "Stackoverflow" then s would contain a reference to the instance in the pool, also there are methods to let you add Strings to the pool after they have been created.
The new String is created in the heap, and NOT in the string pool.
If you want a newly create String to be in the string pool you need to intern() it; i.e.
String s = new String("Stackoverflow").intern();
... except of course that will return the string literal object that you started with!!
String s1 = "Stackoverflow";
String s2 = new String(s1);
String s3 = s2.intern();
System.out.println("s1 == s2 is " + (s1 == s2));
System.out.println("s2 == s3 is " + (s2 == s3));
System.out.println("s1 == s3 is " + (s1 == s3));
should print
s1 == s2 is false
s2 == s3 is false
s1 == s3 is true
And to be pedantic, the String in s is not the String that was created by the new String("StackOverflow") expression. What intern() does is to lookup the its target object in the string pool. If there is already a String in the pool that is equal(Object) to the object being looked up, that is what is returned as the result. In this case, we can guarantee that there will already be an object in the string pool; i.e. the String object that represents the value of the literal.
A regular java object will be created in the heap, and will have a reference s type of String. And, there will be String literal in String Constant Pool. Both are two different things.
My answer is YES!
Check the following code first:
String s0 = "Stackoverflow";
String s1 = new String("Stackoverflow");
String s2 = s1.intern();
System.out.println(s0 == s1);
System.out.println(s1 == s2 );
System.out.println(s0 == s2);
//OUTPUT:
false
false
true
s0 hold a reference in the string pool, while new String(String original) will always construct a new instance. intern() method of String will return a reference in the string pool with the same value.
Now go back to your question:
Will String object created only in heap, or it also makes a copy in String constant pool?
Since you already wrote a string constant "Stackoverflow" and pass it to your String constructor, so in my opinion, it has the same semantic as:
String s0 = "Stackoverflow";
String s1 = new String(s0);
which means there will be a copy in String constant pool when the code is evaluated.
But, if you construct the String object with following code:
String s = new String("StackoverflowSOMETHINGELSE".toCharArray(),0,13);
there won't be a copy of "Stackoverflow" in constant pool.

Strings [= new String vs = ""]

So my question is in relation to declaring and assigning strings.
The way I usually declare strings is by doing the following:
String s1 = "Stackoverflow";
And then if I ever need to change the value of s1 I would do the following:
s1 = "new value";
Today I found another way of doing it and declaring a string would go like:
String s2 = new String("Stackoverflow");
And then changing the value would be:
s2 = new String("new value");
My question is what is the difference between the two or is it simply preferential. From looking at the code the fourth line
s2 = new String ("new value");
I'm assuming that doing that would create a new memory location and s2 would then point to it so I doubt it would be used to change the value but I can see it being used when declaring a string.
From the javadoc :
Initializes a newly created String object so that it represents the
same sequence of characters as the argument; in other words, the newly
created string is a copy of the argument string. Unless an explicit
copy of original is needed, use of this constructor is unnecessary
since Strings are immutable.
So no, you have no reason not to use the simple literal.
Simply do
String s1 = "Stackoverflow";
Historically, this constructor was mainly used to get a lighter copy of a string obtained by splitting a bigger one (see this question). Now, There's no normal reason to use it.
String s1 = "Stackoverflow"; // declaring & initializing s1
String s2 = new String("Stackoverflow"); // declaring & initializing s2
in above cases you are declaring & initializing String object.
The difference between
//first case
String s2 = new String("new String"); // declaring & initializing s2 with new memory
// second case
s2 = "new String" // overwriting previous value of s2
is in the first case you are creating a new object, i-e; allocating memory for a new object which will be refrenced by s2. The previous address to which s2 was pointing/referencing hasn't released the memory yet which will be gc when the program ends or when system needs to.
The good programming practice (second case) is to initialize an object once, and if you want to change its value either assign null to it and then allocate new memory to it or in case of string you can do like this
s2= "new String";
String s1 = "Stackoverflow";
This line will create a new object in the String pool if it does not exist already. That means first it will first try searching for "Stackoverflow" in the String pool, if found then s1 will start pointing to it, if not found then it will create a new object and s1 will refer to it.
String s2 = new String("Stackoverflow");
Will always create a new String object, no matter if the value already exists in the String pool. So the object in the heap would then have the value "Stackovrflow" and s2 will start pointing to it.
s2 = new String("new value");
This will again create a new Object and s2 will start pointing to it. The earlier object that s2 was pointing to is not open for garbage collection (gc).
Let me know if this helps.
The main difference is that the constructor always creates a totally new instance of String containing the same characters than the original String.
String s1 = "Stackoverflow";
String s2 = "Stackoverflow";
Then s1 == s2 will return true
String s1 = "Stackoverflow";
String s2 = new String("Stackoverflow");
Then s1 == s2 will return false
Just using the double quotes option is often better:
With a constructors you might create two instance of String.
Easier to read and less confusing
#user2612619 here i want to say is .. when ever ur creating object with "new" operator it always falls in heap memory . so wenever u have same content but different objects than also it will create new objects on heap by this u cant save memory ....
but in order to save memory java people brought concept of immutable where we can save memory .. if u r creating a different object with same content .. string will recognize dat difference and creates only one object with same content and points the two references to only one object ..
i can solve ur doubts from this figure ..
case 1:
String s = new String("stackoverflow");
String s1 = new String("stackoverflow");
as they are two different objects on heap memory with two different values of hashcode .. so s==s1 (is reference comparison) it is false .. but s.equals(s1) is content comparison ..so it is true
case 2:
String s = "stackoverflow";
String s1 = "stackoverflow";
here objects fall in scp(string constant pool memory)
same object for two different refernces .. so hashcode also same .. same reference value .. so s==s1 is true here [u can observe from figure clearly]
s.equals(s1) is true as content comparison.. this is very beautiful concept .. u will love it wen u solve some problems ... all d best
Creating String object Directly:
String s1 = "Hip Hop"
will create a string object but first JVM checks the String constant or literal pool and if the string does not exist, it creates a new String object “Hip jop” and a reference is maintained in the pool. The variable s1 also refers the same object. Now, if we put a statement after this:
String s2 = "Hip Hop"
JVM checks the String constant pool first and since the string already exists, a reference to the pooled instance is returned to s2.
System.out.println(s1==s2) // comparing reference and it will print true
java can make this optimization since strings are immutable and can be shared without fear of data corruption.
Creating String Object using new
String s3 = new String("Hip Hop")
For new keyword, a String object is created in the heap memory wither or not an equal string object already exists in the pool and s3 will refer to the newly created one.
System.out.println(s3==s2) // two reference is different, will print false
String objects created with the new operator do not refer to objects in the string pool but can be made to using String’s intern() method. The java.lang.String.intern() returns an interned String, that is, one that has an entry in the global String literal pool. If the String is not already in the global String literal pool, then it will be added to the pool.
String s4 = s3.intern();
Systen.out.println(s4 == s2); // will print `true` because s4 is interned,
//it now have the same reference to "Hip Hop" as s2 or s1
But try:
Systen.out.println(s4 == s3) // it will be false,
As the reference of s4, s2 and s1 is the reference to the pooled instance while s3 is referring to the created object in heap memory.
use of new for creting string:
prior to OpenJDK 7, Update 6, Java String.susbtring method had potential memory leak. substring method would build a new String object keeping a reference to the whole char array, to avoid copying it. You could thus inadvertently keep a reference to a very big character array with just a one character string. If we want to have minimal strings after substring, we use the constructor taking another string :
String s2 = new String(s1.substring(0,1));
The issue is resolved in JDK 7 update 6. So No need to create string with new any more for taking advantage provided by String literal pool mechanism.
Referance:
String literal pool
using new to prevent memory leak for using substring

Java:literal strings

class A {
String s4 = "abc";
static public void main(String[]args ) {
String s1 = "abc";
String s2 = "abc";
String s3 = new String("abc");
A o = new A();
String s5 = new String("def");
System.out.println("s1==s2 : " + (s1==s2));
System.out.println("s1==s1.intern : " + (s1==s1.intern()));
System.out.println("s1==s3 : " + (s1==s3));
System.out.println("s1.intern==s3.intern : " + (s1.intern()==s3.intern()));
System.out.println("s1==s4 : " + (s1==o.s4));
}
}
The output:
s1==s2 : true
s1==s1.intern : true
s1==s3 : false
s1.intern==s3.intern : true
s1==s4 : true
My questions:
1.What happens for "String s1 = "abc"? I guess the String object is added to the pool in class String as an interned string? Where is it placed on? The "permanent generation" or just the heap(as the data member of the String Class instance)?
2.What happens for "String s2 = "abc"? I guess no any object is created.But does this mean that the Java Intepreter needs to search all the interned strings? will this cause any performance issue?
3.Seems String s3 = new String("abc") does not use interned string.Why?
4.Will String s5 = new String("def") create any new interned string?
The compiler creates a String object for "abc" in the constant pool, and generates a reference to it in the bytecode for the assignment statement.
See (1). No searching; no performance issue.
This creates a new String object at runtime, because that is what the 'new' operator does: create new objects.
Yes, for "def", but because of (3) a new String is also created at runtime.
The String objects at 3-4 are not interned.
1.What happens for "String s1 = "abc"?
At compile time a representation of the literal is written to the "constant pool" part of the classfile for the class that contains this code.
When the class is loaded, the representation of the string literal in the classfile's constant pool is read, and a new String object is created from it. This string is then interned, and the reference to the interned string is then "embedded" in the code.
At runtime, the reference to the previously created / interned String is assigned to s1. (No string creation or interning happens when this statement is executed.)
I guess the String object is added to the pool in class String as an interned string?
Yes. But not when the code is executed.
Where is it placed on? The "permanent generation" or just the heap(as the data member of the String Class instance)?
It is stored in the permgen region of the heap. (The String class has no static fields. The JVM's string pool is implemented in native code.)
2.What happens for "String s2 = "abc"?
Nothing happens at load time. When the compiler created the classfile, it reused the same constant pool entry for the literal that was used for the first use of the literal. So the String reference uses by this statement is the same one as is used by the previous statement.
I guess no any object is created.
Correct.
But does this mean that the Java Intepreter needs to search all the interned strings? will this cause any performance issue?
No, and No. The Java interpretter (or JIT compiled code) uses the same reference as was created / embedded for the previous statement.
3.Seems String s3 = new String("abc") does not use interned string.Why?
It is more complicated than that. The constructor call uses the interned string, and then creates a new String, and copies the characters of the interned string to the new String's representation. The newly created string is assigned to s3.
Why? Because new is specified as always creating a new object (see JLS), and the String constructor is specified as copying the characters.
4.Will String s5 = new String("def") create any new interned string?
A new interned string is created at load time (for "def"), and then a new String object is created at runtime which is a copy of the interned string. (See previous text for more details.)
See this answer on SO. Also see this wikipedia article on String Interning.
String s1 = "abc"; creates a new String and interns it.
String s2 = "abc"; will drag the same Object used for s1 from the intern pool. The JVM does this to increase performance. It is quicker than creating a new String.
Calling new String() is redundant as it will return a new implicit String Object. Not retrieve it from the intern pool.
As Keyser says, == compares the Strings for Object equality, returning true if they are the same Object. When comparing String content you should use .equals()

Categories