Confuse about String reference comparison == with intern - java

I read this when should we use intern method of string on string constants but still not very clear with String == compare also with intern(). I have a couple examples. Can someone help me understand this better.
String s1 = "abc";
String s2 = "abc";
String s3 = "abcabc";
String s4 = s1 + s2;
System.out.println(s3 == s4); // 1. why false ?
System.out.println(s3 == s4.intern()); // 2. why true ?
System.out.println(s4 == s1 + s2); // 3. why false ?
System.out.println(s4 == (s1 + s2).intern()); // 4. why false ?
System.out.println(s4.intern() == (s1 + s2).intern()); // 5. why true ?

There are quite a lot of answers here which exlain that, but let me give you another one.
A string is interned into the String literal pool only in two situations: when a class is loaded and the String was a literal or compile time constant. Otherwise only when you call .intern() on a String. Then a copy of this string is listed in the pool and returned. All other string creations will not be interned. String concatenation (+) is producing new instances as long as it is not a compile time constant expression*.
First of all: never ever use it. If you do not understand it you should not use it. Use .equals(). Interning strings for the sake of comparison might be slower than you think and unnecessarily filling the hashtable. Especially for strings with highly different content.
s3 is a string literal from the constant pool and therefore interned. s4 is a expression not producing an interned constant.
when you intern s4 it has the same content as s3 and is therefore the same instance.
same as s4, expression not a constant
if you intern s1+s2 you get the instance of s3, but s4 is still not s3
if you intern s4 it is the same instance as s3
Some more questions:
System.out.println(s3 == s3.intern()); // is true
System.out.println(s4 == s4.intern()); // is false
System.out.println(s1 == "abc"); // is true
System.out.println(s1 == new String("abc")); // is false
* Compile time constants can be expressions with literals on both sides of the concatenation (like "a" + "bc") but also final String variables initialized from constants or literals:
final String a = "a";
final String b = "b";
final String ab = a + b;
final String ab2 = "a" + b;
final String ab3 = "a" + new String("b");
System.out.println("ab == ab2 should be true: " + (ab == ab2));
System.out.println("a+b == ab should be true: " + (a+b == ab));
System.out.println("ab == ab3 should be false: " + (ab == ab3));

One thing you have to know is, that Strings are Objects in Java. The variables s1 - s4 do not point directly to the text you stored. It is simply a pointer which says where to find the Text within your RAM.
It is false because you compare the Pointers, not the actual text. The text is the same, but these two Strings are 2 completely different Objects which means they have diferent Pointers. Try printing s1 and s2 on the console and you will see.
Its true, because Java does some optimizing concerning Strings. If the JVM detects, that two different Strings share the same text, they will be but in something called "String Literal Pool". Since s3 and s4 share the same text they will also sahe the same slot in the "String Literal Pool". The inter()-Method gets the reference to the String in the Literal Pool.
Same as 1. You compare two pointers. Not the text-content.
As far as I know added values do not get stored in the pool
Same as 2. You they contain the same text so they get stored in the String Literal Pool and therefore share the same slot.

To start off with, s1, s2, and s3 are in the intern pool when they are declared, because they are declared by a literal. s4 is not in the intern pool to start off with. This is what the intern pool might look like to start off with:
"abc" (s1, s2)
"abcabc" (s3)
s4 does not match s3 because s3 is in the intern pool, but s4 is not.
intern() is called on s4, so it looks in the pool for other strings equaling "abcabc" and makes them one object. Therefore, s3 and s4.intern() point to the same object.
Again, intern() is not called when adding two strings, so it does not match from the intern() pool.
s4 is not in the intern pool so it does not match objects with (s1 + s2).intern().
These are both interned, so they both look in the intern pool and find each other.

Related

String intern for GC

I have read many artiles regarding string interning.
If I create a String object
Method 1
String str= new String("test")
2 Objects are created one in heap and other in string pool.
Method 2 if method 1 is not executed
String str= new String("test").intern()
it will create a copy of string frpoom heap to string pool .How many objects will be created.I guess 3.One will be in heap ,other in pool and one "test" literal.
Which one will be eligible for GC in both cases.I have seen artilces that say 2 are getting created but i am unable to understand why?
Method 3
String s= new String("test")
String s1=s.intern()
It does the same thing except the s point to heap object and s1 to pool object and none of them are eligible for Gc.
Is my understanding correct???I am confused a lot on this concept.
If I create a String object
String str= new String("test")
Objects are created one in heap and other in string pool.
A String consists of two objects, the String and the char[] In some version of Java it could be a byte[] Or in fact a char[] which is later replaced by a byte[]. This means that 4, perhaps 5 objects could be created, unless the String for the string literal already exists, in which cases it is 2 for Java 7 update 4+, before that the char[] would be shared so it could be three objects or only 1.
String str= new String("test").intern()
This is exactly the same except, if this is called enough the new String could be allocated on the stack and you might find that only the char[]` is created and this cannot be placed on the stack, at the moment. In future this might be optimised away also.
Which one will be eligible for GC in both cases.I have seen artilces that say 2 are getting created but i am unable to understand why?
The answer is anywhere from 1 to 4 depending on the situation. All of there eligible for collection unless they are being strongly referenced somewhere.
String intern() method:
The most common methods for String comparison are the equals() and equalsIgnoreCase() methods. However, these methods may need large amount of memory for large sequence of characters. The Java String intern() method helps us to improve the performance of the comparison between two Strings.
The intern() method, when applied to a String object, returns a reference to this object (from the hash set of Strings that Java makes), that has the same contents as the original object. Thus, if a code uses the intern() method for several String objects, then our program will use significantly less memory , because it will reuse the references of the objects in the comparison between these Strings.
Keep in mind, that Java automatically interns String literals. This means that the intern() method is to be used on Strings that are constructed with new String().
Example:
JavaStringIntern.java
package com.javacodegeeks.javabasics.string;
public class JavaStringIntern {
public static void main(String[] args) {
String str1 = "JavaCodeGeeks";
String str2 = "JavaCodeGeeks";
String str3 = "JavaCodeGeeks".intern();
String str4 = new String("JavaCodeGeeks");
String str5 = new String("JavaCodeGeeks").intern();
System.out.println("Are str1 and str2 the same: " + (str1 == str2));
System.out.println("Are str1 and str3 the same: " + (str1 == str3));
System.out.println("Are str1 and str4 the same: " + (str1 == str4)); //this should be "false" because str4 is not interned
System.out.println("Are str1 and str4.intern() the same: " + (str1 == str4.intern())); //this should be "true" now
System.out.println("Are str1 and str5 the same: " + (str1 == str5));
}
}
Output:
Are str1 and str2 the same: true
Are str1 and str3 the same: true
Are str1 and str4 the same: false
Are str1 and str4.intern() the same: true
Are str1 and str5 the same: true
Points to note is
Interning is automatic for String literals, the intern() method is to be used on Strings constructed with new String()
The Strings (more specifically, string objects) will be garbage collected if they ever become unreachable, like any other java objects.
String literals typically are not candidates for garbage collection. There will be a implicit reference from the Object to that literal.
Reason for point#3 is If a literal is being used inside a method to build a String is reachable for as long as the method could be executed.

Difference in the following declarations [duplicate]

This question already has answers here:
Comparing strings with == which are declared final in Java
(6 answers)
What is the difference between these two ways of initializing a String?
(3 answers)
Closed 7 years ago.
I am unable to recognize the difference in the following declarations of Strings in Java.
Suppose I am having two string
String str1="one";
String str2="two";
What is the difference between
String str3=new String(str1+str2);
and
String str3=str1+str2;
In both the above declarations, the content of str3 will be onetwo.
Suppose I create a new string
String str4="onetwo";
Then in none of the above declarations,
if(str4==str3) {
System.out.println("This is not executed");
}
Why are str3 and str4 not referring to the same object?
str1 + str2 for non-compilation-constant strings will be compiled into
new StringBuilder(str1).append(str2).toString(). This result will not be put, or taken from string pool (where interned strings go).
It is different story in case of "foo"+"bar" where compiler knows which values he works with, so he can concatenate this string once to avoid it at runtime. Such string literal will also be interned.
So String str3 = str1+str2; is same as
String str3 = new StringBuilder(str1).append(str2).toString();
and String str3 = new String(str1+str2); is same as
String str3 = new String(new StringBuilder(str1).append(str2).toString());
Again, strings produced as result of method (like substring, replace, toString) are not interned.
This means you are comparing two different instances (which store same characters) and that is why == returns false.
Java does not have memory of "how this variable got the value", therefore it really does not matter which method you use, if the result is same.
About comparing, if you compare strings with ==, you are comparing address of objects in memory, because String is not primitive data type, not values. You have to use if(str4.equals(str3))
Because Strings in Java are immutable the compiler will optimize and reuse String literals. Thus
String s1 = "one";
String s2 = "one";
s1 == s2; //true because the compiler will reuse the same String (with the same memory address) for the same string literal
s1 == "o" + "ne"; //true because "Strings computed by constant expressions are computed at compile time and then treated as if they were literals"
s3 = "o";
s1 == s3 + "ne"; //false because the second string is created a run time and is therefore newly created
for a reference see http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.10.5
Strings are kind of tricky, because there is some effort to share their representation. Plus they're immutable.
The short answer is: unless you're really working in a low level, you should never compare strings using "==".
Even if it works for you, it will be a nightmare for your teammates to maintain.
For a longer answer and a bit of amusement, try the following:
String s1= "a" + "b";
String s2= "a" + "b";
String s3=new String("a"+"b");
System.out.println(s1==s2);
System.out.println(s3==s2);
You'll notice that s1==s2 due to the compiler's effort to share.
However s2 != s3 because you've explicitly asked for a new string.
You're not likely to do anything very smart with it, because it's immutable.

String Constant Pool mechanism

Can anyone explain this strange behavior of Strings?
Here's my code:
String s2 = "hello";
String s4 = "hello"+"hello";
String s6 = "hello"+ s2;
System.out.println("hellohello" == s4);
System.out.println("hellohello" == s6);
System.out.println(s4);
System.out.println(s6);
The output is:
true
false
hellohello
hellohello
You need to be aware of the difference between str.equals(other) and str == other. The former checks if two strings have the same content. The latter checks if they are the same object. "hello" + "hello" and "hellohello" can be optimised to be the same string at the compilation time. "hello" + s2 will be computed at runtime, and thus will be a new object distinct from "hellohello", even if its contents are the same.
EDIT: I just noticed your title - together with user3580294's comments, it seems you should already know that. If so, then the only question that might remain is why is one recognised as constant and the other isn't. As some commenters suggest, making s2 final will change the behaviour, since the compiler can then trust that s2 is constant in the same way "hello" is, and can resolve "hello" + s2 at compilation time.
"hello" + s2 works like this:
An instance of the StringBuilder class is created (behind the scenes)
The + operator actually invokes the StringBuilder#append(String s) method
When the appending is done, a StringBuilder.toString() method is invoked, which returns a brand new String object. This is why "hellohello" == s6 is actually false.
More info:
How do I compare Strings in Java?
How Java do the string concatenation using “+”?
String s2 = "hello";
String s4 = "hello" + "hello"; // both "hello"s are added and s4 is resolved to "hellohello" during compile time (and will be in String pool)
String s6 = "hello"+ s2; // s6 is resolved during runtime and will be on heap (don't apply Escape analysis here)
So,
System.out.println("hellohello" == s4); // true
System.out.println("hellohello" == s6); // false
String s4 is interned and reference to this string and to "hellohello" is the same.
Because of that you get true on the line:
System.out.println("hellohello" == s4);
String s6 is not interned, it depends on a variable s2. And references to "hellohello" and to string s6 are not equal. Because of that you get false on the line:
System.out.println("hellohello" == s6);
But if you declare s2 as final, which makes s2 constant, you will get true instead of false on the line System.out.println("hellohello" == s6);, because now compiler can intern the string s6, as it depends on constant values, and the references to "hellohello" and to s6 will be equal to each other.

java equals and == confusion [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
what is String pool in java?
1. I know that == checks if two object are pointing to same memory location also the default definition of equals uses == to do the checking, means both are same.
2. String class overrides equals method to check if two string have same value.
Consider S1 = "test" and S2 = S1;
Now S1 and S2 are two different objects so as per point 1 S1==S2 should be false and as per point 2 S1.equals(S2) should be true but when I ran this small program in eclipse both return true. Is there any special thing about string objects that S1 == S2 is also true.
Consider S1 = "test" and S2 = S1; Now S1 and S2 are two different objects
Nope. This is where your argument fails.
You created one string object, and both your variables refer to the same string object. Assignment does not make a new copy of the string.
When you write
s1 = s2;
s1 and s2 are references to the same object, so s1 == s2 will always return true.
More confusing - if you write:
s1 = "test";
s2 = "test";
s3 = new String("test");
you will find out that s1 == s2 is true but s1 == s3 is false. This is explained in more details in this post.
Wiritng S1 = S2 results in them pointing towards the same object
Writing
String S1 = "test"
String S2 = "test"
Will yield the same results as you have now.
This is due to compiler optimisation, the compiler notices the string class which is immutable, therefore he will optimise the code to both use the same instance. You can force him to make new strings by instaniating them with a constructor
String s1 = new String("test");
String s2 = new String("test");
System.out.println(s1 == s2) // false
System.out.println(s1.equals(s2)) //true
when you initialize
S2=S1
they both point to same memory location.
try
S1 = "test";
S2 = "test";
this will give you
S1==S2 //false

String Comparison in Java...? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How do I compare strings in Java?
why first comparison ( s1 == s2 ) displays equal whereas 2nd comparison ( s1 == s3 ) displays not equal....?
public class StringComparison
{
public static void main( String [] args)
{
String s1 = "Arsalan";
String s2 = "Arsalan";
String s3 = new String ("Arsalan");
if ( s1 == s2 )
System.out.println (" S1 and S2 Both are equal...");
else
System.out.println ("S1 and S2 not equal");
if ( s1 == s3 )
System.out.println (" S1 and S3 Both are equal...");
else
System.out.println (" S1 and S3 are not equal");
}
}
This has to do with the fact that you cannot compare strings with == as well as compiler optimizations.
== in Java only compares if the two sides refer to the exact same instance of the same object. It does not compare the content. To compare the actual content of the strings, you need to use s1.equals(s2).
Now the reason why s1 == s2 is true and s1 == s3 is false is because the JVM decided to optimize the code so that s1 and s2 are the same object. (It's called, "String Pooling.")
Per 3.10.5: Pooling of string literals is actually mandated by the standard.
Moreover, a string literal always refers to the same instance of class
String. This is because string literals - or, more generally, strings
that are the values of constant expressions (§15.28) - are "interned"
so as to share unique instances, using the method String.intern.
Don't use == to compare strings, it tests reference equality (do two names refer to the same object). Try s1.equals(s2);, which actually tests the elements for equality.
String one = "Arsalan";
String two = "Arsalan";
one == two
// Returns true because in memory both Strings are pointing to the SAME object
one.equals(two)
// Will ALWAYS return true because the VALUES of the Strings are the same (would not matter if the objects were referenced differently).
The reason is string interning. It's complicated. The compiler is "smart" enough to use the same exact object for s1 and s2, even though you might think they are different. But s3, which uses new String("Arsalan"), doesn't intern.
Some guidelines:
You should almost always use equals(), not ==, to compare strings
You should almost never use String s = new String("foo").
Instead, use String s = "foo".
If "Arsalan" is not found in the pool of Strings, a "Arsalan" string will be created and s1 will refer it. Since "Arsalan" string already exists in the pool of Strings s2 will refer to the same Object as s1. Because the new keyword is used for s3, Java will create a new String object in normal (nonpool) memory, and s3 will refer to it. This is the reason why s1 and s3 don't refer to the same object.
public class StringComparison
{
public static void main( String [] args)
{
String s1 = "Arsalan";
String s2 = new String("Arsalan");
String s3 = new String ("Arsalan");
if ( s1 == s2 )
System.out.println (" S1 and S2 Both are equal...");
else
System.out.println ("S1 and S2 not equal");
if ( s1 == s3 )
System.out.println (" S1 and S3 Both are equal...");
else
System.out.println (" S1 and S3 are not equal");
if ( s2 == s3 )
System.out.println (" S2 and S3 Both are equal...");
else
System.out.println (" S2 and S3 are not equal");
}
}
If you run this, you can see that S2 and S3 are also not equal. This is because s2, s3 are references to a String Object and hence they contain different address values.
Don't use ==, use s1.equals(s2) or s1.equals(s3) instead.

Categories