String Constant Pool mechanism - java

Can anyone explain this strange behavior of Strings?
Here's my code:
String s2 = "hello";
String s4 = "hello"+"hello";
String s6 = "hello"+ s2;
System.out.println("hellohello" == s4);
System.out.println("hellohello" == s6);
System.out.println(s4);
System.out.println(s6);
The output is:
true
false
hellohello
hellohello

You need to be aware of the difference between str.equals(other) and str == other. The former checks if two strings have the same content. The latter checks if they are the same object. "hello" + "hello" and "hellohello" can be optimised to be the same string at the compilation time. "hello" + s2 will be computed at runtime, and thus will be a new object distinct from "hellohello", even if its contents are the same.
EDIT: I just noticed your title - together with user3580294's comments, it seems you should already know that. If so, then the only question that might remain is why is one recognised as constant and the other isn't. As some commenters suggest, making s2 final will change the behaviour, since the compiler can then trust that s2 is constant in the same way "hello" is, and can resolve "hello" + s2 at compilation time.

"hello" + s2 works like this:
An instance of the StringBuilder class is created (behind the scenes)
The + operator actually invokes the StringBuilder#append(String s) method
When the appending is done, a StringBuilder.toString() method is invoked, which returns a brand new String object. This is why "hellohello" == s6 is actually false.
More info:
How do I compare Strings in Java?
How Java do the string concatenation using “+”?

String s2 = "hello";
String s4 = "hello" + "hello"; // both "hello"s are added and s4 is resolved to "hellohello" during compile time (and will be in String pool)
String s6 = "hello"+ s2; // s6 is resolved during runtime and will be on heap (don't apply Escape analysis here)
So,
System.out.println("hellohello" == s4); // true
System.out.println("hellohello" == s6); // false

String s4 is interned and reference to this string and to "hellohello" is the same.
Because of that you get true on the line:
System.out.println("hellohello" == s4);
String s6 is not interned, it depends on a variable s2. And references to "hellohello" and to string s6 are not equal. Because of that you get false on the line:
System.out.println("hellohello" == s6);
But if you declare s2 as final, which makes s2 constant, you will get true instead of false on the line System.out.println("hellohello" == s6);, because now compiler can intern the string s6, as it depends on constant values, and the references to "hellohello" and to s6 will be equal to each other.

Related

String + String vs String + String returned from method

Today while working with String's i have encountered a behavior i don't know before. I'm not able to understand what's happening internally.
public String returnVal(){
return "5";
}
String s1 = "abcd5";
String s2 = "abcd"+"5";
String s3 = "abcd5";
String s4 = "abcd"+returnVal();
System.out.println(s1 == s2);
System.out.println(s1.equals(s2));
System.out.println(s3 == s4);
System.out.println(s3.equals(s4));
My expectation is printing "true" from all s.o.p's but s3 == s4 is false, why?
My expectation is printing "true" from all s.o.p's but s3 == s4 is false, why?
The compiler can do constant expression inlining. This means that
String s1 = "abcd5";
String s2 = "abcd"+"5";
final String five = "5"; // final reference
String sa = "abcd" + five;
are all the same (except five) and the compiler can simplify all these expressions to "abcd5"
However, if the compiler cannot optimise the expression, the operation is performed at runtime and a new String is created. This new String is not a constant which is places in the String literal pool (as it is not a literal in byte code)
String s4 = "abcd" + returnVal(); // not inlined by the compiler.
String f5 = "5"; // not a final reference.
String sb = "abcd" + f5; // evaluated at runtime
These create new strings every time they are run (as well as new StringBuilder and char[]s)
You have stumbled across the intricacies of how the Java compiler optimizes String.
Suppose I have this program:
String a = "abc";
String b = "abc";
Here the compiler can initialize a and b to the same String instance. This entails that a == b and a.equals(b).
Here we also get the same behaviour:
String a = "abc";
String b = "ab" + "c";
This is because "ab" + "c" can be evaluated at compile-time to just "abc", which in turn can share an instance with a.
This technique is not possible with expressions that call functions:
String a = "abc";
String b = "ab" + functionThatReturnsC();
This is because functionThatReturnsC could have side-effects which cannot be resolved at compile-time.
Your case of returnVal is interesting. Since it is constant, it could be inlined, in which case the compile-time instance sharing could be applied. It seems the compiler implementer decided not to support this.
This issue exposes a weakness of Java. Since we cannot override =, programmers cannot implement custom value-types. Therefore, you should always use equals or Objects.equals to ensure consistent behaviour.
Note that these optimizations may differ between compilers.

Confuse about String reference comparison == with intern

I read this when should we use intern method of string on string constants but still not very clear with String == compare also with intern(). I have a couple examples. Can someone help me understand this better.
String s1 = "abc";
String s2 = "abc";
String s3 = "abcabc";
String s4 = s1 + s2;
System.out.println(s3 == s4); // 1. why false ?
System.out.println(s3 == s4.intern()); // 2. why true ?
System.out.println(s4 == s1 + s2); // 3. why false ?
System.out.println(s4 == (s1 + s2).intern()); // 4. why false ?
System.out.println(s4.intern() == (s1 + s2).intern()); // 5. why true ?
There are quite a lot of answers here which exlain that, but let me give you another one.
A string is interned into the String literal pool only in two situations: when a class is loaded and the String was a literal or compile time constant. Otherwise only when you call .intern() on a String. Then a copy of this string is listed in the pool and returned. All other string creations will not be interned. String concatenation (+) is producing new instances as long as it is not a compile time constant expression*.
First of all: never ever use it. If you do not understand it you should not use it. Use .equals(). Interning strings for the sake of comparison might be slower than you think and unnecessarily filling the hashtable. Especially for strings with highly different content.
s3 is a string literal from the constant pool and therefore interned. s4 is a expression not producing an interned constant.
when you intern s4 it has the same content as s3 and is therefore the same instance.
same as s4, expression not a constant
if you intern s1+s2 you get the instance of s3, but s4 is still not s3
if you intern s4 it is the same instance as s3
Some more questions:
System.out.println(s3 == s3.intern()); // is true
System.out.println(s4 == s4.intern()); // is false
System.out.println(s1 == "abc"); // is true
System.out.println(s1 == new String("abc")); // is false
* Compile time constants can be expressions with literals on both sides of the concatenation (like "a" + "bc") but also final String variables initialized from constants or literals:
final String a = "a";
final String b = "b";
final String ab = a + b;
final String ab2 = "a" + b;
final String ab3 = "a" + new String("b");
System.out.println("ab == ab2 should be true: " + (ab == ab2));
System.out.println("a+b == ab should be true: " + (a+b == ab));
System.out.println("ab == ab3 should be false: " + (ab == ab3));
One thing you have to know is, that Strings are Objects in Java. The variables s1 - s4 do not point directly to the text you stored. It is simply a pointer which says where to find the Text within your RAM.
It is false because you compare the Pointers, not the actual text. The text is the same, but these two Strings are 2 completely different Objects which means they have diferent Pointers. Try printing s1 and s2 on the console and you will see.
Its true, because Java does some optimizing concerning Strings. If the JVM detects, that two different Strings share the same text, they will be but in something called "String Literal Pool". Since s3 and s4 share the same text they will also sahe the same slot in the "String Literal Pool". The inter()-Method gets the reference to the String in the Literal Pool.
Same as 1. You compare two pointers. Not the text-content.
As far as I know added values do not get stored in the pool
Same as 2. You they contain the same text so they get stored in the String Literal Pool and therefore share the same slot.
To start off with, s1, s2, and s3 are in the intern pool when they are declared, because they are declared by a literal. s4 is not in the intern pool to start off with. This is what the intern pool might look like to start off with:
"abc" (s1, s2)
"abcabc" (s3)
s4 does not match s3 because s3 is in the intern pool, but s4 is not.
intern() is called on s4, so it looks in the pool for other strings equaling "abcabc" and makes them one object. Therefore, s3 and s4.intern() point to the same object.
Again, intern() is not called when adding two strings, so it does not match from the intern() pool.
s4 is not in the intern pool so it does not match objects with (s1 + s2).intern().
These are both interned, so they both look in the intern pool and find each other.

java equals and == confusion [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
what is String pool in java?
1. I know that == checks if two object are pointing to same memory location also the default definition of equals uses == to do the checking, means both are same.
2. String class overrides equals method to check if two string have same value.
Consider S1 = "test" and S2 = S1;
Now S1 and S2 are two different objects so as per point 1 S1==S2 should be false and as per point 2 S1.equals(S2) should be true but when I ran this small program in eclipse both return true. Is there any special thing about string objects that S1 == S2 is also true.
Consider S1 = "test" and S2 = S1; Now S1 and S2 are two different objects
Nope. This is where your argument fails.
You created one string object, and both your variables refer to the same string object. Assignment does not make a new copy of the string.
When you write
s1 = s2;
s1 and s2 are references to the same object, so s1 == s2 will always return true.
More confusing - if you write:
s1 = "test";
s2 = "test";
s3 = new String("test");
you will find out that s1 == s2 is true but s1 == s3 is false. This is explained in more details in this post.
Wiritng S1 = S2 results in them pointing towards the same object
Writing
String S1 = "test"
String S2 = "test"
Will yield the same results as you have now.
This is due to compiler optimisation, the compiler notices the string class which is immutable, therefore he will optimise the code to both use the same instance. You can force him to make new strings by instaniating them with a constructor
String s1 = new String("test");
String s2 = new String("test");
System.out.println(s1 == s2) // false
System.out.println(s1.equals(s2)) //true
when you initialize
S2=S1
they both point to same memory location.
try
S1 = "test";
S2 = "test";
this will give you
S1==S2 //false

String Comparison in Java...? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How do I compare strings in Java?
why first comparison ( s1 == s2 ) displays equal whereas 2nd comparison ( s1 == s3 ) displays not equal....?
public class StringComparison
{
public static void main( String [] args)
{
String s1 = "Arsalan";
String s2 = "Arsalan";
String s3 = new String ("Arsalan");
if ( s1 == s2 )
System.out.println (" S1 and S2 Both are equal...");
else
System.out.println ("S1 and S2 not equal");
if ( s1 == s3 )
System.out.println (" S1 and S3 Both are equal...");
else
System.out.println (" S1 and S3 are not equal");
}
}
This has to do with the fact that you cannot compare strings with == as well as compiler optimizations.
== in Java only compares if the two sides refer to the exact same instance of the same object. It does not compare the content. To compare the actual content of the strings, you need to use s1.equals(s2).
Now the reason why s1 == s2 is true and s1 == s3 is false is because the JVM decided to optimize the code so that s1 and s2 are the same object. (It's called, "String Pooling.")
Per 3.10.5: Pooling of string literals is actually mandated by the standard.
Moreover, a string literal always refers to the same instance of class
String. This is because string literals - or, more generally, strings
that are the values of constant expressions (§15.28) - are "interned"
so as to share unique instances, using the method String.intern.
Don't use == to compare strings, it tests reference equality (do two names refer to the same object). Try s1.equals(s2);, which actually tests the elements for equality.
String one = "Arsalan";
String two = "Arsalan";
one == two
// Returns true because in memory both Strings are pointing to the SAME object
one.equals(two)
// Will ALWAYS return true because the VALUES of the Strings are the same (would not matter if the objects were referenced differently).
The reason is string interning. It's complicated. The compiler is "smart" enough to use the same exact object for s1 and s2, even though you might think they are different. But s3, which uses new String("Arsalan"), doesn't intern.
Some guidelines:
You should almost always use equals(), not ==, to compare strings
You should almost never use String s = new String("foo").
Instead, use String s = "foo".
If "Arsalan" is not found in the pool of Strings, a "Arsalan" string will be created and s1 will refer it. Since "Arsalan" string already exists in the pool of Strings s2 will refer to the same Object as s1. Because the new keyword is used for s3, Java will create a new String object in normal (nonpool) memory, and s3 will refer to it. This is the reason why s1 and s3 don't refer to the same object.
public class StringComparison
{
public static void main( String [] args)
{
String s1 = "Arsalan";
String s2 = new String("Arsalan");
String s3 = new String ("Arsalan");
if ( s1 == s2 )
System.out.println (" S1 and S2 Both are equal...");
else
System.out.println ("S1 and S2 not equal");
if ( s1 == s3 )
System.out.println (" S1 and S3 Both are equal...");
else
System.out.println (" S1 and S3 are not equal");
if ( s2 == s3 )
System.out.println (" S2 and S3 Both are equal...");
else
System.out.println (" S2 and S3 are not equal");
}
}
If you run this, you can see that S2 and S3 are also not equal. This is because s2, s3 are references to a String Object and hence they contain different address values.
Don't use ==, use s1.equals(s2) or s1.equals(s3) instead.

A quiz taken in a data structure class

Wanted an explanation on the results of question 1.
***1. What is the output of the following method?
public static void main(String[] args) {
Integer i1=new Integer(1);
Integer i2=new Integer(1);
String s1=new String("Today");
String s2=new String("Today");
System.out.println(i1==i2);
System.out.println(s1==s2);
System.out.println(s1.equals(s2));
System.out.println(s1!=s2);
System.out.println( (s1!=s2) || s1.equals(s2));
System.out.println( (s1==s2) && s1.equals(s2));
System.out.println( ! (s1.equals(s2)));
}
Answer:
false
false
true
true
true
false
false
Integer i1=new Integer(1);
Integer i2=new Integer(1);
String s1=new String("Today");
String s2=new String("Today");
// do i1 and 12 point at the same location in memory? No - they used "new"
System.out.println(i1==i2);
// do s1 and s2 point at the same location in memory? No - the used "new"
System.out.println(s1==s2);
// do s1 and s2 contain the same sequence of characters ("Today")? Yes.
System.out.println(s1.equals(s2));
// do s1 and s2 point at different locations in memory? Yes - they used "new"
System.out.println(s1!=s2);
// do s1 and s2 point to different locations in memory? Yes - they used "new".
// Do not check s1.equals(s2) because the first part of the || was true.
System.out.println( (s1!=s2) || s1.equals(s2));
// do s1 and s2 point at the same location in memory? No - they used "new".
// do not check s1.equals(s2) because the first part of the && was false.
System.out.println( (s1==s2) && s1.equals(s2));
// do s1 and s2 not contain the same sequence of characters ("Today")? No.
System.out.println( ! (s1.equals(s2)));
I think the main point is that == compares two object references to see if they refer to the same instance, whereas equals compares the values.
For example, s1 and s2 are two different string instances so == returns false, but they both contain the value "Today" so equals returns true.
Keeping in mind that Integer and String are Objects, the == operator compares the memory addresses of those 2 pointers, not the actual Objects themselves. So the first 2 == are going to be false because i1 is not the same Object as i2. If the initialization was:
Integer i1=new Integer(1);
Integer i2=i1;
Then the first println() would have been true.
The s1.equals(s2) is the proper way to compare equality in Objects. The String.equals() method will check for string equality, so "Today" and "Today" are equal strings.
The s1!=s2 is true since s1 and s2 are different Objects, similar to the i1 and i2 issue with ==
The rest should be pretty straightforward boolean operations.
Which result(s) in particular? Does this help?

Categories