Recently in a job interview I was asked this following question (for Java):
Given:
String s1 = "abc";
String s2 = "abc";
What is the return value of
(s1 == s2)
I answered with it would return false because they are two different objects and == is a memory address comparison rather than a value comparison, and that one would need to use .equals() to compare String objects. I was however told that although the .equals(0 methodology was right, the statement nonetheless returns true. I was wondering if someone could explain this to me as to why it is true but why we are still taught in school to use equals()?
String constants are interned by your JVM (this is required by the spec as per here):
All literal strings and string-valued constant expressions are interned. String literals are defined in §3.10.5 of the Java Language Specification
This means that the compiler has already created an object representing the string "abc", and sets both s1 and s2 to point to the same interned object.
java will intern both strings, since they both have the same value only one actual string instance will exist in memory - that's why == will return true - both references point to the same instance.
String interning is an optimization technique to minimize the number of string instances that have to be held in memory. String literals or strings that are the values of constant expressions, are interned so as to share unique instances. Think flyweight pattern.
Since you are not actually creating new instances of String objects for either one of these, they are sharing the same memory space. If it were
String s1 = new String("abc");
String s2 = new String("abc");
the result would be false.
The reason is strings are interned in Java. String interning is a method of storing only one copy of each distinct string value which is immutable. Interning string makes some string processing tasks more efficient. The distinct values are stored in a string intern pool.
(From wiki)
You're right that == uses memory address. However when the java compiler notices that you're using the same string literal multiple times in the same program, it won't create the same string multiple times in memory. Instead both s1 and s2 in your example will point to the same memory. This is called string interning.
So that's why == will return true in this case. However if you read s2 from a file or user input, the string will not automatically interned. So now it no longer points to the same memory. Therefor == would now return false, while equals returns true. And that's why you shouldn't use ==.
Quick and dirty answer: Java optimizes strings so if it encounters the same string twice it will reuse the same object (which is safe because String is immutable).
However, there is no guarantee.
What usually happens is that it works for a long time, and then you get a nasty bug that takes forever to figure out because someone changed the class loading context and your == no longer works.
You should continue to use equals() when testing string equality. Java makes no guarantees about identity testing for strings unless they are interned.
The reason the s1 == s2 in your example is because the compiler is simply optimizing 2 literal references in a scope it can predict.
Related
This question already has answers here:
How do I compare strings in Java?
(23 answers)
Closed 9 years ago.
I'm reading about the notion of interned strings:
String s1 = "Hi";
String s2 = "Hi";
String s3 = new String("Hi");
System.out.println(s1 == s2); // true
System.out.println(s1 == s3); // false
System.out.println(s1.equals(s3)); //true
String is an object in Java, but the compiler allows us to use a literal for assigning to a String reference variable, instead of requiring us to use the new operator, because of the ubiquity of strings in programming. The use of the literal for assignments creates an interned string. All interned strings with the same literal point to the same space in memory, thus the equivalent operator is permitted.
It got me wondering, however: would it not be more proper to use the equals() method to compare two string reference variables, given that in Java strings are objects (and not arrays like in other languages) and the contents of objects should be compared using equals(), as the == operator in regards to objects only tells us that they point to the same spot in memory?
Or is it not more proper because the very concept of interned strings makes it clear that one String can only be equivalent (==) to another string if they share the same literal?
If it is more proper, do people commonly use the equals() method regardless, or is it considered overkill?
Yes equals is the method used to compare the contents of string objects. Using == we can only compare whether the two references point to the same memory location or not. But using equals you can check whether two references hold the same string content, be them at the same memory location or not. Logically speaking, when you are trying to compare two strings, you must be looking for comparing the contents of those strings and not the memory locations of them.
As a practice or almost everywhere string equals method is used for string comparision and not ==
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
String object creation using new and its comparison with intern method
I was playing around with Strings to understand them more and I noticed something that I can't explain :
String str1 = "whatever";
String str2 = str1;
String str3 = "whatever";
System.out.println(str1==str2); //prints true...that's normal, they point to the same object
System.out.println(str1==str3); //gives true..how's that possible ?
How is the last line giving true ? this means that both str1 and str3 have the same address in memory.
Is this a compiler optimization that was smart enough to detect that both string literals are the same ("whatever") and thus assigned str1 and str3 to the same object ? Or am I missing something in the underlying mechanics of strings ?
Because Java has a pool of unique interned instances, and that String literals are stored in this pool. This means that the first "whatever" string literal is exactly the same String object as the third "whatever" literal.
As the Document Says:
public String intern()
Returns a canonical representation for the
string object. A pool of strings, initially empty, is maintained
privately by the class String.
When the intern method is invoked, if the pool already contains a
string equal to this String object as determined by the equals(Object)
method, then the string from the pool is returned. Otherwise, this
String object is added to the pool and a reference to this String
object is returned.
It follows that for any two strings s and t, s.intern() == t.intern()
is true if and only if s.equals(t) is true.
All literal strings and string-valued constant expressions are
interned. String literals are defined in §3.10.5 of the Java Language
Specification
Returns: a string that has the same contents as this string, but is
guaranteed to be from a pool of unique strings.
http://www.xyzws.com/Javafaq/what-is-string-literal-pool/3
As the post says:
String allocation, like all object allocation, proves costly in both time and memory. The JVM performs some trickery while instantiating string literals to increase performance and decrease memory overhead. To cut down the number of String objects created in the JVM, the String class keeps a pool of strings. Each time your code create a string literal, the JVM checks the string literal pool first. If the string already exists in the pool, a reference to the pooled instance returns. If the string does not exist in the pool, a new String object instantiates, then is placed in the pool.
If you do:
String str1 = new String("BlaBla"); //In the heap!
String str2 = new String("BlaBla"); //In the heap!
then you're explicitly creating a String object through new operator (and constructor).
In this case you'll have each object pointing to a different storage location.
But if you do:
String str1 = "BlaBla";
String str2 = "BlaBla";
then you've implicit construction.
Two strings literals share the same storage if they have the same values, this is because Java conserves the storage of the same strings! (Strings that have the same value)
The javac compiler combines String literals which are the same in a given class file.
However at runtime, String literals are combined using the same approach as String.intern() This means even Strings in different class in different applications (in the same JVM which use the same object.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
String comparison and String interning in Java
I understand how String equals() method works but was surprised by some results I had with the String == operator.
I would have expected == to compare references as it does for other objects.
However distinct String objects (with the same content) == returns true and furthermore even for a Static String object (with the same content) which is obviously not the same memory address.
I guess == has been defined the same as equals to prevent its misuse
No, == does just compare references. However, I suspect you've been fooled by compile-time constants being interned - so two literals end up refererring to the same string object. For example:
String x = "xyz";
String y = "xyz";
System.out.println(x == y); // Guaranteed to print true
StringBuilder builder = new StringBuilder();
String z = builder.append("x").append("yz").toString();
System.out.printn(x == z); // Will print false
From section 3.10.5 of the Java language specification:
String literals-or, more generally, strings that are the values of constant expressions (§15.28)-are "interned" so as to share unique instances, using the method String.intern.
The reason it returns the same is because of memory optimizations (that aren't always guaranteed to occur) strings with the same content will point to the same memory area to save space. In the case of static objects they will always point to the same thing (as there is only one of it because of the static keyword). Again don't rely on the above and use Equals() instead.
One thing I should point out from Jon Skeet is that it is always guaranteed for compile time constants. But again just use equals() as it is clearer to read.
It is due to string intern pooling
See
whats-the-difference-between-equals-and
The == operator does always compares references in Java and never contents. What can happen is that once you declare a string literal, this object is sent to the JVM's string pool and if you reuse the same literal the same object is going to be placed in there. A simple test for this behavior can be seen in the following code snippet:
String a = "a string";
String b = "a string";
System.out.println( a == b ); // will print true
String c = "other string";
String d = new String( "other string" );
System.out.println( c == d ); // will print false
The second case prints false because the variable d was initialized with a directly created String object and not a literal, so it will not go to the String pool.
The string pool is not part of the java specification and trusting on it's behavior is not advised. You should always use equals to compare objects.
I guess == has been defined the same as equals to prevent its misuse
Wrong. What is happening here is that when the compiler sees that you are using the same string in two different places it only stores it in the program's data section once. Read in a string or create it from smaller strings and then compare them.
Edit: Note that when I say "same string" above, I'm referring only to string literals, which the compiler knows at runtime.
public static void main(String [] a)
{
String s = new String("Hai");
String s1=s;
String s2="Hai";
String s3="Hai";
System.out.println(s.hashCode());
System.out.println(s1.hashCode());
System.out.println(s2.hashCode());
System.out.println(s3.hashCode());
System.out.println(s==s2);
System.out.println(s2==s3);
}
From the above code can anyone explain what is going behind when JVM encounters this line (s==s2) ?
It compares references - i.e. are both variables referring to the exact same object (rather than just equal ones).
s and s2 refer to different objects, so the expression evaluates to false.
s and s1 refer to the same objects (as each other) because of the assignment.
s2 and s3 refer to the same objects (as each other) because of string interning.
If that doesn't help much, please ask for more details on a particular bit. Objects and references can be confusing to start with.
Note that only string literals are interned by default... so even though s and s2 refer to equal strings, they're still two separate objects. Similarly if you write:
String x = new String("foo");
String y = new String("foo");
then x == y will evaluate to false. You can force interning, which in this case would actually return the interned literal:
String x = new String("foo");
String y = new String("foo");
String z = "foo";
// Expressions and their values:
x == y: false
x == z: false
x.intern() == y.intern(): true
x.intern() == z: true
EDIT: A comment suggested that new String(String) is basically pointless. This isn't the case, in fact.
A String refers to a char[], with an offset and a length. If you take a substring, it will create a new String referring to the same char[], just with a different offset and length. If you need to keep a small substring of a long string for a long time, but the long string itself isn't needed, then it's useful to use the new String(String) constructor to create a copy of just the piece you need, allowing the larger char[] to be garbage collected.
An example of this is reading a dictionary file - lots of short words, one per line. If you use BufferedReader.readLine(), the allocated char array will be at least 80 chars (in the standard JDK, anyway). That means that even a short word like "and" takes a char array of 160 bytes + overheads... you can run out of space pretty quickly that way. Using new String(reader.readLine()) can save the day.
== compars objects not the content of an object. s and s2 are different objects. If you want to compare the content use s.equals(s2).
Think of it like this.
Identical twins look the same but they are made up differently.
If you want to know if they "look" the same use the compare.
If you want to know they are a clone of each other use the "=="
:)
== compares the memory (reference) location of the Objects. You should use .equals() to compare the contents of the object.
You can use == for ints and doubles because they are primitive data types
I suppose you know that when you test equality between variables using '==', you are in fact testing if the references in memory are the same. This is different from the equals() method that combines an algorithm and attributes to return a result stating that two Objects are considered as being the same. In this case, if the result is true, it normally means that both references are pointing to the same Object. This leaves me wondering why s2==s3 returns true and whether String instances (which are immutable) are pooled for reuse somewhere.
It should be an obvious false. JVM does a thing like using the strings that exist in the Memory . Hence s2,s3 point to the same String that has been instantiated once. If you do something like s5="Hai" even that will be equal to s3.
However new creates a new Object. Irrespective if the String is already exisitng or not. Hence s doesnot equal to s3,s4.
Now if you do s6= new String("Hai"), even that will not be equal to s2,s3 or s.
The literals s2 and s3 will point to the same string in memory as they are present at compile time. s is created at runtime and will point to a different instance of "Hai" in memory. If you want s to point to the same instance of "Hai" as s2 and s3 you can ask Java to do that for you by calling intern. So s.intern == s2 will be true.
Good article here.
You are using some '==' overload for String class...
String s1 = "BloodParrot is the man";
String s2 = "BloodParrot is the man";
String s3 = new String("BloodParrot is the man");
System.out.println(s1.equals(s2));
System.out.println(s1 == s2);
System.out.println(s1 == s3);
System.out.println(s1.equals(s3));
// output
true
true
false
true
Why don't all the strings have the same location in memory if all three have the same contents?
Java only automatically interns String literals. New String objects (created using the new keyword) are not interned by default. You can use the String.intern() method to intern an existing String object. Calling intern will check the existing String pool for a matching object and return it if one exists or add it if there was no match.
If you add the line
s3 = s3.intern();
to your code right after you create s3, you'll see the difference in your output.
See some more examples and a more detailed explanation.
This of course brings up the very important topic of when to use == and when to use the equals method in Java. You almost always want to use equals when dealing with object references. The == operator compares reference values, which is almost never what you mean to compare. Knowing the difference helps you decide when it's appropriate to use == or equals.
You explicitly call new for s3 and this leaves you with a new instance of the string.
Creating a String is actually a fast process. Trying to find any previously created String will be much slower.
Consider what you need to do to make all Strings interned. You need to search a set of all previously constructed Strings. Note this must be done in a thread-safe manner, which adds overhead. Because you don't want this to leak, you need to be using some form of (thread-safe) non-strong references.
Of course, you can't implement Java like this because the library unfortunately exposes constructors for Strings and most other immutable value objects.
The new keyword always makes a new string.
More detail here.
Why don't all the strings have the same location in memory if all three have the same contents?
Because they are different strings with same content!
There is a method called String.intern which, essentially, takes all Strings that are the same and puts them into a hash table (I am sort of lying, but for this it is the concept that matters, not the reality).
String s1 = "BloodParrot is the man";
String s2 = "BloodParrot is the man";
String s3 = new String("BloodParrot is the man").intern();
System.out.println(s1.equals(s2));
System.out.println(s1 == s2);
System.out.println(s1 == s3);
System.out.println(s1.equals(s3));
should have them all being "true". This is because (and I am lying a bit here again, but it still only matters for the concept not reality) String s1 = "BloodParrot is the man"; is done something like String s1 = "BloodParrot is the man".intern();