Immutable Types in Java vs. Python - java

String a = new String("Wow");
String b = new String("Wow");
String sameA = a;
boolean r1 = a == b; // This is false, since a and b are not the same object
boolean r2 = a.equals(b); // This is true, since a and b are logically equals
boolean r3 = a == sameA; // This is true, since a and sameA are really the same object
Why is this the case? Because strings are immutable, don't a and b point to the same string? For example, if I do the same thing in Python, this is what I get:
a = "wow"
b = "wow"
a is b #is True because (I thought) both variables point to the same immutable string
a == b #is True because they are logically equivalent
I'm assuming that Java's == is the same as Python's 'is', and Java's .equals is the same as Python's ==, in which case the two blocks of code contradict each other.
I also have a suspicion that this might have to do w/ the fact that Java doesn't treat Strings as primitives?

Because strings are immutable, don't a and b point to the same string?
No. In Java, they don't because you're explicitly asking it to create a new String object out of the literal, and then to create another new String out of the same literal, which guarantees that you get two different objects. (And in your Python test, you get the exact same results, although that isn't guaranteed by the language—see below for details.)
I'm assuming that Java's == is the same as Python's 'is', and Java's .equals is the same as Python's ==
Close enough to true for this question.
in which case the two blocks of code contradict each other.
No they don't. You translated two tests to Python, and got the exact same results as in the two corresponding Java tests. You didn't translate the third test at all, so there's nothing there to contradict with anything.
Let's repeat your Java:
String a = new String("Wow");
String b = new String("Wow");
String sameA = a;
boolean r1 = a == b; // This is false, since a and b are not the same object
boolean r2 = a.equals(b); // This is true, since a and b are logically equals
boolean r3 = a == sameA; // This is true, since a and sameA are really the same object
… and translate it more closely to Python:
a = "Wow"[:]
b = "Wow"[:]
sameA = a
r1 = a is b # This is False, since a and b are not the same object
r2 = a == b # This is True, since a and b are logically equals
r3 = a is sameA # This is True, since a and sameA are really the same object
The [:] are there to match the new in the original code. You're explicitly asking Java to create a new String object out of the literal "Wow". In Python, there's no way to explicitly ask for a new object, but you can always ask for a copy, which is close enough (as long as you don't care about the fact that you might be creating a couple extra strings as garbage to be immediately collection).
However, if you remove the [:], you're often going to get the same results anyway—as you saw in your own tests. You got True for a == b and False for a is b too. Repeating the tests that way:
a = "Wow"
b = "Wow"
sameA = a
r1 = a is b # This is False, since a and b are not the same object
r2 = a == b # This is True, since a and b are logically equals
r3 = a is sameA # This is True, since a and sameA are really the same object
In your comment, you say that a is b was actually True, not False.
As explained above, that's also perfectly legal. While Java requires that new String create a new String object, nothing in Python requires that evaluating the same string literal twice has to create two separate string objects, or even that copying a string object has to create a new string object. See below for more.
I also have a suspicion that this might have to do w/ the fact that Java doesn't treat Strings as primitives?
Indirectly, it's sort of to do with the fact that Java doesn't treat Strings as primitives, and also to do with the fact that Python doesn't treat string literals the same way as some other kinds of literals.
In Java, because String is not a primitive, you have a choice to explicitly create a new String. And you do that. So, it's not going to be the same as any previous instance. The fact that they're both immutable instances is irrelevant; Java is not allowed to collapse separate objects into a single object if there's any way that it would be visible.
In Python, "Wow" is a literal, and you're not asking it to create a new string. And strings are immutable, and Python is allowed to collapse separate immutable built-in objects in ways that Java is not. So it can combine the two literals into one. Even with the [:], it's allowed to collapse the new copy into the original. And with small integers, it will generally do so—try the same test with 0 (use copy.copy(0) to test explicit copying) rather than "Wow" and see what happens. But the major Python implementations often happen to not do this for strings, so you end up with separate objects, so you can get the same result as in Java.
What this means in practice is that, with code like yours, either with or without the explicit copy, a is b and a is not b are both perfectly legal and reasonable things to happen, so your code should never rely on either one being true. Fortunately, there's very little reason to do so. (If you were thinking performance might be a good reason, try calling timeit on a is b vs. a == b, and you'll find something like 55.3ns vs. 61.0ns.)
One last thing to keep in mind: Python and Java are very different languages, so it shouldn't be all that surprising that similar-looking code sometimes acts very differently (or that very different-looking code sometimes acts similarly). The fact that English rules for when to use this vs. that are not the same as Japanese rules for kore vs. sore vs. are is not surprising, it's just something you have to learn when you learn Japanese (or when you learn English).

In Java new operator always returns a new object which occupies its own place in memory and has a unique pointer.

Immutability:Object Not subject to change.
to your question
String a = new String("Wow");
String b = new String("Wow");
String sameA = a;
boolean r1 = a == b; // This is false, since a and b are not the same object
Reference variables a and b are not referring to same Object.When String is created with new operator a totally new memory space is Reserved in Heap for the same string Literal.
If String Pool is not having the facility of making string immutable in that case string pool with one string object/literal has referenced by many reference variables and if any one of them changes the value others will be automatically gets changed. For example
String firstString= "xyzabc";
String secondString= "xyzabc";
Now String secondString called "xyzabc".toUpperCase() which change the same object into "XYZABC" , so A will also be "XYZABC" which is not desirable.
boolean r2 = a.equals(b); // This is true, since a and b are logically equals
//Here literal i.e values of two different object is checked.
boolean r3 = a == sameA; // This is true, since a and sameA are really the same object
== :for Basic types compare the values
== :for Object type =Reserved/Allocated Memory Space in the Heap is same.
in this case a and sameA referring to the same object.
Ex. If
String x="singh";
x+concat("Saheb");
//Here a new object is created without having any reference to it with value "singhSaheb"
But x remain same.
String and even Literal are treated as object.

Related

Why DOES == sometimes work on Strings in Java? [duplicate]

This question already has answers here:
How do I compare strings in Java?
(23 answers)
Closed 7 years ago.
I have the following code:
Circle c1 = new Circle();
Circle c2 = new Circle();
System.out.println(c1 == c2);
Which outputs False, as expected. This is because c1 and c2 are reference types and "==" checks if they refer to the same type (which they don't).
However, I recently tried this:
String a = "hello";
String b = "hello";
System.out.println(a == b);
Which for some reason outputs True. Why is this? String is a reference type and a and b refer to different memory locations. I was always taught that you need to use .equals() for this to work, which this does not!
See: https://ideone.com/CyjE49
UPDATE
THIS IS NOT A DUPLICATE!
I know the proper way to compare strings is using .eqauls()
UPDATE 2
This question may have an answer in: How do I compare strings in Java?, but the question there wasn't asking what I am asking and the answer just went in more detail than required.
Therefore searching with my same question (on Google or otherwise) means users won't be sent to that question or may dismiss it entirely due to the title of the question. Therefore it may be a good idea to keep this up for benefit of other users!
Because string literals are intern'd, identical literals refer to the same object. Therefore, checking for reference equality between them will necessarily return true.
From the Java Language Specification (3.10.5):
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.
In practice, the compiler will pool the string literals "early" and store only one copy in the compiled .class file. However, identical string literals from separate class files will still compare equal using == since the literals are still intern'd when the classes are loaded.
If, on the other hand, we properly apply your example with Circle to String, we would have:
String a = new String("hello");
String b = new String("hello");
System.out.println(a == b); // will print false!
In this case we explicitly create new objects, so they cannot be equal references.
Any string constructed by means other than a literal or other constant string expression will also not necessarily compare reference-equal with an identical string. For example:
String a = "hello";
String b = "hello";
System.out.println(a.substring(0,3) == b.substring(0,3)); // may print false!
String interning. Since both a and b are constants with the same value, the compiler takes advantage of the String's immutability to have both variables refer to the same string and thus save space. Therefore == returns True, since they're effectively the same object.
In this case, "hello" is treated as a constant which is assigned to a and b. So here == actually returns true. If you do this:
String a = new String("hello");
String b = new String("hello");
System.out.println(a == b);
You will get false.

Comparing two strings with equal(==) operator [duplicate]

This question already has answers here:
String comparison and String interning in Java
(3 answers)
Closed 9 years ago.
I was kind of telling some one that , we must use String.equals method to compare two strings values, we can not simply use == operator in java to compare Strings, and told him that == will return false as it doesn't compare the string value but String object reference value.
I have written this example to show him, but for my surprise it always prints true for == operator..
here is the code
public void exampleFunc1(){
String string1 = "ABC";
String string2 = "ABC";
if(string1 == string2)
System.out.println("true");
else{
System.out.println("false");
}
System.out.println(" Are they equal "+(string1 == string2)); // this shouldn't print True but it does
System.out.println(" Are they equal "+(string1.equals(string2)));
}
Output:-
Are they equal true
Are they equal true
So question here is in what circumstances == operator on objects can print true, except that both objects are same instance?
String is one of a few special cases.
Class String keeps a special pool of "interned" Strings. Method myString.intern() looks up myString in this pool. If another String with the same contents already exists in the pool, a pointer to it is returned. If not, myString is added (and a pointer returned).
When you say myString= myString.intern() ;, you are effectively making myString refer to a shared copy or its underlying String available for future sharing (and no duplication). Most library methods creating Strings are subject to this, particularly String literals.
Other cases of "interning" occur with wrapper types Integer, Long, etc. They don't have constructors, but static methods valueOf() that return pre-built, shared objects when they can (usually the 256 values closest to zero), and new objects when they can not. The later is not much problematic because these types are more lightweight than Strings. Long, for example, has a payload of just 8 bytes. String contains a char[] that even empty is 16 bytes or so.
To answer your question, you can not count on any "interning" mechanisms. They have changed in the past, and they could change in the future (or even from one JVM to another), making your code unusable. Always use equals.
You should use
String string1 = new String("ABC");
String string2 = new String("ABC");
Then everything would be correct like what you think,
In this case, "ABC" is just a reference to a const string.
The compiler may be optimizing the assignments and only creating one String object. If you use the explicit String constructor, the == operation should behave as expected.

Using == when comparing objects

Recently in a job interview I was asked this following question (for Java):
Given:
String s1 = "abc";
String s2 = "abc";
What is the return value of
(s1 == s2)
I answered with it would return false because they are two different objects and == is a memory address comparison rather than a value comparison, and that one would need to use .equals() to compare String objects. I was however told that although the .equals(0 methodology was right, the statement nonetheless returns true. I was wondering if someone could explain this to me as to why it is true but why we are still taught in school to use equals()?
String constants are interned by your JVM (this is required by the spec as per here):
All literal strings and string-valued constant expressions are interned. String literals are defined in §3.10.5 of the Java Language Specification
This means that the compiler has already created an object representing the string "abc", and sets both s1 and s2 to point to the same interned object.
java will intern both strings, since they both have the same value only one actual string instance will exist in memory - that's why == will return true - both references point to the same instance.
String interning is an optimization technique to minimize the number of string instances that have to be held in memory. String literals or strings that are the values of constant expressions, are interned so as to share unique instances. Think flyweight pattern.
Since you are not actually creating new instances of String objects for either one of these, they are sharing the same memory space. If it were
String s1 = new String("abc");
String s2 = new String("abc");
the result would be false.
The reason is strings are interned in Java. String interning is a method of storing only one copy of each distinct string value which is immutable. Interning string makes some string processing tasks more efficient. The distinct values are stored in a string intern pool.
(From wiki)
You're right that == uses memory address. However when the java compiler notices that you're using the same string literal multiple times in the same program, it won't create the same string multiple times in memory. Instead both s1 and s2 in your example will point to the same memory. This is called string interning.
So that's why == will return true in this case. However if you read s2 from a file or user input, the string will not automatically interned. So now it no longer points to the same memory. Therefor == would now return false, while equals returns true. And that's why you shouldn't use ==.
Quick and dirty answer: Java optimizes strings so if it encounters the same string twice it will reuse the same object (which is safe because String is immutable).
However, there is no guarantee.
What usually happens is that it works for a long time, and then you get a nasty bug that takes forever to figure out because someone changed the class loading context and your == no longer works.
You should continue to use equals() when testing string equality. Java makes no guarantees about identity testing for strings unless they are interned.
The reason the s1 == s2 in your example is because the compiler is simply optimizing 2 literal references in a scope it can predict.

== operator does not compare references for String [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
String comparison and String interning in Java
I understand how String equals() method works but was surprised by some results I had with the String == operator.
I would have expected == to compare references as it does for other objects.
However distinct String objects (with the same content) == returns true and furthermore even for a Static String object (with the same content) which is obviously not the same memory address.
I guess == has been defined the same as equals to prevent its misuse
No, == does just compare references. However, I suspect you've been fooled by compile-time constants being interned - so two literals end up refererring to the same string object. For example:
String x = "xyz";
String y = "xyz";
System.out.println(x == y); // Guaranteed to print true
StringBuilder builder = new StringBuilder();
String z = builder.append("x").append("yz").toString();
System.out.printn(x == z); // Will print false
From section 3.10.5 of the Java language specification:
String literals-or, more generally, strings that are the values of constant expressions (§15.28)-are "interned" so as to share unique instances, using the method String.intern.
The reason it returns the same is because of memory optimizations (that aren't always guaranteed to occur) strings with the same content will point to the same memory area to save space. In the case of static objects they will always point to the same thing (as there is only one of it because of the static keyword). Again don't rely on the above and use Equals() instead.
One thing I should point out from Jon Skeet is that it is always guaranteed for compile time constants. But again just use equals() as it is clearer to read.
It is due to string intern pooling
See
whats-the-difference-between-equals-and
The == operator does always compares references in Java and never contents. What can happen is that once you declare a string literal, this object is sent to the JVM's string pool and if you reuse the same literal the same object is going to be placed in there. A simple test for this behavior can be seen in the following code snippet:
String a = "a string";
String b = "a string";
System.out.println( a == b ); // will print true
String c = "other string";
String d = new String( "other string" );
System.out.println( c == d ); // will print false
The second case prints false because the variable d was initialized with a directly created String object and not a literal, so it will not go to the String pool.
The string pool is not part of the java specification and trusting on it's behavior is not advised. You should always use equals to compare objects.
I guess == has been defined the same as equals to prevent its misuse
Wrong. What is happening here is that when the compiler sees that you are using the same string in two different places it only stores it in the program's data section once. Read in a string or create it from smaller strings and then compare them.
Edit: Note that when I say "same string" above, I'm referring only to string literals, which the compiler knows at runtime.

How "==" works for objects?

public static void main(String [] a)
{
String s = new String("Hai");
String s1=s;
String s2="Hai";
String s3="Hai";
System.out.println(s.hashCode());
System.out.println(s1.hashCode());
System.out.println(s2.hashCode());
System.out.println(s3.hashCode());
System.out.println(s==s2);
System.out.println(s2==s3);
}
From the above code can anyone explain what is going behind when JVM encounters this line (s==s2) ?
It compares references - i.e. are both variables referring to the exact same object (rather than just equal ones).
s and s2 refer to different objects, so the expression evaluates to false.
s and s1 refer to the same objects (as each other) because of the assignment.
s2 and s3 refer to the same objects (as each other) because of string interning.
If that doesn't help much, please ask for more details on a particular bit. Objects and references can be confusing to start with.
Note that only string literals are interned by default... so even though s and s2 refer to equal strings, they're still two separate objects. Similarly if you write:
String x = new String("foo");
String y = new String("foo");
then x == y will evaluate to false. You can force interning, which in this case would actually return the interned literal:
String x = new String("foo");
String y = new String("foo");
String z = "foo";
// Expressions and their values:
x == y: false
x == z: false
x.intern() == y.intern(): true
x.intern() == z: true
EDIT: A comment suggested that new String(String) is basically pointless. This isn't the case, in fact.
A String refers to a char[], with an offset and a length. If you take a substring, it will create a new String referring to the same char[], just with a different offset and length. If you need to keep a small substring of a long string for a long time, but the long string itself isn't needed, then it's useful to use the new String(String) constructor to create a copy of just the piece you need, allowing the larger char[] to be garbage collected.
An example of this is reading a dictionary file - lots of short words, one per line. If you use BufferedReader.readLine(), the allocated char array will be at least 80 chars (in the standard JDK, anyway). That means that even a short word like "and" takes a char array of 160 bytes + overheads... you can run out of space pretty quickly that way. Using new String(reader.readLine()) can save the day.
== compars objects not the content of an object. s and s2 are different objects. If you want to compare the content use s.equals(s2).
Think of it like this.
Identical twins look the same but they are made up differently.
If you want to know if they "look" the same use the compare.
If you want to know they are a clone of each other use the "=="
:)
== compares the memory (reference) location of the Objects. You should use .equals() to compare the contents of the object.
You can use == for ints and doubles because they are primitive data types
I suppose you know that when you test equality between variables using '==', you are in fact testing if the references in memory are the same. This is different from the equals() method that combines an algorithm and attributes to return a result stating that two Objects are considered as being the same. In this case, if the result is true, it normally means that both references are pointing to the same Object. This leaves me wondering why s2==s3 returns true and whether String instances (which are immutable) are pooled for reuse somewhere.
It should be an obvious false. JVM does a thing like using the strings that exist in the Memory . Hence s2,s3 point to the same String that has been instantiated once. If you do something like s5="Hai" even that will be equal to s3.
However new creates a new Object. Irrespective if the String is already exisitng or not. Hence s doesnot equal to s3,s4.
Now if you do s6= new String("Hai"), even that will not be equal to s2,s3 or s.
The literals s2 and s3 will point to the same string in memory as they are present at compile time. s is created at runtime and will point to a different instance of "Hai" in memory. If you want s to point to the same instance of "Hai" as s2 and s3 you can ask Java to do that for you by calling intern. So s.intern == s2 will be true.
Good article here.
You are using some '==' overload for String class...

Categories