Understanding String and equals [duplicate]

Understanding String and equals [duplicate] - java

This question already has answers here:
Test of equality (equals and hash code method)
(6 answers)
Closed 9 years ago.
When using String s1="java"; and String s2= new String("java");
do both of these create different String objects? I know if I use String s3="java" it uses the same object of s1 but in s2 case also does it use same object? If so then why does StringBuffer sb = new StringBuffer("java"); use a different object. Because if i do System.out.println(sb.equals( s1 )); it returns false;
My understanding of equals method is that it compares if both the references refer to same object unless we have overridden equals method , please let me know if my understanding is wrong.

does both of these create different String object
There are two aspects here. First one is interning of String literals. Out of these two statements:
String s1 = "Hello";
String s2 = new String("Hello");
First one will use the literal string "Hello" that is already in the constant pool. If it isn't there, it will create an entry in the constant pool for "Hello" (In which case you can say that, an object is created)
In 2nd case, you have 2 string objects - first one is the string literal "Hello", which will be interned from the constant pool, and then the 2nd object creation is due to the use of new keyword - new String(...) will certainly create a new object. So:
s1 == s2; // This will return `false`
because, s1 and s2 are reference to 2 different String objects.
Now comes your 2nd case:
String s1 = "Hello";
StringBuffer sb = new StringBuffer("Hello");
In 2nd statement, you are creating a new StringBuffer object. Well, firstly a StringBuffer is not a String. Secondly, since StringBuffer class does not override equals() method, so when you invoke equals method like this:
sb.equals(s1);
it will invoke the Object#equals() method, which does comparison based on the value of reference. So, it will return false, since sb and s1 are pointing to 2 different instance.
However, if you compare them like this:
sb.toString().equals(s1);
then you will get true now. Since String class has overridden the equals() method, which does the comparison based on the contents.
See also:
What is the difference between String and StringBuffer in Java?
How do I compare strings in Java?

Your understanding of the equals method is wrong. It is the == operator that does what you are describing.
The equal method is implemented to do this (quoted from the String class documentation) :
"The result is true if and only if the argument is not null and is a String object that represents the same sequence of characters as this object."

String s1="java";
At this point, s1 points to a String object.
String s2 = new String("java");
String has already overridden the equals method, to check for the contents of the object, as per the documentation.
SO s1.equals(s2) will evaluate to true, because they have the same contents.
Object.equals() will check if the two objects are the same.

String s1="java"; and String s2= new String("java"); does both of
these create different String object
new always creates a new object. That said, the internal string they're referring to is the same. In this case == will return false but equals will return true.
if so then why does StringBuffer sb = new StringBuffer("java"); use
different object
StringBuffer is not String. They're 2 totally different classes. It's the same as comparing String with Integer. Maybe you meant System.out.println(sb.toString().equals( s1 ));?
My understanding of equals method is that it comparse if both the
reference refer to same object unless we have overridden equals method
You're right, but in this case, String overrides equals() (and hashcode() as well) so the behavior is not that of Object#equals().

Ok, let use an analogy.
If I write the same word like 'HELLO' on two pieces of paper.
Then I bring in a panel of experts and ask them some questions.
Q. Expert one. Are those two things the same?
A. Yes they are the same, it says HELLO.
Q. Expert two, make me a paper airplane out of a piece of paper.
A. Ok, sure.
Q. Expert three, are these two things the same?
A. Of course not, one is a paper airplane.
Q. Expert four, get another sheet and write 'HELLO' on it. Now at all these things the same?
A. Of course, they all say 'HELLO'
So, it depends what you mean by equals.
And computer languages have some non intuitive ways of defining equals.
Sometimes equals means we care about the words on the paper, sometimes we are concerned that it the exact same piece of paper' which doesn't matter a lot of the time, as long as they both say 'HELLO'

s1 and s3 refer to the same object but s2 is another different object in memory. check out http://www.journaldev.com/797/what-is-java-string-pool the image and explanations at that link will clarify it more than words can.

Related

String interning and HashSet in java

I have read about string interning, in which String literals are reused, whereas String object created using new aren't reused. This can be seen below when I print true and false for their equality. To be specific, (p1==p2)!=p3, So there are two objects, one pointed by p1 and p2 and another by p3. However, when I add them to HashSet, all considered same. I was expecting a.size() to return 2, but it returned 1. Why is this so?
package collections;
import java.util.HashSet;
public class Col {
public static void main(String[] args) {
method1();
}
public static void method1()
{
HashSet a = new HashSet();
String p1 = "Person1";
String p2 = "Person1";
String p3 = new String("Person1");
if(p1 == p2)
System.out.println(true);
else
System.out.println(false);
if(p1 == p3)
System.out.println(true);
else
System.out.println(false);
a.add(p1);
a.add(p2);
a.add(p3);
System.out.println(a.size());
}
}
Output
true
false
1

HashSet uses equality to keep a unique set of values, not identity (i.e., if two objects are equals to each other, but not ==, a HashSet will only keep one of them).
You can implement a set that uses identity instead of equality by using the JDK's IdentityHashMap with a dummy value shared between all keys, in a similar way that HashSet is based on HashMap.

I have read about string interning, in which String literals are reused, whereas String object created using new aren't reused. This can be seen below when I print true and false for their equality. To be specific, (p1==p2)!=p3, So there are two objects, one pointed by p1 and p2 and another by p3. However, when I add them to HashSet, all considered same. I was expecting a.size() to return 2, but it returned 1.
This is right only if you compare String using ==, the result is different when comparing using equals() method. (In doubt, you can test).
When adding into HashSet, the comparison method used is equals() as its proper for objects. And so, p1, p2 and p3 are equals.
You can try testing using equals() it will output true, true, 1 instead of true, false, 1

p1 and p2 are string literals and they are pointing to the same value because of string pool. So, when we are comparing them using == then they are matching.
p3 is a string object, so when we match using == then it tries to match using reference, so it gives false.
HashSet's add method call HashMap's put method internally. HashMap's put method use hashCode and equals method to set the value in HashMap. String implement hashCode and equals method and provide same hashCode for same value. HashSet contain unique value, so it store only one value.

This is one of those cases where I would recommend learning how to use javap to understand how your code is compiled but let me try to explain what is going on under the hood.
When Java compiles that class, it creates instructions for building what is called the constant pool for that class. That constant pool will hold a reference to a string with the value "Person1". The compiled logic will also say p1 and p2's value should be set to the constant pool's reference to that string (the address in memory that it lives in). Calling p1==p2 will return true because they literally have the same exact value. When you call String p3 = new String("Person1"); you are telling Java to create a new string in a different place in memory which is merely a copy of the original one and then set p3's value as a reference to the place in memory that the new string object lives in. So if you call p1 == p3 it will return false because what you are saying is "does p1's location in memory equals p2's location in memory?"
As others have pointed out, if you called p1.equals(p3) it returns true because .equals compares the string values instead of the references. And a HashSet will see them all the same because it uses the method .hashCode which is similar to .equals in the sense that it generates a hash off of the string value.
Hopefully that clears up some of the confusion!

what is benefit of creating two object using new operator in string

what is benefit of creating two object using new operator in string. Why two objects are created and what is their importance .
String s=new String("abc");
//creates two object
//why 2 object creation is required.

If you perform the following test:
String a = "foo";
String b = new String(a);
System.out.println(a == b);//returns false
So it means a and b are not the same object (this is probably mandatory because the new operator is used).
This can be useful if you would use the == to check if you are talking about the same string (thus not an equivalent string).
There is in this case however little use to do so, especially because String objects are immutable, as you can read in the manual:
Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string. Unless an explicit copy of original is needed, use of this constructor is unnecessary since Strings are immutable.
The only situation where it can be useful I think, if you would attach some kind of "message" aspect to the String, where a messageboard only accepts different object Strings. If you in that case want to insert the same message twice, you will need to make a copy of the String.

If I have 2 reference types, with the same value, does that mean only 1 object in memory

I'm new to Java and have read a conflicting statement to what I believe. Please consider the following code.
String str1 = "dave";
String str2 = "dave";
Both str1 and str2, although unique variables, reference the exact same value. So, how many unique objects are created in memory? 1 or 2 and can some one explain why?

In your example they reference to the same object, because the strings are interned.
In general, usage of new creates new objects, thus using:
String str1 = new String("dave");
String str2 = new String("dave");
would create two different objects in the heap.

It's not so complicated. Except if you're talking about Strings ;-)
First, let's ignore Strings and assume this simple type:
public class ValueHolder {
private final int value;
public ValueHolder(int value) {
this.value = value;
}
public int getValue() {
return value;
}
}
If you have two lines like this:
ValueHolder vh1 = new ValueHolder(1);
ValueHolder vh2 = new ValueHolder(1);
then you'll have created exactly 2 objects on the heap. Even though they behave exactly the same and have the exact same values stored in them (and can't be modified), you will have two objects.
So vh1 == vh2 will return false!
The same is true for String objects: two String objects with the same value can exist.
However, there is one specific thing about String: if you use a String literal(*) in your code, Java will try to re-use any earlier occurance of this (via a process called interning).
So in your example code str1 and str2 will point to the same object.
(*) or more precisely: a compile-time constant expression of type String.

You have one unique Object & 2 references pointing to the same object. This is as a result of String pooling (or interning). Given that both String literals have identical content, the only way to ensure that 2 separate Objects could be created would be to to explicitly invoke one of the String constructors.

it depends. if you write a little test program, then there's a very good chance that they will contain the same reference, because java is trying to do you a favor by saving memory and reusing references. if str2 came from user input, then it would likely be two different references. a good way to test is to use == in a comparison. if the two are ==, then they are referencing the same memory location. if they are not, then they are two different references. this throws off a lot of beginning programmers because when they first start writing code, they use ==, see that it works, then down the road can't figure out why their comparisons aren't working.
i can't explain specifically when java does reuse references "behind the scenes" but it's related to how and when the values are created

You are writing a short-hand version of this
String str1 = new String("dave");
String str2 = new String("dave");
So str1 and str2 are different objects and may be modified separately as such
"dave", the original string, exists once only in memory, with another reference.

Example of ==, equals and hashcode in java

Given this:
String s1= new String("abc");
String s2= new String("abc");
String s3 ="abc";
System.out.println(s1==s3);
System.out.println(s1==s2);
System.out.println(s1.equals(s2));
System.out.println(s1.equals(s3));
System.out.println(s1.hashCode());
System.out.println(s2.hashCode());
System.out.println(s3.hashCode());
Output is:
false
false
true
true
96354
96354
96354
Here == is giving false for each object but the hashcode for each String object is same. Why is it so?

== does compare real equality of objects (I mean - both references point to the same object), not their content, whereas .equal compares content (at least for String).
String a = new String("aa");
String b = new String("aa");
a and b are pointing to different objects.
Notice also that if objects are equal then their hashchodes must be the same, but if hashcodes are the same, it doesn't mean that objects are equal.

The equals contract says that if o1.equals(o2), then o1.hashCode() == o2.hashCode(). It doesn't specify anything about the hash codes of unequal objects. You could have a method like
public int hashCode()
{
return 42;
}
and it'd fulfill the contract. It's just expected that the hash code be related to the value of the object, in order to make hash tables work more efficiently.
Now, as for why your == doesn't work, two objects will always be compared by reference. That is, if o1 == o2, then o1 and o2 are the exact same object. That's rarely what you want; you usually want to see if o1.equals(o2) instead.

When you use ==, you are comparing if two variables hold reference to the same Object. In other words s1 == s2 is like asking: are the s1 and s2 variables referring to the same String object? And that's not true, even when both String objects have the same "abc" value.
When you use equals(), you are comparing the value of both objects. Both objects may not be the same, but their value (in this case "abc") is the same, so it returns true.
How do you define whether an object is equal to another? That's up to you. In this case the String object already defines this for you, but for example if you define a Person object, how do you know if a person P1 is equal to P2? You do that by overriding equals() and hashCode().

== tells you whether the two variable references point at the same object in memory, nothing more. equals() and hashCode() both look at the contents of the object and each uses its own algorithm for calculation.

Is Java's assertEquals method reliable?

I know that == has some issues when comparing two Strings. It seems that String.equals() is a better approach. Well, I'm doing JUnit testing and my inclination is to use assertEquals(str1, str2). Is this a reliable way to assert two Strings contain the same content? I would use assertTrue(str1.equals(str2)), but then you don't get the benefit of seeing what the expected and actual values are on failure.
On a related note, does anyone have a link to a page or thread that plainly explains the problems with str1 == str2?

You should always use .equals() when comparing Strings in Java.
JUnit calls the .equals() method to determine equality in the method assertEquals(Object o1, Object o2).
So, you are definitely safe using assertEquals(string1, string2). (Because Strings are Objects)
Here is a link to a great Stackoverflow question regarding some of the differences between == and .equals().

assertEquals uses the equals method for comparison. There is a different assert, assertSame, which uses the == operator.
To understand why == shouldn't be used with strings you need to understand what == does: it does an identity check. That is, a == b checks to see if a and b refer to the same object. It is built into the language, and its behavior cannot be changed by different classes. The equals method, on the other hand, can be overridden by classes. While its default behavior (in the Object class) is to do an identity check using the == operator, many classes, including String, override it to instead do an "equivalence" check. In the case of String, instead of checking if a and b refer to the same object, a.equals(b) checks to see if the objects they refer to are both strings that contain exactly the same characters.
Analogy time: imagine that each String object is a piece of paper with something written on it. Let's say I have two pieces of paper with "Foo" written on them, and another with "Bar" written on it. If I take the first two pieces of paper and use == to compare them it will return false because it's essentially asking "are these the same piece of paper?". It doesn't need to even look at what's written on the paper. The fact that I'm giving it two pieces of paper (rather than the same one twice) means it will return false. If I use equals, however, the equals method will read the two pieces of paper and see that they say the same thing ("Foo"), and so it'll return true.
The bit that gets confusing with Strings is that the Java has a concept of "interning" Strings, and this is (effectively) automatically performed on any string literals in your code. This means that if you have two equivalent string literals in your code (even if they're in different classes) they'll actually both refer to the same String object. This makes the == operator return true more often than one might expect.

In a nutshell - you can have two String objects that contain the same characters but are different objects (in different memory locations). The == operator checks to see that two references are pointing to the same object (memory location), but the equals() method checks if the characters are the same.
Usually you are interested in checking if two Strings contain the same characters, not whether they point to the same memory location.

public class StringEqualityTest extends TestCase {
public void testEquality() throws Exception {
String a = "abcde";
String b = new String(a);
assertTrue(a.equals(b));
assertFalse(a == b);
assertEquals(a, b);
}
}

The JUnit assertEquals(obj1, obj2) does indeed call obj1.equals(obj2).
There's also assertSame(obj1, obj2) which does obj1 == obj2 (i.e., verifies that obj1 and obj2 are referencing the same instance), which is what you're trying to avoid.
So you're fine.

Yes, it is used all the time for testing. It is very likely that the testing framework uses .equals() for comparisons such as these.
Below is a link explaining the "string equality mistake". Essentially, strings in Java are objects, and when you compare object equality, typically they are compared based on memory address, and not by content. Because of this, two strings won't occupy the same address, even if their content is identical, so they won't match correctly, even though they look the same when printed.
http://blog.enrii.com/2006/03/15/java-string-equality-common-mistake/

"The == operator checks to see if two Objects are exactly the same Object."
http://leepoint.net/notes-java/data/strings/12stringcomparison.html
String is an Object in java, so it falls into that category of comparison rules.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Understanding String and equals [duplicate] - java

s1 and s3 refer to the same object but s2 is another different object in memory. check out http://www.journaldev.com/797/what-is-java-string-pool the image and explanations at that link will clarify it more than words can.

Related

String interning and HashSet in java

what is benefit of creating two object using new operator in string

If I have 2 reference types, with the same value, does that mean only 1 object in memory

Example of ==, equals and hashcode in java

Is Java's assertEquals method reliable?

Categories

Resources