equals or contentEquals when comparing strings

equals or contentEquals when comparing strings - java

I can see that contentEquals is useful for comparing char sequences but I can't find anywhere specifying which method is the best to use when comparing two strings.
Here mentions the differences between both methods but it doesn't explicitly say what to do with two strings.
I can see one advantage of usng contentEquals is that if the variable passed in has its type changed, a compilation error will be thrown. A disadvantage could be the speed of execution.
Should I always use contentEquals when comparing strings or only use it if there are different objects extending CharSequence?

you should use String#equals when comparing the content fo two Strings. Only use contentEquals if one of the Object is not of the type String.
1) it is less confusing. Every Java developer should know what the method is doing, but contentEquals is a more specialised method and therefore less known.
2) It is faster, as you can see in the implementation of contentEquals it calls equals after checking if the sequence is of type AbstractStringBuilder so you save the execution time of that check. But even if the execution would be slower this should not be the first point to make your decision on. First go for readability.

The advantage of contentEquals() is support for objects that implement a CharSequence. When you have a StringBuilder it would be wasteful to call StringBuilder.toString() just so you can use equals() a moment later. In this case contentEquals() helps to avoid allocating a new String to do the comparison.
When comparing two String objects just use equals().

Related

Differentiate string and new string stored in different memory

I am new to java, As per my understanding
String = "ABC" will be stored in string pool and String s=new String("ABC") will create a new memory to store the value. if my understanding is correct how to prove this without using == or equals() method?
Can we prove this using hashcode ?
I generated the hashcode value for both, it returns the same value... why is that...

... how to prove this without using == or equals() method?
The best way1 to prove it in Java code is to use ==.
Certainly you can't prove it using hashcode on the strings because they will have the same hash code. To understand why that is, read the javadoc for String.hashCode(). It explains how the hashcode for a string is calculated.
1 - You could prove it by comparing the values returned by System.identityHashCode(Object). However, that's a round-about approach, and the proof relies on knowledge of what the identity hashcode actually means.
I generated the hashcode value for both, it returns the same value... why is that...
Read the javadoc ... then you will understand.

You cannot prove this using hashcode, because if two strings value are equals it means that they have the same value of hashcode.
You can prove it by comparing their references by using this operator ==.
Take a look a this What's the difference between ".equals" and "=="? to understand the exact difference between the equal method and the == operator.

Android: String equals and contains not matching

I'm testing out the JSON functionality for an Android application and have the following JSON object.
{"result":"fail"}
I then use the following code to get my value:
JSONObject jObject = new JSONObject(ReturnValue); //Return value is what's shown above
String r = jObject.getString("result");
Then using the following I don't get a match
if(r.trim() == "fail")
I wrote it out to the screen just to make sure with this:
et.setText("-" + r + "-");
That results in -fail-
I don't understand why this doesn't match. If I used r.Contains it returns true, but I can't use that for my checks.

Use equals .equals instead of ==. This is because of in Java, if you use == you compare the Object pointers to each other. In the source code of String they have overriden the equals method so they instead compare the letters.
You can't override operators in Java.
Also this is general, always use equals for any object comparison if you don't want to check the references you are comparing are actually pointing on the same object in the heap.

Use
if(r.trim().equals("fail"))
to compare Strings.

as others pointed out, in Java == means "exactly the same object", not "an identical object". You can have two, say, SimpleDateFormat objects that are identical, yet if they occupy different places on the heap, they are not the same object. fyi, C# behaves in quite similar way, but manages to hide it from programmers most of the time.
btw, Since you are already writing Java code, it might be a good idea to, you know, study the language a bit. Saves an awful lot of problems later on. A lot of other surprises await for people who try to write C# in Java (like non-static inner classes).

When to use CharSequence in an API

I'm designing a public interface (API) for a package. I wonder, should I use CharSequence generally instead of String. (I'm mainly talking about the public interfaces).
Are there any drawbacks of doing so? Is it considered a good practice?
What about using it for identifier-like purposes (when the value is matched against a set in a hash-based container)?

CharSequence is rarely used in general purpose libraries. It should usually be used when your main use case is string handling (manipulation, parsing, ...).
Generally speaking you can do anything with a CharSequence that you could do with a String (trivially, since you can convert every CharSequence into a String). But there's one important difference: A CharSequence is not guaranteed to be immutable! Whenever you handle a String and inspect it at two different points in time, you can be sure that it will have the same value every time.
But for a CharSequence that's not necessarily true. For example someone could pass a StringBuilder into your method and modify it while you do something with it, which can break a lot of sane code.
Consider this pseudo-code:
public Object frobnicate(CharSequence something) {
Object o = getFromCache(something);
if (o == null) {
o = computeValue(something);
putIntoCache(o, something);
}
return o;
}
This looks harmless enough and if you'd had used String here it would mostly work (except maybe that the value might be calculated twice). But if something is a CharSequence then its content could change between the getFromCache call and the computeValue call. Or worse: between the computeValue call and the putIntoCache call!
Therefore: only accept CharSequence if there are big advantages and you know the drawbacks.
If you accept CharSequence you should document how your API handles mutable CharSequence objects. For example: "Modifying an argument while the method executes results in undefined behaviour."

This does depend on what you need, I'd like to state two advantages of String, however.
From CharSequence's documentation:
Each object may be implemented by a different class, and there is no
guarantee that each class will be capable of testing its instances for
equality with those of the other. It is therefore inappropriate to use
arbitrary CharSequence instances as elements in a set or as keys in a
map.
Thus, whenever you need a Map or reliable equals/hashCode, you need to copy instances into a String (or whatever).
Moreover, I think CharSequence does not explicitly mention that implementations must be immutable. You may need to do defensive copying which may slow down your implementations.

Java CharSequence is an interface. As the API says, CharSequence has been implemented in CharBuffer, Segment, String, StringBuffer, StringBuilder classes. So if you want to access or accept your API from all these classes thenCharSequence is your choice. If not then String is very good for a public API because it is very easy & everybody knows about it. Remember CharSequence only gives you 4 method, so if you are accepting a CharSequence object through a method, then your input manipulation ability will be limited.

If a parameter is conceptually a sequence of chars, use CharSequence.
A string is technically a sequence of chars, but most often we don't think of it like that; a string is more atomic / holistic, we don't usually care about individual chars.
Think about int - though an int is technically a sequence of bits, we don't usually care about individual bits. We manipulate ints as atomic things.
So if the main work you are going to do on a parameter is to iterate through its chars, use CharSequence. If you are going to manipulate the parameter as an atomic thing, use String.

You can implement CharSequenceto hold your passwords, because the usage of String is discouraged for that purpose. The implementation should have a dispose method that wipes out the plain text data.

No reverse method in String class in Java?

Why there is no reverse method in String class in Java? Instead, the reverse() method is provided in StringBuilder? Is there a reason for this? But String has split(), regionMatches(), etc., which are more complex than the reverse() method.
When they added these methods, why not add reverse()?

Since you have it in StringBuilder, there's no need for it in String, right? :-)
Seriously, when designing an API there's lots of things you could include. The interfaces are however intentionally kept small for simplicity and clarity. Google on "API design" and you'll find tons of pages agreeing on this.
Here's how you do it if you actually need it:
str = new StringBuilder(str).reverse().toString();

Theoretically, String could offer it and just return the correct result as a new String. It's just a design choice, when you get down to it, on the part of the Java base libraries.

If you want an historical reason, String are immutable in Java, that is you cannot change a given String if not creating another String.
While this is not bad "per se", initial versions of Java missed classes like StringBuilder. Instead, String itself contained (and still contains) a lot of methods to "alter" the String but since String is immutable, each of these methods actually creates and return a NEW String object.
This caused simple expressions like :
String s = "a" + anotherString.substr(10,5).trim().toLowerCase();
To actually create in ram something like 5 strings, 4 of which are absolutely useless, with obvious performance problems (despite after there has been some optimizations regarding underlying char[] arrays).
To solve this, Sun introduced StringBuilder and other classes that ARE NOT immutable. These classes freely modify a single char[] array, so that calling methods does not need to produce many intermediate String instances.
They added "reverse" quite lately, so they added it to StringBuilder instead of String, cause that's now the preferred way to manipulate strings.

As a side-note, in Scala you use the same java.lang.String class and you do get a reverse method (along with all kinds of other handy stuff). The way it does it is with implicit conversions, so that your String gets automatically converted into a class that does have a reverse method. It's really quite clever, and removes the need to bloat the base class with hundred of methods.

String is immutable, meaning it can't be changed.
When you reverse a String, what's happening is that each letter is switched on it's own, means it will always create the new object each times.
Let us see with example:
This means that for instance Hello becomes as below
elloH lloeH loleH olleH
and you end up with 4 new String objects on the heap.
So think if you have thousands latter of string or more then how much object will be created.... it will be really a very expensive. So too much memory will be occupied.
So because of this String class not having reverse() method.

Well I think it could be because it is an immutable class so if we had a reverse method it would actually create a new object.

reverse() acts on this, modifying the current object, and String objects are immutable - they can't be modified.
It's peculiarly efficient to do reverse() in situ - the size is known to be the same, so no allocation is necessary, there are half as many loop iterations as there would be in a copy, and, for large strings, memory locality is optimal. From looking at the code, one can see that a lot of care was taken to make it fast. I suspect the author(s) had a particular use case in mind that demanded high performance.

Is Java's assertEquals method reliable?

I know that == has some issues when comparing two Strings. It seems that String.equals() is a better approach. Well, I'm doing JUnit testing and my inclination is to use assertEquals(str1, str2). Is this a reliable way to assert two Strings contain the same content? I would use assertTrue(str1.equals(str2)), but then you don't get the benefit of seeing what the expected and actual values are on failure.
On a related note, does anyone have a link to a page or thread that plainly explains the problems with str1 == str2?

You should always use .equals() when comparing Strings in Java.
JUnit calls the .equals() method to determine equality in the method assertEquals(Object o1, Object o2).
So, you are definitely safe using assertEquals(string1, string2). (Because Strings are Objects)
Here is a link to a great Stackoverflow question regarding some of the differences between == and .equals().

assertEquals uses the equals method for comparison. There is a different assert, assertSame, which uses the == operator.
To understand why == shouldn't be used with strings you need to understand what == does: it does an identity check. That is, a == b checks to see if a and b refer to the same object. It is built into the language, and its behavior cannot be changed by different classes. The equals method, on the other hand, can be overridden by classes. While its default behavior (in the Object class) is to do an identity check using the == operator, many classes, including String, override it to instead do an "equivalence" check. In the case of String, instead of checking if a and b refer to the same object, a.equals(b) checks to see if the objects they refer to are both strings that contain exactly the same characters.
Analogy time: imagine that each String object is a piece of paper with something written on it. Let's say I have two pieces of paper with "Foo" written on them, and another with "Bar" written on it. If I take the first two pieces of paper and use == to compare them it will return false because it's essentially asking "are these the same piece of paper?". It doesn't need to even look at what's written on the paper. The fact that I'm giving it two pieces of paper (rather than the same one twice) means it will return false. If I use equals, however, the equals method will read the two pieces of paper and see that they say the same thing ("Foo"), and so it'll return true.
The bit that gets confusing with Strings is that the Java has a concept of "interning" Strings, and this is (effectively) automatically performed on any string literals in your code. This means that if you have two equivalent string literals in your code (even if they're in different classes) they'll actually both refer to the same String object. This makes the == operator return true more often than one might expect.

In a nutshell - you can have two String objects that contain the same characters but are different objects (in different memory locations). The == operator checks to see that two references are pointing to the same object (memory location), but the equals() method checks if the characters are the same.
Usually you are interested in checking if two Strings contain the same characters, not whether they point to the same memory location.

public class StringEqualityTest extends TestCase {
public void testEquality() throws Exception {
String a = "abcde";
String b = new String(a);
assertTrue(a.equals(b));
assertFalse(a == b);
assertEquals(a, b);
}
}

The JUnit assertEquals(obj1, obj2) does indeed call obj1.equals(obj2).
There's also assertSame(obj1, obj2) which does obj1 == obj2 (i.e., verifies that obj1 and obj2 are referencing the same instance), which is what you're trying to avoid.
So you're fine.

Yes, it is used all the time for testing. It is very likely that the testing framework uses .equals() for comparisons such as these.
Below is a link explaining the "string equality mistake". Essentially, strings in Java are objects, and when you compare object equality, typically they are compared based on memory address, and not by content. Because of this, two strings won't occupy the same address, even if their content is identical, so they won't match correctly, even though they look the same when printed.
http://blog.enrii.com/2006/03/15/java-string-equality-common-mistake/

"The == operator checks to see if two Objects are exactly the same Object."
http://leepoint.net/notes-java/data/strings/12stringcomparison.html
String is an Object in java, so it falls into that category of comparison rules.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

equals or contentEquals when comparing strings - java

Related

Differentiate string and new string stored in different memory

Android: String equals and contains not matching

When to use CharSequence in an API

No reverse method in String class in Java?

Is Java's assertEquals method reliable?

Categories

Resources