What does compareTo() actually return? - java

The compareTo() method in Java returns a value greater/equal/less than 0 and i know that. However, the value itself is my question. What is the difference between 2 or 4 when compareTo() returns. Look at the code below
String s1="hello";
String s2="hello";
String s3="meklo";
String s4="hemlo";
System.out.println(s1.compareTo(s2)); // 0
System.out.println(s1.compareTo(s3)); // -5
System.out.println(s1.compareTo(s4)); // -1
Why the last two commands are -5 and -1?

https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#compareTo(java.lang.String)
This is the definition of lexicographic ordering. If two strings are different, then either they have different characters at some index that is a valid index for both strings, or their lengths are different, or both. If they have different characters at one or more index positions, let k be the smallest such index; then the string whose character at position k has the smaller value, as determined by using the < operator, lexicographically precedes the other string. In this case, compareTo returns the difference of the two character values at position k in the two string -- that is, the value:
this.charAt(k)-anotherString.charAt(k)
If there is no index position at which they differ, then the shorter string lexicographically precedes the longer string. In this case, compareTo returns the difference of the lengths of the strings -- that is, the value:
this.length()-anotherString.length()

compareTo() returns the difference of first unmatched character in the two compared strings. If no unmatch is found, and one string comes out as shorter than other one, then the length difference is returned.
"hello".compareTo("meklo") = 'h' - 'm' = -5
^ ^
and
"hello".compareTo("hemlo") = 'l' - 'm' = -1
^ ^
As a side note:
Non-zero values are mostly considered as true inside conditional statements. So, compareTo can simply return these non-zero values instead of processing them into 1(small optimisation).

If you take closer look at the source code for String#compareTo(String), you can see that the exact results are ambiguous.
public int compareTo(String anotherString) {
int len1 = value.length;
int len2 = anotherString.value.length;
int lim = Math.min(len1, len2);
char v1[] = value;
char v2[] = anotherString.value;
int k = 0;
while (k < lim) {
char c1 = v1[k];
char c2 = v2[k];
if (c1 != c2) {
return c1 - c2;
}
k++;
}
return len1 - len2;
}
In most cases (i.e. a difference in the characters of both strings) it will return the integer difference of the char values of the first differing characters. Otherwise it will return the difference of the lengths of both strings.
The interpretation of the return value beyond = 0, > 0 and < 0 should be of no concern in practice, since the implementation is allowed to change at any time if the contract of Comparable<T>#compareTo(T) is kept:
Compares this object with the specified object for order. Returns a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified object.
Source: https://docs.oracle.com/javase/8/docs/api/java/lang/Comparable.html#compareTo-T-

The exact value does not matter - all that the Comparable (as well as Comparator) interface cares about is whether the value is negative, zero or positive.
This is to make things simple for implementations of the interface. When implementing it, you may choose to return the basic -1, 0 or 1 (this is usual if the comparison relies on evaluating some conditions), or you may use any arbitrary negative or positive value if it suits you better - e.g. you can compare two integers by returning this.i - other.i.
In your particular given example, my guess would be:
-1 is difference in the third letter's code point: 'l' - 'm' == -1
-5 is difference in the first letter's code point: 'h' - 'm' == -5
But the important part is that you shall not rely on it to be that way - it's an implementation detail, and according to Comparable's contract any negative value shall be treated the same ("less than").

Related

Comparing Strings in Java (greater of lesser) [duplicate]

This question already has answers here:
How does compareTo work?
(3 answers)
Closed 1 year ago.
So, I'm trying to compare two String in Java right now. But the compareTo() method works weird. Consider this example:
String one = "one";
String two = "this is muuch greater";
System.out.println(one.compareTo(two));
And if I try to compare them the method works just fine. It returns a negative value.
But if I try something different, for example: (1)
String one = "word";
String two = "hello world";
System.out.println(one.compareTo(two));
or something like this: (2)
String one = "key";
String two = "qw";
System.out.println(one.compareTo(two));
It acts all weird. In the (1) case the method returned a positive value, despite String one being shorter than the String two
In the (2) case the method returned a negative value, despite the first string being longer than the other one.
It says, that the compareTo() method returns a positive value if the string to which the method is "applied" is longer than the string passing into arguments. The method returns 0 if their length is equal and returns a negative value in every other case. What is wrong am i doing?
compareTo() method compares the given string with current string lexicographically. It returns positive number, negative number or 0.
It compares strings on the basis of Unicode value of each character in the strings.
If first string is lexicographically greater than second string, it returns positive number (difference of character value). If first string is less than second string lexicographically, it returns negative number and if first string is lexicographically equal to second string, it returns 0.
If you get source code you see that compareTo in String compares subsequentelly Unicode of characters in strings till one of string is finished. At first mismatch if stops and returns difference between char codes in first and second strings.
If no mismatch found method returns lengths difference between first and second strings.
So, 'o' code is smaller than 't' code (difference is negative), 'w' code greater than 'h' and 'k' code greater than 'q' (difference is positive). In this case code grows in alphabetic order.
public int compareTo(String anotherString) {
int len1 = value.length;
int len2 = anotherString.value.length;
int lim = Math.min(len1, len2);
char v1[] = value;
char v2[] = anotherString.value;
int k = 0;
while (k < lim) {
char c1 = v1[k];
char c2 = v2[k];
if (c1 != c2) {
return c1 - c2;
}
k++;
}
return len1 - len2;
}

Why is compareTo() behaving like this

I am trying to compare two strings. I am using the compareTo method but am seeing some behavior I don't understand.
System.out.println("5".compareTo("10") > 0);
System.out.println("13".compareTo("10") > 0);
Why do both of these statements output true?
The natural ordering for Java strings is lexicographical, not numerical. (See the javadoc for what lexicographical means in the context of a Java string.)
For the first example, the character '5' is greater than the character '1', so "5" is larger than "10".
For the second example, the '1' is common to both strings. So we move on to the 2nd characters, and compare '3' with '0'. The former is larger, so "13" is larger than "10".
And:
Why is compareTo() behaving like this
Because the spec says it should; see link above. And because it makes sense.
(You would not want the String::compareTo() method to try to distinguish between words and numbers and order words alphabetically and numbers numerically ... and scratch its metaphorical head over strings that are neither one or the other!)
When you provide value in ("") it is treated as String and String comparision is different then number comparision.
try below and you will know.
Integer targetValue = 10;
Integer firstValue = 5;
Integer secondValue = 13;
System.out.println("5".compareTo("10") > 0);
System.out.println("13".compareTo("10") > 0);
System.out.println(firstValue.compareTo(targetValue) > 0);
System.out.println(secondValue.compareTo(targetValue) > 0);

Why do we write A.charAt(i) but not A.charAt[i]? And why do we write " - 'A' "?

public static int get(String A) // it is a method
{
int count = 1;
for (int i = 0; i < A.length(); i++) // So A reads line (any word, for example "Car"), so I understand that length will be 3 and that java will check all the characters.
{
int num = (A.charAt(i) - 'A') + 1;
count *= num;
}
return count;
}
You write A.charAt(i) because charAt is a function, not an array.
You write A.charAt(i) - 'A' to compute the difference between A's i:th character and the character 'A'.
The class String is an immutable or value object. It doesn't give you direct access to the characters which make up the string, mainly for performance reasons but also since it helps to avoid a whole class of bugs.
That's why you can't use the array access via []. You could call A.getChars() but that would create a copy of the underlying character array.
char is the code for a character. 'A' == 65, for example. See this table. If A.charAt(1) returns 'F' (or 70), then 'F' - 'A' gives you 5. +1 gives 6.
So the code above turns letters into a number. A pattern which you'll see pretty often is charAt(i) - '0' to turn a string into a number.
But the code above is odd in this respect since count *= num produces a pretty random result for the input. To turn the letters into numbers, base 26, it should read count = count * 26 + num.
A.charAt(i) is a method for strings, you could also do A[i] to access the same position directly.
When you do an operation (+ or -) with chars, you get an int.
Java API for charAt() function
charAt() is a java method, not an Array
returns the char value at the specified index.
Syntax:
Here is the syntax of this method:
public char charAt(int index);
Because charAt() is a method that returns a character from a given String, and not an array. Characters are written 'A'. Strings are written "A".
Because charAt is a method within string and it accepts index. String internally maintains char array and it's all hidden from us and hence you have a method not the array itself.
Reason for -'A' is user wants to convert that character to integer. So for e.g. You character is 'B', User wants to convert it into int using ascii value of 'B' which is 66 - ascii value of 'A' which 65
num = 66 - 65 + 1
And do further processing.
because charAt() is a method in java for string it and it returns a character. and 'A' refers to a char type while we write "A" for string type
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#charAt%28int%29

What does the String classes compareTo() method return

In the Java API on oracles website: "compareTo Returns: "the value 0 if the argument string is equal to this string; a value less than 0 if this string is lexicographically less than the string argument; and a value greater than 0 if this string is lexicographically greater than the string argument." "
Here is an if statement:
String a = "abd";
String b = "abc";
if(a.compareTo(b) >= 1)
returns true
since string a is greater, lexicographically.
My question is, does the compareTo always return a 0, 1, or -1? or does it return the actual amount that the string is greater than or less than the string argument.
So in the above if statement, since "abd" is one greater than "abc" is it returning 1?
As far as you're concerned, there's no telling what the magnitude of the compareTo return value is, just the sign. In practice, most compareTo implementations will return -1, 0, or 1, but the contract specifically says positive or negative, and you should write your code accordingly (e.g., using int compare = a.compareTo(b); if(compare > 0) {...} else...).
According to http://docs.oracle.com/javase/6/docs/api/java/lang/String.html#compareTo%28java.lang.String%29
In this case, compareTo returns the difference of the two character values at position k >in the two string -- that is, the value:
this.charAt(k)-anotherString.charAt(k)
If there is no index position at which they differ, then the shorter string lexicographically precedes the longer string. In this case, compareTo returns the difference of the lengths of the strings -- that is, the value:
this.length()-anotherString.length()
For the last case, for the lengths of the String, by documentation that seems it can return other than -1, 0, 1
Falmarri fully answered this question; as opposed to only indicating the conditions in which the return value would be positive, negative or zero.
"This is the definition of lexicographic ordering. If two strings are different, then either they have different characters at some index that is a valid index for both strings, or their lengths are different, or both. If they have different characters at one or more index positions, let k be the smallest such index; then the string whose character at position k has the smaller value, as determined by using the < operator, lexicographically precedes the other string. In this case, compareTo returns the difference of the two character values at position k in the two string -- that is, the value:
this.charAt(k)-anotherString.charAt(k)
If there is no index position at which they differ, then the shorter string lexicographically precedes the longer string. In this case, compareTo returns the difference of the lengths of the strings -- that is, the value:
this.length()-anotherString.length()"

java comparing integers as strings with a comparator - odd results

I'm attemping to compare some players with a comparator by the amount of runs they have obtained.
System.out.println("Comparing: " + p2.getRuns() + " and " + p1.getRuns());
int newRESULT = intConvert(p2.getRuns()).compareTo(intConvert(p1.getRuns()));
System.out.println("Returns: " + newRESULT);
return newRESULT;
However this returns:
Comparing: 25 and 0,
Returns: 2
Comparing: 0 and 100,
Returns: -1
Comparing: 25 and 100,
Returns: 1
...and hence orders the players in the wrong order.
Should the first comparison not return 1, the second -1 and the last -1 as well?
intConvert:
private static String intConvert(int x)
{
return "" + x;
}
I assume intConvert(...) converts an int to a String, and thus you get lexical comparisons which meahs "25" is greater than "100" because the first character is greater (2 > 1).
If you want to get correct comparisons stick to comparing ints or if you need to use a String create strings of equal length and fill in missings zeros at the front (e.g. 25 -> "025").
To compare Numbers that are represented as String in a sensible way, you need to consistently format them all the same way.
Strings use lexical comparisons. This means "5" will be > "20" because 5 is > than 2. To get the logically expected output you need to format the numbers with some kind of formatter to make them lexically comparable.
"05" is < "20"
The simplest thing would be to use String.format() and a pattern to format all the numbers to Strings so they will compare consistently lexically.
String.format("%04d", yourInteger);
This will produce all ints that are passed in as left padded with 0 for 4 positions.
String.format("%04d", 5);
will produce 0005
Make the "%0Xd" where X is the number of digits you want it formatted to.
You don't have to convert the numbers to strings just to sort, you can do something like:
class ComparePlayersByRuns implements Comparator<Player> {
public int compareTo(Player p1, Player p2) {
return p1.getRuns().compareTo(p2.getRuns());
}
}
or in Java 8 and later all you need to create your comparator is:
Comparators.comparing(Player::getRuns);
And no, the compare isn't required to return 1, 0, or -1, the documentation says:
Returns:
a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second.

Categories