String intern's behaviour? - java

String string1="Hello Snehal";
String string2=new String("Hello Snehal");
String string3=string2.intern();
System.out.println("string1==string2 " + string1==string2); // false. OK.
System.out.println("string2==string3 " + string2==string3); // false. OK.
System.out.println("string1==string3 " + string1==string3); // false. why not TRUE?
When searched other questions for clarification, e.g. When should we use intern method of String on String constants, still not getting clue about 3rd case.

All of them are false, because what's happening are few checks whether:
"string1==string2 " + string1 refers to string2 (for the first statement)
"string2==string3 " + string2 refers to string3 (for the second one)
"string1==string3 " + string1 refers to string1 (for the last one).
You need to wrap the stringX == stringY pieces, because otherwise String concatenation will take place first (as you might already know, the statements in Java are evaluated from left to right and wrapping some of them with brackets () gives them priority).
So, having this:
System.out.println("string1==string2 " + (string1==string2));
System.out.println("string2==string3 " + (string2==string3));
System.out.println("string1==string3 " + (string1==string3));
should behave differently and then you should be able to investigate the output.

String.intern() on a string will ensure that all strings having same contents share same memory . Refer to javadoc for more detail.

Related

String.equals comparison fails to match

I am using .equals for String comparison below, but x does not match "OU":
String memberOfValue="CN=cn,​OU=ou,​OU=roles,​OU=de,​OU=apps,​DC=meta,​DC=cc,​DC=com";
String[] pairs = memberOfValue.split(",");
for (int i=0;i<pairs.length;i++) {
String pair = pairs[i];
String[] keyValue = pair.split("=");
System.out.println(keyValue[0]);
String x = keyValue[0];
String y = "OU";
System.out.println(x);
System.out.println(x.equals(y));
}
Where am I going wrong?
Adding these two lines of code shows the problem:
System.out.println("x: " + x + " - " + x.chars().boxed().collect(Collectors.toList()));
System.out.println("y: " + y + " - " + y.chars().boxed().collect(Collectors.toList()));
It gives
x: ​OU - [8203, 79, 85]
y: OU - [79, 85]
Which shows that you have some invisible char whose integer value is 8203 (zero width space, see What's HTML character code 8203?) in your string. Not sure how you got that.
As #JB Nizet says, you have non-printable characters in your memberOfValue variable, there are some types of characters as for example:
control, format, private use, surrogate, unassigned, etc...
Here is the complete list: http://www.fileformat.info/info/unicode/category/index.htm
In these cases, you can remove all characters from your string using this regular expression: \\P{Print}
For example:
String x = keyValue[0].replaceAll("[\\P{Print}", "");
When you compare the strings again, the result will be correct.
There are two possible problems from what I'm seeing.
A.) If the strings are capitalized differently they will not return equal unless you use the method .equalsIgnoreCase() instead of .equals()
B.) You're not getting the right strings that you're expecting. Be sure to print out or debug which string is getting parsed through.

Why does the expression x+x not print the same result in the two places it appears? [duplicate]

This question already has answers here:
Java: sum of two integers being printed as concatenation of the two
(10 answers)
Closed 7 years ago.
Why does the expression x+x not print the same result in the two places it appears?
String s = args[0];
System.out.println("Hello "+s);
int x = 40;
System.out.println(x);
System.out.println(x+x);
System.out.println(s+" "+x+x);
The result of this code is when I execute in cmd java EG1 kaan
Hello kaan
40
80
kaan4040
why is the last result of the print displaying kaan4040 and not kaan80?
Because of automatic conversion to String.
On this line you "start printing" an integer, so adding another integer to it will again produce integer that is then converted to String and printed out:
System.out.println(x + x); // integer + integer
However on this line you "start printing" a String, so all other values you add to it are at first converted to String and then concatenated together:
System.out.println(s + " " + x + x); // String + String + integer + integer
If you want the two integers to be added together before the concatenation is done, you need to put brackets around it to give it a higher priority:
System.out.println(s + " " + (x + x)); // String + String + integer
In your last print statement, you are doing a string concatenation instead of an arithmetic addition.
Change System.out.println(s+" "+x+x) to System.out.println(s+" "+(x+x)).
Make changes System.out.println(s+" "+x+x); to System.out.println(s+" "+(x+x)); Because it need to add the value and then string concatenation
Because java does some work with your code. When you do System.out.println(x+x);, it sees x+x as an expression with two ints and evaluates it (which is 80). When you do ""+x+x, it sees 3 String, and thus evaluates this expression as a String concatenation.
(btw, by it, I mean javac, and "sees", I mean, well "reads")
Or change print code to System.out.println(x +x+" " +s );
You are performing concatenation instead of addition
Whenever you append anything to string then it will result to string only. You have appended x+x to " " which will append 40 after name. You can use System.out.println(s+" "+(x+x)).
On the last print statement:
System.out.println(s+" "+x+x);
s is a string and is concatenated with " ", from left to right the expression formed by concatenation with s and " ", is then concatenated with x and then ( s + " " + x ) is concatenated with x, yielding kaan4040.
If the + operator is used with:
2 Strings, concatenation occurs
1 String and 1 int, concatenation occurs
2 ints, arithmetic addition
Consider the following scenario:
System.out.println(x + x + " " + "hello");
In this example 80Kaan is printed as arithmetic addition occurs between x and x, then the resulting value (80) is concatenated with the space and hello.
Read from left to right.
int x = 40;
System.out.println(x);
System.out.println(x + x);
System.out.println("" + x + x);
40
80
4040
40 is int 40
80 is int 40 + int 40 (Maths)
4040 is String 40 concat String 40 (because add "" String)
String s = args[0];
System.out.println("Hello "+s);
int x = 40;
System.out.println(x); //1st statement
System.out.println(x+x); //2nd statement
System.out.println(s+" "+x+x); //3rd statement
The first statement simply converts x into String
The second satatement added the numbers because there aren't strings, the compiler thinks of plus sign as addition of two numbers.
the third one sees that there is a string so the compiler thinks like:
print the value of s, add space(" "), add the value of x (convert x into string), add the value of x (convert x into string ).
Hence, Kaan4040.
If you want to print 80, you can do it in two ways:
int sum = x+x;
System.out.println(s+" "+sum); //more readable code
or
System.out.println(s+" "+ (x+x) ); //may confuse you
the compiler will think of x+x as integers since it doesn't find any string inside the parenthesis. I prefer the first one though. :)
why is the last result of the print displaying kaan4040 and not kaan80?
This is because this is how String behaves when used with the + symbol. and it can mean differently when used in a println method.
It means String concatenation when you use it with a String
The 5 even though being an integer will be implicitly converted to String.
E.g:
System.out.println("Hello" + 5); //Output: Hello5
It become mathematical operation:plus when used within a pair of brackets because the brackets will be operated first (add first), then convert to String.
The 1st + is concatenation, and 2nd + is add (Refer to codes below).
E.g:
System.out.println("Hello" + (5+7)); //Output: Hello12
If any one of the '+' operator operand is string, then java internally create 'StringBuilder' and append those operands to that builder. for example:
String s = "a" + 3 + "c";
it's like
String s = new StringBuilder("a").append(3).append("c").toString(); //java internally do this

Does concatenating strings in Java always lead to new strings being created in memory?

I have a long string that doesn't fit the width of the screen. For eg.
String longString = "This string is very long. It does not fit the width of the screen. So you have to scroll horizontally to read the whole string. This is very inconvenient indeed.";
To make it easier to read, I thought of writing it this way -
String longString = "This string is very long." +
"It does not fit the width of the screen." +
"So you have to scroll horizontally" +
"to read the whole string." +
"This is very inconvenient indeed.";
However, I realized that the second way uses string concatenation and will create 5 new strings in memory and this might lead to a performance hit. Is this the case? Or would the compiler be smart enough to figure out that all I need is really a single string? How could I avoid doing this?
I realized that the second way uses string concatenation and will create 5 new strings in memory and this might lead to a performance hit.
No it won't. Since these are string literals, they will be evaluated at compile time and only one string will be created. This is defined in the Java Language Specification #3.10.5:
A long string literal can always be broken up into shorter pieces and written as a (possibly parenthesized) expression using the string concatenation operator +
[...]
Moreover, a string literal always refers to the same instance of class String.
Strings computed by constant expressions (§15.28) are computed at compile time and then treated as if they were literals.
Strings computed by concatenation at run-time are newly created and therefore distinct.
Test:
public static void main(String[] args) throws Exception {
String longString = "This string is very long.";
String other = "This string" + " is " + "very long.";
System.out.println(longString == other); //prints true
}
However, the situation situation below is different, because it uses a variable - now there is a concatenation and several strings are created:
public static void main(String[] args) throws Exception {
String longString = "This string is very long.";
String is = " is ";
String other = "This string" + is + "very long.";
System.out.println(longString == other); //prints false
}
Does concatenating strings in Java always lead to new strings being created in memory?
No, it does not always do that.
If the concatenation is a compile-time constant expression, then it is performed by the compiler, and the resulting String is added to the compiled classes constant pool. At runtime, the value of the expression is the interned String that corresponds to the constant pool entry.
This will happen in the example in your question.
Please check below snippet based on your inputs:
String longString = "This string is very long. It does not fit the width of the screen. So you have to scroll horizontally to read the whole string. This is very inconvenient indeed.";
String longStringOther = "This string is very long. " +
"It does not fit the width of the screen. " +
"So you have to scroll horizontally " +
"to read the whole string. " +
"This is very inconvenient indeed.";
System.out.println(" longString.equals(longStringOther) :"+ longString.equals(longStringOther));
System.out.println(" longString == longStringother : " + (longString == longStringOther ));
Output:
longString.equals(longStringOther) :true
longString == longStringother : true
1st Case : Both Strings are equal ( have same content)
2nd Case : Shows that there is only one String after concatenation. So only one String is created.

Weird Java String comparison

I'm having a minor issue with Java String comparisons.
I've written a class which takes in a String and parses it into a custom tree type. I've written a toString class which then converts this tree back to a String again. As part of my unit tests I'm just checking that the String generated by the toString method is the same as the String that was parsed in the first place.
Here is my simple test with a few printouts so that we can see whats going on.
final String exp1 = "(a|b)";
final String exp2 = "((a|b)|c)";
final Node tree1 = Reader.parseExpression2(exp1);
final Node tree2 = Reader.parseExpression2(exp2);
final String t1 = tree1.toString();
final String t2 = tree2.toString();
System.out.println(":" + exp1 + ":" + t1 + ":");
System.out.println(":" + exp2 + ":" + t2 + ":");
System.out.println(exp1.compareToIgnoreCase(t1));
System.out.println(exp2.compareToIgnoreCase(t2));
System.out.println(exp1.equals(t1));
System.out.println(exp2.equals(t2));
Has the following output; (NB ":" - are used as delineators so I can ensure theres no extra whitespace)
:(a|b):(a|b):
:((a|b)|c):((a|b)|c):
-1
-1
false
false
Based on manually comparing the strings exp1 and exp2 to t1 and t2 respectively, they are exactly the same. But for some reason Java is insisting they are different.
This isn't the obvious mistake of using == instead of .equals() but I'm stumped as to why two seemingly identical strings are different. Any help would be much appreciated :)
Does one of your strings have a null character within it? These might not be visible when you use System.out.println(...).
For example, consider this class:
public class StringComparison {
public static void main(String[] args) {
String s = "a|b";
String t = "a|b\0";
System.out.println(":" + s + ":" + t + ":");
System.out.println(s.equals(t));
}
}
When I ran this on Linux it gave me the following output:
:a|b:a|b:
false
(I also ran it on Windows, but the null character showed up as a space.)
Well, it certainly looks okay. What I would do would be to iterate over both strings using charAt to compare every single character with the equivalent in the other string. This will, at a minimum, hopefully tell you the offending character.
Also output everything else you can find out about both strings, such as the length.
It could be that one of the characters, while looking the same, may be some other Unicode doppelganger :-)
You may also want to capture that output and do a detailed binary dump on it, such as loading it up into gvim and using the hex conversion tool, or executing od -xcb (if available) on the captured output. There may be an obvious difference when you get down to the binary examination level.
I have some suggestions
Copy each output and paste in Notepad (or any similar editor), then
copy them again and do something like this
System.out.println("(a|b)".compareToIgnoreCase("(a|b)"));
Print out the integer representation of each character. If it is a weird unicode, the int representation will be different.
Also what version of JDK are you using?

Beginners Java Question (string output)

So I'm reading input from a file, which has say these lines:
NEO
You're the Oracle?
NEO
Yeah.
So I want to output his actual lines only, not where it says NEO. So I tried this:
if(line.trim()=="NEO")
output=false;
if (output)
TextIO.putln(name + ":" + "\"" + line.trim() + "\""); // Only print the line if 'output' is true
But thats not working out. It still prints NEO. How can I do this?
When comparing strings in Java you have to use the equals() method. Here's why.
if ( "NEO".equals(line.trim() )
I think you're looking for line.trim().equals("NEO") instead of line.trim() == "NEO"
That said, you can get rid of the output variable by instead doing
if(!line.trim().equals("NEO"))
{
TextIO.putln(name + ":" + "\"" + line.trim() + "\""); // Only print the if it isn't "NEO"
}
Strings are objects in Java. This means you can't just use the == operator to compare them, since the two objects will be different even if they both represent the same string. That's why the String object implements an equal() method, which will compare the contents of the objects, instead of just their memory addresses.
Reference
String.equals() docs
In Java, Strings are objects. And the == operator checks for exact equality.
In other terms
final String ans = line.trim();
final String neo = "NEO";
if (ans == neo) ...
implies you want to check that the ans and the neo objects are the same. They are not, since Java allocated (instantiated) two objects.
As other said, you have to test for equality using a method created for the String object, that actually, internally, checks the values are the same.
if (ans.equals(neo)) ...
try the following:
if(line.trim().equals("NEO"))

Categories