Difference between concatenation at run time and compile time in java

Difference between concatenation at run time and compile time in java - java

public class First
{
public static void main(String[] args)
{
String str1="Hello ",str2="World",str3="Hello World";
System.out.println(str3==("Hello "+"World")); //Prints true
System.out.println(str3==("Hello "+str2)); //Prints false
}
}
The reason of the above is given in JLS-
• Strings computed by constant expressions (§15.28) are computed at
compile time and then treated as if they were literals.
• Strings computed by concatenation at run time are newly created and
therefore distinct.
What I wanted to ask is-
Why the strings which are computed at run time differ from those which are computed at compile time?
Is it because of the memory allocation,one is allocated memory in heap and one in String pool or there is some other reason?Please clarify.

The compiler can't know what str2 contains because it would have to execute the code to know the contents of str2 when you are concatenating it with "Hello " (it could make some optimizations and inline it, since it doesn't change, but it doesn't do it).
Imagine a more complex scenario where str2 is something that a user typed in. Even if the user had typed "World" there was no way the compiler could've known that.
Therefore, it can't perform the comparison str3 == "Hello World" using the same "Hello World" from the constant pool that's assigned to str3 and used in the first comparison.
So the compiler will generate the concatenation by using StringBuilder and will end up creating another String with value Hello World, so the identity comparison will fail because one object is the one from the constant pool and the other one is the one that was just created.

You should use equals when comparing Objects and not the == operator.

Strings are immutable in Java. So, when you concatenate two strings, a third one is created at runtime to represent the concatenated value. So using == returns false as both arguments are pointing to different instances of String object.
For compile time scenario, due to compiler optimization, the concatenated string is already created, and at runtime, boht arguments of == are being represented by same instances. Hence, == returns true as both arguments point to same instance (reference).

The compiler recognizes that constants won't change and if you are using the + operator, will concatenate them in the compiled code.
That's why in first case it will run the execution as str3==("HelloWorld") since "Helloworld" literal is already present in the string pool they both will point at the same location in the String pool it will print true .
In case of str3==("Hello"+str2),the compiler won't check that str2 has World in it, it will consider it as a variable that can have any value so at run time they will create a new string variable which point to different HelloWorld than the str3 in the string pool, so it will print false.

Related

In Java, when we print a string literal on to the terminal, does this string literal also be stored in the string pool?

I am aware that when we initialize a string literal to a variable this literal will be stored in the string pool by the JVM. Consider the piece of code below.
System.out.println("This is a string literal");
Does the string literal within the quotes also be stored in the string pool even if I don't initialize it to a variable?

I will preface this answer by saying that there is little practical use in gaining a deep understanding of the Java string pool. From a practical perspective, you just need to remember two things:
Don't use == to compare strings. Use equals, compareTo, or equivalent methods.
Don't use explicit String.intern calls in your code. If you want to avoid potential problems with duplicate strings, enable the string de-duplication feature that is available in modern Java GCs.
I am aware that when we initialize a string literal either using the 'new' keyword or not, this literal will be stored in the string pool by the JVM.
This is garbled.
Firstly, you don't "initialize" a string literal. You initialize a variable.
String hi = "hello"; // This initializes the variable `hi`.
Secondly you typically don't / shouldn't use a string literal with new.
String hi = new String("hello"); // This is bad. You should write this as above.
The normal use-case for creating a string using new is something like this:
String hi = new String(arrayOfCharacters, offset, count);
In fact, creation and interning of the String object that corresponds to a string literal, happens either at the first time that the literal is used in an expression or at an earlier time. The precise details (i.e. when it happens) are unspecified and (I understand) version dependent.
The first usage might be in a variable initialization, or it might be in something else; e.g. a method call.
So to your question:
Consider the piece of code below:
System.out.println("This is a string literal");
Does the string literal within the quotes also be stored in the string pool even if I do not initialize it?
Yes, it does. If that was the first time the literal was used, the code above may be the trigger for this to happen. But it could have happened previously; e.g. if the above code was run earlier.
As a followup, you asked:
Why does the String Pool collect string literals which are not stored in a variable and just displayed in the console?
Because the JLS 3.10.5 requires that the String objects which correspond to string literals are interned:
"Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern (§12.5)."
And you asked:
The Presence of the String Pool help optimize the program. By storing literals as such (which is actually not required because it is just to be displayed in the console), isn't it the case that it goes against its whole purpose (which is optimization)?
The original idea for interning and the string pool was to save memory. That made sense 25 years ago when the Java language was designed and originally specified. These days even a low-end Android phone has 1GB of RAM, and interning of string literals to save a few thousand bytes is kind of pointless. Except that the JLS says that this must happen.
But the answer is No, it doesn't go against the (original) purpose. This statement:
System.out.println("This is a string literal");
could be executed many times. You don't want / need to create a new String object for the literal each time that you execute it. The thing is that the JVM doesn't know what is going to happen.
Anyway, the interning must happen because that is what the spec says.

String concatenation: + operator with String literal

why s3==s4 returns false while s2==s3 returns true in line no. 8 and 7 respectively.
1. String s="hello";`
2. String s1="he"+"llo";
3. String s2="hello"+123;
4. String s3="hello123";
5. String s4=ss+"123";
7. System.out.println(s==s1);//prints true
8. System.out.println(s2==s3);//prints true
9. System.out.println(s3==s4);//prints false

s + "123"; is not compile-time evaluable so is not a candidate for string internment. (Note that if s was final then it would be.)
Therefore its reference will not be the same as s3, so the output is false.
The others all compare true due to string internment and the compile-time evaluabality of the expressions.

When you use == operator to check the equality of Strings, it checks if the location of the Strings in the memory is the same.
In cases 2 and 4, the Strings "hello" and "hello123" will already be in the String Constant Pool (due to lines 1 and 3) and will be recognized as equivalent to those Strings, and will use the same place in memory for each. In simple terms, it will create a String object and plug it into both instances of "hello" and "hello123".
When you do:
String s4=s+"123";
At run time, it creates a new memory location for s4, since, the JLS says that:
Strings computed by concatenation at run-time are newly created and therefore distinct.
So, the memory locations are different, and hence it gives false as the output.

Using == when comparing objects

Recently in a job interview I was asked this following question (for Java):
Given:
String s1 = "abc";
String s2 = "abc";
What is the return value of
(s1 == s2)
I answered with it would return false because they are two different objects and == is a memory address comparison rather than a value comparison, and that one would need to use .equals() to compare String objects. I was however told that although the .equals(0 methodology was right, the statement nonetheless returns true. I was wondering if someone could explain this to me as to why it is true but why we are still taught in school to use equals()?

String constants are interned by your JVM (this is required by the spec as per here):
All literal strings and string-valued constant expressions are interned. String literals are defined in §3.10.5 of the Java Language Specification
This means that the compiler has already created an object representing the string "abc", and sets both s1 and s2 to point to the same interned object.

java will intern both strings, since they both have the same value only one actual string instance will exist in memory - that's why == will return true - both references point to the same instance.
String interning is an optimization technique to minimize the number of string instances that have to be held in memory. String literals or strings that are the values of constant expressions, are interned so as to share unique instances. Think flyweight pattern.

Since you are not actually creating new instances of String objects for either one of these, they are sharing the same memory space. If it were
String s1 = new String("abc");
String s2 = new String("abc");
the result would be false.

The reason is strings are interned in Java. String interning is a method of storing only one copy of each distinct string value which is immutable. Interning string makes some string processing tasks more efficient. The distinct values are stored in a string intern pool.
(From wiki)

You're right that == uses memory address. However when the java compiler notices that you're using the same string literal multiple times in the same program, it won't create the same string multiple times in memory. Instead both s1 and s2 in your example will point to the same memory. This is called string interning.
So that's why == will return true in this case. However if you read s2 from a file or user input, the string will not automatically interned. So now it no longer points to the same memory. Therefor == would now return false, while equals returns true. And that's why you shouldn't use ==.

Quick and dirty answer: Java optimizes strings so if it encounters the same string twice it will reuse the same object (which is safe because String is immutable).
However, there is no guarantee.
What usually happens is that it works for a long time, and then you get a nasty bug that takes forever to figure out because someone changed the class loading context and your == no longer works.

You should continue to use equals() when testing string equality. Java makes no guarantees about identity testing for strings unless they are interned.
The reason the s1 == s2 in your example is because the compiler is simply optimizing 2 literal references in a scope it can predict.

How "==" works for objects?

public static void main(String [] a)
{
String s = new String("Hai");
String s1=s;
String s2="Hai";
String s3="Hai";
System.out.println(s.hashCode());
System.out.println(s1.hashCode());
System.out.println(s2.hashCode());
System.out.println(s3.hashCode());
System.out.println(s==s2);
System.out.println(s2==s3);
}
From the above code can anyone explain what is going behind when JVM encounters this line (s==s2) ?

It compares references - i.e. are both variables referring to the exact same object (rather than just equal ones).
s and s2 refer to different objects, so the expression evaluates to false.
s and s1 refer to the same objects (as each other) because of the assignment.
s2 and s3 refer to the same objects (as each other) because of string interning.
If that doesn't help much, please ask for more details on a particular bit. Objects and references can be confusing to start with.
Note that only string literals are interned by default... so even though s and s2 refer to equal strings, they're still two separate objects. Similarly if you write:
String x = new String("foo");
String y = new String("foo");
then x == y will evaluate to false. You can force interning, which in this case would actually return the interned literal:
String x = new String("foo");
String y = new String("foo");
String z = "foo";
// Expressions and their values:
x == y: false
x == z: false
x.intern() == y.intern(): true
x.intern() == z: true
EDIT: A comment suggested that new String(String) is basically pointless. This isn't the case, in fact.
A String refers to a char[], with an offset and a length. If you take a substring, it will create a new String referring to the same char[], just with a different offset and length. If you need to keep a small substring of a long string for a long time, but the long string itself isn't needed, then it's useful to use the new String(String) constructor to create a copy of just the piece you need, allowing the larger char[] to be garbage collected.
An example of this is reading a dictionary file - lots of short words, one per line. If you use BufferedReader.readLine(), the allocated char array will be at least 80 chars (in the standard JDK, anyway). That means that even a short word like "and" takes a char array of 160 bytes + overheads... you can run out of space pretty quickly that way. Using new String(reader.readLine()) can save the day.

== compars objects not the content of an object. s and s2 are different objects. If you want to compare the content use s.equals(s2).

Think of it like this.
Identical twins look the same but they are made up differently.
If you want to know if they "look" the same use the compare.
If you want to know they are a clone of each other use the "=="
:)

== compares the memory (reference) location of the Objects. You should use .equals() to compare the contents of the object.
You can use == for ints and doubles because they are primitive data types

I suppose you know that when you test equality between variables using '==', you are in fact testing if the references in memory are the same. This is different from the equals() method that combines an algorithm and attributes to return a result stating that two Objects are considered as being the same. In this case, if the result is true, it normally means that both references are pointing to the same Object. This leaves me wondering why s2==s3 returns true and whether String instances (which are immutable) are pooled for reuse somewhere.

It should be an obvious false. JVM does a thing like using the strings that exist in the Memory . Hence s2,s3 point to the same String that has been instantiated once. If you do something like s5="Hai" even that will be equal to s3.
However new creates a new Object. Irrespective if the String is already exisitng or not. Hence s doesnot equal to s3,s4.
Now if you do s6= new String("Hai"), even that will not be equal to s2,s3 or s.

The literals s2 and s3 will point to the same string in memory as they are present at compile time. s is created at runtime and will point to a different instance of "Hai" in memory. If you want s to point to the same instance of "Hai" as s2 and s3 you can ask Java to do that for you by calling intern. So s.intern == s2 will be true.
Good article here.

You are using some '==' overload for String class...

Is there a difference between String concat and the + operator in Java? [duplicate]

This question already has answers here:
Closed 13 years ago.
Duplicate
java String concatenation
I'm curious what is the difference between the two.
The way I understand the string pool is this:
This creates 3 string objects in the string pool, for 2 of those all references are lost.
String mystr = "str";
mystr += "end";
Doesn't this also create 3 objects in the string pool?
String mystr = "str";
mystr = mystr.concat("end")
I know StringBuilder and StringBuffer are much more efficient in terms of memory usage when there's lots of concatination to be done. I'm just curious if there's any difference between the + operator and concat in terms of memory usage.

There's no difference in this particular case; however, they're not the same in general.
str1 += str2 is equivalent to doing the following:
str1 = new StringBuilder().append(str1).append(str2).toString();
To prove this to yourself, just make a simple method that takes two strings and +='s the first string to the second, then examine the disassembled bytecode.
By contrast, str1.concat(str2) simply makes a new string that's the concatenation of str1 and str2, which is less expensive for a small number of concatenated strings (but will lose to the first approach with a larger number).
Additionally, if str1 is null, notice that str1.concat(str2) throws a NPE, but str1 += str2 will simply treat str1 as if it were null without throwing an exception. (That is, it yields "null" concatenated with the value of str2. If str2 were, say, "foo", you would wind up with "nullfoo".)
Update: See this StackOverflow question, which is almost identical.

The important difference between += and concat() is not performance, it's semantics. concat() will only accept a string argument, but + (or +=) will accept anything. If the non-string operand is an object, it will be converted to a string by calling toString() on it; a primitive will be converted as if by calling the appropriate method in the associated wrapper class, e.g., Integer.toString(theInt); and a null reference becomes the string "null".
Actually, I don't know why concat() even exists. People see it listed in the API docs and assume it's there for a good reason--performance being the most obvious reason. But that's a red herring; if performance is really a concern, you should be using a StringBuilder, as discussed in the thread John linked to. Otherwise, + or += is much more convenient.
EDIT: As for the issue of "creating objects in the string pool," I think you're misunderstanding what the string pool is. At run-time, the actual character sequences, "str" and "end" will be stored in a dedicated data structure, and wherever you see the literals "str" and "end" in the source code, the bytecode will really contain references to the appropriate entries in that data structure.
In fact, the string pool is populated when the classes are loaded, not when the code containing the string literals is run. That means each of your snippets only creates one object: the result of the concatenation. (There's also some object creation behind the scenes, which is a little different for each of the techniques, but the performance impact is not worth worrying about.)

The way I understand the string pool
is this:
You seem to have a misconception concerning that term. There is no such thing as a "string pool" - the way you're using it, it looks like you just mean all String object on the heap. There is a runtime constant pool which contains, among many other things, compile-time String constants and String instances returned from String.intern()

Unless the argument to concat is an empty string, then
String mystr = "str";
mystr = mystr.concat("end")
will also create 3 strings.
More info: https://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.