My question is in regard to the way Java handles String literals. It's quite clear from the Java Language Specs (JLS) that String literals are being implicitly interned - in other words, objects that are created in the String constant pool part of the heap, in contrast to the heap-based objects created when calling new String("whatever").
What doesn't seem to line up with what the JLS says is that when creating a new String using String concatenation with a casted constant String type, which should be considered as a constant String as per the JLS, apparently the JVM is creating a new String object rather than interning it implicitly. I appreciate any explanation about this particular behaviour and whether or not this is a platform-specific behaviour. I am running on a Mac OSX Snow Leopard.
public class Test
{
public static void main(String args[])
{
/*
Create a String object on the String constant pool
using a String literal
*/
String hello = "hello";
final String lo = "lo"; // this will be created in the String pool as well
/*
Compare the hello variable to a String constant expression
, that should cause the JVM to implicitly call String.intern()
*/
System.out.println(hello == ("hel" + lo));// This should print true
/*
Here we need to create a String by casting an Object back
into a String, this will be used later to create a constant
expression to be compared with the hello variable
*/
Object object = "lo";
final String stringObject = (String) object;// as per the JLS, casted String types can be used to form constant expressions
/*
Compare with the hello variable
*/
System.out.println(hello == "hel" + stringObject);// This should print true, but it doesn't :(
}
}
Casting to Object is not allowed in a compile time constant expression. The only casts permitted are to String and primitives. JLS (Java SE 7 edition) section 15.28:
> - Casts to primitive types and casts to type String
(There's actually a second reason. object isn't final so cannot possibly by considered a constant variable. "A variable of primitive type or type String, that is final and initialized with a compile-time constant expression (§15.28), is called a constant variable." -- section 4.12.4.)
Seems like because you reference an object here final String stringObject = (String) object;, this is no longer a 'compile-time' constant, but a 'run-time' constant. The first example from here eludes to it with the part:
String s = "lo";
String str7 = "Hel"+ s;
String str8 = "He" + "llo";
System.out.println("str7 is computed at runtime.");
System.out.println("str8 is created by using string constant expression.");
System.out.println(" str7 == str8 is " + (str7 == str8));
System.out.println(" str7.equals(str8) is " + str7.equals(str8));
The string str7 is computed at runtime, because it references another string that is not a literal, so by that logic I assume despite that face that you make stringObject final, it still references an object, so cannot be computed at compile time.
And from the java lang spec here, it states:
"The string concatenation operator + (§15.18.1) implicitly creates a new String object when the result is not a compile-time constant expression (§15.28). "
I cannot find any examples where a cast can be used, except, for this terrible, terrible example:
System.out.println(hello == "hel" + ( String ) "lo");
Which hardly has any logical use, but maybe the part about a string cast was included because of the above case.
Related
I am trying to concatenate two strings, one string with some value and another with empty.
Example:
String string1="Great"
String string2="";
and concatenating these two string with concat function and + operator
Example:
String cat=string1.concat(string2)
String operator=string1+string2
As per my understanding, while using empty string in concat function as the string2 is empty no new reference will be created. But while using + operator a new reference will be created in the string pool constant. But in the below code while using the + operator new reference is not created.
public class Main {
public static void main(String[] args) {
String string1="Great",string2="";
String cat=string1.concat(string2);
if(string1==cat)
{
System.out.println("Same");
}
else
{
System.out.println("Not same");
}
String operator=string1+string2;
if(operator==string1)
System.out.println("Same");
else
System.out.println("Not same");
}
}
Output:
string 1 :69066349
cat :69066349
Same
string1 :69066349
operator :69066349
Not same
From the above code, as it's using + operator, the reference for the variable : operator should refer to the new memory, but it's pointing to the string1 reference. Please explain the above code.
It is all in the documentation.
For String.concat, the javadoc states this:
If the length of the argument string is 0, then this String object is returned.
For the + operator, JLS 15.8.1 states:
The result of string concatenation is a reference to a String object that is the concatenation of the two operand strings. The characters of the left-hand operand precede the characters of the right-hand operand in the newly created string.
The String object is newly created (§12.5) unless the expression is a constant expression (§15.29).
As you can see, the results will be different for the case where the 2nd string has length zero and this is not a constant expression.
That is what happens in your example.
You also said:
But while using + operator a new reference will be created in the string pool constant.
This is not directly relevant to your question, but ... actually, no it won't be created there. It will create a reference to a regular (not interned) String object in the heap. (It would only be in the class file's constant pool ... and hence the string pool ... if it was a constant expression; see JLS 15.29)
Note that the string pool and the classfile constant pool are different things.
Can I add a couple of things:
You probably shouldn't be using String.concat. The + operator is more concise, and the JIT compiler should know how to optimize away the creation of unnecessary intermediate strings ... in the few cases where you might consider using concat for performance reasons.
It is a bad idea to exploit the fact that no new object is created so that you can use == rather than equals(Object). Your code will be fragile. Just use equals always for comparing String and the primitive wrapper types. It is simpler and safer.
In short, the fact that you are even asking this question suggests that you are going down a blind alley. Knowledge of this edge-case difference between concat and + is ... pointless ... unless you are planning to enter a quiz show for Java geeks.
public class Strings
{
public static void main(String ads[])
{
String a = "meow";
String ab = a + "deal";
String abc= "meowdeal";
System.out.println (ab==abc);
}
}
why output is false?
In this program ab is created in string literal and then abc created but why ab and abc not refer to the same memory in string constant pool ,because before creating abc it search in string constant pool for String meowdeal.
Java only pools strings it knows about at compile time; string constants and constant string expressions. a is a local variable, so a + "deal" is a string expression that isn't evaluated until runtime (even though you looking at it can see that it should be constant). The Java compiler doesn't know it's a constant expression, and doesn't put it in the pool. It performs the string concatenation at runtime, resulting in a different object than any in the pool.
I'll explain what's happening:
public class Strings {
public static void main(String ads[]) {
String a = "meow"; // new string created
String ab = a + "deal"; // again a new string created. Reference different.
String abc = "meowdeal"; // a whole new string.
System.out.println(ab == abc);// even though the values are same, reference is different. For value equality, use .equals()
}
}
Your question implies that you expect Java to check the result of every string concatenation to see if there is a matching string in the string constant pool - but this would be grossly inefficient. String concats are always new objects unless all the strings are compile-time constants.
If you really want to compare the strings using == you need to intern the constructed string like so:
ab=(a+"deal").intern();
However this would be for a very specific use case and very uncommon.
Note that this is a different case from when two constants are concatenated; given "ab"+"cd" the compiler is required to resolve the expression to "abcd" and pool the result. The same would be true if one or both of the values are compile-time constants, static final ....
class A {
String s4 = "abc";
static public void main(String[]args ) {
String s1 = "abc";
String s2 = "abc";
String s3 = new String("abc");
A o = new A();
String s5 = new String("def");
System.out.println("s1==s2 : " + (s1==s2));
System.out.println("s1==s1.intern : " + (s1==s1.intern()));
System.out.println("s1==s3 : " + (s1==s3));
System.out.println("s1.intern==s3.intern : " + (s1.intern()==s3.intern()));
System.out.println("s1==s4 : " + (s1==o.s4));
}
}
The output:
s1==s2 : true
s1==s1.intern : true
s1==s3 : false
s1.intern==s3.intern : true
s1==s4 : true
My questions:
1.What happens for "String s1 = "abc"? I guess the String object is added to the pool in class String as an interned string? Where is it placed on? The "permanent generation" or just the heap(as the data member of the String Class instance)?
2.What happens for "String s2 = "abc"? I guess no any object is created.But does this mean that the Java Intepreter needs to search all the interned strings? will this cause any performance issue?
3.Seems String s3 = new String("abc") does not use interned string.Why?
4.Will String s5 = new String("def") create any new interned string?
The compiler creates a String object for "abc" in the constant pool, and generates a reference to it in the bytecode for the assignment statement.
See (1). No searching; no performance issue.
This creates a new String object at runtime, because that is what the 'new' operator does: create new objects.
Yes, for "def", but because of (3) a new String is also created at runtime.
The String objects at 3-4 are not interned.
1.What happens for "String s1 = "abc"?
At compile time a representation of the literal is written to the "constant pool" part of the classfile for the class that contains this code.
When the class is loaded, the representation of the string literal in the classfile's constant pool is read, and a new String object is created from it. This string is then interned, and the reference to the interned string is then "embedded" in the code.
At runtime, the reference to the previously created / interned String is assigned to s1. (No string creation or interning happens when this statement is executed.)
I guess the String object is added to the pool in class String as an interned string?
Yes. But not when the code is executed.
Where is it placed on? The "permanent generation" or just the heap(as the data member of the String Class instance)?
It is stored in the permgen region of the heap. (The String class has no static fields. The JVM's string pool is implemented in native code.)
2.What happens for "String s2 = "abc"?
Nothing happens at load time. When the compiler created the classfile, it reused the same constant pool entry for the literal that was used for the first use of the literal. So the String reference uses by this statement is the same one as is used by the previous statement.
I guess no any object is created.
Correct.
But does this mean that the Java Intepreter needs to search all the interned strings? will this cause any performance issue?
No, and No. The Java interpretter (or JIT compiled code) uses the same reference as was created / embedded for the previous statement.
3.Seems String s3 = new String("abc") does not use interned string.Why?
It is more complicated than that. The constructor call uses the interned string, and then creates a new String, and copies the characters of the interned string to the new String's representation. The newly created string is assigned to s3.
Why? Because new is specified as always creating a new object (see JLS), and the String constructor is specified as copying the characters.
4.Will String s5 = new String("def") create any new interned string?
A new interned string is created at load time (for "def"), and then a new String object is created at runtime which is a copy of the interned string. (See previous text for more details.)
See this answer on SO. Also see this wikipedia article on String Interning.
String s1 = "abc"; creates a new String and interns it.
String s2 = "abc"; will drag the same Object used for s1 from the intern pool. The JVM does this to increase performance. It is quicker than creating a new String.
Calling new String() is redundant as it will return a new implicit String Object. Not retrieve it from the intern pool.
As Keyser says, == compares the Strings for Object equality, returning true if they are the same Object. When comparing String content you should use .equals()
This question already has answers here:
String concatenation with Null
(3 answers)
Closed 3 years ago.
Why does the following work? I would expect a NullPointerException to be thrown.
String s = null;
s = s + "hello";
System.out.println(s); // prints "nullhello"
Why must it work?
The JLS 5, Section 15.18.1.1 JLS 8 § 15.18.1 "String Concatenation Operator +", leading to JLS 8, § 5.1.11 "String Conversion", requires this operation to succeed without failure:
...Now only reference values need to be considered. If the reference is null, it is converted to the string "null" (four ASCII characters n, u, l, l). Otherwise, the conversion is performed as if by an invocation of the toString method of the referenced object with no arguments; but if the result of invoking the toString method is null, then the string "null" is used instead.
How does it work?
Let's look at the bytecode! The compiler takes your code:
String s = null;
s = s + "hello";
System.out.println(s); // prints "nullhello"
and compiles it into bytecode as if you had instead written this:
String s = null;
s = new StringBuilder(String.valueOf(s)).append("hello").toString();
System.out.println(s); // prints "nullhello"
(You can do so yourself by using javap -c)
The append methods of StringBuilder all handle null just fine. In this case because null is the first argument, String.valueOf() is invoked instead since StringBuilder does not have a constructor that takes any arbitrary reference type.
If you were to have done s = "hello" + s instead, the equivalent code would be:
s = new StringBuilder("hello").append(s).toString();
where in this case the append method takes the null and then delegates it to String.valueOf().
Note: String concatenation is actually one of the rare places where the compiler gets to decide which optimization(s) to perform. As such, the "exact equivalent" code may differ from compiler to compiler. This optimization is allowed by JLS, Section 15.18.1.2:
To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.
The compiler I used to determine the "equivalent code" above was Eclipse's compiler, ecj.
See section 5.4 and 15.18 of the Java Language specification:
String conversion applies only to the
operands of the binary + operator when
one of the arguments is a String. In
this single special case, the other
argument to the + is converted to a
String, and a new String which is the
concatenation of the two strings is
the result of the +. String conversion
is specified in detail within the
description of the string
concatenation + operator.
and
If only one operand expression is of
type String, then string conversion is
performed on the other operand to
produce a string at run time. The
result is a reference to a String
object (newly created, unless the
expression is a compile-time constant
expression (§15.28))that is the
concatenation of the two operand
strings. The characters of the
left-hand operand precede the
characters of the right-hand operand
in the newly created string. If an
operand of type String is null, then
the string "null" is used instead of
that operand.
The second line is transformed to the following code:
s = (new StringBuilder()).append((String)null).append("hello").toString();
The append methods can handle null arguments.
You are not using the "null" and therefore you don't get the exception. If you want the NullPointer, just do
String s = null;
s = s.toString() + "hello";
And I think what you want to do is:
String s = "";
s = s + "hello";
This is behavior specified in the Java API's String.valueOf(Object) method. When you do concatenation, valueOf is used to get the String representation. There is a special case if the Object is null, in which case the string "null" is used.
public static String valueOf(Object obj)
Returns the string representation of the Object argument.
Parameters:
obj - an Object.
Returns:
if the argument is null, then a string equal to "null"; otherwise, the value of obj.toString() is returned.
public class Comparison {
public static void main(String[] args) {
String s = "prova";
String s2 = "prova";
System.out.println(s == s2);
System.out.println(s.equals(s2));
}
}
outputs:
true
true
on my machine. Why? Shouldn't be == compare object references equality?
Because String instances are immutable, the Java language is able to make some optimizations whereby String literals (or more generally, String whose values are compile time constants) are interned and actually refer to the same (i.e. ==) object.
JLS 3.10.5 String Literals
Each string literal is a reference to an instance of class String. String objects have a constant value. String literals-or, more generally, strings that are the values of constant expressions -are "interned" so as to share unique instances, using the method String.intern.
This is why you get the following:
System.out.println("yes" == "yes"); // true
System.out.println(99 + "bottles" == "99bottles"); // true
System.out.println("7" + "11" == "" + '7' + '1' + (char) (50-1)); // true
System.out.println("trueLove" == (true + "Love")); // true
System.out.println("MGD64" == "MGD" + Long.SIZE);
That said it needs to be said that you should NOT rely on == for String comparison in general, and should use equals for non-null instanceof String. In particular, do not be tempted to intern() all your String just so you can use == without knowing how string interning works.
Related questions
Java String.equals versus ==
difference between string object and string literal
what is the advantage of string object as compared to string literal
Is it good practice to use java.lang.String.intern()?
On new String(...)
If for some peculiar reason you need to create two String objects (which are thus not == by definition), and yet be equals, then you can, among other things, use this constructor:
public String(String original) : Initializes a newly created String object so that it represents the same sequence of characters as the argument; in other words, the newly created string is a copy of the argument string. Unless an explicit copy of original is needed, use of this constructor is unnecessary since Strings are immutable.
Thus, you can have:
System.out.println("x" == new String("x")); // false
The new operator always create a new object, thus the above is guaranteed to print false. That said, this is not generally something that you actually need to do. Whenever possible, you should just use string literals instead of explicitly creating a new String for it.
Related questions
Java Strings: “String s = new String(”silly“);”
What is the purpose of the expression “new String(…)” in Java?
JLS, 3.10.5 => It is guaranteed that a literal string object will be reused by any other code running in the same virtual machine that happens to contain the same string literal
If you explicitly create new objects, == returns false:
String s1 = new String("prova");
String s2 = new String("prova");
System.out.println(s1 == s2); // returns false.
Otherwise the JVM can use the same object, hence s1 == s2 will return true.
It does. But String literals are pooled, so "prova" returns the same instance.
String s = "prova";
String s2 = "prova";
s and s2 are literal strings which are pointing the same object in String Pool of JVM, so that the comparison returns true.
Yes, "prova" is stored in the java inner string pool, so its the same reference.
Source code literals are part of a constant pool, so if the same literal appears multiple times, it will be the same object at runtime.
The JVM may optimize the String usage so that there is only one instance of the "equal" String in memory. In this case also the == operator will return true. But don't count on it, though.
You must understand that "==" compares references and "equals" compares values. Both s and s1 are pointing to the same string literal, so their references are the same.
When you put a literal string in java code, the string is automatically interned by the compiler, that is one static global instance of it is created. Or more specifically, it is put into a table of interned strings. Any other quoted string that is exactly the same, content-wise, will reference the same interned string.
So in your code s and s2 are the same string
Ideally it should not happen ever. Because java specification guarantees this. So I think it may be the bug in JVM, you should report to the sun microsystems.