Check if a variable is in the string constant pool - java

In Java when we use literal string when creating a string object, I know that a new object is created in SCP (string constant pool).
Is there a way to check if a variable is in the SCP or in the heap?

First of all, the correct term is the "string pool", not the "string constant pool"; see String pool - do String always exist in constant pool?
Secondly, you are not checking a variable. You are checking the string that some variable contains / refers to. (A variable that contains a reference to the string, cannot be be "in the SCP". The variable is either on the stack, in the heap (not SCP), or in metaspace.)
Is there a way to check if a variable is in SCP or in heap ?
From Java 7 and later, the string pool is in the (normal) heap. So your question is moot if we interpret it literally.
Prior to Java 7, the way to check if a String is in the string pool was to do this
if (str == str.intern()) {
System.out.println("In the string pool");
}
However, this had the problem that if an equivalent str was not already in the pool, you would have added a copy of str to the pool.
From Java 7 onwards, the above test is no longer reliable. A str.intern() no longer needs to copy a string to a separate region to add it the string pool. Therefore the reference to an intern'd string is often identical to the reference to the original (non-interned) string.
char[] chars = new char[]{'a', 'b', 'c'};
String str = new String(chars);
String interned = str.intern();
System.out.println(str == interned);
In fact str == str.intern() can only detect the case where you have a non-interned string with the same content as a string literal.
Or at least list all string instances in SCP ?
There is no way to do that.
As JB Nizet pointed out, there is really not a lot of point in asking these questions:
You shouldn't be writing code that depends on whether a string is in the string pool or not.
If you are concerned about storage to the level where you would contemplate calling intern for yourself, it is better to make use of the opportunistic string compaction mechanism provided by Java 9+ garbage collectors.

Related

In Java, when we print a string literal on to the terminal, does this string literal also be stored in the string pool?

I am aware that when we initialize a string literal to a variable this literal will be stored in the string pool by the JVM. Consider the piece of code below.
System.out.println("This is a string literal");
Does the string literal within the quotes also be stored in the string pool even if I don't initialize it to a variable?
I will preface this answer by saying that there is little practical use in gaining a deep understanding of the Java string pool. From a practical perspective, you just need to remember two things:
Don't use == to compare strings. Use equals, compareTo, or equivalent methods.
Don't use explicit String.intern calls in your code. If you want to avoid potential problems with duplicate strings, enable the string de-duplication feature that is available in modern Java GCs.
I am aware that when we initialize a string literal either using the 'new' keyword or not, this literal will be stored in the string pool by the JVM.
This is garbled.
Firstly, you don't "initialize" a string literal. You initialize a variable.
String hi = "hello"; // This initializes the variable `hi`.
Secondly you typically don't / shouldn't use a string literal with new.
String hi = new String("hello"); // This is bad. You should write this as above.
The normal use-case for creating a string using new is something like this:
String hi = new String(arrayOfCharacters, offset, count);
In fact, creation and interning of the String object that corresponds to a string literal, happens either at the first time that the literal is used in an expression or at an earlier time. The precise details (i.e. when it happens) are unspecified and (I understand) version dependent.
The first usage might be in a variable initialization, or it might be in something else; e.g. a method call.
So to your question:
Consider the piece of code below:
System.out.println("This is a string literal");
Does the string literal within the quotes also be stored in the string pool even if I do not initialize it?
Yes, it does. If that was the first time the literal was used, the code above may be the trigger for this to happen. But it could have happened previously; e.g. if the above code was run earlier.
As a followup, you asked:
Why does the String Pool collect string literals which are not stored in a variable and just displayed in the console?
Because the JLS 3.10.5 requires that the String objects which correspond to string literals are interned:
"Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern (§12.5)."
And you asked:
The Presence of the String Pool help optimize the program. By storing literals as such (which is actually not required because it is just to be displayed in the console), isn't it the case that it goes against its whole purpose (which is optimization)?
The original idea for interning and the string pool was to save memory. That made sense 25 years ago when the Java language was designed and originally specified. These days even a low-end Android phone has 1GB of RAM, and interning of string literals to save a few thousand bytes is kind of pointless. Except that the JLS says that this must happen.
But the answer is No, it doesn't go against the (original) purpose. This statement:
System.out.println("This is a string literal");
could be executed many times. You don't want / need to create a new String object for the literal each time that you execute it. The thing is that the JVM doesn't know what is going to happen.
Anyway, the interning must happen because that is what the spec says.

Why concatenation of String object and string literal is created in heap? [duplicate]

This question already has an answer here:
What is the difference between Heap memory and string constant pool in java
(1 answer)
Closed 3 years ago.
I have below Strings
String str1 = "Abc";//created in constant pool
String str2 = "XYZ";//created in constant pool
String str3 = str1 + str2;//created in constant pool
String str4 = new String("PQR");//created in heap
String str5 = str1.concat(str4);//created in heap
String str6 = str1 + str4;//created in heap
Here I don't know why concatenation of Strings, one created in the constant pool, and the other in the heap, results in creating the new String in the heap. I don't know the reason, why does it happen?
There is a bunch of dubious information in the comments, so I will give this a proper answer.
There is actually no such thing as "the constant pool". You won't find this term in the Java Language Specification.
(Perhaps you are getting your terminology confused with the Constant Pool which is the section of a .class file, and the corresponding per-class Runtime Constant Pool ... which is not visible to application programs. These are "specification artifacts" defined by the JVM spec for the purpose of defining the execution model of bytecodes. The spec does not require that they physically exist, though they typically do; e.g. in an Oracle or OpenJDK implementation.)
There is a data structure in a running JVM called the string pool. The string pool is NOT mentioned by name in the JLS, but its existence is implied by string literal properties as specified by the JLS. The string pool is mentioned in the javadocs, and the JVM specification.
The string pool will contain the String objects that represent the values of any string-valued constant expression used in an application. This includes string literals.
The string pool has always been primarily a de-duping mechanism for strings. Applications are able to use this by calling the String.intern method.
The string values in the Constant Pool (see above) are used to create the String objects that the application see:
A String object is created from the representation.
String.intern is called, returning the corresponding de-duped String object from the string pool.
That string becomes part of the classes Runtime Constant Pool; i.e. the Runtime Constant Pool for a class will include a reference to the String object in the string pool.
This process can happen eagerly or lazily depending on the Java implementation.
The string pool is and has always been stored in the (or a) heap.
Prior to Java 7, string objects in the string pool were allocated in a special heap called the PermGen heap. In the earliest versions of Java it wasn't GC'ed. Then it was GC'ed only occasionally.
In Java 7 (not 8!) the string pool stopped using the PermGen heap and used the regular heap instead.
In Java 8 the PermGen heap was replaced (for some purposes!) by a different storage management mechanism called the Metaspace. Apparently, Metaspace doesn't hold Java objects. Rather, it holds code segments, class descriptors and other JVM internal data structures.
In recent versions of Java (i.e. Java 8 u20 and later) the GC has another mechanism for de-duping strings that survive a given number of GC cycles.
The behavior of strings (i.e. which ones are interned and which ones are not) is determined by the relevant parts of the JLS and the javadocs for the String class.
All of the complexity is irrelevant if you follow one simple rule:
Never use == to compare strings. Always use equals.
Now to deal with your example code:
String str1 = "Abc"; // string pool
String str2 = "XYZ"; // string pool
String str3 = str1 + str2; // not string pool (!!)
String str3a = "Abc" + "XYZ"; // string pool
String str4 = new String("PQR"); // not string pool (but the "PQR" literal is)
String str5 = str1.concat(str4); // not string pool
String str6 = str1 + str4; // not string pool
String str7 = str6.intern(); // string pool
Why?
The values assigned to str1, str2 and str3a are all values of constant expressions; see below.
The value assigned to str3 is not the value of a constant expression according to the JLS.
str4 - the JLS says that new operator always creates a new object and new strings are not automatically interned
str5 - string operations apart from intern do not create objects in the string pool
str6 - ditto - equivalent to a concat call. The JLS also says that + produces a new string (except in the constant expression case).
str7 - the exception: see above. The intern call returns a object in the string pool.
Constant expressions include literals, concatenations involving literals, values of static final String constants, and a few other things. See JLS 15.28 for the complete list, but bear in mind that the string pool only holds string values.
The precise behavior of intern depends on the Java version. Consider this example:
char[] chars = // create an array of random characters
String s1 = new String(chars);
String s2 = s1.intern();
Let us assume that the random characters do not correspond to any previously interned string.
For older JVMs where interned strings were allocated in PermGen, the intern call in the example will (must) produce a new String object.
For newer JVMs, the intern can add the existing String object to the string pool data structure without having to create a new String object.
In other words, the truth of s1 == s2 depends on the Java version.

Java- Creating String object using new keyword

I know the difference between String literal and new String object and also know how it works internally.But my question is little bit advance of this.When we create String object using new keyword as
String str = new String("test");
In this case, we are passing a argument of String type.
My questions is where this string gets generated - Heap Or String constant pool Or somewhere else?
As up to my knowledge, this argument is a string literal so it should be in String constant pool.If is it so then what is use of intern method - only just link variable str to constant pool? because "test" would be available already.
Please clarify me, if I had misunderstood the concept.
The statement String str = new String("test"); creates a string object which gets stored on the heap like any other object. The string literal "test" that is passed as an argument is stored in the string constant pool.
String#intern() checks if a string constant is already available in the string pool. If there is one already it returns it, else it creates a new one and stores it in the pool. See the Javadocs:
Returns a canonical representation for the string object.
A pool of strings, initially empty, is maintained privately by the class String.
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.
Starting from JDK7, interned strings are stored on the heap. This is from the release notes of JDK7:
In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
Use of intern() :
public static void main(String[] args) throws IOException {
String s = new String(new char[] { 'a', 'b', 'c' }); // "abc" will not be added to String constants pool.
System.out.println(System.identityHashCode(s));
s = s.intern();// add s to String constants pool
System.out.println(System.identityHashCode(s));
String str1 = new String("hello");
String str2 = "hello";
String str3 = str1.intern();
System.out.println(System.identityHashCode(str1));
System.out.println(System.identityHashCode(str2));
System.out.println(System.identityHashCode(str3));
}
O/P :
1414159026
1569228633 --> OOPs String moved to String constants pool
778966024
1021653256
1021653256 --> "hello" already added to string pool. So intern does not add it again.

when string intern() method is getting called

Case 1:
String str = "StackOverFlow";
String str1 = "StackOverFlow";
if(str==str1){
System.out.println("equal");//prints equal
}
Case 2:
String str = "StackOverFlow";
String str1=str.intern();
if(str==str1){
System.out.println("equal");//prints equal
}
follow up questions:
I want to know whether for the first case JVM calls intern() internally and assign the reference of str to str1?
how two references equal in the first case?
Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap? if yes where exactly?
For question 4 the answer is as below:
In Java 6 and earlier, interned strings were also stored in the permanent generation. In Java 7, interned strings are stored in the main object heap.
Here is what documentation says:
In JDK 7, interned strings are no longer allocated in the permanent
generation of the Java heap, but are instead allocated in the main
part of the Java heap (known as the young and old generations), along
with the other objects created by the application. This change will
result in more data residing in the main Java heap, and less data in
the permanent generation, and thus may require heap sizes to be
adjusted. Most applications will see only relatively small differences
in heap usage due to this change, but larger applications that load
many classes or make heavy use of the String.intern() method will see
more significant differences.
Further details from here:
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
The references are equal because they are both String literals.
The intern() call on str is not needed because it is also a literal. An example of when you would need to use intern() (which, by the way is a lot slower then equals(), so don't use it) would be when constructing a String with a byte or char array.
For example:
final String str1 = "I am a literal";
final String str2 = new String(str1.toCharArray());
final boolean check1 = str1 == str2; // false
final boolean check2 = str1 == str2.intern(); // true
1) I want to know whether for the first case JVM calls intern() internally and assign the reference of str to str1?
Well, yes and no.
Yes the intern() method is called internally. But the call doesn't happen when that code is run. In fact, it happens when that code is loaded. The loader then saves the reference to the interned String.
But in this case, the loading process only needs to do the interning once. The two literals (in this case) will actually be represented by a single "constant pool entry" in the class that is being loaded. (The Java compiler will have spotted the duplicate literals in the class ... at compile time ... and eliminated it.)
2) how two references equal in the first case?
Because the two strings have been interned.
3) Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Yes ... modulo that the interning doesn't happen at the point when the code containing the declaration is run.
4) Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap? if yes where exactly?
The answer is somewhat system dependent. In general, the string pool is in the heap. On some systems the heap is divided into regions or spaces which have different garbage collection policies, and the string pool is allocated in the so-called "permgen" space that is sized independently of the rest of the heap. But this is not always true.
I want to know whether for the first case JVM calls intern() internally
No.
and assign the reference of str to str1?
Yes, but because the value is a literal, not because of interning. There is only one instance of it in the .class file in the first place.
how two references equal in the first case?
That's not another question, just the same question re-stated.
Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Yes, but it's done by the compiler, not intern().
Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap?
No.
where exactly?
In the constant pool of the loaded class, which is in the heap.
There are largely 2 ways of creating string objects.
One is
String str = "test-String"; // This string is created in string pool or returned from
string pool if already exists
Second one
String str = new String("test-string");// This string object will be created in heap memory and will be treated as any other object
When you call string.intern on the string instance created using new operator... this string will be created in string-pool or returned from the pool if exists. This is a mechanism to move the string object from heap to perm Gen (String pool)

Why are equal java strings taking the same address? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
String object creation using new and its comparison with intern method
I was playing around with Strings to understand them more and I noticed something that I can't explain :
String str1 = "whatever";
String str2 = str1;
String str3 = "whatever";
System.out.println(str1==str2); //prints true...that's normal, they point to the same object
System.out.println(str1==str3); //gives true..how's that possible ?
How is the last line giving true ? this means that both str1 and str3 have the same address in memory.
Is this a compiler optimization that was smart enough to detect that both string literals are the same ("whatever") and thus assigned str1 and str3 to the same object ? Or am I missing something in the underlying mechanics of strings ?
Because Java has a pool of unique interned instances, and that String literals are stored in this pool. This means that the first "whatever" string literal is exactly the same String object as the third "whatever" literal.
As the Document Says:
public String intern()
Returns a canonical representation for the
string object. A pool of strings, initially empty, is maintained
privately by the class String.
When the intern method is invoked, if the pool already contains a
string equal to this String object as determined by the equals(Object)
method, then the string from the pool is returned. Otherwise, this
String object is added to the pool and a reference to this String
object is returned.
It follows that for any two strings s and t, s.intern() == t.intern()
is true if and only if s.equals(t) is true.
All literal strings and string-valued constant expressions are
interned. String literals are defined in §3.10.5 of the Java Language
Specification
Returns: a string that has the same contents as this string, but is
guaranteed to be from a pool of unique strings.
http://www.xyzws.com/Javafaq/what-is-string-literal-pool/3
As the post says:
String allocation, like all object allocation, proves costly in both time and memory. The JVM performs some trickery while instantiating string literals to increase performance and decrease memory overhead. To cut down the number of String objects created in the JVM, the String class keeps a pool of strings. Each time your code create a string literal, the JVM checks the string literal pool first. If the string already exists in the pool, a reference to the pooled instance returns. If the string does not exist in the pool, a new String object instantiates, then is placed in the pool.
If you do:
String str1 = new String("BlaBla"); //In the heap!
String str2 = new String("BlaBla"); //In the heap!
then you're explicitly creating a String object through new operator (and constructor).
In this case you'll have each object pointing to a different storage location.
But if you do:
String str1 = "BlaBla";
String str2 = "BlaBla";
then you've implicit construction.
Two strings literals share the same storage if they have the same values, this is because Java conserves the storage of the same strings! (Strings that have the same value)
The javac compiler combines String literals which are the same in a given class file.
However at runtime, String literals are combined using the same approach as String.intern() This means even Strings in different class in different applications (in the same JVM which use the same object.

Categories