when string intern() method is getting called - java

Case 1:
String str = "StackOverFlow";
String str1 = "StackOverFlow";
if(str==str1){
System.out.println("equal");//prints equal
}
Case 2:
String str = "StackOverFlow";
String str1=str.intern();
if(str==str1){
System.out.println("equal");//prints equal
}
follow up questions:
I want to know whether for the first case JVM calls intern() internally and assign the reference of str to str1?
how two references equal in the first case?
Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap? if yes where exactly?
For question 4 the answer is as below:
In Java 6 and earlier, interned strings were also stored in the permanent generation. In Java 7, interned strings are stored in the main object heap.
Here is what documentation says:
In JDK 7, interned strings are no longer allocated in the permanent
generation of the Java heap, but are instead allocated in the main
part of the Java heap (known as the young and old generations), along
with the other objects created by the application. This change will
result in more data residing in the main Java heap, and less data in
the permanent generation, and thus may require heap sizes to be
adjusted. Most applications will see only relatively small differences
in heap usage due to this change, but larger applications that load
many classes or make heavy use of the String.intern() method will see
more significant differences.
Further details from here:
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.

The references are equal because they are both String literals.
The intern() call on str is not needed because it is also a literal. An example of when you would need to use intern() (which, by the way is a lot slower then equals(), so don't use it) would be when constructing a String with a byte or char array.
For example:
final String str1 = "I am a literal";
final String str2 = new String(str1.toCharArray());
final boolean check1 = str1 == str2; // false
final boolean check2 = str1 == str2.intern(); // true

1) I want to know whether for the first case JVM calls intern() internally and assign the reference of str to str1?
Well, yes and no.
Yes the intern() method is called internally. But the call doesn't happen when that code is run. In fact, it happens when that code is loaded. The loader then saves the reference to the interned String.
But in this case, the loading process only needs to do the interning once. The two literals (in this case) will actually be represented by a single "constant pool entry" in the class that is being loaded. (The Java compiler will have spotted the duplicate literals in the class ... at compile time ... and eliminated it.)
2) how two references equal in the first case?
Because the two strings have been interned.
3) Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Yes ... modulo that the interning doesn't happen at the point when the code containing the declaration is run.
4) Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap? if yes where exactly?
The answer is somewhat system dependent. In general, the string pool is in the heap. On some systems the heap is divided into regions or spaces which have different garbage collection policies, and the string pool is allocated in the so-called "permgen" space that is sized independently of the rest of the heap. But this is not always true.

I want to know whether for the first case JVM calls intern() internally
No.
and assign the reference of str to str1?
Yes, but because the value is a literal, not because of interning. There is only one instance of it in the .class file in the first place.
how two references equal in the first case?
That's not another question, just the same question re-stated.
Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Yes, but it's done by the compiler, not intern().
Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap?
No.
where exactly?
In the constant pool of the loaded class, which is in the heap.

There are largely 2 ways of creating string objects.
One is
String str = "test-String"; // This string is created in string pool or returned from
string pool if already exists
Second one
String str = new String("test-string");// This string object will be created in heap memory and will be treated as any other object
When you call string.intern on the string instance created using new operator... this string will be created in string-pool or returned from the pool if exists. This is a mechanism to move the string object from heap to perm Gen (String pool)

Related

Check if a variable is in the string constant pool

In Java when we use literal string when creating a string object, I know that a new object is created in SCP (string constant pool).
Is there a way to check if a variable is in the SCP or in the heap?
First of all, the correct term is the "string pool", not the "string constant pool"; see String pool - do String always exist in constant pool?
Secondly, you are not checking a variable. You are checking the string that some variable contains / refers to. (A variable that contains a reference to the string, cannot be be "in the SCP". The variable is either on the stack, in the heap (not SCP), or in metaspace.)
Is there a way to check if a variable is in SCP or in heap ?
From Java 7 and later, the string pool is in the (normal) heap. So your question is moot if we interpret it literally.
Prior to Java 7, the way to check if a String is in the string pool was to do this
if (str == str.intern()) {
System.out.println("In the string pool");
}
However, this had the problem that if an equivalent str was not already in the pool, you would have added a copy of str to the pool.
From Java 7 onwards, the above test is no longer reliable. A str.intern() no longer needs to copy a string to a separate region to add it the string pool. Therefore the reference to an intern'd string is often identical to the reference to the original (non-interned) string.
char[] chars = new char[]{'a', 'b', 'c'};
String str = new String(chars);
String interned = str.intern();
System.out.println(str == interned);
In fact str == str.intern() can only detect the case where you have a non-interned string with the same content as a string literal.
Or at least list all string instances in SCP ?
There is no way to do that.
As JB Nizet pointed out, there is really not a lot of point in asking these questions:
You shouldn't be writing code that depends on whether a string is in the string pool or not.
If you are concerned about storage to the level where you would contemplate calling intern for yourself, it is better to make use of the opportunistic string compaction mechanism provided by Java 9+ garbage collectors.

Why concatenation of String object and string literal is created in heap? [duplicate]

This question already has an answer here:
What is the difference between Heap memory and string constant pool in java
(1 answer)
Closed 3 years ago.
I have below Strings
String str1 = "Abc";//created in constant pool
String str2 = "XYZ";//created in constant pool
String str3 = str1 + str2;//created in constant pool
String str4 = new String("PQR");//created in heap
String str5 = str1.concat(str4);//created in heap
String str6 = str1 + str4;//created in heap
Here I don't know why concatenation of Strings, one created in the constant pool, and the other in the heap, results in creating the new String in the heap. I don't know the reason, why does it happen?
There is a bunch of dubious information in the comments, so I will give this a proper answer.
There is actually no such thing as "the constant pool". You won't find this term in the Java Language Specification.
(Perhaps you are getting your terminology confused with the Constant Pool which is the section of a .class file, and the corresponding per-class Runtime Constant Pool ... which is not visible to application programs. These are "specification artifacts" defined by the JVM spec for the purpose of defining the execution model of bytecodes. The spec does not require that they physically exist, though they typically do; e.g. in an Oracle or OpenJDK implementation.)
There is a data structure in a running JVM called the string pool. The string pool is NOT mentioned by name in the JLS, but its existence is implied by string literal properties as specified by the JLS. The string pool is mentioned in the javadocs, and the JVM specification.
The string pool will contain the String objects that represent the values of any string-valued constant expression used in an application. This includes string literals.
The string pool has always been primarily a de-duping mechanism for strings. Applications are able to use this by calling the String.intern method.
The string values in the Constant Pool (see above) are used to create the String objects that the application see:
A String object is created from the representation.
String.intern is called, returning the corresponding de-duped String object from the string pool.
That string becomes part of the classes Runtime Constant Pool; i.e. the Runtime Constant Pool for a class will include a reference to the String object in the string pool.
This process can happen eagerly or lazily depending on the Java implementation.
The string pool is and has always been stored in the (or a) heap.
Prior to Java 7, string objects in the string pool were allocated in a special heap called the PermGen heap. In the earliest versions of Java it wasn't GC'ed. Then it was GC'ed only occasionally.
In Java 7 (not 8!) the string pool stopped using the PermGen heap and used the regular heap instead.
In Java 8 the PermGen heap was replaced (for some purposes!) by a different storage management mechanism called the Metaspace. Apparently, Metaspace doesn't hold Java objects. Rather, it holds code segments, class descriptors and other JVM internal data structures.
In recent versions of Java (i.e. Java 8 u20 and later) the GC has another mechanism for de-duping strings that survive a given number of GC cycles.
The behavior of strings (i.e. which ones are interned and which ones are not) is determined by the relevant parts of the JLS and the javadocs for the String class.
All of the complexity is irrelevant if you follow one simple rule:
Never use == to compare strings. Always use equals.
Now to deal with your example code:
String str1 = "Abc"; // string pool
String str2 = "XYZ"; // string pool
String str3 = str1 + str2; // not string pool (!!)
String str3a = "Abc" + "XYZ"; // string pool
String str4 = new String("PQR"); // not string pool (but the "PQR" literal is)
String str5 = str1.concat(str4); // not string pool
String str6 = str1 + str4; // not string pool
String str7 = str6.intern(); // string pool
Why?
The values assigned to str1, str2 and str3a are all values of constant expressions; see below.
The value assigned to str3 is not the value of a constant expression according to the JLS.
str4 - the JLS says that new operator always creates a new object and new strings are not automatically interned
str5 - string operations apart from intern do not create objects in the string pool
str6 - ditto - equivalent to a concat call. The JLS also says that + produces a new string (except in the constant expression case).
str7 - the exception: see above. The intern call returns a object in the string pool.
Constant expressions include literals, concatenations involving literals, values of static final String constants, and a few other things. See JLS 15.28 for the complete list, but bear in mind that the string pool only holds string values.
The precise behavior of intern depends on the Java version. Consider this example:
char[] chars = // create an array of random characters
String s1 = new String(chars);
String s2 = s1.intern();
Let us assume that the random characters do not correspond to any previously interned string.
For older JVMs where interned strings were allocated in PermGen, the intern call in the example will (must) produce a new String object.
For newer JVMs, the intern can add the existing String object to the string pool data structure without having to create a new String object.
In other words, the truth of s1 == s2 depends on the Java version.

Java- Creating String object using new keyword

I know the difference between String literal and new String object and also know how it works internally.But my question is little bit advance of this.When we create String object using new keyword as
String str = new String("test");
In this case, we are passing a argument of String type.
My questions is where this string gets generated - Heap Or String constant pool Or somewhere else?
As up to my knowledge, this argument is a string literal so it should be in String constant pool.If is it so then what is use of intern method - only just link variable str to constant pool? because "test" would be available already.
Please clarify me, if I had misunderstood the concept.
The statement String str = new String("test"); creates a string object which gets stored on the heap like any other object. The string literal "test" that is passed as an argument is stored in the string constant pool.
String#intern() checks if a string constant is already available in the string pool. If there is one already it returns it, else it creates a new one and stores it in the pool. See the Javadocs:
Returns a canonical representation for the string object.
A pool of strings, initially empty, is maintained privately by the class String.
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.
Starting from JDK7, interned strings are stored on the heap. This is from the release notes of JDK7:
In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
Use of intern() :
public static void main(String[] args) throws IOException {
String s = new String(new char[] { 'a', 'b', 'c' }); // "abc" will not be added to String constants pool.
System.out.println(System.identityHashCode(s));
s = s.intern();// add s to String constants pool
System.out.println(System.identityHashCode(s));
String str1 = new String("hello");
String str2 = "hello";
String str3 = str1.intern();
System.out.println(System.identityHashCode(str1));
System.out.println(System.identityHashCode(str2));
System.out.println(System.identityHashCode(str3));
}
O/P :
1414159026
1569228633 --> OOPs String moved to String constants pool
778966024
1021653256
1021653256 --> "hello" already added to string pool. So intern does not add it again.

We know that String will store in SCP in the same way String[] also store in SCP? and what about ArrayList<String>?

We know that a String will be stored in SCP (String Constant Pool) area:
1. in the same way String[] is also stored in SCP? i mean each array contains a String, so again this will be stored in SCP?
2. and what about ArrayList(String) ? i mean each arraylist contains a String, so again this will be stored in SCP?
In our project we are facing OutOfMemory. we are having more than 1000+ String[]. Each time values for String[] are different. So Huge no of objects are getting created in SCP (guessing) We want to change this to ArrayList(String) to reduce memory. If again ArrayList(String), each String gets stored in SCP area, then there is no use of changing from String[] to ArrayList(String)
Please explain in detail. Your response is valuable.
No. All Strings will not be stored in String Constant Pool. Only String literals and interned Strings will be stored there.
String s = "abc"; // stores "abc" in the SCP if it is not already present.
String s1= "abc";// stores "abc" in the SCP if it is not already present.
String s2="abc";// Doesn't store "abc" into the String pool as it is already present.
String s3=s1+s2; // "abcabc" goes on heap
String[] is just an array that holds references to String objects.
So, I think there is some other problem somewhere else.
Also, from Oracle doc :
In JDK 7, interned strings are no longer allocated in the permanent
generation of the Java heap, but are instead allocated in the main
part of the Java heap (known as the young and old generations), along
with the other objects created by the application. This change will
result in more data residing in the main Java heap, and less data in
the permanent generation, and thus may require heap sizes to be
adjusted. Most applications will see only relatively small differences
in heap usage due to this change, but larger applications that load
many classes or make heavy use of the String.intern() method will see
more significant differences.
Its the String literals that are stored in pool. Rest for all other questions that you have, its just a reference. So String[] holds reference to String Objects which include pooled objects as well. Similarly ArrayList will hold the reference.
Changing [] to ArrayList wont make a difference.
Rather changing these String to StringBuffer/builder will, as basic operations like +(concat) doesnot create a new object all together.

string literals and permanent generation memory area

When we say that interned strings are stored in permanent generation area then does the same applies for string literals also? Or it is only for strings interned by inter()?
Actually blog posts usually say that string pool contains reference to string object while actual string object is somewhere in heap. also there is very much confusion that whether permanent generation is IN heap or outside of it. (i used jcosole it is showing permanent gen different from heap.many posts say it as a part of heap and many say it is different)
Edit:
Also when I ran:
public class stringtest2{
public static void main(String args[]){
int i=0;
List<String> list=new ArrayList<String>();
while(true){
String s="hello"+i;
String s1=i+"hello";
String s2=i+"hello"+i;
System.out.println(s);
s.intern();
s1.intern();
s2.intern();
list.add(s);
list.add(s1);
list.add(s2);
i++;
}
}
}
I was expecting Java.lang.OutOfMemoryError: PermGen space But i got :
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at stringtest2.main(stringtest2.java:20)
Shouldn't it be Java.lang.OutOfMemoryError: PermGen space
When we say that interned strings are stored in permanent generation area then does the same applies for string literals also?
Literal strings are interned. So yes, in Java 6-.
From Java 7, interned strings are not stored in permanent generation any longer. They are stored in the main part of the heap like any other objects you would create.
Shouldn't it be Java.lang.OutOfMemoryError: PermGen space
The exception you get is caused by the creation of an array which lives on the heap. To try to get an "out of permgen memory" error, you could try to remove the list.add() lines. Note however that interned strings can be garbage collected so even doing that will still not cause the exception you expect.
Cf RFE 6962931:
In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
All String literals are interned automatically, as described in the String JavaDoc:
All literal strings and string-valued constant expressions are interned.
I would expect the behaviour to be consistent between strings you manually intern and any string literals.
s.intern() returns a reference to the interned string but you are not using it.
Try s = s.intern()

Categories