string literals and permanent generation memory area - java

When we say that interned strings are stored in permanent generation area then does the same applies for string literals also? Or it is only for strings interned by inter()?
Actually blog posts usually say that string pool contains reference to string object while actual string object is somewhere in heap. also there is very much confusion that whether permanent generation is IN heap or outside of it. (i used jcosole it is showing permanent gen different from heap.many posts say it as a part of heap and many say it is different)
Edit:
Also when I ran:
public class stringtest2{
public static void main(String args[]){
int i=0;
List<String> list=new ArrayList<String>();
while(true){
String s="hello"+i;
String s1=i+"hello";
String s2=i+"hello"+i;
System.out.println(s);
s.intern();
s1.intern();
s2.intern();
list.add(s);
list.add(s1);
list.add(s2);
i++;
}
}
}
I was expecting Java.lang.OutOfMemoryError: PermGen space But i got :
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2760)
at java.util.Arrays.copyOf(Arrays.java:2734)
at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
at java.util.ArrayList.add(ArrayList.java:351)
at stringtest2.main(stringtest2.java:20)
Shouldn't it be Java.lang.OutOfMemoryError: PermGen space

When we say that interned strings are stored in permanent generation area then does the same applies for string literals also?
Literal strings are interned. So yes, in Java 6-.
From Java 7, interned strings are not stored in permanent generation any longer. They are stored in the main part of the heap like any other objects you would create.
Shouldn't it be Java.lang.OutOfMemoryError: PermGen space
The exception you get is caused by the creation of an array which lives on the heap. To try to get an "out of permgen memory" error, you could try to remove the list.add() lines. Note however that interned strings can be garbage collected so even doing that will still not cause the exception you expect.
Cf RFE 6962931:
In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.

All String literals are interned automatically, as described in the String JavaDoc:
All literal strings and string-valued constant expressions are interned.
I would expect the behaviour to be consistent between strings you manually intern and any string literals.

s.intern() returns a reference to the interned string but you are not using it.
Try s = s.intern()

Related

Why concatenation of String object and string literal is created in heap? [duplicate]

This question already has an answer here:
What is the difference between Heap memory and string constant pool in java
(1 answer)
Closed 3 years ago.
I have below Strings
String str1 = "Abc";//created in constant pool
String str2 = "XYZ";//created in constant pool
String str3 = str1 + str2;//created in constant pool
String str4 = new String("PQR");//created in heap
String str5 = str1.concat(str4);//created in heap
String str6 = str1 + str4;//created in heap
Here I don't know why concatenation of Strings, one created in the constant pool, and the other in the heap, results in creating the new String in the heap. I don't know the reason, why does it happen?
There is a bunch of dubious information in the comments, so I will give this a proper answer.
There is actually no such thing as "the constant pool". You won't find this term in the Java Language Specification.
(Perhaps you are getting your terminology confused with the Constant Pool which is the section of a .class file, and the corresponding per-class Runtime Constant Pool ... which is not visible to application programs. These are "specification artifacts" defined by the JVM spec for the purpose of defining the execution model of bytecodes. The spec does not require that they physically exist, though they typically do; e.g. in an Oracle or OpenJDK implementation.)
There is a data structure in a running JVM called the string pool. The string pool is NOT mentioned by name in the JLS, but its existence is implied by string literal properties as specified by the JLS. The string pool is mentioned in the javadocs, and the JVM specification.
The string pool will contain the String objects that represent the values of any string-valued constant expression used in an application. This includes string literals.
The string pool has always been primarily a de-duping mechanism for strings. Applications are able to use this by calling the String.intern method.
The string values in the Constant Pool (see above) are used to create the String objects that the application see:
A String object is created from the representation.
String.intern is called, returning the corresponding de-duped String object from the string pool.
That string becomes part of the classes Runtime Constant Pool; i.e. the Runtime Constant Pool for a class will include a reference to the String object in the string pool.
This process can happen eagerly or lazily depending on the Java implementation.
The string pool is and has always been stored in the (or a) heap.
Prior to Java 7, string objects in the string pool were allocated in a special heap called the PermGen heap. In the earliest versions of Java it wasn't GC'ed. Then it was GC'ed only occasionally.
In Java 7 (not 8!) the string pool stopped using the PermGen heap and used the regular heap instead.
In Java 8 the PermGen heap was replaced (for some purposes!) by a different storage management mechanism called the Metaspace. Apparently, Metaspace doesn't hold Java objects. Rather, it holds code segments, class descriptors and other JVM internal data structures.
In recent versions of Java (i.e. Java 8 u20 and later) the GC has another mechanism for de-duping strings that survive a given number of GC cycles.
The behavior of strings (i.e. which ones are interned and which ones are not) is determined by the relevant parts of the JLS and the javadocs for the String class.
All of the complexity is irrelevant if you follow one simple rule:
Never use == to compare strings. Always use equals.
Now to deal with your example code:
String str1 = "Abc"; // string pool
String str2 = "XYZ"; // string pool
String str3 = str1 + str2; // not string pool (!!)
String str3a = "Abc" + "XYZ"; // string pool
String str4 = new String("PQR"); // not string pool (but the "PQR" literal is)
String str5 = str1.concat(str4); // not string pool
String str6 = str1 + str4; // not string pool
String str7 = str6.intern(); // string pool
Why?
The values assigned to str1, str2 and str3a are all values of constant expressions; see below.
The value assigned to str3 is not the value of a constant expression according to the JLS.
str4 - the JLS says that new operator always creates a new object and new strings are not automatically interned
str5 - string operations apart from intern do not create objects in the string pool
str6 - ditto - equivalent to a concat call. The JLS also says that + produces a new string (except in the constant expression case).
str7 - the exception: see above. The intern call returns a object in the string pool.
Constant expressions include literals, concatenations involving literals, values of static final String constants, and a few other things. See JLS 15.28 for the complete list, but bear in mind that the string pool only holds string values.
The precise behavior of intern depends on the Java version. Consider this example:
char[] chars = // create an array of random characters
String s1 = new String(chars);
String s2 = s1.intern();
Let us assume that the random characters do not correspond to any previously interned string.
For older JVMs where interned strings were allocated in PermGen, the intern call in the example will (must) produce a new String object.
For newer JVMs, the intern can add the existing String object to the string pool data structure without having to create a new String object.
In other words, the truth of s1 == s2 depends on the Java version.

We know that String will store in SCP in the same way String[] also store in SCP? and what about ArrayList<String>?

We know that a String will be stored in SCP (String Constant Pool) area:
1. in the same way String[] is also stored in SCP? i mean each array contains a String, so again this will be stored in SCP?
2. and what about ArrayList(String) ? i mean each arraylist contains a String, so again this will be stored in SCP?
In our project we are facing OutOfMemory. we are having more than 1000+ String[]. Each time values for String[] are different. So Huge no of objects are getting created in SCP (guessing) We want to change this to ArrayList(String) to reduce memory. If again ArrayList(String), each String gets stored in SCP area, then there is no use of changing from String[] to ArrayList(String)
Please explain in detail. Your response is valuable.
No. All Strings will not be stored in String Constant Pool. Only String literals and interned Strings will be stored there.
String s = "abc"; // stores "abc" in the SCP if it is not already present.
String s1= "abc";// stores "abc" in the SCP if it is not already present.
String s2="abc";// Doesn't store "abc" into the String pool as it is already present.
String s3=s1+s2; // "abcabc" goes on heap
String[] is just an array that holds references to String objects.
So, I think there is some other problem somewhere else.
Also, from Oracle doc :
In JDK 7, interned strings are no longer allocated in the permanent
generation of the Java heap, but are instead allocated in the main
part of the Java heap (known as the young and old generations), along
with the other objects created by the application. This change will
result in more data residing in the main Java heap, and less data in
the permanent generation, and thus may require heap sizes to be
adjusted. Most applications will see only relatively small differences
in heap usage due to this change, but larger applications that load
many classes or make heavy use of the String.intern() method will see
more significant differences.
Its the String literals that are stored in pool. Rest for all other questions that you have, its just a reference. So String[] holds reference to String Objects which include pooled objects as well. Similarly ArrayList will hold the reference.
Changing [] to ArrayList wont make a difference.
Rather changing these String to StringBuffer/builder will, as basic operations like +(concat) doesnot create a new object all together.

when string intern() method is getting called

Case 1:
String str = "StackOverFlow";
String str1 = "StackOverFlow";
if(str==str1){
System.out.println("equal");//prints equal
}
Case 2:
String str = "StackOverFlow";
String str1=str.intern();
if(str==str1){
System.out.println("equal");//prints equal
}
follow up questions:
I want to know whether for the first case JVM calls intern() internally and assign the reference of str to str1?
how two references equal in the first case?
Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap? if yes where exactly?
For question 4 the answer is as below:
In Java 6 and earlier, interned strings were also stored in the permanent generation. In Java 7, interned strings are stored in the main object heap.
Here is what documentation says:
In JDK 7, interned strings are no longer allocated in the permanent
generation of the Java heap, but are instead allocated in the main
part of the Java heap (known as the young and old generations), along
with the other objects created by the application. This change will
result in more data residing in the main Java heap, and less data in
the permanent generation, and thus may require heap sizes to be
adjusted. Most applications will see only relatively small differences
in heap usage due to this change, but larger applications that load
many classes or make heavy use of the String.intern() method will see
more significant differences.
Further details from here:
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
The references are equal because they are both String literals.
The intern() call on str is not needed because it is also a literal. An example of when you would need to use intern() (which, by the way is a lot slower then equals(), so don't use it) would be when constructing a String with a byte or char array.
For example:
final String str1 = "I am a literal";
final String str2 = new String(str1.toCharArray());
final boolean check1 = str1 == str2; // false
final boolean check2 = str1 == str2.intern(); // true
1) I want to know whether for the first case JVM calls intern() internally and assign the reference of str to str1?
Well, yes and no.
Yes the intern() method is called internally. But the call doesn't happen when that code is run. In fact, it happens when that code is loaded. The loader then saves the reference to the interned String.
But in this case, the loading process only needs to do the interning once. The two literals (in this case) will actually be represented by a single "constant pool entry" in the class that is being loaded. (The Java compiler will have spotted the duplicate literals in the class ... at compile time ... and eliminated it.)
2) how two references equal in the first case?
Because the two strings have been interned.
3) Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Yes ... modulo that the interning doesn't happen at the point when the code containing the declaration is run.
4) Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap? if yes where exactly?
The answer is somewhat system dependent. In general, the string pool is in the heap. On some systems the heap is divided into regions or spaces which have different garbage collection policies, and the string pool is allocated in the so-called "permgen" space that is sized independently of the rest of the heap. But this is not always true.
I want to know whether for the first case JVM calls intern() internally
No.
and assign the reference of str to str1?
Yes, but because the value is a literal, not because of interning. There is only one instance of it in the .class file in the first place.
how two references equal in the first case?
That's not another question, just the same question re-stated.
Does the first case means whenever you declare a string like String str = "StackOverFlow"; it adds to the pool of string as same as that of intern() method?
Yes, but it's done by the compiler, not intern().
Does String pool which is used by String str = "StackOverFlow"; and intern() is allocated outside of heap?
No.
where exactly?
In the constant pool of the loaded class, which is in the heap.
There are largely 2 ways of creating string objects.
One is
String str = "test-String"; // This string is created in string pool or returned from
string pool if already exists
Second one
String str = new String("test-string");// This string object will be created in heap memory and will be treated as any other object
When you call string.intern on the string instance created using new operator... this string will be created in string-pool or returned from the pool if exists. This is a mechanism to move the string object from heap to perm Gen (String pool)

How to clear the entry from String Literal Pool [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Garbage collection behaviour for String.intern()
How does Java store Strings and how does substring work internally?
According to me the String reference when declared as null doesn't deletes the entry from String literal pool and i want to know how we can clear it .
String object="csk";// creates an Object in Java Heap and makes an entry String Literal Pool .
object=null// however make this reference to null object .
//but it doesn't deletes an entry from String literal .I doubt if it deletes an entry from Literal Pool
String literals (WeakHashMap) are also stored in heap memory called the "permgen" heap.
need to configure in JVM to find and collect dynamically loaded classes that are no longer needed, and this may cause String literals to be garbage collected.
and or when JVM performas the Full gc.
An excerpt from How does Java store Strings and how does substring work internally?:
Strings in the pool can be garbage collected (meaning that a string literal might be removed from the pool at some stage if it becomes full)

Does Immutability of Strings in Java cause Out Of Memory

I have written a simple Java program that reads a million rows from the Database and writes them to a File.
The max memory that this program can use is 512M.
I frequently notice that this program runs Out Of Memory for more than 500K rows.
Since the program is a very simple program it is easy to find out that this doesn't have a memory leak. the way the program works is that it fetches a thousand rows from the Database, writes them to a file using Streams and then goes and fetches the next thousand rows. The size of each row varies but none of the rows is huge. On taking a dump while the program is running the older string are easily seen on the heap. These String in heap are unreachable which means they are waiting to get Garbage collected. I also believe that the GC doesn't necessarily run during the execution of this program which leaves String's in the heap longer than they should.
I think the solution would be to use long Char Arrays(or Stringbuffer) instead of using String objects to store the lines that are returned by the DB. The assumption is that I can overwrite the contents of a Char Array which means the same Char Array can be used across multiple iterations without having to allocate new Space each time.
Pseudocode :
Create an Array of Arrays using new char[1000][1000];
Fill the thousand rows from DB to the Array.
Write Array to File.
Use the same Array for next thousand rows
If the above pseudocode fixes my problem then in reality the Immutable nature of the String class hurts the Java programmer as there is no direct way to claim the space used up by a String even though the String is no longer in use.
Are there any better alternatives to this problem ?
P.S : I didn't do a static analysis alone. I used yourkit profiler to test a heap dump. The dump clearly says 96% of the Strings have NO GC Roots which means they are waiting to get Garbage collected. Also I don't use Substring in my code.
Immutability of the class String has absolutely nothing to do with OutOfMemoryError. Immutability means that it cannot ever change, only that.
If you run out of memory, it is simply because the garbage collector was unable to find any garbage to collect.
In practice, it is likely that you are holding references to way too many Strings in memory (for instance, do you have any kind of collection holding strings, such as List, Set, Map?). You must destroy these references to allow the garbage collector to do its job and free up some memory.
The simple answer to this question is 'no'. I suspect you're hanging onto references longer than you think.
Are you closing those streams properly ? Are you intern()ing those strings. That would result in a permanent copy being made of the string if it doesn't exist already, and taking up permgen space (which isn't collected). Are you taking substring() of a larger string ? Strings make use of the flyweight pattern and will share a character array if created using substring(). See here for more details.
You suggest that garbage collection isn't running. The option -verbose:gc will log the garbage collections and you can see immediately what's going on.
The only thing about strings which can cause an OutOfMemoryError is if you retain small sections of a much larger string. If you are doing this it should be obvious from a heap dump.
When you take a heap dump I suggest you only look at live objects, in which case any retained objects you don't need is most likely to be a bug in your code.

Categories