String pool - do String always exist in constant pool? - java

When string is created by using literal, it gets stored in pool. But when new operator is used to create String object, it stores the object in Heap.
But is the object in heap just a pointer to literal stored in pool or is it a simple String object stored in heap which is eligible for GC?

Terminology:
The Constant Pool is an area in (each) .class file that contains various constants, including strings. No runtime objects exist in the constant pool. It is a region of a file.
The String Pool is a runtime data structure used by the JVM to manage certain kinds of strings. (Specifically, String objects that correspond to literals, and String objects added to the pool by String::intern().)
Your question is actually talking about the String Pool, not the Constant Pool.
To answer your questions:
String pool - do String always exist in constant pool?
No. A string object created using new String() doesn't exist in either the string pool or the constant pool.
When string is created by using literal, it gets stored in pool.
The string (already!) exists the constant pool and gets created as a Java String object in the string pool. (The actual creation can be at class load time, or when the literal is first used. This depends on the Java implementation.)
But when new operator is used to create String object, it stores the object in Heap.
Yes. But the string pool is also part of the Heap. Like I said, it is a data structure, not a region of storage.
(In the old days, the string pool lived in a special heap called the PermGen heap. But PermGen was replaced with something else (MetaSpace), and the string pool doesn't use either ... anymore.
But is the object in heap just a pointer to literal stored in pool or is it a simple String object stored in heap which is eligible for GC?
This is really confused.
All strings are represented as String objects in the (a) heap. That includes strings in the string pool. Even when the string pool was in PermGen.
All String objects that are unreachable are eligible for garbage collection. Even for strings in the string pool. Even for String objects that represent string literals.
But ... wait ... so can string literals be garbage collected?
Yes!! If a String object that represents a string literal becomes unreachable at runtime it is eligible for garbage collection, just like any other String object.
A string literal can become unreachable if the code object(s) that use the literal become unreachable. It can happen, when a classloader becomes unreachable.
And yes, PermGen was garbage collected. At least since JDK 1.2. (IIRC Java 1.0 and maybe 1.1 didn't implement GC for the PermGen heap. But that was fixed a long time ago.)

Related

How can I destroy reference from String pool in Java?

Strings are immutable. When I declare:
String a = "abc";
String a1 = "abc";
Both objects refer to same location. So how can I destroy this "abc" reference from String pool?
My use case is that I am developing a hardware application with less memory and for this, I need to clear the references from String pool to save memory.
No, typically you can not "destroy reference from String pool in Java" manually.
The main reason I suppose why you are targeting it is to avoid out of memory errors. In Java 6 days all interned strings were stored in the PermGen – the fixed size part of heap mainly used for storing loaded classes and string pool. Besides explicitly interned strings, PermGen string pool also contained all literal strings earlier used in your program. The biggest issue with string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at runtime. You can set it using -XX:MaxPermSize=N option. This would lead to memory leaks and out of memory errors.
In Java 7 – the string pool was relocated to the heap. It means that you are no longer limited by a separate fixed size memory area. All strings are now located in the heap, as most of other ordinary objects.
You may also increase the string pool size by configuring -XX:StringTableSize=N.
If you are not sure about the string pool usage, try -XX:+PrintStringTableStatistics JVM argument. It will print you the string pool usage when your program terminates.
In JDK, there is also a tool named jmap which can be used to find out number of interned strings in your application.
jmap -heap process_id
Eg:
jmap -heap 18974
Along with other output, this command also outputs number of interned strings and the space they occupy "xxxxxx interned Strings occupying xxxxxx bytes."
The rules for garbage collection of objects in the String pool are the same as for other Strings or any other object. But the fact that the String objects that correspond to String literals mostly are always reachable since there is an implicit reference to the string object in the code of every method that uses the literal and so typically they are not candidates for garbage collection.
However, this is not always the case. If the literal was defined in a class that was dynamically loaded (e.g. using Class.forName(...)), then it is possible to arrange that the class is unloaded. If that happens, then the String object for the literal will be unreachable, and will be reclaimed when the heap containing the interned String gets GC'ed.

java string literal and new operator [duplicate]

I know the concept of a constants pool and the String constant pool used by JVMs to handle String literals. But I don't know which type of memory is used by the JVM to store String constant literals. The stack or the heap? Since its a literal which is not associated with any instance I would assume that it will be stored in stack. But if it's not referred by any instance the literal has to be collected by GC run (correct me if I am wrong), so how is that handled if it is stored in the stack?
The answer is technically neither. According to the Java Virtual Machine Specification, the area for storing string literals is in the runtime constant pool. The runtime constant pool memory area is allocated on a per-class or per-interface basis, so it's not tied to any object instances at all. The runtime constant pool is a subset of the method area which "stores per-class structures such as the runtime constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and instance initialization and interface type initialization". The VM spec says that although the method area is logically part of the heap, it doesn't dictate that memory allocated in the method area be subject to garbage collection or other behaviors that would be associated with normal data structures allocated to the heap.
As explained by this answer, the exact location of the string pool is not specified and can vary from one JVM implementation to another.
It is interesting to note that until Java 7, the pool was in the permgen space of the heap on hotspot JVM but it has been moved to the main part of the heap since Java 7:
Area: HotSpot
Synopsis: In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
RFE: 6962931
And in Java 8 Hotspot, Permanent Generation has been completely removed.
String literals are not stored on the stack. Never. In fact, no objects are stored on the stack.
String literals (or more accurately, the String objects that represent them) are were historically stored in a Heap called the "permgen" heap. (Permgen is short for permanent generation.)
Under normal circumstances, String literals and much of the other stuff in the permgen heap are "permanently" reachable, and are not garbage collected. (For instance, String literals are always reachable from the code objects that use them.) However, you can configure a JVM to attempt to find and collect dynamically loaded classes that are no longer needed, and this may cause String literals to be garbage collected.
CLARIFICATION #1 - I'm not saying that Permgen doesn't get GC'ed. It does, typically when the JVM decides to run a Full GC. My point is that String literals will be reachable as long as the code that uses them is reachable, and the code will be reachable as long as the code's classloader is reachable, and for the default classloaders, that means "for ever".
CLARIFICATION #2 - In fact, Java 7 and later uses the regular heap to hold the string pool. Thus, String objects that represent String literals and intern'd strings are actually in the regular heap. (See #assylias's Answer for details.)
But I am still trying to find out thin line between storage of string literal and string created with new.
There is no "thin line". It is really very simple:
String objects that represent / correspond to string literals are held in the string pool.
String objects that were created by a String::intern call are held in the string pool.
All other String objects are NOT held in the string pool.
Then there is the separate question of where the string pool is "stored". Prior to Java 7 it was the permgen heap. From Java 7 onwards it is the main heap.
String pooling
String pooling (sometimes also called as string canonicalisation) is a
process of replacing several String objects with equal value but
different identity with a single shared String object. You can achieve
this goal by keeping your own Map (with possibly soft
or weak references depending on your requirements) and using map
values as canonicalised values. Or you can use String.intern() method
which is provided to you by JDK.
At times of Java 6 using String.intern() was forbidden by many
standards due to a high possibility to get an OutOfMemoryException if
pooling went out of control. Oracle Java 7 implementation of string
pooling was changed considerably. You can look for details in
http://bugs.sun.com/view_bug.do?bug_id=6962931 and
http://bugs.sun.com/view_bug.do?bug_id=6962930.
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location –
the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
String pool values are garbage collected
Yes, all strings in the JVM string pool are eligible for garbage
collection if there are no references to them from your program roots.
It applies to all discussed versions of Java. It means that if your
interned string went out of scope and there are no other references to
it – it will be garbage collected from the JVM string pool.
Being eligible for garbage collection and residing in the heap, a JVM
string pool seems to be a right place for all your strings, isn’t it?
In theory it is true – non-used strings will be garbage collected from
the pool, used strings will allow you to save memory in case then you
get an equal string from the input. Seems to be a perfect memory
saving strategy? Nearly so. You must know how the string pool is
implemented before making any decisions.
source.
As other answers explain Memory in Java is divided into two portions
1. Stack: One stack is created per thread and it stores stack frames which again stores local variables and if a variable is a reference type then that variable refers to a memory location in heap for the actual object.
2. Heap: All kinds of objects will be created in heap only.
Heap memory is again divided into 3 portions
1. Young Generation: Stores objects which have a short life, Young Generation itself can be divided into two categories Eden Space and Survivor Space.
2. Old Generation: Store objects which have survived many garbage collection cycles and still being referenced.
3. Permanent Generation: Stores metadata about the program e.g. runtime constant pool.
String constant pool belongs to the permanent generation area of Heap memory.
We can see the runtime constant pool for our code in the bytecode by using javap -verbose class_name which will show us method references (#Methodref), Class objects ( #Class ), string literals ( #String )
You can read more about it on my article How Does JVM Handle Method Overloading and Overriding Internally.
To the great answers that already included here I want to add something that missing in my perspective - an illustration.
As you already JVM divides the allocated memory to a Java program into two parts. one is stack and another one is heap. Stack is used for execution purpose and heap is used for storage purpose. In that heap memory, JVM allocates some memory specially meant for string literals. This part of the heap memory is called string constants pool.
So for example, if you init the following objects:
String s1 = "abc";
String s2 = "123";
String obj1 = new String("abc");
String obj2 = new String("def");
String obj3 = new String("456);
String literals s1 and s2 will go to string constant pool, objects obj1, obj2, obj3 to the heap. All of them, will be referenced from the Stack.
Also, please note that "abc" will appear in heap and in string constant pool. Why is String s1 = "abc" and String obj1 = new String("abc") will be created this way? It's because String obj1 = new String("abc") explicitly creates a new and referentially distinct instance of a String object and String s1 = "abc" may reuse an instance from the string constant pool if one is available. For a more elaborate explanation: https://stackoverflow.com/a/3298542/2811258

Confusion on string immutability

I have following code:-
public class StaticImplementer {
private static String str= "ABC";
public static void main(String args[]) {
str = str + "XYZ";
}
}
Questions:-
Here String str is static, then where this object will be stored in memory?
Since String is immutable, where the object for "XYZ" will be stored?
Where will be final object will be Stored?
And how will garbage collection will be done?
1) Here String str is static, then where this object will be stored in
memory?
Those literals will be stored in the String pool memory, no matter if the variable is declared as static or not.
More info here: Where does Java's String constant pool live, the heap or the stack?
2) Since String is immutable, where the object for "XYZ" will be stored?
Similar to the first answer: a literal will be stored in the pool memory.
Immutability just allows the concept of shared pool memory.
3) Where will be final object will be Stored?
According to the Java specification, concatenation of literals will end up to a literal too (since known at compilation time), stored in the pool memory.
Excerpt:
"This is a " + // actually a string-valued constant expression,
"two-line string" // formed from two string literals
4) And how will garbage collection will be done?
As essence of the pool memory, they won't be garbage collected by default.
Indeed, if garbage collected immediately, the "shared" concept of the pool would fail.
Here String str is static, then where this object will be stored in
memory?
String str is not an object, it's a reference to an object. "ABC", "XYZ" & "ABCXYZ" are three distinct String objects. Thus, str points to a string. You can change what it points to, but not that which it points at.
Since String is immutable, where the object for "XYZ" will be stored?
As explained in above & also by Mik378, "XYZ" is just a String object which gets saved in the String pool memory and the reference to this memory is returned when "XYZ" is declared or assigned to any other object.
Where will be final object will be Stored?
The final object, "ABCXYZ" will also get saved to the pool memory and the reference will be returned to the object when the operation is assigned to any variable.
And how will garbage collection will be done?
String literals are interned. As of Java 7, the HotSpot JVM puts interned Strings in the heap, not permgen. In earlier versions of Java, JVM placed interned Strings in permgen. However, Strings in permgen were garbage collected. Apparently, Class objects in permgen are also collectable, so everything in permgen is collectable, though permgen collection might not be enabled by default in some old JVMs.
String literals, being interned, would be a reference held by the declaring Class object to the String object in the intern pool. So the interned literal String would only be collected if the Class object that referred to it were also collected.
Shishir

Where does Java's String constant pool live, the heap or the stack?

I know the concept of a constants pool and the String constant pool used by JVMs to handle String literals. But I don't know which type of memory is used by the JVM to store String constant literals. The stack or the heap? Since its a literal which is not associated with any instance I would assume that it will be stored in stack. But if it's not referred by any instance the literal has to be collected by GC run (correct me if I am wrong), so how is that handled if it is stored in the stack?
The answer is technically neither. According to the Java Virtual Machine Specification, the area for storing string literals is in the runtime constant pool. The runtime constant pool memory area is allocated on a per-class or per-interface basis, so it's not tied to any object instances at all. The runtime constant pool is a subset of the method area which "stores per-class structures such as the runtime constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and instance initialization and interface type initialization". The VM spec says that although the method area is logically part of the heap, it doesn't dictate that memory allocated in the method area be subject to garbage collection or other behaviors that would be associated with normal data structures allocated to the heap.
As explained by this answer, the exact location of the string pool is not specified and can vary from one JVM implementation to another.
It is interesting to note that until Java 7, the pool was in the permgen space of the heap on hotspot JVM but it has been moved to the main part of the heap since Java 7:
Area: HotSpot
Synopsis: In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
RFE: 6962931
And in Java 8 Hotspot, Permanent Generation has been completely removed.
String literals are not stored on the stack. Never. In fact, no objects are stored on the stack.
String literals (or more accurately, the String objects that represent them) are were historically stored in a Heap called the "permgen" heap. (Permgen is short for permanent generation.)
Under normal circumstances, String literals and much of the other stuff in the permgen heap are "permanently" reachable, and are not garbage collected. (For instance, String literals are always reachable from the code objects that use them.) However, you can configure a JVM to attempt to find and collect dynamically loaded classes that are no longer needed, and this may cause String literals to be garbage collected.
CLARIFICATION #1 - I'm not saying that Permgen doesn't get GC'ed. It does, typically when the JVM decides to run a Full GC. My point is that String literals will be reachable as long as the code that uses them is reachable, and the code will be reachable as long as the code's classloader is reachable, and for the default classloaders, that means "for ever".
CLARIFICATION #2 - In fact, Java 7 and later uses the regular heap to hold the string pool. Thus, String objects that represent String literals and intern'd strings are actually in the regular heap. (See #assylias's Answer for details.)
But I am still trying to find out thin line between storage of string literal and string created with new.
There is no "thin line". It is really very simple:
String objects that represent / correspond to string literals are held in the string pool.
String objects that were created by a String::intern call are held in the string pool.
All other String objects are NOT held in the string pool.
Then there is the separate question of where the string pool is "stored". Prior to Java 7 it was the permgen heap. From Java 7 onwards it is the main heap.
String pooling
String pooling (sometimes also called as string canonicalisation) is a
process of replacing several String objects with equal value but
different identity with a single shared String object. You can achieve
this goal by keeping your own Map (with possibly soft
or weak references depending on your requirements) and using map
values as canonicalised values. Or you can use String.intern() method
which is provided to you by JDK.
At times of Java 6 using String.intern() was forbidden by many
standards due to a high possibility to get an OutOfMemoryException if
pooling went out of control. Oracle Java 7 implementation of string
pooling was changed considerably. You can look for details in
http://bugs.sun.com/view_bug.do?bug_id=6962931 and
http://bugs.sun.com/view_bug.do?bug_id=6962930.
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location –
the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
String pool values are garbage collected
Yes, all strings in the JVM string pool are eligible for garbage
collection if there are no references to them from your program roots.
It applies to all discussed versions of Java. It means that if your
interned string went out of scope and there are no other references to
it – it will be garbage collected from the JVM string pool.
Being eligible for garbage collection and residing in the heap, a JVM
string pool seems to be a right place for all your strings, isn’t it?
In theory it is true – non-used strings will be garbage collected from
the pool, used strings will allow you to save memory in case then you
get an equal string from the input. Seems to be a perfect memory
saving strategy? Nearly so. You must know how the string pool is
implemented before making any decisions.
source.
As other answers explain Memory in Java is divided into two portions
1. Stack: One stack is created per thread and it stores stack frames which again stores local variables and if a variable is a reference type then that variable refers to a memory location in heap for the actual object.
2. Heap: All kinds of objects will be created in heap only.
Heap memory is again divided into 3 portions
1. Young Generation: Stores objects which have a short life, Young Generation itself can be divided into two categories Eden Space and Survivor Space.
2. Old Generation: Store objects which have survived many garbage collection cycles and still being referenced.
3. Permanent Generation: Stores metadata about the program e.g. runtime constant pool.
String constant pool belongs to the permanent generation area of Heap memory.
We can see the runtime constant pool for our code in the bytecode by using javap -verbose class_name which will show us method references (#Methodref), Class objects ( #Class ), string literals ( #String )
You can read more about it on my article How Does JVM Handle Method Overloading and Overriding Internally.
To the great answers that already included here I want to add something that missing in my perspective - an illustration.
As you already JVM divides the allocated memory to a Java program into two parts. one is stack and another one is heap. Stack is used for execution purpose and heap is used for storage purpose. In that heap memory, JVM allocates some memory specially meant for string literals. This part of the heap memory is called string constants pool.
So for example, if you init the following objects:
String s1 = "abc";
String s2 = "123";
String obj1 = new String("abc");
String obj2 = new String("def");
String obj3 = new String("456);
String literals s1 and s2 will go to string constant pool, objects obj1, obj2, obj3 to the heap. All of them, will be referenced from the Stack.
Also, please note that "abc" will appear in heap and in string constant pool. Why is String s1 = "abc" and String obj1 = new String("abc") will be created this way? It's because String obj1 = new String("abc") explicitly creates a new and referentially distinct instance of a String object and String s1 = "abc" may reuse an instance from the string constant pool if one is available. For a more elaborate explanation: https://stackoverflow.com/a/3298542/2811258

Garbage collection and Strings

I have some doubts regarding Strings
Are they live on Heap or String pool?
And if on Heap then they will be garbage collected, if they are not reachable by any live thread.
And if on String pool then how they will be deleted or removed because as we know Garbage Collection happens only on heap.
String s = new String("abc");
the string object referred by s will be on heap and the string literal "abc" will be in string pool. The objects in the string pool will not be garbage collected. They are there to be reused during the lifetime of the program, to improve the performance.
They are all stored in the heap, but intern()ed strings (including string literals in the source) are referenced from a pool in the String class.
If they appear as literals in the source code, including constant string expressions (e.g. "a" + "b") then they will also be referenced from the Class the appear in, which usually means they will last as long as the process runs.
Edit:
When you call intern() on a string in your code it is also added to this pool, but because it uses weak references the string can still be garbage collected if it is no longer in use.
See also:
interned Strings : Java Glossary
Quote from that article:
The collection of Strings registered in this HashMap is sometimes called the String pool. However, they are ordinary Objects and live on the heap just like any other (perhaps in an optimised way since interned Strings tend to be long lived).
String Alex goes into the literal pool, stays there as long as the process runs (or the web application remains loaded.) as told by finnw and is never garbage collected. String name2 doesn't allocate memory for "Alex" and reuses it from the literal pool.
PS: Literal pool is on the heap as well.
For string John two objects are created with reference name3 and name4 which are garbage collectible.
String name = "Alex";
String name2 = "Alex";
String name3 = new String("John");
String name4 = new String("John");

Categories