Strings are immutable. When I declare:
String a = "abc";
String a1 = "abc";
Both objects refer to same location. So how can I destroy this "abc" reference from String pool?
My use case is that I am developing a hardware application with less memory and for this, I need to clear the references from String pool to save memory.
No, typically you can not "destroy reference from String pool in Java" manually.
The main reason I suppose why you are targeting it is to avoid out of memory errors. In Java 6 days all interned strings were stored in the PermGen – the fixed size part of heap mainly used for storing loaded classes and string pool. Besides explicitly interned strings, PermGen string pool also contained all literal strings earlier used in your program. The biggest issue with string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at runtime. You can set it using -XX:MaxPermSize=N option. This would lead to memory leaks and out of memory errors.
In Java 7 – the string pool was relocated to the heap. It means that you are no longer limited by a separate fixed size memory area. All strings are now located in the heap, as most of other ordinary objects.
You may also increase the string pool size by configuring -XX:StringTableSize=N.
If you are not sure about the string pool usage, try -XX:+PrintStringTableStatistics JVM argument. It will print you the string pool usage when your program terminates.
In JDK, there is also a tool named jmap which can be used to find out number of interned strings in your application.
jmap -heap process_id
Eg:
jmap -heap 18974
Along with other output, this command also outputs number of interned strings and the space they occupy "xxxxxx interned Strings occupying xxxxxx bytes."
The rules for garbage collection of objects in the String pool are the same as for other Strings or any other object. But the fact that the String objects that correspond to String literals mostly are always reachable since there is an implicit reference to the string object in the code of every method that uses the literal and so typically they are not candidates for garbage collection.
However, this is not always the case. If the literal was defined in a class that was dynamically loaded (e.g. using Class.forName(...)), then it is possible to arrange that the class is unloaded. If that happens, then the String object for the literal will be unreachable, and will be reclaimed when the heap containing the interned String gets GC'ed.
Related
When string is created by using literal, it gets stored in pool. But when new operator is used to create String object, it stores the object in Heap.
But is the object in heap just a pointer to literal stored in pool or is it a simple String object stored in heap which is eligible for GC?
Terminology:
The Constant Pool is an area in (each) .class file that contains various constants, including strings. No runtime objects exist in the constant pool. It is a region of a file.
The String Pool is a runtime data structure used by the JVM to manage certain kinds of strings. (Specifically, String objects that correspond to literals, and String objects added to the pool by String::intern().)
Your question is actually talking about the String Pool, not the Constant Pool.
To answer your questions:
String pool - do String always exist in constant pool?
No. A string object created using new String() doesn't exist in either the string pool or the constant pool.
When string is created by using literal, it gets stored in pool.
The string (already!) exists the constant pool and gets created as a Java String object in the string pool. (The actual creation can be at class load time, or when the literal is first used. This depends on the Java implementation.)
But when new operator is used to create String object, it stores the object in Heap.
Yes. But the string pool is also part of the Heap. Like I said, it is a data structure, not a region of storage.
(In the old days, the string pool lived in a special heap called the PermGen heap. But PermGen was replaced with something else (MetaSpace), and the string pool doesn't use either ... anymore.
But is the object in heap just a pointer to literal stored in pool or is it a simple String object stored in heap which is eligible for GC?
This is really confused.
All strings are represented as String objects in the (a) heap. That includes strings in the string pool. Even when the string pool was in PermGen.
All String objects that are unreachable are eligible for garbage collection. Even for strings in the string pool. Even for String objects that represent string literals.
But ... wait ... so can string literals be garbage collected?
Yes!! If a String object that represents a string literal becomes unreachable at runtime it is eligible for garbage collection, just like any other String object.
A string literal can become unreachable if the code object(s) that use the literal become unreachable. It can happen, when a classloader becomes unreachable.
And yes, PermGen was garbage collected. At least since JDK 1.2. (IIRC Java 1.0 and maybe 1.1 didn't implement GC for the PermGen heap. But that was fixed a long time ago.)
After reading these discussions - question 1, question 2, article
I have the below understanding of Java String Constant Pool (Please correct me, If I am wrong):
When the source code is compiled, compiler look for all the string literals (The ones put into double quotes) in our program and create distinct(No duplicates) objects in the heap area and maintain their references in a special memory area called String Constant Pool (An area inside method area). Any other string objects are created at run time.
Suppose our code has the following statements:
String a = "abc"; //Line 1
String b = "xyz"; //Line 2
String c = "abc"; //Line 3
String d = new String("abc"): //Line 4
When the above code is compiled,
Line 1: a String object "abc" is created in heap and this object is referenced by variable a and String Constant Pool.
Line 2: Compiler searches String Constant Pool for any existing reference to the object "xyz". But does not find one. So, it creates object "xyz" and puts its reference in String Constant Pool.
Line 3: This time compiler finds the object in String Constant Pool and does not make any additional entry in pool or heap. Variable c just refers to existing object which is also referred by a.
Line 4: The literal in Line 4 is present in String Constant Pool. So, no more entry is made in pool. At run time however another String object is created for "abc" and its reference is stored in variable d.
Now I have the following questions/doubts:
Is that what happens exactly which is described above?
How does the compiler creates object? As per my knowledge, objects
are created at Run time and heap is a Run time memory area. So, how
and where does String objects are created at the time of
compilation!
Source code can be compiled in one machine and run in a different
machine. Or, even in the same machine they can be compiled and run in
different time. Then how those objects (created in compile time) are
recovered?
What happens when we intern a String.
Is that what happens exactly which is described above?
Yes, conceptually, however, the constant pool and string pool are different things.
The constant pool is a part of a .class file that contains all constants used in this class.
The string pool is a runtime concept - interned strings and string literals are stored here.
Here's the JVM specification on the constant pool. It is part of the section on the .class format.
How does the compiler creates object? As per my knowledge, objects are created at Run time and heap is a Run time memory area. So, how and where does String objects are created at the time of compilation!
How/when exactly this happens, I believe, is a JVM implementation-specific detail (correct me if I am wrong), but the basic explanation is that whenever the JVM decides to load a class, any strings found in the constant pool are automatically placed into the runtime string pool, and any duplicates are made to refer to the same instance.
In one of the linked answers' comments, Paŭlo Ebermann says:
when the classes are loaded in the VM, the string constants will get copied to the heap, to a VM-wide string pool
so it seems this is at least how Sun's VM implemented the string pool.
Prior to JDK 7/HotSpot interned strings were stored in the permanent generation space - now they are stored in the main heap.
Source code can be compiled in one machine and run in a different machine. Or, even in the same machine they can be compiled and run in different time. Then how those objects (created in compile time) are recovered?
Constants are stored in the compiled files. Therefore they are retrievable whenever the JVM decides to load this class.
What happens when we intern a String.
This is answered here:
doing String.intern() on a series of strings will ensure that all strings having same contents share same memory
I know the concept of a constants pool and the String constant pool used by JVMs to handle String literals. But I don't know which type of memory is used by the JVM to store String constant literals. The stack or the heap? Since its a literal which is not associated with any instance I would assume that it will be stored in stack. But if it's not referred by any instance the literal has to be collected by GC run (correct me if I am wrong), so how is that handled if it is stored in the stack?
The answer is technically neither. According to the Java Virtual Machine Specification, the area for storing string literals is in the runtime constant pool. The runtime constant pool memory area is allocated on a per-class or per-interface basis, so it's not tied to any object instances at all. The runtime constant pool is a subset of the method area which "stores per-class structures such as the runtime constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and instance initialization and interface type initialization". The VM spec says that although the method area is logically part of the heap, it doesn't dictate that memory allocated in the method area be subject to garbage collection or other behaviors that would be associated with normal data structures allocated to the heap.
As explained by this answer, the exact location of the string pool is not specified and can vary from one JVM implementation to another.
It is interesting to note that until Java 7, the pool was in the permgen space of the heap on hotspot JVM but it has been moved to the main part of the heap since Java 7:
Area: HotSpot
Synopsis: In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
RFE: 6962931
And in Java 8 Hotspot, Permanent Generation has been completely removed.
String literals are not stored on the stack. Never. In fact, no objects are stored on the stack.
String literals (or more accurately, the String objects that represent them) are were historically stored in a Heap called the "permgen" heap. (Permgen is short for permanent generation.)
Under normal circumstances, String literals and much of the other stuff in the permgen heap are "permanently" reachable, and are not garbage collected. (For instance, String literals are always reachable from the code objects that use them.) However, you can configure a JVM to attempt to find and collect dynamically loaded classes that are no longer needed, and this may cause String literals to be garbage collected.
CLARIFICATION #1 - I'm not saying that Permgen doesn't get GC'ed. It does, typically when the JVM decides to run a Full GC. My point is that String literals will be reachable as long as the code that uses them is reachable, and the code will be reachable as long as the code's classloader is reachable, and for the default classloaders, that means "for ever".
CLARIFICATION #2 - In fact, Java 7 and later uses the regular heap to hold the string pool. Thus, String objects that represent String literals and intern'd strings are actually in the regular heap. (See #assylias's Answer for details.)
But I am still trying to find out thin line between storage of string literal and string created with new.
There is no "thin line". It is really very simple:
String objects that represent / correspond to string literals are held in the string pool.
String objects that were created by a String::intern call are held in the string pool.
All other String objects are NOT held in the string pool.
Then there is the separate question of where the string pool is "stored". Prior to Java 7 it was the permgen heap. From Java 7 onwards it is the main heap.
String pooling
String pooling (sometimes also called as string canonicalisation) is a
process of replacing several String objects with equal value but
different identity with a single shared String object. You can achieve
this goal by keeping your own Map (with possibly soft
or weak references depending on your requirements) and using map
values as canonicalised values. Or you can use String.intern() method
which is provided to you by JDK.
At times of Java 6 using String.intern() was forbidden by many
standards due to a high possibility to get an OutOfMemoryException if
pooling went out of control. Oracle Java 7 implementation of string
pooling was changed considerably. You can look for details in
http://bugs.sun.com/view_bug.do?bug_id=6962931 and
http://bugs.sun.com/view_bug.do?bug_id=6962930.
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location –
the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
String pool values are garbage collected
Yes, all strings in the JVM string pool are eligible for garbage
collection if there are no references to them from your program roots.
It applies to all discussed versions of Java. It means that if your
interned string went out of scope and there are no other references to
it – it will be garbage collected from the JVM string pool.
Being eligible for garbage collection and residing in the heap, a JVM
string pool seems to be a right place for all your strings, isn’t it?
In theory it is true – non-used strings will be garbage collected from
the pool, used strings will allow you to save memory in case then you
get an equal string from the input. Seems to be a perfect memory
saving strategy? Nearly so. You must know how the string pool is
implemented before making any decisions.
source.
As other answers explain Memory in Java is divided into two portions
1. Stack: One stack is created per thread and it stores stack frames which again stores local variables and if a variable is a reference type then that variable refers to a memory location in heap for the actual object.
2. Heap: All kinds of objects will be created in heap only.
Heap memory is again divided into 3 portions
1. Young Generation: Stores objects which have a short life, Young Generation itself can be divided into two categories Eden Space and Survivor Space.
2. Old Generation: Store objects which have survived many garbage collection cycles and still being referenced.
3. Permanent Generation: Stores metadata about the program e.g. runtime constant pool.
String constant pool belongs to the permanent generation area of Heap memory.
We can see the runtime constant pool for our code in the bytecode by using javap -verbose class_name which will show us method references (#Methodref), Class objects ( #Class ), string literals ( #String )
You can read more about it on my article How Does JVM Handle Method Overloading and Overriding Internally.
To the great answers that already included here I want to add something that missing in my perspective - an illustration.
As you already JVM divides the allocated memory to a Java program into two parts. one is stack and another one is heap. Stack is used for execution purpose and heap is used for storage purpose. In that heap memory, JVM allocates some memory specially meant for string literals. This part of the heap memory is called string constants pool.
So for example, if you init the following objects:
String s1 = "abc";
String s2 = "123";
String obj1 = new String("abc");
String obj2 = new String("def");
String obj3 = new String("456);
String literals s1 and s2 will go to string constant pool, objects obj1, obj2, obj3 to the heap. All of them, will be referenced from the Stack.
Also, please note that "abc" will appear in heap and in string constant pool. Why is String s1 = "abc" and String obj1 = new String("abc") will be created this way? It's because String obj1 = new String("abc") explicitly creates a new and referentially distinct instance of a String object and String s1 = "abc" may reuse an instance from the string constant pool if one is available. For a more elaborate explanation: https://stackoverflow.com/a/3298542/2811258
I am reading about Garbage collection and i am getting confusing search results when i search for String literal garbage collections.
I need clarification on following points:
If a string is defined as literal at compile time [e.g: String str = "java"] then will it be garbage collected?
If use intern method [e.g: String str = new String("java").intern()] then will it be garbage collected? Also will it be treated differently from String literal in point 1.
Some places it is mentioned that literals will be garbage collected only when String class will be unloaded? Does it make sense because I don't think String class will ever be unloaded.
If a string is defined as literal at compile time [e.g: String str = "java";] then will it be garbage collected?
Probably not. The code objects will contain one or more references to the String objects that represent the literals. So as long as the code objects are reachable, the String objects will be to.
It is possible for code objects to become unreachable, but only if they were dynamically loaded ... and their classloader is destroyed.
If I use the intern method [e.g: String str = new String("java").intern()] then will it be garbage collected?
The object returned by the intern call will be the same object that represents the "java" string literal. (The "java" literal is interned at class loading time. When you then intern the newly constructed String object in your code snippet, it will lookup and return the previously interned "java" string.)
However, interned strings that are not identical with string literals can be garbage collected once they become unreachable. The PermGen space is garbage collected on all recent HotSpot JVMs. (Prior to Java 8 ... which drops PermGen entirely.)
Also will it be treated differently from string literal in point 1.
No ... because it is the same object as the string literal.
And indeed, once you understand what is going on, it is clear that string literals are not treated specially either. It is just an application of the "reachability" rule ...
Some places it is mentioned that literals will be garbage collected only when String class will be unloaded? Does it make sense because I don't think the String class will ever be unloaded.
You are right. It doesn't make sense. The sources that said that are incorrect. (It would be helpful if you posted a URL so that we can read what they are saying for ourselves ...)
Under normal circumstances, string literals and classes are all allocated into the JVM's permanent generation ("PermGen"), and usually won't ever be collected. Strings that are interned (e.g. mystring.intern()) are stored in a memory pool owned by the String class in permgen, and it was once the case that aggressive interning could cause a space leak because the string pool itself held a reference to every string, even if no other references existed. Apparently this is no longer true, at least as of JDK 1.6 (see, e.g., here).
For more on permgen, this is a decent overview of the topic. (Note: that link goes to a blog associated with a product. I don't have any association with the blog, the company, or the product, but the blog entry is useful and doesn't have much to do with the product.)
The literal string will remain in memory as long as the program is in memory.
str will be garbage collected, but the literal it is created from will not.
That makes perfect sense, since the string class is unloaded when the program is unloaded.
intern() method checks the availability of the object in String pool. If the object/literal is available then reference of it will be returned. If the literal is not there in the pool then object is loaded in the perm area (String pool) and then reference to it will be return. We have to use intern() method judiciously.
I know the concept of a constants pool and the String constant pool used by JVMs to handle String literals. But I don't know which type of memory is used by the JVM to store String constant literals. The stack or the heap? Since its a literal which is not associated with any instance I would assume that it will be stored in stack. But if it's not referred by any instance the literal has to be collected by GC run (correct me if I am wrong), so how is that handled if it is stored in the stack?
The answer is technically neither. According to the Java Virtual Machine Specification, the area for storing string literals is in the runtime constant pool. The runtime constant pool memory area is allocated on a per-class or per-interface basis, so it's not tied to any object instances at all. The runtime constant pool is a subset of the method area which "stores per-class structures such as the runtime constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and instance initialization and interface type initialization". The VM spec says that although the method area is logically part of the heap, it doesn't dictate that memory allocated in the method area be subject to garbage collection or other behaviors that would be associated with normal data structures allocated to the heap.
As explained by this answer, the exact location of the string pool is not specified and can vary from one JVM implementation to another.
It is interesting to note that until Java 7, the pool was in the permgen space of the heap on hotspot JVM but it has been moved to the main part of the heap since Java 7:
Area: HotSpot
Synopsis: In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
RFE: 6962931
And in Java 8 Hotspot, Permanent Generation has been completely removed.
String literals are not stored on the stack. Never. In fact, no objects are stored on the stack.
String literals (or more accurately, the String objects that represent them) are were historically stored in a Heap called the "permgen" heap. (Permgen is short for permanent generation.)
Under normal circumstances, String literals and much of the other stuff in the permgen heap are "permanently" reachable, and are not garbage collected. (For instance, String literals are always reachable from the code objects that use them.) However, you can configure a JVM to attempt to find and collect dynamically loaded classes that are no longer needed, and this may cause String literals to be garbage collected.
CLARIFICATION #1 - I'm not saying that Permgen doesn't get GC'ed. It does, typically when the JVM decides to run a Full GC. My point is that String literals will be reachable as long as the code that uses them is reachable, and the code will be reachable as long as the code's classloader is reachable, and for the default classloaders, that means "for ever".
CLARIFICATION #2 - In fact, Java 7 and later uses the regular heap to hold the string pool. Thus, String objects that represent String literals and intern'd strings are actually in the regular heap. (See #assylias's Answer for details.)
But I am still trying to find out thin line between storage of string literal and string created with new.
There is no "thin line". It is really very simple:
String objects that represent / correspond to string literals are held in the string pool.
String objects that were created by a String::intern call are held in the string pool.
All other String objects are NOT held in the string pool.
Then there is the separate question of where the string pool is "stored". Prior to Java 7 it was the permgen heap. From Java 7 onwards it is the main heap.
String pooling
String pooling (sometimes also called as string canonicalisation) is a
process of replacing several String objects with equal value but
different identity with a single shared String object. You can achieve
this goal by keeping your own Map (with possibly soft
or weak references depending on your requirements) and using map
values as canonicalised values. Or you can use String.intern() method
which is provided to you by JDK.
At times of Java 6 using String.intern() was forbidden by many
standards due to a high possibility to get an OutOfMemoryException if
pooling went out of control. Oracle Java 7 implementation of string
pooling was changed considerably. You can look for details in
http://bugs.sun.com/view_bug.do?bug_id=6962931 and
http://bugs.sun.com/view_bug.do?bug_id=6962930.
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location –
the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
String pool values are garbage collected
Yes, all strings in the JVM string pool are eligible for garbage
collection if there are no references to them from your program roots.
It applies to all discussed versions of Java. It means that if your
interned string went out of scope and there are no other references to
it – it will be garbage collected from the JVM string pool.
Being eligible for garbage collection and residing in the heap, a JVM
string pool seems to be a right place for all your strings, isn’t it?
In theory it is true – non-used strings will be garbage collected from
the pool, used strings will allow you to save memory in case then you
get an equal string from the input. Seems to be a perfect memory
saving strategy? Nearly so. You must know how the string pool is
implemented before making any decisions.
source.
As other answers explain Memory in Java is divided into two portions
1. Stack: One stack is created per thread and it stores stack frames which again stores local variables and if a variable is a reference type then that variable refers to a memory location in heap for the actual object.
2. Heap: All kinds of objects will be created in heap only.
Heap memory is again divided into 3 portions
1. Young Generation: Stores objects which have a short life, Young Generation itself can be divided into two categories Eden Space and Survivor Space.
2. Old Generation: Store objects which have survived many garbage collection cycles and still being referenced.
3. Permanent Generation: Stores metadata about the program e.g. runtime constant pool.
String constant pool belongs to the permanent generation area of Heap memory.
We can see the runtime constant pool for our code in the bytecode by using javap -verbose class_name which will show us method references (#Methodref), Class objects ( #Class ), string literals ( #String )
You can read more about it on my article How Does JVM Handle Method Overloading and Overriding Internally.
To the great answers that already included here I want to add something that missing in my perspective - an illustration.
As you already JVM divides the allocated memory to a Java program into two parts. one is stack and another one is heap. Stack is used for execution purpose and heap is used for storage purpose. In that heap memory, JVM allocates some memory specially meant for string literals. This part of the heap memory is called string constants pool.
So for example, if you init the following objects:
String s1 = "abc";
String s2 = "123";
String obj1 = new String("abc");
String obj2 = new String("def");
String obj3 = new String("456);
String literals s1 and s2 will go to string constant pool, objects obj1, obj2, obj3 to the heap. All of them, will be referenced from the Stack.
Also, please note that "abc" will appear in heap and in string constant pool. Why is String s1 = "abc" and String obj1 = new String("abc") will be created this way? It's because String obj1 = new String("abc") explicitly creates a new and referentially distinct instance of a String object and String s1 = "abc" may reuse an instance from the string constant pool if one is available. For a more elaborate explanation: https://stackoverflow.com/a/3298542/2811258