Compiler behavior with String literals to Create String Constant Pool - java

After reading these discussions - question 1, question 2, article
I have the below understanding of Java String Constant Pool (Please correct me, If I am wrong):
When the source code is compiled, compiler look for all the string literals (The ones put into double quotes) in our program and create distinct(No duplicates) objects in the heap area and maintain their references in a special memory area called String Constant Pool (An area inside method area). Any other string objects are created at run time.
Suppose our code has the following statements:
String a = "abc"; //Line 1
String b = "xyz"; //Line 2
String c = "abc"; //Line 3
String d = new String("abc"): //Line 4
When the above code is compiled,
Line 1: a String object "abc" is created in heap and this object is referenced by variable a and String Constant Pool.
Line 2: Compiler searches String Constant Pool for any existing reference to the object "xyz". But does not find one. So, it creates object "xyz" and puts its reference in String Constant Pool.
Line 3: This time compiler finds the object in String Constant Pool and does not make any additional entry in pool or heap. Variable c just refers to existing object which is also referred by a.
Line 4: The literal in Line 4 is present in String Constant Pool. So, no more entry is made in pool. At run time however another String object is created for "abc" and its reference is stored in variable d.
Now I have the following questions/doubts:
Is that what happens exactly which is described above?
How does the compiler creates object? As per my knowledge, objects
are created at Run time and heap is a Run time memory area. So, how
and where does String objects are created at the time of
compilation!
Source code can be compiled in one machine and run in a different
machine. Or, even in the same machine they can be compiled and run in
different time. Then how those objects (created in compile time) are
recovered?
What happens when we intern a String.

Is that what happens exactly which is described above?
Yes, conceptually, however, the constant pool and string pool are different things.
The constant pool is a part of a .class file that contains all constants used in this class.
The string pool is a runtime concept - interned strings and string literals are stored here.
Here's the JVM specification on the constant pool. It is part of the section on the .class format.
How does the compiler creates object? As per my knowledge, objects are created at Run time and heap is a Run time memory area. So, how and where does String objects are created at the time of compilation!
How/when exactly this happens, I believe, is a JVM implementation-specific detail (correct me if I am wrong), but the basic explanation is that whenever the JVM decides to load a class, any strings found in the constant pool are automatically placed into the runtime string pool, and any duplicates are made to refer to the same instance.
In one of the linked answers' comments, Paŭlo Ebermann says:
when the classes are loaded in the VM, the string constants will get copied to the heap, to a VM-wide string pool
so it seems this is at least how Sun's VM implemented the string pool.
Prior to JDK 7/HotSpot interned strings were stored in the permanent generation space - now they are stored in the main heap.
Source code can be compiled in one machine and run in a different machine. Or, even in the same machine they can be compiled and run in different time. Then how those objects (created in compile time) are recovered?
Constants are stored in the compiled files. Therefore they are retrievable whenever the JVM decides to load this class.
What happens when we intern a String.
This is answered here:
doing String.intern() on a series of strings will ensure that all strings having same contents share same memory

Related

Java String Pool with String constructor and the intern function

I learned about the Java String Pool recently, and there's a few things that I don't quiet understand.
When using the assignment operator, a new String will be created in the String Pool if it doesn't exist there already.
String a = "foo"; // Creates a new string in the String Pool
String b = "foo"; // Refers to the already existing string in the String Pool
When using the String constructor, I understand that regardless of the String Pool's state, a new string will be created in the heap, outside of the String Pool.
String c = new String("foo"); // Creates a new string in the heap
I read somewhere that even when using the constructor, the String Pool is being used. It will insert the string into the String Pool and into the heap.
String d = new String("bar"); // Creates a new string in the String Pool and in the heap
I didn't find any further information about this, but I would like to know if that's true.
If that is indeed true, then - why? Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.
Another thing that I would like to know is how the .intern() function of the String class works: Does it just return a pointer to the string in the String Pool?
And finally, in the following code:
String s = new String("Hello");
s = s.intern();
Will the garbage collector delete the string that is outside the String Pool from the heap?
You wrote
String c = new String("foo"); // Creates a new string in the heap
I read somewhere that even when using the constructor, the String Pool is being used. It
will insert the string into the String Pool and into the heap.
That’s somewhat correct, but you have to read the code correctly. Your code contains two String instances. First, you have the string literal "foo" that evaluates to a String instance, the one that will be inserted into the pool. Then, you are creating a new String instance explicitly, using new String(…) calling the String(String) constructor. Since the explicitly created object can’t have the same identity as an object that existed prior to its creation, two String instances must exist.
Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.
Well it does so, because you told it so. In theory, this construction could get optimized, skipping the intermediate step that you can’t perceive anyway. But the first assumption for a program’s behavior should be that it does precisely what you have written.
You could ask why there’s a constructor that allows such a pointless operation. In fact, this has been asked before and this answer addresses this. In short, it’s mostly a historical design mistake, but this constructor has been used in practice for other technical reasons; some do not apply anymore. Still, it can’t be removed without breaking compatibility.
String s = new String("Hello");
s = s.intern();
Will the garbage collector delete the string that is outside the String Pool from the heap?
Since the intern() call will evaluate to the instance that had been created for "Hello" and is distinct from the instance created via new String(…), the latter will definitely be unreachable after the second assignment to s. Of course, this doesn’t say whether the garbage collector will reclaim the string’s memory only that it is allowed to do so. But keep in mind that the majority of the heap occupation will be the array that holds the character data, which will be shared between the two string instances (unless you use a very outdated JVM). This array will still be in use as long as either of the two strings is in use. Recent JVMs even have the String Deduplication feature that may cause other strings of the same contents in the JVM use this array (to allow collection of their formerly used array). So the lifetime of the array is entirely unpredictable.
Q: I read somewhere that even when using the constructor, the String Pool is being used. It will insert the string into the String Pool and into the heap. [] I didn't find any further information about this, but I would like to know if that's true.
It is NOT true. A string created with new is not placed in the string pool ... unless something explicitly calls intern() on it.
Q: Why does java create this duplicate string?
Because the JLS specifies that every new generates a new object. It would be counter-intuitive if it didn't (IMO).
The fact that it is nearly always a bad idea to use new String(String) is not a good reason to make new behave differently in this case. The real answer is that programmers should learn not to write that ... except in the extremely rare cases that that it is necessary to do that.
Q: Another thing that I would like to know is how the intern() function of the String class works: Does it just return a pointer to the string in the String Pool?
The intern method always returns a pointer to a string in the string pool. That string may or may not be the string you called intern() or.
There have been different ways that the string pool was implemented.
In the original scheme, interned strings were held in a special heap call the PermGen heap. In that scheme, if the string you were interning was not already in the pool, then a new string would be allocated in PermGen space, and the intern method would return that.
In the current scheme, interned strings are held in the normal heap, and the string pool is just a (private) data structure. When the string being interned a not in the pool, it is simply linked into the data structure. A new string does not need to be allocated.
Q: Will the garbage collector delete the string that is outside the String Pool from the heap?
The rule is the same for all Java objects, no matter how they were created, and irrespective of where (in which "space" or "heap" in the JVM) they reside.
If an object is not reachable from the running application, then it is eligible for deletion by the garbage collector.
That doesn't mean that an unreachable object will be be garbage collected in any particular run of the GC. (Or indeed ever ... in some circumstances.)
The above rule equally applies to the String objects that correspond to string literals. If it ever becomes possible that a literal can never be used again, then it may be garbage collected.
That doesn't normally happen. The JVM keeps a hidden references to each string literal object in a private data structure associated with the class that defined it. Since classes normally exists for the lifetime of the JVM, their string literal objects remain reachable. (Which makes sense ... since the application may need to use them.)
However, if a class is loaded using a dynamically created classloader, and that classloader becomes unreachable, then so will all of its classes. So it is actually possible for a string literal object to become unreachable. If it does, it may be garbage collected.

Runtime constant pool - is filled up by variables created in runtime?

I'm not sure about some properties of runtime constant pool.
Runtime constant pool, is filled up by the data from constant pool (from .class files, during class loading). But is it also filled up by variables created in runtime? Or are they converted during compilation to literals, and stored in constant pool?
For example:
Integer i = new Integer(127);
is treated like literal, because of conversion to:
Integer i = Integer.valueOf(127);
during compilation, and stored in constant pool?
If it's not working like that, is there any runtime mechanics for runtime constant pool?
And second question: I have found this sentence in many articles: "every class got Runtime constant pool", but what does it mean? Is there a single RCP, that contains all application objects of (for example) Integer type, or is there a single RCP for every class, that contains all constant objects, that occured in this class? (for example: Person, got age = Integer(18), and isAdult = Boolean(true)).
First, there is no conversion of
Integer i = new Integer(127);
to
Integer i = Integer.valueOf(127);
These constructs are entirely different. new Integer(127) is guaranteed to produce a new instance every time it is evaluated, whereas Integer.valueOf(127) is guaranteed to produce the same instance on every evaluation, as Integer.valueOf guarantees for all values in the -128 … +127 range. This is handled by the implementation of the Integer.valueOf(int) and not related to constant pools in any way. Of course, it is implementation specific, but the OpenJDK implementation handles this by simply filling an array with references to these 256 instance the first time, this cache is accessed.
While it is correct that every class has a constant pool in its class file, it might be misleading to say that every class will have a runtime constant pool (on its own). That’s again a JVM implementation detail. While it is possible to map each class constant pool 1:1 to a runtime constant pool, it obviously makes sense to merge the constant pools of classes living in the same resolve context (i.e. defined by the same class loader) into one pool, so that identical constants don’t need to be resolved multiple times. Though, conceptionally, every class has a runtime representation of its pool, even if they do not materialize in this naive form. So the statement “every class has a runtime constant pool” is not wrong, but it doesn’t necessarily imply that there will be such a data structure for every class.
This affects classes, members, MethodType, MethodHandle and String instances, referenced by the constant pools of the classes, but not wrapper types like Integer or Boolean, as there are no such entries in a constant pool. Integer values in the pool are primitive values and boolean values do not exist at all.
This must not be confused with the global String pool references all String instances for literals and the results of intern() calls.
Problem 1 - Anwser: No
Integeral wrapper types are cached, not stored in the constant pool. They are just ordinary objects in the heap. Integer or Byte caching is a runtime optimization, not a VM optimization, nor a compile time optimization. They are not magically replaced with the cached one when their constructor is invoked to create a new one.
First, your translation from new Integer(127) to Integer.valueOf(127) is not correct at all as explained in this post. If you do some runtime verifications, like System.out.println(Integer.valueOf(127) == new Integer(127)); (prints false), you will quickly come to the conclusion that no matter what object you are constructing, using new operator always creates a new, uncached object. (Even Strings, who's actually in the runtime constant table, need being interned to get a reference to the canonical one.)
What i variable hold is just reference pointing to a Integer object in the heap. It will be cached if you are using valueOf and vice versa.
Problem 2 - Anwser: There a single RCP for every class but they are all in the same memory region
The RCPs are all stored in method area. Personally I don't know how JVM is implemented, but JVMS has stated:
The Java Virtual Machine maintains a per-type constant pool (§2.5.5), a run-time data structure that serves many of the purposes of the symbol table of a conventional programming language implementation.
Nevertheless, this doesn't matter even from a performance tuning view, as long as you don't plan to apply for a job in Oracle.

How can I destroy reference from String pool in Java?

Strings are immutable. When I declare:
String a = "abc";
String a1 = "abc";
Both objects refer to same location. So how can I destroy this "abc" reference from String pool?
My use case is that I am developing a hardware application with less memory and for this, I need to clear the references from String pool to save memory.
No, typically you can not "destroy reference from String pool in Java" manually.
The main reason I suppose why you are targeting it is to avoid out of memory errors. In Java 6 days all interned strings were stored in the PermGen – the fixed size part of heap mainly used for storing loaded classes and string pool. Besides explicitly interned strings, PermGen string pool also contained all literal strings earlier used in your program. The biggest issue with string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at runtime. You can set it using -XX:MaxPermSize=N option. This would lead to memory leaks and out of memory errors.
In Java 7 – the string pool was relocated to the heap. It means that you are no longer limited by a separate fixed size memory area. All strings are now located in the heap, as most of other ordinary objects.
You may also increase the string pool size by configuring -XX:StringTableSize=N.
If you are not sure about the string pool usage, try -XX:+PrintStringTableStatistics JVM argument. It will print you the string pool usage when your program terminates.
In JDK, there is also a tool named jmap which can be used to find out number of interned strings in your application.
jmap -heap process_id
Eg:
jmap -heap 18974
Along with other output, this command also outputs number of interned strings and the space they occupy "xxxxxx interned Strings occupying xxxxxx bytes."
The rules for garbage collection of objects in the String pool are the same as for other Strings or any other object. But the fact that the String objects that correspond to String literals mostly are always reachable since there is an implicit reference to the string object in the code of every method that uses the literal and so typically they are not candidates for garbage collection.
However, this is not always the case. If the literal was defined in a class that was dynamically loaded (e.g. using Class.forName(...)), then it is possible to arrange that the class is unloaded. If that happens, then the String object for the literal will be unreachable, and will be reclaimed when the heap containing the interned String gets GC'ed.

java string literal and new operator [duplicate]

I know the concept of a constants pool and the String constant pool used by JVMs to handle String literals. But I don't know which type of memory is used by the JVM to store String constant literals. The stack or the heap? Since its a literal which is not associated with any instance I would assume that it will be stored in stack. But if it's not referred by any instance the literal has to be collected by GC run (correct me if I am wrong), so how is that handled if it is stored in the stack?
The answer is technically neither. According to the Java Virtual Machine Specification, the area for storing string literals is in the runtime constant pool. The runtime constant pool memory area is allocated on a per-class or per-interface basis, so it's not tied to any object instances at all. The runtime constant pool is a subset of the method area which "stores per-class structures such as the runtime constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and instance initialization and interface type initialization". The VM spec says that although the method area is logically part of the heap, it doesn't dictate that memory allocated in the method area be subject to garbage collection or other behaviors that would be associated with normal data structures allocated to the heap.
As explained by this answer, the exact location of the string pool is not specified and can vary from one JVM implementation to another.
It is interesting to note that until Java 7, the pool was in the permgen space of the heap on hotspot JVM but it has been moved to the main part of the heap since Java 7:
Area: HotSpot
Synopsis: In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
RFE: 6962931
And in Java 8 Hotspot, Permanent Generation has been completely removed.
String literals are not stored on the stack. Never. In fact, no objects are stored on the stack.
String literals (or more accurately, the String objects that represent them) are were historically stored in a Heap called the "permgen" heap. (Permgen is short for permanent generation.)
Under normal circumstances, String literals and much of the other stuff in the permgen heap are "permanently" reachable, and are not garbage collected. (For instance, String literals are always reachable from the code objects that use them.) However, you can configure a JVM to attempt to find and collect dynamically loaded classes that are no longer needed, and this may cause String literals to be garbage collected.
CLARIFICATION #1 - I'm not saying that Permgen doesn't get GC'ed. It does, typically when the JVM decides to run a Full GC. My point is that String literals will be reachable as long as the code that uses them is reachable, and the code will be reachable as long as the code's classloader is reachable, and for the default classloaders, that means "for ever".
CLARIFICATION #2 - In fact, Java 7 and later uses the regular heap to hold the string pool. Thus, String objects that represent String literals and intern'd strings are actually in the regular heap. (See #assylias's Answer for details.)
But I am still trying to find out thin line between storage of string literal and string created with new.
There is no "thin line". It is really very simple:
String objects that represent / correspond to string literals are held in the string pool.
String objects that were created by a String::intern call are held in the string pool.
All other String objects are NOT held in the string pool.
Then there is the separate question of where the string pool is "stored". Prior to Java 7 it was the permgen heap. From Java 7 onwards it is the main heap.
String pooling
String pooling (sometimes also called as string canonicalisation) is a
process of replacing several String objects with equal value but
different identity with a single shared String object. You can achieve
this goal by keeping your own Map (with possibly soft
or weak references depending on your requirements) and using map
values as canonicalised values. Or you can use String.intern() method
which is provided to you by JDK.
At times of Java 6 using String.intern() was forbidden by many
standards due to a high possibility to get an OutOfMemoryException if
pooling went out of control. Oracle Java 7 implementation of string
pooling was changed considerably. You can look for details in
http://bugs.sun.com/view_bug.do?bug_id=6962931 and
http://bugs.sun.com/view_bug.do?bug_id=6962930.
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location –
the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
String pool values are garbage collected
Yes, all strings in the JVM string pool are eligible for garbage
collection if there are no references to them from your program roots.
It applies to all discussed versions of Java. It means that if your
interned string went out of scope and there are no other references to
it – it will be garbage collected from the JVM string pool.
Being eligible for garbage collection and residing in the heap, a JVM
string pool seems to be a right place for all your strings, isn’t it?
In theory it is true – non-used strings will be garbage collected from
the pool, used strings will allow you to save memory in case then you
get an equal string from the input. Seems to be a perfect memory
saving strategy? Nearly so. You must know how the string pool is
implemented before making any decisions.
source.
As other answers explain Memory in Java is divided into two portions
1. Stack: One stack is created per thread and it stores stack frames which again stores local variables and if a variable is a reference type then that variable refers to a memory location in heap for the actual object.
2. Heap: All kinds of objects will be created in heap only.
Heap memory is again divided into 3 portions
1. Young Generation: Stores objects which have a short life, Young Generation itself can be divided into two categories Eden Space and Survivor Space.
2. Old Generation: Store objects which have survived many garbage collection cycles and still being referenced.
3. Permanent Generation: Stores metadata about the program e.g. runtime constant pool.
String constant pool belongs to the permanent generation area of Heap memory.
We can see the runtime constant pool for our code in the bytecode by using javap -verbose class_name which will show us method references (#Methodref), Class objects ( #Class ), string literals ( #String )
You can read more about it on my article How Does JVM Handle Method Overloading and Overriding Internally.
To the great answers that already included here I want to add something that missing in my perspective - an illustration.
As you already JVM divides the allocated memory to a Java program into two parts. one is stack and another one is heap. Stack is used for execution purpose and heap is used for storage purpose. In that heap memory, JVM allocates some memory specially meant for string literals. This part of the heap memory is called string constants pool.
So for example, if you init the following objects:
String s1 = "abc";
String s2 = "123";
String obj1 = new String("abc");
String obj2 = new String("def");
String obj3 = new String("456);
String literals s1 and s2 will go to string constant pool, objects obj1, obj2, obj3 to the heap. All of them, will be referenced from the Stack.
Also, please note that "abc" will appear in heap and in string constant pool. Why is String s1 = "abc" and String obj1 = new String("abc") will be created this way? It's because String obj1 = new String("abc") explicitly creates a new and referentially distinct instance of a String object and String s1 = "abc" may reuse an instance from the string constant pool if one is available. For a more elaborate explanation: https://stackoverflow.com/a/3298542/2811258

Where does Java's String constant pool live, the heap or the stack?

I know the concept of a constants pool and the String constant pool used by JVMs to handle String literals. But I don't know which type of memory is used by the JVM to store String constant literals. The stack or the heap? Since its a literal which is not associated with any instance I would assume that it will be stored in stack. But if it's not referred by any instance the literal has to be collected by GC run (correct me if I am wrong), so how is that handled if it is stored in the stack?
The answer is technically neither. According to the Java Virtual Machine Specification, the area for storing string literals is in the runtime constant pool. The runtime constant pool memory area is allocated on a per-class or per-interface basis, so it's not tied to any object instances at all. The runtime constant pool is a subset of the method area which "stores per-class structures such as the runtime constant pool, field and method data, and the code for methods and constructors, including the special methods used in class and instance initialization and interface type initialization". The VM spec says that although the method area is logically part of the heap, it doesn't dictate that memory allocated in the method area be subject to garbage collection or other behaviors that would be associated with normal data structures allocated to the heap.
As explained by this answer, the exact location of the string pool is not specified and can vary from one JVM implementation to another.
It is interesting to note that until Java 7, the pool was in the permgen space of the heap on hotspot JVM but it has been moved to the main part of the heap since Java 7:
Area: HotSpot
Synopsis: In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
RFE: 6962931
And in Java 8 Hotspot, Permanent Generation has been completely removed.
String literals are not stored on the stack. Never. In fact, no objects are stored on the stack.
String literals (or more accurately, the String objects that represent them) are were historically stored in a Heap called the "permgen" heap. (Permgen is short for permanent generation.)
Under normal circumstances, String literals and much of the other stuff in the permgen heap are "permanently" reachable, and are not garbage collected. (For instance, String literals are always reachable from the code objects that use them.) However, you can configure a JVM to attempt to find and collect dynamically loaded classes that are no longer needed, and this may cause String literals to be garbage collected.
CLARIFICATION #1 - I'm not saying that Permgen doesn't get GC'ed. It does, typically when the JVM decides to run a Full GC. My point is that String literals will be reachable as long as the code that uses them is reachable, and the code will be reachable as long as the code's classloader is reachable, and for the default classloaders, that means "for ever".
CLARIFICATION #2 - In fact, Java 7 and later uses the regular heap to hold the string pool. Thus, String objects that represent String literals and intern'd strings are actually in the regular heap. (See #assylias's Answer for details.)
But I am still trying to find out thin line between storage of string literal and string created with new.
There is no "thin line". It is really very simple:
String objects that represent / correspond to string literals are held in the string pool.
String objects that were created by a String::intern call are held in the string pool.
All other String objects are NOT held in the string pool.
Then there is the separate question of where the string pool is "stored". Prior to Java 7 it was the permgen heap. From Java 7 onwards it is the main heap.
String pooling
String pooling (sometimes also called as string canonicalisation) is a
process of replacing several String objects with equal value but
different identity with a single shared String object. You can achieve
this goal by keeping your own Map (with possibly soft
or weak references depending on your requirements) and using map
values as canonicalised values. Or you can use String.intern() method
which is provided to you by JDK.
At times of Java 6 using String.intern() was forbidden by many
standards due to a high possibility to get an OutOfMemoryException if
pooling went out of control. Oracle Java 7 implementation of string
pooling was changed considerably. You can look for details in
http://bugs.sun.com/view_bug.do?bug_id=6962931 and
http://bugs.sun.com/view_bug.do?bug_id=6962930.
String.intern() in Java 6
In those good old days all interned strings were stored in the PermGen
– the fixed size part of heap mainly used for storing loaded classes
and string pool. Besides explicitly interned strings, PermGen string
pool also contained all literal strings earlier used in your program
(the important word here is used – if a class or method was never
loaded/called, any constants defined in it will not be loaded).
The biggest issue with such string pool in Java 6 was its location –
the PermGen. PermGen has a fixed size and can not be expanded at
runtime. You can set it using -XX:MaxPermSize=96m option. As far as I
know, the default PermGen size varies between 32M and 96M depending on
the platform. You can increase its size, but its size will still be
fixed. Such limitation required very careful usage of String.intern –
you’d better not intern any uncontrolled user input using this method.
That’s why string pooling at times of Java 6 was mostly implemented in
the manually managed maps.
String.intern() in Java 7
Oracle engineers made an extremely important change to the string
pooling logic in Java 7 – the string pool was relocated to the heap.
It means that you are no longer limited by a separate fixed size
memory area. All strings are now located in the heap, as most of other
ordinary objects, which allows you to manage only the heap size while
tuning your application. Technically, this alone could be a sufficient
reason to reconsider using String.intern() in your Java 7 programs.
But there are other reasons.
String pool values are garbage collected
Yes, all strings in the JVM string pool are eligible for garbage
collection if there are no references to them from your program roots.
It applies to all discussed versions of Java. It means that if your
interned string went out of scope and there are no other references to
it – it will be garbage collected from the JVM string pool.
Being eligible for garbage collection and residing in the heap, a JVM
string pool seems to be a right place for all your strings, isn’t it?
In theory it is true – non-used strings will be garbage collected from
the pool, used strings will allow you to save memory in case then you
get an equal string from the input. Seems to be a perfect memory
saving strategy? Nearly so. You must know how the string pool is
implemented before making any decisions.
source.
As other answers explain Memory in Java is divided into two portions
1. Stack: One stack is created per thread and it stores stack frames which again stores local variables and if a variable is a reference type then that variable refers to a memory location in heap for the actual object.
2. Heap: All kinds of objects will be created in heap only.
Heap memory is again divided into 3 portions
1. Young Generation: Stores objects which have a short life, Young Generation itself can be divided into two categories Eden Space and Survivor Space.
2. Old Generation: Store objects which have survived many garbage collection cycles and still being referenced.
3. Permanent Generation: Stores metadata about the program e.g. runtime constant pool.
String constant pool belongs to the permanent generation area of Heap memory.
We can see the runtime constant pool for our code in the bytecode by using javap -verbose class_name which will show us method references (#Methodref), Class objects ( #Class ), string literals ( #String )
You can read more about it on my article How Does JVM Handle Method Overloading and Overriding Internally.
To the great answers that already included here I want to add something that missing in my perspective - an illustration.
As you already JVM divides the allocated memory to a Java program into two parts. one is stack and another one is heap. Stack is used for execution purpose and heap is used for storage purpose. In that heap memory, JVM allocates some memory specially meant for string literals. This part of the heap memory is called string constants pool.
So for example, if you init the following objects:
String s1 = "abc";
String s2 = "123";
String obj1 = new String("abc");
String obj2 = new String("def");
String obj3 = new String("456);
String literals s1 and s2 will go to string constant pool, objects obj1, obj2, obj3 to the heap. All of them, will be referenced from the Stack.
Also, please note that "abc" will appear in heap and in string constant pool. Why is String s1 = "abc" and String obj1 = new String("abc") will be created this way? It's because String obj1 = new String("abc") explicitly creates a new and referentially distinct instance of a String object and String s1 = "abc" may reuse an instance from the string constant pool if one is available. For a more elaborate explanation: https://stackoverflow.com/a/3298542/2811258

Categories