Garbage collection and Strings - java

I have some doubts regarding Strings
Are they live on Heap or String pool?
And if on Heap then they will be garbage collected, if they are not reachable by any live thread.
And if on String pool then how they will be deleted or removed because as we know Garbage Collection happens only on heap.

String s = new String("abc");
the string object referred by s will be on heap and the string literal "abc" will be in string pool. The objects in the string pool will not be garbage collected. They are there to be reused during the lifetime of the program, to improve the performance.

They are all stored in the heap, but intern()ed strings (including string literals in the source) are referenced from a pool in the String class.
If they appear as literals in the source code, including constant string expressions (e.g. "a" + "b") then they will also be referenced from the Class the appear in, which usually means they will last as long as the process runs.
Edit:
When you call intern() on a string in your code it is also added to this pool, but because it uses weak references the string can still be garbage collected if it is no longer in use.
See also:
interned Strings : Java Glossary
Quote from that article:
The collection of Strings registered in this HashMap is sometimes called the String pool. However, they are ordinary Objects and live on the heap just like any other (perhaps in an optimised way since interned Strings tend to be long lived).

String Alex goes into the literal pool, stays there as long as the process runs (or the web application remains loaded.) as told by finnw and is never garbage collected. String name2 doesn't allocate memory for "Alex" and reuses it from the literal pool.
PS: Literal pool is on the heap as well.
For string John two objects are created with reference name3 and name4 which are garbage collectible.
String name = "Alex";
String name2 = "Alex";
String name3 = new String("John");
String name4 = new String("John");

Related

Java String Pool with String constructor and the intern function

I learned about the Java String Pool recently, and there's a few things that I don't quiet understand.
When using the assignment operator, a new String will be created in the String Pool if it doesn't exist there already.
String a = "foo"; // Creates a new string in the String Pool
String b = "foo"; // Refers to the already existing string in the String Pool
When using the String constructor, I understand that regardless of the String Pool's state, a new string will be created in the heap, outside of the String Pool.
String c = new String("foo"); // Creates a new string in the heap
I read somewhere that even when using the constructor, the String Pool is being used. It will insert the string into the String Pool and into the heap.
String d = new String("bar"); // Creates a new string in the String Pool and in the heap
I didn't find any further information about this, but I would like to know if that's true.
If that is indeed true, then - why? Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.
Another thing that I would like to know is how the .intern() function of the String class works: Does it just return a pointer to the string in the String Pool?
And finally, in the following code:
String s = new String("Hello");
s = s.intern();
Will the garbage collector delete the string that is outside the String Pool from the heap?
You wrote
String c = new String("foo"); // Creates a new string in the heap
I read somewhere that even when using the constructor, the String Pool is being used. It
will insert the string into the String Pool and into the heap.
That’s somewhat correct, but you have to read the code correctly. Your code contains two String instances. First, you have the string literal "foo" that evaluates to a String instance, the one that will be inserted into the pool. Then, you are creating a new String instance explicitly, using new String(…) calling the String(String) constructor. Since the explicitly created object can’t have the same identity as an object that existed prior to its creation, two String instances must exist.
Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.
Well it does so, because you told it so. In theory, this construction could get optimized, skipping the intermediate step that you can’t perceive anyway. But the first assumption for a program’s behavior should be that it does precisely what you have written.
You could ask why there’s a constructor that allows such a pointless operation. In fact, this has been asked before and this answer addresses this. In short, it’s mostly a historical design mistake, but this constructor has been used in practice for other technical reasons; some do not apply anymore. Still, it can’t be removed without breaking compatibility.
String s = new String("Hello");
s = s.intern();
Will the garbage collector delete the string that is outside the String Pool from the heap?
Since the intern() call will evaluate to the instance that had been created for "Hello" and is distinct from the instance created via new String(…), the latter will definitely be unreachable after the second assignment to s. Of course, this doesn’t say whether the garbage collector will reclaim the string’s memory only that it is allowed to do so. But keep in mind that the majority of the heap occupation will be the array that holds the character data, which will be shared between the two string instances (unless you use a very outdated JVM). This array will still be in use as long as either of the two strings is in use. Recent JVMs even have the String Deduplication feature that may cause other strings of the same contents in the JVM use this array (to allow collection of their formerly used array). So the lifetime of the array is entirely unpredictable.
Q: I read somewhere that even when using the constructor, the String Pool is being used. It will insert the string into the String Pool and into the heap. [] I didn't find any further information about this, but I would like to know if that's true.
It is NOT true. A string created with new is not placed in the string pool ... unless something explicitly calls intern() on it.
Q: Why does java create this duplicate string?
Because the JLS specifies that every new generates a new object. It would be counter-intuitive if it didn't (IMO).
The fact that it is nearly always a bad idea to use new String(String) is not a good reason to make new behave differently in this case. The real answer is that programmers should learn not to write that ... except in the extremely rare cases that that it is necessary to do that.
Q: Another thing that I would like to know is how the intern() function of the String class works: Does it just return a pointer to the string in the String Pool?
The intern method always returns a pointer to a string in the string pool. That string may or may not be the string you called intern() or.
There have been different ways that the string pool was implemented.
In the original scheme, interned strings were held in a special heap call the PermGen heap. In that scheme, if the string you were interning was not already in the pool, then a new string would be allocated in PermGen space, and the intern method would return that.
In the current scheme, interned strings are held in the normal heap, and the string pool is just a (private) data structure. When the string being interned a not in the pool, it is simply linked into the data structure. A new string does not need to be allocated.
Q: Will the garbage collector delete the string that is outside the String Pool from the heap?
The rule is the same for all Java objects, no matter how they were created, and irrespective of where (in which "space" or "heap" in the JVM) they reside.
If an object is not reachable from the running application, then it is eligible for deletion by the garbage collector.
That doesn't mean that an unreachable object will be be garbage collected in any particular run of the GC. (Or indeed ever ... in some circumstances.)
The above rule equally applies to the String objects that correspond to string literals. If it ever becomes possible that a literal can never be used again, then it may be garbage collected.
That doesn't normally happen. The JVM keeps a hidden references to each string literal object in a private data structure associated with the class that defined it. Since classes normally exists for the lifetime of the JVM, their string literal objects remain reachable. (Which makes sense ... since the application may need to use them.)
However, if a class is loaded using a dynamically created classloader, and that classloader becomes unreachable, then so will all of its classes. So it is actually possible for a string literal object to become unreachable. If it does, it may be garbage collected.

String pool - do String always exist in constant pool?

When string is created by using literal, it gets stored in pool. But when new operator is used to create String object, it stores the object in Heap.
But is the object in heap just a pointer to literal stored in pool or is it a simple String object stored in heap which is eligible for GC?
Terminology:
The Constant Pool is an area in (each) .class file that contains various constants, including strings. No runtime objects exist in the constant pool. It is a region of a file.
The String Pool is a runtime data structure used by the JVM to manage certain kinds of strings. (Specifically, String objects that correspond to literals, and String objects added to the pool by String::intern().)
Your question is actually talking about the String Pool, not the Constant Pool.
To answer your questions:
String pool - do String always exist in constant pool?
No. A string object created using new String() doesn't exist in either the string pool or the constant pool.
When string is created by using literal, it gets stored in pool.
The string (already!) exists the constant pool and gets created as a Java String object in the string pool. (The actual creation can be at class load time, or when the literal is first used. This depends on the Java implementation.)
But when new operator is used to create String object, it stores the object in Heap.
Yes. But the string pool is also part of the Heap. Like I said, it is a data structure, not a region of storage.
(In the old days, the string pool lived in a special heap called the PermGen heap. But PermGen was replaced with something else (MetaSpace), and the string pool doesn't use either ... anymore.
But is the object in heap just a pointer to literal stored in pool or is it a simple String object stored in heap which is eligible for GC?
This is really confused.
All strings are represented as String objects in the (a) heap. That includes strings in the string pool. Even when the string pool was in PermGen.
All String objects that are unreachable are eligible for garbage collection. Even for strings in the string pool. Even for String objects that represent string literals.
But ... wait ... so can string literals be garbage collected?
Yes!! If a String object that represents a string literal becomes unreachable at runtime it is eligible for garbage collection, just like any other String object.
A string literal can become unreachable if the code object(s) that use the literal become unreachable. It can happen, when a classloader becomes unreachable.
And yes, PermGen was garbage collected. At least since JDK 1.2. (IIRC Java 1.0 and maybe 1.1 didn't implement GC for the PermGen heap. But that was fixed a long time ago.)

How can I destroy reference from String pool in Java?

Strings are immutable. When I declare:
String a = "abc";
String a1 = "abc";
Both objects refer to same location. So how can I destroy this "abc" reference from String pool?
My use case is that I am developing a hardware application with less memory and for this, I need to clear the references from String pool to save memory.
No, typically you can not "destroy reference from String pool in Java" manually.
The main reason I suppose why you are targeting it is to avoid out of memory errors. In Java 6 days all interned strings were stored in the PermGen – the fixed size part of heap mainly used for storing loaded classes and string pool. Besides explicitly interned strings, PermGen string pool also contained all literal strings earlier used in your program. The biggest issue with string pool in Java 6 was its location – the PermGen. PermGen has a fixed size and can not be expanded at runtime. You can set it using -XX:MaxPermSize=N option. This would lead to memory leaks and out of memory errors.
In Java 7 – the string pool was relocated to the heap. It means that you are no longer limited by a separate fixed size memory area. All strings are now located in the heap, as most of other ordinary objects.
You may also increase the string pool size by configuring -XX:StringTableSize=N.
If you are not sure about the string pool usage, try -XX:+PrintStringTableStatistics JVM argument. It will print you the string pool usage when your program terminates.
In JDK, there is also a tool named jmap which can be used to find out number of interned strings in your application.
jmap -heap process_id
Eg:
jmap -heap 18974
Along with other output, this command also outputs number of interned strings and the space they occupy "xxxxxx interned Strings occupying xxxxxx bytes."
The rules for garbage collection of objects in the String pool are the same as for other Strings or any other object. But the fact that the String objects that correspond to String literals mostly are always reachable since there is an implicit reference to the string object in the code of every method that uses the literal and so typically they are not candidates for garbage collection.
However, this is not always the case. If the literal was defined in a class that was dynamically loaded (e.g. using Class.forName(...)), then it is possible to arrange that the class is unloaded. If that happens, then the String object for the literal will be unreachable, and will be reclaimed when the heap containing the interned String gets GC'ed.

Confusion on string immutability

I have following code:-
public class StaticImplementer {
private static String str= "ABC";
public static void main(String args[]) {
str = str + "XYZ";
}
}
Questions:-
Here String str is static, then where this object will be stored in memory?
Since String is immutable, where the object for "XYZ" will be stored?
Where will be final object will be Stored?
And how will garbage collection will be done?
1) Here String str is static, then where this object will be stored in
memory?
Those literals will be stored in the String pool memory, no matter if the variable is declared as static or not.
More info here: Where does Java's String constant pool live, the heap or the stack?
2) Since String is immutable, where the object for "XYZ" will be stored?
Similar to the first answer: a literal will be stored in the pool memory.
Immutability just allows the concept of shared pool memory.
3) Where will be final object will be Stored?
According to the Java specification, concatenation of literals will end up to a literal too (since known at compilation time), stored in the pool memory.
Excerpt:
"This is a " + // actually a string-valued constant expression,
"two-line string" // formed from two string literals
4) And how will garbage collection will be done?
As essence of the pool memory, they won't be garbage collected by default.
Indeed, if garbage collected immediately, the "shared" concept of the pool would fail.
Here String str is static, then where this object will be stored in
memory?
String str is not an object, it's a reference to an object. "ABC", "XYZ" & "ABCXYZ" are three distinct String objects. Thus, str points to a string. You can change what it points to, but not that which it points at.
Since String is immutable, where the object for "XYZ" will be stored?
As explained in above & also by Mik378, "XYZ" is just a String object which gets saved in the String pool memory and the reference to this memory is returned when "XYZ" is declared or assigned to any other object.
Where will be final object will be Stored?
The final object, "ABCXYZ" will also get saved to the pool memory and the reference will be returned to the object when the operation is assigned to any variable.
And how will garbage collection will be done?
String literals are interned. As of Java 7, the HotSpot JVM puts interned Strings in the heap, not permgen. In earlier versions of Java, JVM placed interned Strings in permgen. However, Strings in permgen were garbage collected. Apparently, Class objects in permgen are also collectable, so everything in permgen is collectable, though permgen collection might not be enabled by default in some old JVMs.
String literals, being interned, would be a reference held by the declaring Class object to the String object in the intern pool. So the interned literal String would only be collected if the Class object that referred to it were also collected.
Shishir

Garbage collection of String literals

I am reading about Garbage collection and i am getting confusing search results when i search for String literal garbage collections.
I need clarification on following points:
If a string is defined as literal at compile time [e.g: String str = "java"] then will it be garbage collected?
If use intern method [e.g: String str = new String("java").intern()] then will it be garbage collected? Also will it be treated differently from String literal in point 1.
Some places it is mentioned that literals will be garbage collected only when String class will be unloaded? Does it make sense because I don't think String class will ever be unloaded.
If a string is defined as literal at compile time [e.g: String str = "java";] then will it be garbage collected?
Probably not. The code objects will contain one or more references to the String objects that represent the literals. So as long as the code objects are reachable, the String objects will be to.
It is possible for code objects to become unreachable, but only if they were dynamically loaded ... and their classloader is destroyed.
If I use the intern method [e.g: String str = new String("java").intern()] then will it be garbage collected?
The object returned by the intern call will be the same object that represents the "java" string literal. (The "java" literal is interned at class loading time. When you then intern the newly constructed String object in your code snippet, it will lookup and return the previously interned "java" string.)
However, interned strings that are not identical with string literals can be garbage collected once they become unreachable. The PermGen space is garbage collected on all recent HotSpot JVMs. (Prior to Java 8 ... which drops PermGen entirely.)
Also will it be treated differently from string literal in point 1.
No ... because it is the same object as the string literal.
And indeed, once you understand what is going on, it is clear that string literals are not treated specially either. It is just an application of the "reachability" rule ...
Some places it is mentioned that literals will be garbage collected only when String class will be unloaded? Does it make sense because I don't think the String class will ever be unloaded.
You are right. It doesn't make sense. The sources that said that are incorrect. (It would be helpful if you posted a URL so that we can read what they are saying for ourselves ...)
Under normal circumstances, string literals and classes are all allocated into the JVM's permanent generation ("PermGen"), and usually won't ever be collected. Strings that are interned (e.g. mystring.intern()) are stored in a memory pool owned by the String class in permgen, and it was once the case that aggressive interning could cause a space leak because the string pool itself held a reference to every string, even if no other references existed. Apparently this is no longer true, at least as of JDK 1.6 (see, e.g., here).
For more on permgen, this is a decent overview of the topic. (Note: that link goes to a blog associated with a product. I don't have any association with the blog, the company, or the product, but the blog entry is useful and doesn't have much to do with the product.)
The literal string will remain in memory as long as the program is in memory.
str will be garbage collected, but the literal it is created from will not.
That makes perfect sense, since the string class is unloaded when the program is unloaded.
intern() method checks the availability of the object in String pool. If the object/literal is available then reference of it will be returned. If the literal is not there in the pool then object is loaded in the perm area (String pool) and then reference to it will be return. We have to use intern() method judiciously.

Categories