What if Size of String Pool Exceeds? - java

In Java,
String literals in String Constant Pool are not garbage collected,
since they are referenced from Table of references which is created by instance of runtime in order to optimize space.
If Size of String literal pool exceeds then,
Since each String in String literal pool has reference hence it will be not eligible for GC.
how it is handled by JVM ?

There is a long discussion with real code examples at JavaRanch.
The general output is the following:
If a string is added to constant pool at RUNTIME using String.intern(), it can be garbage collected after it is no longer in use. Most probably, the string pool keeps only soft references to added strings, thus allowing to garbage collect them (can't be sure, because String.intern() is a native method).
If a string is a COMPILE time constant, it is added to the constant pool of the corresponding class. Therefore, it can be garbage collected only after the class is unloaded.
The answer to your question is: the only way how you can get OutOfMemoryError because of String constants is to load lot's of classes with many string literals computed at compile time. Then you can eventually exceed maximum size of PermGen space. But this will happen at the time you load classes into memory (e.g., start your application, deploy project to a WebServer, dynamically load new library, etc.)

String literals can be collected when they are no longer needed. Usually this is not a problem if they appear in classes because there are other limits you are likely to reach if you attempt to load lots of classes e.g. Maximum Perm Gen.
Generally speaking, developers are smart enough not to over use the string literal pool and instead using databases or file to load the bulk of their data if its a non trivial size.
You can introduce a problem if you use String.intern() a lot in an attempt to optimise the space of your system. String.intern() is not free and becomes increasingly expensive if you add a large number (millions) of string into it. If this is a performance problem, it should be reasonably obvious to the developer when it is.

Related

Why String doesn't use scp only

We know that String create object in heap and scp based on situation but what if String use only scp for every situation, so that we can save some memory space
Firstly, despite what you have heard or read, the Oracle documentation does not mention a thing called the "string constant pool". (Or "scp".) In fact, there are two distinct things:
The Constant Pool which is part of the ".class" file format and represents many the kinds of constants emitted by the compiler.
The String Pool is a runtime data structure that is primarily used to implement certain properties about String objects that originated as the result of compile time constant expressions.
But while the latter holds "constants", it can also holds String objects that were placed there by calling String.intern. So from that respect it is not a string "constant" pool. Alternatively, all String objects are immutable (constant), so from that perspective the "constant" in string constant pool is redundant.
In addition you say:
We know that String create object in heap and scp based on situation.
In a modern JVM (Java 7 or later), the strings in the string pool are actually in the regular heap.
The only situations where the JVM puts a string into the string pool are:
when creating a String object corresponding to a String-valued constant-expression in a .class file, or
when application code calls the String.intern method.
No other string constructors or methods do it, and (AFAIK) none of the standard Java SE librarys ever use intern().
So to answer your question:
Why String doesn't use scp only?
Because when the String is not a duplicate, putting it into the pool doesn't save memory. Rather it uses more memory.
Because a String in the string pool tends to live longer beyond the point where it becomes unreachable. (This is certainly true for Java versions where the string pool was in the PermGen heap.) So you may end up using the memory for a pooled String for longer than if it hadn't been pooled. That can also mean that more memory is used overall.
Because searching the string pool each time you created a new string would be (relatively) expensive.
Because the string pool creates more work for the garbage collector. The pool is a native hash table data structure that contains references that are akin to Reference types. Searching or scanning the table to remove strings that are no longer reachable costs extra GC time.
Because ... frankly ... the percentage of memory used by (aka "wasted" on) duplicate strings is not significant in most applications.
Java tends to be memory hungry by virtue of being a garbage collected language. But it is memory hungry irrespective of string pooling. So if your application requirements include running in a minimal memory footprint, Java string pooling is not the solution. You should probably be using a different programming language.
In fact, since Java 9 there is a better way to save memory used by duplicate strings. Enable the GC's string deduplication feature. It is more efficient than interning because the deduping is only done on strings that have survived a number of new-space garbage collections. This reduces the wasted effort on deduping strings that turn out to be short lived.

What are good Java coding practices to help Java GC? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I avoid garbage collection delays in Java games? (Best Practices)
Java's GC pause is a killer. Often the time, the application doesn't have the memory leak. At some point, it may pause for ~1 second for every 1G memory allocation.
What are good Java coding practices to help Java GC?
One example, since null object becomes eligible for garbage collection, so it is a good idea to explicitly set an object to null e.g. object = null.
In general to help GC you should avoid of unreasonable memory usage. Simple advices could be:
1) Do not produce new objects, where it is not needed. For example do not use constructions like String test = new String("blabla");. If it possible, reuse old objects (it belongs to immutable objects mainly).
2) Do not declare fields in classes, where they are used only inside methods; i.e. make them local variables.
3) Avoid of using object wrappers under primitive types. I.e. use int instead of Integer, boolean instead of Boolean, if you really do not need to store in them null values. Also for example, where it is possible, for memory economy, do not use ArrayList, use simple Java arrays of primitive types (not Integer[], but int[]).
The single best thing that you can do to minimize GC pauses is to properly size your heap.
If you know that your program never uses more than 1Gb of live objects, then it's pointless to pass -Xms4096m. That will actually increase your GC pauses, because the JVM will leave garbage around until it absolutely has to clear it.
Similarly, if you know that you have very few long-lived objects, you can usually benefit by increasing the size of the young generation relative to the tenured generation (for Sun JVMs).
The only thing that you can really do, coding-wise, is to move large objects off-heap. But that's unlikely to be useful for most applications.
Objects that are allocated and not immediately released get moved into the tenured heap space. Tenured memory is the most expensive to collect. Avoid churning through long lived objects should help with GC pauses.

String Immutability memory Issue

Once the String object is created , we can't modify it But if we do any operations on it JVM will create New Object. Here by creating new objects then JVM consumes more memory. Then i think it causes to memory issue right.?
You are correct. It is definitely worth being aware of this issue, even if it doesn't affect you every time.
As you say, Strings cannot change after creation - they're immutable and they don't expose many ways to change them.
However, operations such as a split() will be generating additional string objects in the background, and each of those strings have a memory overhead if you are holding onto references to them.
As the other posters note, the objects will be small and garbage collection will usually clean up the old ones after they have gone out of scope, so you generally won't have to worry about this.
However, if you're doing something specific and holding onto large amounts of string references then this could bite you.
Look at String interning depending on your use case, noting the warnings on the linked page.
Two things to note:
1) Hard coded String literals will be automatically interned by Java, reducing the impact of this.
2) The + operator is more efficient in this regard, it will use String Builders underneath giving performance & memory benefits.
No, that does not. If you do not hold strong links to String instances they eventually will be collected by a garbage collector.
For example:
while (true) {
new String("that is a string");
}
in this snippet you continuously create new object instances, however you will never get OutOfMemoryException as created instances become garbage (there are obviously no strong links).
It consumes more memory for new objects, that's right. But that fact in itself does not create an issue, because garbage collector promptly reclaims all inaccessible memory. Of course you can turn it into an issue by creating links to the newly created strings, but that would be an issue of your program, not of JVM.
The biggest memory issue you have to know about is taking a small substring of a huge string. That substring shares the original string's char array and even if the original string gets gc'd, the huge char array will still be referenced by the substring. The workaround is to use new String(hugeString.substring(i)).
The issue that is generated is the fact that garbage is generated. This issue is resolved by the virtual machine by calling the garbage collector which frees the memory used by that garbage.
As soon as the old object is not used anymore, it can be removed by the garbage collector. (Which will be done far before any memory issue arises).
If you want to prevent the copying of the data, use a StringBuilder.
Unused objects are collected by GC.
and Immutability got many benefits in java.
In Java achieving as much immutability as possible is a good practice.
They can be safely used in Collections frameworks also.
Check this
As far as I know StringBuilder (or StringBuffer for thread safe) is useful for managing String and make them mutable.
Manipulate some characters in a huge String do not 'eat' many bytes in memory.
It is also more powerful/speed for concate.
Since a string instance is immutable it can be reused by the jvm. The String class is implemented with Flyweight Design Pattern that is used to avoid memory issues.

Is there any way to "flush" interned strings?

I'm using an external library which uses String.intern() for performance reasons. That's fine, but I'm invoking that library a lot in a given run and so I run into the dreaded
java.lang.OutOfMemoryError: PermGen space
Obviously I can use the JVM command-line -XX:MaxPermSize modifier, but that solution isn't very scalable. Instead, is there any way to periodically (between two "batches" of library calls) "flush" the interned string pool, i.e. empty the static table of strings held by the String class?
No. Just size permgen appropriately. It's no different to having to size the heap appropriately. Don't be afraid!
Investigating further, I found this article, which seems to demonstrate that interned strings are still garbage collected. I guess that means that my problem here is a deeper one - the library I use must still hold a living reference to these strings :(

What is perm space?

While learning about java memory profiling, I keep seeing the term "perm space" in addition to "heap." I know what the heap is - what's perm space?
It stands for permanent generation:
The permanent generation is special
because it holds meta-data describing
user classes (classes that are not
part of the Java language). Examples
of such meta-data are objects
describing classes and methods and
they are stored in the Permanent
Generation. Applications with large
code-base can quickly fill up this
segment of the heap which will cause
java.lang.OutOfMemoryError: PermGen no
matter how high your -Xmx and how much
memory you have on the machine.
Perm space is used to keep informations for loaded classes and few other advanced features like String Pool(for highly optimized string equality testing), which usually get created by String.intern() methods.
As your application(number of classes) will grow this space shall get filled quickly, since the garbage collection on this Space is not much effective to clean up as required, you quickly get Out of Memory : perm gen space error. After then, no application shall run on that machine effectively even after having a huge empty JVM.
Before starting your application you should java -XX:MaxPermSize to get rid of this error.
Simple (and oversimplified) answer: it's where the jvm stores its own bookkeeping data, as opposed to your data.
Perm Gen stands for permanent generation which holds the meta-data information about the classes.
Suppose if you create a class name A, it's instance variable will be stored in heap memory and class A along with static classloaders will be stored in permanent generation.
Garbage collectors will find it difficult to clear or free the memory space stored in permanent generation memory. Hence it is always recommended to keep the permgen memory settings to the advisable limit.
JAVA8 has introduced the concept called meta-space generation, hence permgen is no longer needed when you use jdk 1.8 versions.
The permgen space is the area of heap that holds all the reflective data of the virtual machine itself, such as class and method objects.
It holds stuff like class definitions, string pool, etc. I guess you could call it meta-data.
Permgen space is always known as method area.When the classloader subsystem will load the the class file(byte code) to the method area(permGen).
It contains all the class metadata eg: Fully qualified name of your class, Fully qualified name of the immediate parent class, variable info, constructor info, constant pool infor etc.
What exists under PremGen : Class Area comes under PremGen area. Static fields are also developed at class loading time, so they also exist in PremGen. Constant Pool area having all immutable fields that are pooled like String are kept here. In addition to that, class data loaded by class loaders, Object arrays, internal objects used by jvm are also located.
PermGen Space stands for memory allocation for Permanent generation All Java immutable objects come under this category, like String which is created with literals or with String.intern() methods and for loading the classes into memory. PermGen Space speeds up our String equality searching.
JVM has an internal representation of Java objects and those internal representations
are stored in the heap (in the young generation or the tenured generation).
JVM also has an internal representation of the Java classes and those
are stored in the permanent generation

Categories