We know that String create object in heap and scp based on situation but what if String use only scp for every situation, so that we can save some memory space
Firstly, despite what you have heard or read, the Oracle documentation does not mention a thing called the "string constant pool". (Or "scp".) In fact, there are two distinct things:
The Constant Pool which is part of the ".class" file format and represents many the kinds of constants emitted by the compiler.
The String Pool is a runtime data structure that is primarily used to implement certain properties about String objects that originated as the result of compile time constant expressions.
But while the latter holds "constants", it can also holds String objects that were placed there by calling String.intern. So from that respect it is not a string "constant" pool. Alternatively, all String objects are immutable (constant), so from that perspective the "constant" in string constant pool is redundant.
In addition you say:
We know that String create object in heap and scp based on situation.
In a modern JVM (Java 7 or later), the strings in the string pool are actually in the regular heap.
The only situations where the JVM puts a string into the string pool are:
when creating a String object corresponding to a String-valued constant-expression in a .class file, or
when application code calls the String.intern method.
No other string constructors or methods do it, and (AFAIK) none of the standard Java SE librarys ever use intern().
So to answer your question:
Why String doesn't use scp only?
Because when the String is not a duplicate, putting it into the pool doesn't save memory. Rather it uses more memory.
Because a String in the string pool tends to live longer beyond the point where it becomes unreachable. (This is certainly true for Java versions where the string pool was in the PermGen heap.) So you may end up using the memory for a pooled String for longer than if it hadn't been pooled. That can also mean that more memory is used overall.
Because searching the string pool each time you created a new string would be (relatively) expensive.
Because the string pool creates more work for the garbage collector. The pool is a native hash table data structure that contains references that are akin to Reference types. Searching or scanning the table to remove strings that are no longer reachable costs extra GC time.
Because ... frankly ... the percentage of memory used by (aka "wasted" on) duplicate strings is not significant in most applications.
Java tends to be memory hungry by virtue of being a garbage collected language. But it is memory hungry irrespective of string pooling. So if your application requirements include running in a minimal memory footprint, Java string pooling is not the solution. You should probably be using a different programming language.
In fact, since Java 9 there is a better way to save memory used by duplicate strings. Enable the GC's string deduplication feature. It is more efficient than interning because the deduping is only done on strings that have survived a number of new-space garbage collections. This reduces the wasted effort on deduping strings that turn out to be short lived.
Related
I have read oracle document but there is nothing given regarding method area and string constant pool. I have doubt that where method area, string constant pool reside in memory in JDK 8 or 8+ .
The java language specification does not specify where this lives.
It also doesn't matter. These objects end up being created, there is no way to directly access them, which doesn't matter.
That's sort of how java works: The spec says what you can and cannot rely on, this gives room to JVM implementations to do whatever they want, so long as they fulfill the contract. "Where in memory..." is a question that in java doesn't matter, you can't manipulate memory directly at all.
Go back to why you think you need to know and find another way; any answer to this question would be specific to some implementation of the JVM, and therefore your code wouldn't be portable. That is, any version update to the JVM, or some alternative JVM implementation such as OpenJ9 rolls along and your code just breaks, probably with a raw core dump. That doesn't sound like a good idea.
In Java 8 and later:
the method area is in metaspace
the string pool is in the regular heap.
This is an implementation detail for Oracle and OpenJDK JVMs. Other implementations may be different. But it really doesn't matter where strings and code is stored. Your application doesn't need to know.
By the way, it is called the "string pool", not the "string constant pool".
All strings are constant in the sense that they are immutable.
Strings variables that are declared as static final (and are constant in that sense) are not necessarily in the string pool.
Not all strings in the string pool are static final.
Not all strings in the string pool are string literals or other compile-time constant values.
Java strings are immutable, and instantiating multiple Strings with the same values returns the same object pointer. (Is there a term for this? "pooling" seems to fit, but that already refers to doing caching to save time by doing fewer instantiations.)
Does Java also do this (the thing without a term) with other (user-defined) classes that are immutable? Can Java even detect that a class is immutable, or is this something unique to the string class?
Wrt. Strings, the word you're looking for is interning.
Java won't do this for your own immutable objects. It does have cached versions of boxed primitives, though. See this article on wrapper class caching for more info.
As others here have said this process with Strings is known as interning.
Its worth mentioning that the behaviour of Strings with the same literal values being the same object may or may not be true in Java 7. From 7 onwards:
In JDK 7, interned strings are no longer allocated in the permanent generation of the Java heap, but are instead allocated in the main part of the Java heap (known as the young and old generations), along with the other objects created by the application. This change will result in more data residing in the main Java heap, and less data in the permanent generation, and thus may require heap sizes to be adjusted. Most applications will see only relatively small differences in heap usage due to this change, but larger applications that load many classes or make heavy use of the String.intern() method will see more significant differences.
Take a look at Java SE 7 RFE for the full details on this.
With regards to your own immutable objects Java doesnt do anything special with them - it doesnt know that they're immutable. It may inline methods a little more than otherwise if it can detect that its worthwhile/possible, but as far at the compiler and JVM are concerned they're just another object.
The term you are lookig is itering. Java optimize strings "automatically", during compilation and give the developer possibility to do it on runtime. (The details about what is optimized when depend on JVM version.)
As far it goes for immutable objects. I do not think that Java support any type of mechanism that will resolve same instace. String type is not exeption of this rule.
Reason why, is that you have to use operator new to create a instance. If you use new to create string instance, you will always get two different objects.
The intering is avaiable only for String type. But the concept is free, you can add to your immutable class such method and write an compled method that will do the same thing.
String interning. Wikipedia: String Interning
String Interning is unique to String class only. I suppose that JVM does not apply these rules for a user defined classes.
Once the String object is created , we can't modify it But if we do any operations on it JVM will create New Object. Here by creating new objects then JVM consumes more memory. Then i think it causes to memory issue right.?
You are correct. It is definitely worth being aware of this issue, even if it doesn't affect you every time.
As you say, Strings cannot change after creation - they're immutable and they don't expose many ways to change them.
However, operations such as a split() will be generating additional string objects in the background, and each of those strings have a memory overhead if you are holding onto references to them.
As the other posters note, the objects will be small and garbage collection will usually clean up the old ones after they have gone out of scope, so you generally won't have to worry about this.
However, if you're doing something specific and holding onto large amounts of string references then this could bite you.
Look at String interning depending on your use case, noting the warnings on the linked page.
Two things to note:
1) Hard coded String literals will be automatically interned by Java, reducing the impact of this.
2) The + operator is more efficient in this regard, it will use String Builders underneath giving performance & memory benefits.
No, that does not. If you do not hold strong links to String instances they eventually will be collected by a garbage collector.
For example:
while (true) {
new String("that is a string");
}
in this snippet you continuously create new object instances, however you will never get OutOfMemoryException as created instances become garbage (there are obviously no strong links).
It consumes more memory for new objects, that's right. But that fact in itself does not create an issue, because garbage collector promptly reclaims all inaccessible memory. Of course you can turn it into an issue by creating links to the newly created strings, but that would be an issue of your program, not of JVM.
The biggest memory issue you have to know about is taking a small substring of a huge string. That substring shares the original string's char array and even if the original string gets gc'd, the huge char array will still be referenced by the substring. The workaround is to use new String(hugeString.substring(i)).
The issue that is generated is the fact that garbage is generated. This issue is resolved by the virtual machine by calling the garbage collector which frees the memory used by that garbage.
As soon as the old object is not used anymore, it can be removed by the garbage collector. (Which will be done far before any memory issue arises).
If you want to prevent the copying of the data, use a StringBuilder.
Unused objects are collected by GC.
and Immutability got many benefits in java.
In Java achieving as much immutability as possible is a good practice.
They can be safely used in Collections frameworks also.
Check this
As far as I know StringBuilder (or StringBuffer for thread safe) is useful for managing String and make them mutable.
Manipulate some characters in a huge String do not 'eat' many bytes in memory.
It is also more powerful/speed for concate.
Since a string instance is immutable it can be reused by the jvm. The String class is implemented with Flyweight Design Pattern that is used to avoid memory issues.
In Java,
String literals in String Constant Pool are not garbage collected,
since they are referenced from Table of references which is created by instance of runtime in order to optimize space.
If Size of String literal pool exceeds then,
Since each String in String literal pool has reference hence it will be not eligible for GC.
how it is handled by JVM ?
There is a long discussion with real code examples at JavaRanch.
The general output is the following:
If a string is added to constant pool at RUNTIME using String.intern(), it can be garbage collected after it is no longer in use. Most probably, the string pool keeps only soft references to added strings, thus allowing to garbage collect them (can't be sure, because String.intern() is a native method).
If a string is a COMPILE time constant, it is added to the constant pool of the corresponding class. Therefore, it can be garbage collected only after the class is unloaded.
The answer to your question is: the only way how you can get OutOfMemoryError because of String constants is to load lot's of classes with many string literals computed at compile time. Then you can eventually exceed maximum size of PermGen space. But this will happen at the time you load classes into memory (e.g., start your application, deploy project to a WebServer, dynamically load new library, etc.)
String literals can be collected when they are no longer needed. Usually this is not a problem if they appear in classes because there are other limits you are likely to reach if you attempt to load lots of classes e.g. Maximum Perm Gen.
Generally speaking, developers are smart enough not to over use the string literal pool and instead using databases or file to load the bulk of their data if its a non trivial size.
You can introduce a problem if you use String.intern() a lot in an attempt to optimise the space of your system. String.intern() is not free and becomes increasingly expensive if you add a large number (millions) of string into it. If this is a performance problem, it should be reasonably obvious to the developer when it is.
I'm using an external library which uses String.intern() for performance reasons. That's fine, but I'm invoking that library a lot in a given run and so I run into the dreaded
java.lang.OutOfMemoryError: PermGen space
Obviously I can use the JVM command-line -XX:MaxPermSize modifier, but that solution isn't very scalable. Instead, is there any way to periodically (between two "batches" of library calls) "flush" the interned string pool, i.e. empty the static table of strings held by the String class?
No. Just size permgen appropriately. It's no different to having to size the heap appropriately. Don't be afraid!
Investigating further, I found this article, which seems to demonstrate that interned strings are still garbage collected. I guess that means that my problem here is a deeper one - the library I use must still hold a living reference to these strings :(