Why String Deduplication when we have String Pool - java

String De-duplication:
Strings consume a lot of memory in any application.Whenever the garbage collector visits String objects it takes note of the char arrays. It takes their hash value and stores it alongside with a weak reference to the array. As soon as it finds another String which has the same hash code it compares them char by char.If they match as well, one String will be modified and point to the char array of the second String. The first char array then is no longer referenced anymore and can be garbage collected.
String Pool:
All strings used by the java program are stored here. If two variables are initialized to the same string value. Two strings are not created in the memory, there will be only one copy stored in memory and both will point to the same memory location.
So java already takes care of not creating duplicate strings in the heap by checking if the string exists in the string pool. Then what is the purpose of string de-duplication?
If there is a code as follows
String myString_1 = new String("Hello World");
String myString_2 = new String("Hello World");
two strings are created in memory even though they are same. I cannot think of any scenario other than this where string de-duplication is useful. Obviously I must be missing something. What I am I missing?
Thanks In Advance

The string pool applies only to strings added to it explicitly, or used as constants in the application. It does not apply to strings created dynamically during the lifetime of the application. String deduplication, however, applies to all strings.

Compile time vs run time
String pool refers to string constants that are known at compile time.
String deduplication would help you if you happen to retrieve (or construct) the same string a million times at run time, e.g. reading it from a file, a HTTP request or any other way.

String de-duplication enjoys the extra level of indirection built into String:
With a string pool, you are limited to returning the same object for two identical strings
String de-duplication lets you have multiple distinct String objects sharing the same content.
This translates into removing a limitation of de-duplicating on creation: your application could keep creating new String objects with identical content while using very little extra memory, because the content of the strings would be shared. This process can be done on a completely unrelated schedule - for example, in the background, while your application does not need much of the CPU resources. Since the identity of the String object does not change, de-duplication can be completely hidden form your application.

Just to add to the answers above, on older VM's the string pool is not garbage collected (this has changed now, but don't rely on that). It contains strings which are used as constants in the application, and so will always be needed. If you continually put all your strings in the string pool, you might quickly run out of memory. On top of that, de-duplication is a relatively expensive process, if you know you only need the string for a very short period of time, and you have enough memory.
For these reasons, strings are not put in the string pool automatically. You have to do it explicitly by calling string.intern().

From documentation :
"Initializes a newly created String object so that it represents
the same sequence of characters as the argument; in other words, the
newly created string is a copy of the argument string. Unless an
explicit copy of original is needed, use of this constructor is
unnecessary since Strings are immutable."
So my sense says, this constructor in String class is not needed normally like you have used above. I guess that constructor is provided merely for the sake of completeness or if you do not want to share that copy (kind of unnecessary now, refer here what I am talking about) but still other constructors are useful like getting an String object from char array and so on..

I cannot think of any scenario other than this where string de-duplication is useful.
Well one other (much more) frequent scenario is the use of StringBuilders. In the toString() method of the StringBuilder class, it clearly creates a new instance in memory:
public final class StringBuilder extends AbstractStringBuilder
implements java.io.Serializable, CharSequence
{
...
#Override
public String toString() {
// Create a copy, don't share the array
return new String(value, 0, count);
}
...
}
Same thing for its thread-safe version StringBuffer:
public final class StringBuffer extends AbstractStringBuilder
implements java.io.Serializable, CharSequence
{
...
#Override
public synchronized String toString() {
if (toStringCache == null) {
toStringCache = Arrays.copyOfRange(value, 0, count);
}
return new String(toStringCache, true);
}
...
}
In applications that rely heavily on this, string de-duplication may reduce memory usage.

Related

Java String Pool with String constructor and the intern function

I learned about the Java String Pool recently, and there's a few things that I don't quiet understand.
When using the assignment operator, a new String will be created in the String Pool if it doesn't exist there already.
String a = "foo"; // Creates a new string in the String Pool
String b = "foo"; // Refers to the already existing string in the String Pool
When using the String constructor, I understand that regardless of the String Pool's state, a new string will be created in the heap, outside of the String Pool.
String c = new String("foo"); // Creates a new string in the heap
I read somewhere that even when using the constructor, the String Pool is being used. It will insert the string into the String Pool and into the heap.
String d = new String("bar"); // Creates a new string in the String Pool and in the heap
I didn't find any further information about this, but I would like to know if that's true.
If that is indeed true, then - why? Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.
Another thing that I would like to know is how the .intern() function of the String class works: Does it just return a pointer to the string in the String Pool?
And finally, in the following code:
String s = new String("Hello");
s = s.intern();
Will the garbage collector delete the string that is outside the String Pool from the heap?
You wrote
String c = new String("foo"); // Creates a new string in the heap
I read somewhere that even when using the constructor, the String Pool is being used. It
will insert the string into the String Pool and into the heap.
That’s somewhat correct, but you have to read the code correctly. Your code contains two String instances. First, you have the string literal "foo" that evaluates to a String instance, the one that will be inserted into the pool. Then, you are creating a new String instance explicitly, using new String(…) calling the String(String) constructor. Since the explicitly created object can’t have the same identity as an object that existed prior to its creation, two String instances must exist.
Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.
Well it does so, because you told it so. In theory, this construction could get optimized, skipping the intermediate step that you can’t perceive anyway. But the first assumption for a program’s behavior should be that it does precisely what you have written.
You could ask why there’s a constructor that allows such a pointless operation. In fact, this has been asked before and this answer addresses this. In short, it’s mostly a historical design mistake, but this constructor has been used in practice for other technical reasons; some do not apply anymore. Still, it can’t be removed without breaking compatibility.
String s = new String("Hello");
s = s.intern();
Will the garbage collector delete the string that is outside the String Pool from the heap?
Since the intern() call will evaluate to the instance that had been created for "Hello" and is distinct from the instance created via new String(…), the latter will definitely be unreachable after the second assignment to s. Of course, this doesn’t say whether the garbage collector will reclaim the string’s memory only that it is allowed to do so. But keep in mind that the majority of the heap occupation will be the array that holds the character data, which will be shared between the two string instances (unless you use a very outdated JVM). This array will still be in use as long as either of the two strings is in use. Recent JVMs even have the String Deduplication feature that may cause other strings of the same contents in the JVM use this array (to allow collection of their formerly used array). So the lifetime of the array is entirely unpredictable.
Q: I read somewhere that even when using the constructor, the String Pool is being used. It will insert the string into the String Pool and into the heap. [] I didn't find any further information about this, but I would like to know if that's true.
It is NOT true. A string created with new is not placed in the string pool ... unless something explicitly calls intern() on it.
Q: Why does java create this duplicate string?
Because the JLS specifies that every new generates a new object. It would be counter-intuitive if it didn't (IMO).
The fact that it is nearly always a bad idea to use new String(String) is not a good reason to make new behave differently in this case. The real answer is that programmers should learn not to write that ... except in the extremely rare cases that that it is necessary to do that.
Q: Another thing that I would like to know is how the intern() function of the String class works: Does it just return a pointer to the string in the String Pool?
The intern method always returns a pointer to a string in the string pool. That string may or may not be the string you called intern() or.
There have been different ways that the string pool was implemented.
In the original scheme, interned strings were held in a special heap call the PermGen heap. In that scheme, if the string you were interning was not already in the pool, then a new string would be allocated in PermGen space, and the intern method would return that.
In the current scheme, interned strings are held in the normal heap, and the string pool is just a (private) data structure. When the string being interned a not in the pool, it is simply linked into the data structure. A new string does not need to be allocated.
Q: Will the garbage collector delete the string that is outside the String Pool from the heap?
The rule is the same for all Java objects, no matter how they were created, and irrespective of where (in which "space" or "heap" in the JVM) they reside.
If an object is not reachable from the running application, then it is eligible for deletion by the garbage collector.
That doesn't mean that an unreachable object will be be garbage collected in any particular run of the GC. (Or indeed ever ... in some circumstances.)
The above rule equally applies to the String objects that correspond to string literals. If it ever becomes possible that a literal can never be used again, then it may be garbage collected.
That doesn't normally happen. The JVM keeps a hidden references to each string literal object in a private data structure associated with the class that defined it. Since classes normally exists for the lifetime of the JVM, their string literal objects remain reachable. (Which makes sense ... since the application may need to use them.)
However, if a class is loaded using a dynamically created classloader, and that classloader becomes unreachable, then so will all of its classes. So it is actually possible for a string literal object to become unreachable. If it does, it may be garbage collected.

Java: How exactly String intern() and StringPool works?

According to Javadoc about String.intern():
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
I have few questions about the same.
When a new String object (not using a string literal but using new() operator) is created like:
String str = new String("Test");
Question: I am aware that a new object will be created in heap. But will it also put String Test into the stringpool during object creation? If yes, then why the reference is not returned directly for the stringpool. If no, why not directly put the string in the pool as now the StringPool has been moved out of the PermGen and is in regular heap space (i.e. there is no space constraint apart from the heap space limit). There are some posts which state that the String is inserted in pool as soon as object is created whereas there are posts which contradicts this too.
Once we call String.intern() on a String object (as literals are already interned) what happens to the space allocated to the object? Is it reclaimed at the same moment or it waits for the next GC cycle?
Accepted answer to another question on SO, states that String intern should be used when you need speed since you can compare strings by reference (== is faster than equals).
Question: I am aware that when using String.intern() it returns reference to the string already present in the StringPool. But this requires a full scan lookup on the StringPool which can be an expensive operation in itself. So is this speed achieved during string comparison justifiable? If so, why?
I have looked at below sources:
JavaDoc
SO Question ques1, ques2, ques3
http://java-performance.info/string-intern-in-java-6-7-8/
And other misc sources from SO and outside world
All string literals are interned on compilation time. Using a string literal with the single argument constructor taking a string is a bit of an abuse of that constructor, hence you are likely to get two of them (but maybe there is a special compiler case for this, I can't say for sure). As of java 8 the implementation of the constructor (for openjdk) is this:
public String(String original) {
this.value = original.value;
this.hash = original.hash;
}
So no special treatment on this side. If you know the literal don't use this constructor.
I don't think there is any special GC semantics for Strings. It will get collected once it's unreachable and deemed collection worthy by the GC as any other object.
Don't ever use == for comparing strings, the first step in the default equals method for Strings is doing just that. If this is your dominant case (you know you are working with interned strings most of the time) you are only paying the overhead of a method call which is tiny, the potential for future bugs you add by doing something like that is just too big of a risk for a gain that is minuscule.

How many objects are created in StringBuilder?

Based on my understanding on String objects, that every string literal is being added/created on the String Constant Pool.
String a = new String("hello world");
two objects are being created, one is the "hello world" being added on the constant pool and the other object is the new instance of String. Does the same principle is being applied when it comes to StringBuilder?
StringBuilder b = new StringBuilder("java is fun");
b.append(" and good");
in the StringBuilder example does it mean that there are 3 objects being created? The first one is the new instance of StringBuilder and the other 2 objects are the string literals "java is fun" and " and good"?
Yes, your understanding is correct. (But see below.) String literals go in the constant pool, while the new String(...) and new StringBuilder(...) create additional objects. (The StringBuilder also creates an internal character array object, so that there are at least four objects involved in the second example.) Note that the call to b.append(...) may internally create another object and some garbage, but only if the internal character array used by b needs to expand.
EDIT: As #fdreger points out in a comment, the string objects corresponding to the string literals are created not at run time, but rather at the time the class is created during the creation and loading phase of the class life cycle (as described in the Java Language Specification; see also the section on the runtime constant pool).
Yes, you're right.
"java is fun" and " and good" are going to be stored in the constant pool. I would call them objects, but yes, it's more like it.
b in this case is going to be an object with a limited lifespan. It may actually store its own copy of both strings in its buffer to work with, so that originals won't be modified.
The strings are created when the class is loaded, not when the actual code is executed. You may think of them as part of your code, exactly as any other literal value.
So in your second example:
two strings are created when the code is loaded for the first time
no strings are created during the execution of the code.

Is "new String()" immutable as well?

I've been studying Java String for a while. The following questions are based on the below posts
Java String is special
Immutability of String in java
Immutability:
Now, going by the immutability, the String class has been designed so that the values in the common pool can be reused in other places/variables. This holds good if the String was created as
String a = "Hello World!";
However, if I create String like
String b = new String("Hello World!");
why is this immutable as well? (or is it?). Since this has a dedicated heap memory, I should be able to modify this without affecting any other variable. So by design, was there any other reason why String as a whole is considered immutable? Or is my above assumption wrong?
Second thing I wanted to ask was about the common string pool. If I create a string object as
String c = "";
is an empty entry created in the pool?
Is there any post already on these? If so, could someone share the link?
new String() is an expression that produces a String ... and a String is immutable, no matter how it is produced.
(Asking if new String() is mutable or not is nonsensical. It is program code, not a value. But I take it that that is not what you really meant.)
If I create a string object as String c = ""; is an empty entry created in the pool?
Yes; that is, an entry is created for the empty string. There is nothing special about an empty String.
(To be pedantic, the pool entry for "" gets created long before your code is executed. In fact, it is created when your code is loaded ... or possibly even earlier than that.)
So, I was wanted to know whether the new heap object is immutable as well, ...
Yes it is. But the immutability is a fundamental property of String objects. All String objects.
You see, the String API simply does not provide any methods for changing a String. So (apart from some dangerous and foolish1 tricks using reflection), you can't mutate a String.
and if so what was the purpose?.
The primary reason that Java String is designed as an immutable class is simplicity. It makes it easier to write correct programs, and read / reason about other people's code if the core string class provides an immutable interface.
An important second reason is that the immutability of String has fundamental implications for the Java security model. But I don't think this was a driver in the original language design ... in Java 1.0 and earlier.
Going by the answer, I gather that other references to the same variable is one of the reasons. Please let me know if I am right in understanding this.
No. It is more fundamental than that. Simply, all String objects are immutable. There is no complicated special case reasoning required to understand this. It just >>is<<.
For the record, if you want a mutable "string-like" object in Java, you can use StringBuilder or StringBuffer. But these are different types to String.
1 - The reason these tricks are (IMO) dangerous and foolish is that they affect the values of strings that are potentially shared by other parts of your application via the string pool. This can cause chaos ... in ways that the next guy maintaining your code has little chance of tracking down.
String is immutable irrespective of how it is instantiated
1) Short answer is yes, new String() is immutable too.
Because every possible mutable operation (like replace,toLowerCase etcetra) that you perform on String does not affect the original String instance and returns you a new instance.
You may check this in Javadoc for String. Each public method of String that is exposed returns a new String instance and does not alter the present instance on which you called the method.
This is very helpful in Multi-threaded environment as you don't have to think about mutability (someone will change the value) every time you pass or share the String around. String can easily be the most used data type, so the designers have blessed us all to not think about mutability everytime and saved us a lot of pain.
Immutability allowed String pool or caching
It is because of immutability property that the internal pool of string was possible, as when same String value is required at some other place then that immutable reference is returned. If String would have been mutable then it would not have been possible to share Strings like this to save memory.
String immutablity was not because of pooling, but immutability has more benefits attached to it.
String interning or pooling is an example of Flyweight Design pattern
2) Yes it will be interned like any other String as a blank String is also as much a String as other String instances.
References:
Immutability benefits of String
The Java libraries are heavily optimized around the constraint that any String object is immutable, regardless of how that object is constructed. Even if you create your b using new, other code that you pass that instance to will treat the value as immutable. This is an example of the Value Object pattern, and all of the advantages (thread-safety, no need to make private copies) apply.
The empty string "" is a legitimate String object just like anything else, it just happens to have no internal contents, and since all compile-time constant strings are interned, I'll virtually guarantee that some runtime library has already caused it to be added to the pool.
1) The immutable part isn't because of the pool; it just makes the pool possible in the first place. Strings are often passed as arguments to other functions or even shared with other threads; Making Strings immutable was a design decision to make reasoning in such situations easier. So yes - Strings in Java are always immutable, no matter how you create them (note that it's possible to have mutable strings in java - just not with the String class).
2) Yes. Probably. I'm not actually 100% sure, but that should be the case.
This isn't strictly an answer to your question, but if behind your question is a wish to have mutable strings that you can manipulate, you should check out the StringBuilder class, which implements many of the exact same methods that String has but also adds methods to change the current contents.
Once you've built your string in such a way that you're content with it, you simply call toString() on it in order to convert it to an ordinary String that you can pass to library routines and other functions that only take Strings.
Also, both StringBuilder and String implements the CharSequence interface, so if you want to write functions in your own code that can use both mutable and immutable strings, you can declare them to take any CharSequence object.
From the Java Oracle documentation:
Strings are constant; their values cannot be changed after they are
created.
And again:
String buffers support mutable strings. Because String
objects are immutable they can be shared.
Generally speaking: "all primitive" (or related) object are immutable (please, accept my lack of formalism).
Related post on Stack Overflow:
Immutability of Strings in Java
Is Java String immutable?
Is Java "pass-by-reference" or "pass-by-value"?
Is a Java string really immutable?
String is immutable. What exactly is the meaning?
what the reason behind making string immutable in java?
About the object pool: object pool is a java optimization which is NOT related to immutable as well.
String is immutable means that you cannot change the object itself no matter how you created it.And as for the second question: yes it will create an entry.
It's the other way around, actually.
[...] the String class has been designed so that the values in the common pool can be reused in other places/variables.
No, the String class is immutable so that you can safely reference an instance of it, without worrying about it being modified from another part of your program. This is why pooling is possible in the first place.
So, consider this:
// this string literal is interned and referenced by 'a'
String a = "Hello World!";
// creates a new instance by copying characters from 'a'
String b = new String(a);
Now, what happens if you simply create a reference to your newly created b variable?
// 'c' now points to the same instance as 'b'
String c = b;
Imagine that you pass c (or, more specifically, the object it is referencing) to a method on a different thread, and continue working with the same instance on your main thread. And now imagine what would happen if strings were mutable.
Why is this the case anyway?
If nothing else, it's because immutable objects make multithreading much simpler, and usually even faster. If you share a mutable object (that would be any stateful object, with mutable private/public fields or properties) between different threads, you need to take special care to ensure synchronized access (mutexes, semaphores). Even with this, you need special care to ensure atomicity in all your operations. Multithreading is hard.
Regarding the performance implications, note that quite often copying the entire string into a new instance, in order to change even a single character, is actually faster than inducing an expensive context switch due to synchronization constructs needed to ensure thread-safe access. And as you've mentioned, immutability also offers interning possibilities, which means it can actually help reduce memory usage.
It's generally a pretty good idea to make as much stuff immutable as your can.
1) Immutability: String will be immutable if you create it using new or other way for security reasons
2) Yes there will be a empty entry in the string pool.
You can better understand the concept using the code
String s1 = new String("Test");
String s2 = new String("Test");
String s3 = "Test";
String s4 = "Test";
System.out.println(s1==s2);//false
System.out.println(s1==s3);//false
System.out.println(s2==s3);//false
System.out.println(s4==s3);//true
Hope this will help your query. You can always check the source code for String class in case of better understanding Link : http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/String.java
1-String is immutable.See this:
Is a Java string really immutable?
So you can create it with many way.
2-Short answer: Yes will be empty.
Strings created will always be immutable regardless of how they are created.
Answers to your questions:
The only difference is:
When string is created like -- {String a = "Hello World!";} then only one object gets created.
And when it is created like -- {String b = new String("Hello World!");} then two objects get created. The first one, because you have used 'new' keyword and the second one because of the String property.
Yes, for sure. There will be an empty entry created in the pool.
String is immutable because it does not provide you with a mean to modify it. It is design to avoid any tampering (it is final, the underlying array is not supposed to be touched ...).
Identically, Integer is immutable, because there is no way to modify it.
It does not matter how you create it.
Immutability is not a feature of new, it is a feature of the class String. It has no mutator methods, so it immutable.
String A = "Test"
String B = "Test"
Now String B called"Test".toUpperCase()which change the same object into"TEST", soAwill also be"TEST"` which is not desirable.
Note that in your example, the reference is changed and not the object it refers to, i.e. b as a reference may be changed and refer to a new object. But that new object is immutable, which means its contents will not be changed "after" calling the constructor.
You can change a string using b=b+"x"; or b=new String(b);, and the content of variable a seem to change, but don't confuse immutability of a reference (here variable b) and the object it is referring to (think of pointers in C). The object that the reference is pointing to will remain unchanged after its creation.
If you need to change a string by changing the contents of the object (instead of changing the reference), you can use StringBuffer, which is a mutable version of String.

What is meant by immutable?

What exactly does immutable mean - that is, what are the consequences of an object being mutable or immutable? In particular, why are Java's Strings immutable?
My understanding is that the StringBuilder type is something like a mutable equivalent to String. When would I use StringBuilder rather than String, and vice-versa?
Immutable means that once the constructor for an object has completed execution that instance can't be altered.
This is useful as it means you can pass references to the object around, without worrying that someone else is going to change its contents. Especially when dealing with concurrency, there are no locking issues with objects that never change
e.g.
class Foo
{
private final String myvar;
public Foo(final String initialValue)
{
this.myvar = initialValue;
}
public String getValue()
{
return this.myvar;
}
}
Foo doesn't have to worry that the caller to getValue() might change the text in the string.
If you imagine a similar class to Foo, but with a StringBuilder rather than a String as a member, you can see that a caller to getValue() would be able to alter the StringBuilder attribute of a Foo instance.
Also beware of the different kinds of immutability you might find: Eric Lippert wrote a blog article about this. Basically you can have objects whose interface is immutable but behind the scenes actual mutables private state (and therefore can't be shared safely between threads).
An immutable object is an object where the internal fields (or at least, all the internal fields that affect its external behavior) cannot be changed.
There are a lot of advantages to immutable strings:
Performance: Take the following operation:
String substring = fullstring.substring(x,y);
The underlying C for the substring() method is probably something like this:
// Assume string is stored like this:
struct String { char* characters; unsigned int length; };
// Passing pointers because Java is pass-by-reference
struct String* substring(struct String* in, unsigned int begin, unsigned int end)
{
struct String* out = malloc(sizeof(struct String));
out->characters = in->characters + begin;
out->length = end - begin;
return out;
}
Note that none of the characters have to be copied! If the String object were mutable (the characters could change later) then you would have to copy all the characters, otherwise changes to characters in the substring would be reflected in the other string later.
Concurrency: If the internal structure of an immutable object is valid, it will always be valid. There's no chance that different threads can create an invalid state within that object. Hence, immutable objects are Thread Safe.
Garbage collection: It's much easier for the garbage collector to make logical decisions about immutable objects.
However, there are also downsides to immutability:
Performance: Wait, I thought you said performance was an upside of immutability! Well, it is sometimes, but not always. Take the following code:
foo = foo.substring(0,4) + "a" + foo.substring(5); // foo is a String
bar.replace(4,5,"a"); // bar is a StringBuilder
The two lines both replace the fourth character with the letter "a". Not only is the second piece of code more readable, it's faster. Look at how you would have to do the underlying code for foo. The substrings are easy, but now because there's already a character at space five and something else might be referencing foo, you can't just change it; you have to copy the whole string (of course some of this functionality is abstracted into functions in the real underlying C, but the point here is to show the code that gets executed all in one place).
struct String* concatenate(struct String* first, struct String* second)
{
struct String* new = malloc(sizeof(struct String));
new->length = first->length + second->length;
new->characters = malloc(new->length);
int i;
for(i = 0; i < first->length; i++)
new->characters[i] = first->characters[i];
for(; i - first->length < second->length; i++)
new->characters[i] = second->characters[i - first->length];
return new;
}
// The code that executes
struct String* astring;
char a = 'a';
astring->characters = &a;
astring->length = 1;
foo = concatenate(concatenate(slice(foo,0,4),astring),slice(foo,5,foo->length));
Note that concatenate gets called twice meaning that the entire string has to be looped through! Compare this to the C code for the bar operation:
bar->characters[4] = 'a';
The mutable string operation is obviously much faster.
In Conclusion: In most cases, you want an immutable string. But if you need to do a lot of appending and inserting into a string, you need the mutability for speed. If you want the concurrency safety and garbage collection benefits with it the key is to keep your mutable objects local to a method:
// This will have awful performance if you don't use mutable strings
String join(String[] strings, String separator)
{
StringBuilder mutable;
boolean first = true;
for(int i = 0; i < strings.length; i++)
{
if(first) first = false;
else mutable.append(separator);
mutable.append(strings[i]);
}
return mutable.toString();
}
Since the mutable object is a local reference, you don't have to worry about concurrency safety (only one thread ever touches it). And since it isn't referenced anywhere else, it is only allocated on the stack, so it is deallocated as soon as the function call is finished (you don't have to worry about garbage collection). And you get all the performance benefits of both mutability and immutability.
Actually String is not immutable if you use the wikipedia definition suggested above.
String's state does change post construction. Take a look at the hashcode() method. String caches the hashcode value in a local field but does not calculate it until the first call of hashcode(). This lazy evaluation of hashcode places String in an interesting position as an immutable object whose state changes, but it cannot be observed to have changed without using reflection.
So maybe the definition of immutable should be an object that cannot be observed to have changed.
If the state changes in an immutable object after it has been created but no-one can see it (without reflection) is the object still immutable?
Immutable objects are objects that can't be changed programmatically. They're especially good for multi-threaded environments or other environments where more than one process is able to alter (mutate) the values in an object.
Just to clarify, however, StringBuilder is actually a mutable object, not an immutable one. A regular java String is immutable (meaning that once it's been created you cannot change the underlying string without changing the object).
For example, let's say that I have a class called ColoredString that has a String value and a String color:
public class ColoredString {
private String color;
private String string;
public ColoredString(String color, String string) {
this.color = color;
this.string = string;
}
public String getColor() { return this.color; }
public String getString() { return this.string; }
public void setColor(String newColor) {
this.color = newColor;
}
}
In this example, the ColoredString is said to be mutable because you can change (mutate) one of its key properties without creating a new ColoredString class. The reason why this may be bad is, for example, let's say you have a GUI application which has multiple threads and you are using ColoredStrings to print data to the window. If you have an instance of ColoredString which was created as
new ColoredString("Blue", "This is a blue string!");
Then you would expect the string to always be "Blue". If another thread, however, got ahold of this instance and called
blueString.setColor("Red");
You would suddenly, and probably unexpectedly, now have a "Red" string when you wanted a "Blue" one. Because of this, immutable objects are almost always preferred when passing instances of objects around. When you have a case where mutable objects are really necessary, then you would typically guard the objet by only passing copies out from your specific field of control.
To recap, in Java, java.lang.String is an immutable object (it cannot be changed once it's created) and java.lang.StringBuilder is a mutable object because it can be changed without creating a new instance.
In large applications its common for string literals to occupy large bits of memory. So to efficiently handle the memory, the JVM allocates an area called "String constant pool".(Note that in memory even an unreferenced String carries around a char[], an int for its length, and another for its hashCode. For a number, by contrast, a maximum of eight immediate bytes is required)
When complier comes across a String literal it checks the pool to see if there is an identical literal already present. And if one is found, the reference to the new literal is directed to the existing String, and no new 'String literal object' is created(the existing String simply gets an additional reference).
Hence : String mutability saves memory...
But when any of the variables change value, Actually - it's only their reference that's changed, not the value in memory(hence it will not affect the other variables referencing it) as seen below....
String s1 = "Old string";
//s1 variable, refers to string in memory
reference | MEMORY |
variables | |
[s1] --------------->| "Old String" |
String s2 = s1;
//s2 refers to same string as s1
| |
[s1] --------------->| "Old String" |
[s2] ------------------------^
s1 = "New String";
//s1 deletes reference to old string and points to the newly created one
[s1] -----|--------->| "New String" |
| | |
|~~~~~~~~~X| "Old String" |
[s2] ------------------------^
The original string 'in memory' didn't change, but the
reference variable was changed so that it refers to the new string.
And if we didn't have s2, "Old String" would still be in the memory but
we'll not be able to access it...
"immutable" means you cannot change value. If you have an instance of String class, any method you call which seems to modify the value, will actually create another String.
String foo = "Hello";
foo.substring(3);
<-- foo here still has the same value "Hello"
To preserve changes you should do something like this
foo = foo.sustring(3);
Immutable vs mutable can be funny when you work with collections. Think about what will happen if you use mutable object as a key for map and then change the value (tip: think about equals and hashCode).
java.time
It might be a bit late but in order to understand what an immutable object is, consider the following example from the new Java 8 Date and Time API (java.time). As you probably know all date objects from Java 8 are immutable so in the following example
LocalDate date = LocalDate.of(2014, 3, 18);
date.plusYears(2);
System.out.println(date);
Output:
2014-03-18
This prints the same year as the initial date because the plusYears(2) returns a new object so the old date is still unchanged because it's an immutable object. Once created you cannot further modify it and the date variable still points to it.
So, that code example should capture and use the new object instantiated and returned by that call to plusYears.
LocalDate date = LocalDate.of(2014, 3, 18);
LocalDate dateAfterTwoYears = date.plusYears(2);
date.toString()… 2014-03-18
dateAfterTwoYears.toString()… 2016-03-18
I really like the explaination from SCJP Sun Certified Programmer for Java 5 Study Guide.
To make Java more memory efficient, the JVM sets aside a special area of memory called the "String constant pool." When the compiler encounters a String literal, it checks the pool to see if an identical String already exists. If a match is found, the reference to the new literal is directed to the existing String, and no new String literal object is created.
Objects which are immutable can not have their state changed after they have been created.
There are three main reasons to use immutable objects whenever you can, all of which will help to reduce the number of bugs you introduce in your code:
It is much easier to reason about how your program works when you know that an object's state cannot be changed by another method
Immutable objects are automatically thread safe (assuming they are published safely) so will never be the cause of those hard-to-pin-down multithreading bugs
Immutable objects will always have the same Hash code, so they can be used as the keys in a HashMap (or similar). If the hash code of an element in a hash table was to change, the table entry would then effectively be lost, since attempts to find it in the table would end up looking in the wrong place. This is the main reason that String objects are immutable - they are frequently used as HashMap keys.
There are also some other optimisations you might be able to make in code when you know that the state of an object is immutable - caching the calculated hash, for example - but these are optimisations and therefore not nearly so interesting.
One meaning has to do with how the value is stored in the computer, For a .Net string for example, it means that the string in memory cannot be changed, When you think you're changing it, you are in fact creating a new string in memory and pointing the existing variable (which is just a pointer to the actual collection of characters somewhere else) to the new string.
String s1="Hi";
String s2=s1;
s1="Bye";
System.out.println(s2); //Hi (if String was mutable output would be: Bye)
System.out.println(s1); //Bye
s1="Hi" : an object s1 was created with "Hi" value in it.
s2=s1 : an object s2 is created with reference to s1 object.
s1="Bye" : the previous s1 object's value doesn't change because s1 has String type and String type is an immutable type, instead compiler create a new String object with "Bye" value and s1 referenced to it. here when we print s2 value, the result will be "Hi" not "Bye" because s2 referenced to previous s1 object which had "Hi" value.
Immutable means that once the object is created, non of its members will change. String is immutable since you can not change its content.
For example:
String s1 = " abc ";
String s2 = s1.trim();
In the code above, the string s1 did not change, another object (s2) was created using s1.
Immutable simply mean unchangeable or unmodifiable. Once string object is created its data or state can't be changed
Consider bellow example,
class Testimmutablestring{
public static void main(String args[]){
String s="Future";
s.concat(" World");//concat() method appends the string at the end
System.out.println(s);//will print Future because strings are immutable objects
}
}
Let's get idea considering bellow diagram,
In this diagram, you can see new object created as "Future World". But not change "Future".Because String is immutable. s, still refer to "Future". If you need to call "Future World",
String s="Future";
s=s.concat(" World");
System.out.println(s);//print Future World
Why are string objects immutable in java?
Because Java uses the concept of string literal. Suppose there are 5 reference variables, all refers to one object "Future".If one reference variable changes the value of the object, it will be affected to all the reference variables. That is why string objects are immutable in java.
Once instanciated, cannot be altered. Consider a class that an instance of might be used as the key for a hashtable or similar. Check out Java best practices.
As the accepted answer doesn't answer all the questions. I'm forced to give an answer after 11 years and 6 months.
Can somebody clarify what is meant by immutable?
Hope you meant immutable object (because we could think about immutable reference).
An object is immutable: iff once created, they always represent the same value (doesn't have any method that change the value).
Why is a String immutable?
Respect the above definition which could be checked by looking into the Sting.java source code.
What are the advantages/disadvantages of the immutable objects?
immutable types are :
safer from bugs.
easier to understand.
and more ready for change.
Why should a mutable object such as StringBuilder be preferred over String and vice-verse?
Narrowing the question Why do we need the mutable StringBuilder in programming?
A common use for it is to concatenate a large number of strings together, like this:
String s = "";
for (int i = 0; i < n; ++i) {
s = s + n;
}
Using immutable strings, this makes a lot of temporary copies — the first number of the string ("0") is actually copied n times in the course of building up the final string, the second number is copied n-1 times, and so on. It actually costs O(n2) time just to do all that copying, even though we only concatenated n elements.
StringBuilder is designed to minimize this copying. It uses a simple but clever internal data structure to avoid doing any copying at all until the very end, when you ask for the final String with a toString() call:
StringBuilder sb = new StringBuilder();
for (int i = 0; i < n; ++i) {
sb.append(String.valueOf(n));
}
String s = sb.toString();
Getting good performance is one reason why we use mutable objects. Another is convenient sharing: two parts of your program can communicate more conveniently by sharing a common mutable data structure.
More could be found here : https://web.mit.edu/6.005/www/fa15/classes/09-immutability/#useful_immutable_types
Immutable Objects
An object is considered immutable if its state cannot change after it is constructed. Maximum reliance on immutable objects is widely accepted as a sound strategy for creating simple, reliable code.
Immutable objects are particularly useful in concurrent applications. Since they cannot change state, they cannot be corrupted by thread interference or observed in an inconsistent state.
Programmers are often reluctant to employ immutable objects, because they worry about the cost of creating a new object as opposed to updating an object in place. The impact of object creation is often overestimated, and can be offset by some of the efficiencies associated with immutable objects. These include decreased overhead due to garbage collection, and the elimination of code needed to protect mutable objects from corruption.
The following subsections take a class whose instances are mutable and derives a class with immutable instances from it. In so doing, they give general rules for this kind of conversion and demonstrate some of the advantages of immutable objects.
Source
An immutable object is the one you cannot modify after you create it. A typical example are string literals.
A D programming language, which becomes increasingly popular, has a notion of "immutability" through "invariant" keyword. Check this Dr.Dobb's article about it - http://dobbscodetalk.com/index.php?option=com_myblog&show=Invariant-Strings.html&Itemid=29 . It explains the problem perfectly.

Categories