Now I know this question is asked a lot of times, and has been answered many times, but still someone new to Java will find the explanation tough to understand.
So what I have understood from these question is as follows
String a = "hi";
The above statement first checks whether the string is present in string pool. If not, it adds it in the pool and a reference of it is created in the pool. Basically the object is made in permanent generation area and string pool is used to have a reference of it.
However, with
String a = new String("hello");
In this case, it creates two object. One in permanent generation area, and one in the normal heap memory. The a contains a reference to the heap memory object.
Now my question is whether this concept is right or not. Does string pool is references or a pool of actual strings and whether the concept of permanent generation area here I understood is right or not? If wrong please explain in layman's language. Please don't make it duplicate, as I already know this has been answered a lot of times. None was in layman's language and easy to understand. Are two objects actually made? If yes, then how, and if no, then why? It would be really helpful.
The effect of what you say is basically correct. The problem with your formulation concerns when things happen. When you write
String a="hi";
or indeed, your Java file has the string literal "hi" anywhere in it, then this string literal is allocated only once: when the class is loaded, when your code starts running. Then the initialization of a just uses the existing String object. But when you have an explicit constructor call as in
String a=new String("hi");
then a new String is created. new means a new string object.
Yes, you understand it well. When you do:
String a = new String("hello");
There will be 2 created Strings, one on the pool and one object, not in the pool that contains a copy of the content that's stored in the object from the pool.
You'll have something like that:
Pool
+-------+
|"hello" <-------- a
| |
+-------+
String a= new String("hello")
one object in string pool is created and one another object is created by new operator in heap area.and a is holding the reference of String pool object and string pool object is holding the reference of heap object. and heap object contains hello.
Related
Given a string
String x = "hello";
As far as I remember, x now is a constant with a reference but when in the middle of the code I change value to
x = "hey";
hello stays stuck in the memory and the GC frees it. Is it true ? If not, could you explain more how the GC work?
As far as I remember, x now is a constant
No. The reference is not a constant. It is a reference to an immutable, but the reference itself (the "pointer") is not (crediting Turing85 for the clarification). For x to be considered a constant, it needs to be declared with the keyword final (then the reference itself will be a constant). For example, final String x = "hello";. Because it is not a constant, it allows you to change the reference later on in the code x = "hey";. THAT SAID, because of how Java manages strings (and other immutables), "hello" and "hey" are indeed constants. They are "permanent" members of the String pool. Meaning that, their life cycle is the same as the life of your program. I think there are ways change this, but by default I am pretty sure this is the case. So, if you create multiple objects with the same String value:
String a = "hello"; // will add "hello" to String pool if it doesn't exist there
String b = "hello"; // finds "hello" in the string pool and returns back the same reference to this variable
String c = "hello"; // same as `b` above
However, if you create a new String using the copy constructor:
String d = new String("hello"); // Will add new "hello" to heap even though it already exist in string pool. (Thanks Holger!)
Explaining in detail how the GC works is not very easy to explain; especially in a single SO post. I don't even fully understand it myself. But, since the purpose of the pool is to eliminate the need to create "infinite" copies of the same thing over and over, "hello" and "hey" won't be garbage-collected. As for the variable x, it will be handled by the GC depending on many factors, one of which is scope. Also, there is no guarantee as to when (if ever) the GC will cleanup an unused resource.
I think this is accurate, albeit somewhat ambiguous or incomplete. But at a high-level, this is how GC will work in this case to the best of my knowledge.
UPDATE: Thanks to #Holger for bringing this to my attention. When a String Literal is assigned to a variable, it will first look for such value in the String Pool to see if such string exists. If it doesn't, it creates it and assigns it to the variable. HOWEVER, when a string is created with the new operator, it will add the new string JUST into Heap Memory. That's the main difference between assigning string literals vs. new strings to a variable.
I learned about the Java String Pool recently, and there's a few things that I don't quiet understand.
When using the assignment operator, a new String will be created in the String Pool if it doesn't exist there already.
String a = "foo"; // Creates a new string in the String Pool
String b = "foo"; // Refers to the already existing string in the String Pool
When using the String constructor, I understand that regardless of the String Pool's state, a new string will be created in the heap, outside of the String Pool.
String c = new String("foo"); // Creates a new string in the heap
I read somewhere that even when using the constructor, the String Pool is being used. It will insert the string into the String Pool and into the heap.
String d = new String("bar"); // Creates a new string in the String Pool and in the heap
I didn't find any further information about this, but I would like to know if that's true.
If that is indeed true, then - why? Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.
Another thing that I would like to know is how the .intern() function of the String class works: Does it just return a pointer to the string in the String Pool?
And finally, in the following code:
String s = new String("Hello");
s = s.intern();
Will the garbage collector delete the string that is outside the String Pool from the heap?
You wrote
String c = new String("foo"); // Creates a new string in the heap
I read somewhere that even when using the constructor, the String Pool is being used. It
will insert the string into the String Pool and into the heap.
That’s somewhat correct, but you have to read the code correctly. Your code contains two String instances. First, you have the string literal "foo" that evaluates to a String instance, the one that will be inserted into the pool. Then, you are creating a new String instance explicitly, using new String(…) calling the String(String) constructor. Since the explicitly created object can’t have the same identity as an object that existed prior to its creation, two String instances must exist.
Why does java create this duplicate string? It seems completely redundant to me since the strings in java are immutable.
Well it does so, because you told it so. In theory, this construction could get optimized, skipping the intermediate step that you can’t perceive anyway. But the first assumption for a program’s behavior should be that it does precisely what you have written.
You could ask why there’s a constructor that allows such a pointless operation. In fact, this has been asked before and this answer addresses this. In short, it’s mostly a historical design mistake, but this constructor has been used in practice for other technical reasons; some do not apply anymore. Still, it can’t be removed without breaking compatibility.
String s = new String("Hello");
s = s.intern();
Will the garbage collector delete the string that is outside the String Pool from the heap?
Since the intern() call will evaluate to the instance that had been created for "Hello" and is distinct from the instance created via new String(…), the latter will definitely be unreachable after the second assignment to s. Of course, this doesn’t say whether the garbage collector will reclaim the string’s memory only that it is allowed to do so. But keep in mind that the majority of the heap occupation will be the array that holds the character data, which will be shared between the two string instances (unless you use a very outdated JVM). This array will still be in use as long as either of the two strings is in use. Recent JVMs even have the String Deduplication feature that may cause other strings of the same contents in the JVM use this array (to allow collection of their formerly used array). So the lifetime of the array is entirely unpredictable.
Q: I read somewhere that even when using the constructor, the String Pool is being used. It will insert the string into the String Pool and into the heap. [] I didn't find any further information about this, but I would like to know if that's true.
It is NOT true. A string created with new is not placed in the string pool ... unless something explicitly calls intern() on it.
Q: Why does java create this duplicate string?
Because the JLS specifies that every new generates a new object. It would be counter-intuitive if it didn't (IMO).
The fact that it is nearly always a bad idea to use new String(String) is not a good reason to make new behave differently in this case. The real answer is that programmers should learn not to write that ... except in the extremely rare cases that that it is necessary to do that.
Q: Another thing that I would like to know is how the intern() function of the String class works: Does it just return a pointer to the string in the String Pool?
The intern method always returns a pointer to a string in the string pool. That string may or may not be the string you called intern() or.
There have been different ways that the string pool was implemented.
In the original scheme, interned strings were held in a special heap call the PermGen heap. In that scheme, if the string you were interning was not already in the pool, then a new string would be allocated in PermGen space, and the intern method would return that.
In the current scheme, interned strings are held in the normal heap, and the string pool is just a (private) data structure. When the string being interned a not in the pool, it is simply linked into the data structure. A new string does not need to be allocated.
Q: Will the garbage collector delete the string that is outside the String Pool from the heap?
The rule is the same for all Java objects, no matter how they were created, and irrespective of where (in which "space" or "heap" in the JVM) they reside.
If an object is not reachable from the running application, then it is eligible for deletion by the garbage collector.
That doesn't mean that an unreachable object will be be garbage collected in any particular run of the GC. (Or indeed ever ... in some circumstances.)
The above rule equally applies to the String objects that correspond to string literals. If it ever becomes possible that a literal can never be used again, then it may be garbage collected.
That doesn't normally happen. The JVM keeps a hidden references to each string literal object in a private data structure associated with the class that defined it. Since classes normally exists for the lifetime of the JVM, their string literal objects remain reachable. (Which makes sense ... since the application may need to use them.)
However, if a class is loaded using a dynamically created classloader, and that classloader becomes unreachable, then so will all of its classes. So it is actually possible for a string literal object to become unreachable. If it does, it may be garbage collected.
Based on my understanding on String objects, that every string literal is being added/created on the String Constant Pool.
String a = new String("hello world");
two objects are being created, one is the "hello world" being added on the constant pool and the other object is the new instance of String. Does the same principle is being applied when it comes to StringBuilder?
StringBuilder b = new StringBuilder("java is fun");
b.append(" and good");
in the StringBuilder example does it mean that there are 3 objects being created? The first one is the new instance of StringBuilder and the other 2 objects are the string literals "java is fun" and " and good"?
Yes, your understanding is correct. (But see below.) String literals go in the constant pool, while the new String(...) and new StringBuilder(...) create additional objects. (The StringBuilder also creates an internal character array object, so that there are at least four objects involved in the second example.) Note that the call to b.append(...) may internally create another object and some garbage, but only if the internal character array used by b needs to expand.
EDIT: As #fdreger points out in a comment, the string objects corresponding to the string literals are created not at run time, but rather at the time the class is created during the creation and loading phase of the class life cycle (as described in the Java Language Specification; see also the section on the runtime constant pool).
Yes, you're right.
"java is fun" and " and good" are going to be stored in the constant pool. I would call them objects, but yes, it's more like it.
b in this case is going to be an object with a limited lifespan. It may actually store its own copy of both strings in its buffer to work with, so that originals won't be modified.
The strings are created when the class is loaded, not when the actual code is executed. You may think of them as part of your code, exactly as any other literal value.
So in your second example:
two strings are created when the code is loaded for the first time
no strings are created during the execution of the code.
This question already has answers here:
How do I compare strings in Java?
(23 answers)
Compare two objects with .equals() and == operator
(16 answers)
Closed 7 years ago.
I looked it up in a book, which is usually more thorough in terms of explanations than a website.
Take this for ex.:
if (nickname == "Bob")
The condition will be true only if nickname is referring to the same String object.
Here is a sentence I found confusing, can anyone please explain to why this is the case:
For efficiency, Java makes only one string object for every string constant.
The book points out that the way of assembling the object "Bob" also affects whether the condition will be true of not, which confuses me the most.
For ex.:
String nickname = "Bob";
...
if (nickname == "Bob") //TRUE
But if "Bob" is created from .substring() method, condition will be FALSE.
String name = "Robert";
String nickname = name.substring(0,3);
...
if (nickname == "Rob")//FALSE
Why is this so?
Edit: in the end of the book's explanation, I found a sentence which also confuses me a lot:
Because string objects are always constructed by the compiler, you never have an interest in whether two strings objects are shared.
Doesn't everything we write get constructed by the compiler?
You need to understand 2 things
1)
String a = "Bob";
String b = "Bob";
System.out.println(a.equals(b));
System.out.println(a == b);
How do you think? What the output?
true
true
What doing this? First string created in string pool in permanent generation memory. Second string get existing object from pool.
String a = "Bob"; // create object in string pool(perm-gen)
String b = "Bob"; // getting existing object.
How right you noticed :
For efficiency, Java makes only one string object for every string constant.
2)
String nickname = name.substring(0,3);
As String is immutable object name.substring(0,3); created new String("Rob") in heap memory, not in perm-gen.
Note :
In Java 8 String pool is created in PermGen area of Heap, garbage collection can occur in perm space but depends upon JVM to JVM. By the way from JDK 1.7 update, String pool is moved to heap area where objects are created.
Read more here.
String literals are internally handled by the JVM so that for every unique String literal, it always refers to the same object if it has the same value. For example, a string literal "test" in class A will be the exact same object as a string literal "test" in class B.
Doesn't everything we write get constructed by the compiler?
The compiler simply adds a the string literal to the classes constant pool upon compilation and loads it with a special instruction called LDC, the rest is handled by the JVM, which loads the string constant from a special string constant pool that never removes / garbage-collects any objects (previously permgen).
However, you can get the 'internal' version of any string (as if it was a string literal) using String#internal(), which would cause the == operator to work again.
It's about objects.
Since these aren't primitives == doesn't compare what they are. == compares where they are (in heap memory).
.equals() should (if implemented) compare what's contained in that memory.
This is a detail that is easily forgotten because small strings and boxed numbers often don't get new memory when created because it's more optimal to instead point you to cached version of the same thing. Thus you can ask for a new "Bob" over and over and just get handed a reference (memory address) to the same "Bob". This tempts us to compare them like primitives since that seems to work the same way. But not every object will have this happen to it so it's a bad habit to let yourself develop.
This trick works only when 1) a matching object already exists, 2) it's immutable so you can't surprise users of other "copies" by changing it.
To abuse an old metaphor, if two people have the same address it's a safe bet that they keep the same things at home, since it's the same home. However, just because two people have different addresses doesn't mean they don't keep exactly the same things at home.
Implementing .equals() is all about defining what we care about when comparing what is kept in these objects.
So only trust == to compare values of primitives. Use .equals() to ask an object what it think's it's equal to.
Also, this isn't just a java issue. Every object oriented language that lets you directly handle primitives and object references/pointers/memory address will force you to deal with them differently because a reference to an object is not the object it self.
The objects value is not the same as it's identity. If it was there would only ever be one copy of an object with the same contents. Since the language can't perfectly make that happen you're stuck having to deal with these two concepts differently.
I read that string constant pool is self referenced Also in this link it is written as the creation of String literal :
String s= "new";
will create a new String "new" in the heap if there is not one.
So does it mean that object is always created in the heap regardless its literal or new object using new keyword?
What i understood of intern is -- it checks if there is a object in the heap with same name then it is referenced else new object is created in the heap.
Please correct if i am wrong here.
Another doubt i have is - does the constant pool contains the objects or just the refernces to the objects in the heap.
does it mean that object is always created in the heap regardless its literal or new object using new keyword?
Yes, in Java all Object-derived objects, including Strings, are created in the heap. The only difference is that identical String objects from the constant pool get reused with the help of the compiler, while String objects created with operator new require explicit code from the programmer in order to get reused.
Yes it is on heap.
and with respect to intern() Yesyou are right.