By reading Oracle JVM architecture document:
https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html
A run-time constant pool is a per-class or per-interface run-time
representation of the constant_pool table in a class file (§4.4).
I understand that for each class, it has a runtime constant pool (please correct me if I am wrong).
However, what I am confused is that if I have two different classes A and B and each class has a private String variable say String value = "abc".
if I compare A.value with B.value using == rather than equals, I will get a true which make me think that "abc" in both A and B are in the same runtime constant pool? Could someone point me out where I am wrong ?
This is a preemptive optimization that the JLS superimposes.
From JLS 7, §3.10.5 (formatting mine)
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.
However, note that this is only true of String literals and constant expressions. Dynamically constructed strings (e.g. x + y for Strings x and y) are not automatically interned to share the same unique instances. As a result, you will still have to use .equals in general unless you can guarantee that your operands are constant expressions.
This is because '==' is comparing references. Objects of both A and B have different String value variables (and so each class' constant pool has a separate entry for it); but they are both initialized to the same value. The compiler/JVM is most likely optimizing for space by having them both point to the same compile-time constant value in the bytecode. The '==' operator is NOT comparing constant pool locations.
Edit: to clear up some confusion, this does NOT mean that "==" can be used for string comparison. All I was saying was that it cannot be used to compare constant pool location either. It is for one thing and one thing only: comparing whether two references point to the same object. The situation in the question will SOMETIMES result in == returning true, but sometimes not. It depends on decisions the compiler and JVM make (or depending on what the JSL says as an astute answerer has said).
Related
I'm not sure about some properties of runtime constant pool.
Runtime constant pool, is filled up by the data from constant pool (from .class files, during class loading). But is it also filled up by variables created in runtime? Or are they converted during compilation to literals, and stored in constant pool?
For example:
Integer i = new Integer(127);
is treated like literal, because of conversion to:
Integer i = Integer.valueOf(127);
during compilation, and stored in constant pool?
If it's not working like that, is there any runtime mechanics for runtime constant pool?
And second question: I have found this sentence in many articles: "every class got Runtime constant pool", but what does it mean? Is there a single RCP, that contains all application objects of (for example) Integer type, or is there a single RCP for every class, that contains all constant objects, that occured in this class? (for example: Person, got age = Integer(18), and isAdult = Boolean(true)).
First, there is no conversion of
Integer i = new Integer(127);
to
Integer i = Integer.valueOf(127);
These constructs are entirely different. new Integer(127) is guaranteed to produce a new instance every time it is evaluated, whereas Integer.valueOf(127) is guaranteed to produce the same instance on every evaluation, as Integer.valueOf guarantees for all values in the -128 … +127 range. This is handled by the implementation of the Integer.valueOf(int) and not related to constant pools in any way. Of course, it is implementation specific, but the OpenJDK implementation handles this by simply filling an array with references to these 256 instance the first time, this cache is accessed.
While it is correct that every class has a constant pool in its class file, it might be misleading to say that every class will have a runtime constant pool (on its own). That’s again a JVM implementation detail. While it is possible to map each class constant pool 1:1 to a runtime constant pool, it obviously makes sense to merge the constant pools of classes living in the same resolve context (i.e. defined by the same class loader) into one pool, so that identical constants don’t need to be resolved multiple times. Though, conceptionally, every class has a runtime representation of its pool, even if they do not materialize in this naive form. So the statement “every class has a runtime constant pool” is not wrong, but it doesn’t necessarily imply that there will be such a data structure for every class.
This affects classes, members, MethodType, MethodHandle and String instances, referenced by the constant pools of the classes, but not wrapper types like Integer or Boolean, as there are no such entries in a constant pool. Integer values in the pool are primitive values and boolean values do not exist at all.
This must not be confused with the global String pool references all String instances for literals and the results of intern() calls.
Problem 1 - Anwser: No
Integeral wrapper types are cached, not stored in the constant pool. They are just ordinary objects in the heap. Integer or Byte caching is a runtime optimization, not a VM optimization, nor a compile time optimization. They are not magically replaced with the cached one when their constructor is invoked to create a new one.
First, your translation from new Integer(127) to Integer.valueOf(127) is not correct at all as explained in this post. If you do some runtime verifications, like System.out.println(Integer.valueOf(127) == new Integer(127)); (prints false), you will quickly come to the conclusion that no matter what object you are constructing, using new operator always creates a new, uncached object. (Even Strings, who's actually in the runtime constant table, need being interned to get a reference to the canonical one.)
What i variable hold is just reference pointing to a Integer object in the heap. It will be cached if you are using valueOf and vice versa.
Problem 2 - Anwser: There a single RCP for every class but they are all in the same memory region
The RCPs are all stored in method area. Personally I don't know how JVM is implemented, but JVMS has stated:
The Java Virtual Machine maintains a per-type constant pool (§2.5.5), a run-time data structure that serves many of the purposes of the symbol table of a conventional programming language implementation.
Nevertheless, this doesn't matter even from a performance tuning view, as long as you don't plan to apply for a job in Oracle.
According to Javadoc about String.intern():
When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
I have few questions about the same.
When a new String object (not using a string literal but using new() operator) is created like:
String str = new String("Test");
Question: I am aware that a new object will be created in heap. But will it also put String Test into the stringpool during object creation? If yes, then why the reference is not returned directly for the stringpool. If no, why not directly put the string in the pool as now the StringPool has been moved out of the PermGen and is in regular heap space (i.e. there is no space constraint apart from the heap space limit). There are some posts which state that the String is inserted in pool as soon as object is created whereas there are posts which contradicts this too.
Once we call String.intern() on a String object (as literals are already interned) what happens to the space allocated to the object? Is it reclaimed at the same moment or it waits for the next GC cycle?
Accepted answer to another question on SO, states that String intern should be used when you need speed since you can compare strings by reference (== is faster than equals).
Question: I am aware that when using String.intern() it returns reference to the string already present in the StringPool. But this requires a full scan lookup on the StringPool which can be an expensive operation in itself. So is this speed achieved during string comparison justifiable? If so, why?
I have looked at below sources:
JavaDoc
SO Question ques1, ques2, ques3
http://java-performance.info/string-intern-in-java-6-7-8/
And other misc sources from SO and outside world
All string literals are interned on compilation time. Using a string literal with the single argument constructor taking a string is a bit of an abuse of that constructor, hence you are likely to get two of them (but maybe there is a special compiler case for this, I can't say for sure). As of java 8 the implementation of the constructor (for openjdk) is this:
public String(String original) {
this.value = original.value;
this.hash = original.hash;
}
So no special treatment on this side. If you know the literal don't use this constructor.
I don't think there is any special GC semantics for Strings. It will get collected once it's unreachable and deemed collection worthy by the GC as any other object.
Don't ever use == for comparing strings, the first step in the default equals method for Strings is doing just that. If this is your dominant case (you know you are working with interned strings most of the time) you are only paying the overhead of a method call which is tiny, the potential for future bugs you add by doing something like that is just too big of a risk for a gain that is minuscule.
This question already has answers here:
How do I compare strings in Java?
(23 answers)
Compare two objects with .equals() and == operator
(16 answers)
Closed 7 years ago.
I looked it up in a book, which is usually more thorough in terms of explanations than a website.
Take this for ex.:
if (nickname == "Bob")
The condition will be true only if nickname is referring to the same String object.
Here is a sentence I found confusing, can anyone please explain to why this is the case:
For efficiency, Java makes only one string object for every string constant.
The book points out that the way of assembling the object "Bob" also affects whether the condition will be true of not, which confuses me the most.
For ex.:
String nickname = "Bob";
...
if (nickname == "Bob") //TRUE
But if "Bob" is created from .substring() method, condition will be FALSE.
String name = "Robert";
String nickname = name.substring(0,3);
...
if (nickname == "Rob")//FALSE
Why is this so?
Edit: in the end of the book's explanation, I found a sentence which also confuses me a lot:
Because string objects are always constructed by the compiler, you never have an interest in whether two strings objects are shared.
Doesn't everything we write get constructed by the compiler?
You need to understand 2 things
1)
String a = "Bob";
String b = "Bob";
System.out.println(a.equals(b));
System.out.println(a == b);
How do you think? What the output?
true
true
What doing this? First string created in string pool in permanent generation memory. Second string get existing object from pool.
String a = "Bob"; // create object in string pool(perm-gen)
String b = "Bob"; // getting existing object.
How right you noticed :
For efficiency, Java makes only one string object for every string constant.
2)
String nickname = name.substring(0,3);
As String is immutable object name.substring(0,3); created new String("Rob") in heap memory, not in perm-gen.
Note :
In Java 8 String pool is created in PermGen area of Heap, garbage collection can occur in perm space but depends upon JVM to JVM. By the way from JDK 1.7 update, String pool is moved to heap area where objects are created.
Read more here.
String literals are internally handled by the JVM so that for every unique String literal, it always refers to the same object if it has the same value. For example, a string literal "test" in class A will be the exact same object as a string literal "test" in class B.
Doesn't everything we write get constructed by the compiler?
The compiler simply adds a the string literal to the classes constant pool upon compilation and loads it with a special instruction called LDC, the rest is handled by the JVM, which loads the string constant from a special string constant pool that never removes / garbage-collects any objects (previously permgen).
However, you can get the 'internal' version of any string (as if it was a string literal) using String#internal(), which would cause the == operator to work again.
It's about objects.
Since these aren't primitives == doesn't compare what they are. == compares where they are (in heap memory).
.equals() should (if implemented) compare what's contained in that memory.
This is a detail that is easily forgotten because small strings and boxed numbers often don't get new memory when created because it's more optimal to instead point you to cached version of the same thing. Thus you can ask for a new "Bob" over and over and just get handed a reference (memory address) to the same "Bob". This tempts us to compare them like primitives since that seems to work the same way. But not every object will have this happen to it so it's a bad habit to let yourself develop.
This trick works only when 1) a matching object already exists, 2) it's immutable so you can't surprise users of other "copies" by changing it.
To abuse an old metaphor, if two people have the same address it's a safe bet that they keep the same things at home, since it's the same home. However, just because two people have different addresses doesn't mean they don't keep exactly the same things at home.
Implementing .equals() is all about defining what we care about when comparing what is kept in these objects.
So only trust == to compare values of primitives. Use .equals() to ask an object what it think's it's equal to.
Also, this isn't just a java issue. Every object oriented language that lets you directly handle primitives and object references/pointers/memory address will force you to deal with them differently because a reference to an object is not the object it self.
The objects value is not the same as it's identity. If it was there would only ever be one copy of an object with the same contents. Since the language can't perfectly make that happen you're stuck having to deal with these two concepts differently.
Let's look at the folloing code snippet:
String s1 = "Hello";
String s2 = "Hello";
Both variables refer to the same object due to interning. Since strings are immutable, only one object is created and both refer to the same object.
A constant pool is also something, which holds all the constants (integer, string, etc.) that are declared in a class. It is specific to each class.
System.out.println("Hello"); // I believe this Hello is different from above.
Questions:
Does string pool refer to the pool of a constant string object in the constant pool?
If yes, is String pool common throughout the whole application or specific to a class?
My questions are,
Does string pool refers to the pool of constant string object in the constant pool?
No.
"Constant pool" refers to a specially formatted collection of bytes in a class file that has meaning to the Java class loader. The "strings" in it are serialized, they are not Java objects. There are also many kinds of constants, not just strings in it.
See Chapter 4.4 the constant pool table
Java Virtual Machine instructions do not rely on the run-time layout of classes, interfaces, class instances, or arrays. Instead, instructions refer to symbolic information in the constant_pool table.
In contrast, the "String pool" is used at runtime (not just during class loading), contains only strings, and the "strings" in the string pool are java objects.
The "string pool" is a thread-safe weak-map from java.lang.String instances to java.lang.String instances used to intern strings.
Chapter 3.10.5. String Literals says
A string literal is a reference to an instance of class String (§4.3.1, §4.3.3).
Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.
There is only one string pool, and all string literals are automatically interned.
Also, there are other pools for autoboxing and such.
The constant pool is where those literals are put for the class.
constans_pool(all constans, including Strings) is a data structure in class file(out of JVM).
When class file is loaded into JVM, then constans_pool -> run-time constans_pool(General), in hotspot & SE8:
Strings in constans_pool will be stored in Heap, and we call it string-pool; https://openjdk.org/jeps/122 https://wiki.openjdk.org/display/HotSpot/Caching+Java+Heap+Objects
the other data in constans_pool will be stored in native-memory(Metaspace),and we call it run-time constans_pool(Special).
yesterday(April 5th 2012) i'am trying comparing string which is in environment:
computer 1
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11D50b)
OS X 10.7.3
computer 2
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11D50b)
Window 7
computer 3
Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-11D50b)
Linux Ubuntu 11.10
This is the code i'am trying
public class TComp{
public static void main(String[] args){
String a = "arif";
String b = "arif";
if(a==b){
System.out.println("match!");
}
}
}
As far as i know, to compare string in java we should using .equal() function and '==' will do interning in this case. But with those all computer with different OS, why intern work fine in computer 1, while i got error in computer 2 and computer 3?
please correct if any kind of word i've wrong. thank you.
In the same class, all string constants are folded into the .class file constant pool by the compiler (at compile time). This means the compiler will only store one copy of the string (because who needs two identical constants in the pool?).
This means that within a class, == comparison of strings often works; however, before you get too excited, there is a good reason you should never use == comparison of strings. There is no guarantee that the two strings you compare both came from the in-class constant pool.
So,
"foo" == new String("foo")
is entirely likely to fail, while
"foo" == "foo"
might work. That might depends heavily on the implementation, and if you code to the implementation instead of the specification, you could find yourself in for a very nasty surprise if the implementation changes because the specification doesn't actually require that implementation.
In short, use .equals(...) for Object comparison, every time. Reserve == for primitive comparison and "this is the same object instance" comparison only. Even if you think that the two Strings might be interned (or the same object), as you never know when you will be running under a different classloader, on a different JVM implementation, or in a machine that simply decided to not intern everything.
On one computer they were the same object on the other they weren't. The rules for the language don't specify whether they're the same object or not, so it can happen either way.
String interning is entirely up to the compiler. == does not "intern" anything; it simply compares object identity. In some cases, a and b can point to the same object. In other cases, they don't. Both are legal, so you should indeed use .equals().
See also http://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.5 .
This is because Java do String interning whenever you create a compile-time constant string.
JLS 15.28. Constant Expressions
Compile-time constant expressions of type String are always "interned" so as to share unique instances, using the method String.intern.
That's why you get a true when use "==" to compare, because they ARE actually the same object., String.valueOf() work the same way as string constants.
String x = "a";
String y = "a";
System.out.println(x == y); // true
String w = new String("b");
String z = "b";
System.out.println(w == z); // false
The use of == to compare objects is simply not reliable. You should never use == to compare Objects for equality unless you are truly looking for the exact same instance.
The == operator determines whether the two objects references are referring to the same instance.
On the other hand, the .equals() method compares the actual characters within the object.
These should be irrelevant with regards to which computer you are on.
The best thing is always to use .equals to compare objects.
But with String if you need for some strange reason using == operator you need to be sure to compare the results of .intern method.
It returns always the interned value and the doc tells that it is unique. The doc say that all the consts are interned and unique too.