I am trying to understand how the class file is loaded into method area and execute. I am very much confused about the constant pool.
when the constant pool is created initially? while compiling the
class file or when the class is loaded.
How the byte code is organized in method area What the method table
consists of?
Can anyone show the sketch the picture representation of mapping in
method area for clear understanding
Since the literal meaning of “constant pool” is just “pool of constants”, there are different things of the name, which are easy to confuse
Each class file has a constant pool describing all constants used in that class, which includes constant values but also symbolic references needed for linkage. Some entries fulfill both roles, e.g. class entries may serve as owner declaration for a symbolic reference to a member, needed when accessing a field or invoking a method, but may also be used to get a Class instance, e.g. for a class literal appearing in source code. Since it’s part of the class file, its format is specified within The Java® Virtual Machine Specification, §4 The class File Format, in §4.4. The Constant Pool.
As said by other answers, you can use the command javap -v class.name to inspect the constant pool of a class.
There is a corresponding data structure at runtime, also known as run-time constant pool. Since certain values are represented as runtime objects (e.g. of type String, Class, MethodType, or MethodHandle), and symbolic references must be resolved to the runtime representation of the denoted classes and members, this structure is not the same as the byte sequence found in the class file. But these entries correspond, so that each time, an object is instantiated for a constant or a symbolic reference is resolved, the result can be remembered and reused the next time the same constant entry is accessed.
This doesn’t imply that an implementation must have a 1:1 representation of each class’ constant pool. It’s possible that a specific implementation maps a class’ pool to a shared pool used for a all classes of the same class loading context, where each symbolic reference resolves to the same target.
There’s also the string pool, which can be seen as part of the runtime constant pool, holding references to all String instances associated with string constants, to allow resolving all identical string constants of all classes to the same String instance.
When a Java file is compiled, all references to variables and methods are stored in the class's constant pool as a symbolic reference.
Here is a link for your reference : What is the purpose of the Java Constant Pool?
javac creates a constant pool when you compile your source to .class file. You can see it if you make
javap -v MyClass
to your MyClass.class
The Java Virtual Machine has a method area that is shared among all Java Virtual Machine threads.
You can see bytecode of your class file by
'javap -c -v Main'
Method Area is just a part of the heap where JVM has all information about this class.
Related
I am currently trying to dig deeper into the specification of the Java Virtual Machine. I have been reading Inside the JVM book online and there is one confusing abstraction I can't seem to grasp: Constant Pool. here is the excerpt from the book:
For each type it loads, a Java virtual machine must store a constant pool. A constant pool is an ordered set of constants used by the type, including literals (string, integer, and floating point constants) and symbolic references to types, fields, and methods. Entries in the constant pool are referenced by index, much like the elements of an array. Because it holds symbolic references to all types, fields, and methods used by a type, the constant pool plays a central role in the dynamic linking of Java programs
I have several questions about the above and CP in general:
Is CP located in .class file for each type?
What does the author mean by "symbolic reference"?
What is the Constant Pool's purpose, in simple English?
Constant pool is a part of .class file (and its in-memory representation) that contains constants needed to run the code of that class.
These constants include literals specified by the programmer and symbolic references generated by compiler. Symbolic references are basically names of classes, methods and fields referenced from the code. These references are used by the JVM to link your code to other classes it depends on.
For example, the following code
System.out.println("Hello, world!");
produces the following bytecode (javap output)
0: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #3; //String Hello, world!
5: invokevirtual #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
#n here are references to the constant pool. #2 is a symbolic reference to System.out field, #3 is a Hello, world! string and #4 is a symbolic reference to PrintStream.println(String) method.
As you can see, symbolic references are not just names - for example, symbolic reference to the method also contains information about its parameters (Ljava/lang/String;) and return type (V means void).
You can inspect constant pool of a class by running javap -verbose for that class.
I think understanding how the frame is constructed using a diagram would help.
The frame is where the operands (operation instructions) reside and that is where the dynamic linking occurs. It is a shorthand way, so to speak, using the constant pool to keep track of the class and it's members.
Each frame contains a reference to the runtime constant pool. The reference points to the constant pool for the class of the method being executed for that frame. This reference helps to support dynamic linking.
C/C++ code is typically compiled to an object file then multiple object files are linked together to product a usable artifact such as an executable or dll. During the linking phase symbolic references in each object file are replaced with an actual memory address relative to the final executable. In Java this linking phase is done dynamically at runtime.
When a Java file is compiled, all references to variables and methods are stored in the class's constant pool as a symbolic reference. A symbolic reference is a logical reference not a reference that actually points to a physical memory location.
Here is a link to James Blooms JVM Internals for more details.
What is the Constant Pool's purpose, in simple English?
The CP is a memory area very unique constant values are stored to reduce redundancy:
System.err.println("Hello");
System.out.println("Hello");
In the CP there is only one String object "Hello" and the compiler substitutes in both lines to the same reference. Your .class file only contains one Hello string. (The same for other types).
Is CP located in .Class file for each type?
Yes, Look here: http://en.wikipedia.org/wiki/Java_class_file
Let give Example First to understand what String constant pool mean
public class StringConstantPool {
public static void main(String[] args) {
String s = "prasad";
String s2 = "prasad";
System.out.println(s.equals(s2));
System.out.println(s == s2);
}
}
the output will be
true
true
what happen here step by step
1- The class is loaded when JVM is invoked.
2- JVM will look for all the string literals in the program.
3- First, it finds the variable s which refers to the literal “prasad” and it will be created in the memory
4- A reference for the literal “prasad” will be placed in the string constant pool memory.
5- Then it finds another variable s2 which is referring to the same string literal “prasad“.
Now that JVM has already found a string literal “prasad“, both the variables s and s2 wil refer to the same object i.e. “prasad“.
I hope this be helpful
read more
http://www.journaldev.com/797/what-is-java-string-pool
It might be interpreted as a brower's browsing history,reducing the need to find or build every time
The class file format as described in http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html contains all references to other classes in the the constant pool as entries of type CONSTANT_Utf8.
But these entries are not only references to classes but also class literals, names of methods, fields and what not.
In a first attempt I thought it would be sufficient to use the constant pool entries referenced by other constant_pool entries of type CONSTANT_Class, CONSTANT_NameAndType and CONSTANT_MethodType
But these don't seem to include type parameters and annotations. Further reading of the specification seems to suggest that I need to parse things like RuntimeVisibleAnnotations and similar constructs in order to identify the relevant constant pool entries. Which means I have to parse more or less the complete class file.
But the whole idea behind parsing the class file myself was that it would be simpler then using a library like ASM, because I thought it would be sufficient to interpret the constant pool.
My question is: Is there a way to reliable identify all classes referenced in a class file by just interpreting little more than the constant pool?
Annotation types that cannot be loaded by a class loader are ignored by this class loader and will simply appear to be invisible at runtime. I assume that this is the reason that types that are referenced by an annotation are not stored in the constant pool where the resolution of an unknown type would prohibit successful class loading. Annotations are code attributes, i.e. meta data and they should not be linked deeply into the class by avoiding a constant pool entry.
You are therefore required to also introspect RuntimeVisibleAnnotations which live outside of the constant pool. However, if the constant pool does not contain a string RunntimeVisibleAnnotations, your approach is working. ASM has however very little overhead so I would use it nevertheless.
As shown in the image, Permgen is further divided into several parts.
Runtime constant pool stores constants pertaining to each type that is loaded by class loader.
Method area stores method information such as method return type, method name. (correct me if I am wrong here.)
And Reserved area is the part which is reserved if more memory is required by permgen.
But what I don't understand is, what is code area in the image? Any code is stored in this space(seems vague to me)?
Any code is stored in this space(seems vague to me)?
Any specific reason for that ?
The possible answer could be : Code area stores the byte code of the classes loaded into your memory.
But then the question comes, Why class is not loaded directly in RAM ?
Because we have a JVM to provide interoperability, Since JVM is an intermediary between java code and the machine , we need some place to store the code statements until JVM is scheduled by OS to execute its code.(for OS JVM is a process). So, It loads the byte code in Code area(if i am right) and when scheduled, further interprets code(.class) into underlying machine instructions.
The answer to me is "Code area holds the byte code of the classes".
To back the idea mentioned above ., here are some concepts copied as it is from Oracle blog which says:
So the Java classes are stored in the permanent generation. What all
does that entail? Besides the basic fields of a Java class there are:
Methods of a class (including the bytecodes)
Names of the classes (in the form of an object that points to a string also in the permanent generation)
Constant pool information (data read from the class file, see chapter 4 of the JVM specification for all the details).
Object arrays and type arrays associated with a class (e.g., an object array containing references to methods).
Internal objects created by the JVM (java/lang/Object or java/lang/exception for instance)
Information used for optimization by the compilers (JITs)
Hope it clears.
From an interesting article on the problems of PermGen: Will Java 8 Solve PermGen OutOfMemoryError?:
Jon Masamitsu, JVM developer at Oracle, explained 2006 in his blog the
purpose of the permanent generation: The permanent generation contains
information about classes, such as bytecode, names and JIT
information. It is stored in a separate space, because it is mostly
static and garbage collection can be much more optimized by separating
it.
Actually the PermGen store all your static code. i think this makes sense to you why there is a code area in PermGen.
I will venture to guess, based on the following article, by Jon Masamitsu, from which the following quote is taken, that the figure above is a misrepresentation (or rephrased - a misleading representation):
So the Java classes are stored in the permanent generation. What all
does that entail? Besides the basic fields of a Java class there are
Methods of a class (including the bytecodes)
Names of the classes (in
the form of an object that points to a string also in the permanent
generation)
Constant pool information (data read from the class file,
see chapter 4 of the JVM specification for all the details).
Object
arrays and type arrays associated with a class (e.g., an object array
containing references to methods).
Internal objects created by the JVM
(java/lang/Object or java/lang/exception for instance)
Information
used for optimization by the compilers (JITs)
The bytecode of all classes that have been resolved live in permgen. Just because a library has 1.2MB of classes doesn't they will be loaded by the JVM from the JAR. It it possible, even likely, that only a small fraction of those classes are used by a particular application.
You can run many large application servers whose sum total JAR size is >1GB using only 64MB permgen, because only a fraction of the classes are ever used.
Also take this example:
class A {
// ... code
}
class B {
void method1() {
// something
}
void method2() {
A a = new A();
}
}
While these classes may reside in the same JAR, merely creating an instance of B does not cause class A to be loaded. If you never call method2(), class A will never be loaded by the JVM. Additionally, contrary to popular belief, permgen can be garbage collected, and if space gets low, and there are no instances on the heap referring to class A anymore, then class A can be removed from permgen.
I have constructed a tiny custom class loader in a dummy application in order to understand how dynamic class loading works. For this question, I don't need to go into details about what it does other than to mention that it instantiates two different instances of my class loader and has each one load different classes, in order that I can satisfy myself by confirming a "ClassNotFoundException" from one of the class loader instances when only the other has loaded a particular class.
However, I have a question that can be easily expressed by the following, hopefully self-explanatory line of code.
Class clazz = myClassLoader.loadClass(theClazz);
This line of code causes my custom class loader to LOAD the class bytes into memory, and to return an instance of a Class object for that class.
My question is this: Where are the physical bytes of memory for the loaded class located (i.e., the contents of the .class file)? Are they stored inside the ClassLoader object, or are they stored inside the Class object (whereupon the ClassLoader object merely contains an internal reference to this Class object) - or somewhere else entirely?
The classloader object has a Collection of all classes it has loaded.
If the same physical class is loaded by 2 different class laoders, the bytes of that class are two times in memeory. The two classes behave like different types. They are not compatible to each other! Where the bytes are stored is not really relevant, I wonder why you want to know that. If you write your own ClassLoader you can "store" them where ever you want. However at some point you will make a call like: ClassLoader.defineClass(String, byte[], int, int). Then the relevant structures in memory inside the VM are created (MethodArea, ConstantPool etc.) as mentioned in other answers.
From the source code for ClassLoader:
// The classes loaded by this class loader. The only purpose of this table
// is to keep the classes from being GC'ed until the loader is GC'ed.
private Vector classes = new Vector();
The source code for the java classes are located in src.zip in your JDK directory.
Edit:
Was that what you asked about?
At the lowest level, the binary representation of the class is present in various runtime areas of the virtual machine, most notably in the Method Area and in the Runtime Constant Pool. In simpler terms, the Method Area is expected to contain information about the class, including the code for methods and constructors as evidenced by the following quote from the Virtual Machine Specification:
The Java virtual machine has a method
area that is shared among all Java
virtual machine threads. The method
area is analogous to the storage area
for compiled code of a conventional
language or analogous to the "text"
segment in a UNIX process. It stores
per-class structures such as the
runtime constant pool, field and
method data, and the code for methods
and constructors, including the
special methods (§3.9) used in class
and instance initialization and
interface type initialization.
"This line of code causes my custom class loader to LOAD the class bytes into memory, and to return an instance of a Class object for that class"
If I understand your question correct, memory allocation for objects is done on the heap space of the java process.
It depends on the JVM, seen for example here or here. Old versions of Mac OS used a pointer to pointer scheme, called a handle.
The class file and its internal, JVM-specific, representation are usually stored in the Permanent Generation - at least in the Sun/Oracle incarnation of the JVM.
See What does PermGen actually stand for? for more links.
Inspired by the comments on this question, I'm pretty sure that Java Strings are interned at runtime rather than compile time - surely just the fact that classes can be compiled at different times, but would still point to the same reference at runtime.
I can't seem to find any evidence to back this up. Can anyone justify this?
The optimization happens (or at least can happen) in both places:
If two references to the same string constant appear in the same class, I'd expect the class file to only contain one constant pool entry. This isn't strictly required in order to ensure that there's only one String object created in the JVM, but it's an obvious optimization to make. This isn't actually interning as such - just constant optimization.
When classes are loaded, the string pool for the class is added to the intern pool. This is "real" interning.
(I have a vague recollection that one of the bits of work for Java 7 around "small jar files" included a single string pool for the whole jar file... but I could be very wrong.)
EDIT: Section 5.1 of the JVM spec, "The Runtime Constant Pool" goes into details of this:
To derive a string literal, the Java
virtual machine examines the sequence
of characters given by the
CONSTANT_String_info structure.
If the method String.intern has
previously been called on an instance
of class String containing a sequence
of Unicode characters identical to
that given by the CONSTANT_String_info
structure, then the result of string
literal derivation is a reference to
that same instance of class String.
Otherwise, a new instance of class
String is created containing the
sequence of Unicode characters given
by the CONSTANT_String_info structure;
that class instance is the result of
string literal derivation. Finally,
the intern method of the new String
instance is invoked.
Runtime.
JLS and JVM specifications specify javac compilation to class files which contain constant declarations (in the Constant Pool) and constant usage in code (which javac can inline as primitive / object reference values). For compile-time String constants, the compiler generates code to construct String instances and to call String.intern() for them, so that the JVM interns String constants automatically. This is a behavioural requirement from JLS:
http://docs.oracle.com/javase/specs/jls/se7/html/jls-15.html#jls-15.28
Compile-time constant expressions of type String are always "interned" so as to share unique instances, using the method String.intern.
But these specs have neither the concept nor the definition of any particular String intern pool structures/references/handles whether compile time or runtime. (Of course, in general, the JVM spec does not mandate any particular internal structure for objects: http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-2.html#jvms-2.7)
The reason that no intern pool structures are mentioned is because they're handled entirely with the String class. The intern pool is a private static/class-level structure of the String class (unspecified by JLS & JVM specs & javadoc).
Objects are added to the intern pool when String.intern() is called at runtime. The intern pool is leveraged privately by the String class - when code create new String instances and calls String.intern(), the String class determines whether to reuse existing internal data. Optimisation can be carried out by the JIT compiler - at runtime.
There's no compile-time contribution here, bar the vanilla inlining of constant values.