How much space does a class take in the JVM? - java

I've been looking to no avail searching different sites because all the search results want to tell me how to compute the size of an object, not of a class.
When we define a class (with a .class file in a jar/war) and it gets loaded into the JVM, how much space does that take up? Obviously it depends on the actual things in the class, a class with more fields has more metadata to store. But if we say for example the class had 10 integer fields and 10 reference fields to other objects:
How much extra space would it take up in the JVM to have say, 1,000 of those classes (all extending the same base class)?
Would it change if they were anonymous classes instead of defined classes?

Well every Class itself is an object. So an indicative size would be to call Instrumentation.getObjectSize() on the class itself.
You can follow the tutorial here:
https://www.baeldung.com/java-size-of-object
And then if you want to check the size of MyClass you can do:
InstrumentationAgent.getObjectSize(MyClass.getClass());
This doesn't mean it is the only memory associated with this class, because the class loader, garbage collector and other internal workings of the JVM might keep other meta information about the class.

The answer will most definitely depend on JVM implementation (including version of such implementation, platform etc). There is no one number you can get.
As for anonymous inner classes, these still have class names generated for them, so this should have zero impact.

Related

Memory Allocation for objects in java

I am a beginner in Java and I had completed C++.
In C++,the memory allocation for member functions is done at the time they are declared as a part of a class and not when the objects are created,when the objects are created,then only the memory allocation for the instance variables is done,that is for every object memory allocation for instance variable is done seperately and the member functions are shared commonly by all the objects.
In case of instance variables,I know the same case happens in java,but what happens in case of member functions?
In C++;
For e.g. if we have 2 instance variables a and b,and we had created 3 objects x,y,z and let us say we have a member function getData(),
then all the 3 objects have a seperate copy of the two instance variables a and b,but share a common copy of getData().
For instance variables,same is the case in java,but what for member functions?
In Java, the bytecode for the methods exists only once for a class; no copy of the method's bytecode is made for every object. That would be unnecessary and wasteful; the bytecode does not change while the program runs.
So it works the same as what you say for C++:
... the member functions are shared commonly by all the objects.
In Java, member functions are loaded on-demand by the classloader. Since the entire Java standard library is available alongside any Java program, and the program itself may contain any number of classes, this is necessary in order to keep program start times in the range of sanity.
So any time the program needs a class in any way (e.g. to access a static variable or method, or to create an instance of an object) that isn't already loaded, the Java classloader loads that class, which includes the class's member functions. Then, once that class is loaded, it doesn't need to be loaded again if it's needed in the future, because similarly to what you noted in your question, only one copy of the class bytecode is necessary at any given time.
To avoid accumulating more and more loaded classes as the program runs, Java uses garbage collection to unload classes that it knows it can safely unload, similarly to how it uses garbage collection for program data. Those classes can of course be reloaded later if they are needed again. There are, of course, situations in which unloading a class wouldn't work due to the risk that reloading it would cause static variables/code to be reinitialized/run.
This ability to unload and reload classes can also be put to use in debugging: IntelliJ IDEA has a HotSwap feature that allows you to edit and recompile a class while the program is running, unload that class, then reload the new bytecode so you can quickly test ideas. (Wow, I just realized that totally sounded like an advertisement. I'm not affiliated with JetBrains, in case you're wondering.)

PermGen space of heap

As shown in the image, Permgen is further divided into several parts.
Runtime constant pool stores constants pertaining to each type that is loaded by class loader.
Method area stores method information such as method return type, method name. (correct me if I am wrong here.)
And Reserved area is the part which is reserved if more memory is required by permgen.
But what I don't understand is, what is code area in the image? Any code is stored in this space(seems vague to me)?
Any code is stored in this space(seems vague to me)?
Any specific reason for that ?
The possible answer could be : Code area stores the byte code of the classes loaded into your memory.
But then the question comes, Why class is not loaded directly in RAM ?
Because we have a JVM to provide interoperability, Since JVM is an intermediary between java code and the machine , we need some place to store the code statements until JVM is scheduled by OS to execute its code.(for OS JVM is a process). So, It loads the byte code in Code area(if i am right) and when scheduled, further interprets code(.class) into underlying machine instructions.
The answer to me is "Code area holds the byte code of the classes".
To back the idea mentioned above ., here are some concepts copied as it is from Oracle blog which says:
So the Java classes are stored in the permanent generation. What all
does that entail? Besides the basic fields of a Java class there are:
Methods of a class (including the bytecodes)
Names of the classes (in the form of an object that points to a string also in the permanent generation)
Constant pool information (data read from the class file, see chapter 4 of the JVM specification for all the details).
Object arrays and type arrays associated with a class (e.g., an object array containing references to methods).
Internal objects created by the JVM (java/lang/Object or java/lang/exception for instance)
Information used for optimization by the compilers (JITs)
Hope it clears.
From an interesting article on the problems of PermGen: Will Java 8 Solve PermGen OutOfMemoryError?:
Jon Masamitsu, JVM developer at Oracle, explained 2006 in his blog the
purpose of the permanent generation: The permanent generation contains
information about classes, such as bytecode, names and JIT
information. It is stored in a separate space, because it is mostly
static and garbage collection can be much more optimized by separating
it.
Actually the PermGen store all your static code. i think this makes sense to you why there is a code area in PermGen.
I will venture to guess, based on the following article, by Jon Masamitsu, from which the following quote is taken, that the figure above is a misrepresentation (or rephrased - a misleading representation):
So the Java classes are stored in the permanent generation. What all
does that entail? Besides the basic fields of a Java class there are
Methods of a class (including the bytecodes)
Names of the classes (in
the form of an object that points to a string also in the permanent
generation)
Constant pool information (data read from the class file,
see chapter 4 of the JVM specification for all the details).
Object
arrays and type arrays associated with a class (e.g., an object array
containing references to methods).
Internal objects created by the JVM
(java/lang/Object or java/lang/exception for instance)
Information
used for optimization by the compilers (JITs)
The bytecode of all classes that have been resolved live in permgen. Just because a library has 1.2MB of classes doesn't they will be loaded by the JVM from the JAR. It it possible, even likely, that only a small fraction of those classes are used by a particular application.
You can run many large application servers whose sum total JAR size is >1GB using only 64MB permgen, because only a fraction of the classes are ever used.
Also take this example:
class A {
// ... code
}
class B {
void method1() {
// something
}
void method2() {
A a = new A();
}
}
While these classes may reside in the same JAR, merely creating an instance of B does not cause class A to be loaded. If you never call method2(), class A will never be loaded by the JVM. Additionally, contrary to popular belief, permgen can be garbage collected, and if space gets low, and there are no instances on the heap referring to class A anymore, then class A can be removed from permgen.

JVM bytecode limitations on class-class interactions

I was looking through the JVM bytecode instructions and was surprised to see that all the interactions between classes (e.g. casting, new, etc.) rely upon constant pool lookups for identity of the other classes.
Am I correct in inferring that this means that one class cannot know about the existence of more than 64k others, as it is impossible to refer to them? If one did need to refer to that many, what ought one do--delegate the work to multiple classes each of which could have their own <64k interactions?
(The reason this interests me is that I have a habit of writing code generators, sometimes producing thousands of distinct classes, and that some languages (e.g. Scala) create classes prolifically. So it seems that if true I have to be careful: if I have hundreds of methods in a class each using hundreds of (distinct) classes, I could exceed the constant pool space.)
Am I correct in inferring that this means that one class cannot know about the existence of more than 64k others, as it is impossible to refer to them?
I think you are correct. And don't forget that there are constant pool entries for other things; e.g. all of the classes method and fields names, and all of its literal strings.
If one did need to refer to that many, what ought one do--delegate the work to multiple classes each of which could have their own <64k interactions?
I guess so.
However, I'm not convinced that this concern would ever be realized in practice. It is hard to conceive of a class that needs to directly interact with that many other classes ... unless the code generator is ignoring the structure of its input source code.
It sounds like your problem could be solved via invokedynamic. This is basically a much faster form of reflection designed to ease the implementation of dynamic languages on the JVM.
If you really do have to deal with thousands of automatically generated classes, you probably don't want to statically link it all. Just use invokedynamic. This also has the advantage of letting you defer some code generation to runtime.
Note that you still need a constant pool entry for every dynamic method called by a class, but you no longer need to refer to the actual class and methods being called. In fact, you can create them on demand.

Java: Where is the memory allocated for the physical bytes of a class when loaded by a ClassLoader?

I have constructed a tiny custom class loader in a dummy application in order to understand how dynamic class loading works. For this question, I don't need to go into details about what it does other than to mention that it instantiates two different instances of my class loader and has each one load different classes, in order that I can satisfy myself by confirming a "ClassNotFoundException" from one of the class loader instances when only the other has loaded a particular class.
However, I have a question that can be easily expressed by the following, hopefully self-explanatory line of code.
Class clazz = myClassLoader.loadClass(theClazz);
This line of code causes my custom class loader to LOAD the class bytes into memory, and to return an instance of a Class object for that class.
My question is this: Where are the physical bytes of memory for the loaded class located (i.e., the contents of the .class file)? Are they stored inside the ClassLoader object, or are they stored inside the Class object (whereupon the ClassLoader object merely contains an internal reference to this Class object) - or somewhere else entirely?
The classloader object has a Collection of all classes it has loaded.
If the same physical class is loaded by 2 different class laoders, the bytes of that class are two times in memeory. The two classes behave like different types. They are not compatible to each other! Where the bytes are stored is not really relevant, I wonder why you want to know that. If you write your own ClassLoader you can "store" them where ever you want. However at some point you will make a call like: ClassLoader.defineClass(String, byte[], int, int). Then the relevant structures in memory inside the VM are created (MethodArea, ConstantPool etc.) as mentioned in other answers.
From the source code for ClassLoader:
// The classes loaded by this class loader. The only purpose of this table
// is to keep the classes from being GC'ed until the loader is GC'ed.
private Vector classes = new Vector();
The source code for the java classes are located in src.zip in your JDK directory.
Edit:
Was that what you asked about?
At the lowest level, the binary representation of the class is present in various runtime areas of the virtual machine, most notably in the Method Area and in the Runtime Constant Pool. In simpler terms, the Method Area is expected to contain information about the class, including the code for methods and constructors as evidenced by the following quote from the Virtual Machine Specification:
The Java virtual machine has a method
area that is shared among all Java
virtual machine threads. The method
area is analogous to the storage area
for compiled code of a conventional
language or analogous to the "text"
segment in a UNIX process. It stores
per-class structures such as the
runtime constant pool, field and
method data, and the code for methods
and constructors, including the
special methods (ยง3.9) used in class
and instance initialization and
interface type initialization.
"This line of code causes my custom class loader to LOAD the class bytes into memory, and to return an instance of a Class object for that class"
If I understand your question correct, memory allocation for objects is done on the heap space of the java process.
It depends on the JVM, seen for example here or here. Old versions of Mac OS used a pointer to pointer scheme, called a handle.
The class file and its internal, JVM-specific, representation are usually stored in the Permanent Generation - at least in the Sun/Oracle incarnation of the JVM.
See What does PermGen actually stand for? for more links.

How to compare 2 classes which are loaded from 2 different classloader

here is my case:
classloader A, loaded one class("Class1");
then, I changed Class1.java and compile it.
next I loaded Class1.class again by classloader B.
I want to compare these 2 classes, check whether the class meta data changed by someone.
Is there any way to compare 2 classes' definition data?
I am not entirely sure what you mean by "the class meta data" beyond what you can find through the reflection APIs. Here is an attempt to answer the question based on my best guess.
By definition data do you mean their declared internal variables and method signatures? Because you can do that with reflection (getDeclaredMethod() and getDeclaredFields()). However, if the two classes are loaded from different class loaders, they will not be equal (see the Class javadocs on equality), even if they are loaded from the same compiled bytecode.
There is other information you can get from the Reflection APIs, including what class it inherits from, what interfaces it implements, and any Annotations that are compiled in with it (assuming 1.5 or higher of course).
You could also potentially do a hash of the Class files (finding them through the classloader is possible) and see if they are different - that would tell you if they had different code in them.
Hope that helps.
thanks!
Reflection could collect one class's meta data, but it's hard to check whether one class is changed.
I can locate that class file, but also it's hard to check whether one class is changed.
I assumed there should be a way to check loaded classes, whether they have the same data(from the same java file).

Categories