Inheritance and Object creation, Theoretically and in Real - java

Lets say I have a class A.java,
When I will execute a constructor method of A, it will create a memory space for xyz Object.
A xyz = new A();
The reference to memory may be something like,
[xyz] ---> '0x34524'
Thats basics of OOP. Simple enough!
Now,
What happens if a class is inheriting from different parent classes? How many object space will be created in memory?
Lets say we have,
and then we create an object of class D.java,
D omg = new D();
Here as we know that D's object will call construct of C.java and so on until A.java. Does this mean that in memory we are having 4 different memory reference, because we are instantiating all of the four objects (one directly and another 3 indirectly)?
[omg] ---> '0x34525'
[C] ---> '0x34526'
[B] ---> '0x34527'
[A] ---> '0x34528'
Note :
This isn't homework question, this is just a curiosity question.
I am aware of the fact that if we have a instance variable inside an A.java then we will not create only object A but we will be creating other internal object whenever we hit new keyword.

First, a tid bit... calling the constructor of an object does not allocate it. In bytecode, the initialization new Object() is expressed as something to the effect of...
new java/lang/Object
invokespecial java/lang/Object <init>()V
The new instruction takes care of allocating the space and acquiring a reference to the yet uninitialized object, while the invokespecial handles calling the constructor itself (which is internally compiled to a void method named <init>, hence the descriptor <init>()V).
Moving on, internals of object allocation and representation on the heap are entirely JVM specific. However, as far as I know, there is only one allocated object per allocated object, no matter its number of super classes. The object itself in memory has space for the instance fields of both its own class and its super classes. It must also have space for a virtual method table, in order to do virtual dispatch when performing virtual method calls (e.g. via invokevirtual) for the object.
Internally, the Oracle HotSpot JVM manages things called oops, or ordinary object pointers. You can read more about the HotSpot memory layout here. Feel free to browse the HotSpot source repository.

JVM allocates memory for only one object (here D)
memory allocation and initialization happens bottom(here D) to top(Object)
initialization/calling constructors happens Top(Object) to Bottom(here D)
reference :
http://www.artima.com/designtechniques/initialization.html

I have not read this anywhere but its my experience.
When you call new D(), the constructor chain begins, it first creates an java.lang.Object and then extends it to an A, I mean after creating the Object (which is root of all objects), A is initialized on it, by adding memory for A's members, including fields and methods (which are a pointer to some code!). And then it extends to B and so on.
In the process of extension if a method is overriden, the method pointer in the object will point to new code.
It will be only one reference to D.

Related

Java runtime memory model --

I'm looking for verification/correction on the following.
Assume the following inheritance hierarchy -
class A {
void m1() { System.out.print(System.currentTimeMillis()); }
void m2() { System.out.print(new Date()); } // current date
}
class B extends A {
int x;
void m1() { System.out.print(x); } // overriding
void m99() { System.out.print(++x); }
}
Also assume that, class B is instantiated at some point in the application, i.e. the following statement is executed -
B b = new B();
(Paragraph-A) When the application is built, both classes A and B are loaded into memory
by the static loader.
Both classes are in memory, along with the definitions of all their member methods.
When B is instantiated with the above statement, a memory space in the heap
is allocated for that object b.
The methods m1(), m2() and m99() all have their definitions in this space of b.
These method definitions are only references to the method "templates" existing
in the class definitions. In the class, these methods are sequences of operations--
parameterized operations if
the method is executing on any parameter and/or global variable.
When one of the methods, say b.m99() is invoked at runtime,
JRE goes to class definition of B to get that "template" (sequence of operations),
looks up the current values of the fields of b,
fills in that "template" with the current values of these field(s),
also pushes these current values to the stackspace and runs the methods by executing these operations it found on the class definition.
If the method is inherited from a superclass, Eg. m2() above, the definition of that
method in the class (the definition mentioned in Paragraph-A above) is itself a reference to the definition of m2() in class A.
At runtime, when b.m2() is executed, JRE goes directly to class A to find that "template" for the low-level operations to execute.
These references to method definitions are checked at compile time and put into the bytecode. Eg. in the bytecode for the above case, class B has, a direct reference to method m2() of class A for method m2() it's inheriting from A.
Is this all accurate? If not, where/why not?
Generally, execution environments for Java can get implemented in various ways and it is impossible to say what “Java” does in general.
When the application is built, both classes A and B are loaded into memory
by the static loader.
Both classes are in memory, along with the definitions of all their member methods.
The standard way of deployment is to compile Java source code to bytecode. When the application is executed, the classes will be loaded. There is no such thing as a “static loader”. There are different class loaders. When the class files are delivered on the class path, they will be loaded by the application class loader.
When B is instantiated with the above statement, a memory space in the heap
is allocated for that object b.
The methods m1(), m2() and m99() all have their definitions in this space of b.
As said by Andreas, the method definitions are part of the JVM’s class representation. The object only contains a reference (pointer) to the class.
These method definitions are only references to the method "templates" existing
in the class definitions. In the class, these methods are sequences of operations--
parameterized operations if the method is executing on any parameter and/or global variable.
The terms “definitions” and “templates” and the way you use them, are creating unnecessary confusion. The instruction sequences are part of a methods definition. There is a reference from the object to these definitions, either indirectly via the already mentioned reference to the class, or directly via a table of method pointers, known as “vtable”, a widespread optimization.
When one of the methods, say b.m99() is invoked at runtime, JRE goes to class definition of B to get that "template" (sequence of operations), looks up the current values of the fields of b, fills in that "template" with the current values of these field(s), also pushes these current values to the stackspace and runs the methods by executing these operations it found on the class definition.
You should forget about that term “template”. The method definition contains a sequence of executable instructions and the JVM will execute these instructions. For instance methods, a pointer to the object data becomes the implicit first argument. No template will be filled with anything.
If the method is inherited from a superclass, Eg. m2() above, the definition of that
method in the class (the definition mentioned in Paragraph-A above) is itself a reference to the definition of m2() in class A.
At runtime, when b.m2() is executed, JRE goes directly to class A to find that "template" for the low-level operations to execute.
This is an implementation detail, but hold your breath…
These references to method definitions are checked at compile time and put into the bytecode. Eg. in the bytecode for the above case, class B has, a direct reference to method m2() of class A for method m2() it's inheriting from A.
This is not, how Java works. In Java, the compiler will check for the presence of the invoked method, which succeeds as B inherits the method from A and its accessible, then, it will record an invocation as written in source code, for m2() invoked on B.
The fact that B inherits the method, is an implementation detail of B and allowed to change. A future version of B may override the method. If that happens, A.m2() may even get removed. Or a class C may get introduced between A and B (C extends A and B extends C), which are all backwards compatible changes.
But back to the previous section, at runtime, an implementation may utilize the knowledge about actual inheritance. A JVM could search the super type hierarchy each time, a method is invoked, that would be valid but not very efficient.
A different strategy is to have the “vtable” mentioned above. Such a table is created for each class when it is initialized, starting with a copy of all superclass methods, entries of overridden methods replaced, and newly declared methods at the end.
So when an invocation instruction is executed the first time, it gets linked by determining the associated index in the vtable. Then, every invocation only needs to fetch the method pointer from the vtable of the object’s actual class, without ever traversing the class hierarchy.
That’s still only the way, an interpreted or less optimized execution works. When the JVM decides to optimize the invoking code further, it may predict the actual target of the method invocation. There are two ways in your example
the JVM uses the knowledge that A.m2() has never overridden (it would have to remove such an optimization when a new class is loaded which does override the method)
It analyzes the code path, to determine that for B b = new B(); b.m2(); the target is fixed as the result of new B() is always of type B, not a superclass and not a subclass.
When the target is predicted, it can be inlined. Then, the optimized code simply does System.out.print(new Date()); and when there’s no other use of the B instance, even the allocation may get eliminated.
So what the JVM does at runtime, may be entirely different than what has written in source code. Only the perceivable result (the date is printed) will be the same.
The methods m1(), m2() and m99() all have their definitions in this space of b.
Incorrect. The space allocated for b (the instance of B) references the class itself, i.e. the space allocated for class B, where the method definitions are stored.
The space allocated for an object instance consists of an object header and the data of the instance, i.e. the values of the fields. See e.g. What is in java object header for more information about the object header.
Eg. in the bytecode for the above case, class B has, a direct reference to method m2() of class A for method m2() it's inheriting from A.
Incorrect. The bytecode for class B knows nothing about method m2().
Remember, class A may be compiled separately from class B, so you can remove method m2 without recompiling class B.
UPDATE
From comment:
How then is it known what to execute when b.m2() is run? I don't think JRE goes to the super-class of B, looks to see an m2() there, if no such method then goes to super-super class, ... Too inefficient in runtime. Must be a direct reference to m2(). m2() is a member of B -- even though inherited.
As already stated in the answer, m2() is NOT a member of B. If you run the Java Disassembler, i.e. run javap B.class on the command-line, you'll see:
class B extends A {
int x;
B();
void m1();
void m99();
}
As you can see, the compiler has added the default constructor for you, but has not added any m2() method.
Now create this class:
class C {
public static void main(String[] args) {
B b = new B();
b.m2();
}
}
Then disassemble it with the -c switch, i.e. javap -c C.class:
class C {
C();
Code:
0: aload_0
1: invokespecial #8 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
Code:
0: new #16 // class B
3: dup
4: invokespecial #18 // Method B."<init>":()V
7: astore_1
8: aload_1
9: invokevirtual #19 // Method B.m2:()V
12: return
}
As you can see, the compiler generates an instruction to call B.m2(), even though we already saw that B.class doesn't know about m2().
This means that what you postulated is exactly what happens, i.e. the JVM needs to resolve the method to class A at runtime, by walking up the superclass chain.
If m2() is removed from class A and recompiled, without recompiling class C, you will get NoSuchMethodError: 'void B.m2()' when running the code.
It's all visible in the .class files if you use a disassembler like javap. The bytecode for class B does not contain method m2. Further reading about Java handles inheritance here.

In Java, when we instantiate some class, does the JVM actually create a separate "object" for each supertype? [duplicate]

Lets say I have a class A.java,
When I will execute a constructor method of A, it will create a memory space for xyz Object.
A xyz = new A();
The reference to memory may be something like,
[xyz] ---> '0x34524'
Thats basics of OOP. Simple enough!
Now,
What happens if a class is inheriting from different parent classes? How many object space will be created in memory?
Lets say we have,
and then we create an object of class D.java,
D omg = new D();
Here as we know that D's object will call construct of C.java and so on until A.java. Does this mean that in memory we are having 4 different memory reference, because we are instantiating all of the four objects (one directly and another 3 indirectly)?
[omg] ---> '0x34525'
[C] ---> '0x34526'
[B] ---> '0x34527'
[A] ---> '0x34528'
Note :
This isn't homework question, this is just a curiosity question.
I am aware of the fact that if we have a instance variable inside an A.java then we will not create only object A but we will be creating other internal object whenever we hit new keyword.
First, a tid bit... calling the constructor of an object does not allocate it. In bytecode, the initialization new Object() is expressed as something to the effect of...
new java/lang/Object
invokespecial java/lang/Object <init>()V
The new instruction takes care of allocating the space and acquiring a reference to the yet uninitialized object, while the invokespecial handles calling the constructor itself (which is internally compiled to a void method named <init>, hence the descriptor <init>()V).
Moving on, internals of object allocation and representation on the heap are entirely JVM specific. However, as far as I know, there is only one allocated object per allocated object, no matter its number of super classes. The object itself in memory has space for the instance fields of both its own class and its super classes. It must also have space for a virtual method table, in order to do virtual dispatch when performing virtual method calls (e.g. via invokevirtual) for the object.
Internally, the Oracle HotSpot JVM manages things called oops, or ordinary object pointers. You can read more about the HotSpot memory layout here. Feel free to browse the HotSpot source repository.
JVM allocates memory for only one object (here D)
memory allocation and initialization happens bottom(here D) to top(Object)
initialization/calling constructors happens Top(Object) to Bottom(here D)
reference :
http://www.artima.com/designtechniques/initialization.html
I have not read this anywhere but its my experience.
When you call new D(), the constructor chain begins, it first creates an java.lang.Object and then extends it to an A, I mean after creating the Object (which is root of all objects), A is initialized on it, by adding memory for A's members, including fields and methods (which are a pointer to some code!). And then it extends to B and so on.
In the process of extension if a method is overriden, the method pointer in the object will point to new code.
It will be only one reference to D.

Please confirm/correct this: C++ vs Java

When a user defined class (A) has an Object of another user defined class (B) as one of it's data members, then:
In Java: Actually, only a reference to that instance of B is the data member of A, not the instance of B itself.
In C++: The entire instance of B is the data member. No pointers or anything. Just the whole solid instance of B itself unless it's dynamically instantiated in which case, a pointer is the data member.
Is my deduction correct or incorrect? I'm not sure myself if this is 100% correct.
Now, I'm really intrigued by this whole thing. Can you take it one step further and tell me something I missed? I mean, what is the significance of this difference? Does it mean, that the class A in Java occupies less space than the same class A implementation in C++?
Yes, you are basically correct. The Java class
class C{
D dRef; // needs to be assigned a valid object reference.
}
The dRef is assigned a newed object reference.
Will be constructed similar to the c++ class
class C2
{
D* dPointer; // needs to be assigned a valid pointer value.
};
The dPointer can be assigned a newed object pointer value.
While the c++ class
class C3
{
D dValue;
};
Will contain the entire layout size of D, created when a C3 is created.
The c++ version will likely in both scenarios occupy less space than the java version. For the java version to be usable, e.g. cObject.dRef.dMethod() a D object needs to be created, new D(). There will be overhead for both the C object as well as the instance. Garbage collection bookkeeping for instance.
The c++ variants needs only to store the pointer value in C2 (along with a new D). For C3, C3 can be the same size as D.

Java Class instantiating - What goes inside memory?

I have a basic question. Consider this simple code:
class A{
void someMethod(){
B b = new B(); // Line 3
B c = new B(); // Line 4
}
}
When Line 3 is executed, class B is loaded into memory (i.e.: we have physical space allocated for a object of type 'Class' ( Let's say with an id - classLaoder1.B) of type class containing code for class B).
Question 1# What happens next? - Instance of class B(representing state of b) is created(allocated physical memory) based on the fact that classLoader.B actually contains B's information?
Question 2# Also, at Line -4, the since classLoader.B is present in memory, an object containing state of c is created in memory?
Well, your example and the description are a bit to vague to answer your question in a short manner.
You are referring to different classloaders but you didn't include any exemplary code when which class is loaded. In its current form the code won't even compile as the return value is missing - but let's continue with your question.
The heap is a memory area created by the JVM at startup and may increase and decrease at runtime dynamically. It is divided into different sections. YoungGen will hold short lived objects, OldGen will hold object states of objects that survived the YoungGen space and finally the PermGen space which as it names suggest should contain permanent class metadata and descriptors. Therefore, the PermGen space is reserved for classes and stuff that is tied to classes (like static members) and you will have to deal with it if you handle application servers or plugin-mechanisms that provide some kind of hot-deployment features. (To be a bit more precise, in Sun's JVM the PermGen space is actually a separate part of memory and does not really belong to the heap, but different JVM vendors may have different definitions therefore)
Reference: Configuration and Setup of SAP JVM
Two cases may occur upon invoking someMethod():
B was already loaded by the application classloader on startup of your application
B is included within a class that got loaded by a child classloader
In the first case the memory for the class definition is allocated at startup within the PermGen space of the heap and is only freed when the application shuts down. In the latter case, there is also memory for that class stored in the PermGen space of the heap, but on invoking loadClass(...) of the class-loader which should load the class. Here, memory can be freed if no strong reference is pointing to any class loaded by that class-loader. Often, enums or singleton classes, which hold a strong reference to themselves, will prevent however a correct unloading of those loaded bytes and create a memory leak therefore.
If you ever implement one of those application-frameworks and debug it, you will see what exactly happens when. To load a class via a class-loader the loadClass(...) method is called which first checks if it already has loaded that class before then asks his parent if she knows this class (which also checks if she has loaded that class or her parent, ...). Only if the class was not loaded before (by either this class-loader or any parent) the current (child) class-loader is going to execute findClass(...) which further should call defineClass() which actually turns the bytes from some input-file or stream to a Class representation. That Class object contains the blueprint (signature of the method including number and type of parameters, return value and thrown exceptions). On trying to load a class, usually the extension class as well as the defined interfaces get loaded too (if not already known in the class-loader tree) - but the types of including members are not yet loaded! They will get loaded when the class is going to be instantiated.
Reference: How ClassLoader Works in Java
On creating a new instance, the new operator invokes the newInstance(...) method internally and reserves memory for all members of that instance. So, if the type of the member is yet unknown by the current class-loader or its parents, it will be loaded before assigning any values. Then, the constructor of the class is executed (according to the constructor called with the new operation) and values get assigned to the memory occupied by the variables on the heap (often in Eden space). After the object is constructed in memory, a reference to the object is returned by the new operator and the object is ready to be used within your code.
c in your example is instantiated the same way as b - first the classloader has to check if class B was already loaded. As it has loaded B before, it just grabs B from its local cache and returns the class therefore. Next, the newInstance(...) method is executed on the class to instantiate a new object. Therefore, again memory for the member variables is allocated on the heap - after the initial check if the required classes have already been loaded - the constructor is executed and the reference to the newly created and initialized object is returned.
If your class has static methods or static members they will get allocated onto the PermGen space, as they belong to the class and are shared across all instances.
One thing to note: If c should be loaded by a peer or peer's child class-loader (CL2) and b was defined by a sister class-loader (CL1) (so no parent has actually defined the class), the peer class-loader CL2 will load (and define) its own version of B which seems to be the same as the version of the sister's loader CL1, but they actually are different classes for Java as the class-loader which loaded that class is actually part of the class. This means CL1-B != CL2-B although both versions share the same methods and fields. Casting c to b's B will result in a ClassCastException therefore.
Just for completeness, although you didn't ask for this, on calling methods a different kind of memory allocation occurs. The passed variables are pushed onto the stack, which every thread has its own instance of, and popped from the stack (including return value) if the method returns. Furthermore, each block (The part between { and }) creates a new stack-frame (that's why variables declared inside a block are not visible to a region outside of the block) where the local variables of that block are stored into. More information available here
Reference: Understanding Stack and Heap-Tutorial
b and c are instances of the Class B, consider them variables, so they will be stored in memory separately. b contains information of the class B (the class is equivalent to a structure) so you may have a class Person that has the variable name, and you have 2 instances: p1 and p2. Each one will have a different name and a different place in memory

when is "this" assigned memory location and at what point can a method be called in java

class A
{
B b;
public A()
{
b = new B(this);
//initialization of class A variables
}
public void meth1()
{
}
}
class B
{
A a;
public B(A a)
{
this.a = a;
}
}
I know that this reference shouldn't be passed in this way,but what happens if this is done
Some other class calls the class A constructor. when is the "this" reference actually allocated memory? would it be assigned memory as soon as A's constructor is called before even super() is called.
Suppose class B is a thread and since B has A's reference can B call the methods on A before A's constructor doesn't even return if "this" reference is not allocated memory yet.
The memory for the object is allocated before any constructor is executed. Otherwise the constructor would have not place to write the values of the variables.
Therefore you can pass out a reference to the current object (a.k.a this) to other pieces of code inside the constructor.
As you noted, the object is not fully constructed at that time and it's a bad idea to actually do that, but "just" because the values of the object can be in an inconsistent state. The memory is already allocated and reserved for that object at this point in time.
this is just a reference to the "current object", which you could think of as just another parameter that any non-static method gets. In fact in that's actually how the JVM treats it. See JVMS §2.6.1 Local Variables:
On instance method invocation, local variable 0 is always used to pass a reference to the object on which the instance method is being invoked (this in the Java programming language).
So the direct answer to "when is this allocated" is effectively: Whenever you call a method on an object.
this refers to current object and any object is allocated memory using "new"
The memory is allocated when JVM is processing new instruction. If for example your code looks like:
A a = new A();
^
here the memory for A is allocated
It indeed could be a problem to pass this to B. Constructor of B can invoke instance method of A before the constructor of A has been finished. You should move line to the end of constructor of A to avoid possible problems. Alternatively you could manage the object lifecycle from outside using setters.
this is assigned before the constructor is called. In fact, the super() call is not necessary. It only ensures that the creation stuff of the parent class is done, which doesn't matter if the parent class is Object. Also, A's methods are usable as soon as the object is created (even before the constructor is called) so if B got the reference to A in the constructor, it can use A's methods just like A itself in the constructor. Just be sure to make A's methods so that they can be used when A is not fully initialized, or just create and start B after the initialization is complete.
As long as you don't modify A or call methods on A or it's members in the constructor of B it will work. ( See other answers)
If you call a method on an not completely initialized object (after construction) it's not defined what happens. Especially if you use multiple threads (see memory barrier).
More on this topic:
How do JVM's implicit memory barriers behave when chaining constructors?

Categories