Constructor bytecode - java

The ASM guide talks about constructors:
package pkg;
public class Bean {
private int f;
public int getF() {
return this.f;
}
public void setF(int f) {
this.f = f;
}
}
The Bean class also has a default public constructor which is
generated by the compiler, since no explicit constructor was defined
by the programmer. This default public constructor is generated as
Bean() { super(); }. The bytecode of this constructor is the
following:
ALOAD 0
INVOKESPECIAL java/lang/Object <init> ()V
RETURN
The first instruction pushes this on the operand stack. The second
instruction pops this value from the stack, and calls the <init>
method defined in the Object class. This corresponds to the super()
call, i.e. a call to the constructor of the super class, Object. You
can see here that constructors are named differently in compiled and
source classes: in compiled classes they are always named <init>,
while in source classes they have the name of the class in which they
are defined. Finally the last instruction returns to the caller.
How is the value of this already known to the JVM before the first instruction of the constructor?

At the JVM level, first the object is allocated, uninitialized, then the constructor is invoked on that object. The constructor is more-or-less an instance method executed on the uninitialized object.
Even in the Java language, this exists and has all its fields at the first line of the constructor.

The first thing to understand, is how object instantiation work on the bytecode level.
As explained in JVMS, §3.8. Working with Class Instances:
Java Virtual Machine class instances are created using the Java Virtual Machine's new instruction. Recall that at the level of the Java Virtual Machine, a constructor appears as a method with the compiler-supplied name <init>. This specially named method is known as the instance initialization method (§2.9). Multiple instance initialization methods, corresponding to multiple constructors, may exist for a given class. Once the class instance has been created and its instance variables, including those of the class and all of its superclasses, have been initialized to their default values, an instance initialization method of the new class instance is invoked. For example:
Object create() {
return new Object();
}
compiles to:
Method java.lang.Object create()
0 new #1 // Class java.lang.Object
3 dup
4 invokespecial #4 // Method java.lang.Object.<init>()V
7 areturn
So the constructor invocation via invokespecial shares the behavior of passing this as the first argument with invokevirtual.
However, it must be emphasized that a reference to an uninitialized reference is treated specially, as you are not allowed to use it before the constructor (or the super constructor when you’re inside the constructor) has been invoked. This is enforced by the verifier.
JVMS, §4.10.2.4. Instance Initialization Methods and Newly Created Objects:
… The instance initialization method (§2.9) for class myClass sees the new uninitialized object as its this argument in local variable 0. Before that method invokes another instance initialization method of myClass or its direct superclass on this, the only operation the method can perform on this is assigning fields declared within myClass.
When doing dataflow analysis on instance methods, the verifier initializes local variable 0 to contain an object of the current class, or, for instance initialization methods, local variable 0 contains a special type indicating an uninitialized object. After an appropriate instance initialization method is invoked (from the current class or its direct superclass) on this object, all occurrences of this special type on the verifier's model of the operand stack and in the local variable array are replaced by the current class type. The verifier rejects code that uses the new object before it has been initialized or that initializes the object more than once. In addition, it ensures that every normal return of the method has invoked an instance initialization method either in the class of this method or in the direct superclass.
Similarly, a special type is created and pushed on the verifier's model of the operand stack as the result of the Java Virtual Machine instruction new. The special type indicates the instruction by which the class instance was created and the type of the uninitialized class instance created. When an instance initialization method declared in the class of the uninitialized class instance is invoked on that class instance, all occurrences of the special type are replaced by the intended type of the class instance. This change in type may propagate to subsequent instructions as the dataflow analysis proceeds.
So code creating an object via the new instruction can’t use it in any way before the constructor has been called, whereas a constructor’s code can only assign fields before another (this(…) or super(…)) constructor has been called (an opportunity used by inner classes to initialize their outer instance reference as a first action), but still can’t do anything else with their uninitialized this.
It’s also not allowed for a constructor to return when this is still in the uninitialized state. Hence, the automatically generated constructor bears the required minimum, invoking the super constructor and returning (there is no implicit return on the byte code level).
It’s generally recommended to read The Java® Virtual Machine Specification (resp. its Java 11 version) alongside to any ASM specific documentation or tutorials.

Related

Java method reference resolving [duplicate]

This question already has answers here:
Java implicit "this" parameter in method?
(2 answers)
Closed 4 years ago.
I'm trying to understand how method references work in java.
At first sight it is pretty straightforward. But not when it comes to such things:
There is a method in Foo class:
public class Foo {
public Foo merge(Foo another) {
//some logic
}
}
And in another class Bar there is a method like this:
public class Bar {
public void function(BiFunction<Foo, Foo, Foo> biFunction) {
//some logic
}
}
And a method reference is used:
new Bar().function(Foo::merge);
It complies and works, but I don't understand how does it match this:
Foo merge(Foo another)
to BiFunction method:
R apply(T t, U u);
???
There is an implicit this argument on instance methods. This is defined §3.7 of the JVM specification:
The invocation is set up by first pushing a reference to the current instance, this, on to the operand stack. The method invocation's arguments, int values 12 and 13, are then pushed. When the frame for the addTwo method is created, the arguments passed to the method become the initial values of the new frame's local variables. That is, the reference for this and the two arguments, pushed onto the operand stack by the invoker, will become the initial values of local variables 0, 1, and 2 of the invoked method.
To understand why method invocation is done this way, we need to understand how the JVM stores code in memory. The code and the data of an object are separated. In fact, all methods of one class (static and non-static) are stored in the same place, the method area (§2.5.4 of JVM spec). This allows to store each method only once instead of re-storing them for each instance of a class over and over again. When a method like
someObject.doSomethingWith(someOtherObject);
is called, it gets actually compiled to something that looks more like
doSomething(someObject, someOtherObject);
Most Java-programmers would agree that someObject.doSomethingWith(someOtherObject) has a "lower cognitive complexity": we do something with someObject that involves someOtherObject. The center of this action is someObject, where someOtherObject is just a means to an end.
With doSomethingWith(someObject, someOtherObject), you do not transport this semantics of someObject being the center of the action.
So in essence, we write the first version, but the computer prefers the second version.
As was pointed out by #FedericoPeraltaSchaffner, you can even write the implicit this parameter explicitly since Java 8. The exact definition is given in JLS, §8.4.1:
The receiver parameter is an optional syntactic device for an instance method or an inner class's constructor. For an instance method, the receiver parameter represents the object for which the method is invoked. For an inner class's constructor, the receiver parameter represents the immediately enclosing instance of the newly constructed object. Either way, the receiver parameter exists solely to allow the type of the represented object to be denoted in source code, so that the type may be annotated. The receiver parameter is not a formal parameter; more precisely, it is not a declaration of any kind of variable (§4.12.3), it is never bound to any value passed as an argument in a method invocation expression or qualified class instance creation expression, and it has no effect whatsoever at run time.
The receiver parameter must be of the type of the class and must be named this.
This means that
public String doSomethingWith(SomeOtherClass other) { ... }
and
public String doSomethingWith(SomeClass this, SomeOtherClass other) { ... }
will have the same semantic meaning, but the latter allows for e.g. annotations.
I find it easier to understand with different types :
public class A {
public void test(){
function(A::merge);
}
public void function(BiFunction<A, B, C> f){
}
public C merge(B i){
return null;
}
class B{}
class C{}
}
We can know see that using a method reference Test::merge instead of a reference on an instance will implicitly use this as the first value.
15.13.3. Run-Time Evaluation of Method References
If the form is ReferenceType :: [TypeArguments] Identifier
[...]
If the compile-time declaration is an instance method, then the target reference is the first formal parameter of the invocation method. Otherwise, there is no target reference.
And we can find some example using this behavior on the following subject:
The JLS - 15.13.1. Compile-Time Declaration of a Method Reference mention:
A method reference expression of the form ReferenceType::[TypeArguments] Identifier can be interpreted in different ways.
- If Identifier refers to an instance method, then the implicit lambda expression has an extra parameter [...]
- if Identifier refers to a static method. It is possible for ReferenceType to have both kinds of applicable methods, so the search algorithm described above identifies them separately, since there are different parameter types for each case.
It then show some ambiguity possible with this behavior :
class C {
int size() { return 0; }
static int size(Object arg) { return 0; }
void test() {
Fun<C, Integer> f1 = C::size;
// Error: instance method size()
// or static method size(Object)?
}
}

How does constructor return if it doesn't have any return type?

This seems to be quite a confusing question. Per the definition, I understand that constructor is a special type of method used to initialize the state of an object and/or assign values to instance variables.
Also someone in Stack Overflow mentioned that constructor returns an object (instance) of a class, as opposed to what a normal method does/returns?
Despite going through lots of tutorials and reference materials, I couldn't find a concrete reason as to how constructor is able to return a value without the presence of a return statement.
I'm curious to find out the internal working of the entire process.
Constructors don't return anything. A constructor simply initializes an instance.
A new instance creation expression
new SomeExample();
produces a reference to a new instance of the specified class
A new class instance is explicitly created when evaluation of a class
instance creation expression (§15.9) causes a class to be
instantiated.
and invokes the corresponding constructor to initialize the created instance
Just before a reference to the newly created object is returned as the
result, the indicated constructor is processed to initialize the new
object using the following procedure:
Assign the arguments for the constructor to newly created parameter variables for this constructor invocation.
If this constructor begins with an explicit constructor invocation (§8.8.7.1) of another constructor in the same class (using this), then
evaluate the arguments and process that constructor invocation
recursively using these same five steps. If that constructor
invocation completes abruptly, then this procedure completes abruptly
for the same reason; otherwise, continue with step 5.
This constructor does not begin with an explicit constructor invocation of another constructor in the same class (using this). If
this constructor is for a class other than Object, then this
constructor will begin with an explicit or implicit invocation of a
superclass constructor (using super). Evaluate the arguments and
process that superclass constructor invocation recursively using these
same five steps. If that constructor invocation completes abruptly,
then this procedure completes abruptly for the same reason. Otherwise,
continue with step 4.
Execute the instance initializers and instance variable initializers for this class, assigning the values of instance variable
initializers to the corresponding instance variables, in the
left-to-right order in which they appear textually in the source code
for the class. If execution of any of these initializers results in an
exception, then no further initializers are processed and this
procedure completes abruptly with that same exception. Otherwise,
continue with step 5.
Execute the rest of the body of this constructor. If that execution completes abruptly, then this procedure completes abruptly for the
same reason. Otherwise, this procedure completes normally.
It gives the JVM the 'return' opcode:
'return' returns to the calling method:
http://en.wikipedia.org/wiki/Java_bytecode_instruction_listings
Code for a default constructor:
aload_0
invokespecial #1; //Method java/lang/Object
return
A Java constructor does not return anything. A constructor simply initializes a new instance of an object of a specific class. Sometimes constructors will have System.out.Println("text") which may lead you to think it returns something, but you can have that statement in any method that doesn't have a return type.
In bytecode
Test1 t1 = new Test1();
looks as follows
NEW test/Test1 //create an uninitized instance of Test1
DUP
NVOKESPECIAL test/Test1.<init> ()V // call construcctor
STORE 1 // save reference to created instance in local var
and this is constructor, void method in fact with special name <init>
public <init>()V //V means no return value, void
L0
LINENUMBER 3 L0
ALOAD 0
INVOKESPECIAL java/lang/Object.<init> ()V // call super constructor
RETURN

Why is a subclass' static initializer not invoked when a static method declared in its superclass is invoked on the subclass?

Given the following classes:
public abstract class Super {
protected static Object staticVar;
protected static void staticMethod() {
System.out.println( staticVar );
}
}
public class Sub extends Super {
static {
staticVar = new Object();
}
// Declaring a method with the same signature here,
// thus hiding Super.staticMethod(), avoids staticVar being null
/*
public static void staticMethod() {
Super.staticMethod();
}
*/
}
public class UserClass {
public static void main( String[] args ) {
new UserClass().method();
}
void method() {
Sub.staticMethod(); // prints "null"
}
}
I'm not targeting at answers like "Because it's specified like this in the JLS.". I know it is, since JLS, 12.4.1 When Initialization Occurs reads just:
A class or interface type T will be initialized immediately before the first occurrence of any one of the following:
...
T is a class and a static method declared by T is invoked.
...
I'm interested in whether there is a good reason why there is not a sentence like:
T is a subclass of S and a static method declared by S is invoked on T.
Be careful in your title, static fields and methods are NOT inherited. This means that when you comment staticMethod() in Sub , Sub.staticMethod() actually calls Super.staticMethod() then Sub static initializer is not executed.
However, the question is more interesting than I thought at the first sight : in my point of view, this shouldn't compile without a warning, just like when one calls a static method on an instance of the class.
EDIT: As #GeroldBroser pointed it, the first statement of this answer is wrong. Static methods are inherited as well but never overriden, simply hidden. I'm leaving the answer as is for history.
I think it has to do with this part of the jvm spec:
Each frame (§2.6) contains a reference to the run-time constant pool (§2.5.5) for the type of the current method to support dynamic linking of the method code. The class file code for a method refers to methods to be invoked and variables to be accessed via symbolic references. Dynamic linking translates these symbolic method references into concrete method references, loading classes as necessary to resolve as-yet-undefined symbols, and translates variable accesses into appropriate offsets in storage structures associated with the run-time location of these variables.
This late binding of the methods and variables makes changes in other classes that a method uses less likely to break this code.
In chapter 5 in the jvm spec they also mention:
A class or interface C may be initialized, among other things, as a result of:
The execution of any one of the Java Virtual Machine instructions new, getstatic, putstatic, or invokestatic that references C (§new, §getstatic, §putstatic, §invokestatic). These instructions reference a class or interface directly or indirectly through either a field reference or a method reference.
...
Upon execution of a getstatic, putstatic, or invokestatic instruction, the class or interface that declared the resolved field or method is initialized if it has not been initialized already.
It seems to me the first bit of documentation states that any symbolic reference is simply resolved and invoked without regard as to where it came from. This documentation about method resolution has the following to say about that:
[M]ethod resolution attempts to locate the referenced method in C and its superclasses:
If C declares exactly one method with the name specified by the method reference, and the declaration is a signature polymorphic method (§2.9), then method lookup succeeds. All the class names mentioned in the descriptor are resolved (§5.4.3.1).
The resolved method is the signature polymorphic method declaration. It is not necessary for C to declare a method with the descriptor specified by the method reference.
Otherwise, if C declares a method with the name and descriptor specified by the method reference, method lookup succeeds.
Otherwise, if C has a superclass, step 2 of method resolution is recursively invoked on the direct superclass of C.
So the fact that it's called from a subclass seems to simply be ignored. Why do it this way? In the documentation you provided they say:
The intent is that a class or interface type has a set of initializers that put it in a consistent state, and that this state is the first state that is observed by other classes.
In your example, you alter the state of Super when Sub is statically initialized. If initialization happened when you called Sub.staticMethod you would get different behavior for what the jvm considers the same method. This might be the inconsistency they were talking about avoiding.
Also, here's some of the decompiled class file code that executes staticMethod, showing use of invokestatic:
Constant pool:
...
#2 = Methodref #18.#19 // Sub.staticMethod:()V
...
Code:
stack=0, locals=1, args_size=1
0: invokestatic #2 // Method Sub.staticMethod:()V
3: return
The JLS is specifically allowing the JVM to avoid loading the Sub class, it's in the section quoted in the question:
A reference to a static field (§8.3.1.1) causes initialization of only the class or interface that actually declares it, even though it might be referred to through the name of a subclass, a subinterface, or a class that implements an interface.
The reason is to avoid having the JVM load classes unnecessarily. Initializing static variables is not an issue because they are not getting referenced anyway.
The reason is quite simple: for JVM not to do extra work prematurely (Java is lazy in its nature).
Whether you write Super.staticMethod() or Sub.staticMethod(), the same implementation is called. And this parent's implementation typically does not depend on subclasses. Static methods of Super are not supposed to access members of Sub, so what's the point in initializing Sub then?
Your example seems to be artificial and not well-designed.
Making subclass rewrite static fields of superclass does not sound like a good idea. In this case an outcome of Super's methods will depend on which class is touched first. This also makes hard to have multiple children of Super with their own behavior. To cut it short, static members are not for polymorphism - that's what OOP principles say.
According to this article, when you call static method or use static filed of a class, only that class will be initialized.
Here is the example screen shot.
for some reason jvm think that static block is no good, and its not executed
I believe, it is because you are not using any methods for subclass, so jvm sees no reason to "init" the class itself, the method call is statically bound to parent at compile time - there is late binding for static methods
http://ideone.com/pUyVj4
static {
System.out.println("init");
staticVar = new Object();
}
Add some other method, and call it before the sub
Sub.someOtherMethod();
new UsersClass().method();
or do explicit Class.forName("Sub");
Class.forName("Sub");
new UsersClass().method();
When static block is executed Static Initializers
A static initializer declared in a class is executed when the class is initialized
when you call Sub.staticMethod(); that means class in not initialized.Your are just refernce
When a class is initialized
When a Class is initialized in Java After class loading, initialization of class takes place which means initializing all static members of class. A Class is initialized in Java when :
1) an Instance of class is created using either new() keyword or using reflection using class.forName(), which may throw ClassNotFoundException in Java.
2) an static method of Class is invoked.
3) an static field of Class is assigned.
4) an static field of class is used which is not a constant variable.
5) if Class is a top level class and an assert statement lexically nested within class is executed.
When a class is loaded and initialized in JVM - Java
that's why your getting null(default value of instance variable).
public class Sub extends Super {
static {
staticVar = new Object();
}
public static void staticMethod() {
Super.staticMethod();
}
}
in this case class is initialize and you get hashcode of new object().If you do not override staticMethod() means your referring super class method
and Sub class is not initialized.

How exactly does the 'new' keyword and constructors work

Ok, so i'm study for my first java certification and i can't quite understand exactly what happens underneath the hood when an object is created, some explanations that iv'e read states that the constructor returns the reference to an object which is a source of confusion for me because from my understanding this is done by the new keyword. My questions are:
Where does the reference to an object actually come from, the 'new' keyword or the constructor?
When an object is created using the 'new' keyword, does it implicitly pass this object to the constructor?
When an object is created using the 'new' keyword, is it just a java Object until it is passed to the constructor eg. Person me = new Person(); Is this object associated with the class person as soon as it is created, or only after it has been passed to the constructor
The entire expression of new Constructor(arguments) is considered one instance-creation expression, and is not meaningful as a separate new keyword and constructor call.
The JLS, 15.9.4 describes the actual creation of the instance, which is comprised of three steps:
Qualifying the expression (not applicable when using new)
Allocating space for the instance in memory, and setting fields to default values.
Evaluating arguments for the constructor, and calling the constructor.
First, if the class instance creation expression is a qualified class
instance creation expression, the qualifying primary expression is
evaluated. If the qualifying expression evaluates to null, a
NullPointerException is raised, and the class instance creation
expression completes abruptly. If the qualifying expression completes
abruptly, the class instance creation expression completes abruptly
for the same reason.
Next, space is allocated for the new class instance. If there is
insufficient space to allocate the object, evaluation of the class
instance creation expression completes abruptly by throwing an
OutOfMemoryError.
The new object contains new instances of all the fields declared in
the specified class type and all its superclasses. As each new field
instance is created, it is initialized to its default value (§4.12.5).
Next, the actual arguments to the constructor are evaluated,
left-to-right. If any of the argument evaluations completes abruptly,
any argument expressions to its right are not evaluated, and the class
instance creation expression completes abruptly for the same reason.
Next, the selected constructor of the specified class type is invoked.
This results in invoking at least one constructor for each superclass
of the class type. This process can be directed by explicit
constructor invocation statements (§8.8) and is specified in detail in
§12.5.
The value of a class instance creation expression is a reference to
the newly created object of the specified class. Every time the
expression is evaluated, a fresh object is created.
Yes, as bytecode this all turns into one invokespecial call with the new instance at the bottom of the stack of parameters passed. Please see the bottom of this answer. Semantically, however, the constructor doesn't get the new instance "passed" to it, but that instance is made available as this in Java source code.
This question doesn't make semantic sense, as the object doesn't "exist" to the outside world until the constructor returns. If the construction terminates abruptly, the object is not actually created and available. However, in memory on an OpenJDK JVM at least, the type of the object as well as any types that it extends/implements are written to an in-memory header. This is not guaranteed for all implementations, however.
Disassembly:
I compiled and disassembled:
class Test{
public static void main(String args[]){
Integer s = new Integer(2);
}
}
This is the result:
public static void main(java.lang.String[]);
Code:
0: new #2 // class java/lang/Integer
3: dup
4: iconst_2
5: invokespecial #3 // Method java/lang/Integer."<init>":(I)V
8: astore_1
9: return
As you can see, first the object is allocated using new, which fills in all fields as defaults. It's duplicated, so the stack looks like:
our new Integer
our new Integer
The constant value that I passed to the constructor is then pushed:
2
our new Integer
our new Integer
invokespecial occurs, which passes the top two stack elements--the new instance, and the constructor's parameter.

Constructor is static or non static

As per standard book constructor is a special type of function which is used to initialize objects.As constructor is defined as a function and inside class function can have only two type either static or non static.My doubt is what constructor is ?
1.)As constructor is called without object so it must be static
Test test =new Test();//Test() is being called without object
so must be static
My doubt is if constructor is static method then how can we frequently used this inside
constructor
Test(){
System.out.println(this);
}
Does the output Test#12aw212 mean constructors are non-static?
Your second example hits the spot. this reference is available in the constructor, which means constructor is executed against some object - the one that is currently being created.
In principle when you create a new object (by using new operator), JVM will allocate some memory for it and then call a constructor on that newly created object. Also JVM makes sure that no other method is called before the constructor (that's what makes it special).
Actually, on machine level, constructor is a function with one special, implicit this parameter. This special parameter (passed by the runtime) makes the difference between object and static methods. In other words:
foo.bar(42);
is translated to:
bar(foo, 42);
where first parameter is named this. On the other hand static methods are called as-is:
Foo.bar(42);
translates to:
bar(42);
Foo here is just a namespace existing barely in the source code.
Constructors are non-static. Every method first parameter is implicit this (except static) and constructor is one of that.
Constructors are NOT static functions. When you do Test test =new Test(); a new Test object is created and then the constructor is called on that object (I mean this points to the newly created object).
The new keyword here is the trick. You're correct in noting that in general, if you're calling it without an object, a method is static. However in this special case (i.e., preceded by the new keyword) the compiler knows to call the constructor.
The new operator returns a reference to the object it created.
new Test(); // creates an instance.
The System.out.println(this); is called after the new operator has instantiated the object
Not static. Read about constructors http://www.javaworld.com/jw-10-2000/jw-1013-constructors.html.
Neither.
Methods can be divided into 2 types: static/non-static methods, aka class/instance methods.
But constructors are not methods.
When we talk about static class then it comes to our mind that methods are called with class name,But in case of constructor ,Constructor is initialized when object is created So this proves to be non-static.
Constructors are neither static (as called using class name) or non-static as executed while creating an object.
Static:
Temp t= new Temp();
The new operator creates memory in the heap area and passes it to the constructor as Temp(this) implicitly. It then initializes a non-static instance variable defined in a class called this to the local parameter variable this.
Below example is just for understanding the concept, if someone tries to compile it, it will give the compile-time error.
class Temp{
int a;
Temp this; //inserted by compiler.
Temp(Temp this){ //passed by compiler
this.this=this; // initialise this instance variable here.
this.a=10;//when we write only a=10; and all the non-static member access by this implicitly.
return this; // so that we can't return any value from constructor.
}
}
Constructor is static because:
It is helping to create object.
It is called without object.
Constructor is used to initialize the object and has the behavior of non-static methods,as non-static methods belong to objects so as constructor also and its invoked by the JVM to initialize the objects with the reference of object,created by new operator

Categories