I read it somewhere that calling a constructor is work of JVM, so i created a Class named Hello and did not put anything at all, and just compiled it, after compiling when i open the byte code there is constructor created inside a class which is default constructor.
So is it compiler's duty to put a default constructor.
I thought it's jvm who checks and calls constructor.
Ps: I haven't run that code.
I thought it's jvm who checks and calls constructor.
Wrong assumption. The JVM reads compiled classes (.class) files. It doesn't modify or add them.
Of course, the JVM executes code, and thus calls/invokes methods and constructors.
But the java compiler is responsible for "adding" such things a default constructor, see here for more details.
Having said that, of course there is the JIT (just in time compiler) that is part of the JVM. But the JIT translates byte code into machine code, and its job is again, not to add things such as additional constructors.
I am trying to work out what you mean by:
I thought it's jvm who checks and calls constructor.
The "call" makes sense.
The "check" .... not sure. If you mean that the JVM's classloader checks that the required constructors are present when it loads1 the class, that is correct. But if the JVM finds that a (default or otherwise) constructor is missing, it doesn't just add one. Instead the JVM marks the class and its dependents as unusable, throws an Error exception, and typically exits.
(Note that the kind of checks described above are done to deal with cases where there is a binary compatibility mismatch between versions of classes used at compile time and at runtime. Typically the you compiled a class against one version of an API and used and put an incompatible version on the runtime classpath.)
The checking that you are probably thinking about is done by the bytecode compiler.
If there is no constructor in the source code of a class, the compiler defines a default constructor, and includes it the .class file. This is in conformance to what the JLS says.
If the source code contains a new that uses any constructor that hasn't been defined, the compiler treats this as a compilation error.
By the time the JVM sees any bytecode file for a Java class, it will contain at least one constructor.
1 - I am deliberately leaving out some details here.
A default constructor is automatically generated by the compiler if you do not explicitly define at least one constructor in your class. You've defined two, so your class does not have a default constructor.A default constructor is created if you don't define any constructors in your class. It simply is a no argument constructor which does nothing. Edit: Except call super()
public Module(){
}
Related
Based on the older Java (7) Language Specifications (13.1.7):
Any constructs introduced by a Java compiler that do not have a corresponding construct in the source code must be marked as synthetic, except for default constructors, the class initialization method, and the values and valueOf methods of the Enum class.
On newer ones (Java (17) Language Specifications (13.1.7): ) that wording changes to:
A construct emitted by a Java compiler must be marked as synthetic if it does not correspond to a construct declared explicitly or implicitly in source code, unless the emitted construct is a class initialization method (JVMS §2.9).
I wonder how would this apply to the accesor methods created for the components of java Records (JEP 395)
For example
record ARecord(int a){}
would have a method int a() yet there is no code representing such method, according to the wording of the older JLS such method is added by the compiler so I would expect it to be synthetic but its not, as it can be corroborated by running the following 2 lines on JShell
jshell
| Welcome to JShell -- Version 17.0.1
| For an introduction type: /help intro
jshell> record ARecord(int a){}
| created record ARecord
jshell> ARecord.class.getDeclaredMethod("a").isSynthetic();
$2 ==> false
jshell>
The reason I ask is because I would like to use reflection (or any other programmatic mean at runtime) to determine which elements on the class have a matching code structure, basically those have code representing them, meaning:
For the following code
record ARecord(int a){
pubic void someMethod() {}
}
that entity would have 2 methods (a and someMethod), a has no code representing it and someMethod does, I need a way to differentiate those based on that criteria
I wonder if it is because its considered as implicitly declared being its code implicitly defined as part of the component
This is exactly it. Note how the old spec only says that "synthetic" should be marked on constructs that
do not have a corresponding construct in the source code
with the exception of the implicitly declared Enum.values and Enum.valueOf. Back then, those were the only two implicitly declared (in the sense that the new spec uses the phrase) things, apparently. :D
On the other hand, the new spec says
does not correspond to a construct declared explicitly or implicitly in source code
Note that this wording automatically handles the Enum exceptions, but also handles the plethora of implicitly declared things that got added since. This includes record components.
From the Java 17 spec §8.10.3. Record Members,
Furthermore, for each record component, a record class has a method with the same name as the record component and an empty formal parameter list. This method, which is declared explicitly or implicitly, is known as an accessor method.
...
If a record class has a record component for which an accessor method is not declared explicitly, then an accessor method for that record component is declared implicitly [...]
The method a is implicitly declared in your component, therefore it is not synthetic.
Generally speaking (there might be exceptions to this that I don't know of), synthetic constructs are constructs that are not specified by the language spec, but are required for a particular implementation of a compiler to work. The spec is basically saying that such constructs must be marked as "synthetic" in the binary. See some examples here.
Any members of a type that have the synthetic flag on are ignored entirely by javac. Javac acts exactly as if those things don't exist at all.
As a consequence, obviously, the 'getters' you get for a record aren't synthetic. If they were, it would be impossible to call them from .java code - the only way to call them is to write a hacky javac clone that does compile access to synthetics, or to use bytecode manipulation to remove the synthetic flag, or to emit bytecode directly, or to use reflection.
I have my own System class with a Test class in the same package which test methods declared in the System class. I also have created a System constructor which takes 3 parameters. When I created a constructor to test the methods in my IDE the program was working fine (I had use java.util.System where I need to use the System. methods) but IDE knew I was referring to my own class when I created the constructor. However, when I trying running my test class from command line it won't even compile:
error: constructor System in class System cannot be applied to given types;
System sys = new System("String1", "String2", 20);
^
required: no arguments
found: String,String,int
reason: actual and formal argument lists differ in length
My guess is that instead of my constructor, the java.util.System constructor (with no parameters) is being invoked which causes the whole program to crash. Does anyone know how to fix it and why is it only happening in command line and not in IDE?
You mention java.util.System, but that's not where the platform's System lives; it lives in java.lang.
This is a problem. Java code acts as if import java.lang.*; is at the top of every file, even if you don't write it. The java language spec says so. So now you get into a fun dilemma:
Given a class named System in the same package, and you star-imported another package that also has a System class in it, which one is chosen if you use an unqualified type reference "System" someplace in the code?
The answer is presumably that whilst the spec is clear on this, few to no java coders care about the answer. They'd rather just.. not get into this bizarro situation. Thus, don't use star imports lightly, and don't name any classes the same as classes in the java.lang package.
If you must know, the order is as follows:
To resolve the type name System into which actual type it is referring to:
Check if there is a named (non-star) import for it: import java.lang.System;
Check if there is a class named System in this source file.
Check if there is a class named System in this package.
Check if there is a class named System in any star-imported package (and therefore, in java.lang as that is always star-imported.
Thus, given that it sounds like your System class is in the same package, that one 'wins'. However, if during a compilation run your non-test source files (your System.java file) is not on the classpath or sourcepath, then instead of the compiler straight up telling you this, instead you get the error you witness.
So, you have 2 problems:
you are not compiling the test classes on the command line correctly. Use a build system.
Don't name classes the same as classes in the lang package; whilst you can make code that works, and the ordering is well defined, it's confusing (hey, it confused you - that's anecdotal evidence right there!) and not idiomatic java. Other folks will have a very hard time reading your code, and you're likely to run into bugs in IDEs and such, because when you're doing weird unique things, odds go way up you run into scenarios nobody thought of and nobody ran into before.
Preface
I have been experimenting with ByteBuddy and ASM, but I am still a beginner in ASM and between beginner and advanced in ByteBuddy. This question is about ByteBuddy and about JVM bytecode limitations in general.
Situation
I had the idea of creating global mocks for testing by instrumenting constructors in such a way that instructions like these are inserted at the beginning of each constructor:
if (GlobalMockRegistry.isMock(getClass()))
return;
FYI, the GlobalMockRegistry basically wraps a Set<Class<?>> and if that set contains a certain class, then isMock(Class<?>> clazz) would return true. The advantage of that concept is that I can (de)activate global mocking for each class during runtime because if multiple tests run in the same JVM process, one test might need a certain global mock, the next one might not.
What the if(...) return; instructions above want to achieve is that if mocking is active, the constructor should not do anything:
no this() or super() calls, → update: impossible
no field initialisations, → update: possible
no other side effects. → update: might be possible, see my update below
The result would be an object with uninitialised fields that did not create any (possibly expensive) side effects such as resource allocation (database connection, file creation, you name it). Why would I want that? Could I not just create an instance with Objenesis and be happy? Not if I want a global mock, i.e. mock objects I cannot inject because they are created somewhere inside methods or field initialisers I do not have control over. Please do not worry about what method calls on such an object would do if its instance fields are not properly initialised. Just assume I have instrumented the methods to return stub results, too. I know how to do that already, the problem are only constructors in the context of this question.
Questions / problems
Now if I try to simulate the desired result in Java source code, I meet the following limitations:
I cannot insert any code before this() or super(). I could mitigate that by also instrumenting the super class hierarchy with the same if(...) return;, but would like to know if I could in theory use ASM to insert my code before this() or super() using a method visitor. Or would the byte code of the instrumented class somehow be verified during loading or retransformation and then rejected because the byte code is "illegal"? I would like to know before I start learning ASM because I want to avoid wasting time for an idea which is not feasible.
If the class contains final instance fields, I also cannot enter a return before all of those fields have been initialised in the constructor. That might happen at the very end of a complex constructor which performs lots of side effects before actually initialising the last field. So the question is similar to the previous one: Can I use ASM to insert my if(...) return; before any fields (including final ones) are initialised and produce a valid class which I could not produce using javac and will not be rejected when loaded or retransformed?
BTW, if it is relevant, we are talking about Java 8+, i.e. at the time of writing this that would be Java versions 8 to 14.
If anything about this question is unclear, please do not hesitate to ask follow-up questions, so I can improve it.
Update after discussing Antimony's answer
I think this approach could work and avoid side effects, calling the constructor chain but avoiding any side effects and resulting in a newly initialised instance with all fields empty (null, 0, false):
In order to avoid calling this.getClass(), I need to hard-code the mock target's class name directly into all constructors up the parent chain. I.e. if two "global mock" target classes have the same parent class(es), multiple of the following if blocks would be woven into each corresponding parent class, one for each hard-coded child class name.
In order to avoid any side effects from objects being created or methods being called, I need to call a super constructor myself, using null/zero/false values for each argument. That would not matter because the next parent class up the chain would have a similar code block so that the arguments given do not matter anyway.
// Avoid accessing 'this.getClass()'
if (GlobalMockRegistry.isMock(Sub.class)) {
// Identify and call any parent class constructor, ideally a default constructor.
// If none exists, call another one using default values like null, 0, false.
// In the class derived from Object, just call 'Object.<init>'.
super(null, 0, false);
return;
}
// Here follows the original byte code, i.e. the normal super/this call and
// everything else the original constructor does.
Note to myself: Antimony's answer explains "uninitialised this" very nicely. Another related answer can be found here.
Next update after evaluating my new idea
I managed to validate my new idea with a proof of concept. As my JVM byte code knowledge is too limited and I am not used to the way of thinking it requires (stack frames, local variable tables, "reverse" logic of first pushing/popping variables, then applying an operation on them, not being able to easily debug), I just implemented it in Javassist instead of ASM, which in comparison was a breeze after failing miserably with ASM after hours of trial & error.
I can take it from here and I want to thank user Antimony for his very instructive answer + comments. I do know that theoretically the same solution could be implemented using ASM, but it would be exceedingly difficult in comparison because its API is too low level for the task at hand. ByteBuddy's API is too high level, Javassist was just right for me in order to get quick results (and easily maintainable Java code) in this case.
Yes and no. Java bytecode is much less restrictive than Java (source) in this regard. You can put any bytecode you want before the constructor call, as long as you don't actually access the uninitialized object. (The only operations allowed on an uninitialized this value are calling a constructor, setting private fields declared in the same class, and comparing it against null).
Bytecode is also more flexible in where and how you make the constructor call. For example, you can call one of two different constructors in an if statement, or you can wrap the super constructor call in a "try block", both things that are impossible at the Java language level.
Apart from not accessing the uninitialized this value, the only restriction* is that the object has to be definitely initialized along any path that returns from the constructor call. This means the only way to avoid initializing the object is to throw an exception. While being much laxer than Java itself, the rules for Java bytecode were still very deliberately constructed so it is impossible to observe uninitialized objects. In general, Java bytecode is still required to be memory safe and type safe, just with a much looser type system than Java itself. Historically, Java applets were designed to run untrusted code in the JVM, so any method of bypassing these restrictions was a security vulnerability.
* The above is talking about traditional bytecode verification, as that is what I am most familiar with. I believe stackmap verification behaves similarly though, barring implementation bugs in some versions of Java.
P.S. Technically, Java can have code execute before the constructor call. If you pass arguments to the constructor, those expressions are evaluated first, and hence the ability to place bytecode before the constructor call is required in order to compile Java code. Likewise, the ability to set private fields declared in the same class is used to set synthetic variables that arise from the compilation of nested classes.
If the class contains final instance fields, I also cannot enter a return before all of those fields have been initialised in the constructor.
This, however, is eminently possible. The only restriction is that you call some constructor or superconstructor on the uninitialized this value. (Since all constructors recursively have this restriction, this will ultimately result in java.lang.Object's constructor being called). However, the JVM doesn't care what happens after that. In particular, it only cares that the fields have some well typed value, even if it is the default value (null for objects, 0 for ints, etc.) So there is no need to execute the field initializers to give them a meaningful value.
Is there any other way to get the type to be instantiated other than this.getClass() from a super class constructor?
Not as far as I am aware. There's no special opcode for magically getting the Class associated with a given value. Foo.class is just syntactic sugar which is handled by the Java compiler.
Since a static function call is translated into a static invocation bytecode regardless of how the definition exists... is there some way to force a caller of a static function to compile successfully even when the target function and class don't exist yet?
I want to be able to compile calls to functions that don't exist yet. I need to tell the compiler to trust me that at runtime, I'll have them properly defined and in the classpath so go ahead and compile it for now.
Is there a way to do this?
Reflectively yes, but not via a regular call.
The call requires an entry in the string pool that includes the method name and parameter types so the compiler needs to be able to decide on a signature for the method.
invokestatic <method-spec>
<method-spec> is a method specification. It is a single token made up of three parts: a classname, a methodname and a descriptor. e.g.
java/lang/System/exit(I)V
is the method called "exit" in the class called "java.lang.System", and it has the descriptor "(I)V" (i.e. it takes an integer argument and returns no result).
Consider
AClass.aStaticMethod(42)
Without knowing anything about AClass, it could be a call to any of
AClass.aStaticMethod(int)
AClass.aStaticMethod(int...)
AClass.aStaticMethod(long)
AClass.aStaticMethod(long...)
ditto for float and double
AClass.aStaticMethod(Integer)
AClass.aStaticMethod(Number)
AClass.aStaticMethod(Comparable<? extends Integer>)
AClass.aStaticMethod(Object)
AClass.aStaticMethod(Serializable)
and probably a few others that I've missed.
... is there some way to force a caller of a static function to compile successfully even when the target function and class don't exist yet?
No. When compiling a method call, the compiler needs to check that the name, argument types, result type, exceptions and so on of the called method. Since you are asking about a static method, this information can only defined in one place ... the class that declares the static method. There is no work-around for this if you want static type-safety.
I need to tell the compiler to trust me that at runtime ...
It is not that simple:
You haven't told the compiler what the method signature should be. The compiler needs to be told, because is not possible to accurately infer the signature from the call.
The Java platform is designed to be robust, and "just trust me" could lead to catastrophic runtime failures.
If you are willing to sacrifice compile-time type safety and eschew the convenience / simplicity / readability of statically typed code, then reflection is an option. But I can't think of any other options that would work.
No, but you could declare interfaces that have the methods and code against them, then use the Abstract Factory pattern to provide implementations at runtime.
Dependency Injection use this approach.
I want to know how Java linker works. Specifically, in which order it combines classes, interfaces, packages, methods and etc into jvm-executable format. I have found some information here, but there is not so much information about linking order.
There is no such thing as a Java "linker". There is, however, the concept of a classloader which - given an array of java byte codes from "somewhere" - can create an internal representation of a Class which can then be used with new etc.
In this scenario interfaces are just special classes. Methods and fields are available when the class has been loaded.
First of all: methods are always part of a class. Interfaces are basically just special classes, and packages are just a part of the fully qualified name of a class with some impact on visibility and the physical organization of class files.
So the question comes down to: how does a JVM link class files? The JVM spec you linked to says:
The Java programming language allows
an implementation flexibility as to
when linking activities (and, because
of recursion, loading) take place,
provided that the semantics of the
language are respected, that a class
or interface is completely verified
and prepared before it is initialized,
and that errors detected during
linkage are thrown at a point in the
program where some action is taken by
the program that might require linkage
to the class or interface involved in
the error.
For example, an implementation may
choose to resolve each symbolic
reference in a class or interface
individually, only when it is used
(lazy or late resolution), or to
resolve them all at once, for example,
while the class is being verified
(static resolution). This means that
the resolution process may continue,
in some implementations, after a class
or interface has been initialized.
Thus, the question can only be answered for a specific JVM implementation.
Furthermore, it should never make a difference in the behaviour of Java programs, except possibly for the exact point where linking errors result in runtime Error instances being thrown.
Java doesn't do linking the way C does. The principle unit is the class definition. A lot of the matching of a class reference to its definition happens at runtime. So you could compile a class against one version of a library, but provide another version at runtime. If the relevant signatures match, everything will be ok. There's some in-lining of constants at compile time, but that's about it.
As noted previously Java compiler doesn't have a linker. However, JVM has a linking phase, which performed after class loading. JVM spec defines it at best:
Linking a class or interface involves verifying and preparing that
class or interface, its direct superclass, its direct superinterfaces,
and its element type (if it is an array type), if necessary.
Resolution of symbolic references in the class or interface is an
optional part of linking.
This specification allows an implementation flexibility as to when
linking activities (and, because of recursion, loading) take place,
provided that all of the following properties are maintained:
A class or interface is completely loaded before it is linked.
A class or interface is completely verified and prepared before it is
initialized.
Errors detected during linkage are thrown at a point in the program
where some action is taken by the program that might, directly or
indirectly, require linkage to the class or interface involved in the
error.
https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-5.html#jvms-5.4
Linking is one of the three activities performed by ClassLoaders. It includes verification, preparation, and (optionally) resolution.
Verification : It ensures the correctness of .class file i.e. it check whether this file is properly formatted and generated by valid compiler or not. If verification fails, we get run-time exception java.lang.VerifyError.
Preparation : JVM allocates memory for class variables and initializing the memory to default values.
Resolution : It is the process of replacing symbolic references from the type with direct references. It is done by searching into method area to locate the referenced entity.