Why the resolution phase is required in Java during class loading? - java

When do the symbolic references gets replaced with memory references in method area?

All symbolic references that were now loaded into the method area in form of the runtime constant pool are resolved to actual types loaded by this JVM. If a symbolic reference can be resolved but results in a conflict of definitions, a IncompatibleClassChangeError is thrown. If a referenced class cannot be found, a NoClassDefFoundError is thrown which basically wraps a ClassNotFoundException that was thrown by the class loader attempting to load this referenced class. If a referenced class references itself, a ClassCircularityError is thrown. Resolution can happen in one of two flavors which is up to the implementors of the JVM
Eager: All symbolic references to other fields, methods or classes are resolved right now.
Lazy: Resolving of symbolic references is postponed until the first use of a method. This might bring with it that a class refering to a non-existant class never throws an error if this reference never needs to be resolved.
look at the beginning of the Chapter 5.4.3. Resolution, there’s stated explicitly:
The Java Virtual Machine instructions anewarray, checkcast, getfield,
getstatic, instanceof, invokedynamic, invokeinterface, invokespecial,
invokestatic, invokevirtual, ldc, ldc_w, multianewarray, new,
putfield, and putstatic make symbolic references to the run-time
constant pool. Execution of any of these instructions requires
resolution of its symbolic reference.
There’s the resolving of the direct super class and the directly implemented interfaces (or super interfaces in case of an interface) which happens early and there’s the resolution of symbolic references for the purpose of the above byte code instructions which can be postponed.

The resolution phase is required so that the referenced classes or interfaces are resolved either the first time they are used or all at once based on the JVM implementation.
When do the symbolic references gets replaced with memory references in method area
This happens as part of class file verification which works in conjunction with the class loader and it ensures that loaded class files have a proper internal structure and they are consistent.
Class file verification happens in four distinct passes:
Pass 1: Structural checks on the class files
This happens as a class is loaded. The loaded class is checked for the internal structure to ensure it is safe to parse.
Paas 2: Semantic checks on the data type.
Here class file verifier makes sure that semantics of Java programming lanuage is obeyed.
Pass 3: Bytecode verification
Here the data-flow analysis of the bytecodes representing the methods of the class is performed
Pass 4: Verification of symbolic references
It is this pass 4 where the dynamic linking happens which is nothing but the process of resolving symbolic references into direct references.

Related

Does Java Class Linkage Resolution step OR Initialisation lead to loading of other resolved classes?

I was going through the JVM specification document and JLS , on the classloading mechanism in java .
Here is what I understand .
At first the when the main class is being asked to be loaded , it
looks if the binary representation of the class has been already
loaded or not , if not the class loader loads the class file from
the disk .
Linkage Steps: Verification ,Preparation and Resolution
Initialisation.
What I find confounding is , while in the Resolution and Initialisation steps if a class is referenced which has yet not been loaded from the source , what happens ? does the Resolution or Initialisation step pause for the Classloading to happen by it's parent classloader?
Or is the loading , Linking and Initialization deferred till the time actual method or code using that reference is executed at runtime ?
JVMS §5.4. Linking states:
Linking a class or interface involves verifying and preparing that class or interface, its direct superclass, its direct superinterfaces, and its element type (if it is an array type), if necessary. Resolution of symbolic references in the class or interface is an optional part of linking.
So when not talking about the direct supertypes of a class, the resolution is optional and may be deferred.
The same section also contains
For example, a Java Virtual Machine implementation may choose to resolve each symbolic reference in a class or interface individually when it is used ("lazy" or "late" resolution), or to resolve them all at once when the class is being verified ("eager" or "static" resolution). This means that the resolution process may continue, in some implementations, after a class or interface has been initialized.
So the process does not always strictly follow the graphic you’ve shown in the question. Instead, the resolution can be seen as an ongoing process.
In practice, in case of the HotSpot JVM, some classes have to get resolved immediately, like the superclasses. Other classes are resolved when verifying code of a method, which happens right before the first execution of a method for this JVM.
This does not affect all classes referenced by a method’s code but depend on the actual type use, e.g. HotSpot’s verifier will resolve types for checking the validity of assignments against the actual type hierarchy, but skip this step, if a type is assigned to itself or to java.lang.Object, i.e. where the assignment is always valid. So some types may get resolved only at their first actual use, e.g. when they are instantiated via new or a static method declared by the type is invoked. But this depends on subtle aspects of the code. See also When is a Java Class loaded? or Does the JVM throw if an unused class is absent?
There might be types referenced only in reflective data like annotations or debug attributes which get never resolved during a run, but may be in another.
But since this implies that the resolution of a type is deferred to the point when it is actually needed, it also implies that right at this point, the operation will stop and wait for the completion of this process for the prerequisite classes. So, e.g. loading a class always implies resolving its direct superclass, loading it if not already loaded, which in turn implies resolving of the superclass of the superclass and so on. So it won’t return before the complete super class hierarchy has been resolved.
The JVMS also states in Section 5.3
If the Java Virtual Machine ever attempts to load a class C during verification
(§5.4.1) or resolution (§5.4.3) (but not initialization (§5.5)), and the class loader
that is used to initiate loading of C throws an instance of ClassNotFoundException ,
then the Java Virtual Machine must throw an instance of NoClassDefFoundError
whose cause is the instance of ClassNotFoundException .
(A subtlety here is that recursive class loading to load superclasses is performed
as part of resolution (§5.3.5, step 3). Therefore, a ClassNotFoundException that
results from a class loader failing to load a superclass must be wrapped in a
NoClassDefFoundError .)
So there is indeed a recursion going on in the resolution phase of the classloading.

JVM tries to load a class that isn't called

I am writing code that adds functions to a 'mod' if it exists in the classpath (referenced by pixelmonPresent)
PixelHammerTool extends ItemHammer
, ItemHammer only exists if pixelmon is present
The issue im having is, if i do this in the class (same package)
if(Basemod.pixelmonPresent) {
rubyHammer = new PixelHammerTool(Basemod.RUBY, "pixelutilitys:RubyHammer", "rubyHammer");
}
It will cause a class not found on PixelHammerTool,
Why is this being called if the if statement is false and what can i do about it ?
The why is simple and straightforward: because when a class is loaded, all the classes referenced by it are loaded too. (In fact they are loaded first.)
Avoiding it isn't too complicated either, although the code won't look nice: you need to load the class with reflection, using Class.forName(), find the constructor you want from the array returned by Class.getConstructors() and then create an instance using Constructor.newInstance().
Note that while if it only happens a few times in your code, this solution is fine, if you find yourself doing this a lot then you should probably look for a dependency injection framework that will do the heavy lifting for you.
Under the Linking section in the specs, we see this:
For example, a Java Virtual Machine implementation may choose to resolve each symbolic reference in a class or interface individually when it is used ("lazy" or "late" resolution), or to resolve them all at once when the class is being verified ("eager" or "static" resolution). This means that the resolution process may continue, in some implementations, after a class or interface has been initialized. Whichever strategy is followed, any error detected during resolution must be thrown at a point in the program that (directly or indirectly) uses a symbolic reference to the class or interface.
So when the constant has to be defined is implementation-dependent, based on the class loader. The behavior you're seeing is consistent with the "eager" resolution mentioned: when you reference PixelHammerTool in your code, even if it's for a runtime path that will never be hit, the class loader tries to link in its definition, which does not exist.
This strategy causes the JVM to start slower but execute faster at runtime, which is generally the strategy taken in all the implementations I'm familiar with. Indeed, the default class loader is given the name "bootstrap class loader" because it has this behavior - load classes at JVM bootstrap time.
You can either instantiate the class via reflection, as biziclop suggested (the easier route), which forces linking at runtime, or find or create a class loader that instantiates classes lazily.

I want to know when a program written by Java is running,its classes will be all loaded on the main memory?

When a program written in Java is running, will all of its classes be all loaded into the main memory? If so, isn't it a waste of RAM?
No it's fine, because of virtual address space and virtual memory. Read these:
http://en.wikipedia.org/wiki/Virtual_memory
http://en.wikipedia.org/wiki/Virtual_address_space
Virtual memory means that you can load a large amount into memory and the unused sections are saved to disc and are moved out of physical RAM.
Virtual address space means that each process (one example of a process is your Java program) has its own address space, so it does not 'steal' addresses from other processes.
Only classes that are referenced during a particular execution are loaded. Most large Java programs will frequently run with many of the classes not loaded as those classes serve various scenarios not exercised by that particular process.
Classes in the standard library are handled the same as application classes. For instance, if your application does not reference AWT, no classes in AWT packages will be loaded.
Java language spec contains the wording which explicitly precludes eager initialization of classes.
JLS Section 12.4:
A class or interface type T will be
initialized immediately before the
first occurrence of any one of the
following:
T is a class and an instance of T is
created.
T is a class and a static
method declared by T is invoked.
A
static field declared by T is
assigned.
A static field declared by T
is used and the field is not a
constant variable (§4.12.4).
T is a
top-level class, and an assert
statement (§14.10) lexically nested
Note my use of the term "initialization". A class is initialized as part of constructing Class object, when parsing the binary data that defines the class.
There is nothing precluding a particular ClassLoader implementation from loading the binaries of all the classes that it sees into memory, but it cannot fully load those classes until they are requested without violating JLS.
For a common ClassLoader implementation, see URLClassLoader.

JVM verifierification -when is it performed?

I would like to know in what situations exactly would the verifier in JVM kick in and check the class. I know one such instance is when you load the class, but sometimes class is loaded and later on verified. That's why I want to know precisely when that happens.
The spec (§4.10) says the following:
A Java virtual machine implementation verifies that each class file
satisfies the necessary constraints at linking time (§5.4).
§5.4 defines what exactly "linking time" means:
Linking a class or interface involves verifying and preparing that
class or interface, its direct superclass, its direct superinterfaces,
and its element type (if it is an array type), if necessary.
Resolution of symbolic references in the class or interface is an
optional part of linking.
This specification allows an implementation flexibility as to when
linking activities (and, because of recursion, loading) take place,
provided that all of the following properties are maintained:
A class or interface is completely loaded before it is linked.
A class or interface is completely verified and prepared before it is initialized.
Errors detected during linkage are thrown at a point in the program where some action is taken by the program that might, directly
or indirectly, require linkage to the class or interface involved in
the error.
For example, a Java virtual machine implementation may choose to
resolve each symbolic reference in a class or interface individually
when it is used ("lazy" or "late" resolution), or to resolve them all
at once when the class is being verified ("eager" or "static"
resolution). This means that the resolution process may continue, in
some implementations, after a class or interface has been initialized.
Whichever strategy is followed, any error detected during resolution
must be thrown at a point in the program that (directly or indirectly)
uses a symbolic reference to the class or interface.
Note as a matter of fact at least Hotspot is doing lazy initialization as described (and I'd be extremely surprised if JRockit and co did otherwise).
Source:
http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.10
http://docs.oracle.com/javase/specs/jvms/se7/html/jvms-5.html#jvms-5.4

How Java linker works?

I want to know how Java linker works. Specifically, in which order it combines classes, interfaces, packages, methods and etc into jvm-executable format. I have found some information here, but there is not so much information about linking order.
There is no such thing as a Java "linker". There is, however, the concept of a classloader which - given an array of java byte codes from "somewhere" - can create an internal representation of a Class which can then be used with new etc.
In this scenario interfaces are just special classes. Methods and fields are available when the class has been loaded.
First of all: methods are always part of a class. Interfaces are basically just special classes, and packages are just a part of the fully qualified name of a class with some impact on visibility and the physical organization of class files.
So the question comes down to: how does a JVM link class files? The JVM spec you linked to says:
The Java programming language allows
an implementation flexibility as to
when linking activities (and, because
of recursion, loading) take place,
provided that the semantics of the
language are respected, that a class
or interface is completely verified
and prepared before it is initialized,
and that errors detected during
linkage are thrown at a point in the
program where some action is taken by
the program that might require linkage
to the class or interface involved in
the error.
For example, an implementation may
choose to resolve each symbolic
reference in a class or interface
individually, only when it is used
(lazy or late resolution), or to
resolve them all at once, for example,
while the class is being verified
(static resolution). This means that
the resolution process may continue,
in some implementations, after a class
or interface has been initialized.
Thus, the question can only be answered for a specific JVM implementation.
Furthermore, it should never make a difference in the behaviour of Java programs, except possibly for the exact point where linking errors result in runtime Error instances being thrown.
Java doesn't do linking the way C does. The principle unit is the class definition. A lot of the matching of a class reference to its definition happens at runtime. So you could compile a class against one version of a library, but provide another version at runtime. If the relevant signatures match, everything will be ok. There's some in-lining of constants at compile time, but that's about it.
As noted previously Java compiler doesn't have a linker. However, JVM has a linking phase, which performed after class loading. JVM spec defines it at best:
Linking a class or interface involves verifying and preparing that
class or interface, its direct superclass, its direct superinterfaces,
and its element type (if it is an array type), if necessary.
Resolution of symbolic references in the class or interface is an
optional part of linking.
This specification allows an implementation flexibility as to when
linking activities (and, because of recursion, loading) take place,
provided that all of the following properties are maintained:
A class or interface is completely loaded before it is linked.
A class or interface is completely verified and prepared before it is
initialized.
Errors detected during linkage are thrown at a point in the program
where some action is taken by the program that might, directly or
indirectly, require linkage to the class or interface involved in the
error.
https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-5.html#jvms-5.4
Linking is one of the three activities performed by ClassLoaders. It includes verification, preparation, and (optionally) resolution.
Verification : It ensures the correctness of .class file i.e. it check whether this file is properly formatted and generated by valid compiler or not. If verification fails, we get run-time exception java.lang.VerifyError.
Preparation : JVM allocates memory for class variables and initializing the memory to default values.
Resolution : It is the process of replacing symbolic references from the type with direct references. It is done by searching into method area to locate the referenced entity.

Categories