I was curious about what all locations JVM looks for executing a program? I'm more interested in understanding in what sequence and where does JVM look for class files, like does it look into java libs, extension libs, classpath any directory like the current directory from where the java is invoked? I'm more interested in JVM behaviour and not how class loader load class, which I know has parent delegation mechanism till root.
If a class is executed from directory where the compiled class is kept on file system and also in a jar file in the same directory, would JVM load both or just one and which one?
Say you have a thread unsafe Vector and if we compare it performance to ArrayList, which one would be better and why?
How classes are found.
Answer is here:
http://docs.oracle.com/javase/1.5.0/docs/tooldocs/findingclasses.html
Answer for point 2:
Order of finding classes is as follows:
classes or packages in current directory.
classes found from CLASSPATH environment variable. [overrides 1]
classes found from -classpath command line option. [overrides 1,2]
classes found from jar archives specified via -jar command line option [overrides 1,2,3]
So if you use -jar option while running, classes come from jarfile.
Only one class is loaded though.
Without using any additional classloader:
Search order for a JVM:
Runtime classes (basically, rt.jar in $JRE_HOME/lib`)
Extension classes (some JARs in $JRE_HOME/lib/ext`)
Classpath, in order. There are four possibilities for specifying classpath:
If -jar was specified, then that JAR is in the classpath. Whatever classpath is declared as classpath in META-INF/MANIFEST.MF is also considered.
Else, if -cp was specified, that is the classpath.
Else, if $CLASSPATH is set, that is the classpath.
Else, the current directory from which java has been launched is the classpath.
So, if I specify -cp src/A.jar:src/B.jar, then A.jar will be searched first, then B.jar
The JVM loads only the class that is found first, according to the order in which the directories/JARs are declared in the classpath. This is important if you use -cp or $CLASSPATH.
In single thread scenarios and with recent JVMs, Vector and ArrayList should have similar performance (ArrayList should perform slightly better as it is not synchronized, but locking is fast currently when there is no contention, so the difference should be small). Anyway, Vector is obsolete: don't use it in new code.
I believe Java looks in the current directory, then at the class path, per the "-cp" VM argument. You can put any combination of folders of classes (e.g. /project/bin/com/putable), specific class files (e.g. /project/bin/com/putable/MyClass.class), and JAR files (e.g. /project/lib/MyJar.jar) on the class path. Locations are separated by either a colon (Unix-based OSes) or semicolon (Windows-based OSes). So anything on the classpath is fair game for Java to look at when obtaining class definitions. With respect to sequence, classes are loaded lazily. So they only get loaded when your application first requires them. If your application doesn't require a certain class during the duration of its runtime, then that class will NEVER get loaded.
If you don't put anything on the class path, I think Java will load from the class file and not the Jar. If you specify one or the other on the classpath, then that's the place Java will look for. If you put both on the classpath, Java's class-loading behavior is undefined and it could pick either, depending on the JVM implementation.
Depends on what you want to do. Vectors are actually always thread safe, per the Java API, so if you don't require concurrent access, the ArrayList will be faster. Vectors and ArrayLists are both backed by arrays, but they increase capacity at different rates (Vector capacity doubles whenever the end is reached and more space is needed, but ArrayList increases by 50%). Depending on how often you have to grow or shrink, the answer will vary. Check out this link for more info:
http://www.javaworld.com/javaworld/javaqa/2001-06/03-qa-0622-vector.html
I'm more interested in JVM behaviour and not how class loader load
class
Sorry, but this is nonsensical.
Because the answer is that the JVM creates a class loader and let's this class loader load the classes.
So, in order to understand the "JVM behaviour" you need to understand the class loader behaviour.
But maybe your question was: how does the JVM create the system class loader?
The accepted answer is already correct but there is a more detailed and updated official spec in How Classes are Found.
Some caveats as:
A class file has a subpath name that reflects the class's fully-qualified name. For example, if the class com.mypackage.MyClass is stored under /myclasses, then /myclasses must be in the user class path and the full path to the class file must be /myclasses/com/mypackage/MyClass.class. If the class is stored in an archive named myclasses.jar, then myclasses.jar must be in the user class path, and the class file must be stored in the archive as com/mypackage/MyClass.class.
And the priorities in How the Java Launcher Finds User Classes
The default value, ".", meaning that user class files are all the class files in the current directory (or under it, if in a package).
The value of the CLASSPATH environment variable, which overrides the default value.
The value of the -cp or -classpath command line option, which overrides both the default value and the CLASSPATH value.
The JAR archive specified by the -jar option, which overrides all other values. If this option is used, all user classes must come
from the specified archive.
Related
In a standard JVM I can re-order my classpath to "hide" similar classes (move desired implementation of a class/interface to the front of the classpath). How I can achieve this behaviour in the internal database JVM (e.g. 11g)? Is this the order of loading the classes with "loadjava"?
When you use the loadJava utility to load a class there is only a single path on the class-path.
If you load a class that has the same name and class path as a previous class then it will ignore the newer class unless you specify the -force option:
-force Forces files to be loaded, even if they match digest table entries.
In which case it will overwrite the earlier entry (it will not load two copies of the class).
So, no, you cannot hide duplicate classes by re-ordering the class-path (as you can with external JVMs) as there is only a single path on the class-path within the internal JVM used by Oracle.
The Java standard/system libraries (java.*, javax.*, etc.) are stored in lib/rt.jar inside each JRE distribution.
Say I have an application that I have compiled and jarred into myapp.jar. This JAR only contains my app's class files, and merely references system classes like System, File, Runtime, Thread, String, Boolean, etc.
So when I run my app, say via java -jar myapp.jar, the JVM is obviously doing so last minute linking (or something) where it is executing the bytecode of my class files (inside myapp.jar) and then "jumping" into lib/rt.jar to run bytecode located there. I would imagine the process is the same if myapp.jar depends on other JARs provided on the runtime classpath.
My question is: what is this "linking" process called, and how does it essentially work?
That rt.jar is part of the bootstrap classpath, a parent of the usual classpath you already know and that you configure when you use the -cp option (you can actually change the bootstrap classpath too using the -Xbootclasspath option to load, for example, a custom Java runtime).
See Oracle documentation for a detailed description of how classes are searched/loaded from the system defined classpaths hierarchy.
Now, the additional questions you seemed to have:
How is the archive actually found?
It's simply hardcoded. If the java binary is located in <common_root>/bin/java, rt.jar will be searched in <common_root>/lib/rt.jar.
How is the "linking" performed?
On the JVM there is no actual linking, the classes are dynamically loaded using a mechanism based on a hierarchy of ClassLoader that are the software components that actually do the class file loading/parsing. When you try to load a class, the search starts from the application-facing default classloader(or a child classloader if you have defined one) and if the class cannot be loaded the loading attempt is repeated with a parent classloader until the bootstrap classloader is reached.
If the class is found, the .class file is loaded, parsed and internal structures representing the class and its data are created.Once the class is loaded a new instance can be created.
If instead, even the boot classloader could not load your class a user-visible ClassNotFoundException is thrown.
When we refer to a class className in jar, how does it know whether it's defined or not when there's no header files(like in c/c++) ?
Java works with classloaders. Classes are needed for compilation, since it will perform static type checking to ensure that you are using the correct signatures of every method.
After compiling them, though, they are not linked like you have in a C/C++ compiler so basically every .class file is standalone. Of course this means that you will have to provide compiled classed used by your program when you are going to execute it. So it's a little bit different from how C and C++ prepare executables. You don't actually have a linking phase at all, it is not needed.
The classloader will dinamically load them by adding them to the runtime base used by the JVM.
Actually there are many classloaders that are used by the JVM that have different permissions and properties, you can also invoke it explicitly to ask for a class to be loaded. What happens can also be a sort of "lazy" loading in which the compiled .class code is loaded just when needed (and this loading process can throw a ClassNotFoundException if the asked class is not inside the classpath)
When you run the Java compiler or your application itself, you can specify a classpath which lists all the jars and directories you're loading classes from. A jar just contains a bunch of class files; these files have enough metadata in them that no extra header files are necessary.
The classes in the jar file contain all the required information (class names, method signatures etc) so header files are not needed.
When you compile multiple classes javac is clever enough to compile dependencies automatically so the system still works.
It looks at the classpath and tries to load the class from there to get its definition.
Java files are compiled into class files which are java bytecode. These class files reside in a file structure where the top level is pointed to by the classpath variable. Compiling in C/C++ creates object files which can be linked into executable binaries. Java only compiles into bytecode files which are pulled in by the JVM at runtime. The following provide more explanation.
http://en.wikipedia.org/wiki/Java_bytecode
http://en.wikipedia.org/wiki/Java_compiler
http://en.wikipedia.org/wiki/Java_Virtual_Machine
Consider a scenario that a java program imports the classes from jar files. If the same class resides in two or more jar files there could be a problem.
In such scenarios what is the class that imported by the program? Is it the class
with the older timestamp??
What are the practices we can follow to avoid such complications.
Edit : This is an example. I have 2 jar files my1.jar and my2.jar. Both the files contain com.mycompany.CrazyWriter
By default, classes are loaded by the ClassLoader using the classpath which is searched in order.
If you have two implementations of the same class, the one the class loader finds first will be loaded.
If the classes are not actually the same class (same names but different methods), you'll get an exception when you try to use it.
You can load two classes with the same names in a single VM by using multiple class loaders. The OSGI framework can manage lots of the complexitites for you, making sure the correct version is loaded, etc.
First, I assume that you mean that the same class resides in two more jar files...
Now, answering your questions:
Which class is imported is dependent on your classloader and JVM. You cannot guarantee which class it will be, but in the normal classloader it will be the class from the first jar file on your classpath.
Don't put the same class into multiple jar files, or if you are trying to override system classes, use -bootclasspath.
Edit: To address one of the comments on this answer. I originally thought that sealing the jar would make a difference, since in theory it should not load two classes from the same package from different jar files. However, after some experimentation, I see that this assumption does not hold true, at least with the default security provider.
The ClassLoader is responsible for loading the Classes.
It scanns the ClassPath and loads the class that it found first.
If you have the same Jar twice on the ClassPath or if you have two Jars that contain two different versions of the same Class (that is com.packagename.Classname), the one that is found first is loaded.
Try to avoid having the same jar on the classpath twice.
Not sure what you meant by "the same class resides in two more classes"
if you meant inner/nested classes, there should be no problem since they are in different namespaces.
If you meant in two more JARs, as already answered, the order in the classpath is used.
How to avoid?
A package should be in only one JAR to avoid duplicated classes. If two classes have the same simple name, like java.util.Date and java.sql.Date, but are in different packages, they actually are different classes. You must use the fully qualified name, at aleast from one of the classes, to distinguish them.
If you have a problem finding out which version of a class is being used, then jwhich might be of use:
http://www.fullspan.com/proj/jwhich/index.html
If the same class resides in two more jars there should be a problem.
What do you mean exactly? Why should this be a problem?
In such scenarios what is the class that imported by the program? (Class with older timestamp??)
If a class exists in two JARs, the class will be loaded from the first JAR on the class path where it is found. Quoting Setting the class path (the quoted part applies to archive files too):
The order in which you specify multiple class path entries is important. The Java interpreter will look for classes in the directories in the order they appear in the class path variable. In the example above, the Java interpreter will first look for a needed class in the directory C:\java\MyClasses. Only if it doesn't find a class with the proper name in that directory will the interpreter look in the C:\java\OtherClasses directory.
In other words, if a specific order is required then just enumerate the JAR files explicitly in the class path. This is something commonly used by application server vendors: to patch specific class(es) of a product, you put a JAR (e.g. CR1234.jar) containing patched class(es) on the class path before the main JAR (say weblogic.jar).
What are the practices we can follow to avoid such complications.
Well, the obvious answer is don't do it (or only on purpose like in the sample given above).
I'm confused in understanding, how java interpretor and java compiler searches for all the necessary jar files it requires from environment variables. As I have only set the set path variable for JDK directory, but I've not set any variable to search for any class libraries, which jvm requires. How can it search those important jar files?
Which jar files are you talking about? Java already knows about the jar files it "owns" (such as rt.jar) - you don't have to tell it about them explicitly. This is known as the bootclasspath - you can override it, but usually you don't want to.
For better understanding of how classes are found and loaded by JVM read How Classes are Found.
CLASSPATH is an enviromental variable is like the path file (which helps windows to find executables). It lists a set of all places the JVM looks for classes. You can also give the classpath on the command line when starting the jvm and java compiler