Safe class imports from JAR Files - java

Consider a scenario that a java program imports the classes from jar files. If the same class resides in two or more jar files there could be a problem.
In such scenarios what is the class that imported by the program? Is it the class
with the older timestamp??
What are the practices we can follow to avoid such complications.
Edit : This is an example. I have 2 jar files my1.jar and my2.jar. Both the files contain com.mycompany.CrazyWriter

By default, classes are loaded by the ClassLoader using the classpath which is searched in order.
If you have two implementations of the same class, the one the class loader finds first will be loaded.
If the classes are not actually the same class (same names but different methods), you'll get an exception when you try to use it.
You can load two classes with the same names in a single VM by using multiple class loaders. The OSGI framework can manage lots of the complexitites for you, making sure the correct version is loaded, etc.

First, I assume that you mean that the same class resides in two more jar files...
Now, answering your questions:
Which class is imported is dependent on your classloader and JVM. You cannot guarantee which class it will be, but in the normal classloader it will be the class from the first jar file on your classpath.
Don't put the same class into multiple jar files, or if you are trying to override system classes, use -bootclasspath.
Edit: To address one of the comments on this answer. I originally thought that sealing the jar would make a difference, since in theory it should not load two classes from the same package from different jar files. However, after some experimentation, I see that this assumption does not hold true, at least with the default security provider.

The ClassLoader is responsible for loading the Classes.
It scanns the ClassPath and loads the class that it found first.
If you have the same Jar twice on the ClassPath or if you have two Jars that contain two different versions of the same Class (that is com.packagename.Classname), the one that is found first is loaded.
Try to avoid having the same jar on the classpath twice.

Not sure what you meant by "the same class resides in two more classes"
if you meant inner/nested classes, there should be no problem since they are in different namespaces.
If you meant in two more JARs, as already answered, the order in the classpath is used.
How to avoid?
A package should be in only one JAR to avoid duplicated classes. If two classes have the same simple name, like java.util.Date and java.sql.Date, but are in different packages, they actually are different classes. You must use the fully qualified name, at aleast from one of the classes, to distinguish them.

If you have a problem finding out which version of a class is being used, then jwhich might be of use:
http://www.fullspan.com/proj/jwhich/index.html

If the same class resides in two more jars there should be a problem.
What do you mean exactly? Why should this be a problem?
In such scenarios what is the class that imported by the program? (Class with older timestamp??)
If a class exists in two JARs, the class will be loaded from the first JAR on the class path where it is found. Quoting Setting the class path (the quoted part applies to archive files too):
The order in which you specify multiple class path entries is important. The Java interpreter will look for classes in the directories in the order they appear in the class path variable. In the example above, the Java interpreter will first look for a needed class in the directory C:\java\MyClasses. Only if it doesn't find a class with the proper name in that directory will the interpreter look in the C:\java\OtherClasses directory.
In other words, if a specific order is required then just enumerate the JAR files explicitly in the class path. This is something commonly used by application server vendors: to patch specific class(es) of a product, you put a JAR (e.g. CR1234.jar) containing patched class(es) on the class path before the main JAR (say weblogic.jar).
What are the practices we can follow to avoid such complications.
Well, the obvious answer is don't do it (or only on purpose like in the sample given above).

Related

Determine loader jar of a dynamically loaded jar

How do I determine the name of the jar that is dynamically loading my jar? Is it possible? I have attempted many variants using ClassLoader but with no success.
Thanks in advance.
Let me explain why I need the name of the "loader jar". In our container we have the following lines:
URLClassLoader classLoader = new URLClassLoader(new URL[] { artifact.getFile().toURI().toURL() });
Method method = URLClassLoader.class.getDeclaredMethod("addURL", URL.class);
method.setAccessible(true);
method.invoke(classLoader, artifact.getArtifact().getFile().toURI().toURL());
Class<?> processorClass = classLoader.loadClass(className);
Object processorClassInstance = processorClass.newInstance();
When the loaded class is instantiated, newInstance above, it's properties files is external to the jar the class resides. The configuration files are in a directory named after the container jar that contains the class that executed the lines of code above. So, if the deployed container jar is called RedcapTDP.jar the configuration files are in "C:...\RedcapTDP". When the RedcapTDP.jar is deployed it dynamically loads the configured maven artifact which in turn will read it's configuration file from the RedcapTDP directory.
I hope that makes it clear!
How do I determine the name of the jar that is dynamically loading my jar? Is it possible?
Taking you literally, no, it is clearly impossible. In fact, it doesn't even make sense. Java does not "load" Jars at all, though it may load one or more classes contained within a jar. When it does so, it is the VM loading the class, not any jar loading it.
Interpreting you a bit more liberally, perhaps you are asking "how do I determine the class whose dependency on one of the classes in my jar is causing my class to be loaded, and how do I determine from which jar file that other class was loaded?" Unless your control extends beyond the classes in your jar, however, this is again impossible.
Class loading is a separate step preceding class initialization, and class initialization is the first point at which there is any opportunity to execute any code contained in your class. Thus, class loading is no longer ongoing when your classes first get a chance to inquire about anything. Moreover, classes are not necessarily loaded from jars at all, and in any event, they do not carry information about the source from which they were loaded.
I could perhaps go further afield with speculations about what you may mean to ask, but I don't see any interpretation of the question that affords an answer different from "no, it is not possible."
JARs don't load JARs. A jar file is just a meaningless container that helps holding class files and other resources together. Classes are loaded by a class loader(s) from a class path provided to the jvm. What you're trying to do is not possible. Maybe you can explain what you're trying to achieve and perhaps there is a better way to do it.

Referring different classes of same name from packages of same name

We have a process which needs to refer two different encryption classes having same name, at different times. Both the class names are same with same package path "com.abc.security.encryption".
Both the classes have same package name com.abc.security.encryption, however they are present in different jar files.
Let's say ENCRYPTION.class(new logic) present in Jar A and ENCRYPTION.class(old logic) is present in Jar B.
Now in my process, when we call Jar B API which refers ENCRYPTION.class, is referring to ENCRYPTION.class(new logic) present in Jar A instead of ENCRYPTION.class(old logic) present in Jar B .
Until I delete Jar A having ENCRYPTION.class(new logic), the ENCRYPTION.class(old logic) present in Jar B is not referred.
Since both the encryption logic are from different utility modules being used by many different modules, i am not able to ask them to change the name of the package.
I need a way to make sure both the logic are referred at required places without changing anything in those modules.
Can anything be changed in the class paths of my process or in the code, so that calling Jar B API, calls ENCRYPTION.class(old logic) present in Jar B itself. And when i call direct ENCRYPTION.class it should refer to ENCRYPTION.class(new logic) present in Jar A.
Tried by adding the first class path as "." for the process. But it did not solve the issue.
Your help is most appreciated.
Thanks,
Nvn
You should remove the problematic jar from the classpath. A classpath with multiple jars that contain the same fully qualified class names is a recipe for disaster.
If that's not an option, you might be able to create a custom class loader which does this swapping. But it probably won't be easy. There's a similar question about this which might get you started if you go down this road: Unloading classes in java?

Classes or Jar, who wins?

Is it right, that in case of duplicate classes,
the one in "classes" will be taken?
Consider in a web application the class A will be available directly in
WEB-INF/classes
and a part of a jar in
WEB-INF/lib
Will the one in classes always win?
This depends on the class path - the classpath entry that comes first will initially win. Also, if there are multiple class loaders, a class loader that has loaded the the dependent class will be tried first. It should not depend on if classes are in the jar or not. Documented here.
In general, do not do this as having multiple classes under same name is a true recipe for disaster.

Public class outside a jar file containing multiple packages

So, I have a Java project containing several packages (like com.myapp.a , com.myapp.b, com.myapp.c) for better readability and I want to build a jar to use as a library in another project.
But I just want to expose only some classes and interfaces from this jar. The problem is that if I don't declare these classes public then they can't be seen inside the jar file between the packages (for example I have a class A in com.myapp.a package that is used in com.myapp.b package).
So how can I expose just what I want outside of the jar when I have multiple packages defined inside?
Currently Java does not address this problem directly.
OSGi adresses this problem by explicitly defining the exported package list.
Also hopefully this will be addressed with the Java 8 Modularity system as well.
So one option is to use OSGi, but this option does not work if the jar file is used directly rather than as an OSGi bundle.
Another option is to use code obfuscation (like Proguard), to obfuscate the packages you do not want to expose.
Eclipse "solved" this problem by making all classes available, but classes that were not intended to be used by clients were placed in packages whose name contains "internal". For example, that might mean that you have packages named "com.myapp.b" and "com.myapp.internal.b". It's made clear to users of the classes that internal classes are not guaranteed to be upwardly compatible or even present in later releases.

How is import done in Java?

For instance
import org.apache.nutch.plugin.Extension,
though used many times,
I've no much idea what is done essentially.
EDIT: Is org.apache.nutch.plugin essentially 4 directories or fewer than 4 like a directory named org.apache?
I think the question you might be trying to ask is, "What are packages in Java, and how does the import keyword relate to them?". Your confusion about directory structures might stem from the fact that some other languages have include directives that use file names to literally include the contents of the specified file in your source code at compile time. C/C++ are examples of languages that use this type of include directive. Java's import keyword does not work this way. As others have said, the import keyword is simply a shorthand way to reference one or more classes in a package. The real work is done by the Java Virtual Machine's class loader (details below).
Let's start with the definition of a "Java package", as described in the Wikipedia article:
A Java package is a mechanism for
organizing Java classes into
namespaces similar to the modules of
Modula. Java packages can be stored in
compressed files called JAR files,
allowing classes to download faster as
a group rather than one at a time.
Programmers also typically use
packages to organize classes belonging
to the same category or providing
similar functionality.
In Java, source code files for classes are in fact organized by directories, but the method by which the Java Virtual Machine (JVM) locates the classes is different from languages like C/C++.
Suppose in your source code you have a package named "com.foo.bar", and within that package you have a class named "MyClass". At compile time, the location of that class's source code in the file system must be {source}/com/foo/bar/MyClass.java, where {source} is the root of the source tree you are compiling.
One difference between Java and languages like C/C++ is the concept of a class loader. In fact, the concept of a class loader is a key part of the Java Virtual Machine's architecture. The job of the class loader is to locate and load any class files your program needs. The "primordial" or "default" Java class loader is usually provided by the JVM. It is a regular class of type ClassLoader, and contains a method called loadClass() with the following definition:
// Loads the class with the specified name.
// Example: loadClass("org.apache.nutch.plugin.Extension")
Class loadClass(String name)
This loadClass() method will attempt to locate the class file for the class with given name, and it produces a Class object which has a newInstance() method capable of instantiating the class.
Where does the class loader search for the class file? In the JVM's class path. The class path is simply a list of locations where class files can be found. These locations can be directories containing class files. It can even contain jar files, which can themselves contain even more class files. The default class loader is capable of looking inside these jar files to search for class files. As a side note, you could implement your own class loader to, for example, allow network locations (or any other location) to be searched for class files.
So, now we know that whether or not "com.foo.bar.MyClass" is in a class file in your own source tree or a class file inside a jar file somewhere in your class path, the class loader will find it for you, if it exists. If it does not exist, you will get a ClassNotFoundException.
And now to address the import keyword: I will reference the following example:
import com.foo.bar.MyClass;
...
public void someFunction() {
MyClass obj1 = new MyClass();
org.blah.MyClass obj2 = new org.blah.MyClass("some string argument");
}
The first line is simply a way to tell the compiler "Whenever you see a variable declared simply as type MyClass, assume I mean com.foo.bar.MyClass. That is what's happening in the case of obj1. In the case of obj2, you are explicitly telling the compiler "I don't want the class com.foo.bar.MyClass, I actually want org.blah.MyClass". So the import keyword is just a simple way of cutting down on the amount of typing programmers have to do in order to use other classes. All of the interesting stuff is done in the JVM's class loader.
For more information about exactly what the class loader does, I recommend reading an article called The Basics of Java Class Loaders
All it's doing is saving you typing. Instead of having to type "org.apache.nutch.plugin.Extension" every time you want to use it, the import allows you to refer to it by its short name, "Extension".
Don't be confused by the word "import" - it's not loading the .class file or anything like that. The class loader will search for it on the CLASSPATH and load it into perm space the first time your code requires it.
UPDATE: As a developer you have to know that packages are associated with directories. If you create a package "com.foo.bar.baz" in your .java file, it'll have to be stored in a directory com/foo/bar/baz.
But when you download a JAR file, like that Apache Nutch library, there are no directories involved from your point of view. The person who created the JAR had to zip up the proper directory structure, which you can see as the path to the .class file if you open the JAR using WinZip. You just have to put that JAR in the CLASSPATH for your app when you compile and run.
Imports are just hints to the compiler telling him how to figure out the full name of classes.
So if you have "import java.util.*;" and in your code you are doing something like "new ArrayList()", when the compiler processes this expression it first needs to find the fully qualified name of the type ArrayList. It does so by going thru the list of imports and appending ArrayList to each import. Specifically, when it appends ArrayList to java.util it get the FQN java.util.ArrayList. It then looks up this FQN in its class-path. If it finds a class with such a name then it knows that java.util.ArrayList is the correct name.
is "org.apache.nutch.plugin" essentially 4 directories?
If you have a class whose name is org.apache.nutch.plugin.Extension, then it is stored somewhere in the classpath as a file org/apache/nutch/plugin/Extension.class. So the root directory contains four nested subdirectories ("org", "apache", "nutch", "plugin") which in turn contain the class file.
import org.apache.nutch.plugin.Extension is a compilation time shortcut that allows you to refer to the Extension class without using the class' fully qualified name. It has no meaning at runtime, it's only a compilation time trick to save typing.
By convention the .class file for this class will be located in folder org/apache/nutch/plugin either in the file system or in a jar file, either of which need to be in your classpath, both at compile time and runtime. If the .class file is in a jar file then that jar file needs to be in your classpath. If the .class file is in a folder, then the folder that is the parent of folder "org" needs to be in your classpath. For example, if the class was located in folder c:\myproject\bin\org\apache\nutch\plugin then folder c:\myproject\bin would need to be part of the classpath.
If you're interested in finding out where the class was loaded from when you run your program, use the -verbose:class java command line option. It should tell you which folder or jar file the JVM found the class.
Basically when you make a class you can declare it to be part of a package. I personally don't have much experience with doing packages. However, afaik, that basically means that you are importing the Extension class from the org.apache.nutch.plugin package.
Buliding off of Thomas' answer, org.apache.nutch.plugin is a path to the class file(s) you want to import. I'm not sure about this particular package, but generally you'll have a .jar file that you add to your classpath, and your import statement points to the directory "./[classpath]/[jarfile]/org/apache/nutch/plugin"
you can't have a directory named org.apache as a package. the compiler won't understand that name and will look for the directory structure org/apache when you import any class from that package.
also, do not mistake the Java import statement with the C #include preprocessor instruction. the import statement is, like they've said, a shorthand for you to type fewer characters when referring to a class name.

Categories