Java Thread context classloader - how does it work?

Java Thread context classloader - how does it work? - java

I found the following loaders structure for OSGi in internet.
bootstrap classloader (Java standard libraries from jre/lib/rt.jar
etc) ^ extension classloader ^ system
classloader (ie stuff on $CLASSPATH, OSGi core code) ^ (**
limited access to types from parent classloader common OSGi
classloader:
--|-- OSGi classloader for bundle1 -> (map of imported-package->classloader)
--|-- OSGi classloader for bundle2 -> (map of imported-package->classloader)
--|-- OSGi classloader for bundle3 -> (map of imported-package->classloader)
Here it says that
A context classloader set on the executing thread. By default it is
always set to System classloader or from the thread from where the new
thread instance was created.
From structure above we see system loader(=context loader) has higher position and as I know parent classloader never asks its children.
So my question please explain how current thread works with classes that are in current bundle?

In OSGi the Thread Context ClassLoader (TCCL) is simply undefined. You cannot expect or assert that it will be anything in particular. In fact, a lot of the time it will be null.
TCCL is a hack that was added in Java 1.2 to support J2EE. Specifically it was needed to support things like Entity Beans; in a modern world it's used to support technologies like JPA, JAXB, Hibernate and so on.
The issue with parent delegation is that, while the application classes at the bottom have visibility of all classes in the parent classloaders, unfortunately the classes loaded by the parent classloaders do not have visibility of the application classes. In practical terms, this means that your application code can load (say) the classes that make up Hibernate, but Hibernate would not be able to load your domain classes because they are below it in the hierarchy.
So, TCCL was invented. In a J2EE application server, the TCCL is created as a thread-local variable, and it has visibility of all your application classes. Hibernate/JPA/JAXB etc can consult the TCCL in order to find the application classes. This was easy enough to do in J2EE because the app server controls all of the entry points: it controls the web server, it controls the RMI endpoints, and as an application developer you are not permitted to create your own threads.
However the programming environment for OSGi is far less constrained. Any bundle is permitted to create its own network endpoints, spin up its own threads, or pretty much do anything. Therefore, OSGi has no opportunity to intervene and impose a TCCL that has visibility of the application classes. Furthermore, the very concept of an "application" is fuzzy because we have this neat thing called modularity. An application consists of multiple bundles... but how to define which bundles may provide classes to the TCCL?
So OSGi basically punts on this issue. The TCCL is undefined so you should never rely on it. Fortunately most libraries that try to use only do so as one of a series of places they try to load classes from.

Related

Significance of class loader in loading resource bundle

I am having difficulty in understanding the significance of classLoader in ResourceBundle.getBundle(bundleName, locale, classLoader) API .
What could be the practical scenario where someone would want to provide custom loader for this API?

A Java application might have multiple class loaders. For example, a J2EE application running on Tomcat or Glassfish has multiple tiers of classloaders - some belonging to the J2EE server itself, some being specifically made for your webapp (otherwise your webapp would be able to access classes belonging to other webapps) and even custom classloaders that you might have instantiated yourself.
Standalone Java apps might also have multiple classloaders. For example, if your application supports plugins and each of these plugins is contained in its own JAR file (local or remote) then in order to load the plugin's classes at runtime you would have to create your own classloaders to do so.
Therefore, when you load a ResourceBundle you have to select the appropriate classloader to ensure that the resource is loaded from the correct source. Here's a simple example... imagine that your application contains a /version.properties file and your JVM also has a similar, yet different, /version.properties (e.g. IBM's Java has this properties file). Trying to load this resource file using the system's default classloader returns the version.properties that is included in the JVM and in order to load your own version of this file, you must use a custom classloader or one whose context is specific to your app.
There is an old but still very interesting explanation of how class loaders work and how hierarchies and loading contexts are useful in practice. For more info, check Internals of Java Class Loading.

Java EE and Java SE classloading

The difference that I read on the Internet between Java EE and Java SE classloading is that
In Java SE, a classloader delegates the classloading to its parent
classloader and then tries to load the class itself
However, In Java EE, a classloader first tries to load the class itself and then
delegate the classloading of that class to its parent classloader.
Kindly validate my understanding.
Also, why is it designed like that in Java EE (Any advantages of keeping it like this.)
This is the link where I heard this [http://www.youtube.com/watch?v=t8sQw3pGJzM]

Alright then,
A common application has 3 standard classloaders:
Bootstrap Classloader
Extensions Classloader
System-Classpath Classloader
So far, so good. Now, this works for a single application running alone and free.
But what happens when you say J2EE? You have multiple applications running on the same place, so you have to figure out a way to prevent them from stumbling on each other. That's where these extra classloaders come into play.
Think about a server instance. There's a JBoss with two deployed EARs. What would happen if there were to be conflicting classes between applications? They're ok on their own particular context but as a whole they're inconsistent.
These extra classloaders are introduced in an application-wise way to ensure the isolation between them. Classloaders below System-Classpath Classloader recognize a class only if it is specified in the manifest file for one of its childs.
In J2SE, the three basic classloaders work in a parent-child relationship based on three principles:
Delegation: If a class is not loaded (cache), the request is delegated to its parent. This goes on until the top of the hierarchy (Bootstrap classloader) who loads basic J2SE related classes (i.e. Integer, ArrayList, amongst others). This is what you reference in your question: A classloader delegates the loading until the top of the hierarchy, then each classloader tries to load the class if its parent couldn't find it, until someone loads it. Otherwise: ClassNotFound.
Visibility: Classes loaded by a parent classloader are visible to its children, not the other way around.
Uniqueness: If a parent classloader loads a class, a children will never reload it.
In Java SE, a classloader delegates the classloading to its parent classloader and then tries to load the class itself.
True, due to the principles explained above.
There's no determined classloader structure in J2EE (a vendor has "poetic license" to implement it), but they kind of follow a hierarchy. In this case, the System-classpath classloader loads the main application: The server. The server libraries (its classes, more specifically) are available, then, to every application due to the visibility principle.
Down there, the applications have particular classloader structures, but as a whole they are different children of the System-classpath classloader. Each application loads its related and particular classes (both application and libraries).
The loading here is not propagated to the parents outside the application context. Why? because if the System-classpath classloader were to load the applications as usual, the class of every application would be visible to others due to the visibility principle, completely breaking the isolation between themselves. So:
However, In Java EE, a classloader first tries to load the class itself and then delegate the classloading of that class to its parent classloader.
This is partly true, but I'd rather limit this affirmation to the context of an application and leave out the Java related classes, that are indeed loaded by the top level classloaders.
Long story short: It's not a straightforward process but I wouldn't go as far as to say J2EE handles the classloading the opposite way around of J2SE.

I think Java EE class loading standard will help you on your way. As far as I know there is no mandated way of classloading for standard Java. For WebApps (WARs) however, it is specified that the classloading is parent-last.

Would it be possible having Spring libraries in common/shared context?

We have a portal application with one Main web app context and many minor web app contexts - plugins. Currently (very simplified) the Main one has own spring libraries and plugins would have to have them also if they wanted to use spring. In common/shared tomcat context there are just drivers and interfaces.
Would it work if spring libraries were moved to common context in regards to other libraries that spring might indirectly use or they might use spring ? Like hibernate, because the apps are using spring-tx etc. Would hibernate have to move to common/shared context too ?
What do you think, what are the other aspects ? From spring application context point of view it would be much easier like this.

#RichW is correct in stating that placing Spring libraries in Tomcat's common classloader is bad practice. And there's a good chance it won't work.
Java uses a classloader hierarchy). When a class load is requested, the classloader will recursively request the class from it's parent classloader before attempting to load the class using it's own classpath. This process continues up to the root classloader (know as the bootstrap classloader). In this way, classes referenced from a parent classloader always get priority over classes referenced in classloaders further down the hierarchy.
It's important to note that in this process classes are never loaded from a child classloader. Therefore any classes required by Spring would also need to be loaded into the common classloader - including asm, log4j, commons-logging and cglib (all of which spring depends on). This will lead to a whole host of problems: in particular, including commons-logging in the common classpath is a whole world of hurt
If you actually managed to get Tomcat started, then you would experience problems with memory leaks when recycling applications. In tomcat, applications are unloaded using conventional garbage collection, so if anything holds a reference to a class inside an application which has subsequently been restarted, that application will not get garbage collection. Spring and logging frameworks are prime candidates for holding references to classes so you will probably suffer from OOM errors after a few application restarts.
The only way to do this safely would be to consider using a full blown application server (such as JBoss AS) and deploy your application as an EAR.

If you were able to move from Tomcat to a full-blown Java EE container then an option would be to package everything as an EAR using the Bundled Optional Classes mechanism.
You'd then move the common JARs out of the WARs & into the top level of the EAR.

Yes, I know it's tempting. Yes, it can work. But putting application-specific or framework-specific libraries in the shared libraries folder of an app server is considered by some to be a bad practice, and I agree.
In my opinion web-apps should contain their own dependencies (app jars, framework jars, etc.). Frameworks also have dependencies, often requiring multiple jars with particular versions. Sometimes these versions change, sometimes the dependencies change. Over time that shared library folder will become a kitchen sink for jars, and that will affect all your apps, perhaps in unpredictable ways.
Going the shared library folder route you gain some slight initial convenience, but what you lose is choice: the choice to only affect one web-app at a time. I recommend you keep your jars within your web-app, nicely contained and separate from the other web-apps. It will make them more reliable and you'll find framework upgrades easier to handle. You'll be happier in the long run, I promise you.

Tomcat classloader violates delegating policy

Question1:
As we know, when a classloader is about to load a class, it delegates the request to its parent classloader. However in Tomcat, it doesn’t: you could load your class to overwrite the same name class which is put in common lib directory. This means Tomcat WebappClassloader doesn’t follow delegating policy. Is it violation of convention?
Question2:
I wrote a class and put it in common lib directory, obviously the class is shared among web apps. For instance, every web app can read/write the static field of the class. Further, classes in JDK are loaded by Bootstrap classloader, then their static fields are shared by any web apps, is it dangerous?

This behavior is intentional and it allows you to override libraries provided in the Tomcat itself independently in every WAR. For instance you can override Log4J with different version per each application deployed to the container without introducing any issues or breaking other applications. From Tomcat documentation:
Like many server applications, Tomcat installs a variety of class loaders [...] to allow different portions of the container, and the web applications running on the container, to have access to different repositories of available classes and resources. This mechanism is used to provide the functionality defined in the Servlet Specification, version 2.4 — in particular, Sections 9.4 and 9.6.
It does violate the normal delegation algorithm, but this is how other application server work as well (JBoss for instance).
Ad. question 2: Yes, it is dangerous, you have to remember about synchronization and have no control over who modifies this variable. I would avoid static fields altogether.
For instance EhCache allows you to share CacheManager. This is implemented via net.sf.ehcache.CacheManager#singleton static volatile field. Now you get all sort of problems: if you put ehcache.jar in Tomcat's /lib, it will work as expected. However if each web application has its own copy of the JAR file, sharing will not work because each web app has its own copy of CacheManager class. It gets even worse when only one application has its own ehcache.jar - all applications will share the same instance of CachedManager except the one having ehcache.jar packaged together. Such error are very hard to track down...

Why is the setContextClassLoader() method placed on Thread?

Why is the setContextClassLoader() method placed on Thread?
What different thread have different classloaders?
The question is what if I extended a ClassLoader, loaded there some new classes. to the my custom classloader.
Now , I want it to be the context classloader , so I call the method Thread.currentThread().setContextClassLoader(loader).
Are these new classes awailable only in the context of the current thread ? Or how does it work?
Thanks

The Context class loader is the class loader that the thread will use to find classes. You primarily care about this when you are writing an application server or something similar. The idea is that you can start a thread from a class loaded in the application server's class loader, and yet pass it a child class loader that handles loading the classes of the deployed application.

The thread context class loader is a little bit of a hack.
When you load a class with reflection, you use either an explicit class loader or the one of the immediate calling class. When you link to a class using normal Java code, then class requesting the linking is used as the source for the loader.
Thread.setContextClassLoader is used to set the class loader for Thread.getContextClassLoader. This is used by random APIs (notably through ServiceLoader) to pick up classes through reflection so that you can change implementation. Having implementations change out from under your code depending upon which thread it is running on at a crucial moment is a bad idea.

Thread.setContextClassLoaderis used to set contextClassLoader, if not set manually, it will set to systemClassLoader which is Launcher.AppClassLoader ,this can be proved by checking the source code of Launcher.
Then what is the use of contextClassLoader?
contextClassLoader provides a back door around the classloading delegation scheme.
Then this question becomes why do we need this back door?
From the JavaWorld article Find a way out of the ClassLoader maze
Take JNDI for instance: its guts are implemented by bootstrap classes in rt.jar (starting with J2SE 1.3), but these core JNDI classes may load JNDI providers implemented by independent vendors and potentially deployed in the application's -classpath. This scenario calls for a parent classloader (the primordial one in this case) to load a class visible to one of its child classloaders (the system one, for example). Normal J2SE delegation does not work, and the workaround is to make the core JNDI classes use thread context loaders, thus effectively "tunneling" through the classloader hierarchy in the direction opposite to the proper delegation.

Java class loaders can be classified into below categories
1) Bootstrap Class Loader
Load classes from JAVA_HOME/jre/lib/rt.jar
2) Extensions Class Loader<
Load classes from JAVA_HOME/jre/lib/ext
3) System Class Loader
Classpath of the application
We can create own class loader and specify own location from which classes can be loaded , that is refer to ContextClassLoader
Hope that gives idea upon why we need to use setContextClassLoader()

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.