Dynamically loading jar from arbitrary url

Dynamically loading jar from arbitrary url - java

Recently AWS Lambda added support for Java.
While this is great news, this come with a pretty severe limitation to the size of the code (50MB compressed). While this may be fine for other languages, Java uberjars can easily beat that.
So I've been toying with the idea of having a small loader that pull in, at runtime, a bigger jar from somewhere else. (set aside if this is a good idea or not for a moment).
From my initial research seems that a Custom Class Loader is the way to go. This is probably a no go for AWS Lambda.
Is there any other creative way this could be achieved?

I think ClassLoader, and more precisely URLClassLoader, is the way to go, and I don't know of any other solution to load code at runtime.
The class loader does not even have to be custom. It works with just a few lines of code, as demonstrated in this post.
If the jar files you will load fulfill a particular service for your application, also consider the handy ServiceLoader. It works on the same principle (in fact, you can pass it directly a ClassLoader), but makes it transparent to instantiate objects from the dynamically loaded library. Otherwise, you would have to get your hands a bit dirty, using something like:
Object main = loader.loadClass("Main", true).newInstance();

Related

Block instances of a class at the JVM level?

Is there a way to configure the JVM to block instances of a class being created?
I'd like to do this to ensure no service running in the JVM is allowed to create instances of a class that has been identified as a security risk in a CVE, lets call that class BadClass.
NOTE: I'm looking for a general solution, so the following is purely additional information. I would normally address this by switching the library out, or upgrading it to a version that doesn't have the exploit, but it's part of a larger library that wont be addressing the issue for some time. So I'm not even using BadClass anywhere, but want to completely block it.

I do not know a JVM parameter, but here's some alternatives that might pout you in a position that solve your requirements:
You can write a CustomClassLoader that gives you fine control on what to do. Normal use cases would be plugin loading etc. In your case this is more security governance on devops level.
If you have a CICD pipeline with integration tests you could also start the JVM with -verbose:class parameter and see which classes are loaded when running your tests. Seem a bit hacky, but maybe suits your use case. Just throwing everything into the game, it's up to you judging about the best fit.
Depending on your build system (Maven?) you could restrict building applications just on your private cached libs. So you should have full control on it and put a library - review layer in between. This would also share responsibility between devs and the repository admins.

A distinct non-answer: Do not even try!
What if that larger library that has this dependency wants to call that method? What should happen then?
In other words, what is your blocking supposed to do?
Throw some Error instance, that leads to a teardown of the JVM?
Return null, so that (maybe much later) other code runs into a NPE?
Remember: that class doesn't exist in a void. There is other code invoking it. That code isn't prepared for you coming in, and well, doing what again?!
I think there are no good answers to these questions.
So, if you really want to "manipulate" things:
Try sneaking in a different version of that specific class into your classpath instead. Either an official one, that doesn't have the security issue, or something that complies to the required interface and that does something less harmful. Or, if you dare going down that path, do as the other answer suggests and get into "my own classloader" business.
In any case, your first objective: get clean on your requirements here. What does blocking mean?!

Have you considered using Java Agent?
It can intercept class loading in any classloader, and manipulate it's content before the class is actually loaded. Then, you may either modify the class to remove/fix it's bugs, or return dummy class that would throw error in static initializer.

Using class loader to enable shared code between Java and Android

I am trying to build an application that runs under JavaSE and Android. Most of the code is the same between the two, but there are some specific functions that need to be separated. I use Eclipse. So I decided to put the shared code in a separate project, and then build one more project for Android and one for Java, which reference the shared project. I put all Java and Android specific functions in one class residing in the Java and Android specific projects. These classes are called UtilsJ (for Java) and UtilsA (for Android). The code in the shared project uses a factory to determine at runtime which version it needs to pick, and then calls the class loader to load the right class. Essentially: if property java.vm.name equals Dalvik, load UtilsA, else load UtilsJ (and of course cast to the Utils interface before returning).
My question is simply if this is a good idea or is something going to eventually break? I've never used class loader before. Any other suggestions how to implement this sharing would also be appreciated.

Generating an interface implementation dynamically is certainly a valid technique. For instance, having a data access interface that has multiple implementations; one each for flat files, MySQL and WebDAV. The program can pick an implementation at run time based on system/platform properties.
But this feels different. If I saw that I had a Java app and an Android app that had a lot of common code, my goal would be to create an Eclipse project that generates a jar file that I could just drop into the libraries of both projects. In that case, the jar file wouldn't contain any code that was incompatible with one platform or the other. So there wouldn't be any reason to have a platform-specific implementation.
Let's take your example some code reading an initialization file. If it's common code, you have an input parameter which is a file. On Android, maybe it's "/data/data/com.whatever.blahblahblah" and on Java you're getting the "user.dir" system parameter for the top level directories. But one way or another, it's a File, and you hand it to your common setup method. That's okay. But if your initialization file read code e.g. needs a Context to get a Resource to read the file for Android, then it's not common code. And it doesn't belong in a library jar for a JVM-hosted app.
So I think that in your case the platform-specific implementation classes are overkill. If it's common code, it's the same code — period.
Let's talk about another example in your comment. If you are using desktop Java, then you are probably using Swing or AWT, so you still have the same issue of running some network task off the UI thread, notifying when it completes, maybe even updating some progress indicator UI while it's processing. Same function, same operation, but the code is so different that I can't see how having it in the same library next to an AsyncTask version could be of any benefit.
And testing might get tricky. Obviously JUnit will work for everything, but some tests would need to run on a device or emulator.
I stated that it was a valid technique, and of course you may have other compelling reasons to choose the multi-platform option. You asked the question; is anything going to break? My answer is: Probably not, but why risk dealing with some heartburn down the road? Speaking for myself, I wouldn't do it. If I had to support multiple MVC apps, my common library would have nothing but M.

Do I need to extend ClassLoader to redirect web application resource loading?

My question is two-fold. First, I'll explain the problem, and second, assuming the solution is to implement a class loader, how to go about doing that on a web application.
My problem is this: Our company is using a framework made by another company. It uses xml files to generate web pages and these xml files are located within another library (jar files). It wasn't meant to be dynamic because these libraries are generated often (weekly?), but they determine how many fields there are, what type of information it collects (datetime, combo box, etc.), and so on.
Now the question has been proposed by my company whether or not it would be possible to dynamically move these fields around (by dynamic, I mean ideally you could refresh the page and see the effects of changes made to the layout). I did a few preliminary tests and discovered that modifying the xml does give the desired effect on the web page, however, since these xml files are located in the jars, it means I have two possibilities:
Create a tool which modifies the jar outside of the scope of my web application, though this would obviously imply that it absolutely cannot be dynamic. Moreover, I'd have to create an interface aside from the web application in order to manage the tool. Moreover still, I can't seem to shake the impression that this is an incredibly hacky approach and that I should probably avoid this solution at any cost.
I implement a class loader (specifically getResourceAsStream) and when I see a call to load one such xml file, rather than do the default behavior, I generate the xml file based on the original, modifying information as I require, then returning the resource to the caller (which in this case would be the third-party framework).
First question is, is #2 my best option or do there exist other options (or should I stick to #1)?
My second question is, assuming I should implement my own class loader, how best can I do this on my web application? I'm using Tomcat 7, but if possible I would like the solution to be independent of which web container I'm using.
Any help would be greatly appreciated!

You could probably simply explode the jar to a directory that is on the classpath and update the XML files in place and on the fly. This won't account for any internal caching within the application, (if any, that's a different problem) but it's straightforward to implement and doesn't put you in the shenanigan filled ClassLoader business.

I'm not sure if I understand your question.But I guess you could try using xstream api from thoughtworks.It can generate xml on the fly for you given a Java object and from this point on you can treat these xmls the way you do now to generate your webpages.
I know this answer is very trivialising however if this can lead you to a new api that can help you move to a new approach of generating xml with minimum fuss then I guess it would have served your purpose well.

Java ServiceLoader with multiple Classloaders

What are the best practices for using ServiceLoader in an Environment with multiple ClassLoaders? The documentation recommends to create and save a single service instance at initialization:
private static ServiceLoader<CodecSet> codecSetLoader = ServiceLoader.load(CodecSet.class);
This would initialize the ServiceLoader using the current context classloader. Now suppose this snippet is contained in a class loaded using a shared classloader in a web container and multiple web applications want to define their own service implementations. These would not get picked up in the above code, it might even be possible that the loader gets initialized using the first webapps context classloader and provide the wrong implementation to other users.
Always creating a new ServiceLoader seems wasteful performance wise since it has to enumerate and parse service files each time. Edit: This can even be a big performance problem as shown in this answer regarding java's XPath implementation.
How do other libraries handle this? Do they cache the implementations per classloader, do they reparse their configuration everytime or do they simply ignore this problem and only work for one classloader?

I personally do not like the ServiceLoader under any circumstances. It's slow and needlessly wasteful and there is little you can do to optimize it.
I also find it a bit limited -- you really have to go out of your way if you want to do more than search by type alone.
xbean-finder's ResourceFinder
ResourceFinder is a self-contained java file capable of replacing ServiceLoader usage. Copy/paste reuse is no problem. It's one java file and is ASL 2.0 licensed and available from Apache.
Before our attention spans get too short, here's how it can replace a ServiceLoader
ResourceFinder finder = new ResourceFinder("META-INF/services/");
List<Class<? extends Plugin>> impls = finder.findAllImplementations(Plugin.class);
This will find all of the META-INF/services/org.acme.Plugin implementations in your classpath.
Note it does not actually instantiate all the instances. Pick the one(s) you want and you're one newInstance() call away from having an instance.
Why is this nice?
How hard is it to call newInstance() with proper exception handling? Not hard.
Having the freedom to instantiate only the ones you want is nice.
Now you can support constructor args!
Narrowing search scope
If you want to just check specific URLs you can do so easily:
URL url = new File("some.jar").toURI().toURL();
ResourceFinder finder = new ResourceFinder("META-INF/services/", url);
Here, only the 'some.jar' will be searched on any usage of this ResourceFinder instance.
There's also a convenience class called UrlSet which can make selecting URLs from the classpath very easy.
ClassLoader webAppClassLoader = Thread.currentThread().getContextClassLoader();
UrlSet urlSet = new UrlSet(webAppClassLoader);
urlSet = urlSet.exclude(webAppClassLoader.getParent());
urlSet = urlSet.matching(".*acme-.*.jar");
List<URL> urls = urlSet.getUrls();
Alternate "service" styles
Say you wanted to apply the ServiceLoader type concept to redesign URL handling and find/load the java.net.URLStreamHandler for a specific protocol.
Here's how you might layout the services in your classpath:
META-INF/java.net.URLStreamHandler/foo
META-INF/java.net.URLStreamHandler/bar
META-INF/java.net.URLStreamHandler/baz
Where foo is a plain text file that contains the name of the service implementation just as before. Now say someone creates a foo://... URL. We can find the implementation for that quickly, via:
ResourceFinder finder = new ResourceFinder("META-INF/");
Map<String, Class<? extends URLStreamHandler>> handlers = finder.mapAllImplementations(URLStreamHandler.class);
Class<? extends URLStreamHandler> fooHandler = handlers.get("foo");
Alternate "service" styles 2
Say you wanted to put some configuration information in your service file, so it contains more than just a classname. Here's an alternate style that resolves services to properties files. By convention one key would be the class names and the other keys would be injectable properties.
So here red is a properties file
META-INF/org.acme.Plugin/red
META-INF/org.acme.Plugin/blue
META-INF/org.acme.Plugin/green
You can look things up similarly as before.
ResourceFinder finder = new ResourceFinder("META-INF/");
Map<String,Properties> plugins = finder.mapAllProperties(Plugin.class.getName());
Properties redDefinition = plugins.get("red");
Here's how you could use those properties with xbean-reflect, another little library that can give you framework-free IoC. You just give it the class name and some name value pairs and it will construct and inject.
ObjectRecipe recipe = new ObjectRecipe(redDefinition.remove("className").toString());
recipe.setAllProperties(redDefinition);
Plugin red = (Plugin) recipe.create();
red.start();
Here's how that might look "spelled" out in long form:
ObjectRecipe recipe = new ObjectRecipe("com.example.plugins.RedPlugin");
recipe.setProperty("myDateField","2011-08-29");
recipe.setProperty("myIntField","100");
recipe.setProperty("myBooleanField","true");
recipe.setProperty("myUrlField","http://www.stackoverflow.com");
Plugin red = (Plugin) recipe.create();
red.start();
The xbean-reflect library is a step beyond the built-in JavaBeans API, but a bit better without requiring you to go all the way to a full-on IoC framework like Guice or Spring. It supports factory methods and constructor args and setter/field injection.
Why is the ServiceLoader so limited?
Deprecated code in the JVM damages the Java language itself. Many things are trimmed to the bone before being added to the JVM, because you cannot trim them after. The ServiceLoader is a prime example of that. The API is limited and OpenJDK implementation is somewhere around 500 lines including javadoc.
There's nothing fancy there and replacing it is easy. If it doesn't work for you, don't use it.
Classpath scope
APIs aside, in pure practicality narrowing the scope of the URLs searched is the true solution to this problem. App Servers have quite a lot of URLs all by themselves, not including the jars in your application. Tomcat 7 on OSX for example has about 40~ URLs in the StandardClassLoader alone (this is the parent to all webapp classloaders).
The bigger your app server the longer even a simple search will take.
Caching doesn't help if you intend to search for more than one entry. As well, it can add some bad leaks. Can be a real lose-lose scenario.
Narrow the URLs down to the 5 or 12 that you really care about and you can do all sorts of service loading and never notice the hit.

Have you tried using the two argument version so that you can specify which classloader to use? Ie, java.util.ServiceLoader.load(Class, ClassLoader)

Mu.
In a 1x WebContainer <-> Nx WebApplication system, the ServiceLoader instantiated in the WebContainer will not pick up any classes defined in WebApplications, just those in the container. A ServiceLoader instantiated in a WebApplication will detect classes defined in the application in addition to those defined in the container.
Keep in mind WebApplications will need to be kept separate, are designed that way, things will break if you try and circumvent that, and they are not the method and system available to extend the container - if your library is a simple Jar, just drop it into the appropriate extension folder of the container.

I really like Neil's answer in the link I added in my comment. Due to I have same experences in my recent project.
"Another thing to bear in mind with ServiceLoader is to try to abstract the lookup mechanism. The publish mechanism is quite nice and clean and declarative. But the lookup (via java.util.ServiceLoader) is as ugly as hell, implemented as a classpath scanner that breaks horribly if you put the code into any environment (such as OSGi or Java EE) that does not have global visibility. If your code gets tangled up with that then you'll have a hard time running it on OSGi later. Better to write an abstraction that you can replace when the time comes."
I actually met this problem in OSGi environment actually it's just eclipse in our project. But I luckily fixed it in a timely fashion. My workaround is using one class from the plugin I want to load ,and get classLoader from it. That will be a valid fix. I didn't use the standard ServiceLoader, But my process is quite similiar, use a properties to define the plugin classes I need to load. And I know there is another way to know each plugin's classloader. But at least I don't need to use that.
Honest, I don't like the generics used in ServiceLoader. Because it limited that one ServiceLoader can only handle classes for one interface. Well is it really useful? In my implementation, it don't force you by this limitation. I just use one implementation of loader to load all the plugin classes. I don't see the reason to use two or more. Due to the consumer can know from the config files about the relationships between interfaces and implementation.

This question seem to be more complicated than I first anticipated. As I see
it, there are 3 possible strategies for dealing with ServiceLoaders.
Use a static ServiceLoader instance and only support loading classes from
the same classloader as the one holding the ServiceLoader reference. This
would work when
The service configuration and implementation are in a shared classloader
and all child classloaders are using the same implementation. The example
in the documentation is geared towards theis use case.
Or
Configuration and implementation are put into each child classloader and
deployed along each webapp in WEB-INF/lib.
In this scenario it is not possible to deploy the service in a shared classloader
and let each webapp choose its own service implementation.
Initialize the ServiceLoader on each access passing the context classloader of
the current thread as the second parameter. This approach is taken be the JAXP
and JAXB apis, although they are using their own FactoryFinder implementation
instead of ServiceLoader. So it is possible to bundle a xml parser with a webapp
and have it automatically get picked up for example by DocumentBuilderFactory#newInstance.
This lookup has a performance impact, but in the case of xml parsing
the time to look up the implementation is small compared to the time needed to
actually parse a xml document. In the library I'm envisioning the factory
itself is pretty simple so the lookup time would dominate the performance.
Somehow cache the implementation class with the context classloader as the key.
I'm not entirely sure if this is possible in all the above cases without
causing any memory leaks.
In conclusion, I will probably be ignoring this problem and require that the library
gets deployed inside each webapp, i.e. option 1b above.

Running class in separate context

I'm trying to make a simple application that loads and runs some classes during runtime. For example, let's say I have this config:
module1.classpath=module1.jar,somelibs1.jar
module1.class=com.blabla.Module1
module2.classpath=module2.jar,somelibs2.jar
module2.class=com.blabla.Module2
Then I need to load libraries specified in module1.classpath and run the module1.class with that libraries loaded. Afterwards I need to load module2.classpath and run module2.class with those libraries.
How do I handle the case when somelibs1.jar and somelibs2.jar have the same classes inside? Basically I'd like to run module1.jar using exclusively somelibs1.jar and module2.jar using exclusively somelibs2.jar. How do I implement that?
I'm guessing I need to create a separate classloader for each of my classes and push the jars in that classloaders. However I'd appreciate some example or at least a confirmation that it is a right way to do that.

This seems to be a pretty good use case for OSGI. I would recommend using OSGI for this as everything you nees is provided by OSGI out-of-box.
But if for some reason you can't use OSGI, then what you need to do is to have a classloader for each module. Load the moduleX.class by a ClassLoaderX, and moduleX.classpath should be added in to ClassLoaderX's path. You can use a set of simple URLClassLoader for this.

Thanks for question. Very interesting.
It seems to you can't use several versions of the same class in one instance of JVM. I've never had this task and I don't know how to implement this.
But let's play. I don't know what is exotic application do you develop. May be you can run many JVMs and each JVM will have exclusive CLASSPATH.
Write application which can run (for example using Runtime.exec()) another JVM and make a conversation to it via some channel (may be network).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.