Generate minimized jar with only used classes - java

I'm in need of creating the minimal jar of utils library for use in Android. I'm using some methods from apache commons libraries (such as IOUtils, StringUtils). However, each such usage makes me import the whole library (commons-lang, commons-io etc.) which is absolutely acceptable under Tomcat (war's are mamoot-sized anyway), but absolutely unacceptable for Android project.
So, my aim is, to pack all used classes from dependencies into one jar - but only that classes that are needed. I remember once being in touch with maven plugin that done that task, unfortunatelly I can't remember its name nor find it via Google.
So please, do you know maven plugin that will do such minimization of dependencies, or any stand-alone tool that will do the same?

The maven plugin you can't remember is probably Apache Maven Shade Plugin, there is minimizeJar option. As Andreas_D noticed, this won't include classes, loaded with Class.forName, so you will need to implicity say in configuration, that you need them. Here is how i made maven to include jdbc driver in my single jar:
<filter>
<artifact>net.sourceforge.jtds:jtds</artifact>
<includes>
<include>**</include>
</includes>
</filter>

Excuse me, maybe i not clearly understood question. Obfuscator tool (i.e. ProGuard) could do that, isn't it? It packs several JARs into one and strips unused classes. If you don't need obfuscation/optimization (to prevent unwanted side-effects) then you could disable them, leaving "shrink" phase enabled.

In general it is not possible to automatically select all classes that are used by an application. Just think about what we can do with Class.forName(String name) or if we use a dependency injection container and declare types in external configuration files.

I guess if you use Eclipse to JAR the project it gives some options to do that while JARing :)
Maybe it will be useful.
Also you can collect your used library classes under a custom library and include this user created library in the project.

Related

Which JAR does Java pick for a certain import?

Dependency issues, we've all dealt with them, but I'm mostly used to C# and now working in Java so I have some questions.
Let's say I add a library to my project, called ExtLib.
ExtLib has a certain library included in its lib-folder, let's call it LogLib-1.0.
I'm using Eclipse and I've made a User Library for ExtLib, included its main jar file and all of the files in its lib-folder. So far so good.
But now I want to do some logging of my own, so I make another User Library and add the newer LogLib-1.1 to it, because it has some new features I want to use.
Can I ever be sure I'm not breaking ExtLib this way?
I know .NET uses the Global Assembly Cache and methods like that, but I have no clue how Java handles this. I tried Googling, but didn't find much, a few mentions of the Classloader here and there, but nothing helpful.
Can anyone tell me what a proper way to deal with this issue is? Or is it no issue at all?
In this specific case (LogLib-1.0 and LogLib-1.1) we're dealing with the same library that is both a direct dependency of your application, and a "transitive" dependency via the ExtLib. In this situation, dependency management can come to help.
It will probably reason that LogLib-1.1 is a backward compatible release of LogLib-1.0, and it will decide that your application can run fine using only LogLib-1.1.
In the Java world, tools like Maven, Gradle or SBT exist to help you in this. Maven is the most widespread, and other tools often are compatible with Maven.
Usage
To solve this situation using Maven, you would add a file called pom.xml to your application, stating it depends on LogLib version 1.1. That might look like this (note that this example is pure fiction):
<dependency>
<groupId>org.loglib</groupId>
<artifactId>loglib</artifactId>
<version>1.1</version>
</dependency>
The ExtLib you're using also has a pom.xml shipped with it, and it might state
<dependency>
<groupId>org.loglib</groupId>
<artifactId>loglib</artifactId>
<version>1.0</version>
</dependency>
Maven (or any other tool) would decide that including LogLib-1.1 is sufficient to get your application running. When using Maven, mvn depedency:tree helps you visualise that.
Deployment
With respect to the packaging / deployment question: mvn package will package your application to a jar, war or ear archive, including only the dependencies you need (and not two versions of the same lib). This makes you don't have to worry about the order in which your application server reads the jar files.

Classloader to isolate a jar (class identity crisis)

I'm using jarX that has embedded dependencies that conflict with my own dependencies, so I'm creating a classloader to isolate jarX's dependencies from my main classloader.
jarX is outside my app's classpath, but my classes that use jarX's classes are in my classpath, so when I instantiate my classes loaded via the custom classloader, I run into the class identity crisis in the form of ClassCastException as the JVM's version of my classes are considered different from those loaded by my custom classloader.
I found this blog post where they solved a similar problem by only interacting with the custom classloader loaded classes via reflection, which seems to solve this problem.
It just feels like it should be easier than this. Does anyone know a better way to handle this problem?
The easiest way is to open jarX, remove the offending classes, and done. It is a bad practice to embed dependencies in a JAR unless that is JAR is meant to be used only as a standalone runnable fat-jar. JARs that are meant to be used as libraries should not embed dependencies.
When you notice that people package third-party classes in their JARs, I'd recommend pointing out to them that this is generally not a good idea and to encourage them to refrain from doing so. If a project provides a runnable fat-jar including all dependencies, that is fine. But, it should not be the only JAR they provide. A plain JAR or set of JARs without any third-party code should also be offered. In the rare cases that third-party code was modified and must be included, it should be done under the package namespace of the provider, not of the original third-party.
Finally, for real solutions to building modular Java applications and handling classloader isolation, check out one of the several OSGi implementations or project Jigsaw.
Can you post which jar is it and what are the classes that it overlaps, with the full stacktrace? Have a look at this tool I wrote to generate a list of duplicate classes in the WAR, there is an option to exclude duplicates of the same size.
These are some measures that can be done to solve this:
Try to reduce the number of duplicates by doing a case by case analysis of why the overlap exists. Add maven exclusions for jars that are complete duplicates.
Check if there is a version of the same jar without the dependencies that you could use, which jar is it, xerces, etc?
If there is no jar without dependencies, you can you exclude the other jar that overlaps jarX and see if the application still works. This means all components that need the jar have a compatible version of the jarX library
Separate the application into two WARs each with the version of the library you need. This will reduce the number of libraries in which
These where measures that are likelly to be more maintainable long-term
If the previous measures do not work:
open the jar, delete the duplicate classes and publish in the maven repository with a different name jarX-patched
you can configure nexus to serve a patched jar instead of an unpatched jar transparently
If your container supports OSGI that would be even better, but if you don't use a OSGI container for development as well, then the application would not work in development.

Is there a way to prevent developers to use a certain import?

I have an application that uses Jasper to generate reports. In order to encapsulate the complexity and provide a uniform interface with the Jasper API, I have created a "intermediate" interface that wraps the Jasper classes and delegates client calls to them. This will also make it easier to change the report machine in the future - to Crystal Reports, for instance.
The thing is, since the Jasper classes are in the classpath, developers (including myself) can accidentally use some of its classes directly in the business code, and that may pass unnoticed for a long time. I would like to avoid that, or at least be notified when that happens.
The environment is basically eclipse, maven, git, sonar, bamboo ci.
I'm sure this is not an uncommon scenario, so, what is the best way to deal? Design patterns, eclipse/maven plugins, sonar alerts? Or maybe something dead simple that I'm just not seeing?
In maven you can specify a library is for runtime only. This allows you to not compile against that library at all. If you don't use Jasper from maven, you could avoid including it at all. You can force this by adding an <exclusion> if it is a transient dependency.
You should have two separate eclipse projects: One for the reporting library, one for the rest.
The reporting library project contains your interfaces, the Jasper jar files and the Jasper-specific implementation of the interfaces.
The other project depends on the reporting library project (you can set project dependencies in the projects properties dialog under "Java Build Path" -> "Projects").
As the reporting project only exports the source folder to the other project, the jasper classes are not visible to it at development time.
I haven't used it much myself, but if you ever need more control over your dependencies you could try DCL Suite, an Eclipse plugin. It lets you define constraints between modules and you can declare the modules to be a class, a set of classes, packages, etc
That would only be possible if you handled classloading of Jasper and included it as a resource (a jar file) inside your own jar. Then no one would know it was available directly. Here's an example of how you can include jars inside your own jar file -> An embedded jar classloader in under 100 lines.

Changing package names before building in Bamboo

I recently discovered that BlackBerry treats all classes with the same fully-qualified name as identical--regardless of whether they are in entirely different apps or not--causing apps that use different versions of our shared libraries to break when they are installed on the same phone.
To solve this problem, we are planning on changing the package names to include a version number, then building. Can someone explain how, using Bamboo, I can insert a step in our build process that:
changes certain packages names
replaces all code references to the old package name with references to the new package name?
A great tool that is made especially for the task of changing the fully qualified names of Java classes in jar files is jarjar. It can be used easily from within Ant, or alternatively from a shell script.
I have never used Bamboo - I assume, it should work there, too. Of course, there may be some special restrictions in that environment (concerning bytecode manipulation), I don't know about (?)
I'm not familiar with Bamboo and you did not include much information about your build system. If you are using maven, you could use the shade plugin:
This plugin provides the capability to package the artifact in an uber-jar, including its dependencies and to shade - i.e. rename - the packages of some of the dependencies.
The second example here shows how to configure package renaming. The resulting jar file would then have to be processed by rapc as in Chris Lerchers comment to his answer. It should be possible to also integrate this in a maven build using the exec plugin.

Maven Plugin - are plugins executable within plugins?

Is it possible to execute a plugin from a plugin? For instance, if I want to programmatically call another plugin from within a plugin, not via static XML.
Is this possible, how would I do that?
Thanks,
Walter
There are several ways to do this:
Use MavenInvoker to fork a new maven process.
This has pros and cons, especially since you're building the project twice, but a common pattern is to modify the maven model, write it out to the file system as a temporary pom XML file, point the invoker to this pom. Drawback: you're losing the original model and wasting resources. Pro: you can do anything you want to the (new) maven model dynamically. This is very powerful
Let your plugin either aggregate or extend the original plugin.
Extending is a lot simpler, configuration is automatically there (Google for maven extend plugin ). By Aggregation I mean calling the plugin programmatically which means you will probably have to access the plexus container to wire up the plugin configuration

Categories