Reduce jar file size - java

Is there any way to reduce jar file size?
I want a tool that reduces the unused dependencies.
I use maven for dependency management.

A JAR file doesn't normally contain dependencies in the Maven sense. So you must be talking about:
a WAR or EAR or similar file,
a so-called UberJAR file produced by combining lots of JAR files; e.g. using the Maven shade plugin, or
dependencies at a finer granularity than Maven modules.
In the first two cases, you can keep out nominally dependent JARs by excluding them, either in the dependency specification, or in the war or shade plugin build descriptor. IIRC, the shade plugin also allows you to exclude specific packages and classes.
The last may require a separate tool to post-process the JAR file. Getting rid of unused classes is the kind of thing that an obfuscator can do. However, you need to be careful not to eliminate classes or class names that are used reflectively; e.g. by a DI / IoC framework or an AOP framework.
(Generally speaking, this kind of tool tries to figure out what classes are used by analysing the dependencies implied by .class file external references. DI / IoC / AOP and so on introduce other kinds of dependency that are not apparent in the .class file structure.)

If you like to know the dependencies your project uses just check the maven-dependency-plugin which can be used to analyze the used/unused dependencies.
Check your dependencies via:
mvn dependency:analyze
or take a look at the dep tree like this:
mvn dependency:tree
Or you can take a look into your ide (depending which one you use) for example with Eclipse (Indigo) and the m2e plugin you have a tab "Dependency Hierarchy" which shows the tree of dependencies incl. the transitive dependencies.
In some situation you have to be careful about dependencies which are used by DI frameworks which can't be analyzed by maven-dependency-plugin or by ide plugins.

pack200 can drastically reduce the JAR size. But it's hard to use with Maven and impossible to use with an EE container.
Why do you have unused dependencies?

Related

How to map local jars to artifactId, groupId, version triplet?

I have a huge amount of local jar dependencies for legacy ant project. The names don't follow artifactId-version-classifier.jar pattern. I want to replace such jars with artifacts from central repository if possible.
Is there a way to do it?
That is usually a lot of work.
You can compute the checksums and look them up. You can also look up the name and the version (if you know it).
Furthermore, the project probably only uses a small fraction as direct dependencies. The rest are transitive dependencies. So if you figure out the direct dependencies (from looking at the code) than resolving those would be enough. Maven will find the transitive ones automatically.

How to find if I need to exclude dependencies in a maven java project?

I use both Intellij IDEA (2018.3.5) & Eclipse IDEs, but I prefer Intellij. I have a maven based Java project with multiple poms. I added some dependencies to one of the pom files. I need to find out if there are any dependency conflicts which could prevent the build from running when its deployed, and then exclude them. I tried the steps given below to find conflicts which could cause problems. Are they enough or do I need to do more ?
Check if there are any compile time dependency conflicts with mvn clean install -DskipTests. Build was successful with no errors.
Check if Intellij shows no problems under File > Project Structure > Problems. There are no problems.
I also saw the dependency tree with mvn dependency:tree -Dverbose. It has a lot of "omitted for duplicate" and "omitted for conflict with" items, but the build was successful. I don't see any errors though. Does this mean that everything is okay or do I have to do something more about these conflicts ?
The best way to tell if everything is fine with your application is to have good tests.
However normally one doesn't exclude transitive dependencies from project's <dependency> libraries. Doing it can potentially break the dependency in a subtle and hard to notice way. It's usually safer to remove the whole <dependency>.
There are few scenario when one should use <exclude>:
Dealing with incompatible transitive dependencies between different libraries e.g. A requires library C-1.0 but library B requires library C-2.0 while C-1.0 and C-2.0 can't coexist on the classpath.
Having transitive dependencies already provided by system e.g. deploying to Tomcat with additional JARs in the TOMCAT_HOME/lib directory.
If you decide to exclude a dependency it's important that you check the final artifact because sometimes plugins do weird things e.g. there were versions of maven-assembly-plugin affected by a bug that resulted in different dependencies being resolved during shaded JAR creation than maven-dependency-plugin used for compilation.

Understanding Maven dependencies and assembly

I am not very much experienced with Maven and it's compilation and packaging logic gets me confused.
I have some dependencies declares as :
<dependency>
<groupId>com.dependency_group</groupId>
<artifactId>dependency_1</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>com.dependency_group</groupId>
<artifactId>dependency_2</artifactId>
<version>1.0.0</version>
<scope>provided</scope>
</dependency>
So as far as I understand, dependency_1 will be added to the classpath of my program as something that comes along with my jar, and dependency_2 on the other hand, will be added to the classpath as something that the system runtime will provide upon deployment.
Then I run the package goal of Maven and none of my dependencies are packed with my code (I am using the shade plugin, but even without it nothing changes).
I expected that when some dependency is set as compile scope, it will be exported with my compiled code, since AFAICS, there's no point in setting the classpath saying a dependency will come along with my code, and Maven just don't package that dependency with it. It looks to me as if Maven is not obeying it's contract.
So:
1 - What's the logic behind this?
2 - Do I have to always use the Assembly plugin?
3 - Are there cases where people will define a dependency as compile and will not want it packaged within a jar?
Let me shed some light on the main point here. There are fundamentally two kinds of java artifacts:
Applications, i.e. ears, wars, executable jars
Libraries, i.e. jars that are meant to be used as dependencies for other artifacts.
For Applications your reasoning makes perfectly sense. Wars and Ears automatically package all their compile dependencies and you need no assembly plugin for that. For libraries, you do not pack the dependencies into the library. Maven handles transitive dependency resolution and would be confused if you put a fat jar on the classpath.
The thing is that packaging jar can be both a libary or an application. If you want a standalone application, you need to tell Maven to package everything, e.g. by using the assembly plugin or shade plugin.
You use compile scope when you want some dependencies to come along with your code. For example you want Jackson to be a part of your application if you are using it for json serialization.
You use provided scope, if you want dependency to be on the classpath during the compilation but wont be included within your application. It must be provided by running environment. For example you want Lombok as it is compile only library, or you want to have Servlet Api dependency as provided when you are writing servlet application because such app will be ran on servlet container thus there is no need to pack it within your application (it will be available in container runtime)
Do I have to always use the Assembly plugin
Nobody forces you to do so.

Finding unused (compile-scoped) jars in multiple projects using maven ?

I have a webapp that consists of multiple projects. We assemble using Ant and we suspect that some of the jars in /java directory are unneeded.
To find unneeded jars I ran
mvn dependency:analyze -DignoreNonCompile
to get a list of unused declared jars for each project. However it is possible that a jar unused by one project is still used by another. To check this, I ran
mvn dependency:tree
to get the dependency structure of all projects.
Using information from these commands, I will now use a script to check if a jar exists such that it is unused in all projects that declare it. Is this a reasonable approach for compile-scoped jars? What about jars in other scopes?
Thanks.
However it is possible that a jar unused by one project is still used
by another.
I recommend to declare all needed dependencies as direct dependencies and not rely on transitive dependencies which might get removed in a newer version.
Define the versions of the dependencies in the DependencyManagement section of the common parent POM and omit the versions later when declaring the dependcies. Like this you can make sure you're using the same version of the dependencies in all your projects.

Using multiple source folders (as intermediate step in a conversion of a large Java project to Maven)

I am converting a large Java project to use maven. I have a LOT of inter dependencies to work out, but I would like to get it off the ground with maven before I do the real cleanup work. I have broken it up into a few modules plus one giant module; let's call that module monolith.
Monolith has regular Java classes and some gwt classes (with interdependencies). I separated the two parts to have a directory structure like this:
./src/main/java/...
./src/client/gwt/...
So, I can easily get this to compile in eclipse with m2eclipse, but then I can't seem to find how to get it to compile with maven. I saw that the pom file has a build section where you can specify an alternate source and target, but I think it is not a repeatable attribute in the pom:
<build>
<sourceDirectory>${basedir}/src/main/java</sourceDirectory>
</build>
In eclipse, I can adjust the project's .classpath file (in the project properties) to add additional source files (and output dirs) to accomplish what I am looking to do.
Is there any way to do this, or do I need to work out the dependencies first, and separate into separate modules?
If you go against the grain with maven it will be an uphill battle all the way.
Maven doesn't lean towards multiple main source directories, they would do better in maven environment as separate modules.
I've looked at a number of maven gwt projects and archetypes, and none of them seem to take the approach you've suggested.
Have a look at the source structure used by Hupa, also see the archetypes from the Ham and Eggs blog
http://hamandeggs.wordpress.com/2010/01/26/how-to-gae-eclipse-maven/
http://hamandeggs.wordpress.com/2010/07/25/gae-eclipse-maven-update-for-helios/
These also cater for App Engine.
If you really need to separate your java server source from your gwt client source, then monolith needs to be split into more modules.
It is quite common to see gwt projects with a package structure as follows:
com.company.project
.client
.server
.shared
And then specify the source paths in your gwt.xml to include client and shared
What you have is called a maven multi-module project. Take a look at this tutorial on the maven book.
So, I can easily get this to compile
in eclipse with m2eclipse, but then I
can't seem to find how to get it to
compile with maven.
-- I am not sure what you meant by this. M2Eclipse plugin is using maven to build your modules. Perhaps you can clarify this section. Hope the tutorial link helps you.
try to follow this tutorial http://maven.apache.org/plugins/maven-eclipse-plugin/reactor.html
main idea- start from creation of empty project from maven mvn archetype:create and then put you sources to created by maven structure...
also i can strongly recommend to check your dependency tree and effective pom with eclipse plugin tool when you perform this task (for avoid duplicate in dep. & other bad things)

Categories