Find minimal necessary java classpath - java

Is there a tool to detect unneeded jar-files?
For instance say that I have myapp.jar, which I can launch with a classpath containing hibernate.jar, junit.jar and easymock.jar. But actually it will work fine using only hibernate.jar, since the code that calls junit.jar is not reachable.
I realize that reflection might complicate things, but I could live with a tool that ignored reflection. Except for that it seems like a relatively simple problem to solve.
If there is no such tool, what is best practices for deciding which dependencies are needed? It seems to me that it must be a common problem.

This is not possible in a system that might use reflection.
That said, a static analysis tool could do a pretty good job if you don't use ANY reflection.

Have you taken a look at Dependency Finder?
http://depfind.sourceforge.net/
A handy list of most of the other available Java dependency tools is also available on that site.

I have used
http://code.google.com/p/jarjar/
and found it to be pretty good.
Also, you will find out if you have broken any reflection easily if you have a good set of unit/acceptance tests :).

Something to add to Bill K's reply: you might not use reflection at all, but the JARs you are using might. I remember encountering something like that with xalan & xerces, where a ClassNotFoundException has been thrown at runtime.

Related

JVM: most simple way to alter code of a dependency library?

Most of the time, I don't like Javascript and would prefer strict and compiled languages like Scala, Java, Haskell...
However, one thing that can be nice with Javascript is to be able to easily change code of external dependencies. For exemple, if you have a bug and you think it's one of your dependency library you can easily hack around and swap a library method by your own override and check if it's better. You can even add methods to Array ou String prototypes and things like that... One could even go to node_modules and alter the library code here temporarily if he wants to.
In the JVM world this seems to me like an heavy process to just get started:
Clone the dependency sources
Hack it
Compile it
Publish it to some local maven/ivy repository
Integrate the fixed version in your project
This is a pain, I just don't want to do that more than once in a year
Today I was trying to fix a bug in my app, and the lib did not provide me enough information. I would have loved to just be able to put a Logger on one line of that lib to have better insight of what was happening but instead I tried to hack with the debugger with no success (the bug was not reproductible on my computer anyway...)
Isn't there any simple alternative for rapidly altering the code of a dependency?
I would be interested in any solution for Scala, Java, Clojure or any other JVM language.
I'm not looking for a production-deployable solution, just a quick solution to use locally and eventually deployable on a test env.
Edit: I'm talking about library internals that are not intended to be modified by the library author. Please assume that the class to change is final, not replaceable by library configuration, and not injectable by any way into the library.
In Clojure you can re-bind vars, also from other namespaces, by using intern. So as long as the code you want to alter is Clojure code, that's a possible way to monkeypatch.
(intern 'user 'inc dec)
(inc 1)
=> 0
This is not something to do lightly though, since it can and will lead to problems with other code not expecting this behavior. It can be handy to use during development to temporarily fix edge cases or bugs in other libraries, but don't use it in published libraries or production code.
Best to simply fork and fix these libraries, and send a pull request to have it fixed in the original library.
When you're writing a library yourself that you expect people need to extend or overload, implement it in Clojure protocols, where these changes can be restricted to the extending/overloading namespaces only.
I disagree that AspectJ is difficult to use, it, or another bytecode manipulation library is your only realistic alternative.
Load-time weaving is a definite way around this issue. Depending on how you're using the class in question you might even be able to use a mocking library to achieve the same results, but something like AspectJ, which is specifically designed for augmentation and manipulation, would likely be the easiest.

Is writing a library with a dependency on Groovy a good idea?

I am writing a Java library right now that I publish as a Maven artifact and use in a different Java/Groovy project. I was wondering whether in general it is a good idea to write a library that depends on a certain version of Groovy (e.g. has a dependency on groovy-all-2.x.y).
The discomfort of just using Java in the library would not be too bad.
What do you think?
Should I better use a generous version range for the Groovy dependency? Should I rather write a plain Java library?
I guess it depends on how you want it to be used.
If it's not a utility and you don't think other projects will use it, then do whatever you want.
If it's a utility designed to be used in testing, I don't think a groovy dependency on the test classpath is too bad. I'm sure some projects would still avoid your utility because of the groovy dependency.
If it's a general utility that you want people to use everywhere, then I'd say a groovy dependency is definately a bad idea. I certainly wouldn't use it and I'm sure many others would avoid for the same reason.
If you want maximum adoption of your utility, keep the dependencies as few as possible. Groovy is a huge, bloated dependency that many projects will avoid.
I would say it depends on the intended use of this library. If you only plan on using it yourself and are perfectly fine with the groovy dependency as it is then leave it that way. If the library is meant to be used by others than the easiest thing for them might be to write it all in Java, since then there is less that can go wrong when trying to use the library. It really comes down to if the work required to switch it to Java is worth the benefit to you of having it all in Java.

How do I prevent use of beta classes from google guava library?

We have been using Google collections in the production for several months. We would like to start using guava for additional functions. However, I'm afraid to bring guava into our product stack b/c some developers may start to use 'beta' classes.
We have various unit-tests in our code but at this point, I prefer not to include 'beta' class b/c it is subject to change in the future.
Is there any easy way to do detect if the project includes any 'beta' guava classes?
Overstock.com recently released a Findbugs plugin that flags usage of #Beta classes, methods, or fields.
In your unit tests, setup an aspect to log and/or fail when any of the beta classes (or any unwelcome class) is used.
Apparently Google Guava has an #Beta annotation which indicates which classes or methods you don't want to use.
Unfortunalty this annotation is #Retention(value=CLASS) which I've never used but since it's supposed to be kept in .class files it might mean that it will still be availiable to Class.getDeclaredAnnotations(). If it's not you will have to use CGLIB or similar bytecode level library to find it.
Given that you might want to instrument your CI application or add a checking classloader to your app to detect usage of beta API
If you're using eclipse, access rules are one option. You'd get a compile-time error whenever you are importing or otherwise using a restricted class.
Here is a list of Guava's Beta Classes.You will have to tell other developers to check this link before using a guava class.
I was thinking you could probably use reflection for that if you had a list of beta classes, which you can using Gili's link. Then it gets pretty easy - just see this answer:
Can you find all classes in a package using reflection?
I'd probably just put that in a unit test and have the unit test fail if it sees a class you don't like.

Statically checking a Java app for link errors

I have a scenario where I have code written against version 1 of a library but I want to ship version 2 of the library instead. The code has shipped and is therefore not changeable. I'm concerned that it might try to access classes or members of the library that existed in v1 but have been removed in v2.
I figured it would be possible to write a tool to do a simple check to see if the code will link against the newer version of the library. I appreciate that the code may still be very broken even if the code links. I am thinking about this from the other side - if the code won't link then I can be sure there is a problem.
As far as I can see, I need to run through the bytecode checking for references, method calls and field accesses to library classes then use reflection to check whether the class/member exists.
I have three-fold question:
(1) Does such a tool exist already?
(2) I have a niggling feeling it is much more complicated that I imagine and that I have missed something major - is that the case?
(3) Do you know of a handy library that would allow me to inspect the bytecode such that I can find the method calls, references etc.?
Thanks!
I think that Clirr - a binary compatibility checker - can help here:
Clirr is a tool that checks Java libraries for binary and source compatibility with older releases. Basically you give it two sets of jar files and Clirr dumps out a list of changes in the public api. The Clirr Ant task can be configured to break the build if it detects incompatible api changes. In a continuous integration process Clirr can automatically prevent accidental introduction of binary or source compatibility problems.
Changing the library in your IDE will result in all possible compile-time errors.
You don't need anything else, unless your code uses another library, which in turn uses the updated library.
Be especially wary of Spring configuration files. Class names are configured as text and don't show up as missing until runtime.
If you have access to the source code, you could just compile source against the new library. If it doesn't compile, you have definitely a problem. If it compiles you may still have a problem if the program uses reflection, some kind of IoC stuff like Spring etc.
If you have unit tests, then you may have a better change catch any linking errors.
If you have only have a .class file of the program, then I don't know any tools that would help besides decomplining class file to source and compiling source again against the new library, but that doesn't sound too healthy.
The checks you mentioned are done by the JVM/Java class loader, see e.g. Linking of Classes and Interfaces.
So "attempting to link" can be simply achieved by trying to run the application. Of course you could hoist the checks to run them yourself on your collection of .class/.jar files. I guess a bunch of 3rd party byte code manipulators like BCEL will also do similar checks for you.
I notice that you mention reflection in the tags. If you load classes/invoke methods through reflection, there's no way to analyse this in general.
Good luck!

How to determine which classes are used by a Java program?

Is there any tool that lists which and when some classes are effectively used by an app or, even-better, automatically trims JAR libraries to only provide classes that are both referenced and used?
Bear in mind that, as proven by the halting problem, you can't definitely say that a particular class is or isn't used. At least on any moderately complex application. That's because classes aren't just bound at compile-time but can be loaded:
based on XML config (eg Spring);
loaded from properties files (eg JDBC driver name);
added dynamically with annotations;
loaded as a result of external input (eg user input, data from a database or remote procedure call);
etc.
So just looking at source code isn't enough. That being said, any reasonable IDE will provide you with dependency analysis tools. IntelliJ certainly does.
What you really need is runtime instrumentation on what your application is doing but even that isn't guaranteed. After all, a particular code path might come up one in 10 million runs due to a weird combination of inputs so you can't be guaranteed that you're covered.
Tools like this do have some value though. You might want to look at something like Emma. Profilers like Yourkit can give you a code dump that you can do an analysis on too (although that won't pick up transient objects terribly well).
Personally I find little value beyond what the IDE will tell you: removing unused JARs. Going more granular than that is just asking for trouble for little to no gain.
Yes, you want ProGuard. It's a completely free Java code shrinker and obfuscator. It's easy to configure, fast and effective.
You might try JarJar http://code.google.com/p/jarjar/
It trims the jar dependencies.
For most cases, you can do it quite easily using just javac.
Delete you existing class files. Call javac with the name of your entry classes. It will compile those classes necessary, but no more. Job done.

Categories