Statically checking a Java app for link errors - java

I have a scenario where I have code written against version 1 of a library but I want to ship version 2 of the library instead. The code has shipped and is therefore not changeable. I'm concerned that it might try to access classes or members of the library that existed in v1 but have been removed in v2.
I figured it would be possible to write a tool to do a simple check to see if the code will link against the newer version of the library. I appreciate that the code may still be very broken even if the code links. I am thinking about this from the other side - if the code won't link then I can be sure there is a problem.
As far as I can see, I need to run through the bytecode checking for references, method calls and field accesses to library classes then use reflection to check whether the class/member exists.
I have three-fold question:
(1) Does such a tool exist already?
(2) I have a niggling feeling it is much more complicated that I imagine and that I have missed something major - is that the case?
(3) Do you know of a handy library that would allow me to inspect the bytecode such that I can find the method calls, references etc.?
Thanks!

I think that Clirr - a binary compatibility checker - can help here:
Clirr is a tool that checks Java libraries for binary and source compatibility with older releases. Basically you give it two sets of jar files and Clirr dumps out a list of changes in the public api. The Clirr Ant task can be configured to break the build if it detects incompatible api changes. In a continuous integration process Clirr can automatically prevent accidental introduction of binary or source compatibility problems.

Changing the library in your IDE will result in all possible compile-time errors.
You don't need anything else, unless your code uses another library, which in turn uses the updated library.

Be especially wary of Spring configuration files. Class names are configured as text and don't show up as missing until runtime.

If you have access to the source code, you could just compile source against the new library. If it doesn't compile, you have definitely a problem. If it compiles you may still have a problem if the program uses reflection, some kind of IoC stuff like Spring etc.
If you have unit tests, then you may have a better change catch any linking errors.
If you have only have a .class file of the program, then I don't know any tools that would help besides decomplining class file to source and compiling source again against the new library, but that doesn't sound too healthy.

The checks you mentioned are done by the JVM/Java class loader, see e.g. Linking of Classes and Interfaces.
So "attempting to link" can be simply achieved by trying to run the application. Of course you could hoist the checks to run them yourself on your collection of .class/.jar files. I guess a bunch of 3rd party byte code manipulators like BCEL will also do similar checks for you.
I notice that you mention reflection in the tags. If you load classes/invoke methods through reflection, there's no way to analyse this in general.
Good luck!

Related

JVM: most simple way to alter code of a dependency library?

Most of the time, I don't like Javascript and would prefer strict and compiled languages like Scala, Java, Haskell...
However, one thing that can be nice with Javascript is to be able to easily change code of external dependencies. For exemple, if you have a bug and you think it's one of your dependency library you can easily hack around and swap a library method by your own override and check if it's better. You can even add methods to Array ou String prototypes and things like that... One could even go to node_modules and alter the library code here temporarily if he wants to.
In the JVM world this seems to me like an heavy process to just get started:
Clone the dependency sources
Hack it
Compile it
Publish it to some local maven/ivy repository
Integrate the fixed version in your project
This is a pain, I just don't want to do that more than once in a year
Today I was trying to fix a bug in my app, and the lib did not provide me enough information. I would have loved to just be able to put a Logger on one line of that lib to have better insight of what was happening but instead I tried to hack with the debugger with no success (the bug was not reproductible on my computer anyway...)
Isn't there any simple alternative for rapidly altering the code of a dependency?
I would be interested in any solution for Scala, Java, Clojure or any other JVM language.
I'm not looking for a production-deployable solution, just a quick solution to use locally and eventually deployable on a test env.
Edit: I'm talking about library internals that are not intended to be modified by the library author. Please assume that the class to change is final, not replaceable by library configuration, and not injectable by any way into the library.
In Clojure you can re-bind vars, also from other namespaces, by using intern. So as long as the code you want to alter is Clojure code, that's a possible way to monkeypatch.
(intern 'user 'inc dec)
(inc 1)
=> 0
This is not something to do lightly though, since it can and will lead to problems with other code not expecting this behavior. It can be handy to use during development to temporarily fix edge cases or bugs in other libraries, but don't use it in published libraries or production code.
Best to simply fork and fix these libraries, and send a pull request to have it fixed in the original library.
When you're writing a library yourself that you expect people need to extend or overload, implement it in Clojure protocols, where these changes can be restricted to the extending/overloading namespaces only.
I disagree that AspectJ is difficult to use, it, or another bytecode manipulation library is your only realistic alternative.
Load-time weaving is a definite way around this issue. Depending on how you're using the class in question you might even be able to use a mocking library to achieve the same results, but something like AspectJ, which is specifically designed for augmentation and manipulation, would likely be the easiest.

How to protect a java library from being used in other projects?

I have developed a java library which I consider as a valuable intellectual property. I want to protect it from being used by unallowed softwares.
I shall say my library has a clean API, and I shall distribute the source code of the project which is using it (not the library) to my customer.
I mean I want to change the library somehow that it only works properly in my company's projects, but no-one else could not use it, in other projects.
What is the best solution to protect the library?
I must add that I can obfuscate the library (but not the customers' application).
2 possibilities:
You want to publish the source code of your app and allow clients to compile it by themselves and to modify the source; In this case, protecting your library is technically impossible, whatever the language.
You give the source only for information, you don't want them to compile or modify the source. In this case I see at least 2 levels of security can implement :
You compile and obfuscate your application with the source of your library . That way, all your public API will be obfuscated and so almost unusable (unless someone really want to unobfuscate it, good luck...). You can also repackage your classes, all your library classes will be in the same packages than the app, so it will be very hard to know which files are part of the library and what it is doing.
You implement a mechanism at compilation that compute a hash of your app jar and modify your library source code to check at runtime that the app is really your app.
I believe that obfuscating is enough. If someone succeed in understanding your obfuscated code, he will crack the solution 2 quite easily.
Except that, you cannot do anything, there is no mechanism for that.
For obfuscating I strongly recommend Proguard

Retrieving a list of classes from a package in an Android project

I'm aware that it isn't easily feasible to get all of the classes in a package using reflection, but I'm wondering if someone knows of a good solution/workaround, specifically for an Android project?
Given a package, I need to be able to retrieve all of the classes from it and process annotations from them using reflection.
Does anyone know of a way to do this? Are there any libraries available?
Scanning the filesystem as most solutions for non-Android Java do won't help on Android. Here's a (theoretical) solution that is android-specific: http://mindtherobot.com/blog/737/android-hacks-scan-android-classpath/
However, it remains a hack, since Java unfortunately does not directly support this.
Existing dependency injection solutions use reflection for processing the annotations, but still need the resources to be declared. See this example of DI using reflection.
If you are using Ant to build your artifacts, you could read the contents of your source directory using Bash or Java, and use this to regenerate the full hierarchy of classes automatically during each build. This might make things tricky if you rely on heavily on the Eclipse IDE though, since the list might be out of date until you run another Ant build. (Note: according to Pyscho you can make Eclipse use Ant by altering the project configuration, see comments)
Another option might be to process the AndroidManifest file using the AssetManager, but you would be limited to the resources declared in that file. The compiled classes themselves are in-lined and optimised in the classes.dex file, and as such you're unlikely to get much useful information from it.
I think you might find the answer here https://stackoverflow.com/a/1457971/1199538
there is a java file attached so you can download it and try it
short snippet from the answer following:
This method can only be used when:
You have a class that is in the same package you want to discover, This class is called a
SeedClass. For example, if you want to list all classes in 'java.io', the seed class may be java.io.File.
Your classes are in a directory or in a JAR file it has source file information (not source code file, but just source file). As far as I've tried, it work almost 100% except the JVM class (those classes come with the JVM).
Your program must have permission to access ProtectionDomain of those classes. If your program is loaded locally, there should be no problem.
You can do classpath scanning for Android at compiletime, before the JVM bytecodes have been converted to Dalvik bytecodes, e.g. using the ClassGraph library (I am the author):
https://github.com/classgraph/classgraph/wiki/Build-Time-Scanning

Plugging in to Java compilers

I have a post-compilation step that manipulates the Java bytecode of generated classes. I'd like to make life as painless as possible for library consumers, so I'm looking at ways I can make this process automatic and (if possible) compiler agnostic.
The Annotation Processing API provides many of the desired features (automatic service discovery; supported by Eclipse). Unfortunately, this is aimed at code generators and doesn't support manipulation of existing artefacts:
The initial inputs to the tool are
considered to be created by the zeroth
round; therefore, attempting to create
a source or class file corresponding
to one of those inputs will result in
a FilerException.
The Decorator pattern recommended by the API is not an option.
I can see how to perform the step with a runtime agent/instrumentation, but this is a worse option than a manual build step as it would require anyone even peripherally touched by the API to configure their JVMs in a non-obvious manner.
Is there a way to plug into or wrap the compiler tool as invoked by javac? Has anyone successfully subverted the annotation processors to manipulate bytecode, no matter what the doc says?
The Groovy compiler is the only bytecode compiler which allows to hook into the compilation process (example: Generate bytecode to support the Singleton pattern)
The Annotation Processing API is not meant to change the code. As you have already found out, all you can do is install a classloader, examine the bytecode at runtime and manipulate it. It's braindead but it works. This follows the general "we're afraid that a developer could try something stupid" theme which you will find throughout Java. There is no way to extend javac. The relevant classes are either private, final or will change with the next version of Java.
Another option is to write annotated Java, for example you write a class "ExampleTpl.java". Then, you use a precompiler which expands the annotations in that file to get "Example.java". In the rest of the code, you use Example and ignore ExampleTpl.
For Eclipse, there is a bug report to automate this step. I'm not aware of any other work in this area.
It can be done.
Take a look at my blog post Roman Numerals, in our Java where an annotation processor is used to rewrite code. Limitation being that it works with Sun's javac only.

How to determine which classes are used by a Java program?

Is there any tool that lists which and when some classes are effectively used by an app or, even-better, automatically trims JAR libraries to only provide classes that are both referenced and used?
Bear in mind that, as proven by the halting problem, you can't definitely say that a particular class is or isn't used. At least on any moderately complex application. That's because classes aren't just bound at compile-time but can be loaded:
based on XML config (eg Spring);
loaded from properties files (eg JDBC driver name);
added dynamically with annotations;
loaded as a result of external input (eg user input, data from a database or remote procedure call);
etc.
So just looking at source code isn't enough. That being said, any reasonable IDE will provide you with dependency analysis tools. IntelliJ certainly does.
What you really need is runtime instrumentation on what your application is doing but even that isn't guaranteed. After all, a particular code path might come up one in 10 million runs due to a weird combination of inputs so you can't be guaranteed that you're covered.
Tools like this do have some value though. You might want to look at something like Emma. Profilers like Yourkit can give you a code dump that you can do an analysis on too (although that won't pick up transient objects terribly well).
Personally I find little value beyond what the IDE will tell you: removing unused JARs. Going more granular than that is just asking for trouble for little to no gain.
Yes, you want ProGuard. It's a completely free Java code shrinker and obfuscator. It's easy to configure, fast and effective.
You might try JarJar http://code.google.com/p/jarjar/
It trims the jar dependencies.
For most cases, you can do it quite easily using just javac.
Delete you existing class files. Call javac with the name of your entry classes. It will compile those classes necessary, but no more. Job done.

Categories