Debugging java obfuscated code

Debugging java obfuscated code - java

We are going to obfuscate our project but don't want to lose the ability of remote debugging and hotswapping.
Is it possible? Which tools can handle this? I'd be happy with simple obfuscation - just renaming classes/methods/variables.
[Edited] We're using Intellij IDEA but wasn't able to find any plugin for this task.

We have the same kind of needs (simple obfuscation, need to debug later)
and we use ProGuard. It's a Java app, which can be integrated in an Ant task.
It can do a lot of things, but it's also fully tuneable. So you can keep your obfuscation simple. One of the options is to generate a "Symbol Correspondance Table", which allows you to retrive the non-obfucated code from the obfuscated one. (it keeps track that the variable xyz in the class qksdnqd is in fact myCuteVarName in the class MeaningfulClassName)
Edit: Obfuscation can be tricky. Some examples:
You can't change the name of your main method.
Do you use a classloader? Can it still retrieve the class after the obfuscation?
What about your ORM Mapping? Your Spring Context? (if any)
Edit2:
You can also see:
Do you obfuscate your commercial Java code?

See SD Java Obfuscator. It strips comments and whitespace, and renames all members/methods/class names that aren't public.
It also providew you with a map of how the code was obfuscated, e.g., for each symbol FOO obfuscated as XYZ, a map FOO->XYZ. This means if you get a backtrace mentioning XYZ, you can easily determine the original symbol FOO. Of course, since only you (the person doing the obfuscation) has this map, only you can do this.

Related

Sonar rule for interfaces only RSPEC-1213

I want to modify / make the rule target only public interfaces (not public classes etc). Is this possible ? Im using this rule in Java code but its too strict for my project and I would love to know if there is a way to change it a little bit.
Link for rule: https://rules.sonarsource.com/java/RSPEC-1213

For an existing ruleset on SonarQube, talk to your sonar administrator to change the rules that are enforced on the code and remove that particular one from global enforcement.
There have been a few times I've gone to the admins of the tool for the install that I use and said "this rule isn't one that I care about or will enforce and only makes it confusing" and had them remove that rule from the globally run ruleset.
Is it possible to write your own rule?
Yes, it is possible. From SonarQube's docs: Adding coding rules you have some options. Either you can write a plugin for SonarQube and add that to your instance (docs), or you can write an external application that analyzes the code which SonarQube consumes.
If you don't have your own instance of sonarqube or aren't up to writing the associated plugin or external tooling... you might want to instead lookout PMD (site).
For PMD, writing a custom rule can be much simpler (docs). One of the ways that PMD works is by 'compiling' the Java code into an XML representation of the abstract syntax tree for Java and then running xpath queries against that XML (tutorial).
The xpath rule can then be included in a project's configuration.
What about turning it off for the code that I'm working on?
If a specific rule is one that you don't want to invoke, you could suppress it with #SuppressWarnings("java:S106") (that particular spares warnings is for System.out.println use, but the same structure can be used for other warnings) or by adding // NOSONAR too strict on the line. There are spots where I have such comments where following the rule for a particular set of code is problematic and suppress it for that line, method, or class - with the comment about why that is done.
That particular rule... I'm gonna agree with the Java (and now Oracle) guidelines and follow it. The reason is that if anyone else works on the code, they'll expect it to follow that convention. Having a consistent understanding of what things should be where in code so that another developer doesn't need to go dig through an entire file to find the constructor when it is expected to be at the top (under the field definition) is a good thing. What's more, it limits the future cases where a developer goes through to make things consistent with conventions and results in a lot of style: updating code to follow style guide commits later.

Java snippet based AST-Manipulation before/during compilation

The question is whether the functionality I describe below already exists, or whether I need to make an attempt at creating it myself. I am aware that I am probably looking at a lot of work if it does not exist yet, and I am also sure that others have already tried. I am nevertheless grateful for comments such as "project A tried this, but..." or "dude D already failed because...". If somebody has an overall more elegant solution, that would of course be welcome as well.
I want to change the way I develop (private) Java code by introducing a multiplexing layer. What I mean by that is that I want to be able to create library-like parameterizable AST-snippets, which I want to insert into my code via some sort of placeholders (such as annotations). I am aware of project https://projectlombok.org/ and have found that, while I find it useful for small applications, it does not generally suit my requirements, as it does not seem possible to insert own snippets without forking the entire project and making major modifications. Also lombok only ever modifies a single file at a time, while I am looking for a solution that will need to 'know' multiple files at a time.
I imagine a structure like this:
Source S: (Parameterizable) AST-snippets that can be included via some sort of reference in Source A.
Source A: Regular Java-Code, in which I can reference snippets from Source A. This code will not be compiled directly, as it is lacking the referenced snippets, and would thus throw a lot of compile time exceptions.
Source T: Target Source, which is an AST-equivalent copy of Source A, except that all references of AST-Snippets have been replaced by their respective Snippet from Source S. It needs to be mappable to the original Source A as well as the resolved snippets from Source S, where applicable, as most development will happen there.
I see several challenges with this concept, not the least of which are debuggability, source-mapping and compatibility with other frameworks/APIs. Also, it seems a challenge to work around the one-file-at-a-time limitation, memory wise.
The advantage over lombok would be flexibility, since lombok only provides a fixed set of snippets for specific purposes, whereas this would enable devs to write own snippets, or make modifications to getters, setters etc. Also, lombok 'quirks' into the compilation step, and does not output the 'fused' source, afaik.
I want to target at least javac and eclipse's ecj compilers.

Stop external library access to runnable Jar

So I have a Java application I will be releasing to one of my communities for a price. The app is just about complete and ready to be obfuscated but the problem is;
I found that when I add the Jar to another project in Eclipse you can instantiate classes externally and use my program as an external library to make scripts outside of my program. This is not what I'm wanting to achieve here... I'm self taught so I have grey areas of knowledge as I haven't learned formally, but I'm pretty experienced in Java still... I've tried googling it and nothings coming up, maybe I'm not phrasing it correctly. But if I could get some help it would be appreciated.
Here is my structure of my packages:
src.com
Contains main class
src.com.scripts
Contains Abstract Script class
src.com.scripts.impl
Contains the actual scripts that extend the abstract Script class
What I've tried doing:
I removed the public Identifier from the Abstract Script class but then it isn't visible to the main class to call it from as it is in the package before. So how can I go about this when my project is sorted in packages and they all need to access eachother?

There is no solution.
If people want to reverse engineer your code, they will. There is nothing you can do to change that. public/private are essentially meaningless beyond helping you write good portable code.
That being said, Java is generally much easier to reverse engineer and make bindings to than other languages. Java doesn't inline functions and unless told otherwise, it will even leave all of your class and method names intact. If you had used a language like C, the optimized binary would be a bigger pain to work with, but the result would still be the same.
Just obfuscate the jar and call it a day. Manually changing how you write your code is more harmful to you than it is to them.

Statically checking a Java app for link errors

I have a scenario where I have code written against version 1 of a library but I want to ship version 2 of the library instead. The code has shipped and is therefore not changeable. I'm concerned that it might try to access classes or members of the library that existed in v1 but have been removed in v2.
I figured it would be possible to write a tool to do a simple check to see if the code will link against the newer version of the library. I appreciate that the code may still be very broken even if the code links. I am thinking about this from the other side - if the code won't link then I can be sure there is a problem.
As far as I can see, I need to run through the bytecode checking for references, method calls and field accesses to library classes then use reflection to check whether the class/member exists.
I have three-fold question:
(1) Does such a tool exist already?
(2) I have a niggling feeling it is much more complicated that I imagine and that I have missed something major - is that the case?
(3) Do you know of a handy library that would allow me to inspect the bytecode such that I can find the method calls, references etc.?
Thanks!

I think that Clirr - a binary compatibility checker - can help here:
Clirr is a tool that checks Java libraries for binary and source compatibility with older releases. Basically you give it two sets of jar files and Clirr dumps out a list of changes in the public api. The Clirr Ant task can be configured to break the build if it detects incompatible api changes. In a continuous integration process Clirr can automatically prevent accidental introduction of binary or source compatibility problems.

Changing the library in your IDE will result in all possible compile-time errors.
You don't need anything else, unless your code uses another library, which in turn uses the updated library.

Be especially wary of Spring configuration files. Class names are configured as text and don't show up as missing until runtime.

If you have access to the source code, you could just compile source against the new library. If it doesn't compile, you have definitely a problem. If it compiles you may still have a problem if the program uses reflection, some kind of IoC stuff like Spring etc.
If you have unit tests, then you may have a better change catch any linking errors.
If you have only have a .class file of the program, then I don't know any tools that would help besides decomplining class file to source and compiling source again against the new library, but that doesn't sound too healthy.

The checks you mentioned are done by the JVM/Java class loader, see e.g. Linking of Classes and Interfaces.
So "attempting to link" can be simply achieved by trying to run the application. Of course you could hoist the checks to run them yourself on your collection of .class/.jar files. I guess a bunch of 3rd party byte code manipulators like BCEL will also do similar checks for you.
I notice that you mention reflection in the tags. If you load classes/invoke methods through reflection, there's no way to analyse this in general.
Good luck!

How to determine which classes are used by a Java program?

Is there any tool that lists which and when some classes are effectively used by an app or, even-better, automatically trims JAR libraries to only provide classes that are both referenced and used?

Bear in mind that, as proven by the halting problem, you can't definitely say that a particular class is or isn't used. At least on any moderately complex application. That's because classes aren't just bound at compile-time but can be loaded:
based on XML config (eg Spring);
loaded from properties files (eg JDBC driver name);
added dynamically with annotations;
loaded as a result of external input (eg user input, data from a database or remote procedure call);
etc.
So just looking at source code isn't enough. That being said, any reasonable IDE will provide you with dependency analysis tools. IntelliJ certainly does.
What you really need is runtime instrumentation on what your application is doing but even that isn't guaranteed. After all, a particular code path might come up one in 10 million runs due to a weird combination of inputs so you can't be guaranteed that you're covered.
Tools like this do have some value though. You might want to look at something like Emma. Profilers like Yourkit can give you a code dump that you can do an analysis on too (although that won't pick up transient objects terribly well).
Personally I find little value beyond what the IDE will tell you: removing unused JARs. Going more granular than that is just asking for trouble for little to no gain.

Yes, you want ProGuard. It's a completely free Java code shrinker and obfuscator. It's easy to configure, fast and effective.

You might try JarJar http://code.google.com/p/jarjar/
It trims the jar dependencies.

For most cases, you can do it quite easily using just javac.
Delete you existing class files. Call javac with the name of your entry classes. It will compile those classes necessary, but no more. Job done.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.