I am looking for a java obfuscator with maven-plugin support? We tried using ProGuard but we ran into some runtime issues which doesn't happen if you don't obfuscate. Are there any alternate obfuscators.
Since ProGuard operates in a manner similar to most other Java obfuscators (at least the ones that I'm familiar with), it's pretty likely that you'll run into similar problems. (In fact, ProGuard goes out of its way to emit compliant bytecode, while some other obfuscators are rumored to be less vigilant about this.)
What sort of problems were you having? Typically the issues with using obfuscation are with name mangling - running into problems with other libraries being unable to locate public classes / methods / fields, or problems using reflection. This is often solveable by being very careful about which class names and method names you allow to be mangled.
The last time I used obfuscation on a Java project, we were fairly conservative about what was obfuscated. We placed the classes we wanted obfuscated into a subpackage of their original package called 'internal', and we obfuscated only .internal.. We found this much more usable than trying to determine what not to obfuscate.
Another issue with obfuscators is their optimization. Although I've not seen bugs from optimization in ProGuard, it's certainly not impossible. Regardless, I turn this off for a few reasons: first, when you get an (obfuscated) stack trace for a customer, it's hard enough to unmangle the names to determine what went wrong. If your obfuscator has optimized anything, that stack trace is likely meaningless. Second, it's unnecessary: the JVM is very, very good at optimizing byte code and this is interfering with that (thus potentially making it worse.)
Stringer Java Obfuscation Toolkit has a great set of options for integration with the IDE and build system, including Maven.
Several years ago, I had similar problem than you.
If I remember correctly ProGuard did optimize short private method wrongly:
It did dissmiss effects of "synchronized" keyworld of method during inlining.
We fix this problem by using -dontoptimize option of ProGuard.
Related
Now, I know that...
Anything can be reverse engineered, given enough time and resources.
However, would compiling your Java code to native code with a tool like GCJ make it more difficult to decompile? I mean, given a few minutes, I can decompile a .jar using JD-GUI, and it is relatively accurate. Most of the "Java to EXE" converters are just .exe launchers for the JVM, and while there are many benefits to the JVM, I have been led to believe that security of the source code is not one of them.
Bottom line: Can you use something like GCJ to compile your Java source (or .class) files to native machine code, and if so, will that protect it from decompiling?
EDIT: Ideally, it would be something more than just obfuscation. The specific project is a commercial game, so what we are looking for is a way to make it more difficult to get to the source code to begin with, not just understand it. Also, I'm not sure that Steam accepts .jars, and we are planning on submitting it to the new Green Light project.
I wouldn't choose that approach just for source-security.
Check out some Obfuscator tools out there like ProGuard
If you want to see what those tools do to your source code, just try read the decompiled Minecraft jar if you have one on hand.
A downside to using this is, that if your code depends on using reflection, you'll have to configure the tools to ignore those functions/classes/whatever, as those will not be found at runtime otherwise.
Technically, yes. Using something like GCJ will make it harder to decompile, however keep in mind that you are losing some major benefits of using Java if you do this. Namely, you lose the ability to write cross-platform applications.
You could use an obfuscator to make the code harder to decompile AND still keep the benefits of using Java.
a Source code obfuscator like
this , this and this
makes your variables, functions, etc... unreadable by other(has no logical meaning). You should read here too!
I was just messing around. I downloaded the dex2jar http://code.google.com/p/dex2jar/ and the Java Decompiler JD-GUI http://java.decompiler.free.fr/?q=jdgui
I got my own apk file (signed, sealed and on Google Play), used dex2jar to make it into a jar repository.
command line (Windows users use .bat, everyone else .sh):
d2j-dex2jar.bat -f MyAwesomeApp.apk
I dragged and dropped the output into a JD-GUI, and all the class files, the original code reappeared.
I was taken aback a bit. Is my java/Android code this exposed? How is ProGuard protecting my apk if it can be decompiled and regenerated so easily? It doesn't seem obfuscated at all...
Thanks in advance.
Obfuscators usually simply change classes, methods and fields names to names that have no meaning. So, if you have "ScoreCalculator.computeScore(Player p, Match m)" you end up with "A.zk(F f, R r)". This is similar to what Uglify or Closure compiler do for javascript, except that in javascript it is to reduce source length.
It is possible to understand what the method does anyway, it is only harder.
Aslo, Java uses late binding (as DLLs or SO files). So, calls that go outside your code (like to java.util, java.lang etc.. packages) cannot be obfuscated. Also, if your code needs to receive calls from outside (a typical example, register a listener on a button), that code cannot be obfuscated. Same happens for a DLL, where you can see clearly the name of method that need to be called form outside the DLL and calls to other DLLs.
However, the mapping between a certain source code and the compiled code is not necessarily one to one. Older C compilers used to produce the same op code for a given source directive, so decompilers were very effective. Then C compilers added many optimizations to resulting op code, and these optimizations made decompiler mostly ineffective [1]
Java never implemented (a lot of) optimizations at compile time, because to run on different platforms (there including different android devices), Java decided to apply serious optimizations later, at run time, based on the architecture and hardware properties of the running device (this is what "HotSpot" is mostly about [2]).
Good obfuscators usually also reorder bytecode instructions, or insert some useless ones, or apply some optimizations upfront to make decompilers unable (or less able) to derive source code so easily.
This technique is useless when it comes to people who can read bytecode, as any possible C obfuscation is useless if a person can read assembler code.
As many cracking softwares demonstrate, reverse engineering is always possible, even with C or other laguages, even on firmware (think about iPhone firmwares), cause the client your code is running on is always untrusted, and can always be tampered with.
If you have very mission critical code, something worth a lot of money that someone else may steal, I'd suggest to run it server side, or validate it server side somehow.
I might also add, that there is modern alternative to this APKTool->dex2jar->JD-GUI route!
Just try open-source APK and DEX decompiler called Jadx: https://sourceforge.net/projects/jadx/files/
It has also online version here: http://www.javadecompilers.com/apk
I am considering runtime byte-code generation/modification for a Java project.
Two important, and still maintained, APIs are ASM and Javassist.
ASM is the fastest at generating the code, and probably the most powerful. But it's also a lot less user-friendly than Javassist.
In my case, I want to perform the bytecode manipulation upfront, so that it is complete at the end of the application setup phase. So the speed of manipulation/generation is not critical. What is critical, is the speed of the generated code, because it will be part of a real-time desktop game, not the typical web-app, where the network delays hide the costs of reflection completely.
So my question is, does Javassist introduce some unnecessary overhead in the byte-code, which would not be present when using ASM? Or, expressed another way, is working at the ASM level going to provide me with a speed boost in the generated code compared to working with Javassist?
[EDIT] I'm interested in the newest version of both tools, and mostly interested to see if anyone tried them both on the same problem, and saw any significant difference in the speed of the resulting classes.
I don't think it would be possible to provide a simple objective answer to this. It would (I imagine) depend on the versions of the respective tools, the nature of the code you are generating, and (most importantly) whether you are using the respective tools as well as they can be used.
The other thing you should consider is whether going down the byte-code manipulation route is likely to give you enough performance benefit to compensate for the increased software development and maintenance pain. (And that can't be answered by anyone but yourself ...)
Anyway, my advice would be to only go down this route if you've already tried the "pure Java" approach and found it to give unacceptable performance.
Ok I'm kinda in a predicament right now.
I have a java program that I have split into a core/outside sections.
I collated my core into a java library that the outside code (which will be publicly released) can reference.
However, I do not want the contents of this java library to be decompilable.
So I went to find a good java obfuscator.
What complicates my situation is the fact that my java library isn't exactly modular - it references/changes state of outside code (yes it's terrible but whatever)
I've tried demos of all premium obfuscators (ZKM, allatori, etc) and free ones, but they either
have very weak control flow obfuscation which is what I need
halt because of references to java library's/dependencies that are not in the jar itself but are still referenced.
Any advice?
Obfuscation does not prevent decompilation. It can always be decompiled. It just make help making your code less readable afterwards. Obfuscate only your logic, keep interfaces untouched.
If there are no obfuscators that have control flow obfuscation which meet your standards, then you will have to write your own obfuscator or submit a request to an existing vendor to improve their product.
Run both the external and internal parts of the program through the obfuscator, together at the same time. But write exclude rules for all of the external code. You should also write exclude rules for the public API of your internal code. If you don't have a public API layer on your internal code, then you are going to have a hard time, because your external code will have to refer to your internal code by obfuscated names, which will make for very unmaintainable and hard to read external source code.
I second ahanin's comment.
But, if you're looking for an obfuscator, a good one that has a pretty robust rule set is Proguard. Its used heavily in the Android space where code needs to be made as minimal (small) as posible.
Link: Proguard
The canonical JVM implementation from Sun applies some pretty sophisticated optimization to bytecode to obtain near-native execution speeds after the code has been run a few times.
The question is, why isn't this compiled code cached to disk for use during subsequent uses of the same function/class?
As it stands, every time a program is executed, the JIT compiler kicks in afresh, rather than using a pre-compiled version of the code. Wouldn't adding this feature add a significant boost to the initial run time of the program, when the bytecode is essentially being interpreted?
Without resorting to cut'n'paste of the link that #MYYN posted, I suspect this is because the optimisations that the JVM performs are not static, but rather dynamic, based on the data patterns as well as code patterns. It's likely that these data patterns will change during the application's lifetime, rendering the cached optimisations less than optimal.
So you'd need a mechanism to establish whether than saved optimisations were still optimal, at which point you might as well just re-optimise on the fly.
Oracle's JVM is indeed documented to do so -- quoting Oracle,
the compiler can take advantage of
Oracle JVM's class resolution model to
optionally persist compiled Java
methods across database calls,
sessions, or instances. Such
persistence avoids the overhead of
unnecessary recompilations across
sessions or instances, when it is
known that semantically the Java code
has not changed.
I don't know why all sophisticated VM implementations don't offer similar options.
An updated to the existing answers - Java 8 has a JEP dedicated to solving this:
=> JEP 145: Cache Compiled Code. New link.
At a very high level, its stated goal is:
Save and reuse compiled native code from previous runs in order to
improve the startup time of large Java applications.
Hope this helps.
Excelsior JET has a caching JIT compiler since version 2.0, released back in 2001. Moreover, its AOT compiler may recompile the cache into a single DLL/shared object using all optimizations.
I do not know the actual reasons, not being in any way involved in the JVM implementation, but I can think of some plausible ones:
The idea of Java is to be a write-once-run-anywhere language, and putting precompiled stuff into the class file is kind of violating that (only "kind of" because of course the actual byte code would still be there)
It would increase the class file sizes because you would have the same code there multiple times, especially if you happen to run the same program under multiple different JVMs (which is not really uncommon, when you consider different versions to be different JVMs, which you really have to do)
The class files themselves might not be writable (though it would be pretty easy to check for that)
The JVM optimizations are partially based on run-time information and on other runs they might not be as applicable (though they should still provide some benefit)
But I really am guessing, and as you can see, I don't really think any of my reasons are actual show-stoppers. I figure Sun just don't consider this support as a priority, and maybe my first reason is close to the truth, as doing this habitually might also lead people into thinking that Java class files really need a separate version for each VM instead of being cross-platform.
My preferred way would actually be to have a separate bytecode-to-native translator that you could use to do something like this explicitly beforehand, creating class files that are explicitly built for a specific VM, with possibly the original bytecode in them so that you can run with different VMs too. But that probably comes from my experience: I've been mostly doing Java ME, where it really hurts that the Java compiler isn't smarter about compilation.