Recover unaltered java from APK - java

I recently lost my android game code. When I tried to retrieve it from APK in my test device, I was able to recover all assets, xml and few java files - less complex ones (using dex2java converter). However original code of the more complex java files (like renderer) couldnt be recovered properly. Lots of modifications had been made to the code (like additional while, break, continue, labels).
Please suggest how the actual JAVA code(unaltered) can be recovered from the APK.
Thanks

The simple answer is that it cannot.
Source code is not included into the compiled version. The compiled classes only contain JVM instructions. From these instructions, you can often reconstruct the logic, but rarely if ever the original source.

Related

Decompiler Bytecode and Obfuscators

Can we completely reverse-engineer the source code from java bytecode ? Why this feature is allowed in Java and How successful are java decompilers against obfuscators.?
I know this question is old but I kept looking for a reliable answer until I found nothing.
So in this post I summarize some of my effort to obfuscate a J2EE JAR.
It seems , that by year 2014 (time of writing) there are not many options out there.
If you read this review later then things may have changed or fixed.
When I think why , I start to sense that the whole obfuscation effort gives a false sense of security. Don't get me wrong. It does add a level of security, but not as much as I would hope.
I will try to give a preview of what I found to explain myself. My recommendation are personal , others may disagree with it.
So to begin with: obfuscation in Java is the process of taking bytecode and making it less readable (using a decompiler of course) while maintaining its original functionality.What can we do, Java ,working as an interperter, must keep its bytecode exposed. You run the obfuscator as a measure of security in case the class file falls into the wrong hands. The result of the obfuscation is a reverse-mapping files and a JAR with the obfuscated classes. The reverse mapping file is used of-course to perform stack trace reading (a.k.a re-trace) or to revert the bytecode to its original shape. The runtime performance hit of an obfuscated class should not pass the 10% (but this really depends on what you do in your code).
But there is a big “but” . Obfuscation will scramble your code but it won’t make it hacker-proof. Bare in mind you only buy time and a determined hacker will find a way to reverse engineer your bytecode into its pure algorithm.
IMHO: the best way to hide a sensitive piece of code is to drown it in some huge pile of meaningless code.
Some of the hackers will try to modify your bytecode (by code injection) to help them achieve their goals. Some obfuscators offer additional level of JAR hardening , making it harder to modify.
De-obfuscators and de-compilers: my favourite Java decompiler is JD-GUI . However, when it comes to de-obfuscators I found the market pretty empty. Most of the tools ask you for a hint (what obfuscation tool was used to encrypt the source JAR) , yet none of them really deliver results (some of them even crash when trying to de-cipher the JAR). They are open source projects with low maintenance. I couldn’t even find a paid application to do a decent de-obfuscation. so enlighten me if you know something.
Free solutions
There are open source , free obfuscators which usually simply rename the classes/methods names, making it one letter method (i.e. from printUsage(String params) to a(String p) ).
They might ,as hinted here , even strip debugging information to make it a bit more difficult. (debugging information is kept at the end of every Java method bytecode and contains: line numbers, variables names ,etc.).
Its a nice effort , but an experience Java developer with a debugger can very easily deduce the purpose of each parameter while doing few live runs.
One of the nice open source obfuscators is ProGuard but there are several more tools.
Nevertheless , if you truly security fanatic you will probably want something stronger. Stronger demands more features (and more money) which leads us to the next bullet:
Paid solutions
While free products may only change classes method names , paid product will usually offer more features:
code/flow obfuscation: this will change the method code and inject empty loops/dead code/confusing switch tables and alike. Some of them may even scramble the exception table content. the obfuscation strength usually determine the output size.
Note: regarding code obfuscation: I deliberately avoided the details in my review. Some of the bytecode I saw and analyzed expose their obfuscation methods, and I wish to protect their IP. I do have an opinion about who uses better algorithms. contact me if you wish to know.
classes/method renaming : well this is the obvious , we discussed it in the free obfuscation. Some of the product will rename the class name and then recursively search for reflection usage of that class and fix those too. Paid products may even rename Spring /Wink configuration files for the same purpose (renaming in reflection).
String encryption: for every string “like this” in the code, it will encrypt it to some level and keep the key somewhere (in the class constant table/static blocks/a new method or any other mean).
debug information : stripping parts or scrambling.many of them will remove the line numbers info.
class
hardening: all kinds of methods like injecting some signing scheme into the beginning of the class/method, making sure an outsider won’t be able to easily modify the JAR and run it. Less important for Android or applets as most of them are digitally signed anyhow. some will combine hardening with water-marking to track pirated copies. But we all know anti-pirating methods by software are doomed to be hacked. Game industry suffered from it for decades until network based subscriptions arrived.
Since most products here deal with Java , some of them provides Android integration. It means it will not only obfuscate the Java (dalvik) code , but also manipulates the Android's manifest file and resources. Some offer anti debugging: remove the debug flag in android apps.
Nice GUI app to configure the various options and maybe do a re-trance on a given log file. The UI is usually used to generate a config file. with such file you can later re-play the obfuscation many times, even from command line.
Incremental build support - this is useful for large groups who release product updates/fixes frequently. You can tell the obfuscator to preserve old “obfuscation” result and randomly obfuscate only “new” code flows. this way you can be sure minimal impact on your methods signature. Without this flag , each obfuscation cycle on a JAR would yield a different output as most good tools use some level of randomness in their algorithms.
CLI and distributed builds. When you work alone then running an obfuscator is not a big issue. you need to configure the obfuscator to your relevant options and run it.However, in enterprise , when integrating obfuscator into the the build script things are a bit different. There is another level of complexity: build engine tasks (like ant/maven) and license management. The good news that all obfuscator I tested have command line API. In distributed build environment there are cluster/pool of build machines to support concurrent demand of builds. The cluster is dynamic and virtual, machines are going up or down, depending on various conditions. Some obfuscation products are based on cpuID license file or hostname. This can create quite a challenge for the build teams to integrate. Some prefer a local floating license server. Some may require public license server (but then: not all build farms have access to the public internet). Some offer multi-site license (which in my opinion is the best).
Some offer code optimizations - algebric equivalence and dropping of dead code. Its nice, but I believe that today's JDK do good job in optimizing bytecode. Its true that dead code makes you downloadable bigger, but with today's bandwidth its less than a problem. I also want to believe that in software today 20:80 thumb rule still applies. in any application 20% is probably a dead code anyway.
So who are the players I tried ?
KlassMaster by Zelix.com - one of the oldest in the industry. Yet they deliver a solid product with 3-4 releases per year. This been going for decades (since 1997). Zelix provides good email support and answered all my emails in a timely manner. They have a nice GUI client to either obfuscate a JAR or create a config file for future obfuscation. It simple and slick. nothing special here. They provided simple to read on-line documentation for all their flags. they support both “exclude” and “include” regular expressions for what the engine should obfuscate. The thing I liked about their process most is that it also adds “noise” to the exception table. It makes it a bit more confusing regarding the method exception handling. Their flow obfuscator strength is quite good and can be configured between 3 possible levels (light,medium and aggressive). Another feature I liked is the fine tuning they provide for debug info stripping (online line numbers, or online local variables or both). Klass Master doesn’t provide any
dedicated Android flags or anti-tamper methods. Their licensing model is quite simple: a text file to be placed near the KlassMaster main JAR. They also support incremental obfuscation.
JFuscator from secureTeam.net : While secureTeam also has a .Net tool , I focus on their Java tool capabilities. Their (Swing based) GUI tool seems nice but it crash when trying the simplest obfuscation task. the error was always the same: Error reading '/opt/sun-jdk1.7.0_55/jre\lib\rt.jar'. Reason: ''/opt/sun-jdk1.7.0_55/jre\lib\rt.jar': no such file or directory' . Now of course I have my Java installed in /opt/sun-jdk1.7.0_55/jre. You can image that they simply didn’t expect linux back slash structure. I contacted secureTeam.net support by email with the minor “path” problem. They asked if I am a linux user and after I replied I am , they never answered my email. I also tried their web site on-line chat : no response. So there I stopped testing. Without further results, I couldn’t examine the obfuscated bytecode quality. From their web site it seems they have anti-tamper method , String manipulation, method renaming and few other features.
GuartIt4J (by Arxan.com) : Arxan is fairly solid player in the mobile environment and as such they offer Android obfuscator which of course works well for Java. They have one of the most flexible engines.They provide code obfuscation,string encryption and alike You can define the complexity of code obfuscation. it is simply an integer. the higher - the longer your method turns out. ofcourse, you must be carefull not to exceed the JVM 64KB limit per class… As I said before one of the best strategies to hide a sensitive code is not to encrypt it , but to inject it into huge pile of garbage. This is exactly what GuardIt does. It can also explode in the same way the methods exception table. I managed to create a method with 100 exceptions in its exception table (pre-obfuscator it was 5). what they miss: their re-trace program is not part of the supplied main JAR. Nevertheless, they were kind enough to send me a sample Java program that performs re-trace given the reverse mapping file and the log. They don’t support incremental obfuscation and no flexibility regarding debug information. Debug information stripping is either all or nothing. watching the output JAR you will tons of conditions and jumps that were injected. Bare in mind , exploding the class size has its performance hit. In some methods I measured almost 50% performance hit when applying long obfuscation (no I/O in those methods). so extrapolating the code comes with a price.(from a 400 opcodes - I went up to 2200 opcodes after obfuscation). JD-GUI , my de-compiler failed to open such classes and crashed (IndexOutOfBoundException). They also supply complete class encryption . Meaning the class is encrypted with some symetrical key which demands a special (or custom written) class loader to open it in memory. This is an anti-tamper mechanism as well as hiding code. Just remember that a JVM can’t run that class without the class loader help. Its a nice feature, but the secret key and the bootstrap loader JAR are probably there. If he got the encrypted JAR the hacker will eventually get his hands and decrypt the classes. Yet this another level of obstacle the common hacker will need to pass. What I didn’t like here is the license file policy: is bounded to CPUid or need to install a floating license server.
SecureIt (by Allatori.com) : SecureIt offers all the general code obfuscation, string encryption ,renaming and such. On top of the standard obfuscation methods they also offer some kind of water-marking which is an anti-tamper/pirating method. They support Android and JavaME (who uses ME these days?!). They support incremental obfuscation. The one thing to note about configuring SecureIt: it is all command line. No GUI tool this time. Personally , I don’t mind command line tools as long as they come with good documentation. Luckily they have a very good documentation and a rich API with many flags to tune if you wish. you can re-trace with they tool (also a command line ) . They can’t obfuscate the exception table. I didn’t check their licensing mechanism.
DashO (by Preemptive.com) : DashO obfuscator will be remembered probably as the best UI tool you can get (to create your configuration). Like SecureIt they lake the exception table obfuscation but they have all the rest of the required features (as well as CLI, Spring framework and gradle/ant integration, and even an eclipse plugin) . Well, they do document a try-catch obfuscator (which is same as exception table obfuscator) , but it is only a recommendation to the engine. When I tried it , it had nil effect on the exception table. As I said , the GUI tool is superb and has a re-trace embedded into it. they also offer some kind of application signing and water-marking as an anti-tamper/pirating mechanism. DashO provides superb Android integration and also combine in their product a door for analytics uploads. You can actually track your application. Injecting crash log uploaders and reporting code to your JAR. Nevertheless that’s not the scope of obfuscation - that’s a whole different code injection product. They have a very good support. both online and by phone. Their licensing scheme is based on monthly subscription or one time purchase payment. A bit different than others. They are using a floating license server to support large environments.
I hope this helps a bit..
Can we completely reverse-engineer the source code from java bytecode ?
Not completely, because some aspects of source code, such as whitespace, local variable names, and comments, are not preserved in bytecode. Otherwise, yes -- while you can't get the exact same source code out, you can almost always get something that can at least be compiled back to the same bytecode.
Why this feature is allowed in Java
It's not so much "allowed" as it is "not prevented". And it's not prevented because doing so is impossible -- the code must be runnable to be useful; if the code is runnable, then it is analyzable; if it is analyzable, then with sufficient analysis it can be converted back to source.
How successful are java decompilers against obfuscators?
Not very. Most obfuscators I've seen (esp. ProGuard) are primarily effective in removing meaningful function and class names; obfuscating the logic itself is not typically attempted.
you can get source code from binary these days. Although the source code obtained by Java's bytecode is more readable, obfuscating will make it slightly unreadable. Its not that only Java can be reverse engineered to code. Even C/C++ these days (with Hexrays plugin for IDA Pro) can be decompiled to source. Obfuscaters will make it hard to read but not impossible. There is nothing that can save your program from an intelligent and capable reverse engineer. :).
Good luck.
Can we completely reverse-engineer the source code from java bytecode
?
The java class file is based on a spec so anyone can read into it. A tool like JD-GUI will tear into your source code easily. It is not a 'feature' per se. While 100% reverse-engineering is not possible, most of your code can be reverse engineered.
How successful are java decompilers against obfuscators?
Depends. The point of the obfuscator is to remove any meaningful names and try to introduce confusion in the code without impacting performance. Most developers are great at obfuscating code themselves :) Pro-guard is pretty good at obfuscation.

Android Apk decompilation seems to easy

I was just messing around. I downloaded the dex2jar http://code.google.com/p/dex2jar/ and the Java Decompiler JD-GUI http://java.decompiler.free.fr/?q=jdgui
I got my own apk file (signed, sealed and on Google Play), used dex2jar to make it into a jar repository.
command line (Windows users use .bat, everyone else .sh):
d2j-dex2jar.bat -f MyAwesomeApp.apk
I dragged and dropped the output into a JD-GUI, and all the class files, the original code reappeared.
I was taken aback a bit. Is my java/Android code this exposed? How is ProGuard protecting my apk if it can be decompiled and regenerated so easily? It doesn't seem obfuscated at all...
Thanks in advance.
Obfuscators usually simply change classes, methods and fields names to names that have no meaning. So, if you have "ScoreCalculator.computeScore(Player p, Match m)" you end up with "A.zk(F f, R r)". This is similar to what Uglify or Closure compiler do for javascript, except that in javascript it is to reduce source length.
It is possible to understand what the method does anyway, it is only harder.
Aslo, Java uses late binding (as DLLs or SO files). So, calls that go outside your code (like to java.util, java.lang etc.. packages) cannot be obfuscated. Also, if your code needs to receive calls from outside (a typical example, register a listener on a button), that code cannot be obfuscated. Same happens for a DLL, where you can see clearly the name of method that need to be called form outside the DLL and calls to other DLLs.
However, the mapping between a certain source code and the compiled code is not necessarily one to one. Older C compilers used to produce the same op code for a given source directive, so decompilers were very effective. Then C compilers added many optimizations to resulting op code, and these optimizations made decompiler mostly ineffective [1]
Java never implemented (a lot of) optimizations at compile time, because to run on different platforms (there including different android devices), Java decided to apply serious optimizations later, at run time, based on the architecture and hardware properties of the running device (this is what "HotSpot" is mostly about [2]).
Good obfuscators usually also reorder bytecode instructions, or insert some useless ones, or apply some optimizations upfront to make decompilers unable (or less able) to derive source code so easily.
This technique is useless when it comes to people who can read bytecode, as any possible C obfuscation is useless if a person can read assembler code.
As many cracking softwares demonstrate, reverse engineering is always possible, even with C or other laguages, even on firmware (think about iPhone firmwares), cause the client your code is running on is always untrusted, and can always be tampered with.
If you have very mission critical code, something worth a lot of money that someone else may steal, I'd suggest to run it server side, or validate it server side somehow.
I might also add, that there is modern alternative to this APKTool->dex2jar->JD-GUI route!
Just try open-source APK and DEX decompiler called Jadx: https://sourceforge.net/projects/jadx/files/
It has also online version here: http://www.javadecompilers.com/apk

In a pickle with obfuscating a java library

Ok I'm kinda in a predicament right now.
I have a java program that I have split into a core/outside sections.
I collated my core into a java library that the outside code (which will be publicly released) can reference.
However, I do not want the contents of this java library to be decompilable.
So I went to find a good java obfuscator.
What complicates my situation is the fact that my java library isn't exactly modular - it references/changes state of outside code (yes it's terrible but whatever)
I've tried demos of all premium obfuscators (ZKM, allatori, etc) and free ones, but they either
have very weak control flow obfuscation which is what I need
halt because of references to java library's/dependencies that are not in the jar itself but are still referenced.
Any advice?
Obfuscation does not prevent decompilation. It can always be decompiled. It just make help making your code less readable afterwards. Obfuscate only your logic, keep interfaces untouched.
If there are no obfuscators that have control flow obfuscation which meet your standards, then you will have to write your own obfuscator or submit a request to an existing vendor to improve their product.
Run both the external and internal parts of the program through the obfuscator, together at the same time. But write exclude rules for all of the external code. You should also write exclude rules for the public API of your internal code. If you don't have a public API layer on your internal code, then you are going to have a hard time, because your external code will have to refer to your internal code by obfuscated names, which will make for very unmaintainable and hard to read external source code.
I second ahanin's comment.
But, if you're looking for an obfuscator, a good one that has a pretty robust rule set is Proguard. Its used heavily in the Android space where code needs to be made as minimal (small) as posible.
Link: Proguard

Recover lost code from compiled apk

I have an issue here..and its making me really nervous.
I was working on this game, and it was going great, so I took a copy of it on my laptop to work do some work while away from my computer.
long story short, hard-drive failure + poor back ups led to me losing a very important class.
Is there a way to decompile the apk to retrieve the bit of code that was lost? It isn't overly complicated or sophisticated, its just that its impossible to re-write it without reading every. single. line. of. code. in the entire application since it initializes a LOT of classes and loads a bunch of stuff in a specific way.
With a quick google search I was able to find apktool, which decompiles it into a bunch of .smali files, which I don't think were designed for human reading.
All I need to recover is one very big method in the class. I found the smali file that contains it and I think I found the line where it starts. something like
.method public declared-synchronized load(Lcom/X/X/game/X;)I
Anyone help would be appreciated since I would have to scrap the entire game without this method.
A quick google resulted in a way to decompile apks (decompile apk to java source). However, even though it results in java code, you probably won't have any variable names (just default ones like param1) as those are unrecoverably removed when you compile the source code to byte code, also, depending on the decompiler, for/foreach loops will be while loops instead, if/else blocks might not represent your original control flow due to compiler optimization.
As a general advise: Use some sort of source control. On your own server, paid account with github doesn't matter what, but use source control. Even if you are just one person developing on a project. It helps with this situation, it helps with reverting to a previous version, it helps finding a bug you introduced. When the tools are available, use them.

How to deliver a java program to a client?

I wrote a software application in Java. Now I want to deliver it to my clients. But before that, I want to do something on that software which are mentioned below. You can answer any or all of the below questions:
I want to:
Encrypt all the .class files so that no one can decompile it. How can I encrypt it?
After encryption I want to obfuscate that code to add extra safety. How can I do that?
Add some "serial-key" functionality so that the software works only after registering it with the key provided by me. This is very important so as to prevent multi-user usage of my software. How can I add that key functionality and how can I generate keys. And how can I restrict that software to work only on a single computer.
The jar file can be unzipped and the .class file can be seen. Is there any way to wrap jar file into something so that no one can unzip that file.
I don't want to tell the client to first install java to run my application. So is there any way by which if anyone installs my software, the java automatically gets installed on his/her computer without informing him that java is being installed to his computer. If it is possible, then Is it legal to use Java software in this way.
Change the icon of the jar file permanently.
Implement a code which checks my site for any available updates.
If you want any other suggestions to increase the security of the softwre, then you are welcomed too.
In no particular order:
2 - There are products that perform obfuscation. They typically rename classes / variables / methods to single letter names. This makes determining user reported errors rather difficult. Stack traces showing the exception occurs in a.b.c are not particularly helpful.
1,3,4 - You can't fully avoid this risk if your are distributing java. Your code needs to be unpacked and loaded at some point. If someone replaces rt.jar in the jvm then they can replace the top-level class loader and dump out your classes like that. Obfuscation makes this less useful for them, but see the above caveat.
5 - Distribute a "private jre". Basically, you have a jre in your program folder. Your launcher script runs it. Increases the size of your distribution though.
6 - On windows, this would be a file association issue. But that would also affect all other jar files. Unless as part of 4 (however you manage that) you also use a different extension. Not sure about other operating systems.
7 - Use Java Web Start? Failing that, just have a file on your server listing the most recent version, fetch the file and compare with the installed version.
For 1,2,4 and 5 you could also look into compiling to native code using gcj or similar. Beware of compatibility issues if you do that though.
Encrypt all the .class files so that no one can decompile it. How can
I encrypt it?
You can't. If no one can decompile it, how do you expect the target JVM to?
After encryption I want to obfuscate that code to add extra
safety. How can I do that?
I want to add some "serial-key"
functionality so that the software
works only after registering it with
the key provided by me. This is very
important so as to prevent multi-user
usage of my software. How can I add
that key functionality and how can I
generate keys. And how can I restrict
that software to work only on a single
computer.
There are a couple of ways to do this but a simple one is with public key cryptography:
Your software generates a random request ID or a request ID based on the machine attributes and your user submits this to you.
You sign the request ID with your private key and send it back to the user.
The user provides the signed request ID to the software which validates that it was signed by you.
The jar file can be unzipped and the .class file can be seen. Is there
any way to wrap jar file into
something so that no one can unzip
that file.
No
I don't want to tell the client to first install java to run my
application. So is there any way by
which if anyone installs my software,
the java automatically gets installed
on his/her computer without informing
him that java is being downloaded to
his computer. If it is possible, then
Is it legal to use Java software in
this way.
Try building an NSIS installer for your application that detects/installs Java and your program.
Build a better trust relationship with your clients.
Then you can spend extra time ( not doing tasks 1-5 ) to make improvements, fix bugs, etc., which in turn improves relationship with your clients.
You can compile it with GCJ, which will compile your application to a normal Windows/Linux native executable (.exe). Then you can create an installation, using a program like InstallShield.
The company where I work actually ships unobfuscated jar files, with all debug information in place. That way, if an error occurs at a client's site, they can send us the full stacktrace which helps enormously in analyzing and localizing bugs in the code.
Trying to obfuscate your code will lead you into an arms race with potential crackers and consume huge amounts of time with little or no real benefit. Instead, I'd advise you to try and find other ways to make buying (and not pirating) your software worthwhile to your clients. For example, you could offer them free updates, or tech support, or something like that.
As for 6: You can use JSmooth or a similar tool to create an exe wrapper for your app. It will allow you to change the icon, and your clients will have an exe file that they can doubleclick without having to mess with file associations for jar files.
Note, however, that the generated exe won't contain Java or your jar files. It will, however, print a nice error message if Java isn't available.
Just adding on to the other answers here:
1 and 4: You could actually do this if you modify the JVM and pre-package it with your installation, but it's against Java's license agreement to distribute a modified JVM without paying Sun like a billion dollars.
Who is your client? Piratebay.org? Seriously, every major company in the US pays for software. The risk of a client quitting and calling them in is just too high. You need enough protection to make it easier for a programmer to get purchasing to pay for the product than to circumvent your copy protection.

Categories