Java(Android): Find unused classes - java

In my Android application, I am using lot of open source JAVA libraries as source. It makes the application very huge in size.
Number of classes coming around 6000+. I want to remove the unused classes from it. Any one have idea about how to do it. I find many tools, but that is for removing unused codes. Thanks in advance.

Use Proguard. It strips away unused classes and libraries. Link: http://developer.android.com/tools/help/proguard.html
EDIT:
The gc overhead limit exceeded is not because you are using proguard. Its because the memory allowed for eclipse to use is low. You can fix this by increasing the memory limit allowed (https://www.simplified.guide/eclipse/fix-gc-overhead-limit-exceeded). Do this, run proguard, and your app size will be minimal.

Well if you are using open source java libraries you should first find out what licences those libraries are distributed under. Some licences do not allow you to repackage distributables other licences will only allow you to repackage if you make the new software open source (that includes your code). http://opensource.org/licenses
So after you have checked the liceneces and or contacted the rights holders.
You could write a tool that follows the dependency tree from your classes through all of your third party code and produces a list of classes that are not in that tree. I imagine most IDE's are not going to do what you want because they will consider a library as either used or not.

proguard does this for java.
From what I'm seeing it's already part of the android stack - http://developer.android.com/tools/help/proguard.html .
Look at the link and try to find out why it isn't working for you (probably you aren't creating a Release build).
If it is working, and you still have a huge file, then you are probably using libraries that use a lot of files, and there's not much you can do about it.

Step 1
Generate usage.txt and mapping.txt with Proguard or R8
Add -printusage to your proguard.pro file Run
./gradlew app:minifyReleaseWithProguard or ./gradlew app:minifyReleaseWithR8
Step 2
Find class name records that is in usage.txt but not in mapping.txt, those are the unused classes that are removed by Proguard/ R8
It's not hard to write such algorithm but you can consider my approach using Java Set data structure
More details here

6000 classes ????? Well this is why people pay like 2000 for a compiler that removes unused code. If you put your code in eclipse it will place a yellow line under libraries, and variables that you are not using at all.
Hope this helps.

Related

Is there a way to use external libraries in IntelliJ without downloading their .jars?

I am trying to write a standalone Java application in IntelliJ using edu.stanford.nlp.trees.GrammaticalStructure. Therefore, I have imported the module:
import edu.stanford.nlp.trees.GrammaticalStructure;
Currently, Intellij doesn't recognize this and many others of the imported external libraries (cannot resolve the symbols) and is also not able to automatically download/import them.
Is there a way to use the GrammaticalStructure class without having to download the entire Stanford CoreNLP .jar and adding it to the project as a library? This question applies to other dependencies as well, since I want to use other external libraries but avoid including their .jar files as much as possible (to minimize the size of the final application, given that it will be standalone). Unfortunately, all the solutions I have found proposed exactly that.
Apologies if I have overlooked some basic setting or setup steps, it has been a while since I have worked with Java.
Any help is greatly appreciated.
If you want to use it means you want to execute the code in them. How is the runtime supposed to execute code that is does not have? How is the compiler supposed to know how the code is defined (e.g. what the classes look like)? This is simply impossible. If you want to use the code you have to provide it to the compiler as well as the runtime.
If you just dont want to include all of that code into your application, you need either access to the sources and just pick the class you need or you need some kind of JAR minimizer as #CrazyCoder suggested.

How to ProGuard and Optimizer and Obfuscator in Java

I downloaded the eclipse from the google bundle but yet I dont know if it is optimizing the code once it gets compiled. How do I enable optimiziation and obfuscation for my code in eclipse for java or do I need to get a special plugin to do so? I want to make my files as small as they can to be a quicker download for users
If this is a bad question please do not -rep me, just tell me and I'll remove it
I've used ProGuard once or twice, never extensively but my understanding is this: it is an external bundle of files that you must run (either command line or through its GUI) in order to use it. I have used the GUI and it gives you several different tabs for each of the options (Optimizing, Shrinking, and Obfuscation). You can find their project page here with more information and detail on how to use it. As far as I know there is no IDE intergration for ProGuard.

How to modify the class file?

I was working on the project in eclipse in which I have added this maven dependency for PDFBOX
Maven dependency
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.6.0</version>
</dependency>
And I was getting the error on some pdf file as:
Parsing Error, Skipping Object
java.io.IOException: expected='endstream' actual='' org.apache.pdfbox.io.PushBackInputStream#1b8d77fe
at org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:439)
at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:552)
at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:184)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1088)
at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1053)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:74)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
at org.apache.tika.Tika.parseToString(Tika.java:357)
at edu.uci.ics.crawler4j.crawler.BinaryParser.parse(BinaryParser.java:37)
at edu.uci.ics.crawler4j.crawler.WebCrawler.handleBinary(WebCrawler.java:223)
at edu.uci.ics.crawler4j.crawler.WebCrawler.processPage(WebCrawler.java:460)
at edu.uci.ics.crawler4j.crawler.WebCrawler.run(WebCrawler.java:129)
at java.lang.Thread.run(Thread.java:662)
So when I google it, I found there was some bug in BaseParser.java file, So they have given the patch(https://issues.apache.org/jira/browse/PDFBOX-195) for this java file only.. So my question is how can I modify this java file only.. I can see the BaseParser.class file in eclipse as I have attached the source doc for that PDFBOX-Issue. Any suggestions will be appreciated.
Given that BaseParser.java is an Apache file, there is absolutely no reason why you cannot download the source, make your changes and re-compile it. I have done this with Apache code in the past. It was pretty straight forward and took me only a few minutes. Remember to submit your fix back to Apache so that way it will be included in the release.
You can:
create subclass manual (and use it if it possible)
download source, fix it, recompile, and finally, overwrite it in jar
create subclass programmaticly (using cglib or asm)
download only BasicParser, mock all depends (just create empty class files with needs methods), recompile it and put in jar (or ./ext ./endorsed dir in jvm, if you want)
Generally, one doesn't modify a class file directly, they download the source code and then rebuild the class file with javac. Yes, it is possible to modify class files without doing such a thing; but, patch files are not generally binary patch files, they are generally source code patch files.
Stefanglase has mentioned that the release you are working with should have the patch applied, but there is a small chance that a recent change reintroduced the issue. You might want to verify that you're not solving the wrong problem before you get too deep into it.
On the rare odds that you really want to modify a binary, you open it with a hexadecimal editor, or a hexeditor for short. Basically this allows you to set any byte in the file to any value, which means you must have a strong knowledge of the file's internal format, what is allowed / disallowed, and how to make allowable changes that actually implement your expected behavior. In short, you'll be doing a compiler's work manually, by hand.
It can be done, but it is the sort of task that generally requires a lot of knowledge, and few people have that knowledge already, so the costs of learning that knowledge and successfully implementing the change is likely much higher than rebuilding from available patched source. Even the costs of successfully implementing the change with the knowledge of the general principals and techniques already present isn't something that one can say with certainty is less than the costs of rebuilding the entire library with patched source.
Good Luck.

Patching Java software

I'm trying to create a process to patch our current java application so users only need to download the diffs rather than the entire application. I don't think I need to go as low level as a binary diff since most of the jar files are small, so replacing an entire jar file wouldn't be that big of a deal (maybe 5MB at most).
Are there standard tools for determining which files changed and generating a patch for them? I've seen tools like xdelta and vpatch, but I think they work at a binary level.
I basically want to figure out - which files need to be added, replaced or removed. When I run the patch, it will check the current version of the software (from a registry setting) and ensure the patch is for the correct version. If it is, it will then make the necessary changes. It doesn't sound like this would be too difficult to implement on my own, but I was wondering if other people had already done this. I'm using NSIS as my installer if that makes any difference.
Thanks,
Jeff
Be careful when doing this--I recommend not doing it at all.
The biggest problem is public static variables. They are actually compiled into the target, not referenced. This means that even if a java file doesn't change, the class must be recompiled or you will still refer to the old value.
You also want to be very careful of changing method signatures--you will get some very subtle bugs if you change a method signature and do not recompile all files that call that method--even if the calling java files don't actually need to change (for instance, change a parameter from an int to a long).
If you decide to go down this path, be ready for some really hard to debug errors (generally no traces or significant indications, just strange behavior like the number received not matching the one sent) on customer site that you cannot duplicate and a lot of pissed off customers.
Edit (too long for comment):
A binary diff of the class files might work but I'd assume that some kind of version number or date gets compiled in and that they'd change a little every compile for no reason but that could be easily tested.
You could take on some strict development practices of not using public final statics (make them private) and not every changing method signatures (deprecate instead) but I'm not convinced that I know all the possible problems, I just know the ones we encountered.
Also binary diffs of the Jar files would be useless, you'd have to diff the classes and re-integrate them into the jars (doesn't sound easy to track)
Can you package your resources separately then minimize your code a bit? Pull out strings (Good for i18n)--I guess I'm just wondering if you could trim the class files enough to always do a full build/ship.
On the other hand, Sun seems to do an okay job of making class files that are completely compatible with the previous JRE release, so they must have guidelines somewhere.
You may want to see if Java WebStart can help you as it is designed to do exactly those things you want to do.
I know that the documentation describes how to create and do incremental updates, but we deploy the whole application as it changes very rarely. It is then an issue of updating the JNLP when ready.
How is it deployed?
On a local network I just leave everything as .class files in a folder. The startup script uses robocopy or rsync to copy from network share to local. If any .class file is different it is synced down. If not, it doesn't sync.
For non-local network I created my own updater. It downloads a text file of md5sums and compares to local files. If different it pulls file down from http.
A long time ago the way we solved this was to used Classpath and jar files. Our application was built in a Jar file, and it had a launcher Jar file. The launcher classpath had a patch.jar that was read into the classpath before the main application.jar. This meant that we could update the patch.jar to supersede any classes in the main application.
However, this was a long time ago. You may be better using something like the Java Web Start type of approach, which offers more seamless application updating.

Is there a decent tool for comparing/diffing two Java packages?

I'm looking for a tool that will give me a high level view of which files are different between two fairly large Java packages. If I could then drill down into individual files then that would be good. I don't want to go file by file if possible.. any ideas?
thanks
Beyond compare (and other diff tools) can do directory compares too ...
If you're more interested in API differences than content differences, check out JDiff.
For example the Google Guava project uses it to show changes between releases. Here is the r06 release diff: http://guava-libraries.googlecode.com/svn/tags/release06/javadoc/jdiff/changes.html
WinMerge is an excellent Windows standalone diff tool and I use it for almost all of my source files. It can navigate through folder structures (in your case, your Java packages).
If you do use version control, it integrates very well with TortoiseSVN (and perhaps others in the Tortoise family).
You can use pkgdiff tool to compare java archives:
pkgdiff A.jar B.jar
See sample report for args4j.
See also japi-compliance-checker for analysis of API changes in your java archives.
I use Kompare on Linux. Just a diff GUI front end, that can diff directories recursively. I believe there are many others (I'm sure I've seen a list somewhere).
Eclipse works well. Just select the two different packages (hold the Ctrl key, click on a package, click again on the other package), right click on one of the selected packages, go to the 'Compare With...' submenu, select 'Compare With Each Other'.
I have used Araxis Merge to do this too. It is also helpful for doing code merges. It is not free (about 80 bucks I think) but well worth it.
I always use eclipses team synchronize (for included cvs; this requires one revision checked in and another one disk) works same way with subclipse plugin for subversion.
If you are on windows and don't have the checked in a version control system you could use winmerge
Last I checked kdiff3 worked both on *nix and windows.

Categories