This question already has answers here:
Java - How to reduce the size of third-party jars to reduce the size of your application
(3 answers)
Closed 8 years ago.
I need to remove unused classes from third party JARs. Why tools should I use?
I already tried to use ProGuard. However, it removes only unused classes from the project itself but the library jars - third party - always remain unchanged.
You can create an uber jar and then use ProGuard. Repackaging library classes into jars would be a challenge, but from the spirit of your question you will prefer the uber jar as such.
As other posters have commented, you still need to be careful about classes loaded through the so much abused and misunderstood reflection mechanism.
#Joonas Pulakka is right. But if you still really want to do this and be sure that your application will not fail for ClassNotFoundException run your application with option -verbose:class, perform all usecases that exist, take the log that contains all loaded classes. Then take list of all classes of your third party library and file all classes from your library that have been never loaded. Then create alternative jar file that contains only "needed" classes and pray :)
Good luck.
It is a good thing to know what classes and libraries you are using, and even if there is risk (as pointed out by Peter) to removing unused stuff, there is cost in carrying any kind of excess baggage, and you shouldn't just keep accumulating. If you use reflection, then get a handle on what you are using it for, and systematically get rid of what you don't need. There are benefits to a leaner code-base that you understand better.
Java only loads class as they are used. Removing classes can only cause you problems and won't help you at runtime. 36 MB of code isn't that much given only a portion of it will be loaded. How much memory do you have? Most PC have at least 2000 MB these days If you are downloading your applet or Java WebStart application over a slow link I would imagine you are using pack200 (to make the jars smaller) and have included the minimum of libraries already.
Related
I am working on a project in which code is taken from other sources/team and most of java classes are unused and increasing my war file size. Will removing these unused Java class files can improve my webserver performance as it is always high and perform slow? I understand it is a silly question still I want to take suggestion for it. Thanks.
No it wouldn't, Classes are loaded as needed or when manually loaded by a classloader.
Separate Jars
When creating JAR files, I've always kept the source separate and offered it as an optional extra.
eg:
Foo.jar
Foo-source.jar
It seems to be the obvious way to do things and is very common. Advantages being:
Keeps binary jar small
Source may not be open / public
Faster for classloader? (I've no idea, just guessing)
Single Jar
I've started to doubt whether these advantages are always worth it. I'm working on a tiny component that is open-source. None of the advantages I've listed above were problems in this project anyway:
Classes + source still trivially small (and will remain that way)
Source is open
Class loading speed of this jar is irrelevant
Keeping the source with the classes does however bring new advantages:
Single dependency
No issues of version mismatch between source and classes
Developers using this jar will always have the source to hand (to debug or inspect)
Those new advantages are really attractive to me. Yes, I could just zip source, classes and even javadoc into a zip file and let clients of my component decide which they want to use (like Google do with the guava libraries) but is it really worth it?
I know it goes against conventional software engineering logic a little, but I think the advantages of a single jar file out-weigh the alternatives.
Am I wrong? Is there a better way?
Yes, I could just zip source, classes and even javadoc into a zip file and let clients of my component decide which they want to use (like Google do with the guava libraries) but is it really worth it?
Of course it is worth it! It takes about 2 seconds to do it, or just a few minutes to change your build scripts.
This is the way that most people who distribute sources and binaries handle this problem.
EDIT
It is not your perspective you need to consider. You have to think of this from the perspective of the people deploying / using your software.
They aren't going to use the source code on the deployment platform.
Therefore putting the source code in the binary JAR is a waste of disc space, slows down deployment and slows down application startup.
If they want to do something about it, they've got a problem. How do they rebuild the JAR file to get rid of the source code? How do they know what is safe to leave out?
From the deployer / user's perspectives, there are no positives, only negatives.
Finally, your point about people not being able to track source versus binary versions doesn't really hold water. Most people who would be interested in the source code are perfectly capable of doing this. Besides, there some simple things you can do to address the issue, like using JAR filenames that include your software's version number, or putting the version number into the manifest.
I have just come across a potential pitfall for the java+classes in a single jar.
If you have java files in a jar and that jar is included in the classpath of a subsequent javac execution, you MUST make sure that the timestamps of the java file is less than the timestamp of the class file.
This scenario can happen when you copy/move the java or class files prior to packaging as a jar.
If the java file is newer than the class, then even though the java file is found on the classpath (rather than an argument to javac), javac will attempt to compile that java file and then potentially end up with duplicate class errors during the compilation stage.
For this reason I would recommend keeping the source in a separate jar to the class files.
Note that relevant flags in javac will not allow you to prefer class over source: http://docs.oracle.com/javase/7/docs/technotes/tools/windows/javac.html#searching
I prefer 'Separate Jars'.
Because binary class jar is for running on JVM, but source not. Source should be carefully maintained by your source control system(SVN). If source needs to release, zip it in separate jar. Many open source separates class jar and source one.
If you want others to test and inspect/improve your code then you can have your source with the binaries. If not, keep the source away from the jar.
How small is small and why should your jar act differently from others?
Unless you have a very good reason why your jar should have the sources, not simply debugging but something specific to this one jar then I'd say no, choice is best.
I say this because if your jar should not be different from other, then you have to work on the assumption that others should do the same as you. If so, the size of the jar is not important, because its duplicated over all "small" jars. Then my WAR is much bigger than needed which, admittedly is not a massive issue, but is not something I would chose for production when I can download sources in DEV so easily.
I guess this is kind of a follow-on to question 1522329.
That question talked about getting a list of all classes used at runtime via the java -verbose:class option.
What I'm interested in is automating the build of a JAR file which contains my class(es), and all other classes they rely on. Typically, this would be where I am using code from some third party open source product's "client logic" but they haven't provided a clean set of client API objects. Their complete set of code goes server-side, but I only need the necessary client bits.
This would seem a common issue but I haven't seen anything (e.g. in Eclipse) which helps with this. Am I missing something?
Of course I can still do it manually by: biting the bullet and including all the third-party code in a massive JAR (offending my purist sensibilities) / source walkthrough / trial and error / -verbose:class type stuff (but the latter wouldn't work where, say, my code runs as part of a J2EE servlet, and thus I only want to see this for a given Tomcat webapp and, ideally, only for classes related to my classes therein).
I would recommend using a build system such as Ant or Maven. Maven is designed with Java in mind, and is what I use pretty much exclusively. You can even have Maven assemble (using the assembly plugin) all of the dependent classes into one large jar file, so you don't have to worry about dependencies.
http://maven.apache.org/
Edit:
Regarding the servlet, you can also define which dependencies you want packaged up with your jar, and if you are making a stand alone application you can have the jar tool make an executable jar.
note: yes, I am a bit of a Maven advocate, as it has made the project I work on much easier. No I do not work on the project personally. :)
Take a look at ProGuard.
ProGuard is a free Java class file shrinker, optimizer, obfuscator, and preverifier. It detects and removes unused classes, fields, methods, and attributes. It optimizes bytecode and removes unused instructions. It renames the remaining classes, fields, and methods using short meaningless names. Finally, it preverifies the processed code for Java 6 or for Java Micro Edition.
What you want is not only to include the classes you rely on but also the classes, the classes you rely on, rely on. And so on, and so forth.
So that's not really a build problem, but more a dependency one. To answer your question, you can either solve this with Maven (apparently) or Ant + Ivy.
I work with Ivy and I sometimes build "ueber-jar" using the zipgroupfileset functionality of the Ant Jar task. Not very elegant would say some, but it's done in 10 seconds :-)
We have a developer who is in the habit of committing non-java files (xsd, dtd etc) in the java packages under the src/java folder in our repository. Admittedly, these are relevant files to that package, but I just hate to see non-java files in the src folder.
Is this is a common practice that I should get used to or are we doing something strange by maintaining these files like this?
The problem with putting non Java (or other languages) files that are closely tied to the code in a different place than the code is knowing where to find them. It is possible to standardize the locations then theoretically everyone will know where to go and what to do. But I find in practice that does not happen.
Imagine your app still being maintained 5 or 10 years down the road by a team of junior - intermediate developers that do not work at the company now and will never talk to anyone who works on your project now. Putting files closely linked to the source in the source package structure could make their lives easier.
I am a big proponent of eliminating as many ambiguities as possible within reason.
It's very common and even recommended as long as its justifiable. Generally it's justifiable when it's a static resource (DTD+XSLT for proprietary formats, premade scripts etc.) but it's not when the file is something that's likely to be updated by a third party like IP/geographic location database dump.
I think it gets easier if you think of 'src' as not specifically meaning 'source code'. Think of it as the source of resources that are things needed by your program at compile time and/or runtime.
Things that are a product of compile or build activities should not go here.
Admittedly, like most things, exceptions may apply :)
Update:
Personally, I like to break down src further with subdirectories for each resource type underneath it. Others may like that division at a higher level.
There is a lot of jar libraries that uses the same practice.
I think it is acceptable and comfortable.
In Eclipse it works well for us to have a src folder containing java classes, and a configuration folder (which is blessed as a source folder) containing property files etc. Then they all go in the output folder together and can be found in the classpath while still being in seperate folders inside Eclipse
One of the advantages of keeping all the auxiliary files next to the source is that version consistency is maintained between these 3rd party libraries and your source code. If you ever need to go back and debug a specific version, you can pull the entire set of source+config and have it all be the same version.
That being said I'd put them in a $project/config/ directory, or some such, rather than in $project/src/java itself. They're not source, nor java, really, so it's misleading having them in that directory.
When you really get down to it, though, this is an issue of personal style. There's no "Right" answer and you should be talking with those team members and understanding why they made this decision. Using this thread as evidence to support a unilateral decision probably won't go over well. ;)
Its pretty common, you can find it in really popular frameworks, e.g. xsd files for spring various schemas. Also people usually place hibernate mapping files in the same package as the model classes.
I think this is common as long as the files are necessary. The problems arise when people start committing files that are not needed with the source, such as design specs or random text files.
It is surely common, but incredibly lazy and sloppy. My skin crawls when I see it.
Using a tool such as Maven to build your products enables you to easily, and clearly separate code from resources.
Eclipse bundles can be similarly separated.
A customer requires a preview of a new feature of our product. They asked to have that feature sent to them in a jar file (like a patch). There's no problem with including the new classes in said jar file. However, an existing class was modified, which is needed to integrate the new feature. They just want to add this new jar file without having to update the core classes of our product. So, the question is: is it possible to override an already existing class using a separate jar? If so, how?
Thanks in advance.
There's a chance it'll work if you put the new jar earlier in the classpath than the original jar. It's worth trying, although it still sounds like a recipe for disaster - or at least, really hard to debug issues if somehow both classes are loaded.
EDIT: I had planned to write this bit earlier, but got interrupted by the end of a train journey...
I would go back to the customer and explain that while what they're asking is possible, it may cause unexpected problems. Updating the jar file is a much safer fix, with much less risk. The phrases "unexpected problems" and "risk" are likely to ring alarm bells with the customer, so hopefully they'll let you do the right thing.
Yes and no, it depends on your environment.
If you use, for example, OSGi and have your versions under control, it's just a matter of installing a new bundle with the exported package at a higher version (assuming your version ranges are lenient enough).
If you use plain old Java with no fancy custom class loading, you should be good to go putting it earlier on your class path (as others already mentioned).
If you do have custom class loading, you'll need to make sure that all the classes that your 'patched' class needs, and indeed the entire transitive dependency hull, is visible from the class loader which is loading the patched version, which might mean you need to ship the entire application, worst case.
All of the answers that stipulate putting the updated classes before the ones they are replacing in the classpath are correct, only provided the original JAR is not sealed or signed.
Yes, it may be possible, by putting it earlier on the classpath than your original jar. However, relying on the ordering of your classpath is not always going to lead to happiness. I'm not sure if it is even documented in the Java Language Spec; if not, then it's going to break for different JVMs and even different versions of the same JVM.
Instead, consider quoting a realistic time frame to integrate the new feature into the current codebase. This is perhaps not the answer you're looking for.
Probably more than you need for this specific case, but in generally if you just want to tweak or augment an existing class you can also use AspectJ with load-time weaving.