Jar configurations and their contents

Jar configurations and their contents - java

While downloading Google Guice I noticed two main "types" of artifacts available on their downloads page:
guice-3.0.zip; and
guice-3.0-src.zip
Upon downloading them both and inspecting their contents, they seem to be two totally different "perspectives" of the Guice 3.0 release.
The guice-3.0.zip just contains the Guice jar and its dependencies. The guice-3.0-src.zip, however, did not contain the actual Guice jar, but it did contain all sorts of other goodness: javadocs, examples, etc.
So it got me thinking: there must be different "configurations" of jars that get released inside Java projects. Crossing this idea with what little I know from build tools like Ivy (which has the concept of artifact configurations) and Maven (which has the concept of artifact scopes), I am wondering what the relation is between artifact configuration/scope and the end deliverable (the jar).
Let's say I was making a utility jar called my-utils.jar. In its Ivy descriptor, I could cite log4j as a compile-time dependency, and junit as a test dependency. I could then specify which of these two "configurations" to resolve against at buildtime.
What I want to know is: what is the "mapping" between these configurations and the content of the jars that are produced in the end result?
For instance, I might package all of my compile configuration dependencies wind up in the main my-utils.jar, but would there ever be a reason to package my test dependencies into a my-utils-test.jar? And what kind of dependencies would go in the my-utils-src.jar?
I know these are a lot of tiny questions, so I guess you can sum everything up as follows:
For a major project, what are the typical varieties of jars that get released (such as guice-3.0.zip vs guice-3.0-src.zip, etc.), what are the typical contents of each, and how do they map back to the concept of Ivy configurations or Maven scopes?

The one you need to run is guice-3.0.zip. It has the .class files in the correct package structure.
The other JAR, guice-3.0-src.zip, has the .java source files and other things that you might find useful. A smart IDE, like IntelliJ, can use the source JAR to allow you to step into the Guice code with a debugger and see what's going on.
You can also learn a lot by reading the Guice source code. It helps to see how developers who are smarter than you and me write code.
I'd say that the best example I've found is the Efficient Java Matrix Library at Google Code. That has an extensive JUnit test suite that's available along with the source, the docs, and everything else that you need. I think it's most impressive. I'd like to emulate it myself.

Related

How to identify potential java dependiny namespace conflicts in maven project?

I have several huge legacy applications that I am now working on. After months of testing, we finally reached deployment only to have a "failed to load webapplicationcontext" which foiled the whole endeavor. That specific failure was due to a name space conflict between two transitive dependencies. i.e., both jars had a class to load as: org.something.somethingelse.ClassName.
There are ~100 jars pulled in via maven for this single project. Several explicit, most transitive. Ideally, I would like to know every single jar I'm putting on my classpath. Practically, though, I don't have enough experience or time to look through every one of them for potential issues.
Is there a tool, technique, or eclipse/intelliJ feature that I can use to scan a set of jars for similar namespaces?

You can try with enforcer plugin. In a maven project, it's very usefull when you need to detect different jar depenndencies of same artifact with different version.
You can read this post too.

So there were a couple of different solutions here. I ended up using jhades (http://jhades.github.io/) to identify conflicts within the war, and then tattletale (a utility provided by JBOSS support) to identify conflicts between the war and the container.
I added 'exclude *' tags to all the explicit dependencies to prevent any transitive dependencies from loading. I added explicit dependencies for anything that still wasn't present. After ensuring that all compiled dependencies played nicely, I set any libraries identified by tattletale to provided and added the necessary module to standalone.xml. These things like hibernate, apache libs, servlet APIs etc.
The other thing I discovered which made this so difficult to identify in the first place is that JBOSS's classloader indexes libraries according to how the hosting file system orders them. On Windows, which is where we do 90% of our development, they are always loaded alphabetically. On linux, where we do our production deployments, the order is pseudo random. Our production servers are built from the same images, so a RHEL 3.4 server will load in the same order as another 3.4, but a 3.5 will load entirely differently. Thus, we did not see a failure until the stars aligned and we deployed to a 3.6 server. In production.
Hope this helps someone.

The idiomatic structure for gradle project for the tests

Task: what I have is the large non-Gradle (make:-)) project, which contains many subprojects, each one in it's own subdirectory. I have to write functional test for some of these subprojects. These subprojects are producing independent results, but with the same structure, so there is many common code for testing these subprojects, so I want to share it in some special location.
Restrictions:
as developers requested, the tests for subprojects should be in the directory of this subproject (to be precise, in the subdirectory, for example, func_tests).
I have some shared dependencies for my test projects, that I usually use, for example, Google Guava, TestNG and so on, and also have some settings for test run (excludeGroups 'slow'...) and I prefer this settings to be common, still, that doesn't matter too much.
symbolic links are accepted way, if that's good design:)
If it's possible, I want to have IntelliJ IDEA correctly handle this dependency.
My ideas:
symlink src/main of every test subproject to some common directory (src/test is "individual"). This will greatly support IDE , but it would lead to copying all the dependencies and preferences. Also, I'm very unsure, if that's preferred way in Gradle.
create common project, which will be imported by every subproject, this will save dependencies (will it?), but I'm not sure IDEA will correctly handle this way.
What is the idiomatic way to do this with Gradle?

Look at samples/java/withIntegrationTests in your Gradle installation. This will give you some idea how to add your tests (there are other ways too). You want to tweak that setup to make sure that IDEA handles your tests. This is done by customization of idea.module.scopes.
Shared code and shared libraries: you can create a map like https://github.com/gradle/gradle/blob/master/gradle/dependencies.gradle and use it in different subprojects. BTW: Gradle codebase has a lot of integration tests and you can check how their build is configured to see if you want to apply some ideas.

Including .jar files in Github for consistency

I am new to using github and have been trying to figure out this question by looking at other people's repositories, but I cannot figure it out. When people fork/clone repositories in github to their local computers to develop on the project, is it expected that the cloned project is complete (ie. it has all of the files that it needs to run properly). For example, if I were to use a third-party library in the form of a .jar file, should I include that .jar file in the repository so that my code is ready to run when someone clones it, or is it better to just make a note that you are using such-and-such third-party libraries and the user will need to download those libraries elsewhere before they begin work. I am just trying to figure at the best practices for my code commits.
Thanks!

Basically it is as Chris said.
You should use a build system that has a package manager. This way you specify which dependencies you need and it downloads them automatically. Personally I have worked with maven and ant. So, here is my experience:
Apache Maven:
First word about maven, it is not a package manager. It is a build system. It just includes a package manager, because for java folks downloading the dependencies is part of the build process.
Maven comes with a nice set of defaults. This means you just use the archtype plugin to create a project ("mvn archetype:create" on the cli). Think of an archetype as a template for your project. You can choose what ever archetype suits your needs best. In case you use some framework, there is probably an archetype for it. Otherwise the simple-project archetype will be your choice. Afterwards your code goes to src/main/java, your test cases go to src/test/java and "mvn install" will build everything. Dependencies can be added to the pom in maven's dependency format. http://search.maven.org/ is the place to look for dependencies. If you find it there, you can simply copy the xml snippet to your pom.xml (which has been created by maven's archetype system for you).
In my experience, maven is the fastest way to get a project with dependencies and test execution set up. Also I never experienced that a maven build which worked on my machine failed somewhere else (except for computers which had year-old java versions). The charm is that maven's default lifecycle (or build cycle) covers all your needs. Also there are a lot of plugins for almost everything. However, you have a big problem if you want to do something that is not covered by maven's lifecycle. However, I only ever encountered that in mixed-language projects. As soon as you need anything but java, you're screwed.
Apache Ivy:
I've only ever used it together with Apache Ant. However, Ivy is a package manager, ant provides a build system. Ivy is integrated into ant as a plugin. While maven usually works out of the box, Ant requires you to write your build file manually. This allows for greater flexibility than maven, but comes with the prize of yet another file to write and maintain. Basically Ant files are as complicated as any source code, which means you should comment and document them. Otherwise you will not be able to maintain your build process later on.
Ivy itself is as easy as maven's dependency system. You have an xml file which defines your dependencies. As for maven, you can find the appropriate xml snippets on maven central http://search.maven.org/.
As a summary, I recommend Maven in case you have a simple Java Project. Ant is for cases where you need to do something special in your build.

Classloader to isolate a jar (class identity crisis)

I'm using jarX that has embedded dependencies that conflict with my own dependencies, so I'm creating a classloader to isolate jarX's dependencies from my main classloader.
jarX is outside my app's classpath, but my classes that use jarX's classes are in my classpath, so when I instantiate my classes loaded via the custom classloader, I run into the class identity crisis in the form of ClassCastException as the JVM's version of my classes are considered different from those loaded by my custom classloader.
I found this blog post where they solved a similar problem by only interacting with the custom classloader loaded classes via reflection, which seems to solve this problem.
It just feels like it should be easier than this. Does anyone know a better way to handle this problem?

The easiest way is to open jarX, remove the offending classes, and done. It is a bad practice to embed dependencies in a JAR unless that is JAR is meant to be used only as a standalone runnable fat-jar. JARs that are meant to be used as libraries should not embed dependencies.
When you notice that people package third-party classes in their JARs, I'd recommend pointing out to them that this is generally not a good idea and to encourage them to refrain from doing so. If a project provides a runnable fat-jar including all dependencies, that is fine. But, it should not be the only JAR they provide. A plain JAR or set of JARs without any third-party code should also be offered. In the rare cases that third-party code was modified and must be included, it should be done under the package namespace of the provider, not of the original third-party.
Finally, for real solutions to building modular Java applications and handling classloader isolation, check out one of the several OSGi implementations or project Jigsaw.

Can you post which jar is it and what are the classes that it overlaps, with the full stacktrace? Have a look at this tool I wrote to generate a list of duplicate classes in the WAR, there is an option to exclude duplicates of the same size.
These are some measures that can be done to solve this:
Try to reduce the number of duplicates by doing a case by case analysis of why the overlap exists. Add maven exclusions for jars that are complete duplicates.
Check if there is a version of the same jar without the dependencies that you could use, which jar is it, xerces, etc?
If there is no jar without dependencies, you can you exclude the other jar that overlaps jarX and see if the application still works. This means all components that need the jar have a compatible version of the jarX library
Separate the application into two WARs each with the version of the library you need. This will reduce the number of libraries in which
These where measures that are likelly to be more maintainable long-term
If the previous measures do not work:
open the jar, delete the duplicate classes and publish in the maven repository with a different name jarX-patched
you can configure nexus to serve a patched jar instead of an unpatched jar transparently
If your container supports OSGI that would be even better, but if you don't use a OSGI container for development as well, then the application would not work in development.

Is there any disadvantage to putting API code into a JAR along with the classes?

In Java if you package the source code (.java) files into the jar along with classes (.class) most IDE's like eclipse will show the javadoc comments for code completion.
IIRC there are few open-source projects that do this like JMock.
Lets say I have cleanly separated my API code from implementation code so that I have something like myproject-api.jar and myproject-impl.jar is there any reason why I should not put the source code in my myproject-api.jar ?
Because of Performance? Size?
Why don't other projects do this?
EDIT: Other than the Maven download problem will it hurt anything to put my sources into the classes jar to support as many developers as possible (maven or not)?

Generally because of distribution reason:
if you keep separate binaries and sources, you can download only what you need.
For instance:
myproject-api.jar and myproject-impl.jar
myproject-api-src.jar and myproject-impl-src.jar
myproject-api-docs.zip and myproject-impl-docs.zip
Now, m2eclipse - Maven for Eclipse can download sources automatically as well
mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true
Now, it can also generate the right pom to prevent distribution of the source or javadoc jar when anyone declare a dependency on your jar.
The OP comments:
also can't imagine download size being an issue (i mean it is 2010 a couple 100k should not be a problem).
Well actually it (i.e. "the size) is a problem.
Maven suffers already from the "downloading half the internet on first build" syndrome.
If that downloads also sources and/or javadocs, that begins to be really tiresome.
Plus, the "distribution" aspect includes the deployment: in a webapp server, there is no real advantage to deploy a jar with sources in it.
Finally, if you really need to associate sources with binaries, this SO question on Maven could help.

Using maven, attach the sources automatically like this:
http://maven.apache.org/plugins/maven-source-plugin/usage.html
and the javadocs like this:
http://maven.apache.org/plugins/maven-javadoc-plugin/jar-mojo.html
That way they will automatically be picked up by
mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true
or by m2eclipse

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Jar configurations and their contents - java

Related

How to identify potential java dependiny namespace conflicts in maven project?

The idiomatic structure for gradle project for the tests

Including .jar files in Github for consistency

Classloader to isolate a jar (class identity crisis)

Is there any disadvantage to putting API code into a JAR along with the classes?

Categories

Resources