I am very new to Java. I am running somebody else's program on my computer, and they have imports like:
import weka.classifiers.CostMatrix;
import weka.classifiers.Evaluation;
import weka.classifiers.meta.CostSensitiveClassifier;
import weka.core.*;
The program actually works for me, but I am surprised because weka is a pretty specialized program, so I doubt it is distributed with Java. I never installed weka using any package manager, and I have searched the program code and it doesn't contain any weka packages explicitly.
Do you have any tips for figuring out 1) where these packages are installed, and 2) how I "got" these packages on my local computer? I have read that Java doesn't have a centralized package manager like Python or Perl do, so that might make it harder. I am super new to Java so any basic tips about package management would also be appreciated.
These packages are dependencies of your project, so they have probably been downloaded automatically by a tool that manages dependencies.
There are several possible build tools that can do that. Since you are working with Java/JVM, the usual suspects are Maven and Ant or maybe (less likely) Gradle or SBT.
In your case, the most probable scenario is:
A Maven plugin somewhere in your IDE manages the dependencies and downloads the jars (mvn in console less likely: you would have noticed if you used it)
A pom.xml build definition file lists all the dependencies
A weka dependency is probably declared somewhere in the pom, it should look roughly like this:
-
<dependency>
<groupId>nz.ac.waikato.cms.weka</groupId>
<artifactId>weka-stable</artifactId>
<version>3.8.0</version>
</dependency>
The JARs are stored in a hidden directory .m2 (or maybe .ivy) in your home directory.
The idea is that you can simply get the source code files and the pom.xml, and let Maven (or a similar build tool) download all dependencies, get all the required compiler plugins (or test-coverage tools, or whatever), and build your project. If you tried to do without a build tool, you would have to pass around eternally long lists of dependencies with version numbers that have to be obtained somehow before your program can be compiled, and this would be just a huge mess.
Edit: It is probably downloaded from here: Maven Central: weka-stable
It wouldn't run unless those packages are on the classpath and passed at runtime via
java -classpath
Or you're running an uber JAR file that does contain the libraries.
Common solutions for dependency management include a pom.xml (Maven), build.gradle (Gradle), or build.sbt (SBT).
While those aren't the only options, another solution would be those JAR libraries have been copied into your Java installation somehow
Related
I am new to using github and have been trying to figure out this question by looking at other people's repositories, but I cannot figure it out. When people fork/clone repositories in github to their local computers to develop on the project, is it expected that the cloned project is complete (ie. it has all of the files that it needs to run properly). For example, if I were to use a third-party library in the form of a .jar file, should I include that .jar file in the repository so that my code is ready to run when someone clones it, or is it better to just make a note that you are using such-and-such third-party libraries and the user will need to download those libraries elsewhere before they begin work. I am just trying to figure at the best practices for my code commits.
Thanks!
Basically it is as Chris said.
You should use a build system that has a package manager. This way you specify which dependencies you need and it downloads them automatically. Personally I have worked with maven and ant. So, here is my experience:
Apache Maven:
First word about maven, it is not a package manager. It is a build system. It just includes a package manager, because for java folks downloading the dependencies is part of the build process.
Maven comes with a nice set of defaults. This means you just use the archtype plugin to create a project ("mvn archetype:create" on the cli). Think of an archetype as a template for your project. You can choose what ever archetype suits your needs best. In case you use some framework, there is probably an archetype for it. Otherwise the simple-project archetype will be your choice. Afterwards your code goes to src/main/java, your test cases go to src/test/java and "mvn install" will build everything. Dependencies can be added to the pom in maven's dependency format. http://search.maven.org/ is the place to look for dependencies. If you find it there, you can simply copy the xml snippet to your pom.xml (which has been created by maven's archetype system for you).
In my experience, maven is the fastest way to get a project with dependencies and test execution set up. Also I never experienced that a maven build which worked on my machine failed somewhere else (except for computers which had year-old java versions). The charm is that maven's default lifecycle (or build cycle) covers all your needs. Also there are a lot of plugins for almost everything. However, you have a big problem if you want to do something that is not covered by maven's lifecycle. However, I only ever encountered that in mixed-language projects. As soon as you need anything but java, you're screwed.
Apache Ivy:
I've only ever used it together with Apache Ant. However, Ivy is a package manager, ant provides a build system. Ivy is integrated into ant as a plugin. While maven usually works out of the box, Ant requires you to write your build file manually. This allows for greater flexibility than maven, but comes with the prize of yet another file to write and maintain. Basically Ant files are as complicated as any source code, which means you should comment and document them. Otherwise you will not be able to maintain your build process later on.
Ivy itself is as easy as maven's dependency system. You have an xml file which defines your dependencies. As for maven, you can find the appropriate xml snippets on maven central http://search.maven.org/.
As a summary, I recommend Maven in case you have a simple Java Project. Ant is for cases where you need to do something special in your build.
Dependency issues, we've all dealt with them, but I'm mostly used to C# and now working in Java so I have some questions.
Let's say I add a library to my project, called ExtLib.
ExtLib has a certain library included in its lib-folder, let's call it LogLib-1.0.
I'm using Eclipse and I've made a User Library for ExtLib, included its main jar file and all of the files in its lib-folder. So far so good.
But now I want to do some logging of my own, so I make another User Library and add the newer LogLib-1.1 to it, because it has some new features I want to use.
Can I ever be sure I'm not breaking ExtLib this way?
I know .NET uses the Global Assembly Cache and methods like that, but I have no clue how Java handles this. I tried Googling, but didn't find much, a few mentions of the Classloader here and there, but nothing helpful.
Can anyone tell me what a proper way to deal with this issue is? Or is it no issue at all?
In this specific case (LogLib-1.0 and LogLib-1.1) we're dealing with the same library that is both a direct dependency of your application, and a "transitive" dependency via the ExtLib. In this situation, dependency management can come to help.
It will probably reason that LogLib-1.1 is a backward compatible release of LogLib-1.0, and it will decide that your application can run fine using only LogLib-1.1.
In the Java world, tools like Maven, Gradle or SBT exist to help you in this. Maven is the most widespread, and other tools often are compatible with Maven.
Usage
To solve this situation using Maven, you would add a file called pom.xml to your application, stating it depends on LogLib version 1.1. That might look like this (note that this example is pure fiction):
<dependency>
<groupId>org.loglib</groupId>
<artifactId>loglib</artifactId>
<version>1.1</version>
</dependency>
The ExtLib you're using also has a pom.xml shipped with it, and it might state
<dependency>
<groupId>org.loglib</groupId>
<artifactId>loglib</artifactId>
<version>1.0</version>
</dependency>
Maven (or any other tool) would decide that including LogLib-1.1 is sufficient to get your application running. When using Maven, mvn depedency:tree helps you visualise that.
Deployment
With respect to the packaging / deployment question: mvn package will package your application to a jar, war or ear archive, including only the dependencies you need (and not two versions of the same lib). This makes you don't have to worry about the order in which your application server reads the jar files.
I'm new to Maven, using the m2e plugin for Eclipse. I'm still wrapping my head around Maven, but it seems like whenever I need to import a new library, like java.util.List, now I have to manually go through the hassle of finding the right repository for the jar and adding it to the dependencies in the POM. This seems like a major hassle, especially since some jars can't be found in public repositories, so they have to be uploaded into the local repository.
Am I missing something about Maven in Eclipse? Is there a way to automatically update the POM when Eclipse automatically imports a new library?
I'm trying to understand how using Maven saves time/effort...
You picked a bad example. Portions of the actual Java Library that come with the Java Standard Runtime are there regardless of Maven configuration.
With that in mind, if you wanted to add something external, say Log4j, then you would need to add a project dependency on Log4j. Maven would then take the dependency information and create a "signature" to search for, first in the local cache, and then in the external repositories.
Such a signature might look like
groupId:artifactId:version
or perhaps
groupId:artifactId:version:classifier
This identifies a maven "module" which will then be downloaded and configured into your system. Once in place it adds all of the classes within the module to your configured project.
Maven principally saves time in downloading and organizing JAR files in your build. By defining a "standard" project layout and a "standard" build order, Maven eliminates a lot of the guesswork in the "why isn't my project building" sweepstakes. Also, you can use neat commands like "mvn dependency:tree" to print out a list of all the JARs your project depends on, recursively.
Warning note: If you are using the M2E plugin and Eclipse, you may also run into problems with the plugin itself. The 1.0 version (hosted at eclipse.org) was much less friendly than the previous 0.12 version (hosted at Sonatype). You can get around this to some extent by downloading and installing the "standalone" version of Maven from apache (maven.apache.org) and running Maven from the command line. This is actually much more stable than trying to run Maven inside Eclipse (in my personal experience) and may save you some pain as you try to learn about Maven.
I recently discovered that BlackBerry treats all classes with the same fully-qualified name as identical--regardless of whether they are in entirely different apps or not--causing apps that use different versions of our shared libraries to break when they are installed on the same phone.
To solve this problem, we are planning on changing the package names to include a version number, then building. Can someone explain how, using Bamboo, I can insert a step in our build process that:
changes certain packages names
replaces all code references to the old package name with references to the new package name?
A great tool that is made especially for the task of changing the fully qualified names of Java classes in jar files is jarjar. It can be used easily from within Ant, or alternatively from a shell script.
I have never used Bamboo - I assume, it should work there, too. Of course, there may be some special restrictions in that environment (concerning bytecode manipulation), I don't know about (?)
I'm not familiar with Bamboo and you did not include much information about your build system. If you are using maven, you could use the shade plugin:
This plugin provides the capability to package the artifact in an uber-jar, including its dependencies and to shade - i.e. rename - the packages of some of the dependencies.
The second example here shows how to configure package renaming. The resulting jar file would then have to be processed by rapc as in Chris Lerchers comment to his answer. It should be possible to also integrate this in a maven build using the exec plugin.
In Java if you package the source code (.java) files into the jar along with classes (.class) most IDE's like eclipse will show the javadoc comments for code completion.
IIRC there are few open-source projects that do this like JMock.
Lets say I have cleanly separated my API code from implementation code so that I have something like myproject-api.jar and myproject-impl.jar is there any reason why I should not put the source code in my myproject-api.jar ?
Because of Performance? Size?
Why don't other projects do this?
EDIT: Other than the Maven download problem will it hurt anything to put my sources into the classes jar to support as many developers as possible (maven or not)?
Generally because of distribution reason:
if you keep separate binaries and sources, you can download only what you need.
For instance:
myproject-api.jar and myproject-impl.jar
myproject-api-src.jar and myproject-impl-src.jar
myproject-api-docs.zip and myproject-impl-docs.zip
Now, m2eclipse - Maven for Eclipse can download sources automatically as well
mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true
Now, it can also generate the right pom to prevent distribution of the source or javadoc jar when anyone declare a dependency on your jar.
The OP comments:
also can't imagine download size being an issue (i mean it is 2010 a couple 100k should not be a problem).
Well actually it (i.e. "the size) is a problem.
Maven suffers already from the "downloading half the internet on first build" syndrome.
If that downloads also sources and/or javadocs, that begins to be really tiresome.
Plus, the "distribution" aspect includes the deployment: in a webapp server, there is no real advantage to deploy a jar with sources in it.
Finally, if you really need to associate sources with binaries, this SO question on Maven could help.
Using maven, attach the sources automatically like this:
http://maven.apache.org/plugins/maven-source-plugin/usage.html
and the javadocs like this:
http://maven.apache.org/plugins/maven-javadoc-plugin/jar-mojo.html
That way they will automatically be picked up by
mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true
or by m2eclipse