Where to place java packages like htmlparser?

Where to place java packages like htmlparser? - java

I am beginning to use java packages like HTMLParser, I have downloaded it and finding that there are many files and directories in it.
I wander, where to place them in my linux system? Is there a convention or a standard?

The quick and dirty answer is "anywhere on the classpath", where the classpath is set either as a system property on the client machine (not recommended), as a temporary system property for the CLI session used to start the JVM (workable from a startup script), or as a commandline parameter to the JVM (usually the preferred choice).
First and second set the CLASSPATH environment variable, see the JDK or JRE documentation for the exact syntax and your operating system's and/or shell scripting documentation as well. Third uses the -cp commandline variable to the Java runtime and compiler, see their documentation for exact syntax.
Where to place the files on the filesystem? For development purposes I typically use a central folder on my computer containing all such libraries and link to that from my IDE or other development environment. For deployment/packaging to end users, it is traditional to have a "lib" subfolder to the product folder that contains all distributable content, and put the jar files in that.

Java packages come in two forms. Source code - all the files and directories you mention - and packaged as jars. A common convention in Java projects is that the project has a lib directory that contains all the jars that the project depends on. These projects often use a shell script which adds all the jars to the Java classpath prior to executing the project code.
However many projects are switching from this method of dealing with dependencies to using a build tool like Apache Maven which automatically handles dependency management. Other alternatives include Ivy or Gradle. For an introduction see the 5 minute introduction to Maven or the Maven 3 tutorial.
Here you write a pom.xml (project object model file) which specifies which libraries (jars) your project uses. Maven then stores all the jars for your different projects in a .m2 directory in your local directory, keeping track of where it obtained them, and their versioning information.
This makes developing much easier as you do not need to create the lib directory or manually manage dependencies. You also avoid a lot of the complexities of setting the classpath, as Maven automatically does this for you during common lifecyle stages such as compilation and test. Recent versions of Eclipse can read the Maven pom and automatically configure your classpath from it.
Once you have built the project, Maven can also help create "fat jars" that contain all the jars your project depends on, via the assembly plugin or the Shade plugin. This makes distributing the code easier when you are building an executable that you want someone to use. If you are distributing a jar, then your pom.xml describes the dependencies of your project, avoiding the need to distribute the jars it depends on.
For laying out files in general on a Linux system consult the Linux Filesystem hierarchy standard.

Related

How are java dependencies deployed on server machine

Did a lot of research but could not find a proper answer. My question is simple - I am building an executable jar file which has few external dependencies like spring etc. Now I want to deploy my executable jar file to server machine. Is there a easy and safe way of achieving it? Few options I am aware of:
Build an uber jar with all the dependencies bundled along with application code and deploy it
Deploy the source code executable jar and then manually add all the dependency jar files to the class path
Is there any other better way? Any tools which can help here? How are dependency jar upgrades handled? - Are they manually replaced on server machine?

If you 'just' have an executable jar and some other jar files as dependencies (this is the most common case actually), you can follow best practice standards and create a zip file containing them all. Check how various open source projects offer their stuff for download.
If you use some framework that might also guide you about deployment. As an example, the servlet specification tells you how to create that zip file in chapter 10.
If you want another way more compliant to the OS package manager, you could take a look at JPackage. It also bundles a Java Runtime so you have tight control not just about the jar dependencies but also the runtime.

How can I package a maven jar and output multiple copies of the resulting JAR file to different folders?

Pretty much what the title says. I'm building Minecraft Spigot plugins for servers running under BungeeCord and running the mvn package plugin in IntelliJ results in the generated JAR file being located in the project's "target" folder. Instead, I need to output multiple copies of the generated JAR into multiple "plugin" folders for various servers. I'm not sure how or if this is possible to do with Maven, but I would like to know if there is a way to do that in pom.xml. Having to copy the JAR every time I build it slows down the development process. Any help would be greatly appreciated!

You should be able achieve the same thing with symlinks (symbolic link). It allows the file system a way to have a reference to another file location without actually making an explicit copy (kind of like a shortcut). This is also more in the spirit of maven's philosophy that each project should build only one artifact.
How you make symlinks depends on your operating system. You would need to make a symlink for each separate location that you need to "copy" to.
Mac / Linux:
ln -s /project/target/my.jar /project/server/plugin1
Windows:
mklink /project/target/my.jar /project/server/plugin1
Other info about symlinks (e.g. if you need to delete):
https://www.howtogeek.com/howto/16226/complete-guide-to-symbolic-links-symlinks-on-windows-or-linux/
https://www.howtogeek.com/297721/how-to-create-and-use-symbolic-links-aka-symlinks-on-a-mac/
Building multiple jars is fairly tricky in maven. See this post for some additional details - it suggests using a maven ant plugin if this is what you really want, but I would recommend against this approach. Symlinks should be easier to work with.
Ant to Maven - multiple build targets

How do I check installed JARs, external libraries, etc. on three different Java IDEs?

I've written programs in several languages and have tutored students in computer science, but just starting to learn Java on my MacBook. Regarding this question, I'd be happy with any answer that points me to available information or tutorials that address my question; I'm capable of understanding advanced things.
I've been searching for the right IDE for me as well as something I can use with my students, and I've tried IntelliJ, Eclipse, and VS Code. Along the way I've installed external JARs to provide extra capabilities, such as Apache Commons.
Things are getting confusing. I've lost track of how I got to the present state in each IDE. I'd like to understand better how to know the overall Java environment that any given project is using on each of these IDEs, including any external JARs and where they are located. And I'd like to know if they borrow from the Java system environment.
My goal is to understand how my own system got to the way its currently configured, to update my configuration on a project-by-project basis, and to help my students get a matching configuration.
I'd also like advice on the right way, or simplest/cleanest way, to install external JARs.

Maven
Question: I'd also like advice on the right way, or simplest/cleanest way, to install external JARs.
If you really wanna work in a organised way and wanna focus completely on coding rather than looking for dependencies to work with , then try building your projects with Apache Maven. The magic wand of Maven projects are pom.xml file where all magic happens depending upon your wish.
Maven is a build automation tool used primarily for Java projects. Maven addresses two aspects of building software:
Describes and manages how software is built.
Describes and manages dependencies (various libraries used by your code).
Why Maven:
De facto standard
Able to compile, test, pack and distribute source code ( different Goals)
Robust dependency management (Most important from my point of view)
Extensible via plugin
Good community support and many fan boys around.
The big 3 IDEs (IntelliJ, NetBeans, and Eclipse) all having good
support for Maven, letting you use Maven as a substitute for their
own proprietary project definition and build process.
Maven famously caches all of its dependencies in the ~/.m2
directory, which is sometimes called the local Maven repository.
Maven local repository keeps your project's all dependencies (library jars,
plugin jars etc.). When you run a Maven build, then Maven automatically
downloads all the dependency jars into the local repository. It helps to
avoid references to dependencies stored on remote machine every time a
project is build.
You can simply deploy your project as JAR, WAR, or EAR file and use it on different IDEs or as standalone.

All IDEs need a way to know your project's dependencies. You can either tell them that yourself or let a build tool do that.
Manual dependency handling: by adding the jars to your project. This is probably the fastest way when working on a small project, with one developer, on a specific IDE, with few dependencies. Usually when telling the IDE that this .jar is a dependency of your project, the IDE stores that reference to a project-specific file (eg. in Eclipse the .classpath file which you can edit with a txt editor and see the dependencies yourself). However, it kind of locks your application to your IDE. Most IDEs have cross-IDE support for import and migration, but using both IDEs at the same time can be confusing when a dependency is added to one and has to be repetitively added to other as well. Furthermore, your dependencies have dependencies on their own. By adding manually your jars you are responsible to find and download their own dependencies as well.
Use a build tool: There are 3 standard such tools right now: Apache Ant with Ivy, Apache Maven and Gradle. All of them have support in the major IDEs for Java: IntelliJ IDEA, Eclipse and NetBeans. All of them use some extra build-tool specific files to store your project's configuration and subsequently configure your IDE and the IDE-specific files. That way, your project becomes IDE-agnostic, the IDE outsources the dependency handling to the build tool. These tools will download any direct or transitive dependencies of your project in a local directory or you can compile jars in a specified folder. From those, Ant is the oldest (with Ivy adding dependency handling support), Maven was developed after that and Gradle is the newest and probably the most flexible. In production however Maven is by far the most established one right now.
It would be also useful to look up the Standard Directory Layout. If you adhere to that, it will be easier to work/start with either Maven or Gradle.
Finally, you can search and find most of the free libraries in Maven-Central where conveniently their Ivy/Maven/Gradle script is added as well for you to use on your build-tool script. In many cases a .jar is provided as well if you prefer to manually add it as a dependency.
Regarding VS Code, I think it supports these tools through plugins but I'm not sure.

Why do a lot of projects only offer source and no jars for download?

I've seen a lot of projects, even from big companies like Elephant Bird (Twitter) and Akela (Mozilla) that offer source and ask you to compile it yourself instead of also offering jars. Is there some benefit to compiling in your own environment instead of just downloading a jar someone else has compiled?

Dependencies are not in the same location or even have the same version on every machine. It is simpler to detect where they are at compile-time.
If there is any native code (sometimes just for optimization) in a project, there are probably platform-dependent flags that need to be set at compile-time.

The short answer is dependency management. Most public OSS Java projects offer jars by publishing them to Maven Central. You are expected to use a build system like Gradle, Ivy, or Maven to manage your dependencies - these tools will automatically download the library you want along with any of its dependent libraries and be smart about it, caching it on your local filesystem so if a library is shared across multiple libraries it won't be downloaded twice.
As for the example projects you listed, Elephant Bird is available via Maven Central whereas Akela tells you exactly how to create your own jar (perhaps it's not quite far along enough to justify going through the rigmarole of publishing to Maven Central):
Building
To make a jar you can do:
mvn package
To make a Hadoop MapReduce job jar with no defined main class in the manifest:
mvn assembly:assembly

Without an automatic build system its hard to maintain a current version of the jar file online. Including the jar file in the repository is generally not a good idea as users who clone it don't need the compiled jar, they want the code. So unless the publisher explicitly adds a jar file to a download location outside of the sourcecode repository and updates this file every time the application changes you have to compile it yourself. Automatic Build systems can help a publisher to provide a current compiled jar to it's users but for smaller projects it's not always sufficient to go through the trouble of setting one up.

What is the difference between setting the classpath and java build path in eclipse?

What are the different ways that java programs gain access to external libraries. There is setting a classpath, modifying the build or build path, but I've seen other ways of adding jars.
Why do some libraries have to be added to the classpath while others do not. For example I'm using JSF, WTP tools, and other extra libraries but they are not in my buildpath when I view the build path of my project.

The classpath is used to find classes when executing a Java program. The build path is used when Eclipse is compiling a Java program.

The Java Build Path is just an Eclipse thing. It's where Eclipse finds the classes needed to compile and run the classes of the project. It's thus both th compile and the run classpath.
In the case of a webapp, the webapp runs inside a Java EE web container. The web container gives access to standard Java EE classes (javax.servlet, etc.). Moreover, all the jars in WEB-INF/classes are automatically included in the classpath of the web app. So Eclipse doesn't need you to specify them in the Java Build Path. They're included automatically.

On development time.
A build path is one where you can explicitly point to third party software / jars.
By default not all third party software are added into your classpath, hence you may have to explicitly add that to your path.
On runtime.
On the other hand when you run your applications from the command line, you would prefix the classpath by using -cp to specify the third party jars.
For example in web projects you would add it to your web-inf library when you deploy.

A classpath is simply an array of classpath entries (IClasspathEntry) that describe the types that are available. The classpath is an environment variable that tells where to look for class files and it is generally set to a directory or a JAR (java archive) file.
The Java build path is reflected in the structure of a Java project element. You can query a project for its package fragment roots (IPackageFragmentRoot). The build path is the classpath that is used for building a Java project (IJavaProject).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.