Compile Hadoop 2.2.0 job? - java

It seems that all of the examples are constructed with older versions in mind.
How do I compile my java program on Ubuntu such that it will refer to hadoop-2.2.0 libraries?
Where are the jar files that I am supposed to include?
What is the command?
Is it like -
javac -classpath libraries wordcount.java
Thank you.

The simplest solution for Linux machines would be:
javac -classpath `yarn classpath` -d . WordCount.java
Or:
export CLASSPATH=`yarn classpath`
javac -classpath $CLASSPATH -d . WordCount.java

I found the following:
javac -classpath $HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.2.0.jar:$HADOOP_HOME/share/hadoop/common/lib/commons-cli-1.2.jar -d wordcount_classes myWordCount.java
This allowed me to compile the Wordcount example (or in this case a copy of mine called myWordCount).

Hadoop has a command "hadoop classpath" that supplies you with the necessary classpath.
ie
hadoop classpath
/etc/hadoop/conf:/usr/lib/hadoop/lib/:/usr/lib/hadoop/.//:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/:/usr/lib/hadoop-hdfs/.//:/usr/lib/hadoop-yarn/lib/:/usr/lib/hadoop-yarn/.//:/usr/lib/hadoop-mapreduce/lib/:/usr/lib/hadoop-mapreduce/.//
So if you wanna compile you can use it this way..
javac -classpath $(hadoop classpath) -d . WordCount.java

you have to instal Cygin and there you can run your hadoop example and also you can configure your hadoop with eclipse

Run the command: "yarn classpath" to see a list of directories. When I use this list as my -classpath option for javac, my Java program compiles.
I am running HortonWorks v2.0, Apache Hadoop 2.2.0.

I'm having bumpy ride with Hadoop Example jars too. Information in many videos/tutorials/blogs is based on older version.
When we compile these examples or write any of our own MapReduce program, that is going to use hadoop packages (i.e. import jar in IDE/add reference to external jars - akin to Add reference to .dll in MS Visual Studio), and IDE will take care of correctly calling javac for each class.
Now for manually compiling any class e.g. WordCount.java, we need to tell javac which all jars our class is dependent on. I followed outdated videos but that shared one information i.e. to set a variable in .bashrc, having reference to all Hadoop related jar files and then use that in javac -classpath $VARIABLE filename.java.
e.g. I'm using name as $HADOOP_CLASSPATH and values as shown here (I'm on Mac OS X)
/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/:/usr/local/hadoop/share/hadoop/common/:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/:/usr/local/hadoop/share/hadoop/hdfs/:/usr/local/hadoop/share/hadoop/yarn/lib/:/usr/local/hadoop/share/hadoop/yarn/:/usr/local/hadoop/share/hadoop/mapreduce/lib/:/usr/local/hadoop/share/hadoop/mapreduce/:/contrib/capacity-scheduler/.jar:/usr/local/hadoop/share/hadoop/yarn/:/usr/local/hadoop/share/hadoop/yarn/lib/*
with this variable, I could compile class successfully.
"javac -classpath $HADOOP_CLASSPATH WordCount.java "

Related

How to import external jars in command line or Atom?

I have been using IntelliJ to run my java programs that require some external jars. I also learned that if I want to compile and run my program from the command line I should do the following:
java -classpath someJar.jar YourMainClass
or for many libraries:
java -classpath someJar.jar;myJar.jar YourMainClass
However, while placed in the src folder where my class it doesn't seem to find my class.
I also like using the Atom text editor but I don't know any package that can import external libraries like an IDE does. So how do I do it in Atom or in cmd in Windows 10? I am kind of a newbie to java dev outisde my beloved IDE, so I would really appreciate some help.
If you're on a Windows system then try compiling (with the jar) using the following command in cmd:
javac -cp .;jar-name.jar *.java
To run the command with the jar use:
java -cp .;jar-name.jar JavaCodeName
If you're on a Unix system then you can try the following to compile in terminal:
javac -cp jar-name.jar:. *.java
And to run it use:
java -cp jar-name.jar:. JavaCodeName
I'm not too familiar with Atom so I don't know if there is an attachment to do this, but it should work for terminal / command prompt.

What does "build:" mean in the java -cp option?

I was learning about Frege and saw this command line:
$ java -Xss1m -cp build:fregec.jar examples.SimpleIO
I've never seen that build: before. What does that mean and what does it do?
More context: https://github.com/Frege/frege/issues/289
I don't see it documented in this official article or when I type java at the command line.
: is the separator, so it's including build and fregec.jar on the classpath.
Looking at Frege specifically, you first use it to compile some code and create some class files in the build directory. For example:
java -Xss1m -jar fregec.jar -d build SimpleIO.fr
Then to run the compiled code you need both Frege itself, and the class files you just created, on the classpath:
java -Xss1m -cp build:fregec.jar examples.SimpleIO

Classpath to use for MapR/Hadoop/Hive

I'm trying to compile some java code for hadoop and need to know what classpath I need to specify. For cloudera I use this below but what do I use for a MapR installation? Surprisingly I could only find how to set the classpath in google, not what to set it to.
javac -classpath "/opt/cloudera/parcels/CDH-4.6.0-1.cdh4.6.0.p0.26/lib/hadoop/client/*" mr.java -d mr
Found the answer by trial and error. Oddly google is very silent on this and all the books and examples I've read appear to assume this is too obvious to bother printing.
mkdir MyClass
javac -classpath "/opt/mapr/hadoop/hadoop-0.20.2/lib/*" MyClass.java -d MyClass
jar -cvf MyClass.jar -C MyClass .
Additionally, if you want the hive libraries, eg for compiling a hive UDF:
javac -classpath "/opt/mapr/hadoop/hadoop-0.20.2/lib/*:/opt/mapr/hive/hive-0.12/lib/*" MyClass.java -d MyClass
EDIT: one thing I would add is make sure you put quotes around the path, otherwise linux expands it on the command line which is not what you want. The * in the path needs to be passed to java as is.

What are the common errors you see when you run 'java -cp ...' or 'java -classpath'? How do you set a directory of jars in classpath? [duplicate]

Is there a way to include all the jar files within a directory in the classpath?
I'm trying java -classpath lib/*.jar:. my.package.Program and it is not able to find class files that are certainly in those jars. Do I need to add each jar file to the classpath separately?
Using Java 6 or later, the classpath option supports wildcards. Note the following:
Use straight quotes (")
Use *, not *.jar
Windows
java -cp "Test.jar;lib/*" my.package.MainClass
Unix
java -cp "Test.jar:lib/*" my.package.MainClass
This is similar to Windows, but uses : instead of ;. If you cannot use wildcards, bash allows the following syntax (where lib is the directory containing all the Java archive files):
java -cp "$(printf %s: lib/*.jar)"
(Note that using a classpath is incompatible with the -jar option. See also: Execute jar file with multiple classpath libraries from command prompt)
Understanding Wildcards
From the Classpath document:
Class path entries can contain the basename wildcard character *, which is considered equivalent to specifying a list of all the files
in the directory with the extension .jar or .JAR. For example, the
class path entry foo/* specifies all JAR files in the directory named
foo. A classpath entry consisting simply of * expands to a list of all
the jar files in the current directory.
A class path entry that contains * will not match class files. To
match both classes and JAR files in a single directory foo, use either
foo;foo/* or foo/*;foo. The order chosen determines whether the
classes and resources in foo are loaded before JAR files in foo, or
vice versa.
Subdirectories are not searched recursively. For example, foo/* looks
for JAR files only in foo, not in foo/bar, foo/baz, etc.
The order in which the JAR files in a directory are enumerated in the
expanded class path is not specified and may vary from platform to
platform and even from moment to moment on the same machine. A
well-constructed application should not depend upon any particular
order. If a specific order is required then the JAR files can be
enumerated explicitly in the class path.
Expansion of wildcards is done early, prior to the invocation of a
program's main method, rather than late, during the class-loading
process itself. Each element of the input class path containing a
wildcard is replaced by the (possibly empty) sequence of elements
generated by enumerating the JAR files in the named directory. For
example, if the directory foo contains a.jar, b.jar, and c.jar, then
the class path foo/* is expanded into foo/a.jar;foo/b.jar;foo/c.jar,
and that string would be the value of the system property
java.class.path.
The CLASSPATH environment variable is not treated any differently from
the -classpath (or -cp) command-line option. That is, wildcards are
honored in all these cases. However, class path wildcards are not
honored in the Class-Path jar-manifest header.
Note: due to a known bug in java 8, the windows examples must use a backslash preceding entries with a trailing asterisk: https://bugs.openjdk.java.net/browse/JDK-8131329
Under Windows this works:
java -cp "Test.jar;lib/*" my.package.MainClass
and this does not work:
java -cp "Test.jar;lib/*.jar" my.package.MainClass
Notice the *.jar, so the * wildcard should be used alone.
On Linux, the following works:
java -cp "Test.jar:lib/*" my.package.MainClass
The separators are colons instead of semicolons.
We get around this problem by deploying a main jar file myapp.jar which contains a manifest (Manifest.mf) file specifying a classpath with the other required jars, which are then deployed alongside it. In this case, you only need to declare java -jar myapp.jar when running the code.
So if you deploy the main jar into some directory, and then put the dependent jars into a lib folder beneath that, the manifest looks like:
Manifest-Version: 1.0
Implementation-Title: myapp
Implementation-Version: 1.0.1
Class-Path: lib/dep1.jar lib/dep2.jar
NB: this is platform-independent - we can use the same jars to launch on a UNIX server or on a Windows PC.
My solution on Ubuntu 10.04 using java-sun 1.6.0_24 having all jars in "lib" directory:
java -cp .:lib/* my.main.Class
If this fails, the following command should work (prints out all *.jars in lib directory to the classpath param)
java -cp $(for i in lib/*.jar ; do echo -n $i: ; done). my.main.Class
Short answer: java -classpath lib/*:. my.package.Program
Oracle provides documentation on using wildcards in classpaths here for Java 6 and here for Java 7, under the section heading Understanding class path wildcards. (As I write this, the two pages contain the same information.) Here's a summary of the highlights:
In general, to include all of the JARs in a given directory, you can use the wildcard * (not *.jar).
The wildcard only matches JARs, not class files; to get all classes in a directory, just end the classpath entry at the directory name.
The above two options can be combined to include all JAR and class files in a directory, and the usual classpath precedence rules apply. E.g. -cp /classes;/jars/*
The wildcard will not search for JARs in subdirectories.
The above bullet points are true if you use the CLASSPATH system property or the -cp or -classpath command line flags. However, if you use the Class-Path JAR manifest header (as you might do with an ant build file), wildcards will not be honored.
Yes, my first link is the same one provided in the top-scoring answer (which I have no hope of overtaking), but that answer doesn't provide much explanation beyond the link. Since that sort of behavior is discouraged on Stack Overflow these days, I thought I'd expand on it.
Windows:
java -cp file.jar;dir/* my.app.ClassName
Linux:
java -cp file.jar:dir/* my.app.ClassName
Remind:
- Windows path separator is ;
- Linux path separator is :
- In Windows if cp argument does not contains white space, the "quotes" is optional
For me this works in windows .
java -cp "/lib/*;" sample
For linux
java -cp "/lib/*:" sample
I am using Java 6
You can try java -Djava.ext.dirs=jarDirectory
http://docs.oracle.com/javase/6/docs/technotes/guides/extensions/spec.html
Directory for external jars when running java
Correct:
java -classpath "lib/*:." my.package.Program
Incorrect:
java -classpath "lib/a*.jar:." my.package.Program
java -classpath "lib/a*:." my.package.Program
java -classpath "lib/*.jar:." my.package.Program
java -classpath lib/*:. my.package.Program
If you are using Java 6, then you can use wildcards in the classpath.
Now it is possible to use wildcards in classpath definition:
javac -cp libs/* -verbose -encoding UTF-8 src/mypackage/*.java -d build/classes
Ref: http://www.rekk.de/bloggy/2008/add-all-jars-in-a-directory-to-classpath-with-java-se-6-using-wildcards/
If you really need to specify all the .jar files dynamically you could use shell scripts, or Apache Ant. There's a commons project called Commons Launcher which basically lets you specify your startup script as an ant build file (if you see what I mean).
Then, you can specify something like:
<path id="base.class.path">
<pathelement path="${resources.dir}"/>
<fileset dir="${extensions.dir}" includes="*.jar" />
<fileset dir="${lib.dir}" includes="*.jar"/>
</path>
In your launch build file, which will launch your application with the correct classpath.
Please note that wildcard expansion is broken for Java 7 on Windows.
Check out this StackOverflow issue for more information.
The workaround is to put a semicolon right after the wildcard. java -cp "somewhere/*;"
To whom it may concern,
I found this strange behaviour on Windows under an MSYS/MinGW shell.
Works:
$ javac -cp '.;c:\Programs\COMSOL44\plugins\*' Reclaim.java
Doesn't work:
$ javac -cp 'c:\Programs\COMSOL44\plugins\*' Reclaim.java
javac: invalid flag: c:\Programs\COMSOL44\plugins\com.comsol.aco_1.0.0.jar
Usage: javac <options> <source files>
use -help for a list of possible options
I am quite sure that the wildcard is not expanded by the shell, because e.g.
$ echo './*'
./*
(Tried it with another program too, rather than the built-in echo, with the same result.)
I believe that it's javac which is trying to expand it, and it behaves differently whether there is a semicolon in the argument or not. First, it may be trying to expand all arguments that look like paths. And only then it would parse them, with -cp taking only the following token. (Note that com.comsol.aco_1.0.0.jar is the second JAR in that directory.) That's all a guess.
This is
$ javac -version
javac 1.7.0
All the above solutions work great if you develop and run the Java application outside any IDE like Eclipse or Netbeans.
If you are on Windows 7 and used Eclipse IDE for Development in Java, you might run into issues if using Command Prompt to run the class files built inside Eclipse.
E.g. Your source code in Eclipse is having the following package hierarchy:
edu.sjsu.myapp.Main.java
You have json.jar as an external dependency for the Main.java
When you try running Main.java from within Eclipse, it will run without any issues.
But when you try running this using Command Prompt after compiling Main.java in Eclipse, it will shoot some weird errors saying "ClassNotDef Error blah blah".
I assume you are in the working directory of your source code !!
Use the following syntax to run it from command prompt:
javac -cp ".;json.jar" Main.java
java -cp ".;json.jar" edu.sjsu.myapp.Main
[Don't miss the . above]
This is because you have placed the Main.java inside the package edu.sjsu.myapp and java.exe will look for the exact pattern.
Hope it helps !!
macOS, current folder
For Java 13 on macOS Mojaveā€¦
If all your .jar files are in the same folder, use cd to make that your current working directory. Verify with pwd.
For the -classpath you must first list the JAR file for your app. Using a colon character : as a delimiter, append an asterisk * to get all other JAR files within the same folder. Lastly, pass the full package name of the class with your main method.
For example, for an app in a JAR file named my_app.jar with a main method in a class named App in a package named com.example, alongside some needed jars in the same folder:
java -classpath my_app.jar:* com.example.App
For windows quotes are required and ; should be used as separator. e.g.:
java -cp "target\\*;target\\dependency\\*" my.package.Main
Short Form: If your main is within a jar, you'll probably need an additional '-jar pathTo/yourJar/YourJarsName.jar ' explicitly declared to get it working (even though 'YourJarsName.jar' was on the classpath)
(or, expressed to answer the original question that was asked 5 years ago: you don't need to redeclare each jar explicitly, but does seem, even with java6 you need to redeclare your own jar ...)
Long Form:
(I've made this explicit to the point that I hope even interlopers to java can make use of this)
Like many here I'm using eclipse to export jars: (File->Export-->'Runnable JAR File'). There are three options on 'Library handling' eclipse (Juno) offers:
opt1: "Extract required libraries into generated JAR"
opt2: "Package required libraries into generated JAR"
opt3: "Copy required libraries into a sub-folder next to the generated JAR"
Typically I'd use opt2 (and opt1 was definitely breaking), however native code in one of the jars I'm using I discovered breaks with the handy "jarinjar" trick that eclipse leverages when you choose that option. Even after realizing I needed opt3, and then finding this StackOverflow entry, it still took me some time to figure it out how to launch my main outside of eclipse, so here's what worked for me, as it's useful for others...
If you named your jar: "fooBarTheJarFile.jar"
and all is set to export to the dir: "/theFully/qualifiedPath/toYourChosenDir".
(meaning the 'Export destination' field will read: '/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile.jar' )
After you hit finish, you'll find eclipse then puts all the libraries into a folder named 'fooBarTheJarFile_lib' within that export directory, giving you something like:
/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile.jar
/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile_lib/SomeOtherJar01.jar
/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile_lib/SomeOtherJar02.jar
/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile_lib/SomeOtherJar03.jar
/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile_lib/SomeOtherJar04.jar
You can then launch from anywhere on your system with:
java -classpath "/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile_lib/*" -jar /theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile.jar package.path_to.the_class_with.your_main.TheClassWithYourMain
(For Java Newbies: 'package.path_to.the_class_with.your_main' is the declared package-path that you'll find at the top of the 'TheClassWithYourMain.java' file that contains the 'main(String[] args){...}' that you wish to run from outside java)
The pitfall to notice: is that having 'fooBarTheJarFile.jar' within the list of jars on your declared classpath is not enough. You need to explicitly declare '-jar', and redeclare the location of that jar.
e.g. this breaks:
java -classpath "/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile.jar;/theFully/qualifiedPath/toYourChosenDir/fooBarTheJarFile_lib/*" somepackages.inside.yourJar.leadingToTheMain.TheClassWithYourMain
restated with relative paths:
cd /theFully/qualifiedPath/toYourChosenDir/;
BREAKS: java -cp "fooBarTheJarFile_lib/*" package.path_to.the_class_with.your_main.TheClassWithYourMain
BREAKS: java -cp ".;fooBarTheJarFile_lib/*" package.path_to.the_class_with.your_main.TheClassWithYourMain
BREAKS: java -cp ".;fooBarTheJarFile_lib/*" -jar package.path_to.the_class_with.your_main.TheClassWithYourMain
WORKS: java -cp ".;fooBarTheJarFile_lib/*" -jar fooBarTheJarFile.jar package.path_to.the_class_with.your_main.TheClassWithYourMain
(using java version "1.6.0_27"; via OpenJDK 64-Bit Server VM on ubuntu 12.04)
You need to add them all separately. Alternatively, if you really need to just specify a directory, you can unjar everything into one dir and add that to your classpath. I don't recommend this approach however as you risk bizarre problems in classpath versioning and unmanagability.
The only way I know how is to do it individually, for example:
setenv CLASSPATH /User/username/newfolder/jarfile.jar:jarfile2.jar:jarfile3.jar:.
Hope that helps!
class from wepapp:
> mvn clean install
> java -cp "webapp/target/webapp-1.17.0-SNAPSHOT/WEB-INF/lib/tool-jar-1.17.0-SNAPSHOT.jar;webapp/target/webapp-1.17.0-SNAPSHOT/WEB-INF/lib/*" com.xx.xx.util.EncryptorUtils param1 param2
Think of a jar file as the root of a directory structure. Yes, you need to add them all separately.
Not a direct solution to being able to set /* to -cp but I hope you could use the following script to ease the situation a bit for dynamic class-paths and lib directories.
libDir2Scan4jars="../test";cp=""; for j in `ls ${libDir2Scan4jars}/*.jar`; do if [ "$j" != "" ]; then cp=$cp:$j; fi; done; echo $cp| cut -c2-${#cp} > .tmpCP.tmp; export tmpCLASSPATH=`cat .tmpCP.tmp`; if [ "$tmpCLASSPATH" != "" ]; then echo .; echo "classpath set, you can now use ~> java -cp \$tmpCLASSPATH"; echo .; else echo .; echo "Error please check libDir2Scan4jars path"; echo .; fi;
Scripted for Linux, could have a similar one for windows too. If proper directory is provided as input to the "libDir2Scan4jars"; the script will scan all the jars and create a classpath string and export it to a env variable "tmpCLASSPATH".
Set the classpath in a way suitable multiple jars and current directory's class files.
CLASSPATH=${ORACLE_HOME}/jdbc/lib/ojdbc6.jar:${ORACLE_HOME}/jdbc/lib/ojdbc14.jar:${ORACLE_HOME}/jdbc/lib/nls_charset12.jar;
CLASSPATH=$CLASSPATH:/export/home/gs806e/tops/jconn2.jar:.;
export CLASSPATH
I have multiple jars in a folder. The below command worked for me in JDK1.8 to include all jars present in the folder. Please note that to include in quotes if you have a space in the classpath
Windows
Compiling: javac -classpath "C:\My Jars\sdk\lib\*" c:\programs\MyProgram.java
Running: java -classpath "C:\My Jars\sdk\lib\*;c:\programs" MyProgram
Linux
Compiling: javac -classpath "/home/guestuser/My Jars/sdk/lib/*" MyProgram.java
Running: java -classpath "/home/guestuser/My Jars/sdk/lib/*:/home/guestuser/programs" MyProgram
Order of arguments to java command is also important:
c:\projects\CloudMirror>java Javaside -cp "jna-5.6.0.jar;.\"
Error: Unable to initialize main class Javaside
Caused by: java.lang.NoClassDefFoundError: com/sun/jna/Callback
versus
c:\projects\CloudMirror>java -cp "jna-5.6.0.jar;.\" Javaside
Exception in thread "main" java.lang.UnsatisfiedLinkError: Unable

Execute Java from command line

I have a folder on my desktop titled "Stuff" and in that folder I have the following:
Hello.java
mail.jar
And Hello.java imports from mail.jar, so I need to tell Hello.java to look for mail.jar.
From a Windows command line and from a unix command line, how can I compile this and run this?
Compile:
javac -cp .;mail.jar Hello.java
where ; is for Windows; use : for *nix.
and run:
java -cp .;mail.jar Hello
where again, use ; for Windows and : for *nix.
-cp tells both javac and java what classpath to use, and as your files are in the local directory where you're executing the command, you can use . for the Hello part and the name of the jar for the paths inside the jar. Wikipedia has a decent article on classpaths.
Mind you, if you're going to be doing this on a regular basis, you may want to set your CLASSPATH environment variable rather than constantly using the -cp flag. Both java and javac use the CLASSPATH variable.
For my own development machine, I actually include . in my CLASSPATH variable, for convenience. It's not something I would do on a production or build/test box, but it's very handy for development purposes. You'd want to have your usual jars in it as well.
Assuming Hello.java does not contain a package declaration, on Windows:
javac -cp mail.jar Hello.java
java -cp mail.jar;. Hello
The only difference on Unix platforms is that you separate the elements of the classpath with a scolon instead of a semicolon:
java -cp mail.jar:. Hello
Follow this tutorial and you should be able to do it in no time:
Java Compilation
You also shouldn't have any problems with the classpath because your classes are in the same folder

Categories