hadoop jar wordcount,jar wordcount exception - java

Whenever I execute the command : hadoop jar wc.jar WordCount text.txt output
It gives me this error
Exception in thread "main" java.lang.ClassNotFoundException: WordCount
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.util.RunJar.run(RunJar.java:316)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Even though everything seems to be fine
Can anyone help me out with this ??

first of all you need to make sure you have a main method inside the WordCount class.
You can also look at this answer right here: How to run a jar file in hadoop?
As the answers suggest there's no problem when running the jar, but when creating the jar.
So for example, you need to run:
JAVA_HOME/bin/jar cf WordCount.jar WordCount.class
and then run
hadoop jar WordCount.jar WordCount
Hope this helps.

Related

Word Count Program in Hadoop Exception in thread "main" java.lang.ClassNotFoundException

I am using eclipse to export the jar file of a map-reduce program. When i am run the jar using command:
hadoop jar Mapreduce.jar Mapreduce siva/file1.txt
Then Show Errors in
Exception in thread "main" java.lang.ClassNotFoundException: Mapreduce
enter code here
at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
at java.net.URLClassLoader$1.run(URLClassLoader.java:201)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:200)
at java.lang.ClassLoader.loadClass(ClassLoader.java:325)
at java.lang.ClassLoader.loadClass(ClassLoader.java:270)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
I could not figure out where is the problem.Please help me
you need to give full class name.I mean package name also
For example if my class name is Mapreduce and package name is com.hadoop.ex
then your command should be
hadoop jar MapReduce.jar com.hadoop.ex.MapReduce inputFile outputfile

JUnit exception in main?

I am trying to run a simple junit test using OS X and terminal. I have placed the Junit jar file inside my java folder. I was able to compile all of the files using:
javac -cp .:"/Library/java/junit.jar" *.java
It compiles just fine, with no errors. However when I try to run the command:
java TestRunner
It gives the error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/junit/runner/JUnitCore
at TestRunner.main(TestRunner.java:7)
Caused by: java.lang.ClassNotFoundException: org.junit.runner.JUnitCore
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 1 more
I cannot seem to find what I am doing incorrectly. Any help solving this would be greatly appreciated.
Like #ToddSewell said, the external libraries should be in your classpath for the execution too. Try this:
java -cp .:"/Library/java/junit.jar" TestRunner

Running Java classes in Hadoop

I am trying to run a java class in Hadoop, just like in the book "Hadoop.The.Definitive.Guide", like this:
hadoop URLCat hdfs://localhost/user/hamza/sometext.txt
Where it is supposed to print the contents of sometext.txt to the terminal.
What happens is when I type that command it gives me this error:
Error: Could not find or load main class URLCat
URLCat is a Java class, but I have no idea why it's not working.
I tried converting it to a jar file using IntelliJ but then when I run it, like this:
hadoop jar Hadoop_example.jar URLCat hdfs://localhost/user/hamza/sometext.txt
Note: Hadoop_example.jar is the jar file name and the class that has main method is URLCat.
It gives me this error:
Exception in thread "main" java.lang.ClassNotFoundException: URLCat
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
So how can I run Java classes in Hadoop.
I found a similar question on Stack overflow but the answers were mostly to convert it into a jar file, but that didn't work for me either.
hadoop jar Hadoop_example.jar URLCat hdfs://localhost/user/hamza/sometext.txt
This is the correct command line to start a program on hadoop. Why it does not start depends on how you made the jar.
This POST will be useful to you.

ClassNotFoundException when running hadoop jar

I'm attempting to run a MapReduce job from a jar file and keep getting a ClassNotFoundException error. I'm running Hadoop 1.2.1 on a Centos 6 virtual machine.
First I compiled the file exercise.java (and class) into a jar file exercise.jar using the following shell script compile.sh :
#!/bin/bash
javac -classpath /pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar /pathto/exercise.java
jar cvf exercise.jar /pathto/*.class
This runs fine and the jar completes successfully. I then attempt to run the actual MapReduce job using shell script exec.sh:
#!/bin/bash
export CLASSPATH=$CLASSPATH:/pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar:/pathto/exercise.class
hadoop jar exercise.jar exercise /data/input/inputfile.txt /data/output
This trows the ClassNotFoundException error :
Exception in thread "main" java.lang.ClassNotFoundException: exercise
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
I realize the explicit path names might not be necessary but I've been a little desperate to double check everything. I've confirmed that in my exercise.java file exercise.class is in the job configuration via job.setJarByClass(exercise.class); and confirmed exercise.class is contained in exercise.jar. Can't seem to figure it out.
UPDATE
The exec.sh script with the full path of exercise.class. It's stored in my Eclipse project directory:
#!/bin/bash
export CLASSPATH=$CLASSPATH:/pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar:/home/username/workspace/MVN_Hadoop/src/main/java.com.amend.hadoop.MapReduce/*
hadoop jar \
exercise.jar \
/home/username/workspace/MVN_Hadoop/src/main/java.com.amend.hadoop.MapReduce/exercise \
/data/input/inputfile.txt \
/data/output
When I actually try and run the exec.sh script using the explicitly written out path names, I also get a completely different set of errors:
Exception in thread "main" java.lang.ClassNotFoundException: /home/hdadmin/workspace/MVN_Hadoop/src/main/java/come/amend/hadoop/MapReduce/exercise
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
I could see this possible errors.
From the Hadoop jar exercise.jar exercise /data/input/inputfile.txt /data/output please specify the full path of the exercise class. I.e org.name.package.exercise if exists. To cross check open the jar file and check the location of exercise.class location.
To continue, Hadoop doesn't expect jars to be included within the jars, since the path of Hadoop is set globally.
NEW:
See, the following path is some thing weird. "/home/hdadmin/workspace/MVN_Hadoop/src/main/java/come/amend/hadoop/MapReduce/exercise"
If you are running using your jar, how could a class path be so specific, instead of jar path. It could only be "come/amend/hadoop/MapReduce/exercise" this.

Exception during wordcount in Hadoop

I have installed Hadoop successfully and now I want to run Wordcount.jar. As shown below, my source address is /user/amir/dft/pg5000.txt and destination address to save results is /user/amir/dft/output.txt.
I have downloaded the .jar file from this url.
Now I'm facing this error message when I run the below command. I followed the instructions found at this url and now my problem is on "Run the MapReduce job" step. How can I overcome it?
amir#amir-Aspire-5820TG:/usr/local/hadoop$ bin/hadoop jar /usr/local/hadoop/wordcount.jar wordcount /user/amir/dft/pg5000.txt /user/amir/dft/output.txt
Exception in thread "main" java.lang.ClassNotFoundException: wordcount
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.util.RunJar.main(RunJar.java:205)
amir#amir-Aspire-5820TG:/usr/local/hadoop$
It means u have a typo or somthing wrong with the main class your specifying. Do u mean org.apache.hadoop.examples.WordCount instead of wordcount.
You don't need to download a new .jar file. A wordcount jar is already there in the examples of hadoop. Just use the command:
bin/hadoop jar hadoop*examples*.jar wordcount /user/amir/dft /user/amir/dft-output
The input and output paths should be directories on HDFS, not files. This will run the wordcount program on all the files that are uploaded on HDFS under the /user/amir/dft/ path, (including your pg5000.txt file).
EDIT: If you want to run this specific jar that you have downloaded, though, follow #samthebest's answer (keeping in mind that the input&output paths are directories).
EDIT2: Following the comments of this answer, it seems that the hadoop version used is newer than the one described in the tutorial. So the .jar for the wordcount program is located at the path hadoop_root/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar, as mentioned in this post.

Categories