I have a jar file which is copied from a windows machine onto unix machine....usually we convert all the files that were copied from windows using dos2unix command..
When I convert the jar file to unix format using dos2unix...I am getting the below error...
Exception in thread "main" java.io.IOException: Error opening job jar: hadoop-examples-2.0.0-mr1-cdh4.3.0.jar
at org.apache.hadoop.util.RunJar.main(RunJar.java:135)
Caused by: java.util.zip.ZipException: invalid END header (bad central directory offset)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:127)
at java.util.jar.JarFile.<init>(JarFile.java:135)
at java.util.jar.JarFile.<init>(JarFile.java:72)
at org.apache.hadoop.util.RunJar.main(RunJar.java:133)
It ran successfully before running dos2unix command on it...
Any idea why this happened ?
Don't do that. A jar file is the same as a zip, it's a binary. dos2unix is for converting line endings in ASCII files (not binary).
Related
I wrote a basic MapReduce program on my MacBook utilizing Apache's resource here:
https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html
After I finished, I Exported a jar of my project and transferred it to my EC2 instance through SSH.
After that, I ran this command through the terminal of my EC2 instance:
/usr/local/hadoop/bin/hadoop jar test.jar com.map.reduce games.tar.gz output
Where /usr/local/hadoop/bin/hadoop is where hadoop is installed on the EC2, test.jar is the transfered jar file and com.map.reduce is the package name where all of my classes are hosted. games.tar.gz is the directory I will be working with and output is where I want to see my results.
But I am getting the exception:
Exception in thread "main" java.lang.ClassNotFoundException:
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:466)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:566)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:499)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:374)
at org.apache.hadoop.util.RunJar.run(RunJar.java:311)
at org.apache.hadoop.util.RunJar.main(RunJar.java:232)
I am wondering if this is an issue with the JARS I am using locally. Any help is appreciated.
I'm attempting to run a MapReduce job from a jar file and keep getting a ClassNotFoundException error. I'm running Hadoop 1.2.1 on a Centos 6 virtual machine.
First I compiled the file exercise.java (and class) into a jar file exercise.jar using the following shell script compile.sh :
#!/bin/bash
javac -classpath /pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar /pathto/exercise.java
jar cvf exercise.jar /pathto/*.class
This runs fine and the jar completes successfully. I then attempt to run the actual MapReduce job using shell script exec.sh:
#!/bin/bash
export CLASSPATH=$CLASSPATH:/pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar:/pathto/exercise.class
hadoop jar exercise.jar exercise /data/input/inputfile.txt /data/output
This trows the ClassNotFoundException error :
Exception in thread "main" java.lang.ClassNotFoundException: exercise
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
I realize the explicit path names might not be necessary but I've been a little desperate to double check everything. I've confirmed that in my exercise.java file exercise.class is in the job configuration via job.setJarByClass(exercise.class); and confirmed exercise.class is contained in exercise.jar. Can't seem to figure it out.
UPDATE
The exec.sh script with the full path of exercise.class. It's stored in my Eclipse project directory:
#!/bin/bash
export CLASSPATH=$CLASSPATH:/pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar:/home/username/workspace/MVN_Hadoop/src/main/java.com.amend.hadoop.MapReduce/*
hadoop jar \
exercise.jar \
/home/username/workspace/MVN_Hadoop/src/main/java.com.amend.hadoop.MapReduce/exercise \
/data/input/inputfile.txt \
/data/output
When I actually try and run the exec.sh script using the explicitly written out path names, I also get a completely different set of errors:
Exception in thread "main" java.lang.ClassNotFoundException: /home/hdadmin/workspace/MVN_Hadoop/src/main/java/come/amend/hadoop/MapReduce/exercise
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
I could see this possible errors.
From the Hadoop jar exercise.jar exercise /data/input/inputfile.txt /data/output please specify the full path of the exercise class. I.e org.name.package.exercise if exists. To cross check open the jar file and check the location of exercise.class location.
To continue, Hadoop doesn't expect jars to be included within the jars, since the path of Hadoop is set globally.
NEW:
See, the following path is some thing weird. "/home/hdadmin/workspace/MVN_Hadoop/src/main/java/come/amend/hadoop/MapReduce/exercise"
If you are running using your jar, how could a class path be so specific, instead of jar path. It could only be "come/amend/hadoop/MapReduce/exercise" this.
I am running sqoop installation script in AWS - EMR-4.2.0 Version, followed this documentation.
After created cluster (at Steps), I have submitted my sqoop script as an arguments and s3://elasticmapreduce/libs/script-runner/script-runner.jar/ command-runner.jar as a jar file, but getting error like this. Can you help me pls what is the cause and problem?
Error:
Exception in thread "main" java.lang.RuntimeException: java.io.IOException: Cannot run program "s3://bmsgcm/spark/install-sqoop.sh" (in directory "."): error=2, No such file or directory
at com.amazonaws.emr.command.runner.ProcessRunner.exec(ProcessRunner.java:139)
at com.amazonaws.emr.command.runner.CommandRunner.main(CommandRunner.java:13)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: Cannot run program "s3://bmsgcm/spark/install-sqoop.sh" (in directory "."): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
at com.amazonaws.emr.command.runner.ProcessRunner.exec(ProcessRunner.java:92)
... 7 more
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
at java.lang.ProcessImpl.start(ProcessImpl.java:130)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
... 8 more
command-runner.jar can only read local files. You can add a bootstrap script to copy files from S3 to local file system.
Piggybox is correct. Unlike script-runner.jar that was used on the 2.x and 3.x EMR AMIs, command-runner.jar can only run local commands. A bootstrap script is the best way to do this.
For instance, if you have a few spark drivers on S3, and you have a shell script (also on S3) to copy them to the master node for later use in a job flow step with spark-submit, then you might have had a step like this:
Steps=[
{
'Name': 'Install My Spark Drivers',
'ActionOnFailure':'TERMINATE_JOB_FLOW',
'HadoopJarStep':
'Jar': 's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar',
'Args': [
's3://my-bucket/spark-driver-install.sh',
]
}
},
...other steps...
]
Which, as you've experienced, will fail on EMR 4.x if you just swap command-runner.jar for script-runner.jar, there.
Instead, make a bootstrap action to call the script, like:
BootstrapActions=[
{
'Name': 'Install My Spark Drivers',
'ScriptBootstrapAction': {
'Path': 's3://my-bucket/spark-driver-install.sh',
'Args': []
}
}
]
The above example is expressed as boto3 run_job_flow kwargs. It's not immediately obvious to me how to accomplish the same thing in the web console, though.
For executing a script you can use script-runner. I was also facing the same issue. My script had ^M characters which was causing this issue. Removing those worked.
I am trying to run there own example of LegacyLibrary on http://code.google.com/p/javacpp/
when I try to compile code with following command mentioned at site only (basically compiling code with javacpp.jar) I am getting following exception
D:\Java Workspace\POC\JavaCPP\bin>java -jar javacpp.jar LegacyLibrary
Generating source file: D:\Java Workspace\POC\JavaCPP\bin\jniLegacyLibrary.cpp
Building library file: D:\Java Workspace\POC\JavaCPP\bin\windows-x86\jniLegacyLibrary.dll
cl "/IC:\Program Files (x86)\Java\jdk1.6.0_03\include" "/IC:\Program Files (x86)\Java\jdk1.6.0_03\include\win32" "D:\Java Workspace\POC\JavaCPP\bin\jniLegacyLibrary.cpp" /W3 /Oi
/O2 /EHsc /Gy /GL /MD /LD /link "/OUT:D:\Java Workspace\POC\JavaCPP\bin\windows-x86\jniLegacyLibrary.dll"
Exception in thread "main" java.io.IOException: Cannot run program "cl": CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
at com.googlecode.javacpp.Builder.build(Builder.java:189)
at com.googlecode.javacpp.Builder.generateAndBuild(Builder.java:234)
at com.googlecode.javacpp.Builder.main(Builder.java:479)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.<init>(ProcessImpl.java:81)
at java.lang.ProcessImpl.start(ProcessImpl.java:30)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
... 3 more
what the remedy for this ?
The error message is pretty clear: it's not finding the cl executable, which is the name of the Visual Studio compiler.
You should run that command from a Visual Studio command prompt (there's an entry in the start menu for that, usually), to get the right environment variables loaded.
I was trying to install hadoob on mac. I got the following error. What could be the issue?
hadoop-0.20.203.0 administrator$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
Exception in thread "main" java.io.IOException: Error opening job jar: hadoop-*-examples.jar
at org.apache.hadoop.util.RunJar.main(RunJar.java:90)
Caused by: java.util.zip.ZipException: error in opening zip file
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.<init>(ZipFile.java:127)
at java.util.jar.JarFile.<init>(JarFile.java:135)
at java.util.jar.JarFile.<init>(JarFile.java:72)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
-Anish-
I had a similar issue with an example from the book Hadoop In Action. It turns out either the book had it wrong or the example jars now have a different naming scheme. In any case, your command should now begin with:
bin/hadoop jar hadoop-examples-*.jar