Error running hadoop job from Eclipse - java

I am facing some issues while trying to launch a mapreduce job on our Hadoop cluster from eclipse. I have added a folder named "conf" as a class folder and under that folder, I have imported the "core-site.xml", "hdfs-site.xml", "mapred-site.xml" and "hbase-site.xml". My hadoop cluster runs Hadoop 0.20.205.0, HBase-0.94.1. We are able to successfully submit the jobs to the cluser using "hadoop jar" command. Since this is very cumbersome, I want to setup eclipse such that I could submit the Hadoop jobs to the cluster by just running the program.
After I added the required dependencies to the project, I am getting the following exception when I run the example "PiEstimator.java" (of Hadoop-0.20.205.0).
Number of Maps = 4
Samples per Map = 4
Exception in thread "main" org.apache.hadoop.ipc.RemoteException: java.io.IOException: java.lang.NoSuchMethodException: org.apache.hadoop.hdfs.protocol.ClientProtocol.create(java.lang.String, org.apache.hadoop.fs.permission.FsPermission, java.lang.String, boolean, boolean, short, long)
at java.lang.Class.getMethod(Class.java:1632)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
at org.apache.hadoop.ipc.Client.call(Client.java:1066)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at com.sun.proxy.$Proxy1.create(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at com.sun.proxy.$Proxy1.create(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.<init>(DFSClient.java:3245)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:713)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:182)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:555)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:892)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:393)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:284)
at com.amazon.seo.mapreduce.examples.PiEstimator.estimate(PiEstimator.java:265)
at com.amazon.seo.mapreduce.examples.PiEstimator.run(PiEstimator.java:325)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at com.amazon.seo.mapreduce.examples.PiEstimator.main(PiEstimator.java:333)
Can you please help me understand what part of my setup is wrong and how to fix it?

Were you able to resolve this issue?
I've resolved similar error with ClassDefinition as follows:
Create Jar as a Runnable Java File (File >> Export >> Runnable Java File)
export HADOOP_CLASSPATH =
This will make hadoop to pickup the correct class from your jar file.

Unfortunately, I believe that you will have to upgrade Hadoop's version to at least 2.5.0

Related

Filepath on Mac for local Parquet files in Spark program in Java

I have a small Spark program in Java, that reads parquet files from a local directory on a Mac.
I have been trying to do this multiple ways, but nothing seems to be working.
Dataset<Row> dsuomcategoryconvfactor = spark.read().parquet(path + "file:///usr/local⁩/ParquetData/data1.parquet");
I think I am giving the path wrong for Spark to identify it, and it's throwing the below error.
20/01/06 10:58:29 INFO SharedState: Warehouse path is 'file:/usr/local/Cellar/apache-spark/2.4.4/libexec/work/driver-20200106105812-0006/spark-warehouse'.
20/01/06 10:58:29 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:65)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: org.apache.spark.sql.AnalysisException: Path does not exist: file:/usr/local⁩/ParquetData/data1.parquet;
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:558)
at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary$1.apply(DataSource.scala:545)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:355)
at org.apache.spark.sql.execution.datasources.DataSource.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:545)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:359)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:644)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:628)
This works fine when run from an IDE, but when I submit the job from shell using spark-submit, this error is being thrown.
Any help would be appreciated.
Thanks!

My cdh5.2 cluster get FileNotFoundException when running hbase MR jobs

My cdh5.2 cluster has a problem to run hbase MR jobs.
For example, I added the hbase classpath into the hadoop classpath:
vi /etc/hadoop/conf/hadoop-env.sh
add the line:
export HADOOP_CLASSPATH="/usr/lib/hbase/bin/hbase classpath:$HADOOP_CLASSPATH"
And when I am running:
hadoop jar /usr/lib/hbase/hbase-server-0.98.6-cdh5.2.1.jar rowcounter "mytable"
I get the following exception:
14/12/09 03:44:02 WARN security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: hdfs://clusterName/usr/lib/hbase/lib/hbase-client-0.98.6-cdh5.2.1.jar
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.hbase.mapreduce.Driver.main(Driver.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://clusterName/usr/lib/hbase/lib/hbase-client-0.98.6-cdh5.2.1.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1083)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1075)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1075)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:394)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1295)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1292)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1292)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1313)
at org.apache.hadoop.hbase.mapreduce.RowCounter.main(RowCounter.java:191)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
Even I have the same problem with CDH 5.2.0. as a work around,
I manually copied the jar file in to hdfs, then exceptions are not coming
So, The problem was environment issue:
When I added the below jars into /usr/lib/hadoop/lib. all worked fine
hbase-client-0.98.6-cdh5.2.1.jar
hbase-common-0.98.6-cdh5.2.1.jar
hbase-protocol-0.98.6-cdh5.2.1.jar
hbase-server-0.98.6-cdh5.2.1.jar
hbase-prefix-tree-0.98.6-cdh5.2.1.jar
hadoop-core-2.5.0-mr1-cdh5.2.1.jar
htrace-core-2.04.jar
My machine has the following rpms in it:
>> rpm -qa | grep cdh
zookeeper-3.4.5+cdh5.2.1+84-1.cdh5.2.1.p0.13.el6.x86_64
hadoop-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hadoop-0.20-mapreduce-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hbase-regionserver-0.98.6+cdh5.2.1+64-1.cdh5.2.1.p0.9.el6.x86_64
cloudera-cdh-5-0.x86_64
bigtop-utils-0.7.0+cdh5.2.1+0-1.cdh5.2.1.p0.13.el6.noarch
bigtop-jsvc-0.6.0+cdh5.2.1+578-1.cdh5.2.1.p0.13.el6.x86_64
parquet-1.5.0+cdh5.2.1+38-1.cdh5.2.1.p0.12.el6.noarch
hadoop-hdfs-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hadoop-mapreduce-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hadoop-0.20-mapreduce-tasktracker-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hbase-0.98.6+cdh5.2.1+64-1.cdh5.2.1.p0.9.el6.x86_64
avro-libs-1.7.6+cdh5.2.1+69-1.cdh5.2.1.p0.13.el6.noarch
parquet-format-2.1.0+cdh5.2.1+6-1.cdh5.2.1.p0.14.el6.noarch
hadoop-yarn-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
hadoop-hdfs-datanode-2.5.0+cdh5.2.1+578-1.cdh5.2.1.p0.14.el6.x86_64
I still wonder which rpm is missing.
Instead of adding the file manually to HDFS, you can add the hbase library path to the .bashrc file. Add the lib folder in hbase to the CLASSPATH.
Also, add the classpath of hbase to HADOOP_CLASSPATH.
Your .bashrc file should contain the following:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`${HBASE_HOME}/bin/hbase classpath`
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`${HBASE_HOME}/bin/hbase mapredcp`
export CLASSPATH=${HBASE_HOME}/lib/*
Note: The CLASSPATH should point to the lib folder of your hbase installation folder. Use the following to compile and run your java code.
javac Example.java
java -classpath $CLASSPATH:. Example

Graphbuilder with Hadoop

I am running graphbuilder with hadoop (single-node). I have followed the tutorial "https://01.org/graphbuilder/documentation/how-run-demo-application". However, when I run the command
bin/hadoop jar /usr/local/graphbuilder/target/graphbuilder-0.0.1-SNAPSHOT-hadoop-job.jar com.intel.hadoop.graphbuilder.demoapps.wikipedia.docwordgraph.TFIDFGraphEnd2End 1 /user/hduser/wiki-input /user/hduser/en-wiki-articles-output ingressCode*
I am getting error
INFO docwordgraph.CreateWordCountGraph: ========== Creating Graph ================
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.filecache.DistributedCache.addFileToClassPath(Lorg/apache/hadoop/fs/Path;Lorg/apache/hadoop/conf/Configuration;Lorg/apache/hadoop/fs/FileSystem;)V
at com.intel.hadoop.graphbuilder.util.FsUtil.distributedTempClassToClassPath(FsUtil.java:46)
at com.intel.hadoop.graphbuilder.job.AbstractPreprocessJob.run(AbstractPreprocessJob.java:111)
at com.intel.hadoop.graphbuilder.demoapps.wikipedia.docwordgraph.CreateWordCountGraph.main(CreateWordCountGraph.java:77)
at com.intel.hadoop.graphbuilder.demoapps.wikipedia.docwordgraph.TFIDFGraphEnd2End.main(TFIDFGraphEnd2End.java:69)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)"
Can you please guide me how to solve this issue?
Thanks
Looks like you're using a back level version of hadoop. Check the version of hadoop that your version of graph builder needs and make sure that's the version you're running.

Java: Run Program without Eclipse

I made a game that uses external libraries and resource files. Now I want test it on another PC(with Ubuntu OS), I tried to run it without eclipse by exporting to jar files, but it doesn't work. This is the hierarchy. Is there any simple way to do this ?
Edit
When I export it as Runnable(Note that I'm trying to run it still on windows not ubuntu) jar file and try to run it java -jar gra.jar It throws:
`Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoa
der.java:58)
Caused by: java.lang.UnsatisfiedLinkError: no lwjgl in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860)
at java.lang.Runtime.loadLibrary0(Runtime.java:845)
at java.lang.System.loadLibrary(System.java:1084)
at org.lwjgl.Sys$1.run(Sys.java:73)
at java.security.AccessController.doPrivileged(Native Method)
at org.lwjgl.Sys.doLoadLibrary(Sys.java:66)
at org.lwjgl.Sys.loadLibrary(Sys.java:95)
at org.lwjgl.Sys.<clinit>(Sys.java:112)
at org.lwjgl.opengl.Display.<clinit>(Display.java:132)
at org.newdawn.slick.AppGameContainer$1.run(AppGameContainer.java:39)
at java.security.AccessController.doPrivileged(Native Method)
at org.newdawn.slick.AppGameContainer.<clinit>(AppGameContainer.java:36)
at MainWindow.main(MainWindow.java:14)
... 5 more`
Make sure you are getting your resource files correctly, not referencing them in your local file system.
To ensure your resources are put in the JAR add them to your classpath.
Caused by: java.lang.UnsatisfiedLinkError: no lwjgl in java.library.path
lwjgl is not found. It should be in one of the directories listed in java.library.path.
You most certainly added a directory to java.library.path with the native dlls of jwgl when you set up your project. Eclipse cannot export this. You will have to create a starter script (sh, bat) which adds the dlls.
Have a look at the last step here:
http://www.lwjgl.org/wiki/index.php?title=Setting_Up_LWJGL_with_Eclipse
How did you pack the jar? If you used solely eclipse, you should use jarsplice. That may fix the error

Using one-jar to build one jar file

I'm trying to use one-jar to generate one jar file that contains clojure jar file and java class file: Creating one jar file that for execution from Java/Clojure
Following the instruction, I could generate directories using one-jar-appgen-0.97.jar. As instructed, I replaced the java source, and added the ThingOne-1.0.0-SNAPSHOT-standalone.jar
Running ant, it builds jar file without an error, but I got error messages when I try to execute the jar file.
java -jar build/test-one-jar.jar
test_one_jar main entry point, args=[]
Hello from Java!
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.simontuffs.onejar.Boot.run(Boot.java:342)
at com.simontuffs.onejar.Boot.main(Boot.java:168)
Caused by: java.lang.ExceptionInInitializerError
at clojure.lang.Namespace.<init>(Namespace.java:34)
at clojure.lang.Namespace.findOrCreate(Namespace.java:176)
at clojure.lang.Var.internPrivate(Var.java:149)
at ThingOne.core.<clinit>(Unknown Source)
at onejar.main.TestOneJarMain.run(TestOneJarMain.java:27)
at onejar.main.TestOneJarMain.main(TestOneJarMain.java:20)
... 6 more
Caused by: java.lang.NullPointerException
at clojure.lang.RT.lastModified(RT.java:374)
at clojure.lang.RT.load(RT.java:408)
at clojure.lang.RT.load(RT.java:398)
at clojure.lang.RT.doInit(RT.java:434)
at clojure.lang.RT.<clinit>(RT.java:316)
... 12 more
What might be wrong?
Lines 168 and 342 in the Boot.java class of One-Jar indicate a problem with setting properties. This problem is happening when unit tests fail. My guess is it's related to bug 3090800 in the SourceForge One-Jar Bug Tracker.

Categories