I am trying to perform map reduce on Pentaho 5. For Pentaho 5, the Pentaho applications come pre-configured for Apache Hadoop 0.20.2 and it says no further configuration is required for this version. I installed Hadoop 0.20.2 on windows using cygwin and every thing works fine. I run a simple job on Pentaho, which copies file in HDFS which finished successfully and the file system was copied into HDFS. But as soon as I run map reduce job, it says the job was finished on pentaho but the map reduce task was failed and on the output directory on HDFS the result is missing and the log file says:
Error: java.lang.ClassNotFoundException: org.pentaho.di.trans.step.RowListener
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:833)
at org.apache.hadoop.mapred.JobConf.getMapRunnerClass(JobConf.java:790)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170
Please Help me out.
Maybe a little bit old but I thought it might help someone.
This can be caused by:
For Hadoop version > 0.20: check if you set up your environment correctly, see Pentaho Support : Creating a New Hadoop Configuration
Check on HDFS if you have an /opt/pentaho/mapreduce/ and check folder permissions on HDFS for /opt/pentaho (Did you find the kettle-*.jar files in lib folder?)
Check classpath separator ("," in Windows, ":" in Linux). In order to change it, edit spoon.sh (or spoon.bat) and modify OPT variable like this: OPT="$OPT -Dhadoop.cluster.path.separator=,"
Related
I found the following project on github https://github.com/fbukevin/hadoop-cooccurrence which uses a co-occurrence algorithm in hadoop.
I’m using a virtualized Ubuntu 14.04 and managed to install hadoop as a single node cluster with this instruction http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php. I'm new to hadoop and this are my first attempts to run a program with yarn.
I can execute the command yarn in command line, but I don’t know how to run the co-occurrence algorithm in yarn. In the description it says that the program can be used with the following command
$ yarn jar <hadoop>.jar [pairs | stripes] <input_file>
So I tried this:
$ yarn jar /home/vmiller/Downloads/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar pairs pg100.txt
Exception in thread "main" java.lang.ClassNotFoundException: pairs
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
This is definitely not correct but I don't know how to run the command correctly. Somehow I have to tell yarn to use the Cooccurrence.java located in hadoop-cooccurrence/src/main/java/cooc/Cooccurrence.java because this file seems to be the one that executes the co-occurrence algorithm. But how do I tell yarn to use this file with the pairs and stripesarguments on the input file?
You should give jar the path to the jar including the Cooccurrence class.
Jar is in target folder (cooc-1.0-SNAPSHOT.jar).
You don't need to indicate the class name as it is set up in the Manifest file
I actually managed to run the programm. My approach wasn't that wrong, as tokiloutok mentioned I had to include the right jar file.
Before I could execute the command I had to import the pg100.txt into HDFS.
So I had to deactivate the safe mode of the name node with
hdfs dfsadmin -safemode leave
and import the file with
hdfs dfs -put /home/vmiller/workspace/hadoop-cooccurrence/pg100.txt /user/hadoop/
so that I could finally run
yarn jar target/cooc-1.0-SNAPSHOT.jar pairs pg100.txt
without getting any errors.
I'm attempting to run a MapReduce job from a jar file and keep getting a ClassNotFoundException error. I'm running Hadoop 1.2.1 on a Centos 6 virtual machine.
First I compiled the file exercise.java (and class) into a jar file exercise.jar using the following shell script compile.sh :
#!/bin/bash
javac -classpath /pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar /pathto/exercise.java
jar cvf exercise.jar /pathto/*.class
This runs fine and the jar completes successfully. I then attempt to run the actual MapReduce job using shell script exec.sh:
#!/bin/bash
export CLASSPATH=$CLASSPATH:/pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar:/pathto/exercise.class
hadoop jar exercise.jar exercise /data/input/inputfile.txt /data/output
This trows the ClassNotFoundException error :
Exception in thread "main" java.lang.ClassNotFoundException: exercise
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
I realize the explicit path names might not be necessary but I've been a little desperate to double check everything. I've confirmed that in my exercise.java file exercise.class is in the job configuration via job.setJarByClass(exercise.class); and confirmed exercise.class is contained in exercise.jar. Can't seem to figure it out.
UPDATE
The exec.sh script with the full path of exercise.class. It's stored in my Eclipse project directory:
#!/bin/bash
export CLASSPATH=$CLASSPATH:/pathto/hadoop-common-1.2.1.jar:\
/pathto/hadoop-core-1.2.1.jar:/home/username/workspace/MVN_Hadoop/src/main/java.com.amend.hadoop.MapReduce/*
hadoop jar \
exercise.jar \
/home/username/workspace/MVN_Hadoop/src/main/java.com.amend.hadoop.MapReduce/exercise \
/data/input/inputfile.txt \
/data/output
When I actually try and run the exec.sh script using the explicitly written out path names, I also get a completely different set of errors:
Exception in thread "main" java.lang.ClassNotFoundException: /home/hdadmin/workspace/MVN_Hadoop/src/main/java/come/amend/hadoop/MapReduce/exercise
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
I could see this possible errors.
From the Hadoop jar exercise.jar exercise /data/input/inputfile.txt /data/output please specify the full path of the exercise class. I.e org.name.package.exercise if exists. To cross check open the jar file and check the location of exercise.class location.
To continue, Hadoop doesn't expect jars to be included within the jars, since the path of Hadoop is set globally.
NEW:
See, the following path is some thing weird. "/home/hdadmin/workspace/MVN_Hadoop/src/main/java/come/amend/hadoop/MapReduce/exercise"
If you are running using your jar, how could a class path be so specific, instead of jar path. It could only be "come/amend/hadoop/MapReduce/exercise" this.
When I'm trying to uninstall weblogic in console mod (failure at uninstallation with GUI mode before !) with this command
sh uninstall.sh -mode=console
Below is the exception that I got:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Layout
at com.bea.plateng.common.util.logging.LogFactory.newLogInstance(LogFactory.java:102)
at com.bea.plateng.common.util.logging.LogFactory.getLog(LogFactory.java:87)
at com.bea.plateng.wizard.WizardController.setupWizardLog(WizardController.java:325)
at com.bea.plateng.wizard.WizardController.<init>(WizardController.java:168)
at com.bea.plateng.wizard.WizardHelper.invokeWizard(WizardHelper.java:161)
at com.bea.plateng.wizard.WizardHelper.invokeWizardAndWait(WizardHelper.java:42)
at com.bea.plateng.wizard.WizardController.main(WizardController.java:933)
Caused by: java.lang.ClassNotFoundException: org.apache.log4j.Layout
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
I took google search many times but no luck.
Many thanks for helping me solve this issue.
If your end target is to just uninstall weblogic, you may opt to just manually delete relevant directories. If you wish to stick to script or GUI , you might try to solve log4j issue logically,
Place log4j jar in $DOMAIN_NAME/lib
If uninstall script is looking for log4j that means a java process might be getting called from some other nested script and that is trying to use log4j to write logs. Try to provide log4j jar via -cp command to that process
Do your startWeblogic & stopWeblogic scripts able to use log4j successfully? if yes, try to provide log4j path to uninstall in same way as done in those scripts
You have to mainly figure out as why your uninstall.sh is looking for log4j.
Its hard to answer precisely since these issues are very much machine specific. If you are going with manual delete option, google for steps specific to your OS.
I am using hadoop 2.4.1 version. I am trying to run a mapreduce job which moves data from local system to hdfs cluster(output directory). If I set the output directory as my local system path, the program is running fine. But when I set the output directory as a path in hdfs cluster I am getting the below error
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/protobuf/ServiceException
at org.apache.hadoop.ipc.ProtobufRpcEngine.<clinit>(ProtobufRpcEngine.java:69)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1834)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1799)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
at org.apache.hadoop.ipc.RPC.getProtocolEngine(RPC.java:203)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:328)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:235)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2397)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(FileOutputFormat.java:160)
at s1.run(s1.java:66)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at s1.main(s1.java:75)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.ServiceException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 25 more
I saw some posts which stated the issue could be related to protobuf dependecy.
Hadoop 2.2.0 mapreduce job not running after upgrading from hadoop 1.0.4
I am using hadoop commons jar 2.5.2 which has the protobuf. Any help to solve this would be appreciated.
Made it working ! Found that there were some jars of 2.2 version which were incompatible with the current version. When i updated those, the program works fine.
if you compile zhe *.java use default java CLASSPATH is ok.
Edit the hadoop_env.sh
export HADOOP_CLASSPATH=${CLASSPATH}
restart the hadoop server
NoClassDefFoundError is thrown by jvm at runtime when a class is not present in classpath.
Check your classpath.
Check also this answer. Could be useful if you solved the NoClassDefFoundError link
I am trying to use JDBC to access the MySql database on my computer. I get this error message
java.lang.ClassNotFoundException: com.mysql.jdbc.Driver
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:191)
at FunctionClass.special_function(FunctionClass.java:72)
at FunctionClass.<init>(FunctionClass.java:25)
at EvaluateFunctions.main(EvaluateFunctions.java:12)
I've seen on the other posts that this is due to the fact that the driver is not in the lib directory of the JDK. I tried that and it still doesn't work. I added the *.jar file to my "/usr/lib/default-java/lib/" folder on Xubuntu. I am not using any kind of servers. Here is my code that connects to the database:
Class.forName("com.mysql.jdbc.Driver");
Does anyone know what I am missing here?
Thanks for all the help in advance.
If you are using eclipse or some ide you can add the jar files to your build path.
It can be done by checking this link for eclipse. https://stackoverflow.com/a/27085441/4083590
If you are using command prompt you should add the jar files to your CLASSPATH. It can be done as shown below.
Linux:
CLASSPATH=$CLASSPATH;{Current working directory};{Direct Path to .jar file}
CLASSPATH=$CLASSPATH;/home/users/xyz/workspace/;/home/users/xyz/workspace/xyz.jar
Note: ./ can be used to signify current working directory
Windows:
set CLASSPATH=%CLASSPATH%:{Current working directory}:{Direct path to .jar file}
set CLASSPATH=%CLASSPATH%:C:\users\xyz\workspace:C:\users\xyz\workspace\xyz.jar
After doing this you would be able to compile your program.