I have set up a hadoop single node in windows.
When I execute the command ./bin/hadoop jar Prefix.jar PrefixJob ip op
the job is stuck. There is no exception or anything. but it is just stuck.
How to get it to run?
The correct command to run the WordCount example is as below which I just tested yesterday (on HDInsight):
hadoop.cmd jar jar_file_name.jar class_name iput_file_or_folder_name output_folder_name
c:\apps\dist\hadoop-1.1.0-SNAPSHOT\bin\hadoop.cmd jar \
c:\apps\Jobs\templates\635000448534317551.hadoop-examples.jar wordcount \
/user/admin/DaVinci.txt /user/admin/outcount
What you can do is look at the Job Log to understand what is happening. First visit your cluster to Open Job Tracker and then search for the Job ID to check the information about the submitted job.
Related
I have this project to do with Hadoop and I have installed Hadoop just as described here: https://www.codeproject.com/Articles/757934/Apache-Hadoop-for-Windows-Platform I'm trying to run the same map-reduce job Recipe.java on the dataset recipeitems-latest.json
I have created a .jar file from this Recipe.java code, and I've started YARN and DFS. I have also created the directory /in and copied recipeitems-latest.json to it.
Now, I start the job by calling:
hadoop jar c:\Hwork\Recipe.jar Recipe /in /out
The job starts, says running but no progress is made as you can see here: https://i.stack.imgur.com/QSifC.png
I tried tracking the job too by clicking on given link, its status is accepted but the progress bar shows nothing.
I have started using Hadoop only 1 day back and I really don't know what is going wrong. Why is there no progress in the job I started?
The problem is resolved. Apparently EOL characters in \sbin\start-yarn must be changed (as well as in \bin\hadoop.cmd) from '\n' to '\r\n' and it worked like a charm!
I have two machines named: ubuntu1 and ubuntu2.
In ubuntu1, I started the master node in Spark Standalone Cluster and ubuntu2 I started with a worker (slave).
I am trying to execute the example workCount available on github.
When I submit the application, the worker send an error message
java.io.FileNotFoundException: File file:/home/ubuntu1/demo/test.txt does not exist.
My command line is
./spark-submit --master spark://ubuntu1-VirtualBox:7077 --deploy-mode cluster --clas br.com.wordCount.App -v --name"Word Count" /home/ubuntu1/demo/wordCount.jar /home/ubuntu1/demo/test.txt
The file test.txt has only to stay in one machine ?
Note: The master and the worker are in different machine.
Thank you
I got the same problem while loading the JSON file. I recognized by default windows storing file format as Textfile regardless of the name. identify the file format then you can load easily.
example: think you saved the file as test.JSON. but by default windows adding .txt to it.
check that and try to run again.
I hope your problem will get resolved with this idea.
Thank you.
You should put your file on hdfs by going to the folder and typing :
hdfs dfs -put <file>
Otherwise each node has to have access to it by having the same path folder existing on each machine.
Don't forget to change file:/ to hdfs:/ after you do that
I added KafkaLog4JAppender functionality to my MR job.
locally the job is running and sending the formatted logs into my Kafka cluster.
when I try to run it from the yarn server, using:
jar [jar-name].jar [DriverClass].class [job-params] -Dlog4j.configuration=log4j.xml -libjars
I get the following expception:
log4j:ERROR Could not create an Appender. Reported error follows.
java.lang.ClassNotFoundException: kafka.producer.KafkaLog4jAppender
the KafkaLog4JAppender class is in the path.
running
jar tvf [my-jar].jar | grep KafkaLog4J
finds the class
I'm kinda lost and would appreciate any helpfull input
thanks in advance!
If it works in local mode and not working in Yarn/distributed mode, then it could be problem of jar not being distributed properly. YOu might want to check Using third part jars and files in your MapReduce application(Distributed cache) for details on how to distribute your jar containing KafkaLog4jAppender.class
I have a setup up a master and slave on the same machine.
The master monitors External Jobs the slave runs a batch script and notifies the master about the same.
I followed the Windows instructions from:
https://wiki.jenkins-ci.org/display/JENKINS/Monitoring+external+jobs
and setup a buld step on the slave as follows:
set JENKINS_HOME=http://localhost:8080/jenkins/
java -jar C:\apache-tomcat-7.0.56\webapps\jenkins\WEB-INF\lib\jenkins-core-1.624.jar "POC_Main_Ext_Job" jenkins_poc_test_1
On building the slave job, I get the following error message in the console output:
Building remotely on MySlave in workspace D:\Temp\Jenkins\workspace\Slave_FreeStyle_1
[Slave_FreeStyle_1] $ cmd /c call C:\Users\ROHIT~1.BIS\AppData\Local\Temp\hudson853358493293228093.bat
D:\Temp\Jenkins\workspace\Slave_FreeStyle_1>set JENKINS_HOME=http://localhost:8080/jenkins/
D:\Temp\Jenkins\workspace\Slave_FreeStyle_1>java -jar C:\apache-tomcat-7.0.56\webapps\jenkins\WEB-INF\lib\jenkins-core-1.624.jar "POC_Main_Ext_Job" jenkins_poc_test_1
http://localhost:8080/jenkins/job/POC_Main_Ext_Job/ is not a valid external job (404 Not Found)
D:\Temp\Jenkins\workspace\Slave_FreeStyle_1>exit -1
Build step 'Execute Windows batch command' marked build as failure
Finished: FAILURE
However, the above url does work in Jenkins and takes to the corresponding job. Any Idea what I'm doing wrong?
As you can see I had missed out the user credentials while setting JENKINS_HOME.
After adding them it works smoothly.
I am setting up an Apache Storm system but am having problems getting the program to run consistently. I have set up storm on three servers but it only works consistently on one. I think the issue lies somewhere in the path of the command.
I have been using storm-starter to set up the program and have tested it locally with RollingTopWords. When I run the following command $ storm jar storm-starter-*.jar storm.starter.RollingTopWords the computer stalls a second then i get the following error:
Could not find or load main class storm.starter.RollingTopWords
The jar is stored in the directory /apache/storm/examples/storm-starter/target . Let me know if there is any other information I can provide that would be of help because I'm feeling a little desperate at this point.
The following is the entire output for the program that doesn't work.
Running: /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -client -Dstorm.options= -Dstorm.home=/home/scix3/apache/storm -Dstorm.log.dir=/home/scix3/apache/storm/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /home/scix3/apache/storm/lib/kryo-2.21.jar:/home/scix3/apache/storm/lib/core.incubator-0.1.0.jar:/home/scix3/apache/storm/lib/commons-fileupload-1.2.1.jar:/home/scix3/apache/storm/lib/ring-servlet-0.3.11.jar:/home/scix3/apache/storm/lib/clj-stacktrace-0.2.2.jar:/home/scix3/apache/storm/lib/jline-2.11.jar:/home/scix3/apache/storm/lib/servlet-api-2.5.jar:/home/scix3/apache/storm/lib/disruptor-2.10.1.jar:/home/scix3/apache/storm/lib/log4j-over-slf4j-1.6.6.jar:/home/scix3/apache/storm/lib/clojure-1.5.1.jar:/home/scix3/apache/storm/lib/commons-exec-1.1.jar:/home/scix3/apache/storm/lib/logback-core-1.0.13.jar:/home/scix3/apache/storm/lib/jetty-util-6.1.26.jar:/home/scix3/apache/storm/lib/slf4j-api-1.7.5.jar:/home/scix3/apache/storm/lib/carbonite-1.4.0.jar:/home/scix3/apache/storm/lib/compojure-1.1.3.jar:/home/scix3/apache/storm/lib/minlog-1.2.jar:/home/scix3/apache/storm/lib/commons-lang-2.5.jar:/home/scix3/apache/storm/lib/tools.macro-0.1.0.jar:/home/scix3/apache/storm/lib/reflectasm-1.07-shaded.jar:/home/scix3/apache/storm/lib/tools.cli-0.2.4.jar:/home/scix3/apache/storm/lib/math.numeric-tower-0.0.1.jar:/home/scix3/apache/storm/lib/logback-classic-1.0.13.jar:/home/scix3/apache/storm/lib/tools.logging-0.2.3.jar:/home/scix3/apache/storm/lib/asm-4.0.jar:/home/scix3/apache/storm/lib/jetty-6.1.26.jar:/home/scix3/apache/storm/lib/snakeyaml-1.11.jar:/home/scix3/apache/storm/lib/hiccup-0.3.6.jar:/home/scix3/apache/storm/lib/clj-time-0.4.1.jar:/home/scix3/apache/storm/lib/jgrapht-core-0.9.0.jar:/home/scix3/apache/storm/lib/clout-1.0.1.jar:/home/scix3/apache/storm/lib/chill-java-0.3.5.jar:/home/scix3/apache/storm/lib/commons-io-2.4.jar:/home/scix3/apache/storm/lib/joda-time-2.0.jar:/home/scix3/apache/storm/lib/storm-core-0.9.4.jar:/home/scix3/apache/storm/lib/objenesis-1.2.jar:/home/scix3/apache/storm/lib/commons-logging-1.1.3.jar:/home/scix3/apache/storm/lib/ring-core-1.1.5.jar:/home/scix3/apache/storm/lib/ring-jetty-adapter-0.3.11.jar:/home/scix3/apache/storm/lib/commons-codec-1.6.jar:/home/scix3/apache/storm/lib/json-simple-1.1.jar:/home/scix3/apache/storm/lib/ring-devel-0.3.11.jar:storm-starter-.jar:/home/scix3/apache/storm/conf:/home/scix3/apache/storm/bin -Dstorm.jar=storm-starter-.jar storm.starter.RollingTopWords
Error: Could not find or load main class storm.starter.RollingTopWords
The main issue for the error
Could not find or load main class storm.starter.RollingTopWords cloud be.
Check the launch configuration while building the jar.
you must be very careful while building the jar ,it asks you to choose destination folder and launch configuration(launch configuration should be of same project)
You might have missed the main class in your project.
Before using Stormsubmitter in Remote cluster, check once weather it works properly localcluster
To check if the problem is with storm unable to find the jar, you can try issuing
storm jar /fullpath/my-storm-jar.jar Classname
Few other things you can make sure
The jar is compiled properly/jar contains the RollingTopWords class
storm.yaml points to the correct nimubs (This seems less probable, as the the connection is being made and there is an attempt to load the topology)