Spark-submit fails without an error - java

I used the following command to run the spark java example of wordcount:-
time spark-submit --deploy-mode cluster --master spark://192.168.0.7:7077 --class org.apache.spark.examples.JavaWordCount /home/pi/Desktop/example/new/target/javaword.jar /books_500.txt
I have copied the same jar file into all nodes in the same location. (Copying into HDFS didn't work for me.) When I run it, the following is the output:-
Running Spark using the REST application submission protocol.
16/07/14 16:32:18 INFO rest.RestSubmissionClient: Submitting a request to launch an application in spark://192.168.0.7:7077.
16/07/14 16:32:30 WARN rest.RestSubmissionClient: Unable to connect to server spark://192.168.0.7:7077.
Warning: Master endpoint spark://192.168.0.7:7077 was not a REST server. Falling back to legacy submission gateway instead.
16/07/14 16:32:30 WARN util.Utils: Your hostname, master02 resolves to a loopback address: 127.0.1.1; using 192.168.0.7 instead (on interface wlan0)
16/07/14 16:32:30 WARN util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/07/14 16:32:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
It just stops there, quits the job and waits for the next command on terminal. I didn't understand this error without an error message. Help needed please...!!

Related

Launch spark master windows7

Using win7-64, jdk8, sparks1.6.2.
I have spark running, winutils, HADOOP_HOME, etc
Per documentation Note: The launch scripts do not currently support Windows. To run a Spark cluster on Windows, start the master and workers by hand. But does not say how?
How do I launch spark master on windows?
Tried running sh start-master.sh thru git bash : failed to launch org.apache.spark.deploy.master.Master: Even though it prints out Master --ip Sam-Toshiba --port 7077 --webui-port 8080 - So I don't know what all this means.
But when I try spark-submit --class " " --master spark://Sam-Toshiba:7077 target/ .jar -
I get errors:
WARN AbstractLifeCycle: FAILED SelectChannelConnector#0.0.0.0:
4040: java.net.BindException: Address already in use: bind
java.net.BindException: Address already in use
WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
17/01/12 14:44:29 WARN AppClient$ClientEndpoint: Failed to connect to master Sam-Toshiba:7077
java.io.IOException: Failed to connect to Sam-Toshiba/192.168.137.1:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:216)
Also tried spark://localhost:7077 - same errors
On Windows you can launch Master using below command. Open command prompt and go to Spark bin folder and execute
spark-class.cmd org.apache.spark.deploy.master.Master
Above command will print like Master: Starting Spark master at spark://192.168.99.1:7077 in console as per IP of your machine. You can check the UI at http://192.168.99.1:8080/
If you want to launch worker once your master is up you can use below command. This will use all the available cores of your machine.
spark-class.cmd org.apache.spark.deploy.worker.Worker spark://192.168.99.1:7077
If you want to utilize 2 cores of your 4 cores of machine then use
spark-class.cmd org.apache.spark.deploy.worker.Worker -c 2 spark://192.168.99.1:7077

Submit a spark application from Windows to a Linux cluster

I have a Spark Master and Worker running on a remote Ubuntu Linux machine.
I'm trying to run the JavaDirectKafkaWordCount example.
But when I submit from my Windows machine to the Linux cluster I get
C:/spark-1.6.0-bin-hadoop2.6/bin/spark-submit --master spark://10.123.45.67:6066 --deploy-mode cluster --class com.company.spark.app.JavaDirectKafkaWordCount "C:/Dev/spark-app/target/spark-app-1.0-SNAPSHOT.jar" kafka-server:9092 topic1
Running Spark using the REST application submission protocol.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/02/11 11:53:20 INFO RestSubmissionClient: Submitting a request to launch an application in spark://10.123.45.67:6066.
16/02/11 11:53:21 INFO RestSubmissionClient: Submission successfully created as driver-20160211115129-0009. Polling submission state...
16/02/11 11:53:21 INFO RestSubmissionClient: Submitting a request for the status of submission driver-20160211115129-0009 in spark://10.123.45.67:6066.
16/02/11 11:53:21 INFO RestSubmissionClient: State of driver driver-20160211115129-0009 is now ERROR.
16/02/11 11:53:21 INFO RestSubmissionClient: Driver is running on worker worker-20160211111114-172.18.0.8-59825 at 172.18.0.8:59825.
16/02/11 11:53:21 ERROR RestSubmissionClient: Exception from the cluster:
java.io.IOException: No FileSystem for scheme: C
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1650)
org.apache.spark.util.Utils$.doFetchFile(Utils.scala:597)
org.apache.spark.util.Utils$.fetchFile(Utils.scala:394)
org.apache.spark.deploy.worker.DriverRunner.org$apache$spark$deploy$worker$DriverRunner$$downloadUserJar(DriverRunner.scala:150)
org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:79)
16/02/11 11:53:21 INFO RestSubmissionClient: Server responded with
CreateSubmissionResponse:
{
"action" : "CreateSubmissionResponse",
"message" : "Driver successfully submitted as driver-20160211115129-0009",
"serverSparkVersion" : "1.6.0",
"submissionId" : "driver-20160211115129-0009",
"success" : true
}
It looks like Spark is taking C: as the schema. Try changing to
C:/spark-1.6.0-bin-hadoop2.6/bin/spark-submit --master spark://10.123.45.67:6066 --deploy-mode cluster --class com.company.spark.app.JavaDirectKafkaWordCount "file:///C:/Dev/spark-app/target/spark-app-1.0-SNAPSHOT.jar" kafka-server:9092 topic1
By adding file:/// to the file path you are telling Spark that the schema is file, so it will not get confused and use C as the schema. Jars submitted to spark using file should get hosted by spark so the cluster can see them.
The application jar file you input at the submit command use the windows location, but according for the spark official document:
application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes.
So the file must be exist or accessible from each of the cluster member, if use the local file system, you must make sure the file exist at very node.
At my local environment, I use the file:///opt/spark-2.0.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.0.1.jar, it can pass the test.

never ending job in mapreduce

I have set some MapReduce configuration in my main method as so
configuration.set("mapreduce.jobtracker.address", "localhost:54311");
configuration.set("mapreduce.framework.name", "yarn");
configuration.set("yarn.resourcemanager.address", "localhost:8032");
Now when I launch the mapreduce task, the process is tracked (I can see it in my cluster dashboard (the one listening on port 8088)), but the process never finishes. It remains blocked at the following line:
15/06/30 15:56:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/06/30 15:56:17 INFO client.RMProxy: Connecting to ResourceManager at localhost/127.0.0.1:8032
15/06/30 15:56:18 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
15/06/30 15:56:18 INFO input.FileInputFormat: Total input paths to process : 1
15/06/30 15:56:18 INFO mapreduce.JobSubmitter: number of splits:1
15/06/30 15:56:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1435241671439_0008
15/06/30 15:56:19 INFO impl.YarnClientImpl: Submitted application application_1435241671439_0008
15/06/30 15:56:19 INFO mapreduce.Job: The url to track the job: http://10.0.0.10:8088/proxy/application_1435241671439_0008/
15/06/30 15:56:19 INFO mapreduce.Job: Running job: job_1435241671439_0008
Someone has an idea?
Edit : in my yarn nodemanager log, I have this message
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1435241671439_0003_03_000001
2015-06-30 15:44:38,396 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_1435241671439_0002_04_000001
Edit 2 :
I also have in the yarn manager log, some exception that happened sooner (for a precedent mapreduce call) :
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [0.0.0.0:8040] java.net.BindException: Address already in use; For more details see:
Solution : I killed all the daemon processes and restarted again hadoop ! In fact, when I ran jps, I was still getting hadoop daemons though I had stopped them. This was a mismatch of HADOOP_PID_DIR
The default port of nodemanage of yarn is 8040. The error says that the port is already in use. Stop all the hadoop process, if you dont have data, may be format namenode once and try running the job again. From both of your edits, the issue is surely with node manager
Solution : I killed all the daemon processes and restarted again hadoop ! In fact, when I ran jps, I was still getting hadoop daemons though I had stopped them. This was related to a mismatch of HADOOP_PID_DIR

Spark on yarn jar upload problems

I am trying to run a simple Map/Reduce java program using spark over yarn (Cloudera Hadoop 5.2 on CentOS). I have tried this 2 different ways. The first way is the following:
YARN_CONF_DIR=/usr/lib/hadoop-yarn/etc/hadoop/;
/var/tmp/spark/spark-1.4.0-bin-hadoop2.4/bin/spark-submit --class MRContainer --master yarn-cluster --jars /var/tmp/spark/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar simplemr.jar
This method gives the following error:
diagnostics: Application application_1434177111261_0007 failed 2 times
due to AM Container for appattempt_1434177111261_0007_000002 exited
with exitCode: -1000 due to: Resource
hdfs://kc1ltcld29:9000/user/myuser/.sparkStaging/application_1434177111261_0007/spark-assembly-1.4.0-hadoop2.4.0.jar
changed on src filesystem (expected 1434549639128, was 1434549642191
Then I tried without the --jars:
YARN_CONF_DIR=/usr/lib/hadoop-yarn/etc/hadoop/;
/var/tmp/spark/spark-1.4.0-bin-hadoop2.4/bin/spark-submit --class MRContainer --master yarn-cluster simplemr.jar
diagnostics: Application application_1434177111261_0008 failed 2 times
due to AM Container for appattempt_1434177111261_0008_000002 exited
with exitCode: -1000 due to: File does not exist:
hdfs://kc1ltcld29:9000/user/myuser/.sparkStaging/application_1434177111261_0008/spark-assembly-1.4.0-hadoop2.4.0.jar
.Failing this attempt.. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: root.myuser
start time: 1434549879649
final status: FAILED
tracking URL: http://kc1ltcld29:8088/cluster/app/application_1434177111261_0008
user: myuser Exception in thread "main" org.apache.spark.SparkException: Application
application_1434177111261_0008 finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:841)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:867)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 15/06/17 10:04:57 INFO util.Utils: Shutdown hook called 15/06/17
10:04:57 INFO util.Utils: Deleting directory
/tmp/spark-2aca3f35-abf1-4e21-a10e-4778a039d0f4
I tried deleting all the .jars from hdfs://users//.sparkStaging and resubmitting but that didn't help.
The problem was solved by copying spark-assembly.jar into a directory on the hdfs for each node and then passing it to spark-submit --conf spark.yarn.jar as a parameter. Commands are listed below:
hdfs dfs -copyFromLocal /var/tmp/spark/spark-1.4.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-hadoop2.4.0.jar /user/spark/spark-assembly.jar
/var/tmp/spark/spark-1.4.0-bin-hadoop2.4/bin/spark-submit --class MRContainer --master yarn-cluster --conf spark.yarn.jar=hdfs:///user/spark/spark-assembly.jar simplemr.jar
If you are getting this error it means you are uploading assembly jars using --jars option or manually copying to hdfs in each node.
I have followed this approach and it works for me.
In yarn-cluster mode, Spark submit automatically uploads the assembly jar to a distributed cache that all executor containers read from, so there is no need to manually copy the assembly jar to all nodes (or pass it through --jars).
It seems there are two versions of the same jar in your HDFS.
Try removing all old jars from your .sparkStaging directory and try again, it should work.

Unable to start Oryx with Hadoop

I am trying to run Oryx with Hadoop 2.4. Hadoop starts successfully with warning:
WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable.
Oryx also starts successfully. But when I ingest data into it, following exception is thrown :
2014-08-22 14:35:05,835 ERROR [IPC Server handler 3 on 37788] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1408697508855_0002_m_000000_0 - exited : org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
2014-08-22 14:35:05,835 INFO [IPC Server handler 3 on 37788] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Diagnostics report from attempt_1408697508855_0002_m_000000_0: Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
2014-08-22 14:35:05,837 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics report from attempt_1408697508855_0002_m_000000_0: Error: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
2014-08-22 14:35:05,840 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1408697508855_0002_m_000000_0 TaskAttempt Transitioned from RUNNING to FAIL_CONTAINER_CLEANUP
Has anyone faced such kind of issue earlier? Any kind of help will be appreciable!
I'm copying a few items from your thread on the mailing list:
This may be a problem with the installation's Snappy libraries, but that seems to have been resolved
The YARN containers are being killed for running past virtual memory limits. See the FAQ -- this may be a Java issue you can work around by changing YARN config.
Final problem seems to be another issue with YARN config, although it's not clear. I suggest starting from fresh config and/or a distribution that is preconfigured and known to work, if possible.

Categories