I wrote a few Spark job in Java then submitted the jars with submit script.
bin/spark-submit --class "com.company.spark.jobName.SparkMain" --master local[*] /tmp/spark-job-1.0.jar
There will be a service and will run in same server. The service should stop the job when receive the stop command.
I have these information about job in service:
SparkHome
AppName
AppResource
Master uri
app-id
status
Is there any way to stop running spark job in java code.
Have you reviewed the REST server and the ability to use /submissions/kill/[submissionId]? That seems like it would work for your need.
Related
I am having spark streaming application using kinesis and running in EMR 6.0.0,
It's running fine locally but when deploying to AWS EMR it keeps failing with
NoClassDefFoundError exception
20/11/17 15:26:56 INFO Client:
client token: N/A
diagnostics: User class threw exception: java.lang.NoClassDefFoundError: com/fasterxml/jackson/dataformat/cbor/CBORFactory
at com.amazonaws.protocol.json.SdkJsonProtocolFactory.getSdkFactory(SdkJsonProtocolFactory.java:123)
at com.amazonaws.protocol.json.SdkJsonProtocolFactory.createGenerator(SdkJsonProtocolFactory.java:54)
at com.amazonaws.protocol.json.SdkJsonProtocolFactory.createGenerator(SdkJsonProtocolFactory.java:74)
at com.amazonaws.protocol.json.SdkJsonProtocolFactory.createProtocolMarshaller(SdkJsonProtocolFactory.java:64)
at com.amazonaws.services.kinesis.model.transform.DescribeStreamRequestProtocolMarshaller.marshall(DescribeStreamRequestProtocolMarshaller.java:52)
at com.amazonaws.services.kinesis.AmazonKinesisClient.executeDescribeStream(AmazonKinesisClient.java:861)
at com.amazonaws.services.kinesis.AmazonKinesisClient.describeStream(AmazonKinesisClient.java:846)
at com.amazonaws.services.kinesis.AmazonKinesisClient.describeStream(AmazonKinesisClient.java:887)
at com.gartner.tn.datafeed.application.PositionStreamApplicationV4.getJavaDStream(PositionStreamApplicationV4.java:240)
I had the exact same issue and I solved it by removing the dependence on CBOR from Kinesis. I am not sure if that is an option for you but it worked for me.
There are a few ways to do this but, for when running in local mode, I put the following code at the beginning of the main class in my streaming spark application;
System.setProperty(SDKGlobalConfiguration.AWS_CBOR_DISABLE_SYSTEM_PROPERTY, "true");
When running in cluster mode start your spark submit as follows;
spark-submit --deploy-mode cluster \
--conf spark.driver.extraJavaOptions='-Dcom.amazonaws.sdk.disableCbor=true' \
--conf spark.executor.extraJavaOptions='-Dcom.amazonaws.sdk.disableCbor=true'
When running in client mode on the cluster start like this;
spark-submit --deploy-mode client \
--driver-java-options '-Dcom.amazonaws.sdk.disableCbor=true' \
--conf spark.executor.extraJavaOptions='-Dcom.amazonaws.sdk.disableCbor=true'
This question led me to the answer; Getting an AmazonKinesisException Status Code: 502 when using LocalStack from Java
I have a spring batch job (launched via Control-M on a windows server) that crashed because of :
2019-10-23 11:50:44,699 ERROR [main] o.s.b.c.l.s.CommandLineJobRunner [CommandLineJobRunner.java:368] Job Terminated in error: A job execution for this job is already running: JobInstance: id=10, version=0, Job=[stockProjectionJob]
I have found and killed the java process with the following command :
wmic process where name="javaw.exe" get commandline,creationdate,processid|find /C "batch"
But the batch still won't run (same error), what can I do ?
Make sure all job executions tied to the job instance have a not null end time (tables BATCH_JOB_EXECUTION and BATCH_JOB_INSTANCE)
Hello my Spark configuration in JAVA is :
ss=SparkSession.builder()
.config("spark.driver.host", "192.168.0.103")
.config("spark.driver.port", "4040")
.config("spark.dynamicAllocation.enabled", "false")
.config("spark.cores.max","1")
.config("spark.executor.memory","471859200")
.config("spark.executor.cores","1")
//.master("local[*]")
.master("spark://kousik-pc:7077")
.appName("abc")
.getOrCreate();
Now when I am submitting any job from inside code(not submitting jar) I am getting the Warning:
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
The spark UI is
The worker that is in the screenshot is started from command:
~/spark/sbin/start-slave.sh
All the four jobs those are in waiting stage is submitted from java code. Tried all solutions from all sites. Any idea please.
As per my understanding, you wanted to run a spark job using only one executor core, you don't have to specify spark.executor.cores.
spark.cores.max should handle assigning only one core to each job as its value is 1.
Its always good practice to provide the configuration details like master, executor memory/cores in spark-submit command like below:
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master spark://xxx.xxx.xxx.xxx:7077 \
--executor-memory 20G \
--total-executor-cores 100 \
/path/to/examples.jar \
1000
In case if you want to explicitly specify the number of executors to each job use --total-executor-cores in your spark-submit command
Check the documentation here
I am running spark 1.6.0 on a small computing cluster and wish to kill a driver program. I've submitted a custom implementation of the out of the box Spark Pi calculation example with the following options:
spark-submit --class JavaSparkPi --master spark://clusterIP:portNum --deploy-mode cluster /path/to/jarfile/JavaSparkPi.jar 10
Note: 10 is a command line argument and is irrelevant for this question.
I've tried many methods of killing the driver program that was started on the cluster:
./bin/spark-class org.apache.spark.deploy.Client kill
spark-submit --master spark://node-1:6066 --kill $driverid
Issue the kill command from the spark administrative interface (web ui): http://my-cluster-url:8080
Number 2 yields a success JSON response:
{
"action" : "KillSubmissionResponse",
"message" : "Kill request for driver-xxxxxxxxxxxxxx-xxxx submitted",
"serverSparkVersion" : "1.6.0",
"submissionId" : "driver-xxxxxxxxxxxxxx-xxxx",
"success" : true
}
Where 'driver-xxxxxxxxxxxxxx-xxxx' is the actual driver id.
But the web UI http://my-cluster-url:8080/ still shows the driver program as running.
Is there anything else I can try?
I've been trying to make this work with no luck thus far. I launch a cluster with
./spark-ec2 -k keyname -i ~/.keys/key.pem --region=us-east-1 -s 5 launch "my test cluster"
Then I submit a job with
bin/spark-submit --verbose --class com.company.jobs.AggregateCostDataWorkflow --master spark://ec2-54-157-122-49.compute-1.amazonaws.com:7077 --deploy-mode cluster --conf spark.executor.memory=5g /Users/my.name/scala-proj/target/scala-2.10/scala-proj-0.1.0.jar --outputPath,s3n://my-bucket/my-name/ec2-spark-test/
Where outPutPath is an argument to the main method. After a bit and some status output, I see an exception that looks like
15/06/05 16:09:33 INFO StandaloneRestClient: Submitting a request to launch an application in spark://ec2-74-141-162-19.compute-1.amazonaws.com:7077.
Exception in thread "main" java.net.ConnectException: Operation timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at [java socket stuff elided for brevity] org.apache.spark.deploy.rest.StandaloneRestClient.postJson(StandaloneRestClient.scala:150)
at org.apache.spark.deploy.rest.StandaloneRestClient.createSubmission(StandaloneRestClient.scala:70)
at org.apache.spark.deploy.rest.StandaloneRestClient$.run(StandaloneRestClient.scala:317)
at org.apache.spark.deploy.rest.StandaloneRestClient$.main(StandaloneRestClient.scala:329)
at org.apache.spark.deploy.rest.StandaloneRestClient.main(StandaloneRestClient.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:178)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
This is spark 1.3.1 (on my local machine) I can access the UI on the master machine and verify that the Spark processes are in fact up. I can also ssh into the master.
Any tips?
You need to open ports by editing security policies, if you want to access ports on your EC2 spark cluster. spark_ec2.py doesn't open ports 7077 and 6066 on master to be accessed from outside the cluster.
I use the other way - connect to master machine of your spark cluster with the command
./spark_ec2.py -k keyname -i ~/.keys/key.pem login "my test cluster"
Upload the your job file (with scp using same key) and submit job from there. This would ensure that your driver has access to a cluster master and slaves.
See "Running Applications" section of Running Spark on EC2 documentation