I'm ready to install hadoop on Windows,4 daemons are running,when I run a demo
yarn jar %HADOOP_PREFIX%\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.5.0.jar wordcount /myfile.txt /out
This is my env:jdk1.8 ,hadoop-2.7.7
I get the following error:
-_-
D:\ProgramFiles\hadoop\hadoop-2.7.7\share\hadoop\mapreduce>yarn jar hadoop-mapreduce-examples-2.7.7.jar wordcount /myfile.txt /input
19/03/02 10:05:11 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:19000/input already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
D:\ProgramFiles\hadoop\hadoop-2.7.7\share\hadoop\mapreduce>yarn jar hadoop-mapreduce-examples-2.7.7.jar wordcount /myfile.txt /input1
19/03/02 10:06:25 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
19/03/02 10:06:26 INFO input.FileInputFormat: Total input paths to process : 1
19/03/02 10:06:27 INFO mapreduce.JobSubmitter: number of splits:1
19/03/02 10:06:27 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1551490342050_0002
19/03/02 10:06:27 INFO impl.YarnClientImpl: Submitted application application_1551490342050_0002
19/03/02 10:06:27 INFO mapreduce.Job: The url to track the job: http://SF80002935M01:8088/proxy/application_1551490342050_0002/
19/03/02 10:06:27 INFO mapreduce.Job: Running job: job_1551490342050_0002
19/03/02 10:07:02 INFO mapreduce.Job: Job job_1551490342050_0002 running in uber mode : false
19/03/02 10:07:02 INFO mapreduce.Job: map 0% reduce 0%
19/03/02 10:07:02 INFO mapreduce.Job: Job job_1551490342050_0002 failed with state FAILED due to: Application application_1551490342050_0002 failed 2 times due to AM Container for appattempt_1551490342050_0002_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://SF80002935M01:8088/cluster/app/application_1551490342050_0002Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_1551490342050_0002_02_000001
Exit code: 1
Exception message: CreateSymbolicLink error (5): ?????
Stack trace: ExitCodeException exitCode=1: CreateSymbolicLink error (5): ?????
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:744)
Shell output: 移动了 1 个文件。
Container exited with a non-zero exit code 1
Failing this attempt. Failing the application.
19/03/02 10:07:02 INFO mapreduce.Job: Counters: 0
D:\ProgramFiles\hadoop\hadoop-2.7.7\share\hadoop\mapreduce>hdfs dfs -cat /input1
cat: `/input1': No such file or directory
D:\ProgramFiles\hadoop\hadoop-2.7.7\share\hadoop\mapreduce>hdfs dfs -cat /input1 -r -00000
cat: `/input1': No such file or directory
cat: `-r': No such file or directory
cat: `-00000': No such file or directory
hdfs://localhost:19000/input already exists
It seems you gave /input as the output directory for the wordcount of /myfile.txt
MapReduce jobs will not overwrite the contents of the output directory if one already exists, and therefore will fail.
As for the errors at the bottom, /input1 would be a directory, and you cannot cat a directory, then -r and -00000 are individually not valid output file names.
Instead do hdfs dfs -ls /input1, then copy one of the filenames to cat
Related
I am trying to run a spark job using below gcloud command.
gcloud dataproc jobs submit spark \
--cluster=clusterName \
--class=clazzName \
--jars=gs://abc/def/ghi.jar \
--region=us-central1 \
--files=gs://abc/def/jkl.json \
--properties=spark.driver.extraJavaOptions="-Dconfig.file=application_dev.json",spark.executor.extraJavaOptions="-Dconfig.file=application_dev.json",spark.executor.memory=6G,spark.driver.memory=4G,spark.executor.cores=3,spark.executor.instances=4
I am getting below error
ERROR org.apache.spark.SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File not found: gs://temp-bucket/f80f8cb3-0358-445e-8ec2-819e4282bfe4/spark-job-history
Full stack trace
Waiting for job output...
22/09/02 05:30:47 INFO com.polaris.ihub.commons.utils.keymaker.KeymakerApi: ====== Reading App Context ======
22/09/02 05:30:47 INFO com.polaris.ihub.commons.utils.keymaker.KeymakerApi: File to be read from -> gs://abc/def/app_context.txt
22/09/02 05:30:49 INFO org.apache.spark.SparkEnv: Registering MapOutputTracker
22/09/02 05:30:49 INFO org.apache.spark.SparkEnv: Registering BlockManagerMaster
22/09/02 05:30:49 INFO org.apache.spark.SparkEnv: Registering BlockManagerMasterHeartbeat
22/09/02 05:30:49 INFO org.apache.spark.SparkEnv: Registering OutputCommitCoordinator
22/09/02 05:30:49 INFO org.sparkproject.jetty.util.log: Logging initialized #8633ms to org.sparkproject.jetty.util.log.Slf4jLog
22/09/02 05:30:50 INFO org.sparkproject.jetty.server.Server: jetty-9.4.40.v20210413; built: 2021-04-13T20:42:42.668Z; git: someAlphaNumeric1; jvm 1.8.0_322-b06
22/09/02 05:30:50 INFO org.sparkproject.jetty.server.Server: Started #8803ms
22/09/02 05:30:50 INFO org.sparkproject.jetty.server.AbstractConnector: Started ServerConnector#2db33feb{HTTP/1.1, (http/1.1)}{0.0.0.0:38111}
22/09/02 05:30:50 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at clusterName-m/someIp:8032
22/09/02 05:30:51 INFO org.apache.hadoop.yarn.client.AHSProxy: Connecting to Application History server at clusterName-m/someIp:10200
22/09/02 05:30:51 INFO org.apache.hadoop.conf.Configuration: resource-types.xml not found
22/09/02 05:30:51 INFO org.apache.hadoop.yarn.util.resource.ResourceUtils: Unable to find 'resource-types.xml'.
22/09/02 05:30:55 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Submitted application application_appID
22/09/02 05:30:56 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at clusterName-m/someIp:8030
22/09/02 05:30:57 ERROR org.apache.spark.SparkContext: Error initializing SparkContext.
java.io.FileNotFoundException: File not found: gs://temp-bucket/f80f8cb3-0358-445e-8ec2-819e4282bfe4/spark-job-history
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getFileStatus(GoogleHadoopFileSystemBase.java:958)
at org.apache.spark.deploy.history.EventLogFileWriter.requireLogBaseDirAsDirectory(EventLogFileWriters.scala:77)
at org.apache.spark.deploy.history.SingleEventLogFileWriter.start(EventLogFileWriters.scala:221)
at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:83)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:612)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2680)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939)
at claszzName2.spark.adaptor.utils.SparkUtils.newSparkSession(SparkUtils.java:42)
at claszzName2.spark.adaptor.bqtopubsub.BqToPubSubAdaptor.main(BqToPubSubAdaptor.java:30)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
22/09/02 05:30:57 INFO org.sparkproject.jetty.server.AbstractConnector: Stopped Spark#2db33feb{HTTP/1.1, (http/1.1)}{0.0.0.0:0}
22/09/02 05:30:58 ERROR clazzName: Spark Batch Application failed : {}
java.io.FileNotFoundException: File not found: gs://temp-bucket/f80f8cb3-0358-445e-8ec2-819e4282bfe4/spark-job-history
at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.getFileStatus(GoogleHadoopFileSystemBase.java:958)
at org.apache.spark.deploy.history.EventLogFileWriter.requireLogBaseDirAsDirectory(EventLogFileWriters.scala:77)
at org.apache.spark.deploy.history.SingleEventLogFileWriter.start(EventLogFileWriters.scala:221)
at org.apache.spark.scheduler.EventLoggingListener.start(EventLoggingListener.scala:83)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:612)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2680)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:945)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939)
at claszzName2.spark.adaptor.utils.SparkUtils.newSparkSession(SparkUtils.java:42)
at claszzName2.spark.adaptor.bqtopubsub.BqToPubSubAdaptor.main(BqToPubSubAdaptor.java:30)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Exception in thread "main" java.lang.RuntimeException
at clazzName.main(BqToPubSubAdaptor.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:951)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
ERROR: (gcloud.dataproc.jobs.submit.spark) Job [d8c3e7e5e8004e5bba72b921d454bfeb] failed with error:
Google Cloud Dataproc Agent reports job failure. If logs are available, they can be found at:
It seems GCP DataProc/spark is looking for default history server for event logging, which may not be located at specific location.
You can override two properties to update the logs location for history server.
spark.eventLog.dir
spark.history.fs.logDirectory
Either you can use these properties during submit of spark job, just like mentioned in example or you can update your spark-defaults.conf file
with these 2 properties.
Like
gcloud dataproc jobs submit spark \
--cluster=clusterName \
--class=clazzName \
--jars=gs://abc/def/ghi.jar \
--region=us-central1 \
--files=gs://abc/def/jkl.json \
--properties=spark.driver.extraJavaOptions="-Dconfig.file=application_dev.json",spark.executor.extraJavaOptions="-Dconfig.file=application_dev.json",spark.executor.memory=6G,spark.driver.memory=4G,spark.executor.cores=3,spark.executor.instances=4,spark.history.fs.logDirectory=gs://<bucket-name>/<folder-name>,spark.eventLog.dir=gs://<bucket-name>/<folder-name>
I am using EKS for the deployment of my Spring Boot App.
At startup, it crashes a couple of times, gets restarted by EKS, and finally starts serving requests.
Here are the logs: -
2022-04-20 13:08:51.836 INFO 1 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.springframework.retry.annotation.RetryConfiguration' of type [org.springframework.retry.annotation.RetryConfiguration$$EnhancerBySpringCGLIB$$e8ec2216] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2022-04-20 13:08:51.848 INFO 1 --- [ main] trationDelegate$BeanPostProcessorChecker : Bean 'org.springframework.cloud.autoconfigure.ConfigurationPropertiesRebinderAutoConfiguration' of type [org.springframework.cloud.autoconfigure.ConfigurationPropertiesRebinderAutoConfiguration$$EnhancerBySpringCGLIB$$f428cee] is not eligible for getting processed by all BeanPostProcessors (for example: not eligible for auto-proxying)
2022-04-20 13:09:42.455 ERROR 1 --- [ main] o.s.boot.SpringApplication : Application run failed
java.lang.IllegalStateException: Logback configuration error detected:
ERROR in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy#1248276879 - Unexpected exception while waiting for compression job to finish java.lang.InterruptedException
ERROR in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy#828088650 - Unexpected exception while waiting for compression job to finish java.lang.InterruptedException
ERROR in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy#1248276879 - Timeout while waiting for clean-up job to finish java.util.concurrent.TimeoutException
ERROR in c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy#828088650 - Timeout while waiting for clean-up job to finish java.util.concurrent.TimeoutException
at org.springframework.boot.logging.logback.LogbackLoggingSystem.loadConfiguration(LogbackLoggingSystem.java:169)
at org.springframework.boot.logging.AbstractLoggingSystem.initializeWithConventions(AbstractLoggingSystem.java:82)
at org.springframework.boot.logging.AbstractLoggingSystem.initialize(AbstractLoggingSystem.java:60)
at org.springframework.boot.logging.logback.LogbackLoggingSystem.initialize(LogbackLoggingSystem.java:117)
at org.springframework.boot.context.logging.LoggingApplicationListener.initializeSystem(LoggingApplicationListener.java:290)
at org.springframework.boot.context.logging.LoggingApplicationListener.initialize(LoggingApplicationListener.java:263)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEnvironmentPreparedEvent(LoggingApplicationListener.java:226)
at org.springframework.boot.context.logging.LoggingApplicationListener.onApplicationEvent(LoggingApplicationListener.java:199)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:172)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:165)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:139)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:127)
at org.springframework.boot.context.event.EventPublishingRunListener.environmentPrepared(EventPublishingRunListener.java:75)
at org.springframework.boot.SpringApplicationRunListeners.environmentPrepared(SpringApplicationRunListeners.java:54)
at org.springframework.boot.SpringApplication.prepareEnvironment(SpringApplication.java:347)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:306)
at com.dt.Application.main(Application.java:20)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.boot.loader.MainMethodRunner.run(MainMethodRunner.java:48)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:87)
at org.springframework.boot.loader.Launcher.launch(Launcher.java:50)
at org.springframework.boot.loader.JarLauncher.main(JarLauncher.java:51)
First two lines are warnings as per my research so far. they shouldn't cause the app to fail (let me know if I am wrong).
Application run failed and 4 Logback errors were printed: -
In Logback's TimeBasedRollingPolicy, there's a start() and stop() method.
When the stop() method is called, two async jobs are run. Compression job and Clean up job. The compression job threw InterruptedException and the clean-up job threw TimeoutException.
I have a few questions: -
I am not sure whether these Logback errors caused the app to crash or vice-versa
What invoked the stop() method of TimeBasedRollingPolicy
What interrupted the Compression job?
If you have any idea why this is happening, please let me know.
Thanks :)
It says "Logback configuration error detected". I would check your log configuration file to see if you have a typo in it.
I am getting the following error when trying to submit spark job using spark-submit. I am unable to understand the error what exactly it's asking for.
It is failing at below line.
spark.read.format("com.mongodb.spark.sql.DefaultSource").
option("spark.mongodb.input.partitioner", "MongoSinglePartitioner").
option("uri", connectionString).
load()
Here is my spark submit command:
spark-submit \
--class com.sparkTutorial.input \
--deploy-mode cluster \
--master "spark://master-node:7077" \
--packages com.microsoft.sqlserver:mssql-jdbc:7.4.1.jre11 \
--packages org.mongodb.spark:mongo-spark-connector_2.12:3.0.1 \
target/scala-2.12/sql-mongo-validation-assembly-0.1.jar
Logs on a worker node.
Spark Executor Command: "/usr/local/openjdk-11/bin/java" "-cp" "/opt/spark/conf/:/opt/spark/jars/*" "-Xmx2048M" "-Dspark.driver.port=36881" "-Dspark.rpc.askTimeout=10s" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler#localhost:36881" "--executor-id" "36" "--hostname" "172.22.0.5" "--cores" "2" "--app-id" "app-20210907192950-0004" "--worker-url" "spark://Worker#172.22.0.5:46711"
========================================
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/09/07 19:30:30 INFO CoarseGrainedExecutorBackend: Started daemon with process name: 22087#bbc2066271a2
21/09/07 19:30:30 INFO SignalUtils: Registering signal handler for TERM
21/09/07 19:30:30 INFO SignalUtils: Registering signal handler for HUP
21/09/07 19:30:30 INFO SignalUtils: Registering signal handler for INT
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/opt/spark/jars/spark-unsafe_2.12-3.1.2.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
21/09/07 19:30:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/09/07 19:30:31 INFO SecurityManager: Changing view acls to: root
21/09/07 19:30:31 INFO SecurityManager: Changing modify acls to: root
21/09/07 19:30:31 INFO SecurityManager: Changing view acls groups to:
21/09/07 19:30:31 INFO SecurityManager: Changing modify acls groups to:
21/09/07 19:30:31 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1748)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:61)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:393)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:382)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:301)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.$anonfun$run$9(CoarseGrainedExecutorBackend.scala:413)
at scala.runtime.java8.JFunction1$mcVI$sp.apply(JFunction1$mcVI$sp.java:23)
at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)
at scala.collection.immutable.Range.foreach(Range.scala:158)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.$anonfun$run$7(CoarseGrainedExecutorBackend.scala:411)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:62)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:61)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
... 4 more
Caused by: java.io.IOException: Failed to connect to localhost/127.0.0.1:36881
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:287)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:218)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:230)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:204)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:202)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:198)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: localhost/127.0.0.1:36881
Caused by: java.net.ConnectException: Connection refused
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Unknown Source)
I am following this tutorial to run Spark-Pi Application using kubectl command from here. https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/docs/quick-start-guide.md#running-the-examples
When I submit
kubectl apply -f spark-pi.yaml
and check the logs using kubectl logs spark-pi-driver -f , I see this exception.
20/03/20 01:47:45 INFO SparkEnv: Registering OutputCommitCoordinator
20/03/20 01:47:46 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/03/20 01:47:46 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://spark-pi-1584668857472-driver-svc.default.svc:4040
20/03/20 01:47:46 INFO SparkContext: Added JAR file:///opt/spark/examples/jars/spark-examples_2.11-2.4.3.jar at spark://spark-pi-1584668857472-driver-svc.default.svc:7078/jars/spark-examples_2.11-2.4.3.jar with timestamp 1584668866199
Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/datatype/jsr310/JavaTimeModule
at io.fabric8.kubernetes.client.dsl.base.OperationSupport.<clinit>(OperationSupport.java:59)
at io.fabric8.kubernetes.client.DefaultKubernetesClient.pods(DefaultKubernetesClient.java:204)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator$$anonfun$1.apply(ExecutorPodsAllocator.scala:55)
at scala.Option.map(Option.scala:146)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.<init>(ExecutorPodsAllocator.scala:55)
at org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:89)
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2788)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:493)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2520)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:935)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:926)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:926)
at org.apache.spark.examples.SparkPi$.main(SparkPi.scala:31)
at org.apache.spark.examples.SparkPi.main(SparkPi.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:849)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:924)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:933)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.datatype.jsr310.JavaTimeModule
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 28 more
20/03/20 01:47:47 INFO DiskBlockManager: Shutdown hook called
20/03/20 01:47:47 INFO ShutdownHookManager: Shutdown hook called
20/03/20 01:47:47 INFO ShutdownHookManager: Deleting directory /var/data/spark-97c7e689-9506-42a1-b3b1-578270832f75/spark-e99b532d-3c81-4a76-a05c-a4a753627db2/userFiles-7d18e8e4-74d6-4dbc-b967-31cd2c6d96d3
20/03/20 01:47:47 INFO ShutdownHookManager: Deleting directory /var/data/spark-97c7e689-9506-42a1-b3b1-578270832f75/spark-e99b532d-3c81-4a76-a05c-a4a753627db2
20/03/20 01:47:47 INFO ShutdownHookManager: Deleting directory /tmp/spark-72a95874-127a-4f0e-b32a-22d1aec74e1c```
Any help how to resolve this?
This jackson-annotations jar comes within jars folder in spark-2.4.3-bin-hadoop2.7. Not sure why it is not able to pick up in classpath. Any help would be appreciated.
As pointed out by #Andreas, ${SPARK_HOME}/jars doesn't contain jackson-datatype-jsr310.
You can try to modify spark-docker/Dockerfile and see how it works:
. . .
ADD https://repo1.maven.org/maven2/com/fasterxml/jackson/datatype/jackson-datatype-jsr310/2.9.10/jackson-datatype-jsr310-2.9.10.jar $SPARK_HOME/jars
. . .
It seems like a bug thought and if it helps - please rise the issue in the repo.
If you are using SpringBoot you should know that Jackson is added by default. so remove and it should work
I am trying to create SparkContext object in below code from my Windows 10 Desktop:
import org.apache.spark.SparkConf;
import org.apache.spark.SparkContext;
public class MyTest{
public static void main(String args[]) {
SparkConf conf = new SparkConf().setAppName("");
conf.setMaster("local");
SparkContext sc = new SparkContext(conf);
}
}
Using maven dependencies:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.0-cdh5.8.3</version>
</dependency>
I have also set HADOOP_HOME=C:\winutil where C:\winutil\bin\winutils.exe is present of 64-bits.
While running the above code getting below error message:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/C:/Users/farooque/.m2/repository/org/slf4j/slf4j-log4j12/1.7.5/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/C:/Users/farooque/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.4.1/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/12/01 10:44:01 INFO spark.SparkContext: Running Spark version 1.6.0
17/12/01 10:46:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/12/01 10:46:28 INFO spark.SecurityManager: Changing view acls to: farooque
17/12/01 10:46:28 INFO spark.SecurityManager: Changing modify acls to: farooque
17/12/01 10:46:28 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(farooque); users with modify permissions: Set(farooque)
17/12/01 10:47:36 INFO util.Utils: Successfully started service 'sparkDriver' on port 52602.
17/12/01 10:48:46 INFO slf4j.Slf4jLogger: Slf4jLogger started
17/12/01 10:48:57 INFO Remoting: Starting remoting
17/12/01 10:49:06 ERROR Remoting: Remoting error: [Startup timed out] [
akka.remote.RemoteTransportException: Startup timed out
at akka.remote.Remoting.akka$remote$Remoting$$notifyError(Remoting.scala:129)
at akka.remote.Remoting.start(Remoting.scala:191)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:579)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:577)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:588)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1989)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1980)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:289)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:462)
at com.xyz.module.submodule.incoming.test.validator.MyTest.main(MyTest.java:11)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at akka.remote.Remoting.start(Remoting.scala:173)
... 18 more
]
17/12/01 10:49:06 ERROR spark.SparkContext: Error initializing SparkContext.
java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at akka.remote.Remoting.start(Remoting.scala:173)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:579)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:577)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:588)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1989)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1980)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:289)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:462)
at com.xyz.module.submodule.incoming.test.validator.MyTest.main(MyTest.java:11)
17/12/01 10:49:08 INFO spark.SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at akka.remote.Remoting.start(Remoting.scala:173)
at akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:579)
at akka.actor.ActorSystemImpl._start(ActorSystem.scala:577)
at akka.actor.ActorSystemImpl.start(ActorSystem.scala:588)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:111)
at akka.actor.ActorSystem$.apply(ActorSystem.scala:104)
at org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52)
at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1989)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1980)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:289)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:462)
at com.xyz.module.submodule.incoming.test.validator.MyTest.main(MyTest.java:11)
Not being able to create Spark context object. Error is thrown at SparkContext sc = new SparkContext(conf);. Looking for the help!!!
This issue got resolved!
Let me share the real cause and how it got resolved?
Initially the code was present in a network drive(say Z:\workspace\...) but execution of the code was happening locally on the machine. To create a SparkContext object by taking the code from network drive [Startup timed out] was happening, (assumption slow network speed)causes the issue.
Solution:
So I moved the code in local drive(say C:\Users\farooque\workspace\...) and ran the code again, now SparkContext got created successfully. Hence the issue got resolved!