I'm trying to get started with Storm. I setup a hosted cluster. I followed all the steps listed here for getting started. It works fine till submitting:
storm jar target/storm-starter-*.jar org.apache.storm.starter.RollingTopWords production-topology fails with
Running: java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/usr/local/Cellar/storm/1.2.2/libexec -Dstorm.log.dir=/usr/local/Cellar/storm/1.2.2/libexec/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /usr/local/Cellar/storm/1.2.2/libexec/*:/usr/local/Cellar/storm/1.2.2/libexec/lib/*:/usr/local/Cellar/storm/1.2.2/libexec/extlib/*:target/storm-starter-2.0.1-SNAPSHOT.jar:/usr/local/Cellar/storm/1.2.2/libexec/conf:/usr/local/Cellar/storm/1.2.2/libexec/bin -Dstorm.jar=target/storm-starter-2.0.1-SNAPSHOT.jar -Dstorm.dependency.jars= -Dstorm.dependency.artifacts={} org.apache.storm.starter.RollingTopWords production-topology
Error: Could not find or load main class org.apache.storm.starter.RollingTopWords
Caused by: java.lang.NoClassDefFoundError: org/apache/storm/topology/ConfigurableTopology
I'm not familiar with Java and Storm but getting started doesn't feel any good yet.
ConfigurableTopology doesn't exist in Storm 1.2.2. Most likely you are trying to use a storm-starter jar built from Storm 2.x with a 1.2.2 cluster. This will not work. Build storm-starter from the 1.x sources instead, and it should work.
Related
My spark streaming job (spark 1.6) is trying to store data in Redis cluster, when I run it locally it worked fine, when deploy it on the cluster, I got the below stack trace:
Caused by: java.lang.NoSuchMethodError: redis.clients.jedis.JedisPoolConfig.setFairness(Z)V
at com.xyz.utils.redis.RedisClient.<init>(RedisClient.java:87)
at com.xzy.redis.Cache.<init>(Cache.java:76)
at com.xyz.spark.broadcastvariables.RedisCacheWrapper$RedisCacheHolder.getDevelopmentInstance(RedisCacheWrapper.java:29)
at com.zyx.spark.broadcastvariables.RedisCacheWrapper$RedisCacheHolder.<clinit>(RedisCacheWrapper.java:18)
at com.zyx.spark.broadcastvariables.RedisCacheWrapper.getCache(RedisCacheWrapper.java:74)
at com.zyx.spark.sparkjob.ProcessArticles.lambda$null$0(ProcessArticles.java:380)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
I am using redis.clients:jedis 2.9.0 and this jar depends on org.apache.commons:commons-pool2 2.4.2 which contains the setFairness() method.
I know it is a dependency conflict problem between my uber jar dependencies and spark dependencies, as I found that spark classpath depends on a different version of commons-pool.
I've tried:
to use --jar option to add my commons-pool2
to use spark.executor.extraClassPath=true
And both trials failed.
Could anyone help?
Thanks
I'm really new to maven and storm so I'm trying to follow the instructions in https://github.com/apache/storm/tree/master/examples/storm-starter. My current path is /home/luc/theTest/storm/examples/storm-starter. Inside the target folder there is a storm-starter-2.0.0-SNAPSHOT.jar file. I'm getting stuck when running
storm jar target/storm-starter-*.jar org.apache.storm.starter.ExclamationTopology -local
I get this error
ionTopology -local
Running: /usr/lib/jvm/java-8-openjdk-amd64/bin/java -client -Ddaemon.name= -Dstorm.options= -Dstorm.home=/home/luc/stormTest/apache-storm-1.1.1 -Dstorm.log.dir=/home/luc/stormTest/apache-storm-1.1.1/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /home/luc/stormTest/apache-storm-1.1.1/lib/servlet-api-2.5.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/slf4j-api-1.7.21.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/objenesis-2.1.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/kryo-3.0.3.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/log4j-core-2.8.2.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/log4j-over-slf4j-1.6.6.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/storm-core-1.1.1.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/log4j-slf4j-impl-2.8.2.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/minlog-1.3.0.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/log4j-api-2.8.2.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/clojure-1.7.0.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/ring-cors-0.1.5.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/asm-5.0.3.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/reflectasm-1.10.1.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/disruptor-3.3.2.jar:/home/luc/stormTest/apache-storm-1.1.1/lib/storm-rename-hack-1.1.1.jar:target/storm-starter-2.0.0-SNAPSHOT.jar:/home/luc/stormTest/apache-storm-1.1.1/conf:/home/luc/stormTest/apache-storm-1.1.1/bin -Dstorm.jar=target/storm-starter-2.0.0-SNAPSHOT.jar -Dstorm.dependency.jars= -Dstorm.dependency.artifacts={} org.apache.storm.starter.ExclamationTopology -local
Error: Could not find or load main class org.apache.storm.starter.ExclamationTopology
Am I doing something wrong? I'm also a bit confused on whether I have to run the nimbus and supervisor first. I tried with and without them and neither worked anyways. Been searching the web but nothing works. Not sure what else to try.
This is usually caused by inconsistent versions of storm-client and storm-starter. Try following these steps to get the example working.
download the latest release from http://storm.apache.org/downloads.html
in this example, we will use version 1.1.1
extract this to a folder, lets call it ${STORM_HOME}
cd into ${STORM_HOME}/examples/storm-starter
execute mvn package -DskipTests=true
this should build the storm-starter jar in target folder
${STORM_HOME}/examples/storm-starter/target/storm-starter-1.1.1.jar
run the example from ${STORM_HOME} directory:
./bin/storm jar examples/storm-starter/target/storm-starter-1.1.1.jar org.apache.storm.starter.ExclamationTopology
don't add the -local flag since it seems like the ExclamationTopology is only deployed in a LocalCluster if no args are passed. You can check the source code here: ${STORM_HOME}/examples/storm-starter/src/jvm/org/apache/storm/starter/ExclamationTopology.java
I am working on real time streaming data analysis using Apache Flink-1.1.3. My system consist of Kafka cluster for message queue, Flink cluster which read the messages from kafka partitions and make some analysis on it and finally I want to dump the generated data into Ignite Cache. For the system I am using IgniteSink class to sink the data into ignite cache. The versions are as follows:
Flink 1.1.3,
Kafka 2.10,
Ignite 2.0.0
When I tried to run the job on flink cluster it gives me the following error,
Exception in thread "main" org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Job execution failed.
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:409)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:95)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:382)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:374)
at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.executeRemotely(RemoteStreamEnvironment.java:209)
at org.apache.flink.streaming.api.environment.RemoteStreamEnvironment.execute(RemoteStreamEnvironment.java:173)
at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1429)
at flink_ignite_sink_remote.main(flink_ignite_sink_remote.java:77)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$8.apply$mcV$sp(JobManager.scala:822)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$8.apply(JobManager.scala:768)
at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$8.apply(JobManager.scala:768)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.ignite.sink.flink.IgniteSink$SinkContext$Holder
at org.apache.ignite.sink.flink.IgniteSink$SinkContext.getStreamer(IgniteSink.java:201)
at org.apache.ignite.sink.flink.IgniteSink$SinkContext.access$100(IgniteSink.java:175)
at org.apache.ignite.sink.flink.IgniteSink.invoke(IgniteSink.java:165)
at org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:39)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:373)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:358)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:346)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:329)
at org.apache.flink.streaming.api.operators.TimestampedCollector.collect(TimestampedCollector.java:51)
at flink_ignite_sink_remote$Splitter.flatMap(flink_ignite_sink_remote.java:177)
at flink_ignite_sink_remote$Splitter.flatMap(flink_ignite_sink_remote.java:1)
at org.apache.flink.streaming.api.operators.StreamFlatMap.processElement(StreamFlatMap.java:48)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:373)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:358)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:346)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:329)
at org.apache.flink.streaming.api.operators.StreamSource$NonTimestampContext.collect(StreamSource.java:161)
at org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecord(AbstractFetcher.java:225)
at org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.run(Kafka09Fetcher.java:253)
at java.lang.Thread.run(Thread.java:745)
I have included all the libraries of Ignite-Flink into my project, still I got java.lang.NoClassDefFoundError error.
I suspect you are using a simple jar instead of uber/fat jar.
If you are using maven try shade-plugin or for sbt sbt-assembly. You can also create your project as described in the quickstart-guide
This is also discussed on Apache Ignite user forum: http://apache-ignite-users.70518.x6.nabble.com/Flink-Streamer-td10650.html
When you work with flink jobs, that you deploy over a flink cluster, you have 2 options:
Generate a jar file with all your dependencies bundled inside and deploy that jar
Add all your dependencies to the classpath of your flink server, so it can find those dependencies out of the jar file.
You problem looks like you are not generating a jar file with all the dependencies inside of it and those dependencies aren't located inside the classpath of the fink server.
Try running the following mvn command to generate your jars:
mvn clean package -Pbuild-jar
It could generate multiple jar, pick the fat jar (the bigger)
I am setting up an Apache Storm system but am having problems getting the program to run consistently. I have set up storm on three servers but it only works consistently on one. I think the issue lies somewhere in the path of the command.
I have been using storm-starter to set up the program and have tested it locally with RollingTopWords. When I run the following command $ storm jar storm-starter-*.jar storm.starter.RollingTopWords the computer stalls a second then i get the following error:
Could not find or load main class storm.starter.RollingTopWords
The jar is stored in the directory /apache/storm/examples/storm-starter/target . Let me know if there is any other information I can provide that would be of help because I'm feeling a little desperate at this point.
The following is the entire output for the program that doesn't work.
Running: /usr/lib/jvm/java-1.7.0-openjdk-amd64/bin/java -client -Dstorm.options= -Dstorm.home=/home/scix3/apache/storm -Dstorm.log.dir=/home/scix3/apache/storm/logs -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= -cp /home/scix3/apache/storm/lib/kryo-2.21.jar:/home/scix3/apache/storm/lib/core.incubator-0.1.0.jar:/home/scix3/apache/storm/lib/commons-fileupload-1.2.1.jar:/home/scix3/apache/storm/lib/ring-servlet-0.3.11.jar:/home/scix3/apache/storm/lib/clj-stacktrace-0.2.2.jar:/home/scix3/apache/storm/lib/jline-2.11.jar:/home/scix3/apache/storm/lib/servlet-api-2.5.jar:/home/scix3/apache/storm/lib/disruptor-2.10.1.jar:/home/scix3/apache/storm/lib/log4j-over-slf4j-1.6.6.jar:/home/scix3/apache/storm/lib/clojure-1.5.1.jar:/home/scix3/apache/storm/lib/commons-exec-1.1.jar:/home/scix3/apache/storm/lib/logback-core-1.0.13.jar:/home/scix3/apache/storm/lib/jetty-util-6.1.26.jar:/home/scix3/apache/storm/lib/slf4j-api-1.7.5.jar:/home/scix3/apache/storm/lib/carbonite-1.4.0.jar:/home/scix3/apache/storm/lib/compojure-1.1.3.jar:/home/scix3/apache/storm/lib/minlog-1.2.jar:/home/scix3/apache/storm/lib/commons-lang-2.5.jar:/home/scix3/apache/storm/lib/tools.macro-0.1.0.jar:/home/scix3/apache/storm/lib/reflectasm-1.07-shaded.jar:/home/scix3/apache/storm/lib/tools.cli-0.2.4.jar:/home/scix3/apache/storm/lib/math.numeric-tower-0.0.1.jar:/home/scix3/apache/storm/lib/logback-classic-1.0.13.jar:/home/scix3/apache/storm/lib/tools.logging-0.2.3.jar:/home/scix3/apache/storm/lib/asm-4.0.jar:/home/scix3/apache/storm/lib/jetty-6.1.26.jar:/home/scix3/apache/storm/lib/snakeyaml-1.11.jar:/home/scix3/apache/storm/lib/hiccup-0.3.6.jar:/home/scix3/apache/storm/lib/clj-time-0.4.1.jar:/home/scix3/apache/storm/lib/jgrapht-core-0.9.0.jar:/home/scix3/apache/storm/lib/clout-1.0.1.jar:/home/scix3/apache/storm/lib/chill-java-0.3.5.jar:/home/scix3/apache/storm/lib/commons-io-2.4.jar:/home/scix3/apache/storm/lib/joda-time-2.0.jar:/home/scix3/apache/storm/lib/storm-core-0.9.4.jar:/home/scix3/apache/storm/lib/objenesis-1.2.jar:/home/scix3/apache/storm/lib/commons-logging-1.1.3.jar:/home/scix3/apache/storm/lib/ring-core-1.1.5.jar:/home/scix3/apache/storm/lib/ring-jetty-adapter-0.3.11.jar:/home/scix3/apache/storm/lib/commons-codec-1.6.jar:/home/scix3/apache/storm/lib/json-simple-1.1.jar:/home/scix3/apache/storm/lib/ring-devel-0.3.11.jar:storm-starter-.jar:/home/scix3/apache/storm/conf:/home/scix3/apache/storm/bin -Dstorm.jar=storm-starter-.jar storm.starter.RollingTopWords
Error: Could not find or load main class storm.starter.RollingTopWords
The main issue for the error
Could not find or load main class storm.starter.RollingTopWords cloud be.
Check the launch configuration while building the jar.
you must be very careful while building the jar ,it asks you to choose destination folder and launch configuration(launch configuration should be of same project)
You might have missed the main class in your project.
Before using Stormsubmitter in Remote cluster, check once weather it works properly localcluster
To check if the problem is with storm unable to find the jar, you can try issuing
storm jar /fullpath/my-storm-jar.jar Classname
Few other things you can make sure
The jar is compiled properly/jar contains the RollingTopWords class
storm.yaml points to the correct nimubs (This seems less probable, as the the connection is being made and there is an attempt to load the topology)
SOLVED (the solution is in the comments)
I'm using Hadoop 2.2.0 (in pseudo-distributed mode) on ubuntu 13.10 and Eclipse Kepler v4.3 to develop my Hadoop program and Dynamic Web Project (without Maven).
My Hadoop jar project, called "WorkTest.jar", works correctly when I run job from command line with: "Hadoop jar WorkTest.jar" and I see correctly the work progress on the terminal.
Hadoop project contains four elements:
DriverJob.java (class that configures and starts the job)
Mapper.java
Combiner.java
Reducer.java
Now I have written a new Dynamic Web Project with a ServletTest.java in which I entered the DriverJob class code, the other class (Mapper.java, Combiner.java, Reducer.java) are placed in the same package as the servlet (main package). The WebContent/lib folder contains all Hadoop jar necessary dependencies.
I have successfully deploy my application on WildFly 8 Server whit Eclipse but when I try to run mapreduce job (the job configuration runs successfully and I managed to delete and write a folder on HDFS), he keeps on failing with the following exception visible from the Hadoop Job log file:
FATAL [IPC Server handler 5 on 46834] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1396015900746_0023_m_000002_0 - exited : java.lang.RuntimeException: java.lang.ClassNotFoundException: Class Mapper not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1720)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:721)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.ClassNotFoundException: Class Mapper not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1718)
... 8 more
and from the WildFly log file:
WARN [org.apache.hadoop.mapreduce.JobSubmitter] Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
WARN [org.apache.hadoop.mapreduce.JobSubmitter] No job jar file set. User classes may not be found. See Job or Job#setJar(String).
But the WEB-INF/classes/ deploy folder on WildFly containing the Mapper.class, Combiner.class and Reducer.class.
I also tried to enter the class code of Mapper, Combiner and Reducer inside the servlet, but does not work with the same error...
What I'm doing wrong?
I believe you need to have your .class files in an archive (jar) that can be distributed to the nodes in the cluster.
WARN [org.apache.hadoop.mapreduce.JobSubmitter] No job jar file set. User classes may not be found. See Job or Job#setJar(String).
This error is the key. Generally you would use job.setJarByClass(DriverJob.class) to tell the mapreduce client which jar file has the Mapper/Reducer classes. You don't have a jar and so that method for distributing the proper classes falls apart.