ClassNotFoundException and NoClassDefFoundError in Flink app using Cassandra Driver - java

I am developing a Flink application and it will use Cassandra Driver to interact with Cassandra DB. The Cassandra Driver is implemented in Singleton fashion and multiple Flink process functions will interact with it to get data from Cassandra. I also add a future callback to each Session.executeAsync's ResultSetFuture. The app is run on Kubernetes through Docker containers.
The environment is:
Flink version is 1.10.0 and using shaded netty, hadoop, guava and jackson.
Using cassandra-driver-mapping: 3.9.0 and shaded cassandra-driver-core: 3.9.0.
All dependencies are packaged in a single jar using Bazel. Before starting the Flink app, I check all the required classes are in the jar and are correct and complete. And I use the shaded dependency in order to avoid class loading conflict in JVM. But, when I start and run the Flink app. I keep seeing the following ClassNotFoundException in the Taskmanager logs.
java.lang.NoClassDefFoundError: com/datastax/driver/core/SessionManager$State
at com.datastax.driver.core.SessionManager.getState(SessionManager.java:211)
at io.uhana.cassandra.CassandraDriver.sessionNeedsReconnect(CassandraDriver.java:508)
at io.uhana.cassandra.CassandraDriver.access$000(CassandraDriver.java:61)
at io.uhana.cassandra.CassandraDriver$1.onFailure(CassandraDriver.java:518)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1387)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1015)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:868)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:713)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:230)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:235)
at com.datastax.driver.core.RequestHandler.access$2600(RequestHandler.java:61)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:1011)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:647)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1262)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1180)
at com.datastax.shaded.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at com.datastax.shaded.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at com.datastax.shaded.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at com.datastax.shaded.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
at com.datastax.shaded.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at com.datastax.shaded.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at com.datastax.driver.core.InboundTrafficMeter.channelRead(InboundTrafficMeter.java:38)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
at com.datastax.shaded.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1304)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
at com.datastax.shaded.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
at com.datastax.shaded.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:921)
at com.datastax.shaded.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:135)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:646)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:546)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:500)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:460)
at com.datastax.shaded.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at com.datastax.shaded.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.ClassNotFoundException: com.datastax.driver.core.SessionManager$State
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:588)
at org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:69)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 49 more
and
ConstantReconnectionPolicy$ConstantSchedule' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handling the following exception:
java.lang.NoClassDefFoundError: com/datastax/shaded/netty/handler/timeout/IdleState
at com.datastax.shaded.netty.handler.timeout.IdleStateHandler$ReaderIdleTimeoutTask.run(IdleStateHandler.java:493)
at com.datastax.shaded.netty.handler.timeout.IdleStateHandler$AbstractIdleTask.run(IdleStateHandler.java:466)
at com.datastax.shaded.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
at com.datastax.shaded.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:120)
at com.datastax.shaded.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
at com.datastax.shaded.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:464)
at com.datastax.shaded.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
at com.datastax.shaded.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.ClassNotFoundException: com.datastax.shaded.netty.handler.timeout.IdleState
at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:588)
at org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:69)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 9 more
I also notice that these issues are easier to reproduce when giving more resources and parallelism to the Flink app and the process functions. And the issues are most likely happen in the future callback.
Any help appreciated!

Related

DataProc - SparkException: Failed to register classes with Kryo - java.lang.ClassNotFoundException

I am receiving ClassNotFoundExceptions when trying to use KryoSerializer for a dataframe on DataProc for a class that is part of the main JAR of sent to spark-submit. This only happens when using spark:spark.submit.deployMode "cluster" and a master setting of "yarn". It does not happen if I just a master setting like "local[*]" which lends me to believe that the JAR class path of the executors is missing or something.
I've tried experimenting with the following by adding the MAIN_JAR to classpath, but keep coming up short to the same ClassNotFoundException conclusion. Everything I read suggests this shouldn't be necessary however.
"spark:spark.executor.extraClassPath"
"spark:spark.driver.extraClassPath"
"spark:spark.driver.userClassPathFirst"
"spark:spark.executor.userClassPathFirst"
"spark:spark.yarn.dist.jars"
"spark:spark.yarn.jars"
"spark:spark.jars"
I'm using the latest 2.1 DataProc image. I'm getting this very frustrating "ClassNotFoundException" for trying to Kyro serialize my dataframe. The thing is the class exists in the main jar submitted on submit-spark.
Here is the Java code for the main jar that sends spark-submit, the JAR is on GCS and using the GCS Cloud Storage Connector, but I know this works since this is only happening in cluster mode:
SparkJob sparkJob = SparkJob.newBuilder().setMainJarFileUri(MAIN_JAR)
The specific image URI of the DataProc image i'm using. This is the explicit image URI, but i've also tried some of the other existing 2.1 images as well like debian.
https://www.googleapis.com/compute/v1/projects/cloud-dataproc/global/images/dataproc-2-1-ubu20-20221201-035100-rc01
Here's the gcloud image list command for 2.1 I'm using to get the image URI of the stable 2.1 images available for reference:
gcloud compute images list --uri --project cloud-dataproc --filter "labels.goog-dataproc-version ~ ^2.1.0" --sort-by=~creationTimestamp
Here's some of the stacktrace:
22/12/21 03:57:18 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) (<GCP project resource details redacted> executor 1): java.io.IOException: org.apache.spark.SparkException: Failed to register classes with Kryo
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1477)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:228)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:105)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:84)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.spark.SparkException: Failed to register classes with Kryo
at org.apache.spark.serializer.KryoSerializer.$anonfun$newKryo$5(KryoSerializer.scala:183)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:233)
at org.apache.spark.serializer.KryoSerializer.newKryo(KryoSerializer.scala:171)
at org.apache.spark.serializer.KryoSerializer$$anon$1.create(KryoSerializer.scala:102)
at com.esotericsoftware.kryo.pool.KryoPoolQueueImpl.borrow(KryoPoolQueueImpl.java:48)
at org.apache.spark.serializer.KryoSerializer$PoolWrapper.borrow(KryoSerializer.scala:109)
at org.apache.spark.serializer.KryoSerializerInstance.borrowKryo(KryoSerializer.scala:346)
at org.apache.spark.serializer.KryoDeserializationStream.<init>(KryoSerializer.scala:302)
at org.apache.spark.serializer.KryoSerializerInstance.deserializeStream(KryoSerializer.scala:436)
at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:336)
at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$4(TorrentBroadcast.scala:259)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$2(TorrentBroadcast.scala:233)
at org.apache.spark.util.KeyLock.withLock(KeyLock.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast.$anonfun$readBroadcastBlock$1(TorrentBroadcast.scala:228)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1470)
... 11 more
Caused by: java.lang.ClassNotFoundException: <<redacted>.MyClass In the Main JAR sent to spark-submit>
at java.base/java.lang.ClassLoader.findClass(ClassLoader.java:719)
at org.apache.spark.util.ParentClassLoader.findClass(ParentClassLoader.java:35)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589)
at org.apache.spark.util.ParentClassLoader.loadClass(ParentClassLoader.java:40)
at org.apache.spark.util.ChildFirstURLClassLoader.loadClass(ChildFirstURLClassLoader.java:48)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:398)
at org.apache.spark.util.Utils$.classForName(Utils.scala:220)
at org.apache.spark.serializer.KryoSerializer.$anonfun$newKryo$6(KryoSerializer.scala:174)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.serializer.KryoSerializer.$anonfun$newKryo$5(KryoSerializer.scala:173)
... 27 more
Also to make note, this is my pom.xml. I use provided for all the libraries already installed on the cluster:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.3.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>3.3.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.google.cloud.spark</groupId>
<artifactId>spark-bigquery-with-dependencies_2.12</artifactId>
<version>0.27.1</version>
<scope>provided</scope>
</dependency>
I also tried experimenting with initialization actions to copy the JAR with gsutil to the class path of executors manually, but that seems unnecessary and yields the same result anyway.

ClassNotFoundException: org.apache.hadoop.hive.metastore.api.UnknownDBException

I have code (Spark job) that runs on EMR 6.7.0 - which runs fine. Locally the dependencies are a mess.
When trying to run locally, I'm getting:
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/UnknownDBException
...
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.metastore.api.UnknownDBException
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522)
So I figured I need to add the dependency hive-metastore, but version 3.1.2 since this is the Hive version I see in the EMR 6.5.0 release notes.
Unfortunately I don't see this class in the hive-metastore 3.1.3 version.
Am I missing something? I'm sure there are problems with the local dependencies that contradict some other dependencies, but overall when adding the hive-metastore 3.1.3 jar - the required classes aren't there.

How to solve java.lang.ClassNotFoundException: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer

I am using tinkerpop + Janus Graph + Spark
build.gradle
compile group: 'org.apache.tinkerpop', name: 'spark-gremlin', version: '3.1.0-incubating'
below is some critical configuration that we have
spark.serializer: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
In the logs corresponding long entry which refer the jar containing the above class is loaded
{"#timestamp":"2020-02-18T07:24:21.720+00:00","#version":1,"message":"Added JAR /opt/data/janusgraph/applib2/spark-gremlin-827a65ae26.jar at spark://gdp-identity-stage.target.com:38876/jars/spark-gremlin-827a65ae26.jar with timestamp 1582010661720","logger_name":"o.a.s.SparkContext","thread_name":"SparkGraphComputer-boss","level":"INFO","level_value":20000}
but my spark job submitted by SparkGraphComputer is failed, when we see executor logs, we saw
Caused by: java.lang.ClassNotFoundException: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
Why this exception is coming even though the corresponding jar is loaded?
Anyone, please suggest on this.
As I mention seeing this exception in spark executor when I opened one of the worker logs below complete exception
Spark Executor Command: "/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.222.b10-0.el7_6.x86_64/bin/java" "-cp" "/opt/spark/spark-2.4.0/conf/:/opt/spark/spark-2.4.0/jars/*:/opt/hadoop/hadoop-3_1_1/etc/hadoop/" "-Xmx56320M" "-Dspark.driver.port=43137" "-XX:+UseG1GC" "-XX:+PrintGCDetails" "-XX:+PrintGCTimeStamps" "-Xloggc:/opt/spark/gc.log" "-Dtinkerpop.gremlin.io.kryoShimService=org.apache.tinkerpop.gremlin.hadoop.structure.io.HadoopPoolShimService" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "spark://CoarseGrainedScheduler#gdp-identity-stage.target.com:43137" "--executor-id" "43392" "--hostname" "192.168.192.10" "--cores" "6" "--app-id" "app-20200220094335-0001" "--worker-url" "spark://Worker#192.168.192.10:36845"
========================================
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1713)
at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:281)
at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: java.lang.ClassNotFoundException: org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:238)
at org.apache.spark.SparkEnv$.instantiateClass$1(SparkEnv.scala:259)
at org.apache.spark.SparkEnv$.instantiateClassFromConf$1(SparkEnv.scala:280)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:283)
at org.apache.spark.SparkEnv$.createExecutorEnv(SparkEnv.scala:200)
at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:221)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
at org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
... 4 more
when I am setting the spark. jars property on graph, am passing this jar location also
Jar which we created from the application is of fat jar type means it contains the actual code and all the required dependencies also, please see below screenshots .
If you look at the logs, you see this
java" "-cp" "/opt/spark/spark-2.4.0/conf/:/opt/spark/spark-2.4.0/jars/*:/opt/hadoop/hadoop-3_1_1/etc/hadoop/"
Unless you have the gremlin JARs in your /opt/spark/spark-2.4.0/jars/* folder on each Spark worker, then the class you're using doesn't exist.
The recommended way to include it for your specific application would be the Gradle Shadow plugin rather than --packages or spark.jars

NoCLassDefFound error in hadoop

I am using hadoop 2.4.1 version. I am trying to run a mapreduce job which moves data from local system to hdfs cluster(output directory). If I set the output directory as my local system path, the program is running fine. But when I set the output directory as a path in hdfs cluster I am getting the below error
Exception in thread "main" java.lang.NoClassDefFoundError: com/google/protobuf/ServiceException
at org.apache.hadoop.ipc.ProtobufRpcEngine.<clinit>(ProtobufRpcEngine.java:69)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1834)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1799)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1893)
at org.apache.hadoop.ipc.RPC.getProtocolEngine(RPC.java:203)
at org.apache.hadoop.ipc.RPC.getProtocolProxy(RPC.java:537)
at org.apache.hadoop.hdfs.NameNodeProxies.createNNProxyWithClientProtocol(NameNodeProxies.java:328)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:235)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:139)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:510)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:453)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:136)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2397)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputPath(FileOutputFormat.java:160)
at s1.run(s1.java:66)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at s1.main(s1.java:75)
Caused by: java.lang.ClassNotFoundException: com.google.protobuf.ServiceException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 25 more
I saw some posts which stated the issue could be related to protobuf dependecy.
Hadoop 2.2.0 mapreduce job not running after upgrading from hadoop 1.0.4
I am using hadoop commons jar 2.5.2 which has the protobuf. Any help to solve this would be appreciated.
Made it working ! Found that there were some jars of 2.2 version which were incompatible with the current version. When i updated those, the program works fine.
if you compile zhe *.java use default java CLASSPATH is ok.
Edit the hadoop_env.sh
export HADOOP_CLASSPATH=${CLASSPATH}
restart the hadoop server
NoClassDefFoundError is thrown by jvm at runtime when a class is not present in classpath.
Check your classpath.
Check also this answer. Could be useful if you solved the NoClassDefFoundError link

ERROR [main] master.HMasterCommandLine: Master exiting java.lang.RuntimeException: Failed construction of Master

I am getting the below error while trying to run HBASE with HADOOP
HBASE 0.98.x
HADOOP 2.4.0
ERROR [main] master.HMasterCommandLine: Master exiting
java.lang.RuntimeException: Failed construction of Master: class org.apache.had$
at org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMCl$
at org.apache.hadoop.hbase.LocalHBaseCluster.addMaster(LocalHBaseCluste$
at org.apache.hadoop.hbase.LocalHBaseCluster.<init>(LocalHBaseCluster.j$
at org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMaste$
at org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommand$
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandL$
at org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2793)
Caused by: org.apache.hadoop.ipc.RemoteException: Server IPC version 9 cannot c$
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:245)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedF$
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:144$
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:247)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:895)
at org.apache.hadoop.hbase.master.HMaster.<init>(HMaster.java:458)
at org.apache.hadoop.hbase.master.HMasterCommandLine$LocalHMaster.<init$
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstruc$
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Delegating$
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.util.JVMClusterUtil.createMasterThread(JVMCl$
... 7 more
Do i have to set configuration property at hbase-site.xml. Thanks in advance
Which version of HBase did you downloaded? Be sure you get the corresponding version with the same hadoop version you have (version 2). On your case you should download
$ wget http://apache.rediris.es/hbase/stable/hbase-0.98.8-hadoop2-bin.tar.gz
instead of
$ wget http://apache.rediris.es/hbase/stable/hbase-0.98.8-hadoop1-bin.tar.gz
Make sure that you are editing the correct files under hbase/conf.

Categories