I can't write a orc file with spark - java

I'm trying to write a dataframe to an orc, but to no avail. I'm using Spark 1.6 with Java.
I am running on my local machine, I tried to install some dependencies but without success.
My POM is this:
<properties>
<spark.version>1.6.0</spark.version>
<scala.short.version>2.10</scala.short.version>
<slf4j.version>1.7.25</slf4j.version>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.scalatest/scalatest_${scala.short.version} -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.9.0.0</version>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.1.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.10</artifactId>
<version>2.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-avro_2.10</artifactId>
<version>3.2.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.11.8</version>
<!--<scope>provided</scope>-->
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>1.6.0</version>
</dependency>
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>RELEASE</version>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.11</version>
<!--<scope>provided</scope>-->
</dependency>
<!-- https://mvnrepository.com/artifact/com.typesafe.play/play-json -->
<dependency>
<groupId>com.typesafe.play</groupId>
<artifactId>play-json_2.11</artifactId>
<version>2.7.0-M1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-xml</artifactId>
<version>2.11.0-M4</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-parser-combinators</artifactId>
<version>2.11.0-M4</version>
</dependency>
</dependencies>
I have a job spark that I want to write to an orc file, but this error returns me:
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: orc. Please find packages at http://spark-packages.org
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:219)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:148)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:139)
at Confiaveis.main(Confiaveis.java:96)
Caused by: java.lang.ClassNotFoundException: orc.DefaultSource
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at scala.util.Try.orElse(Try.scala:84)
at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
... 4 more
I used this command to write:
df.write().mode("append").format("orc").save("path");
does anyone know how i can solve this?
As little as I know of spark, I understand that it's a library he doesn't find, but I can't find anywhere to clarify what that library would be.

Try
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_*your_version*</artifactId>
<version>*your_version*</version>
<scope>provided</scope>
</dependency>

Related

Running Spring application returns ClassNotFounfException : ApplicationStartup

I am trying to run a simple email sending application with Spring and Javax Mail API using Maven, however on launch it returns an exception.
Exception in thread "main" java.lang.NoClassDefFoundError: org/springframework/core/metrics/ApplicationStartup
at org.springframework.boot.SpringApplication.<init>(SpringApplication.java:233)
at org.springframework.boot.SpringApplication.<init>(SpringApplication.java:246)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1306)
at org.springframework.boot.SpringApplication.run(SpringApplication.java:1295)
at Email.Email.run(Email.java:19)
at Menu.Menu.menu(Menu.java:28)
at org.example.Main.main(Main.java:25)
Caused by: java.lang.ClassNotFoundException: org.springframework.core.metrics.ApplicationStartup
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 7 more
My pom.xml dependencies are like this
<dependencies>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.19.0</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.19.0</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-test</artifactId>
<version>5.2.9.RELEASE</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.sun.mail</groupId>
<artifactId>javax.mail</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-context-support</artifactId>
<version>5.3.23</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-mail</artifactId>
<version>2.7.5</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<version>2.7.5</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.springframework.boot/spring-boot -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot</artifactId>
<version>2.7.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.springframework/spring-context -->
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-context</artifactId>
<version>5.3.23</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.springframework/spring-webmvc -->
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-webmvc</artifactId>
<version>5.3.23</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.springframework.boot/spring-boot-autoconfigure -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-autoconfigure</artifactId>
<version>2.7.5</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>RELEASE</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter</artifactId>
<version>RELEASE</version>
<scope>test</scope>
</dependency>
</dependencies>
Tried downgrading to 2.3.3, got a different exception: ClassNotFound: NativeDetector instead of ApplicationStartup.
mvn clean installing changed nothing, deleting and reinstalling contents of .m2 folder didn't help either.
Would be grateful to hear an advice. Thank you.

Can't write data into the table by Apache Iceberg

i'm trying to write simple data into the table by Apache Iceberg 0.9.1, but error messages show. I want to CRUD data by Hadoop directly.
i create a hadooptable , and try to read from the table. after that i try to write data into the table .
i prepare a json file including one line. my code have read the json object, and arrange the order of the data, but the final step writing data is always error. i've changed some version of dependency packages , but another error messages are show.
Are there something wrong on version of packages.
Please help me.
this is my source code:
public class IcebergTest {
public static void main(String[] args) {
testWithoutCatalog();
readDataWithouCatalog();
writeDataWithoutCatalog();
}
public static void testWithoutCatalog() {
Schema bookSchema = new Schema(optional(1, "title", Types.StringType.get()),
optional(2, "price", Types.LongType.get()),
optional(3, "author", Types.StringType.get()),
optional(4, "genre", Types.StringType.get()));
PartitionSpec bookspec = PartitionSpec.builderFor(bookSchema).identity("title").build();
Configuration conf = new Configuration();
String warehousePath = "hdfs://hadoop01:9000/warehouse_path/xgfying/books3";
HadoopTables tables = new HadoopTables(conf);
Table table = tables.create(bookSchema, bookspec, warehousePath);
}
public static void readDataWithouCatalog(){
.......
}
public static void writeDataWithoutCatalog(){
SparkSession spark = SparkSession.builder().master("local[2]").getOrCreate();
Dataset<Row> df = spark.read().json("src/test/data/books3.json");
System.out.println(" this is the writing data : "+df.select("title","price","author","genre")
.first().toString());
df.select("title","price","author","genre")
.write().format("iceberg").mode("append")
.save("hdfs://hadoop01:9000/warehouse_path/xgfying/books3");
// System.out.println(df.write().format("iceberg").mode("append").toString());
}
}
this is the error messages:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/11/18 15:51:36 INFO SparkContext: Running Spark version 2.4.5
.......
file:///C:/tmp/icebergtest1/src/test/data/books3.json, range: 0-75, partition values: [empty row]
20/11/18 15:51:52 ERROR Utils: Aborting task
java.lang.ExceptionInInitializerError
at org.apache.iceberg.parquet.Parquet$WriteBuilder.build(Parquet.java:232)
at org.apache.iceberg.spark.source.SparkAppenderFactory.newAppender(SparkAppenderFactory.java:61)
at org.apache.iceberg.spark.source.BaseWriter.openCurrent(BaseWriter.java:105)
at org.apache.iceberg.spark.source.PartitionedWriter.write(PartitionedWriter.java:63)
at org.apache.iceberg.spark.source.Writer$Partitioned24Writer.write(Writer.java:271)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Cannot find constructor for interface org.apache.parquet.column.page.PageWriteStore
Missing org.apache.parquet.hadoop.ColumnChunkPageWriteStore(org.apache.parquet.hadoop.CodecFactory$BytesCompressor,org.apache.parquet.schema.MessageType,org.apache.parquet.bytes.ByteBufferAllocator,int) [java.lang.NoSuchMethodException: org.apache.parquet.hadoop.ColumnChunkPageWriteStore.<init>(org.apache.parquet.hadoop.CodecFactory$BytesCompressor, org.apache.parquet.schema.MessageType, org.apache.parquet.bytes.ByteBufferAllocator, int)]
at org.apache.iceberg.common.DynConstructors$Builder.build(DynConstructors.java:235)
at org.apache.iceberg.parquet.ParquetWriter.<clinit>(ParquetWriter.java:55)
... 19 more
20/11/18 15:51:52 ERROR DataWritingSparkTask: Aborting commit for partition 0 (task 2, attempt 0, stage 2.0)
20/11/18 15:51:52 ERROR DataWritingSparkTask: Aborted commit for partition 0 (task 2, attempt 0, stage 2.0)
this is my pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>icebergtest</groupId>
<artifactId>icebergtest1</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>icebergtest1</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<iceberg.version>0.9.1</iceberg.version>
<hadoop.version>2.7.0</hadoop.version>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- org.apache.hadoop BEGIN-->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
<!--将netty包排除-->
<exclusions>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
</exclusion>
</exclusions>
</dependency>
<!--解决io.netty.buffer.PooledByteBufAllocator.defaultNumHeapArena()I异常,-->
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
<version>4.1.18.Final</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-jobclient</artifactId>
<version>${hadoop.version}</version>
</dependency>
<!-- org.apache.hadoop END-->
<!-- org.apache.iceberg BEGIN-->
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-core</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-api</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-parquet</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-common</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-orc</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-data</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-hive</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-arrow</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-spark</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-bundled-guava</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-spark-runtime</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-spark2</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-flink</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-pig</artifactId>
<version>${iceberg.version}</version>
</dependency>
<dependency>
<groupId>org.apache.iceberg</groupId>
<artifactId>iceberg-mr</artifactId>
<version>${iceberg.version}</version>
</dependency>
<!-- org.apache.iceberg END-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.5</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.5</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.4.5</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.4.5</version>
<exclusions>
<exclusion>
<groupId>org.codehaus.janino</groupId>
<artifactId>commons-compiler</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.11.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>2.11.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
<version>2.11.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<!--<version>2.7.9</version>-->
<version>2.6.6</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<!--<version>2.7.9.4</version>-->
<version>2.6.5</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<!--<version>2.7.9</version>-->
<version>2.6.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.56</version>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-avro</artifactId>
<version>1.11.1</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.10.0</version>
</dependency>
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-column</artifactId>
<version>1.11.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>2.4.0</version>
<scope>provided</scope>
</dependency>
</dependencies>
</project>
Missing
org.apache.parquet.hadoop.ColumnChunkPageWriteStore(org.apache.parquet.hadoop.CodecFactory$BytesCompressor,org.apache.parquet.schema.MessageType,org.apache.parquet.bytes.ByteBufferAllocator,int)
[java.lang.NoSuchMethodException:
org.apache.parquet.hadoop.ColumnChunkPageWriteStore.(org.apache.parquet.hadoop.CodecFactory$BytesCompressor,
org.apache.parquet.schema.MessageType,
org.apache.parquet.bytes.ByteBufferAllocator, int)]
Means you are using the Constructor of ColumnChunkPageWriteStore, which takes in 4 parameters, of types (org.apache.parquet.hadoop.CodecFactory$BytesCompressor, org.apache.parquet.schema.MessageType, org.apache.parquet.bytes.ByteBufferAllocator, int)
It cant find the constructor you are using. That why NoSuchMethodError
According to https://jar-download.com/artifacts/org.apache.parquet/parquet-hadoop/1.8.1/source-code/org/apache/parquet/hadoop/ColumnChunkPageWriteStore.java , you need 1.8.1 of parquet-hadoop
Change your mvn import to an older version. I looked at 1.8.1 source code and it has the proper constructor you need.

Apache Beam Maven Dependency Error

I am attempting to use Apache Beam from Java as a data pipeline of sorts. I have written a simple class that sources from Google Pubsub and sinks to Google Bigquery, but I cannot get it to build for the life of me. I am using Maven to build and have added every Beam package I could find, but I still get "class file not found" errors.
Specifically:
[ERROR] /X:/Work/pipeline/backup-pipeline/src/main/java/PassthroughPipeline.java:[28,16] cannot access org.apache.beam.sdk.options.GcpOptions
class file for org.apache.beam.sdk.options.GcpOptions not found
[ERROR] /X:/Work/pipeline/backup-pipeline/src/main/java/PassthroughPipeline.java:[29,16] cannot access org.apache.beam.sdk.options.BigQueryOptions
class file for org.apache.beam.sdk.options.BigQueryOptions not found
[ERROR] /X:/Work/pipeline/backup-pipeline/src/main/java/PassthroughPipeline.java:[31,16] cannot access org.apache.beam.sdk.options.GcsOptions
class file for org.apache.beam.sdk.options.GcsOptions not found
Does anyone know what packages I need to add to resolve these? Google has unfortunately been no help.
The POM file that I have is based off of the example POM given by Apache for Wordcount, but with extra dependencies added. Below are the dependencies I put in it. I can provide the full file if needed, but it is quite monolithic.
<dependencies>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-apex</artifactId>
<version>${beam.version}</version>
<scope>runtime</scope>
</dependency>
<!--
Apex depends on httpclient version 4.3.5, project has a transitive dependency to httpclient 4.0.1 from
google-http-client. Apex dependency version being specified explicitly so that it gets picked up. This
can be removed when the project no longer has a dependency on a different httpclient version.
-->
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.3.5</version>
<scope>runtime</scope>
<exclusions>
<exclusion>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
</profile>
<profile>
<id>dataflow-runner</id>
<!-- Makes the DataflowRunner available when running a pipeline. -->
<dependencies>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
<version>${beam.version}</version>
<scope>runtime</scope>
</dependency>
</dependencies>
</profile>
<profile>
<id>flink-runner</id>
<!-- Makes the FlinkRunner available when running a pipeline. -->
<dependencies>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-flink_2.10</artifactId>
<version>${beam.version}</version>
<scope>runtime</scope>
</dependency>
</dependencies>
</profile>
<profile>
<id>spark-runner</id>
<!-- Makes the SparkRunner available when running a pipeline. Additionally,
overrides some Spark dependencies to Beam-compatible versions. -->
<dependencies>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-spark</artifactId>
<version>${beam.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-hadoop-file-system</artifactId>
<version>${beam.version}</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>${spark.version}</version>
<scope>runtime</scope>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>jul-to-slf4j</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.10</artifactId>
<version>${jackson.version}</version>
<scope>runtime</scope>
</dependency>
</dependencies>
</profile>
</profiles>
<dependencies>
<!-- Adds a dependency on the Beam SDK. -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-core</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-io-google-cloud-platform -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-google-cloud-platform</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-common-fn-api -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-common-fn-api</artifactId>
<version>2.2.0</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-extensions-google-cloud-platform-core -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-extensions-google-cloud-platform-core</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-io-common -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-common</artifactId>
<version>2.2.0</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-runners-parent -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-parent</artifactId>
<version>2.2.0</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-runners-gcp-parent -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-gcp-parent</artifactId>
<version>2.2.0</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-extensions-parent -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-extensions-parent</artifactId>
<version>2.2.0</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-parent -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-parent</artifactId>
<version>2.2.0</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-common-parent -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-common-parent</artifactId>
<version>2.2.0</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.cloud.dataflow/google-cloud-dataflow-java-sdk-parent -->
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-parent</artifactId>
<version>2.2.0</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-runners-reference -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-reference</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-parent -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-parent</artifactId>
<version>2.2.0</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-build-tools -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-build-tools</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-runners-direct-java -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-direct-java</artifactId>
<version>2.2.0</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-runners-core-construction-java -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-core-construction-java</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
<version>[2.1.0, 2.99)</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-common-runner-api -->
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-common-runner-api</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
<version>0.4.0</version>
</dependency>
<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client</artifactId>
<version>${google-clients.version}</version>
<exclusions>
<!-- Exclude an old version of guava that is being pulled
in by a transitive dependency of google-api-client -->
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava-jdk5</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-bigquery</artifactId>
<version>${bigquery.version}</version>
<exclusions>
<!-- Exclude an old version of guava that is being pulled
in by a transitive dependency of google-api-client -->
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava-jdk5</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.google.http-client</groupId>
<artifactId>google-http-client</artifactId>
<version>${google-clients.version}</version>
<exclusions>
<!-- Exclude an old version of guava that is being pulled
in by a transitive dependency of google-api-client -->
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava-jdk5</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-pubsub</artifactId>
<version>${pubsub.version}</version>
<exclusions>
<!-- Exclude an old version of guava that is being pulled
in by a transitive dependency of google-api-client -->
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava-jdk5</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>${joda.version}</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>${guava.version}</version>
</dependency>
<!-- Add slf4j API frontend binding with JUL backend -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>${slf4j.version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-jdk14</artifactId>
<version>${slf4j.version}</version>
<!-- When loaded at runtime this will wire up slf4j to the JUL backend -->
<scope>runtime</scope>
</dependency>
<!-- Hamcrest and JUnit are required dependencies of PAssert,
which is used in the main code of DebuggingWordCount example. -->
<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest-all</artifactId>
<version>${hamcrest.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>${junit.version}</version>
</dependency>
</dependencies>
These classes:
org.apache.beam.sdk.options.GcpOptions
org.apache.beam.sdk.options.GcsOptions
org.apache.beam.sdk.options.BigQueryOptions
... are all in an earlier version of Apache Beam.
Given the dependencies in your pom.xml (specifically, the dependency on v2.2.0 of Apache Beam) the correct imports are:
org.apache.beam.sdk.extensions.gcp.options.GcpOptions
org.apache.beam.sdk.extensions.gcp.options.GcsOptions
org.apache.beam.sdk.io.gcp.bigquery.BigQueryOptions

Maven compiler error: Duplicate name in Manifest: Depends-On

I am trying to convert a legacy monolith project (MAIN) into Maven which has over 125 dependencies, added over years of development.
During install I am getting below error, which fails the build.
How Do I get to know which jar among 125 is causing the problem.
Build being prepared with Eclipse IDE, JDK 1.8.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project MyBatch: Compilation failure
[ERROR] Oct 08, 2017 5:[32,-1] 2 PM java.util.jar.Attributes read
[ERROR] WARNING: Duplicate name in Manifest: Depends-On.
[ERROR] Ensure that the manifest does not have duplicate entries, and
Ironically, My build started failing since last morning, prior to which this was completely working well. I added no additional dependencies, no confguration params changed. I have even tried reverting my pom.xml to previous version (which was working fine) but thats also didnt help.
Any directions please ?
POM.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.apps</groupId>
<artifactId>dev-main</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>dev-main</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.6</maven.compiler.source>
<maven.compiler.target>1.6</maven.compiler.target>
</properties>
<build>
<sourceDirectory>${project.basedir}/src/javaroot</sourceDirectory>
<resources>
<resource>
<directory>${project.basedir}/src/resources/</directory>
</resource>
</resources>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
<encoding>Cp1252</encoding>
<fork>true</fork>
<meminitial>512m</meminitial>
<maxmem>1048m</maxmem>
<excludes>
<exclude>com/myorg/oasys/services/notes/**</exclude>
<exclude>com/myorg/corp/gdcms/**</exclude>
<exclude>com/myorg/ejb/**</exclude>
</excludes>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>itext</groupId>
<artifactId>itext</artifactId>
<version>2.1.7</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.15</version>
</dependency>
<dependency>
<groupId>jctableK</groupId>
<artifactId>jctableK</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>com.ibm</groupId>
<artifactId>mq</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>com.ibm</groupId>
<artifactId>mqjms</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>jms</groupId>
<artifactId>jms</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>jconn3</groupId>
<artifactId>jconn3</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>com.bea.weblogic</groupId>
<artifactId>weblogic</artifactId>
<version>9.2</version>
</dependency>
<dependency>
<groupId>wljmsclient</groupId>
<artifactId>wljmsclient</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.11.0</version>
</dependency>
<dependency>
<groupId>xerces</groupId>
<artifactId>xerces</artifactId>
<version>2.0.2</version>
</dependency>
<dependency>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<version>1.3.04</version>
</dependency>
<dependency>
<groupId>serializer</groupId>
<artifactId>serializer</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>resolver</groupId>
<artifactId>resolver</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>fox</groupId>
<artifactId>fox-common</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>OpenFX-security-client</groupId>
<artifactId>OpenFX-security-client</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>fox</groupId>
<artifactId>fox-security-client</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.3</version>
</dependency>
<dependency>
<groupId>commons-fileupload</groupId>
<artifactId>commons-fileupload</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>tibjms</groupId>
<artifactId>tibjms</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>jdom</groupId>
<artifactId>jdom</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>prodapi</groupId>
<artifactId>prodapi</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>eaeUtil</groupId>
<artifactId>eaeUtil</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>quantum</groupId>
<artifactId>quantum</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>jxl</groupId>
<artifactId>jxl</artifactId>
<version>2.6.9</version>
</dependency>
<dependency>
<groupId>tibrvj</groupId>
<artifactId>tibrvj</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>rdsMarsClient</groupId>
<artifactId>rdsMarsClient</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>jaxb</groupId>
<artifactId>jaxb-api</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>jaxb</groupId>
<artifactId>jaxb-impl</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>jaxb</groupId>
<artifactId>jaxb-xjc</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>sjsxp</groupId>
<artifactId>sjsxp</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>rsgfientities</groupId>
<artifactId>rsgfientities</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.0.4</version>
</dependency>
<dependency>
<groupId>jackson</groupId>
<artifactId>jackson-core-asl</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>jackson</groupId>
<artifactId>jackson-jaxrs</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>jackson</groupId>
<artifactId>jackson-mapper-asl</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>jackson</groupId>
<artifactId>jackson-mrbean</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>jackson</groupId>
<artifactId>jackson-smile</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>jackson</groupId>
<artifactId>jackson-xc</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>jaxb-commons-lang-plugin</groupId>
<artifactId>jaxb-commons-lang-plugin</artifactId>
<version>2.0.2</version>
</dependency>
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.4</version>
</dependency>
<dependency>
<groupId>dsp</groupId>
<artifactId>dspModelClient</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>dsp</groupId>
<artifactId>dspModelBase</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>gfi_gemfire</groupId>
<artifactId>gfi_gemfire</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>gemfire</groupId>
<artifactId>gemfire</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>httpclient</groupId>
<artifactId>httpclient</artifactId>
<version>4.1.1</version>
</dependency>
<dependency>
<groupId>httpcore</groupId>
<artifactId>httpcore</artifactId>
<version>4.1</version>
</dependency>
<dependency>
<groupId>xalan-j</groupId>
<artifactId>xalan</artifactId>
<version>2.7.1</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>fop</groupId>
<artifactId>fop</artifactId>
<version>0.93</version>
</dependency>
<dependency>
<groupId>gfinet</groupId>
<artifactId>gfinet_config</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>gfinet</groupId>
<artifactId>gfinet_messaging</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>rdsclient</groupId>
<artifactId>rdsclient</artifactId>
<version>1.0</version>
</dependency>
<!-- <dependency>
<groupId>aopalliance</groupId>
<artifactId>aopalliance</artifactId>
<version>1.0</version>
</dependency> -->
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.1</version>
</dependency>
<!-- <dependency>
<groupId>antlr</groupId>
<artifactId>antlr</artifactId>
<version>2.7.7</version>
</dependency> -->
<dependency>
<groupId>FpML-JAXB</groupId>
<artifactId>FpML-JAXB</artifactId>
<version>2.0</version>
</dependency>
<dependency>
<groupId>poi</groupId>
<artifactId>poi</artifactId>
<version>3.7</version>
</dependency>
<dependency>
<groupId>xmlbeans</groupId>
<artifactId>xmlbeans</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>T-ZeroAPI</groupId>
<artifactId>T-ZeroAPI</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>bouncycastle</groupId>
<artifactId>bouncycastle-jce-jdk13-112cx</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>jnlp</groupId>
<artifactId>jnlp</artifactId>
<version>1.2</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>3.0.3</version>
</dependency>
<dependency>
<groupId>com.oracle</groupId>
<artifactId>ojdbc6</artifactId>
<version>11.2.0.3</version>
</dependency>
<dependency>
<groupId>commons-collections</groupId>
<artifactId>commons-collections</artifactId>
<version>3.1</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-smile</artifactId>
<version>2.1.4</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.1.4</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.1.4</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.1.1</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-core</artifactId>
<version>1.1.1</version>
</dependency>
<dependency>
<groupId>commons-pool</groupId>
<artifactId>commons-pool</artifactId>
<version>1.5.5</version>
</dependency>
<dependency>
<groupId>com.thoughtworks.xstream</groupId>
<artifactId>xstream</artifactId>
<version>1.2.2</version>
</dependency>
<dependency>
<groupId>antlr</groupId>
<artifactId>antlr-runtime</artifactId>
<version>3.1.3</version>
</dependency>
<dependency>
<groupId>gson</groupId>
<artifactId>gson</artifactId>
<version>1.6</version>
</dependency>
<dependency>
<groupId>ALWrapper</groupId>
<artifactId>ALWrapper</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>AnalyticsLibrary</groupId>
<artifactId>AnalyticsLibrary</artifactId>
<version>1.0</version>
</dependency>
<dependency>
<groupId>poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.7</version>
</dependency>
<dependency>
<groupId>resteasy</groupId>
<artifactId>resteasy-jaxb-provider</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>resteasy</groupId>
<artifactId>resteasy-jaxrs</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>resteasy</groupId>
<artifactId>resteasy-jettison-provider</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>resteasy</groupId>
<artifactId>resteasy-spring</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>resteasy</groupId>
<artifactId>resteasy-cache-core</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-dbcp</artifactId>
<version>1.4</version>
</dependency>
<dependency>
<groupId>jaxrs-api</groupId>
<artifactId>jaxrs-api</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
</dependency>
<dependency>
<groupId>springframework</groupId>
<artifactId>spring-core</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-support</artifactId>
<version>2.0.6</version>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-release-with-junit-easymock-dependencies</artifactId>
<version>1.6.2</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>org.hibernate</groupId>
<artifactId>hibernate-entitymanager</artifactId>
<version>4.1.9.Final</version>
</dependency>
<!--
<dependency>
<groupId>com.myorg.161090.rio_umb</groupId>
<artifactId>rio_umb</artifactId>
<version>2.0_G3</version>
</dependency>
<dependency>
<groupId>com.myorg.161090.rio_umb</groupId>
<artifactId>rio_registryservice</artifactId>
<version>2.0_G3</version>
</dependency>
-->
<dependency>
<groupId>com.sun.xml</groupId>
<artifactId>xml</artifactId>
<version>1.0</version>
</dependency>
<!-- this is for resolving sonar Class not found -->
<dependency>
<groupId>javax.xml.rpc</groupId>
<artifactId>jaxrpc</artifactId>
<version>1.1</version>
</dependency>
<dependency>
<groupId>com.aspose.aspose-words</groupId>
<artifactId>aspose-words</artifactId>
<version>14.8.0</version>
</dependency>
<!-- <dependency>
<groupId>javax.activation</groupId>
<artifactId>activation</artifactId>
<version>1.1</version>
</dependency>
--> <dependency>
<groupId>tangosol</groupId>
<artifactId>tangosol</artifactId>
<version>3.3.1</version>
</dependency>
</dependencies>
</project>
Update 1: I managed to extract MANIFEST.MF from all dependent jars; but none contains "Depends-on".

NoClassDefFoundError: org/apache/spark/sql/DataFrame in spark-cassandra-connector

I'm trying to upgrade spark-cassandra-connector from 1.4 to 1.5.
Everything seems fine but when I run test cases then It stuck between the process and log some error message saying:
Exception in thread "dag-scheduler-event-loop"
java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
My pom file looks like:
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.10 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>16.0.1</version>
</dependency>
<!-- Scala Library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.10.5</version>
</dependency>
<!--Spark Cassandra Connector-->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.0.2</version>
</dependency>
<!--Spark-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.0</version>
<exclusions>
<exclusion>
<groupId>net.java.dev.jets3t</groupId>
<artifactId>jets3t</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
</project>
Thank you in advance!!
Can anyone please help me with this ?
If you need more info please let me know!!
Try to add dependency
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
Also make sure that your version spark-cassandra-connector is compatible with version of Spark you're using. I had the same error message even with all proper dependencies when was trying to use older spark-cassandra-connector with newer Spark version. Refer to this table: https://github.com/datastax/spark-cassandra-connector#version-compatibility

Categories