scala version mismatch in spark 2.1.0 - java

When I'm using spark 1.6.1, everything is alright. When I switch to Spark 2.1.0, I come across the problem below:
Task 33 in stage3.0 failed 4times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 33 in stage 3.0 failed 4 times, most recent failure: Lost taks 33.3 in stage 3.0 (TID 310, 192.168.1.5, executor 3): java.io.invalidclassexception scala.tuple2; local class incompatible; local class incompatible: stream classdesc serialVersionUID = -4864544146559264103, local class serialVersionUID = 3356420310891166197
I know -4864544146559264103 is correspond to scala 2.10, while 3356420310891166197 is correspond to scala 2.11. Although I changed my configuration to
EDIT: the entire pom file shows below.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>test.spark</groupId>
<artifactId>spark</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>spark</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<build>
<plugins>
<plugin>
<!-- solve the problem of : java.lang.ClassNotFoundException: kafka.producer.ProducerConfig -->
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.6.1</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>1.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-common</artifactId>
<version>1.1.2</version>
</dependency>
<dependency>
<groupId>io.fastjson</groupId>
<artifactId>boon</artifactId>
<version>0.33</version>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.7</version>
</dependency>
<dependency>
<groupId>com.googlecode.json-simple</groupId>
<artifactId>json-simple</artifactId>
<version>1.1.1</version>
</dependency>
</dependencies>
</project>
the problem is still exists. How to fix this problem? Any detail needed will be added. Thanks for any help!

Finally, I fixed this problem. It's my fault, the pom file is alright, and the project works well.
The problem is resulted from one detail, the code reads scala.Tuple2 object from HDFS, which is not mentioned in my question (I'm sorry to say this). The objects in HDFS are generated with scala 2.10 by another project, so the problem occurs.
Anyway, thanks for your help.

Related

AWS Java Lambda dependency JAR size management with Maven

I have a Lambda function that takes around ~10 seconds to start up using Java-11. I did some googling and came across a number of posts that suggest that lowering the JAR size of the package may help with quicker start times (less redundant libraries loaded etc...).
I also read in some posts that using below may help and tried to add
<scope>provided</scope>
in the AWS related dependencies thinking that well...AWS Lambda would have AWS specific libraries present? Turns out that is not the case! adding scope provided does not work when trying to execute the function.
My current pom.xml is as follows:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>REMOVED</groupId>
<artifactId>REMOVED</artifactId>
<version>REMOVED</version>
<properties>
<maven.compiler.source>11</maven.compiler.source>
<maven.compiler.target>11</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>30.1-jre</version>
</dependency>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>1.18.20</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<version>5.7.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-engine</artifactId>
<version>5.7.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20220320</version>
</dependency>
<dependency>
<groupId>software.amazon.kinesis</groupId>
<artifactId>amazon-kinesis-client</artifactId>
<version>2.4.1</version>
</dependency><dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-kinesis</artifactId>
<version>1.12.228</version>
</dependency>
<!-- <dependency><groupId>software.amazon.awssdk</groupId><artifactId>firehose</artifactId><version>2.17.198</version></dependency> -->
<!-- https://mvnrepository.com/artifact/software.amazon.awssdk/kinesis -->
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>kinesis</artifactId>
<version>2.17.201</version>
</dependency>
<!-- https://mvnrepository.com/artifact/software.amazon.awssdk/secretsmanager -->
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>secretsmanager</artifactId>
<version>2.17.204</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-failsafe-plugin</artifactId>
<version>2.22.0</version>
</plugin>
<plugin>
<artifactId>maven-clean-plugin</artifactId>
<version>3.1.0</version>
</plugin>
<!-- default lifecycle, jar packaging: see https://maven.apache.org/ref/current/maven-core/default-bindings.html#Plugin_bindings_for_jar_packaging -->
<plugin>
<artifactId>maven-resources-plugin</artifactId>
<version>3.0.2</version>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
</plugin>
<plugin>
<artifactId>maven-jar-plugin</artifactId>
<version>3.0.2</version>
</plugin>
<plugin>
<artifactId>maven-install-plugin</artifactId>
<version>2.5.2</version>
</plugin>
<plugin>
<artifactId>maven-deploy-plugin</artifactId>
<version>2.8.2</version>
</plugin>
<!-- site lifecycle, see https://maven.apache.org/ref/current/maven-core/lifecycles.html#site_Lifecycle -->
<plugin>
<artifactId>maven-site-plugin</artifactId>
<version>3.7.1</version>
</plugin>
<plugin>
<artifactId>maven-project-info-reports-plugin</artifactId>
<version>3.0.0</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.4</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
AWS Kinesis usage is just to create the KinesisClient and use PutRecordRequest and PutRecordResponse, with AWS Secret Manager , just to access specific secrets.
I am not that versed in AWS API, from my understanding I am already including a limited amount of libraries required to perform those tasks based on the dependencies.
The JAR file is about 65 MB large. Would I be able to optimise dependency loads further? And I guess would lowering the JAR file increase start up time of the Lambda function?
Thank you,
I think you've accidentally included three copies of the same dependency. You just want the latest 2.x.x version of the Kinesis library.
You certainly don't need a 1.x.x SDK and a 2.x.x SDK.
This article does not reduce the size of your Lambda (which I am also struggling with), but it helps with cold starts without any code changes.
In their example it changed the start time from about 5 seconds to 2 seconds. It worked for me.
Simply add an lambda environment variable:
Key: JAVA_TOOL_OPTIONS
Value: -XX:+TieredCompilation -XX:TieredStopAtLevel=1
There is also an article that describes how to remove Netty and Apache clients in favour of the built in JDK client. This reduces the size.
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>url-connection-client</artifactId>
</dependency>
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>s3</artifactId>
<exclusions>
<exclusion>
<groupId>software.amazon.awssdk</groupId>
<artifactId>netty-nio-client</artifactId>
</exclusion>
<exclusion>
<groupId>software.amazon.awssdk</groupId>
<artifactId>apache-client</artifactId>
</exclusion>
</exclusions>
</dependency>
Use the UrlConnectionHttpClient which wraps Java's HTTP client.
S3Client client = S3Client.builder()
.region(Region.US_WEST_2)
.credentialsProvider(EnvironmentVariableCredentialsProvider.create())
.httpClient(UrlConnectionHttpClient.builder().build())
.build();
Finally you can use Jackson-jr which is a +- 100kb version of Jackson.

How to fix maven-packaged jar reporting 'Error: Could not find or load main class .\FormulaTelemetryApp-1.0-SNAPSHOT.jar'

I am trying to package my JavaFX application using Maven, running it using exec:java is no problem and running maven - package yields no errors. But when I want to execute the runnable jar it returns the error 'Error: Could not find or load main class .\FormulaTelemetryApp-1.0-SNAPSHOT.jar' where the second part is the jar I want to execute.
Beneath I included my POM, I already went looking for solutions, this is where I stumbled upon the maven-jar-plugin, but including that didn't solve the problem. I also tried cleaning the project and compiling/packaging again but that didn't change anything either.
As you might notice, I am fairly new to Maven (trying to self-learn it) so there might be some strange things in the POM.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.formulaElectric</groupId>
<artifactId>FormulaTelemetryApp</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>FormulaTelemetryApp</name>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<release>11</release>
</configuration>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.6.0</version>
<executions>
<execution>
<goals>
<goal>java</goal>
</goals>
</execution>
</executions>
<configuration>
<mainClass>org.feb.telemetry.application.TelemetryApplication</mainClass>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.1.1</version>
<configuration>
<archive>
<index>true</index>
<manifest>
<mainClass>org.feb.telemetry.application.TelemetryApplication</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
</build>
<distributionManagement>
<!-- use the following if you're not using a snapshot version. -->
<repository>
<id>localSnap</id>
<name>RepositoryProxyRel</name>
<url>http://127.0.0.1:8080/nexus/content/repositories/releases/</url>
</repository>
<!-- use the following if you ARE using a snapshot version. -->
<snapshotRepository>
<id>MylocalSnap</id>
<name>RepositoryProxySnap</name>
<url>http://127.0.0.1:8080/nexus/content/repositories/snapshots/</url>
</snapshotRepository>
</distributionManagement>
<dependencies>
<dependency>
<groupId>org.openjfx</groupId>
<artifactId>javafx-controls</artifactId>
<version>11</version>
</dependency>
<dependency>
<groupId>org.openjfx</groupId>
<artifactId>javafx-fxml</artifactId>
<version>11</version>
</dependency>
<dependency>
<groupId>org.controlsfx</groupId>
<artifactId>controlsfx</artifactId>
<version>9.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.1</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.23</version>
</dependency>
<dependency>
<groupId>com.fazecast</groupId>
<artifactId>jSerialComm</artifactId>
<version>1.3.11</version>
</dependency>
<dependency>
<groupId>eu.hansolo</groupId>
<artifactId>tilesfx</artifactId>
<version>1.4.7</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>23.0-rc1</version>
</dependency>
<dependency>
<groupId>com.sun.mail</groupId>
<artifactId>javax.mail</artifactId>
<version>1.6.2</version>
</dependency>
<dependency>
<groupId>org.xerial</groupId>
<artifactId>sqlite-jdbc</artifactId>
<version>3.21.0</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.1</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>com.jcraft</groupId>
<artifactId>jsch</artifactId>
<version>0.1.55</version>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.2</version>
</dependency>
<dependency>
<groupId>org.jetbrains</groupId>
<artifactId>annotations</artifactId>
<version>RELEASE</version>
<scope>compile</scope>
</dependency>
</dependencies>
</project>
As stated in the comments, the problem was two-fold: on one hand a fat jar was needed, maven-shade-plugin was used for that. Besides that a launcher class (non-application extending class) was needed as main class, as is stated in the answer to this question.

Maven upgraded DROOLS version and now can no longer build

I am new to maven and java. I am working on a Java/DROOLS project. I was experiencing errors which seemed to be fixed with the next version of DROOLS. I changed the version in my pom.xml and then did mvn package.
The files were downloaded but every line of code that references DROOLS throws a compile error of package org.drools does not exist.
[ERROR] /C:/Dev/src/main/java/com/company/project/drools/client/DroolsRulesRunner.java:[11,18] package org.drools does not exist
[ERROR] /C:/Dev/src/main/java/com/company/project/drools/client/DroolsRulesRunner.java:[12,29] package org.drools.definition does not exist
Any help would be greatly appreciated.
Edit
Here is my pom.xml.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.bericotechnologies</groupId>
<artifactId>sbt-rules</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<name>my-app</name>
<url>http://maven.apache.org</url>
<properties>
<drools.version>7.6.0.Final</drools.version>
<jbpm5.version>7.6.0.Final</jbpm5.version>
</properties>
<dependencies>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-jdk14</artifactId>
<version>1.7.12</version>
</dependency>
<dependency>
<groupId>org.jbpm</groupId>
<artifactId>jbpm-flow</artifactId>
<version>${jbpm5.version}</version>
</dependency>
<dependency>
<groupId>org.jbpm</groupId>
<artifactId>jbpm-bpmn2</artifactId>
<version>${jbpm5.version}</version>
</dependency>
<dependency>
<groupId>org.jbpm</groupId>
<artifactId>jbpm-flow-builder</artifactId>
<version>${jbpm5.version}</version>
</dependency>
<dependency>
<groupId>pl.maciejwalkowiak</groupId>
<artifactId>junit-drools</artifactId>
<version>1.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest-all</artifactId>
<version>1.3</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>2.9</version>
</dependency>
<dependency>
<groupId>org.drools</groupId>
<artifactId>drools-compiler</artifactId>
<version>${drools.version}</version>
</dependency>
<dependency>
<groupId>org.eclipse.persistence</groupId>
<artifactId>org.eclipse.persistence.core</artifactId>
<version>2.6.3</version>
</dependency>
<dependency>
<groupId>com.microsoft.sqlserver</groupId>
<artifactId>sqljdbc4</artifactId>
<version>4.0</version>
</dependency>
<dependency>
<groupId>com.microsoft.sqlserver</groupId>
<artifactId>sqljdbc</artifactId>
<version>4.0</version>
</dependency>
<dependency>
<groupId>org.rzo.yajsw</groupId>
<artifactId>wrapper</artifactId>
<version>11.11</version>
</dependency>
<dependency>
<groupId>org.rzo.yajsw</groupId>
<artifactId>wrapperApp</artifactId>
<version>11.11</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>jboss</id>
<name>jboss</name>
<url>https://repository.jboss.org/nexus/content/groups/public/</url>
</repository>
<repository>
<id>maciejwalkowiak.pl</id>
<url>https://github.com/maciejwalkowiak/maven-repo/raw/releases/</url>
</repository>
</repositories>
<build>
<resources>
<resource>
<directory>src/main/rules</directory>
</resource>
</resources>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
<groupId>com.google.code.sortpom</groupId>
<artifactId>maven-sortpom-plugin</artifactId>
<version>2.3.0</version>
<executions>
<execution>
<phase>verify</phase>
<goals>
<goal>sort</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>com.berico.psip.run.RunDroolsService</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
</build>
</project>
The only thing that I changed is
<properties>
<drools.version>5.6.0.Final</drools.version>
<jbpm5.version>5.5.0.Final</jbpm5.version>
</properties>
to
<properties>
<drools.version>7.6.0.Final</drools.version>
<jbpm5.version>7.6.0.Final</jbpm5.version>
</properties>
If I change it back everything works fine. I can see the 7.6.0.Final jars in the repository.
You have broken imports and you have to fix them.
There's changes in the package structure of Drools between versions 5.6.0.Final and 7.6.0.Final. The package "definitions" is now under org.drools.core.definitions (v7.6.0.Final) and not under org.drools.definitions (5.6.0.Final).
You need to remove all imports beginning with org.drools in your classes and re-import the missing Drools classes again with the new package locations - those from version 7.6.0.Final.

How to use Maven Shade plugin?

In my project I want to avoid version conflict of neo4j lucene indexer (which uses lucene version - 3.6.2) and apache lucene (lucene version - 5.3.0). For this I want to use Maven shade plugin. Actually, I added plugin to my projects 'pom.xml' file but problem wasn't solved. I get exception -
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.lucene.analysis.standard.StandardAnalyzer: method <init>()V not found
at com.sessa.col.spr.act.dictionary.DictionaryConfiguration.writerConfiguration(DictionaryConfiguration.java:124)
at com.sessa.col.spr.act.process_flow.Flow.startProcess(Flow.java:59)
at com.sessa.col.spr.act.process_flow.FlowHandler.main(FlowHandler.java:17)
It seems that it is caused by version conflict again. I guess, I don't use Maven Shade plugin in a correct way. How should it be used?
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.sessa.col.spr.act</groupId>
<artifactId>Color-Spreading-Activation</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>Color-Spreading-Activation</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<neo4j-version>2.2.5</neo4j-version>
</properties>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.4</version>
</dependency>
<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j</artifactId>
<version>${neo4j-version}</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.2</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-parser</artifactId>
<version>3.5.2</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.2</version>
<classifier>models</classifier>
</dependency>
<dependency>
<groupId>com.sparsity</groupId>
<artifactId>sparkseejava</artifactId>
<version>5.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.jena</groupId>
<artifactId>jena-tdb</artifactId>
<version>1.1.2</version>
</dependency>
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-tools</artifactId>
<version>1.5.3</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>5.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
<version>5.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>5.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queries</artifactId>
<version>5.3.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<createDependencyReducedPom>false</createDependencyReducedPom>
<relocations>
<relocation>
<pattern>org.apache.lucene</pattern>
<shadedPattern>shaded_lucene_3_6_2.org.apache.lucene</shadedPattern>
</relocation>
</relocations>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>neo4j-repo</id>
<name>Neo4j Repository</name>
<url>http://m2.neo4j.org/content/repositories/releases</url>
</repository>
</repositories>
I doubt that maven-shade will help you here. You basically want to have multiple versions of the same jar - lucene 3.6.2 and 5.x.y used at the same time.
The only solution I'm aware of here is using classloader separation.
However it might be worth refactoring the architecture to prevent that problem by separating Neo4j and your code into separate JVMs.
Add this project to eclipse -> https://github.com/lagodiuk/neo4j-uber-jar.
Use mvn-install and create ~SNAPSHOT.jar
Add that .jar to your project (which has conflict)
Remove neo4j maven dependency from that project.

Spark runtime error: spark.metrics.sink.MetricsServlet cannot be instantialized

I got invocation target exception when running project with spark 1.3 lib in maven in IntelliJ.
I got met this error only in IntelliJ IDE. After I deployed the jar and ran via spark-submit, the error went out.
Any one has met with the same problem before? I hope to fix this problem so as to do easy-debugging. otherwise I have to package the jar every time when I want to run the code.
details are as below:
2015-04-21 09:39:13 ERROR MetricsSystem:75 - Sink class org.apache.spark.metrics.sink.MetricsServlet cannot be instantialized
2015-04-21 09:39:13 ERROR TrainingSFERunner:144 - java.lang.reflect.InvocationTargetException
2015-04-20 16:08:44 INFO BlockManagerMaster:59 - Registered BlockManager
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:187)
at org.apache.spark.metrics.MetricsSystem$$anonfun$registerSinks$1.apply(MetricsSystem.scala:181)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:98)
at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:226)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:39)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:98)
at org.apache.spark.metrics.MetricsSystem.registerSinks(MetricsSystem.scala:181)
at org.apache.spark.metrics.MetricsSystem.start(MetricsSystem.scala:98)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:390)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61)
at spark.mllibClassifier.JavaRandomForests.run(JavaRandomForests.java:105)
at spark.mllibClassifier.SparkMLlibMain.runMain(SparkMLlibMain.java:263)
at spark.mllibClassifier.JavaRandomForests.main(JavaRandomForests.java:221)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Caused by: java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.module.SimpleSerializers.<init>(Ljava/util/List;)V
at com.codahale.metrics.json.MetricsModule.setupModule(MetricsModule.java:223)
at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:469)
at org.apache.spark.metrics.sink.MetricsServlet.<init>(MetricsServlet.scala:45)
My pom file is as follows.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>projects</groupId>
<artifactId>project1</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>5.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
<version>5.0.0</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>colt</groupId>
<artifactId>colt</artifactId>
<version>1.2.0</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.2</version>
<configuration>
<source>1.7</source>
<target>1.7</target>
</configuration>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
</plugins>
</build>
</project>
Strangely, I found the error didn't come out any more when I move the spark related dependencies to the front.
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>1.3.0</version>
</dependency>
//....the rest dependencies....
</dependencies>
So the sequence of the dependencies matters! Any one knows why?
I think the issue here is in jackson dependency. I had similar issue and problem was multiple jackson-core and jackson-databind versions. I think it is scala related issue. Anyway add this 2 jackson dependencies to pom with lower versions and it should work. Maybe you will not find right version from first try. This one works for me.
<jackson-core.version>2.4.4</jackson-core.version>
<spark.version>2.3.0</spark.version>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>${jackson-core.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>${jackson-core.version}</version>
</dependency>
Jackson dependency issue - I had spark3.0.3-hadoop2.7 version, and had 2 versions of jackson annotation and databind jars. Removed those, and this error got solved.

Categories