How to run my map-reduce program with AWS? - java

I`m trying to run a map-reduce program using Hadoop framework.
I need to run the program on an AmazonElasticMapReduce instance but I keep getting the following error:
Exception in thread "main" java.lang.NoSuchMethodError: com.amazonaws.transform.JsonErrorUnmarshaller: method (Ljava/lang/Class;)V not found
I tried to fix my pom file and adding the AWS SDK, changing its versions and adding the core separately.
my pom.xml file:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.3</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-ec2</artifactId>
<version>1.10.2</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-s3</artifactId>
<version>1.10.5</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-emr</artifactId>
<version>1.9.0</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.11.5</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-core</artifactId>
<version>1.11.5</version>
</dependency>
me trying to create the instance:
AWSCredentials creds = new PropertiesCredentials(new FileInputStream(propertiesFilePath));
AmazonElasticMapReduce mapReduce = new AmazonElasticMapReduceClient(creds);
I expect the program to run on AWS instance (and probably get a bunch more error in my code that I can debug).

Related

hadoop-aws compile dependencies on jackson library

I am trying to write a Spark program using Java to fetch records from oracle database and write it into S3 bucket. To access bucket using s3a:// convention, I have added hadoop-aws as dependency in the pom.xml.
I have added below dependencies in the pom.xml file :
<properties>
<java.version>1.8</java.version>
<scala.version>2.12</scala.version>
<spark.version>3.0.2</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.github.noraui</groupId>
<artifactId>ojdbc7</artifactId>
<version>12.1.0.2</version>
</dependency>
<dependency>
<groupId>com.github.noraui</groupId>
<artifactId>ojdbc7</artifactId>
<version>12.1.0.2</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.11.690</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
</dependency>
I tried to run the program but got dependency error :
Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/databind/type/ReferenceType
at com.fasterxml.jackson.module.scala.modifiers.ScalaTypeModifierModule.$init$(ScalaTypeModifier.scala:35)
at com.fasterxml.jackson.module.scala.DefaultScalaModule.<init>(DefaultScalaModule.scala:18)
at com.fasterxml.jackson.module.scala.DefaultScalaModule$.<init>(DefaultScalaModule.scala:36)
at com.fasterxml.jackson.module.scala.DefaultScalaModule$.<clinit>(DefaultScalaModule.scala)
After bit of research, I got to know the hadoop-aws has compilation dependency on com.fasterxml.jackson.core as mentioned here :
https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws/2.8.5
So, I updated my pom.xml with the below dependencies :
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-databind -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-annotations -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
</dependency>
But, I am still getting the same error. I am not sure what else I am missing here.

Invoke dataflow pipeline error in app engine

I want to schedule the dataflow pipeline as per Scheduling Dataflow pipelines using App Engine Cron Service or Cloud Functions.
When run, my pipeline throws an exception:
java.lang.NoClassDefFoundError: Could not initialize class
com.google.cloud.dataflow.sdk.options.PipelineOptionsFactory
Screenshot of an error message.
Maven project one (app engine):
<dependency>
<groupId>com.google.appengine</groupId>
<artifactId>appengine-api-1.0-sdk</artifactId>
<version>1.9.42</version>
</dependency>
<dependency>
project two
</dependency>
Maven project two:
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
<version>[1.6.0, 2.0.0)</version>
</dependency>
<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client</artifactId>
<version>1.22.0</version>
<exclusions>
<exclusion>
<groupId>com.google.guava</groupId>
<artifactId>guava-jdk5</artifactId>
</exclusion>
</exclusions>
</dependency>
Do I need this dependency,
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
<version>1.6.1</version>
</dependency>
or this one?
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-archetypes-starter</artifactId>
<version>1.6.0</version>
</dependency>
Thanks.

NoClassDefFoundError: org/apache/spark/sql/DataFrame in spark-cassandra-connector

I'm trying to upgrade spark-cassandra-connector from 1.4 to 1.5.
Everything seems fine but when I run test cases then It stuck between the process and log some error message saying:
Exception in thread "dag-scheduler-event-loop"
java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame
My pom file looks like:
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.10 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>16.0.1</version>
</dependency>
<!-- Scala Library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.10.5</version>
</dependency>
<!--Spark Cassandra Connector-->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.0.2</version>
</dependency>
<!--Spark-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.0</version>
<exclusions>
<exclusion>
<groupId>net.java.dev.jets3t</groupId>
<artifactId>jets3t</artifactId>
</exclusion>
</exclusions>
</dependency>
</dependencies>
</project>
Thank you in advance!!
Can anyone please help me with this ?
If you need more info please let me know!!
Try to add dependency
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
Also make sure that your version spark-cassandra-connector is compatible with version of Spark you're using. I had the same error message even with all proper dependencies when was trying to use older spark-cassandra-connector with newer Spark version. Refer to this table: https://github.com/datastax/spark-cassandra-connector#version-compatibility

Could not initialize com.ibm.mq.MQEnvironment

I've upgraded my maven dependencies for IBM MQ from these(version: 6.0.2.5):
<dependency>
<groupId>com.ibm</groupId>
<artifactId>mq</artifactId>
<version>${ibm-mq-version}</version>
</dependency>
<dependency>
<groupId>com.ibm</groupId>
<artifactId>mqjms</artifactId>
<version>${ibm-mq-version}</version>
</dependency>
<dependency>
<groupId>com.ibm.disthub2</groupId>
<artifactId>dhbcore</artifactId>
<version>DH610-Gold</version>
</dependency>
<dependency>
<groupId>com.ibm</groupId>
<artifactId>mqetclient</artifactId>
<version>${ibm-mq-version}</version>
</dependency>
To that(version: 7.5.0.5):
<dependency>
<groupId>com.ibm</groupId>
<artifactId>mq-jms-all</artifactId>
<version>${ibm-mq-version}</version>
</dependency>
Now, everytime I try to run my project, I get the following error:
nested exception is java.lang.NoClassDefFoundError: Could not initialize class com.ibm.mq.MQEnvironment
The maven-dependency is imported correctly and is also visible in Eclipse in the maven-dependencies-tab. Also i see the com.ibm.mq.jar in the classpath.
I've googled a lot and the only real solution, which worked for some people was, to add the connector.jar. But I'm already using the jar:
<dependency>
<groupId>javax.resource</groupId>
<artifactId>connector</artifactId>
<version>${connector-version}</version>
</dependency>
Am I missing something?
IBM MQ from these(version: 6.0.2.5):
To that(version: 7.5.0.5):
IBM moved the MQException to the 'com.ibm.mq.jmqi.jar' file.
As per the the MQ Knowledge Center, you need the following jar files for MQ JMS programming:
com.ibm.mq.commonservices.jar
com.ibm.mq.headers.jar
com.ibm.mq.pcf.jar
com.ibm.mq.jmqi.jar
connector.jar
jms.jar
dhbcore.jar
rmm.jar
jndi.jar
ldap.jar
fscontext.jar
providerutil.jar
CL3Export.jar
CL3Nonexport.jar
Exactly the same problem and this fixed it
<dependency>
<groupId>javax.resource</groupId>
<artifactId>connector</artifactId>
<version>${connector-version}</version>
</dependency>
These are my dependencies.
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mq.commonservices</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mq.headers</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mq.jmqi</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mq.jms.Nojndi</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mqjms</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mq.soap</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mq</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mq.headers</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>com.ibm.mq.pcf</artifactId>
<version>7.0.1.4</version>
</dependency>
<dependency>
<groupId>javax.resource</groupId>
<artifactId>connector</artifactId>
<version>1.5</version>
</dependency>
<dependency>
<groupId>com.ibm</groupId>
<artifactId>com.ibm.dhbcore</artifactId>
<version>7.0.1</version>
</dependency>
<dependency>
<groupId>com.ibm.mq</groupId>
<artifactId>CL3Nonexport</artifactId>
<version>${webspheremq.version}</version>
</dependency>
<dependency>
<groupId>com.ibm</groupId>
<artifactId>com.ibm.mqetclient</artifactId>
<version>7.0.1</version>
</dependency>
For Eclipse (Dynamic Web Project (Servlet)) you need copy files:
com.ibm.mq.commomservices.jar
com.ibm.mq.defaultconfig.jar
com.ibm.mq.headers.jar
com.ibm.mq.jar
com.ibm.mq.jmqi.jar
com.ibm.mq.jms.Nojndi.jar
com.ibm.mq.pcf.jar
com.ibm.mqetclient.jar
com.ibm.mqjms.jar
connector.jar
dhbcode.jar
fscontext.jar
jms.jar
to /WebContext/WEB-INF/lib, then add them into Project (Project -> Properties -> Java Build Path -> Add External JARs).
After all, go through these steps:
close project
close Eclipse
open Eclipse
open project.
Good Luck!

IllegalArgumentException on trying out 'Quickstart: Run a Gmail App in Java'

I created a project using maven and followed the steps. But when i run the code i get the following error.
Exception in thread "main" java.lang.IllegalArgumentException
at com.google.api.client.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:76)
at com.google.api.client.util.Preconditions.checkArgument(Preconditions.java:37)
at com.google.api.client.googleapis.auth.oauth2.GoogleClientSecrets.getDetails(GoogleClientSecrets.java:80)
at com.google.api.client.googleapis.auth.oauth2.GoogleAuthorizationCodeFlow$Builder.<init>(GoogleAuthorizationCodeFlow.java:195)
at com.vertoanalytics.external.mailfetcher.GmailApiQuickstart.main(GmailApiQuickstart.java:41)
When I google it there are some ides about not having latest packges can cause this problem. Here is my meven dependency list related to the project
<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client</artifactId>
<version>1.18.0-rc</version>
</dependency>
<dependency>
<groupId>com.google.oauth-client</groupId>
<artifactId>google-oauth-client</artifactId>
<version>1.19.0</version>
</dependency>
<dependency>
<groupId>com.google.apis</groupId>
<artifactId>google-api-services-gmail</artifactId>
<version>v1-rev7-1.19.0</version>
</dependency>
<dependency>
<groupId>com.google.http-client</groupId>
<artifactId>google-http-client-jackson</artifactId>
<version>1.18.0-rc</version>
</dependency>
do i miss anything ?

Categories