Map Reduce client jars for 2.4.1 hadoop in eclipse - java

When I run my hadoop mapreduce word count jar in hadoop folder in shell, it is running properly and the output is generated correctly,
Since I use yarn in case of hadoop 2.4.1, when I run from eclipse for MapReduce Sample program, MAP process completed and getting failed in reduce process.
Its clear that the problem is with jar configuration.
Please find the jars, I have added...
This is the error I got
INFO: reduce task executor complete. Nov 21, 2014 8:50:35 PM
org.apache.hadoop.mapred.LocalJobRunner$Job run WARNING:
job_local1638918104_0001 java.lang.Exception:
java.lang.NoSuchMethodError:
org.apache.hadoop.mapred.ReduceTask.setLocalMapFiles(Ljava/util/Map;)V
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.NoSuchMethodError:
org.apache.hadoop.mapred.ReduceTask.setLocalMapFiles(Ljava/util/Map;)V
at
org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:309)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Exception in thread "Thread-12" java.lang.NoClassDefFoundError:
org/apache/commons/httpclient/HttpMethod at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:562)
Caused by: java.lang.ClassNotFoundException:
org.apache.commons.httpclient.HttpMethod at
java.net.URLClassLoader$1.run(URLClassLoader.java:366) at
java.net.URLClassLoader$1.run(URLClassLoader.java:355) at
java.security.AccessController.doPrivileged(Native Method) at
java.net.URLClassLoader.findClass(URLClassLoader.java:354) at
java.lang.ClassLoader.loadClass(ClassLoader.java:423) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at
java.lang.ClassLoader.loadClass(ClassLoader.java:356) ... 1 more

As per the screenshot, you are manually adding all the dependent jars to the classpath.
It's highly recommended to use maven for this, which will automate the process of adding dependent jars to the classpath. We just need to add main dependent jars.
I used the following dependencies in pom.xml which helped me to run without any issues..
<properties>
<hadoop.version>2.5.2</hadoop.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-api</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-nodemanager</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-server-resourcemanager</artifactId>
<version>${hadoop.version}</version>
</dependency>
</dependencies>
come to your problem,
I checked in the classpath, there are exactly 82 jar files available.
It will be tedious job to find each jar like this.
You can add the functional wise jars HERE.
Other workaround would be, add all the jar files in installed hadoop directory path as <hadoop-installed>/share/hadoop/ and add all jars from all the lib folder. which is the best thing you can do.. or
Add only avro specific jars, because exception thrown by avro class as per the screenshot. This could solve avro jars issue. but you may face other dependecy issues.
I also faced the same problem while working with Hadoop V1. So later i realized and using Maven with Hadoop V2. So no worries of dependent jars.
Your focus will be on Hadoop and Business needs. :)
Hope it helps you..

Related

Apache Tomcat failed to start with SAXParserFactoryImpl not found

I'm trying to start Web application in Eclipse Apache Tomcat it failed with below error posted in gist.
Error:complete Error link
Caused by: java.lang.RuntimeException: Provider for class javax.xml.parsers.SAXParserFactory cannot be created
at javax.xml.parsers.FactoryFinder.findServiceProvider(FactoryFinder.java:308)
... 38 more
Caused by: java.util.ServiceConfigurationError: javax.xml.parsers.SAXParserFactory: Provider org.apache.xerces.jaxp.SAXParserFactoryImpl not found
at java.util.ServiceLoader.fail(ServiceLoader.java:239)
Below is pom file dependency related to xml , i have
<dependency>
<groupId>xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.4.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>ibm</groupId>
<artifactId>xml4j</artifactId>
<version>2.0.15</version>
</dependency>
I tried various solutions , pasting xerces jar to jre/lib/endorsed folder , marking it as provided , refershing target directory nothing worked .
Since my comment helped you out, I'm posting it as an answer.
Use Apache Xerces Dependency:
<!-- https://mvnrepository.com/artifact/org.apache.xerces/xercesImpl -->
<dependency>
<groupId>org.apache.xerces</groupId>
<artifactId>xercesImpl</artifactId>
<version>2.9.1</version>
<scope>runtime</scope>
</dependency>

StringUtils class not found

I wrote a console application which reads a text file to String and then processes the file contents. I used maven in my project, enabled autoimport, added proper dependencies but still, when I try to process the String by using replace() method (this method belongs to org.apache.commons.lang3.StringUtils class) I get the undermentioned error. Moreover, when I launch my application in intelliJ, it works perfectly and everything seems to be fine. When I compile and build .jar file with maven and then launch it via terminal it reports this error:
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/commons/lang3/StringUtils at
com.company.Reader.process(Reader.java:47) at
com.company.App.main(App.java:9) Caused by:
java.lang.ClassNotFoundException: org.apache.commons.lang3.StringUtils
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at
java.lang.ClassLoader.loadClass(ClassLoader.java:424) at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at
java.lang.ClassLoader.loadClass(ClassLoader.java:357)
Also, the dependencies in my pom.xml look like this:
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.5</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-lang3 -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.4</version>
</dependency>
<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest-library</artifactId>
<version>1.3</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-all</artifactId>
<version>1.8.4</version>
<scope>test</scope>
</dependency>
</dependencies>
I have no idea what might be wrong. I know a way to make my program work, that is, to download .jar with commons-lang3 manually and include it in my project, but this is not a satisfying solution for me. Does anyone know why do I get such error?
Thanks in advance
You didn't put your entire pom.xml, but I guess that you forgot the dependency:
<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-lang3 -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.4</version>
</dependency>
You said that your project works inside Intellij, then you already have the common-lang3 in your computer.
I think the problem resides in the maven dependencies.
Try to execute: mvn clean install via command line in the root folder of your project.
You need to put your commons-lang3-3.4.jar in the classpath
Two links to help you:
https://docs.oracle.com/javase/7/docs/technotes/tools/windows/classpath.html
Run a JAR file from the command line and specify classpath

Hadoop 2.6.x and Amazon AWS SDK Library conflicts http-core conflict

My job writes each record to DynamoDB in the Hadoop's map.
I cannot make it run with Hadoop 2.6 which has httpclient-4.2.5.jar and httpcore-4.2.5.jar.
AWS which I am using was built using httpclient-4.5.2.jar and httpcore-4.4.4.jar.
When I am using classpath to include the new jar files, it gives the following exception.
java.lang.Exception: java.lang.NoSuchFieldError: INSTANCE
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NoSuchFieldError: INSTANCE
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.<clinit>(SSLConnectionSocketFactory.java:144)
at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.getPreferredSocketFactory(ApacheConnectionManagerFactory.java:87)
at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.create(ApacheConnectionManagerFactory.java:65)
at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.create(ApacheConnectionManagerFactory.java:58)
at com.amazonaws.http.apache.client.impl.ApacheHttpClientFactory.create(ApacheHttpClientFactory.java:49)
To me, it looks like Hadoop was built using the old libraries and something has changed in the syntax.
What's the reasonable solution than recompiling older sources of AWS?
As an update, I had to switch to Maven and play around with the versions a bit.
<!-- http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>2.6.0</version>
</dependency>
<!-- http://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk -->
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.3.4</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk-dynamodb</artifactId>
<version>1.9.13</version>
</dependency>
<dependency>
<groupId>org.netpreserve.commons</groupId>
<artifactId>webarchive-commons</artifactId>
<version>1.1.4</version>
</dependency>
Finally, it works

Maven builds are working, but junit is throwing a NoSuchMethodError

Just added PowerMock to my projects pom files so I can start mocking some static methods. I've verified all the versions are up to date, this includes Juint, Javassit, and mockito. The one line that is causing this problem is this:
#RunWith(PowerMockRunner.class)
When that line is commented out, my code runs fine through Junit, with it Junit has a heart attack, but a maven build works perfectly. Now I can get used to working with maven builds, but I would love the ability to debug my tests through Junit.
I have tried the following: cleaning eclipse, using maven installs/cleans/builds from both Terminal and Eclipse's UI, furiously pressing F5 while rocking myself in a dark corner for 2 hours. Any assistance on this problem would be greatly appreciated.
Below is the failure trace when I try to run as a junit test.
java.lang.NoSuchMethodError: javassist.CtMethod.hasAnnotation(Ljava/lang/Class;)Z
at org.powermock.core.transformers.impl.TestClassTransformer.removeTestAnnotationsForTestMethodsThatRunOnOtherClassLoader(TestClassTransformer.java:185)
at org.powermock.core.transformers.impl.TestClassTransformer.transform(TestClassTransformer.java:198)
at org.powermock.core.classloader.MockClassLoader.loadMockClass(MockClassLoader.java:251)
at org.powermock.core.classloader.MockClassLoader.loadModifiedClass(MockClassLoader.java:180)
at org.powermock.core.classloader.DeferSupportingClassLoader.loadClass(DeferSupportingClassLoader.java:68)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.powermock.modules.junit4.common.internal.impl.JUnit4TestSuiteChunkerImpl.createDelegatorFromClassloader(JUnit4TestSuiteChunkerImpl.java:145)
at org.powermock.modules.junit4.common.internal.impl.JUnit4TestSuiteChunkerImpl.createDelegatorFromClassloader(JUnit4TestSuiteChunkerImpl.java:40)
at org.powermock.tests.utils.impl.AbstractTestSuiteChunkerImpl.createTestDelegators(AbstractTestSuiteChunkerImpl.java:244)
at org.powermock.modules.junit4.common.internal.impl.JUnit4TestSuiteChunkerImpl.<init>(JUnit4TestSuiteChunkerImpl.java:61)
at org.powermock.modules.junit4.common.internal.impl.AbstractCommonPowerMockRunner.<init>(AbstractCommonPowerMockRunner.java:32)
at org.powermock.modules.junit4.PowerMockRunner.<init>(PowerMockRunner.java:34)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.junit.internal.builders.AnnotatedBuilder.buildRunner(AnnotatedBuilder.java:31)
at org.junit.internal.builders.AnnotatedBuilder.runnerForClass(AnnotatedBuilder.java:24)
at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
at org.junit.internal.builders.AllDefaultPossibilitiesBuilder.runnerForClass(AllDefaultPossibilitiesBuilder.java:29)
at org.junit.runners.model.RunnerBuilder.safeRunnerForClass(RunnerBuilder.java:57)
at org.junit.internal.requests.ClassRequest.getRunner(ClassRequest.java:24)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.<init>(JUnit4TestReference.java:33)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestClassReference.<init>(JUnit4TestClassReference.java:25)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestLoader.createTest(JUnit4TestLoader.java:48)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestLoader.loadTests(JUnit4TestLoader.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:452)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
<version>1.10.19</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock</artifactId>
<version>1.6.2</version>
<scope>test</scope>
<type>pom</type>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-core</artifactId>
<version>1.6.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-module-junit4</artifactId>
<version>1.6.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-api-mockito</artifactId>
<version>1.6.2</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>javassist</groupId>
<artifactId>javassist</artifactId>
<version>3.4.GA</version>
<scope>test</scope>
</dependency>
The information you provided is not enough to find out what could be the problem. I would recommend you to run
mvn dependency:tree
It will list you all the dependencies, that you have in your project.Try to find simillar dependencies with different versions.
Another possible solution: Do you have the correct Javassist version (3.18.2) in the classpath?
So I'm using Raptor (I believe that's an in house version of Eclipse) which forces certain dependency version installs (even without a pom). Even though I only had the right versions in my pom files, Raptor was installing incompatible versions before hand, which took precedence. (at least thats what I believe was happening)
I had to do some forceful version control in my parent pom using dependency managers, and I seemed to have cleared up the issue. Though a few more problems followed with the libraries not having the correct methods. I ended up resolving those issues by finding versions of the dependencies (allowed by companies repository manager) that worked together (a lot of tinkering and seeing which versions would play nice). Because of the repository manager I was unable to download just the latest version of each dependency I needed.
If you are experiencing this problem make sure your versions are compatible, if you can get the newest version of the dependencies do that. If not, grab a beer, and start with the latest versions offered by your repository manager and start trying to find versions that work with each other.
I found the fix. I had to add
<dependency>
<groupId>com.ebay.raptor.core</groupId>
<artifactId>RaptorKernel</artifactId>
<exclusions>
<exclusion>
<artifactId>javassist</artifactId>
<groupId>javassist</groupId>
</exclusion>
</exclusions>
</dependency>
RaptorKernel could be replaced by any artifact which is bringing in the javassist different(other than 3.18.2) dependency.

Errors in using Datanucleus Rest API

When I do a Http POST to a Datanucleus Rest resource in my local Appengine developement server, the server throws this error:
Caused by: java.lang.ClassNotFoundException: org.datanucleus.NucleusContext
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at com.google.appengine.tools.development.IsolatedAppClassLoader.loadClass(IsolatedAppClassLoader.java:176)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
... 37 more
From what it seems, using Datnuclues Rest API on appengine is not permitted by the platform?
EDIT:
Datanucleus in the pom:
<dependency>
<groupId>com.google.appengine.orm</groupId>
<artifactId>datanucleus-appengine</artifactId>
<version>2.0.0</version>
</dependency>
<dependency>
<groupId>org.datanucleus</groupId>
<artifactId>datanucleus-core</artifactId>
<version>3.0.0-release</version>
</dependency>
<dependency>
<groupId>org.datanucleus</groupId>
<artifactId>datanucleus-api-jpa</artifactId>
<version>3.0.0-release</version>
</dependency>
<dependency>
<groupId>org.apache.geronimo.specs</groupId>
<artifactId>geronimo-jpa_2.0_spec</artifactId>
<version>1.0</version>
</dependency>
<!-- Required by Datanucleus REST API -->
<dependency>
<groupId>org.datanucleus</groupId>
<artifactId>datanucleus-rest</artifactId>
<version>2.0.0-release</version>
</dependency>
<dependency>
<groupId>net.sf.flexjson</groupId>
<artifactId>flexjson</artifactId>
<version>2.1</version>
</dependency>
<dependency>
<groupId>org.datanucleus</groupId>
<artifactId>datanucleus-json</artifactId>
<version>2.0.0-release</version>
</dependency>
Here are the DN-related jars in the WEB-INF/lib folder:
datanucleus-rest-2.0.0-release.jar
datanucleus-json-2.0.0-release.jar
datanucleus-core-3.0.0-release.jar
datanucleus-appengine-2.0.0.jar
datanucleus-api-jpa-3.0.0-release.jar
EDIT:
Fixed the initial problem by using v3.0 DN dependencies.
However, now when I try to access the REST resource from the DN servlet it throws this error:
Error : An error occurred trying to instantiate an instance of the API adapter "org.datanucleus.api.jdo.JDOAdapter" (perhaps you dont have the requisite datanucleus-api-XXX jar in the CLASSPATH?) : {1}
org.datanucleus.exceptions.NucleusUserException: Error : An error occurred trying to instantiate an instance of the API adapter "org.datanucleus.api.jdo.JDOAdapter" (perhaps you dont have the requisite datanucleus-api-XXX jar in the CLASSPATH?) : {1}
So you use some version of datanucleus-api-rest (presumably 3.x) and you don't have the requisite version of datanucleus-core (also 3.x) present. That is normally what a ClassNotFoundException means

Categories