I am trying to put a jar file in hadoop but getting the hadoop namenode -format
enter image description here as you can see jar file is already created
[cloudera#quickstart ~]$ hdfs dfs -put AverageWordCount.jar /user/cloudera
21/10/14 01:27:08 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/cloudera/AverageWordCount.jar._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1541)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3286)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:667)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:212)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:483)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy14.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1544)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1361)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:600)
put: File /user/cloudera/AverageWordCount.jar._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
Need to first make sure the Namenode is in safe mode to check that below is the steps:
To know the status of Safemode, use command:
hadoop dfsadmin –safemode get
To enter Safemode, use command:
bin/hadoop dfsadmin –safemode enter
To come out of Safemode, use command:
hadoop dfsadmin -safemode leave
After that restart the hdfs services. And try again to Put the .jar file again in the Hadoop
Related
I have hadoop
hadoop#nodo1:/opt/hadoop$ hadoop version Hadoop 2.7.7 Subversion
Unknown -r c1aad84bd27cd79c3d1a7dd58202a8c3ee1ed3ac Compiled by stevel
on 2018-07-18T22:47Z Compiled with protoc 2.5.0 From source with
checksum 792e15d20b12c74bd6f19a1fb886490 This command was run using
/opt/hadoop/share/hadoop/common/hadoop-common-2.7.7.jar
And as learned in a course, I use
/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount
But when i run this, the next error is shown:
hadoop#nodo1:/opt/hadoop$ hadoop jar
/opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar
wordcount /libros /output3
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
hdfs://nodo1:9000/output3 already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:266)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:139)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:87)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
at org.apache.hadoop.util.RunJar.main(RunJar.java:141)
In the path i have a book
hadoop#nodo1:/opt/hadoop$ hdfs dfs -ls /libros/ Found 1 items
-rw-r--r-- 1 hadoop supergroup 2198927 2018-11-02 10:22 /libros/quijote.txt
TNK from your help
First do
hdfs dfs -ls /output3
If there is a file then,
Either Delete, Output directory hdfs://nodo1:9000/output3 or,
Use different file name
# Change output3 to output4
hadoop jar /opt/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.7.jar wordcount /libros /output4
I am trying to import data from mysql table to hdfs. I am using the below sqoop import command
sqoop import --connect jdbc:mysql://localhost:3306/employee --username root --password *** --table Emp --m 1
I am getting the below error
16/05/07 20:01:18 ERROR tool.ImportTool: Encountered IOException running import job: java.io.FileNotFoundException: File does not exist: hdfs://localhost:54310/usr/lib/sqoop/lib/parquet-format-2.0.0.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:93)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:269)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:390)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:483)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1314)
at org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:196)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:169)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:266)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
I have the parquet-format-2.0.0.jar at usr/lib/sqoop folder, but even then it showing the error.
I tried to import all the sqoop lib to the hdfs but then i can do that its throwing the below error
16/05/07 18:40:11 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /usr/lib/sqoop/lib/xz-1.0.jar.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
What can do now? I can't copy the jar files to the HDFS and also I can't import the data to HDFS form MySQL.
I tried this solution
sqoop import eror - File does not exist:
but can't proceed from the second step. I also cleared the cache and restarted the Hadoop file system.
Thanks
For datanode replication error, following may be the reasons:
Datanodes doesn’t have enough disk space to store the blocks
Namenode can not reach Datanode(s) or Datanode(s) could be down/unavailable
Make sure the connectivity between Namenode and Datanode(s) and also the Datanode(s) have sufficient space to store the new blocks.
In case, if there is enough disk space, you need to reformat your nameNode.
I have configured gridgain-hadoop-os-6.6.2.zip, and followed steps as mentioned in docs/hadoop_readme.pdf . started gridgain using bin/ggstart.sh command, now am running a simple wordcount code in gridgain with hadoop-2.2.0. using command
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/*-mapreduce-examples-*.jar wordcount /input /output
Steps I tried:
Step 1:
Extracted hadoop-2.2.0 and gridgain-hadoop-os-6.6.2.zip file in usr/local folder and changed name for gridgain folder as "gridgain".
Step 2:
Set the path for export GRIDGAIN_HOME=/usr/local/gridgain.. and path for hadoop-2.2.0 with JAVA_HOME as
# Set Hadoop-related environment variables
export HADOOP_PREFIX=/usr/local/hadoop-2.2.0
export HADOOP_HOME=/usr/local/hadoop-2.2.0
export HADOOP_MAPRED_HOME=/usr/local/hadoop-2.2.0
export HADOOP_COMMON_HOME=/usr/local/hadoop-2.2.0
export HADOOP_HDFS_HOME=/usr/local/hadoop-2.2.0
export YARN_HOME=/usr/local/hadoop-2.2.0
export HADOOP_CONF_DIR=/usr/local/hadoop-2.2.0/etc/hadoop
export GRIDGAIN_HADOOP_CLASSPATH='/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*'
Step 3:
now i run command as bin/setup-hadoop.sh ... answer Y to every prompt.
Step 4:
started gridgain using command
bin/ggstart.sh
Step 5:
now i created dir and uploaded file using :
hadoop fs -mkdir /input
hadoop fs -copyFromLocal $HADOOP_HOME/README.txt /input/WORD_COUNT_ME.
txt
Step 6:
Running this command gives me error:
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/*-mapreduce-examples-*.
jar wordcount /input /output
Getting following error:
15/02/22 12:49:13 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
15/02/22 12:49:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_091ebfbd-2993-475f-a506-28280dbbf891_0002
15/02/22 12:49:13 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hduser/.staging/job_091ebfbd-2993-475f-a506-28280dbbf891_0002
java.lang.NullPointerException
at org.gridgain.client.hadoop.GridHadoopClientProtocol.processStatus(GridHadoopClientProtocol.java:329)
at org.gridgain.client.hadoop.GridHadoopClientProtocol.submitJob(GridHadoopClientProtocol.java:115)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:430)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286)
at org.apache.hadoop.examples.WordCount.main(WordCount.java:84)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
and gridgain console error as:
sLdrId=a0b8610bb41-091ebfbd-2993-475f-a506-28280dbbf891, userVer=0, loc=true, sampleClsName=java.lang.String, pendingUndeploy=false, undeployed=false, usage=0]], taskClsName=o.g.g.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask, sesId=e129610bb41-091ebfbd-2993-475f-a506-28280dbbf891, startTime=1424589553332, endTime=9223372036854775807, taskNodeId=091ebfbd-2993-475f-a506-28280dbbf891, clsLdr=sun.misc.Launcher$AppClassLoader#1bdcbb2, closed=false, cpSpi=null, failSpi=null, loadSpi=null, usage=1, fullSup=false, subjId=091ebfbd-2993-475f-a506-28280dbbf891], jobId=f129610bb41-091ebfbd-2993-475f-a506-28280dbbf891]]
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/JobContext
at java.lang.Class.getDeclaredConstructors0(Native Method)
at java.lang.Class.privateGetDeclaredConstructors(Class.java:2585)
at java.lang.Class.getConstructor0(Class.java:2885)
at java.lang.Class.getConstructor(Class.java:1723)
at org.gridgain.grid.hadoop.GridHadoopDefaultJobInfo.createJob(GridHadoopDefaultJobInfo.java:107)
at org.gridgain.grid.kernal.processors.hadoop.jobtracker.GridHadoopJobTracker.job(GridHadoopJobTracker.java:959)
at org.gridgain.grid.kernal.processors.hadoop.jobtracker.GridHadoopJobTracker.submit(GridHadoopJobTracker.java:222)
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopProcessor.submit(GridHadoopProcessor.java:188)
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopImpl.submit(GridHadoopImpl.java:73)
at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask.run(GridHadoopProtocolSubmitJobTask.java:54)
at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask.run(GridHadoopProtocolSubmitJobTask.java:37)
at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolTaskAdapter$Job.execute(GridHadoopProtocolTaskAdapter.java:95)
at org.gridgain.grid.kernal.processors.job.GridJobWorker$2.call(GridJobWorker.java:484)
at org.gridgain.grid.util.GridUtils.wrapThreadLoader(GridUtils.java:6136)
at org.gridgain.grid.kernal.processors.job.GridJobWorker.execute0(GridJobWorker.java:478)
at org.gridgain.grid.kernal.processors.job.GridJobWorker.body(GridJobWorker.java:429)
at org.gridgain.grid.util.worker.GridWorker.run(GridWorker.java:151)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: Failed to load class: org.apache.hadoop.mapreduce.JobContext
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClass(GridHadoopClassLoader.java:125)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 20 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.JobContext
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClassExplicitly(GridHadoopClassLoader.java:196)
at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClass(GridHadoopClassLoader.java:106)
... 21 more
^[[B
Help here....
Edited Here :
raj#ubuntu:~$ hadoop classpath
/usr/local/hadoop-2.2.0/etc/hadoop:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/common/*:/usr/local/hadoop-2.2.0/share/hadoop/hdfs:/usr/local/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/hdfs/*:/usr/local/hadoop-2.2.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/yarn/*:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce/*:/usr/local/hadoop-2.2.0/contrib/capacity-scheduler/*.jar
raj#ubuntu:~$ jps
3529 GridCommandLineStartup
3646 Jps
raj#ubuntu:~$ echo $GRIDGAIN_HOME
/usr/local/gridgain
raj#ubuntu:~$ echo $HADOOP_HOME
/usr/local/hadoop-2.2.0
raj#ubuntu:~$ hadoop version
Hadoop 2.2.0
Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768
Compiled by hortonmu on 2013-10-07T06:28Z
Compiled with protoc 2.5.0
From source with checksum 79e53ce7994d1628b240f09af91e1af4
This command was run using /usr/local/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar
raj#ubuntu:~$ cd /usr/local/hadoop-2.2.0/share/hadoop/mapreduce
raj#ubuntu:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce$ ls
hadoop-mapreduce-client-app-2.2.0.jar hadoop-mapreduce-client-hs-2.2.0.jar hadoop-mapreduce-client-jobclient-2.2.0-tests.jar lib
hadoop-mapreduce-client-common-2.2.0.jar hadoop-mapreduce-client-hs-plugins-2.2.0.jar hadoop-mapreduce-client-shuffle-2.2.0.jar lib-examples
hadoop-mapreduce-client-core-2.2.0.jar hadoop-mapreduce-client-jobclient-2.2.0.jar hadoop-mapreduce-examples-2.2.0.jar sources
raj#ubuntu:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce$
I configured exactly the versions you mentioned (gridgain-hadoop-os-6.6.2.zip + hadoop-2.2.0) -- the "wordcount" sample works fine.
[UPD after question's author log analysis:]
Raju, thanks for the detailed logs.
The cause of the problem are incorrectly set env variables
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
You explicitly set all these variables to ${HADOOP_HOME} value, what is wrong. This causes GG to compose incorrect hadoop classpath, as seen from the below GG Node log:
+++ HADOOP_PREFIX=/usr/local/hadoop-2.2.0
+++ [[ -z /usr/local/hadoop-2.2.0 ]]
+++ '[' -z /usr/local/hadoop-2.2.0 ']'
+++ HADOOP_COMMON_HOME=/usr/local/hadoop-2.2.0
+++ HADOOP_HDFS_HOME=/usr/local/hadoop-2.2.0
+++ HADOOP_MAPRED_HOME=/usr/local/hadoop-2.2.0
+++ GRIDGAIN_HADOOP_CLASSPATH='/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*'
So, to fix the issue please don't set unnecessary env variables. JAVA_HOME and HADOOP_HOME is quite enough, nothing else is needed.
many thnaks to Ivan, thanks for your help and support, the solution you gave was good to get me out of the problem.
The issue was not to set other hadoop related environment variables. this is enough to set.
JAVA_HOME , HADOOP_HOME and GRIDGAIN_HOME
I have installed hadoop using following points.
Installed hadoop using tar file
created hdfs user and group and assigned them to hadoop folder
then created hdfs directories for namenode and datanode in /opt folder
Configuration files are also set.
But when i tried to run hadoop jar hadoop-examples-1.0.0.jar pi 4 100 I am getting this error.
2014-11-05 12:12:02,978 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null
2014-11-05 12:12:02,978 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/tmp/hadoop-hdfs/mapred/system/jobtracker.info" - Aborting...
2014-11-05 12:12:02,979 WARN org.apache.hadoop.mapred.JobTracker: Writing to file hdfs://hostname:9000/tmp/hadoop-hdfs/mapred/system/jobtracker.info failed!
2014-11-05 12:12:02,979 WARN org.apache.hadoop.mapred.JobTracker: FileSystem is not ready yet!
2014-11-05 12:12:02,982 WARN org.apache.hadoop.mapred.JobTracker: Failed to initialize recovery manager.
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-hdfs/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1556)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1083)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
at org.apache.hadoop.ipc.Client.call(Client.java:1066)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at com.sun.proxy.$Proxy5.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at com.sun.proxy.$Proxy5.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3507)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3370)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2700(DFSClient.java:2586)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2826)
One thing here I want to mention is that I have set hdfs paths to /mnt direcotry but hdfs still pointing to /tmp/hadoop-hdfs
Please give some suggestions.
Check all the paths of the action node you are trying to run,this usually occurs due to wrong input/output paths provided.
Also if you are rerunning a workflow job,make sure all the events and properties provided in coordinator.xml(or job.xml) must be present in job.properties,because rerunning a workflow job doesn't refer to job.xml as against in the case of normal coordinator job run(scheduled running).
I would like to see the following code make a directory in my "/tmp" via hdfs.
I can, for instance, run
hadoop fs -mkdir hdfs://localhost:9000/tmp/newdir
and succeed.
jps lists that namenode, datanode are running.
Hadoop version 0.20.1+169.89.
public static void main(String[] args) throws IOException {
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:9000");
FileSystem fs = FileSystem.get(conf);
fs.mkdirs(new Path("hdfs://localhost:9000/tmp/alex"));
}
I get the following error:
Exception in thread "main" java.io.IOException: Failed on local exception: java.io.EOFException; Host Details : local host is: "<my-machine-name>/192.168.2.6"; destination host is: "localhost":9000;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
at org.apache.hadoop.ipc.Client.call(Client.java:1351)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.mkdirs(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.mkdirs(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:467)
at org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2394)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2365)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:817)
at org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(DistributedFileSystem.java:813)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:813)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:806)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1933)
at com.twitter.amplify.core.dao.AccessHdfs.main(AccessHdfs.java:39)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:375)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:995)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891)
You have a version mismatch - your questions notes a NameNode running version 0.20.1+169.89 (which i think is from Cloudera distro CDH2 - http://archive.cloudera.com/cdh/2/), and in IntelliJ you are using Apache hadoop version 2.2.0.
Update your IntelliJ classpath to use the jars compatible with your cluster version - namely:
hadoop-0.20.1+169.89-core.jar
I had same version of Hadoop(hadoop-2.2.0) installed on my master and slave nodes but still I was getting same exception. To get rid of it I have followed below steps:
1. from $HADOP_HOME execute sbin/stop-all.sh, to stop the cluster
2. delete the data directory from all problematic node. If you dont know where data directory is then open core-site.xml, find the value corresponding to hadoop.tmp.dir, go to that directory, then cd dfs there you will find a directory named data, delete that data directory from all problematic datanodes
3. format the master node
4. from $HADOP_HOME execute sbin/start-all.sh, to start the cluster