Hadoop example not working after installed - java

Hi I recently installed hadoop 2.7.2 in a distributed mode, with the namenode being hadoop and datanode being hadoop1 and hadoop2. When I do yarn jar /usr/local/hadoop/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar pi 2 1000 in bash, it gives me error message like:
Number of Maps = 2
Samples per Map = 1000
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "benji/192.168.1.4"; destination host is: "hadoop":9000;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
at org.apache.hadoop.ipc.Client.call(Client.java:1479)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2108)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:278)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.
at com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:99)
at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:498)
at com.google.protobuf.GeneratedMessage.parseUnknownField(GeneratedMessage.java:193)
at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.<init>(RpcHeaderProtos.java:2207)
at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.<init>(RpcHeaderProtos.java:2165)
at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$1.parsePartialFrom(RpcHeaderProtos.java:2295)
at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto$1.parsePartialFrom(RpcHeaderProtos.java:2290)
at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:200)
at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
at org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:3167)
at org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1085)
at org.apache.hadoop.ipc.Client$Connection.run(Client.java:979)
And if I do hadoop jar /usr/local/hadoop/hadoop-2.7.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar pi 2 1000, it gives error message like:
Number of Maps = 2
Samples per Map = 1000
java.io.IOException: Failed on local exception: com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.; Host Details : local host is: "hadoop/192.168.1.4"; destination host is: "hadoop":9000;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
... blabla ...
Notice the weird mysterious difference between the two error messages lies in local host name (one is benji/192.168.1.4 and the other is hadoop/192.168.1.4). I do start-dfs.sh, and start-yarn.sh before the yarn jar ..., all look well.
I will be very much appreciate if anyone can help to figure out the problem. Here are contents of some configuration files:
/etc/hosts file (benji is the non-hadoop account on the master computer):
192.168.1.4 hadoop benji
192.168.1.5 hadoop1
192.168.1.9 hadoop2
/etc/hostname file:
hadoop
~/.ssh/config file:
# hadoop1
Host hadoop1
HostName 192.168.1.5
User hadoop1
IdentityFile ~/.ssh/hadoopid
# hadoop2
Host hadoop2
HostName 192.168.1.9
User hadoop2
IdentityFile ~/.ssh/hadoopid
# hadoop localhost
Host localhost
HostName localhost
User hadoop
IdentityFile ~/.ssh/hadoopid
# hadoop
Host hadoop
HostName hadoop
User hadoop
IdentityFile ~/.ssh/hadoopid
core-site.xml file:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:///usr/local/hadoop/hadoopdata/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
</configuration>
hdfs-site.xml file:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/hadoop/hadoopdata/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.du.reserved</name>
<value>21474836480</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/hadoop/hadoopdata/dfs/datanode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property> </configuration>
Could anyone help on this issue? Thanks!
UPDATE 1
I figured out part of the problem. I did jps and found datanode and namenode was not running. After netstat -an | grep 9000 and lsof -i :9000 I found that another process is listening the port 9000. The namenode was able to run after I changed fs.defaultFS from hdfs://hadoop:9000 to hdfs://hadoop:9001 in the core-site.xml file, and dfs.namenode.secondary.http-address from hadoop:9001 to hadoop:9002 in hdfs-site.xml. The protocol-buffer error message disappeared after this change. But the datanodes were still not running according to the result of jps.
The datanode log file shows something weird happening:
... blabla ...
2016-05-19 12:27:12,157 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop/192.168.
1.4:9000. Already tried 44 time(s); maxRetries=45
2016-05-19 12:27:32,158 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to se
rver: hadoop/192.168.1.4:9000
... blabla ...
2016-05-19 13:41:55,382 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: RECEIVED SIGNAL 15: SIGTERM
2016-05-19 13:41:55,387 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
... blabla ...
I do not understand why the datanode tries to connect to the namenode on port 9000.

You should have a configured Hadoop package installed in all your slaves, only changing the fs.defaultFS in namenode to hdfs://hadoop:9001 but not the datanodes will yield the datanodes try to connect to hdfs://hadoop:900 as it's stated in their core-site.xml.

Related

Kerberos and yarn nodemanager fail to connecting resourcemanager. Failed to specify server's Kerberos principal name

I`m trying to run my cluster with Kerberos. Before hdfs, yarn and spark worked correctly. After setting up kerberos, I can only run hdfs because yarn is going to crush after 15 minutes with an error. I have tried different configurations with no result.
Master node do not have any logs about slave node. Nodemanager run just for 15 minutes but do not show on yarn master list.
I do not understand why you can use kinit or hdfs run with no problem, but yarn seems to not connect the resource manager.
Log:
[...]
2020-12-02 22:23:07,532 INFO org.eclipse.jetty.server.AbstractConnector: Started ServerConnector#333cb916{HTTP/1.1,[http/1.1]}{0.0.0.0:8042}
2020-12-02 22:23:07,532 INFO org.eclipse.jetty.server.Server: Started #3804ms
2020-12-02 22:23:07,532 INFO org.apache.hadoop.yarn.webapp.WebApps: Web app node started at 8042
2020-12-02 22:23:07,534 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Node ID assigned is : node1:46673
2020-12-02 22:23:07,537 INFO org.apache.hadoop.util.JvmPauseMonitor: Starting JVM pause monitor
2020-12-02 22:23:07,541 INFO org.apache.hadoop.yarn.client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at master.ar.com/192.168.46.100:8031
2020-12-02 22:23:07,567 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Sending out 0 NM container statuses: []
2020-12-02 22:23:07,576 INFO org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Registering with RM using containers :[]
2020-12-02 22:33:05,974 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Cache Size Before Clean: 0, Total Deleted: 0, Public Deleted: 0, Private Deleted: 0
2020-12-02 22:38:07,867 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Unexpected error starting NodeStatusUpdater
java.io.IOException: DestHost:destPort master.ar.com:8031 , LocalHost:localPort node1.ar.com/192.168.46.101:0. Failed on local exception: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at java.base/jdk.internal.reflect.GeneratedConstructorAccessor23.newInstance(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:812)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1405)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
at com.sun.proxy.$Proxy76.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:74)
at jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy77.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:416)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:274)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:122)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:963)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1042)
Caused by: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:884)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
at org.apache.hadoop.ipc.Client.call(Client.java:1452)
... 21 more
Caused by: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:333)
at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:240)
at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:166)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:392)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818)
... 24 more
2020-12-02 22:38:07,870 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state STARTED
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: DestHost:destPort master.ar.com:8031 , LocalHost:localPort node1.ar.com/192.168.46.101:0. Failed on local exception: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:280)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:122)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:963)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1042)
Caused by: java.io.IOException: DestHost:destPort master.ar.com:8031 , LocalHost:localPort node1.ar.com/192.168.46.101:0. Failed on local exception: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at java.base/jdk.internal.reflect.GeneratedConstructorAccessor23.newInstance(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:812)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1405)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
at com.sun.proxy.$Proxy76.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:74)
at jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy77.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:416)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:274)
... 5 more
Caused by: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:884)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
at org.apache.hadoop.ipc.Client.call(Client.java:1452)
... 21 more
Caused by: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:333)
at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:240)
at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:166)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:392)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818)
... 24 more
2020-12-02 22:38:07,871 INFO org.apache.hadoop.service.AbstractService: Service NodeManager failed in state STARTED
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: DestHost:destPort master.ar.com:8031 , LocalHost:localPort node1.ar.com/192.168.46.101:0. Failed on local exception: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:280)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:122)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:963)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1042)
Caused by: java.io.IOException: DestHost:destPort master.ar.com:8031 , LocalHost:localPort node1.ar.com/192.168.46.101:0. Failed on local exception: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at java.base/jdk.internal.reflect.GeneratedConstructorAccessor23.newInstance(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:812)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1405)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
at com.sun.proxy.$Proxy76.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:74)
at jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy77.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:416)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:274)
... 5 more
Caused by: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:884)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
at org.apache.hadoop.ipc.Client.call(Client.java:1452)
... 21 more
Caused by: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:333)
at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:240)
at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:166)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:392)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818)
... 24 more
2020-12-02 22:38:07,877 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.w.WebAppContext#6fa02932{node,/,null,UNAVAILABLE}{jar:file:/home/hadoop/hadoop-3.3.0/share/hadoop/yarn/hadoop-yarn-common-3.3.0.jar!/webapps/node}
2020-12-02 22:38:07,880 INFO org.eclipse.jetty.server.AbstractConnector: Stopped ServerConnector#333cb916{HTTP/1.1,[http/1.1]}{0.0.0.0:8042}
2020-12-02 22:38:07,880 INFO org.eclipse.jetty.server.session: node0 Stopped scavenging
2020-12-02 22:38:07,881 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler#aac3f4e{static,/static,jar:file:/home/hadoop/hadoop-3.3.0/share/hadoop/yarn/hadoop-yarn-common-3.3.0.jar!/webapps/static,UNAVAILABLE}
2020-12-02 22:38:07,881 INFO org.eclipse.jetty.server.handler.ContextHandler: Stopped o.e.j.s.ServletContextHandler#2b4786dd{logs,/logs,file:///home/hadoop/hadoop-3.3.0/logs/,UNAVAILABLE}
2020-12-02 22:38:07,884 INFO org.apache.hadoop.ipc.Server: Stopping server on 46673
2020-12-02 22:38:07,895 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 0
2020-12-02 22:38:07,895 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2020-12-02 22:38:07,896 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl is interrupted. Exiting.
2020-12-02 22:38:07,911 INFO org.apache.hadoop.ipc.Server: Stopping server on 8040
2020-12-02 22:38:07,912 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 8040
2020-12-02 22:38:07,915 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2020-12-02 22:38:07,916 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: Public cache exiting
2020-12-02 22:38:07,917 WARN org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl: org.apache.hadoop.yarn.server.nodemanager.NodeResourceMonitorImpl is interrupted. Exiting.
2020-12-02 22:38:07,920 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NodeManager metrics system...
2020-12-02 22:38:07,921 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system stopped.
2020-12-02 22:38:07,921 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NodeManager metrics system shutdown complete.
2020-12-02 22:38:07,921 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: DestHost:destPort master.ar.com:8031 , LocalHost:localPort node1.ar.com/192.168.46.101:0. Failed on local exception: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:280)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:122)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:963)
at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1042)
Caused by: java.io.IOException: DestHost:destPort master.ar.com:8031 , LocalHost:localPort node1.ar.com/192.168.46.101:0. Failed on local exception: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at java.base/jdk.internal.reflect.GeneratedConstructorAccessor23.newInstance(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:837)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:812)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1566)
at org.apache.hadoop.ipc.Client.call(Client.java:1508)
at org.apache.hadoop.ipc.Client.call(Client.java:1405)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
at com.sun.proxy.$Proxy76.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:74)
at jdk.internal.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy77.registerNodeManager(Unknown Source)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:416)
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:274)
... 5 more
Caused by: java.io.IOException: Couldn't set up IO streams: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:884)
at org.apache.hadoop.ipc.Client$Connection.access$3800(Client.java:413)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1636)
at org.apache.hadoop.ipc.Client.call(Client.java:1452)
... 21 more
Caused by: java.lang.IllegalArgumentException: Failed to specify server's Kerberos principal name
at org.apache.hadoop.security.SaslRpcClient.getServerPrincipal(SaslRpcClient.java:333)
at org.apache.hadoop.security.SaslRpcClient.createSaslClient(SaslRpcClient.java:240)
at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:166)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:392)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:622)
at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:413)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:822)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:818)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:818)
... 24 more
2020-12-02 22:38:07,926 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at node1.ar.com/192.168.46.101
************************************************************/
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master.ar.com:9000</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hdfs/datanode</value>
</property>
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<property>
<name>dfs.client.read.shortcircuit</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.data.dir.perm</name>
<value>700</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:1004</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:1006</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>hdfs/node1.ar.com#AR.COM</value>
</property>
<property>
<name>dfs.datanode.kerberos.http.principal</name>
<value>HTTP/node1.ar.com#AR.COM</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/etc/security/keytab/hdfs.service.keytab</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master.ar.com</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.principal</name>
<value>yarn/node1.ar.com#AR.COM</value>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/etc/security/keytab/yarn.service.keytab</value>
</property>
<property>
<name>yarn.nodemanager.container-executor.class</name>
<value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
</property>
<property>
<name>yarn.nodemanager.linux-container-executor.group</name>
<value>yarn</value>
</property>
</configuration>
Looks like you are missing yarn.resourcemanager.principal. Try adding below configuration on NodeManager's yarn-site.xml.
<property>
<name>yarn.resourcemanager.principal</name>
<value>yarn/master.ar.com#AR.COM</value>
</property>

Hadoop: There are 0 datanodes running and no nodes & cannot connect to namenode

I'm having trouble setting up Hadoop. My setup consists of a nameNode VM and two seperate physical dataNodes that are connected to the same network.
IP configuration:
192.168.118.212 namenode-1
192.168.118.217 datanode-1
192.168.118.216 datanode-2
I keep getting the error that there are 0 datanodes running, but when I do JPS on my dataNode-1 machine or dataNode-2 machine, it shows up as running.
My nameNode log shows this:
File /user/hadoop/.bashrc_COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s)
are excluded in this operation.
The logs on my dataNode-1 machine tell me that it has trouble connecting to the nameNode.
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: namenode-1/192.168.118.212:9000
Only weird part is that it can't connect, though it can start it? I can also SSH between all of them with no problems.
So my best guess would be that I've configured the one of the config files incorrectly, though I checked other questions on here and they seem to be correct.
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://namenode-1:9000/</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/hadoop_data/hdfs/datanode</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/hadoop_data/hdfs/namenode</value>
<final>true</final>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.job.tracker</name>
<value>namenode-1:9001</value>
</property>
</configuration>
The problem could be the fs.default.name. Try the ip adress as fs.default.name. And check if your /etc/hosts configuration points to the correct ip address. Most likely this is correct, since your datanode figured out the ip address.
The problem could also be the port number! Try 8020 or 50070 instead of 9000 and look what happens.
The problem was the firewall.
You can stop it by running systemctl stop firewalld.service
I found the answer here:
https://stackoverflow.com/a/37994066/8789361

Hadoop namenode not starting, says file:/// has no athority

I am trying to run hadoop 2.6.0 for windows and I am following this guide
https://wiki.apache.org/hadoop/Hadoop2OnWindows
I have everything setup, all the values are set within the correct xml files.
However, I am running to this error when trying to start my namenode
Invalid URI for NameNode address (checkfs.defaultFS): file:/// has no authority.
This is what I have in my core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:19000</value>
</property>
</configuration>
What is going on??

eclipse hadoop connection refused

I'm running WordCount example of mapreduce on hadoop on my local.
When I run it with the command:
hadoop jar MRTest.jar example.WordCount /gutenberg /out333
It works.
in the command
MRTest.jar is the jar package which exported from my eclipse.
/gutenberg is the input path on my hdfs.
/out333 is the output path on my hdfs.
but when I run it on eclipse with arguments :
hdfs://localhost:9000/gutenberg hdfs://localhost:9000/output6
it throws the following exception:
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Exception in thread "main" java.net.ConnectException: Call From SWD-LIUQIN00-D1/172.26.35.141 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1351)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at $Proxy9.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at $Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:651)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397)
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:145)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:456)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:342)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286)
at example.WordCount.main(WordCount.java:81)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
at org.apache.hadoop.ipc.Client.call(Client.java:1318)
... 28 more
Here is some config on my local:
slaves:
hadoop-slave141
/etc/hosts:
127.0.0.1 hadoop-slave141 localhost
core-site.xml:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000/</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-master:9000</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/data/hadoop/dfs/hadoop2/name</value>
<description>Path on the local filesystem where the NameNode stores the namespace and transactions logs persistently.</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/data/hadoop/dfs/hadoop2/data</value>
<description>Comma separated list of paths on the local filesystem of a DataNode where it should store its blocks.</description>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-master:19888</value>
</property>
<property>
<name>mapred.job.tracker</name> //JobTracker的主机(或者IP)和端口。
<value>hdfs://hadoop-master:9001</value>
</property>
</configuration>
yarn-site.xml:
<configuration>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop-master:8031</value>
<description>host is the hostname of the resource manager and port is the port on which the NodeManagers contact the Resource Manager. </description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop-master:8030</value>
<description>host is the hostname of the resourcemanager and port is the port on which the Applications in the cluster talk to the Resource Manager. </description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler</value>
<description>In case you do not want to use the default scheduler</description>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>0.0.0.0:8032</value>
<description>the host is the hostname of the ResourceManager and the port is the port on which the clients can talk to the Resource Manager. </description>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>${hadoop.tmp.dir}/nodemanager/local</value>
<description>the local directories used by the nodemanager</description>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>0.0.0.0:8034</value>
<description>the nodemanagers bind to this port</description>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>${hadoop.tmp.dir}/nodemanager/remote</value>
<description>directory on hdfs where the application logs are moved to </description>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>${hadoop.tmp.dir}/nodemanager/logs</value>
<description>the directories used by Nodemanagers as log directories</description>
</property>
<!-- Use mapreduce_shuffle instead of mapreduce.suffle (YARN-1229)-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
hadoop fs -ls / :
14/04/22 10:24:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 9 items
-rw-r--r-- 3 hadoop supergroup 1195054 2014-04-18 08:27 /gutenberg
drwxr-xr-x - hadoop supergroup 0 2014-04-21 20:38 /out222
drwxr-xr-x - hadoop supergroup 0 2014-04-22 09:55 /out333
drwxr-xr-x - hadoop supergroup 0 2014-04-18 08:45 /output
drwxr-xr-x - hadoop supergroup 0 2014-04-18 11:23 /output2
drwxr-xr-x - hadoop supergroup 0 2014-04-18 11:28 /output3
drwxr-xr-x - hadoop supergroup 0 2014-04-18 13:38 /output3jps
drwxr-xr-x - hadoop supergroup 0 2014-04-21 17:48 /outt
drwx------ - hadoop supergroup 0 2014-04-18 08:44 /tmp
jps:
11149 Jps
7762 ResourceManager
8074 NodeManager
6901 NameNode
7204 DataNode
7565 SecondaryNameNode
I tested both eclipse helios and also kelper.
hadoop version is 2.2.0.
I am also wondering what is the diff between running with the command line and in eclipse.
I use hadoop user which is the user of my hadoop from start to end.
And Eclipse is also launched by hadoop user.
Somebody please help me, I 've stucked on this for 3 days.
Actually, you can use the HDFS path as argument.
Try to change the value of fs.defalutFS in "core-site.xml" to localhost:
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
And then restart the Hadoop.

Cannot initialize cluster exception while running job on Hadoop 2

The question is linked to my previous question All the daemons are running, jps shows:
6663 JobHistoryServer
7213 ResourceManager
9235 Jps
6289 DataNode
6200 NameNode
7420 NodeManager
but the wordcount example keeps on failing with the following exception:
ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1238)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1234)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1233)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1262)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286)
at WordCount.main(WordCount.java:80)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Since it says the problem is in configuration, I am posting the configuration files here. The intention is to create a single node cluster.
yarn-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/yarn/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/yarn/yarn_data/hdfs/datanode</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>Yarn</value>
</property>
</configuration>
Please tell what is missing or what am I doing wrong.
I was having similar issue, but yarn was not the issue.
After adding following jars into my classpath issue got resolved:
hadoop-mapreduce-client-jobclient-2.2.0.2.0.6.0-76
hadoop-mapreduce-client-common-2.2.0.2.0.6.0-76
hadoop-mapreduce-client-shuffle-2.2.0.2.0.6.0-76
You have uppercased Yarn, which is probably why it can not resolve it. Try the lowercase version that is suggested in the official documentation.
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Looks like i had a lucky day and went with this exception through 'all' of those causes. Summary:
wrong mapreduce.framework.name (see above)
missing mapreduce job-client jars (see above)
wrong version (see Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses-submiting job2remoteClustr )
my configured 'yarn.ipc.client.factory.class' wasn't in the classpath of the yarn server's (just on the client)
In my case i was trying to use sqoop and ran into this error.
Turns out that i was pointing to the latest version of hadoop 2.0 available from CDH repo for which sqoop was not supported.
The version of cloudera was 2.0.0-cdh4.4.0 which had yarn support build in.
When i used 2.0.0-cdh4.4.0 under hadoop-0.20 the problem went away.
Hope this helps.
changing mapreduce_shuffle to mapreduce.shuffle made it work in my case

Categories