Hadoop 2.6 Connecting to ResourceManager at /0.0.0.0:8032 - java

I´m trying to run the following Spark example under Hadoop 2.6, but I get the following error:
INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 and the Client enters in a loop trying to connect. I´m running a cluster of two machines, one master and a slave.
./bin/spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--num-executors 3 \
--driver-memory 2g \
--executor-memory 2g \
--executor-cores 1 \
--queue thequeue \
lib/spark-examples*.jar \
10
This is the error I get:
15/12/06 13:38:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/06 13:38:29 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/12/06 13:38:30 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/12/06 13:38:31 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/12/06 13:38:32 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/12/06 13:38:33 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
15/12/06 13:38:34 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
jps
hduser#master:/usr/local/spark$ jps
4930 ResourceManager
4781 SecondaryNameNode
5776 Jps
4608 DataNode
5058 NodeManager
4245 Worker
4045 Master
My /etc/host/
/etc/hosts
192.168.0.1 master
192.168.0.2 slave
The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes

This error mainly comes when hostname is not configured correctly ...Please check if hostname is configured correctly and same as you have mentioned for Resourcemanager...

I had faced the same problem. I solved it.
Do the Following steps.
Start Yarn by using command: start-yarn.sh
Check Resource Manager by using command: jps
Add the following code to the configuration
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>

I had also encountered the same issue where I was not able to submit the spark job with spark-submit.
The issue was due to the missing HADOOP_CONF_DIR path while launching the Spark job So, whenever you are submitting the job, set HADOOP_CONF_DIR to appropriate HADOOP CONF directory.
Like export HADOOP_CONF_DIR=/etc/hadoop/conf

You need to make sure that yarn-site.xml is on the class path and also make sure that the relevant properties are marked with true element.

Similar export HADOOP_CONF_DIR=/etc/hadoop/conf was a good idea for my case in flink on yarn when i run ./bin/yarn-session.sh -n 2 -tm 2000.

As you can see here yarn.resourcemanager.address is calculated based on yarn.resourcemanager.hostname which its default value is set to 0.0.0.0. So you should configure it correctly.
From the base of the Hadoop installation, edit the etc/hadoop/yarn-site.xml file and add this property.
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
Exucuting start-yarn.sh again will put your new settings into effect.

I have got the same problem. My cause is that the times are not the same between machines since my Resource Manager is not on the master machine. Just one second difference can cause yarn connection problem. A few more seconds difference can cause your name node and date node unable to start. Use ntpd to configure time synchronization to make sure the times are exactly same.

Related

HBase 1.2.1 standalone in Docker unable to connect

I want to connect to HBase running in standalone in a docker, using Java and the HBase API
I use this code to connect :
Configuration config = HBaseConfiguration.create();
config.set("hbase.zookeeper.quorum", "163.172.142.199");
config.set("hbase.zookeeper.property.clientPort","2181");
HBaseAdmin.checkHBaseAvailable(config);
Here is my /etc/hosts file
127.0.0.1 localhost
XXX.XXX.XXX.XXX hbase-srv
Here is the /etc/hosts file from my docker (named hbase-srv)
XXX.XXX.XXX.XXX hbase-srv
With this configuration, I get a connection refused error :
INFO | Initiating client connection, connectString=163.172.142.199:2181 sessionTimeout=90000 watcher=hconnection-0x6aba2b860x0, quorum=163.172.142.199:2181, baseZNode=/hbase
INFO | Opening socket connection to server 163.172.142.199/163.172.142.199:2181. Will not attempt to authenticate using SASL (unknown error)
INFO | Socket connection established to 163.172.142.199/163.172.142.199:2181, initiating session
INFO | Session establishment complete on server 163.172.142.199/163.172.142.199:2181, sessionid = 0x15602f8d8dc0002, negotiated timeout = 40000
INFO | Closing zookeeper sessionid=0x15602f8d8dc0002
INFO | Session: 0x15602f8d8dc0002 closed
INFO | EventThread shut down
org.apache.hadoop.hbase.MasterNotRunningException: com.google.protobuf.ServiceException: java.net.ConnectException: Connection refused
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1560)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1580)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1737)
at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.isMasterRunning(ConnectionManager.java:948)
at org.apache.hadoop.hbase.client.HBaseAdmin.checkHBaseAvailable(HBaseAdmin.java:3159)
at hbase.Benchmark.main(Benchmark.java:26)
However, if I remove the lines XXX.XXX.XXX.XXX hbase-srv from both /etc/hosts files I get the error unknown host : hbase-srv
I have also checked, I can successfully telnet to my hbase docker on the client port.
On the docker, all the ports used by HBase are opened and binded to the same number (60000 on 60000, 2181 on 2181, etc).
I also wanted to add that all was fine when I used this configuration on localhost.
If you can't give me an answer to my problem, could you at least give me a procedure to deploy a standalone hbase on a docker.
UPDATE : Here is my Docker file
FROM java:openjdk-8
ADD hbase-1.2.1 /hbase-1.2.1
WORKDIR /hbase-1.2.1
# ZooKeeper
EXPOSE 2181
# HMaster
EXPOSE 60000
# HMaster Web
EXPOSE 60010
# RegionServer
EXPOSE 60020
# RegionServer Web
EXPOSE 60030
EXPOSE 16010
RUN chmod 755 /hbase-1.2.1/bin/start-hbase.sh
CMD ["/hbase-1.2.1/bin/start-hbase.sh"]
My HBase shell is working, I also tried to open the port using iptables for tcp and udp but still the same problem
There are two problems with your Dockerfile:
use hbase master start instead of start-hbase.sh
regionserver is actually not running on 60020
The 2nd problem is not so easy to solve. If run hbase standalone with version >= 1.2.0 (not sure, I'm running 1.2.0), hbase will use ephemeral port instead of the default port or the port you provide in hbase-site.xml which makes it very hard to provide hbase service in docker using the original version.
I add a property named hbase.localcluster.port.ephemeral and managed to build a standalone hbase in docker, which you can reference here.

Failed to run Spark job on Yarn cluster - Retrying connect to server

I setup my yarn cluster and also my spark cluster on the same machines but now I need to run a spark job with yarn using the client mode.
Here is my sample config for my job:
SparkConf sparkConf = new SparkConf(true).setAppName("SparkQueryApp")
.setMaster("yarn-client")// "yarn-cluster" or "yarn-client"
.set("es.nodes", "10.0.0.207")
.set("es.nodes.discovery", "false")
.set("es.cluster", "wp-es-reporting-prod")
.set("es.scroll.size", "5000")
.setJars(JavaSparkContext.jarOfClass(Demo.class))
.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.set("spark.default.parallelism", String.valueOf(cpus * 2))
.set("spark.executor.memory", "10g")
.set("spark.num.executors", "40")
.set("spark.dynamicAllocation.enabled", "true")
.set("spark.dynamicAllocation.minExecutors", "10")
.set("spark.dynamicAllocation.maxExecutors", "50") .set("spark.logConf", "true");
This doesn't seems to work when I tried to run my Spark job
java -jar spark-test-job.jar"
I got this exception
405472 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to
server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10,
sleepTime=1 SECONDS)
406473 [main] INFO org.apache.hadoop.ipc.Client - Retrying connect to
server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
...
Any help ?

Hadoop cluster setup - java.net.ConnectException: Connection refused

I want to setup a hadoop-cluster in pseudo-distributed mode. I managed to perform all the setup-steps, including startuping a Namenode, Datanode, Jobtracker and a Tasktracker on my machine.
Then I tried to run some exemplary programms and faced the java.net.ConnectException: Connection refused error. I stepped back to the very first steps of running some operations in standalone mode and faced the same problem.
I performed even triple-check of all the installation steps and have no idea how to fix it. (I am new to Hadoop and a beginner Ubuntu user thus I kindly ask you for "taking it into account" if providing any guide or tip).
This is the error output I keep receiving:
hduser#marta-komputer:/usr/local/hadoop$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+'
15/02/22 18:23:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/02/22 18:23:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
java.net.ConnectException: Call From marta-komputer/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy9.delete(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:521)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy10.delete(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1929)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:638)
at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:634)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:634)
at org.apache.hadoop.examples.Grep.run(Grep.java:95)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.Grep.main(Grep.java:101)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 32 more
etc/hadoop/hadoop-env.sh file:
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
# The jsvc implementation to use. Jsvc is required to run secure datanodes
# that bind to privileged ports to provide authentication of data transfer
# protocol. Jsvc is not required if SASL is configured for authentication of
# data transfer protocol using non-privileged ports.
#export JSVC_HOME=${JSVC_HOME}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}
# Extra Java CLASSPATH elements. Automatically insert capacity-scheduler.
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
else
export HADOOP_CLASSPATH=$f
fi
done
# The maximum amount of heap to use, in MB. Default is 1000.
#export HADOOP_HEAPSIZE=
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""
# Extra Java runtime options. Empty by default.
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
# Command specific options appended to HADOOP_OPTS when specified
export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"
# The following applies to multiple commands (fs, dfs, fsck, distcp etc)
export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS"
#HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"
# On secure datanodes, user to run the datanode as after dropping privileges.
# This **MUST** be uncommented to enable secure HDFS if using privileged ports
# to provide authentication of data transfer protocol. This **MUST NOT** be
# defined if SASL is configured for authentication of data transfer protocol
# using non-privileged ports.
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
# Where log files are stored. $HADOOP_HOME/logs by default.
#export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER
# Where log files are stored in the secure data environment.
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
# HDFS Mover specific parameters
###
# Specify the JVM options to be used when starting the HDFS Mover.
# These options will be appended to the options specified as HADOOP_OPTS
# and therefore may override any similar flags set in HADOOP_OPTS
#
# export HADOOP_MOVER_OPTS=""
###
# Advanced Users Only!
###
# The directory where pid files are stored. /tmp by default.
# NOTE: this should be set to a directory that can only be written to by
# the user that will run the hadoop daemons. Otherwise there is the
# potential for a symlink attack.
export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
# A string representing this instance of hadoop. $USER by default.
export HADOOP_IDENT_STRING=$USER
.bashrc file Hadoop-related fragment:
# -- HADOOP ENVIRONMENT VARIABLES START -- #
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
# -- HADOOP ENVIRONMENT VARIABLES END -- #
/usr/local/hadoop/etc/hadoop/core-site.xml file:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop_tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
/usr/local/hadoop/etc/hadoop/hdfs-site.xml file:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
</property>
</configuration>
/usr/local/hadoop/etc/hadoop/yarn-site.xml file:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
/usr/local/hadoop/etc/hadoop/mapred-site.xml file:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<configuration>
Running hduser#marta-komputer:/usr/local/hadoop$ bin/hdfs namenode -format results in an output as follows (I substitiute some of its part with (...)):
hduser#marta-komputer:/usr/local/hadoop$ bin/hdfs namenode -format
15/02/22 18:50:47 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = marta-komputer/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 2.6.0
STARTUP_MSG: classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/htrace-core-3.0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli (...)2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.0.jar:/usr/local/hadoop/contrib/capacity-scheduler/*.jar
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z
STARTUP_MSG: java = 1.8.0_31
************************************************************/
15/02/22 18:50:47 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
15/02/22 18:50:47 INFO namenode.NameNode: createNameNode [-format]
15/02/22 18:50:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Formatting using clusterid: CID-0b65621a-eab3-47a4-bfd0-62b5596a940c
15/02/22 18:50:48 INFO namenode.FSNamesystem: No KeyProvider found.
15/02/22 18:50:48 INFO namenode.FSNamesystem: fsLock is fair:true
15/02/22 18:50:48 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000
15/02/22 18:50:48 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true
15/02/22 18:50:48 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000
15/02/22 18:50:48 INFO blockmanagement.BlockManager: The block deletion will start around 2015 Feb 22 18:50:48
15/02/22 18:50:48 INFO util.GSet: Computing capacity for map BlocksMap
15/02/22 18:50:48 INFO util.GSet: VM type = 64-bit
15/02/22 18:50:48 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB
15/02/22 18:50:48 INFO util.GSet: capacity = 2^21 = 2097152 entries
15/02/22 18:50:48 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false
15/02/22 18:50:48 INFO blockmanagement.BlockManager: defaultReplication = 1
15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxReplication = 512
15/02/22 18:50:48 INFO blockmanagement.BlockManager: minReplication = 1
15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxReplicationStreams = 2
15/02/22 18:50:48 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks = false
15/02/22 18:50:48 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000
15/02/22 18:50:48 INFO blockmanagement.BlockManager: encryptDataTransfer = false
15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxNumBlocksToLog = 1000
15/02/22 18:50:48 INFO namenode.FSNamesystem: fsOwner = hduser (auth:SIMPLE)
15/02/22 18:50:48 INFO namenode.FSNamesystem: supergroup = supergroup
15/02/22 18:50:48 INFO namenode.FSNamesystem: isPermissionEnabled = true
15/02/22 18:50:48 INFO namenode.FSNamesystem: HA Enabled: false
15/02/22 18:50:48 INFO namenode.FSNamesystem: Append Enabled: true
15/02/22 18:50:48 INFO util.GSet: Computing capacity for map INodeMap
15/02/22 18:50:48 INFO util.GSet: VM type = 64-bit
15/02/22 18:50:48 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB
15/02/22 18:50:48 INFO util.GSet: capacity = 2^20 = 1048576 entries
15/02/22 18:50:48 INFO namenode.NameNode: Caching file names occuring more than 10 times
15/02/22 18:50:48 INFO util.GSet: Computing capacity for map cachedBlocks
15/02/22 18:50:48 INFO util.GSet: VM type = 64-bit
15/02/22 18:50:48 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB
15/02/22 18:50:48 INFO util.GSet: capacity = 2^18 = 262144 entries
15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033
15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0
15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension = 30000
15/02/22 18:50:48 INFO namenode.FSNamesystem: Retry cache on namenode is enabled
15/02/22 18:50:48 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
15/02/22 18:50:48 INFO util.GSet: Computing capacity for map NameNodeRetryCache
15/02/22 18:50:48 INFO util.GSet: VM type = 64-bit
15/02/22 18:50:48 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
15/02/22 18:50:48 INFO util.GSet: capacity = 2^15 = 32768 entries
15/02/22 18:50:48 INFO namenode.NNConf: ACLs enabled? false
15/02/22 18:50:48 INFO namenode.NNConf: XAttrs enabled? true
15/02/22 18:50:48 INFO namenode.NNConf: Maximum size of an xattr: 16384
Re-format filesystem in Storage Directory /usr/local/hadoop_tmp/hdfs/namenode ? (Y or N) Y
15/02/22 18:50:50 INFO namenode.FSImage: Allocated new BlockPoolId: BP-948369552-127.0.1.1-1424627450316
15/02/22 18:50:50 INFO common.Storage: Storage directory /usr/local/hadoop_tmp/hdfs/namenode has been successfully formatted.
15/02/22 18:50:50 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
15/02/22 18:50:50 INFO util.ExitUtil: Exiting with status 0
15/02/22 18:50:50 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at marta-komputer/127.0.1.1
************************************************************/
Starting dfs and yarn results in the following output:
hduser#marta-komputer:/usr/local/hadoop$ start-dfs.sh
15/02/22 18:53:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-marta-komputer.out
localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-marta-komputer.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-secondarynamenode-marta-komputer.out
15/02/22 18:53:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
hduser#marta-komputer:/usr/local/hadoop$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduser-resourcemanager-marta-komputer.out
localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduser-nodemanager-marta-komputer.out
Calling jps shortly after that gives:
hduser#marta-komputer:/usr/local/hadoop$ jps
11696 ResourceManager
11842 NodeManager
11171 NameNode
11523 SecondaryNameNode
12167 Jps
netstat output:
hduser#marta-komputer:/usr/local/hadoop$ sudo netstat -lpten | grep java
tcp 0 0 0.0.0.0:8088 0.0.0.0:* LISTEN 1001 690283 11696/java
tcp 0 0 0.0.0.0:42745 0.0.0.0:* LISTEN 1001 684574 11842/java
tcp 0 0 0.0.0.0:13562 0.0.0.0:* LISTEN 1001 680955 11842/java
tcp 0 0 0.0.0.0:8030 0.0.0.0:* LISTEN 1001 684531 11696/java
tcp 0 0 0.0.0.0:8031 0.0.0.0:* LISTEN 1001 684524 11696/java
tcp 0 0 0.0.0.0:8032 0.0.0.0:* LISTEN 1001 680879 11696/java
tcp 0 0 0.0.0.0:8033 0.0.0.0:* LISTEN 1001 687392 11696/java
tcp 0 0 0.0.0.0:8040 0.0.0.0:* LISTEN 1001 680951 11842/java
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 1001 687242 11171/java
tcp 0 0 0.0.0.0:8042 0.0.0.0:* LISTEN 1001 680956 11842/java
tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 1001 690252 11523/java
tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 1001 687239 11171/java
/etc/hosts file:
127.0.0.1 localhost
127.0.1.1 marta-komputer
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
====================================================
UPDATE 1.
I updated the core-site.xml and now I have:
<property>
<name>fs.default.name</name>
<value>hdfs://marta-komputer:9000</value>
</property>
but I keep receiving the error - now starting as:
15/03/01 00:59:34 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
java.net.ConnectException: Call From marta-komputer.home/192.168.1.8 to marta-komputer:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
I also notice that telnet localhost 9000 is not working:
hduser#marta-komputer:~$ telnet localhost 9000
Trying 127.0.0.1...
telnet: Unable to connect to remote host: Connection refused
For me these steps worked
stop-all.sh
hadoop namenode -format
start-all.sh
Hi Edit your conf/core-site.xml and change localhost to 0.0.0.0. Use the conf below. That should work.
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
From the netstat output you can see the process is listening on address 127.0.0.1
tcp 0 0 127.0.0.1:9000 0.0.0.0:* ...
from the exception message you can see that it tries to connect to address 127.0.1.1
java.net.ConnectException: Call From marta-komputer/127.0.1.1 to localhost:9000 failed ...
further in the exception it's mentionend
For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
on this page you find
Check that there isn't an entry for your hostname mapped to 127.0.0.1 or 127.0.1.1 in /etc/hosts (Ubuntu is notorious for this)
so the conclusion is to remove this line in your /etc/hosts
127.0.1.1 marta-komputer
I had the similar prolem with OP. As the terminal output suggested, I went to
http://wiki.apache.org/hadoop/ConnectionRefused
I tried to change my /etc/hosts file as suggested here, i.e. remove 127.0.1.1 as OP suggested it will create another error.
So in the end, I leave it as is. The following is my /etc/hosts
127.0.0.1 localhost.localdomain localhost
127.0.1.1 linux
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
In the end, I found that my namenode did not started correctly, i.e.
When you type sudo netstat -lpten | grep java in the terminal, there will not be any JVM process running(listening) on port 9000.
So I made two directories for namenode and datanode respectively(if you have not done so). You don't have to put where I put it, please replace it based on your hadoop directory.
i.e.
mkdir -p /home/hadoopuser/hadoop-2.6.2/hdfs/namenode
mkdir -p /home/hadoopuser/hadoop-2.6.2/hdfs/datanode
I reconfigured my hdfs-site.xml.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoopuser/hadoop-2.6.2/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoopuser/hadoop-2.6.2/hdfs/datanode</value>
</property>
</configuration>
In terminal, stop your hdfs and yarn with script stop-dfs.sh and stop-yarn.sh. They are located in your hadoop directory/sbin. In my case, it's /home/hadoopuser/hadoop-2.6.2/sbin/.
Then start your hdfs and yarn with script start-dfs.sh and start-yarn.sh
After it is started, type jps in your terminal to see if your JVM processes are running correctly. It should show the following.
15678 NodeManager
14982 NameNode
15347 SecondaryNameNode
23814 Jps
15119 DataNode
15548 ResourceManager
Then try to use netstat again to see if your namenode is listening to port 9000
sudo netstat -lpten | grep java
If you successfully set up the namenode, you should see the following in your terminal output.
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 1001 175157 14982/java
Then try to type the command hdfs dfs -mkdir /user/hadoopuser
If this command executes sucessfully, now you can list your directory in the HDFS user directory by hdfs dfs -ls /user
Make sure HDFS is online. Start it by $HADOOP_HOME/sbin/start-dfs.sh
Once you do that, your test with telnet localhost 9001should work.
For me it was that I could not cluster my zookeeper.
hdfs haadmin -getServiceState 1
active
hdfs haadmin -getServiceState 2
active
My hadoop-hdfs-zkfc-[hostname].log showed:
2017-04-14 11:46:55,351 WARN org.apache.hadoop.ha.HealthMonitor:
Transport-level exception trying to monitor health of NameNode at
HOST/192.168.1.55:9000: java.net.ConnectException: Connection refused
Call From HOST/192.168.1.55 to HOST:9000 failed on connection
exception: java.net.ConnectException: Connection refused; For more
details see: http://wiki.apache.org/hadoop/ConnectionRefused
solution:
hdfs-site.xml
<property>
<name>dfs.namenode.rpc-bind-host</name>
<value>0.0.0.0</value>
</property>
before
netstat -plunt
tcp 0 0 192.168.1.55:9000 0.0.0.0:* LISTEN 13133/java
nmap localhost -p 9000
Starting Nmap 6.40 ( http://nmap.org ) at 2017-04-14 12:15 EDT
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000047s latency).
Other addresses for localhost (not scanned): 127.0.0.1
PORT STATE SERVICE
9000/tcp closed cslistener
after
netstat -plunt
tcp 0 0 0.0.0.0:9000 0.0.0.0:* LISTEN 14372/java
nmap localhost -p 9000
Starting Nmap 6.40 ( http://nmap.org ) at 2017-04-14 12:28 EDT
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000039s latency).
Other addresses for localhost (not scanned): 127.0.0.1
PORT STATE SERVICE
9000/tcp open cslistener
In /etc/hosts:
Add this line:
your-ip-address your-host-name
example: 192.168.1.8 master
In /etc/hosts:
Delete the line with 127.0.1.1 (This will cause loopback)
In your core-site, change localhost to your-ip or your-hostname
Now, restart the cluster.
Check your firewall setting
and set
<property>
<name>fs.default.name</name>
<value>hdfs://MachineName:9000</value>
</property>
replace localhost to machine name
hduser#marta-komputer:/usr/local/hadoop$ jps
11696 ResourceManager
11842 NodeManager
11171 NameNode
11523 SecondaryNameNode
12167 Jps
Where is your DataNode? Connection refused problem might also be due to no active DataNode. Check datanode logs for issues.
UPDATED:
For this error:
15/03/01 00:59:34 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
java.net.ConnectException: Call From marta-komputer.home/192.168.1.8 to marta-komputer:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Add these lines in yarn-site.xml:
<property>
<name>yarn.resourcemanager.address</name>
<value>192.168.1.8:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>192.168.1.8:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>192.168.1.8:8031</value>
</property>
Restart the hadoop processes.
Your issue is a very interesting one. Hadoop setup could be frustrating some time due to the complexity of the system and many moving parts involved. I think the issue you faced is definitely a firewall one.
My hadoop cluster has similar setup. With a firewall rule added with command:
sudo iptables -A INPUT -p tcp --dport 9000 -j REJECT
I'm able to see the exact issue:
15/03/02 23:46:10 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
java.net.ConnectException: Call From mybox/127.0.1.1 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
You can verify your firewall settings with command:
/usr/local/hadoop/etc$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source destination
REJECT tcp -- anywhere anywhere tcp dpt:9000 reject-with icmp-port-unreachable
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
Once the suspicious rule is identified, it could be deleted with a command like:
sudo iptables -D INPUT -p tcp --dport 9000 -j REJECT
Now, the connection should go through.
In my experaince
15/02/22 18:23:04 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
You may have 64 bit version OS, and hadoop installation 32bit. refer this
java.net.ConnectException: Call From marta-komputer/127.0.1.1 to
localhost:9000 failed on connection exception: java.net.ConnectException:
connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
this problem refers to your ssh public key authorization. please provide details about your ssh set up.
Please refer this link to check the complete steps.
also provide info if
cat $HOME/.ssh/authorized_keys
returns any result or not.
I resolved the same issue by adding this property to hdfs-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
Stop it by-: stop-all.sh
format the namenode-: hadoop namenode -format
again start-: start-all.sh
I am also facing same issue in Hortonworks
At the time I restart the Ambari agents and servers then the issue has been resolved.
systemctl stop ambari-agent
systemctl stop ambari-server
Source :Full Article With Resolution
systemctl start ambari-agent
systemctl start ambari-server
I was getting the same issue and found that OpenSSH service was not running and it was causing the issue. After starting the SSH service it worked.
To check if SSH service is running or not:
ssh localhost
To start the service, if OpenSSH is already installed:
sudo /etc/init.d/ssh start
get in $SPARK_HOME/conf, then open file spark-env.sh and add:
SPARK_MASTER_HOST= your-IP
SPARK_LOCAL_IP=127.0.0.1

Not able to run the examples of HBase-The definitive guide

I've been trying to run examples from HBase-The definitve guide and i've been encountering with this error and i'm not able to get past it. I'm running in Stand alone mode if that helps.
Exception in thread "main" org.apache.hadoop.hbase.MasterNotRunningException: �
17136#ubuntulocalhost,32992,1373877731444
at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:615)
at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
at util.HBaseHelper.<init>(HBaseHelper.java:29)
at util.HBaseHelper.getHelper(HBaseHelper.java:33)
at client.PutExample.main(PutExample.java:22)
But my HMaster process is running:
hduser#ubuntu:/home/ubuntu/hbase-book/ch03$ jps
17602 Jps
8709 NameNode
8929 DataNode
9472 TaskTracker
9252 JobTracker
9172 SecondaryNameNode
17136 HMaster
This is my hbase-site.xml file:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///usr/local/hbase/hbase-data/</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/hbase/zookeeper-data/</value>
This is my /etc/hosts file:
127.0.0.1 localhost
127.0.1.1 ubuntu
127.0.0.1 ubuntu.ubuntu-domain ubuntu
Specifically, i'm trying to run the 3rd chapter examples and i'm just not understanding why my setup is not running..
Any idea where i'm going wrong?
Edit: Here are the logs:
2013-07-15 03:56:32,663 INFO org.apache.zookeeper.server.NIOServerCnxnFactory: Accepted socket connection from /127.0.0.1:60119
2013-07-15 03:56:32,672 WARN org.apache.zookeeper.server.ZooKeeperServer: Connection request from old client /127.0.0.1:60119; will be dropped if server is in r-o mode
2013-07-15 03:56:32,672 INFO org.apache.zookeeper.server.ZooKeeperServer: Client attempting to establish new session at /127.0.0.1:60119
2013-07-15 03:56:32,674 INFO org.apache.zookeeper.server.ZooKeeperServer: Established session 0x13fe17e7f1d0006 with negotiated timeout 40000 for client /127.0.0.1:60119
2013-07-15 03:57:11,653 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Stats: total=1.17 MB, free=247.24 MB, max=248.41 MB, blocks=2, accesses=68, hits=55, hitRatio=80.88%, , cachingAccesses=61, cachingHits=53, cachingHitsRatio=86.88%, , evictions=0, evicted=6, evictedPerRun=Infinity
2013-07-15 03:57:14,333 WARN org.apache.zookeeper.server.NIOServerCnxn: caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x13fe17e7f1d0006, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:724)
2013-07-15 03:57:14,334 INFO org.apache.zookeeper.server.NIOServerCnxn: Closed socket connection for client /127.0.0.1:60119 which had sessionid 0x13fe17e7f1d0006
2013-07-15 03:57:24,551 INFO org.apache.hadoop.hbase.master.LoadBalancer: Skipping load balancing because balanced cluster; servers=1 regions=1 average=1.0 mostloaded=1 leastloaded=1
2013-07-15 03:57:24,568 DEBUG org.apache.hadoop.hbase.client.MetaScanner: Scanning .META. starting at row= for max=2147483647 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation#189ddf
2013-07-15 03:57:24,578 DEBUG org.apache.hadoop.hbase.master.CatalogJanitor: Scanned 1 catalog row(s) and gc'd 0 unreferenced parent region(s)
It has nothing to do with standalone or distributed mode. Make sure your setup is working fine. I can see that RegionServer and Zookeeper are not running. Comment out the line 127.0.1.1 ubuntu in your /etc/hosts file and restart HBase. You might have to kill it.
P.S : Since you already have Hadoop configured and it is running fine, you can run HBase in pseudo-distributed setup.

How do I get rid of connection refused error in hadoop?

When I try to run hadoop command
vinit#ubuntu:~/hadoop-1.0.4$ bin/hadoop dfs -ls
I get following things as output.
13/04/17 06:26:37 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9010. Already tried 0 time(s).
13/04/17 06:26:38 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:9010. Already tried 1 time(s).
Bad connection to FS. command aborted. exception: Call to localhost/127.0.0.1:9010 failed on connection exception: java.net.ConnectException: Connection refused
I am new to hadoop and even Java.Please Help!
Check if your HDFS processes are running? Run 'jps' command to check the running java processes.
You shoudl have at least 'Namenode' and 'Datanode' processes running. Please check and let me know.
Cheers
Rags
I have struggled two days and the night between to find out the answer to this problem.
In my case( and I'm sure this is the problem in most cases ) had to create the hadoop temporary folder by hand and add them to the hdfs-site.xml !
<property>
<name>dfs.data.dir</name>
<value>/home/stefan/Downloads/hadoop-2.7.1/tmp/dfs/name/data</value>
<final>true</final>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/stefan/Downloads/hadoop-2.7.1/tmp/dfs/name</value>
<final>true</final>
</property>
I hope this helps you guys not to go through the same hell as me.
Besides that
chown user_name hadoop_folder hadoop_temp_folder
chmod 755 hadoop_folder hadoop_temp_folder

Categories