Couldn't start hadoop datanode normally - java

I'd started datanode successfully before, but when I tried today it shows the following info. It sounds like I have not mkdir the /home/hadoop/appdata/hadoopdata directory,but I confirmed that the directory already exists in my computer. So what's the problem? Why I couldn't start the datanode normally?
Ex:I've tried to delete /home/hadoop/appdata/ and mkdir a new one, but it still doesn't work.
I've also deleted /home/hadoop/tmp/hadoop_tmp and mkdir a new one, it still doesn't work...
2014-03-04 09:30:30,106 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2014-03-04 09:30:30,349 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot access storage directory /home/hadoop/appdata/hadoopdata
2014-03-04 09:30:30,350 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/hadoop/appdata/hadoopdata does not exist
2014-03-04 09:30:30,453 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode:
java.io.IOException: All specified directories are not accessible or do not exist.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:139)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)

Stop all hadoop services
Delete dfs/namenode
Delete dfs/datanode from both slaves and masters
Check the premission of the Hadoop folder:
sudo chmod –R 755 /usr/local/hadoop
Restart Hadoop
Check/verify the folder permission.
sudo chmod –R 755 /home/hadoop/appdata
If you still have the problem check the log files

Try to formate your namenode
**
use hadoop namenode -format
or
hdfs namenode -format
**
you will get clear picture what is not configure as expected.

Related

How to recover data after namenode -format command in Hadoop

I am using hadoop 1.2.1 version. due to some unknown reason, my namenode goes down and following log information was obtained
2017-07-28 15:04:47,422 INFO org.apache.hadoop.hdfs.server.common.Storage: Start loading image file /home/hpcnl/crawler/hadoop-1.2.1/tmp/dfs/name/current/fsimage
2017-07-28 15:04:47,423 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:881)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:834)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:378)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:104)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:427)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:299)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
2017-07-28 15:04:47,428 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:881)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:834)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:378)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:104)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:427)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:395)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:299)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:569)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1479)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1488)
Then I search on internet and found that you should stop cluster and run following command
hadoop namenode -format
After this when I restart cluster, data was not appeared in respective folders in HDFS. Can I recover my data? How to handle such situations in future if my namenode goes down?
You can always backup your metadata by using these commands:
hdfs dfsadmin -safemode enter
hdfs dfsadmin -saveNamespace
These commands will put your namenode in safemode and push the edits to the FSImage file:
hdfs dfsadmin -fetchImage /path/someFilename
or
cd /namenode/data/current/
tar -cvf /root/nn_backup_data.tar
Now you can place this data in your namenode metadata directory and restart the namenode.
Please note that you shouldn't use the command below until unless you don't have any other options:
hadoop namenode -format

Initialization failed for Block pool <registering> (Datanode Uuid unassigned)

What is the source of this error and how could it be fixed?
2015-11-29 19:40:04,670 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to anmol-vm1-new/10.0.1.190:8020. Exiting.
java.io.IOException: All specified directories are not accessible or do not exist.
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:217)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:254)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:974)
at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:945)
at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:278)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
at java.lang.Thread.run(Thread.java:745)
2015-11-29 19:40:04,670 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to anmol-vm1-new/10.0.1.190:8020
2015-11-29 19:40:04,771 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned)
there are 2 Possible Solutions to resolve
First:
Your namenode and datanode cluster ID does not match, make sure to make them the same.
In name node, change ur cluster id in the file located in:
$ nano HADOOP_FILE_SYSTEM/namenode/current/VERSION
In data node you cluster id is stored in the file:
$ nano HADOOP_FILE_SYSTEM/datanode/current/VERSION
Second:
Format the namenode at all:
Hadoop 1.x: $ hadoop namenode -format
Hadoop 2.x: $ hdfs namenode -format
I met the same problem and solved it by doing the following steps:
step 1. remove the hdfs directory (for me it was the default directory "/tmp/hadoop-root/")
rm -rf /tmp/hadoop-root/*
step 2. run
bin/hdfs namenode -format
to format the directory
The root cause of this is datanode and namenode clusterID different, please unify them with namenode clusterID then restart hadoop then it should be resolved.
The issue arises because of mismatch of cluster ID's of datanode and namenode.
Follow these steps:
GO to Hadoop_home/data/namenode/CURRENT and copy cluster ID from "VERSION".
GO to Hadoop_home/data/datanode/CURRENT and paste this cluster ID in "VERSION" replacing the one present there.
then format the namenode
start datanode and namenode again.
The issue arises because of mismatch of cluster ID's of datanode and namenode.
Follow these steps:
1- GO to Hadoop_home/ delete folder Data
2- create folder with anthor name data123
3- create two folder namenode and datanode
4-go to hdfs-site and past your path
<name>dfs.namenode.name.dir</name>
<value>........../data123/namenode</value>
<name>dfs.datanode.data.dir</name>
<value>............../data123/datanode</value>
.
This problem may occur when there are some storage i/o errors. In this scenario, the VERSION file is not available hence appearing as the error above.
You may need to exclude the storage locations on those bad drives in hdfs-site.xml.
For me, this worked -
Delete (or make a backup) of HADOOP_FILE_SYSTEM/namenode/current directory
restart the datanode service
This should create the current directory again, with the correct clusterID in the VERSION file
Source - https://community.pivotal.io/s/article/Cluster-Id-is-incompatible-error-reported-when-starting-datanode-service?language=en_US

How to add module to Wildfly using CLI

I'm trying to create a Wildfly docker image with a postgres datasource.
When I build the dockerfile it always fails with Permission Denied when I try to install the postgres module.
My dockerfile looks look this:
FROM wildflyext/wildfly-camel
RUN /opt/jboss/wildfly/bin/add-user.sh admin admin --silent
ADD postgresql-9.4-1201.jdbc41.jar /tmp/
ADD config.sh /tmp/
ADD batch.cli /tmp/
RUN /tmp/config.sh
Which calls the following:
#!/bin/bash
JBOSS_HOME=/opt/jboss/wildfly
JBOSS_CLI=$JBOSS_HOME/bin/jboss-cli.sh
JBOSS_MODE=${1:-"standalone"}
JBOSS_CONFIG=${2:-"$JBOSS_MODE.xml"}
function wait_for_wildfly() {
until `$JBOSS_CLI -c "ls /deployment" &> /dev/null`; do
sleep 10
done
}
echo "==> Starting WildFly..."
$JBOSS_HOME/bin/$JBOSS_MODE.sh -c $JBOSS_CONFIG > /dev/null &
echo "==> Waiting..."
wait_for_wildfly
echo "==> Executing..."
$JBOSS_CLI -c --file=`dirname "$0"`/batch.cli --connect
echo "==> Shutting down WildFly..."
if [ "$JBOSS_MODE" = "standalone" ]; then
$JBOSS_CLI -c ":shutdown"
else
$JBOSS_CLI -c "/host=*:shutdown"
fi
And
batch
module add --name=org.postgresql --resources=/tmp/postgresql-9.4-1201.jdbc41.jar --dependencies=javax.api,javax.transaction.api
/subsystem=datasources/jdbc-driver=postgresql:add(driver-name=postgresql,driver-module-name=org.postgresql,driver-xa-datasource-class-name=org.postgresql.xa.PGXADataSource)
run-batch
The output when building is:
==> Starting WildFly...
==> Waiting...
==> Executing... Failed to locate the file on the filesystem copying /tmp/postgresql-9.4-1201.jdbc41.jar to
/opt/jboss/wildfly/modules/org/postgresql/main/postgresql-9.4-1201.jdbc41.jar:
/tmp/postgresql-9.4-1201.jdbc41.jar (Permission denied)
What permissions are required, and where do I set the permission(s)?
Thanks
It seems the JAR file is not readable by the jboss user (the user comming from parent image). The postgresql-9.4-1201.jdbc41.jar is added under the root user - find details in this GitHub discussion.
You could
either add permissions to JAR file before adding it to the image
or add permissions to JAR file in the image after the adding
or change ownership of the file in the image
The simplest solution could be the first one. The other 2 solutions need also switching user to root (USER root in dockerfile) and then back to jboss.
Here a advice : make a cli file like this :
connect
module add --name=sqlserver.jdbc --resources=#INSTALL_FOLDER#/libext/jtds-1.3.1.jar --dependencies=javax.api,javax.transaction.api
/subsystem=datasources/jdbc-driver=sqlserver:add(driver-module-name=sqlserver.jdbc,driver-name=sqlserver,driver-class-name=#JDBC_DRIVER#)
/subsystem=datasources/data-source=#DATASOURCENAME#:add(jndi-name=java:jboss/#JNDI_NAME#,enabled="true",use-java-context="true",driver-name=sqlserver,connection-url="#JDBC_URL#",user-name=#JDBC_USER#,password=#JDBC_PASSWORD#,validate-on-match=true,background-validation=true)
replace #VAR# by our own value... and It should work!
Be caution than JBOSS/Wildfly 10 think relatively for jar --resources by default but wildfly 8 think absolute path this could make you weird ! ;-)
cheers!

$ bin/hadoop namenode -format STARTUP_MSG: host = java.net.UnknownHostException:

I'm currently learning Hadoop by http://tecadmin.net/steps-to-install-hadoop-on-centosrhel-6/
in the 5th step when I apply this command $ bin/hadoop namenode -format I get following error
I also have checked these links for resolving my problem
"hadoop namenode -format" returns a java.net.UnknownHostException
java.net.UnknownHostException: Invalid hostname for server: local
I don't know where is domain name in the configuration files for replacing it by localhost.
also I went to /etc/hosts file and replaced text by localhost.. still I haven't resolve the problem please someone help me.
The unknownHostException could be resolved by the following steps:
Go to /etc/hosts
Edit the "hosts" file with IP 127.0.0.1 [space] HostName (e.g. static.98.35.ebonenet.com)
Save the file and try again
With the help of Aadil's answer I resolved The unknownHostException by the following steps:
Step-1 Go to /etc/hosts
Step-2 Edit the "hosts" file with IP 127.0.0.1 [space/tab] localhost [space/tab] HostName (e.g. static.98.35.ebonenet.com)
Step-3 Save the file and try again
Problem:
If anyone facing:
SHUTDOWN_MSG: Shutting down NameNode at java.net.UnknownHostException:
ubuntu: ubuntu: unknown error
Solution:
Go to /etc/hosts
Disable/Remove if any IPV6 configuration maintained in /etc/hosts file
Save the file and try again

nameNode doesnt start - hadoop-1.0.3 ubuntu 13.10 single node cluster

I am trying to setup a hadoop single node cluster on my local machine.
I installed hadoop using the following instructions
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
after starting the cluster using
bin/start-all.sh
I get the following output from jps
19623 TaskTracker
19388 SecondaryNameNode
19670 Jps
19479 JobTracker
I can see the nameNode is notrunning. I pulled out the logs from the /logs directory and look like this.
2014-01-24 11:30:20,614 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /app/hadoop/tmp/dfs/name does not exist.
2014-01-24 11:30:20,617 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /app/hadoop/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)
2014-01-24 11:30:20,619 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /app/hadoop/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)
2014-01-24 11:30:20,620 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ishan-HP-Pavilion-dv6700-Notebook-PC/127.0.1.1
It says the direcoty path /app/hadoop/tmp/dfs/name I tried creating this directory path for hadoop user but I got same error again.
Can someone please help me fix this.
Please note: I have read similar posts on here but none of them helped.
Thnks !
I would suggest you check permissions of the directory "/app/hadoop/tmp/dfs/name" given to hduser. Alternatively, you could make sure none of the components (secondary name node etc.) are up and running, and then format the namenode using this command:
$HADOOP_INSTALL/hadoop/bin/hadoop namenode -format
Try to start your cluster again and see if it works.
I think "/app/hadoop/tmp/dfs/name" should automatically get created. This folder keeps current info(cache) of namenode. Similarly there must be another folder "namesecondary" with "name" folder. If these folder are not there check "conf/core-site.xml" again. Id's for each daemons is created during each run and these id's are stored in these folders(and also some other information(i don't know exaclty)).
and
if these folders are available
just remove all content of "name" folder.
format namenode.
instead of "start-all" do start-dfs.sh and start-mapred.sh separately
Expecting this should work.
You donot have permissions to create this direcory. Try giving a path in your home directory. you should mention this path in core-site.xml for property hadoop.tmp.dir. I have mentioned it in following link
http://lets-do-something-big.blogspot.in/2014/01/hadoop-series-single-node-installation.html

Categories