I am trying to run hadoop 2.6.0 for windows and I am following this guide
https://wiki.apache.org/hadoop/Hadoop2OnWindows
I have everything setup, all the values are set within the correct xml files.
However, I am running to this error when trying to start my namenode
Invalid URI for NameNode address (checkfs.defaultFS): file:/// has no authority.
This is what I have in my core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://0.0.0.0:19000</value>
</property>
</configuration>
What is going on??
Related
I'm having trouble setting up Hadoop. My setup consists of a nameNode VM and two seperate physical dataNodes that are connected to the same network.
IP configuration:
192.168.118.212 namenode-1
192.168.118.217 datanode-1
192.168.118.216 datanode-2
I keep getting the error that there are 0 datanodes running, but when I do JPS on my dataNode-1 machine or dataNode-2 machine, it shows up as running.
My nameNode log shows this:
File /user/hadoop/.bashrc_COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s)
are excluded in this operation.
The logs on my dataNode-1 machine tell me that it has trouble connecting to the nameNode.
WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: namenode-1/192.168.118.212:9000
Only weird part is that it can't connect, though it can start it? I can also SSH between all of them with no problems.
So my best guess would be that I've configured the one of the config files incorrectly, though I checked other questions on here and they seem to be correct.
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://namenode-1:9000/</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/hadoop_data/hdfs/datanode</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/hadoop_data/hdfs/namenode</value>
<final>true</final>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.job.tracker</name>
<value>namenode-1:9001</value>
</property>
</configuration>
The problem could be the fs.default.name. Try the ip adress as fs.default.name. And check if your /etc/hosts configuration points to the correct ip address. Most likely this is correct, since your datanode figured out the ip address.
The problem could also be the port number! Try 8020 or 50070 instead of 9000 and look what happens.
The problem was the firewall.
You can stop it by running systemctl stop firewalld.service
I found the answer here:
https://stackoverflow.com/a/37994066/8789361
I tried to run WordCount in terminal with command
hadoop jar ~/Study/Hadoop/Jars/WordCount.jar \
WordCount /input/input_wordcount/ /output
but it failed with the following error:
How to solve this?
are you running on vmware ! close the fire wall at fisrt !
try service iptables stop or chkconfig iptables off
add this configuration in hdfs-site.xml
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
I am trying to launch Hadoop on my computer but when I execute any relating command in CMD such as hadoop version or hdfs namenode -format I get an error (exact as next):
Error: Could not find or load main class Name
The OS is Windows 10.
Hadoop version 2.7.1.
JDK 1.8.0.131.
I have the following user variables:
HADOOP_HOME = C:\hadoop-2.7.1\bin
HAVA_HOME = C:\Progra~2\Java\jdk1.8.0_131
And within the system variable PATH there are two locations set:
%JAVA_HOME%\bin;C:\hadoop-2.7.1\bin
In hadoop-env.cmd there is variable:
JAVA_HOME = %JAVA_HOME%
Among core-site.xml, mapred-site.xml, hdfs-site.xml and yarn-site.xml links to directories are set only in hdfs-site.xml. The full configuration tag in this file is the next:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/c:/hadoop-2.7.1/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/c:/hadoop-2.7.1/data/datanode</value>
</property>
</configuration>
I cannot start Hive Web interface as described here. This is the output of hive --service hwi:
ls: cannot access /usr/local/hive/lib/hive-hwi-*.war: No such file or directory
14/09/09 13:07:59 INFO hwi.HWIServer: HWI is starting up
14/09/09 13:08:00 FATAL hwi.HWIServer: HWI WAR file not found at /usr/local/hive/lib/hive-hwi-0.13.1.war
It appears that there is no .war file under /usr/local/hive/lib!!! am I supposed to generate the war file?
I've correctly set $ANT_LIB and $HIVE_HOME, and here is my hive-site.xml:
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://hadoop-server/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.hwi.war.file</name>
<value>/lib/hive-hwi-0.13.1.war</value>
<description>This is the WAR file with the jsp content for Hive Web Interface</description>
</property>
</configuration>
My hive version is 0.13.1, and hadoop version is 2.5.0.
There is HIVE-7233
you may need to change the version of hive or copy war file of another version
The question is linked to my previous question All the daemons are running, jps shows:
6663 JobHistoryServer
7213 ResourceManager
9235 Jps
6289 DataNode
6200 NameNode
7420 NodeManager
but the wordcount example keeps on failing with the following exception:
ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1238)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1234)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1233)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1262)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286)
at WordCount.main(WordCount.java:80)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Since it says the problem is in configuration, I am posting the configuration files here. The intention is to create a single node cluster.
yarn-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/yarn/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/yarn/yarn_data/hdfs/datanode</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>Yarn</value>
</property>
</configuration>
Please tell what is missing or what am I doing wrong.
I was having similar issue, but yarn was not the issue.
After adding following jars into my classpath issue got resolved:
hadoop-mapreduce-client-jobclient-2.2.0.2.0.6.0-76
hadoop-mapreduce-client-common-2.2.0.2.0.6.0-76
hadoop-mapreduce-client-shuffle-2.2.0.2.0.6.0-76
You have uppercased Yarn, which is probably why it can not resolve it. Try the lowercase version that is suggested in the official documentation.
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Looks like i had a lucky day and went with this exception through 'all' of those causes. Summary:
wrong mapreduce.framework.name (see above)
missing mapreduce job-client jars (see above)
wrong version (see Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses-submiting job2remoteClustr )
my configured 'yarn.ipc.client.factory.class' wasn't in the classpath of the yarn server's (just on the client)
In my case i was trying to use sqoop and ran into this error.
Turns out that i was pointing to the latest version of hadoop 2.0 available from CDH repo for which sqoop was not supported.
The version of cloudera was 2.0.0-cdh4.4.0 which had yarn support build in.
When i used 2.0.0-cdh4.4.0 under hadoop-0.20 the problem went away.
Hope this helps.
changing mapreduce_shuffle to mapreduce.shuffle made it work in my case