Hadoop namenode rejecting connections!? What am I doing wrong?

Hadoop namenode rejecting connections!? What am I doing wrong? - java

My configuration:
Server-class machine cluster (4 machines), each with RHEL, 8GB RAM, quad core processors.
I setup machine 'B1' to be the master, rest of 'em as slaves (B2,B3,B4). Kicked off dfs-start.sh, name node came up on 53410 on B1. Rest of the nodes are not able to connect to B1 on 53410!
Here's what I did so far:
Tried "telnet B1 53410" from B2, B3, B4 - Connection refused.
Tried ssh to B1 from B2,B3,B4 and viceversa - no problem, works fine.
Changed 53410 to 55410, restarted dfs, same issue - connection refused on this port too.
Disabled firewall (iptables stop) on B1 - tried connecting from B2,B3,B4 - fails on telnet.
Disabled firewall on all nodes, tried again, fails again to connect to 53410.
Checked ftp was working from B2,B3,B4 to B1, stopped ftp service (service vsftpd stop), tried bringing up dfs on standard ftp port (21), namenode comes up, rest of the nodes are failing again. Can't even telnet to the ftp port from B2,B3,B4.
"telnet localhost 53410" works fine on B1.
All nodes are reachable from one another and all /etc/hosts are setup with correct mapping for ip addresses. So, I am pretty much clueless at this point. Why on earth would the namenode reject connections - is there a setting in hadoop conf, that I should be aware of to allow external clients connect remotely on the namenode port?

Previous answers were not clear to me.
Basically each hadoop servers (node or namenode) will create a server and listen on the IP associated with its lookup name.
Let say you have 3 box (box1, box2, box3), the /etc/hosts file should look like this:
127.0.0.1 localhost
192.168.10.1 box1
192.168.10.2 box2
192.168.10.3 box3
Instead of :
127.0.0.1 localhost box1
192.168.10.2 box2
192.168.10.3 box3
//(this is incorrect, box one will be listening exclusively on 127.0.0.1)

fixed this.. it was an incorrect entry in my /etc/hosts. All nodes were connecting on loopback to master.

Try changing in conf/core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
from localhost to your machine name?

Set the DataNode with the right file permission:
chmod 755 /home/svenkata/hadoop/datanode/

Related

If I use "localhost", is it guaranteed that the client is local? (considering I can edit the system hosts file)

If I use this solution:
new ServerSocket(9090, 0, InetAddress.getByName("localhost"))
...and the user changes it's system hosts file to access my website as "localhost", will this fail to prevent access from non-local client?

(in response to the bounty call)
As always in computer security, guarantee depens on attacker capabilities.
The attacker is lame and knows nothing. Then yes, localhost guarantees the locality of the client.
The attacker has login access to the system and can run SSH to the outer world. Then no guarantees - SSH can forward internal ports through tunnels:
ssh -R *:8080:localhost:9090 some.external.server
Executing this command on the box with your java server will result in establishing a tunnel. All requests addressed to some.external.server:8080 will be delivered to localhost:9090 of the target box.
VPS nowdays costs almost nothing, so the attacker can easily rent such external box and use it as the proxy between your localhost and the whole world.
You may try to protect your server by filtering out all requests where Host header is not localhost. It could be easily countermeasured by including a header-rewriting proxy, such as nginx, to the forwarding chain.
Summary
As you can see, guarantee means that users in the target box must be severely limited: no forwarding software. It implies denying users access to system utilities like ssh or installing and/or running them with user privileges. This is highly unlikely unless the box is a set-top box without any user login or software reconfiguration.
Localhost address
The first comment to the question suggests a trick with localhost name resolution:
the user could probably override localhost to that it's no longer 127.0.0.1
The idea is to place a record to /etc/hosts or c:\Windows\System32\Drivers\etc\hosts that binds localhost name to another IP address.
If your box has an Ethernet connection with, say, address 1.2.3.4, then the line
1.2.3.4 localhost
might cause change of localhost address. If this happens, then the line
new ServerSocket(9090, 0, InetAddress.getByName("localhost"))
will bind the port 9090 on the external network interface, that is accessible from the outside of the box.
I tried this on Ubuntu 18.04, and it worked. I successfully connected to the app running on localhost in the box on the other side of Pasific.
BUT
Once upon a time MS Windows developers hardcoded localhost to be 127.0.0.1. Here is the Medium post about that.
I checked with my Windows 10 box. Confirmed: localhost resolves to 127.0.0.1. The test program
package org.example;
import java.net.*;
import java.io.*;
public class TryLocalhost {
public static void main(String[] args) throws IOException {
System.out.println("localhost: " + InetAddress.getByName("localhost"));
}
}
produces
localhost: localhost/127.0.0.1
while hosts file tried to bind localhost to the link-local address
# localhost name resolution is handled within DNS itself.
# 127.0.0.1 localhost
# ::1 localhost
192.168.0.198 localhost
The comment is original, from Microsoft.

Not able to connect to remote cassandra

I am trying to access Cassandra(2.1.0) installed on my machine from other machine using my ip address. Here is how I am trying to do it in other machine:
Cluster cluster = Cluster.builder().addContactPoint("192.168.3.51").build();
Session session = cluster.connect("adaequare");
But I am not able to access it. Here are few configurations from Cassandra installed on my machine:
listen_address: localhost
start_native_transport: true
native_transport_port: 9042
rpc_address: localhost
rpc_port: 9160
I tried changing localhost to my ip address. But it did not work either. Do I have to make any other changes in my cassandra.yaml to get this done?

You need to post the error. By saying "It didn't work" gives no clue at all.
Anyway, rpc_addressin cassandra.yaml should point to the IP you configured. In case it is 192.168.3.51, then it needs to go there.

how to configure cluster address in weblogic

I am configuring cluster which group of managed server in console. I have 2 managed server:
Managed_server_name listen_address listen_port ip_address
m1 slc001.us.xxx.com 7001 10.1.1.1
m2 slc002.us.xxx.com 7002 10.1.1.2
So I created a cluster and added the two managed server to this cluster, the cluster message mode is unicast, now how do I configure cluster address? I have three options, which is correct one?
slc001.us.xxx.com:7001,slc002.us.xxx.com:7002
10.1.1.1:7001,10.1.1.2:7002
10.1.1.1,10.1.1.2

Both options 1 and 2 will work. Ideally I would use hostnames in case the IP ever changes. As an example of one of our clusters we use:
portal1:11602, portal2:11602
You must have the ssl port in there if you're using ssl, otherwise use the regular unsecure port.

How to connect to a Cluster prommatically

Am successfully installed cassandra and when i testing with "connect localhost/9160;" it is working fine for me.I want connect with different IP address/Port.I was changed the listen_address in cassandra.yaml file and restarted the server and tested it showing below error.
Exception retrieving information about the cassandra node, check you have connected to the thrift port.
org.apache.thrift.transport.TTransportException: Read a negative frame size (-21
13929216)!
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTranspo
rt.java:133)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.ja
va:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.ja
va:362)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.ja
va:284)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryPr
otocol.java:191)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_cluster_na
me(Cassandra.java:1206)
at org.apache.cassandra.thrift.Cassandra$Client.describe_cluster_name(Ca
ssandra.java:1194)
at org.apache.cassandra.cli.CliMain.connect(CliMain.java:138)
at org.apache.cassandra.cli.CliClient.executeConnect(CliClient.java:2393
)
at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java
:282)
at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.
java:201)
at org.apache.cassandra.cli.CliMain.main(CliMain.java:331)
It is really helpful for me.Sorry my bad English..

If the only parameter you changed is listen_address, then you still need to use port 9160 to connect with cassandra-cli. If you want to change that port as well, then you need to adjust the rpc_port in cassandra.yaml accordingly. listen_address defines the port that two Cassandra nodes would communicate over. It is independent of the port used for Thrift clients (like cassandra-cli).

Elasticsearch server discovery configuration

I've installed ElasticSearch server, that i'm running by:
$ ./elasticsearch -f
{0.18.2}[11698]: initializing ...
loaded [], sites []
{0.18.2}[11698]: initialized
{0.18.2}[11698]: starting ...
bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.106:9300]}
new_master [Stingray][ocw4qPdmSfWuD9pUxHoN1Q][inet[/192.168.1.106:9300]], reason: zen-disco-join (elected_as_master)
elasticsearch/ocw4qPdmSfWuD9pUxHoN1Q
recovered [0] indices into cluster_state
bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.106:9200]}
{0.18.2}[11698]: started
How I can configure Java client to connect to this server?
I have just:
node.client=true
but, after trying to connect i'm receiving:
org.elasticsearch.discovery.MasterNotDiscoveredException:
at org.elasticsearch.action.support.master.TransportMasterNodeOperationAction$3.onTimeout(TransportMasterNodeOperationAction.java:162)
If i'm configuring java client as:
node.data=false
I'm getting following logs:
INFO main node:internalInfo:93 - [Stark, Tony] {0.18.2}[13008]: starting ...
INFO main transport:internalInfo:93 - [Stark, Tony] bound_address {inet[/0:0:0:0:0:0:0:0:9301]}, publish_address {inet[/192.168.1.106:9301]}
INFO elasticsearch[Stark, Tony]clusterService#updateTask-pool-13-thread-1 service:internalInfo:93 - [Stark, Tony] new_master [Stark, Tony][WkNn96hgTkWXRnsR0EOZjA][inet[/192.168.1.106:9301]]{data=false}, reason: zen-disco-join (elected_as_master)
As I understood it means that this new node (supposed to be client node) made itself a new master node. And I don't from log that it's found and connect to any other node.
Both server and client are started on same machine. 192.168.1.106:9200 are accessible from browser.
And I can't find any good documentation about discovery config. Where I can read more about ElasticSearch configurations? And how to configure Java client?

The most likely reason for this failure is firewall on your machine that blocks multicast discovery traffic on port 54328. Both client and master are broadcasting on this port during initial discovery and they don't hear back from each other. That's why when you specify node.client=true client node (that cannot be a master) fails with MasterNotDiscoveredException and node with no data elects itself as a master.

I ran into the same problem and by using IP numbers in the config file resolved it for me.
in /config/elasticsearch.yml
uncomment and change the network.host setting to:
network.host: 127.0.0.1
You can also change this to your machine IP number in ifconfig.

I had the same issue. Eventually, it turned out I had a firewall issue, with my firewall (on Ubuntu) blocking the ports of ElasticSearch. I'm using the default firewall on Ubuntu, ufw.
So, to open up the ports, I ran these commands on the terminal:
sudo ufw allow proto tcp to any port 9200:9400
sudo ufw allow proto tcp to any port 54328
My cluster runs locally on 9200, and all my clients open up on 9300+. So, I just opened the range 9200-9400 for them. 54328 is for the multicast broadcast.
Just to be complete: I also used the TransportClient, which works, but I added hardcoded my localhost to the address the TransportClient will work on. Not a good thing for production code :-)

Faced the same issue where the nodes were not able to elect a master on nodes restart.
The problem lies in the communication of nodes among themselves.
Please ensure in your elastic search logs, whether the node restart says
publish_address {127.0.0.1:9200}
or
publish_address {0.0.0.1:9200}
This means the current node is not publishing its IP address to other nodes and hence the nodes won't recognise this node even though the node IP might be present in the discovery.zen.ping.unicast.hosts
Solution
Make the following changes in elasticsearch.yml. Add
network.host: _non_loopback:ipv4_
and restart the node.
Ensure that the bound address now shows the <IP address>:<port no> and not the localhost.
This means that now your node is discoverable. The second step to make it discoverable in the cluster is to add the ip address of the node in the unicast hosts lists of all the master nodes, so that whenever we have a new master, the node is discoverable to the new master.
Add the node IP to the discovery.zen.ping.unicast.hosts
list of hosts of all the masters to make it disoverable. A masterpings all the
nodes present in the unicast list.

Something like this should work:
Settings s = ImmutableSettings.settingsBuilder()
.put(this.settings)
.build();
TransportClient client = new TransportClient(s);
client.addTransportAddress(new InetSocketTransportAddress(
"localhost",
9300)
);
What fouled me up was I originally tried connecting the client to 9200, not 9300. Guidance for settings above can be found from http://www.elasticsearch.org/guide/reference/java-api/client.html

Configure network host to localhost:
network.host: 127.0.0.1

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Hadoop namenode rejecting connections!? What am I doing wrong? - java

fixed this.. it was an incorrect entry in my /etc/hosts. All nodes were connecting on loopback to master.

Try changing in conf/core-site.xml <property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> </property> from localhost to your machine name?

Set the DataNode with the right file permission: chmod 755 /home/svenkata/hadoop/datanode/

Related

If I use "localhost", is it guaranteed that the client is local? (considering I can edit the system hosts file)

Not able to connect to remote cassandra

how to configure cluster address in weblogic

How to connect to a Cluster prommatically

Elasticsearch server discovery configuration

Categories

Resources