Spark Cannot Connect to HBaseClusterSingleton

Spark Cannot Connect to HBaseClusterSingleton - java

I start an Hbase cluster for my test class. I use that helper class:
HBaseClusterSingleton.java
and use it as like that:
private static final HBaseClusterSingleton cluster = HBaseClusterSingleton.build(1);
I retrieve configuration object as follows:
cluster.getConf()
and I use it at Spark as follows:
sparkContext.newAPIHadoopRDD(conf, MyInputFormat.class, clazzK,
clazzV);
When I run my test there is no need to startup an Hbase cluster because Spark will connect to my dummy cluster. However when I run my test method it throws an error:
2015-08-26 01:19:59,558 INFO [Executor task launch
worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn
(ClientCnxn.java:logStartConnect(966)) - Opening socket connection to
server localhost/127.0.0.1:2181. Will not attempt to authenticate
using SASL (unknown error)
2015-08-26 01:19:59,559 WARN [Executor
task launch worker-0-SendThread(localhost:2181)] zookeeper.ClientCnxn
(ClientCnxn.java:run(1089)) - Session 0x0 for server null, unexpected
error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused at
sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
Hbase tests, which do not run on Spark, works well. When I check the logs I see that cluster and Spark is started up correctly:
2015-08-26 01:35:21,791 INFO [main] hdfs.MiniDFSCluster
(MiniDFSCluster.java:waitActive(2055)) - Cluster is active
2015-08-26 01:35:40,334 INFO [main] util.Utils
(Logging.scala:logInfo(59)) - Successfully started service
'sparkDriver' on port 56941.
I realized that when I start up an hbase from command line my test method for Spark connects to it!
So, does it means that it doesn't care about the conf I passed to it? Any ideas about how to solve it?

Related

Accessing Neo4j/GrapheneDB (Dev free plan) on Heroku from Micronaut Java app fails: Connection to database terminated

currently I'm struggling with Neo4j/GrapheneDB (Dev free plan) on Heroku platform.
Launching my app locally via "heroku local" works fine, it connects (Neo4j Java Driver 4) to a Neo4j 3.5.18 (running from Docker image "neo4j:3.5").
My app is built using Micronaut framework, using its Neo4j support. Launching my app on Heroku platform succeeds, I'm using Gradle Heroku plugin for this task.
But accessing the database with business operations (and health checks) fails with exception like this:
INFO Driver - Direct driver instance 1523082263 created for server address hobby-[...]ldel.dbs.graphenedb.com:24787
WARN RetryLogic - Transaction failed and will be retried in 1032ms
org.neo4j.driver.exceptions.ServiceUnavailableException: Connection to the database terminated. Please ensure that your database is listening on the correct host and port and that you have compatible encryption settings both on Neo4j server and driver. Note that the default encryption setting has changed in Neo4j 4.0.
at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:143)
at org.neo4j.driver.internal.InternalSession.beginTransaction(InternalSession.java:163)
at org.neo4j.driver.internal.InternalSession.lambda$transaction$4(InternalSession.java:147)
at org.neo4j.driver.internal.retry.ExponentialBackoffRetryLogic.retry(ExponentialBackoffRetryLogic.java:101)
at org.neo4j.driver.internal.InternalSession.transaction(InternalSession.java:146)
at org.neo4j.driver.internal.InternalSession.readTransaction(InternalSession.java:112)
at org.neo4j.driver.internal.InternalSession.readTransaction(InternalSession.java:106)
at PersonController.logInfoOf(PersonController.java:57)
at PersonController.<init>(PersonController.java:50)
at $PersonControllerDefinition.build(Unknown Source)
at io.micronaut.context.DefaultBeanContext.doCreateBean(DefaultBeanContext.java:1814)
[...]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:832)
Suppressed: org.neo4j.driver.internal.util.ErrorUtil$InternalExceptionCause: null
at org.neo4j.driver.internal.util.ErrorUtil.newConnectionTerminatedError(ErrorUtil.java:52)
at org.neo4j.driver.internal.async.connection.HandshakeHandler.channelInactive(HandshakeHandler.java:81)
[...]
at org.neo4j.driver.internal.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at org.neo4j.driver.internal.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 common frames omitted
I'm sure to get login credentials from OS environment variables GRAPHENEDB_BOLT_URL, GRAPHENEDB_BOLT_USER, and GRAPHENEDB_BOLT_PASSWORD injected to the app correctly; I've verified it with some debug log statements:
State changed from starting to up
INFO io.micronaut.runtime.Micronaut - Startup completed in 2360ms. Server Running: http://localhost:7382
INFO Application - Neo4j Bolt URIs: [bolt://hobby-[...]ldel.dbs.graphenedb.com:24787]
INFO Application - Neo4j Bolt encrypted? false
INFO Application - Neo4j Bolt trust strategy: TRUST_SYSTEM_CA_SIGNED_CERTIFICATES
INFO Application - Changed trust strategy to: TRUST_ALL_CERTIFICATES
INFO Application - Env.: GRAPHENEDB_BOLT_URL='bolt://hobby-[...]ldel.dbs.graphenedb.com:24787'
INFO Application - Env.: GRAPHENEDB_BOLT_USER='app1[...]hdai'
INFO Application - Env.: GRAPHENEDB_BOLT_PASSWORD of length 31
I've also tried restarting GrapheneDB instance via Heroku plugin website, but with same negative results.
What's going wrong here? Are there any ways to further nail down the root cause?
Thanks
Christian

I had a closer look at this and it seems that you need the driver encryption turned on for the Graphene db instances. This can be configured in application.yml as below:
neo4j:
encryption: true
For reference, here is a sample project https://github.com/aldrinm/micronaut-neo4j-heroku

Flink TaskManager not reconnecting to the new Jobmanager

I have configured Flink in HA mode as mentioned here:
I wanted to test the fault tolerance, hence I did the following:
Setup Flink cluster with 2 JobManagers and 1 TaskManager
Start a streaming job on task manager
Kill the active job manager(to simulate a crash)
The leader election is happening as expected.
But the task manager is noted reconnecting to the new job manager. It simply tries to reconnect to the previous leader every 10seconds.
Pasting the task manager log here:
2018-07-25 19:46:08,508 INFO org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration - Messages have a max timeout of 10000 ms
2018-07-25 19:46:08,515 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint for org.apache.flink.runtime.taskexecutor.TaskExecutor at akka://flink/user/taskmanager_0 .
2018-07-25 19:46:08,524 INFO org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - Starting ZooKeeperLeaderRetrievalService /leader/resource_manager_lock.
2018-07-25 19:46:08,525 INFO org.apache.flink.runtime.taskexecutor.JobLeaderService - Start job leader service.
2018-07-25 19:46:08,529 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Connecting to ResourceManager akka.tcp://flink#10.10.97.210:46477/user/resourcemanager(b91b9aeb3565be973c9bb47259414e0a).
2018-07-25 19:46:08,574 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /10.10.97.210:46477
2018-07-25 19:46:08,576 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#10.10.97.210:46477] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#10.10.97.210:46477]] Caused by: [Connection refused: /10.10.97.210:46477]
2018-07-25 19:46:08,579 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not resolve ResourceManager address akka.tcp://flink#10.10.97.210:46477/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink#10.10.97.210:46477/user/resourcemanager..
2018-07-25 19:46:18,606 WARN akka.remote.transport.netty.NettyTransport - Remote connection to [null] failed with java.net.ConnectException: Connection refused: /10.10.97.210:46477
2018-07-25 19:46:18,607 WARN akka.remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://flink#10.10.97.210:46477] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#10.10.97.210:46477]] Caused by: [Connection refused: /10.10.97.210:46477]
2018-07-25 19:46:18,607 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor - Could not resolve ResourceManager address akka.tcp://flink#10.10.97.210:46477/user/resourcemanager, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink#10.10.97.210:46477/user/resourcemanager..
Restarting task manager doesn't help
Restarting cluster doesn't help
Please guide me if anything is missing.

Looking into the logs:
Connection refused: /10.10.97.210:46477
Was port 46477 was opened/excluded from firewall?
Just check if you have set the following in flink config:
jobmanager.rpc.port: 6123
blob.server.port: 50100-50200
And then unblock these ports.

Activemq embedded broker Error when trying to start

ERROR org.apache.activemq.broker.BrokerService - Failed to start Apache ActiveMQ ([localhost, null], java.io.IOException: org/apache/activemq/store/NoLocalSubscriptionAware)
INFO org.apache.activemq.broker.BrokerService - Apache ActiveMQ 5.9.1 (localhost, null) is shutting down
INFO org.apache.activemq.broker.TransportConnector - Connector tcp://localhost:61616 stopped
WARN org.apache.activemq.broker.jmx.ManagementContext - Failed to start JMX connector Cannot bind to URL [rmi://localhost:1099/jmxrmi]: javax.naming.NameAlreadyBoundException: jmxrmi [Root exception is java.rmi.AlreadyBoundException: jmxrmi]. Will restart management to re-create JMX connector, trying to remedy this issue.
The code I am trying to use is
BrokerService broker = new BrokerService();
TransportConnector connector = new TransportConnector();
connector.setUri(new URI("tcp://localhost:61616"));
broker.addConnector(connector);
broker.start();
I am getting exception at start() method. I am deploying this on server not in my computer.

It's quite hard to say what is wrong given the limited information but one thing I'd check is that there isn't already a broker running on that server as it looks like something is at least sitting on the JMX port already. You could check in the broker log to see if the broker logs any additional information on the error.

Unable to connect to a Kerberos-secured Phoenix datasource

I want to test pulling data from Apache HBase with a Java application. The application will use SQL-like queries via a JDBC to Apache Phoenix.
I've set up my Hadoop "cluster" on one machine using Ambari and the HortonWorks HDP 2.5 platform. I've also Kerberized the environment using Ambari's wizard, where my KDC is a seperate machine running Windows Active Directory.
Ambari shows no errors, and I am able to use sqlline.py to successfully make SQL-like calls to HBase through Phoenix. I set up some example tables this way (cf. HortonWorks Phoenix & ODBC tutorial, although I had to kinit etc. first).
However, I am having problems creating a JDBC datasource to be used by the Java application. In my case, I am planning to host the webapp on WildFly 10.1 and I am developing with Eclipse JEE with the JBoss Tools plugin.
These are the steps I used to create the datasource:
Datasource Explorer > Database Connections > New...
Connection Profile: Generic JDBC
URL: jdbc:phoenix:hdfs.eaa.local:2181/hbase-secure:HTTP/hbase.eaa.local#EAA.LOCAL:jboss.server.temp.dir/spnego.service.keytab
Username: hbase -I'm unsure what to put here-
Driver: I've created a new driver of the type "Generic JDBC Driver" and I had to add JAR files for all of the dependencies of phoenix-core-[version].jar. The Driver Class is org.apache.phoenix.jbdc.PhoenixDriver.
I got the connection string from an extant post in the HortonWorks community, which is why it includes the Kerberos principal and keytab used for the connection.
When I try to test the datasource connection, it churns for about 5 minutes before spitting out an error message (after something like 35 attempts). The client returns Java exceptions that the sockets are in a "closing state", and the Zookeeper logs are less helpful:
INFO [SyncThread:0:ZooKeeperServer#617] - Established session 0x157aef451560217 with negotiated timeout 40000 for client /192.168.40.3:52674
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#197] - Accepted socket connection from /192.168.40.41:43860
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#827] - Processing ruok command from /192.168.40.41:43860
INFO [Thread-1448:NIOServerCnxn#1007] - Closed socket connection for client /192.168.40.41:43860 (no session established for client)
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#197] - Accepted socket connection from /192.168.40.41:43922
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#868] - Client attempting to establish new session at /192.168.40.41:43922
INFO [SyncThread:0:ZooKeeperServer#617] - Established session 0x157aef451560218 with negotiated timeout 40000 for client /192.168.40.41:43922
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:SaslServerCallbackHandler#118] - Successfully authenticated client: authenticationID=hbase/hdfs.eaa.local#EAA.LOCAL; authorizationID=hbase/hdfs.eaa.local#EAA.LOCAL.
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:SaslServerCallbackHandler#134] - Setting authorizedID: hbase
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#964] - adding SASL authorization for authorizationID: hbase
INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#494] - Processed session termination for sessionid: 0x157aef451560218
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#1007] - Closed socket connection for client /192.168.40.41:43922 which had sessionid 0x157aef451560218
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#197] - Accepted socket connection from /192.168.40.41:44008
INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn#827] - Processing ruok command from /192.168.40.41:44008
INFO [Thread-1449:NIOServerCnxn#1007] - Closed socket connection for client /192.168.40.41:44008 (no session established for client)
NB. 192.168.40.3 is the VPN server, which my host machine is using to tunnel into the environment with the Hadoop cluster. 192.168.40.41 is the machine running the cluster, hdfs.eaa.local.
There are plenty of accepted socket connections which are then immediately closed. Occasionally the client authenticates successfully (so I'm confident in my Kerberos settings) but then there is a session termination immediately afterward.
I've also tried to deploy the Datasource directly in WildFly with jboss-cli and standalone.xml and module.xml modifications. But I get lots of problems with missing dependencies that I'm not sure how to resolve without creating a new module for each required JAR (and there are a lot) for phoenix-core-[version].jar. I followed this guide.
What can I do to fix the issue or diagnose further? I've been pulling my hair out for a couple of days now.

You need to add hbase-site.xml and core-site.xml to your classpath.
See How to connect to a Kerberos-secured Apache Phoenix data source with WildFly? for more information.

SpringBoot app running on AWS-EC2 unable to connect to MySQL AWS-RDS database

I am having problems running an app I have developed in an EC2 instance. When I execute the .jar (java -jar app.jar), the SpringBoot app starts but it fails when trying to connect to my MySQL RDS database. The thing is when I run the app locally on my machine, It has no issues with the DB connection.
I have opened the port where the app is running (8090) and MySql port as well (3306) for inbound and outbound traffic:
This is the error I get:
2016-09-23 17:46:38.132 INFO 10161 --- [main] .t.TomcatEmbeddedServletContainerFactory : Server initialized with port: 8090
2016-09-23 17:46:38.604 INFO 10161 --- [main] o.apache.catalina.core.StandardService : Starting service Tomcat
2016-09-23 17:46:38.605 INFO 10161 --- [main] org.apache.catalina.core.StandardEngine : Starting Servlet Engine: Apache Tomcat/7.0.54
2016-09-23 17:46:38.724 INFO 10161 --- [ost startStop 1] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2016-09-23 17:46:38.725 INFO 10161 --- [ost startStop 1] o.s.web.context.ContextLoader: Root WebApplicationContext: initialization completed in 5028 ms
2016-09-23 17:48:48.476 ERROR 10161 --- [ost startStop 1] o.a.tomcat.jdbc.pool.ConnectionPool: Unable to create initial connections of pool.
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
Any ideas how can i solve this problem?
Thank you very much for your help
Regards
Andres

From your description and log file, it's likely that network configuration is the cause here.
You might want to draw the network topology of your instances (region/availability zone, VPC, subnet, network acl, security group). This will be very helpful when you do more complex development work.
There are good references: VPC Introduction and Security in your VPC and Scenarios for Accessing a DB Instance in a VPC
I suggest the following actions for your troubleshooting:
Check security group (SG) configuration of your EC2 instance and RDS instance.
You can check this by going to EC2 Dashboard/RDS Dashboard -> Click on an instance and look at "Security Group" description, or you can click on the Setting icon (Show/Hide columns) and tick "Security Groups".
In RDS's SG configuration: make sure you have enable access from EC2 instance's SG to port 3306. You can do this by putting EC2 instance's SG ID into Source field of the config, as a "Custom IP" value. See the 1st scenario in the above reference for more detail.
Use mysql command line to test the connection between EC2 instance and RDS.
Hope it helps.

You need to perform following steps :
1) Go to EC2 instance and find security group you want access in RDS
2) Now go to your RDS security group and select inbound rules
Select ALL TCP and add your sg-xxx(security group)
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_VPC.Scenarios.html

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.