Apache Flink: Standalone Cluster tries to connect with username "flink" - java

For my master thesis I'm trying to set up a flink standalone cluster on 4 nodes. I've worked along the documentation which pretty neatly explains how to set it up. But when I start the cluster there is a warning and when I'm trying to run a job, there is an error with the same message:
akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka.tcp://flink#MYHOSTNAME:6123/user/jobmanager#-818199108]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage"
Increasing the timeout didn't work. When I open the taskmanagers in web UI, all of them have the following pattern:
akka.tcp://flink#MYHOSTNAME:33779/user/taskmanager
Does anyone have an idea how to solve this to get the cluster working? Thanks in advance!
One last thing: There isn't a user "flink" on the cluster and won't be created. So any advices without telling me I should create that user would be very appreciated! Thanks!

Not sure if it is still relevant, but the way i did it (using Flink 1.5.3):
I setup a HA standalone cluster with 3 master (JobManager) and 20 slaves (TaskManager) in the following way.
Define your conf/masters file (hostname:8081 per line)
Define your conf/slaves file (each taskmanager hostname per line)
Define in the flink-conf.yaml on each master machine its own jobmanager.rpc.address hostname
Define in the flink-conf.yaml on each slave machine the jobmanager.rpc.address as localhost
Once every is set, execute the bin/start-cluster.sh on any of the master host.
If you need HA then you need to setup a zookeeper quorum and modify the corresponding properties regarding HA (high-availability, high-availability.storageDir, high-availability.zookeeper.quorum)

Related

Kafka + Spring locally broker may not be available. Windows 10

I'm having trouble configuring kafka and spring on Windows 10 machine.
I did according to the guide, which I found on YouTube https://www.youtube.com/watch?v=IncG0_XSSBg&t=538s.
I can't connect locally in any way.
The spring application is very simple and its task is only to connect to the standing server.
I have already spent a lot of time looking for a solution and nothing helps me.
I tried a lot. changed it to
server.properties
listenera na listeners = PLAINTEXT: //127.0.0.1: 9092.
I changed Java version to jre 8.241.
The spring application cannot connect to the broker.
Please help.
UPDATE
After typing, to start Kafka server:
bin/kafka-server-start.sh config/server.properties
I have got following error:
After you run zookeper, open another terminal, change directory to again where you run zookeper, and then run command bin/kafka-server-start.sh config/server.properties. This will start kafka server, and you will be able to reach 9092 port.
For details, you can see quick start doc.

Google Kubernetes Engine - Redis Master to Slave replication does not happen

I have set-up a cluster under Google Kubernetes Engine and tried the GuestBook Redis image (Java). Was able to put a key onto Redis Master, however failing to read the value from the Slave. Tried to read it from Master itself and found the respective key and its value, however read from Slave fails and the reason could be replication not happening.
Tried the approach provided under
page https://cloud.google.com/kubernetes-engine/docs/tutorials/guestbook (tried using JAVA).
I suppose the redis-slave-controller.yaml has the necessary configuration to set the replication, but still it does not work. Could someone please help what could be missing here?
I was using the latest redis4 image (launcher.gcr.io/google/redis4:latest) for both master and slave and it seemed to be causing the replication issue. Could not find the right image for slave for the latest version and hence
I replaced the below images and it is working correctly now.
Redis Master image: gcr.io/google_containers/redis:latest
Redis Slave image: gcr.io/google_containers/redis-slave:v2

How to configure Java client connecting to AWS EMR spark cluster

I'm trying to write a simple spark application, and when i run it locally it works with setting the master as
.master("local[2]")
But after configuring spark cluster on AWS (EMR) i can't connet to the master url:
.master("spark://<master url>:7077")
Is this the way to do it? am i missing something here?
The cluster is up and running, and when i tried adding my application as a step jar, so it will run directly in the cluster it worked. But i want to be able to run it from a remote machine.
would appreciate some help here,
Thanks
To run from a remote machine, you will need to open the appropriate ports in the Security Group assigned to your EMR master node. You will need to add at least 7077.
If by "remote" you mean one that isn't in your AWS environment, you will also need to setup a way to route traffic to it from the outside.

How to know status of Kafka broker in java?

i am working on apache storm which has a topolgy main class. This topology contains the kafkaSpout which listen a kafka topic over a kafka broker. Now before i submit this topology i want to make sure the status of the kafka broker which has the topic. But i didnt found any way to do it? How a kafka brokers status can be known from storm tolopogy class ? Please help...
If you simply want a quick way to know if it is running or not you can just run the start command again with the same config:
bin/kafka-server-start.sh config/server.properties
If it's running then you should get an exception about the port already being in use.
Not foolproof, so better would be to use Zookeeper as mentioned above:
Personally I use intellij which has a Zookeeper plugin which helps you browse the brokers/topics running within it. Probably something similar for Eclipse or other IDEs.
(IntelliJ)
Go to File > Settings > type zookeeper in the search, then install and click ok (may need to restart)
Go to File > Settings > type zookeeper in the search. Click enable then put in the address where your zookeeper server is running and apply changes. (Note you may need to check the port is correct too)
You should now see your zookeeper server as a tab on the left side of the IDE.
This should show you your broker and topics, consumers etc
Hope that helps!
If you have configured storm-ui, then that should give you a brief information about the running cluster, including informations such as currently running topologies, available free slots, supervisor info etc.
Programitically you can write a thrift client to retrieve those information from the storm cluster. You can possibly choose almost any language to develope your own client.
Check out this article for further reference
Depending on what kind of status you want to have, for most cases you would actually retrieve this from Zookeeper. In Zookeeper you can see registered brokers, topics and other useful things which might be what you're looking for.
Another solution would be to deploy a small regular consumer which would be able to perform those checks for you.

Can connect to Cassandra cluster using OpsCenter but not DevCenter or via Java

Cassandra noob here. I've done the online training which didn't need more than a localhost connection. Now I've pulled out some old computers and set them up as a cluster, however I can't connect to them via DevCenter or using the Java Driver.
I used OpsCenter to set up the cluster hoping that I would not have to do any manual configuration, but it seems that some manual configuration will be required.
I used OpsCenter 4.0.3 to create a Community 2.0.3 cluster with four nodes. All four nodes are joined to the cluster. OpsCenter sees them all and shows them as Active. All four nodes are running Ubuntu Desktop 13.10. I have successfully added a keyspace using the OpsCenter Schema tab.
Nmap shows that none of the nodes has port 9042 open, so it seems to me that it's a problem with the client side agents not listening on the port.
At the suggestion of someone from DataStax I edited the cassandra.yaml file on one of the nodes (the seed node, as it happens) and set the rpc_address to the node ip address (ie: 192.168.0.123). I restarted the node from OpsCenter, but there was no effect.
I then edited cassandra.yaml and changed the listen_address to be the node address, and restarted the node from OpsCenter, again to no avail.
Clearly I have missed a step somewhere along the line. Anyone who has successfully started a Cassandra cluster know what I'm overlooking?
Edit cassandra.yaml, find the line that has rpc_address, un comment it and set it to:
rpc_address: 0.0.0.0
if you used datastax to install cassandra you can find cassandra.yaml in /etc/cassandra
Check that the following settings are on (at least one of ) your C* node:
start_native_transport: true
native_transport_port: 9042
rpc_address: IP -- where the IP is something you can ping from the machine running DevCenter.
Once you've restarted the node make sure you can actually connect to it: telnet IP 9042. If you cannot than most probably you haven't edited the right cassandra.yaml.

Categories