i am working on apache storm which has a topolgy main class. This topology contains the kafkaSpout which listen a kafka topic over a kafka broker. Now before i submit this topology i want to make sure the status of the kafka broker which has the topic. But i didnt found any way to do it? How a kafka brokers status can be known from storm tolopogy class ? Please help...
If you simply want a quick way to know if it is running or not you can just run the start command again with the same config:
bin/kafka-server-start.sh config/server.properties
If it's running then you should get an exception about the port already being in use.
Not foolproof, so better would be to use Zookeeper as mentioned above:
Personally I use intellij which has a Zookeeper plugin which helps you browse the brokers/topics running within it. Probably something similar for Eclipse or other IDEs.
(IntelliJ)
Go to File > Settings > type zookeeper in the search, then install and click ok (may need to restart)
Go to File > Settings > type zookeeper in the search. Click enable then put in the address where your zookeeper server is running and apply changes. (Note you may need to check the port is correct too)
You should now see your zookeeper server as a tab on the left side of the IDE.
This should show you your broker and topics, consumers etc
Hope that helps!
If you have configured storm-ui, then that should give you a brief information about the running cluster, including informations such as currently running topologies, available free slots, supervisor info etc.
Programitically you can write a thrift client to retrieve those information from the storm cluster. You can possibly choose almost any language to develope your own client.
Check out this article for further reference
Depending on what kind of status you want to have, for most cases you would actually retrieve this from Zookeeper. In Zookeeper you can see registered brokers, topics and other useful things which might be what you're looking for.
Another solution would be to deploy a small regular consumer which would be able to perform those checks for you.
Related
I have two development machines, both running Ignite in server mode on same network. Actually I started the server in the first machine and then started another machine. When the other machine starts, it is getting automatically added to the first one's topology.
Note:
when starting I've removed the work folder in both machines.
In config, I never mentioned any Ips of other machines.
Can anyone tell me what's wrong with this? My intention is each machine should've separate topology.
As described in discovery documentation, Apache Ignite will employ multicast to find all nodes in a local network, forming a cluster. This is default mode of operation.
Please note that we don't really recommend using this mode for either development or production deployment, use static discovery instead (see the same documentation).
For my master thesis I'm trying to set up a flink standalone cluster on 4 nodes. I've worked along the documentation which pretty neatly explains how to set it up. But when I start the cluster there is a warning and when I'm trying to run a job, there is an error with the same message:
akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka.tcp://flink#MYHOSTNAME:6123/user/jobmanager#-818199108]] after [10000 ms]. Sender[null] sent message of type "org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage"
Increasing the timeout didn't work. When I open the taskmanagers in web UI, all of them have the following pattern:
akka.tcp://flink#MYHOSTNAME:33779/user/taskmanager
Does anyone have an idea how to solve this to get the cluster working? Thanks in advance!
One last thing: There isn't a user "flink" on the cluster and won't be created. So any advices without telling me I should create that user would be very appreciated! Thanks!
Not sure if it is still relevant, but the way i did it (using Flink 1.5.3):
I setup a HA standalone cluster with 3 master (JobManager) and 20 slaves (TaskManager) in the following way.
Define your conf/masters file (hostname:8081 per line)
Define your conf/slaves file (each taskmanager hostname per line)
Define in the flink-conf.yaml on each master machine its own jobmanager.rpc.address hostname
Define in the flink-conf.yaml on each slave machine the jobmanager.rpc.address as localhost
Once every is set, execute the bin/start-cluster.sh on any of the master host.
If you need HA then you need to setup a zookeeper quorum and modify the corresponding properties regarding HA (high-availability, high-availability.storageDir, high-availability.zookeeper.quorum)
I'm new-ish to networking, and I'm swimming (drowning) in semantics.
I have a VM which runs a Java application. Ideally, it would be fed inputs from the host through a RabbitMQ queue. The Java application would then place the results on another RabbitMQ queue on a different port where it will be used by the host application. After researching it for a bit, it seems like RabbitMQ only exists in the localhost space with listeners on different ports, am I correct in this?
Do I need 2 RabbitMQ servers running in tandem, then, (one on the VM and other on Host) each listening to the same port? Or do I just need one RabbitMQ server running while both applications are pointed to the same IP Address/Port?
Also, I have also read that you cannot connect as 'guest/guest' unless it is on localhost, which I understand, but how is RabbitMQ supposed to be configured/reachable to anything besides localhost?
I've been researching for several hours, but the documentation does not point to a direct answer/how-to guide. Perhaps it is my lack of network experience. If anyone could elaborate on these questions or point me to some articles/helpful guides, I would be much obliged.
P.S. -- I don't even know what code to display to give context. Let me know and I'll edit the code into the post.
RabbitMQ listens to TCP port 5672 on all network interfaces out-of-the-box. This includes the "loopback" interface (to allow fast connections to self) and interfaces visible to other remote hosts (including VMs).
For your use case, you probably need a single RabbitMQ instance for both directions. The application on the host will publish messages to one queue and the Java application in the VM will consume messages from that queue and push the result to a second queue. This second queue can be consumed by the application on the host.
For the user, you need to create a new user with the appropriate rights. This is documented in the access control article. To create the user, you can do it from the management web UI (after you enabled the management plugin) or using the rabbitmqctl command line tool.
The last part is networking between the host and the VM. It really depends on the technology you use. It may work out-of-the-box or you may have to configure how VMs are connected to the network. Refer to the documentation of your hypervisor.
I'm new to MQ programming with java.
For my Integration tests, I would like to clean up the destination queue before posting any messages to it. Is there any option in MQ-Java to delete all messages in a queue in one go?
You can use a WMQ PCF program to clear messages in queue in one go. PCF classes provide an interface to administer WMQ programatically. There is a sample, PCF_ClearQueue.java that demonstrates clearing messages from a queue.
The sample is located (on Windows platforms) \tools\pcf\samples directory. More information on clear queue can be found here.
If you have access to runmqsc then use the MQSC command called: CLEAR QLOCAL
Note: If an application has the queue open then that command will fail and PCF commands will fail too. Hence, you will need to get all messages from the queue one at a time. You can download a Java program called EmptyQ from http://www.capitalware.com/mq_code_java.html that will do the trick.
I am a newbie to Gridgain and i would like to know how do i add remote nodes to a program. is there some configuration file. i dont see a clear cut example anywhere in the guides. (the worst guide i've ever seen)
By default, GridGain uses multicast discovery, so, if multicast is working in your network, nodes should find each other automatically.
You can configure an alternative multicast group address, if default settings do not work.
You can also configure TCP discovery with a list of IP addresses and ports, where nodes can start. This gives you more control over the discovery process, and is also a good alternative, if multicast discovery does not work.
There are also other means of discovery (including Shared FS and Amazon S3). Check GRIDGAIN_HOME/config/default-spring.xml for examples (search for "discovery"). Also, have a look at GRIDGAIN_HOME/examples/config/spring-cache.xml.
Please ensure you start all the nodes with the same configuration ("localHost" property may differ).
There is work-in-progress book online: http://www.gridgain.com/book/book.html#_taste_of_gridgain
That should answer these basic questions.