How to share chronicle queue between multiple micro services in AWS - java

We had a micro service approach for one of our systems using Kafka as an event bus.
We had some latency problems and experimented with replacing Kafka topics with a bunch of Chronicle queues. When running locally on a developer machine the results were amazing, one of our most expensive work flows was processing ten to thirty times faster.
Given the initial good results we decided to take the experiment further and deploy our proof on concept in AWS which is where our system runs. Our micro services run in docker containers across a bunch of EC2s.
We created an EFS volume and mounted it on each docker container. We verified the volume was accessible from each micro service and the right read write permissions were granted.
Now the problem:
MS1 receives a message (API call) does some processing and emits an event in a chronicle queue. We can see on the EFS file system the chronicle queue file is touched. MS2 is supposed to consume that event and do some further processing. This is not happening. Eventually restarting MS2 would trigger the message processing but this is not always the case. Easy to imagine the disappointment.
The question:
Is our EFS approach wrong? If yes what would be the way to go?
Thank you in advance for your inputs.

Chronicle Queue cannot work on a Network File System like EFS, as discussed in this previous question and also documented here: https://github.com/OpenHFT/Chronicle-Queue/#usage
To communicate between hosts you need Chronicle Queue Enterprise which supports TCP/IP replication.
Please note also doco for running with docker

Related

Fail safe mechanism for Kafka

I am working on application that writes to Kafka queue which is read by other application. When I am unable to send message on Kafka due to network or other reason, I need to write messages during Kafka down time to other place e.g Oracle or local file system, so that I don't loose messages generated during down time.Problem with oracle or other DB is it too can go down. Is there any recommendations about how could I achieve fail safe during Kafka down time.
Number of messages generated are approx 20-25 million per day. For messages stored during downtime I am planning to have batch job to re send them to destination application once target application is up again.
Thank you
You can push those messages into a cloud based messaging service like SQS. It supports around 3K messages per second.
There is also a connector that allows you to push back the messages into Kafka directly, with no other headaches.
If you can't export the data out of your local network, then maybe a cluster of RabbitMQ instances may help, although it wouldn't be a plug & play solution.

Big GC pause caused by Kafka producer in microservice

One of my web services sends log message (size: 10K) to kafka (version: 0.10.2.1) for each online request, and I found that the KafkaProducer consumes lots of memory which caused long gc pausing time.
There is only one Kafka producer in my service, which is recommended officially.
I am just wondering if anyone has any suggestions on how to send messages to kafka without any impact on the online services?
Sounds like the producer isn't able to keep up with the rate at which your service is generating logs. (This is necessarily speculation since you have given minimal details about your setup).
Have you benchmarked your kafka cluster? Is it able to sustain the kind of load you generate?
Another avenue would be to decouple your kafka producer from your actual service. Since you're dealing with log messages, your application can simply write the logs to disk, and you can have a separate process reading these log files and sending them to kafka. This way the producing of messages won't impact your main service.
You can even have the kafka producer running on a different VM/Container entirely, and read the logs via something like an NFS mount.

How to increase WebSocket throughput

I need to pull data from a lot of clients connecting to a java server through a web socket.
There are a lot of web socket implementations, and I picked vert.x.
I made a simple demo where I listen to text frames of json, parse them with jackson and send response back. Json parser doesn't influence significantly on the throughput.
I am getting overall speed 2.5k per second with 2 or 10 clients.
Then I tried to use buffering and clients don't wait for every single response but send batch of messages (30k - 90k) after a confirmation from a server - speed increased up to 8k per second.
I see that java process has a CPU bottleneck - 1 core is used by 100%.
Mean while nodejs client cpu consumption is only 5%.
Even 1 client causes server to eat almost a whole core.
Do you think it's worth to try other websocket implementations like jetty?
Is there way to scale vert.x with multiple cores?
After I changed the log level from debug to info I have 70k. Debug level causes vert.x print messages for every frame.
It's possible specify number of verticle (thread) instances by e.g. configuring DeploymentOptions http://vertx.io/docs/vertx-core/java/#_specifying_number_of_verticle_instances
You was able to create more than 60k connections on a single machine, so I assume the average time of a connection was less than a second. Is it the case you expect on production? To compare other solutions you can try to run https://github.com/smallnest/C1000K-Servers
Something doesn't sound right. That's very low performance. Sounds like vert.x is not configured properly for parallelization. Are you using only one verticle (thread)?
The Kaazing Gateway is one of the first WS implementations. By default, it uses multiple cores and is further configurable with threads, etc. We have users using it for massive IoT deployments so your issue is not the JVM.
In case you're interested, here's the github repo: https://github.com/kaazing/gateway
Full disclosure: I work for Kaazing

Tib RV - listing all the processes that are publishing to a given topic

We have RV messaging systems publishing and receiving messages.Recently some underlying jars were upgraded - these are serialization jars used by all publishers and subscribers. However , it seems that some of the publishers are still referencing old versions of the serialization jars and therefore the receivers fail when trying to deserialize received messages.
Obviously restarting these publisher services should fix the problem. However , how do I identify all publishers using a particular topic to send messages to ? There must be some RV admin way of listing all the processes that are publishing to a given topic ?
I just gave a similar answer on another question:
There is a really great tool for this called Rai Insight
Basically what it can do is to sit on a box and silently listen all the multicast data and represent statistics even in real time. We used it to monitor traffic flow spikes with just few seconds delay.
It can give you traffic statistics braked down by multicast group, service number or even sending machine. Traffic flow peak/average, retransmission rate peak/average. All you can think of.
It will also give you per-service per-topic information.

How to distribute Java long running process to remote servers

My php web server receive requests and needs to launch a java program that runs between 30 sec and 5 minutes or even more. That long process needs to be distributed on the available servers in my LAN.
What i need:
a job queue ( that's done in a db)
A DB watch. Get notified of new or completed job (to start another job in the queue)
Start a java process on a remote and available computer.
It seems that it needs to be a DB watch since I need to evaluate which remote computer is available and a DB stored procedure wouldn't accomplish that easily.
What is the best or at least a good way to achieve this in a OS independant way using JAVA.
I guess I could use a FileWatch and manage the queue in a folder but it seems prehistoric.
Thanks
I would use a JMS queue. You add tasks/messages to a queue and the next available process takes a task, performs it and sends back any result on another queue or topic. This supports transparent load balancing and you can restart tasks if a process fails. No polling is required.

Categories