Fail safe mechanism for Kafka - java

I am working on application that writes to Kafka queue which is read by other application. When I am unable to send message on Kafka due to network or other reason, I need to write messages during Kafka down time to other place e.g Oracle or local file system, so that I don't loose messages generated during down time.Problem with oracle or other DB is it too can go down. Is there any recommendations about how could I achieve fail safe during Kafka down time.
Number of messages generated are approx 20-25 million per day. For messages stored during downtime I am planning to have batch job to re send them to destination application once target application is up again.
Thank you

You can push those messages into a cloud based messaging service like SQS. It supports around 3K messages per second.
There is also a connector that allows you to push back the messages into Kafka directly, with no other headaches.
If you can't export the data out of your local network, then maybe a cluster of RabbitMQ instances may help, although it wouldn't be a plug & play solution.

Related

Akhq UI shows Consumer lag always 1 for jdbc sink connector even though all messages are consumed

we are using Kafka stream(2.5.0 jar) with the java application ( with exactly once semantics) and a Jdbc sink connector (UPSERT mode) to write data to db.
flow:-
Java Kafka Stream app -------- > Db Sink Connector.
the akhq user interface shows a lag of 1 always ,even though all are valid messages. All the messages are consumed. is it due to the connector is not having "isolation.level" as "read_committed" ,currently "read_uncommited". The lag is shown in the pic below. Also I have seen
a bug related to Kafka https://issues.apache.org/jira/browse/KAFKA-10683 ,is it related to this.
Sink connector consumer lag
Late question but I don't know people ask question about akhq on stackoverflow.
It's a behaviour of Kafka Transaction.
Kafka transaction are handle with a commit message that will never be read by a Kafka Consumer.
The consumer just can't know that last offset is a commit.
So as Akhq is a simple Kafka consumer, it will be always see this lag of 1.
You will have this for every application using Kafka transaction.

Restarting requests in case server crashes/restarts (Spring Boot)

I'm developing an application which processes asynchronous requests that takes on an average 10 minutes to finish. The server is written using Spring Boot and has 4 replicas and there's a load balancer. In case one of these server crashes while processing certain number of requests, I want these failed requests to restart on the remaining servers in a load balanced way.
Note: There's a common database in which we create a unique entry for every incoming request, and delete that entry when that request is processed successfully.
Constraints:
We can't wait for the server to restart.
There's no extra server to keep watch of these servers.
There's no leader/slave architecture among the servers.
Can someone please help me with this problem?
One solution would be to use a message queue to handle the requests. I would recommend using Apache Kafka (Spring for Apache Kafka) and propose the following solution:
Create 4 Kafka topics.
Whenever each of the 4 replicas receives a request, publish it on one of the 4 topics (randomly) instead of simply handling it.
Each replica will connect to Kafka and consume from one topic. If you let Kafka manage your topics, whenever one replica would crash, one of the other 3 will pick up its topic and start consuming requests in its place.
When the crashed replica restarts and connects to Kafka, it can start consuming again from its topic (this auto-balancing is already implemented in Kafka).
Another advantage of this solution is that you can, if you want to, stop using the database to store requests, as Kafka can act as your database in this case.

Big GC pause caused by Kafka producer in microservice

One of my web services sends log message (size: 10K) to kafka (version: 0.10.2.1) for each online request, and I found that the KafkaProducer consumes lots of memory which caused long gc pausing time.
There is only one Kafka producer in my service, which is recommended officially.
I am just wondering if anyone has any suggestions on how to send messages to kafka without any impact on the online services?
Sounds like the producer isn't able to keep up with the rate at which your service is generating logs. (This is necessarily speculation since you have given minimal details about your setup).
Have you benchmarked your kafka cluster? Is it able to sustain the kind of load you generate?
Another avenue would be to decouple your kafka producer from your actual service. Since you're dealing with log messages, your application can simply write the logs to disk, and you can have a separate process reading these log files and sending them to kafka. This way the producing of messages won't impact your main service.
You can even have the kafka producer running on a different VM/Container entirely, and read the logs via something like an NFS mount.

how does jms interact with the underlying database?

I understand JMS as depicted by the following diagram:
(source: techhive.com)
Is there any way for me to access the underlying database using JMS or some other thing? Further, the JDBC connections that the JMS server maintains, can I add new connections in it so as to access other databases also and do CRUD operations on them? If yes, how?
Where did you get this from?
Normally JMS is used to send messages to queue (or topics). You have message producers that push messages in the queue and message consumers consume them and process it.
In your exemple it seems that you have multiple queues. One for the messages that need to be processed, and one for each client to retrieve the result the processing of its messages.
With JMS Server you don't necessarily have a database behind. Everything can stay in memory, or can be written to files. You will need database server behind only if you configure your JMS server to be persistent (and to assure that even if server/application crash your messages won't be lost). But in that case you will never have to interact with the database. Only the JMS server will and you will interact with the JMS server sending and consuming messages.

sql server as persistent db for activemq

When my activemq goes down, how can i store the message that are on its way to activemq?If the answer is using persistance db , then how and when can i resend those messages that were stored in db back to activemq queue(assuming it is up and working)?
(To give you a complete background: whenever a row gets inserted into by db my db triggers http to my java app .this app puts the changes in db as messages into activemq(we have written this thing as we are not experts in java spring frame work))
any solutions or suggestions in this regard is much appreciated
What you are looking for is indeed persistency:
Persistent messaging (ensures the messages are stored in a datastore until the broker receives the acknowledgement that it has been delivered successfully to all consumers)
This will ensure the messages are re-sent (automatically) once the broker is back alive.
If you want redundancy, you should look for the master/slave topology

Categories