How to reconnect kafka producer once closed?

How to reconnect kafka producer once closed? - java

I have multi thread app which uses producer class to produce messages, earlier i was using below code to create producer for each request.where KafkaProducer was newly built with each request as below:
KafkaProducer<String, byte[]> producer = new KafkaProducer<String, byte[]>(prop);
ProducerRecord<String, byte[]> data = new ProducerRecord<String, byte[]>(topic, objBytes);
producer.send(data, new Callback() {
#Override
public void onCompletion(RecordMetadata metadata, Exception exception) {
if (exception != null) {
isValidMsg[0] = false;
exception.printStackTrace();
saveOrUpdateLog(msgBean, producerType, exception);
logger.error("ERROR:Unable to produce message.",exception);
}
}
});
producer.close();
Then I read Kafka docs on producer and come to know we should use single producer instance to have good performance.
Then I created single instance of KafkaProducer inside a singleton class.
Now when & where we should close the producer. Obviously if we close the producer after first send request it wont find the producer to resend messages hence throwing :
java.lang.IllegalStateException: Cannot send after the producer is closed.
OR how we can reconnect to producer once closed.
Problem is if program crashes or have exceptions then?

Generally, calling close() on the KafkaProducer is sufficient to make sure all inflight records have completed:
/**
* Close this producer. This method blocks until all previously sent requests complete.
* This method is equivalent to <code>close(Long.MAX_VALUE, TimeUnit.MILLISECONDS)</code>.
* <p>
* <strong>If close() is called from {#link Callback}, a warning message will be logged and close(0, TimeUnit.MILLISECONDS)
* will be called instead. We do this because the sender thread would otherwise try to join itself and
* block forever.</strong>
* <p>
*
* #throws InterruptException If the thread is interrupted while blocked
*/
If your producer is being used throughout the lifetime of your application, don't close it up until you get a termination signal, then call close(). As said in the documentation, the producer is safe to used in a multi-threaded environment and hence you should re-use the same instance.
If you're sharing your KafkaProducer in multiple threads, you have two choices:
Call close() while registering a shutdown callback via Runtime.getRuntime().addShutdownHook from your main execution thread
Have your multi-threaded methods race for closing on only allow for a single one to win.
A rough sketch of 2 would possibly look like this:
object KafkaOwner {
private var producer: KafkaProducer = ???
#volatile private var isClosed = false
def close(): Unit = {
if (!isClosed) {
kafkaProducer.close()
isClosed = true
}
}
def instance: KafkaProducer = {
this.synchronized {
if (!isClosed) producer
else {
producer = new KafkaProducer()
isClosed = false
}
}
}
}

As described in javadoc for KafkaProducer:
public void close()
Close this producer. This method blocks until all previously sent requests complete.
This method is equivalent to close(Long.MAX_VALUE, TimeUnit.MILLISECONDS).
src: https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html#close()
So you don't need to worry that your messages won't be sent, even if you call close immediately after send.
If you plan to use a KafkaProducer more than once, then close it only after you've finished using it. If you still want to have the guarantee that your message is actually sent before your method completes and not waiting in a buffer, then use KafkaProducer#flush() which will block until current buffer is sent. You can also block on Future#get() if you prefer.
There is also one caveat to be aware of if you don't plan to ever close your KafkaProducer (e.g. in short-lived apps, where you just send some data and the app immediately terminates after sending). The KafkaProducer IO thread is a daemon thread, which means the JVM will not wait until this thread finishes to terminate the VM. So, to ensure that your messages are actually sent use KafkaProducer#flush(), no-arg KafkaProducer#close() or block on Future#get().

Kafka producer is supposed to be thread safe and frugal with it's thread pool. you might want to use
producer.flush();
instead of
producer.close();
leaving the producer open until program termination or until your sure you won't need it any more.
If you still want to close the producer, then recreate it on demand.
producer = new KafkaProducer<String, byte[]>(prop);

Related

Kafka message handling in consumer

I have a consumer which reads data from a topic and spawns a thread for processing. At a single point of time there can be multiple messages being processed in the server. The application encountered DB timeouts and all the messages being processed were lost. And since there were multiple threads polling for DB connection, the application threw out of memory exception and went down.
How can I improve the architecture to remove data loss even if consumer goes down without processing

You should do At-Least-Once processing by committing the offsets after you complete your processing.
i.e Do
consumer.commitSync();
After your the thread completes successfully.
Note that you also need to configure the consumer to stop commmiting the offset automatically by setting ‘enable.auto.commit’ to false.
You need to be careful though that your consumer is Idempotent. i.e If it fails, and reads and processes the same value again, it will not effect the outcome.

You should commit the offset after getting a successful response from DB.
The issue is related to the available database connection and thread. The only way to handle this issue is to get a database connection and then send the database connection to the thread.
Thread Example
public class ConsumerThreadHandler implements Callable {
private ConsumerRecord consumerRecord;
private Connection dataBaseConnection;
public ConsumerThreadHandler(ConsumerRecord consumerRecord,) {
this.consumerRecord = consumerRecord;
this.dataBaseConnection = dataBaseConnection;
}
#Override
public Object call() throws Exception {
// Perform all the data base related things
// and generate the proper response
return;
}
}
Consumer Code
executor = new ThreadPoolExecutor(numberOfThreads, numberOfThreads, 0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<>(), new ThreadPoolExecutor.CallerRunsPolicy());
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (final ConsumerRecord record : records) {
// Get database connection , Check untill get the connection or maintain the connection pool and based on available connection move next.
Future future=executor.submit(new ConsumerThreadHandler(record,dataBaseConnection));
if(future.isDone())
// Based on the proper response commit the offset
}
}
}
You can go through the following simple example.
https://howtoprogram.xyz/2016/05/29/create-multi-threaded-apache-kafka-consumer/

RabbitMQ. Java client. Is it possible to acknowledge message not on the same thread it was received?

I want to fetch several messages, handle them and ack them all together after that. So basically I receive a message, put it in some queue and continue receiving messages from rabbit. Different thread will monitor this queue with received messages and process them when amount is sufficient. All I've been able to found about ack contains examples only for one message which processed on the same thread. Like this(from official docs):
channel.basicQos(1);
final Consumer consumer = new DefaultConsumer(channel) {
#Override
public void handleDelivery(String consumerTag, Envelope envelope, AMQP.BasicProperties properties, byte[] body) throws IOException {
String message = new String(body, "UTF-8");
System.out.println(" [x] Received '" + message + "'");
try {
doWork(message);
} finally {
System.out.println(" [x] Done");
channel.basicAck(envelope.getDeliveryTag(), false);
}
}
};
And also documentation says this:
Channel instances must not be shared between threads. Applications
should prefer using a Channel per thread instead of sharing the same
Channel across multiple threads. While some operations on channels are
safe to invoke concurrently, some are not and will result in incorrect
frame interleaving on the wire.
So I'm confused here. If I'm acking some message and at the same time the channel is receiving another message from rabbit, is it considered to be two operations at the time? It seems to me like yes.
I've tried to acknowledge message on the same channel from different thread and it seems to work, but documentation says that I should not share channels between threads. So I've tried to do acknowledgment on different thread with different channel, but it fails, because delivery tag is unknown for this channel.
Is it possible to acknowledge message not on the same thread it was received?
UPD
Example piece of code of what I want. It's in scala, but I think it's straightforward.
case class AmqpMessage(envelope: Envelope, msgBody: String)
val queue = new ArrayBlockingQueue[AmqpMessage](100)
val consumeChannel = connection.createChannel()
consumeChannel.queueDeclare(queueName, true, false, true, null)
consumeChannel.basicConsume(queueName, false, new DefaultConsumer(consumeChannel) {
override def handleDelivery(consumerTag: String,
envelope: Envelope,
properties: BasicProperties,
body: Array[Byte]): Unit = {
queue.put(new AmqpMessage(envelope, new String(body)))
}
})
Future {
// this is different thread
val channel = connection.createChannel()
while (true) {
try {
val amqpMessage = queue.take()
channel.basicAck(amqpMessage.envelope.getDeliveryTag, false) // doesn't work
consumeChannel.basicAck(amqpMessage.envelope.getDeliveryTag, false) // works, but seems like not thread safe
} catch {
case e: Exception => e.printStackTrace()
}
}
}

Although the documentation is pretty restrictive, some operations on channels are safe to invoke concurrently.
You may ACK messages in the different thread as long as consuming and acking are the only actions you do on the channel.
See this SO question, which deals with the same thing:
RabbitMQ and channels Java thread safety

For me your solution is correct. You are not sharing channels across thread.
You never pass your channel object to another thread, you use it on the same thread that receives the messages.
It is not possible that you are
'acking some message and at the same time the channel is receiving another message from rabbit'
If your are in handleDelivery method, that thread is blocked by your code and has no chance of receiving another message.
As you found out, you cannot acknowledge message using channel other than channel that was used to receive message.
You must acknowledge using same channel, and you must do that on the same thread that was receiving message. So you may pass channel object to other methods, classes but you must be careful not to pass it to another thread.
I use this solution in my project It uses RabbitMQ listner and Spring Integration. For every AMQP message, one org.springframework.integration.Message is created. That message has AMPQ message body as payload, and AMQP channel and delivery tag as headers of my org.springframework.integration.Message.
If you want to acknowledge several messages, and they were delivered on the same channel, you should use
channel.basicAck(envelope.getDeliveryTag(), true);
For multiple channels, efficient algorithm is
Lets say you have 100 messages, delivered using 10 channels
you need to find max deliveryTag for each channel.
invoke channel.basicAck(maxDeliveryTagForThatChannel, true);
This way, you need 10 basicAck (network roundtrips) not 100.

As the docs say, one channel per thread the rest has no restrictions.
I would just to say a few things on your example. What you are trying to do here is wrong. There is no need to ACK the message only after you take it from ArrayBlockingQueue, because once you put it there, it stays there. ACKing it to RMQ has nothing to do with the other ArrayBlockingQueue queue.

How can I force AmazonSQSBufferedAsyncClient to flush messages?

I'm using the AWS SDK for Java and I'm using the buffering async sqs client to batch requests so that I reduce costs.
When my application shuts down, I want to ensure that no messages are waiting in the buffer, but there's no .flush() method I can see on the client.
Does AmazonSQSBufferedAsyncClient.shutdown() flush my messages when called? I looked at the source code and it's unclear. The method calls shutdown() on each QueueBuffer that it has, but inside QueueBuffer.shutdown() it says
public void shutdown() {
//send buffer does not require shutdown, only
//shut down receive buffer
receiveBuffer.shutdown();
}
Further, the documentation for .shutdown() says:
Shuts down this client object, releasing any resources that might be
held open. This is an optional method, and callers are not expected
to call it, but can if they want to explicitly release any open
resources. Once a client has been shutdown, it should not be used to
make any more requests.
For this application, I need to ensure no messages get lost while being buffered. Do I need to handle this manually using the normal AmazonSQSClient instead of the buffering/async one?

With 1.11.37 version of the SDK, there is a configuration parameter just for this purpose in QueueBufferConfig.
AmazonSQSBufferedAsyncClient bufClient =
new AmazonSQSBufferedAsyncClient(
realAsyncClient,
new QueueBufferConfig( )
.withFlushOnShutdown(true)
);

There is a method to explicitly call the flush but it is not accessible and actually I could not find any call to that method in amazon code. It seems like something is missing.
When you call shutdown on the async client it executes the following code:
public void shutdown() {
for( QueueBuffer buffer : buffers.values() ) {
buffer.shutdown();
}
realSQS.shutdown();
}
And QueueBuffer#shutdown() looks like this:
/**
* Shuts down the queue buffer. Once this method has been called, the
* queue buffer is not operational and all subsequent calls to it may fail
* */
public void shutdown() {
//send buffer does not require shutdown, only
//shut down receive buffer
receiveBuffer.shutdown();
}
So it seems like they are intentionally not calling to sendBuffer.shutdown() which is the method that will flush every message in the buffer that are not still sent.
Did you find a case when you shutdown the SQS Client and it lost messages? It looks like they are aware of that and that case should not happen, but if you want to be sure you can call that method with reflection which it is really nasty but it will satisfy your needs.
AmazonSQSBufferedAsyncClient asyncSqsClient = <your initialization code of the client>;
Field buffersField = ReflectionUtils.findField(AmazonSQSBufferedAsyncClient.class, "buffers");
ReflectionUtils.makeAccessible(buffersField);
LinkedHashMap<String, Object> buffers = (LinkedHashMap<String, Object>) ReflectionUtils.getField(buffersField, asyncSqsClient);
for (Object buffer : buffers.values()) {
Class<?> clazz = Class.forName("com.amazonaws.services.sqs.buffered.QueueBuffer");
SendQueueBuffer sendQueueBuffer = (SendQueueBuffer) ReflectionUtils.getField(ReflectionUtils.findField(clazz, "sendBuffer"), buffer);
sendQueueBuffer.flush();//finally
}
Something like that should work, I guess. Let me know!

Async NIO: Same client sending multiple messages to Server

Regarding Java NIO2.
Suppose we have the following to listen to client requests...
asyncServerSocketChannel.accept(null, new CompletionHandler <AsynchronousSocketChannel, Object>() {
#Override
public void completed(final AsynchronousSocketChannel asyncSocketChannel, Object attachment) {
// Put the execution of the Completeion handler on another thread so that
// we don't block another channel being accepted.
executer.submit(new Runnable() {
public void run() {
handle(asyncSocketChannel);
}
});
// call another.
asyncServerSocketChannel.accept(null, this);
}
#Override
public void failed(Throwable exc, Object attachment) {
// TODO Auto-generated method stub
}
});
This code will accept a client connection process it and then accept another.
To communicate with the server the client opens up an AsyncSocketChannel and fires the message.
The Completion handler completed() method is then invoked.
However, this means if the client wants to send another message on the same AsyncSocket instance it can't.
It has to create another AsycnSocket instance - which I believe means another TCP connection - which is performance hit.
Any ideas how to get around this?
Or to put the question another way, any ideas how to make the same asyncSocketChannel receive multipe CompleteionHandler completed() events?
edit:
My handling code is like this...
public void handle(AsynchronousSocketChannel asyncSocketChannel) {
ByteBuffer readBuffer = ByteBuffer.allocate(100);
try {
// read a message from the client, timeout after 10 seconds
Future<Integer> futureReadResult = asyncSocketChannel.read(readBuffer);
futureReadResult.get(10, TimeUnit.SECONDS);
String receivedMessage = new String(readBuffer.array());
// some logic based on the message here...
// after the logic is a return message to client
ByteBuffer returnMessage = ByteBuffer.wrap((RESPONSE_FINISHED_REQUEST + " " + client
+ ", " + RESPONSE_COUNTER_EQUALS + value).getBytes());
Future<Integer> futureWriteResult = asyncSocketChannel.write(returnMessage);
futureWriteResult.get(10, TimeUnit.SECONDS);
} ...
So that's it my server reads a message from the async channe and returns an answer.
The client blocks until it gets the answer. But this is ok. I don't care if client blocks.
Whent this is finished, client tries to send another message on same async channel and it doesn't work.

There are 2 phases of connection and 2 different kind of completion handlers.
First phase is to handle a connection request, this is what you have programmed (BTW as Jonas said, no need to use another executor). Second phase (which can be repeated multiple times) is to issue an I/O request and to handle request completion. For this, you have to supply a memory buffer holding data to read or write, and you did not show any code for this. When you do the second phase, you'll see that there is no such problem as you wrote: "if the client wants to send another message on the same AsyncSocket instance it can't".
One problem with NIO2 is that on one hand, programmer have to avoid multiple async operations of the same kind (accept, read, or write) on the same channel (or else an error occur), and on the other hand, programmer have to avoid blocking wait in handlers. This problem is solved in df4j-nio2 subproject of the df4j actor framework, where both AsyncServerSocketChannel and AsyncSocketChannel are represented as actors. (df4j is developed by me.)

First, you should not use an executer like you have in the completed-method. The completed-method is already handled in a new worker-thread.
In your completed-method for .accept(...), you should call asychSocketChannel.read(...) to read the data. The client can just send another message on the same socket. This message will be handled with a new call to the completed-method, perhaps by another worker-thread on your server.

Producer Consumer pattern - handling producer failure

I have a producer/consumer pattern like the following
A fixed number of producer threads, each writing on to their own BlockingQueue, invoked via an Executor
A single consumer thread, reading off the producer threads
Each producer is running a database query and writing the results to its Queue. The consumer polls all the producer Queues. At the moment if there is a database error the producer thread dies and then the consumer gets stuck forever waiting for more results on the product Queue.
How should I be structuring this to handle catch errors correctly?

I once did a similar thing and decided to use a sentinel value that the dying producer thread would push into the queue from the catch-block. You can push the exception itself (this works in most scenarios), or have a special object for that. In any case it is great to push the exception to the consumer for debugging purposes.

Whatever class it is that you actually push onto the queue/s, it should contain success/fail/error members so that the consumer/s can check for fails.
Peter has already suggested using only one queue - I don't see how avoiding all that polling should be any particular problem - the objects on the queue can have members that identify which producer they came from, and any other metadata, if required.

It appears that the only option you have when a producer dies is to stop the consumer.
To do this you can use a poison pill. This is a special object which the producer adds when it stops and the consumer knows to stop when it receives it. The poison pill can be added into a finally block so it is always added no matter how the producer is killed/dies.
Given you have only one consumer, I would use one queue. This way your consumer will only block where all the producers have died.

You might add some timeout to kill the consumer when there are no more elements in the queue(s) for a certain time.
Another approach might be to have the producers maintain an "alive" flag and signal that they are dying by setting it to false. If the producers run continuously but might not always get results from the database the "alive" flag could be the time the producer reported to be alive the last time and then use the timeout to check whether the producer might have died (when the last report of being alive was too long ago).

Answering my own question.
I used the following class. It takes a list of Runnable and executes them all in parallel, if one fails, it interrupts all the others. Then I have interrupt handling in my producers and consumers to die gracefully when interrupted.
This works nicely for my case.
Thanks for all the comments/answers as they gave me some ideas.
// helper class that does the following
//
// if any thread has an exception then interrupt all the others with an eye to cancelling them
// if the thread calling execute() is interrupted then interrupt all the child threads
public class LinkedExecutor
{
private final Collection<Runnable> runnables;
private final String name;
public LinkedExecutor( String name, Collection<Runnable> runnables )
{
this.runnables = runnables;
this.name = name;
}
public void execute()
{
ExecutorService executorService = Executors.newCachedThreadPool( ConfigurableThreadFactory.newWithPrefix( name ) );
// use a completion service to poll the results
CompletionService<Object> completionService = new ExecutorCompletionService<Object>( executorService );
for ( Runnable runnable : runnables )
{
completionService.submit( runnable, null );
}
try
{
for ( int i = 0; i < runnables.size(); i++ )
{
Future<?> future = completionService.take();
future.get();
}
}
catch ( InterruptedException e )
{
// on an interruption of this thread interrupt all sub-threads in the executor
executorService.shutdownNow();
throw new RuntimeException( "Executor '" + name + "' interrupted", e );
}
catch ( ExecutionException e )
{
// on a failure of any of the sub-threads interrupt all the threads
executorService.shutdownNow();
throw new RuntimeException( "Execution execution in executor '" + name + "'", e );
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.