Handling exceptions in Kafka streams

Handling exceptions in Kafka streams - java

Had gone through multiple posts but most of them are related handling Bad messages not about exception handling while processing them.
I want to know to how to handle the messages that is been received by the stream application and there is an exception while processing the message? The exception could be because of multiple reasons like Network failure, RuntimeException etc.,
Could someone suggest what is the right way to do? Should I use
setUncaughtExceptionHandler? or is there a better way?
How to handle retries?

it depends what do you want to do with exceptions on producer side.
if exception will be thrown on producer (e.g. due to Network failure or kafka broker has died), stream will die by default. and with kafka-streams version 1.1.0 you could override default behavior by implementing ProductionExceptionHandler like the following:
public class CustomProductionExceptionHandler implements ProductionExceptionHandler {
#Override
public ProductionExceptionHandlerResponse handle(final ProducerRecord<byte[], byte[]> record,
final Exception exception) {
log.error("Kafka message marked as processed although it failed. Message: [{}], destination topic: [{}]", new String(record.value()), record.topic(), exception);
return ProductionExceptionHandlerResponse.CONTINUE;
}
#Override
public void configure(final Map<String, ?> configs) {
}
}
from handle method you could return either CONTINUE if you don't want streams dying on exception, on return FAIL in case you want stream stops (FAIL is default one).
and you need specify this class in stream config:
default.production.exception.handler=com.example.CustomProductionExceptionHandler
Also pay attention that ProductionExceptionHandler handles only exceptions on producer, and it will not handle exceptions during processing message with stream methods mapValues(..), filter(..), branch(..) etc, you need to wrap these method logic with try / catch blocks (put all your method logic into try block to guarantee that you will handle all exceptional cases):
.filter((key, value) -> { try {..} catch (Exception e) {..} })
as I know, we don't need to handle exceptions on consumer side explicitly, as kafka streams will retry automatically consuming later (as offset will not be changed until messages will be consumed and processed); e.g. if kafka broker will be not reachable for some time, you will got exceptions from kafka streams, and when broken will be up, kafka stream will consume all messages. so in this case we will have just delay and nothing corrupted/lost.
with setUncaughtExceptionHandler you will not be able to change default behavior like with ProductionExceptionHandler, with it you could only log error or send message into failure topic.
Update since kafka-streams 2.8.0
since kafka-streams 2.8.0, you have the ability to automatically replace failed stream thread (that caused by uncaught exception)
using KafkaStreams method void setUncaughtExceptionHandler(StreamsUncaughtExceptionHandler eh); with StreamThreadExceptionResponse.REPLACE_THREAD. For more details please take a look at Kafka Streams Specific Uncaught Exception Handler
kafkaStreams.setUncaughtExceptionHandler(ex -> {
log.error("Kafka-Streams uncaught exception occurred. Stream will be replaced with new thread", ex);
return StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.REPLACE_THREAD;
});

For handling exceptions on the consumer side,
1) You can add a default exception handler in producer with the following property.
"default.deserialization.exception.handler" = "org.apache.kafka.streams.errors.LogAndContinueExceptionHandler";
Basically apache provides three exception handler classes as
1) LogAndContiuneExceptionHandler which you can take as
props.put(StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
LogAndContinueExceptionHandler.class);
2) LogAndFailExceptionHandler
props.put(StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
LogAndFailExceptionHandler.class);
3) LogAndSkipOnInvalidTimestamp
props.put(StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
LogAndSkipOnInvalidTimestamp.class);
For custom exception handling,
1)you can implement the DeserializationExceptionHandler interface and override the handle() method.
2) Or you can extend the above-mentioned classes.

setUncaughtExceptionHandler doesn't help to handle exception, it works after the stream has terminated due to some exception which was not caught.
Kafka provides few ways to handle exceptions. A simple try-catch{} would help catch exceptions in the processor code but kafka deserialization exception (can be due to data issues) and production exception(occurs during communication with broker) requires DeserializationExceptionHandler and ProductionExceptionHandler respectively. By default a kafka application would fail if it encounter any of these.
You can find on this post

In Spring cloud stream you confgure your custom deserialization handler using following:
spring.cloud.stream.kafka.streams.binder.configuration.default.deserialization.exception.handler=your-package-name.CustomLogAndContinueExceptionHandler
CustomLogAndContinueExceptionHandler extends LogAndContinueExceptionHandler or implements DeserializationExceptionHandler
CustomLogAndContinueExceptionHandler DeserializationHandlerResponse.CONTINUE or FAIL depending on your usecase
#Slf4j
public class CustomLogAndContinueExceptionHandler extends LogAndContinueExceptionHandler {
#Override
public DeserializationHandlerResponse handle(ProcessorContext context, ConsumerRecord<byte[], byte[]> record,
Exception exception) {
.... some business logic here ....
log.error("Message failed: taskId: {}, topic: {}, partition: {}, offset: {}, , detailerror : {}",
context.taskId(), record.topic(), record.partition(), record.offset(), exception.getMessage());
return DeserializationHandlerResponse.CONTINUE;
}
}

Related

Is there any method to retry within maxAttempts times using Acknowledge.nack for spring-cloud-stream-binder-kafka?

I am trying to use consuming batches in kafka, and I found the document said retry is not supported as follows.
Retry within the binder is not supported when using batch mode, so maxAttempts will be overridden to 1. You can configure a SeekToCurrentBatchErrorHandler (using a ListenerContainerCustomizer) to achieve similar functionality to retry in the binder. You can also use a manual AckMode and call Ackowledgment.nack(index, sleep) to commit the offsets for a partial batch and have the remaining records redelivered. Refer to the Spring for Apache Kafka documentation for more information about these techniques
If I use Acknowledge.nack(index,sleep), it will retry infinitely when error happens. Is there any method to retry within maxAttempts times using Acknowledge.nack?
The code is as like
#StreamListener(Sink.INPUT)
public void consume(#Payload List<PayLoad> payloads, #Header(KafkaHeaders.ACKNOWLEDGMENT) Acknowledgment acknowledgment) {
try {
consume(payloads);
acknowledgment.acknowledge();
} catch (Exception e) {
acknowledgment.nack(0, 50);
}
}

There is not; you have to keep track of the retry count yourself.
However, since version 2.5, you can now use a RecoveringBatchErrorHandler where you throw a specific exception to tell the handler which record failed and it commits the offsets for the records before that one and applies retry logic to the failed record.
See https://docs.spring.io/spring-kafka/reference/html/#recovering-batch-eh

Java JMS - Message Listener and onException

I have an application with a main thread and a JMS thread which talk to each other through ActiveMQ 5.15.11. I am able to send messages just fine, however I would like a way to send back status or errors. I noticed that the MessageListener allows for onSuccess() and onException(ex) as two events to listen for, however I am finding that only onSuccess() is getting called.
Here are snippets of my code.
JMS Thread:
ConnectionFactory factory = super.getConnectionFactory();
Connection connection = factory.createConnection();
Session session = connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
Queue queue = session.createQueue(super.getQueue());
MessageConsumer consumer = session.createConsumer(queue);
consumer.setMessageListener(m -> {
try {
super.processRmbnConfigMsg(m);
} catch (JMSException | IOException e) {
LOG.error(e.getMessage(), e);
// I can only use RuntimeException.
// Also this exception is what I am expecting to get passed to the onException(..)
// call in the main thread.
throw new RuntimeException(e);
}
});
connection.start();
Main thread (sending messages to JMS):
sendMessage(xml, new AsyncCallback() {
#Override
public void onException(JMSException e) {
// I am expecting this to be that RuntimeException from the JMS thread.
LOG.error("Error", e);
doSomethingWithException(e);
}
#Override
public void onSuccess() {
LOG.info("Success");
}
});
What I am expecting is that the exceptions thrown in the new RuntimeException(e) will get picked up on the onException(JMSException e) event listener, in some way, even if the RuntimeException is wrapped.
Instead, I am always getting onSuccess() events. I suppose the onException(..) event happens during communication issues, but I would like a way to send back to the caller exceptions.
How do I accomplish that goal of collecting errors in the JMS thread and sending it back to my calling thread?

Your expectation is based on a fundamental misunderstanding of JMS.
One of the basic tenets of brokered messaging is that producers and consumers are logically disconnected from each other. In other words...A producer sends a message to a broker and it doesn't necessarily care if it is consumed successfully or not, and it certainly won't know who consumes it or have any guarantee when it will be consumed. Likewise, a consumer doesn't necessarily know when or why the message was sent or who sent it. This provides great flexibility between producers and consumers. JMS adheres to this tenet of disconnected producers and consumers.
There is no direct way for a consumer to inform a producer about a problem with the consumption of the message it sent. That said, you can employ what's called a "request/response pattern" so that the consumer can provide some kind of feedback to the producer. You can find an explanation of this pattern along with example code here.
Also, the AsyncCallback class you're using is not part of JMS. I believe it's org.apache.activemq.AsyncCallback provided exclusively by ActiveMQ itself and it only provides callbacks for success or failure for the actual send operation (i.e. not for the consumption of the message).
Lastly, you should know that throwing a RuntimeException from the onMessage method of a javax.jms.MessageListener is considered a "programming error" by the JMS specification and should be avoided. Section 8.7 of the JMS 2 specification states:
It is possible for a listener to throw a RuntimeException; however, this is considered a client programming error. Well behaved listeners should catch such exceptions and attempt to divert messages causing them to some form of application-specific 'unprocessable message' destination.
The result of a listener throwing a RuntimeException depends on the session's acknowledgment mode.
AUTO_ACKNOWLEDGE or DUPS_OK_ACKNOWLEDGE - the message will be immediately redelivered. The number of times a JMS provider will redeliver the same message before giving up is provider-dependent. The JMSRedelivered message header field will be set, and the JMSXDeliveryCount message property incremented, for a message redelivered under these circumstances.
CLIENT_ACKNOWLEDGE - the next message for the listener is delivered. If a client wishes to have the previous unacknowledged message redelivered, it must manually recover the session.
Transacted Session - the next message for the listener is delivered. The client can either commit or roll back the session (in other words, a RuntimeException does not automatically rollback the session).

Catch-all exception handling for outbound ChannelHandler

In Netty you have the concept of inbound and outbound handlers. A catch-all inbound exception handler is implemented simply by adding a channel handler at the end (the tail) of the pipeline and implementing an exceptionCaught override. The exception happening along the inbound pipeline will travel along the handlers until meeting the last one, if not handled along the way.
There isn't an exact opposite for outgoing handlers. Instead (according to Netty in Action, page 94) you need to either add a listener to the channel's Future or a listener to the Promise passed into the write method of your Handler.
As I am not sure where to insert the former, I thought I'd go for the latter, so I made the following ChannelOutboundHandler:
/**
* Catch and log errors happening in the outgoing direction
*
* #see <p>p94 in "Netty In Action"</p>
*/
private ChannelOutboundHandlerAdapter createOutgoingErrorHandler() {
return new ChannelOutboundHandlerAdapter() {
#Override
public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise) {
logger.info("howdy! (never gets this far)");
final ChannelFutureListener channelFutureListener = future -> {
if (!future.isSuccess()) {
future.cause().printStackTrace();
// ctx.writeAndFlush(serverErrorJSON("an error!"));
future.channel().writeAndFlush(serverErrorJSON("an error!"));
future.channel().close();
}
};
promise.addListener(channelFutureListener);
ctx.write(msg, promise);
}
};
This is added to the head of the pipeline:
#Override
public void addHandlersToPipeline(final ChannelPipeline pipeline) {
pipeline.addLast(
createOutgoingErrorHandler(),
new HttpLoggerHandler(), // an error in this `write` should go "up"
authHandlerFactory.get(),
// etc
The problem is that the write method of my error handler is never called if I throw a runtime exception in the HttpLoggerHandler.write().
How would I make this work? An error in any of the outgoing handlers should "bubble up" to the one attached to the head.
An important thing to note is that I don't merely want to close the channel, I want to write an error message back to the client (as seen from serverErrorJSON('...'). During my trials of shuffling around the order of the handlers (also trying out stuff from this answer), I have gotten the listener activated, but I was unable to write anything. If I used ctx.write() in the listener, it seems as if I got into a loop, while using future.channel().write... didn't do anything.

I found a very simple solution that allows both inbound and outbound exceptions to reach the same exception handler positioned as the last ChannelHandler in the pipeline.
My pipeline is setup as follows:
//Inbound propagation
socketChannel.pipeline()
.addLast(new Decoder())
.addLast(new ExceptionHandler());
//Outbound propagation
socketChannel.pipeline()
.addFirst(new OutboundExceptionRouter())
.addFirst(new Encoder());
This is the content of my ExceptionHandler, it logs caught exceptions:
public class ExceptionHandler extends ChannelInboundHandlerAdapter {
#Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) throws Exception {
log.error("Exception caught on channel", cause);
}
}
Now the magic that allows even outbound exceptions to be handled by ExceptionHandler happens in the OutBoundExceptionRouter:
public class OutboundExceptionRouter extends ChannelOutboundHandlerAdapter {
#Override
public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise) throws Exception {
promise.addListener(ChannelFutureListener.FIRE_EXCEPTION_ON_FAILURE);
super.write(ctx, msg, promise);
}
}
This is the first outbound handler invoked in my pipeline, what it does is add a listener to the outbound write promise which will execute future.channel().pipeline().fireExceptionCaught(future.cause()); when the promise fails. The fireExceptionCaught method propagates the exception through the pipeline in the inbound direction, eventually reaching the ExceptionHandler.
In case anyone is interested, as of Netty 4.1, the reason why we need to add a listener to get the exception is because after performing a writeAndFlush to the channel, the invokeWrite0 method is called in AbstractChannelHandlerContext.java which wraps the write operation in a try catch block. The catch block notifies the Promise instead of calling fireExceptionCaught like the invokeChannelRead method does for inbound messages.

Basically what you did is correct... The only thing that is not correct is the order of the handlers. Your ChannelOutboundHandlerAdapter mast be placed "as last outbound handler" in the pipeline. Which means it should be like this:
pipeline.addLast(
new HttpLoggerHandler(),
createOutgoingErrorHandler(),
authHandlerFactory.get());
The reason for this is that outbound events from from the tail to the head of the pipeline while inbound events flow from the head to the tail.

There does not seem to be a generalized concept of a catch-all exception handler for outgoing handlers that will catch errors regardless of where. This means, unless you registered a listener to catch a certain error a runtime error will probably result in the error being "swallowed", leaving you scratching your head for why nothing is being returned.
That said, maybe it doesn't make sense to have a handler/listener that always will execute given an error (as it needs to be very general), but it does make logging errors a bit tricker than need be.
After writing a bunch of learning tests (which I suggest checking out!) I ended up with these insights, which are basically the names of my JUnit tests (after some regex manipulation):
a listener can write to a channel after the parent write has completed
a write listener can remove listeners from the pipeline and write on an erronous write
all listeners are invoked on success if the same promise is passed on
an error handler near the tail cannot catch an error from a handler nearer the head
netty does not invoke the next handlers write on runtime exception
netty invokes a write listener once on a normal write
netty invokes a write listener once on an erronous write
netty invokes the next handlers write with its written message
promises can be used to listen for next handlers success or failure
promises can be used to listen for non immediate handlers outcome if the promise is passed on
promises cannot be used to listen for non immediate handlers outcome if a new promise is passed on
promises cannot be used to listen for non immediate handlers outcome if the promise is not passed on
only the listener added to the final write is invoked on error if the promise is not passed on
only the listener added to the final write is invoked on success if the promise is not passed on
write listeners are invoked from the tail
This insight means, given the example in the question, that if an error should arise near the tail and authHandler does not pass the promise on, then the error handler near the head will never be invoked, as it is being supplied with a new promise, as ctx.write(msg) is essentially ctx.channel.write(msg, newPromise()).
In our situation we ended up solving the situation by injecting the same shareable error handling inbetween all the business logic handlers.
The handler looked like this
#ChannelHandler.Sharable
class OutboundErrorHandler extends ChannelOutboundHandlerAdapter {
private final static Logger logger = LoggerFactory.getLogger(OutboundErrorHandler.class);
private Throwable handledCause = null;
#Override
public void write(ChannelHandlerContext ctx, Object msg, ChannelPromise promise) {
ctx.write(msg, promise).addListener(writeResult -> handleWriteResult(ctx, writeResult));
}
private void handleWriteResult(ChannelHandlerContext ctx, Future<?> writeResult) {
if (!writeResult.isSuccess()) {
final Throwable cause = writeResult.cause();
if (cause instanceof ClosedChannelException) {
// no reason to close an already closed channel - just ignore
return;
}
// Since this handler is shared and added multiple times
// we need to avoid spamming the logs N number of times for the same error
if (handledCause == cause) return;
handledCause = cause;
logger.error("Uncaught exception on write!", cause);
// By checking on channel writability and closing the channel after writing the error message,
// only the first listener will signal the error to the client
final Channel channel = ctx.channel();
if (channel.isWritable()) {
ctx.writeAndFlush(serverErrorJSON(cause.getMessage()), channel.newPromise());
ctx.close();
}
}
}
}
Then in our pipeline setup we have this
// Prepend the error handler to every entry in the pipeline.
// The intention behind this is to have a catch-all
// outbound error handler and thereby avoiding the need to attach a
// listener to every ctx.write(...).
final OutboundErrorHandler outboundErrorHandler = new OutboundErrorHandler();
for (Map.Entry<String, ChannelHandler> entry : pipeline) {
pipeline.addBefore(entry.getKey(), entry.getKey() + "#OutboundErrorHandler", outboundErrorHandler);
}

Aborting a Batch Job during a chunk-based step

I have a somewhat linear job set up in Spring Batch that consists of several steps. If, at any point, a single step fails, the job should fail.
The steps consist of a number of tasklets followed by a chunk-based step. I.e.:
Step 1
Tasklet 1
Step 2
Tasklet 2
Step 3
Reader
Processor
Writer
If something goes wrong, the obvious thing to do is throw an Exception. Spring Batch will handle this and log everything. This behaviour, particularly printing the stack trace, is undesirable, and it would be better if the Job could be ended gracefully with the Status set to FAILED.
The Tasklets currently set the ExitStatus directly on the StepContribution. They are also built using a flow (which wasn't ideal, but the steps continue unhindered otherwise). The issues can then be handled directly in the Tasklet.
However, we have no access to the StepContribution in the chunk-based approach. We only have the StepExecution. Using setExitStatus does nothing here.
We are using the builders (JobBuilerFactory and StepBuilderFactory), not the XML setups.
The potential solutions:
Tell or configure Batch how to handle exceptions (not to print a stack trace).
Catch the exception in a listener. Unfortunately, the exception has already been caught by Spring Batch by the time it gets to the #AfterStep.
Tell the step/job that we do not want to continue (e.g. setting a value in the execution context or an alternative for the StepContribution.

As far as I know, the only way to stop the job is to throw an exception. There was no other graceful way of telling Spring Batch "This job is done, it failed, go directly to Failed, do not pass GO, etc."
Although not a direct solution to the original problem, one can use the .exceptionHandler() of a StepBuilder to gain more control over the exceptions you throw, e.g. logging them.
public class LogAndRethrowExceptionHandler implements ExceptionHandler {
private static final Logger LOGGER = Logger.getLogger(...);
#Override
public void handleException(RepeatContext repeatContext, Throwable throwable) throws Throwable {
LOGGER.error(throwable.getMessage());
throw throwable;
}
}
This way you may, in theory, hide the stack traces produced by Spring Batch but still show the error messages.

I think you can explore below two options.
Option 1 : You Can use noSkip Exception.
This Explicitly prevent certain exceptions (and subclasses) from being skipped
Throw some specific Exception for which you want to fail the job.
This is how you can configure hat
stepBuilderFactory.get("myStep")
.<POJO, POJO> chunk(1000)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant()
.noSkip(YourCustomException.class)
.skip(Exception.class)
.skipLimit(100)
****Option 2** :** You can use set the exit status to FAILED for error flow in after step is completed
public class MyStepExecutionListener implements StepExecutionListener {
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
if(allIsGood()) //
{
return ExitStatus.COMPLETED;
}
else if (someExceptionOrErrorCase()){
return ExitStatus.FAILED;
}
}
}
Hope this helps

In my case I ended up adding conditions and throwing JobInterruptedException in both Reader and Processor
import org.springframework.batch.core.BatchStatus;
import org.springframework.batch.core.JobInterruptedException;
public class CustomReader implements ItemReader<String> {
#Override
public String read() throws Exception {
// ...
// if ("condition to stop job") {
throw new JobInterruptedException("Aborting Job", BatchStatus.ABANDONED);
// }
}
}

Global exception handling in #Named class in spring

I want to handle a set of exceptions in one of the class which is annotated using #Named.
I am using AKKA actors for concurrency. Also using Hibernate.
In my Akka actor class, I am invoking the hibernate methods for insert operations.
I want to know how to handle exceptions in all the actor classes globally. I can do that using #ExceptionHandler and #ControllerAdvice. But I believe that it works only in the #Controller layer.
Sample code is given below:
#Named("SaveDepartmentService")
public class SaveDepartmentService extends BaseCommonService {
#Autowired
IDepartmentRepository departmentRepository;
#Override
public PartialFunction<Object, BoxedUnit> receive(){
return ReceiveBuilder
.match(Department.class, c->createOrUpdateDepartment(c))
.build()
.orElse(matchAny());
}
private void createOrUpdateDepartment(Department dept){
// try{
departmentRepository.insertOrUpdate(dept);
// }catch (ConstraintViolationException e){
// System.out.println("-------------------------------");
// e.printStackTrace();
// wrap the exception and send as
// sender().tell("Constarint_violation",self());
//}
sender().tell(dept, self());
} }
I want to handle ConstraintViolationException across my actors and respond with a message (eg: sender().tell("CONSTRAINT_VIOLATION", self());).
I can do the same thing by handling it in catch block, but i need to duplicate it in all actor classes.
Is there any way to handle it like #ControllerAdvice?

You could use the default exception handler.
By setting the default uncaught exception handler, an application can change the way in which uncaught exceptions are handled (such as logging to a specific device, or file) for those threads that would already accept whatever "default" behavior the system provided.
But it makes no difference between CDI-Beans and Non-CDI-Beans (By the way: I use it to automatically increase the level of logging of all classes in the Stacktraces)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.