Aborting a Batch Job during a chunk-based step - java

I have a somewhat linear job set up in Spring Batch that consists of several steps. If, at any point, a single step fails, the job should fail.
The steps consist of a number of tasklets followed by a chunk-based step. I.e.:
Step 1
Tasklet 1
Step 2
Tasklet 2
Step 3
Reader
Processor
Writer
If something goes wrong, the obvious thing to do is throw an Exception. Spring Batch will handle this and log everything. This behaviour, particularly printing the stack trace, is undesirable, and it would be better if the Job could be ended gracefully with the Status set to FAILED.
The Tasklets currently set the ExitStatus directly on the StepContribution. They are also built using a flow (which wasn't ideal, but the steps continue unhindered otherwise). The issues can then be handled directly in the Tasklet.
However, we have no access to the StepContribution in the chunk-based approach. We only have the StepExecution. Using setExitStatus does nothing here.
We are using the builders (JobBuilerFactory and StepBuilderFactory), not the XML setups.
The potential solutions:
Tell or configure Batch how to handle exceptions (not to print a stack trace).
Catch the exception in a listener. Unfortunately, the exception has already been caught by Spring Batch by the time it gets to the #AfterStep.
Tell the step/job that we do not want to continue (e.g. setting a value in the execution context or an alternative for the StepContribution.

As far as I know, the only way to stop the job is to throw an exception. There was no other graceful way of telling Spring Batch "This job is done, it failed, go directly to Failed, do not pass GO, etc."
Although not a direct solution to the original problem, one can use the .exceptionHandler() of a StepBuilder to gain more control over the exceptions you throw, e.g. logging them.
public class LogAndRethrowExceptionHandler implements ExceptionHandler {
private static final Logger LOGGER = Logger.getLogger(...);
#Override
public void handleException(RepeatContext repeatContext, Throwable throwable) throws Throwable {
LOGGER.error(throwable.getMessage());
throw throwable;
}
}
This way you may, in theory, hide the stack traces produced by Spring Batch but still show the error messages.

I think you can explore below two options.
Option 1 : You Can use noSkip Exception.
This Explicitly prevent certain exceptions (and subclasses) from being skipped
Throw some specific Exception for which you want to fail the job.
This is how you can configure hat
stepBuilderFactory.get("myStep")
.<POJO, POJO> chunk(1000)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant()
.noSkip(YourCustomException.class)
.skip(Exception.class)
.skipLimit(100)
****Option 2** :** You can use set the exit status to FAILED for error flow in after step is completed
public class MyStepExecutionListener implements StepExecutionListener {
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
if(allIsGood()) //
{
return ExitStatus.COMPLETED;
}
else if (someExceptionOrErrorCase()){
return ExitStatus.FAILED;
}
}
}
Hope this helps

In my case I ended up adding conditions and throwing JobInterruptedException in both Reader and Processor
import org.springframework.batch.core.BatchStatus;
import org.springframework.batch.core.JobInterruptedException;
public class CustomReader implements ItemReader<String> {
#Override
public String read() throws Exception {
// ...
// if ("condition to stop job") {
throw new JobInterruptedException("Aborting Job", BatchStatus.ABANDONED);
// }
}
}

Related

spring batch failure in the end of the day

is there a solution that allows you to check on the jobrepository for a given job(JobInstance), the presence of a completed job during the day, if there is no completed status on the batch_job_execution table during the current day, so I must send a notification or an exit code like what we got nothing today.
i plan to implement the solution in a class that extends from JobExecutionListenerSupport, like this:
public class JobCompletionNotificationListener extends JobExecutionListenerSupport {
private Logger logger = LoggerFactory.getLogger(JobCompletionNotificationListener.class);
private JobRegistry jobRegistry;
private JobRepository jobRepository;
public JobCompletionNotificationListener(JobRegistry jobRegistry, JobRepository jobRepository) {
this.jobRegistry = jobRegistry;
this.jobRepository = jobRepository;
}
#Override
public void afterJob(JobExecution jobExecution) {
System.out.println("finishhhhh");
//the logic if no job completed to day
if(noJobCompletedToDay){
Notify();
}
if (jobExecution.getStatus() == BatchStatus.COMPLETED) {
logger.info("!!! JOB FINISHED! -> example action execute after Job");
}
}
}
You can use JobExplorer#getLastJobExecution to get the last execution for your job instance and check if it's completed during the current day.
Depending on when you are going to do that check, you might also make sure there are no currently running jobs (JobExplorer#findRunningJobExecutions can help).
You can implement monitoring in multiple ways. Since version 4.2 Spring Batch provides support for metrics and monitoring based on Micrometer. There is an example of spring [grafana sample][1], with prometheus and grafana from which you can rely to customize a custom board or launch alerts from these tools.
If you have several batch processes it may be the best option, in addition to these tools will help you to monitor services, applications etc.
Buily in metrics:
Duration of job execution
Currently active jobs
Duration of step execution
Duration of item reading
Duration of item processing
Duration of chunk writing
You can create your own custom metrics (eg. Execution failures).
Otherwise, you can implement the monitoring, for example, through another independent batch process, which executes and sends a notification / mail etc. collecting for example the state of the process base, of the application or a filesystem shared between both processes.
You can also implement the check the way you describe it, there is an interesting thread where you can find described how to throw an exception in one step and process it in a next step that sends or not an alert as appropriate.

Handling exceptions in Kafka streams

Had gone through multiple posts but most of them are related handling Bad messages not about exception handling while processing them.
I want to know to how to handle the messages that is been received by the stream application and there is an exception while processing the message? The exception could be because of multiple reasons like Network failure, RuntimeException etc.,
Could someone suggest what is the right way to do? Should I use
setUncaughtExceptionHandler? or is there a better way?
How to handle retries?
it depends what do you want to do with exceptions on producer side.
if exception will be thrown on producer (e.g. due to Network failure or kafka broker has died), stream will die by default. and with kafka-streams version 1.1.0 you could override default behavior by implementing ProductionExceptionHandler like the following:
public class CustomProductionExceptionHandler implements ProductionExceptionHandler {
#Override
public ProductionExceptionHandlerResponse handle(final ProducerRecord<byte[], byte[]> record,
final Exception exception) {
log.error("Kafka message marked as processed although it failed. Message: [{}], destination topic: [{}]", new String(record.value()), record.topic(), exception);
return ProductionExceptionHandlerResponse.CONTINUE;
}
#Override
public void configure(final Map<String, ?> configs) {
}
}
from handle method you could return either CONTINUE if you don't want streams dying on exception, on return FAIL in case you want stream stops (FAIL is default one).
and you need specify this class in stream config:
default.production.exception.handler=com.example.CustomProductionExceptionHandler
Also pay attention that ProductionExceptionHandler handles only exceptions on producer, and it will not handle exceptions during processing message with stream methods mapValues(..), filter(..), branch(..) etc, you need to wrap these method logic with try / catch blocks (put all your method logic into try block to guarantee that you will handle all exceptional cases):
.filter((key, value) -> { try {..} catch (Exception e) {..} })
as I know, we don't need to handle exceptions on consumer side explicitly, as kafka streams will retry automatically consuming later (as offset will not be changed until messages will be consumed and processed); e.g. if kafka broker will be not reachable for some time, you will got exceptions from kafka streams, and when broken will be up, kafka stream will consume all messages. so in this case we will have just delay and nothing corrupted/lost.
with setUncaughtExceptionHandler you will not be able to change default behavior like with ProductionExceptionHandler, with it you could only log error or send message into failure topic.
Update since kafka-streams 2.8.0
since kafka-streams 2.8.0, you have the ability to automatically replace failed stream thread (that caused by uncaught exception)
using KafkaStreams method void setUncaughtExceptionHandler(StreamsUncaughtExceptionHandler eh); with StreamThreadExceptionResponse.REPLACE_THREAD. For more details please take a look at Kafka Streams Specific Uncaught Exception Handler
kafkaStreams.setUncaughtExceptionHandler(ex -> {
log.error("Kafka-Streams uncaught exception occurred. Stream will be replaced with new thread", ex);
return StreamsUncaughtExceptionHandler.StreamThreadExceptionResponse.REPLACE_THREAD;
});
For handling exceptions on the consumer side,
1) You can add a default exception handler in producer with the following property.
"default.deserialization.exception.handler" = "org.apache.kafka.streams.errors.LogAndContinueExceptionHandler";
Basically apache provides three exception handler classes as
1) LogAndContiuneExceptionHandler which you can take as
props.put(StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
LogAndContinueExceptionHandler.class);
2) LogAndFailExceptionHandler
props.put(StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
LogAndFailExceptionHandler.class);
3) LogAndSkipOnInvalidTimestamp
props.put(StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,
LogAndSkipOnInvalidTimestamp.class);
For custom exception handling,
1)you can implement the DeserializationExceptionHandler interface and override the handle() method.
2) Or you can extend the above-mentioned classes.
setUncaughtExceptionHandler doesn't help to handle exception, it works after the stream has terminated due to some exception which was not caught.
Kafka provides few ways to handle exceptions. A simple try-catch{} would help catch exceptions in the processor code but kafka deserialization exception (can be due to data issues) and production exception(occurs during communication with broker) requires DeserializationExceptionHandler and ProductionExceptionHandler respectively. By default a kafka application would fail if it encounter any of these.
You can find on this post
In Spring cloud stream you confgure your custom deserialization handler using following:
spring.cloud.stream.kafka.streams.binder.configuration.default.deserialization.exception.handler=your-package-name.CustomLogAndContinueExceptionHandler
CustomLogAndContinueExceptionHandler extends LogAndContinueExceptionHandler or implements DeserializationExceptionHandler
CustomLogAndContinueExceptionHandler DeserializationHandlerResponse.CONTINUE or FAIL depending on your usecase
#Slf4j
public class CustomLogAndContinueExceptionHandler extends LogAndContinueExceptionHandler {
#Override
public DeserializationHandlerResponse handle(ProcessorContext context, ConsumerRecord<byte[], byte[]> record,
Exception exception) {
.... some business logic here ....
log.error("Message failed: taskId: {}, topic: {}, partition: {}, offset: {}, , detailerror : {}",
context.taskId(), record.topic(), record.partition(), record.offset(), exception.getMessage());
return DeserializationHandlerResponse.CONTINUE;
}
}

Make a spring-batch job exit with non-zero code if an exception is thrown

I'm trying to fix a spring-batch job which is launched from a shell script. The script then checks the process exit code to determine whether the job has succeeded. Java, however, exits 0 even if the program ended with an exception, unless System.exit was specifically called with a different code, so the script always reports success.
Is there a way to make spring-batch return a non-zero code on failure? To be clear, I'm not talking about the ExitStatus or BatchStatus, but the actual exit code of the java process.
If there's no way to tell spring-batch to return non-zero, can I safely use System.exit in a Tasklet (or a Listener) without interfering with anything that spring-batch does under the hood after an exception?
Failing that, can anyone suggest a way to get the BatchStatus or some other indicator of failure back to the shell script?
As per this github issue it's recommended to pass the result of SpringApplication#exit to System#exit. You don't need to access an ExitCodeGenerator instance directly or manually close the context.
ConfigurableApplicationContext context = SpringApplication.run(Application.class, args);
int exitCode = SpringApplication.exit(context);
System.exit(exitCode);
I would not recommend calling System.exit() from a tasklet, since the job will NOT finish completely and hence some entries in BATCH_-Tables could be left in an inconsistent state.
If you use the class CommandLineJobRunner to start your batch, then the return code according to the BatchStatus is returned (CommandLineJobRunner can be configured with an SystemExiter; as default it uses a JvmSystemExiter which calls System.exit() at the very end). However, this solution circumvents SpringBoot.
Therefore, if you want to use springboot, I'd recommend writing your own Launch-Wrapper main -method. Something like
public static void main(String[] args) throws Exception {
// spring boot application startup
SpringApplication springApp = new SpringApplication(.. your context ..);
// calling the run method returns the context
// pass your program arguments
ConfigurableApplicationContext context = springApp.run(args);
// spring boot uses exitCode Generators to calculate the exit status
// there should be one implementation of the ExitCodeGenerator interface
// in your context. Per default SpringBoot uses the JobExecutionExitCodeGenerator
ExitCodeGenerator exitCodeGen = context.getBean(ExitCodeGenerator.class);
int code = exitCodeGen.getExitCode();
context.close();
System.exit(code);
}
I solved this some time ago - apologies that I did not update sooner.
Spring Batch does exit non-zero on an failed job, but there's a subtle catch.
If a job step has no transitions explicitly defined it will by default end the job on success and fail the job on failure (of the step).
If, however, the step has a transition defined for any result (e.g. to the next step on success), there is no default transition for any other result, and if it fails the job won't be properly failed. So you have to explicitly define a "fail on FAILED" transition (typically, for all steps except the last).

Spring MDP - how to shut it down on bad message

I have got a Spring MDP implemented using Spring DefaultMessageListenderContainer listening to an input queue on WebSphere MQ v7.1. If there is a bad message coming in (that causes RuntimeException), what currently happens is that, the transaction is rolled back, and the message is put back into the queue. However the MDP goes into an infinite loop.
Question 1: For my requirements I would like to be able to shut down the processing the moment it sees a bad message. No retries needed. Is it possible to shutdown the message listener gracefully in case it sees a bad message (as opposed to crude System.exit() or methods of that sort)? I definitely don't like it to go into an infinite loop.
Edit:
Question 2: Is there a way to stop or suspend the listener container to stop further processing of messages?
The usual way to process this is to have an error queue and when you see a bad message to put it into the error queue.
Some systems handle this for you such as IBM MQ Series. You just need to configure the error queue and how many retries you want ant it will put it there.
An administrator will then look through these queues and take proper action on the messages that are in the queue (i.e. fix and resubmit them)
Actually, System.exit() is too brutal and... won't work. Retrying of failed messages is handled on the broker (WMQ) side so the message will be redelivered once you restart your application.
The problem you are describing is called poison-message and should be handled on the broker side. It seems to be described in Handling poison messages in WMQ manual and in How WebSphere Application Server handles poison messages.
I solved the problem in the following manner, not sure if this is the best way, however it works.
MDP Implements ApplicationContextAware; I also maintain a listener state (enum with OPEN, CLOSE, ERROR values) MDP Code fragment below:
//context
private ConfigurableApplicationContext applicationContext;
//listener state
private ListenerState listenerState = ListenerState.OPEN;
#Override
public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
this.applicationContext = (ConfigurableApplicationContext) applicationContext;
}
//onMessage method
public void processMessages(....) {
try {
process(...);
} catch (Throwable t) {
listenerState = ListenerState.ERROR;
throw new RuntimeException(...);
}
}
#Override
public void stopContext() {
applicationContext.stop();
}
In the java main that loads the spring context i do this:
//check for errors for exit
Listener listener = (Listener)context.getBean("listener");
listenerContainer listenerContainer =
(ListenerContainer)context.getBean("listenerContainer");
try {
while(true) {
Thread.sleep(1000); //sleep for 1 sec
if(!listener.getListenerState().equals(ListenerState.OPEN)) {
listener.stopContext();
listenerContainer.stop();
System.exit(1);
}> }
} catch (InterruptedException e) {
throw new RuntimeException(e);
}

My exception logging aspect is logging the same exception twice

I'm writing a stand alone application, that has to start up and be left running unattended for long periods of time. Rather than have exceptions bring it to a halt, it needs to log the exception with enough information for the support people to have an idea what happened, and carry on.
As a result each exception is wrapped in a runtime exception, then thrown to be logged by a different part of the application. I'm using aop:config tags to create an aspect to log the runtime exceptions thrown by the rest of the application. The exception would then carry on up the call stack to an UncaughtExceptionHandler to end the exception silently.
However, the same exception is being caught repeatedly, and logged (each exception is written by a separate thread, and goes to a separate log file). In the debugger, both exceptions have the same ID.
My applicationContext is basic for this :
&ltaop:config&gt
&ltaop:aspect ref="exceptionLoggingAspect"&gt
&ltaop:after-throwing method="logException"
pointcut="execution(* *.*(..))" throwing="exception" /&gt
&lt/aop:aspect&gt
&lt/aop:config&gt
The UncaughtExceptionHandler is equally basic, at least till I get it working :
private void setUncaughtExceptionHandler()
{
final Handler handler = new Handler();
Thread.setDefaultUncaughtExceptionHandler(handler);
}
class Handler implements Thread.UncaughtExceptionHandler
{
#Override
public void uncaughtException(Thread t, Throwable e)
{
System.out.println("Throwable: " + e.getMessage());
System.out.println(t.toString());
}
}
I have experimented by restricting the pointcut to a single package, and throwing an exception from that package (not the package the exception logging is in), but it is still logged twice.
Is there something fundamentally wrong with this idea ? Advice appreciated.

Categories