Spring batch trigger process after all jobs complete - java

I have a spring batch application with a series of jobs. I want to send an email after ALL the jobs have completed, but I'm not sure the best way to do this. The options I am considering are:
Run the jobs in a certain order, and amend the JobListener of the last job, so that it will send the email. Downside with this is it won't work if a further job is added to the end of the batch.
Add a new job which will send the email and order the jobs, making sure this additional job is run last.
Are there any built in spring-batch constructs that will be triggered on completion of the entire batch?
The final option would be my preferred solution, so my question is, are there any spring-batch classes that listen for batch completion (similar to JobExecutionListenerSupport or a Step Listener)?

No. I am not aware of any batch listener that listen for the whole batch's completion.
I have two alternatives for you. Both allows you to stick to Spring.
(1) If your application is designed as perpetual (i.e. like a web-server), you can inject a custom jobLauncher, grab the TaskExecutor and wait its completion (either through simple counters that counts call-backs from afterJob functions or through a certain amount of time that it requires all jobs to be submitted -- not necessarily started).
Add a configuration class like this:
#Configuration
class JobConfiguration implements InitializingBean {
TaskExecutor taskExecutor;
#Bean TaskExecutor taskExecutor () {
// here, change to your liking, in this example
// I put a SyncTaskExecutor
taskExecutor = new SyncTaskExecutor();
return taskExecutor;
}
#Bean
public JobLauncher jobLauncher(#Autowired JobRepository jobRepository,
#Autowired TaskExecutor taskExecutor) {
SimpleJobLauncher launcher = new SimpleJobLauncher();
launcher.setJobRepository(jobRepository);
launcher.setTaskExecutor(taskExecutor);
return launcher;
}
List<Job> jobs;
// I don't use this in this example
// however, jobs.size() will help you with the countdown latch
#Autowired public void setJobs (List<Job> jobs) {
this.jobs = jobs;
}
#AfterPropertiesSet
public void init () {
// either countdown until all jobs are submitted
// or sleep a finite amount of time
// in this example, I'll be lazy and just sleep
Thread.sleep(1 * 3600 * 1000L); // wait 1 hour
taskExecutor.shutdown();
try {
taskExecutor.awaitTermination();
} catch (Exception e) {}
finally {
// send your e-mail here.
}
}
}
(2) If your application stops when all jobs are done, you can simply follow this to send out an e-mail.
I repeat a few lines of code for completeness:
public class TerminateBean {
#PreDestroy
public void onDestroy() throws Exception {
// send out e-mail here
}
}
We also have to add a bean of this type:
#Configuration
public class ShutdownConfig {
#Bean
public TerminateBean getTerminateBean() {
return new TerminateBean();
}
}

If I understand correctly, you have a job of jobs. In this case, you can define an enclosing job with a series of steps of type JobStep (which delegates to a job). Then, you can register a JobExecutionListener on the enclosing job. This listener will be called once all steps (ie sub-jobs) have completed.
More details about the JobStep here: https://docs.spring.io/spring-batch/4.0.x/api/org/springframework/batch/core/step/job/JobStep.html

Related

Strategies to implement callback mechanism / notify, when all the asynchrous spring integration flows/threads execution is completed

I have spring integration flow that gets triggered once a every day, that pulls all parties from database and sends each party to an executorChannel.
The next flow would pull data for each party and then process them parallelly by sending in to a different executor channel.
Challenge i'm facing is how do i know when this entire process ends. Any ideas on how to acheve this .
Here's my pseudo code of executor channels and integration flows.
#Bean
public IntegrationFlow fileListener() {
return IntegrationFlows.from(Files.inboundAdapter(new
File("pathtofile"))).channel("mychannel").get();
}
#Bean
public IntegrationFlow flowOne() throws ParserConfigurationException {
return IntegrationFlows.from("mychannel").handle("serviceHandlerOne",
"handle").nullChannel();
}
#Bean
public IntegrationFlow parallelFlowOne() throws ParserConfigurationException {
return IntegrationFlows.from("executorChannelOne").handle("parallelServiceHandlerOne",
"handle").nullChannel();
}
#Bean
public IntegrationFlow parallelFlowTwo() throws ParserConfigurationException {
return IntegrationFlows.from("executorChannelTwo").handle("parallelServiceHandlerTwo",
"handle").nullChannel();
}
#Bean
public MessageChannel executorChannelOne() {
return new ExecutorChannel(
Executors.newFixedThreadPool(10));
}
#Bean
public MessageChannel executorChannelTwo;() {
return new ExecutorChannel(
Executors.newFixedThreadPool(10));
}
#Component
#Scope("prototype")
public class ServiceHandlerOne{
#Autowired
MessageChannel executorChannelOne;
#ServiceActivator
public Message<?> handle(Message<?> message) {
List<?> rowDatas = repository.findAll("parties");
rowDatas.stream().forEach(data -> {
Message<?> message = MessageBuilder.withPayload(data).build();
executorChannelOne.send(message);
});
return message;
}
}
#Component
#Scope("prototype")
public class ParallelServiceHandlerOne{
#Autowired
MessageChannel executorChannelTwo;;
#ServiceActivator
public Message<?> handle(Message<?> message) {
List<?> rowDatas = repository.findAll("party");
rowDatas.stream().forEach(data -> {
Message<?> message = MessageBuilder.withPayload(data).build();
executorChannelTwo;.send(message);
});
return message;
}
}
First of all no reason to make your services as #Scope("prototype"): I don't see any state holding in your services, so they are stateless, therefore can simply be as singleton. Second: since you make your flows ending with the nullChannel(), therefore point in returning anything from your service methods. Therefore just void and flow is going to end over there naturally.
Another observation: you use executorChannelOne.send(message) directly in the code of your service method. The same would be simply achieved if you just return that new message from your service method and have that executorChannelOne as the next .channel() in your flow definition after that handle("parallelServiceHandlerOne", "handle").
Since it looks like you do that in the loop, you might consider to add a .split() in between: the handler return your List<?> rowDatas and splitter will take care for iterating over that data and replies each item to that executorChannelOne.
Now about your original question.
There is really no easy to say that your executors are not busy any more. They might not be at the moment of request just because the message for task has not reached an executor channel yet.
Typically we recommend to use some async synchronizer for your data. The aggregator is a good way to correlate several messages in-the-flight. This way the aggregator collects a group and does not emit reply until that group is completed.
The splitter I've mentioned above adds a sequence details headers by default, so subsequent aggregator can track a message group easily.
Since you have layers in your flow, it looks like you would need a several aggregators: two for your executor channels after splitting, and one top level for the file. Those two would reply to the top-level for the final, per-file grouping.
You also may think about making those parties and party calls in parallel using a PublishSubscribeChannel, which also can be configured with a applySequence=true. This info then will be used by the top-level aggregator for info per file.
See more in docs:
https://docs.spring.io/spring-integration/docs/current/reference/html/core.html#channel-implementations-publishsubscribechannel
https://docs.spring.io/spring-integration/docs/current/reference/html/message-routing.html#splitter
https://docs.spring.io/spring-integration/docs/current/reference/html/message-routing.html#aggregator

Running job stops if start new running the same job - Spring Batch

Issue
I am a bit confused, because when starting the execution of a Spring Batch job by HTTP request, if I receive another HTTP request to start the same job, but with different parameters while the job is executing, the job that is being executed stops unfinished and processing of the new job starts.
Context
I've developed an API REST to load and process the content of Excel files. The web service exposes two endpoints, one to load, validate and store the content of Excel files in the database and the other to start the processing of the records stored in the database.
How does it works
POST /api/excel/upload
This endpoint receives the Excel files. When a request is received, each file is assigned a unique identifier and its content is validated. If the content is correct, it inserts it into a temporary table waiting to be processed.
GET /api/Excel/process?id=x
This endpoint receives the identifiers of the files to be processed. When a request is received, a Spring Batch job is started to process the records in the temporary table.
Some code
Controller
#PostMapping(produces = {APPLICATION_JSON_VALUE})
public ResponseEntity<Page<ExcelLoad>> post(#RequestParam("file") MultipartFile multipartFile)
{
return super.getResponse().returnPage(service.upload(multipartFile));
}
#GetMapping(value = "/process", produces = APPLICATION_JSON_VALUE)
public DeferredResult<ResponseEntity<Void>> get(#RequestParam("id") Integer idCarga)
{
DeferredResult<ResponseEntity<Void>> response = new DeferredResult<>(1000L);
response.onTimeout(() -> response.setResult(super.getResponse().returnVoid()));
ForkJoinPool.commonPool().submit(() -> service.startJob(idCarga));
return response;
}
I use DeferredResult to send a response to the client after receiving the request without waiting for the job to finish
Service
public void startJob(int idCarga)
{
JobParameters params = new JobParametersBuilder()
.addString("mainJob", String.valueOf(System.currentTimeMillis()))
.addString("idCarga", String.valueOf(idCarga))
.toJobParameters();
try
{
jobLauncher.run(job, params);
}
catch (JobExecutionException e)
{
log.error("---ERROR: {}", e.getMessage());
}
}
Batch
#Bean
public Step mainStep(ReaderImpl reader, ProcessorImpl processor, WriterImpl writer)
{
return stepBuilderFactory.get("step")
.<List<ExcelLoad>, Invoice>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant().skipPolicy(new ExceptionSkipPolicy())
.listener(stepSkipListener)
.build();
}
#Bean
public Job mainJob(Step mainStep)
{
return jobBuilderFactory.get("mainJob")
.listener(mainJobExecutionListener)
.incrementer(new RunIdIncrementer())
.start(mainStep)
.build();
}
Performing some tests I have observed the following behavior:
If I make a request to the endpoint /process to process each file at different times: in this case, all the records stored in the temporary table are processed:
Records processed file1: 3606 (expected 3606).
Records processed file2: 1776 (expected 1776).
If I make a request to the endpoint /process to first process file1, and before it finishes I make another request to process file2: in this case, not all the records stored in the temporary table are processed:
Records processed file1: 1080 (expected 3606)
Records processed file2: 1774 (expected 1776)
The JobLauncher does not stop job executions, it only launches them. The default job launcher provided by Spring Batch is the SimpleJobLauncher which delegates job launching to a TaskExecutor. Now depending on the task executor implementation you use and how it is configured to launch concurrent tasks, you can see different behaviours. For example, when you launch a new job execution and a new task is submitted to the task executor, the task executor can decide to reject this submission if all workers are busy, or put it in a waiting queue, or stop another task and submit the new one. Those strategies depend on several parameters (TaskExecutor implementation, the type of the queue used behind the scene, the RejectedExecutionHandler implementation, etc) .
In your case, you seem to be using the following:
ForkJoinPool.commonPool().submit(() -> service.startJob(idCarga));
So you need to check the behaviour of this pool with regard to how it handles new task submissions (I guess this is what is stopping your jobs, but you need to confirm that). That said, I don't see why you need this. If your requirement is the following:
I use DeferredResult to send a response to the client after receiving the request without waiting for the job to finish
Then you can use an asynchronous task executor implementation (like the ThreadPoolTaskExecutor) in your job launcher, see Running Jobs from within a Web Container.
Thanks to help with answer from #Mahmoud Ben Hassine, I was able to resolve the issue. To help with the implementation, in case someone comes to this question, I share the code that, in my case, has worked to solve the problem:
Controller
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job job;
#GetMapping(value = "/process", produces = APPLICATION_JSON_VALUE)
public void get(#RequestParam("id") Integer idCarga) throws JobExecutionException
{
JobParameters params = new JobParametersBuilder()
.addString("mainJob", String.valueOf(System.currentTimeMillis()))
.addString("idCarga", String.valueOf(idCarga))
.toJobParameters();
jobLauncher.run(job, params);
}
Batch config, job and steps
#Configuration
#EnableBatchProcessing
public class BatchConfig extends DefaultBatchConfigurer
{
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private StepSkipListener stepSkipListener;
#Autowired
private MainJobExecutionListener mainJobExecutionListener;
#Bean
public TaskExecutor taskExecutor()
{
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setMaxPoolSize(10);
taskExecutor.setThreadNamePrefix("batch-thread-");
return taskExecutor;
}
#Bean
public JobLauncher jobLauncher() throws Exception
{
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(getJobRepository());
jobLauncher.setTaskExecutor(taskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
#Bean
public Step mainStep(ReaderImpl reader, ProcessorImpl processor, WriterImpl writer)
{
return stepBuilderFactory.get("step")
.<List<ExcelLoad>, Invoice>chunk(10)
.reader(reader)
.processor(processor)
.writer(writer)
.faultTolerant().skipPolicy(new ExceptionSkipPolicy())
.listener(stepSkipListener)
.build();
}
#Bean
public Job mainJob(Step mainStep)
{
return jobBuilderFactory.get("mainJob")
.listener(mainJobExecutionListener)
.incrementer(new RunIdIncrementer())
.start(mainStep)
.build();
}
}
If after applying this code, as it happened to me, you have also had problems inserting the records in the database, you can go through this question where I also put the code that works for me.

Spring boot Job using delay not to work fine

I've defined 3 jobs using fixedDelayString=300000 (5 minutes) and I did that 3 these jobs will be executed independently. For that reason, I created an Async implementation. At first, each job worked fine, but in the time they started to delay a lot.
Each execution is about 5seg, but the next execution started to run after 10minutes. And occasionally 15 or 18minutes.
An example could be:
#EnableScheduling
public class AppConfig {
#Async('threadPoolTaskExecutor')
#Scheduled(fixedDelayString=15000)
public void doSomething1() {
// something that should run periodically
}
#Async('threadPoolTaskExecutor')
#Scheduled(fixedDelayString=300000)
public void doSomething2() {
// something that should run periodically
}
#Async('threadPoolTaskExecutor')
#Scheduled(fixedDelayString=300000)
public void doSomething3() {
// this job begins to have interval larger in each execution
}
}
#Configuration
#EnableAsync
public class AsyncConf {
#Bean("threadPoolTaskExecutor")
public TaskExecutor getAsyncExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(3);
executor.setMaxPoolSize(1000);
executor.setThreadNamePrefix("Async-");
return executor;
}
}
ยดยดยด
To mention the fixed delay, it should be fixedDelay instead of fixedDelayString. Check below code:
#Async('threadPoolTaskExecutor')
#Scheduled(fixedDelay = 300000)
public void doSomething3() { ... }
Also you should write #EnableScheduling annotation on your configuration class.
Also note that fixedDelay specifies that job should run next after specified amount of time once execution is completed. If you want to run your jobs at fixed intervals, you should try fixedRate instead of fixedDelay. Check more about scheduling here - https://www.baeldung.com/spring-scheduled-tasks

How to test proper working of Quartz job in Spring Boot application

I want to test if my Quartz trigger is working as it supposes in practice.
My Quartz configuration looks like:
#Configuration
public class QuartzConfiguration {
#Bean
public JobDetail verificationTokenRemoverJobDetails() {
return
JobBuilder
.newJob(VerificationTokenQuartzRemoverJob.class)
.withIdentity("Job for verification token remover")
.storeDurably()
.build();
}
#Bean
public Trigger verificationTokenRemoverJobTrigger(JobDetail jobADetails) {
return
TriggerBuilder
.newTrigger()
.forJob(jobADetails)
.withIdentity("Trigger for verification token remover")
.withSchedule(CronScheduleBuilder.cronSchedule("0 0 0/2 1/1 * ? *"))
.build();
}
}
and my Job class looks like:
#AllArgsConstructor
public class VerificationTokenQuartzRemoverJob implements Job {
private VerificationTokenRepository verificationTokenRepository;
#Override
public void execute(JobExecutionContext context) {
verificationTokenRepository.deleteAllByCreatedLessThan(LocalDateTime.now().minusMinutes(30));
}
}
When I am starting my Spring Boot application in logs I can realize that Job is working and triggered cyclical but it's not enough to confirm the proper working.
That's why I decided to create a JUnit test. I found a tutorial: click but an owner used a clause while(true) which according to this topic: click is not a preferable option. Here occurs a question, is there any other option to verify the Job class name, the identity of the trigger and check if CRON expression and the concrete job are called as often as possible?
If it possible I will be grateful for suggestions on how to reach a desirable effect.
Please not that the above answer on using the Spring Scheduling has one big drawback: This works nicely if you just run a single instance of your application, but as soon as you scale up to multiple instances it becomes more complex:
You might want to run the job only once at a certain interval but if two nodes run simultaneously the job might run on both nodes (so basically twice). Quartz can handle these kind of situations because it can have a central database through which it can coordinate if a job is already started.
With spring scheduling the job will run on both nodes.
With SpringBoot, You could do easier doing the following
--- Option 1 ---
#Configuration
// This is important!
#EnableScheduling
public class Configuration{
// TODO Change to 0 0 0/2 1/1 * ? *
#Scheduled(cron = "0 15 10 15 * ?")
public void scheduleTaskUsingCronExpression() {
long now = System.currentTimeMillis() / 1000;
System.out.println(
"schedule tasks using cron jobs - " + now);
}
}
Full Example:
https://www.baeldung.com/spring-scheduled-tasks
--- Option 2-> Programatically ---
#Configuration
#EnableScheduling
public class Configuration implements SchedulingConfigurer {
#Bean(destroyMethod = "shutdown")
public Executor taskExecutor() {
return Executors.newScheduledThreadPool(100);
}
#Override
public void configureTasks(ScheduledTaskRegistrar taskRegistrar) {
CronTrigger cronTrigger
= new CronTrigger("* * * * * *");
taskRegistrar.setScheduler(taskExecutor());
taskRegistrar.addTriggerTask(
new Runnable() {
#Override public void run() {
System.out.println("RUN!!!");
}
},
cronTrigger
);
}
}

How to trigger multiple "child" Jobs from a "mother" Job using Spring Batch?

I have a Job that looks into a configuration table and I would like it to trigger a Job per configuration entry using the values from the table.
The first Job comprises of a single Tasklet that does the config table lookup then triggers a subsequent Job per entry.
Mother Job code:
#Configuration
public class RunProcessesJobConfig {
... // steps, jobs factory init
#Autowired
private Tasklet runProcessTask;
#Bean
public Step runProcessStep() {
return steps.get("runProcessStep")
.tasklet(runProcessTask)
.build();
}
#Bean
public Job runProcessJob() {
return jobs.get("runProcessJob")
.start(runProcessStep())
.build();
}
}
The way I'm currently trying to implement it is by Autowiring a JobLauncher and the Job I need into the Tasklet and running the Job from there.
'RunProcessTask' - gets autowired into job config above:
#Autowired
Job myJob;
#Override
public RepeatStatus execute(StepContribution sc, ChunkContext cc) throws Exception {
List<DeployCfg> deployCfgs = da.getJobCfgs();
for (DeployCfg deployCfg : deployCfgs) {
String cfg1 = deployCfg.getCfg1();
String cfg2 = deployCfg.getCfg2();
String cfg3 = deployCfg.getCfg3();
// trigger job per config
new JobParametersBuilder()
.addString("cfg1", cfg1)
.addString("cfg2", cfg2)
.addString("cfg3", cfg3);
final JobExecution jobExec = jobLauncher.run(myJob, new JobParameters());
}
return RepeatStatus.FINISHED;
}
When I try executing the Mother Job I get an TransactionSuspensionNotSupportedException: Transaction manager [org.springframework.batch.support.transaction.ResourcelessTransactionManager] does not support transaction suspension error along the jobLauncher.run(...) line.
I'm thinking running a Job within another Job messes up with Spring Batch's Transaction Manager. Any ideas on how to do this?
Additional version info:
spring-boot-starter-parent version 1.5.9.RELEASE
spring-boot-starter-batch

Categories