Passing parameters to itemreader through #StepScope or #JobScope - java

I´m trying to inject parameters from an outside context to an ItemReader using spring batch.
Below I have my code that trigger the job:
Date date = Date.from(advanceSystemDateEvent.getReferenceDate().atStartOfDay(ZoneId.systemDefault()).toInstant());
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addLong("uniqueness", System.nanoTime());
jobParametersBuilder.addDate("date", date);
jobLauncher.run(remuneradorJob, jobParametersBuilder.toJobParameters());
My job:
#EnableBatchProcessing
#Configuration
public class JobsConfig {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Bean
public Job remuneradorJob(Step remuneradorStep) {
return jobBuilderFactory
.get("remuneradorJob")
.start(remuneradorStep)
.incrementer(new RunIdIncrementer())
.build();
}
}
My step:
#Configuration
public class StepsConfig {
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Bean
public Step remuneradorStep(ItemReader<Entity> remuneradorReader, ItemWriter<Entity> remuneradorWriter) {
return stepBuilderFactory
.get("remuneradorStep")
.<Entity, Entity>chunk(1)
.reader(remuneradorReader)
.writer(remuneradorWriter)
.build();
}
}
My itemreader:
#Configuration
public class RemuneradorReaderConfig {
#Autowired
private EntityManagerFactory entityManagerFactory;
#Bean
#StepScope
public JpaPagingItemReader<Entity> remuneradorReader(#Value("#{jobParameters}") Map jobParameters) {
//LocalDate localDate = date.toInstant().atZone(ZoneId.systemDefault()).toLocalDate();
LocalDate localDate = LocalDate.of(2011,5,16);
JpaPagingItemReader<Entity> databaseReader = new JpaPagingItemReader<>();
databaseReader.setEntityManagerFactory(entityManagerFactory);
databaseReader.setQueryString("...");
databaseReader.setPageSize(1000);
databaseReader.setParameterValues(Collections.<String, Object>singletonMap("limit", localDate));
return databaseReader;
}
}
I got the error:
Error creating bean with name 'scopedTarget.remuneradorReader': Scope 'step' is not active for the current thread; consider defining a scoped proxy for this bean if you intend to refer to it from a singleton; nested exception is java.lang.IllegalStateException: No context holder available for step scope
I tried replacing #StepScope for #JobScope, but I got the same error.
I saw this issue:
Spring batch scope issue while using spring boot
And finally, the application runs.
But now I´m facing another problem, according the code below:
#Bean
#StepScope
public JpaPagingItemReader<Entity> remuneradorReader(#Value("#{jobParameters}") Map jobParameters) {
JpaPagingItemReader<Entity> databaseReader = new JpaPagingItemReader<>();
databaseReader.setEntityManagerFactory(entityManagerFactory);
databaseReader.setQueryString("select o from Object o");
databaseReader.setPageSize(1000);
return databaseReader;
}
When I execute this reader, it gives me:
Deposit{Deposit_ID='null', Legacy_ID ='null', Valor_Depósito='10000', Saldo='10000'}
The idDeposit and idLegacy comes null.
But when I remove #StepScope and #Value("#{jobParameters}") Map jobParameters from ItemReader, like code below:
#Bean
public JpaPagingItemReader<Entity> remuneradorReader() {
JpaPagingItemReader<Entity> databaseReader = new JpaPagingItemReader<>();
databaseReader.setEntityManagerFactory(entityManagerFactory);
databaseReader.setQueryString("select o from Object o");
databaseReader.setPageSize(1000);
return databaseReader;
}
The reader gives me the correct response:
Deposit{Deposit_ID='98', Legacy_ID ='333', Valor_Depósito='10000', Saldo='10000'}
I think it´s missing something else.
Can anyone help me?

Related

Spring Batch throwing NullPointerException while saving data in database [duplicate]

This question already has answers here:
Why is my Spring #Autowired field null?
(21 answers)
Closed 1 year ago.
I tried implementing Spring batch. Here I'm trying to save data from text file to save in database. I'm getting NPE while processing.
#Configuration
public class JobConfig {
#Value("${inputFile}")
private Resource resource;
#Autowired
private ResourceLoader resourceLoader;
private JobBuilderFactory jobBuilderFactory;
private StepBuilderFactory stepBuilderFactory;
#Autowired
public JobConfig(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory) {
this.jobBuilderFactory = jobBuilderFactory;
this.stepBuilderFactory = stepBuilderFactory;
}
#Bean
public Job job() {
return this.jobBuilderFactory.get("JOB-Load")
.start(fileReadingStep())
.build();
}
#Bean
public Step fileReadingStep() {
return stepBuilderFactory.get("File-Read-Step1")
.<Student,Student>chunk(5)
.reader(itemReader())
.processor(new Processer())
.writer(new Writer())
.build();
}
#Bean
public FlatFileItemReader<Student> itemReader() {
FlatFileItemReader<Student> flatFileItemReader = new FlatFileItemReader<>();
flatFileItemReader.setResource(resource);
flatFileItemReader.setName("File-Reader");
flatFileItemReader.setStrict(false);
flatFileItemReader.setLineMapper(LineMapper());
return flatFileItemReader;
}
#Bean
public LineMapper<Student> LineMapper() {
DefaultLineMapper<Student> defaultLineMapper = new DefaultLineMapper<Student>();
FixedLengthTokenizer fixedLengthTokenizer = new FixedLengthTokenizer();
fixedLengthTokenizer.setNames(new String[] { "name"});
fixedLengthTokenizer.setColumns(new Range[] { new Range(1, 5)});
fixedLengthTokenizer.setStrict(false);
defaultLineMapper.setLineTokenizer(fixedLengthTokenizer);
defaultLineMapper.setFieldSetMapper(new DefaultFieldSetMapper());
return defaultLineMapper;
}
}
Writer Class
#Component
public class Writer implements ItemWriter<Student> {
#Autowired
private StudentRepo repo;
#Override
public void write(List<? extends Student> items) throws Exception {
for(Student s:items) {
System.out.println(s.getName());
}
try {
repo.saveAll(items);
}catch (Exception e) {
e.printStackTrace();
}
}
}
here I'm used JPARepository to save my text file data into database in my custom writer class. Here StudentRepo is null.
Why it's null ?
I tried another method method used same StudentRepo to store in db manually there is no issue.
Only in writer class it's null.
In fileReadingStep, you do not use the Writer bean that gets created through the #Component annotation. You create a new Writer instance which the Spring context is unaware of wherefore the repo field is not set and remains null.
You need to inject the Writer bean like you inject the ResourceLoader or the JobBuilderFactory and then use the bean in fileReadingStep.

SpringBatch Sharing Large Amounts of Data Between Steps

I have a need to share relatively large amounts of data between job steps for a spring batch implementation. I am aware of StepExecutionContext and JobExecutionContext as mechanisms for this. However, I read that since these must be limited in size (less than 2500 characters). That is too small for my needs. In my novice one-step Spring Batch implementation, my single step job is as below:
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
private static final String GET_DATA =
" SELECT " +
"stuffA, " +
"stuffB, " +
"FROM STUFF_TABLE " +
"ORDER BY stuffA ASC";
#Bean
public ItemReader<StuffDto> databaseCursorItemReader(DataSource dataSource) {
return new JdbcCursorItemReaderBuilder<StuffDto>()
.name("cursorItemReader")
.dataSource(dataSource)
.sql(GET_DATA)
.rowMapper(new BeanPropertyRowMapper<>(StuffDto.class))
.build();
}
#Bean
ItemProcessor<StuffDto, StuffDto> databaseXmlItemProcessor() {
return new QueryLoggingProcessor();
}
#Bean
public ItemWriter<StuffDto> databaseCursorItemWriter() {
return new LoggingItemWriter();
}
#Bean
public Step databaseCursorStep(#Qualifier("databaseCursorItemReader") ItemReader<StuffDto> reader,
#Qualifier("databaseCursorItemWriter") ItemWriter<StuffDto> writer,
StepBuilderFactory stepBuilderFactory) {
return stepBuilderFactory.get("databaseCursorStep")
.<StuffDto, StuffDto>chunk(1)
.reader(reader)
.writer(writer)
.build();
}
#Bean
public Job databaseCursorJob(#Qualifier("databaseCursorStep") Step exampleJobStep,
JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory.get("databaseCursorJob")
.incrementer(new RunIdIncrementer())
.flow(exampleJobStep)
.end()
.build();
}
}
This works fine in the sense that I can successfully read from the database and write in the writer step to a loggingitemwriter like this:
public class LoggingItemWriter implements ItemWriter<StuffDto> {
private static final Logger LOGGER = LoggerFactory.getLogger(LoggingItemWriter.class);
#Override
public void write(List<? extends StuffDto> list) throws Exception {
LOGGER.info("Writing stuff: {}", list);
}
}
However, I need to be able to make available that StuffDto (or equivalent) and it's data to a second step that would be performing some processing against it rather than just logging it.
I would be grateful for any ideas on how that could be accomplished if you assume that the step and job contexts are too limited. Thanks.
If you do not want to write the data in the database or filesystem, one way to achieve the same is like below:
Create your own job context bean in your config class having the required properties and annotated it with #JobScope.
Implement org.springframework.batch.core.step.tasklet interface to your reader, processor and writer classes. If you want more control over steps you can also implement org.springframework.batch.core.StepExecutionListener with it.
Get your own context object using #Autowire and use the setter-getter method of it to store and retrieve the data.
Sample Code:
Config.java
#Autowired
private Processor processor;
#Autowired
private Reader reader;
#Autowired
private Writer writer;
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Bean
#JobScope
public JobContext getJobContexts() {
JobContext newJobContext = new JobContext();
return newJobContext;
}
#Bean
public Step reader() {
return stepBuilderFactory.get("reader")
.tasklet(reader)
.build();
}
#Bean
public Step processor() {
return stepBuilderFactory.get("processor")
.tasklet(processor)
.build();
}
#Bean
public Step writer() {
return stepBuilderFactory.get("writer")
.tasklet(writer)
.build();
}
public Job testJob() {
return jobBuilderFactory.get("testJob")
.start(reader())
.next(processor())
.next(writer())
.build();
}
//Below will start the job
#Scheduled(fixedRate = 1000)
public void starJob(){
Map<String, JobParameter> confMap = new HashMap<>();
confMap.put("time", new JobParameter(new Date()));
JobParameters jobParameters = new JobParameters(confMap);
monitorJobLauncher.run(testJob(), jobParameters);
}
JobContext.java
private List<StuffDto> dataToProcess = new ArrayList<>();
private List<StuffDto> dataToWrite = new ArrayList<>();
//getter
SampleReader.java
#Component
public class SampleReader implements Tasklet,StepExecutionListener{
#Autowired
private JobContext context;
#Override
public void beforeStep(StepExecution stepExecution) {
//logic that you need to perform before the execution of this step.
}
#Override
public void afterStep(StepExecution stepExecution) {
//logic that you need to perform after the execution of this step.
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext){
// Whatever code is here will get executed for reader.
// Fetch StuffDto object from database and add it to jobContext
//dataToProcess list.
return RepeatStatus.FINISHED;
}
}
SampleProcessor.java
#Component
public class SampleProcessor implements Tasklet{
#Autowired
private JobContext context;
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext){
// Whatever code is here will get executed for processor.
// context.getDataToProcessList();
// apply business logic and set the data to write.
return RepeatStatus.FINISHED;
}
Same ways for the writer class.
Note: Please note here that here you need to write database-related boilerplate code on your own. But this way you can have more control over your logic and nothing to worry about context size limit. All the data will be in memory so as soon as operation done those will be garbage collected. I hope you get the idea of what I willing to convey.
To read more about Tasklet vs Chunk read this.

Spring Batch (modular=true) creates job config beans twice

#Configuration
#EnableBatchProcessing(modular=true)
public class ModularJobConfiguration {
#Bean
public ApplicationContextFactory job() {
return new GenericApplicationContextFactory(FlatfileToDbJobConfiguration.class);
}
#Bean
public ApplicationContextFactory anotherJob() {
return new GenericApplicationContextFactory(FlatfileToDbJobAnotherConfiguration.class);
}
}
For example show only one config, another like this.
#Configuration
public class FlatfileToDbJobConfiguration {
#Autowired
private JobBuilderFactory jobBuilders;
#Autowired
private StepBuilderFactory stepBuilders;
#Bean
public Job flatfileToDbJob(){
return jobBuilders.get("flatfileToDbJob")
.listener(protocolListener())
.start(step())
.build();
}
#Bean
public Step step(){
return stepBuilders.get("step")
.<Partner,Partner>chunk(1)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
// ...
// rest part of code
// ...
}
All works fine, but all beans methods in config files call twice. I think, first in common context and second in individual. Is it normal? I can't use post construct, because it will call twice. How to have modular=true and only one bean initialization? Thanks!
I find the answer. I need #Lazy initialization. Then config will initialize then individual context creates.
#Configuration
#Lazy
public class FlatfileToDbJobConfiguration {
#Autowired
private JobBuilderFactory jobBuilders;
#Autowired
private StepBuilderFactory stepBuilders;
#Bean
public Job flatfileToDbJob(){
return jobBuilders.get("flatfileToDbJob")
.listener(protocolListener())
.start(step())
.build();
}
#Bean
public Step step(){
return stepBuilders.get("step")
.<Partner,Partner>chunk(1)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
// ...
// rest part of code
// ...
}

How to run multiple jobs in spring batch using annotations

I am using Spring Boot + Spring Batch (annotation) , have come across a scenario where I have to run 2 jobs.
I have Employee and Salary records which needs to updated using spring batch. I have configured BatchConiguration classes by following this tutorial spring-batch getting started tutorial for Employee and Salary objects, respectively named as BatchConfigurationEmployee & BatchConfigurationSalary.
I have Defined the ItemReader, ItemProcessor, ItemWriter and Job by following the tutorial which is mentioned above already.
When I start my Spring Boot application either of the Job runs, I want to run both the BatchConfigured classes. How can I achieve this
********* BatchConfigurationEmployee.java *************
#Configuration
#EnableBatchProcessing
public class BatchConfigurationEmployee {
public ItemReader<employee> reader() {
return new EmployeeItemReader();
}
#Bean
public ItemProcessor<Employee, Employee> processor() {
return new EmployeeItemProcessor();
}
#Bean
public Job Employee(JobBuilderFactory jobs, Step s1) {
return jobs.get("Employee")
.incrementer(new RunIdIncrementer())
.flow(s1)
.end()
.build();
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory, ItemReader<Employee> reader,
ItemProcessor<Employee, Employee> processor) {
return stepBuilderFactory.get("step1")
.<Employee, Employee> chunk(1)
.reader(reader)
.processor(processor)
.build();
}
}
Salary Class is here
#Configuration
#EnableBatchProcessing
public class BatchConfigurationSalary {
public ItemReader<Salary> reader() {
return new SalaryItemReader();
}
#Bean
public ItemProcessor<Salary, Salary> processor() {
return new SalaryItemProcessor();
}
#Bean
public Job salary(JobBuilderFactory jobs, Step s1) {
return jobs.get("Salary")
.incrementer(new RunIdIncrementer())
.flow(s1)
.end()
.build();
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory, ItemReader<Salary> reader,
ItemProcessor<Salary, Salary> processor) {
return stepBuilderFactory.get("step1")
.<Salary, Salary> chunk(1)
.reader(reader)
.processor(processor)
.build();
}
}
The names of the Beans have to be unique in the whole Spring Context.
In both jobs, you are instantiating the reader, writer and processor with the same methodname. The methodname is the name that is used to identifiy the bean in the context.
In both job-definitions, you have reader(), writer() and processor(). They will overwrite each other. Give them unique names like readerEmployee(), readerSalary() and so on.
That should solve your problem.
You jobs are not annotated with #Bean, so the spring-context doesn't know them.
Have a look at the class JobLauncherCommandLineRunner. All Beans in the SpringContext implementing the Job interface will be injected. All jobs that are found will be executed. (this happens inside the method executeLocalJobs in JobLauncherCommandLineRunner)
If, for some reason, you don't want to have them as beans in the context, then you have to register your jobs with the jobregistry.( the method execute registeredJobs of JobLauncherCommandLineRunner will take care of launching the registered jobs)
BTW, you can control with the property
spring.batch.job.names= # Comma-separated list of job names to execute on startup (For instance
`job1,job2`). By default, all Jobs found in the context are executed.
which jobs should be launched.
I feel that this also is a pretty good way to run mutiple Jobs.
I am making use of a Job Launcher to configure and execute the job and independent commandLineRunner implementation to run them. These are ordered to make sure they are executed sequentially in the required though
Apologies for the big post but I wanted to give a clear picture of what can be achieved using JobLauncher configurations with multiple command line runners
This is the current BeanConfiguration that I have
#Configuration
public class BeanConfiguration {
#Autowired
DataSource dataSource;
#Autowired
PlatformTransactionManager transactionManager;
#Bean(name="jobOperator")
public JobOperator jobOperator(JobExplorer jobExplorer,
JobRegistry jobRegistry) throws Exception {
SimpleJobOperator jobOperator = new SimpleJobOperator();
jobOperator.setJobExplorer(jobExplorer);
jobOperator.setJobRepository(createJobRepository());
jobOperator.setJobRegistry(jobRegistry);
jobOperator.setJobLauncher(jobLauncher());
return jobOperator;
}
/**
* Configure joblaucnher to set the execution to be done asycn
* Using the ThreadPoolTaskExecutor
* #return
* #throws Exception
*/
#Bean
public JobLauncher jobLauncher() throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(createJobRepository());
jobLauncher.setTaskExecutor(taskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
// Read the datasource and set in the job repo
protected JobRepository createJobRepository() throws Exception {
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(dataSource);
factory.setTransactionManager(transactionManager);
factory.setIsolationLevelForCreate("ISOLATION_SERIALIZABLE");
//factory.setTablePrefix("BATCH_");
factory.setMaxVarCharLength(10000);
return factory.getObject();
}
#Bean
public RestTemplateBuilder restTemplateBuilder() {
return new RestTemplateBuilder().additionalInterceptors(new CustomRestTemplateLoggerInterceptor());
}
#Bean(name=AppConstants.JOB_DECIDER_BEAN_NAME_EMAIL_INIT)
public JobExecutionDecider jobDecider() {
return new EmailInitJobExecutionDecider();
}
#Bean
public ThreadPoolTaskExecutor taskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(15);
taskExecutor.setMaxPoolSize(20);
taskExecutor.setQueueCapacity(30);
return taskExecutor;
}
}
I have setup the database to hold the job exectuion details in postgre and hence the DatabaseConfiguration looks like this (two different beans for two different profiles -env)
#Configuration
public class DatasourceConfiguration implements EnvironmentAware{
private Environment env;
#Bean
#Qualifier(AppConstants.DB_BEAN)
#Profile("dev")
public DataSource getDataSource() {
HikariDataSource ds = new HikariDataSource();
boolean isAutoCommitEnabled = env.getProperty("spring.datasource.hikari.auto-commit") != null ? Boolean.parseBoolean(env.getProperty("spring.datasource.hikari.auto-commit")):false;
ds.setAutoCommit(isAutoCommitEnabled);
// Connection test query is for legacy connections
//ds.setConnectionInitSql(env.getProperty("spring.datasource.hikari.connection-test-query"));
ds.setPoolName(env.getProperty("spring.datasource.hikari.pool-name"));
ds.setDriverClassName(env.getProperty("spring.datasource.driver-class-name"));
long timeout = env.getProperty("spring.datasource.hikari.idleTimeout") != null ? Long.parseLong(env.getProperty("spring.datasource.hikari.idleTimeout")): 40000;
ds.setIdleTimeout(timeout);
long maxLifeTime = env.getProperty("spring.datasource.hikari.maxLifetime") != null ? Long.parseLong(env.getProperty("spring.datasource.hikari.maxLifetime")): 1800000 ;
ds.setMaxLifetime(maxLifeTime);
ds.setJdbcUrl(env.getProperty("spring.datasource.url"));
ds.setPoolName(env.getProperty("spring.datasource.hikari.pool-name"));
ds.setUsername(env.getProperty("spring.datasource.username"));
ds.setPassword(env.getProperty("spring.datasource.password"));
int poolSize = env.getProperty("spring.datasource.hikari.maximum-pool-size") != null ? Integer.parseInt(env.getProperty("spring.datasource.hikari.maximum-pool-size")): 10;
ds.setMaximumPoolSize(poolSize);
return ds;
}
#Bean
#Qualifier(AppConstants.DB_PROD_BEAN)
#Profile("prod")
public DataSource getProdDatabase() {
HikariDataSource ds = new HikariDataSource();
boolean isAutoCommitEnabled = env.getProperty("spring.datasource.hikari.auto-commit") != null ? Boolean.parseBoolean(env.getProperty("spring.datasource.hikari.auto-commit")):false;
ds.setAutoCommit(isAutoCommitEnabled);
// Connection test query is for legacy connections
//ds.setConnectionInitSql(env.getProperty("spring.datasource.hikari.connection-test-query"));
ds.setPoolName(env.getProperty("spring.datasource.hikari.pool-name"));
ds.setDriverClassName(env.getProperty("spring.datasource.driver-class-name"));
long timeout = env.getProperty("spring.datasource.hikari.idleTimeout") != null ? Long.parseLong(env.getProperty("spring.datasource.hikari.idleTimeout")): 40000;
ds.setIdleTimeout(timeout);
long maxLifeTime = env.getProperty("spring.datasource.hikari.maxLifetime") != null ? Long.parseLong(env.getProperty("spring.datasource.hikari.maxLifetime")): 1800000 ;
ds.setMaxLifetime(maxLifeTime);
ds.setJdbcUrl(env.getProperty("spring.datasource.url"));
ds.setPoolName(env.getProperty("spring.datasource.hikari.pool-name"));
ds.setUsername(env.getProperty("spring.datasource.username"));
ds.setPassword(env.getProperty("spring.datasource.password"));
int poolSize = env.getProperty("spring.datasource.hikari.maximum-pool-size") != null ? Integer.parseInt(env.getProperty("spring.datasource.hikari.maximum-pool-size")): 10;
ds.setMaximumPoolSize(poolSize);
return ds;
}
public void setEnvironment(Environment environment) {
// TODO Auto-generated method stub
this.env = environment;
}
}
Make sure that the initial app launcher catches the app execution which will be returned once the job execution terminates (either gets failed or completed) so that you can gracefully shutdown the jvm. Else using joblauncher makes the jvm to be alive even after all jobs get completed
#SpringBootApplication
#ComponentScan(basePackages="com.XXXX.Feedback_File_Processing.*")
#EnableBatchProcessing
public class FeedbackFileProcessingApp
{
public static void main(String[] args) throws Exception {
ApplicationContext appContext = SpringApplication.run(FeedbackFileProcessingApp.class, args);
// The batch job has finished by this point because the
// ApplicationContext is not 'ready' until the job is finished
// Also, use System.exit to force the Java process to finish with the exit code returned from the Spring App
System.exit(SpringApplication.exit(appContext));
}
}
............. so on , you can configure your own decider , your own job/steps as you said above for two different configurations like below and use them seperately in commandline runners (since the post is getting bigger, I am giving the details of just the job and command line runner)
These are the two jobs
#Configuration
public class DefferalJobConfiguration {
#Autowired
JobLauncher joblauncher;
#Autowired
private JobBuilderFactory jobFactory;
#Autowired
private StepBuilderFactory stepFactory;
#Bean
#StepScope
public Tasklet newSampleTasklet() {
return ((stepExecution, chunkContext) -> {
System.out.println("execution of step after flow");
return RepeatStatus.FINISHED;
});
}
#Bean
public Step sampleStep() {
return stepFactory.get("sampleStep").listener(new CustomStepExecutionListener())
.tasklet(newSampleTasklet()).build();
}
#Autowired
#Qualifier(AppConstants.FLOW_BEAN_NAME_EMAIL_INITIATION)
private Flow emailInitFlow;
#Autowired
#Qualifier(AppConstants.JOB_DECIDER_BEAN_NAME_EMAIL_INIT)
private JobExecutionDecider jobDecider;
#Autowired
#Qualifier(AppConstants.STEP_BEAN_NAME_ITEMREADER_FETCH_DEFERRAL_CONFIG)
private Step deferralConfigStep;
#Bean(name=AppConstants.JOB_BEAN_NAME_DEFERRAL)
public Job deferralJob() {
return jobFactory.get(AppConstants.JOB_NAME_DEFERRAL)
.start(emailInitFlow)
.on("COMPLETED").to(sampleStep())
.next(jobDecider).on("COMPLETED").to(deferralConfigStep)
.on("FAILED").fail()
.end().build();
}
}
#Configuration
public class TestFlowJobConfiguration {
#Autowired
private JobBuilderFactory jobFactory;
#Autowired
#Qualifier("testFlow")
private Flow testFlow;
#Bean(name = "testFlowJob")
public Job testFlowJob() {
return jobFactory.get("testFlowJob").start(testFlow).end().build();
}
}
Here are the command line runners (I am making sure that the first job is completed before the second job is initialized but it is totally up to the user to execute them in parallel following a different stratergy)
#Component
#Order(1)
public class DeferralCommandLineRunner implements CommandLineRunner, EnvironmentAware{
// If the jobLauncher is not used, then by default jobs are launched using SimpleJobLauncher
// with default configuration(assumption)
// hence modified the jobLauncher with vales set in BeanConfig
// of spring batch
private Environment env;
#Autowired
JobLauncher jobLauncher;
#Autowired
#Qualifier(AppConstants.JOB_BEAN_NAME_DEFERRAL)
Job deferralJob;
#Override
public void run(String... args) throws Exception {
// TODO Auto-generated method stub
JobParameters jobparams = new JobParametersBuilder()
.addString("run.time", LocalDateTime.now().
format(DateTimeFormatter.ofPattern(AppConstants.JOB_DATE_FORMATTER_PATTERN)).toString())
.addString("instance.name",
(deferralJob.getName() != null) ?deferralJob.getName()+'-'+UUID.randomUUID().toString() :
UUID.randomUUID().toString())
.toJobParameters();
jobLauncher.run(deferralJob, jobparams);
}
#Override
public void setEnvironment(Environment environment) {
// TODO Auto-generated method stub
this.env = environment;
}
}
#Component
#Order(2)
public class TestJobCommandLineRunner implements CommandLineRunner {
#Autowired
JobLauncher jobLauncher;
#Autowired
#Qualifier("testFlowJob")
Job testjob;
#Autowired
#Qualifier("jobOperator")
JobOperator operator;
#Override
public void run(String... args) throws Exception {
// TODO Auto-generated method stub
JobParameters jobParam = new JobParametersBuilder().addString("name", UUID.randomUUID().toString())
.toJobParameters();
System.out.println(operator.getJobNames());
try {
Set<Long> deferralExecutionIds = operator.getRunningExecutions(AppConstants.JOB_NAME_DEFERRAL);
System.out.println("deferralExceutuibuds:" + deferralExecutionIds);
operator.stop(deferralExecutionIds.iterator().next());
} catch (NoSuchJobException | NoSuchJobExecutionException | JobExecutionNotRunningException e) {
// just add a logging here
System.out.println("exception caught:" + e.getMessage());
}
jobLauncher.run(testjob, jobParam);
}
}
Hope this gives a complete idea of how it can be done. I am using spring-boot-starter-batch:jar:2.0.0.RELEASE

Spring Batch Multiple Threads

I am writing a Spring Batch with idea of scaling it when required.
My ApplicationContext looks like this
#Configuration
#EnableBatchProcessing
#EnableTransactionManagement
#ComponentScan(basePackages = "in.springbatch")
#PropertySource(value = {"classpath:springbatch.properties"})
public class ApplicationConfig {
#Autowired
Environment environment;
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Bean
public Job job() throws Exception {
return jobs.get("spring_batch")
.flow(step()).end()
.build();
}
#Bean(name = "dataSource", destroyMethod = "close")
public DataSource dataSource() {
BasicDataSource basicDataSource = new BasicDataSource();
return basicDataSource;
}
#Bean
public JobRepository jobRepository() throws Exception {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setTransactionManager(transactionManager());
jobRepositoryFactoryBean.setDataSource(dataSource());
return jobRepositoryFactoryBean.getObject();
}
#Bean(name = "batchstep")
public Step step() throws Exception {
return stepBuilderFactory.get("batchstep").allowStartIfComplete(true).
transactionManager(transactionManager()).
chunk(2).reader(batchReader()).processor(processor()).writer(writer()).build();
}
#Bean
ItemReader batchReader() throws Exception {
System.out.println(Thread.currentThread().getName()+"reader");
HibernateCursorItemReader<Source> hibernateCursorItemReader = new HibernateCursorItemReader<>();
hibernateCursorItemReader.setQueryString("from Source");
hibernateCursorItemReader.setFetchSize(2);
hibernateCursorItemReader.setSessionFactory(sessionFactory().getObject());
hibernateCursorItemReader.close();
return hibernateCursorItemReader;
}
#Bean
public ItemProcessor processor() {
return new BatchProcessor();
}
#Bean
public ItemWriter writer() {
return new BatchWriter();
}
public TaskExecutor taskExecutor(){
SimpleAsyncTaskExecutor asyncTaskExecutor=new SimpleAsyncTaskExecutor("spring_batch");
asyncTaskExecutor.setConcurrencyLimit(5);
return asyncTaskExecutor;
}
#Bean
public LocalSessionFactoryBean sessionFactory() {
LocalSessionFactoryBean sessionFactory = new LocalSessionFactoryBean();
sessionFactory.setDataSource(dataSource());
sessionFactory.setPackagesToScan(new String[]{"in.springbatch.entity"});
sessionFactory.setHibernateProperties(hibernateProperties());
return sessionFactory;
}
#Bean
public PersistenceExceptionTranslationPostProcessor exceptionTranslation() {
return new PersistenceExceptionTranslationPostProcessor();
}
#Bean
#Autowired
public HibernateTransactionManager transactionManager() {
HibernateTransactionManager txManager = new HibernateTransactionManager();
txManager.setSessionFactory(sessionFactory().getObject());
return txManager;
}
Properties hibernateProperties() {
return new Properties() {
{
setProperty("hibernate.hbm2ddl.auto", environment.getProperty("hibernate.hbm2ddl.auto"));
setProperty("hibernate.dialect", environment.getProperty("hibernate.dialect"));
setProperty("hibernate.globally_quoted_identifiers", "false");
}
};
}
}
With above configuration I am able to read from DB , process the data and write to DB.
I am using chunk size as 2 and reading 2 records from cursor using
HibernateCusrsorItem reader and my query to read from DB is based on
date to pick current date records.
So far I am able to achieve desired behavior as well as restart
ability with job only picking records which were not processed
due to failure in previous run.
Now my requirement is to make batch use multiple threads to process data and write to DB.
My Processor and writer looks like this
#Component
public class BatchProcessor implements ItemProcessor<Source,DestinationDto>{
#Override
public DestinationDto process(Source source) throws Exception {
System.out.println(Thread.currentThread().getName()+":"+source);
DestinationDto destination=new DestinationDto();
destination.setName(source.getName());
destination.setValue(source.getValue());
destination.setSourceId(source.getSourceId().toString());
return destination;
}
#Component
public class BatchWriter implements ItemWriter<DestinationDto>{
#Autowired
IBatchDao batchDao;
#Override
public void write(List<? extends DestinationDto> list) throws Exception {
System.out.println(Thread.currentThread().getName()+":"+list);
batchDao.saveToDestination((List<DestinationDto>)list);
}
I updated my step and added a ThreadPoolTaskExecutor as follows
#Bean(name = "batchstep")
public Step step() throws Exception {
return stepBuilderFactory.get("batchstep").allowStartIfComplete(true).
transactionManager(transactionManager()).chunk(1).reader(batchReader()).
processor(processor()).writer(writer()).taskExecutor(taskExecutor()).build();
}
After this my processor is getting called by multiple threads but with same source data.
Is there anything extra i need to do?
This is a big question
Your best bet at getting a good answers would be to look through the Scaling and Parallel Processing chapter in the Spring Batch Documentation (Here)
There might be some multi-threading samples in the spring batch examples (Here)
An easy way to thread the Spring batch job is to Create A Future Processor - you put all your Processing Logic in a Future Object and you spring-processor class only adds Objects to the future. You writer class then wait on the future to finish before performing the write process. Sorry I don't have a sample to point you too for this - but if you have specific questions I can try and answer!

Categories