I have created a simple single-step Spring batch job that reads items from a DB, processes them and writes the result to a csv.
During runtime I end up with a
org.springframework.batch.item.WriterNotOpenException: Writer must be open before it can be written to
The relevant code:
#Configuration
#EnableBatchProcessing
#EnableAutoConfiguration
public class CleanEmailJob {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
public DataSource dataSource;
#Bean
public ResourcelessTransactionManager transactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public MapJobRepositoryFactoryBean mapJobRepositoryFactory(ResourcelessTransactionManager txManager)
throws Exception {
MapJobRepositoryFactoryBean factory = new MapJobRepositoryFactoryBean(txManager);
factory.afterPropertiesSet();
return factory;
}
#Bean
public JobRepository jobRepository(MapJobRepositoryFactoryBean factory) throws Exception {
return factory.getObject();
}
#Bean
public JobExplorer jobExplorer(MapJobRepositoryFactoryBean factory) {
return new SimpleJobExplorer(factory.getJobInstanceDao(), factory.getJobExecutionDao(),
factory.getStepExecutionDao(), factory.getExecutionContextDao());
}
#Bean
public SimpleJobLauncher jobLauncher(JobRepository jobRepository) {
SimpleJobLauncher launcher = new SimpleJobLauncher();
launcher.setJobRepository(jobRepository);
return launcher;
}
#Bean
public Job cleanEmailAddressesJob() throws Exception {
return jobBuilderFactory.get("cleanEmailAddresses")
.incrementer(new RunIdIncrementer())
.start(processEmailAddresses())
.build();
}
#Bean
public Step processEmailAddresses() throws UnexpectedInputException, ParseException, Exception {
return stepBuilderFactory.get("processAffiliates")
.<AffiliateEmailAddress, VerifiedAffiliateEmailAddress> chunk(10)
.reader(reader())
.processor(processor())
.writer(report())
.build();
}
#Bean
public ItemWriter<VerifiedAffiliateEmailAddress> report(){
FlatFileItemWriter<VerifiedAffiliateEmailAddress> reportWriter = new FlatFileItemWriter<VerifiedAffiliateEmailAddress>();
reportWriter.setResource(new ClassPathResource("report.csv"));
DelimitedLineAggregator<VerifiedAffiliateEmailAddress> delLineAgg = new DelimitedLineAggregator<VerifiedAffiliateEmailAddress>();
delLineAgg.setDelimiter(",");
BeanWrapperFieldExtractor<VerifiedAffiliateEmailAddress> fieldExtractor = new BeanWrapperFieldExtractor<VerifiedAffiliateEmailAddress>();
fieldExtractor.setNames(new String[] {"uniekNr", "reason"});
delLineAgg.setFieldExtractor(fieldExtractor);
reportWriter.setLineAggregator(delLineAgg);
reportWriter.setShouldDeleteIfExists(true);
return reportWriter;
}
As described in the documentation I would expect the lifecycle events(open, close) are automatically taken care of since I am in a single threaded and single writer job?
To elaborate on the comment left, Spring Batch will register any ItemStream implementations automatically when it finds them so that they will be automatically opened when the step begins. When using java config, Spring only knows what the return type is. Since you are returning ItemReader, we don't know that your implementation also implements ItemStream. When using java config, I usually recommend returning the implementation if it's known (instead of the interface). That allows Spring to introspect it fully. So in this example, returning FlatFileItemReader instead of ItemReader will fix the issue.
Finally it turned out to be the Spring Dev Tools interfering with my batch application. The report.csv is in my classpath, when writing to the report file, Spring Dev Tools detects a change in my classpath and triggers a reload of the application causing the report.csv resource being closed...
spring.devtools.restart.enabled=false
in my application.properties made it work again
Related
I am trying to schedule a Spring Batch on top of Spring boot Application. The following are my configurations. However, i see an error when application start up fails with the following error.
Parameter 0 of method mapJobRepositoryFactory in ScheduleConfig required a bean of type 'org.springframework.batch.support.transaction.ResourcelessTransactionManager' that could not be found. Can some one throw light on why this is occuring?
#Configuration
#EnableScheduling
public class ScheduleConfig {
#Bean
public ResourcelessTransactionManager transactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public MapJobRepositoryFactoryBean mapJobRepositoryFactory(
ResourcelessTransactionManager transactionManager) throws Exception {
MapJobRepositoryFactoryBean factory = new
MapJobRepositoryFactoryBean(transactionManager);
factory.afterPropertiesSet();
return factory;
}
#Bean
public JobRepository jobRepository(
MapJobRepositoryFactoryBean factory) throws Exception {
return factory.getObject();
}
#Bean
public SimpleJobLauncher jobLauncher(JobRepository jobRepository) {
SimpleJobLauncher launcher = new SimpleJobLauncher();
launcher.setJobRepository(jobRepository);
return launcher;
}
}
Add #EnableBatchProcessing on top of the class :)
Don't use
public PlatformTransactionManager transactionManager(....
Cause...
It conflicts with "PlatformTransactionManager" in "DefaultBatchConfigurer".
I am implementing a small REST-service, which runs batch job. I am not going to persist any metadata and my job is not parallel, so I've choosed MapJobRepositoryFactoryBean. And I also have H2 in classpath for other purposes. But when I trying to run the job, I am getting the following error:
org.springframework.dao.EmptyResultDataAccessException: Incorrect result size: expected 1, actual 0
I figured out, that JobRepository bean I getting from MapJobRepositoryFactoryBean uses JdbcJobExecutionDao instead of MapJobExecutionDao. And because of that it fails on request
SELECT VERSION FROM BATCH_JOB_EXECUTION WHERE JOB_EXECUTION_ID=?
Why this happens? How should I properly configure it to use Map-based Dao?
Currently my configuration is the following:
public class ServiceConfiguration {
#Bean
public PlatformTransactionManager transactionManager() {
return new ResourcelessTransactionManager();
}
#Bean
public MapJobRepositoryFactoryBean mapJobRepositoryFactoryBean(PlatformTransactionManager transactionManager) throws Exception {
MapJobRepositoryFactoryBean mapJobRepositoryFactoryBean = new MapJobRepositoryFactoryBean(transactionManager);
return mapJobRepositoryFactoryBean;
}
#Bean(name = "inMemoryJobRepository")
public JobRepository jobRepository(MapJobRepositoryFactoryBean mapJobRepositoryFactoryBean) throws Exception {
JobRepository mapJobRepository = mapJobRepositoryFactoryBean.getObject();
return mapJobRepository;
}
#Bean(name = "syncJobLauncher")
public JobLauncher jobLauncher(#Qualifier("inMemoryJobRepository") JobRepository jobRepository) throws Exception {
SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(jobRepository);
return simpleJobLauncher;
}
}
UPDATE:
As Michael Minella mentioned, the bean for custom BatchConfigurer should be created.
#Bean
public BatchConfigurer batchConfigurer(#Qualifier("inMemoryJobRepository") JobRepository jobRepository,
#Qualifier("syncJobLauncher") JobLauncher jobLauncher,
#Qualifier("resourcelessTransactionManager") PlatformTransactionManager transactionManager,
#Qualifier("inMemoryJobExplorer") JobExplorer jobExplorer){
return new InMemoryBatchConfigurer(jobRepository, transactionManager, jobLauncher, jobExplorer);
}
I am using Spring Boot + Spring Batch (annotation) , have come across a scenario where I have to run 2 jobs.
I have Employee and Salary records which needs to updated using spring batch. I have configured BatchConiguration classes by following this tutorial spring-batch getting started tutorial for Employee and Salary objects, respectively named as BatchConfigurationEmployee & BatchConfigurationSalary.
I have Defined the ItemReader, ItemProcessor, ItemWriter and Job by following the tutorial which is mentioned above already.
When I start my Spring Boot application either of the Job runs, I want to run both the BatchConfigured classes. How can I achieve this
********* BatchConfigurationEmployee.java *************
#Configuration
#EnableBatchProcessing
public class BatchConfigurationEmployee {
public ItemReader<employee> reader() {
return new EmployeeItemReader();
}
#Bean
public ItemProcessor<Employee, Employee> processor() {
return new EmployeeItemProcessor();
}
#Bean
public Job Employee(JobBuilderFactory jobs, Step s1) {
return jobs.get("Employee")
.incrementer(new RunIdIncrementer())
.flow(s1)
.end()
.build();
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory, ItemReader<Employee> reader,
ItemProcessor<Employee, Employee> processor) {
return stepBuilderFactory.get("step1")
.<Employee, Employee> chunk(1)
.reader(reader)
.processor(processor)
.build();
}
}
Salary Class is here
#Configuration
#EnableBatchProcessing
public class BatchConfigurationSalary {
public ItemReader<Salary> reader() {
return new SalaryItemReader();
}
#Bean
public ItemProcessor<Salary, Salary> processor() {
return new SalaryItemProcessor();
}
#Bean
public Job salary(JobBuilderFactory jobs, Step s1) {
return jobs.get("Salary")
.incrementer(new RunIdIncrementer())
.flow(s1)
.end()
.build();
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory, ItemReader<Salary> reader,
ItemProcessor<Salary, Salary> processor) {
return stepBuilderFactory.get("step1")
.<Salary, Salary> chunk(1)
.reader(reader)
.processor(processor)
.build();
}
}
The names of the Beans have to be unique in the whole Spring Context.
In both jobs, you are instantiating the reader, writer and processor with the same methodname. The methodname is the name that is used to identifiy the bean in the context.
In both job-definitions, you have reader(), writer() and processor(). They will overwrite each other. Give them unique names like readerEmployee(), readerSalary() and so on.
That should solve your problem.
You jobs are not annotated with #Bean, so the spring-context doesn't know them.
Have a look at the class JobLauncherCommandLineRunner. All Beans in the SpringContext implementing the Job interface will be injected. All jobs that are found will be executed. (this happens inside the method executeLocalJobs in JobLauncherCommandLineRunner)
If, for some reason, you don't want to have them as beans in the context, then you have to register your jobs with the jobregistry.( the method execute registeredJobs of JobLauncherCommandLineRunner will take care of launching the registered jobs)
BTW, you can control with the property
spring.batch.job.names= # Comma-separated list of job names to execute on startup (For instance
`job1,job2`). By default, all Jobs found in the context are executed.
which jobs should be launched.
I feel that this also is a pretty good way to run mutiple Jobs.
I am making use of a Job Launcher to configure and execute the job and independent commandLineRunner implementation to run them. These are ordered to make sure they are executed sequentially in the required though
Apologies for the big post but I wanted to give a clear picture of what can be achieved using JobLauncher configurations with multiple command line runners
This is the current BeanConfiguration that I have
#Configuration
public class BeanConfiguration {
#Autowired
DataSource dataSource;
#Autowired
PlatformTransactionManager transactionManager;
#Bean(name="jobOperator")
public JobOperator jobOperator(JobExplorer jobExplorer,
JobRegistry jobRegistry) throws Exception {
SimpleJobOperator jobOperator = new SimpleJobOperator();
jobOperator.setJobExplorer(jobExplorer);
jobOperator.setJobRepository(createJobRepository());
jobOperator.setJobRegistry(jobRegistry);
jobOperator.setJobLauncher(jobLauncher());
return jobOperator;
}
/**
* Configure joblaucnher to set the execution to be done asycn
* Using the ThreadPoolTaskExecutor
* #return
* #throws Exception
*/
#Bean
public JobLauncher jobLauncher() throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(createJobRepository());
jobLauncher.setTaskExecutor(taskExecutor());
jobLauncher.afterPropertiesSet();
return jobLauncher;
}
// Read the datasource and set in the job repo
protected JobRepository createJobRepository() throws Exception {
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(dataSource);
factory.setTransactionManager(transactionManager);
factory.setIsolationLevelForCreate("ISOLATION_SERIALIZABLE");
//factory.setTablePrefix("BATCH_");
factory.setMaxVarCharLength(10000);
return factory.getObject();
}
#Bean
public RestTemplateBuilder restTemplateBuilder() {
return new RestTemplateBuilder().additionalInterceptors(new CustomRestTemplateLoggerInterceptor());
}
#Bean(name=AppConstants.JOB_DECIDER_BEAN_NAME_EMAIL_INIT)
public JobExecutionDecider jobDecider() {
return new EmailInitJobExecutionDecider();
}
#Bean
public ThreadPoolTaskExecutor taskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(15);
taskExecutor.setMaxPoolSize(20);
taskExecutor.setQueueCapacity(30);
return taskExecutor;
}
}
I have setup the database to hold the job exectuion details in postgre and hence the DatabaseConfiguration looks like this (two different beans for two different profiles -env)
#Configuration
public class DatasourceConfiguration implements EnvironmentAware{
private Environment env;
#Bean
#Qualifier(AppConstants.DB_BEAN)
#Profile("dev")
public DataSource getDataSource() {
HikariDataSource ds = new HikariDataSource();
boolean isAutoCommitEnabled = env.getProperty("spring.datasource.hikari.auto-commit") != null ? Boolean.parseBoolean(env.getProperty("spring.datasource.hikari.auto-commit")):false;
ds.setAutoCommit(isAutoCommitEnabled);
// Connection test query is for legacy connections
//ds.setConnectionInitSql(env.getProperty("spring.datasource.hikari.connection-test-query"));
ds.setPoolName(env.getProperty("spring.datasource.hikari.pool-name"));
ds.setDriverClassName(env.getProperty("spring.datasource.driver-class-name"));
long timeout = env.getProperty("spring.datasource.hikari.idleTimeout") != null ? Long.parseLong(env.getProperty("spring.datasource.hikari.idleTimeout")): 40000;
ds.setIdleTimeout(timeout);
long maxLifeTime = env.getProperty("spring.datasource.hikari.maxLifetime") != null ? Long.parseLong(env.getProperty("spring.datasource.hikari.maxLifetime")): 1800000 ;
ds.setMaxLifetime(maxLifeTime);
ds.setJdbcUrl(env.getProperty("spring.datasource.url"));
ds.setPoolName(env.getProperty("spring.datasource.hikari.pool-name"));
ds.setUsername(env.getProperty("spring.datasource.username"));
ds.setPassword(env.getProperty("spring.datasource.password"));
int poolSize = env.getProperty("spring.datasource.hikari.maximum-pool-size") != null ? Integer.parseInt(env.getProperty("spring.datasource.hikari.maximum-pool-size")): 10;
ds.setMaximumPoolSize(poolSize);
return ds;
}
#Bean
#Qualifier(AppConstants.DB_PROD_BEAN)
#Profile("prod")
public DataSource getProdDatabase() {
HikariDataSource ds = new HikariDataSource();
boolean isAutoCommitEnabled = env.getProperty("spring.datasource.hikari.auto-commit") != null ? Boolean.parseBoolean(env.getProperty("spring.datasource.hikari.auto-commit")):false;
ds.setAutoCommit(isAutoCommitEnabled);
// Connection test query is for legacy connections
//ds.setConnectionInitSql(env.getProperty("spring.datasource.hikari.connection-test-query"));
ds.setPoolName(env.getProperty("spring.datasource.hikari.pool-name"));
ds.setDriverClassName(env.getProperty("spring.datasource.driver-class-name"));
long timeout = env.getProperty("spring.datasource.hikari.idleTimeout") != null ? Long.parseLong(env.getProperty("spring.datasource.hikari.idleTimeout")): 40000;
ds.setIdleTimeout(timeout);
long maxLifeTime = env.getProperty("spring.datasource.hikari.maxLifetime") != null ? Long.parseLong(env.getProperty("spring.datasource.hikari.maxLifetime")): 1800000 ;
ds.setMaxLifetime(maxLifeTime);
ds.setJdbcUrl(env.getProperty("spring.datasource.url"));
ds.setPoolName(env.getProperty("spring.datasource.hikari.pool-name"));
ds.setUsername(env.getProperty("spring.datasource.username"));
ds.setPassword(env.getProperty("spring.datasource.password"));
int poolSize = env.getProperty("spring.datasource.hikari.maximum-pool-size") != null ? Integer.parseInt(env.getProperty("spring.datasource.hikari.maximum-pool-size")): 10;
ds.setMaximumPoolSize(poolSize);
return ds;
}
public void setEnvironment(Environment environment) {
// TODO Auto-generated method stub
this.env = environment;
}
}
Make sure that the initial app launcher catches the app execution which will be returned once the job execution terminates (either gets failed or completed) so that you can gracefully shutdown the jvm. Else using joblauncher makes the jvm to be alive even after all jobs get completed
#SpringBootApplication
#ComponentScan(basePackages="com.XXXX.Feedback_File_Processing.*")
#EnableBatchProcessing
public class FeedbackFileProcessingApp
{
public static void main(String[] args) throws Exception {
ApplicationContext appContext = SpringApplication.run(FeedbackFileProcessingApp.class, args);
// The batch job has finished by this point because the
// ApplicationContext is not 'ready' until the job is finished
// Also, use System.exit to force the Java process to finish with the exit code returned from the Spring App
System.exit(SpringApplication.exit(appContext));
}
}
............. so on , you can configure your own decider , your own job/steps as you said above for two different configurations like below and use them seperately in commandline runners (since the post is getting bigger, I am giving the details of just the job and command line runner)
These are the two jobs
#Configuration
public class DefferalJobConfiguration {
#Autowired
JobLauncher joblauncher;
#Autowired
private JobBuilderFactory jobFactory;
#Autowired
private StepBuilderFactory stepFactory;
#Bean
#StepScope
public Tasklet newSampleTasklet() {
return ((stepExecution, chunkContext) -> {
System.out.println("execution of step after flow");
return RepeatStatus.FINISHED;
});
}
#Bean
public Step sampleStep() {
return stepFactory.get("sampleStep").listener(new CustomStepExecutionListener())
.tasklet(newSampleTasklet()).build();
}
#Autowired
#Qualifier(AppConstants.FLOW_BEAN_NAME_EMAIL_INITIATION)
private Flow emailInitFlow;
#Autowired
#Qualifier(AppConstants.JOB_DECIDER_BEAN_NAME_EMAIL_INIT)
private JobExecutionDecider jobDecider;
#Autowired
#Qualifier(AppConstants.STEP_BEAN_NAME_ITEMREADER_FETCH_DEFERRAL_CONFIG)
private Step deferralConfigStep;
#Bean(name=AppConstants.JOB_BEAN_NAME_DEFERRAL)
public Job deferralJob() {
return jobFactory.get(AppConstants.JOB_NAME_DEFERRAL)
.start(emailInitFlow)
.on("COMPLETED").to(sampleStep())
.next(jobDecider).on("COMPLETED").to(deferralConfigStep)
.on("FAILED").fail()
.end().build();
}
}
#Configuration
public class TestFlowJobConfiguration {
#Autowired
private JobBuilderFactory jobFactory;
#Autowired
#Qualifier("testFlow")
private Flow testFlow;
#Bean(name = "testFlowJob")
public Job testFlowJob() {
return jobFactory.get("testFlowJob").start(testFlow).end().build();
}
}
Here are the command line runners (I am making sure that the first job is completed before the second job is initialized but it is totally up to the user to execute them in parallel following a different stratergy)
#Component
#Order(1)
public class DeferralCommandLineRunner implements CommandLineRunner, EnvironmentAware{
// If the jobLauncher is not used, then by default jobs are launched using SimpleJobLauncher
// with default configuration(assumption)
// hence modified the jobLauncher with vales set in BeanConfig
// of spring batch
private Environment env;
#Autowired
JobLauncher jobLauncher;
#Autowired
#Qualifier(AppConstants.JOB_BEAN_NAME_DEFERRAL)
Job deferralJob;
#Override
public void run(String... args) throws Exception {
// TODO Auto-generated method stub
JobParameters jobparams = new JobParametersBuilder()
.addString("run.time", LocalDateTime.now().
format(DateTimeFormatter.ofPattern(AppConstants.JOB_DATE_FORMATTER_PATTERN)).toString())
.addString("instance.name",
(deferralJob.getName() != null) ?deferralJob.getName()+'-'+UUID.randomUUID().toString() :
UUID.randomUUID().toString())
.toJobParameters();
jobLauncher.run(deferralJob, jobparams);
}
#Override
public void setEnvironment(Environment environment) {
// TODO Auto-generated method stub
this.env = environment;
}
}
#Component
#Order(2)
public class TestJobCommandLineRunner implements CommandLineRunner {
#Autowired
JobLauncher jobLauncher;
#Autowired
#Qualifier("testFlowJob")
Job testjob;
#Autowired
#Qualifier("jobOperator")
JobOperator operator;
#Override
public void run(String... args) throws Exception {
// TODO Auto-generated method stub
JobParameters jobParam = new JobParametersBuilder().addString("name", UUID.randomUUID().toString())
.toJobParameters();
System.out.println(operator.getJobNames());
try {
Set<Long> deferralExecutionIds = operator.getRunningExecutions(AppConstants.JOB_NAME_DEFERRAL);
System.out.println("deferralExceutuibuds:" + deferralExecutionIds);
operator.stop(deferralExecutionIds.iterator().next());
} catch (NoSuchJobException | NoSuchJobExecutionException | JobExecutionNotRunningException e) {
// just add a logging here
System.out.println("exception caught:" + e.getMessage());
}
jobLauncher.run(testjob, jobParam);
}
}
Hope this gives a complete idea of how it can be done. I am using spring-boot-starter-batch:jar:2.0.0.RELEASE
I am writing a Spring Batch with idea of scaling it when required.
My ApplicationContext looks like this
#Configuration
#EnableBatchProcessing
#EnableTransactionManagement
#ComponentScan(basePackages = "in.springbatch")
#PropertySource(value = {"classpath:springbatch.properties"})
public class ApplicationConfig {
#Autowired
Environment environment;
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Bean
public Job job() throws Exception {
return jobs.get("spring_batch")
.flow(step()).end()
.build();
}
#Bean(name = "dataSource", destroyMethod = "close")
public DataSource dataSource() {
BasicDataSource basicDataSource = new BasicDataSource();
return basicDataSource;
}
#Bean
public JobRepository jobRepository() throws Exception {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setTransactionManager(transactionManager());
jobRepositoryFactoryBean.setDataSource(dataSource());
return jobRepositoryFactoryBean.getObject();
}
#Bean(name = "batchstep")
public Step step() throws Exception {
return stepBuilderFactory.get("batchstep").allowStartIfComplete(true).
transactionManager(transactionManager()).
chunk(2).reader(batchReader()).processor(processor()).writer(writer()).build();
}
#Bean
ItemReader batchReader() throws Exception {
System.out.println(Thread.currentThread().getName()+"reader");
HibernateCursorItemReader<Source> hibernateCursorItemReader = new HibernateCursorItemReader<>();
hibernateCursorItemReader.setQueryString("from Source");
hibernateCursorItemReader.setFetchSize(2);
hibernateCursorItemReader.setSessionFactory(sessionFactory().getObject());
hibernateCursorItemReader.close();
return hibernateCursorItemReader;
}
#Bean
public ItemProcessor processor() {
return new BatchProcessor();
}
#Bean
public ItemWriter writer() {
return new BatchWriter();
}
public TaskExecutor taskExecutor(){
SimpleAsyncTaskExecutor asyncTaskExecutor=new SimpleAsyncTaskExecutor("spring_batch");
asyncTaskExecutor.setConcurrencyLimit(5);
return asyncTaskExecutor;
}
#Bean
public LocalSessionFactoryBean sessionFactory() {
LocalSessionFactoryBean sessionFactory = new LocalSessionFactoryBean();
sessionFactory.setDataSource(dataSource());
sessionFactory.setPackagesToScan(new String[]{"in.springbatch.entity"});
sessionFactory.setHibernateProperties(hibernateProperties());
return sessionFactory;
}
#Bean
public PersistenceExceptionTranslationPostProcessor exceptionTranslation() {
return new PersistenceExceptionTranslationPostProcessor();
}
#Bean
#Autowired
public HibernateTransactionManager transactionManager() {
HibernateTransactionManager txManager = new HibernateTransactionManager();
txManager.setSessionFactory(sessionFactory().getObject());
return txManager;
}
Properties hibernateProperties() {
return new Properties() {
{
setProperty("hibernate.hbm2ddl.auto", environment.getProperty("hibernate.hbm2ddl.auto"));
setProperty("hibernate.dialect", environment.getProperty("hibernate.dialect"));
setProperty("hibernate.globally_quoted_identifiers", "false");
}
};
}
}
With above configuration I am able to read from DB , process the data and write to DB.
I am using chunk size as 2 and reading 2 records from cursor using
HibernateCusrsorItem reader and my query to read from DB is based on
date to pick current date records.
So far I am able to achieve desired behavior as well as restart
ability with job only picking records which were not processed
due to failure in previous run.
Now my requirement is to make batch use multiple threads to process data and write to DB.
My Processor and writer looks like this
#Component
public class BatchProcessor implements ItemProcessor<Source,DestinationDto>{
#Override
public DestinationDto process(Source source) throws Exception {
System.out.println(Thread.currentThread().getName()+":"+source);
DestinationDto destination=new DestinationDto();
destination.setName(source.getName());
destination.setValue(source.getValue());
destination.setSourceId(source.getSourceId().toString());
return destination;
}
#Component
public class BatchWriter implements ItemWriter<DestinationDto>{
#Autowired
IBatchDao batchDao;
#Override
public void write(List<? extends DestinationDto> list) throws Exception {
System.out.println(Thread.currentThread().getName()+":"+list);
batchDao.saveToDestination((List<DestinationDto>)list);
}
I updated my step and added a ThreadPoolTaskExecutor as follows
#Bean(name = "batchstep")
public Step step() throws Exception {
return stepBuilderFactory.get("batchstep").allowStartIfComplete(true).
transactionManager(transactionManager()).chunk(1).reader(batchReader()).
processor(processor()).writer(writer()).taskExecutor(taskExecutor()).build();
}
After this my processor is getting called by multiple threads but with same source data.
Is there anything extra i need to do?
This is a big question
Your best bet at getting a good answers would be to look through the Scaling and Parallel Processing chapter in the Spring Batch Documentation (Here)
There might be some multi-threading samples in the spring batch examples (Here)
An easy way to thread the Spring batch job is to Create A Future Processor - you put all your Processing Logic in a Future Object and you spring-processor class only adds Objects to the future. You writer class then wait on the future to finish before performing the write process. Sorry I don't have a sample to point you too for this - but if you have specific questions I can try and answer!
I'd like to use JpaItemWriter to batch persist entities. But when I use the following code to persist, I'm told:
Hibernate:
select
nextval ('hibernate_sequence')
[] 2014-03-19 15:46:02,237 ERROR : TransactionRequiredException: no transaction is in progress
How can I enable transactions on the following:
#Bean
public ItemWriter<T> writer() {
JpaItemWriter<T> itemWriter = new JpaItemWriter<>();
itemWriter.setEntityManagerFactory(emf);
}
#Configuration
#EnableTransactionManagement
#EnableBatchProcessing
class Config{ {
#Bean
public LocalContainerEntityManagerFactoryBean emf() {
LocalContainerEntityManagerFactoryBean emf = new LocalContainerEntityManagerFactoryBean();
emf.setDataSource(dataSource());
emf.setPackagesToScan("my.package");
emf.setJpaVendorAdapter(jpaAdapter());
emf.setJpaProperties(jpaProterties());
return emf;
}
Edit:
#Bean
public Job airlineJob(JobBuilderFactory jobs, Step step) {
return jobs.get("job")
.start(step)
.build();
}
//Reader is a `FlatFileItemReader`, writer is `CustomItemWriter`.
#Bean
public Step step(StepBuilderFactory steps,
MultiResourceItemReader<T> rea,
ItemProcessor<T, T> pro,
ItemWriter<T> wr) {
return steps.get("step")
.reader(rea)
.processor(proc)
.writer(wr)
.build();
}
//use same datasource and tx manager as in the full web application
#Bean
public JobLauncher launcher(TransactionManager tx, DataSource ds) throws Exception {
SimpleJobLauncher launcher = new SimpleJobLauncher();
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(ds);
factory.setTransactionManager(tx);
jobLauncher.setJobRepository(factory.getJobRepository());
return launcher;
}
Edit 2 as response to #Haim:
#Bean
public JpaItemWriter<T> jpaItemWriter(EntityManagerFactory emf) {
JpaItemWriter<T> writer = new JpaItemWriter<T>();
writer.setEntityManagerFactory(emf);
return writer;
}
I agree with Michael Minella: Spring batch job repository does not like to share its transaction manager with others. The logic is simple, if you share your job transaction manager with your step transaction manager upon step failure it will rollback both the step and the data written to the job repository. This means that you will not persist data for job restart.
In order to use two transaction managers you need to:
Delete #EnableTransactionManagement in case you use it only for the #Transactional above
Define an additional transaction manager
#Bean
#Qualifier("jpaTrx")
public PlatformTransactionManager jpaTransactionManager() {
return new JpaTransactionManager(emf());
}
Set the transaction manager to your step
#Autowired
#Qualifier("jpaTrx")
PlatformTransactionManager jpaTransactionManager
//Reader is a FlatFileItemReader, writer is CustomItemWriter.
#Bean
public Step step(StepBuilderFactory steps,
MultiResourceItemReader<T> rea,
ItemProcessor<T, T> pro,
ItemWriter<T> wr) {
return steps.get("step")
//attach tx manager
.transactionManager(jpaTransactionManager)
.reader(rea)
.processor(proc)
.writer(wr)
.build();
}
Write your own JpaTransactionManager for your datasources instead of spring transactions
#Autowired
private DataSource dataSource;
#Bean
#Primary
public JpaTransactionManager jpaTransactionManager() {
final JpaTransactionManager tm = new JpaTransactionManager();
tm.setDataSource(dataSource);
return tm;
}
Click Up Vote if this is useful for you.
I solved it creating my own transactional JpaWriter:
#Component
public class CustomItemWriter<T> extends JpaItemWriter<T> {
#Override
#Transactional
public void write(List<? extends T> items) {
super.write(items);
}
}