Trying to implement Spring batch,but facing a strange problem,Our ItemReader class is executing only once.
Here below is the detail.
If we have 1000 rows in DB.
Our Item reader fetch 1000 rows from DB,and pass list to ItemWriter
ItemWriter successfully delete all items.
Now ItemReader again tries to fetch the data from DB,but did not find,hence returns NULL,so execution stops.
But we have configured batch to be executed with Quartz scheduler,which is every minute.
Now if we insert let say 1000 rows in DB by dump import,the batch job should pick this data in next execution,but it is not even executing,although
JobLauncher is executing.
Configuration :-
1.We have ItemReader,ItemWriter with commit interval equals to 1.
<batch:job id="csrfTokenBatchJob">
<batch:step id="step1">
<tasklet>
<chunk reader="csrfTokenReader" writer="csrfTokenWriter" commit-interval="1"></chunk>
</tasklet>
</batch:step>
</batch:job>
2.Job is scheduled to be triggered at every minute.
<bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="triggers">
<bean id="cronTrigger" class="org.springframework.scheduling.quartz.CronTriggerBean">
<property name="jobDetail" ref="jobDetail" />
<property name="cronExpression" value="0 0/1 * * * ?" />
</bean>
</property>
</bean>
3.Job configuration
<bean id="jobDetail" class="org.springframework.scheduling.quartz.JobDetailBean">
<property name="jobClass" value="com.tavant.oauth.batch.job.CSRFTokenJobLauncher" />
<property name="jobDataAsMap">
<map>
<entry key="jobName" value="csrfTokenCleanUpBatchJob" />
<entry key="jobLocator" value-ref="jobRegistry" />
<entry key="jobLauncher" value-ref="jobLauncher" />
</map>
</property>
</bean>
First time it is executing successfully,but later it does not execute,but i can see in logs that JobLauncher is executing.
#Component("csrfTokenReader")
#Scope(value="step")
public class CSRFTokenReader implements ItemReader<List<CSRFToken>> {
private static final Logger logger = LoggerFactory.getLogger(CSRFTokenReader.class);
#Autowired
private CleanService cleanService;
#Override
public List<CSRFToken> read() {
List<CSRFToken> csrfTokenList = null;
try{
int keepUpto = Integer.valueOf(PropertiesContext.getInstance().getProperties().getProperty("token.keep", "1"));
Calendar calTime = Calendar.getInstance();
calTime.add(Calendar.HOUR, -keepUpto);
Date toKeep = calTime.getTime();
csrfTokenList = cleanService.getCSRFTokenByTime(toKeep);
}
catch(Throwable th){
logger.error("Exception in running job At " + new Date() + th);
}
if(CollectionUtils.isEmpty(csrfTokenList)){
return null;
}
return csrfTokenList;
}
}
EDIT:--
public class CSRFTokenJobLauncher extends QuartzJobBean {
static final String JOB_NAME = "jobName";
private JobLocator jobLocator;
private JobLauncher jobLauncher;
public void setJobLocator(JobLocator jobLocator) {
this.jobLocator = jobLocator;
}
public void setJobLauncher(JobLauncher jobLauncher) {
this.jobLauncher = jobLauncher;
}
#Override
protected void executeInternal(JobExecutionContext context) {
Map<String, Object> jobDataMap = context.getMergedJobDataMap();
String jobName = (String) jobDataMap.get(JOB_NAME);
log.info("Quartz trigger firing with Spring Batch jobName="+jobName);
JobParameters jobParameters = getJobParametersFromJobMap(jobDataMap);
try {
jobLauncher.run(jobLocator.getJob(jobName), jobParameters);
}
catch (JobExecutionException e) {
log.error("Could not execute job.", e);
}
}
private JobParameters getJobParametersFromJobMap(Map<String, Object> jobDataMap) {
JobParametersBuilder builder = new JobParametersBuilder();
for (Entry<String, Object> entry : jobDataMap.entrySet()) {
String key = entry.getKey();
Object value = entry.getValue();
if (value instanceof String && !key.equals(JOB_NAME)) {
builder.addString(key, (String) value);
}
else if (value instanceof Float || value instanceof Double) {
builder.addDouble(key, ((Number) value).doubleValue());
}
else if (value instanceof Integer || value instanceof Long) {
builder.addLong(key, ((Number)value).longValue());
}
else if (value instanceof Date) {
builder.addDate(key, (Date) value);
}
}
return builder.toJobParameters();
}
}
After hours of time wasting,the problem seems to be solved now,i have configured allow-start-if-complete="true" in tasklet.Now Batch Item Reader is executing as per schedule.
<batch:job id="csrfTokenBatchJob">
<batch:step id="step1">
<batch:tasklet allow-start-if-complete="true">
<batch:chunk reader="csrfTokenReader" writer="csrfTokenWriter" commit-interval="1"></batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
Spring batch records every job execution in database. Which is why spring batch need to differentiate every job run. It checks whether the job is already executed on the same day and it would not start again unless any job parameter varies from previous run or allow start if complete setting is enabled.
OPTION1:- As mentioned above answer we can use allow-start-if-complete="true"
OPTION2:- Always pass a job parameter which is a current date time stamp. This way job parameter value is always unique.
JobExecution jobExecution = jobLauncher.run(reportJob, new JobParametersBuilder()
.addDate("now", new Date()).build());
OPTION3:- Use an incrementor for example RunIdIncrementer so we do not need to make sure to pass unique job parameter every time.
#Bean
public Job job1(JobBuilderFactory jobs, Step s1) {
return jobs.get("job1")
.incrementer(new RunIdIncrementer())
.flow(s1)
.end()
.build();
}
Related
I am currently working on a Batch that consumes data from a large SQL database with millions of rows.
It does some processing in the processor that consists of grouping rows retrieved from the Reader via a large sql query with joins.
And the Writer writes the result to another table.
The problem is that this Batch has performance problems, because the Sql selection queries take a lot of time and the steps are not executed in multithreading.
So I'd like to run them in multitheading but the problem is that the steps group the rows by calculating a total amount of all the rows with the same types for example.
So if I put it in multitheading how can I do that when each partition is going to be processed in a different thread knowing that it's millions of rows that I can't store in the context to retrieve them after the step and do the grouping.
and I can't save them in the database either since it's millions of rows
Do you have any idea how I can do this?
I hope I was able to explain my problem well.
And thanks in advance for your help
I've had a similar task like yours, unlikly we were using java 1.7 and spring 3.x. I can provide a configuiration in xml so maybe you will be able to use annotation configuration for this I've not tryed.
<batch:job id="dualAgeRestrictionJob">
<-- use a listner if you need -->
<batch:listeners>
<batch:listener ref="dualAgeRestrictionJobListener" />
</batch:listeners>
<!-- master step, 10 threads (grid-size) -->
<batch:step id="dualMasterStep">
<partition step="dualSlaveStep"
partitioner="arInputRangePartitioner">
<handler grid-size="${AR_GRID_SIZE}" task-executor="taskExecutor" />
</partition>
</batch:step>
</batch:job>
<-- here you define your reader processor and writer and the commit interval -->
<batch:step id="dualSlaveStep">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="arInputPagingItemReader"
writer="arOutputWriter" processor="arInputItemProcessor"
commit-interval="${AR_COMMIT_INTERVAL}" />
</batch:tasklet>
</batch:step>
<!-- The partitioner -->
<bean id="arInputRangePartitioner" class="com.example.ArInputRangePartitioner">
<property name="arInputDao" ref="arInputJDBCTemplate" />
<property name="statsForMail" ref="statsForMail" />
</bean>
<bean id="taskExecutor"
class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="${AR_CORE_POOL_SIZE}" />
<property name="maxPoolSize" value="${AR_MAX_POOL_SIZE}" />
<property name="allowCoreThreadTimeOut" value="${AR_ALLOW_CORE_THREAD_TIME_OUT}" />
</bean>
<bean id="transactionManager"
class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
<property name="dataSource" ref="kvrDatasource" />
</bean>
The partitioner makes a query to count the rows and make chunks for each thread:
public class ArInputRangePartitioner implements Partitioner {
private static final Logger logger = LoggerFactory.getLogger(ArInputRangePartitioner.class);
private ArInputDao arInputDao;
private StatsForMail statsForMail;
#Override
public Map<String, ExecutionContext> partition(int gridSize) {
Map<String, ExecutionContext> result = new HashMap<String, ExecutionContext>();
// You can make a query and then divede the from to for each thread
Map<Integer,Integer> idMap = arInputDao.getOrderIdList();
Integer countRow = idMap.size();
statsForMail.setNumberOfRecords( countRow );
Integer range = countRow / gridSize;
Integer remains = countRow % gridSize;
int fromId = 1;
int toId = range;
for (int i = 1; i <= gridSize; i++) {
ExecutionContext value = new ExecutionContext();
if(i == gridSize) {
toId += remains;
}
logger.info("\nStarting : Thread {}", i);
logger.info("fromId : {}", idMap.get(fromId) );
logger.info("toId : {}", idMap.get(toId) );
value.putInt("fromId", idMap.get(fromId) );
value.putInt("toId", idMap.get(toId) );
value.putString("name", "Thread" + i);
result.put("partition" + i, value);
fromId = toId + 1;
toId += range;
}
return result;
}
public ArInputDao getArInputDao() {
return arInputDao;
}
public void setArInputDao(ArInputDao arInputDao) {
this.arInputDao = arInputDao;
}
public StatsForMail getStatsForMail() {
return statsForMail;
}
public void setStatsForMail(StatsForMail statsForMail) {
this.statsForMail = statsForMail;
}
}
This is the configuration for the reader and writer:
<bean id="arInputPagingItemReader" class="org.springframework.batch.item.database.JdbcPagingItemReader" scope="step" >
<property name="dataSource" ref="kvrDatasource" />
<property name="queryProvider">
<bean class="org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean" >
<property name="dataSource" ref="kvrDatasource" />
<property name="selectClause" value="${AR_INPUT_PAGING_ITEM_READER_SELECT}" />
<property name="fromClause" value="${AR_INPUT_PAGING_ITEM_READER_FROM}" /> <property name="whereClause" value="${AR_INPUT_PAGING_ITEM_READER_WHERE}" />
<property name="sortKey" value="${AR_INPUT_PAGING_ITEM_READER_SORT}" />
</bean>
</property>
<!-- Inject via the ExecutionContext in rangePartitioner -->
<property name="parameterValues">
<map>
<entry key="fromId" value="#{stepExecutionContext[fromId]}" />
<entry key="toId" value="#{stepExecutionContext[toId]}" />
</map>
</property>
<property name="pageSize" value="${AR_PAGE_SIZE}" />
<property name="rowMapper" ref="arOutInRowMapper" />
</bean>
<bean id="arOutputWriter"
class="org.springframework.batch.item.database.JdbcBatchItemWriter"
scope="step">
<property name="dataSource" ref="kvrDatasource" />
<property name="sql" value="${SQL_AR_OUTPUT_INSERT}"/>
<property name="itemSqlParameterSourceProvider">
<bean class="org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider" />
</property>
</bean>
Maybe some one knows how to convert this with modern spring-batch/spring-boot
PS: Don't use a lot of thread otherwise spring batch will lose a lot of time to fill it's own tables. You have to make some benchmark to understand the correct configuration
I also suggest to not use jpa/hibernate with millions of rows, in my case I’ve used jdbcTemplate
EDIT for annotation configuration see this question
Follow an example of configuration with partitioner
#Configuration
#RequiredArgsConstructor
public class JobConfig {
private static final Logger log = LoggerFactory.getLogger(JobConfig.class);
private final JobBuilderFactory jobBuilderFactory;
private final StepBuilderFactory stepBuilderFactory;
#Value(value = "classpath:employees.csv")
private Resource resource;
#Bean("MyJob1")
public Job createJob(#Qualifier("MyStep1") Step stepMaster) {
return jobBuilderFactory.get("MyJob1")
.incrementer(new RunIdIncrementer())
.start(stepMaster)
.build();
}
#Bean("MyStep1")
public Step step(PartitionHandler partitionHandler, Partitioner partitioner) {
return stepBuilderFactory.get("MyStep1")
.partitioner("slaveStep", partitioner)
.partitionHandler(partitionHandler)
.build();
}
#Bean("slaveStep")
public Step slaveStep(FlatFileItemReader<Employee> reader) {
return stepBuilderFactory.get("slaveStep")
.<Employee, Employee>chunk(1)
.reader(reader)
.processor((ItemProcessor<Employee, Employee>) employee -> {
System.out.printf("Processed item %s%n", employee.getId());
return employee;
})
.writer(list -> {
for (Employee item : list) {
System.out.println(item);
}
})
.build();
}
#Bean
public Partitioner partitioner() {
return gridSize -> {
Map<String, ExecutionContext> result = new HashMap<>();
int lines = 0;
try(BufferedReader reader = new BufferedReader(new InputStreamReader(resource.getInputStream()))) {
while (reader.readLine() != null) lines++;
} catch (IOException e) {
throw new RuntimeException(e);
}
int range = lines / gridSize;
int remains = lines % gridSize;
int fromLine = 0;
int toLine = range;
for (int i = 1; i <= gridSize; i++) {
if(i == gridSize) {
toLine += remains;
}
ExecutionContext value = new ExecutionContext();
value.putInt("fromLine", fromLine);
value.putInt("toLine", toLine);
fromLine = toLine;
toLine += range;
result.put("partition" + i, value);
}
return result;
};
}
#StepScope
#Bean
public FlatFileItemReader<Employee> flatFileItemReader(#Value("#{stepExecutionContext['fromLine']}") int startLine, #Value("#{stepExecutionContext['toLine']}") int lastLine) {
FlatFileItemReader<Employee> reader = new FlatFileItemReader<>();
reader.setResource(resource);
DefaultLineMapper<Employee> lineMapper = new DefaultLineMapper<>();
lineMapper.setFieldSetMapper(fieldSet -> {
String[] values = fieldSet.getValues();
return Employee.builder()
.id(Integer.parseInt(values[0]))
.firstName(values[1])
.build();
});
lineMapper.setLineTokenizer(new DelimitedLineTokenizer(";"));
reader.setLineMapper(lineMapper);
reader.setCurrentItemCount(startLine);
reader.setMaxItemCount(lastLine);
return reader;
}
#Bean
public PartitionHandler partitionHandler(#Qualifier("slaveStep") Step step, TaskExecutor taskExecutor) {
TaskExecutorPartitionHandler taskExecutorPartitionHandler = new TaskExecutorPartitionHandler();
taskExecutorPartitionHandler.setTaskExecutor(taskExecutor);
taskExecutorPartitionHandler.setStep(step);
taskExecutorPartitionHandler.setGridSize(5);
return taskExecutorPartitionHandler;
}
#Bean
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setMaxPoolSize(5);
taskExecutor.setCorePoolSize(5);
taskExecutor.setQueueCapacity(5);
taskExecutor.afterPropertiesSet();
return taskExecutor;
}
}
We had a similar use case where I had to start off with reading millions of records based on certain criteria as input from a rest endpoint and process it parallelly using 20-30 threads to meet extreme deadlines. But subsequent challenges were that same complex queries were made to database and then partitioned to be shared across generated threads.
Better solution:
We solved it by reading the data once and then internally partition it and pass it to threads initiated.
A typical batch process would have the objective -> to read, make some http calls/ manipulate the data, and write it to a response log table.
Spring batch provides the capability to keep track of the records processed so that a restart can be initiated to pick up the remaining lot to process. An alternative to this can be a flag in your master table to mark the record as processed so it need not be picked during restart.
Multiple challenges faced were :
support of joins in the query reader
partitioning of data.
same record being processed again
Coming to multi processing ->
Lets say you have 10000 records and you need to process 5 records parallelly.
Multiple creative solutions can be implemented but the two most often used that fit all use cases would be
partitioning data on no of records.
partitioning data on the mod of value of index data if numeric.
Considering the memory the machine will be able to serve, a suitable number of threads can be selected. Eg 5. => 10000/5 => each thread would process 2000 records.
Partitioning is a processing to split the ranges and allowing each step execution process to pick it in its own thread and run it. For the above step we will need to split those ranges and pass it while query execution to make it fetch records for the range and continue the process in a separate thread.
Thread 0 : 1–2000
Thread 1 : 2001–4000
Thread 2 : 4001–6000
Thread 3 : 6001–8000
Thread 4 : 8001–10000
Another logic for partitioning would be assigning the threads 0 to 4 and query basis the modulo of the number. But one drawback of this could be that one particular range would receive more load compared to others whereas the previous approach would ensure that everyone gets a fair share.
The split data is passed on to the separate thread which will start processing it and write data at the commit interval ( chunk size ) mentioned in the step.
Code :
READER
#Bean
#StepScope
public JdbcPagingItemReader<YourDataType> dataReaders(
#Value("#{jobParameters[param1]}") final String param1,
#Value("#{stepExecutionContext['modulo']}") Long modulo) throws Exception {
logger.info("Thread started reading for modulo index : " + modulo);
JdbcPagingItemReader<YourDataType> reader = new JdbcPagingItemReader <> ();
reader.setDataSource(getDataSource());
reader.setRowMapper(new YourDataTypeRowMapper());
reader.setQueryProvider(queryProvider(param1, modulo));
return reader;
public OraclePagingQueryProvider queryProvider(String param1, Long modulo) throws Exception {
OraclePagingQueryProvider provider = new OraclePagingQueryProvider();
provider.setSelectclause("your elements to query");
provider.setFromClause("your tables/ joined tables");
provider.setWhereclause("where clauses AND MOD (TO_NUMBER(yourkey) = " + modulo);
Map<String,Order> sortkeys = new HashMap<>();
sortKeys.put("yoursortkey", Order.ASCENDING);
provider.setSortKeys(sortKeys);
return provider;
}
Sample data reader -> param1 is any parameter that user would want to input. modulo is a step execution parameter — passed from the Partitioner object.
Paritioner object if to be used for modulo 5 would have modulo 0|1|2|3|4 and this would spawn 5 threads which would interact with the reader and fetch data for the divided sets.
WRITER
#Bean
public JdbcbatchItemWriter<YourDataType> dataWriter() throws Exception {
logger.info("Initializing data writer");
JdbcBatchItemWriter<YourDataType> databaseItemWriter = new JdbcBatchItemWriter<>();
databaseItemWriter.setDataSource(injectyourdatasourcehere);
databaseItemWriter.setsql(INSERT_QUERY_HERE);
ItemPreparedStatementsetter<RespData> ps = new YourResponsePreparedStatement();
databaseItemWriter.setItemPreparedStatementsetter(ps);
return databaseItemWriter;
}
public class Your ResponsePreparedStatement implements ItemPreparedStatementSetter<RespData> {
public void setValues (RespData respData, PreparedStatement preparedStatement)throws SQLException {
preparedStatement.setString(1, respData.getYourData());
}
}
Response Writer to log response to any table to keep tab of the processed data for analytics or business reporting.
PROCESSOR
#Bean
public ItemProcessor<YourDataType,RespData> processor() {
return new YOURProcessor();
}
Processor where the core logic for the data manipulation would be written. Response returned is of the type which is expected by the Data writer.
If you wish to skip spring batch tables auto creation, overriding batch configuration would solve the issue.
#Configuration
#EnableAutoConfiguration
#EnableBatchProcessing
public class BatchConfiguration extends DefaultBatchConfigurer {
#Override
public void setDataSource(DataSource dataSource) {}
}
else such an exception could be encountered:
at java.lang.Thread.run(Thread.java:829) [?:?]Caused by:
org.springframework.dao.CannotSerializeTransactionException:
PreparedStatementCallback; SQL [INSERT into
BATCH_JOB_INSTANCE(JOB_INSTANCE_ID, JOB_NAME, JOB_KEY, VERSION) values
(?, ?, ?, ?)]; ORA-08177: can’t serialize access for this transaction
; nested exception is java.sql.SQLException: ORA-08177: can’t
serialize access for this transaction
Column Range partitioner can be created as:
#Component
public class ColumnRangePartitioner implements Partitioner {
Map<String,ExecutionContext> result = new HashMap();
#Override
public Map<String,ExecutionContext> partition(int gridsize) {
Map<String,ExecutionContext> result = new HashMap<>();
int start = 0;
while (start < gridSize) {
ExecutionContext value = new ExecutionContext();
result.put("partition : " + start, value);
value.putInt("modulo", start);
start += 1;
}
return result;
}
}
Setting up of job and step
our job will be focusing on executing step1 — which will spawn threads based on the partitioner provided — here columnrange partitioner — to process the step.
Grid size is the no of parallel threads ( modulo to be calculated of using ).
Every processStep step is a series of reading the data for that specific thread assigned modulo, processing it and then writing it.
#Bean
public ColumnRangePartitioner getParitioner () throws Exception {
ColumnRangePartitioner columnRangePartitioner = new ColumnRangePartitioner();
return columnRangePartitioner;
}
#Bean
public Step step1(#Qualifier("processStep") Step processStep,
StepBuilderFactory stepBuilderFactory) throws Exception {
return stepBuilderFactory.get("step1")
.listener(jobCompletionNotifier)
.partitioner(processStep.getName(),getParitioner())
.step(processStep)
.gridSize(parallelThreads)
.taskExecutor(taskExecutor())
.build();
}
#Bean
public Step processStep(
#Qualifier("DataReader") ItemReader<ReadType> reader,
#Qualifier("LogWRITE") ItemWriter<WriterType> writer,
StepBuilderFactory stepBuilderFactory) throws Exception {
return stepBuilderFactory.get("processStep")
.<ReadType,WriterType> chunk(1)
.reader(reader)
.processor(processor())
.writer (writer)
.faultTolerant()
.skip(Exception.class)
.skipLimit(exceptionLimit)
.build();
}
#Bean
public SimpleAsyncTaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor asyncTaskExecutor = new SimpleAsyncTaskExecutor();
return asyncTaskExecutor;
}
#Bean
public Job our JOB (#Qualifier("step1") Step step1, JobBuilderFactory jobBuilderFactory) throws Exception {
return jobBuilderFactory.get("ourjob")
.start(step1)
.incrementer(new RunIdIncrementer())
.preventRestart()
.build();
}
This might be an usual spring batch solution but would be applicable to every migration requirement involving commonly used SQL DB/ java based solutions.
We did add customizations to the application
Avoid executing the join query again and then filtering. complex joins could impact database performance. Hence a better solution would be to fetch the data once and split it internally. Memory used by the application would be huge and the hashmap would be populated with all the data your query would fetch but java is capable of handling that. That fetched data could be passed to the ListItemReader to process list of data for that particular thread parallelly.
For processing parallel requests ( not threads but parallel api calls to this application ) modification can be made to process a certain query once only keeping a lock on it using a semaphore so that other threads are waiting on it. Once lock is release those waiting threads would find that data to be present and db will not be queries again.
The code would for the above impl would be complex for this blog scope. Feel free to ask if any use case is required by your application.
Would love to solve any issues regarding the same. Feel free to reach out to me(Akshay) at akshay.patell1702#gmail.com or my colleague(Sagar) at sagarnagdev61#gmail.com
Is there any way to specify Scheduler for specific Spring Batch job configured via XML without utils RunScheduler class like this: https://www.mkyong.com/spring-batch/spring-batch-and-spring-taskscheduler-example/?
So for now my config looks like this:
<batch:job id="testJob" job-repository="jobRepository" parent="jobParent">
<batch:step id="testStep" allow-start-if-complete="true">
<batch:tasklet>
<batch:chunk
reader="testReader"
processor="testProcessor"
writer="jmsWriter">
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
<task:scheduled-tasks>
<task:scheduled ref="testJobLauncher" method="runJob" cron="0 */5 * * * *"/>
</task:scheduled-tasks>
<bean id="testJobLauncher"
class="com.test.RunScheduler"
p:job-ref="testJob"
p:jobLauncher-ref="jobLauncher"
"/>
#Component
public class RunScheduler {
private JobLauncher jobLauncher;
private Job job;
public void runJob() {
try {
String dateParam = new Date().toString();
JobParameters param = new JobParametersBuilder().addString("date", dateParam).toJobParameters();
JobExecution execution = jobLauncher.run(job, param);
} catch (Exception e) {
LOGGER.error("Can't start job", e);
throw new RuntimeException(e);
}
}
public Job getJob() {
return job;
}
public void setJob(Job job) {
this.job = job;
}
public JobLauncher getJobLauncher() {
return jobLauncher;
}
public void setJobLauncher(JobLauncher jobLauncher) {
this.jobLauncher = jobLauncher;
}
}
Is there way not use RunScheduler class and just handle it using XML config?
You can use the capabilities of #EnableScheduling and the cronSequenceGenerator for the scheduling and cron-settings without depending on the util classes.
I managed to configure and schedule a Quartz job using JobStoreTX persistent store in Spring Boot ( version 4.2.5 ). Here is how I schedule the job.
First :
public class MyJob implements Job{
#Autowired
IService service;
#Override
public void execute(JobExecutionContext context) throws JobExecutionException {
service.doSomething();
}
}
#Autowired seems like it wont work in a Quartz job implementation because it wont be instantiated by Spring. Hence, im facing the famous JavaNullPointerException.
Second, in order to get hold of Spring-managed beans in a Quartz job, I used org.springframework.scheduling.quartz.SchedulerFactoryBean to manage the Quartz lifecycle :
public class MyJob implements Job{
#Override
public void execute(JobExecutionContext context) throws JobExecutionException {
try {
ApplicationContext applicationContext = (ApplicationContext) context.getScheduler().getContext().get("applicationContext");
IService service= applicationContext.getBean(IService.class);
service.getManualMaxConfig();
} catch (SchedulerException e) {
e.printStackTrace();
}
}
}
And then :
<bean id="scheduler"
class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="applicationContextSchedulerContextKey" value="applicationContext" />
</bean>
The sad news is that im also facing JavaNPE.
I also try these suggestions, in vain ..
LINK
Whats wrong with what im doing?
Update 1 :
Before trying to inject service, i tried to pass some Params as #ritesh.garg suggests.
public class MyJob implements Job{
private String someParam;
private int someParam2;
public void setSomeParam(String someParam) {
this.someParam = someParam;
}
public void setSomeParam2(int someParam2) {
this.someParam2 = someParam2;
}
#Override
public void execute(JobExecutionContext context) throws JobExecutionException {
System.out.println("My job is running with "+someParam+' '+someParam2);
}
}
And my jobBean.xml looks like :
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean id="scheduler"
class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="applicationContextSchedulerContextKey" value="applicationContext" />
</bean>
<bean id="myJob" class="org.springframework.scheduling.quartz.JobDetailFactoryBean">
<property name="jobClass" value="com.quartz.service.MyJob"/>
<property name="jobDataAsMap">
<map>
<entry key="someParam" value="some value"/>
<entry key="someParam2" value="1"/>
</map>
</property>
</bean>
</beans>
I dont know why, but the parameters arent passed and it prints :
My job is running with null 0
Ps : I imported the jobBean.xml into Application.java . So i dont know what am i missing ?
Update 2 : Here is my detailed code :
#Component
public class JobScheduler{
Timer timer = new Timer();
#PostConstruct
public void distributeAutomaticConf(){
try {
timer.schedule(new ServiceImpl(), 10000);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Service Impl :
#Transactional
#Component
public class ServiceImpl extends TimerTask implements IService{
#Override
public void run() {
final SchedulerFactory factory = new StdSchedulerFactory();
Scheduler scheduler = null;
try {
scheduler = factory.getScheduler();
final JobDetailImpl jobDetail = new JobDetailImpl();
jobDetail.setName("My job executed only once.. ");
jobDetail.setJobClass(MyJob.class);
SimpleTrigger trigger = (SimpleTrigger) newTrigger()
.withIdentity("trigger_", "group_")
.build();
scheduler.start();
scheduler.scheduleJob(jobDetail, trigger);
System.in.read();
if (scheduler != null) {
scheduler.shutdown();
}
} catch (final SchedulerException e) {
e.printStackTrace();
} catch (final IOException e) {
e.printStackTrace();
}
}
}
MyJob :
public class MyJob extends QuartzJobBean{
#Autowired
IService service;
#Override
protected void executeInternal(JobExecutionContext arg0) throws JobExecutionException { SpringBeanAutowiringSupport.processInjectionBasedOnCurrentContext(this);
service.doSomething();
}
}
jobBean.xml :
<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean id="scheduler"
class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="applicationContextSchedulerContextKey" value="applicationContext" />
</bean>
<bean id="myJob" class="org.springframework.scheduling.quartz.JobDetailFactoryBean">
<property name="jobClass" value="com.quartz.service.MyJob"/>
<property name="jobDataAsMap">
<map>
<entry key="someParam" value="some value"/>
<entry key="someParam2" value="1"/>
</map>
</property>
</bean>
</beans>
quartz.properties :
org.quartz.scheduler.instanceName = DefaultQuartzScheduler
org.quartz.scheduler.rmi.export = false
org.quartz.scheduler.rmi.proxy = false
org.quartz.scheduler.wrapJobExecutionInUserTransaction = false
org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 10
org.quartz.threadPool.threadPriority = 5
org.quartz.threadPool.threadsInheritContextClassLoaderOfInitializingThread = true
org.quartz.jobStore.misfireThreshold = 60000
org.quartz.jobStore.class = org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
#org.quartz.jobStore.driverDelegateClass = org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.dataSource = myDS
org.quartz.jobStore.tablePrefix = QRTZ_
org.quartz.dataSource.myDS.driver = org.postgresql.Driver
org.quartz.dataSource.myDS.URL = jdbc:postgresql://localhost:5432/myDB
org.quartz.dataSource.myDS.user = admin
org.quartz.dataSource.myDS.password = admin
org.quartz.dataSource.myDS.maxConnections = 10
org.quartz.scheduler.skipUpdateCheck=true
console :
java.lang.NullPointerException: null
at com.quartz.service.MyJob.executeInternal(MyJob.java:27) ~[classes/:na]
at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:113) ~[spring-context-support-3.1.2.RELEASE.jar:3.1.2.RELEASE]
at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[quartz-2.2.1.jar:na]
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) [quartz-2.2.1.jar:na]
2016-06-05 11:35:16.839 ERROR 25452 --- [eduler_Worker-1] org.quartz.core.ErrorLogger : Job (DEFAULT.My job executed only once.. threw an exception.
org.quartz.SchedulerException: Job threw an unhandled exception.
at org.quartz.core.JobRunShell.run(JobRunShell.java:213) ~[quartz-2.2.1.jar:na]
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:573) [quartz-2.2.1.jar:na]
Caused by: java.lang.NullPointerException: null
at com.quartz.service.MyJob.executeInternal(MyJob.java:27) ~[classes/:na]
at org.springframework.scheduling.quartz.QuartzJobBean.execute(QuartzJobBean.java:113) ~[spring-context-support-3.1.2.RELEASE.jar:3.1.2.RELEASE]
at org.quartz.core.JobRunShell.run(JobRunShell.java:202) ~[quartz-2.2.1.jar:na]
... 1 common frames omitted
I have experienced the same problem in past. My understanding on this issue is that beans instantiated in spring context cannot be injected in quartz context simply by using #Autowired annotation.
I managed to solve it by using setter based dependency injection. But the same is mentioned in the "LINK" you have added in the original post.
Pasting the relevant information from the link:
Update: Replaced implements Job with extends QuartzJobBean
public class MyJob extends QuartzJobBean {
private String someParam;
private int someParam2;
public void setSomeParam(String someParam) {
this.someParam = someParam;
}
public void setSomeParam2(String someParam2) {
this.someParam2 = someParam2;
}
#Override
public void execute(JobExecutionContext context) throws JobExecutionException {
System.out.println("My job is running with "+someParam+' '+someParam2);
}
}
Here, someParam and someParam2 are being injected via setter dependency injection. Now the other part that makes this complete is to pass someParam and someParam2 in jobDataAsMap
<bean id="myJob" class="org.springframework.scheduling.quartz.JobDetailFactoryBean">
<property name="jobClass" value="com.my.MyJob"/>
<property name="jobDataAsMap">
<map>
<entry key="someParam" value="some value"/>
<entry key="someParam2" value="1"/>
</map>
</property>
</bean>
In your case, it would be a value-ref="IserviceBeanId", instead of 'value' in entry. I would be surprised as well as curious, if this did not/does not work for you.
I fix my problem implementing "InitializingBean" in my job;
public class MyJob extends QuartzJobBean implements InitializingBean {
private String someParam;
private int someParam2;
public void setSomeParam(String someParam) {
this.someParam = someParam;
}
public void setSomeParam2(String someParam2) {
this.someParam2 = someParam2;
}
#Override
public void execute(JobExecutionContext context) throws JobExecutionException {
System.out.println("My job is running with "+someParam+' '+someParam2);
}
#Override
public void afterPropertiesSet() throws Exception {
}
}
The correct way from the most of the examples I've seen is to make your Job interface implementation a #Component
#Component
public class MyJob implements Job{
#Autowired IService service;
#Override
public void execute(JobExecutionContext context) throws JobExecutionException{
service.doSomething();
}
}
We can use JobDataMap to pass the objects.
example: here restTemplate is Autowired.
JobDataMap newJobDataMap = new JobDataMap();
newJobDataMap.put("restTemplate", restTemplate);
JobDetail someJobDetail = JobBuilder
.newJob(QuartzJob.class)
.withIdentity(jobName, GROUP)
.usingJobData(newJobDataMap)
.build();
I have a MultiResourceItemReader with a custom ItemReader as a delegate. The problem I'm facing is that when I launch the job, the same file is read over and over again.
This is the delegate class:
public class AllegatiReader implements ResourceAwareItemReaderItemStream<Allegato> {
#PersistenceContext
protected EntityManager em;
private Resource resource;
#Override
public void close() throws ItemStreamException {
}
#Override
public void open(ExecutionContext arg0) throws ItemStreamException {
}
#Override
public void update(ExecutionContext arg0) throws ItemStreamException {
}
#Override
public Allegato read() throws Exception, UnexpectedInputException,
ParseException, NonTransientResourceException {
// DO SOMETHING ...
byte[] fileContent = new byte[(int) resource.getFile().length()];
resource.getInputStream().read(fileContent);
resource.getInputStream().close();
allegato.getFile().setFile(fileContent);
return allegato;
}
#Override
public void setResource(Resource arg0) {
this.resource = arg0;
}
}
Here is my Spring Batch XML configuration file:
<batch:job id="allegati" incrementer="jobParametersIncrementer">
<batch:step id="allegati-import">
<batch:tasklet>
<batch:chunk reader="allegati-reader" writer="allegati-writer" commit-interval="1"/>
</batch:tasklet>
</batch:step>
</batch:job>
<bean id="allegati-reader" class="org.springframework.batch.item.file.MultiResourceItemReader" scope="step">
<property name="resources" value="file:#{jobParameters['FILEPATH']}/*" />
<property name="delegate" ref="allegati-filereader" />
</bean>
<bean id="allegati-writer" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="allegati-filereader" class="it.infogroup.vertenze.porting.reader.AllegatiReader" />
How can I tell Spring Batch to move to the next file?
Your custom reader has to show Spring Batch when all is done, see http://docs.spring.io/spring-batch/trunk/apidocs/org/springframework/batch/item/ItemReader.html#read--
Reads a piece of input data and advance to the next one.
Implementations must return null at the end of the input data set.
in your case i would use an private attribute to save* the state for the resource of this reader instance is processed, it could be the Allegato object, but that seems to be a rather large one
*) your reader is stateful by design, so another state attribute should be no problem
I am trying to cause a job not to have BatchStatus.FAILED if a certain exception occurs.
The docs talk about using skippable-exception-classes within <chunk>, but how can I do the same within a TaskletStep? The below code does not work:
<batch:step id="sendEmailStep">
<batch:tasklet>
<bean class="com.myproject.SendEmail" scope="step" autowire="byType">
<batch:skippable-exception-classes>
<batch:include class="org.springframework.mail.MailException" />
</batch:skippable-exception-classes>
</bean>
</batch:tasklet>
</batch:step>
I implemented this functionality in the Tasklet as Michael Minella suggested:
abstract class SkippableTasklet implements Tasklet {
//Exceptions that should not cause job status to be BatchStatus.FAILED
private List<Class<?>> skippableExceptions;
public void setSkippableExceptions(List<Class<?>> skippableExceptions) {
this.skippableExceptions = skippableExceptions;
}
private boolean isSkippable(Exception e) {
if (skippableExceptions == null) {
return false;
}
for (Class<?> c : skippableExceptions) {
if (e.getClass().isAssignableFrom(c)) {
return true;
}
}
return true;
}
protected abstract void run(JobParameters jobParameters) throws Exception;
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
StepExecution stepExecution = chunkContext.getStepContext().getStepExecution();
JobExecution jobExecution = stepExecution.getJobExecution();
JobParameters jobParameters = jobExecution.getJobParameters();
try {
run(prj);
} catch (Exception e) {
if (!isSkippable(e)) {
throw e;
} else {
jobExecution.addFailureException(e);
}
}
return RepeatStatus.FINISHED;
}
}
And the Spring XML configuration for an example SkippableTasklet:
<batch:tasklet>
<bean class="com.MySkippableTasklet" scope="step" autowire="byType">
<property name="skippableExceptions">
<list>
<value>org.springframework.mail.MailException</value>
</list>
</property>
</bean>
</batch:tasklet>
Within a Tasklet, the responsibility for exception handling is on the implementation of the Tasklet. The skip logic available in chunk oriented processing is due to the exception handling provided by the ChunkOrientedTasklet. If you want to skip exceptions in your own Tasklet implementation, you need to write the code to do so in within your own implementation.