SpringBatch local partitioning restart problems - java

I am having issues with restart of local partitioning batch. I am throwing RuntimeException on 101st processed item. The job fails, but something is going wrong, because on restart, the job continues from 150th item (and not from the 100th item that it should).
Here is the xml-conf:
<bean id="taskExecutor" class="org.springframework.scheduling.commonj.WorkManagerTaskExecutor" >
<property name="workManagerName" value="springWorkManagers" />
</bean>
<bean id="transactionManager" class="org.springframework.transaction.jta.WebSphereUowTransactionManager"/>
<batch:job id="LocalPartitioningJob">
<batch:step id="masterStep">
<batch:partition step="slaveStep" partitioner="splitPartitioner">
<batch:handler grid-size="5" task-executor="taskExecutor" />
</batch:partition>
</batch:step>
</batch:job>
<batch:step id="slaveStep">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="partitionReader" processor="compositeItemProcessor" writer="sqlWriter" commit-interval="50" />
<batch:transaction-attributes isolation="SERIALIZABLE" propagation="REQUIRE" timeout="600" />
<batch:listeners>
<batch:listener ref="Processor1" />
<batch:listener ref="Processor2" />
<batch:listener ref="Processor3" />
</batch:listeners>
</batch:tasklet>
</batch:step>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
<property name="transactionManager" ref="transactionManager" />
<property name="tablePrefix" value="${sb.db.tableprefix}" />
<property name="dataSource" ref="ds" />
<property name="maxVarCharLength" value="1000"/>
</bean>
<bean id="transactionManager" class="org.springframework.transaction.jta.WebSphereUowTransactionManager"/>
<jee:jndi-lookup id="ds" jndi-name="${sb.db.jndi}" cache="true" expected-type="javax.sql.DataSource" />
The splitPartitioner implements Partitioner and splits the initial data and saves it to the executionContexts as lists. The processors call remote ejb's to fetch additional data and the sqlWriter is just a org.spring...JdbcBatchItemWriter. PartitionReader code below:
public class PartitionReader implements ItemStreamReader<TransferObjectTO> {
private List<TransferObjectTO> partitionItems;
public PartitionReader() {
}
public synchronized TransferObjectTO read() {
if(partitionItems.size() > 0) {
return partitionItems.remove(0);
} else {
return null;
}
}
#SuppressWarnings("unchecked")
#Override
public void open(ExecutionContext executionContext) throws ItemStreamException {
partitionItems = (List<TransferObjectTO>) executionContext.get("partitionItems");
}
#Override
public void update(ExecutionContext executionContext) throws ItemStreamException {
executionContext.put("partitionItems", partitionItems);
}
#Override
public void close() throws ItemStreamException {
}
}

It seems that I had few misunderstandings of SpringBatch and my buggy code. The first misunderstanding was that I thought that the readCount would be rolled back on RuntimeException. Now I see that this is not the case, but when SpringBatch is incrementing this value and upon step failure, the value is committed.
Related to above, I thought that the update method on ItemStreamReader would be always called, but the executionContext update to database would just be committed or rolled back. But it seems that the update is called only if no errors occur and the executionContext update is always committed.
The third misunderstanding was that the partitioning "master step" would not be re-executed on restart, but only slave steps are re-executed. But actually "master step" is re-executed if "master step"'s slave step would fail. So I guess that master and slave steps are actually somehow handled as a single step.
And then there was my buggy code in the PartitionReader, which was supposed to save db-server disk space. Maybe the partitionItems should not be edited on next()? (Related to the above statements) Anyhow here is the code for the working PartitionReader:
public class PartitionReader implements ItemStreamReader<TransferObjectTO> {
private List<TransferObjectTO> partitionItems;
private int index;
public PartitionReader() {
}
public synchronized TransferObjectTO read() {
if(partitionItems.size() > index) {
return partitionItems.get(index++);
} else {
return null;
}
}
#SuppressWarnings("unchecked")
#Override
public void open(ExecutionContext executionContext) throws ItemStreamException {
partitionItems = (List<TransferObjectTO>) executionContext.get("partitionItems");
index = executionContext.getInt("partitionIndex", 0);
}
#Override
public void update(ExecutionContext executionContext) throws ItemStreamException {
executionContext.put("partitionIndex", index);
}
#Override
public void close() throws ItemStreamException {
}
}

Related

cron scheduler getting called twice on scheduled time of schecduler

This is the bean code for cron scheduler. declaration for runMeTask And runMeJob
<bean id="runMeTask" class="com.ascent.fieldomobify.cornScheduler.RunMeTask"/>
<bean name="runMeJob" class="org.springframework.scheduling.quartz.JobDetailBean">
<property name="jobClass" value="com.ascent.fieldomobify.cornScheduler.RunMeJob" />
<property name="jobDataAsMap">
<map>
<entry key="runMeTask" value-ref="runMeTask" />
</map>
</property>
</bean>
<bean id="cronTrigger" class="org.springframework.scheduling.quartz.CronTriggerBean">
<property name="jobDetail" ref="runMeJob"/>
<property name="cronExpression" value="0 0 13 * * ?" />
</bean>
<bean class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="triggers">
<list>
<ref bean="cronTrigger" />
</list>
</property>
</bean>
it get call directly from bean scheduler configuration
The first class is RunMeJob
public class RunMeJob extends QuartzJobBean {
private RunMeTask runMeTask;
public void setRunMeTask(RunMeTask runMeTask) {
this.runMeTask = runMeTask;
}
protected void executeInternal(JobExecutionContext context)
throws JobExecutionException {
try {
runMeTask.printMe();
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
form here i call the controller's method which is having the logic
Second class RunMeTask
public class RunMeTask{
#Autowired
WorkOrderController workorderContoller;
public void setWorkorderContoller(WorkOrderController workorderContoller) {
this.workorderContoller = workorderContoller;
}
public void printMe() throws ParseException {
workorderContoller.printSysOut();
}
}
there are some scenarios where you can get that behavior, please check this thread: Java Spring #Scheduled tasks executing twice

Job in spring batch is getting executed multiple times and not stopping

I am trying to read data from cassandra using spring batch, where I have implemented ItemReader, ItemProcessor, and ItemWriter. I am able to read the data , process it and write back the data to the same table. I am creating xml file to execute the job:
xml:
<job id="LoadStatusIndicator" job-repository="jobRepository" restartable="false">
<step id="LoadStatus" next="">
<tasklet>
<chunk reader="StatusReader" processor="ItemProcessor" writer="ItemWriter"
commit-interval="10" />
</tasklet>
</step>
</job>
<beans:bean id="ItemWriter" scope="step"
class="com.batch.writer.ItemWriter">
</beans:bean>
<beans:bean id="ItemProcessor" scope="step"
class="com.batch.processor.ItemProcessor">
</beans:bean>
<beans:bean id="Reader" scope="step"
class="com.reader.ItemReader">
<beans:property name="dataSource" ref="CassandraSource" />
</beans:bean>
applicationcontext.xml:
<beans:bean id="CassandraSource" parent="DataSourceParent">
<beans:property name="url" value="jdbc:cassandra://${cassandra.hostName}:${cassandra.port}/${cassandra.keyspace}" />
<beans:property name="driverClassName" value="org.apache.cassandra.cql.jdbc.CassandraDriver" />
</beans:bean>
reader class:
public static final String query = "SELECT * FROM test_1 allow filtering;";
#Override
public List<Item> read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException
{
List<Item> results = new ArrayList<Item>();
try {
results = cassandraTemplate.select(query,Item.class);
} catch (Exception e) {
e.printStackTrace();
}
return results;
}
writer classs:
#Override
public void write(List<? extends Item> item) throws Exception {
try {
cassandraTemplate.insert(item);
}catch(Exception e){e.printStackTrace();}
But the problem is the whole job is getting executed multiple times , infact it is not stopping at all. I have to force stop the job execution. I have only 2 rows in the table.
I think it is because of the commit-interval defined in xml, but having commit-interval = 10, job executes more than 10 times
According to my understanding, when I run the xml file that means I am running the job only one time, it calls the reader once keeps the data in the run time memory (job repository), calls item processor once (I use list ) and the whole list is inserted at once
SOLVED
In reader class I wrote:
if (results.size!=0)
return results;
else
return null;

spring batch getting data from file and passing to procedure

For a Spring batch project, I need to get the date from a file and then I need to pass this date to a procedure and then run the procedure.
Then the result of the procedure must be written to a csv file.
I tried using listeners but couldn't do this.
Can anyone please tell how this can be achieved or if possible can you share any sample code in github.
First of all, you will need to get the date from your file and store it in the JobExecutionContext. One of the most simple solution would be to create a custom Tasklet to read the text file and store the result String in the context via a StepExecutionListener
This tasklet takes a file parameter and stores the result string with the key file.date :
public class CustomTasklet implements Tasklet, StepExecutionListener {
private String date;
private String file;
#Override
public void beforeStep(StepExecution stepExecution) {}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
stepExecution.getJobExecution().getExecutionContext().put("file.date", date);
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
// Read from file using FileUtils (Apache)
date = FileUtils.readFileToString(file);
}
public void setFile(String file) {
this.file = file;
}
}
Use it this way :
<batch:step>
<batch:tasklet>
<bean class="xx.xx.xx.CustomTasklet">
<property name="file" value="${file.path}"></property>
</bean>
</batch:tasklet>
</batch:step>
Then you will use a Chunk with late binding to retrieve the previously stored value (i.e. using #{jobExecutionContext['file.date']}).
The reader will be a StoredProcedureItemReader :
<bean class="org.springframework.batch.item.database.StoredProcedureItemReader">
<property name="dataSource" ref="dataSource" />
<property name="procedureName" value="${procedureName}" />
<property name="fetchSize" value="50" />
<property name="parameters">
<list>
<bean class="org.springframework.jdbc.core.SqlParameter">
<constructor-arg index="0" value="#{jobExecutionContext['file.date']}" />
</bean>
</list>
</property>
<property name="rowMapper" ref="rowMapper" />
<property name="preparedStatementSetter" ref="preparedStatementSetter" />
</bean>
The writer will be a FlatFileItemWriter :
<bean class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
<property name="resource" value="${file.dest.path}" />
<property name="lineAggregator" ref="lineAggregator" />
</bean>

How does one open a Reader when implementing ItemReader in a Spring Batch project?

In a Spring Batch project I need to compose a record out of multiple lines. I'm implementing ItemReader to accumulate multiple lines before returning an object. After working through several example projects I have pieced this together but I am faced with a ReaderNotOpenException.
I have triple checked the path to the file is correct. When I debug the delegate contains the resource and file path from my configuration file.
Any help appreciated.
Config file:
<bean id="cvsFileItemReader" class="com.mkyong.XYZFileRecordReader">
<property name="delegate">
<bean class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="classpath:ma/report-headeronly.psv" />
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean
class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value="|" />
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.PassThroughFieldSetMapper" />
</property>
</bean>
</property>
</bean>
</property>
</bean>
My Reader:
package com.mkyong;
import org.springframework.batch.item.ExecutionContext;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemStream;
import org.springframework.batch.item.ItemStreamException;
import org.springframework.batch.item.NonTransientResourceException;
import org.springframework.batch.item.ParseException;
import org.springframework.batch.item.UnexpectedInputException;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.transform.FieldSet;
import com.mkyong.model.XYZFileHeaderRecord;
public class XYZFileRecordReader implements ItemReader<XYZFileHeaderRecord>, ItemStream {
private FlatFileItemReader<FieldSet> delegate;
#Override
public XYZFileHeaderRecord read() throws Exception,
UnexpectedInputException, ParseException,
NonTransientResourceException {
XYZFileHeaderRecord maFileHeaderRecord = new XYZFileHeaderRecord();
for (FieldSet line = null; (line = this.delegate.read()) != null;) {
String firstToken = line.readString(0);
if (firstToken.equals("File ID")) {
maFileHeaderRecord.setFileName( line.readString(1) );
} else if (firstToken.equals("Date")) {
maFileHeaderRecord.setDate( line.readString(1) );
return maFileHeaderRecord;
}
}
return null;
}
#Override
public void close() throws ItemStreamException {}
#Override
public void open(ExecutionContext arg0) throws ItemStreamException {}
#Override
public void update(ExecutionContext arg0) throws ItemStreamException {}
public FlatFileItemReader<FieldSet> getDelegate() {
return delegate;
}
public void setDelegate(FlatFileItemReader<FieldSet> delegate) {
this.delegate = delegate;
}
}
And my stacktrace:
SEVERE: Encountered an error executing the step
org.springframework.batch.item.ReaderNotOpenException: Reader must be open before it can be read.
at org.springframework.batch.item.file.FlatFileItemReader.readLine(FlatFileItemReader.java:195)
at org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:173)
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.read(AbstractItemCountingItemStreamItemReader.java:83)
at com.mkyong.XYZFileRecordReader.read(XYZFileRecordReader.java:26)
at com.mkyong.XYZFileRecordReader.read(XYZFileRecordReader.java:1)
at org.springframework.batch.core.step.item.SimpleChunkProvider.doRead(SimpleChunkProvider.java:91)
at org.springframework.batch.core.step.item.SimpleChunkProvider.read(SimpleChunkProvider.java:155)
at org.springframework.batch.core.step.item.SimpleChunkProvider$1.doInIteration(SimpleChunkProvider.java:114)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:368)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:144)
at org.springframework.batch.core.step.item.SimpleChunkProvider.provide(SimpleChunkProvider.java:108)
at org.springframework.batch.core.step.item.ChunkOrientedTasklet.execute(ChunkOrientedTasklet.java:69)
at org.springframework.batch.core.step.tasklet.TaskletStep$ChunkTransactionCallback.doInTransaction(TaskletStep.java:395)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:131)
at org.springframework.batch.core.step.tasklet.TaskletStep$2.doInChunkContext(TaskletStep.java:267)
at org.springframework.batch.core.scope.context.StepContextRepeatCallback.doInIteration(StepContextRepeatCallback.java:77)
at org.springframework.batch.repeat.support.RepeatTemplate.getNextResult(RepeatTemplate.java:368)
at org.springframework.batch.repeat.support.RepeatTemplate.executeInternal(RepeatTemplate.java:215)
at org.springframework.batch.repeat.support.RepeatTemplate.iterate(RepeatTemplate.java:144)
at org.springframework.batch.core.step.tasklet.TaskletStep.doExecute(TaskletStep.java:253)
at org.springframework.batch.core.step.AbstractStep.execute(AbstractStep.java:195)
at org.springframework.batch.core.job.SimpleStepHandler.handleStep(SimpleStepHandler.java:137)
at org.springframework.batch.core.job.flow.JobFlowExecutor.executeStep(JobFlowExecutor.java:64)
at org.springframework.batch.core.job.flow.support.state.StepState.handle(StepState.java:60)
at org.springframework.batch.core.job.flow.support.SimpleFlow.resume(SimpleFlow.java:152)
at org.springframework.batch.core.job.flow.support.SimpleFlow.start(SimpleFlow.java:131)
at org.springframework.batch.core.job.flow.FlowJob.doExecute(FlowJob.java:135)
at org.springframework.batch.core.job.AbstractJob.execute(AbstractJob.java:301)
at org.springframework.batch.core.launch.support.SimpleJobLauncher$1.run(SimpleJobLauncher.java:134)
at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:49)
at org.springframework.batch.core.launch.support.SimpleJobLauncher.run(SimpleJobLauncher.java:127)
at com.mkyong.App.main(App.java:27)
Apr 25, 2014 5:35:56 PM org.springframework.batch.core.launch.support.SimpleJobLauncher$1 run
INFO: Job: [FlowJob: [name=reportJob]] completed with the following parameters: [{}] and the following status: [FAILED]
Exit Status : FAILED
Done
Your delegate isn't getting opened. The easiest way to address this is to update the open, close, and update methods to call the corresponding methods on the delegate as well. This also allows for restartability (which your current version would not because the state of the delegate is not being saved):
#Override
public void close() throws ItemStreamException {
delegate.close();
}
#Override
public void open(ExecutionContext arg0) throws ItemStreamException {
delegate.open(arg0);
}
#Override
public void update(ExecutionContext arg0) throws ItemStreamException {
delegate.update(arg0);
}
The alternative is to register your FlatFileItemReader as a stream in your step. You'll have to pull it out to a separate bean definition if you want to go that route.
You can read more about ItemStreams and how their lifecycle works and how it is impacted via delegation here: http://docs.spring.io/spring-batch/reference/html-single/index.html#itemStream
You have to call delegate.open() to perform the open of real reader. Or you can register delegate reader as streaming to let SB manage the delegate reader stream lifecycle (read chapter 6.5)

Read/write to database from quartz jobs - transactions not working

I have two Quartz (1.8.3) jobs, configured via Spring (2.5.6), one of them writes (send) to database, and one reads from it (check).
<bean id="scheduleFactory"
class="org.springframework.scheduling.quartz.SchedulerFactoryBean">
<property name="triggers">
<list>
<ref bean="Check"/>
<ref bean="Send"/>
</list>
</property>
</bean>
<bean id="Send" class="org.springframework.scheduling.quartz.CronTriggerBean">
<property name="jobDetail">
<bean class="org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">
<property name="targetObject" ref="StatusMonitor" />
<property name="targetMethod" value="sendMessage" />
</bean>
</property>
<property name="cronExpression" value="0 0/1 * * * ?" />
</bean>
<bean id="Check" class="org.springframework.scheduling.quartz.CronTriggerBean">
<property name="jobDetail">
<bean class="org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">
<property name="targetObject" ref="StatusMonitor" />
<property name="targetMethod" value="checkAndUpdateStatus" />
</bean>
</property>
<property name="cronExpression" value="30 0/1 * * * ?" />
</bean>
Transaction manager is set up:
<tx:annotation-driven transaction-manager="TransactionManager"/>
In both jobs I explicitly run read/write operations in transactions like this:
#Override
public synchronized void sendMessage() {
try {
TransactionTemplate tt = new TransactionTemplate(ptm);
tt.execute(new TransactionCallbackWithoutResult() {
#Override
protected void doInTransactionWithoutResult(TransactionStatus status) {
...
statusDAO.update(status);
...
}
});
log.info("Status was updated");
} catch (Exception e) {
...
}
}
where ptm is a TransactionManager bean, injected via Spring.
I see "Status was updated" record in logs, but when I read this record from transactional read method it is outdated sometimes. Moreover, when I use an SQL editor to read this record it is outdated too.
I don't understand, why transactions dont work in this case, do you have any ideas?
Thanks.
For anyone that might be interested. This worked for me
<bean name="applicationDataCollectorControllerJobDetail" class="org.springframework.scheduling.quartz.JobDetailBean">
<property name="jobClass" value="org.mypckage.controller.jobs.ApplicationDataCollectorController" />
<property name="jobDataAsMap">
<map>
<!--<entry key="timeout" value="1" />-->
<entry key="genericService" value-ref="genericService" />
<entry key="applicationDataCollectorService" value-ref="applicationDataCollectorService" />
<entry key="transactionManager" value-ref="transactionManager" />
</map>
</property>
</bean>
--- in the scheduler bean---
#Override
protected void executeInternal(JobExecutionContext ctx) throws JobExecutionException {
getApplicationDataCollectorService().collectData(transactionManager);
}
----In the applicationDataCollectorService bean-----
public void collectData( org.springframework.transaction.jta.JtaTransactionManager transactionManager) {
try {
this.transactionManager = transactionManager;
testTransactionalSave();
} catch (Exception e) {
BUSY = false;
e.printStackTrace();
}
}
}
private void testTransactionalSave() throws Exception {
TransactionTemplate tt = new TransactionTemplate(transactionManager);
tt.execute(new TransactionCallbackWithoutResult() {
#Override
protected void doInTransactionWithoutResult(TransactionStatus ts) {
try {
ApplicationParameter appPara = null;
List<ApplicationParameter> appParaList = genericService.getListFromHQL("select o from ApplicationParameter as o order by o.id desc", false);
if (appParaList != null) {
if (appParaList.size() > 0) {
appPara = (ApplicationParameter) appParaList.get(0);
appPara.setLastBankStatementMailNum(appPara.getLastBankStatementMailNum() + 10);
appPara = (ApplicationParameter) genericService.mergeObject(appPara);
System.out.println(" num is now = " + appPara.getLastBankStatementMailNum());
}
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
});
}
Note: dont forget to declare transactionManager as private properties in both beans with public setter and getter for spring to wire it up. Any Questions? yemiosigbesan#gmail.com

Categories