Job in spring batch is getting executed multiple times and not stopping - java

I am trying to read data from cassandra using spring batch, where I have implemented ItemReader, ItemProcessor, and ItemWriter. I am able to read the data , process it and write back the data to the same table. I am creating xml file to execute the job:
xml:
<job id="LoadStatusIndicator" job-repository="jobRepository" restartable="false">
<step id="LoadStatus" next="">
<tasklet>
<chunk reader="StatusReader" processor="ItemProcessor" writer="ItemWriter"
commit-interval="10" />
</tasklet>
</step>
</job>
<beans:bean id="ItemWriter" scope="step"
class="com.batch.writer.ItemWriter">
</beans:bean>
<beans:bean id="ItemProcessor" scope="step"
class="com.batch.processor.ItemProcessor">
</beans:bean>
<beans:bean id="Reader" scope="step"
class="com.reader.ItemReader">
<beans:property name="dataSource" ref="CassandraSource" />
</beans:bean>
applicationcontext.xml:
<beans:bean id="CassandraSource" parent="DataSourceParent">
<beans:property name="url" value="jdbc:cassandra://${cassandra.hostName}:${cassandra.port}/${cassandra.keyspace}" />
<beans:property name="driverClassName" value="org.apache.cassandra.cql.jdbc.CassandraDriver" />
</beans:bean>
reader class:
public static final String query = "SELECT * FROM test_1 allow filtering;";
#Override
public List<Item> read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException
{
List<Item> results = new ArrayList<Item>();
try {
results = cassandraTemplate.select(query,Item.class);
} catch (Exception e) {
e.printStackTrace();
}
return results;
}
writer classs:
#Override
public void write(List<? extends Item> item) throws Exception {
try {
cassandraTemplate.insert(item);
}catch(Exception e){e.printStackTrace();}
But the problem is the whole job is getting executed multiple times , infact it is not stopping at all. I have to force stop the job execution. I have only 2 rows in the table.
I think it is because of the commit-interval defined in xml, but having commit-interval = 10, job executes more than 10 times
According to my understanding, when I run the xml file that means I am running the job only one time, it calls the reader once keeps the data in the run time memory (job repository), calls item processor once (I use list ) and the whole list is inserted at once

SOLVED
In reader class I wrote:
if (results.size!=0)
return results;
else
return null;

Related

spring batch getting data from file and passing to procedure

For a Spring batch project, I need to get the date from a file and then I need to pass this date to a procedure and then run the procedure.
Then the result of the procedure must be written to a csv file.
I tried using listeners but couldn't do this.
Can anyone please tell how this can be achieved or if possible can you share any sample code in github.
First of all, you will need to get the date from your file and store it in the JobExecutionContext. One of the most simple solution would be to create a custom Tasklet to read the text file and store the result String in the context via a StepExecutionListener
This tasklet takes a file parameter and stores the result string with the key file.date :
public class CustomTasklet implements Tasklet, StepExecutionListener {
private String date;
private String file;
#Override
public void beforeStep(StepExecution stepExecution) {}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
stepExecution.getJobExecution().getExecutionContext().put("file.date", date);
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
// Read from file using FileUtils (Apache)
date = FileUtils.readFileToString(file);
}
public void setFile(String file) {
this.file = file;
}
}
Use it this way :
<batch:step>
<batch:tasklet>
<bean class="xx.xx.xx.CustomTasklet">
<property name="file" value="${file.path}"></property>
</bean>
</batch:tasklet>
</batch:step>
Then you will use a Chunk with late binding to retrieve the previously stored value (i.e. using #{jobExecutionContext['file.date']}).
The reader will be a StoredProcedureItemReader :
<bean class="org.springframework.batch.item.database.StoredProcedureItemReader">
<property name="dataSource" ref="dataSource" />
<property name="procedureName" value="${procedureName}" />
<property name="fetchSize" value="50" />
<property name="parameters">
<list>
<bean class="org.springframework.jdbc.core.SqlParameter">
<constructor-arg index="0" value="#{jobExecutionContext['file.date']}" />
</bean>
</list>
</property>
<property name="rowMapper" ref="rowMapper" />
<property name="preparedStatementSetter" ref="preparedStatementSetter" />
</bean>
The writer will be a FlatFileItemWriter :
<bean class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step">
<property name="resource" value="${file.dest.path}" />
<property name="lineAggregator" ref="lineAggregator" />
</bean>

Integration of SolrJ with Spring Batch

I am using Spring Batch.Following is the jobContext.xml file, JdbcCursorItemReader is reading data from MySQL Database.
<?xml version="1.0" encoding="UTF-8"?>
<beans>
<import resource="infrastructureContext.xml"/>
<batch:job id="readProcessWriteProducts">
<batch:step id="readWriteProducts">
<tasklet>
<chunk reader="reader" processor="processer" writer="writer" commit-interval="5"> </chunk>
</tasklet>
</batch:step>
</batch:job>
<bean id="reader" class="org.springframework.batch.item.database.JdbcCursorItemReader">
<property name="dataSource" ref="dataSource"></property>
<property name="sql" value="SELECT id, name, description, price FROM product"></property>
<property name="rowMapper" ref="productItemReader"></property>
</bean>
<bean id="productItemReader" class="com.itl.readprocesswrite.reader.ProductItemReader"></bean>
<bean id="processer" class="com.itl.readprocesswrite.processor.ProductItemProcessor">
<constructor-arg ref="jdbcTemplate"></constructor-arg>
</bean>
<bean id="writer" class="com.itl.readprocesswrite.writer.ProductJdbcItemWriter">
<constructor-arg ref="jdbcTemplate"></constructor-arg>
</bean>
</beans>
Now, I want to read data from Apache Solr.
I tried following code to read data from Apache Solr.
public class SolrJDrive {
public static void main(String[] args) throws MalformedURLException, SolrServerException {
System.out.println("SolrJDrive::main");
SolrServer solr = new CommonsHttpSolrServer("http://localhost:8983/solr");
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("qt", "/select");
params.set("q", "*:*");
params.set("spellcheck", "on");
params.set("spellcheck.build", "true");
QueryResponse response = solr.query(params);
SolrDocumentList results = response.getResults();
for (int i = 0; i < results.size(); ++i) {
System.out.println(results.get(i));
}
}//end of method main
}//end of class SolrJDrive
Now how do I integrate this with Spring Batch?
In addition to what I said on your other question (Is it possible to integrate Apache Solr with Spring Batch?), here's an example of your Solr custom ItemReader :
public class SolrReader implements ItemReader<SolrDocumentList> {
#Override
public SolrDocumentList read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
SolrServer solr = new CommonsHttpSolrServer("http://localhost:8983/solr");
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("qt", "/select");
params.set("q", "*:*");
params.set("spellcheck", "on");
params.set("spellcheck.build", "true");
QueryResponse response = solr.query(params);
SolrDocumentList results = response.getResults();
return results;
}
}
You would then need an ItemProcessor to convert your SolrDocumentList to something you can work with (ie. a POJO).

Promoting header values from a file to job execution context

I have a file where the first line contains the field names as headers as below:
Id;ToyName;ToyType;ToyColor
1;abc;abc;red
2;pqr;pqr;blue
3;xyz;xyz;orange
My reader is as below:
<beans:bean id="MyFileItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">
<beans:property name="linesToSkip" value="1"/>
<beans:property name="skippedLinesCallback" ref="headerSkipCallback" />
<beans:property name="resource" ref="MyInputFileResource" />
<beans:property name="lineMapper">
<beans:bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<beans:property name="lineTokenizer">
<beans:bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<beans:property name="delimiter" value=";"/>
<beans:property name="names" value="#{jobExecutionContext['columnsFromFileHeader']}" />
</beans:bean>
</beans:property>
<beans:property name="fieldSetMapper">
<beans:bean class="mypackage.MyFieldSetMapper">
</beans:bean>
</beans:property>
</beans:bean>
</beans:property>
</beans:bean>
Thus I have a header line call back implemented to read the skipped header line.
<beans:bean id="headerSkipCallback" class="mypackage.HeaderLineHandler" scope="step">
</beans:bean>
and the class as:
public class HeaderLineHandler implements LineCallbackHandler {
public void handleLine(final String headerLine) {
System.out.println(headerLine);
}}
This works correctly and the header line from the file gets printed.
Now I want to use these field names from file header in the names property of DefaultLineTokenizer
So to put the header line inside the context, I implemented a context injector class as below:
public class StepExecutionListenerCtxInjector {
private ExecutionContext stepExecutionCtx;
private ExecutionContext jobExecutionCtx;
#BeforeStep
public void beforeStep(final StepExecution stepExecution) {
this.stepExecutionCtx = stepExecution.getExecutionContext();
this.jobExecutionCtx = stepExecution.getJobExecution().getExecutionContext();
}
public ExecutionContext getStepExecutionCtx() {
return this.stepExecutionCtx;
}
public ExecutionContext getJobExecutionCtx() {
return this.jobExecutionCtx;
}
}
And changed my header line handler to:
<beans:bean id="headerSkipCallback" class="mypackage.HeaderLineHandler" scope="step">
<beans:property name="stepExecutionListener" ref="stepExecutionListener" />
</beans:bean>
and the class as:
public class HeaderLineHandler implements LineCallbackHandler {
private StepExecutionListenerCtxInjector stepExecutionListener;
public void handleLine(final String headerLine) {
this.stepExecutionListener.getJobExecutionCtx().put("columnsFromFileHeader", headerLine.replaceAll(";", ","));
}
// getter setters
}
Here I am saving the header line as columnsFromFile key inside the job execution context.
However, when I am trying to access it in DefaultLineTokenizer as:
<beans:property name="names" value="#{jobExecutionContext['columnsFromFileHeader']}" />
I am getting null pointer exception:
org.springframework.batch.item.file.FlatFileParseException: Parsing error at line: 2 in resource=[file [C:\myFile.dat]], input=[1;abc;abc;red]
at org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:182)
Caused by:
java.lang.NullPointerException
at org.springframework.batch.item.file.transform.AbstractLineTokenizer.tokenize(AbstractLineTokenizer.java:113)
How can I use the header line from the file in the property to DefaultLineTokenizer?
Make DelimitedLineTokenizer a non-anonymous bean and inject it into your HeaderLineHandler; in HeaderLineHandler.handleLine() set DelimitedLineTokenizer.names using handleLine() input param.
I can't test it but should works.

Listing files from ftp using spring-integration

I want to list all of files on an FTP server using spring-integration and, for example, print them on screen. I've done something like this:
context:
<int:channel id="toSplitter">
<int:interceptors>
<int:wire-tap channel="logger"/>
</int:interceptors>
</int:channel>
<int:logging-channel-adapter id="logger" log-full-message="true"/>
<int:splitter id="splitter" input-channel="toSplitter" output-channel="getFtpChannel"/>
<int-ftp:outbound-gateway id="gatewayLS"
session-factory="ftpClientFactory"
request-channel="inbound"
command="ls"
expression="payload"
reply-channel="toSplitter"/>
<int:channel id="getFtpChannel">
<int:queue/>
</int:channel>
<bean id="ftpClientFactory"
class="org.springframework.integration.ftp.session.DefaultFtpSessionFactory">
<property name="host" value="${host}"/>
<property name="username" value="${user}"/>
<property name="password" value="${password}"/>
<property name="clientMode" value="0"/>
<property name="fileType" value="2"/>
<property name="bufferSize" value="10000000"/>
</bean>
Java code:
ConfigurableApplicationContext context =
new FileSystemXmlApplicationContext("/src/citrus/resources/citrus-context.xml");
final FtpFlowGateway ftpFlow = context.getBean(FtpFlowGateway.class);
ftpFlow.lsFiles("/");
PollableChannel channel = context.getBean("getFtpChannel", PollableChannel.class);
variable("tt", channel.receive().toString());
echo("${tt}");
output:
11:09:17,169 INFO port.LoggingReporter| Test action <echo>
11:09:17,169 INFO actions.EchoAction| [Payload=FileInfo [isDirectory=false, isLink=false, Size=3607, ModifiedTime=Tue Jul 15 14:18:00 CEST 2014, Filename=Smoke03_angart30_st40.exi, RemoteDirectory=/, Permiss
ions=-rw-r--r--]][Headers= {replyChannel=org.springframework.integration.core.MessagingTemplate$TemporaryReplyChannel#7829b776, sequenceNumber=1, errorChannel=org.springframework.integration.core.MessagingTempla
te$TemporaryReplyChannel#7829b776, file_remoteDirectory=/, sequenceSize=1, correlationId=49b57f2d-4dbf-4a89-b5b8-0dfb15bca2be, id=0a58ad65-74b4-4aae-87be- aa6034a41776, timestamp=1405501757060}]
11:09:17,169 INFO port.LoggingReporter| Test action <echo> done
11:09:17,169 INFO port.LoggingReporter| TEST STEP 1/1 done
The output is fine, but what should I do to print this information when I don't know how many files are stored on the FTP? (this code prints only one file). I've tried checking if channel.receive() is null but the test just freezes.
Since you send the result of LS to the <splitter>, your getFtpChannel will receive FileInfo<?> objects one by one.
To print them all you really should have an infinite loop:
while (true) {
variable("tt", channel.receive().toString());
echo("${tt}");
}
To stop the app you should provide some shoutDownHook or listen something from console input.
Another point, that it is bad to block your app with infinite receive().
There is na overloaded method, which applies a timeout param. The last one might be useful to determine the end of your loop:
while (true) {
Message<?> receive = channel.receive(10000);
if (receive == null) {
break;
}
variable("tt", receive.toString());
echo("${tt}");
}
#Configuration
public class FtpConfig {
#Bean
public DefaultFtpSessionFactory ftpSessionFactory() {
DefaultFtpSessionFactory ftpSessionFactory = new DefaultFtpSessionFactory();
ftpSessionFactory.setHost("localhost");
ftpSessionFactory.setPort(21);
ftpSessionFactory.setUsername("user");
ftpSessionFactory.setPassword("pass");
return ftpSessionFactory;
}
}
#Bean
public ApplicationRunner runner(DefaultFtpSessionFactory sf) {
return args -> {
FTPFile[] list = sf.getSession().list(".");
for (FTPFile file: list ) {
System.out.println("Result: " + file.getName());
}
};
}

Pass parameters to stepJob from the parent job in spring batch?

I have three level hierarchy of jobs.
<job id="job1">
<step id="step1" >
<job ref="step1.job1.1" job-parameters-extractor="job1Parameters"/>
</step>
</job>
<job id="job1.1">
<step id="step1.1" >
<job ref="step1.1.job1.1.1"/>
</step>
</job>
<job id="job1.1.1">
<step id="step1.1.1" >
<tasklet ref="ste1.1.1Tasklet" />
</step>
</job>
I want to pass param1=value1 parameters from top level job (job1) to job1.1 and which should again pass it to job1.1.1 ?
How it can be done in spring batch ? I was trying to use
<util:map id="job1Parameters"
map-class="org.springframework.beans.factory.config.MapFactoryBean">
<beans:entry key="param1" value="value1" />
</util:map>
<beans:bean id="otherComputeJobParametersExtractor"
class="org.springframework.batch.core.step.job.DefaultJobParametersExtractor"
p:keys-ref="job1Parameters" />
But its not working.
I know I can pass it as a parameter to job1 and it will be automatically passed to child jobs but there are many parameters and many of those are only for sepecific child jobs so I dont want to pass all parameters to job1.
Can we add any step listener which will add param1=value1 in stepExecutionContext just before triggering child job so the parameters are available to child job via stepExecutionContext ?
I could do it using by setting up stepExecutionListener to pur param1=value1 in stepExecutionContext.
public class SetParam1StepListener implements StepExecutionListener {
protected String param1;
public String getParam1() {
return param1;
}
public void setParam1(String param1) {
this.param1 = param1;
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
// TODO Auto-generated method stub
return null;
}
#Override
public void beforeStep(StepExecution stepExecution) {
stepExecution.getExecutionContext().put("param1", this.param1);
}
}
<beans:bean id="value1.setParam1StepListener" class="com.my.SetParam1StepListener" p:param1="value1" />
Then by adding param1 key to jobParameterExtractor
<beans:bean id="jobParametersExtractor"
class="org.springframework.batch.core.step.job.DefaultJobParametersExtractor">
<beans:property name="keys" value="param1" />
</beans:bean>
and then passing it to step job
<job id="job1">
<step id="step1" >
<job ref="step1.job1.1" job-parameters-extractor="jobParametersExtractor"/>
<listeners>
<listener ref="value1.setParam1StepListener" />
</listeners>
</step>
</job>
It works.

Categories