Spring Batch: how to avoid IllegalStateException: Input resource must exist

Spring Batch: how to avoid IllegalStateException: Input resource must exist - java

I'm developing a batch application using Spring Batch with Java 11.
This is my reader() method:
#SuppressWarnings("unchecked")
#Bean
public FlatFileItemReader<MyClass> reader() {
BeanWrapperFieldSetMapper beanWrapperMapper = new BeanWrapperFieldSetMapper<MyClass>();
beanWrapperMapper.setTargetType(MyClass.class);
return new FlatFileItemReaderBuilder<MyClass>()
.name("MyClassReader")
.resource(new FileSystemResource(inputFolder.concat(File.separator).concat("my-input-file.csv")))
.delimited()
.names("field1", "field2")
.fieldSetMapper(beanWrapperMapper)
.build();
}
I did several tests and when the file my-input-file.csv is there, the batch works fine. However, I would like to get the following behavior: if the file my-input-file.csv is missing, I still want something written to the output file and no errors be raised.
Right now, if I run the batch but the file is not in the folder, this error comes up:
IllegalStateException: Input resource must exist (reader is in 'strict' mode): path [C:\\Users\\username\\Desktop\\src\\test\\resources\\my-input-file.csv]
I am aware that the error is because the file could not be found. But I really would like to handle this case to generate a different output file (and I don't want the batch process to fail).
How can this be done?

set the strict property to false so that input resource exceptions would not happen.
check the readCount after the batch job is completed. if readCount == 0, means no data, and you handle your logic here.
example for your case (implementing the JobExecutionListener):
#Override
public void afterJob(JobExecution jobExecution) {
StepExecution[] stepExecutions = jobExecution.getStepExecutions().toArray(new StepExecution[]{});
StepExecution stepExecution = stepExecutions[0];
long readCount = stepExecution.getReadCount();
if (readCount == 0) {
// your logic
}
}

Related

How to skip whole step if file do not exist for that item reader or step

ISSUE:> I am having item reader writer processor for my spring batch job. And when I am running it, it is giving me error Failed to initialize the reader and
also showing :>
Caused by: java.lang.IllegalStateException: Input resource must exist (reader is in 'strict' mode): file
My requirement is: I want my item reader to not run if file does not exists.
Can anyone help me: I want to skip whole step if file do not exist for that item reader.
My step consist of (item reader, processor and writer).
Just tell me how to skip step, if file do not exist. Any help will be appreciated.And is there anyway to set reader to non strict mode.

See here: You can set skip condition (and limit) on a "chunk/step".
#Bean
public Step step1() {
return this.stepBuilderFactory.get("step1")
.<String, String>chunk(CHUNK_SIZE)
.reader(flatFileItemReader())
.writer(itemWriter())
.faultTolerant()
.skipLimit(0) // 0: dont re-try
.skip(FlatFileParseException.class) // when you use FlatFileItemWriter,
// but other exception (type) possible, e.g. IllegalStateException.class
.build();
}
To set "non-strict" mode in FlatFileItemReader, just:
#Bean
public FlatFileItemReader<XY> flatFileItemReader() {
FlatFileItemReader<XY> reader = new FlatFileItemReader<>();
//!
reader.setStrict(false);
// reader.setXYZ ...
return reader;
}

how to collect and write logs to file in batches after time interval using log4j2 or logback in java

I am trying to make a custom log appender using log4j2. My problem is I don't want to write log immediately to a file appender but after. So, ideally, my spring boot app should collect all the logs in some data structure and then trigger writing to file after the delay of 3 minutes in batches. (I should not use spring batch since it is not a batch application but a simple spring boot starter)

When I was writing my custom provider for Logback (as a part of Loki4j project) I came up with a concise implementation of a thread-safe buffer that can trigger output operation by either batch size or timeout since last output.
Usage pattern:
private static final LogRecord[] ZERO_EVENTS = new LogRecord[0];
private ConcurrentBatchBuffer<ILoggingEvent, LogRecord> buffer = new ConcurrentBatchBuffer<>(batchSize, LogRecord::create, (e, fl) -> eventFileLine(e, fl));
// somewhere in code where new log event arrives
var batch = buffer.add(event, ZERO_EVENTS);
if (batch.length > 0)
handleBatch(batch);
// somewhere in scheduled method that triggers every timeoutMs
var batch = buffer.drain(timeoutMs, ZERO_EVENTS);
if (batch.length > 0)
return handleBatch(batch).thenApply(r -> null);
// handling batches here
private void handleBatch(LogRecord[] lines) {
// flush to file
}

Try looking at appenders that come with log4j2. Most of them implement functionality similar to what you describe. E.g. RandomAccessFileAppender will write to file only after it receives complete batch of log events from async appenders infrastructure. They all share OutputStreamManager or FileManager that encapsulates this logic:
protected synchronized void write(final byte[] bytes, final int offset, final int length, final boolean immediateFlush) {
if (immediateFlush && byteBuffer.position() == 0) {
...
Unfortunately there seems to be no time baed solution for this.
I have written log4j2 appender for Loki and it has its own ring buffer and a thread that will send a batch when there is enough data or user specified timeout passed, here:
if (exceededBatchSizeThreshold() || exceededWaitTimeThreshold(currentTimeMillis)) {
try {
httpClient.log(outputBuffer);
} finally {
outputBuffer.clear();
timeoutDeadline = currentTimeMillis + batchWaitMillis;
}
}

Spring Batch: Read from txt files and return all lines as a single String to the processor

I'm trying to read text from files in Spring Batch but i need all lines as a single String reaching the .processor and not line by line. I read about .setRecordSeparatorPolicy in FlatFileItemReader. Is there any other way to achieve this? Maybe another kind of reader or something?
Any help would be really appreciated
public ItemStreamReader<String> stringReader() throws IOException {
Resources[] resources.. // load files here
MultiResourceItemReader<String> reader = new MultiResourceItemReader<String>();
reader.setResources(resources);
reader.setDelegate(flatFileItemReader()); //sets FlatFileIteamReader
return reader;
}
#Bean
public TaskletStep managerStep() throws Exception {
return managerStepBuilderFactory.get("managerStep")
.<String, String>chunk(6)
.reader(stringReader())
.processor(myProcessor()) //all lines as 1 String here per File
.writer(doSomething())
.build();
}

The chunk processing model is designed for item oriented use cases. If you have a single item, I don't see any added value of using a chunk-oriented step. A simple tasklet is more appropriate.

Is java.util.logging.FileHandler in Java 8 broken?

First, a simple test code:
package javaapplication23;
import java.io.IOException;
import java.util.logging.FileHandler;
public class JavaApplication23 {
public static void main(String[] args) throws IOException {
new FileHandler("./test_%u_%g.log", 10000, 100, true);
}
}
This test code creates with Java 7 only one File "test_0_0.log", no matter, how often I run the program. This is the expected behaviour because the append parameter in the constructor is set to true.
But if I run this sample in Java 8, every run creates a new File (test_0_0.log, test_0_1.log, test_0_2.log,...). I think this is a bug.
Imho, the related change in Java is this one:
## -413,18 +428,18 ##
// object. Try again.
continue;
}
- FileChannel fc;
+
try {
- lockStream = new FileOutputStream(lockFileName);
- fc = lockStream.getChannel();
- } catch (IOException ix) {
- // We got an IOException while trying to open the file.
- // Try the next file.
+ lockFileChannel = FileChannel.open(Paths.get(lockFileName),
+ CREATE_NEW, WRITE);
+ } catch (FileAlreadyExistsException ix) {
+ // try the next lock file name in the sequence
continue;
}
+
boolean available;
try {
- available = fc.tryLock() != null;
+ available = lockFileChannel.tryLock() != null;
// We got the lock OK.
} catch (IOException ix) {
// We got an IOException while trying to get the lock.
## -440,7 +455,7 ##
}
// We failed to get the lock. Try next file.
- fc.close();
+ lockFileChannel.close();
}
}
(In full: OpenJDK changeset 6123:ac22a52a732c)
I know that normally the FileHandler gets closed by the Logmanager, but this is not the case, if the system or the application crashes or the process gets killed. This is why I do not have a "close" statement in the above sample code.
Now I have two questions:
1) What is your opinion? Is this a bug? (Almost answered in the following comments and answers)
2) Do you know a workaround to get the old Java 7 behavior in Java 8? (The more important question...)
Thanks for your answers.

Closing of the FileHandler deletes the 'lck' file. If the lock file exists at all under a JDK8 version that is less than update 40 (java.util.logging), the FileHandler is going to rotate. From the OpenJDK discussion, the decision was made to always rotate if the lck file exists in addtion to if the current process can't lock it. The reason given is that it is always safer to rotate when the lock file exists. So this gets really nasty if you have rotating pattern in use with a mix of JDK versions because the JDK7 version will reuse the lock but the JDK8 version will leave it and rotate. Which is what you are doing with your test case.
Using JDK8 if I purge all log and lck files from the working directory and then run:
public static void main(String[] args) throws IOException {
System.out.println(System.getProperty("java.runtime.version"));
new FileHandler("./test_%u.log", 10000, 100, true).close();
}
I always see a file named 'test_0.log.0'. I get the same result using JDK7.
Bottom line is that is that you have to ensure your FileHandlers are closed. If it is never garbaged collected or removed from the logger tree then LogManager will close your FileHandler. Otherwise you have to close it. After that is fixed, purge all lock files before running your new patched code. Then be aware that if the JVM process crashed or is killed the lock file won't be deleted. If you have an I/O error on close your lock file won't be deleted. When the next process starts, the FileHandler will rotate.
As you point out, it is possible to use up all of the lock files on JDK8 if the above conditions occur over 100 runs. A simple test for this is to run the following code twice without deleting the log and lck files:
public static void main(String[] args) throws Exception {
System.out.println(System.getProperty("java.runtime.version"));
ReferenceQueue<FileHandler> q = new ReferenceQueue<>();
for (int i=0; i<100; i++) {
WeakReference<FileHandler> h = new WeakReference<>(
new FileHandler("./test_%u.log", 10000, 2, true), q);
while (q.poll() != h) {
System.runFinalization();
System.gc();
System.runFinalization();
Thread.yield();
}
}
}
However, the test case above won't work if JDK-6774110 is fixed correctly. The issue for this can be tracked on the OpenJDK site under RFR: 8048020 - Regression on java.util.logging.FileHandler and FileHandler webrev.

Why does my file not get read correctly when running through ANT

I have a suite of unit tests I run in Eclipse which all work fine. They depend on data loaded from a very large > 20MB file.
However, when I run the unit tests from ANT the tests fail because some of the data is not loaded. What happens is my file reading mechanism does not read the entire file, it just stops , without giving any error after reading about 10,000 of 900,000 lines
Here is my file reading code
private static void initializeListWithFileContents(
TreeMap<String, List<String>> treeMap, String fileName)
{
File file = new File(fileName);
Scanner scanner = null;
int count = 0;
try
{
scanner = new Scanner(file);
while (scanner.hasNextLine())
{
String line = scanner.nextLine().toLowerCase().trim();
String[] tokens = line.split(" ");
if (tokens.length == 3)
{
String key = tokens[0] + tokens[1];
if (treeMap.containsKey(key))
{
List<String> list = treeMap.get(key);
list.add(tokens[2]);
}
else
{
List<String> list = new ArrayList<String>();
list.add(tokens[2]);
treeMap.put(key, list);
}
count++;
}
}
scanner.close();
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
System.out.println(count + " rows added");
}
This is part of a Web app. The web app also works fine, the entire file gets loaded to memory.
If the data my Unit tests depends on are contained in the first 10,000 lines then unit tests pass ok with ANT.
The only thing I can think of is it must be a memory issue but why then do I not get an exception thrown.
I run my ANT target from within Eclipse. It is configured with the same JVM args as my Eclipse JUnit runner is , ie -Xms512m -Xmx1234m. I know it picks these up correctly because if ANT launches with the default JVM parameters then it will fail with Heap error.
Any other ideas what I could check?

The Scanner type swallows I/O errors. You must check for errors explicitly using the ioException() method.
If the problem is an encoding error you need to pass the encoding of the file explicitly when you instantiate the scanner.
If the file is a corrupt text file, you may need to provide your own reader that does more fault-tolerant decoding. This should be avoided if possible as it is less correct.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Spring Batch: how to avoid IllegalStateException: Input resource must exist - java

Related

How to skip whole step if file do not exist for that item reader or step

how to collect and write logs to file in batches after time interval using log4j2 or logback in java

Spring Batch: Read from txt files and return all lines as a single String to the processor

Is java.util.logging.FileHandler in Java 8 broken?

Why does my file not get read correctly when running through ANT

Categories

Resources