Spring batch job status FAILED when all Steps COMPLETED

Spring batch job status FAILED when all Steps COMPLETED - java

I have a spring batch job which uses flow:
Flow productFlow = new FlowBuilder<Flow>("productFlow")
.start(productFlow)
.next(new MyDecider()).on("YES").to(anotherFlow)
.build();
After I started to use a decider which checks some value in Jobparameter from job execution to decide whether to run the next flow or not, I am no lo longer getting COMPLETED as overall job status in JobExecution. It comes as FAILED.
However, every step in the STEP EXECUTION Table are COMPLETED and none FAILED.
Have I missed a trick somewhere?
My Decider is looks like this:
public class AnotherFlowDecider implements JobExecutionDecider {
#Override
public AnotherFlowDecider decide(final JobExecution jobExecution, final StepExecution stepExecution) {
final JobParameters jobParameters = jobExecution.getJobParameters();
final String name = jobParameters.getString("name");
if (nonNull(name)) {
switch (name) {
case "A":
return new FlowExecutionStatus("YES");
case "B":
default:
return new FlowExecutionStatus("NO");
}
}
throw new MyCustomException(FAULT, "nameis not provided as a JobParameter");
}
}
in Debug mode I can see
2020-12-11 11:10:58.145 DEBUG [cTaskExecutor-4] o.s.b.c.j.f.s.SimpleFlow [eId=, rId=] -- Completed state=productFlow.stageProduct with status=COMPLETED
2020-12-11 11:10:58.146 DEBUG [cTaskExecutor-4] o.s.b.c.j.f.s.SimpleFlow [eId=, rId=] -- Handling state=productFlow.decision0
2020-12-11 11:10:58.146 DEBUG [cTaskExecutor-4] o.s.b.c.j.f.s.SimpleFlow [eId=, rId=] -- Completed state=productFlow.decision0 with status=NO
2020-12-11 11:10:58.146 DEBUG [cTaskExecutor-4] o.s.b.c.j.f.s.SimpleFlow [eId=, rId=] -- Handling state=productFlow.FAILED

Related

Mono created from a Sink sporadically turns empty

In our code, we create a Sink.one whose value is emitted from onSuccess/onError handlers of some Mono. A value being emitted is never null.
Contary to our expectation, a Mono view of that sink sporadically turns empty. It happens both on Windows 10 as well as on Ubuntu 20 machines, so it is likely not something platform-specific.
Would appreciate any hints as to why it happens?
Below is the shortest unit test I could come up with which reflects what we're doing in our production code, and when this test is run from IDE with "repeat until failure" setting, it fails the assertion about non-null Mono result quite consistently within 10-30 seconds after start.
I also added some diagnostic logging around tryEmitValue / tryEmitError, and looks like it's never an error, and never a null value being emitted, neither it's an emission failure.
The output from a failed run is:
23:48:12.635 [pool-1-thread-1] INFO sink-setter - | onSubscribe([Fuseable] Operators.MonoSubscriber)
23:48:12.635 [pool-1-thread-1] INFO sink-setter - | request(unbounded)
23:48:12.635 [boundedElastic-1] INFO sink-result-receiver - onSubscribe(SinkOneMulticast.NextInner)
23:48:12.635 [pool-1-thread-1] INFO sink-setter - | onNext(value)
23:48:12.635 [boundedElastic-1] INFO sink-result-receiver - request(unbounded)
23:48:12.635 [pool-1-thread-1] INFO sink-setter - | onComplete()
23:48:12.635 [boundedElastic-1] INFO sink-result-receiver - onComplete()
java.lang.AssertionError:
Expecting actual not to be null
The test code:
import java.util.concurrent.*;
import java.util.function.Consumer;
import lombok.extern.slf4j.Slf4j;
import org.junit.jupiter.api.Test;
import reactor.core.publisher.*;
import reactor.core.scheduler.Schedulers;
import reactor.util.Loggers;
import static org.assertj.core.api.AssertionsForClassTypes.assertThat;
#Slf4j
class MonoPlaygroundTest {
private static ExecutorService executor;
static {
executor = Executors.newCachedThreadPool();
Loggers.useSl4jLoggers();
}
#Test
void shouldConstructNonEmptyMono() {
var sink = Sinks.<String>one();
Mono
.fromSupplier(() -> "value")
.log(Loggers.getLogger("sink-setter"))
.subscribeOn(Schedulers.fromExecutorService(executor, "mono-executor"))
.doOnSuccess(tryEmitValueWithDebugLogging(sink))
.doOnError(tryEmitErrorWithDebugLogging(sink))
.subscribe();
var result = sink
.asMono()
.log(Loggers.getLogger("sink-result-receiver"))
// more map / flatMap here...
.subscribeOn(Schedulers.boundedElastic())
.block();
assertThat(result).isNotNull();
}
private static Consumer<Throwable> tryEmitErrorWithDebugLogging(Sinks.One<String> sink) {
return t -> {
sink.tryEmitError(t);
log.error("Was emitting an error: {}", t);
};
}
private static Consumer<String> tryEmitValueWithDebugLogging(Sinks.One<String> sink) {
return value -> {
var result = sink.tryEmitValue(value);
if (value == null) {
log.error("Tried to emit null value");
}
if (result.isFailure()) {
log.error("Failed to emit because of {}", result);
}
};
}
}

Long Flux sometimes not complete

My need is to transfer some item from a not reactive repository to a reactive repository(Firestore).
The procedure is triggered from a REST endpoint exposed with Netty.
The code below is what I've written after some trial and errors.
The query from the non reactive repo is not long (~20sec) but it returns a lot of records and the execution time is usually ~60min.
All records are always saved, all "Saving in progress... XXX" are printed, but about 50% if the times, it will not print "Saved XXX records" and no errors are printed.
Things I've noticed:
more records -> higher probability of fails
it does not depends on the execution time (sometimes longer process than the failed ones completes)
The app runs on a k8s pod with 1500Mi RAM request and 3000Mi limit, from the graphs it never approaches the limit.
What I'm missing here?
#Slf4j
#RestController
#RequestMapping("/import")
public class ImportController {
#Autowired
private NotReactiveRepository notReactiveRepository;
#Autowired
private ReactiveRepository reactiveRepository;
private static final Scheduler queryScheduler = Schedulers.newBoundedElastic(1, 480, "query", 864000);// max 10 days processing time
#GetMapping("/start")
public Mono<String> start() {
log.info("Start");
return Mono.just("RECEIVED")
//fire and forget
.doOnNext(stringRouteResponse -> startProcess().subscribe());
}
private Mono<Long> startProcess() {
Mono<List<Items>> resultsBlockingMono = Mono
.fromCallable(() -> notReactiveRepository.findAll())
.subscribeOn(queryScheduler)
.retryWhen(Retry.backoff(5, Duration.of(2, ChronoUnit.SECONDS)));
return resultsBlockingMono
.doOnNext( records -> log.info("Records: {}", records.size()))
.flatMapMany(Flux::fromIterable)
.map(ItemConverter::convert)
// max 9000 save/sec
.delayElements(Duration.of(300, ChronoUnit.MICROS))
.flatMap(this::saveConvertedItem)
.zipWith(Flux.range(1, Integer.MAX_VALUE))
.doOnNext(savedAndIndex -> log.info("Saving in progress... {}", savedAndIndex.getT2()))
.count()
.doOnNext( numberOfSaved -> log.info("Saved {} records", numberOfSaved));
}
private Mono<ConvertedItem> saveConvertedItem(ConvertedItem convertedItem) {
return reactiveRepository.save(convertedItem)
.retryWhen(Retry.backoff(1000, Duration.of(2, ChronoUnit.MILLIS)))
.onErrorResume(throwable -> {
log.error("Resuming");
return Mono.empty();
})
.doOnError(throwable -> log.error("Error on save"));
}
}
Update:
As requested, this is the last output of the procedure, where should be "Saved 1131113 records" and with .log() before .count() (the output after the onNext always prints after the process, also on success):
"Saving... 1131113"
"| onNext([ConvertedItem(...),1131113])"
"Shutting down ExecutorService 'pubsubPublisherThreadPool'"
"Shutting down ExecutorService 'pubSubAcknowledgementExecutor'"
"Shutting down ExecutorService 'pubsubSubscriberThreadPool'"
"Closing JPA EntityManagerFactory for persistence unit 'default'"
"HikariPool-1 - Shutdown initiated..."
"HikariPool-1 - Shutdown completed."

How to create a TextFree search in mongoDB with Micronaut

I am using reactive MongoDb, and trying to implement Free Text search based on the weight
implementation("io.micronaut.mongodb:micronaut-mongo-reactive")
on below POJO
public class Product {
#BsonProperty("_id")
#BsonId
private ObjectId id;
private String name;
private float price;
private String description;
}
Tried this simple example
public Flowable<List<Product>> findByFreeText(String text) {
LOG.info(String.format("Listener --> Listening value = %s", text));
Flowable.fromPublisher(this.repository.getCollection("product", List.class)
.find(new Document("$text", new Document("$search", text)
.append("$caseSensitive", false)
.append("$diacriticSensitive", false)))).subscribe(item -> {
System.out.println(item);
}, error -> {
System.out.println(error);
});
return Flowable.just(List.of(new Product()));
}
I don't think this is the correct way of implementing the Free Text Search.

At first you don't need to have Flowable with List of Product because Flowable can manage more then one value unlike Single. So, it is enough to have Flowable<Product>. Then you can simply return the Flowable instance from find method.
Text search can be then implemented like this:
public Flowable<Product> findByFreeText(final String query) {
return Flowable.fromPublisher(repository.getCollection("product", Product.class)
.find(new Document("$text",
new Document("$search", query)
.append("$caseSensitive", false)
.append("$diacriticSensitive", false)
)));
}
Then it is up to the consumer of the method how it subscribes to the result Flowable. In controller you can directly return the Flowable instance. If you need to consume it somewhere in your code you can do subscribe() or blockingSubscribe() and so on.
And you can of course test it by JUnit like this:
#MicronautTest
class SomeServiceTest {
#Inject
SomeService service;
#Test
void findByFreeText() {
service.findByFreeText("test")
.test()
.awaitCount(1)
.assertNoErrors()
.assertValue(p -> p.getName().contains("test"));
}
}
Update: you can debug communication with MongoDB by setting this in logback.xml (Micronaut is using Logback as a default logging framework) logging config file:
<configuration>
....
<logger name="org.mongodb" level="debug"/>
</configuration>
Then you will see this in the log file:
16:20:21.257 [Thread-5] DEBUG org.mongodb.driver.protocol.command - Sending command '{"find": "product", "filter": {"$text": {"$search": "test", "$caseSensitive": false, "$diacriticSensitive": false}}, "batchSize": 2147483647, "$db": "some-database"}' with request id 6 to database some-database on connection [connectionId{localValue:3, serverValue:1634}] to server localhost:27017
16:20:21.258 [Thread-8] DEBUG org.mongodb.driver.protocol.command - 16:20:21.258 [Thread-7] DEBUG org.mongodb.driver.protocol.command - Execution of command with request id 6 completed successfully in 2.11 ms on connection [connectionId{localValue:3, serverValue:1634}] to server localhost:27017
Then you can copy the command from log and try it in MongoDB CLI or you can install MongoDB Compass where you can play with that more and see whether the command is correct or not.

Spring Cloud Stream Multi Topic Transaction Management

I'm trying to create a PoC application in Java to figure out how to do transaction management in Spring Cloud Stream when using Kafka for message publishing. The use case I'm trying to simulate is a processor that receives a message. It then does some processing and generates two new messages destined to two separate topics. I want to be able to handle publishing both messages as a single transaction. So, if publishing the second message fails I want to roll (not commit) the first message. Does Spring Cloud Stream support such a use case?
I've set the #Transactional annotation and I can see a global transaction starting before the message is delivered to the consumer. However, when I try to publish a message via the MessageChannel.send() method I can see that a new local transaction is started and completed in the KafkaProducerMessageHandler class' handleRequestMessage() method. Which means that the sending of the message does not participate in the global transaction. So, if there's an exception thrown after the publishing of the first message, the message will not be rolled back. The global transaction gets rolled back but that doesn't do anything really since the first message was already committed.
spring:
cloud:
stream:
kafka:
binder:
brokers: localhost:9092
transaction:
transaction-id-prefix: txn.
producer: # these apply to all producers that participate in the transaction
partition-key-extractor-name: partitionKeyExtractorStrategy
partition-selector-name: partitionSelectorStrategy
partition-count: 3
configuration:
acks: all
enable:
idempotence: true
retries: 10
bindings:
input-customer-data-change-topic:
consumer:
configuration:
isolation:
level: read_committed
enable-dlq: true
bindings:
input-customer-data-change-topic:
content-type: application/json
destination: com.fis.customer
group: com.fis.ec
consumer:
partitioned: true
max-attempts: 1
output-name-change-topic:
content-type: application/json
destination: com.fis.customer.name
output-email-change-topic:
content-type: application/json
destination: com.fis.customer.email
#SpringBootApplication
#EnableBinding(CustomerDataChangeStreams.class)
public class KafkaCloudStreamCustomerDemoApplication
{
public static void main(final String[] args)
{
SpringApplication.run(KafkaCloudStreamCustomerDemoApplication.class, args);
}
}
public interface CustomerDataChangeStreams
{
#Input("input-customer-data-change-topic")
SubscribableChannel inputCustomerDataChange();
#Output("output-email-change-topic")
MessageChannel outputEmailDataChange();
#Output("output-name-change-topic")
MessageChannel outputNameDataChange();
}
#Component
public class CustomerDataChangeListener
{
#Autowired
private CustomerDataChangeProcessor mService;
#StreamListener("input-customer-data-change-topic")
public Message<String> handleCustomerDataChangeMessages(
#Payload final ImmutableCustomerDetails customerDetails)
{
return mService.processMessage(customerDetails);
}
}
#Component
public class CustomerDataChangeProcessor
{
private final CustomerDataChangeStreams mStreams;
#Value("${spring.cloud.stream.bindings.output-email-change-topic.destination}")
private String mEmailChangeTopic;
#Value("${spring.cloud.stream.bindings.output-name-change-topic.destination}")
private String mNameChangeTopic;
public CustomerDataChangeProcessor(final CustomerDataChangeStreams streams)
{
mStreams = streams;
}
public void processMessage(final CustomerDetails customerDetails)
{
try
{
sendNameMessage(customerDetails);
sendEmailMessage(customerDetails);
}
catch (final JSONException ex)
{
LOGGER.error("Failed to send messages.", ex);
}
}
public void sendNameMessage(final CustomerDetails customerDetails)
throws JSONException
{
final JSONObject nameChangeDetails = new JSONObject();
nameChangeDetails.put(KafkaConst.BANK_ID_KEY, customerDetails.bankId());
nameChangeDetails.put(KafkaConst.CUSTOMER_ID_KEY, customerDetails.customerId());
nameChangeDetails.put(KafkaConst.FIRST_NAME_KEY, customerDetails.firstName());
nameChangeDetails.put(KafkaConst.LAST_NAME_KEY, customerDetails.lastName());
final String action = customerDetails.action();
nameChangeDetails.put(KafkaConst.ACTION_KEY, action);
final MessageChannel nameChangeMessageChannel = mStreams.outputNameDataChange();
emailChangeMessageChannel.send(MessageBuilder.withPayload(nameChangeDetails.toString())
.setHeader(MessageHeaders.CONTENT_TYPE, MimeTypeUtils.APPLICATION_JSON)
.setHeader(KafkaHeaders.TOPIC, mNameChangeTopic).build());
if ("fail_name_illegal".equalsIgnoreCase(action))
{
throw new IllegalArgumentException("Customer name failure!");
}
}
public void sendEmailMessage(final CustomerDetails customerDetails) throws JSONException
{
final JSONObject emailChangeDetails = new JSONObject();
emailChangeDetails.put(KafkaConst.BANK_ID_KEY, customerDetails.bankId());
emailChangeDetails.put(KafkaConst.CUSTOMER_ID_KEY, customerDetails.customerId());
emailChangeDetails.put(KafkaConst.EMAIL_ADDRESS_KEY, customerDetails.email());
final String action = customerDetails.action();
emailChangeDetails.put(KafkaConst.ACTION_KEY, action);
final MessageChannel emailChangeMessageChannel = mStreams.outputEmailDataChange();
emailChangeMessageChannel.send(MessageBuilder.withPayload(emailChangeDetails.toString())
.setHeader(MessageHeaders.CONTENT_TYPE, MimeTypeUtils.APPLICATION_JSON)
.setHeader(KafkaHeaders.TOPIC, mEmailChangeTopic).build());
if ("fail_email_illegal".equalsIgnoreCase(action))
{
throw new IllegalArgumentException("E-mail address failure!");
}
}
}
EDIT
We are getting closer. The local transaction does not get created anymore. However, the global transaction still gets committed even if there was an exception. From what I can tell the exception does not propagate to the TransactionTemplate.execute() method. Therefore, the transaction gets committed. It seems like that the MessageProducerSupport class in the sendMessage() method "swallows" the exception in the catch clause. If there's an error channel defined then a message is published to it and thus the exception is not rethrown. I tried turning the error channel off (spring.cloud.stream.kafka.binder.transaction.producer.error-channel-enabled = false) but that doesn't turn it off. So, just for a test I simply set the error channel to null in the debugger to force the exception to be rethrown. That seems to do it. However, the original message keeps getting redelivered to the initial consumer even though I have the max-attempts set to 1 for that consumer.

See the documentation.
spring.cloud.stream.kafka.binder.transaction.transactionIdPrefix
Enables transactions in the binder. See transaction.id in the Kafka documentation and Transactions in the spring-kafka documentation. When transactions are enabled, individual producer properties are ignored and all producers use the spring.cloud.stream.kafka.binder.transaction.producer.* properties.
Default null (no transactions)
spring.cloud.stream.kafka.binder.transaction.producer.*
Global producer properties for producers in a transactional binder. See spring.cloud.stream.kafka.binder.transaction.transactionIdPrefix and Kafka Producer Properties and the general producer properties supported by all binders.
Default: See individual producer properties.
You must configure the shared global producer.
Don't add #Transactional - the container will start the transaction and send the offset to the transaction before committing the transaction.
If the listener throws an exception, the transaction is rolled back and the DefaultAfterRollbackPostProcessor will re-seek the topics/partitions so that the record will be redelivered.
EDIT
There is a bug in the configuration of the binder's transaction manager that causes a new local transaction to be started by the output binding.
To work around it, reconfigure the TM with the following container customizer bean...
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<byte[], byte[]>> customizer() {
return (container, dest, group) -> {
KafkaTransactionManager<?, ?> tm = (KafkaTransactionManager<?, ?>) container.getContainerProperties()
.getTransactionManager();
tm.setTransactionSynchronization(AbstractPlatformTransactionManager.SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
};
}
EDIT2
You can't use the binder's DLQ support because, from the container's perspective, the delivery was successful. We need to propagate the exception to the container to force a rollback. So, you need to move the dead-lettering to the AfterRollbackProcessor instead. Here is my complete test class:
#SpringBootApplication
#EnableBinding(Processor.class)
public class So57379575Application {
public static void main(String[] args) {
SpringApplication.run(So57379575Application.class, args);
}
#Autowired
private MessageChannel output;
#StreamListener(Processor.INPUT)
public void listen(String in) {
System.out.println("in:" + in);
this.output.send(new GenericMessage<>(in.toUpperCase()));
if (in.equals("two")) {
throw new RuntimeException("fail");
}
}
#KafkaListener(id = "so57379575", topics = "so57379575out")
public void listen2(String in) {
System.out.println("out:" + in);
}
#KafkaListener(id = "so57379575DLT", topics = "so57379575dlt")
public void listen3(String in) {
System.out.println("dlt:" + in);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<byte[], byte[]> template) {
return args -> {
template.send("so57379575in", "one".getBytes());
template.send("so57379575in", "two".getBytes());
};
}
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<byte[], byte[]>> customizer(
KafkaTemplate<Object, Object> template) {
return (container, dest, group) -> {
// enable transaction synchronization
KafkaTransactionManager<?, ?> tm = (KafkaTransactionManager<?, ?>) container.getContainerProperties()
.getTransactionManager();
tm.setTransactionSynchronization(AbstractPlatformTransactionManager.SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
// container dead-lettering
DefaultAfterRollbackProcessor<? super byte[], ? super byte[]> afterRollbackProcessor =
new DefaultAfterRollbackProcessor<>(new DeadLetterPublishingRecoverer(template,
(ex, tp) -> new TopicPartition("so57379575dlt", -1)), 0);
container.setAfterRollbackProcessor(afterRollbackProcessor);
};
}
}
and
spring:
kafka:
bootstrap-servers:
- 10.0.0.8:9092
- 10.0.0.8:9093
- 10.0.0.8:9094
consumer:
auto-offset-reset: earliest
enable-auto-commit: false
properties:
isolation.level: read_committed
cloud:
stream:
bindings:
input:
destination: so57379575in
group: so57379575in
consumer:
max-attempts: 1
output:
destination: so57379575out
kafka:
binder:
transaction:
transaction-id-prefix: so57379575tx.
producer:
configuration:
acks: all
retries: 10
#logging:
# level:
# org.springframework.kafka: trace
# org.springframework.transaction: trace
and
in:two
2019-08-07 12:43:33.457 ERROR 36532 --- [container-0-C-1] o.s.integration.handler.LoggingHandler : org.springframework.messaging.MessagingException: Exception thrown while
...
Caused by: java.lang.RuntimeException: fail
...
in:one
dlt:two
out:ONE

Spring Batch failed to get job context value during restart where value was set before failure

I am using spring batch 3 with db2 server.The issue is on restarting failed job using spring job opertor, the value of newFileStatus from job context is null. The value was used to decide next step.
if (BatchStatus.STARTED.equals(executions.getStatus())) {
executions.setEndTime(new Date());
executions.setStatus(BatchStatus.FAILED);
executions.setExitStatus(ExitStatus.FAILED);
jobRepository.update(executions);
logger.info("Restart Status : Restarted for Start Status ");
jobOperator.restart(executions.getId());
} else if (BatchStatus.FAILED.equals(executions.getStatus())) {
loadCache();
logger.info("Restart Status : Restarted for Failed Status ");
jobOperator.restart(executions.getId());
} else {// For COMPLETED AND UNKNOWN Status
//New Run
}
and in tasket to get value from context after restart
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
logger.info("Step Execution Status : " + this.stepExecution.getStatus());
ExitStatus exitStatus;
if (this.stepExecution.getStatus().equals(BatchStatus.FAILED)) {
exitStatus = ExitStatus.FAILED;
} else if (this.stepExecution.getStatus().equals(BatchStatus.COMPLETED)) {
ExecutionContext jobContext = stepExecution.getJobExecution().getExecutionContext();
boolean newFilesStatus = (Boolean) jobContext.get("newFilesArrived");
//based on newFileStaus next step will be determined
if(newFilesStatus){
//return newStep
}else{
//skipe new step
}
return exitStatus;
}
DB Driver - StarSql drive
Thanks

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Spring batch job status FAILED when all Steps COMPLETED - java

Related

Mono created from a Sink sporadically turns empty

Long Flux sometimes not complete

How to create a TextFree search in mongoDB with Micronaut

Spring Cloud Stream Multi Topic Transaction Management

Spring Batch failed to get job context value during restart where value was set before failure

Categories

Resources