Long Flux sometimes not complete - java

My need is to transfer some item from a not reactive repository to a reactive repository(Firestore).
The procedure is triggered from a REST endpoint exposed with Netty.
The code below is what I've written after some trial and errors.
The query from the non reactive repo is not long (~20sec) but it returns a lot of records and the execution time is usually ~60min.
All records are always saved, all "Saving in progress... XXX" are printed, but about 50% if the times, it will not print "Saved XXX records" and no errors are printed.
Things I've noticed:
more records -> higher probability of fails
it does not depends on the execution time (sometimes longer process than the failed ones completes)
The app runs on a k8s pod with 1500Mi RAM request and 3000Mi limit, from the graphs it never approaches the limit.
What I'm missing here?
#Slf4j
#RestController
#RequestMapping("/import")
public class ImportController {
#Autowired
private NotReactiveRepository notReactiveRepository;
#Autowired
private ReactiveRepository reactiveRepository;
private static final Scheduler queryScheduler = Schedulers.newBoundedElastic(1, 480, "query", 864000);// max 10 days processing time
#GetMapping("/start")
public Mono<String> start() {
log.info("Start");
return Mono.just("RECEIVED")
//fire and forget
.doOnNext(stringRouteResponse -> startProcess().subscribe());
}
private Mono<Long> startProcess() {
Mono<List<Items>> resultsBlockingMono = Mono
.fromCallable(() -> notReactiveRepository.findAll())
.subscribeOn(queryScheduler)
.retryWhen(Retry.backoff(5, Duration.of(2, ChronoUnit.SECONDS)));
return resultsBlockingMono
.doOnNext( records -> log.info("Records: {}", records.size()))
.flatMapMany(Flux::fromIterable)
.map(ItemConverter::convert)
// max 9000 save/sec
.delayElements(Duration.of(300, ChronoUnit.MICROS))
.flatMap(this::saveConvertedItem)
.zipWith(Flux.range(1, Integer.MAX_VALUE))
.doOnNext(savedAndIndex -> log.info("Saving in progress... {}", savedAndIndex.getT2()))
.count()
.doOnNext( numberOfSaved -> log.info("Saved {} records", numberOfSaved));
}
private Mono<ConvertedItem> saveConvertedItem(ConvertedItem convertedItem) {
return reactiveRepository.save(convertedItem)
.retryWhen(Retry.backoff(1000, Duration.of(2, ChronoUnit.MILLIS)))
.onErrorResume(throwable -> {
log.error("Resuming");
return Mono.empty();
})
.doOnError(throwable -> log.error("Error on save"));
}
}
Update:
As requested, this is the last output of the procedure, where should be "Saved 1131113 records" and with .log() before .count() (the output after the onNext always prints after the process, also on success):
"Saving... 1131113"
"| onNext([ConvertedItem(...),1131113])"
"Shutting down ExecutorService 'pubsubPublisherThreadPool'"
"Shutting down ExecutorService 'pubSubAcknowledgementExecutor'"
"Shutting down ExecutorService 'pubsubSubscriberThreadPool'"
"Closing JPA EntityManagerFactory for persistence unit 'default'"
"HikariPool-1 - Shutdown initiated..."
"HikariPool-1 - Shutdown completed."

Related

Spring Reactor Mono wait for the subscriber to complete it's task before returning the data

I am trying to execute the multiple HTTP calls with the Spring reactor webclient that returns the Mono. I am using Mono.block() to wait for all monos to complete. This is not waiting until the subscribe() complete.
public class UserValidator {
public Mono<User> getUserSummary(int userId) {
User user = new User();
// First Mono
Mono<Address> address = WebClientUtil.mono(userId, Address.class);
address.subscribe(adrs -> updateAddress(adrs, user));
// Second Mono
Mono<Education> education = WebClientUtil.mono(userId, Education.class);
education.subscribe(edcn -> updateEducationDeatils(edcn, user));
Mono.when(address, education).block(); // Blocking the monos to complete
// Intermittently returning the incomplete data ******************
return Mono.just(user);
}
private void updateAddress(Address adrs, User user) {
// Do some validations
// Validations takes 5 to 10 seconds
user.setAddress(adrs);
}
private void updateEducationDeatils(Education education, User user) {
// Do some validations
// Validations takes 5 to 10 seconds
user.setEducation(education);
}
}
Can someone please help me to fix this to return the updated User object after the subscribe() is completed?
You should not block Mono if the return value is Mono as well, use zip instead:
public Mono<User> getUserSummary(int userId) {
Mono<Address> addressMono = WebClientUtil.mono(userId, Address.class);
Mono<Education> educationMono = WebClientUtil.mono(userId, Education.class);
return Mono.zip(addressMono, educationMono, (address, education) -> {
User user = new User();
user.setAddress(address);
user.setEducation(education);
});
}
Also, make sure your validation does not block either.

Spring Boot java.util.concurrent.ThreadPoolExecutor size

currently I am testing my Spring Boot app, which is a rest service with a circuit breaker pattern. Now
I called my service with 20 threads at the same time and get the following log entry:
Task java.util.concurrent.FutureTask#127adac1[Not completed, task = java.util.concurrent.Executors$RunnableAdapter#74bf28cd[Wrapped task = null]] rejected from java.util.concurrent.ThreadPoolExecutor#19ae13b2[Running, pool size = 10, active threads = 10, queued tasks = 0, completed tasks = 16]
So my question would be is the maximum size of the thread pool realy 10 and am I able to set it to somthing differen?
I found the propertie server.tomcat.max-threads and set it to 10 all request will pass.
Edit
I am calling another rest service with the Spring Boot Resttemplate could this one cause the problem?
#HystrixCommand(
commandKey = "callRestService", fallbackMethod = "failCallRestService", ignoreExceptions = DataNotFoundException.class)
public ResponseEntity<DataAtomsResponse> callRestService(String searchItem, String trackingId)
throws RestServiceNotAvailableException
{
ThreadContext.put("trackingID", trackingId);
configureRestTemplate();
LOGGER.info("Received request with trackingId: {}", trackingId);
Map<String, String> restUrlParams = new HashMap<>();
restUrlParams.put(REST_URL_PARAMETER, searchItem);
HttpEntity<String> entity = getRestParameters(trackingId);
DataAtomsResponse dataAtomsResponse;
LOGGER.info("Request RestService trackingID: {}", trackingId);
ResponseEntity<String> dataResponse=
restTemplate.exchange(config.getRestServiceUrl(), HttpMethod.GET, entity, String.class, restUrlParams);
if (dataResponse.getStatusCode() == HttpStatus.OK || dataResponse.getStatusCode() == HttpStatus.NOT_FOUND) {
LOGGER.debug("Transform result from RestService to JSON trackingID: {}", trackingId);
dataAtomsResponse = dataParser.parse(dataResponse.getBody(), searchItem, trackingId);
return ResponseEntity.ok(dataAtomsResponse );
}
else {
throw new RestServiceNotAvailableException(dataResponse.getStatusCode().getReasonPhrase());
}
}
I have not implemented any ThreadPoolExecuter in any other class.
I found out what was the problem.
Histrix uses the normal java ThreadPoolExecutor and the value of maximum threads is set to 10. This https://medium.com/#truongminhtriet96/playing-with-hystrix-thread-pool-c7eebb5b0ddc article helped me alot. So I set these configs
hystrix.threadpool.default.maximumSize=32
hystrix.threadpool.default.allowMaximumSizeToDivergeFromCoreSize=true
server.tomcat.max-threads=32```

Manually acknowledge Kafka Event A consuming after producing event B

I have a case where I have to consume event A and do some processing, then produce the event B. So my problem is what would happen is the processing crashed and the application couldn't produce B while it consumed already A. My approach is to acknowledge after successfully publishing B, am I correct or should implement another solution for this case?
#KafkaListener(
id = TOPIC_ID,
topics = TOPIC_ID,
groupId = GROUP_ID,
containerFactory = LISTENER_CONTAINER_FACTORY
)
public void listen(List<Message<A>> messages, Acknowledgment acknowledgment) {
try {
final AEvent aEvent = messages.stream()
.filter(message -> null != message.getPayload())
.map(Message::getPayload)
.findFirst()
.get();
processDao.doSomeProcessing() // returns a Mono<Example> by calling an externe API
.subscribe(
response -> {
ProducerRecord<String, BEvent> BEventRecord = new ProducerRecord<>(TOPIC_ID, null, BEvent);
ListenableFuture<SendResult<String, BEvent>> future = kafkaProducerTemplate.send(buildBEvent());
future.addCallback(new ListenableFutureCallback<SendResult<String, BEvent>>() {
#Override
public void onSuccess(SendResult<String, BEvent> BEventSendResult) {
//TODO: do when event published successfully
}
#Override
public void onFailure(Throwable exception) {
exception.printStackTrace();
throw new ExampleException();
}
});
},
error -> {
error.printStackTrace();
throw new ExampleException();
}
);
acknowledgment.acknowledge(); // ??
} catch (ExampleException) {
exception.printStackTrace();
}
}
You can't manage kafka "acknowledgments" when using async code such as reactor.
Kafka does not manage discrete acks for each topic/partition, just the last committed offset for the partition.
If you process two records asynchronously, you will have a race as to which offset will be committed first.
You need to perform the sends on the listener container thread to maintain proper ordering.

Combining many ReactiveX streams into one result stream

I am trying to understand ReactiveX using RxJava but I can't get the whole Reactive idea. My case is the following:
I have Task class. It has perform() method which is executing an HTTP request and getting a response through executeRequest() method. The request may be executed many times (defined number of repetitions). I want to grab all the results of executeRequest() and combine them into Flowable data stream so I can return this Flowable in perform() method. So in the end I want my method to return all results of the requests that my Task executed.
executeRequest() returns Single because it executes only one request and may provide only one response or not at all (in case of timeout).
In perform() I create Flowable range of numbers for each repetition. Subscribed to this Flowable I execute a request per repetition. I additionally subscribe to each response Single for logging and gathering responses into a collection for later. So now I have a set of Singles, how can I merge them into Flowable to return it in perform()? I tried to mess around with operators like merge() but I don't understand its parameters types.
I've read some guides on the web but they all are very general or don't provide examples according to my case.
public Flowable<HttpClientResponse> perform() {
Long startTime = System.currentTimeMillis();
List<HttpClientResponse> responses = new ArrayList<>();
List<Long> failedRepetitionNumbers = new ArrayList<>();
Flowable.rangeLong(0, repetitions)
.subscribe(repetition -> {
logger.debug("Performing repetition {} of {}", repetition + 1, repetitions);
Long currentTime = System.currentTimeMillis();
if (durationCap == 0 || currentTime - startTime < durationCap) {
Single<HttpClientResponse> response = executeRequest(method, url, headers, body);
response.subscribe(successResult -> {
logger.info("Received response with code {} in the {}. repetition.", successResult
.statusCode(), repetition + 1);
responses.add(successResult);
},
error -> {
logger.error("Failed to receive response from {}.", url);
failedRepetitionNumbers.add(repetition);
});
waitInterval(minInterval, maxInterval);
} else {
logger.info("Reached duration cap of {}ms for task {}.", durationCap, this);
}
});
return Flowable.merge(???);
}
And executeRequest()
private Single<HttpClientResponse> executeRequest(HttpMethod method, String url, LinkedMultiValueMap<String, String>
headers, JsonNode body) {
CompletableFuture<HttpClientResponse> responseFuture = new CompletableFuture<>();
HttpClient client = vertx.createHttpClient();
HttpClientRequest request = client.request(method, url, responseFuture::complete);
headers.forEach(request::putHeader);
request.write(body.toString());
request.setTimeout(timeout);
request.end();
return Single.fromFuture(responseFuture);
}
Instead of subscribing to each observable(each HTTP request) within your perform method, Just keep on chaining the observables like this. Your code can be reduced to something like.
public Flowable<HttpClientResponse> perform() {
// Here return a flowable , which can emit n number of times. (where n = your number of HTTP requests)
return Flowable.rangeLong(0, repetitions) // start a counter
.doOnNext(repetition -> logger.debug("Performing repetition {} of {}", repetition + 1, repetitions)) // print the current count
.flatMap(count -> executeRequest(method, url, headers, body).toFlowable()) // get the executeRequest as Flowable
.timeout(durationCap, TimeUnit.MILLISECONDS); // apply a timeout policy
}
And finally, you can subscribe to the perform at the place where you actually need to execute all this, As shown below
perform()
.subscribeWith(new DisposableSubscriber<HttpClientResponse>() {
#Override
public void onNext(HttpClientResponse httpClientResponse) {
// onNext will be triggered each time, whenever a request has executed and ready with result
// if you had 5 HTTP request, this can trigger 5 times with each "httpClientResponse" (if all calls were success)
}
#Override
public void onError(Throwable t) {
// any error during the execution of these request,
// including a TimeoutException in case timeout happens in between
}
#Override
public void onComplete() {
// will be called finally if no errors happened and onNext delivered all the results
}
});

RxJava Observable Timeout not working at times

The below is my code snippet, and sometimes the Observable does not time out,
Observable<A> AObservable = Observable.fromCallable(() ->
//External Service Call
).timeout(800, TimeUnit.MILLISECONDS)
.subscribeOn(Schedulers.io())
.onErrorReturn(throwable -> {
LOGGER.warn(format("Server did not respond within %s ms for id=%s", 800, id));
return null;
});
Observable<B> BObservable = Observable.fromCallable(() ->
//External Service Call
).timeout(800, TimeUnit.MILLISECONDS)
.subscribeOn(Schedulers.io())
.onErrorReturn( throwable -> {
LOGGER.warn(format("Service did not respond within %s ms for id=%s", 800, Id));
return null;
});
// Build Default response
Observable<C> CObservable = Observable.fromCallable(() ->
// Build Default one
).subscribeOn(Schedulers.io());
return Observable.zip(AObservable, BObservable,CObservable,
(AResponse, BResponse, CResponse) -> {
// Handle response and combine them
}).toBlocking().first();
It appears to me that at times the if the service takes more than 800ms the timeout does not happen. Do I miss any attribute here please advise.

Categories