I am using Spring Webflux with Spring data jpa using PostgreSql as backend db.
I don't want to block the main thread while making db calls like find and save.
To achieve the same, I have a main scheduler in Controller class and a jdbcScheduler service classes.
The way I have defined them is:
#Configuration
#EnableJpaAuditing
public class CommonConfig {
#Value("${spring.datasource.hikari.maximum-pool-size}")
int connectionPoolSize;
#Bean
public Scheduler scheduler() {
return Schedulers.parallel();
}
#Bean
public Scheduler jdbcScheduler() {
return Schedulers.fromExecutor(Executors.newFixedThreadPool(connectionPoolSize));
}
#Bean
public TransactionTemplate transactionTemplate(PlatformTransactionManager transactionManager) {
return new TransactionTemplate(transactionManager);
}
}
Now, while doing a get/save call in my service layer I do:
#Override
public Mono<Config> getConfigByKey(String key) {
return Mono.defer(
() -> Mono.justOrEmpty(configRepository.findByKey(key)))
.subscribeOn(jdbcScheduler)
.publishOn(scheduler);
}
#Override
public Flux<Config> getAllConfigsAfterAppVersion(int appVersion) {
return Flux
.fromIterable(configRepository.findAllByMinAppVersionIsGreaterThanEqual(appVersion))
.subscribeOn(jdbcScheduler)
.publishOn(scheduler);
}
#Override
public Flux<Config> addConfigs(List<Config> configList) {
return Flux.fromIterable(configRepository.saveAll(configList))
.subscribeOn(jdbcScheduler)
.publishOn(scheduler);
}
And in controller, I do:
#PostMapping
#ResponseStatus(HttpStatus.CREATED)
Mono<ResponseDto<List<Config>>> addConfigs(#Valid #RequestBody List<Config> configs) {
return configService.addConfigs(configs).collectList()
.map(configList -> new ResponseDto<>(HttpStatus.CREATED.value(), configList, null))
.subscribeOn(scheduler);
}
Is this correct? and/or there is a way better way to do it?
What I understand by:
.subscribeOn(jdbcScheduler)
.publishOn(scheduler);
is that task will run on jdbcScheduler threads and later result will be published on my main parallel scheduler. Is this understanding correct?
Your understanding is correct with regards to publishOn and subscribeOn (see reference documentation in the reactor project about those operators).
If you call blocking libraries without scheduling that work on a specific scheduler, those calls will block one of the few threads available (by default, the Netty event loop) and your application will only be able to serve a few requests concurrently.
Now I'm not sure what you're trying to achieve by doing that.
First, the parallel scheduler is designed for CPU bound tasks, meaning you'll have few of them, as many (or a bit more) as CPU cores. In this case, it's like setting your threadpool size to the number of cores on a regular Servlet container. Your app won't be able to process a large number of concurrent requests.
Even if you choose a better alternative (like the elastic Scheduler), it will be still not as good as the Netty event loop, which is where request processing is scheduled natively in Spring WebFlux.
If your ultimate goal is performance and scalability, wrapping blocking calls in a reactive app is likely to perform worse than your regular Servlet container.
You could instead use Spring MVC and:
use usual blocking return types when you're dealing with a blocking library, like JPA
use Mono and Flux return types when you're not tied to such libraries
This won't be non-blocking, but this will be asynchronous still and you'll be able to do more work in parallel without dealing with the complexity.
IMHO, there a way to execute this operation doing a better use of resources from machine. Following documentation you can wrap the call in other Thread and with this you can continue your execution.
Related
I have such a controller and such a service class. Why am I getting this warning in IDEA - "Possibly blocking call in non-blocking context could lead to thread starvation" ?
#PostMapping(value = {"/create"})
public Mono<ResponseEntity<ResponseDto>> create(
#RequestBody RequestDto request) {
ResponseDto result = service.create(request);
return Mono.just(ResponseEntity.ok(result));
}
#Transactional
public ResponseDto create(RequestDto request) {
taskRepository.save(request);
return new ResponseDto("Ок");
}
This is apparently caused by the #Transactional annotation. When I remove it, the warning disappears. What is this problem and how can it be fixed?
p.s. this example is schematic. the real code is bigger.
The reactive process is contrary to the norm. You cannot use blocking elements here! With Tomcat, it creates a separate thread for each request so that the topic can be blocked. Reactive Netty will NOT create a new thread, just uses a fixed pool.
With the loose approach, you can think that if a process is waiting for a response, it gives the resource of that thread to another. If you block it, it won't be able to do that. Therefore, even with a single-threaded Netty, it can handle to serve multiple parallel requests.
Therefore, thread-based data storage also does not work properly, because another process can interfere or modify it. Therefore, reactive context is available instead.
There is a article to reactive transaction. I don't know it will be solution for you:
https://itnext.io/integrating-hibernate-reactive-with-spring-5427440607fe
I am currently on a Project that builds Microservices, and are trying to move from the more traditional Spring Boot RestClient to Reactive Stack using Netty and WebClient as the HTTP Client in order to connect to backend systems.
This is going well for backends with REST APIs, however I'm still having some difficulties implementing WebClient to services that connect to SOAP backends and Oracle databases, which still uses traditional JDBC.
I managed to find some workaround online regarding JDBC calls that make use of parallel schedulers to publish the result of the blocking JDBC call:
//the method that is called by #Service
#Override
public Mono<TransactionManagerModel> checkTransaction(String transactionId, String channel, String msisdn) {
return asyncCallable(() -> checkTransactionDB(transactionId, channel, msisdn))
.onErrorResume(error -> Mono.error(error));
}
...
//the actual JDBC call
private TransactionManagerModel checkTransactionDB(String transactionId, String channel, String msisdn) {
...
List<TransactionManagerModel> result =
jdbcTemplate.query(CHECK_TRANSACTION, paramMap, new BeanPropertyRowMapper<>(TransactionManagerModel.class));
...
}
//Generic async callable
private <T> Mono<T> asyncCallable(Callable<T> callable) {
return Mono.fromCallable(callable).subscribeOn(Schedulers.parallel()).publishOn(transactionManagerJdbcScheduler);
}
and I think this works quite well.
While for SOAP calls, what I did was encapsulating the SOAP call in a Mono while the SOAP call itself is using a CloseableHttpClient which is obviously a blocking HTTP Client.
//The method that is being 'reactive'
public Mono<OfferRs> addOffer(String transactionId, String channel, String serviceId, OfferRq request) {
...
OfferRs result = adapter.addOffer(transactionId, channel, generateRequest(request));
...
}
//The SOAP adapter that uses blocking HTTP Client
public OfferRs addOffer(String transactionId, String channel, JAXBElement<OfferRq> request) {
...
response = (OfferRs) getWebServiceTemplate().marshalSendAndReceive(url, request, webServiceMessage -> {
try {
SoapHeader soapHeader = ((SoapMessage) webServiceMessage).getSoapHeader();
ObjectFactory headerFactory = new ObjectFactory();
AuthenticationHeader authHeader = headerFactory.createAuthenticationHeader();
authHeader.setUserName(username);
authHeader.setPassWord(password);
JAXBContext headerContext = JAXBContext.newInstance(AuthenticationHeader.class);
Marshaller marshaller = headerContext.createMarshaller();
marshaller.marshal(authHeader, soapHeader.getResult());
} catch (Exception ex) {
log.error("Failed to marshall SOAP Header!", ex);
}
});
return response;
...
}
My question is: Does this implementation for SOAP calls "reactive" enough that I won't have to worry about some calls being blocked in some part of the microservice? I have already implemented reactive stack - calling a block() explicitly will throw an exception as it's not permitted if using Netty.
Or should I adapt the use of parallel Schedulers in SOAP calls as well?
After some discussions i'll write an answer.
Reactor documentation states that you should place blocking calls on their own schedulers. Thats basically to keep the non-blocking part of reactor going, and if something comes in that blocks, then reactor will fallback to traditional servlet behaviour which means assigning one thread to each request.
Reactor has very good documentation about schedulers their types etc.
But short:
onSubscribe
When someone subscribes, reactor will go into something called the assembly phase which means it will basically from the subscribe point start calling the operators backwards upstream until it finds a producer of data (for example a database, or another service etc). If it finds a onSubscribe-operator somewhere during this phase it will place this entire chain on its own defined Scheduler. So one good thing to know is that placement of the onSubscribe does not really matter, as long as it is found during the assembly phase the entire chain will be affected.
Example usage could be:
We have blocking calls to a database, slow calls using a blocking rest client, reading a file from the system in a blocking manor etc.
onPublish
if you have onPublish somewhere in the chain during the assembly phase the chain will know that where it is placed the chain will switch from the default scheduler to the designated scheduler at that specific point. So onPublish placement DOES matter. As it will switch at where it is placed. This operator is more to control that you want to place something on a specific scheduler at specific point in the code.
Examples usage could be:
You are doing some heavy blocking cpu calculations at a specific point, you could switch to a Scheduler.parallell() that will guarantee that all calculations will be placed on separate cores do do heavy cpu work, and when you are done you could switch back to the default scheduler.
Above example
Your soap calls should be placed on its own Scheduler if they are blocking and i think onSubscribe will be enough with a usage of a Schedulers.elasticBound() will be fine to get traditional servlet behaviour. If you feel like you are scared of having every blocking call on the same Scheduler, you could pass in the Scheduler in the asyncCallable function and split up calls to use different Schedulers.
When using Spring webflux with Mono or Flux return type, the http connecting thread is parked/released while the connection waits for the response. Thus, the connection is not taking max-connections.
Question: how can I test/proof that the connection is really released while waiting for the response, and not blocking max-connections?
I already enabled DEBUG logging, but that did not show anything regarding this question.
#RestController
public class MyServlet {
#GetMapping("/")
public Mono<String>/Flux<String> test() {
return Mono.just(service.blocking());
}
}
#Service
public class SlowService {
public String blocking() {
TimeUnit.SECONDS.sleep(10);
return "OK";
}
}
Or is that incorrect at all, and I'd have to use:
Mono.fromCallable(() -> service.blocking()).subscribeOn(Schedulers.elastic());
But still, how can I see from the logs that the connection gets parked correctly?
To test, I'm using server.tomcat.max-threads=5. I'm trying to rewrite my blocking service so that those threads are not blocked during the sleep, and thus more than 5 connections can reach my service concurrently.
There are 2 thread pools, the forkjoin pool that will handle all the regular work. Then you have the Schedular pool that will handle scheduled tasks.
return Mono.just(service.blocking());
This will block one of the threads in the ForkJoinPool so less events can be handled by webflux hence slowing down your service.
Mono.fromCallable(() -> service.blocking()).subscribeOn(Schedulers.elastic());
This will assign the task and "offload" it to the scheduler pool of threads so another thread pool will handle this and not hog one of the ForkJoinPool threads.
How to test this? You need to load test your service, or as most people do, trust the framework to do what it is set out to do and trust that the spring team has tested their side of things.
Within my spring boot app I made several database calls asynchronously like so:
public List<Results> doWork() {
List<Observable<Results>> observables = Lists.newArrayList();
observables.add(
Observable.fromCallable(() -> dbQueryMethod(param1)).subscribeOn(Schedulers.io()));
observables.add(
Observable.fromCallable(() -> dbQueryMethod(param2)).subscribeOn(Schedulers.io()));
return Observable.merge(observables)
.toList()
.toBlocking()
.single();
}
#Transactional(readOnly=true)
public List<Results> myServiceMethod() {
doWork();
}
Basically the issue is that despite marking my service layer method as transaction & read only set to true, it's not actually passing the ThreadLocal state to the new threads spawned off by RxJava, causing the connections go to my master db instance and not the read replica.
We are currently using configured ContextHandlers and a ContextAwareSchedulerHook in App.java, but what do I need to do so that the new threads created by RxJava will inherit whatever ThreadLocal state is needed to manage them within the defined transaction?
Mark your dbQueryMethod with #Transactional(readOnly=true) and make sure you invoke it via Spring autowired proxy (not directly from the same class). Then io thread shall start new transaction, which will be read only.
TLDR : I have background processing going on in RxJava Observables, I am in integration tests, I would like to be able to independently wait for that processing to finish to make sure that background processing started from one test does not interfere with another test.
Simplified, I hava a #RequestMapping method that does the following :
insert data in database
launch an asynchronous processing of that data (http calls via Feign, db updates)
returns nothing (HttpStatus.NO_CONTENT)
This asynchronous processing was previously done with a ThreadPoolTaskExecutor. We're going to transition to RxJava and would like to remove this ThreadPoolTaskExecutor and do the background processing with RxJava.
So quite naively for the moment I tried to do that instead:
Observable
.defer(() -> Observable.just(call to long blocking method)
.subscribeOn(Schedulers.io())
.subscribe();
The end goal is of course to, one step at a time, go down into "call to long blocking method" and use Observable all the way.
Now before that I would like to make my integration tests work first. I am testing this by doing a RestTemplate call to the mapping. As most of the work is asynchronous my call returns really fast. Now I would like to find a way to wait for the asynchronous processing to finish (to make sure it does not conflict with another test).
Before RxJava I would just count the tasks in the ThreadPoolTaskExecutor and wait till it would reach 0.
How can I do that with RxJava ?
What I tried :
I tried to make all my Schedulers immediate with an RxJavaSchedulersHook : this cause some sort of blocking somewhere, code execution stops just before my Feign calls (Feign uses RxJava under the hood)
I tried to count the tasks with an Rx RxJavaObservableExecutionHook : I tried retaining the subscriptions, and removing them when isSubcribed = false, but this didn't work at all (lots of subscribers, the count never goes down)
I tried to put an observeOn(immediate()) in the real production code. This seems to work, and I could inject the right scheduler for runtime/test phases, but I am not really keen on putting code just for testing purposes in my real production code.
I'm probably terribly wrong, or overcomplicating thing, so don't hesitate to correct my reasonning !
How to you return HttpStatus.NO_CONTENT ?
#RequestMapping(value = "/")
public HttpStatus home() {
Observable.defer(() -> Observable.just(longMethod())
.subscribeOn(Schedulers.io())
.subscribe();
return HttpStatus.NO_CONTENT;
}
In this form, you can't know when the longMethod is finished.
If you wants to know when all async jobs are completed, you can return HttpStatus.NO_CONTENT when all jobs are completed, using Spring DefferedResult or using a TestSubscriber
PS: you can use Observable.fromCallable(() -> longMethod()); instead of Observable.defer(() -> Observable.just(longMethod()); if you want
Using DefferedResult
#RequestMapping(value = "/")
public DeferredResult<HttpStatus> index() {
DeferredResult<HttpStatus> deferredResult = new DeferredResult<HttpStatus>();
Observable.fromCallable(() -> longMethod())
.subscribeOn(Schedulers.io())
.subscribe(value -> {}, e -> deferredResult.setErrorResult(e.getMessage()), () -> deferredResult.setResult(HttpStatus.NO_CONTENT))
return deferredResult;
}
Like this, if you call your method, you'll get your result only when your observable complete (so, when the longMethod is finished)
Using TestSubscriber
You'll have to inject a TestSubscriber and when ask him to wait/check the completion of your Observable :
#RequestMapping(value = "/")
public HttpStatus home() {
Observable.defer(() -> Observable.just(longMethod())
.subscribeOn(Schedulers.io())
.subscribe(subscriber); // you'll have to inject this subscriber in your test
return HttpStatus.NO_CONTENT;
}
and in your test :
TestSubscriber subscriber = new TestSubscriber(); // you'll have to inject it into your controller
// ....
controller.home();
subscriber.awaitTerminalEvent();
subscriber.assertCompleted(); // check that no error occurred
You could use a ExecutorServiceAdapter to bridge from the Spring ThreadPoolTaskExecutor to the ExecutorService in RX, and then do the same trick as before.
A few month later in the game : my advice is simply "don't do that". RxJava is not really suited to this kind of job. Without going too much in detail having lots of "loose" Observable running in the background is not appropriate : depending on the volume of your requests you can easily fall into queue and memory issues, and more importantly what happens with all the scheduled and running tasks if the webserver crashes ? How do you restart that ?
Spring offers other better alternatives imho : Spring Batch, Spring Cloud Task, messaging with Spring Cloud Stream, so don't do as I did and just use the right tool for the right job.
Now If you really want to go the bad route :
Either return an SseEmmitter and consume only the first event from the SSE in the consumer service, and consume all events in your tests
Either create an RxJava lift operator that wraps (in the call method) the Subscriber in a parent Subscriber that has a waitForCompletion method. How you do the waiting is up to you (with a CountDownLatch for example). That subscriber would be added to a synchronized list (and removed from it once completed), and in your tests you could just iterate over the list and call waitForCompletion on each item of the list. It's not that complicated and I got it to work, but please, dont do that !