Within my spring boot app I made several database calls asynchronously like so:
public List<Results> doWork() {
List<Observable<Results>> observables = Lists.newArrayList();
observables.add(
Observable.fromCallable(() -> dbQueryMethod(param1)).subscribeOn(Schedulers.io()));
observables.add(
Observable.fromCallable(() -> dbQueryMethod(param2)).subscribeOn(Schedulers.io()));
return Observable.merge(observables)
.toList()
.toBlocking()
.single();
}
#Transactional(readOnly=true)
public List<Results> myServiceMethod() {
doWork();
}
Basically the issue is that despite marking my service layer method as transaction & read only set to true, it's not actually passing the ThreadLocal state to the new threads spawned off by RxJava, causing the connections go to my master db instance and not the read replica.
We are currently using configured ContextHandlers and a ContextAwareSchedulerHook in App.java, but what do I need to do so that the new threads created by RxJava will inherit whatever ThreadLocal state is needed to manage them within the defined transaction?
Mark your dbQueryMethod with #Transactional(readOnly=true) and make sure you invoke it via Spring autowired proxy (not directly from the same class). Then io thread shall start new transaction, which will be read only.
Related
I am writing a service where I want to expose an endpoint which will call another service and if the service call is successful then I want to send back the result to UI/ calling app.
In parallel before sending back the response, I want to execute/submit a task which should run in background and my call should not be dependent on success or failure of this task.
Before returning the response i want to do the-
executorService.execute(object);
This should not be a blocking call..
Any suggestion
Spring Async methods is the way to go here as was suggested in comments.
Some caveats:
Async methods can have different return types, its true that they can return CompletableFuture but this is in case if you called them from some background process and would like to wait/check for their execution status or perhaps, execute something else when the future is ready. In your case it seems that you want "fire-and-forget" behavior. So you should use void return type for your #Async annotated method.
Make sure that you place #EnableAsync. Under that hood it works in a way that it wraps the bean that has #Async methods with some sort of proxy, so the proxy is actually injected into your service. So #EnableAsync turns on this proxy generation mechanism. You can verify that this is the case usually in the debugger and checking the actual type of the injected reference object.
Consider customizing the the task executor to make sure that you're running the async methods with executor that matches your needs. For example, you won't probably want that every invocation of async method would spawn a new thread (and there is an executor that behaves like this). You can read about various executors here for example
Update
Code-wise you should do something like this:
public class MyAsyncHandler {
#Async
public void doAsyncJob(...) {
...
}
}
#Service
public class MyService {
#Autowired // or autowired constructor
private MyAsyncHandler asyncHandler;
public Result doMyMainJob(params) {
dao.saveInDB();
// do other synchronous stuff
Result res = prepareResult();
asyncHandler.doAsyncJob(); // this returns immediately
return res;
}
}
I am using Spring Webflux with Spring data jpa using PostgreSql as backend db.
I don't want to block the main thread while making db calls like find and save.
To achieve the same, I have a main scheduler in Controller class and a jdbcScheduler service classes.
The way I have defined them is:
#Configuration
#EnableJpaAuditing
public class CommonConfig {
#Value("${spring.datasource.hikari.maximum-pool-size}")
int connectionPoolSize;
#Bean
public Scheduler scheduler() {
return Schedulers.parallel();
}
#Bean
public Scheduler jdbcScheduler() {
return Schedulers.fromExecutor(Executors.newFixedThreadPool(connectionPoolSize));
}
#Bean
public TransactionTemplate transactionTemplate(PlatformTransactionManager transactionManager) {
return new TransactionTemplate(transactionManager);
}
}
Now, while doing a get/save call in my service layer I do:
#Override
public Mono<Config> getConfigByKey(String key) {
return Mono.defer(
() -> Mono.justOrEmpty(configRepository.findByKey(key)))
.subscribeOn(jdbcScheduler)
.publishOn(scheduler);
}
#Override
public Flux<Config> getAllConfigsAfterAppVersion(int appVersion) {
return Flux
.fromIterable(configRepository.findAllByMinAppVersionIsGreaterThanEqual(appVersion))
.subscribeOn(jdbcScheduler)
.publishOn(scheduler);
}
#Override
public Flux<Config> addConfigs(List<Config> configList) {
return Flux.fromIterable(configRepository.saveAll(configList))
.subscribeOn(jdbcScheduler)
.publishOn(scheduler);
}
And in controller, I do:
#PostMapping
#ResponseStatus(HttpStatus.CREATED)
Mono<ResponseDto<List<Config>>> addConfigs(#Valid #RequestBody List<Config> configs) {
return configService.addConfigs(configs).collectList()
.map(configList -> new ResponseDto<>(HttpStatus.CREATED.value(), configList, null))
.subscribeOn(scheduler);
}
Is this correct? and/or there is a way better way to do it?
What I understand by:
.subscribeOn(jdbcScheduler)
.publishOn(scheduler);
is that task will run on jdbcScheduler threads and later result will be published on my main parallel scheduler. Is this understanding correct?
Your understanding is correct with regards to publishOn and subscribeOn (see reference documentation in the reactor project about those operators).
If you call blocking libraries without scheduling that work on a specific scheduler, those calls will block one of the few threads available (by default, the Netty event loop) and your application will only be able to serve a few requests concurrently.
Now I'm not sure what you're trying to achieve by doing that.
First, the parallel scheduler is designed for CPU bound tasks, meaning you'll have few of them, as many (or a bit more) as CPU cores. In this case, it's like setting your threadpool size to the number of cores on a regular Servlet container. Your app won't be able to process a large number of concurrent requests.
Even if you choose a better alternative (like the elastic Scheduler), it will be still not as good as the Netty event loop, which is where request processing is scheduled natively in Spring WebFlux.
If your ultimate goal is performance and scalability, wrapping blocking calls in a reactive app is likely to perform worse than your regular Servlet container.
You could instead use Spring MVC and:
use usual blocking return types when you're dealing with a blocking library, like JPA
use Mono and Flux return types when you're not tied to such libraries
This won't be non-blocking, but this will be asynchronous still and you'll be able to do more work in parallel without dealing with the complexity.
IMHO, there a way to execute this operation doing a better use of resources from machine. Following documentation you can wrap the call in other Thread and with this you can continue your execution.
In latest versions of PlayFramework they started using CompletionStage as return type for controllers that will be used for async execution or in a nutshell if you return CompletionStage it is asynchronous execution...
Now when we know the work we submit to CF is a long running IO operation we need to pass a custom executor (otherwise it will be executed on FJP by default).
Each controller execution has a HTTP context which has in it all the request information also this context is necessary to have your EntityManagers if you use JPA...
If we just create custom ExecutorService and inject it in our controller to use in supplyAsync() we wont have all the context information.
Here is an example of some controller action returning CompletionStage
return supplyAsync(() -> {
doSomeWork();
}, executors.io); // this is a custom CachedThreadPool with daemon thread factory
}
and if we try to run something like this in doSomeWork()
Request request = request(); // getting request using Controller.request()
or use preinjected JPAAPI jpa field in controller
jpa.withTransaction(
() -> jpa.em() // we will get an exception here although we are wrapped in a transaction
...
);
exception like
No EntityManager bound to this thread. Try wrapping this call in JPAApi.withTransaction, or ensure that the HTTP context is setup on this thread.
As you can see the jpa code is wrapped in transaction but no context was found because this is a custome pure java threadpool.
What is the correct way to provide all the context information when using CompletableFuture and custom executor?
I also tried defining custom executors in application.conf and lookup them from actor system but i will end up having MessageDispatcher which although is backed by ExecutorService is not compatible with CompletableFuture (maybe i'm wrong? if so how to use it with CF?)
You can use play.libs.concurrent.HttpExecution.fromThread method:
An ExecutionContext that executes work on the given ExecutionContext. The current thread's context ClassLoader and Http.Context are captured when this method is called and preserved for all executed tasks.
So, the code would be something like:
java.util.concurrent.Executor executor = getExecutorFromSomewhere();
return supplyAsync(() -> {
doSomeWork();
}, play.libs.concurrent.HttpExecution.fromThread(executor));
Or, if you are using a scala.concurrent.ExecutionContext:
scala.concurrent.ExecutionContext ec = getExecutorContext();
return supplyAsync(() -> {
doSomeWork();
}, play.libs.concurrent.HttpExecution.fromThread(ec));
But I'm not entirely sure that will preserve the EntityManager for JPA.
TLDR : I have background processing going on in RxJava Observables, I am in integration tests, I would like to be able to independently wait for that processing to finish to make sure that background processing started from one test does not interfere with another test.
Simplified, I hava a #RequestMapping method that does the following :
insert data in database
launch an asynchronous processing of that data (http calls via Feign, db updates)
returns nothing (HttpStatus.NO_CONTENT)
This asynchronous processing was previously done with a ThreadPoolTaskExecutor. We're going to transition to RxJava and would like to remove this ThreadPoolTaskExecutor and do the background processing with RxJava.
So quite naively for the moment I tried to do that instead:
Observable
.defer(() -> Observable.just(call to long blocking method)
.subscribeOn(Schedulers.io())
.subscribe();
The end goal is of course to, one step at a time, go down into "call to long blocking method" and use Observable all the way.
Now before that I would like to make my integration tests work first. I am testing this by doing a RestTemplate call to the mapping. As most of the work is asynchronous my call returns really fast. Now I would like to find a way to wait for the asynchronous processing to finish (to make sure it does not conflict with another test).
Before RxJava I would just count the tasks in the ThreadPoolTaskExecutor and wait till it would reach 0.
How can I do that with RxJava ?
What I tried :
I tried to make all my Schedulers immediate with an RxJavaSchedulersHook : this cause some sort of blocking somewhere, code execution stops just before my Feign calls (Feign uses RxJava under the hood)
I tried to count the tasks with an Rx RxJavaObservableExecutionHook : I tried retaining the subscriptions, and removing them when isSubcribed = false, but this didn't work at all (lots of subscribers, the count never goes down)
I tried to put an observeOn(immediate()) in the real production code. This seems to work, and I could inject the right scheduler for runtime/test phases, but I am not really keen on putting code just for testing purposes in my real production code.
I'm probably terribly wrong, or overcomplicating thing, so don't hesitate to correct my reasonning !
How to you return HttpStatus.NO_CONTENT ?
#RequestMapping(value = "/")
public HttpStatus home() {
Observable.defer(() -> Observable.just(longMethod())
.subscribeOn(Schedulers.io())
.subscribe();
return HttpStatus.NO_CONTENT;
}
In this form, you can't know when the longMethod is finished.
If you wants to know when all async jobs are completed, you can return HttpStatus.NO_CONTENT when all jobs are completed, using Spring DefferedResult or using a TestSubscriber
PS: you can use Observable.fromCallable(() -> longMethod()); instead of Observable.defer(() -> Observable.just(longMethod()); if you want
Using DefferedResult
#RequestMapping(value = "/")
public DeferredResult<HttpStatus> index() {
DeferredResult<HttpStatus> deferredResult = new DeferredResult<HttpStatus>();
Observable.fromCallable(() -> longMethod())
.subscribeOn(Schedulers.io())
.subscribe(value -> {}, e -> deferredResult.setErrorResult(e.getMessage()), () -> deferredResult.setResult(HttpStatus.NO_CONTENT))
return deferredResult;
}
Like this, if you call your method, you'll get your result only when your observable complete (so, when the longMethod is finished)
Using TestSubscriber
You'll have to inject a TestSubscriber and when ask him to wait/check the completion of your Observable :
#RequestMapping(value = "/")
public HttpStatus home() {
Observable.defer(() -> Observable.just(longMethod())
.subscribeOn(Schedulers.io())
.subscribe(subscriber); // you'll have to inject this subscriber in your test
return HttpStatus.NO_CONTENT;
}
and in your test :
TestSubscriber subscriber = new TestSubscriber(); // you'll have to inject it into your controller
// ....
controller.home();
subscriber.awaitTerminalEvent();
subscriber.assertCompleted(); // check that no error occurred
You could use a ExecutorServiceAdapter to bridge from the Spring ThreadPoolTaskExecutor to the ExecutorService in RX, and then do the same trick as before.
A few month later in the game : my advice is simply "don't do that". RxJava is not really suited to this kind of job. Without going too much in detail having lots of "loose" Observable running in the background is not appropriate : depending on the volume of your requests you can easily fall into queue and memory issues, and more importantly what happens with all the scheduled and running tasks if the webserver crashes ? How do you restart that ?
Spring offers other better alternatives imho : Spring Batch, Spring Cloud Task, messaging with Spring Cloud Stream, so don't do as I did and just use the right tool for the right job.
Now If you really want to go the bad route :
Either return an SseEmmitter and consume only the first event from the SSE in the consumer service, and consume all events in your tests
Either create an RxJava lift operator that wraps (in the call method) the Subscriber in a parent Subscriber that has a waitForCompletion method. How you do the waiting is up to you (with a CountDownLatch for example). That subscriber would be added to a synchronized list (and removed from it once completed), and in your tests you could just iterate over the list and call waitForCompletion on each item of the list. It's not that complicated and I got it to work, but please, dont do that !
Question:
Is there a Scope for a Thread and it's spawned Threads which supports it's destruction as soon as all Threads accessing it are terminated? If not do I have to implement it myself or am I getting the concept of Scopes in Spring DI wrong?
Context:
I have a platform which has a REST-API on which processes can be started which are then running on the server. Some of these processes start multiple Threads of which some only terminate on system shutdown (e.g. listening on a stream and doing stuff with the data received).
I want to use Spring for dependency injection and now need to manage beans in a suitable scope.
Problem:
I want to take parameters from the request and provide these at multiple other locations. My approach is to take a container bean which is populated in the request handler and then used at all other occasions. The #Scope("request") is destroyed as soon as the response is send which happens instantly (since the handler only spawns a Thread) thus it's not applicable here.
I read about the ThreadScope implementation from springbyexample (http://www.springbyexample.org/examples/custom-thread-scope-module.html) and a way to modify the spring SimpleThreadScope to support inheritance within the hierarchy of spawned threads (https://stackoverflow.com/a/14987371/4502203). Both are only solving parts of my Problem.
What I need is a Scope which supports destruction callbacks (since I'm not keen on memory leak) and is inherited to Child-Threads.
Code Example:
#RequestMapping(value = "/myApi/{parameterA}/{parameterB}", method = RequestMethod.GET, produces = {MediaType.APPLICATION_JSON_VALUE})
public #ResponseBody void doFancyStuff(#PathVariable String parameterA, #PathVariable String parameterB) {
new Thread(() -> {
ParameterContainer parameterContainer = applicationContext.getBean(ParameterContainer.class);
parameterContainer.setParA(parameterA);
parameterContainer.setParB(parameterB);
/*
* spawn a couple of additional threads here which
* need to get access to the ParameterContainer.
*/
}
}