reactor kafka receive(): Performace gain with publishOn()?

reactor kafka receive(): Performace gain with publishOn()? - java

Small question regarding reactor kafka consumer please.
in many tutorials found online, we can see two different constructs for a reactive kafka consumer.
example 1:
public Flux<String> myConsumer1() {
return kafkaReceiver.receive()
.map(oneMessage-> doLogicThisIsTestedNonBlockingAllTheWay(oneMessage))
.doOnNext(fakeConsumerDTO -> System.out.println("successfully consumed {}={}" + fakeConsumerDTO))
.doOnError(throwable -> System.out.println("something bad happened while consuming : {}" + throwable.getMessage()));
}
example 2:
public Flux<String> myConsumer2() {
Scheduler readerScheduler = Schedulers.newBoundedElastic(60, 60, "readerThreads");
return kafkaReceiver.receive()
.publishOn(readerScheduler) // or this one .publishOn(Schedulers.boundedElastic())
.map(oneMessage-> doLogicThisIsTestedNonBlockingAllTheWay(oneMessage))
.doOnNext(fakeConsumerDTO -> System.out.println("successfully consumed {}={}" + fakeConsumerDTO))
.doOnError(throwable -> System.out.println("something bad happened while consuming : {}" + throwable.getMessage()));
}
The main difference between example 1 and example 2 is that the actual processing / handling / perform logic on the message is either directly executed on the map method (example 1) or on the reactor Scheduler (example 2).
https://projectreactor.io/docs/kafka/snapshot/reference/index.html#kafka-source
This construct is also mentioned in the official reactor kafka project doc (where again, the two constructs are being used)
On the small item '4', it is written "Cannot block the receiver thread" for the line of code .publishOn(aBoundedElasticScheduler)
Suppose the processing / handling / perform logic on the message method has been proven non blocking (blockhound tested etc).
I am having a hard time understanding the difference between the two.
And most of all, I would like to ask if there is any performance gain from one over the other.
To emphasize, this is not an opinion based or style question.
This question is asking about a possible performance difference / performance gain between two solutions.
Thank you

I do not think there is much difference between those 2 examples, but the second example seems more flexible since you use Scheduler.
Using a Scheduler can also help with mitigating any issues with blocking code, as it allows you to specify a separate thread pool for handling the processing of the messages. This can help prevent the blocking of the main thread and improve overall performance.

Related

How to incorporate async calls into a reactive pipeline

I've just discovered the joys of RxJava and its 10000 public methods, but I am struggling to understand how (and if) we should incorporate async apis into reactive pipelines.
To give an example, let's say I'm building a pipeline that:
takes keys from some cold source (or hot, in which case let's say we already have a way of dealing with an overactive source)
fetches data for those keys using an asynchronous client (or just applies any kind of async processing)
batches the data and
saves it into storage.
If we had a blocking api for step #2, it might look something like this.
source.map((key) -> client.callBlocking(key))
.buffer(500, TimeUnit.MILLISECONDS, 100)
.subscribe(dataList -> storage.batchSave(dataList));
With a couple more calls, we could parallelise this, making it so that 100 threads are waiting on client.callBlocking at any given time.
But what if the api we have is already asynchronous and we want to make use of that? I imagine the same pipeline would look something like this
source.magicMethod(new Processor() {
// When downstream requests more items
public void request(int count) {
upstream.request(count);
}
// When upstream delivers us an item
public void onNext(Object key) {
client.callAsync(key)
.onResult((data) -> downstream.onNext(data));
}
})
.buffer(500, TimeUnit.MILLISECONDS, 100)
.subscribe(data -> storage.batchSave(data));
What I want to know is which method is magicMethod. Or perhaps this is a terrible idea to incorporate async calls into a pipeline and we should never ever. (There is also a question of pre-fetching, so that downstream code does not necessarily have to wait for data after requesting it, but let's put that aside for now)
Note that this is not a question about parallelism. The second version could run perfectly well in a single thread (plus whatever threads the client may or may not be using under the hood)
Also, while the question is about RxJava, I'd be just as happy to see an answer using Reactor.
Thanks for helping a poor old reactive noob :)

Spring WebFlux perform parallel HTTP requests and deserialize the response

I have a List<String> containing URLs and I would like to perform a GET request for each URL in that List.
Those requests should be made in parallel. After all the requests are done I would like to have a List<CustomModel> containing all the deserialized responses.
So I created a method to make the HTTP request
public Flux<JsonNode> performGetRequest(String url) {
WebClient webClient = WebClient.create(String.format("%s%s", API_BASE_URL, url));
return webClient.get()
.retrieve()
.bodyToFlux(JsonNode.class);
}
The above method is called this way
public List<CustomModel> fetch(List<String> urls) {
return Flux.fromIterable(urls)
.parallel()
.runOn(Schedulers.boundedElastic())
.flatMap(this::performGetRequest)
.flatMap(jsonNode -> Flux.fromIterable(customDeserialize(jsonNode)))
.sequential()
.collectList()
.flatMapMany(Flux::fromIterable)
.collectList()
.block();
}
For each response, I am using a custom method to deserialize the response
private List<CustomModel> customDeserialize(final JsonNode jsonNodeResponse) {
List<CustomModel> customModelList = new ArrayList<>();
for (JsonNode block : jsonNodeResponse) {
// deserialize the response, create an instance of CustomModel class
// and add it to customModelList
}
return customModelList;
}
The problem is that even tho I use the parallel() method the whole process is probably not running on parallel. The time it takes to complete indicates that I am doing something wrong.
Am I missing something?

The problem is that even tho I use the parallel() method the whole process is probably not running on parallel. The time it takes to complete indicates that I am doing something wrong.
Am I missing something?
Since you are calling block im going to assume you are running a MVC servlet application which is using WebClient only for rest calls.
If you are not running a full webflux application, your application will start up a single event loop that will process all events that are scheduled. If running a full webflux application, you will get as many event loops as cores on the running machine.
By the usage of parallel the reactor documentation says:
To obtain a ParallelFlux, you can use the parallel() operator on any Flux. By itself, this method does not parallelize the work. Rather, it divides the workload into “rails” (by default, as many rails as there are CPU cores).
In order to tell the resulting ParallelFlux where to run each rail (and, by extension, to run rails in parallel) you have to use runOn(Scheduler). Note that there is a recommended dedicated Scheduler for parallel work: Schedulers.parallel().
You are creating a boundedElastic scheduler which is not optimised for parallel work.
But i want to mention, you are doing async i/o not parallel work which is very important to point out. You will most likely not gain any performance gains, when you are running in parallel since most of your i/o will fire off a request and then just wait for a response.
ParellelFlux will ensure that all cpu cores are being used, but there is also some punishements. There is a setup time to make sure that all cores get up to start doing work, then the work that needs to be done is not cpu-intensive, they just fire off say 1000 requests, then all the threads are done, and have to wait for responses.
Workers need to be setup on the cores, the information needs to be sent to each core, retrieved etc.
parallel gains most of its benefits when you have CPU intensive work, where each event needs to perform heavy computations on multiple cores. But for async work a regular Flux will most likely be enough.
Here is what Simon Baslé one of the reactor devs has to say about running i/o work in reactor, parallel vs async
Also worth mentioning, a boundedElastic scheduler is tuned for blocking work as a fallback to regular servlet behaviour in a pure webflux application.
You are running webflux in a servlet application, so what benefits you get may not be as a full as a webflux application.

I'm not 100% sure if this is the issue here, but I noticed when working with WebClient and ParallelFlux, that the WebClient is only returning the Publisher for the response (bodyToMono / bodyToFlux), not for the actual request.
Consider to wrap the remote call with Flux.defer / Mono.defer to get a Publisher already for the request, e.g. something like:
.flatMap(url -> Flux.defer(() -> performGetRequest(url)))

Reducing operation time by using parallel stream

In my java 8 spring boot application, I have a list of 40000 records. For each record, I have to call an external API and save the result to DB. How can I do this with better performance within no time? Each of the API calls will take about 20 secs to complete. I used a parallel stream for reducing the time but there was no considerable change in it.
if (!mainList.isEmpty()) {
AtomicInteger counter = new AtomicInteger();
List<List<PolicyAddressDto>> secondList =
new ArrayList<List<PolicyAddressDto>>(
mainList.stream()
.collect(Collectors.groupingBy(it -> counter.getAndIncrement() / subArraySize))
.values());
for (List<PolicyAddressDto> listOfList : secondList) {
listOfList.parallelStream()
.forEach(t -> {
callAtheniumData(t, listDomain1, listDomain2); // listDomain2 and listDomain1 declared
// globally
});
if (!listDomain1.isEmpty()) {
listDomain1Repository.saveAll(listDomain1);
}
if (!listDomain2.isEmpty()) {
listDomain2Repository.saveAll(listDomain2);
}
}
}

Solving a problem in parallel always involves performing more actual work than doing it sequentially. Overhead is involved in splitting the work among several threads and joining or merging the results. Problems like converting short strings to lower-case are small enough that they are in danger of being swamped by the parallel splitting overhead.
As I can see the api call response is not being saved.
Also all api calls are disjoint with respect to each other.
Can we try creating new threads for each api call.
for (List<PolicyAddressDto> listOfList : secondList) {
listOfList.parallelStream()
.forEach(t -> {
new Thread(() ->{callAtheniumData(t, listDomain1, listDomain2)}).start();
});
}

That's because the parallel stream divide the task usually creating one thread per core -1. If every call you do to the external API takes 20 seconds and you have 4 core, this means 3 concurrent requests that wait for 20 seconds.
You can increase the concurrency of your calls in this way https://stackoverflow.com/a/21172732/574147 but I think you're just moving the problems.
An API that takes 20sec it's a really slow "typical" response time. If this is a really complex elaboration and CPU bounded, how can that service be able to respond at 10 concurrent request keeping the same performance? Probably it wouldn't.
Otherwise if the elaboration is "IO bounded" and takes 20 seconds, you probably need a service able to take (and work!) with list of elements

Each of the API calls will take about 20 secs to complete.
Your external API is where you are being bottlenecked. There's really nothing your code can do to speed it up on the client side except to parallelize the process. You've already done that, so if the external API is within your organization, you need to look into any performance improvements there. If not, can do something like offload the processing via Kafka to Apache NiFi or Streamsets so that your Spring Boot API doesn't have to wait for hours to process the data.

RxJava instead of AsyncTask?

I came across several instances when people were trying to persuade me into using RxJava instead of Android's standard AsyncTask construct.
In my opinion RxJava offers a lot more features but loses in simplicity against AsyncTask.
Are there any use cases that suit one approach better than the other or even more general can RxJava even be considered superior?

The full power of RxJava is visible when you use it on Java 8, preferably with a library like Retrofit. It allows you to trivially chain operations together, with full control of error handling. For example, consider the following code given id: an int that specifies the order and apiClient: a Retrofit client for the order management microservice:
apiClient
.getOrder(id)
.subscribeOn(Schedulers.io())
.flatMapIterable(Order::getLineItems)
.flatMap(lineItem ->
apiClient.getProduct(lineItem.getProductId())
.subscribeOn(Schedulers.io())
.map(product -> product.getCurrentPrice() * lineItem.getCount()),
5)
.reduce((a,b)->a+b)
.retryWhen((e, count) -> count<2 && (e instanceof RetrofitError))
.onErrorReturn(e -> -1)
.subscribe(System.out::println);
This will asynchronously calculate the total price of an order, with the following properties:
at most 5 requests against the API in flight at any one time (and you can tweak the IO scheduler to have a hard cap for all requests, not just for a single observable chain)
up to 2 retries in case of network errors
-1 in case of failure (an antipattern TBH, but that's an other discussion)
Also, IMO the .subscribeOn(Schedulers.io()) after each network call should be implicit - you can do that by modifying how you create the Retrofit client. Not bad for 11+2 lines of code, even if it's more backend-ish than Android-ish.

RxBinding/RxAndroid by Jake Wharton provides some nice threading functionality that you can use to make async calls but RxJava provides waaay more benefits/functionality than just dealing with async threading. That said, There is a pretty steep learning curve (IMO). Also, it should be noted that there is nothing wrong with using AsyncTasks, you can just write more eloquent solutions with Rx (also, IMO).
TLDR you should make an effort to use it. Retrofit and RxJava work together nicely for your AsyncTask replacement purposes.

Using groovy actors to maximise throughput from database?

I'm playing around with the GPars library while working to improve the scalability of a matching system. I'd like to be able to query the database and immediately query the database while the results are being processed concurrently. The bottleneck is reading from the database so I would like to keep the database busy full time while processing the results asynchronously when they are available. I realise I may have some fundamental misunderstandings on how the actor framework works and I'd be happy to be corrected!
In pseudo code I'm trying to do the following:
Define two actors, One for running selects against the database and another for processing the records.
queryActor querys database and sends results to processorActor
queryActor immediately querys database again without waiting for processorActor to finish
I could probably achieve the simple use case without using actors but my end goal is to have an actor pool that is always working on new queries with potentially different datasources in order to increase the throughput of the system in general.
The processing Actor will always be much faster than the database query so I would like to query multiple replicas concurrently in future.
def processor = actor {
loop {
react {querySet ->
println "processing recordset"
if (querySet instanceof Object[]) {
MatcherDataRowProcessor matcher = new MatcherDataRowProcessor(matchedRecords, matchedRecordSet);
matchedRecords = matcher.processRecordset(querySet);
reply matchedRecords
}
else {
println 'processor fed nothing, halting processor actor'
stop()
}
}
}
}
def dbqueryer = actor {
println "dbqueryer has started"
while (batchNum.longValue() <= loopLimiter) {
println "hitting db"
Object[] querySet
def thisRuleBatch = new MatchRuleBatch(targetuidFrom, targetuidTo)
thisRuleBatch.targetuidFrom = batchNum * perBatch - perBatch
thisRuleBatch.targetuidTo = thisRuleBatch.targetuidFrom + perBatch
thisRuleBatch.targetName = targetName
thisRuleBatch.whereClause = whereClause
querySet = dao.getRecordSet(thisRuleBatch)
processor.send querySet
batchNum++
}
react { processedRecords ->
processor.send false
}
}

I would suggest taking a look at Dataflow Queues in the Dataflow Concurrency section of the user guide for GPars. You may find that Dataflows provide a better/cleaner abstraction for your problem at hand. Dataflows can also be used in conjunction with actors.
I think either actors or dataflows would work in this situation and feel that the decision comes down to which one provides the abstraction that more closely matches what you are trying to accomplish. For me, the concept of tasks, queues, dataflows seems to be a closer fit terminology-wise.

After some more research I have found that the DataFlow concurrency stuff in Gpars is actually built on top of the Actor support. The DataflowOperatorTest in the gpars java demo distribution (I need to do a java implementation) seems to be a good match for what I need to do. The main thread waits for multiple stream inputs to be populated which in my case are the parallel database queries.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.