Basic questions about Reactor signals - java

I have some questions regarding the output of the following code:
Flux.just("a", "b", "c", "d")
.log(null, Level.INFO, true) // line: 18
.flatMap(value ->
Mono.just(value.toUpperCase()).publishOn(Schedulers.elastic()), 2)
.log(null, Level.INFO, true) // line: 21
.take(3)
.log(null, Level.INFO, true) // line: 23
.subscribe(x ->
System.out.println("Thread: " + Thread.currentThread().getName() +
" , " + x));
Thread.sleep(1000 * 1000);
Output:
1. 11:29:11 [main] INFO - | onSubscribe([Synchronous Fuseable] FluxArray.ArraySubscription) Flux.log(App.java:18)
2. 11:29:11 [main] INFO - onSubscribe(FluxFlatMap.FlatMapMain) Flux.log(App.java:21)
3. 11:29:11 [main] INFO - onSubscribe(FluxTake.TakeSubscriber) Flux.log(App.java:23)
4. 11:29:11 [main] INFO - request(unbounded) Flux.log(App.java:23)
5. 11:29:11 [main] INFO - request(unbounded) Flux.log(App.java:21)
6. 11:29:11 [main] INFO - | request(2) Flux.log(App.java:18)
7. 11:29:11 [main] INFO - | onNext(a) Flux.log(App.java:18)
8. 11:29:11 [main] INFO - | onNext(b) Flux.log(App.java:18)
9. 11:29:11 [elastic-2] INFO - onNext(A) Flux.log(App.java:21)
10. 11:29:11 [elastic-2] INFO - onNext(A) Flux.log(App.java:23)
11. Thread: elastic-2 , A
12. 11:29:11 [elastic-2] INFO - | request(1) Flux.log(App.java:18)
13. 11:29:11 [main] INFO - | onNext(c) Flux.log(App.java:18)
14. 11:29:11 [elastic-3] INFO - onNext(B) Flux.log(App.java:21)
15. 11:29:11 [elastic-3] INFO - onNext(B) Flux.log(App.java:23)
16. Thread: elastic-3 , B
17. 11:29:11 [elastic-3] INFO - | request(1) Flux.log(App.java:18)
18. 11:29:11 [elastic-3] INFO - | onNext(d) Flux.log(App.java:18)
19. 11:29:11 [elastic-3] INFO - | onComplete() Flux.log(App.java:18)
20. 11:29:11 [elastic-3] INFO - onNext(C) Flux.log(App.java:21)
21. 11:29:11 [elastic-3] INFO - onNext(C) Flux.log(App.java:23)
22. Thread: elastic-3 , C
23. 11:29:11 [elastic-3] INFO - cancel() Flux.log(App.java:21)
24. 11:29:11 [elastic-3] INFO - onComplete() Flux.log(App.java:23)
25. 11:29:11 [elastic-3] INFO - | cancel() Flux.log(App.java:18)
Questions: Each question is about a specific line inside the output (not a line in the code). I also added my answers to some of them but I'm not sure I'm correct.
When subscribing, the subscribe operation ask for unbounded amount of elements. Then why the event: request(unbounded) is going down in the pipeline instead of going up? My answer: The request for unbounded amount is going up to take and then take sending it down again.
flatMap send cancel signal. Why doesn't take sends it instead?
Last question: There is more then one terminal signal in the output. Isn't it a vaulation of reactive-streams spec?

In that case, will be produced ONLY one terminal signal.
Flux.just("a", "b", "c", "d")
.log(null, Level.INFO, true) // line: 18
.flatMap(value ->
Mono.just(value.toUpperCase()).publishOn(Schedulers.elastic()), 2)
.log(null, Level.INFO, true) // line: 21
.take(3)
.log(null, Level.INFO, true) // line: 23
.subscribe(x ->
System.out.println("Thread: " + Thread.currentThread().getName() +
" , " + x), t -> {}, () -> System.out.println("Completed ""Only Once"));
The tricky part here is that each Reactor 3 operator has its own life, and they all are playing by the same rule - emit onComplete to notify downstream operator that there is no data anymore.
Since you have .log() operator and three different points thus you will observe three independent onComplete signals from .just, from .flatMap, and from .take(3).
Firstly, you will see onComplete from .just because the default behavior of .flatMap is 'ok, let's try to request first concurrency elements, and then let's see how it goes', since .just may produce (in your case) only 4 elements, on 2 (which is concurrency level in your example) requested demand it will emit 2 onNext and after two request(1) you will see onComplete. In turn, emitted onComplete lets the .flatMap knows that when 4 flatted streams emit their .onComplete signals, it will be allowed to emit its own onComplete to downstream.
In turn, downstream is .take(3) operator which is also after first three elements will emit its own onComplete signal without waiting for upstream onComplete. Since there is .log operator after .take this signal also will be recorded.
Finally, in your flow, you have 3 independent log operators, which will record 3 independent onComplete from 3 independent operators, but despite that fact, the final terminal .subscribe will receive only one onComplete from the first operator up to the flow.
Small update regarding .take behavior
The central idea of .take is taking elements until the remaining count has been satisfied. Since the upstream may produce more than was requested we need to have a mechanism to prevent sending more data. One of the mechanisms that Reactive-Streams spec offers to us is collaborations over Subscription. Subscription has two primary methods - request - to show the demand and cancel - to show that data is not needed anymore even if requested demand was not satisfied.
In case of .take operator, initial demand is Long.MAX_VALUE, which considers as unbounded demand. Therefore, the only way to stop consuming potentially infinitive stream of data is to cancel subscription, or in other word unsubscribe
Hope it helps you :)

Related

Java reactor `suscribe` is sometime blocking, sometime not

I have been playing around for some time with reactor, but I still need to get something.
This piece of code
Flux.range(1, 1000)
.delayElements(Duration.ofNanos(1))
.map(integer -> integer + 1)
.subscribe(System.out::println);
System.out.println("after");
Returns:
after
2
3
4
which is expected as the documentation of subscribe states: this will immediately return control to the calling thread.
Why, then, this piece of code:
Flux.range(1, 1000)
.map(integer -> integer + 1)
.subscribe(System.out::println);
returns
1
2
...
1000
1001
after
I can never figure out when subscribe will block or not, and that's very annoying when writing batches.
If anyone has the answer, that would be amazing
There is no blocking code in your snippet.
In first example you use .delayElements() and it switches the executing to another thread and releases your main thread. So you can see your System.out.println("after"); executing in Main thread immediately, whilst the reactive chain is being executed on parallel-n threads.
Your first example:
18:49:29.195 [main] INFO com.example.demo.FluxTest - AFTER
18:49:29.199 [parallel-1] INFO com.example.demo.FluxTest - v: 2
18:49:29.201 [parallel-2] INFO com.example.demo.FluxTest - v: 3
18:49:29.202 [parallel-3] INFO com.example.demo.FluxTest - v: 4
18:49:29.203 [parallel-4] INFO com.example.demo.FluxTest - v: 5
18:49:29.205 [parallel-5] INFO com.example.demo.FluxTest - v: 6
But your second example does not switch the executing thread, so your reactive chain executes on Main thread. And after it completes it continues to execute your System.out.println("after");
18:51:28.490 [main] INFO com.example.demo.FluxTest - v: 995
18:51:28.490 [main] INFO com.example.demo.FluxTest - v: 996
18:51:28.490 [main] INFO com.example.demo.FluxTest - v: 997
18:51:28.490 [main] INFO com.example.demo.FluxTest - v: 998
18:51:28.490 [main] INFO com.example.demo.FluxTest - v: 999
18:51:28.490 [main] INFO com.example.demo.FluxTest - v: 1000
18:51:28.490 [main] INFO com.example.demo.FluxTest - v: 1001
18:51:28.491 [main] INFO com.example.demo.FluxTest - AFTER
EDIT:
If you want to switch the thread in your second snippet, basically you have two options:
Add subscribeOn(<Scheduler>) in any place of your reactive chain. Then the whole subscription process will happen on a thread from scheduler you provided.
Add publishOn(<Scheduler>), for example, after Flux.range(), then the emitting itself will happen on your calling thread, but the downstream will be executed on a thread from the scheduler you provided

KTable causes unsubscribe from topics

I'm writing a basic Kafka streams app in Java which reads wikipedia events provided by a producer and attempts to count the amount of created and recently changed events according to user type (bot or human).
I created a custom serdes for the wikipedia events and am able to successfully print both the created and modified events to the screen from my KStreams.
My next step was to create a KTable in which I will count the created events per user type.
It seems that after the KTable is created the rest of the code does not execute.
I don't get an error message and my app seems to be running, but nothing is printed and maybe not even processed.
My code is as following:
StreamsBuilder builder = new StreamsBuilder();
KStream<String, WikiEvent> allEvents =
builder.stream(topicList, Consumed.with(Serdes.String(), WikiEventSerdes.WikiEvent()));
KStream<String, WikiEvent> createEvents = allEvents.filter((key, value) -> value.getStream().equals("create"));
KStream<String, WikiEvent> changeEvents = allEvents.filter((key, value) -> value.getStream().equals("change"));
createEvents.foreach((k,v)->System.out.println("p2 Key= " + k + " Value=" + v.getStream()));
KTable<String, Long> createdPagesUserTypeTable = createEvents.groupBy((key, value) -> value.getUserType()).count();
KStream<String, Long> tableStream = createdPagesUserTypeTable.toStream();
tableStream.foreach((k,v)->System.out.println("Key= " + k + " Value=" + v));
The reason I suspect that nothing executes past the KTable is because the print of the createEvents stream never happens when the KTable definition is present.
Once I remove all lines from the KTable down, I get the prints.
What's gone wrong here?
Also, is there a log of some sort where I can see the execution of my code?
An update:
After looking at the server logs I see this when defining the KTable:
[2022-05-27 19:58:38,983] INFO [GroupCoordinator 0]: Dynamic member with unknown member id joins group streams-wiki in Empty state. Created a new member id streams-wiki-8ea96db7-0052-421a-b7c0-a56cedf9f43e-StreamThread-1-consumer-aa4e311e-2712-4054-ac59-9b56f13d2231 and request the member to rejoin with this id. (kafka.coordinator.group.GroupCoordinator)
[2022-05-27 19:58:38,995] INFO [GroupCoordinator 0]: Preparing to rebalance group streams-wiki in state PreparingRebalance with old generation 2 (__consumer_offsets-22) (reason: Adding new member streams-wiki-8ea96db7-0052-421a-b7c0-a56cedf9f43e-StreamThread-1-consumer-aa4e311e-2712-4054-ac59-9b56f13d2231 with group instance id None; client reason: rebalance failed due to 'The group member needs to have a valid member id before actually entering a consumer group.' (MemberIdRequiredException)) (kafka.coordinator.group.GroupCoordinator)
[2022-05-27 19:58:38,999] INFO [GroupCoordinator 0]: Stabilized group streams-wiki generation 3 (__consumer_offsets-22) with 1 members (kafka.coordinator.group.GroupCoordinator)
[2022-05-27 19:58:39,274] INFO [GroupCoordinator 0]: Assignment received from leader streams-wiki-8ea96db7-0052-421a-b7c0-a56cedf9f43e-StreamThread-1-consumer-aa4e311e-2712-4054-ac59-9b56f13d2231 for group streams-wiki for generation 3. The group has 1 members, 0 of which are static. (kafka.coordinator.group.GroupCoordinator)
[2022-05-27 19:58:39,934] INFO [GroupCoordinator 0]: Preparing to rebalance group streams-wiki in state PreparingRebalance with old generation 3 (__consumer_offsets-22) (reason: Removing member streams-wiki-8ea96db7-0052-421a-b7c0-a56cedf9f43e-StreamThread-1-consumer-aa4e311e-2712-4054-ac59-9b56f13d2231 on LeaveGroup; client reason: the consumer unsubscribed from all topics) (kafka.coordinator.group.GroupCoordinator)
[2022-05-27 19:58:39,934] INFO [GroupCoordinator 0]: Group streams-wiki with generation 4 is now empty (__consumer_offsets-22) (kafka.coordinator.group.GroupCoordinator)
[2022-05-27 19:58:39,938] INFO [GroupCoordinator 0]: Member MemberMetadata(memberId=streams-wiki-8ea96db7-0052-421a-b7c0-a56cedf9f43e-StreamThread-1-consumer-aa4e311e-2712-4054-ac59-9b56f13d2231, groupInstanceId=None, clientId=streams-wiki-8ea96db7-0052-421a-b7c0-a56cedf9f43e-StreamThread-1-consumer, clientHost=/127.0.0.1, sessionTimeoutMs=45000, rebalanceTimeoutMs=300000, supportedProtocols=List(stream)) has left group streams-wiki through explicit `LeaveGroup`; client reason: the consumer unsubscribed from all topics (kafka.coordinator.group.GroupCoordinator)
so it appears that my KTable has somehow caused an unsubscribe from all topics.
Any idea why this is happening?
In the end it turned out that my java consumer was failing due to a missing StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG definition.
The way to understand this was by adding the following lines of code after
KafkaStreams streams = new KafkaStreams(topology, props); :
streams.setUncaughtExceptionHandler((Thread t, Throwable e) -> {
System.out.println(e);
});
This will output debug to the command window and show additional logs for the consumer.

Understanding AWS FIFO Queue behaviour

Was playing around with AWS SQS FIFO Queue locally in localstack with AWS Java sdk v2 & Spring Boot.
I created one endpoint to send messages through one publisher and three endpoints to receive/poll messages from queue via three consumers in Spring boot controller classes.
I created the FIFO queue with following properties -
RECEIVE_MESSAGE_WAIT_TIME_SECONDS = 20 seconds (long poll)
VISIBILITY_TIMEOUT = 60 seconds
FIFO_QUEUE = true
CONTENT_BASED_DEDUPLICATION = true
Each consumer could fetch at max 3 messages (At least 1 if available, up-to 3) per each poll request.
I published 5 messages to the queue (in order). They are -
Message Group Id | Deduplication Id
-----------------------------------
A | A1
A | A2
A | A3
A | A4
A | A5
From log -
2022-06-01 16:13:26.474 INFO 27918 --- [nio-9099-exec-1] c.p.sqs.service.SqsPublisherServiceImpl : sendMsgRequest SendMessageRequest(QueueUrl=http://localhost:4566/000000000000/dev-priyam-fifo-queue.fifo, MessageBody={"id":"A1"}, MessageDeduplicationId=A1, MessageGroupId=A)
2022-06-01 16:13:26.600 INFO 27918 --- [nio-9099-exec-2] c.p.sqs.service.SqsPublisherServiceImpl : sendMsgRequest SendMessageRequest(QueueUrl=http://localhost:4566/000000000000/dev-priyam-fifo-queue.fifo, MessageBody={"id":"A2"}, MessageDeduplicationId=A2, MessageGroupId=A)
2022-06-01 16:13:26.700 INFO 27918 --- [nio-9099-exec-3] c.p.sqs.service.SqsPublisherServiceImpl : sendMsgRequest SendMessageRequest(QueueUrl=http://localhost:4566/000000000000/dev-priyam-fifo-queue.fifo, MessageBody={"id":"A3"}, MessageDeduplicationId=A3, MessageGroupId=A)
2022-06-01 16:13:26.785 INFO 27918 --- [nio-9099-exec-4] c.p.sqs.service.SqsPublisherServiceImpl : sendMsgRequest SendMessageRequest(QueueUrl=http://localhost:4566/000000000000/dev-priyam-fifo-queue.fifo, MessageBody={"id":"A4"}, MessageDeduplicationId=A4, MessageGroupId=A)
2022-06-01 16:13:26.843 INFO 27918 --- [nio-9099-exec-5] c.p.sqs.service.SqsPublisherServiceImpl : sendMsgRequest SendMessageRequest(QueueUrl=http://localhost:4566/000000000000/dev-priyam-fifo-queue.fifo, MessageBody={"id":"A5"}, MessageDeduplicationId=A5, MessageGroupId=A)
I then started polling from consumers randomly. My observation is stated below -
A1, A2 and A3 were polled. They were polled but not deleted (intentionally). So they went back to the queue after visibility timeout (60 seconds) was over.
In the next poll, A3 and A4 were polled. Again, they were polled but not deleted. So they went back to the queue after 60 seconds.
In the next poll, A4 and A5 were polled. Again, they were polled but not deleted. So they went back to the queue after 60 seconds.
In the next poll (and all following polls) A5 was polled. And I kept getting only A5 from here on.
Now I want to understand why I am getting this behaviour. The whole selling point of FIFO is getting ordered messages (per same message group id). My expectation after step 1 was, I will get one of A1, A1 A2 or A1 A2 A3 in the next poll (step 2) - but this didn't happen.
Can anyone explain what is happening here?
My github repo : https://github.com/tahniat-ashraf/java-aws-sqs-101
I believe this is a known issue in localstack, when using both CONTENT_BASED_DEDUPLICATION=true and providing a MessageDeduplicationId.
SQS supports either content-based duplication, or manual deduplication via a deduplication ID. It does not support both.
Try running this on an actual SQS queue - or change your configuration as described in the localstack issue.

QFJ Passes Messages in the Wrong Order

I am using QFJ 2.1.1 and testing my application against a fix simulator (also running in QFJ 2.1.1)
There are two fix sessions.
An initiator and acceptor on both sides.
The problem workflow looks like this:
1. simulator/acceptor <--- New Order Single <--- application/initiator
2. simulator/acceptor ---> ACK ---> application/initiator
3. simulator/initiator ---> New Order Single ---> application/acceptor
4. simulator/initiator <--- ACK <--- application/acceptor
The order of fix messages processed by QFJ in the simulator is 1,2,3,4
The order of fix messages processed by QFJ in the application is 1,3,2,4
The Application was called back with 3 (NewOrderSingle) before it was called back with 2 (ACK)
Here are the QFJ log snippets from the simulator showing 2 and 3 being sent on two sessions:
2
2019-12-12 10:23:12.820 [928630][QFJ Message Processor][INFO ] <-- OUTBOUND VENDOR: 8=FIX.4.2|9=180|35=8|34=2|
52=20191212-15:23:12.820|11=287:MACGREGOR-37392703:45037843|17=BYHWG|20=0|37=SIM:287:MACGREGOR-37392703:45037843|38=10000|39=0|54=2|55=MSFT|150=0|10=132|[:]
3
2019-12-12 10:23:12.820 [928630][QFJ Message Processor][INFO ] <-- OUTBOUND ATS: 8=FIX.4.2|9=208|35=D|34=2|52=20191212-15:23:12.820|11=GSET:287:MACGREGOR-37392703:45037843|18=M|21=1|38=10000|40=P|44=153.3
500|54=2|55=MSFT|60=20191212-15:23:09.205|110=0|8011=287:MACGREGOR-37392703:45037843|10=207|[:]
Here are the QFJ log snippets from the application showing 3 being received before 2 on the two sessions:
3
2019-12-12 10:23:12.824 [31181][QFIXManager][INFO ] FIX onAppReceived(): AQUA->GSET, message=[11=GSET:287:MACG
REGOR-37392703:45037843 35=D 18=M 44=153.3500] {11=GSET:287:MACGREGOR-37392703:45037843, 44=153.3500, 55=MSFT,
34=2, 56=AQUA, 35=D, 8011=287:MACGREGOR-37392703:45037843, 49=GSET, 38=10000, 18=M, 110=0, 8=FIX.4.2, 9=208,
60=20191212-15:23:09.205, 40=P, 52=20191212-15:23:12.820, 21=1, 54=2, 10=207} [:]
2
2019-12-12 10:23:12.827 [31184][QFIXManager][INFO ] FIX onAppReceived(): AQUABORG->GSETBORG, message=[11=287:M
ACGREGOR-37392703:45037843 37=SIM:287:MACGREGOR-37392703:45037843 35=8 39=0 150=0] {11=287:MACGREGOR-37392703:
45037843, 55=MSFT, 34=2, 56=AQUABORG, 35=8, 37=SIM:287:MACGREGOR-37392703:45037843, 49=GSETBORG, 38=10000, 17=
BYHWG, 39=0, 150=0, 8=FIX.4.2, 9=180, 52=20191212-15:23:12.820, 20=0, 54=2, 10=132}
As you can see, the application messages 2 and 3 are received out of order.
How can I prevent this in my QFJ application?

Project Reactor: Schedulers#parallel & Schedulers#elastic purpose

I am learning Project Reactor where I am exploring Schedulers factory.
I tried the following code:
ExecutorService executorService = Executors.newFixedThreadPool(10);
Flux.range(1,4)
.map(i -> {
logger.info(i +" [MAP] " + Thread.currentThread().getName());
return 10 / i;
})
.publishOn(Schedulers.fromExecutorService(executorService)) // .publishOn(Schedulers.parallel())
.subscribe(
n -> {
logger.info("START "+((Long)(System.currentTimeMillis() % 10000000L)).toString());
try {
Thread.sleep(100);
} catch (InterruptedException e) {
e.printStackTrace();
}
logger.info(n.toString());
logger.info("END "+((Long)(System.currentTimeMillis() % 10000000L)).toString());
}
);
executorService.shutdown();
This code was tried with Schedulers.parallel() and Schedulers.elastic() as well. Also, tried with subscribeOn() operator to see similar results.
The logs are:
02:07:30.142 [main] INFO - 1 [MAP] main
02:07:30.143 [main] INFO - 2 [MAP] main
02:07:30.143 [main] INFO - 3 [MAP] main
02:07:30.143 [main] INFO - 4 [MAP] main
02:07:30.143 [pool-1-thread-2] INFO - START 1050143
02:07:30.247 [pool-1-thread-2] INFO - 10
02:07:30.247 [pool-1-thread-2] INFO - END 1050247
02:07:30.247 [pool-1-thread-2] INFO - START 1050247
02:07:30.350 [pool-1-thread-2] INFO - 5
02:07:30.350 [pool-1-thread-2] INFO - END 1050350
02:07:30.350 [pool-1-thread-2] INFO - START 1050350
02:07:30.455 [pool-1-thread-2] INFO - 3
02:07:30.455 [pool-1-thread-2] INFO - END 1050455
02:07:30.455 [pool-1-thread-2] INFO - START 1050455
02:07:30.557 [pool-1-thread-2] INFO - 2
02:07:30.558 [pool-1-thread-2] INFO - END 1050558
Since the Flux's elements are ordered and operated upon in sequence (apparent from the logs above), having multiple threads for an operator (or operator chain) for one element does not make sense. I am sure I am either misinterpreting the Schedulers or lack somewhere in my basic understanding. Can someone point me to the right direction?
I understand the purpose of Schedulers to make the processing asynchronous and unhold the main thread. But why would anyone want to give multiple threads to the operator(s) when operated at one element at a time.
Does it makes sense only when we deal with flatMap operator?

Categories