Reactor, how to debug an OverflowException? - java

I'm trying to find a way to understand/debug why I randomly have that stacktrace :
reactor.core.Exceptions$OverflowException: Could not emit buffer due to lack of requests
at reactor.core.Exceptions.failWithOverflow(Exceptions.java:215)
at reactor.core.publisher.FluxBufferPredicate$BufferPredicateSubscriber.emit(FluxBufferPredicate.java:292)
at reactor.core.publisher.FluxBufferPredicate$BufferPredicateSubscriber.onNextNewBuffer(FluxBufferPredicate.java:251)
at reactor.core.publisher.FluxBufferPredicate$BufferPredicateSubscriber.tryOnNext(FluxBufferPredicate.java:205)
at reactor.core.publisher.FluxBufferPredicate$BufferPredicateSubscriber.onNext(FluxBufferPredicate.java:180)
at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onNext(FluxMap.java:201)
at reactor.core.publisher.FluxConcatMap$ConcatMapImmediate.innerNext(FluxConcatMap.java:271)
at reactor.core.publisher.FluxConcatMap$ConcatMapInner.onNext(FluxConcatMap.java:803)
at reactor.core.publisher.FluxIterable$IterableSubscription.slowPath(FluxIterable.java:232)
at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:190)
at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.set(Operators.java:1444)
at reactor.core.publisher.Operators$MultiSubscriptionSubscriber.onSubscribe(Operators.java:1318)
at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:128)
at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:61)
at reactor.core.publisher.Flux.subscribe(Flux.java:6873)
Does it means that the producer is faster than the consumer ? My pattern is probably not standard and looks like the following (simplified here):
Flux<Pair<Person, String>> auto = getPersons() // REST GET endpoint
.map(p -> {
// In my real-life example, the operation done here is quiet expensive.
Person newP = new Person(p.name, p.age + 10);
return new Pair<>(newP, "The new age of " + newP.name + " is now " + newP.age);
})
.publish()
.autoConnect(2);
Flux<Person> personsToSave = auto.map(e -> e.first);
Flux<String> auditToSave = auto.map(e -> e.second);
Mono.when(
savePersons(personsToSave), // REST POST endpoint
saveAudit(auditToSave)) // REST POST endpoint
.doOnError(e -> System.err.println(e.getMessage()))
.block();
The Hooks.onOperatorDebug(), or log() does not help me a lot. I don't have the problem if I remove the publish() and only save the persons OR the audit.
Can someone give me how to investigate more precisely ? (or an idea to solve the issue)
Reactor 3.1.6

Related

Azure ServiceBusSessionReceiverAsyncClient - Mono instead of Flux

I have a Spring Boot app, where I receive one single message from a Azure Service Bus queue session.
The code is:
#Autowired
ServiceBusSessionReceiverAsyncClient apiMessageQueueIntegrator;
.
.
.
Mono<ServiceBusReceiverAsyncClient> receiverMono = apiMessageQueueIntegrator.acceptSession(sessionid);
Disposable subscription = Flux.usingWhen(receiverMono,
receiver -> receiver.receiveMessages(),
receiver -> Mono.fromRunnable(() -> receiver.close()))
.subscribe(message -> {
// Process message.
logger.info(String.format("Message received from quque. Session id: %s. Contents: %s%n", message.getSessionId(),
message.getBody()));
receivedMessage.setReceivedMessage(message);
timeoutCheck.countDown();
}, error -> {
logger.info("Queue error occurred: " + error);
});
As I am receiving only one message from the session, I use a CountDownLatch(1) to dispose of the subscription when I have received the message.
The documentation of the library says that it is possible to use Mono.usingWhen instead of Flux.usingWhen if I only expect one message, but I cannot find any examples of this anywhere, and I have not been able to figure out how to rewrite this code to do this.
How would the pasted code look if I were to use Mono.usingWhen instead?
Thank you conniey. Posting your suggestion as an answer to help other community members.
By default receiveMessages() is a Flux because we imagine the messages from a session to be "infinitely long". In your case, you only want the first message in the stream, so we use the next() operator.
The usage of the countdown latch is probably not the best approach. In the sample, we had one there so that the program didn't end before the messages were received. .subscribe is not a blocking call, it sets up the handlers and moves onto the next line of code.
Mono<ServiceBusReceiverAsyncClient> receiverMono = sessionReceiver.acceptSession("greetings-id");
Mono<ServiceBusReceivedMessage> singleMessageMono = Mono.usingWhen(receiverMono,
receiver -> {
// Anything you wish to do with the receiver.
// In this case we only want to take the first message, so we use the "next" operator. This returns a
// Mono.
return receiver.receiveMessages().next();
},
receiver -> Mono.fromRunnable(() -> receiver.close()));
try {
// Turns this into a blocking call. .block() waits indefinitely, so we have a timeout.
ServiceBusReceivedMessage message = singleMessageMono.block(Duration.ofSeconds(30));
if (message != null) {
// Process message.
}
} catch (Exception error) {
System.err.println("Error occurred: " + error);
}
You can refer to GitHub issue:ServiceBusSessionReceiverAsyncClient - Mono instead of Flux

RxJava multiple consumers of one publisher

I'm writing some kind of middleware HTTP proxy with cache. The workflow is:
Client requests this proxy for resource
If resurce exists in cache, proxy returns it
If resource wasn't found, proxy fetching remote resource and returns to the user. Proxy saves this resource to the cache on data loading.
My interfaces have Publisher<ByteBuffer> stream for remote resource, cache which accepts Publisher<ByteBuffer> to save, and clients' connection which accepts Publisher<ByteBuffer> as a response:
// remote resource
interface Resource {
Publisher<ByteBuffer> fetch();
}
// cache
interface Cache {
Completable save(Publisher<ByteBuffer> data);
}
// clien response connection
interface Connection {
Completable send(Publisher<ByteBuffer> data);
}
My problem is that I need to lazy save this stream of byte buffers to cache when sending the response to the client, so the client should be responsible for requesting ByteByffer chunks from remote resource, not cache.
I tried to use Publisher::cache method, but it's not a good choice for me, because it keeps all received data in memory, it's not acceptable, since cached data may be few GB of size.
As a workaround, I created Subject filled by next items received from Resource:
private final Cache cache;
private final Connection out;
Completable proxy(Resource res) {
Subject<ByteBuffer> mirror = PublishSUbject.create();
return Completable.mergeArray(
out.send(res.fetch().doOnNext(mirror::onNext),
cache.save(mirror.toFlowable(BackpressureStrategy.BUFFER))
);
}
Is it possible to reuse same Publisher without caching items in memory, and where only one subscriber will be responsible for requesting items from publisher?
I might be missing something (added comment about my version of the Publisher interface being different).
But.. here's how I would do something like this conceptually.
I'm going to simplify the interfaces to deal with Integers:
// remote resource
interface Resource {
ConnectableObservable<Integer> fetch();
}
// cache
interface Cache {
Completable save(Integer data);
}
// client response connection
interface Connection {
Completable send(Integer data);
}
I'd use Observable::publish to create a ConnectableObservable and establish two subscriptions:
#Test
public void testProxy()
{
// Override schedulers:
TestScheduler s = new TestScheduler();
RxJavaPlugins.setIoSchedulerHandler(
scheduler -> s );
RxJavaPlugins.setComputationSchedulerHandler(
scheduler -> s );
// Mock interfaces:
Resource resource = () -> Observable.range( 1, 100 )
.publish();
Cache cache = data -> Completable.fromObservable( Observable.just( data )
.delay( 100, TimeUnit.MILLISECONDS )
.doOnNext( __ -> System.out.println( String.format( "Caching %d", data ))));
Connection connection = data -> Completable.fromObservable( Observable.just( data )
.delay( 500, TimeUnit.MILLISECONDS )
.doOnNext( __ -> System.out.println( String.format( "Sending %d", data ))));
// Subscribe to resource:
ConnectableObservable<Integer> observable = resource.fetch();
observable
.observeOn( Schedulers.io() )
.concatMapCompletable( data -> connection.send( data ))
.subscribe();
observable
.observeOn( Schedulers.computation() )
.concatMapCompletable( data -> cache.save( data ))
.subscribe();
observable.connect();
// Simulate passage of time:
s.advanceTimeBy( 10, TimeUnit.SECONDS );
}
Output:
Caching 1
Caching 2
Caching 3
Caching 4
Sending 1
Caching 5
Caching 6
Caching 7
Caching 8
Caching 9
Sending 2
Caching 10
. . .
Update
Based on your comments, it sounds like respecting backpressure is important in your case.
Let's say you have a Publisher somewhere that honors backpressure, you can transform it into a Flowable as follows:
Flowable<T> flowable = Flowable.fromPublisher( publisher );
Once you have a Flowable you can allow for multiple subscribers without worrying about each subscriber having to request values from the Publisher (or either subscriber from missing any events while establishing the subscriptions). You do that by calling flowable.publish() to create a ConnectableFlowable.
ConnectableFlowable<T> flowable = Flowable.fromPublisher( publisher ).publish();
out.send(flowable); // calls flowable.subscribe()
cache.save(flowable); // calls flowable.subscribe()
flowable.connect(); // begins emitting values

Akka stream broadcast in java

I am trying to broadcast to 2 sink from a source in java, got stuck in between, any pointer will be helpful
public static void main(String[] args) {
ActorSystem system = ActorSystem.create("GraphBasics");
ActorMaterializer materializer = ActorMaterializer.create(system);
final Source<Integer, NotUsed> source = Source.range(1, 1000);
Sink<Integer,CompletionStage<Done>> firstSink = Sink.foreach(x -> System.out.println("first sink "+x));
Sink<Integer,CompletionStage<Done>> secondsink = Sink.foreach(x -> System.out.println("second sink "+x));
RunnableGraph.fromGraph(
GraphDSL.create(
b -> {
UniformFanOutShape<Integer, Integer> bcast = b.add(Broadcast.create(2));
b.from(b.add(source)).viaFanOut(bcast).to(b.add(firstSink)).to(b.add(secondsink));
return ClosedShape.getInstance();
}))
.run(materializer);
}
i am not that much familiar with java api for akka-stream graphs, so i used the official doc. there are 2 errors in your snippet:
when you added source to the graph builder, you need to get Outlet from it. so instead of b.from(b.add(source)) there should smth like this: b.from(b.add(source).out()) according to the official doc
you can't just call two .to method in a row, because .to expects smth with Sink shape, which means kind of dead end. instead you need to attach 2nd sink to the bcast directly, like this:
(...).viaFanOut(bcast).to(b.add(firstSink));
b.from(bcast).to(b.add(secondSink));
all in all the code should look like this:
ActorSystem system = ActorSystem.create("GraphBasics");
ActorMaterializer materializer = ActorMaterializer.create(system);
final Source<Integer, NotUsed> source = Source.range(1, 1000);
Sink<Integer, CompletionStage<Done>> firstSink = foreach(x -> System.out.println("first sink " + x));
Sink<Integer, CompletionStage<Done>> secondSink = foreach(x -> System.out.println("second sink " + x));
RunnableGraph.fromGraph(
GraphDSL.create(b -> {
UniformFanOutShape<Integer, Integer> bcast = b.add(Broadcast.create(2));
b.from(b.add(source).out()).viaFanOut(bcast).to(b.add(firstSink));
b.from(bcast).to(b.add(secondSink));
return ClosedShape.getInstance();
}
)
).run(materializer);
Final note - i would think twice whether it makes sense to use graph api. If you case as simple as this one (just 2 sinks), you might want just to use alsoTo or alsoToMat. They give you the possibility to attach multiple sinks to the flow without the need to use graphs.

Kafka Streams application strange behavior in docker container

I am running Kafka Streams application in a docker container with docker-compose. However, the streams application is behaving strangely. So, I have a source topic (topicSource) and multiple destination topics (topicDestination1 , topicDestination2 ... topicDestination10) that I am branching to based on certain predicates.
topicSoure and topicDestination1 have a direct mapping i.e all the records are simply going into the destination topic without any filtering.
Now all this works perfectly fine when I am run the application locally or on a server without containers.
On the other hand, when I run streams app in container (using docker-compose and using kubernetes) then it doesn't forward all logs from topicSoure to topicDestination1. In fact, only a few number of records are forwarded. For Example some 3000 + records on source topic and only 6 records in destination topic. And all this is really strange.
This is my Dockerfile:
#FROM openjdk:8u151-jdk-alpine3.7
FROM openjdk:8-jdk
COPY /target/streams-examples-0.1.jar /streamsApp/
COPY /target/libs /streamsApp/libs
COPY log4j.properties /
CMD ["java", "-jar", "/streamsApp/streams-examples-0.1.jar"]
NOTE: I am building a jar before creating the image so that I always have an updated code. I have made sure that both the codes, the one running without container and the one with container are same.
Main.java:
Creating Source Stream from Source Topic:
KStream<String, String> source_stream = builder.stream("topicSource");
Branching based on predicates:
KStream<String, String>[] branches_source_topic = source_stream.branch(
(key, value) -> (value.contains("Operation\":\"SharingSet") && value.contains("ItemType\":\"File")), // Sharing Set by Date
(key, value) -> (value.contains("Operation\":\"AddedToSecureLink") && value.contains("ItemType\":\"File")), // Added to secure link
(key, value) -> (value.contains("Operation\":\"AddedToGroup")), // Added to group
(key, value) -> (value.contains("Operation\":\"Add member to role.") || value.contains("Operation\":\"Remove member from role.")),//Role update by date
(key, value) -> (value.contains("Operation\":\"FileUploaded") || value.contains("Operation\":\"FileDeleted")
|| value.contains("Operation\":\"FileRenamed") || value.contains("Operation\":\"FileMoved")), // Upload file by date
(key, value) -> (value.contains("Operation\":\"UserLoggedIn")), // User logged in by date
(key, value) -> (value.contains("Operation\":\"Delete user.") || value.contains("Operation\":\"Add user.")
&& value.contains("ResultStatus\":\"success")), // Manage user by date
(key, value) -> (value.contains("Operation\":\"DLPRuleMatch") && value.contains("Workload\":\"OneDrive")) // MS DLP
);
Sending logs to destination topics:
This is the direct mapping topic i.e. all the records are simply going into the destination topic without any filtering.
AppUtil.pushToTopic(source_stream, Constant.USER_ACTIVITY_BY_DATE, "topicDestination1");
Sending logs from branches to destination topics:
AppUtil.pushToTopic(branches_source_topic[0], Constant.SHARING_SET_BY_DATE, "topicDestination2");
AppUtil.pushToTopic(branches_source_topic[1], Constant.ADDED_TO_SECURE_LINK_BY_DATE, "topicDestination3");
AppUtil.pushToTopic(branches_source_topic[2], Constant.ADDED_TO_GROUP_BY_DATE, "topicDestination4");
AppUtil.pushToTopic(branches_source_topic[3], Constant.ROLE_UPDATE_BY_DATE, "topicDestination5");
AppUtil.pushToTopic(branches_source_topic[4], Constant.UPLOAD_FILE_BY_DATE, "topicDestination6");
AppUtil.pushToTopic(branches_source_topic[5], Constant.USER_LOGGED_IN_BY_DATE, "topicDestination7");
AppUtil.pushToTopic(branches_source_topic[6], Constant.MANAGE_USER_BY_DATE, "topicDestination8");
AppUtli.java:
public static void pushToTopic(KStream<String, String> sourceTopic, HashMap<String, String> hmap, String destTopicName) {
sourceTopic.flatMapValues(new ValueMapper<String, Iterable<String>>() {
#Override
public Iterable<String> apply(String value) {
ArrayList<String> keywords = new ArrayList<String>();
try {
JSONObject send = new JSONObject();
JSONObject received = processJSON(new JSONObject(value), destTopicName);
boolean valid_json = true;
for(String key: hmap.keySet()) {
if (received.has(hmap.get(key))) {
send.put(key, received.get(hmap.get(key)));
}
else {
valid_json = false;
}
}
if (valid_json) {
keywords.add(send.toString());
}
} catch (Exception e) {
System.err.println("Unable to convert to json");
e.printStackTrace();
}
return keywords;
}
}).to(destTopicName);
}
Where are the logs coming from:
So the logs are coming from an online continuous stream. A python job gets the logs which are basically URLs and sends them to a pre-source-topic. Then in streams app I am creating a streams from that topic and hitting those URLs which then return json logs that I am pushing to topicSource.
I have spent a lot of time trying to resolve this. I have no idea what is going wrong or why is it not processing all logs. Kindly help me figure this out.
So after a lot of debugging I came to know that I was exploring in the wrong direction, it was a simple case of consumer being slow than the producer. The producer kept on writing new records on topic and since the messages were being consumed after stream processing the consumer obviously was slow. Simply increasing topic partitions and launching multiple application instances with the same application id did the trick.

reduce() not working with lightcouch

I have wirtten a program to manage tv series and I am stuck at an issue with lightcouch and a specific database query. This is what I have so far. To setup the database views I used the following lines:
MapReduce get_numberOfSeasonsMR = new MapReduce();
get_numberOfSeasonsMR.setMap(
"function(doc) { "
+ " emit(doc.seriesName, doc.season)"
+ "}");
get_numberOfSeasonsMR.setReduce(
"function (key, values, rereduce) {"
+ "return Math.max.apply({}, values)"
+ "}");
map.put("get_numberOfSeasons", get_numberOfSeasonsMR);
In Futon everything appears normal (see http://i.stack.imgur.com/1hgSJ.png).
However, when I try to execute the following line, I get an exception, instead of the results that appear in Futon.
int nr = client.view("design/get_numberOfSeasons").key("Arrow").queryForInt();
Exception:
org.lightcouch.NoDocumentException: Expecting exactly a single result of this view query, but was: 0
org.lightcouch.View.queryValue(View.java:246)
org.lightcouch.View.queryForInt(View.java:219)
....db.Server.getNumberOfSeasons(Server.java:237)
...
I tried to emit Strings in my map() function instead on ints, but it did not make any difference. What am I doing wrong? Or can someone post an example of a successful lightcouch map()+reduce() operation? The tutorials I found only used map() without reduce().
Thanks in advance ;)
Nothing seems wrong with your code, here is the full version:
CouchDbClient dbClient = new CouchDbClient();
DesignDocument designDocument = new DesignDocument();
designDocument.setId("_design/mydesign");
designDocument.setLanguage("javascript");
MapReduce get_numberOfSeasonsMR = new MapReduce();
get_numberOfSeasonsMR.setMap(
"function(doc) { "
+ " emit(doc.seriesName, doc.season)"
+ "}");
get_numberOfSeasonsMR.setReduce(
"function (key, values, rereduce) {"
+ "return Math.max.apply({}, values)"
+ "}");
Map<String, MapReduce> view = new HashMap<>();
view.put("get_numberOfSeasons", get_numberOfSeasonsMR);
designDocument.setViews(view);
dbClient.design().synchronizeWithDb(designDocument);
int count = dbClient.view("mydesign/get_numberOfSeasons").key("Arrow").queryForInt();

Categories