Akka Distributed Pub/Sub back-pressure - java

I am using Akka Distributed Pub/Sub and have a single publisher and a subscriber. My publisher is way faster than the subscriber. Is there a way to slow down the publisher after a certain point?
Publisher code:
public class Publisher extends AbstractActor {
private ActorRef mediator;
static public Props props() {
return Props.create(Publisher.class, () -> new Publisher());
}
public Publisher () {
this.mediator = DistributedPubSub.get(getContext().system()).mediator();
this.self().tell(0, ActorRef.noSender());
}
#Override
public Receive createReceive() {
return receiveBuilder()
.match(Integer.class, msg -> {
// Sending message to Subscriber
mediator.tell(
new DistributedPubSubMediator.Send(
"/user/" + Subscriber.class.getName(),
msg.toString(),
false),
getSelf());
getSelf().tell(++msg, ActorRef.noSender());
})
.build();
}
}
Subscriber code:
public class Subscriber extends AbstractActor {
static public Props props() {
return Props.create(Subscriber.class, () -> new Subscriber());
}
public Subscriber () {
ActorRef mediator = DistributedPubSub.get(getContext().system()).mediator();
mediator.tell(new DistributedPubSubMediator.Put(getSelf()), getSelf());
}
#Override
public Receive createReceive() {
return receiveBuilder()
.match(String.class, msg -> {
System.out.println("Subscriber message received: " + msg);
Thread.sleep(10000);
})
.build();
}
}

Unfortunately, as currently designed, I don't think that there is a way to provide "back-pressure" to the original Sender. Since you are using ActorRef.tell to send the message to the mediator there is no way to get a signal that the downstream receiver is backing up. This is because tell, the method you are using, returns a void.
Switch To Ask
If you switch your tell to an ask you can set an appropriate Timeout value that will at least let you know when you don't receive a response within a particular duration.
Switch To Streams
"Back-pressure" is a primary feature of akka streams. Therefore, by switching to a stream implementation you will be able to achieve your desired goal.
If it possible to create a stream Source from your original data, then you could use Sink.actorRef to create a Sink from the mediator and use Flow.throttle to control the rate of flow to the mediator.

Related

How to use Vertx EventBus to send messages between Verticles?

I am currently maintaining application written in Java with Vertx framework.
I would like to implement sending messages between 2 application instances (primary and secondary) using EventBus (over the network). Is it possible?
In the Vertx documentation I do not see the example how I can achieve that. https://vertx.io/docs/vertx-core/java/#event_bus
I see that there are send(...) methods in EventBus with address - but address can be any String. I would like to publish the events to another application instance (for example from Primary to Secondary).
It is possible using a Vert.x cluster manager.
Choose one of the supported cluster managers in the classpath of your application.
In your main method, instead of creating a standalone Vertx instance, create a clustered one:
Vertx.clusteredVertx(new VertxOptions(), res -> {
if (res.succeeded()) {
Vertx vertx = res.result();
} else {
// failed!
}
});
Deploy a receiver:
public class Receiver extends AbstractVerticle {
#Override
public void start() throws Exception {
EventBus eb = vertx.eventBus();
eb.consumer("ping-address", message -> {
System.out.println("Received message: " + message.body());
// Now send back reply
message.reply("pong!");
});
System.out.println("Receiver ready!");
}
}
In a separate JVM, deploy a sender:
public class Sender extends AbstractVerticle {
#Override
public void start() throws Exception {
EventBus eb = vertx.eventBus();
// Send a message every second
vertx.setPeriodic(1000, v -> {
eb.request("ping-address", "ping!", reply -> {
if (reply.succeeded()) {
System.out.println("Received reply " + reply.result().body());
} else {
System.out.println("No reply");
}
});
});
}
}
That's it for the basics. You may need to follow individual cluster manager configuration instructions in the docs.

Transpose from Consumer to CompletableFuture

I'm currently using an API which I unfortunately cannot change easily. This API has some methods in the style of this:
public void getOffers(Consumer<List<Offer>> offersConsumer) {
final Call<List<Offer>> offers = auctionService.getOffers();
handleGetOffers(offersConsumer, offers);
}
It's a web api using retrofit, and it enables me to process the response in a consumer, but I much rather want to work with CompletableFutures.
I'm using the data I receive from this endpoint to compose an interface in a game, and therefore compose an inventory, that basically acts as a frontend to the api. What I want to do, is to have my composing method to wait for the consumer to finish, and then provide the processed results. This is what I have so far, but I don't know how to do the step from the consumer to the CompletableFuture:
#Override
public CompletableFuture<Inventory> get(Player player) {
return CompletableFuture.supplyAsync(() -> {
auctionAPI.getOffers(offers -> {
//process the offers, then return the result of the processing, in form of an "Inventory"-Object.
}
});
});
}
I now need to return the result of the processing after all the Items have been received and then processed. How can I achieve this?
Something along the lines should work:
#Override
public CompletableFuture<Inventory> get(Player player) {
CompletableFuture<Inventory> result = new CompletableFuture<>();
CompletableFuture.supplyAsync(() -> {
auctionAPI.getOffers(offers -> {
//process the offers, then return the result of the processing, in form of an "Inventory"-Object.
result.complete(inventory);
}
});
return null;
});
return result;
}

RSocket Channel with Spring Boot - Clients miss their own first message

Suppose I have a simple RSocket and Spring Boot Server. The server broadcasts all incoming client messages to all connected clients (including the sender). Client and server look like this:
Server:
public RSocketController() {
this.processor = DirectProcessor.<String>create().serialize();
this.sink = this.processor.sink();
}
#MessageMapping("channel")
Flux<String> channel(final Flux<String> messages) {
this.registerProducer(messages);
// breakpoint here
return processor
.doOnSubscribe(subscription -> logger.info("sub"))
.doOnNext(message -> logger.info("[Sent] " + message));
}
private Disposable registerProducer(Flux<String> flux) {
return flux
.doOnNext(message -> logger.info("[Received] " + message))
.map(String::toUpperCase)
// .delayElements(Duration.ofSeconds(1))
.subscribe(this.sink::next);
}
Client:
#ShellMethod("Connect to the server")
public void connect(String name) {
this.name = name;
this.rsocketRequester = rsocketRequesterBuilder
.rsocketStrategies(rsocketStrategies)
.connectTcp("localhost", 7000)
.block();
}
#ShellMethod("Establish a channel")
public void channel() {
this.rsocketRequester
.route("channel")
.data(this.fluxProcessor.doOnNext(message -> logger.info("[Sent] {}", message)))
.retrieveFlux(String.class)
.subscribe(message -> logger.info("[Received] {}", message));
}
#ShellMethod("Send a lower case message")
public void send(String message) {
this.fluxSink.next(message.toLowerCase());
}
The problem is: the first message a client sends is processed by the server, but does not reach the sender again. All subsequent messages are delivered without any problems. All other clients already connected will receive all messages.
What I noticed so far while debugging
when I call channel() in the client, retrieveFlux() and subscribe() are called. But on the server the breakpoint is not triggered in the corresponding method.
Only when the client sends the first message with send() is the breakpoint triggered on the server.
Using the .delayElements() on the server seems to "solve" the problem.
What am i doing wrong here?
And why does it need the send() first to trigger the servers breakpoint?
Thanks in advance!
A DirectProcessor does not have a buffer. If it does not have a subscriber, the message is dropped.
(Citing from its Javadoc: If there are no Subscribers, upstream items are dropped)
I think that when RSocketController.registerProducer() calls flux.[...].subscribe() it immediately starts processing the incoming messages from flux and passing them to the sink of the processor, but subscription to the processor has not happened yet. Thus the messages are dropped.
I guess that subscription to the processor is done by the framework, after returning from RSocketController.channel(...) method. -- I think that you are able to set a breakpoint in your processor.doOnSubscribe(..) method to see where it actually happens.
Thus maybe moving a registerProducer() call into a processor.doOnSubscribe() callback will solve your issue, like this:
#MessageMapping("channel")
Flux<String> channel(final Flux<String> messages) {
return processor
.doOnSubscribe(subscription -> this.registerProducer(messages))
.doOnSubscribe(subscription -> logger.info("sub"))
.doOnNext(message -> logger.info("[Sent] " + message));
}
But I think that personally I would prefer to replace a DirectProcessor with UnicastProcessor.create().onBackpressureBuffer().publish(). So that broadcasting to multiple subscribers is moved into a separate operation, so that there could be a buffer between the sink and subscribers, and late subscribers and backpressure could be handled in a better way.

Kafka SpringBoot StreamListener - how to consume multiple topics in order?

I have multiple StreamListener-annotated methods consuming from different topics. But some of these topics need to be read from the "earliest" offset to populate an in-memory map (something like a state machine) and then consume from other topics that might have commands in them that should be executed against the "latest" state machine.
Current code looks something like:
#Component
#AllArgsConstructor
#EnableBinding({InputChannel.class, OutputChannel.class})
#Slf4j
public class KafkaListener {
#StreamListener(target = InputChannel.EVENTS)
public void event(Event event) {
// do something with the event
}
#StreamListener(target = InputChannel.COMMANDS)
public void command(Command command) {
// do something with the command only after all events have been processed
}
}
I tried to add some horrible code that gets the kafka topic offset metadata from the incoming event messages and then uses a semaphore to block the command until a certain percentage of the total offset is reached by the event. It kinda works but makes me sad, and it will be awful to maintain once we have 20 or so topics that all depend on one another.
Does SpringBoot / Spring Streams have any built-in mechanism to do this, or is there some common pattern that people use that I'm not aware of?
TL;DR: How do I process all messages from topic A before consuming any from topic B, without doing something dirty like sticking a Thread.sleep(60000) in the consumer for topic B?
See the kafka consumer binding property resetOffsets
resetOffsets
Whether to reset offsets on the consumer to the value provided by startOffset. Must be false if a KafkaRebalanceListener is provided; see Using a KafkaRebalanceListener.
Default: false.
startOffset
The starting offset for new groups. Allowed values: earliest and latest. If the consumer group is set explicitly for the consumer 'binding' (through spring.cloud.stream.bindings..group), 'startOffset' is set to earliest. Otherwise, it is set to latest for the anonymous consumer group. Also see resetOffsets (earlier in this list).
Default: null (equivalent to earliest).
You can also add a KafkaBindingRebalanceListener and perform seeks on the consumer.
EDIT
You can also set autoStartup to false on the second listener, and start the binding when you are ready. Here's an example:
#SpringBootApplication
#EnableBinding(Sink.class)
public class Gitter55Application {
public static void main(String[] args) {
SpringApplication.run(Gitter55Application.class, args);
}
#Bean
public ConsumerEndpointCustomizer<KafkaMessageDrivenChannelAdapter<?, ?>> customizer() {
return (endpoint, dest, group) -> {
endpoint.setOnPartitionsAssignedSeekCallback((assignments, callback) -> {
assignments.keySet().forEach(tp -> callback.seekToBeginning(tp.topic(), tp.partition()));
});
};
}
#StreamListener(Sink.INPUT)
public void listen(String value, #Header(KafkaHeaders.RECEIVED_MESSAGE_KEY) byte[] key) {
System.out.println(new String(key) + ":" + value);
}
#Bean
public ApplicationRunner runner(KafkaTemplate<byte[], byte[]> template,
BindingsEndpoint bindings) {
return args -> {
while (true) {
template.send("gitter55", "foo".getBytes(), "bar".getBytes());
System.out.println("Hit enter to start");
System.in.read();
bindings.changeState("input", State.STARTED);
}
};
}
}
spring.cloud.stream.bindings.input.group=gitter55
spring.cloud.stream.bindings.input.destination=gitter55
spring.cloud.stream.bindings.input.content-type=text/plain
spring.cloud.stream.bindings.input.consumer.auto-startup=false

Observable.publish() doesn't call onCompleted() on observers that subscribe after the source Observable is done

I'm trying to get an Observable to share its emissions with all the subscribers, so that it would be subscribe()d to exactly once.
I tried using Observable.publish(), but it appears that subscribers to the published Observable don't receive any termination messages( onCompleted() and possibly onError()) if they subscribe after the source Observable is done. Here is a piece of code to demonstrate that:
static <T> Observer<T> printObserver(String name) {
return new Observer<T>() {
#Override public void onCompleted() {
System.out.println(name + ": onCompleted()");
}
#Override public void onError(Throwable e) {
System.out.println(name + ": onError( " + e + " )");
}
#Override public void onNext(T value) {
System.out.println(name + ": onNext( " + value + " )");
}
};
}
public void testRxPublishConnect() throws Exception {
Observable<Integer> sourceObservable = Observable.range(1, 5);
ConnectableObservable<Integer> sharedObservable = sourceObservable.publish();
sharedObservable.subscribe(printObserver("Observer #1"));
sharedObservable.connect();
sharedObservable.subscribe(printObserver("Observer #2"));
}
This is what gets printed:
Observer #1: onNext( 1 )
Observer #1: onNext( 2 )
Observer #1: onNext( 3 )
Observer #1: onNext( 4 )
Observer #1: onNext( 5 )
Observer #1: onCompleted()
Note that Observer #2 doesn't receive onCompleted().
I don't think this is the desired behavior. Am I missing something?
I tried it in RxJava versions 1.0.8 and 1.0.14 with the same result.
Try .share() which is .publish().refCount().
This is by design. If you call connect() in this case, your subscriber will receive all events from the start. If a terminated publish would terminate its child subscribers immediately, you likely couldn't observe values because once connected, publish ticks away its source slowly if there are no subscribers to it.
I'm 99% sure this is the expected behavior. I'm not sure about RxJava, but in most of the implementations of the publish&subscribe pattern that I know of, the default behavior for an observable is to publish events to subscribers and forget about them. This means that notifications are not 'retro-active' (i.e. subscribers don't get to know anything about the events emitted in the past).
Also, from the Observable Contract (section 'multiple observers') of the RxJava documentation :
If a second observer subscribes to an Observable that is already emitting items to a first observer, it is up to the Observable whether it will thenceforth emit the same items to each observer ... There is no general guarantee that two observers of the same Observable will see the same sequence of items.
Publish works by building a list of all subscribers then once connect() is called it starts producing data to all subscribers in it's subscriber list. This means all the subscribers have to be known before calling connect. Here's how you would use publish() or possibly more preferable the publish(Func1<Observable<T>, Observable<R>>) overload.
Known number of subscribers: Publish
Func closing over all subscriptions.
observableStream.publish(new Func1<Observable<Integer>, Observable<Integer>>() {
#Override
public Observable<Integer> call(Observable<Integer> subject) {
Observable<Integer> o1 = subject.doOnNext(somework1());
Observable<Integer> o2 = subject.doOnNext(somework2());
return Observable.merge(o1, o2);
}
});
Manual call to connect and subscribe:
ConnectableObservable<Integer> subject = observableStream.publish();
subject.subscribe(somework1());
subject.subscribe(somework2());
subject.connect();
If you don't know how many subscribers you'll have then you can window the inputs to manageable chunks and then publish your inputs over your collection of Transformers.
Unknown number of subscribers: Window
final Set<Transformer<Integer, String>> transformers = new HashSet<>();
observableStream
.window(100, TimeUnit.MILLISECONDS, 1000)
.flatMap(new Func1<Observable<Integer>, Observable<String>>(){
#Override
public Observable<String> call(Observable<Integer> window) {
return window.publish(new Func1<Observable<Integer>, Observable<String>>() {
#Override
public Observable<String> call(Observable<Integer> publish) {
Observable<Observable<String>> workObservables = Observable.from(transformers)
.map(new Func1<Transformer<Integer, String>, Observable<String>>(){
#Override
public Observable<String> call(Transformer<Integer, String> transformer) {
return publish.compose(transformer);
}});
return Observable.merge(workObservables);
}});
}})
.subscribe();
There is a third option. You could use observable.cache() but this will hold all input data from that observable stream in memory so you want to be careful with how you use that. In that case you'll probably end up windowing anyway to control the bounds of your cached subject.

Categories