Difference between a Future and a Mono - java

In (Java) reactive programming, what is the difference between a Future<T> and a (Project Reactor) Mono<T>? Both seem to be means for accessing the result of an asynchronous computation at a time in the future when the computation is complete. Why introduce the Mono interface if Future already does the job?

The greatest difference is that a Mono<T> can be fully lazy, whereas when you get hold of a Future<T>, the underlying processing has already started.
With a typical cold Mono, nothing happens until you subscribe() to it, which makes it possible to pass the Mono around in the application and enrich it with operators along the way, before even starting the processing.
It is also far easier to keep things asynchronous using a Mono compared to a Future (where the API tends to drive you to call the blocking get()).
Finally, compared to both Future and CompletableFuture, the composition aspect is improved in Mono with the extensive vocabulary of operators it offers.

Producer and consumer can communicate in 2 ways: synchronous and asynchronous.
In synchronous (pull-based) way, consumer is a thread and some intermediate communicator object is used. Usually it is a blocking queue. In special case, when only single value is passed during the whole producer-consumer communication, a communicator which implements interface Future can be used. This way is called synchronous, because the consumer calls communicating method like Future.get() and that methods waits until the value is available, and then returns that value as a result. That is, requesting the value, and receiving it, are programmed in the same statement, though these actions can be separated in time.
The drawback of synchronous communication is that when the consumer waits for the requested value, it wastes considerable amount of memory for it's thread stack. As a result, we can have only limited number of actions which wait for data. For example, it could be internet connections serving multiple clients. To increase that number, we can represent consumer not as a thread, but as some relatively small object, with methods called by the producer or communicator when datum for consumer is available. This way is called asynchronous. It is split in 2 actions: request to producer to pass data and passing that data to consumer. This is asynchronous (push-based) method.
Now the reply to the question is: Future is able to act as a synchronous communicator only (with get methods), and Mono can be used both as synchronous communicator (with block methods) and as an asynchronous one (with subscribe methods).
Note that java.util.concurrent.CompletableFuture can also act both as synchronous and asynchronous communicator. Why to have similar means to do the same thing? This phenomenon is called not invented here.

Related

Query on RxJava observeOn scheduler thread

I have to write into a file based on the incoming requests. As multiple requests may come simultaneously, I don't want multiple threads trying to overwrite the file content together, which may lead into losing some data.
Hence, I tried collecting all the requests' data using a instance variable of PublishSubject. I subscribed publishSubject during init and this subscription will remain throughout the life-cycle of application. Also I'm observing the same instance on a separate thread (provided by Vertx event loop) which invokes the method responsible for writing the file.
private PublishSubject<FileData> publishSubject = PublishSubject.create();
private void init() {
publishSubject.observeOn(RxHelper.blockingScheduler(vertx)).subscribe(fileData -> writeData(fileData));
}
Later during request handling, I call onNext as below:
handleRequest() {
//do some task
publishSubject.onNext(fileData);
}
I understand that, when I call onNext, the data will be queued up, to be written into the file by the specific thread which was assigned by observeOn operator. However, what I'm trying to understand is
whether this thread gets blocked in WAITING state for only this
task? Or,
will it be used for other activities also when no file
writing happens?
I don't want to end up with one thread from the vertx event loop wasted in waiting state for going with this approach. Also, please suggest any better approach, if available.
Thanks in advance.
Actually RxJava will do it for you, by definition onNext() emissions will act in serial fashion:
Observables must issue notifications to observers serially (not in parallel). They may issue these notifications from different threads, but there must be a formal happens-before relationship between the notifications. (Observable Contract)
So as long as you will run blocking calls inside the onNext() at the subscriber (and will not fork work to a different thread manually) you will be fine, and no parallel writes will be happen.
Actually, you're worries should come from the opposite direction - Backpressure.
You should choose your backpressure strategy here, as if the requests will come faster then you will process them (writing to file) you might overflow the buffer and get into troubles. (consider using Flowable and choose you're backpressure strategy according to your needs.
Regarding your questions, that depends on the Scheduler, you're using RxHelper.blockingScheduler(vertx) which seems like your custom code, so I can't tell, if the scheduler is using shared thread in work queue fashion then it will not stay idle.
Anyhow, Rx will not determine this for you, the scheduler responsibility is to assign the work to some thread according to its logic.

Akka vs Java 7 Futures

I am trying to understand when to use Akka Futures and found this article to be a little bit more helpful than the main Akka docs. So it looks like Akka Futures do exactly the same thing as Java 7 Futures. So I ask:
Outside the context of an actor system, what benefits do Akka Futures have over Java Futures? When to use each?
Within the context of an actor system, why ever use an Akka Future? Aren't all actor-to-actor messages asynchronous, concurrent and non-blocking?
Akka Futures implement asynchronous way of communication, while Java7 Futures implement synchronous approach. Yes they do the same thing - communication - but in quite different way.
Producer-Consumer pair can interact in two ways: synchronous and asynchronous. Synchronous way assumes the consumer has its own thread and performs a blocking operation to get next produced message, e.g. BlockingQueue.take(). In asynchronous approach, consumer does not own a thread, it is just an object with at least two methods: to store a message and to process it. Producer calls the store method, just like it calls Queue.put(m) in synchronous approach, but this method also initiates execution of the consumer's processing method on a common thread pool.
UPDT
As for the 2nd question (why ever use an Akka Future):
Future creation looks (and is) simpler than Actor's; code for a chain of Futures is more compact and more demonstrable than that of Actors.
Note however, a Future can pass only a single value (message) while an Actor can handle a sequence of messages. But sequences can be handled with Akka Streams. So the question arise: why ever use Akka Actors? I invite more experienced developers to answer this question. Generally, I think if your task can be solved with Futures, then use Futures, else if with Streams, use Streams, else if with Akka Actors, then use Actors, else look for another framework.
For the first part of your question, I agree with Alexei Kaigorodov's answer.
For the second part of your question:
It is useful to use a Future internally when actor responses need to be combined in a very specific way. For example, let's say that the Master actor needs to perform several blocking database queries and then aggregate their results, and so Master sends each query to a Worker and will then aggregate the responses. If the query results can be aggregated in any order (e.g. Master is just summing row counts or whatever) then it makes sense for Worker to send its results to Master via a callback. However, if the results need to be combined in a very specific order then it is easier for each Worker to immediately return a Future and for Master to then go about manipulating these Futures in the correct order. This could be done via callbacks as well, but then Master would need to figure out which query result is which to put them in the correct order and it will be much more difficult to optimize the code (e.g. if the results of query1 can be immediately aggregated with the results of query2 then by using a Future this logic can go directly into the dispatch code where the identities of all queries is already known, whereas using a callback would require Master to identify the query result and also determine if it can aggregate the query with any other query results that have been returned).

Akka and its Error Kernel

I am reading the Akka (Java lib) docs and need clarification on some of their own proclaimed Akka/Actor Best Practices.
Actors should not block (i.e. passively wait while occupying a Thread) on some external entity...The blocking operations should be
done in some special-cased thread which sends messages to the actors which shall act on them.
So what does a code example of this look like in Akka/Java? If an Actor isn't an appriote place to put code that has to block, then what does satisfy the definition of "some special-cased thread"?
Do not pass mutable objects between actors. In order to ensure that, prefer immutable messages.
I'm familiar with how to make immutable classes (no public setters, no public fields, make the class final, etc.). But does Akka have its own definition of an "immutable class", and if so, what is it?
Top-level actors are the innermost part of your Error Kernel...
I don't even know what this means! I understand what they mean by "top-level" actors (highest in the actor/manager/supervisor hierarchy), but what's an "Error Kernel", and how does it relate to actors?
I am able to answer only the first question (and in future, please place only one question in a post).
Consider, for example, a database connection, which is inherently blocking. In order to allow actors to connect to a database, programmer should create a dedicated thread (or a thread pool) with a queue of database requests. A request contains a database statement and a reference to the actor which is to receive the result. The dedicated thread reads requests in a loop, accesses the database, sends the result to the referenced actor etc. The request queue is blocking - when there are no requests, the connection thread is blocked in the queue.take() operation.
So the access to a database is split in two actors - one places a request to the queue, and the other handles the result.
UPDATE: Java code sketch (I am not strong in Scala).
class Request {
String query;
ActorRef handler;
}
class DatabaseConnector implements Runnable {
LinkedBlockingQueue<Request> queue=new LinkedBlockingQueue<Request>();
Thread t = new Thread(this);
{t.start();}
public void sendRequest(Request r) {
queue.put(r);
}
public void run() {
for (;;) {
Request r=queue.take();
ResultSet res=doBlockingCallToJdbc(r.query);
r.handler.sendOneWay(res);
}
}
Here is the answer for your second question. Right from the Akka Doc:
If one actor carries very important data (i.e. its state shall not be
lost if avoidable), this actor should source out any possibly
dangerous sub-tasks to children it supervises and handle failures of
these children as appropriate. Depending on the nature of the
requests, it may be best to create a new child for each request, which
simplifies state management for collecting the replies. This is known
as the “Error Kernel Pattern” from Erlang.
So the phrase you talking about means that these actors are the "last line of defence" from errors in your supervision hierarchy, so they should be strong and powerful guys (commandos) instead of some weak workers. And the less commandos you have - the easier it would be managing them and avoid mess at the top-level. Precisely saying, the count of commando's should be near to the count of business protocols you have (moving to the superheroes - let's say one for IronMan, one for Hulk etc.)
This document also has a good explanation about how to manage blocking operations.
Speaking of which
If an Actor isn't an appriote place to put code that has to block then what does satisfy the definition of "some special-cased thread
Actor definetely doesn't, because Akka guarantees only sequentiality, but your message may be processed on any thread (it just picks-up a free thread from the pool), even for single actor. Blocking operations are not recommended there (at least in same thread-pool with normal) because they may lead to performance problems or even deadlocks. See explanation for Spray (it's based on Akka) for instance : Spray.io: When (not) to use non-blocking route handling?
You may think of it like akka requires to interact only with asynchronous API. You may consider Future for converting sync to async - just send response from your database as a message to the actor. Example for scala:
receive = { //this is receiving method onReceive
case query: Query => //query is message safely casted to Query
Future { //this construction marks a peace of code (handler) which will be passed to the future
//this code will be executed in separate thread:
doBlockingCallToJdbc(query)
} pipeTo sender //means do `sender ! futureResult` after future's completion
}
}
Other approaches are described in the same document (Akka Doc)

Best practices with Akka in Scala and third-party Java libraries

I need to use memcached Java API in my Scala/Akka code. This API gives you both synchronous and asynchronous methods. The asynchronous ones return java.util.concurrent.Future. There was a question here about dealing with Java Futures in Scala here How do I wrap a java.util.concurrent.Future in an Akka Future?. However in my case I have two options:
Using synchronous API and wrapping blocking code in future and mark blocking:
Future {
blocking {
cache.get(key) //synchronous blocking call
}
}
Using asynchronous Java API and do polling every n ms on Java Future to check if the future completed (like described in one of the answers above in the linked question above).
Which one is better? I am leaning towards the first option because polling can dramatically impact response times. Shouldn't blocking { } block prevent from blocking the whole pool?
I always go with the first option. But i am doing it in a slightly different way. I don't use the blocking feature. (Actually i have not thought about it yet.) Instead i am providing a custom execution context to the Future that wraps the synchronous blocking call. So it looks basically like this:
val ecForBlockingMemcachedStuff = ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(100)) // whatever number you think is appropriate
// i create a separate ec for each blocking client/resource/api i use
Future {
cache.get(key) //synchronous blocking call
}(ecForBlockingMemcachedStuff) // or mark the execution context implicit. I like to mention it explicitly.
So all the blocking calls will use a dedicated execution context (= Threadpool). So it is separated from your main execution context responsible for non blocking stuff.
This approach is also explained in a online training video for Play/Akka provided by Typesafe. There is a video in lesson 4 about how to handle blocking calls. It is explained by Nilanjan Raychaudhuri (hope i spelled it correctly), who is a well known author for Scala books.
Update: I had a discussion with Nilanjan on twitter. He explained what the difference between the approach with blocking and a custom ExecutionContext is. The blocking feature just creates a special ExecutionContext. It provides a naive approach to the question how many threads you will need. It spawns a new thread every time, when all the other existing threads in the pool are busy. So it is actually an uncontrolled ExecutionContext. It could create lots of threads and lead to problems like an out of memory error. So the solution with the custom execution context is actually better, because it makes this problem obvious. Nilanjan also added that you need to consider circuit breaking for the case this pool gets overloaded with requests.
TLDR: Yeah, blocking calls suck. Use a custom/dedicated ExecutionContext for blocking calls. Also consider circuit breaking.
The Akka documentation provides a few suggestions on how to deal with blocking calls:
In some cases it is unavoidable to do blocking operations, i.e. to put
a thread to sleep for an indeterminate time, waiting for an external
event to occur. Examples are legacy RDBMS drivers or messaging APIs,
and the underlying reason is typically that (network) I/O occurs under
the covers. When facing this, you may be tempted to just wrap the
blocking call inside a Future and work with that instead, but this
strategy is too simple: you are quite likely to find bottlenecks or
run out of memory or threads when the application runs under increased
load.
The non-exhaustive list of adequate solutions to the “blocking
problem” includes the following suggestions:
Do the blocking call within an actor (or a set of actors managed by a router), making sure to configure a thread pool which is either
dedicated for this purpose or sufficiently sized.
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded
number of tasks of this nature will exhaust your memory or thread
limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the
hardware on which the application runs.
Dedicate a single thread to manage a set of blocking resources (e.g. a NIO selector driving multiple channels) and dispatch events as they
occur as actor messages.
The first possibility is especially well-suited for resources which
are single-threaded in nature, like database handles which
traditionally can only execute one outstanding query at a time and use
internal synchronization to ensure this. A common pattern is to create
a router for N actors, each of which wraps a single DB connection and
handles queries as sent to the router. The number N must then be tuned
for maximum throughput, which will vary depending on which DBMS is
deployed on what hardware.

How to integrate LMAX within a real financial application

I am also thinking of integrating the disruptor pattern in our application. I am a bit unsure about a few things before I start using the disruptor
I have 3 producers, mainly a FIX thread which de-serialises the requests. Another thread which continously modifies order price as the market moves. Also we have one more thread which is responsible for de-serialising the requests sent from a GUI application. All three threads currently write to a Blocking Queue (hence we see a lot of contention on the queue)
The disruptor talks about a Single writer principle and from what I have read that approach scales the best. Is there any way we could make the above three threads obey the single writer principle?
Also in a typical request/response application, specially in our case we have contention on an in memory cache, as we need to lock the cache when we update the cache with the response, whilst a request might be happening for the same order. How do we handle this through the disruptor, i.e. how do I tie up a response to a particular request? Can I eliminate the lock on the cache if yes how?
Any suggestions/pointers would be highly appreciated. We are currently using Java 1.6
I'm new to distruptor and am trying to understand as much usecases as possible. I have tried to answer your questions.
Yes, Disruptor can be used to sequence calls from multiple
producers. I understand that all 3 threads try to update the state
of a shared object. And a single consumer which takes necessary action on the shared object. Internally you can have the single consumer delegate calls to the appropriate single threaded handler based on responsibility. The
The Disruptor exactly does this. It sequences the calls such that
the state is accessed only by a thread at a time. If there's a specific order in which the event handlers are to be invoked, set up the memory barrier. The latest version of Disruptor has a DSL that lets you setup the order easily.
The Cache can be abstracted and accessed through the Disruptor. At a time, only a
Reader or a Writer would get access to the cache, since all calls to
the cache are sequential.

Categories