Async API design client - java

Lets say I create an async REST API in Spring MVC with Java 8's Completeable.
How is this called in the client? If its non blocking, does the endpoint return something before processing? Ie
#RequestMapping("/") //GET method
public CompletableFuture<String> meth(){
thread.sleep(10000);
String result = "lol";
return CompletableFuture.completedFuture(result);
}
How does this exactly work? (This code above is just a randomly made code I just thought of).
When I send a GET request from say google chrome # localhost:3000/ then what happens? I'm a newbie to async APIs, would like some help.

No, the client doesn't know it's asynchronous. It'll have to wait for the result normally. It's just the server side that benefits from freeing up a worker thread to handle other requests.

In this version it's pointless, because CompletableFuture.completedFuture() creates a completed Future immediately.
However in a more complex piece of code, you might return a Future that is not yet complete. Spring will not send the response body until some other thread calls complete() on this Future.
Why not just use a new thread? Well, you could - but in some situations it might be more efficient not to. For example you might put a task into an Executor to be handled by a small pool of threads.
Or you might fire off a JMS message asking for the request to be handled by a completely separate machine. A different part of your program will respond to incoming JMS messages, find the corresponding Future and complete it. There is no need for a thread dedicated to this HTTP request to be alive while the work is being done on another system.
Very simple example:
#RequestMapping("/employeenames/{id}")
public CompletableFuture<String> getName(#PathVariable String id){
CompletableFuture<String> future = new CompletableFuture<>();
database.asyncSelect(
name -> future.complete(name),
"select name from employees where id = ?",
id
);
return future;
}
I've invented a plausible-ish API for an asynchronous database client here: asyncSelect(Consumer<String> callback, String preparedstatement, String... parameters). The point is that it fires off the query, then does not block the tread waiting for the DB to respond. Instead it leaves a callback (name -> future.complete(name)) for the DB client to invoke when it can.
This is not about improving API response times -- we do not send an HTTP response until we have a payload to provide. This is about using the resources on the server more efficiently, so that while we're waiting for the database to respond it can do other things.
There is a related, but different concept, of asynch REST, in which the server responds with 202 Accepted and a header like Location: /queue/12345, allowing the client to poll for the result. But this isn't what the code you asked about does.

CompletableFuture was introduced by Java to make handling complex asynchronous programming. It lets the programmer combine and cascade async calls, and offers the static utility methods runAsync and supplyAsync to abstract away the manual creation of threads.
These methods dispatch tasks to Java’s common thread pool by default or a custom thread pool if provided as an optional argument.
If a CompletableFuture is returned by an endpoint method and #complete is never called, the request will hang until it times out.

Related

How to handle asynchronous callbacks in a synchronous way in Java?

I have an architechture related question. This is a language independent question, but as I come from Java background, it will be easier for me if someone guides me in the Java way.
Basically, the middleware I'm writing communicates with a SOAP based third party service. The calls are async - in a way that, when a service is called, it returns with a response 01 - processing; meaning that the third party has successfully received the request. In the original SOAP request, one callback URL has to be submitted each time, where third party actually sends the result. So, calling a particular service doesn't actually return the result immediately; the result is received in a separate HTTP endpoint in the middleware.
Now in our frontend, we don't want to complicate the user experience. We want our users to call a middleware function (via menu items/buttons), and get the result immediately; and leave the dirty work to the middleware.
Please note that the middleware function (lets say X()) which was invoked from the front end and the middleware endpoint URL(lets call it Y) where third party pushes the result are completely separate from each other. X() somehow has to wait and then fetch the result grabbed in Y and then return the result to the frontend.
How can I build a robust solution to achieve the above mentioned behavior?
The picture depicts my case perfectly. Any suggestions will be highly appreciated.
This question could be more about integration patterns than it is about multi-threading. But requests in the same application/JVM can be orchestrated using a combination of asynchronous invocation and the observer pattern:
This is better done using an example (exploiting your Java knowledge). Check the following simplistic components that try to replicate your scenario:
The third-party service: it exposes an operation that returns a correlation ID and starts the long-running execution
class ExternalService {
public String send() {
return UUID.randomUUID().toString();
}
}
Your client-facing service: It receives a request, calls the third-party service and then waits for the response after registering with the result receiver:
class RequestProcessor {
public Object submitRequest() {
String correlationId = new ExternalService().send();
return new ResultReceiver().register(correlationId).join();
}
}
The result receiver: It exposes an operation to the third-party service, and maintains an internal correlation registry:
class ResultReceiver {
Map<String, CompletableFuture<Object>> subscribers;
CompletableFuture<Object> register(String responseId) {
CompletableFuture<Object> future = new CompletableFuture<Object>();
this.subscribers.put(responseId, future);
return future;
}
public void externalResponse(String responseId, Object result) {
this.subscribers.get(responseId).complete(result);
}
}
Futures, promises, call-backs are handy in this case. Synchronization is done by the initial request processor in order to force the execution to block for the client.
Now this can raise a number of issues that are not addressed in this simplistic class set. Some of these problems may be:
race condition between new ExternalService().send() and new ResultReceiver().register(correlationId). This is something that can be solved in ResultReceiver if it undestands that some responses can be very fast (2-way wait, so to say)
Never-coming results: results can take too long or simply run into errors. These future APIs typically offer timeouts to force cancellation of the request. For example:
new ResultReceiver().register(correlationId)
.get(10000, TimeUnit.SECONDS);
Well what exactly is the problem with doing that? You just create an API (middleware) which doesn't return response until the third party returns the processed result. Front end sends request to X(), X() processes that request by sending a request to Y() and then keep polling Y() to see when the result is ready, then X() takes the results from Y() and sends it back to the front end. Like a facade.
There are some problems regarding using third party services that you don't control which you should consider. First of all you need to implement some kind of circuit breaker or timeout. Because the third party service might hang and never process the results (or process them so long that it makes no sense to wait). Also you should consider some meaningful way to keep the site running even if the third party service is not available or has updated their API or something else prevents you from using it.
And just one last thought in the end. Why would you want to make something that is already implemented asynchronous synchronous? It is made like that probably because it might take time. Blocking the front end for a long periods of time to wait for results makes the user experience unpleasant and the UI unresponsive. It is usually better to stick to asynchronous requests and show the users they are processing but let them do something else meanwhile.

Returning synchronous message from service, but then doing asynchronous processing - concern about hanging threads?

Essentially I've written a service in Java that will do initial synchronous processing (a couple simple calls to other web services). Then, after that processing is done, I return an acknowledgement message to the caller, saying I've verified their request and there is now downstream processing happening in the background asynchronously.
In a nutshell, what I'm concerned about is the complexity of the async processing. The sum of those async calls can take up to 2-3 minutes depending on certain parameters sent. My thought here is: what if there's a lot of traffic at once hitting my service, and there are a bunch of hanging threads in the background, doing a large chunk of processing. Will there be bad data as a result? (like one request getting mixed in with a previous request etc)
The code follows this structure:
Validation of headers and params in body
Synchronous processing
Return acknowledgement message to the caller
Asynchronous processing
For #4, I've simply made a new thread and call a method that does all the async processing within it. Like:
new Thread()
{
#Override
public void run()
{
try {
makeDownstreamCalls(arg1, arg2 , arg3, arg4);
} catch (Exception e) {
e.printStackTrace();
}
}
}.start();
I'm basically wondering about unintended consequences of lots of traffic hitting my service. An example I'm thinking about: a thread executing downstream calls for request A, and then another request comes in, and a new thread has to be made to execute downstream calls for request B. How is request B handled in this situation, and what happens to request A, which is still in-progress? Will the async calls in request A just terminate in this case? Or can each distinct request, and thread, execute in parallel just fine and complete, without any strange consequences?
Well, the answer depends on your code, of which you posted a small part, so my answer contains some guesswork. I'll assume that we're talking about some sort of multi-threaded server which accepts client requests, and that those request come to some handleRequest() method which performs the 4 steps you've mentioned. I'll also assume that the requests aren't related in any way and don't affect each other (so for instance, the code doesn't do something like "if a thread already exists from a previous request then don't create a new thread" or anything like that).
If that's the case, then your handleRequest() method can be simultaneously invoked by different server threads concurrently. And each will execute the four steps you've outlined. If two requests happen simultaneously, then a server thread will execute your handler for request A, and a different one will execute it for B at the same time. If during the processing of a request, a new thread is created, then one will be created for A, another for B. That way, you'll end up with two threads performing makeDownstreamCalls(), one with A's parameters one with B's.
In practice, that's probably a pretty bad idea. The more threads your program will create, the more context-switching the OS has to do. You really don't want the number of requests to increase the number of threads in your application endlessly. Modern OSes are capable of handling hundreds or even thousands of threads (as long as they're bound by IO, not CPU), but it comes at a cost. You might want to consider using a Java executor with a limited number of threads to avoid crushing your process or even OS.
If there's too much load on a server, you can't expect your application to handle it. Process what you can within the limit of the application, and reject further request. Accepting more requests when you're fully loaded means that your application crashes, and none of the requests are processed - this is known as "Load Shedding".

KafkaProducer: Difference between `callback` and returned `Future`?

The KafkaProducer send method both returns a Future and accepts a Callback.
Is there any fundamental difference between using one mechanism over the other to execute an action upon completion of the sending?
Looking at the documentation you linked to it looks like the main difference between the Future and the Callback lies in who initiates the "request is finished, what now?" question.
Let's say we have a customer C and a baker B. And C is asking B to make him a nice cookie. Now there are 2 possible ways the baker can return the delicious cookie to the customer.
Future
The baker accepts the request and tells the customer: Ok, when I'm finished I'll place your cookie here on the counter. (This agreement is the Future.)
In this scenario, the customer is responsible for checking the counter (Future) to see if the baker has finished his cookie or not.
blocking
The customer stays near the counter and looks at it until the cookie is put there (Future.get()) or the baker puts an apology there instead (Error : Out of cookie dough).
non-blocking
The customer does some other work, and once in a while checks if the cookie is waiting for him on the counter (Future.isDone()). If the cookie is ready, the customer takes it (Future.get()).
Callback
In this scenario the customer, after ordering his cookie, tells the baker: When my cookie is ready please give it to my pet robot dog here, he'll know what to do with it (This robot is the Callback).
Now the baker when the cookie is ready gives the cookie to the dog and tells him to run back to it's owner. The baker can continue baking the next cookie for another customer.
The dog runs back to the customer and starts wagging it's artificial tail to make the customer aware that his cookie is ready.
Notice how the customer didn't have any idea when the cookie would be given to him, nor was he actively polling the baker to see if it was ready.
That's the main difference between the 2 scenario's. Who is responsible for initiating the "your cookie is ready, what do you want to do with it?" question. With the Future, the customer is responsible for checking when it's ready, either by actively waiting, or by polling every now and then. In case of the callback, the baker will call back to the provided function.
I hope this answer gives you a better insight in what a Future and Calback actually are. Once you got the general idea, you could try to find out on which thread each specific thing is handled. When a thread is blocked, or in what order everything completes. Writing some simple programs that print statements like: "main client thread: cookie recieved" could be a fun way to experiment with this.
The asynchronous approach
producer.send(record, new Callback(){
#Override
onComplete(RecordMetadata rm, Exception ex){...}
})
gives you better throughput comparing to synchronous
RecordMetadata rm = producer.send(record).get();
since you don't wait for acknowledgements in first case.
Also in asynchronous way ordering is not guaranteed, whereas in synchronous it is - message is sent only after acknowledgement received.
Another difference could be that in synchronous call in case of exception you can stop sending messages straightaway after the exception occurs, whereas in second case some messages will be sent before you discover that something is wrong and perform some actions.
Also note that in asynchronous approach the number of messages which are "in fligh" is controlled by max.in.flight.requests.per.connection parameter.
Apart from synchronous and asynchronous approaches you can use Fire and Forget approach, which is almost the same as synchronous, but without processing the returned metadata - just send the message and hope that it will reach the broker (knowing that most likely it will happen, and producer will retry in case of recoverable errors), but there is a chance that some messages will be lost:
RecordMetadata rm = producer.send(record);
To summarize:
Fire and Forget - fastest one, but some messages could be lost;
Synchronous - slowest, use it if you cannot afford to lose messages;
Asynchronous - something in between.
The main difference is whether you want to block the calling thread waiting for the acknowledgment.
The following using the Future.get() method would block the current thread until the send is completed before performing some action.
producer.send(record).get()
// Do some action
When using a Callback to perform some action, the code will execute in the I/O thread so it's non-blocking for the calling thread.
producer.send(record,
new Callback() {
// Do some action
}
});
Though the docs says it 'generally' executed in the producer:
Note that callbacks will generally execute in the I/O thread of the producer and so should be reasonably fast or they will delay the sending of messages from other threads. If you want to execute blocking or computationally expensive callbacks it is recommended to use your own Executor in the callback body to parallelize processing.
My observations based on The Kafka Producer documentation:
Future gives you access to synchronous processing
Future might not guarantee acknowledgement. My understanding is that a Callback will execute after acknowledgement
Callback gives you access to fully non-blocking asynchronous processing.
There are also guarantees on the ordering of execution for a callback on the same partition
Callbacks for records being sent to the same partition are guaranteed
to execute in order.
My other opinion that the Future return object and the Callback 'pattern' represents two different programming styles and I think that this is the fundamental difference:
The Future represents Java's Concurrency Model Style.
The Callback represents Java's Lambda Programming Style (because Callback actually satisfies the requirement for a Functional Interface)
You can probably end up coding similar behaviors with both the Future and Callback styles, but in some use cases it looks like one might style be more advantageous than the other.
send() is a method to start publishing a message on Kafka Cluster. The send() method is an asynchronous call that says that the send method accumulates message in Buffer and return back immediately. This can be used with linger.ms to batch publish messages for better performance. We can handle exception and control using the call send method with synchronous using get method on Future or asynchronous with a callback.
Each method has its own pros and cons and can be decided based on use cases.
Asynchronous send(Fire & Forget):
We call the send method as below to call publish a message without waiting for any success or error response.
producer.send(new ProducerRecord<String, String>("topic-name", "key", "value"));
This scenario will not wait to get complete first message start sending other messages to get published. In case of exception, producer retry based on retry config parameter but if the message still fails after retrying Kafka Producer never know about this. We may lot some message in this case but if ok with few message loss this provides high throughput and high latency.
Synchronous send
A simple way to send message synchronously is to use the get() method
RecordMetadata recMetadata = producer.send(new ProducerRecord<String, String>("topic-name", "key", "value")).get();
Producer.send returns Future of RecordMetadata and when we call .get() method it will get a reply from Kafka. We can catch Error in case of error or return RecordMetadata in case of success. RecordMetadata contains offset, partition, timestamp to log the information. It's slow but give high reliability and guarantee to deliver the message.
Asynchronous send with callback
We can also call the send() method with a callback function which returns a response once the message gets completed. This is good if you like to send messages in an asynchronous way means not to wait to complete the job but at the same time handle Error or update status about message delivery.
producer.send(record, new Callback(){
#Override
onComplete(RecordMetadata recodMetadata, Exception ex){...}
})
Note: Please don’t confuse with ack & retries with asynchronous send call. Ack and retries will apply on each send call whether its synchronous or asynchronous call, the only matter how you handle return messages and failure scenario. For example, if you send asynchronous send still ack and retries rule gets applied but will be on an independent thread without blocking other thread to send parallel records. The only challenge we will not be aware of in case of failure and time when its message completed successfully.

Returning value from function without completing function

My function is to send an HTTP request to a server and get a response. It looks like :
public Acknowledgement function() {
Acknowledgement ack = new Acknowledgement("received");
//send http request and get response
return ack;
}
The purpose of the Acknowledgement is to inform the caller that function() has been successfully called and processing has been started.
I want to return acknowledgement before sending HTTP request. One way is to use separate thread(implementing Runnable) for sending request and getting response. But Threads have been pretty old. What are other latest alternative to threads to achieve this.
you could try using a callback listener, like:
interface LifecycleListener {
void onStarting();
}
public Acknowledgement function(LifecycleListener listener) {
listener.onStarting();
...
...
}
Many things exist in Java for almost 20 years. Doesn't mean that we should stop using String, or Integer do we?
But you are correct insofar that there are (slightly) newer abstractions on top of bare metal threads.
So, concepts to study would be ExecutorServices and things like Future.
Using threads directly; especially in conjunction with low level primitives such as wait/notify is probably not the first choice in 2017; you would rather look into the aforementioned ExecutorService; or even going one step further and learn about frameworks like RXJava.
threads are not old,if you want to do it better,you can use thread pool to manage them

Solution for Asynchronous Servlets in versions prior to 3.0?

I have a long-running task (report) which would exceed any TCP connection timeouts before it starts returning data. Asynchronous servlets (introducted in Servlets 3.0) are exactly what I need, however I am limited to Servlet v2.4.
Are there any "roll-your-own" solutions? What I'm doing feels hacked - I kick off the task asynchronously in a thread and just return to the client immediately. The client then polls every few seconds (with ajax), and checks for a "ready" status for this task ID (a static list maintains their status and some handles to the objects processed by the thread). Once ready, I inject the output stream into the work object so the thread can write the results back to the client.
You can implement the Reverse ajax technique which means that instead of polling many times to get the response you get the response once the task has finished.
There is a quick solution to implement reverse-ajax technique by using DWR here. But you should maintain the use of the static List. If your background task business logic is complicated you can use an ESB or something more sophisticated.

Categories