I have a Netty HTTP server that I use as an API server. The user sends events to the API server and it processes the event on other threads in same server using executor services or a concurrency framework like Akka. I have two options for the response; when I send the event to another thread, either I can wait for the response and write it to the socket or just write acknowledge message back to the socket.
When I wait for the response, the latency of the http requests increases and the number of requests the server can handle decreases. On the other hand, there is no control for back-pressure so we don't know when the server will process the event and I can't inform the user that the server processed the event. However, the number of http requests the server can handle increases because the latencies are quite low for almost all requests.
public void channelRead(ChannelHandlerContext ctx, Object msg) {
if (msg instanceof HttpRequest) {
executor.submit(new Event(msg));
// do not wait for the response
ctx.write(new DefaultFullHttpResponse(HTTP_1_1, OK));
}
}
public void channelRead(ChannelHandlerContext ctx, Object msg) {
if (msg instanceof HttpRequest) {
Future<Object> future = executor.submit(new Event(msg));
future.after(x -> ctx.write(new DefaultFullHttpResponse(HTTP_1_1, OK)));
}
}
Since it's an API server, I don't have to wait for the response in order to inform the user that the event is processed because it's simply a write request that doesn't return a response. So which way is the most convenient one for http servers and do you think the second option worth for its performance benefit?
I think the answer depends upon your requirements. In general a good philosophy to follow is keep things simple. This may mean do the minimum amount of work required and it may mean leave room for extensibility. If it makes sense for your use case not to provide status back, then what are the reasons you would do this? However there is typically a tradeoff between being simple and providing enough visibility into the system. It sounds like you are not sure about what is and is not required?
Ask your self the following questions:
Can the clients use the return status to do something meaningful? What can your clients do without this status?
Can you foresee the client needs changing in the future to require responses post processing? (i.e. return status, return body, error codes, etc...)
Is your server required to track what is happening all the way from getting the client request to turning that into action on the server?
What is your visibility and ability to debug problems?
What is the additional complexity and overhead required to add the functionality? Is this acceptable?
These are just a few of the many questions you should be asking your self in this situation. Based upon the information you provided I'm not sure if it is possible to provide you with a "you should do X" or "you should do Y" answer. I think you would be best served by re-evaluating your requirements and ask yourself some questions like the ones listed above.
Related
I have an architechture related question. This is a language independent question, but as I come from Java background, it will be easier for me if someone guides me in the Java way.
Basically, the middleware I'm writing communicates with a SOAP based third party service. The calls are async - in a way that, when a service is called, it returns with a response 01 - processing; meaning that the third party has successfully received the request. In the original SOAP request, one callback URL has to be submitted each time, where third party actually sends the result. So, calling a particular service doesn't actually return the result immediately; the result is received in a separate HTTP endpoint in the middleware.
Now in our frontend, we don't want to complicate the user experience. We want our users to call a middleware function (via menu items/buttons), and get the result immediately; and leave the dirty work to the middleware.
Please note that the middleware function (lets say X()) which was invoked from the front end and the middleware endpoint URL(lets call it Y) where third party pushes the result are completely separate from each other. X() somehow has to wait and then fetch the result grabbed in Y and then return the result to the frontend.
How can I build a robust solution to achieve the above mentioned behavior?
The picture depicts my case perfectly. Any suggestions will be highly appreciated.
This question could be more about integration patterns than it is about multi-threading. But requests in the same application/JVM can be orchestrated using a combination of asynchronous invocation and the observer pattern:
This is better done using an example (exploiting your Java knowledge). Check the following simplistic components that try to replicate your scenario:
The third-party service: it exposes an operation that returns a correlation ID and starts the long-running execution
class ExternalService {
public String send() {
return UUID.randomUUID().toString();
}
}
Your client-facing service: It receives a request, calls the third-party service and then waits for the response after registering with the result receiver:
class RequestProcessor {
public Object submitRequest() {
String correlationId = new ExternalService().send();
return new ResultReceiver().register(correlationId).join();
}
}
The result receiver: It exposes an operation to the third-party service, and maintains an internal correlation registry:
class ResultReceiver {
Map<String, CompletableFuture<Object>> subscribers;
CompletableFuture<Object> register(String responseId) {
CompletableFuture<Object> future = new CompletableFuture<Object>();
this.subscribers.put(responseId, future);
return future;
}
public void externalResponse(String responseId, Object result) {
this.subscribers.get(responseId).complete(result);
}
}
Futures, promises, call-backs are handy in this case. Synchronization is done by the initial request processor in order to force the execution to block for the client.
Now this can raise a number of issues that are not addressed in this simplistic class set. Some of these problems may be:
race condition between new ExternalService().send() and new ResultReceiver().register(correlationId). This is something that can be solved in ResultReceiver if it undestands that some responses can be very fast (2-way wait, so to say)
Never-coming results: results can take too long or simply run into errors. These future APIs typically offer timeouts to force cancellation of the request. For example:
new ResultReceiver().register(correlationId)
.get(10000, TimeUnit.SECONDS);
Well what exactly is the problem with doing that? You just create an API (middleware) which doesn't return response until the third party returns the processed result. Front end sends request to X(), X() processes that request by sending a request to Y() and then keep polling Y() to see when the result is ready, then X() takes the results from Y() and sends it back to the front end. Like a facade.
There are some problems regarding using third party services that you don't control which you should consider. First of all you need to implement some kind of circuit breaker or timeout. Because the third party service might hang and never process the results (or process them so long that it makes no sense to wait). Also you should consider some meaningful way to keep the site running even if the third party service is not available or has updated their API or something else prevents you from using it.
And just one last thought in the end. Why would you want to make something that is already implemented asynchronous synchronous? It is made like that probably because it might take time. Blocking the front end for a long periods of time to wait for results makes the user experience unpleasant and the UI unresponsive. It is usually better to stick to asynchronous requests and show the users they are processing but let them do something else meanwhile.
I have a REST API created in Java with the Spark framework, but right now a lot of work is being done on the request thread that is significantly slowing down requests.
I'm wanting to solve this by creating some kind of background worker/queue that will do all the needed work off of the request thread. The response from the server contains data that the client will need (it's data that will be displayed). In these examples the client is a web browser.
Here's what the current cycle looks like
API request from client to server
Server does blocking work; Response from server after several seconds/minutes
Client receives response. It has all the data it needs in the response
Here's what I would like
API request from client to server
Server does work off-thread
Client receives response from server almost instantly, but it doesn't have the data it needs. This response will contain some ID (Integer or UUID), which can be used to check the progress of the work being done
Client regularly checks the status of the work being done, the response will contain a status (like a percentage or time estimate). Once the work is done, the response will also contain the data we need
What I dislike about this approach is that it will significantly complicate my API. If I want to get any data, I will have to make two requests. One to initiate the blocking work, and another to check the status (and get the result of the blocking work). Not only will the API become more complicated, but the backend will too.
Is this efficient, or is there a better way to implement what I want to accomplish?
Neither way is more efficient than the other since the same amount and time of work will be done in either case. In the first case it will be done on the request thread, the client will not know of progress and the request will take as long as it takes to run the task. This has the client wait on the reply.
In the second case you need to add complexity, but you get progress status and possibly other advantages depending on the task. This has the client poll on the reply.
You can use async processing to perform work on non-request threads, but that probably won't make any difference if most of your requests are long running ones. So it's up to you to decide what you want, the client will have to wait the same amount anyway.
Lets say I create an async REST API in Spring MVC with Java 8's Completeable.
How is this called in the client? If its non blocking, does the endpoint return something before processing? Ie
#RequestMapping("/") //GET method
public CompletableFuture<String> meth(){
thread.sleep(10000);
String result = "lol";
return CompletableFuture.completedFuture(result);
}
How does this exactly work? (This code above is just a randomly made code I just thought of).
When I send a GET request from say google chrome # localhost:3000/ then what happens? I'm a newbie to async APIs, would like some help.
No, the client doesn't know it's asynchronous. It'll have to wait for the result normally. It's just the server side that benefits from freeing up a worker thread to handle other requests.
In this version it's pointless, because CompletableFuture.completedFuture() creates a completed Future immediately.
However in a more complex piece of code, you might return a Future that is not yet complete. Spring will not send the response body until some other thread calls complete() on this Future.
Why not just use a new thread? Well, you could - but in some situations it might be more efficient not to. For example you might put a task into an Executor to be handled by a small pool of threads.
Or you might fire off a JMS message asking for the request to be handled by a completely separate machine. A different part of your program will respond to incoming JMS messages, find the corresponding Future and complete it. There is no need for a thread dedicated to this HTTP request to be alive while the work is being done on another system.
Very simple example:
#RequestMapping("/employeenames/{id}")
public CompletableFuture<String> getName(#PathVariable String id){
CompletableFuture<String> future = new CompletableFuture<>();
database.asyncSelect(
name -> future.complete(name),
"select name from employees where id = ?",
id
);
return future;
}
I've invented a plausible-ish API for an asynchronous database client here: asyncSelect(Consumer<String> callback, String preparedstatement, String... parameters). The point is that it fires off the query, then does not block the tread waiting for the DB to respond. Instead it leaves a callback (name -> future.complete(name)) for the DB client to invoke when it can.
This is not about improving API response times -- we do not send an HTTP response until we have a payload to provide. This is about using the resources on the server more efficiently, so that while we're waiting for the database to respond it can do other things.
There is a related, but different concept, of asynch REST, in which the server responds with 202 Accepted and a header like Location: /queue/12345, allowing the client to poll for the result. But this isn't what the code you asked about does.
CompletableFuture was introduced by Java to make handling complex asynchronous programming. It lets the programmer combine and cascade async calls, and offers the static utility methods runAsync and supplyAsync to abstract away the manual creation of threads.
These methods dispatch tasks to Java’s common thread pool by default or a custom thread pool if provided as an optional argument.
If a CompletableFuture is returned by an endpoint method and #complete is never called, the request will hang until it times out.
My server requires sending emails pretty frequently. The emails are heavy; they have attachments as well as inline images in them.
My present code blocks code until an email is sent. (loosing 5 to 6 seconds for every email)
What is the best approach for handling the emails with out blocking the main code flow?
If you are suggesting threads, please elaborate on how efficiently it could be handled?
There are multiple ways to achieve this functionality.
Synchronous Call
This is the one which you are already using. Code (synchronously) invokes Java Mail API and waits for API to complete the execution. The process may take time depending on the complexity of building the email message (fetching records from Database, reading images/documents (attachments), communication with Mail Server etc.
Trade-offs
For individual requests(web/desktop), response latency will increase based on the time it takes to construct and send email.
An exception in sending email, may require redo of entire process (if retried).
Transactional data (e.g. DB) may be rolled back, due to exception while sending email. This may not be desired behavior.
Overall application latency will increase, if similar functionality is invoked by multiple users concurrently.
Email retry functionality may not be possible, if email sending code is tightly coupled with other functional code.
Multithreading Approach
Create a separate thread to asynchronously send an email. Calling code need not have to wait for Email send functionality to complete and execute rest of the code. Ideally, should make use of ThreadPool, instead of blandly creating new threads.
Trade-offs
Though, request latency will go down, it is still not reliable. Any exception occurred while constructing/sending email may result into, no email sent to user.
Email sending functionality can't be distributed across multiple machines.
Retry functionality is possible, since email code is separated into separate class. This class can be independently called, without a need of redoing other things.
Asynchronous Processing
Create a class, which accepts Email request and stores it in either database or messaging infrastructure (e.g. JMS). The message listeners will process the task as and when it arrives and update the status against each task.
Trade-offs
Email requests can be processed in distributed mode.
Retry email possible, without having any side effects.
Complex implementation as multiple components are involved in processing, persisting email requests.
You can efficiently do this if you spawn a thread for every email you have to send.
One way to do it is as follows:
You would need a class that is just a pure extension of Thread:
public class MailSenderThread extends Thread {
#Override
public void run() {
try {
// Code to send email
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
And when you want to send email, you can do:
new MailSenderThread().start();
That is the shortest/easiest way I could think of.
You can refer to an example in my public repository as well. It's off-topic but it gets the concept.
My function is to send an HTTP request to a server and get a response. It looks like :
public Acknowledgement function() {
Acknowledgement ack = new Acknowledgement("received");
//send http request and get response
return ack;
}
The purpose of the Acknowledgement is to inform the caller that function() has been successfully called and processing has been started.
I want to return acknowledgement before sending HTTP request. One way is to use separate thread(implementing Runnable) for sending request and getting response. But Threads have been pretty old. What are other latest alternative to threads to achieve this.
you could try using a callback listener, like:
interface LifecycleListener {
void onStarting();
}
public Acknowledgement function(LifecycleListener listener) {
listener.onStarting();
...
...
}
Many things exist in Java for almost 20 years. Doesn't mean that we should stop using String, or Integer do we?
But you are correct insofar that there are (slightly) newer abstractions on top of bare metal threads.
So, concepts to study would be ExecutorServices and things like Future.
Using threads directly; especially in conjunction with low level primitives such as wait/notify is probably not the first choice in 2017; you would rather look into the aforementioned ExecutorService; or even going one step further and learn about frameworks like RXJava.
threads are not old,if you want to do it better,you can use thread pool to manage them