Long Polling in Spring

Long Polling in Spring - java

We have a somewhat unique case where we need to interface with an outside API that requires us to long poll their endpoint for what they call real time events.
The thing is we may have as many as 80,000 people/devices hitting this endpoint at any given time, listening for events, 1 connection per device/person.
When a client makes a request from our Spring service to long poll for events, our service then in turn makes an async call to the outside API to long poll for events. The outside API has defined the minimum long poll timeout may be set to 180 seconds.
So here we have a situation where a thread pool with a queue will not work, because if we have a thread pool with something like (5 min, 10 max, 10 queue) then the 10 threads getting worked on may hog the spotlight and the 10 in queue will not get a chance until one of the current 10 are done.
We need a serve it or fail it (we will put load balancers etc. behind it), but we don't want to leave a client hanging without actual polling happening.
We have been looking into using DeferredResult for this, and returning that from the controller.
Something to the tune of
#RequestMapping(value = "test/deferredResult", method = RequestMethod.GET)
DeferredResult<ResponseEntity> testDeferredResult() {
final DeferredResult<ResponseEntity> deferredResult = new DeferredResult<ResponseEntity>();
CompletableFuture.supplyAsync(() -> testService.test()).whenCompleteAsync((result, throwable) -> deferredResult.setResult(result));
return deferredResult;
}
I am questioning if I am on the right path, and also should I provide an executor and what kind of executor (and configuration) to the CompletableFuture.supplyAsync() method to best accomplish our task.
I have read various articles, posts, and such and am wanting to see if anyone has any knowledge that might help our specific situation.

The problem you are describing does not sound like one that can be solved nicely if you are using blocking IO. So you are on the right path, because DeferredResult allows you to produce the result using any thread, without blocking the servlet-container thread.
With regards to calling a long-pooling API upstream, you need a NIO solution as well. If you use a Netty client, you can manage several thousand sockets using a single thread. When the NIO selector in Netty detects data, you will get a channel callback and eventually delegate to a thread in the Netty worker thread pool, and you can call deferredResult.setResult. If you don't do blocking IO the worker pool is usually sized after the number of CPU-cores, otherwise you may need more threads.
There are still a number of challenges.
You probably need more than one server (or network interface) since there are only 65K ports.
Sockets in Java does not have write timeouts, so if a client refuses to read data from the socket, and you send more data than your socket buffer, you would block the Netty worker thread(s) and then everything would stop (reverse slow loris attack). This is a classic problem in large async setups, and one of the reasons for using frameworks like Hystrix (by Netflix).

Related

Async HTTP request vs HTTP requests on new thread

I have 2 microservices (A and B).
A has an endpoint which accepts POST requests. When users make a POST request, this happens:
Service A takes the object from the POST request body and stores it in a database.
Service A converts the object to a different object. And the new object gets sent to service B via Jersey HTTP client.
Step 2 takes place on a Java thread pool I have created (Executors.newCachedThreadPool). By doing step 2 on a new thread, the response time of service A's endpoint is not affected.
However, if service B is taking long to respond, service A can potentially create too many threads when it is receiving many POST requests. To help fix this, I can use a fixed thread pool (Exectuors.newFixedThreadPool).
In addition to the fixed thread pool, should I also use an asynchronous non-blocking HTTP client? Such as the one here: https://hc.apache.org/httpcomponents-asyncclient-dev/. The Jersey HTTP client that I use is blocking.
It seems like it is right to use the async HTTP client. But if I switch to a fixed thread pool, I think the async HTTP client won't provide a significant benefit - am I wrong in thinking this?

Even if you use fixed thread pool all your threads in it will be blocked on step 2 meaning that they won't do any meaningful job - just wait for your API to return a response which is not a pragmatic resource management. In this case, you will be able to handle a limited amount of incoming requests since threads in the thread pool will be always busy instead of handling new requests.
In the case of a non-blocking client, you are blocking just one single thread (let's call it dispatcher thread) which is responsible for sending and waiting for all the requests/responses. It will be running in a "while loop" (you could call it an event loop) and check whether all the packages were received as a response so they are ready for worker threads to be picked up.
In the latter scenario, you get a larger amount of available threads ready to do some meaningful job, so your throughput will be increased.

The difference is that with sync client, step A thread will be doing a connection to step 2 endpoint and wait for a response. Making step 2 implementation async will and just return 200 directly (or whatever) will help on decreasing waiting time; but it will still be doing the connection and waiting for response.
With non-blocking client instead, the step A call itself will be done by another thread. So everything is untied from step A thread. Also, system can make use of that thread for other stuff until it gets a response from step B and needs to resume work.
The idea is that your origin threads will not be idle so much time waiting for responses, but instead being reused to do other work while in between.

The reason to use a non-blocking HTTP-Client is to prevent too much CPU from being used on thread-switching. If you already solve that problem by limiting the amount of background threads, then non-blocking IO won't provide any noticeable benefits.
There is another problem with your setup: it is very vulnerable to DDOS attacks (intentional or accidental ones). If a someone calls your service very often, it will internally create a huge work-load that will keep the service busy for a long time. You will definitely need to limit the background task queue (which is a supported feature of the Executor class) and return 503 (or equivalent) if there are too many pending tasks.

Best practices with Akka in Scala and third-party Java libraries

I need to use memcached Java API in my Scala/Akka code. This API gives you both synchronous and asynchronous methods. The asynchronous ones return java.util.concurrent.Future. There was a question here about dealing with Java Futures in Scala here How do I wrap a java.util.concurrent.Future in an Akka Future?. However in my case I have two options:
Using synchronous API and wrapping blocking code in future and mark blocking:
Future {
blocking {
cache.get(key) //synchronous blocking call
}
}
Using asynchronous Java API and do polling every n ms on Java Future to check if the future completed (like described in one of the answers above in the linked question above).
Which one is better? I am leaning towards the first option because polling can dramatically impact response times. Shouldn't blocking { } block prevent from blocking the whole pool?

I always go with the first option. But i am doing it in a slightly different way. I don't use the blocking feature. (Actually i have not thought about it yet.) Instead i am providing a custom execution context to the Future that wraps the synchronous blocking call. So it looks basically like this:
val ecForBlockingMemcachedStuff = ExecutionContext.fromExecutorService(Executors.newFixedThreadPool(100)) // whatever number you think is appropriate
// i create a separate ec for each blocking client/resource/api i use
Future {
cache.get(key) //synchronous blocking call
}(ecForBlockingMemcachedStuff) // or mark the execution context implicit. I like to mention it explicitly.
So all the blocking calls will use a dedicated execution context (= Threadpool). So it is separated from your main execution context responsible for non blocking stuff.
This approach is also explained in a online training video for Play/Akka provided by Typesafe. There is a video in lesson 4 about how to handle blocking calls. It is explained by Nilanjan Raychaudhuri (hope i spelled it correctly), who is a well known author for Scala books.
Update: I had a discussion with Nilanjan on twitter. He explained what the difference between the approach with blocking and a custom ExecutionContext is. The blocking feature just creates a special ExecutionContext. It provides a naive approach to the question how many threads you will need. It spawns a new thread every time, when all the other existing threads in the pool are busy. So it is actually an uncontrolled ExecutionContext. It could create lots of threads and lead to problems like an out of memory error. So the solution with the custom execution context is actually better, because it makes this problem obvious. Nilanjan also added that you need to consider circuit breaking for the case this pool gets overloaded with requests.
TLDR: Yeah, blocking calls suck. Use a custom/dedicated ExecutionContext for blocking calls. Also consider circuit breaking.

The Akka documentation provides a few suggestions on how to deal with blocking calls:
In some cases it is unavoidable to do blocking operations, i.e. to put
a thread to sleep for an indeterminate time, waiting for an external
event to occur. Examples are legacy RDBMS drivers or messaging APIs,
and the underlying reason is typically that (network) I/O occurs under
the covers. When facing this, you may be tempted to just wrap the
blocking call inside a Future and work with that instead, but this
strategy is too simple: you are quite likely to find bottlenecks or
run out of memory or threads when the application runs under increased
load.
The non-exhaustive list of adequate solutions to the “blocking
problem” includes the following suggestions:
Do the blocking call within an actor (or a set of actors managed by a router), making sure to configure a thread pool which is either
dedicated for this purpose or sufficiently sized.
Do the blocking call within a Future, ensuring an upper bound on the number of such calls at any point in time (submitting an unbounded
number of tasks of this nature will exhaust your memory or thread
limits).
Do the blocking call within a Future, providing a thread pool with an upper limit on the number of threads which is appropriate for the
hardware on which the application runs.
Dedicate a single thread to manage a set of blocking resources (e.g. a NIO selector driving multiple channels) and dispatch events as they
occur as actor messages.
The first possibility is especially well-suited for resources which
are single-threaded in nature, like database handles which
traditionally can only execute one outstanding query at a time and use
internal synchronization to ensure this. A common pattern is to create
a router for N actors, each of which wraps a single DB connection and
handles queries as sent to the router. The number N must then be tuned
for maximum throughput, which will vary depending on which DBMS is
deployed on what hardware.

Logic behind camel ServicePool when used with Netty

I have a camel instance with a Netty endpoint that consolidates many incoming requests to send to a single receiver. More specifically, this is a web service whereby each incoming SOAP request results in a Producer.sendBody() into the camel subsystem. The processing of each request involves different routes, but they will all end up in the single Netty endpoint to send on to the next-level server. All is fine, as long as I only have a handful of incoming requests at any one time. If I start having more than 100 simultaneous requests, though, I get this exception:
java.lang.IllegalStateException: Queue full
at java.util.AbstractQueue.add(AbstractQueue.java:71) ~[na:1.6.0_24]
at java.util.concurrent.ArrayBlockingQueue.add(ArrayBlockingQueue.java:209) [na:1.6.0_24]
at org.apache.camel.impl.DefaultServicePool.release(DefaultServicePool.java:95) [camel-core-2.9.2.jar:2.9.2]
at org.apache.camel.impl.ProducerCache$1.done(ProducerCache.java:297) ~[camel-core-2.9.2.jar:2.9.2]
at org.apache.camel.processor.SendProcessor$2$1.done(SendProcessor.java:120) ~[camel-core-2.9.2.jar:2.9.2]
at org.apache.camel.component.netty.handlers.ClientChannelHandler.messageReceived(ClientChannelHandler.java:162) ~[camel-netty-2.9.2.jar:2.9.2]
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296) ~[netty-3.3.1.Final.jar:na]
This is coming from the DefaultServicePool that's used by the Netty component. The DefaultServicePool uses an ArrayBlockingQueue as the backend to the queue and it sets it to a default capacity of 100 Producers. It uses a service pool for performance reasons, to avoid having to keep creating and destroying often-reused producers. Fair enough. Unfortunately, I'm not getting the logic on how it is implemented.
This all starts in ProducerCache::doInAsyncProducer, which starts off by calling doGetProducer. Said method attempts to acquire a Producer from the pool and, if that fails, it creates a new Producer using endpoint.getProducer(). It then makes sure that the service pool exists using pool.addAndAcquire. That done, it returns to the calling function. The doInAsyncProducer does its thing until it's finished, in which case it calls the done processor. At this point, we're completely done processing the exchange, so it releases the Producer back to the pool using pool.release
Here is where the rubber hits the road. The DefaultServicePool::release method inserts the Producer into the ArrayBlockingQueue backend using an add. This is where my java.lan.IllegalStateException is coming from.
Why? Well, let's look through a use case. I have 101 simultaneous incoming requests. Each of them hits the Netty endpoint at roughly the same time. The very first creates the service pool with the capacity of 100 but it's empty to start. In fact, each of the 101 requests will create a new Producer from the endpoint.getProducer; each will verify that they don't exceed the capacity of the service pool (which is empty); and each will continue on to send to the server. After each finishes, it tries to do a pool.release. The first 100 will succeed, since the pool capacity hasn't been reached. The 101st request will attempt to add to the queue and will fail, since the queue is full!
Is that right? If I'm reading that correctly, then this code will always fail whenever there are more than 100 simultaneous requests. My service needs to support upwards of 10,000 simultaneous requests, so that's just not going to fly.
It seems like a more stable solution might be to:
Pre-allocate all 100 Producers on initialization
Block during acquire until a Producer is available
Absolutely do not create your own non-pool Producers if using a ServicePool
In the meantime, I'm thinking of throttling incoming requests.
What I'm hoping for with this question is to learn if I'm reading that logic correctly and to see if it can get changed. Or, am I using it wrong? Is there a better way to handle this type of thing?

Yes the logic should IMHO be improved. I have logged a ticket to improve this.
https://issues.apache.org/jira/browse/CAMEL-5703

Solution for Asynchronous Servlets in versions prior to 3.0?

I have a long-running task (report) which would exceed any TCP connection timeouts before it starts returning data. Asynchronous servlets (introducted in Servlets 3.0) are exactly what I need, however I am limited to Servlet v2.4.
Are there any "roll-your-own" solutions? What I'm doing feels hacked - I kick off the task asynchronously in a thread and just return to the client immediately. The client then polls every few seconds (with ajax), and checks for a "ready" status for this task ID (a static list maintains their status and some handles to the objects processed by the thread). Once ready, I inject the output stream into the work object so the thread can write the results back to the client.

You can implement the Reverse ajax technique which means that instead of polling many times to get the response you get the response once the task has finished.
There is a quick solution to implement reverse-ajax technique by using DWR here. But you should maintain the use of the static List. If your background task business logic is complicated you can use an ESB or something more sophisticated.

How to write an UDP server that will service n concurrent requests from different clients?

I am connecting 10 devices to a LAN, all of them have a udp server that goes like:
while(true){
serverSocket.receive(receivePacket);
dostuff(receivePacket);
}
serverSocket.close();
Now lets assume 9 of the devices try to initiate connection to the 10th device simultaenously. How can I accept all 9 instead of just the first which will then block the socket untill the server completes computation? Should I start a thread which will take care of dostuf() ? Will this let me get request from all of the simultaneous requests I got?

A basic design would have on thread responsible for handling incoming requests (with your desired limit) and then handing them off to worker/request handler threads. When each of these worker threads is finished, you'd want to update a shared/global counter to let the main thread know that it can establish a new connection. This will require a degree of synchronization, but it can be pretty fun.
Here's the idea:
serverThread:
while true:
serverLock.acquire()
if numberOfRequests < MAX_REQUESTS:
packet = socket.receive()
numberOfRequests++
requestThread(packet).run()
else
serverMonitor.wait(serverLock);
serverLock.release()
requestThread:
//handle packet
serverLock.acquire()
if numberOfRequests == MAX_REQUESTS:
numberOfRequests--
serverMonitor.pulse();
serverLock.release()
You'll want to make sure the synchronization is all correct, this is just to give you an idea of what you can start out with. But when you get the hang of it, you'll be able to make optimizations and enhancements. One particular enhancement, which also lends itself to limited number of requests, is something called a ThreadPool.
Regardless the basic structure is very much the same with most servers: a main thread responsible for handing off requests to worker threads. It's a neat and simple abstraction.

You can use threads in order to solve that problem. Since java already has an API that handles threads you can just create instance of runnable executors, take a look at the Executor Interface. Here is another useful link that could potentially help: blocking queue

Use a relatively larger size threadpool since udp doesn't require response.
main method will run as a listener and a threadpool will be doing rest of the heavy lifting

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.