stopping/canceling disconnected GET request threads as soon as possible - java

I am using jetty, version 7.0.1 if that matters.
Sometimes I have some quite long running tasks on a server which I would like to cancel/stop if the client disconnects (in case of GET requests, not e.g. POST file uploads). It seems this is not the case, and that tasks continue to run to
completion.
Perhaps I can use ServletRequestListener.requestDestoryed listener to get notification of such tasks but what is recommended
approach for stoping the request thread? What about releasing resources like database connections, file handles or running tasks
(executor service)?
What is the recommended approach in stopping such tasks as soon as possible?

first I would recommend updating to the latest versions of jetty, we have fixed a ton since 7.0 series
second, your best bet to solve this problem is by design using either jetty-continuations to get async servlet support with servlet 2.5 spec (which is jetty7) or update to servlet 3.0 (jetty 8) and not rely on the get methods of the servlet api to block waiting for a response to send. Instead process the request and then spawn a thread or use an executor future to process the actions, then calling back to the request when you have a payload or success message to return. Reason being that while your in the servlet api blocking on the request process you are consuming resources and threads from your servlet thread pool...you'll be able to scale up much cleaner by using continuations or the async servlets of 3.0...
Also you'll be able to design a proper mechanism for managing these threads and things like timeouts and the proper notification mechanism for exceptional conditions, and it will be testable outside of a servlet container that way.
imo at least :)

Related

Can I configure my servlet container's thread management?

I'm currently working on an app which is heavily connected to maps. To display a map, we are generating a bunch of tiles in many threads, store them and get them if a user wants to see a certain part of the map.
The problem is, I'm naming threads that generate tiles a certain way, but then, when I want to get tiles to show a map, my servlet container's taking random threads from the pool, so the thread named for generating a tile ends up getting it from the storage. Of course, I could just rename the thread after generating a tile back, but I wonder if there is an alternative.
I wonder if I somehow can configure my servlet container for it to maybe kill threads after some time being idle or to create a new thread where I want to or to allocate several threads to work with this part of the code?
All I could find in terms of configuring servlet container is setting its min and max thread pool size, which I think won't help me.
The container is 100% in control over it's threading.
If you are attempting to manipulate the threading of the container then you are fighting a losing battle.
It is not possible to safely kill or stop threads on a running container, as this is incredibly unsafe, and will lead to many memory issues (leaks) and unclosed resources. The Thread.stop() method has been deprecated since Java 1.2.
Now that we have the negatives out of the way ...
Jetty is a 100% Async Java Web Server.
The classic assumption that 1 request uses 1 thread is wrong. (if you want this kind of behavior, then you should use Jetty 6 or older. Jetty versions older than 9.2 are now all EOL / End of Life)
When you use a Servlet call that is traditionally a blocking call, the Jetty server has to fake that blocking call to satisfy the API contract.
Even if using old school / traditional blocking Servlet APIs you'll still experience many situations where that 1 request has been handled by multiple threads over the lifetime of that 1 request.
If you want to work with the Servlet API and it's container then the first thing you should do is start to use both the Servlet Async Processing APIs and Servlet Async I/O APIs combined. Make sure you read about the gotchas on both APIs!
Async Processing will allow you to handle more processing of requests on the server side, not use the container threads that heavily, allow more control over how the threading behaves, will grant you better control over request timeouts, and even get notified of request/response error cases that you will always deal with on a web server.
Async I/O will allow you to only use a thread if there is content from the request/connection to read or if the connection allows a write. That connection will not consume a thread unless I/O is possible. This means more connections/requests per server, and ill behaving clients (slow, dead, problematic, etc) will not impact the behavior of your other clients by consuming threads that are not doing anything productive for you.
If you don't want to work with the Servlet API and do things your own way, then you'll have to manage your own Executor / ThreadGroup / ThreadPool that the server is unaware of. But that still means you'll need to use the Servlet Async Processing APIs to allow the 2 to coexist in harmony (you'll need to use the AsyncContext to inform the container that you are now taking control over the processing of the request, and then later inform it via the AsyncContext that you are done and the request is complete).
The biggest gotcha with this approach is that you cannot safely write to the HttpServletResponse from a thread that the container wasn't in control over.
Meaning the container dispatched on a thread to your application, that thread is the only one that can safely use the HttpServletResponse to write the response. You can have a different thread do the processing, a different thread provide the data to the HttpServletResponse, even a different thread that pumps the dispatch thread with content. But that thread you were dispatched to needs to be used to write.
This is the mixed threading behavior gotcha in the servlet spec. (you are in servlet async mode, on a different thread to process, but not using async mode to read/write.) It's a terribly complex, and ill defined, behavior in the servlet spec that leads to many issues, and I advise you to not chase this path.
This gotcha goes away if you also use the Servlet Async I/O APIs, but at that point the difference in the two above choices is negligible.

Long request processing with DropWizard

I have a simple DropWizard service and I'd like a REST API to start a long running processing task - both CPU and I/O bound. REST call will not wait for task completion, notification will happen by polling/long polling/web socket.
For now, I'd prefer if I can do this in Dropwizard and keep everything in single deployable JAR. What are my options?
UPDATE: I am interested in what my options are regarding running long running tasks in Dropwizard, deployed as single jar without external dependencies. Just spawn a new thread? Assuming there are just few such requests it would probably work but there should be better options.
You probably want to use a managed resource:
https://dropwizard.io/en/stable/manual/core.html#managed-objects
to setup a thread pool. Then, your initial request could push a message onto a queue. Your thread pool can pull messages off the queue and process them asynchronously.
You could maybe provide an additional endpoint so that clients can obtain the current state of the asynchronous process.

Starting long-running process from within Apache ServiceMix

I am looking for suggestions or ideas.
There is an external process (or even a browser) that needs to trigger a long-running process via simple web service call that ideally should run in the same container as that web service. We're using Apache ServiceMix. The web service itself shouldn't stay alive for the duration of the long-running process, besides it may just time-out anyway so we want it to return the response normally pretty much right away.
Originally, I was thinking of using ProcessBuilder() to launch the long-running process as just another app but doing this introduces certain OS dependencies and seems like a less then ideal practice anyway. One of the options we considered is starting another thread from the request and just letting the request complete immediately with a response while the long-running thread would keep on going as long as needed. I fear resource hijacking on the container as well as long-running thread's health when its launcher/parent exits losing any reference to that long-running child.
If anyone has any good ideas for how this can be solved in an elegant way, please let me know.
Thank you very much!
I'm guessing here as you didn't provide the version of your servicemix. Though with Camel which is included with servicemix I'd have two routes the first one providing the web service the second one doing the long running process. The second route should use the seda component. This will give you the async call.

What is the best approach to run a long process from a java servlet?

I would like to ask what is the best approach to run a long process from a java servlet. I have a webapp and when the client do a request it runs a servlet. This servlet should get some parameters from the request and then runs a process. This process may take a long time so I need to run it separately. When this process executed finish, it send an email with the results.
Thanks in advance.
Use a thread pool. Each time you receive a request, create a task and submit it to the thread pool. This will ensure too many requests don't bring the server to its knees, because you'e in control of how many concurrent threads you can have, and how many tasks can wait in the thread pool's queue of waiting tasks.
See the javadoc for Executors and ThreadPoolExecutor.
Though this sounds a bit dangerous that invocation of a servlet spawns a process (without proper throttling capabilities in place), you can spawn a process using Runtime.getRuntime().exec(). Much better would be to use ProcessBuilder to prepare the process arguments and spawn it.
Normally that kind of activities is delegated to another type of application module like a message driven bean and that seems to be the cleanest, and standards compliant solution to me. Although most servers won't complain if you create your own threads (which is forbidden by the standard but rarely enforced) the amount of management needed to set up your own job queue and pooled execution environment isn't really worth it in my opinion.
I see two possibilities to do this:
Create a separate thread for each task (thread pool approach). This is possible, but potentially may create a performance problem.
Create a second application. For instance you can save parameters to DB. Second application will monitor this DB with some interval and do something. Instead DB you can use some message queue manager like WebSphere MQ
Second approach have the advantage: if app not able to process the request now by some reason, the app can return to it later

Java Non-Blocking HTTP Server

I have written an application using embedded Jetty that makes network calls to other services.
I presume that the serving threads are idle whilst waiting for the network calls to complete.
Is there any way to have a worker thread that switches between requests to perform work that can be done at the current time and then when the network calls return also handle that? A request would be returned when all work has been completed for it.
I know this is a common paradigm, and I have used it for non-blocking TCP networking, but I'm unsure as to how to achieve this on a Java HTTP server whilst also waiting on external results.
Any links or explanations are appreciated.
Thanks
Update:
I'm using Membase and ElasticSearch (the only network calls). Membase returns "Future" objects and ElasticSearch returns "ListenableActionFuture". I'd like to be able to continue processing on a thread in response to these objects being returned.
You may take a look at Deft, which is single threaded, asynchronous, event driven web server.
Netty is a java library that allows you to do asynchronous networking.
http://www.jboss.org/netty
Netty supports http, but it is a fairly low level library.
A higher level library is finangle by twitter,
http://twitter.github.com/finagle/
Finangle is built on top of netty, but supports connection pooling, load balancing, and has a lot of other features. Finangle supports http.
If you want to do work at the same time as IO, I suggest you add a thread pool to perform the work. It is possible to re-use the existing threads but its a lot of extra work for possibly too little benefit.

Categories