asynchronous http request handling with tomcat and spring - java

It's my first SO question so be patient with me :)
I'm trying to create a service that:
Receives HTTP GET requests containing a URL to query
For a single GET request the service extracts the URL
Queries a local DB about the URL
If a result was found in the DB it will return it to the client and if not it will need to query some external services (that may take relatively long time to respond)
Return the result of the URL to the client
I'm running this on a virtual machine and Tomcat7 with spring.
I'll apologize in advance and mention that I'm pretty new to Tomcat
Anyway, I'm expecting a lot of concurrent GET requests to this service (hundreds of thousands of simultaneous requests)
What I'm basically trying to achieve is to make this service as scalable as possible (and if that's not possible then at least a service that can handle hundreds of thousands of simultaneous requests)
I've been reading A LOT about asynchronous requests handling in services and especially in Tomcat but I have some things that are still unclear to me:
From the official tomcat website it seems that Tomcat contains number of acceptor threads and number of working threads.
If so, why should I use AsyncContext? Whats the benefit of releasing a Tomcat's working thread and occupying a different thread in my application to do the exact same actions? (there's still 1 active thread in the system)
Somewhat similar to the first question but are there any benefits for creating the AsyncContext and using it with a different thread? (a thread from a thread pool created in my application)
Regarding the same issue, I've seen here that I can also return a Callable or a DeferredResult and process it with either one of Tomcat's threads or with one of my own threads. Are there any benefits for returning a Callable or using a DeferredResult over just processing the AsyncContext from the requests?
Also, If I decide to return a callable, from what thread pool does Tomcat gets the thread to process my callable? Are the threads being used here the same working threads from Tomcat that I previously mentioned? If so, what benefits do I get from releasing one Tomcat working thread and using a different one instead?
I've seen from Oracle's documentation that I can pass AsyncContext a Runnable object that will be processed concurrently, From where do the threads used to execute this Runnable come from? Do I have any control over it? Also, any benefits to passing the AsyncContext a Runnable over just passing the AsyncContext to one my threads?
I apologize for asking so many questions regarding the same things but me and my colleagues are arguing over these things for over a week without any concrete answer.
I have 1 more general question:
What do you think is the best way to make the service I described scalable? (putting aside adding more machines at the moment), could you post any examples or references for the purposed solution?
I'd post more links of links I've been looking at but my current reputation doesn't allow it.
I'll be grateful for any understandable references or for concrete examples and I'll obviously be happy to clarify on any relevant issue
Cheers!

There are a lot of questions packed into this, but I'll try to address some of them.
Asynchronous I/O is a good thing, especially on servers that serve large volumes of requests - it allows you to use fewer threads to process more requests. In the case of a proxy such as you are writing, you really want your HTTP client (that makes the requests to foreign URLs) to be asynchronous as well, so that neither processing the request nor receiving the remote response involves blocking I/O.
That said, you may have a harder time doing this stuff with Tomcat or Java EE servers in general, which have had asynchronous I/O bolted onto them as an afterthought, than using a framework like Netty that is asynchronous from the ground up. As the author of a framework which builds on top of Netty, I'm a bit biased.
To demonstrate how little code you'd need to do what you describe, I wrote a small server that does what you describe here in 3 Java source files and put it on github - it builds a standalone JAR you can run with java -jar to try it out, and I tried to comment it clearly.
What it comes down to is, networked applications spend most of their time waiting for I/O to happen. In the case of a proxy in particular, with traditional, threaded I/O, you would get a request, and the thread that received the request would be responsible for answering it synchronously - that means, if it has to make a network request to another server, that thread is blocked waiting for the answer to come from the remote server. Meaning that thread can't be used for anything else. So, if you have 10 threads, and all of them are waiting on responses, your server can't answer any more requests until one of them finishes and frees up a thread. With asynchronous I/O, you get a callback when some I/O completes. In other words, instead of standing still until the OS flushes your data to the socket and out the network card, your code simply gets a friendly tap on the shoulder when there is something to do (like a response arriving from your proxy request). While your code is waiting for that HTTP request to complete, the thread that sent the proxy request is free to be used to handle another request That means one thread can do a little work on one request, do a little work on another, and another, and eventually finish the first request. Since threads are a finite resource provided by your operating system, this allows you to do a lot more with a lot less hardware.
As to Callable vs. DeferredResult, using a Callable just moves when the work happens around (the Callable gets executed later, on some thread or other, but is still expected to do return a result synchronously); DeferredResult sounds more like what you'd need, since that allows your code to go off and do whatever work it wants, and then set the result (triggering completion of the response) whenever it has something to set.
Honestly, I think if you want to implement this really efficiently, you'd be better off staying away from the Java EE stack - so much of it has baked in assumptions that I/O is synchronous that trying to do async stuff with it is swimming upstream (for example, JDBC has synchronous I/O baked into its bones - if you really want this to scale and you want to use an SQL database, you'd be better off with something like this ).
For another example of using Netty for this sort of thing, see the tiny-maven-proxy project - the code is less pretty, but it shows an example of doing an HTTP proxy where the response body is fed to the client chunk-by-chunk, as it arrives - so you never actually pull the full response body into memory, meaning even requests with huge responses won't run the proxy out of memory. Tiny-maven-proxy also caches on the filesystem. I didn't do those things in the demo because it would have made the code more complicated.

Related

role of multithreading in web application

I am using java(Servlets, JSPs) since 2 years for web application development. In those 2 years I never required to use multithreading(explicitly - as I know that servlet containers uses threading to serve same servlet to different requests) in any project.
But whenever I attend an interview for Web Developer position(java), then there are several questions related to threads in java. I know the basics of java threading so answering the questions is not a problem. But sometimes I get confused whether I am missing something while developing web application by not using mutithreading?
So my question is that what is the role of multithreading in Web Application? Any example where multithreading can be used in web application will be appreciated.
Thanks in advance.
Multi-threading can be used in Web Apps mainly when you are interested in asynchronous calls.
Consider for example you have a Web application that activates a user's state on a GSM network (e.g activate 4G plan) and sends a confirmatory SMS or email message at the end.
Knowing that the Web call would take several minutes - especially if the GSM network is stressed - it does not make sense to call it directly from the Web thread.
So basically, when a user clicks "Activate", the Server returns something like "Thanks for activating the 4G plan. Your plan will be activated in a few minutes and you will receive a confirmation SMS/email".
In that case, you server has to spawn a new thread, ideally using a thread pool, in an asynchronous manner, and immediately return a response to the user.
Workflow:
1- User clicks "Activate" button
2- Servlet receives request and activates a new "Activate 4G Plan" task in a thread pool.
3- Servlet immediately returns an HTML response to the user without waiting for the task to be finalized.
4- End of Http transaction
.
.
.
Asynchronously, the 4G plan gets activated later and the user gets notified through SMS or email, etc...
Speaking about a real-world example, there are several reasons to use multi-threading, and I wouldn't hire a web-developer who doesn't know about it. But in the end, the reasons to use multi-threading are the same for standard- and web-development: you either want something that take a while (aka blocking) done in the background to give the user some response in between, or you have a task that can be speed up by having it run on several cores. When multi-threading is actually useful is however a different question.
Situation 1: A web server that does require some processing and has low hits/second
Here multi-threading (if applicable to the algorithm) is a good thing, as idle cores are utilized and threading can result in a faster response to the user.
Situation 2: A web server that does require some processing and has high hits/second
Here multi-threading is possible, but as cores are usually busy with other requests, there are no resources left to use it properly. Actually spreading out the task to several threads can even have a negative impact on the response time, as the task is now fragmented and all parts need to complete, but the order of execution with threads is undefined. So one client could immediately receive a response, while others might wait into time-out till their last fragment eventually gets processed.
Situation 3: A web server has to do some processing that takes a very long time
Here multi-threading is required, there is no way around it. A client cannot wait minutes or probably hours till it receives the response. In this case a callback system is usually implemented, so basically each task has an "API" that can be queried for the current state. Most online-shops are an example for this: you order something and later you can query your order status.
The alternative to threading is process-forking, as Apache does in its standard configuration. The benefit is that load is spread across cores (mostly applicable to situation 2), and the web-code itself doesn't have to do anything to use all those cores, as the OS handles that automatically. However if you have imbalanced load, some cores can be idle and resources are not used in an optimal way. A threading situation is almost always the better solution, if it is done right. But the Apache/Tomcat standard configuration uses a very outdated threading model, by spawning one thread for each request. Effectively given a certain amount of hits/second, the CPU is more busy with threading than with actually processing those requests.
Well this is a nice question and I think most of the developers who work in web application development don't use multithreading explicitly.
The reason is quite obvious since you are using a application server to deploy your application, the application server internally manages a thread pool for incoming requests.
Then why use multithreading explicitly? What is need a web application developer expose himself to multithreading?
When you work on a large scale application where you have to server many request concurrently it is difficult to serve every kind of request synchronously because particular kind of request could have been doing a lot processing which could bring down the performance your application.
Lets take an example where a web application after serving particular kind of request has to notify users through email and SMS. Doing it synchronously with the request thread could bring down the performance of your web application. So here comes the role of mutlithreading.
In such cases it is advisable to develop a stand alone multithreaded application over the network which is responsible for sending email and SMS only.
Multi-treading in web application can be used when you are interested in parallel action, e.g., fetching data from multiple addresses.
As I understand, multi-threading is used in different situation from thread-pool, which can be used to handle requests from multiple clients.

Java Servlet Connection Timeout

I have a servlet that takes a couple of minutes to process and return its response. It is running in a somewhat restricted environment (Amazon Elastic Beanstalk). In this environment, there is a 60 second limit on request times and that is not configurable.
What are my options here? I thought of having the servlet start a thread and have the browser poll with AJAX, but I have seen so many people recommend against servlets starting threads for various reasons.
Another solution would be to have a thread start and end in the application's context listener, but I have many different servlets in the app that perform various functions, all of which have the same issue. A single thread running in the background would not really help.
Any suggestions?
Edit: With a little bit of more research in SO, I found that an Executor
is what I need.
See BalusC's answer here
See skaffman's answer here
Yes, it is not the best practice to start threads programmaticaly into servelet container. But this restriction is not so strict. IMHO you can do it if you really need. But if you are starting such solution implement it step-by-step.
First just try if this works. Open new thread to process your long request. While it is being processed send some kind of "keep-alive" from the "main" thread of your servlet. When processing is done send response to client.
Probably better and more scalable solution is to use messaging (e.g. JMS) for asynchronous processing of long requests. When request is received servlet should just create JMS message , enqueue it and immediately return. The other side (that implements MessageListener) should process message and put the result into outgoing queue. Client should request the result from this queue. The is the clear solution, it will work in clustered and multi-machine environment but it requires more efforts.
So, you choice should depend on your requirements, resources and time.
The best way to address this is using the Executor (see the update in my question). I used this in my project and it has worked seamlessly.

How to optimize number of database connections?

We have a Java (Spring) web application with Tomcat servlet container.
We have a something like blog.
But the blog must load its posts dynamically with Ajax.
The client's ajax script checks for new posts every second.
I.e. Ajax must ask the server for new posts every second and it will be very heavy for database.
But what if we have hundreds of thousands connects simultaneously?
I think that we must retrieve all posts with cron every second and after that save it somewhere. But where? The main idea is to unload the database.
Any ideas about architecture?
Thanks in advance!
There is other architecture for polling that could be more optimal, depending on the case:
Long polling
Long polling is a variation of the
traditional polling technique and
allows emulation of an information
push from a server to a client. With
long polling, the client requests
information from the server in a
similar way to a normal poll. However,
if the server does not have any
information available for the client,
instead of sending an empty response,
the server holds the request and waits
for some information to be available.
Once the information becomes available
(or after a suitable timeout), a
complete response is sent to the
client. The client will normally then
immediately re-request information
from the server, so that the server
will almost always have an available
waiting request that it can use to
deliver data in response to an event.
In a web/AJAX context, long polling is
also known as Comet programming.
Long Polling
Example of Implementations of this technology:
Push Server
You could also use the observer pattern to register the requests, and notify them when an update is done.
Hundreds of thousands of concurrent users all polling our site every second makes for a huge amount of traffic. If you truly expect this load you are going to have to design your platform accordingly, probably by clustering multiple web, application and DB servers.
Remember that with a database connection pool you don't need a DB connection for every user.
I'm not as familiar with Tomcat, but in WebSphere we can set up connection pools to prepare a certain number of connections.
Also, are you mainly worried about reads or the same number of writes?
Plus, you may also want to have the database "split" depending on region etc. This way there is no single heavy load across the entire database, but it can then be split and even load balanced.
There is also the "NoSQL" databases to look into as well. Maybe something to consider. Just ideas to help out.

Best method of triggering a shell script from Java

I have a shell script which I'd like to trigger from a J2EE web app.
The script does lots of things - processing, FTPing, etc - it's a legacy thing.
It takes a long time to run.
I'm wondering what is the best approach to this. I want a user to be able to click on a link, trigger the script, and display a message to the user saying that the script has started. I'd like the HTTP request/response cycle to be instantaneous, irrespective of the fact that my script takes a long time to run.
I can think of three options:
Spawn a new thread during the processing of the user's click. However, I don't think this is compliant with the J2EE spec.
Send some output down the HTTP response stream and commit it before triggering the script. This gives the illusion that the HTTP request/response cycle has finished, but actually the thread processing the request is still sat there waiting for the shell script to finish. So I've basically hijacked the containers HTTP processing thread for my own purpose.
Create a wrapper script which starts my main script in the background. This would let the request/response cycle to finish normally in the container.
All the above would be using a servlet and Runtime.getRuntime().exec().
This is running on Solaris using Oracle's OC4J app server, on Java 1.4.2.
Please does anyone have any opinions on which is the least hacky solution and why?
Or does anyone have a better approach? We've got Quartz available, but we don't want to have to reimplement the shell script as a Java process.
Thanks.
You mentioned Quartz so let's go for an option #4 (which is IMO the best of course):
Use Quartz Scheduler and a org.quartz.jobs.NativeJob
PS: The biggest problem may be to find documentation and this is the best source I've been able to find: How to use NativeJob?
I'd go with option 3, especially if you don't actually need to know when the script finishes (or have some other way of finding out other than waiting for the process to end).
Option 1 wastes a thread that's just going to be sitting around waiting for the script to finish. Option 2 seems like a bad idea. I wouldn't hijack servlet container threads.
Is it necessary for your application to evaluate output from the script you are starting, or is this a simple fire-and-forget job? If it's not required, you can 'abuse' the fact that Runtime.getRuntime().exec() will return immediately with the process continuing to run in the background. If you actually wanted to wait for the script/process to finish, you would have to invoke waitFor() on the Process object returned by exec().
If the process you are starting writes anything to stdout or stderr, be sure to redirect these to either log files or /dev/null, otherwise the process will block after a while, since stdout and stderr are available as InputStreams with limited buffering capabilites through the Process object.
My approach to this would probably be something like the following:
Set up an ExecutorService within the servlet to perform the actual execution.
Create an implementation of Callable with an appropriate return type, that wraps the actual script execution (using Runtime.exec()) to translate Java input variables to shell script arguments, and the script output to an appropriate Java object.
When a request comes in, create an appropriate Callable object, submit it to the executor service and put the resulting Future somewhere persistent (e.g. user's session, or UID-keyed map returning the key to the user for later lookups, depending on requirements). Then immediately send an HTTP response to the user implying that the script was started OK (including the lookup key if required).
Add some mechanism for the user to poll the progress of their task, returning either a "still running" response, a "failed" response or a "succeeded + result" response depending on the state of the Future that you just looked up.
It's a bit handwavy but depending on how your webapp is structured you can probably fit these general components in somewhere.
If your HTTP response / the user does not need to see the output of the script, or be aware of when the script completes, then your best option is to launch the thread in some sort of wrapper script as you mention so that it can run outside of the servlet container environment as a whole. This means you can absolve yourself from needing to manage threads within the container, or hijacking a thread as you mention, etc.
Only if the user needs to be informed of when the script completes and/or monitor the script's output would I consider options 1 or 2.
For the second option, you can use a servlet, and after you've responded to the HTTP request, you can use java.lang.Runtime.exec() to execute your script. I'd also recommend that you look here : http://www.javaworld.com/javaworld/jw-12-2000/jw-1229-traps.html
... for some of the problems and pitfalls of using it.
The most robust solution for asynchronous backend processes is using a message queue IMO. Recently I implemented this using a Spring-embedded ActiveMQ broker, and rigging up a producing and consuming bean. When a job needs to be started, my code calls the producer which puts a message on the queue. The consumer is subscribed to the queue and get kicked into action by the message in a separate thread. This approach neatly separates the UI from the queueing mechanism (via the producer), and from the asynchronous process (handled by the consumer).
Note this was a Java 5, Spring-configured environment running on a Tomcat server on developer machines, and deployed to Weblogic on the test/production machines.
Your problem stems from the fact that you are trying to go against the 'single response per request' model in J2EE, and have the end-user's page dynamically update as the backend task executes.
Unless you want to go down the introducing an Ajax-based solution, you will have to force the rendered page on the user's browser to 'poll' the server for information periodically, until the back-end task completes.
This can be achieved by:
When the J2EE container receives the request, spawn a thread which takes a reference to the session object (which will be used to write the output of your script)
Initialize the response servlet to write an html page which will contain a Javascript function to reload the page from the server at regular intervals (every 10 seconds or so).
On each request, poll the session object to display the output stored by the spawned thread in step 1
[clean-up logic can be added to delete the stored content from the session once the thread completes if needed, also you can set any additional flags in the session for mark state transitions of the execution of your script]
This is one way to achieve what you want - it isn't the most elegant of all approaches, but it is essentially due to needing to asynchronously update your page content from the server , with a request/response model.
There are other ways to achieve this, but it really depends on how inflexible your constraints are. I have heard of Direct Web Remoting (although I haven't played with it yet), might be worth taking a look at Developing Applications using Reverse-Ajax

Implementing client - masterserver/slaveserver application java

We have a string processing service (c++, uses stdin/out for in/output) that has different layouts, each layout runs separately (eventually will run on separate machines), each layout takes time to load, thats why it must keep running after first run.
I must implement a system with client that will ask the master server to connect it to a relevant slave server which actually runs the relevant layout service. The slave server will communicate the data passed from the client to the service, and when finished will become available on the master server for other clients.
The question is what is the best way to go about implementing the servers? Should I keep an open connection between slave/master until the process is complete to notify the master that the connection is over or keep some sort of var in a synchronized function to check that?
Any other important inputs (or other designs) I have overlooked are also very welcomed, Thanx!
Assuming you can't replace the C++ stuff, here is how I would do it off the top of my head.
I would setup one master server. That server would run a process that accepts requests (probably by HTTP, so it'd be a webservice) and I would have it read the request, parse out what it is, and then call the correct slave. Basically it acts as a proxy. Once it receives the response from the slave it forwards it back to the caller. The simplicity here means that if you start getting more of one type of request, you can set up additional servers for that and round-robin requests to them.
The slaves would be webservices that open the C++ program and forward input and retrieve output. That's all it would do.
I wouldn't bother keeping open connections (except between the slave and the C++ program based on your description). Just using a web request for this stuff will keep the connection between the master and the slave open during the process, but it shouldn't be a problem. This way you don't need to worry about this detail.
Now if I were you I would seriously look at reimplementing the C++ code in Java or calling it via JNI or something. If you can avoid it, I think avoiding the Java wrapper around C++ thing would be a good design goal. The Java could do whatever expensive process it is during start up once, and then hold things ready in memory like the C++ code does.
I hope this helps.
Depending on your scalability needs, you may want to take a look at the Java NIO package. This will give you a starting point to build a scalable, non-blocking server implementation.

Categories