I'd like to multithread my GAE servlets so that the same servlet on the same instance can handle up to 10 (on frontend instance I believe the max # threads is 10) concurrent requests from different users at the same time, timeslicing between each of them.
public class MyServlet implements HttpServlet {
private Executor executor;
#Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
if(executor == null) {
ThreadFactory threadFactory = ThreadManager.currentRequestFactory();
executor = Executors.newCachedThreadPoolthreadFactory);
}
MyResult result = executor.submit(new MyTask(request));
writeResponseAndReturn(response, result);
}
}
So basically when GAE starts up, the first time it gets a request to this servlet, an Executor is created and then saved. Then each new servlet request uses that executor to spawn a new thread. Obviously everything inside MyTask must be thread-safe.
What I'm concerned about is whether or not this truly does what I'm hoping it does. That is, does this code create a non-blocking servlet that can handle multiple requests from multiple users at the same time? If not, why and what do I need to do to fix it? And, in general, is there anything else that a GAE maestro can spot that is dead wrong? Thanks in advance.
I don't think your code would work.
The doGet method is running in threads managed by the servlet container. When a request comes in, a servlet thread is occupied, and it will not be released until doGet method return. In your code, the executor.submit would return a Future object. To get the actual result you need to invoke get method on the Future object, and it would block until the MyTask finishes its task. Only after that, doGet method returns and new requests can kick in.
I am not familiar with GAE, but according to their docs, you can declare your servlet as thread-safe and then the container will dispatch multiple requests to each web server in parallel:
<!-- in appengine-web.xml -->
<threadsafe>true</threadsafe>
You implicitly asked two questions, so let me answer both:
1. How can I get my AppEngine Instance to handle multiple concurrent requests?
You really only need to do two things:
Add the statement <threadsafe>true</threadsafe> to your appengine-web.xml file, which you can find in the war\WEB-INF folder.
Make sure that the code inside all your request handlers is actually thread-safe, i.e. use only local variables in your doGet(...), doPost(...), etc. methods or make sure you synchronize all access to class or global variables.
This will tell the AppEngine instance server framework that your code is thread-safe and that you are allowing it to call all of your request handlers multiple times in different threads to handle several requests at the same time. Note: AFAIK, It is not possible to set this one a per-servlet basis. So, ALL your servlets need to be thread-safe!
So, in essence, the executor-code you posted is already included in the server code of each AppEngine instance, and actually calls your doGet(...) method from inside the run method of a separate thread that AppEngine creates (or reuses) for each request. Basically doGet() already is your MyTask().
The relevant part of the Docs is here (although it doesn't really say much): https://developers.google.com/appengine/docs/java/config/appconfig#Using_Concurrent_Requests
2. Is the posted code useful for this (or any other) purpose?
AppEngine in its current form does not allow you to create and use your own threads to accept requests. It only allows you to create threads inside your doGet(...) handler, using the currentRequestThreadFactory() method you mentioned, but only to do parallel processing for this one request and not to accept a second one in parallel (this happens outside doGet()).
The name currentRequestThreadFactory() might be a little misleading here. It does not mean that it will return the current Factory of RequestThreads, i.e. threads that handle requests. It means that it returns a Factory that can create Threads inside the currentRequest. So, unfortunately it is actually not even allowed to use the returned ThreadFactory beyond the scope of the current doGet() execution, like you are suggesting by creating an Executor based on it and keeping it around in a class variable.
For frontend instances, any threads you create inside a doGet() call will get terminated immediately when your doGet() method returns. For backend instances, you are allowed to create threads that keep running, but since you are not allowed to open server sockets for accepting requests inside these threads, these will still not allow you to manage the request handling yourself.
You can find more details on what you can and cannot do inside an appengine servlet here:
The Java Servlet Environment - The Sandbox (specifically the Threads section)
For completeness, let's see how your code can be made "legal":
The following should work, but it won't make a difference in terms of your code being able to handle multiple requests in parallel. That will be determined solely by the <threadsafe>true</threadsafe> setting in you appengine-web.xml. So, technically, this code is just really inefficient and splits an essentially linear program flow across two threads. But here it is anyways:
public class MyServlet implements HttpServlet {
#Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
ThreadFactory threadFactory = ThreadManager.currentRequestThreadFactory();
Executor executor = Executors.newCachedThreadPool(threadFactory);
Future<MyResult> result = executor.submit(new MyTask(request)); // Fires off request handling in a separate thread
writeResponse(response, result.get()); // Waits for thread to complete and builds response. After that, doGet() returns
}
}
Since you are already inside a separate thread that is specific to the request you are currently handling, you should definitely save yourself the "thread inside a thread" and simply do this instead:
public class MyServlet implements HttpServlet {
#Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
writeResponse(response, new MyTask(request).call()); // Delegate request handling to MyTask object in current thread and write out returned response
}
}
Or, even better, just move the code from MyTask.call() into the doGet() method. ;)
Aside - Regarding the limit of 10 simultaneous servlet threads you mentioned:
This is a (temporary) design-decision that allows Google to control the load on their servers more easily (specifically the memory use of servlets).
You can find more discussion on those issues here:
Issue 7927: Allow configurable limit of concurrent requests per instance
Dynamic Backend Instance Scaling
If your bill shoots up due to increased latency, you may not be refunded the charges incurred
This topic has been bugging the heck out of me, too, since I am a strong believer in ultra-lean servlet code, so my usual servlets could easily handle hundreds, if not thousands, of concurrent requests. Having to pay for more instances due to this arbitrary limit of 10 threads per instance is a little annoying to me to say the least. But reading over the links I posted above, it sounds like they are aware of this and are working on a better solution. So, let's see what announcements Google I/O 2013 will bring in May... :)
I second the assessments of ericson and Markus A.
If however, for some reason (or for some other scenario) you want to follow the path that uses your code snippet as a starting point, I'd suggest that you change your executor definition to:
private static Executor executor;
so that it becomes static across instances.
Related
If multiple requests are handled by a server to run a single servlet then where we need to take care of synchronization?
I have got the answer from How does a single servlet handle multiple requests from client side how multiple requests are handled. But then again there is a question that why we need synchronization if all requests are handled separately?
Can you give some real life example how a shared state works and how a servlet can be dependent? I am not much interested in code but looking for explanation with example of any portal application? Like if there is any login page how it is accessed by n number of users concurrently.
If more than one request is handled by the server.. like what I read is server make a thread pool of n threads to serve the requests and I guess each thread will have their own set of parameters to maintain the session... so is there any chance that two or more threads (means two or more requests) can collide with each other?
Synchronization is required when multiple threads are modifying a shared resources.
So, when all your servlets are independent of each other, you don't worry about the fact that they run in parallel.
But, if they work on "shared state" somehow (for example by reading/writing values into some sort of centralized data store); then you have to make sure that things don't go wrong. Of course: the layer/form how to provide the necessary synchronization to your application depends on your exact setup.
Yes, my answer is pretty generic; but so is your question.
Synchronization in Java will only be needed if shared object is mutable. if your shared object is either read-only or immutable object, then you don't need synchronization, despite running multiple threads. Same is true with what threads are doing with an object if all the threads are only reading value then you don't require synchronization in Java.
Read more
Basically if your servlet application is multi-threaded, then data associated with servlet will not be thread safe. The common example given in many text books are things like a hit counter, stored as a private variable:
e.g
public class YourServlet implements Servlet {
private int counter;
public void service(ServletRequest req, ServletResponse, res) {
//this is not thread safe
counter ++;
}
}
This is because the service method and Servlet is operated on by multiple thread incoming as HTTP requests. The unary increment operator has to firstly read the current value, add one and the write the value back. Another thread doing the same operation concurrently, may increment the value after the first thread has read the value, but before it is written back, thus resulting in a lost write.
So in this case you should use synchronisation, or even better, the AtomicInteger class included as part of Java Concurrency from 1.5 onwards.
The scenario of my problem is:
In my servlet I get a large amount of data from somewhere (not relevant). I have to iterate over all this data and put it in an array, convert it to a JSON object and send it to the client side for viewing. If I do this in a single response it takes a very long time to display the results. Hence, I need to do multithreading.
The created thread needs to keep on adding data to the list while the main thread whenever it gets a request (requests for data keep on coming periodically) sends the present available list.
For instance on first request the response sent is : 1 2 3
Second request : 4 5 6 and so on.
Now I come to actual problem : I don't know how to do multithreading in a servlet. I have looked through numerous resources and examples but it only has confused me further. Some examples have created threads right in doGet which I think is very wrong, some have created them in the init() method but I dont know how can I pass parameters and get results from the thread if it is declared in the init method (It cannot be a global variable). Then there are examples of servletContextListener but I havent found anything useful or that makes sense.
Can anyone please guide to me a reliable source or just give me some sort of pseudo code to get a solution to my problem. It would be extremely helpful if the answers are in context with the aforementioned scenario.
Thanks
The created thread needs to keep on adding data to the list while the
main thread whenever it gets a request (requests for data keep on
coming periodically) sends the present available list.
If I got you correct, you like to get some data as background service and make them ready for clients once they request them(sounds like harvesting data).
Well, creating thread in web-apps, or generally stuffs come with managed environment is different, creating a thread implicitly would cause of memory leak.
One good solution would having a ThreadPool(either by container context/ndi or create it manually).
AND it MUST be created in a manageable manner, where you would control it by environment related events.
ContextListener is your friend, having a context listener class, like this.
public class dear_daemon implements ServletContextListener,Runnable{
ExecutorService the_pool;
Thread the_evil;
/*following get invoked once the context is called*/
public void contextInitialized(ServletContextEvent sce){
/*initialize the thread-pool, and run evil thread*/}
/*following get invoked once the context is destroying*/
public void contextDestroyed(ServletContextEvent sce){eviling=false;
/*stop evil(this) thread(first), then destroy thread pool*/
}
volatile boolean eviling=true;
public void run(){
while(eviling){
/*run Runnable instance which do data fetching using thread-pool*/
}
}
}
And register the listener in web.xml
<listener>
<listener-class>dudes.dear_daemon</listener-class>
</listener>
Having a class(runnable) which do the data fetching, and invoke it by evil thread, each instance using one thread.
The ContextLisstener helps you correctly shutdown and manage init and hult events by container, using the same thing with servlet init is possible, but make sure you do the same thing about hulting with destroy method of servlet.
If you like to do thread-thing about it, make sure you are doing things thread-safe since you have one thing to store data(a list).
If any synchronization is needed(for example ordering the fetched data), make sure you are doing it right, or you will face with deadlocks, or low-performance code.
If any(probably) IO action is needed for getting data, note java IO is blocking, so set appreciated read and connection timeouts, or switch to NIO if you can handle complex NIO stuffs.
If applying these changes make the environment complex, and you like to do alternative solutions, you may simply extract the data fetching from web-profile and run it as a external daemon-service or applciation, where the applciation will pass the fetched data to the server context using calling one of your CGI/Servlet.
What's the recommended way of starting a thread from a servlet?
Example: One user posts a new chat message to a game room. I want to send a push notification to all other players connected to the room, but it doesn't have to happen synchronously. Something like:
public MyChatServlet extends HttpServlet {
protected void doPost(HttpServletRequest request,
HttpServletResponse response)
{
// Update the database with the new chat message.
final String msg = ...;
putMsgInDatabaseForGameroom(msg);
// Now spawn a thread which will deal with communicating
// with apple's apns service, this can be done async.
new Thread() {
public void run() {
talkToApple(msg);
someOtherUnimportantStuff(msg);
}
}.start();
// We can send a reply back to the caller now.
// ...
}
}
I'm using Jetty, but I don't know if the web container really matters in this case.
Thanks
What's the recommended way of starting a thread from a servlet?
You should be very careful when writing the threading program in servlet.
Because it may causes errors (like memory leaks or missing synchronization) can cause bugs that are very hard to reproduce,
or bring down the whole server.
You can start the thread by using start() method.
As per my knowledge , I would recommend startAsync (servlet 3.0).
I got some helpful link for you Click.
but I don't know if the web container really matters in this case.
Yes it matters.Most webservers (Java and otherwise, including JBoss) follow a "one thread per request" model, i.e. each HTTP request is fully processed by exactly one thread.
This thread will often spend most of the time waiting for things like DB requests. The web container will create new threads as necessary.
Hope it will help you.
I would use a ThreadPoolExecutor and submit the tasks to it. The executor can be configured with a fixed/varying number of threads, and with a work queue that can be bounded or not.
The advantages:
The total number of threads (as well as the queue size) can be bounded, so you have good control on resource consumption.
Threads are pooled, eliminating the overhead of thread starting per request
You can choose a task rejection policy (Occurs when the pool is at full capacity)
You can easily monitor the load on the pool
The executor mechanism supports convenient ways of tracking the asynchronous operation (using Future)
In general that is the way. You can start any thread anywhere in a servlet web application.
But in particulary, you should protect your JVM from starting too much threads on any HTTP request. Someone may request a lot ( or very very much ) and propably at some point your JVM will stop with out of memory or something similiar.
So better choice is to use one of the queues found in the java.util.concurrent package.
One option would be to use ExecutorService and it's implementations like ThreadPoolExecutor
, to re-use the pooled threads thus reducing the creation overhead.
You can use also JMS for queuing you tasks to be executed later.
I have a web application where multiple servlets use a certain amount of identical logic for pre-initialization (setting up logging, session tracking, etc.). What I did was to introduce an intermediary level between javax.servlet.http.HttpServlet and my concrete servlet class:
public abstract class AbstractHttpServlet extends HttpServlet {
// ... some common things ...
}
and then:
public class MyServlet extends AbstractHttpServlet {
// ... specialized logic ...
}
One of the things I do in AbstractHttpServlet's default (and only) constructor is to set a few private member variables. In this case it is a UUID, which serves as a session identifier:
public abstract class AbstractHttpServlet extends HttpServlet {
private UUID sessionUuid;
public AbstractHttpServlet() {
super();
this.sessionUuid = UUID.randomUUID();
// ... there's more, but snipped for brevity ...
}
protected UUID getSessionUuid() {
return this.sessionUuid;
}
}
I then use getSessionUuid() in MyServlet to provide for session tracking within the request. This is very useful e.g. in logging, to be able to sift through a large log file and get all entries relating to a single HTTP request. In principle the session identifier could be anything; I just picked using a UUID because it is easy to generate a random one and there's no need to worry about collisions between different servers, seed issues, searching through the log file turning up a match as a portion of a longer string, etc etc.
I don't see any reason why multiple executions should get the same value in the sessionUuid member variable, but in practice, it appears that they do. It's as if the instance of the class is being reused for multiple requests even over a long period of time (seemingly until the server process is restarted).
In my case, class instantiation overhead is minor compared to the useful work done by the class, so ideally I'd like Tomcat to always create new class instances for each request and thus force it to execute the constructor separately each time. Is it possible to perhaps annotate the class to ensure that it is instantiated per request? Answers that don't require server configuration changes are much preferred.
Failing that, is there a way (other than doing so separately in each do*() method such as doGet(), doPost(), etc.) to ensure that some sort of initialization is done per HTTP request which results in execution of a particular servlet?
It's as if the instance of the class is being reused for multiple requests even over a long period of time (seemingly until the server process is restarted).
Yes, that's exactly what will be happening, and what you should expect.
A servlet isn't meant to be a session - it's just meant to be the handler.
If you want to do "something" on each request, no matter what the method, you can override the service method, take whatever action, and then call super.service(). However, you shouldn't change the state of the servlet itself - bear in mind that multiple requests may execute in the same servlet at the same time.
Basically, what you're asking for goes against the design of servlets - you should work with the design rather than against it. You could modify the request itself (using setAttribute) to store some information related to just this request - but I'd probably do that at a higher level than HTTP anyway. (I'd try to make the servlet itself very small, just delegating to non-servlet-aware classes as far as possible, which makes them easier to test.)
This code is not threadsafe. The servlet container will generally create one instance of the servlet and all requests will use it.This means that the sessionUUID will be shared by all requests and will be continually overwritten.
If you need to keep this value on a per request basis, consider using a ThreadLocal object and putting the UUID in there.
It's as if the instance of the class is being reused for multiple requests even over a long period of time.
There is always one instance of a Servlet class at any given point in time per JVM. Hence instance variables are not thread safe in Servlet. Each request for the Servlet will be processed by a thread. Local variables declared inside the service(),doPost() and doGet() will be thread safe .
Hence you can move your logic to some other class , instantiate it inside the service methods and use it in thread safe fashion.You can even use ThreadLocal objects.
There is a provision to implement the SingleThreadModel ,it is deprecated, it is not only bad but ridiculous to do so.
Ensures that servlets handle only one request at a time. This interface has no methods.
If a servlet implements this interface, you are guaranteed that no two threads will execute concurrently in the servlet's service method. The servlet container can make this guarantee by synchronizing access to a single instance of the servlet, or by maintaining a pool of servlet instances and dispatching each new request to a free servlet.
Better to implement a ServletRequestListener and put the logic there.
I have a Java HttpServlet. This servlet contains a set of objects that make use of the observer pattern in order to return data through the servlet's Response object. Here is a simplified version of my doGet() method in the HttpServlet:
protected void doGet(final HttpServletRequest request, final HttpServletResponse response)
MyProcess process = new MyProcess();
// This following method spawns a few threads, so I use a listener to receive a completion event.
process.performAsynchronousMethod(request, new MyListener() {
public void processComplete(data) {
response.getWriter().print(data.toString());
}
}
}
As the example shows, I have a process that I execute, which spawns a variety of threads in order to produce a final dataset. This process can take anywhere from seconds to a minute. My problem is, it appears that as the doGet() method completes, the response object becomes null. When processComplete() is called, the response object will be null - thus preventing me from writing any data out.
It appears as if the servlet is closing the connection as soon as the asynchronous method is called.
Is there a better way to implement this type of servlet when using the observer pattern for asynchronous tasks? Should I do this in another way?
The servlet response will be sent back to the client when the doGet method terminates, it won't wait for your asynchronous call to finish as well. You will need to find a way to block until all your asynchronous tasks have completed, and only then allow the doGet() method to return.
The answers to this question should point you in the right direction.
Something else to watch out for is that you have no guarantee that the threads will write to the response writer in series, you may find that the various print operations overlap and the output will be garbled (this may not matter to you, depending on what the data is, and how it will be used)
You could try asynchronous servlets available in spec version 3.0, not all web servers support it, only some modern. But it means that server will hold socket connection for this amount of time. So, you should know how many clients could be connected simultaneously, not all hardware/operation system could handle a lot of open connections.
And web client will wait, and could have a timeout. You should also consider a situation that socket connection could be disconnected and client will never get result (e.g. some proxy servers break long running connections). So you should allow "resume" operation.