In our application we want to achieve higher throughput so I just want to know how threading works in Spring MVC controllers.
Thanks in advance for your help.
This helped me
http://community.jaspersoft.com/wiki/how-increase-maximum-thread-count-tomcat-level
A web application is hosted in an application server (like tomcat). Usually the application server manage a thread pool and every request is handled by a thread.
The web application don't have to worry about this thread pool. The size of the thread pool is a parameter of the application server.
To achieve higher throughput you really need to identify the bottleneck.
(According my experience, the size of the thread pool of the application server is rarely the root cause of performance problem.)
Note that the "number of controller instances" is normally one. i.e. a controller is usually a singleton shared/used by all threads, and therefore a controller must be thread-safe.
Let us specify the question a little more: an application of interest, implementing a REST controller, is deployed on a typical mutlithreaded application server (running, possibly, other things as well). Q: Is there concurrence in handling of separate requests to the mapped methods of the controller?
I'm not authoritative in this subject, but it is of high importance (in particular: should single-threaded logic be applied to REST-Controller code?).
Edit: answer below is WRONG. Concurrent calls to different methods of same controller are handled concurrently, and so all shared resources they use (services, repositories etc.) must ensure thread safety. For some reason, however, calls to be handled by the same method of the controller are serialized (or: so it appears to me as of now).
The small test below shows, that even though subsequent (and rapid) calls to the mapped methods of the controller are indeed handled by different threads, single-threaded logic applies (i.e. there is no cuncurrency "out of the box").
Let us take the controller:
AtomicInteger count = new AtomicInteger();
#RequestMapping(value = {"/xx/newproduct"})
#ResponseBody
public Answer newProduct(){
Integer atCount = count.incrementAndGet();
////// Any delay/work would do here
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
Answer ans = new Answer("thread:" + Thread.currentThread().getName() + " controller:" + this, atCount);
count.decrementAndGet();
return ans;
}
and launch 10 rapid (almost concurrent w.r.t. the 1000ms sleep time) REST requests, e.g. by the AngularJS code
$scope.newProd = function (cnt) {
var url = $scope.M.dataSource + 'xx/newproduct';
for(var i=0; i<cnt; ++i) {
$http.get(url).success(function(data){
console.log(data);
});
}
};
(the Answer just carries a String and an Integer; making count static would not change anything). What happens, is that all requests become pending concurrently, but responses come sequentially, exactly 1s apart, and none has atCount>1. They do come from different threads though.
More specifically the console log has:
in other words:
Edit: This shows, that concurrent calls to the same method/route are serialized. However, by adding a second method to the controller we easily verify, that calls to this method would be handled concurrently with the calls to the first method, and hence, multithreaded logic for handling requests is mandatory "out-of-the-box".
In order to profit from multithreading one should therefore, as it seems, employ traditional explicit methods, such as launching any non-trivial work as a Runnable on an Executor.
Basically this has nothing to do with Spring. Usually each request is forked into a separate thread. So the usual thing to do here is finding the bottleneck.
However there is a possibility that badly written beans that share state over thread boundaries and therefore need to be synchronized might have a very bad effect.
Related
Essentially I've written a service in Java that will do initial synchronous processing (a couple simple calls to other web services). Then, after that processing is done, I return an acknowledgement message to the caller, saying I've verified their request and there is now downstream processing happening in the background asynchronously.
In a nutshell, what I'm concerned about is the complexity of the async processing. The sum of those async calls can take up to 2-3 minutes depending on certain parameters sent. My thought here is: what if there's a lot of traffic at once hitting my service, and there are a bunch of hanging threads in the background, doing a large chunk of processing. Will there be bad data as a result? (like one request getting mixed in with a previous request etc)
The code follows this structure:
Validation of headers and params in body
Synchronous processing
Return acknowledgement message to the caller
Asynchronous processing
For #4, I've simply made a new thread and call a method that does all the async processing within it. Like:
new Thread()
{
#Override
public void run()
{
try {
makeDownstreamCalls(arg1, arg2 , arg3, arg4);
} catch (Exception e) {
e.printStackTrace();
}
}
}.start();
I'm basically wondering about unintended consequences of lots of traffic hitting my service. An example I'm thinking about: a thread executing downstream calls for request A, and then another request comes in, and a new thread has to be made to execute downstream calls for request B. How is request B handled in this situation, and what happens to request A, which is still in-progress? Will the async calls in request A just terminate in this case? Or can each distinct request, and thread, execute in parallel just fine and complete, without any strange consequences?
Well, the answer depends on your code, of which you posted a small part, so my answer contains some guesswork. I'll assume that we're talking about some sort of multi-threaded server which accepts client requests, and that those request come to some handleRequest() method which performs the 4 steps you've mentioned. I'll also assume that the requests aren't related in any way and don't affect each other (so for instance, the code doesn't do something like "if a thread already exists from a previous request then don't create a new thread" or anything like that).
If that's the case, then your handleRequest() method can be simultaneously invoked by different server threads concurrently. And each will execute the four steps you've outlined. If two requests happen simultaneously, then a server thread will execute your handler for request A, and a different one will execute it for B at the same time. If during the processing of a request, a new thread is created, then one will be created for A, another for B. That way, you'll end up with two threads performing makeDownstreamCalls(), one with A's parameters one with B's.
In practice, that's probably a pretty bad idea. The more threads your program will create, the more context-switching the OS has to do. You really don't want the number of requests to increase the number of threads in your application endlessly. Modern OSes are capable of handling hundreds or even thousands of threads (as long as they're bound by IO, not CPU), but it comes at a cost. You might want to consider using a Java executor with a limited number of threads to avoid crushing your process or even OS.
If there's too much load on a server, you can't expect your application to handle it. Process what you can within the limit of the application, and reject further request. Accepting more requests when you're fully loaded means that your application crashes, and none of the requests are processed - this is known as "Load Shedding".
If multiple requests are handled by a server to run a single servlet then where we need to take care of synchronization?
I have got the answer from How does a single servlet handle multiple requests from client side how multiple requests are handled. But then again there is a question that why we need synchronization if all requests are handled separately?
Can you give some real life example how a shared state works and how a servlet can be dependent? I am not much interested in code but looking for explanation with example of any portal application? Like if there is any login page how it is accessed by n number of users concurrently.
If more than one request is handled by the server.. like what I read is server make a thread pool of n threads to serve the requests and I guess each thread will have their own set of parameters to maintain the session... so is there any chance that two or more threads (means two or more requests) can collide with each other?
Synchronization is required when multiple threads are modifying a shared resources.
So, when all your servlets are independent of each other, you don't worry about the fact that they run in parallel.
But, if they work on "shared state" somehow (for example by reading/writing values into some sort of centralized data store); then you have to make sure that things don't go wrong. Of course: the layer/form how to provide the necessary synchronization to your application depends on your exact setup.
Yes, my answer is pretty generic; but so is your question.
Synchronization in Java will only be needed if shared object is mutable. if your shared object is either read-only or immutable object, then you don't need synchronization, despite running multiple threads. Same is true with what threads are doing with an object if all the threads are only reading value then you don't require synchronization in Java.
Read more
Basically if your servlet application is multi-threaded, then data associated with servlet will not be thread safe. The common example given in many text books are things like a hit counter, stored as a private variable:
e.g
public class YourServlet implements Servlet {
private int counter;
public void service(ServletRequest req, ServletResponse, res) {
//this is not thread safe
counter ++;
}
}
This is because the service method and Servlet is operated on by multiple thread incoming as HTTP requests. The unary increment operator has to firstly read the current value, add one and the write the value back. Another thread doing the same operation concurrently, may increment the value after the first thread has read the value, but before it is written back, thus resulting in a lost write.
So in this case you should use synchronisation, or even better, the AtomicInteger class included as part of Java Concurrency from 1.5 onwards.
I have a web application that retrieves a (large) list of results from the database, then needs to pare down the list by looking at each result, and throwing out "invalid" ones. The parameters that make a result "invalid" are dynamic, and we cannot pass the work on to the database.
So, one idea is to create a thread pool and ExecutorService and check these results concurrently. But I keep seeing people saying "Oh, the spec prohibits spawning threads in a servlet" or "that's just a bad idea".
So, my question: what am I supposed to do? I'm in a servlet 2.5 container, so all the asynchrous goodies as part of the 3.0 spec are unavailable to me. Writing a separate service that I communicate with via JMS seems like overkill.
Looking for expert advice here.
Jason
Nonsense.
The JEE spec has lots of "should nots" and "thou shant's". The Servlet spec, on the other hand, has none of that. The Servlet spec is much more wild west. It really doesn't dive in to the actual operational aspects like the JEE spec does.
I've yet to see a JEE container (either a pure servlet container ala Tomcat/Jetty, or full boat ala Glassfish/JBoss) that actually prevented me from firing off a thread on my own. WebSphere might, it's supposed to be rather notorious, but I've not used WebSphere.
If the concept of creating unruly, self-managed threads makes you itch, then the full JEE containers internally have a formal "WorkManager" that can be used to peel threads off of. They just all expose them in different ways. That's the more "by the book-ish" mechanism for getting a thread.
But, frankly, I wouldn't bother. You'll likely have more success using the Executors out of the standard class library. If you saturate your system with too many threads and everything gets out of hand, well, that's on you. Don't Do That(tm).
As to whether an async solution is even appropriate, I'll punt on that. It's not clear from your post whether it is or not. But your question was about threads and Servlets.
Just Do It. Be aware it "may not be portable", do it right (use an Executor), take responsibility for it, and the container won't be the wiser, nor care.
Doesn't look like concurrency will help you much here. Unless it's very expensive to check each entry, making that check concurrent won't speed things up. Your bottleneck is passing the result set through the database connection, and you couldn't multithread that even if you weren't working on a servlet.
There's nothing to stop you from hitting some ThreadPool from your Servlet, the challenge comes in getting the results. If the Servlet invocation is expecting some result from your submission of a Task to the TreadPool you will end up blocking waiting for the TreadPool stuff to finish so you can compose a response to the doGet/doPut invocation.
If, on the other hand, you devise your service such that a doPut, for example, submits a Task to a ThreadPool but gets back a "handle" or some other unique identifier of the Task returning that to the client, then the client can "poll" the handle through some doGet API to see if the task is done. When the task is done, the client can get the results.
It's completely fine and appropriate. I have done countless work with Servlets that use thread pools on different containers without any problems whatsoever.
EJB containers (like JBoss) tend to warn against spawning threads, but this is because EJB guarantees that an instance of a Bean is only called by one thread, and some of the facilities rely on this and thus you could mess that up by using your own threads. In Servlet there is no such reliance and hence nothing you can mess up this way.
Even in EJB containers, you can use thread pools and be fine as long as you don't interact (like call) with EJB facilities from your own threads.
The thing to watch out for with servlet/threads is that member variables of the servlet need to be thread safe.
Technically nothing stops you from using a thread pool in your servlet to do some post processing but you could shoot yourself in the foot if you create a static thread pool with say 20 threads and 50 clients access your servlet concurrently because 30 clients will be waiting (depending on how long your post-processing takes).
Is a typical J2ee web application or any web app built on top java is multi threaded application , so every time i write some code i have to keep race condition or concurrent modification in mind ?
Is a typical J2ee web application or any web app built on top java is multi threaded application?
Yes, it is. But the application server (Tomcat, JBoss, WebSphere, etc.) handles the threads and resources for you, so you may not worry about race condition or concurrent modification.
When you should worry about concurrent modification? For example, if you happen to create a field in a Servlet and you update this field on every request (doPost or doGet method of the servlet), then two users in their pcs could perform the request on the same URL at the same time, and this field will have an unexpected value. This is covered here: How do servlets work? Instantiation, sessions, shared variables and multithreading, Threadsafety section of the accepted answer. Note that having a design like this is a bad practice.
Another case may be you firing new threads and resources shared among this threads by your own. It is not a good practice nor a bad practice, it is kind of you must understand the risk you're taking and assume the consequences. This means, you can have a Servlet and fire threads on your own, but it's up to you to handle this in the right way. Note that you should evaluate if you really need to fire and handle threads in your Java EE application or if you could use another approach like firing JMS messages that will handle multiple requests in parallel and asynchronously.
#AndreiI noted in his/her answer that EJB prohibits using threads, but this means that you cannot manage threads inside an EJB, nor by creating a new instance of Thread nor by using ExecutorService or any other. In code:
#Stateless
public class FooEJB {
public void bar() {
//this is not allowed!
Thread t = new Thread(new Runnable() {
//implementation of runnable
});
t.start();
}
public void baz() {
//this is not allowed either!
final int numberOfThreads = ...;
ExecutorService es = Executors.newFixedThreadPool(numberOfThreads);
es.execute(new Runnable() { ... });
es.shutdown();
}
}
Like almost any framework in Java (server applications, inclusive Web frameworks or GUI applications based on AWT or Swing), Java EE is multi-threaded. But the answer to your question is no: you do not have to care about race condition or concurrent modification. Of course you are not allowed to make some errors (like sharing Servlet variables), but in a typical application you do not care about such things. For example the EJB specification prohibits using threads, but it has a mechanism for asynchronous jobs. Excerpt from the EJB specification:
The enterprise bean must not attempt to manage threads. The enterprise
bean must not attempt to start, stop, suspend, or resume a thread, or
to change a thread’s priority or name. The enterprise bean must not
attempt to manage thread groups.
Also the most used interface in the JPA specification (EntityManager) is not thread safe, although others are.
In a Java EE application container the Server takes care of the threading for you. Typically it creates one thread per request. However using Spring or EJB you can declare different scopes to your threads. So, you should not have to directly manage threads in a JavaEE application.
I'd like to multithread my GAE servlets so that the same servlet on the same instance can handle up to 10 (on frontend instance I believe the max # threads is 10) concurrent requests from different users at the same time, timeslicing between each of them.
public class MyServlet implements HttpServlet {
private Executor executor;
#Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
if(executor == null) {
ThreadFactory threadFactory = ThreadManager.currentRequestFactory();
executor = Executors.newCachedThreadPoolthreadFactory);
}
MyResult result = executor.submit(new MyTask(request));
writeResponseAndReturn(response, result);
}
}
So basically when GAE starts up, the first time it gets a request to this servlet, an Executor is created and then saved. Then each new servlet request uses that executor to spawn a new thread. Obviously everything inside MyTask must be thread-safe.
What I'm concerned about is whether or not this truly does what I'm hoping it does. That is, does this code create a non-blocking servlet that can handle multiple requests from multiple users at the same time? If not, why and what do I need to do to fix it? And, in general, is there anything else that a GAE maestro can spot that is dead wrong? Thanks in advance.
I don't think your code would work.
The doGet method is running in threads managed by the servlet container. When a request comes in, a servlet thread is occupied, and it will not be released until doGet method return. In your code, the executor.submit would return a Future object. To get the actual result you need to invoke get method on the Future object, and it would block until the MyTask finishes its task. Only after that, doGet method returns and new requests can kick in.
I am not familiar with GAE, but according to their docs, you can declare your servlet as thread-safe and then the container will dispatch multiple requests to each web server in parallel:
<!-- in appengine-web.xml -->
<threadsafe>true</threadsafe>
You implicitly asked two questions, so let me answer both:
1. How can I get my AppEngine Instance to handle multiple concurrent requests?
You really only need to do two things:
Add the statement <threadsafe>true</threadsafe> to your appengine-web.xml file, which you can find in the war\WEB-INF folder.
Make sure that the code inside all your request handlers is actually thread-safe, i.e. use only local variables in your doGet(...), doPost(...), etc. methods or make sure you synchronize all access to class or global variables.
This will tell the AppEngine instance server framework that your code is thread-safe and that you are allowing it to call all of your request handlers multiple times in different threads to handle several requests at the same time. Note: AFAIK, It is not possible to set this one a per-servlet basis. So, ALL your servlets need to be thread-safe!
So, in essence, the executor-code you posted is already included in the server code of each AppEngine instance, and actually calls your doGet(...) method from inside the run method of a separate thread that AppEngine creates (or reuses) for each request. Basically doGet() already is your MyTask().
The relevant part of the Docs is here (although it doesn't really say much): https://developers.google.com/appengine/docs/java/config/appconfig#Using_Concurrent_Requests
2. Is the posted code useful for this (or any other) purpose?
AppEngine in its current form does not allow you to create and use your own threads to accept requests. It only allows you to create threads inside your doGet(...) handler, using the currentRequestThreadFactory() method you mentioned, but only to do parallel processing for this one request and not to accept a second one in parallel (this happens outside doGet()).
The name currentRequestThreadFactory() might be a little misleading here. It does not mean that it will return the current Factory of RequestThreads, i.e. threads that handle requests. It means that it returns a Factory that can create Threads inside the currentRequest. So, unfortunately it is actually not even allowed to use the returned ThreadFactory beyond the scope of the current doGet() execution, like you are suggesting by creating an Executor based on it and keeping it around in a class variable.
For frontend instances, any threads you create inside a doGet() call will get terminated immediately when your doGet() method returns. For backend instances, you are allowed to create threads that keep running, but since you are not allowed to open server sockets for accepting requests inside these threads, these will still not allow you to manage the request handling yourself.
You can find more details on what you can and cannot do inside an appengine servlet here:
The Java Servlet Environment - The Sandbox (specifically the Threads section)
For completeness, let's see how your code can be made "legal":
The following should work, but it won't make a difference in terms of your code being able to handle multiple requests in parallel. That will be determined solely by the <threadsafe>true</threadsafe> setting in you appengine-web.xml. So, technically, this code is just really inefficient and splits an essentially linear program flow across two threads. But here it is anyways:
public class MyServlet implements HttpServlet {
#Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
ThreadFactory threadFactory = ThreadManager.currentRequestThreadFactory();
Executor executor = Executors.newCachedThreadPool(threadFactory);
Future<MyResult> result = executor.submit(new MyTask(request)); // Fires off request handling in a separate thread
writeResponse(response, result.get()); // Waits for thread to complete and builds response. After that, doGet() returns
}
}
Since you are already inside a separate thread that is specific to the request you are currently handling, you should definitely save yourself the "thread inside a thread" and simply do this instead:
public class MyServlet implements HttpServlet {
#Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
writeResponse(response, new MyTask(request).call()); // Delegate request handling to MyTask object in current thread and write out returned response
}
}
Or, even better, just move the code from MyTask.call() into the doGet() method. ;)
Aside - Regarding the limit of 10 simultaneous servlet threads you mentioned:
This is a (temporary) design-decision that allows Google to control the load on their servers more easily (specifically the memory use of servlets).
You can find more discussion on those issues here:
Issue 7927: Allow configurable limit of concurrent requests per instance
Dynamic Backend Instance Scaling
If your bill shoots up due to increased latency, you may not be refunded the charges incurred
This topic has been bugging the heck out of me, too, since I am a strong believer in ultra-lean servlet code, so my usual servlets could easily handle hundreds, if not thousands, of concurrent requests. Having to pay for more instances due to this arbitrary limit of 10 threads per instance is a little annoying to me to say the least. But reading over the links I posted above, it sounds like they are aware of this and are working on a better solution. So, let's see what announcements Google I/O 2013 will bring in May... :)
I second the assessments of ericson and Markus A.
If however, for some reason (or for some other scenario) you want to follow the path that uses your code snippet as a starting point, I'd suggest that you change your executor definition to:
private static Executor executor;
so that it becomes static across instances.