Concurrency : Handling multiple submits in a web application

Concurrency : Handling multiple submits in a web application - java

This is a recent interview question to my friend:
How would you handle a situation where users enter some data in the screen and let's say 5 of them clicked on the Submit button *the SAME time ?*
(By same time,the interviewer insisted that they are same to the level of nanoseconds)
My answer was just to make the method that handles the request synchronized and only one request can acquire the lock on the method at a given time.
But it looks like the interviewer kept insisting there was a "better way" to handle it .
One other approach to handle locking at the database level, but I don't think it is "better".
Are there any other approaches. This seems to be a fairly common problem.

If you have only one network card, you can only have one request coming down it at once. ;)
The answer he is probably looking for is something like
Make the servlet stateless so they can be executed concurrently.
Use components which allow thread safe concurrent access like Atomic* or Concurrent*
Use locks only where you obsolutely have to.
What I prefer to do is to make the service so fast it can respond before the next resquest can come in. ;) Though I don't have the overhead of Java EE or databases to worry about.

Does it matter that they click at the same time e.g. are they both updating the same record on a database?
A synchronized method will not cut it, especially if it's a webapp distributed amongst multiple JVMs. Also the synchronized method may block, but then the other threads would just fire after the first completes and you'd have lost writes.
So locking at database level seems to be the option here i.e. if the record has been updated, report an error back to the users whose updates were serviced after the first.

You do not have to worry about this as web server launches each request in isolated thread and manages it.
But if you have some shared resource like some file for logging then you need to achieve concurrency and put thread lock on it in request and inter requests

Related

Threading and Concurrency Within A Servlet

I have a web application that retrieves a (large) list of results from the database, then needs to pare down the list by looking at each result, and throwing out "invalid" ones. The parameters that make a result "invalid" are dynamic, and we cannot pass the work on to the database.
So, one idea is to create a thread pool and ExecutorService and check these results concurrently. But I keep seeing people saying "Oh, the spec prohibits spawning threads in a servlet" or "that's just a bad idea".
So, my question: what am I supposed to do? I'm in a servlet 2.5 container, so all the asynchrous goodies as part of the 3.0 spec are unavailable to me. Writing a separate service that I communicate with via JMS seems like overkill.
Looking for expert advice here.
Jason

Nonsense.
The JEE spec has lots of "should nots" and "thou shant's". The Servlet spec, on the other hand, has none of that. The Servlet spec is much more wild west. It really doesn't dive in to the actual operational aspects like the JEE spec does.
I've yet to see a JEE container (either a pure servlet container ala Tomcat/Jetty, or full boat ala Glassfish/JBoss) that actually prevented me from firing off a thread on my own. WebSphere might, it's supposed to be rather notorious, but I've not used WebSphere.
If the concept of creating unruly, self-managed threads makes you itch, then the full JEE containers internally have a formal "WorkManager" that can be used to peel threads off of. They just all expose them in different ways. That's the more "by the book-ish" mechanism for getting a thread.
But, frankly, I wouldn't bother. You'll likely have more success using the Executors out of the standard class library. If you saturate your system with too many threads and everything gets out of hand, well, that's on you. Don't Do That(tm).
As to whether an async solution is even appropriate, I'll punt on that. It's not clear from your post whether it is or not. But your question was about threads and Servlets.
Just Do It. Be aware it "may not be portable", do it right (use an Executor), take responsibility for it, and the container won't be the wiser, nor care.

Doesn't look like concurrency will help you much here. Unless it's very expensive to check each entry, making that check concurrent won't speed things up. Your bottleneck is passing the result set through the database connection, and you couldn't multithread that even if you weren't working on a servlet.

There's nothing to stop you from hitting some ThreadPool from your Servlet, the challenge comes in getting the results. If the Servlet invocation is expecting some result from your submission of a Task to the TreadPool you will end up blocking waiting for the TreadPool stuff to finish so you can compose a response to the doGet/doPut invocation.
If, on the other hand, you devise your service such that a doPut, for example, submits a Task to a ThreadPool but gets back a "handle" or some other unique identifier of the Task returning that to the client, then the client can "poll" the handle through some doGet API to see if the task is done. When the task is done, the client can get the results.

It's completely fine and appropriate. I have done countless work with Servlets that use thread pools on different containers without any problems whatsoever.
EJB containers (like JBoss) tend to warn against spawning threads, but this is because EJB guarantees that an instance of a Bean is only called by one thread, and some of the facilities rely on this and thus you could mess that up by using your own threads. In Servlet there is no such reliance and hence nothing you can mess up this way.
Even in EJB containers, you can use thread pools and be fine as long as you don't interact (like call) with EJB facilities from your own threads.

The thing to watch out for with servlet/threads is that member variables of the servlet need to be thread safe.
Technically nothing stops you from using a thread pool in your servlet to do some post processing but you could shoot yourself in the foot if you create a static thread pool with say 20 threads and 50 clients access your servlet concurrently because 30 clients will be waiting (depending on how long your post-processing takes).

Use threads as "sessions"

I am developing a text-based game, MUD. I have the base functions of the program ready, and now I would like to allow to connect more than one client at a time. I plan to use threads to accomplish that.
In my game I need to store information such as current position or health points for each player. I could hold it in the database, but as it will change very quick, sometimes every second, the use of database would be inefficient (am I right?).
My question is: can threads behave as "sessions", ie hold some data unique to each user?
If yes, could you direct me to some resources that I could use to help me understand how it works?
If no, what do you suggest? Is database a good option or would you recommend something else?
Cheers,
Eleeist

Yes, they can, but this is a mind-bogglingly stupid way to do things. For one thing, it permanently locks you into a "one thread per client" model. For another thing, it makes it difficult (maybe even impossible) to implement interactions between users, which I'm sure your MUD has.
Instead, have a collection of some kind that stores your users, with data on each user. Save persistent data to the database, but you don't need to update ephemeral data on every change.
One way to handle this is to have a "changed" boolean in each user. When you make a critical change to a user, write them to the database immediately. But if it's a routine, non-critical change, just set the "changed" flag. Then have a thread come along every once in a while and write out changed users to the database (and clear the "changed" flag).
Use appropriate synchronization, of course!

A Thread per connection / user session won't scale. You can only have N number of threads active where N is equal to the number of physical cores / processors your machine has. You are also limited by the amount of memory in your machine for how many threads you can create a time, some operating systems just put arbitrary limits as well.
There is nothing magical about Threads in handling multiple clients. They will just make your code more complicated and less deterministic and thus harder to reason about what is actually happening when you start hunting logic errors.
A Thread per connection / user session would be an anti-pattern!
Threads should be stateless workers that pull things off concurrent queues and process the data.
Look at concurrent maps for caching ( or use some appropriate caching solution ) and process them and then do something else. See java.util.concurrent for all the primitive classes you need to implement something correctly.

Instead of worrying about threads and thread-safety, I'd use an in-memory SQL database like HSQLDB to store session information. Among other benefits, if your MUD turns out to be the next Angry Birds, you could more easily scale the thing up.

Definitely you can use threads as sessions. But it's a bit off the mark.
The main point of threads is the ability of concurrent, asynchronous execution. Most probably, you don't want events received from your MUD clients to happen in an parallel, uncontrolled order.
To ensure consistency of the world I'd use an in-memory database to store the game world. I'd serialize updates to it, or at least some updates to it. Imagine two players in parallel hitting a monster with HP 100. Each deals 100 damage. If you don't serialize the updates, you could end up giving credit for 100 damage to both players. Imagine two players simultaneously taking loot from the monster. Without proper serialization they could end up each with their own copy of the loot.
Threads, on the other hand, are good for asynchronous communication with clients. Use threads for that, unless something else (like a web server) does that for you already.

ThreadLocal is your friend! :)
http://docs.oracle.com/javase/6/docs/api/java/lang/ThreadLocal.html
ThreadLocal provides storage on the Thread itself. So the exact same call from 2 different threads will return/store different data.
The biggest danger is having a leak between Threads. You would have to be absolutely sure that if a different user used a Thread that someone else used, you would reset/clear the data.

java methods and race condition in a jsp/servlets application

Suppose that I have a method called doSomething() and I want to use this method in a multithreaded application (each servlet inherits from HttpServlet).I'm wondering if it is possible that a race condition will occur in the following cases:
doSomething() is not staic method and it writes values to a database.
doSomething() is static method but it does not write values to a database.
what I have noticed that many methods in my application may lead to a race condition or dirty read/write. for example , I have a Poll System , and for each voting operation, a certain method will change a single cell value for that poll as the following:
[poll_id | poll_data ]
[1 | {choice_1 : 10, choice_2 : 20}]
will the JSP/Servlets app solve these issues by itself, or I have to solve all that by myself?
Thanks..

It depends on how doSomething() is implemented and what it actually does. I assume writing to the database uses JDBC connections, which are not threadsafe. The preferred way of doing that would be to create ThreadLocal JDBC connections.
As for the second case, it depends on what is going on in the method. If it doesn't access any shared, mutable state then there isn't a problem. If it does, you probably will need to lock appropriately, which may involve adding locks to every other access to those variables.
(Be aware that just marking these methods as synchronized does not fix any concurrency bugs. If doSomething() incremented a value on a shared object, then all accesses to that variable need to be synchronized since i++ is not an atomic operation. If it is something as simple as incrementing a counter, you could use AtomicInteger.incrementAndGet().)

The Servlet API certainly does not magically make concurrency a non-issue for you.
When writing to a database, it depends on the concurrency strategy in your persistence layer. Pessimistic locking, optimistic locking, last-in-wins? There's way more going on when you 'write to a database' that you need to decide how you're going to handle. What is it you want to have happen when two people click the button at the same time?
Making doSomething static doesn't seem to have too much bearing on the issue. What's happening in there is the relevant part. Is it modifying static variables? Then yes, there could be race conditions.

The servlet api will not do anything for you to make your concurrency problems disappear. Things like using the synchronized keyword on your servlets are a bad idea because you are basically forcing your threads to be processed one at a time and it ruins your ability to respond quickly to multiple users.
If you use Spring or EJB3, either one will provide threadlocal database connections and the ability to specify transactions. You should definitely check out one of those.

Case 1, your servlet uses some code that accesses a database. Databases have locking mechanisms that you should exploit. Two important reasons for this: the database itself might be used from other applications that read and write that data, it's not enough for your app to deal with contending with itself. And: your own application may be deployed to a scaled, clustered web container, where multiple copies of your code are executing on separate machines.
So, there are many standard patterns for dealing with locks in databases, you may need to read up on Pessimistic and Optimistic Locking.
The servlet API and JBC connection pooling gives you some helpful guarantees so that you can write your servlet code without using Java synchronisation provided your variables are in method scope, in concept you have
Start transaction (perhaps implicit, perhaps on entry to an ejb)
Get connection to DB ( Gets you a connection from pool, associated with your tran)
read/write/update code
Close connection (actually keeps it for your thread until your transaction commits)
Commit (again maybe implictly)
So your only real issue is dealing with any contentions in the DB. All of the above tends to be done rather more nicely using things such as JPA these days, but under the covers thats more or less what's happening.
Case 2: static method, this presumably implies that you now keep everything in a memory structure. This (barring remote invocation of some sort) impies a single JVM and you managing your own locking. Should your JVM or machine crash I guess you lose your data. If you care about your data then using a DB is probably better.
OR, how about a completely other approach: servlet simply records the "vote" by writing a message to a persistent JMS queue. Have some other processes pick up the votes from the queue and adds them up. You won't give immediate feedback to the voter this way, but you decouple the user's experience from the actual (in similar scenarios) quite complex processing .

I thing that the best solution for your problem is to use something like "synchronized" keyword and wait/notify!

Session management using Hibernate in a multi-threaded Swing application

I'm currently working on a (rather large) pet project of mine , a Swing application that by it's very nature needs to be multi-threaded. Almost all user interactions might fetch data from some remote servers over the internet , since I neither control these servers nor the internet itself, long response times are thus inevitable. A Swing UI obviously cannot repaint itself while the EDT is busy so all remote server calls need to be executed by background thread(s).
My problem:
Data fetched by the background threads gets 'enriched' with data from a local (in-memory) database (remote server returns IDs/references to data in the local database). This data later eventually gets passed to the EDT where it becomes part of the view model. Some entities are not completely initialized at this point (lazy-fetching enabled) so the user might trigger lazy-fetching by e.g. scrolling in a JTable. Since the hibernate session is already closed this will trigger a LazyInitializationException. I can't know when lazy-fetching might be triggered by the user so creating a session on demand/attaching the detached object will not work here.
I 'solved' this problem by:
using a single (synchronized , since Session instances are not thread-safe) Session for the whole application
disabling lazy-fetching completely
While this works, the application's performance has suffered greatly (sometimes being close to unusable). The slowdown is mainly caused by the large number of objects that are now fetched by each query.
I'm currently thinking about changing the application's design to 'Session-per-thread' and migrating all entities fetched by non-EDT threads to the EDT thread's Session (similar to this posting on the Hibernate forums).
Side-note: Any problems related to database updates do not apply since all database entities are read-only (reference data).
Any other ideas on how to use Hibernate with lazy-loading in this scenario ?

Don't expose the Session itself in your data API. You can still do it lazily, just make sure that the hydration is being done from the 'data' thread each time. You could use a block (runnable or some kind of command class is probably the best Java can do for you here unfortunately) that's wrapped by code that performs the load async from the 'data' thread. When you're in UI code, (on the UI thread of course) field some kind of a 'data is ready' event that is posted by the data service. You can then get the data from the event use in the UI.

You could look have a look at Ebean ORM. It is session-less and lazy loading just works. This doesn't answer your question but really proposes an alternative.
I know Ebean has built in support for asynchronous query execution which may also be interesting for your scenario.
Maybe worth a look.
Rob.

There are two distinct problems, that should get resolved seperately:
Handling of Hibernate Sessions in Swing Applications. Let me recommend my own article, regarding this problem: http://blog.schauderhaft.de/2008/09/28/hibernate-sessions-in-two-tier-rich-client-applications/
The basic idea is to have a session for every frame, excluding modal frames which use the session of the spawning frame. It is not easy but it works. Meaning, you won't get any LLEs anymore.
How to get your GUI thread separated from the back end.
I recommend to keep the hibernate objects strictly on the back end thread they originate from. Only give wrapper objects to the ETD. If these wrapper objects are asked for a value, they create a request which gets passed to the backend thread, which eventually will return the value.
I'd envision three kinds of wrapper Implementations:
Async: requests the value, and gets notified when the value is available. It would return immediately with some dummy value. On notification it will fire a PropertyChange event i.O. to inform the GUI about the 'changed' value (changed from unknown to a real value).
Sync: requests the value and waits for it to be available.
Timed: a mixture between the two, waiting for a short time (0.01) seconds, before returning. This would avoid plenty change events, compared to the async version.
As a basis for these wrappers a recommend the ValueModel of the JGoodies Binding library: http://www.jgoodies.com/downloads/libraries.html
Obviously You need to take care that any action is only performed on actually loaded values, but since you don't plan on doing updates this shouldn't be to much of an issue.
Let me end with a warning: I have thought about it a lot, but never actually tried it, so move with care.

Java Concurrency - Web Application

I think I found more bugs in my web application. Normally, I do not worry about concurrency issues, but when you get a ConcurrentModificationException, you begin to rethink your design.
I am using JBoss Seam in conjuction with Hibernate and EHCache on Jetty. Right now, it is a single application server with multiple cores.
I briefly looked over my code and found a few places that haven't thrown an exception yet, but I am fairly sure they can.
The first servlet filter I have basically checks if there are messages to notify the user of an event that occurred in the background (from a job, or another user). The filter simply adds messages to the page in a modal popup. The messages are stored on the session context, so it is possible another request could pull the same messages off the session context.
Right now, it works fine, but I am not hitting a page with many concurrent requests. I am thinking that I might need to write some JMeter tests to ensure this doesn't happen.
The second servlet filter logs all incoming requests along with the session. This permits me to know where the client is coming from, what browser they're running, etc. The problem I am seeing more recently is on image gallery pages (where there are many requests at about the same time), I end up getting a concurrent modification exception because I'm adding a request to the session.
The session contains a list of requests, this list appears to be being hit by multiple threads.
#Entity
public class HttpSession
{
protected List<HttpRequest> httpRequests;
#Fetch(FetchMode.SUBSELECT)
#OneToMany(mappedBy = "httpSession")
public List<HttpRequest> getHttpRequests()
{return(httpRequests);}
...
}
#Entity
public class HttpRequest
{
protected HttpSession httpSession;
#ManyToOne(optional = false)
#JoinColumn(nullable = false)
public HttpSession getHttpSession()
{return(httpSession);}
...
}
In that second servlet filter, I am doing something of the sort:
httpSession.getHttpRequests().add(httpRequest);
session.saveOrUpdate(httpSession);
The part that errors out is when I do some comparison to see what changed from request to request:
for(HttpRequest httpRequest:httpSession.getHttpRequests())
That line there blows up with a concurrent modification exception.
Things to walk away with:
1. Will JMeter tests be useful here?
2. What books do you recommend for writing web applications that scale under concurrent load?
3. I tried placing synchronized around where I think I need it, ie on the method that loops through the requests, but it still fails. What else might I need to do?
I added some comments:
I had thought about making the logging of the http requests a background task. I can easily spawn a background task to save that information. I am trying to remember why I didn't evaluate that too much. I think there is some information that I would like to have access to on the spot.
If I made it asynchronous, that would speed up the throughput quite a bit - well I'd have to use JMeter to measure those differences.
I would still have to deal with the concurrency issue there.
Thanks,
Walter

A ConcurrentModificationException occurrs when any collection is modified while iterating over it. You can do it in a single thread, e.g.:
for( Object o : someList ) {
someList.add( new Object() );
}
Wrap your list with Collections.synchronizedList or return an unmodifiable copy of the list.

I'm not sure about scaling web applications in particular, but Java Concurrency in Practice is a fantastic book on concurrency in general.
The list should be replaced with a version that is threadsafe, or all access to it has to be synchronized (readers and writers) on the same object. It is not enough to synchronize just the method that reads from the list.

It's been caused because the list has been modified by another request while you're still iterating over it in one request. Replacing the List by ConcurrentLinkedQueue (click link to see javadoc) should fix the particular problem.
As to your other questions:
1: Will JMeter tests be useful here?
Yes, it is certainly useful to stress-test webapplications and spot concurrency bugs.
2: What books do you recommend for writing web applications that scale under concurrent load?
Not specific tied to webapplications, but more to concurrency in general, the book Concurrency in Practice is the most recommended one in that area. You can perfectly apply the learned things on webapplications as well, they are a perfect real world example of "heavy concurrent" applications.
3: I tried placing synchronized around where I think I need it, ie on the method that loops through the requests, but it still fails. What else might I need to do?
You basically need to synchronize any access to the list on the same lock. But just replacing by ConcurrentLinkedQueue is easier.

You're getting an exception on the iterator, because another thread is altering the collection backing the iterator while you're in mid-iteration.
You could wrap access to the list in synchronized access (both adding and iterating) but there are problems with this, as it could take significantly longer to iterate through a list, along with the processing that goes along with it, and you'd be holding the lock to the list for all of that time.
Another option would be to copy the list and pass out the copy for iteration, which might be a better idea if the objects are small, as you'd only be holding the lock while you make the copy, rather than while you're iterating through the list.
Store your values in a ConcurrentHashMap, which uses lock striping to minimize lock contention. You could then have your get method return a copied list of the keys you want, rather than the complete objects, and access them one at at time directly from the map.
As is mentioned in another answer here, Java Concurrency in Practice is a great book.

The other posters are correct in stating that you need to be writing to a threadsafe data structure. In doing so, you may slow down your response time due to thread contention. Since this essentially a logging operation that is a side effect of the request itself (Or am I not understanding you correctly?) you could spawn a new thread responsible for writing to the threadsafe data structure. That allows you to proceed with the actual response instead of burning response time on a logging operation. It might be worth investigating setting up a threadpool to reduce the time required to use the logging threads.
Any concurrency book by Doug Lea is worth reading.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.