java methods and race condition in a jsp/servlets application - java

Suppose that I have a method called doSomething() and I want to use this method in a multithreaded application (each servlet inherits from HttpServlet).I'm wondering if it is possible that a race condition will occur in the following cases:
doSomething() is not staic method and it writes values to a database.
doSomething() is static method but it does not write values to a database.
what I have noticed that many methods in my application may lead to a race condition or dirty read/write. for example , I have a Poll System , and for each voting operation, a certain method will change a single cell value for that poll as the following:
[poll_id | poll_data ]
[1 | {choice_1 : 10, choice_2 : 20}]
will the JSP/Servlets app solve these issues by itself, or I have to solve all that by myself?
Thanks..

It depends on how doSomething() is implemented and what it actually does. I assume writing to the database uses JDBC connections, which are not threadsafe. The preferred way of doing that would be to create ThreadLocal JDBC connections.
As for the second case, it depends on what is going on in the method. If it doesn't access any shared, mutable state then there isn't a problem. If it does, you probably will need to lock appropriately, which may involve adding locks to every other access to those variables.
(Be aware that just marking these methods as synchronized does not fix any concurrency bugs. If doSomething() incremented a value on a shared object, then all accesses to that variable need to be synchronized since i++ is not an atomic operation. If it is something as simple as incrementing a counter, you could use AtomicInteger.incrementAndGet().)

The Servlet API certainly does not magically make concurrency a non-issue for you.
When writing to a database, it depends on the concurrency strategy in your persistence layer. Pessimistic locking, optimistic locking, last-in-wins? There's way more going on when you 'write to a database' that you need to decide how you're going to handle. What is it you want to have happen when two people click the button at the same time?
Making doSomething static doesn't seem to have too much bearing on the issue. What's happening in there is the relevant part. Is it modifying static variables? Then yes, there could be race conditions.

The servlet api will not do anything for you to make your concurrency problems disappear. Things like using the synchronized keyword on your servlets are a bad idea because you are basically forcing your threads to be processed one at a time and it ruins your ability to respond quickly to multiple users.
If you use Spring or EJB3, either one will provide threadlocal database connections and the ability to specify transactions. You should definitely check out one of those.

Case 1, your servlet uses some code that accesses a database. Databases have locking mechanisms that you should exploit. Two important reasons for this: the database itself might be used from other applications that read and write that data, it's not enough for your app to deal with contending with itself. And: your own application may be deployed to a scaled, clustered web container, where multiple copies of your code are executing on separate machines.
So, there are many standard patterns for dealing with locks in databases, you may need to read up on Pessimistic and Optimistic Locking.
The servlet API and JBC connection pooling gives you some helpful guarantees so that you can write your servlet code without using Java synchronisation provided your variables are in method scope, in concept you have
Start transaction (perhaps implicit, perhaps on entry to an ejb)
Get connection to DB ( Gets you a connection from pool, associated with your tran)
read/write/update code
Close connection (actually keeps it for your thread until your transaction commits)
Commit (again maybe implictly)
So your only real issue is dealing with any contentions in the DB. All of the above tends to be done rather more nicely using things such as JPA these days, but under the covers thats more or less what's happening.
Case 2: static method, this presumably implies that you now keep everything in a memory structure. This (barring remote invocation of some sort) impies a single JVM and you managing your own locking. Should your JVM or machine crash I guess you lose your data. If you care about your data then using a DB is probably better.
OR, how about a completely other approach: servlet simply records the "vote" by writing a message to a persistent JMS queue. Have some other processes pick up the votes from the queue and adds them up. You won't give immediate feedback to the voter this way, but you decouple the user's experience from the actual (in similar scenarios) quite complex processing .

I thing that the best solution for your problem is to use something like "synchronized" keyword and wait/notify!

Related

Is making a method synchronized will ensure that it is thread safe?

I have a method in which some database insert operations are happening using hibernate and i want them to be thread safe. The method is getting some data in parametres and its a possiblity that sometimes two calls are made with same data at same point of time.
I can't lock those tables because of performance degradation. Can anyone suggest making the method as synchronized will solve issue?
Synchronizing a method will ensure that it can only be accessed by one thread at a time. If this method is your only means of writing to the database, then yes, this will stop two threads from writing at the same time. However, you still have to deal with the fact that you have multiple insert operations with the same data.
You should let Hibernate handle the concurrency, that's what it is meant to do. Don't assume Hibernate will lock anything: it supports optimistic transactions for exactly this purpose. Quote from the above link:
The only approach that is consistent with high concurrency and high scalability, is optimistic concurrency control with versioning. Version checking uses version numbers, or timestamps, to detect conflicting updates and to prevent lost updates. Hibernate provides three possible approaches to writing application code that uses optimistic concurrency.
Database Concurrency is handled by transactions. Transactions have the Atomic Consistent Isolated Durable (ACID) properties. They provide isolation between programs accessing a database concurrently. In the Hibernate DAO template of spring framework there are single line methods for CRUD operations on the database. When used individually these don't need to be synchronized by method. Spring provides declarative (XML), programmatic and annotation meta-data driven transaction management if you need to declare "your method" as transactional with specific propagation settings, rollbackFor settings, isolation settings. So in "your method" you can do multiple save,update,deletes etc and the ORM will ensure that it is executed with the transaction settings you have given in the meta-data.
Another issue is that the thread has to have the lock on all the objects that are taking part in the transaction.Otherwise the transaction might fail or the ORM will persist stale data. In another situation it can result in a deadlock because of lock-ordering. I think this is what really answers your question.
Both objects a and b have an instance variable of the type Lock. A boolean flag can be used to indicate the success of the transaction. The client code can retry the same transaction if it fails.
if (a.lock.tryLock()) {
try {
if (b.lock.tryLock()) {
try {
// persist or update object a and b
} finally {
b.lock.unlock();
}
}
} finally {
a.lock.unlock();
}
}
The problem with using synchronized methods is that it locks up the entire Service or DAO class making other service methods unavailable to other threads. By using individual locks on objects we can gain the advantage of fine grained concurrency.
No. This method probably uses another methods and objects, which may be not thread safe. synchronized makes threads to use that's method's object monitor only once at a time, so it makes thread-safe a method with respect to the object.
If you are sure that all other threads use shared functionality only with this method, then making it synchronized may be sufficient.
Choosing the best strategy depends on the architecture, sometimes to increase performance seems to be easier to use the trick like method synchronization, but this is bad approach.
There's no doubts, you should use transactions, and if with that strategy you're facing performance issues you should optimize your db queries or db structure.
Please remember that "Synchronization" should be as much as possible atomic.

Concurrency : Handling multiple submits in a web application

This is a recent interview question to my friend:
How would you handle a situation where users enter some data in the screen and let's say 5 of them clicked on the Submit button *the SAME time ?*
(By same time,the interviewer insisted that they are same to the level of nanoseconds)
My answer was just to make the method that handles the request synchronized and only one request can acquire the lock on the method at a given time.
But it looks like the interviewer kept insisting there was a "better way" to handle it .
One other approach to handle locking at the database level, but I don't think it is "better".
Are there any other approaches. This seems to be a fairly common problem.
If you have only one network card, you can only have one request coming down it at once. ;)
The answer he is probably looking for is something like
Make the servlet stateless so they can be executed concurrently.
Use components which allow thread safe concurrent access like Atomic* or Concurrent*
Use locks only where you obsolutely have to.
What I prefer to do is to make the service so fast it can respond before the next resquest can come in. ;) Though I don't have the overhead of Java EE or databases to worry about.
Does it matter that they click at the same time e.g. are they both updating the same record on a database?
A synchronized method will not cut it, especially if it's a webapp distributed amongst multiple JVMs. Also the synchronized method may block, but then the other threads would just fire after the first completes and you'd have lost writes.
So locking at database level seems to be the option here i.e. if the record has been updated, report an error back to the users whose updates were serviced after the first.
You do not have to worry about this as web server launches each request in isolated thread and manages it.
But if you have some shared resource like some file for logging then you need to achieve concurrency and put thread lock on it in request and inter requests

Java logging across multiple threads

We have a system that uses threading so that it can concurrently handle different bits of functionality in parallel. We would like to find a way to tie all log entries for a particular "transaction" together. Normally, one might use 'threadName' to gather these together, but clearly that fails in a multithreaded situation.
Short of passing a 'transaction key' down through every method call, I can't see a way to tie these together. And passing a key into every single method is just ugly.
Also, we're kind of tied to Java logging, as our system is built on a modified version of it. So, I would be interested in other platforms for examples of what we might try, but switching platforms is highly unlikely.
Does anyone have any suggestions?
Thanks,
Peter
EDIT: Unfortunately, I don't have control over the creation of the threads as that's all handled by a workflow package. Otherwise, the idea of caching the ID once for each thread (on ThreadLocal maybe?) then setting that on the new threads as they are created is a good idea. I may try that anyway.
You could consider creating a globally-accessible Map that maps a Thread's name to its current transaction ID. Upon beginning a new task, generate a GUID for that transaction and have the Thread register itself in the Map. Do the same for any Threads it spawns to perform the same task. Then, when you need to log something, you can simply lookup the transaction ID from the global Map, based on the current Thread's name. (A bit kludgy, but should work)
This is a perfect example for AspectJ crosscuts. If you know the methods that are being called you can put interceptors on them and bind dynamically.
This article will give you several options http://www.ibm.com/developerworks/java/library/j-logging/
However you mentioned that your transaction spans more than one thread, take a look at how log4j cope with binding additional information to current thread with MDC and NDC classes. It uses ThreadLocal as you were advised before, but interesting thing is how log4j injects data into log messages.
//In the code:
MDC.put("RemoteAddress", req.getRemoteAddr());
//In the configuration file, add the following:
%X{RemoteAddress}
Details:
http://onjava.com/pub/a/onjava/2002/08/07/log4j.html?page=3
http://wiki.apache.org/logging-log4j/NDCvsMDC
How about naming your threads to include the transaction ID? Quick and Dirty, admittedly, but it should work (until you need the thread name for something else or you start reusing threads in a thread pool).
If you are logging, then you must have some kind of logger object. You should have a spearate instance in each thread.
add a method to it called setID(String id).
When it is initialized in your thread, set a unique ID using the method.
prepend the set iD to each log entry.
A couple people have suggested answers that have the newly spawned thread somehow knowing what the transaction ID is. Unless I'm missing something, in order to get this ID into the newly spawned thread, I would have to pass it all the way down the line into the method that spawns the thread, which I'd rather not do.
I don't think you need to pass it down, but rather the code responsible for handing work to these threads needs to have the transactionID to pass. Wouldn't the work-assigner have this already?

ThreadLocals hard to use

I'm using ThreadLocal variables (through Clojure's vars, but the following is the same for plain ThreadLocals in Java) and very often run into the issue that I can't be sure that a certain code path will be taken on the same thread or on another thread. For code under my control this is obviously not too big a problem, but for polymorphic third party code there's sometimes not even a way to statically determine whether it's safe to assume single threaded execution.
I tend to think this is a inherent issue with ThreadLocals, but I'd like to hear some advise on how to use them in a safe way.
Then don't use ThreadLocals! They are specifically for when you want a variable that's associated with a Thread, as if there were a Map<Thread,T>.
The typical use case (as far as I know) for a ThreadLocal is in a web application framework. An HTTP filter obtains a database connection on an incoming request, and stores the connection in a static ThreadLocal. All subsequent controllers needing the connection can easily obtain it from the framework using a static call. When the response is returned, the same filter releases the connection again.

Lock Ordering in C3p0

I am trying to log the creation and destruction of database connections in our application using c3p0's ConnectionCustomizer. In it, I have some code that looks like this:
log(C3P0Registry.getPooledDataSources())
I'm running into deadlocks. I'm discovering that c3p0 has at least a couple of objects in its library that use synchronized methods, and don't seem to specify their intended lock ordering. When I log the connections, I'm holding a lock on C3P0Registry and eventually PoolBackedDataSource (simply creating a list of the datasources is accessing the hashcode causing a lock).
Shutting down the connection provider (calling C3P0ConnectionProvider.close()) causes the locks to be called in the opposite order. But while the child datasources are being shut down, my logging is being triggered. The result is a deadlock.
It seems like both calls I am making into the c3p0 library are valid, expected calls:
C3P0ConnectionProvider.close()
C3P0Registry.getPooledDataSources()
It also seems like (unless explicitly stated in the documentation) it should be the library's responsibility to manage it's own locking strategy. (I don't say this to blame anyone.. just to confirm my understanding of best practices)
How should I deal with this issue? Since c3p0 uses synchronized methods rather than a more modern mechanism, I can't really test the locks.
From my DataSource closing code, I could first grab the C3P0Registry lock before closing the DataSource. I would be guessing at the correct lock order, which I don't know if I feel comfortable with.
I don't think I could reverse the lock order for the logging call. I need the C3P0Registry to get the list of DataSources, so I couldn't lock the DataSources without first locking C3P0Registry to get references to them.
Another solution, of course is to provide another, higher level lock above everything c3p0. In the case of a connection pool, that seems to defeat the point.
For now, I'm rolling back my logging. Thanks for any help.
I dont know how to fix the locking issue, but i think you should take a step back here and think about the original problem.
"I am trying to log the creation and destruction of database connections in our application ..."
I would recommend the following.
Create a class and make it implement javax.sql.DataSource.
Create a field of the same type and delegate all methods to it.
In the getConnection() method return your own Connection class wrapping around
java.sql.Connection and so on.
Then wrap this class around your original data source.
In your classes you can now simply create a logger and log all actions you want to see in your log.

Categories