Java selective synchronization

Java selective synchronization - java

I'm maintaining a very old application, and recently I came across a 'multi thread' bug.
Here, in a method, to insert a value into a db, the record is first checked that it exists or not, then if it does not exist, it is inserted into the db.
createSomething(params)
{
....
....
if( !isPresentInDb(params) )
{
.....
.....
.....
insertIntoDb(params)
}
...
}
Here when multiple threads invoke this method, two or more threads with same params may cross the isPresentInDb check, one thread inserts successfully, the other threads fail.
To solve this problem I enclosed both the db interactions into a single synchronized(this) block. But is there a better way of doing this?
Edit: it is more like selective synchronization, only threads with same params need to by synchronized. Is selective synchronization possible?

I'd say the better way to do this would be to let the database do it for you if at all possible. Assuming the row on the database that you are wanting to either update or insert has a unique constraint on it, then my usual approach would be
unconditionally insert the row
if an SQLException occurs, check to see if it is due to a duplicate key on insert error, if it is, do the update, otherwise rethrow the SQLException.
If you can wrap those statements in a database transaction, then you don't have to worry about two threads trampling on each other.

If the logic is really "create this if it doesn't already exist", it could be better still to push the logic down into the database. For example, MySQL has "INSERT IGNORE" syntax that will cause it to ignore the insert if it would violate a primary key constraint. It may not be possible for your code, but worth considering.

This way of doing would only work if this object instance is the only one which inserts something in the table. If it's not, then two threads will synchronize on two different objects, and the synchronization won't work. To make it short : the object should be a singleton, and no other object should insert into this table.
Even if there is a unique object instance inserting, if you have any other application, or any other JVM, inserting in this table, then the synchronization won't bring you any guarantee.
Doing this is better than nothing, but doesn't guarantee that the insert will always succeed. If it doesn't, then the transaction will rollback due (hopefully) to a constraint violation. If you don't have any unique constraint to guarantee the uniqueness in the database, and you have several applications inserting in parallel, then you can't do anything to avoid duplicates.

Since you only want to forbid this method from running with the same params, you can use a ConcurrentMap instead and then call putIfAbsent and check its return value before proceeding. This will allow you to run the method concurrently for different arguments.

Looks fine to me. You can use some of the java.util.concurrent aids, like a ReentrantLock.
It will be better to utilize some sort of optimistic transactions: try to insert, and catch an exception. If the records has just been inserted, simply do nothing.

In one word NO. there is not better way than this. Since to make check-then-update kind of operations atomic you must have to put the logic inside a synchronized block.

You could make the whole method synchronized. I tend to find that a good marker for "this method only gets run by one thread at a time". That's my personal preference though.

The downside of too coarse-grained locking is performance degradation. If the method is called often, it will become a performance bottleneck. Here are two other approaches:
Move your concurrent code into your database statement, if possible.
Use a non-blocking data structure such as ConcurrentMap and maintain a list of known entries (must be warmed up on startup). This allows you two run the method with minimal locking, and without synchronizing the code. An atomic putIfAbsent() can be used to check if it must be added or not.

As others have stated your current approach is fine. Although depending on your requirements there are other things to consider
Is this the only place in your application where you insert these records into the db? If no then the insert could still fail even with synchronisation
How often does theoperation fail? If the number of times the operations fail compared to the number of times you run the method it may be beneficial to detect the failure by catching an appropriate exception. This may be beneficial due to the overhead involved in synchronising threads.
What does your application need to do when it detects this kind of failure?

On first sight your solution seems ok, but if you want to change it here are two options:
use db transactions
use locks from java.util.concurrent.locks
Lock lock = new ReentrantLock();
.....
createSomething(params)
{
....
....
try {
lock.lock();
if( !isPresentInDb(params) )
{
.....
.....
.....
insertIntoDb(params)
}
finally {
lock.unlock;
}
}

Related

What exactly is meant by Spring transactions being atomic?

My understanding of an atomic operation is that it should not be possible for the steps of the operation to be interleaved with those of any other operation - that it should be executed as a single unit.
I have a method for creating a database record that will first of all check if a record with the same value, which also satisfies certain other parameters, already exists, and if so will not create the record.
In fakecode:
public class FooDao implements IFooDao {
#Transactional
public void createFoo(String fooValue) {
if (!fooExists(fooValue)) {
// DB call to create foo
}
}
#Transactional
public boolean fooExists(String fooValue) {
// DB call to check if foo exists
}
}
However I have seen that it is possible for two records with the same value to be created, suggesting that these operations have interleaved in some way. I am aware that with Spring's transactional proxies, self-invocation of a method within an object will not use the transactional logic, but if createFoo() is called from outside the object then I would expect fooExists() to still be included in the same transaction.
Are my expectations as to what transactional atomicity should enforce wrong? Do I need to be using a synchronized block to enforce this?

What a transaction really mean for the database depends on the isolation level. The wikipdia article on Isolation (database systems) explain it well.
Normally one use a not so high isolation level, for example: Read committed. This mean that one can read data from an other transaction not until the other transaction is committed.
In your case this is not enough, because this is the opposite from what you want. - So the obvious solution would be using a more restrictive and slower isolation level: Repeatable reads.
But to be honest, I would use an other way: Make the relevant column unique (but do not remove your if (!fooExists(fooValue))-check). So in 99% your check work. In the remaining 1% you will get an exception, because you try to violate the unique constraint.

Transactional means all updates occur within the same transaction, ie all updates/inserts/delete succeed or all are rolled back (for example if you update multiple tables).
It doesn't guarantee anything about the behaviour of queries within the transaction, which depend on the RDBMS and its configuration (configuration of the isolation level on the database).

#Transactional does not by default make the code synchronized. Two separate threads can enter the same block at the same time and cause inserts to occur. synchronizing the method isn't really a good answer either since that can drastically affect application performance. If your issue is that two identical records are being created by two different threads you may want to add some indexes with unique constraint on the database so that duplicate inserts will fail.

Is making a method synchronized will ensure that it is thread safe?

I have a method in which some database insert operations are happening using hibernate and i want them to be thread safe. The method is getting some data in parametres and its a possiblity that sometimes two calls are made with same data at same point of time.
I can't lock those tables because of performance degradation. Can anyone suggest making the method as synchronized will solve issue?

Synchronizing a method will ensure that it can only be accessed by one thread at a time. If this method is your only means of writing to the database, then yes, this will stop two threads from writing at the same time. However, you still have to deal with the fact that you have multiple insert operations with the same data.

You should let Hibernate handle the concurrency, that's what it is meant to do. Don't assume Hibernate will lock anything: it supports optimistic transactions for exactly this purpose. Quote from the above link:
The only approach that is consistent with high concurrency and high scalability, is optimistic concurrency control with versioning. Version checking uses version numbers, or timestamps, to detect conflicting updates and to prevent lost updates. Hibernate provides three possible approaches to writing application code that uses optimistic concurrency.

Database Concurrency is handled by transactions. Transactions have the Atomic Consistent Isolated Durable (ACID) properties. They provide isolation between programs accessing a database concurrently. In the Hibernate DAO template of spring framework there are single line methods for CRUD operations on the database. When used individually these don't need to be synchronized by method. Spring provides declarative (XML), programmatic and annotation meta-data driven transaction management if you need to declare "your method" as transactional with specific propagation settings, rollbackFor settings, isolation settings. So in "your method" you can do multiple save,update,deletes etc and the ORM will ensure that it is executed with the transaction settings you have given in the meta-data.
Another issue is that the thread has to have the lock on all the objects that are taking part in the transaction.Otherwise the transaction might fail or the ORM will persist stale data. In another situation it can result in a deadlock because of lock-ordering. I think this is what really answers your question.
Both objects a and b have an instance variable of the type Lock. A boolean flag can be used to indicate the success of the transaction. The client code can retry the same transaction if it fails.
if (a.lock.tryLock()) {
try {
if (b.lock.tryLock()) {
try {
// persist or update object a and b
} finally {
b.lock.unlock();
}
}
} finally {
a.lock.unlock();
}
}
The problem with using synchronized methods is that it locks up the entire Service or DAO class making other service methods unavailable to other threads. By using individual locks on objects we can gain the advantage of fine grained concurrency.

No. This method probably uses another methods and objects, which may be not thread safe. synchronized makes threads to use that's method's object monitor only once at a time, so it makes thread-safe a method with respect to the object.
If you are sure that all other threads use shared functionality only with this method, then making it synchronized may be sufficient.

Choosing the best strategy depends on the architecture, sometimes to increase performance seems to be easier to use the trick like method synchronization, but this is bad approach.
There's no doubts, you should use transactions, and if with that strategy you're facing performance issues you should optimize your db queries or db structure.
Please remember that "Synchronization" should be as much as possible atomic.

Non blocking strategy for executing a pair of operations atomically in Java

Lets say I have a Set and another Queue. I want to check in the set if it contains(Element) and if not add(element) to the queue. I want to do the two steps atomically.
One obvious way is to use synchronized blocks or Lock.lock()/unlock() methods. Under thread contention , these will cause context switches. Is there any simple design strategy for achieving this in a non-blocking manner ? may be using some Atomic constructs ?

I don't think you can rely on any mechanism, except the ones you pointed out yourself, simply because you're operating on two structures.
There's decent support for concurrent/atomic operations on one data structure (like "put if not exists" in a ConcurrentHashMap), but for a sequence of operations, you're stuck with either a lock or a synchronized block.

For some operations you can employ what is called a "safe sequence", where concurrent operations may overlap without conflicting. For instance, you might be able to add a member to a set (in theory) without the need to synchronize, since two threads simultaneously adding the same member do not conceptually conflict with each other.
But to query one object and then conditionally operate on a different object is a much more complicated scenario. If your sequence was to query the set, then conditionally insert the member into the set and into the queue, the query and first insert could be replaced with a "compare and swap" operation that syncs without stalling (except perhaps at the memory access level), and then one could insert the member into the queue based on the success of the first operation, only needing to synchronize the queue insert itself. However, this sequence leaves the scenario where another thread could fail the insert and still not find the member in the queue.

Since the contention case is the relevant case you should look at "spin locks". They do not give away the CPU but spin on a flag expecting the flag to be free very soon.
Note however that real spin locks are seldom useful in Java because the normal Lock is quite good. See this blog where someone had first implemented a spinlock in Java only to find that after some corrections (i.e. after making the test correct) spin locks are on par with the standard stuff.

You can use java.util.concurrent.ConcurrentHashMap to get the semantics you want. They have a putIfAbsent that does an atomic insert. You then essentially try to add an element to the map, and if it succeeds, you know that thread that performed the insert is the only one that has, and you can then put the item in the queue safely. The other significant point here is that the operations on a ConcurrentMap insure "happens-before" semantics.
ConcurrentMap<Element,Boolean> set = new ConcurrentHashMap<Element,Boolean>();
Queue<Element> queue = ...;
void maybeAddToQueue(Element e) {
if (set.putIfAbsent(e, true) == null) {
queue.offer(e);
}
}
Note, the actual value type (Boolean) of the map is unimportant here.

Hibernate: A long read-only transaction will now require a small DB update in the middle

I have written quite a complicated engine of sorts which navigates up and down a large series of objects read in from the database.
So I have code that looks something like this:
public void go(long id) {
try {
beginTransaction();
Foo foo = someDao.find(id);
anotherObject.doSomething(foo);
commitTransaction();
} catch (Exception e) {
rollbackTransaction();
}
}
The code in doSomething(...) will call methods to get child objects of Foo and pass those child objects off to other classes and so on.
Prior to my problem, this use to just be a long read-only transaction. Now however, somewhere in the middle of all of this, there needs to be an update to the database. It is important that this update is committed straight away. As Hibernate doesn't support nested transactions, how would I deal with a situation like this to allow me to continue to pass my object around and still call getter methods to access children whilst having that database update get committed?
I thought of removing the long running transaction and having small transactions all over the place. Unfortunately, my code at the moment passes Foo and other child objects everywhere assuming it is still bound to the session. If this is my only solution, would that mean I would end up with ugly merge calls everywhere just to re-attach to the session so the getter methods work again? I'm sure there must be a more elegant solution.

Do the database update within your transaction, i.e. pass the required information to the thread performing your long transaction.
Alternatively, use entity listeners to signal what needs update, and then use the EntityManager.refresh method.
This will be getting a bit ugly with multi-threading and all, but note that you probably do not want the transaction to 'just update' at some random point in time, as that in many cases will yield unpredictable results, like breaking for-loops and such.
And if this is a n-level algorithm, is there any way of doing m levels at a time, save the state (say, the id's of the current scope), and run the next iteration in a new transaction? For this you can use one method without a transaction, which calls EJBs methods which are confined within their own transaction, returning state.

If you must stick with Hibernate (and cannot consider accessing the underlying JDBC driver, Spring Transactions or JTA), you can probably just spawn a thread to do the update and have the main thread wait until it's done (Thread.join()).

I've bitten the bullet and I believe splitting up the big transaction into smaller transactions to have more atomicity is best. This required some manual eager loading in the code however but my nested transaction issue is gone.

java methods and race condition in a jsp/servlets application

Suppose that I have a method called doSomething() and I want to use this method in a multithreaded application (each servlet inherits from HttpServlet).I'm wondering if it is possible that a race condition will occur in the following cases:
doSomething() is not staic method and it writes values to a database.
doSomething() is static method but it does not write values to a database.
what I have noticed that many methods in my application may lead to a race condition or dirty read/write. for example , I have a Poll System , and for each voting operation, a certain method will change a single cell value for that poll as the following:
[poll_id | poll_data ]
[1 | {choice_1 : 10, choice_2 : 20}]
will the JSP/Servlets app solve these issues by itself, or I have to solve all that by myself?
Thanks..

It depends on how doSomething() is implemented and what it actually does. I assume writing to the database uses JDBC connections, which are not threadsafe. The preferred way of doing that would be to create ThreadLocal JDBC connections.
As for the second case, it depends on what is going on in the method. If it doesn't access any shared, mutable state then there isn't a problem. If it does, you probably will need to lock appropriately, which may involve adding locks to every other access to those variables.
(Be aware that just marking these methods as synchronized does not fix any concurrency bugs. If doSomething() incremented a value on a shared object, then all accesses to that variable need to be synchronized since i++ is not an atomic operation. If it is something as simple as incrementing a counter, you could use AtomicInteger.incrementAndGet().)

The Servlet API certainly does not magically make concurrency a non-issue for you.
When writing to a database, it depends on the concurrency strategy in your persistence layer. Pessimistic locking, optimistic locking, last-in-wins? There's way more going on when you 'write to a database' that you need to decide how you're going to handle. What is it you want to have happen when two people click the button at the same time?
Making doSomething static doesn't seem to have too much bearing on the issue. What's happening in there is the relevant part. Is it modifying static variables? Then yes, there could be race conditions.

The servlet api will not do anything for you to make your concurrency problems disappear. Things like using the synchronized keyword on your servlets are a bad idea because you are basically forcing your threads to be processed one at a time and it ruins your ability to respond quickly to multiple users.
If you use Spring or EJB3, either one will provide threadlocal database connections and the ability to specify transactions. You should definitely check out one of those.

Case 1, your servlet uses some code that accesses a database. Databases have locking mechanisms that you should exploit. Two important reasons for this: the database itself might be used from other applications that read and write that data, it's not enough for your app to deal with contending with itself. And: your own application may be deployed to a scaled, clustered web container, where multiple copies of your code are executing on separate machines.
So, there are many standard patterns for dealing with locks in databases, you may need to read up on Pessimistic and Optimistic Locking.
The servlet API and JBC connection pooling gives you some helpful guarantees so that you can write your servlet code without using Java synchronisation provided your variables are in method scope, in concept you have
Start transaction (perhaps implicit, perhaps on entry to an ejb)
Get connection to DB ( Gets you a connection from pool, associated with your tran)
read/write/update code
Close connection (actually keeps it for your thread until your transaction commits)
Commit (again maybe implictly)
So your only real issue is dealing with any contentions in the DB. All of the above tends to be done rather more nicely using things such as JPA these days, but under the covers thats more or less what's happening.
Case 2: static method, this presumably implies that you now keep everything in a memory structure. This (barring remote invocation of some sort) impies a single JVM and you managing your own locking. Should your JVM or machine crash I guess you lose your data. If you care about your data then using a DB is probably better.
OR, how about a completely other approach: servlet simply records the "vote" by writing a message to a persistent JMS queue. Have some other processes pick up the votes from the queue and adds them up. You won't give immediate feedback to the voter this way, but you decouple the user's experience from the actual (in similar scenarios) quite complex processing .

I thing that the best solution for your problem is to use something like "synchronized" keyword and wait/notify!

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.