I'm trying to understand if there is a thread-safety issue inside of Struts2 ScopeInterceptor class (/org/apache/struts2/interceptor/ScopeInterceptor.java), here's the code in question:
private static Map locks = new IdentityHashMap();
static final void lock(Object o, ActionInvocation invocation) throws Exception {
synchronized (o) {
int count = 3;
Object previous = null;
while ((previous = locks.get(o)) != null) {
if (previous == invocation) {
return;
}
if (count-- <= 0) {
locks.remove(o);
o.notify();
throw new StrutsException("Deadlock in session lock");
}
o.wait(10000);
}
;
locks.put(o, invocation);
}
}
static final void unlock(Object o) {
synchronized (o) {
locks.remove(o);
o.notify();
}
}
I have a Websphere application showing 45 stalled threads, high cpu usage. 33 threads are stalled at "locks.remove(o)" inside of "unlock" method. The other 12 threads are stalled inside of "locks.get(o)" inside of "lock" method.
It seems to me that the usage of IdentityHashMap is thread-unsafe. Could simply wrapping IdentityHashMap with Collections.synchronizedMap() solve this problem?:
private static Map locks = Collections.synchronizedMap(new IdentityHashMap());
static final void lock(Object o, ActionInvocation invocation) throws Exception {
synchronized (o) {
int count = 3;
Object previous = null;
while ((previous = locks.get(o)) != null) {
if (previous == invocation) {
return;
}
if (count-- <= 0) {
locks.remove(o);
o.notify();
throw new StrutsException("Deadlock in session lock");
}
o.wait(10000);
}
;
locks.put(o, invocation);
}
}
static final void unlock(Object o) {
synchronized (o) {
locks.remove(o);
o.notify();
}
}
It seems to me that the author tried to "fix" IdentityHashMap's synchronization problem by using synchronized code blocks, however that doesn't protect against multiple threads if the Object "o" is a thread-specific object. And, since the code blocks within lock and unlock are separate, then IdentityHashMap will (and does!) get called simultaneously by more than one thread (as per our Java core evidence).
Is the Collections.synchronizedMap() wrapper the correct fix, or am I missing something?
I believe you are right, and there appears to be a thread safety issue. The developer is attempting to be thread safe by synchronizing on the object "o", but it looks like this object is actually the session object rather than something that is scoped more widely. I believe the change needs to be to synchronize on the locks object.
No real answers, but hopefully some useful information:
The IdentityHashMap documentation (http://docs.oracle.com/javase/7/docs/api/java/util/IdentityHashMap.html) states:
Note that this implementation is not synchronized. If multiple threads access an identity hash map concurrently, and at least one of
the threads modifies the map structurally, it must be synchronized
externally. (A structural modification is any operation that adds or
deletes one or more mappings; merely changing the value associated
with a key that an instance already contains is not a structural
modification.) This is typically accomplished by synchronizing on some
object that naturally encapsulates the map. If no such object exists,
the map should be "wrapped" using the Collections.synchronizedMap
method. This is best done at creation time, to prevent accidental
unsynchronized access to the map:
Map m = Collections.synchronizedMap(new IdentityHashMap(...));
So the Collections.synchronizedMap strategy sounds right, but this page (http://docs.oracle.com/javase/tutorial/essential/concurrency/locksync.html) makes me wonder about whether it will work since the methods are static:
Locks In Synchronized Methods
When a thread invokes a synchronized method, it automatically acquires
the intrinsic lock for that method's object and releases it when the
method returns. The lock release occurs even if the return was caused
by an uncaught exception.
You might wonder what happens when a static synchronized method is
invoked, since a static method is associated with a class, not an
object. In this case, the thread acquires the intrinsic lock for the
Class object associated with the class. Thus access to class's static
fields is controlled by a lock that's distinct from the lock for any
instance of the class.
Since those are static methods (even though the fields are not static), it's hard to tell if the Collections.synchronizedMap wrapper will really work to prevent a deadlock... My answer is that I have no answer!
Yes I think so.
If you accessing lock(Object o, ActionInvocation invocation) with different os you are modify the IdentityHashMap simultaneously with different monitors for the different threads. This makes it possible for different threads to call IdentityHashMap simultaneously.
This can be solved by synchronizing the IdentityHashMap.
Related
Let's say I have a field that can be accessed by two separate threads. I'm using an Object for the synchronization lock. Can I check for null outside of the synchronization block? In other words, is this thread-safe:
private Object sharedObject() = new Object();
private final Object sharedObjectLock() = new Object();
private void awesomeMethod() {
if(sharedObject != null) {
synchronized(sharedObjectLock) {
//code the uses sharedObject
}
}
}
No. The assignment of the variable could occur after you check but before you get the lock.
There was, at one time, a school of thought that synchronized locks were expensive, but that's not true anymore. Just grab the lock.
No, it's not thread-safe. If another thread nullifies sharedObject field, your thread is not guaranteed to see that change.
Let's say I have a HashMap declared as follows:
#GuardedBy("pendingRequests")
private final Map<UInt32, PendingRequest> pendingRequests = new HashMap<UInt32, PendingRequest>();
Access to the map is multi-threaded, and all access is guarded by synchronizing on this final instance of the map, e.g.:
synchronized (pendingRequests) {
pendingRequests.put(reqId, request);
}
Is this enough? Should the map be created using Collections.synchronizedMap()? Should I be locking on a dedicated lock object instead of the map instance? Or maybe both?
External synchronization (in addition to possibly using Collections.synchronizedMap()) is needed in a couple areas where multiple calls on the map must be atomic.
Synchronizing on the map itself is essentially what the Map returned by Collection.synchronizedMap() would do. For your situation it is a reasonable approach, and there is not much to recommend using a separate lock object other than personal preference (or if you wish to have more fine grained control and use a ReentrantReadWriteLock to allow concurrent reading of the map).
E.g.
private Map<Integer,Object> myMap;
private ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
public void myReadMethod()
{
rwl.readLock().lock();
try
{
myMap.get(...);
...
} finally
{
rwl.readLock().unlock();
}
}
public void myWriteMethod()
{
// may want / need to call rwl.readLock().unlock() here,
// since if you are holding the readLock here already then
// you cannot get the writeLock (so be careful on how your
// methods lock/unlock and call each other).
rwl.writeLock().lock();
try
{
myMap.put(key1,item1);
myMap.put(key2,item2);
} finally
{
rwl.writeLock().unlock();
}
}
All calls to the map need to be synchronized, and Collections.synchronizedMap() gives you that.
However, there is also an aspect of compound logic. If you need the integrity of the compound logic, synchronization of individual calls is not enough. For example, consider the following code:
Object value = yourMap.get(key); // synchronized
if (value == null) {
// do more action
yourMap.put(key, newValue); // synchronized
}
Although individual calls (get() and put()) are synchronized, your logic will not be safe against concurrent access.
Another interesting case is when you iterate. For an iteration to be safe, you'd need to synchronize for the entire duration of the iteration, or you will get ConcurrentModificationExceptions.
I have a web application and I am using Oracle database and I have a method basically like this:
public static void saveSomethingImportantToDataBase(Object theObjectIwantToSave) {
if (!methodThatChecksThatObjectAlreadyExists) {
storemyObject() //pseudo code
}
// Have to do a lot other saving stuff, because it either saves everything or nothing
commit() // pseudo code to actually commit all my changes to the database.
}
Right now there is no synchronization of any kind so n threads can of course access this method freely, the problem arises when 2 threads enter this method both check and of course there is nothing just yet, and then they can both commit the transaction, creating a duplicate object.
I do not want to solve this with a unique key identifier in my Database, because I don't think I should be catching that SQLException.
I also cannot check right before the commit, because there are several checks not only 1, which would take a considerable amount of time.
My experience with locks and threads is limited, but my idea is basically to lock this code on the object that it is receiving. I don't know if for example say I receive an Integer Object, and I lock on my Integer with value 1, would that only prevent threads with another Integer with value 1 from entering, and all the other threads with value != 1 can enter freely?, is this how it works?.
Also if this is how it works, how is the lock object compared? how is it determined that they are in fact the same object?. A good article on this would also be appreciated.
How would you solve this?.
Your idea is a good one. This is the simplistic/naive version, but it's unlikely to work:
public static void saveSomethingImportantToDataBase(Object theObjectIwantToSave) {
synchronized (theObjectIwantToSave) {
if (!methodThatChecksThatObjectAlreadyExists) {
storemyObject() //pseudo code
}
// Have to do a lot other saving stuff, because it either saves everything or nothing
commit() // pseudo code to actually commit all my changes to the database.
}
}
This code uses the object itself as the lock. But it has to be the same object (ie objectInThreadA == objectInThreadB) if it's to work. If two threads are operating on an object that is a copy of each other - ie has the same "id" for example, then you'll need to either synchronize the whole method:
public static synchronized void saveSomethingImportantToDataBase(Object theObjectIwantToSave) ...
which will of course greatly reduce concurrency (throughput will drop to one thread at a time using the method - to be avoided).
Or find a way to get the same lock object based on the save object, like this approach:
private static final ConcurrentHashMap<Object, Object> LOCKS = new ConcurrentHashMap<Object, Object>();
public static void saveSomethingImportantToDataBase(Object theObjectIwantToSave) {
synchronized (LOCKS.putIfAbsent(theObjectIwantToSave.getId(), new Object())) {
....
}
LOCKS.remove(theObjectIwantToSave.getId()); // Clean up lock object to stop memory leak
}
This last version it the recommended one: It will ensure that two save objects that share the same "id" are locked with the same lock object - the method ConcurrentHashMap.putIfAbsent() is threadsafe, so "this will work", and it requires only that objectInThreadA.getId().equals(objectInThreadB.getId()) to work properly. Also, the datatype of getId() can be anything, including primitives (eg int) due to java's autoboxing.
If you override equals() and hashcode() for your object, then you could use the object itself instead of object.getId(), and that would be an improvement (Thanks #TheCapn for pointing this out)
This solution will only work with in one JVM. If your servers are clustered, that a whole different ball game and java's locking mechanism will not help you. You'll have to use a clustered locking solution, which is beyond the scope of this answer.
Here is an option adapted from And360's comment on Bohemian's answer, that tries to avoid race conditions, etc. Though I prefer my other answer to this question over this one, slightly:
import java.util.HashMap;
import java.util.concurrent.atomic.AtomicInteger;
// it is no advantage of using ConcurrentHashMap, since we synchronize access to it
// (we need to in order to "get" the lock and increment/decrement it safely)
// AtomicInteger is just a mutable int value holder
// we don't actually need it to be atomic
static final HashMap<Object, AtomicInteger> locks = new HashMap<Integer, AtomicInteger>();
public static void saveSomethingImportantToDataBase(Object objectToSave) {
AtomicInteger lock;
synchronized (locks) {
lock = locks.get(objectToSave.getId());
if (lock == null) {
lock = new AtomicInteger(1);
locks.put(objectToSave.getId(), lock);
}
else
lock.incrementAndGet();
}
try {
synchronized (lock) {
// do synchronized work here (synchronized by objectToSave's id)
}
} finally {
synchronized (locks) {
lock.decrementAndGet();
if (lock.get() == 0)
locks.remove(id);
}
}
}
You could split these out into helper methods "get lock object" and "release lock" or what not, as well, to cleanup the code. This way feels a little more kludgey than my other answer.
Bohemian's answer seems to have race condition problems if one thread is in the synchronized section while another thread removes the synchro-object from the Map, etc. So here is an alternative that leverages WeakRef's.
// there is no synchronized weak hash map, apparently
// and Collections.synchronizedMap has no putIfAbsent method, so we use synchronized(locks) down below
WeakHashMap<Integer, Integer> locks = new WeakHashMap<>();
public void saveSomethingImportantToDataBase(DatabaseObject objectToSave) {
Integer lock;
synchronized (locks) {
lock = locks.get(objectToSave.getId());
if (lock == null) {
lock = new Integer(objectToSave.getId());
locks.put(lock, lock);
}
}
synchronized (lock) {
// synchronized work here (synchronized by objectToSave's id)
}
// no releasing needed, weakref does that for us, we're done!
}
And a more concrete example of how to use the above style system:
static WeakHashMap<Integer, Integer> locks = new WeakHashMap<>();
static Object getSyncObjectForId(int id) {
synchronized (locks) {
Integer lock = locks.get(id);
if (lock == null) {
lock = new Integer(id);
locks.put(lock, lock);
}
return lock;
}
}
Then use it elsewhere like this:
...
synchronized (getSyncObjectForId(id)) {
// synchronized work here
}
...
The reason this works is basically that if two objects with matching keys enter the critical block, the second will retrieve the lock the first is already using (or the one that is left behind and hasn't been GC'ed yet). However if it is unused, both will have left the method behind and removed their references to the lock object, so it is safely collected.
If you have a limited "known size" of synchronization points you want to use (one that doesn't have to decrease in size eventually), you could probably avoid using a HashMap and use a ConcurrentHashMap instead, with its putIfAbsent method which might be easier to understand.
My opinion is you are not struggling with a real threading problem.
You would be better off letting the DBMS automatically assign a non conflicting row id.
If you need to work with existing row ids store them as thread local variables.
If there is no need for shared data do not share data between threads.
http://download.oracle.com/javase/6/docs/api/java/lang/ThreadLocal.html
An Oracle dbms is much better in keeping the data consistent when an application server or a web container.
"Many database systems automatically generate a unique key field when a row is inserted. Oracle Database provides the same functionality with the help of sequences and triggers. JDBC 3.0 introduces the retrieval of auto-generated keys feature that enables you to retrieve such generated values. In JDBC 3.0, the following interfaces are enhanced to support the retrieval of auto-generated keys feature ...."
http://download.oracle.com/docs/cd/B19306_01/java.102/b14355/jdbcvers.htm#CHDEGDHJ
If you can live with occasional over-synchronization (ie. work done sequentially when not needed) try this:
Create a table with lock objects. The bigger table, the fewer chances for over-synchronizaton.
Apply some hashing function to your id to compute table index. If your id is numeric, you can just use a remainder (modulo) function, if it is a String, use hashCode() and a remainder.
Get a lock from the table and synchronize on it.
An IdLock class:
public class IdLock {
private Object[] locks = new Object[10000];
public IdLock() {
for (int i = 0; i < locks.length; i++) {
locks[i] = new Object();
}
}
public Object getLock(int id) {
int index = id % locks.length;
return locks[index];
}
}
and its use:
private idLock = new IdLock();
public void saveSomethingImportantToDataBase(Object theObjectIwantToSave) {
synchronized (idLock.getLock(theObjectIwantToSave.getId())) {
// synchronized work here
}
}
public static void saveSomethingImportantToDataBase(Object theObjectIwantToSave) {
synchronized (theObjectIwantToSave) {
if (!methodThatChecksThatObjectAlreadyExists) {
storemyObject() //pseudo code
}
// Have to do a lot other saving stuff, because it either saves everything or nothing
commit() // pseudo code to actually commit all my changes to the database.
}
}
The synchronized keyword locks the object you want so that no other method could access it.
I don't think you have any choice but to take one of the solutions that you do not seem to want to do.
In your case, I don't think any type of synchronization on the objectYouWantToSave is going to work since they are based on web requests. Therefore each request (on its own thread) is most likely going to have it's own instance of the object. Even though they might be considered logically equal, that doesn't matter for synchronization.
synchronized keyword (or another sync operation) is must but is not enough for your problem. You should use a data structure to store which integer values are used. In our example HashSet is used. Do not forget clean too old record from hashset.
private static HashSet <Integer>isUsed= new HashSet <Integer>();
public synchronized static void saveSomethingImportantToDataBase(Object theObjectIwantToSave) {
if(isUsed.contains(theObjectIwantToSave.your_integer_value) != null) {
if (!methodThatChecksThatObjectAlreadyExists) {
storemyObject() //pseudo code
}
// Have to do a lot other saving stuff, because it either saves everything or nothing
commit() // pseudo code to actually commit all my changes to the database.
isUsed.add(theObjectIwantToSave.your_integer_value);
}
}
To answer your question about locking the Integer, the short answer is NO - it won't prevent threads with another Integer instance with the same value from entering. The long answer: depends on how you obtain the Integer - by constructor, by reusing some instances or by valueOf (that uses some caching). Anyway, I wouldn't rely on it.
A working solution that will work is to make the method synchronized:
public static synchronized void saveSomethingImportantToDataBase(Object theObjectIwantToSave) {
if (!methodThatChecksThatObjectAlreadyExists) {
storemyObject() //pseudo code
}
// Have to do a lot other saving stuff, because it either saves everything or nothing
commit() // pseudo code to actually commit all my changes to the database.
}
This is probably not the best solution performance-wise, but it is guaranteed to work (note, if you are not in a clustered environment) until you find a better solution.
private static final Set<Object> lockedObjects = new HashSet<>();
private void lockObject(Object dbObject) throws InterruptedException {
synchronized (lockedObjects) {
while (!lockedObjects.add(dbObject)) {
lockedObjects.wait();
}
}
}
private void unlockObject(Object dbObject) {
synchronized (lockedObjects) {
lockedObjects.remove(dbObject);
lockedObjects.notifyAll();
}
}
public void saveSomethingImportantToDatabase(Object theObjectIwantToSave) throws InterruptedException {
try {
lockObject(theObjectIwantToSave);
if (!methodThatChecksThatObjectAlreadyExists(theObjectIwantToSave)) {
storeMyObject(theObjectIwantToSave);
}
commit();
} finally {
unlockObject(theObjectIwantToSave);
}
}
You must correctly override methods 'equals' and 'hashCode' for your objects' classes. If you have unique id (String or Number) inside your object then you can just check this id instead of the whole object and no need to override 'equals' and 'hashCode'.
try-finally - is very important - you must guarantee to unlock waiting threads after your operation even if your operation threw exception.
This approach will not work if your back-end is distributed across multiple servers.
This is an implementation of readers writers, i.e. many readers can read but only one writer can write at any one time. Does this work as expected?
public class ReadersWriters extends Thread{
static int num_readers = 0;
static int writing = 0;
public void read_start() throws InterruptedException {
synchronized(this.getClass()) {
while(writing == 1) wait();
num_readers++;
}
}
public void read_end() {
synchronized(this.getClass()) {
if(--num_readers == 0) notifyAll();
}
}
public void write_start() throws InterruptedException{
synchronized(this.getClass()) {
while(num_readers > 0) wait();
writing = 1;
}
}
public void write_end() {
this.getClass().notifyAll();
}
}
Also is this implementation any different from declaring each method
public static synchronized read_start()
for example?
Thanks
No - you're implicitly calling this.wait(), despite not having synchronized on this, but instead on the class. Likewise you're calling this.notifyAll() in read_end. My suggestions:
Don't extend Thread - you're not specializing the thread at all.
Don't use static variables like this from instance members; it makes it look like there's state on a per-object basis, but actually there isn't. Personally I'd just use instance variables.
Don't use underscores in names - the conventional Java names would be numReaders, readEnd (or better, endRead) etc.
Don't synchronize on either this or the class if you can help it. Personally I prefer to have a private final Object variable to lock on (and wait etc). That way you know that only your code can be synchronizing on it, which makes it easier to reason about.
You never set writing to 0. Any reason for using an integer instead of a boolean in the first place?
Of course, it's better to use the classes in the framework for this if at all possible - but I'm hoping you're really writing this to understand threading better.
You can achieve your goal in much simpler way by using
java.util.concurrent.locks.ReentrantReadWriteLock
Just grab java.util.concurrent.locks.ReentrantReadWriteLock.ReadLock when you start reading and java.util.concurrent.locks.ReentrantReadWriteLock.WriteLock when you start writing.
This class is intended exactly for that - allow multiple readers that are mutually exclusive with single writer.
Your particular implementation of read_start is not equivalent to simply declaring the method synchronized. As was noted by J. Skeed, you need to call notify (and wait) on the object you are synchronize-ing with. You cannot use an unrelated object (here: the class) for this. And using the synchronized modified on a method does not make the method implicitly call wait or anything like that.
There is, BTW., an implementation of read/write locks, which ships with the core JDK: java.util.concurrent.locks.ReentrantReadWriteLock. Using that one, your code might look like the following instead:
class Resource {
private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();
private final Lock rlock = lock.readLock();
private final Lock wlock = lock.writeLock();
void read() { ... /* caller has to hold the read lock */ ... }
void write() { ... /* caller has to hold the write lock */ ... }
Lock readLock() { return rlock; }
Lock writeLock() { return wlock; }
}
Usage
final Resource r = ...;
r.readLock().lock();
try {
r.read();
} finally {
r.unlock();
}
and similar for the write operation.
The example code synchronizes on this.getClass(), which will return the same Class object for multiple instances of ReadersWriters in the same class loader. If multiple instances of ReadersWriters exist, even though you have multiple threads, there will be contention for this shared lock. This would be similar to adding the static keyword to a private lock field (as Jon Skeet suggested) and would likely lead to worse performance than synchronizing on this or a private lock object. More specifically, one thread which is reading would be blocking another thread which is writing, and this is likely undesirable.
I understood synchronization of a block of code means that particular code will be accessed by only one thread at time even many thread are waiting to access that.
when we are writing thread class in run method we starting synchronized block by giving object.
for example
class MyThread extends Thread{
String sa;
public MyThread(String s){
sa=s;
}
public void run(){
synchronized(sa){
if(sa.equals("notdone"){
//do some thing on object
}
}
}
}
here we gave sa object to synchronized block what is the need of that.any how we are going to provide synchronization for that particular block of code
I would suggest
extend Runnable rather than Thread.
don't lock in the Runnable on an external. Instead you should be calling a method which may use an internal lock.
String is not a good choice as a lock. It means that "hi" and "hi" will share a lock but new String("hi") will not.
if you are locking all other threads for the life of the thread, why are you using multiple threads?
The parameter object of the synchronized block is the object on which the block locks.
Thus all synchronized blocks with the same object are excluding each other's (and all synchronized methods' of this same object) simultaneous execution.
So if you have this example
class ExampleA extends Thread() {
public ExampleA(Object l) {
this.x = l;
}
private Object x;
public void run() {
synchronized(x) { // <-- synchronized-block A
// do something
}
}
}
class ExampleB extends Thread() {
public ExampleB(Object l) {
this.x = l;
}
private Object x;
public void run() {
synchronized(x) { // <-- synchronized-block B
// do something else
}
}
}
Object o1 = new Object();
Object o2 = new Object();
Thread eA1 = new ExampleA(o1);
Thread eA2 = new ExampleA(o2);
Thread eB1 = new ExampleB(o1);
Thread eB2 = new ExampleB(o2);
eA1.start(); eA2.start(); eB1.start(); eB2.start();
Now we have two synchronized blocks (A and B, in classes ExampleA and ExampleB), and we have two lock objects (o1 and o2).
If we now look at the simultaneous execution, we can see that:
A1 can be executed in parallel to A2 and B2, but not to B1.
A2 can be executed in parallel to A1 and B1, but not to B2.
B1 can be executed in parallel to A2 and B2, but not to A1.
B2 can be executed in parallel to A1 and B1, but not to A2.
Thus, the synchronization depends only on the parameter object, not on the choice of synchronization block.
In your example, you are using this:
synchronized(sa){
if(sa.equals("notdone"){
//do some thing on object
}
}
This looks like you try to avoid that someone changes your instance variable sa to another string while you are comparing it and working - but it does not avoid this.
Synchronization does not work on a variable, it works on an object - and the object in question should usually be either some object which contains the variable (the current MyThread object in your case, reachable by this), or a special object used just for synchronization, and which is not changed.
As Peter Lawrey said, String objects usually are bad choices for synchronization locks, since all equal String literals are the same object (i.e. would exclude each other's synchronized blocks), while a equal non-literal string (e.g. created at runtime) is not the same object, and thus would not exclude synchronized blocks by other such objects or literals, which often leads to subtle bugs.
All threads synchronized on this objects will wait until current thread finishes its work. This is useful for example if you have read/write operation to collection that your wish to synchronized. So you can write sychronized block in methods set and get. In this case if one thread is reading information not all other threads that want to either read or write will wait.
So the question is what is the function of the object that a block synchronizes on?
All instances of Object have what is called a monitor. In normal execution this monitor is unowned.
A thread wishing to enter a synchronized block must take possession of the object monitor. Only one thread can posses the monitor at a time, however. So, if the monitor is currently unowned, the thread takes possession and executes the synchronized block of code. The thread releases the monitor when it leaves the synchronized block.
If the monitor is currently owned, then the thread needing to enter the synchronized block must wait for the monitor to be freed so it can take ownership and enter the block. More than one thread can be waiting and if so, then only one will be given ownership of the monitor. The rest will go back to waiting.