While going through Java Concurrency in Practice book, came across this piece of code where "fromAccount" and ""toAccount" objects are locked one after the other to prevent dynamic lock order deadlock.
public void transferMoney(Account fromAccount,Account toAccount) {
**synchronized (fromAccount) {**
**synchronized (toAccount) {**
........
}
}
}
I am confused as to why this lock ordering is needed at all.If we just wanted to make sure that both the objects are locked at the same time then wouldn't you get the same effect if there was just a regular synchronization block inside which fromAccount and toAccount objects are accessed. I am sure that I am missing some fundamental concept here. Thank you for your help.
public void transferMoney(Account fromAccount,Account toAccount) {
synchronized (this) {
fromAccount.someMethod();
toAccount.someMethod();
}
}
Your alternative to the lock-ordering example is what you want to avoid: having a central lock that everything is using, because then you don't get concurrent transfers, everything waits for that one lock and only one transfer can proceed at a time. It's not clear what this is or what its scope can possibly be, but if there are multiple instances of this transfer service then locking doesn't do any good, because one transfer involving one account can go through one instance while another transfer involving that same account can go through another. Therefore it seems like there can only be one of them, which diminishes your concurrency to one transfer at a time. You won't deadlock, but you won't process a lot of transfers quickly either.
The idea behind this toy example (which you shouldn't mistake for anything like how anybody would transfer money) is it's trying to get better concurrency by locking on the individual accounts involved in the transfer, because for a lot of transfers the accounts involved aren't involved in other concurrent transfers and you'd like to be able to process them concurrently and maximize your concurrency by minimizing the scope of the locking going on to the individual accounts. But this scheme runs into trouble if some account is involved in multiple concurrent transfers and the locks are acquired in a different order for some transfers.
First, it should be noted that the example you have brought (based on your comment, it's page 208, listing 10.2) is a bad example - one that ends in a deadlock. The objects are not locked one after the other to prevent dynamic lock order deadlock, they are an example of where dynamic lock order will happen!
Now, you are suggesting locking on this, but what is this this anyway, and what is the scope of locking?
It's clear that the same object has to be used for all operations - withdraw, deposit, transfer. If separate objects are used for them, then one thread could do a deposit on account A, while another thread transfers from account A to account B, and they won't be using the same lock so the balance will be compromised. So the lock object for all accesses to the same account should be the same one.
As Nathan Hughes explained, one needs to localize the locking. We can't use one central lock object for all the accounts, or we'll have them all waiting for each other despite not actually working on the same resources. So using a central locking object is also out of the question.
So it appears that we need to localize the locks so that each account's balance will have its own lock, so as to allow parallel operations between unrelated accounts, but that this lock has to be used for all operations - withdraw, deposit and transfer.
And here comes the problem - when it's just withdraw or deposit, you are operating on just one account, and so you need to just lock that account. But when you transfer, you have two objects involved. So you need to have both their balances locked in case there are other threads that want to operate on either.
Any object that holds a single lock for two or more accounts will break one of the two points above. Either it won't be used for all operations, or it will not be localized enough.
This is why they are attempting to lock the two locks one after another. Their solution was to make the Account object itself the lock for the account - which fulfils both the "all operations" condition and the "locality" condition. But still we need to make sure we have the locks for both accounts before we can transfer the money.
But again, this source is an example of a deadlock prone code. This is because one thread may want to transfer from account A to account B, while another will want to transfer from account B to account A. In that case, the first one locks the A account, the second locks the B account, and then they are deadlocked because they have performed the locking in opposite order.
The basic fundamental here is to avoid race condition. In your case if there will be another method in any other class who is also doing transfer money to toAccount then incorrect amount may get update in the toAccount. e.g. There are 2 classes which performs money transfer.
One class has a method:
public void transferMoney(Account fromAccount,Account toAccount) {
synchronized (this) {
fromAccount.someMethod();
toAccount.someMethod();
}
}
and other class contains:
public void transferMoneyNow(Account fromAccount1,Account toAccount) {
synchronized (this) {
fromAccount1.someMethod();
toAccount.someMethod();
}
}
If both method takes place at the same time, due to race condition incorrect amount may get update in toAccount.
Related
After reading a little bit about the java memory model and synchronization, a few questions came up:
Even if Thread 1 synchronizes the writes, then although the effect of the writes will be flushed to main memory, Thread 2 will still not see them because the read came from level 1 cache. So synchronizing writes only prevents collisions on writes. (Java thread-safe write-only hashmap)
Second, when a synchronized method exits, it automatically establishes a happens-before relationship with any subsequent invocation of a synchronized method for the same object. This guarantees that changes to the state of the object are visible to all threads. (https://docs.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html)
A third website (I can't find it again, sorry) said that every change to any object - it doesn't care where the reference comes from - will be flushed to memory when the method leaves the synchronized block and establishes a happens-before situation.
My questions are:
What is really flushed back to memory by exiting the synchronized block? (As some websites also said that only the object whose lock has been aquired will be flushed back.)
What does happens-before-relaitonship mean in this case? And what will be re-read from memory on entering the block, what not?
How does a lock achieve this functionality (from https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/Lock.html):
All Lock implementations must enforce the same memory synchronization semantics as provided by the built-in monitor lock, as described in section 17.4 of The Java™ Language Specification:
A successful lock operation has the same memory synchronization effects as a successful Lock action.
A successful unlock operation has the same memory synchronization effects as a successful Unlock action.
Unsuccessful locking and unlocking operations, and reentrant locking/unlocking operations, do not require any memory synchronization effects.
If my assumtion that everything will be re-read and flushed is correct, this is achieved by using synchronized-block in the lock- and unlock-functions (which are mostly also necessary), right? And if it's wrong, how can this functionality be achieved?
Thank you in advance!
The happens-before-relationship is the fundamental thing you have to understand, as the formal specification operates in terms of these. Terms like “flushing” are technical details that may help you understanding them, or misguide you in the worst case.
If a thread performs action A within a synchronized(object1) { … }, followed by a thread performing action B within a synchronized(object1) { … }, assuming that object1 refers to the same object, there is a happens-before-relationship between A and B and these actions are safe regarding accessing shared mutable data (assuming, no one else modifies this data).
But this is a directed relationship, i.e. B can safely access the data modified by A. But when seeing two synchronized(object1) { … } blocks, being sure that object1 is the same object, you still need to know whether A was executed before B or B was executed before A, to know the direction of the happens-before-relationship. For ordinary object oriented code, this usually works naturally, as each action will operate on whatever previous state of the object it finds.
Speaking of flushing, leaving a synchronized block causes flushing of all written data and entering a synchronized block causes rereading of all mutable data, but without the mutual exclusion guaranty of a synchronized on the same instance, there is no control over which happens before the other. Even worse, you can not use the shared data to detect the situation, as without blocking the other thread, it can still inconsistently modify the data you’re operating on.
Since synchronizing on different objects can’t establish a valid happens-before relationship, the JVM’s optimizer is not required to maintain the global flush effect. Most notably, today’s JVMs will remove synchronization, if Escape Analysis has proven that the object is never seen by other threads.
So you can use synchronizing on an object to guard access to data stored somewhere else, i.e not in that object, but it still requires consistent synchronizing on the same object instance for all access to the same shared data, which complicates the program logic, compared to simply synchronizing on the same object containing the guarded data.
volatile variables, like used by Locks internally, also have a global flush effect, if threads are reading and writing the same volatile variable, and use the value to form a correct program logic. This is trickier than with synchronized blocks, as there is no mutual exclusion of code execution, or well, you could see it as having a mutual exclusion limited to a single read, write, or cas operation.
There is no flush per-se, it's just easier to think that way (easier to draw too); that's why there are lots of resources online that refer to flush to main memory (RAM assuming), but in reality it does not happen that often. What really happens is that a drain is performed of the load and/or store buffers to L1 cache (L2 in case of IBM) and it's up to the cache coherence protocol to sync data from there; or to put it differently caches are smart enough to talk to each other (via a BUS) and not fetch data from main memory all the time.
This is a complicated subject (disclaimer: even though I try to do a lot of reading on this, a lot of tests when I have time, I absolutely do not understand it in full glory), it's about potential compiler/cpu/etc re-orderings (program order is never respected), it's about flushes of the buffers, about memory barriers, release/acquire semantics... I don't think that your question is answerable without a phD report; that's why there are higher layers in the JLS called - "happens-before".
Understanding at least a small portion of the above, you would understand that your questions (at least first two), make very little sense.
What is really flushed back to memory by exiting the synchronized block
Probably nothing at all - caches "talk" to each other to sync data; I can only think of two other cases: when you first time read some data and when a thread dies - all written data will be flushed to main memory(but I'm not sure).
What does happens-before-relaitonship mean in this case? And what will be re-read from memory on entering the block, what not?
Really, the same sentence as above.
How does a lock achieve this functionality
Usually by introducing memory barriers; just like volatiles do.
Assume we have to undertake a transfer between any 2 accounts(among hunders out there) as part of a transaction.
And there would be multiple similar transactions running concurrently in a typical multi-threaded environment.
Usual convention would be as below(maintaining the lock order as per a pre-designed convention):
lock account A
lock account B
transfer(A,B)
release B
release A
Is there any way to attempt the locks and release as an atomic operation?
Yes there is: you need to lock the locks under a lock. In other words, you need to create a lock hierarchy. But this solution is not very efficient because it decreases lock granularity.
It looks like in your case it would be sufficient to always take locks in the same order. For example always lock user with lesser ID first.
Transaction is atomic by ACID definition (A - for atomicity). Isolation (at least READ_COMMITED one) guarantee that other transaction that may occur for account A at the same time will wait while previous started transaction will be finished. So actually, you don't need to lock them explicitly as they will lock by internal implementation (database for example) and that locks will be more efficient as they can use optimistic locking techniques.
But this is only true if they all participating in one transactional context (like in JTA environment for example). In such environment you could just start transaction in the beginning of transfer method and no need for locking Account A and Account B.
In case, that they are not in the same transactional context, you can introduce some another locking object but this will significantly reduce performance as threads will be locked even though one is working with accounts A and B and another one with accounts C and D. There are techniques in how to avoid this situation (see ConcurentHashMap for example, where
locks are on baskets - and not on the whole object).
But with your particular example answer could only be some general thoughts as example is to short to examine more. I think variant with locking account A and account B in particular order (should be very care with that - as this could lead to potential deadlocks. And assuming there is not just transfer method that could work with them - it is really high risky) is normal for given situation.
You can try to use the following code.
Note: it only work for two locks and I'm unsure how to make it scale to more locks.
The idea is that you take the first lock and you try to take the second one.
If it fails, we know that 1 lock is free right now, but the other is busy.
Thus we release the first lock and you invert them, so we will lock on the one that was busy and try to take the one that (WAS!) free, if it is still free.
Rinse and repeat.
There is a statistically impossibility that this code will go in StackOverflow,
I think handling it and giving an error is better than making it loop, since it would be a signal that something somewhere is going very wrong.
public static void takeBoth(ReentrantLock l1,ReentrantLock l2) {
l1.lock();
if(l2.tryLock()) {return;}
l1.unlock();
try{takeBoth(l2,l1);}
catch(StackOverflowError e) {throw new Error("??");}
}
public static void releaseBoth(ReentrantLock l1,ReentrantLock l2){
if(!l1.isHeldByCurrentThread()) {l1.unlock();}//this will fail: IllegarMonitorState exception
l2.unlock();//this may fail, in that case we did not touch l1.
l1.unlock();
}
I am learning multithreading, and I have a little question.
When I am sharing some variable between threads (ArrayList, or something other like double, float), should it be lcoked by the same object in read/write? I mean, when 1 thread is setting variable value, can another read at same time withoud any problems? Or should it be locked by same object, and force thread to wait with reading, until its changed by another thread?
All access to shared state must be guarded by the same lock, both reads and writes. A read operation must wait for the write operation to release the lock.
As a special case, if all you would to inside your synchronized blocks amounts to exactly one read or write operation, then you may dispense with the synchronized block and mark the variable as volatile.
Short: It depends.
Longer:
There is many "correct answer" for each different scenarios. (and that makes programming fun)
Do the value to be read have to be "latest"?
Do the value to be written have let all reader known?
Should I take care any race-condition if two threads write?
Will there be any issue if old/previous value being read?
What is the correct behaviour?
Do it really need it to be correct ? (yes, sometime you don't care for good)
tl;dr
For example, not all threaded programming need "always correct"
sometime you tradeoff correctness with performance (e.g. log or progress counter)
sometime reading old value is just fine
sometime you need eventually correct (e.g. in map-reduce, nobody nor synchronized is right until all done)
in some cases, correct is mandatory for every moment (e.g. your bank account balance)
in write-once, read-only it doesn't matter.
sometime threads in groups with complex cases.
sometime many small, independent lock run faster, but sometime flat global lock is faster
and many many other possible cases
Here is my suggestion: If you are learning, you should thing "why should I need a lock?" and "why a lock can help in DIFFERENT cases?" (not just the given sample from textbook), "will if fail or what could happen if a lock is missing?"
If all threads are reading, you do not need to synchronize.
If one or more threads are reading and one or more are writing you will need to synchronize somehow. If the collection is small you can use synchronized. You can either add a synchronized block around the accesses to the collection, synchronized the methods that access the collection or use a concurrent threadsafe collection (for example, Vector).
If you have a large collection and you want to allow shared reading but exclusive writing you need to use a ReadWriteLock. See here for the JavaDoc and an exact description of what you want with examples:
ReentrantReadWriteLock
Note that this question is pretty common and there are plenty of similar examples on this site.
Here is the situation, I have 3 instances, one is a manager, for assigning job, and two is worker, doing to job. Let say the user need to withdraw something, and workflow is like follow:
Request sent
Manager assign job, depends on worker instance loading
worker do the work (reduce the num in db)
tell the manager instance job is finished!
All things work, but two instance may have two withdraw in same account, it may have some problems, to make a negative number.... So, it have some problems there. Well, you can say add an execute channel or queue or something that only execute one database's write function....
But the problem is when I have more and more instance, only with one instance for writing, that may reduce the productivity, any recommends? Thanks.
How do you do it?
By carefully designing the operations to be atomic, and doing all of the relevant checks, accesses and updates as part of the atomic action.
Now you mention that you have a database as part of the implementation technology. Assuming that it is a transactional database, you should be mapping each of these atomic operations to a transaction. So, to build on your description:
Request received
Manager assigns job to a worker
Worker does the work of the withdrawal as follows:
Start database transaction
Check that account exists.
Check that withdrawal > 0.
Check that balance - withdrawal >= 0
Update the database balance.
Commit the transaction, or roll it back if there were errors.
Worker reports outcome to Manager
Manager responds to request.
On the other hand, if there was no database involved, and you were simply updating in-memory objects, then you'd create an Account class that had a synchronized method for doing a withdrawal that did steps 3.3 through 3.5 of the above, and have the worker call the method with the relevant parameters.
I suggest you read about synchronized block/method or ReentrantLock. Basically you need put read/write lock on your withdraw function to make sure two threads can not read/write the same piece of data simultaneously.
Suppose we have a class called AccountService that manages the state of accounts.
AccountService is defined as
interface AccountService{
public void debit(account);
public void credit(account);
public void transfer(Account account, Account account1);
}
Given this definition, what is the best way to implement transfer() so that you can guarantee that transfer is an atomic operation.
I'm interested in answers that reference Java 1.4 code as well as answers that might use resources from java.util.concurrent in Java 5
Synchronize on both Account objects and do the transfer. Make sure you always synchronize in the same order. In order to do so, make the Accounts implement Comparable, sort the two accounts, and synchronize in that order.
If you don't order the accounts, you run the possibility of deadlock if one thread transfers from A to B and another transfers from B to A.
This exact example is discussed on page 207 of Java Concurrency in Practice, a critical book for anybody doing multi-threaded Java development. The example code is available from the publisher's website:
Dynamic lock-ordering deadlock. (bad)
Inducing a lock ordering to avoid deadlock.
A classic example very well explained here - http://www.javaworld.com/javaworld/jw-10-2001/jw-1012-deadlock.html?page=4
You probably need to have a full transactions support (if it's a real application of course).
The difficulty of solution hardly depends on your environment. Describe your system in detail and we'll try to help you (what kind of application? does it use web-server? which web-server? what is used to store data? and so on)
If you can guarantee that all accesses are made through the transfer method, then probably the easiest approach is just to make transfer a synchronized method. This will be thread-safe because this guarantees that only one thread will be running the transfer method at any one time.
If other methods may also access the AccountService, then you might decide to have them all use a single global lock. An easy way of doing this is to surround all code that accesses the AccountService in a synchronized (X) {...} block where X is some shared / singleton object instance (that could be the AccountService instance itself). This will be thread safe because only one thread will be accessing the AccountService at any one time, even if they are in different methods.
If that still isn't sufficient, then you'll need to use more sophisticated locking approaches. One common approach would be to lock the accounts individually before you modify them... but then you must be very careful to take the locks in a consistent order (e.g. by account ID) otherwise you will run into deadlocks.
Finally if AccountService is a remote service then you are into distributed locking territory.... unless you have a PhD in computer science and years of research budget to burn you should probably avoid going there.
Couldn't you avoid having to synchronize using an AtomicReference<Double> for the account balance, along with get() and set()?