Out-of-order writes for Double-checked locking - java

In the examples mentioned for Out-of-order writes for double-checked locking scenarios (ref:
IBM article & Wikipedia Article)
I could not understand the simple reason of why Thread1 would come of out synchronized block before the constructor is fully initialized. As per my understanding, creating "new" and the calling constructor should execute in-sequence and the synchronized lock should not be release till all the work in not completed.
Please let me know what I am missing here.

The constructor can have completed - but that doesn't mean that all the writes involved within that constructor have been made visible to other threads. The nasty situation is when the reference becomes visible to other threads (so they start using it) before the contents of the object become visible.
You might find Bill Pugh's article on it helps shed a little light, too.
Personally I just avoid double-checked locking like the plague, rather than trying to make it all work.

The code in question is here:
public static Singleton getInstance()
{
if (instance == null)
{
synchronized(Singleton.class) { //1
if (instance == null) //2
instance = new Singleton(); //3
}
}
return instance;
}
Now the problem with this cannot be understood as long as you keep thinking that the code executes in the order it is written. Even if it does, there is the issue of cache synchronization across multiple processors (or cores) in a Symmetrical Multiprocessing architecture, which is the mainstream today.
Thread1 could for example publish the instance reference to the main memory, but fail to publish any other data inside the Singleton object that was created. Thread2 will observe the object in an inconsistent state.
As long as Thread2 doesn't enter the synchronized block, the cache synchronization doesn't have to happen, so Thread2 can go on indefinitely without ever observing the Singleton in a consistent state.

Thread 2 checks to see if the instance is null when Thread 1 is at //3 .
public static Singleton getInstance()
{
if (instance == null)
{
synchronized(Singleton.class) { //1
if (instance == null) //2
instance = new Singleton(); //3
}
}
return instance;//4
}
At this point the memory for instance has been allocated from the heap and the pointer to it is stored in the instance reference, so the "if statement" executed by Thread 2 returns "false".
Note that because instance is not null when Thread2 checks it, thread 2 does not enter the synchronized block and instead returns a reference to a " fully constructed, but partially initialized, Singleton object."

There's a general problem with code not being executed in the order it's written. In Java, a thread is only obligated to be consistent with itself. An instance created on one line with new has to be ready to go on the next. There's no such oblgation to other threads. For instance, if fieldA is 1 and 'fieldB' is 2 going into this code on thread 1:
fieldA = 5;
fieldB = 10;
and thread 2 runs this code:
int x = fieldA;
int y = FieldB;
x y values of 1 2, 5 2, and 5 10 are all to be expected, but 1 10--fieldB was set and/or picked up before fieldA--is perfectly legal, and likely, as well. So double-checked locking is a special case of a more general problem, and if you work with multiple threads you need to be aware of it, particularly if they all access the same fields.
One simple solution from Java 1.5 that should be mentioned: fields marked volatile are guaranteed to be read from main memory immediately before being referenced and written immediately after. If fieldA and fieldB above were declared volatile, an x y value of 1 10 would not be possible. If instance is volatile, double-checked locking works. There's a cost to using volatile fields, but it's less than synchronizing, so the double-checked locking becomes a pretty good idea. It's an even better idea because it avoids having a bunch of threads waiting to synch while CPU cores are sitting idle.
But you do want to understand this (if you can't be talked out of multithreading). On the one hand you need to avoid timing problems and on the other avoid bringing your program to a halt with all the threads waiting to get into synch blocks. And it's very difficult to understand.

Related

If synchronized creates a happen-before relationship and prevents reordering why is volatile needed for DCL

I'm trying to understand the need for volatile in double-checked locking (I'm aware there are better ways than DCL though) I've read a few SO questions similar to mine, but none seem to explain what I'm looking for. I've even found some upvoted answers on SO that have said volatile is not needed (even when the object is mutable) however, everything I've read says otherwise.
What I want to know is why volatile is necessary in DCL if synchronized creates a happens-before relationship and prevents reordering?
Here is my understanding of how DCL works and an example:
// Does not work
class Foo {
private Helper helper = null; // 1
public Helper getHelper() { // 2
if (helper == null) { // 3
synchronized(this) { // 4
if (helper == null) { // 5
helper = new Helper(); // 6
} // 7
} // 8
} // 9
return helper; // 10
}
This does not work because the Helper object is not immutable or volatile and we know that
volatile causes every write to be flushed to memory and for every read to come from memory. This is important so that no thread sees a stale object.
So in the example I listed, it's possible for Thread A to begin initializing a new Helper object at Line 6. Then Thread B comes along and see a half initialized object at line 3. Thread B then jumps to line 10 and returns a half initialized Helper object.
Adding volatile fixes this with a happens before relationship and no reordering can be done by the JIT compiler. So the Helper object cannot be written to the helper reference until it is fully constructed (?, at least this is what I think it is telling me...).
However, after reading JSR-133 documentation, I became a bit confused. It states
Synchronization ensures that memory writes by a thread before or
during a synchronized block are made visible in a predictable manner
to other threads which synchronize on the same monitor. After we exit
a synchronized block, we release the monitor, which has the effect of
flushing the cache to main memory, so that writes made by this thread
can be visible to other threads. Before we can enter a synchronized
block, we acquire the monitor, which has the effect of invalidating
the local processor cache so that variables will be reloaded from main
memory. We will then be able to see all of the writes made visible by
the previous release.
So synchronized in Java creates a memory barrier and a happens before relationship.
So the actions are being flushed to memory, so it makes me question why volatile is needed on the variable.
The documentation also states
This means that any memory operations which were visible to a thread
before exiting a synchronized block are visible to any thread after it
enters a synchronized block protected by the same monitor, since all
the memory operations happen before the release, and the release
happens before the acquire.
My guess as to why we need the volatile keyword and why synchronize is not enough, is because the memory operations are not visible to other threads until Thread A exits the synchronized block and Thread B enters the same block on the same lock.
It's possible that Thread A is initializing the object at line 6 and Thread B comes along at Line 3 before there is a flush by Thread A at Line 8.
However, this SO answer seems to contradict that as the synchronized block prevents reordering "from inside a synchronized block, to outside it"
If helper is not null, what ensures that the code will see all the effects of the construction of the helper? Without volatile, nothing would do so.
Consider:
synchronized(this) { // 4
if (helper == null) { // 5
helper = new Helper(); // 6
} // 7
Suppose internally this is implemented as first setting helper to a non-null value and then calling the constructor to create a valid Helper object. No rule prevents this.
Another thread may see helper as non-null but the constructor hasn't even run yet, much less made its effects visible to another thread.
It is vital not to permit any other thread to see helper set to a non-null value until we can guarantee that all consequences of the constructor are visible to that thread.
By the way, getting code like this correct is extremely difficult. Worse, it can appear to work fine 100% of the time and then suddenly break on a different JVM, CPU, library, platform, or whatever. It is generally advised that writing this kind of code be avoided unless proven to be needed to meet performance requirements. This kind of code is hard to write, hard to understand, hard to maintain, and hard to get right.
#David Schwartz's answer is pretty good but there is one thing that I'm not sure is stated well.
My guess as to why we need the volatile keyword and why synchronize is not enough, is because the memory operations are not visible to other threads until Thread A exits the synchronized block and Thread B enters the same block on the same lock.
Actually not the same lock but any lock because locks come with memory barriers. volatile is not about locking but it is around crossing memory barriers while synchronized blocks are both locks and memory barriers. You need the volatile because even though Thread A has properly initialized the Helper instance and published it to helper field, Thread B needs to also cross a memory barrier to ensure that it sees all of the updates to Helper.
So in the example I listed, it's possible for Thread A to begin initializing a new Helper object at Line 6. Then Thread B comes along and see a half initialized object at line 3. Thread B then jumps to line 10 and returns a half initialized Helper object.
Right. It is possible that Thread A might initialize the Helper and publish it before it hits the end of the synchronized block. There is nothing stopping it from happening. And because the JVM is allowed to reorder the instructions from the Helper constructor until later, it could be published to helper field but not be fulling initialized. And even if Thread A does reach the end of the synchronized block and Helper then gets fully initialized, there is still nothing that ensures that Thread B sees all of the updated memory.
However, this SO answer seems to contradict that as the synchronized block prevents reordering "from inside a synchronized block, to outside it"
No, that answer is not contradictory. You are confusing what happens with just Thread A and what happens to other threads. In terms of Thread A (and central memory), exiting the synchronized block makes sure that Helper's constructor has fully finished and published to the helper field. But this means nothing until Thread B (or other threads) also cross a memory barrier. Then they too will invalidate the local memory cache and see all of the updates.
That's why the volatile is necessary.

Questions about how the synchronized keyword works with locks and thread starvation

In this java tutorial there's some code that shows an example to explain the use of the synchronized keyword. My point is, why I shouldn't write something like this:
public class MsLunch {
private long c1 = 0;
private long c2 = 0;
//private Object lock1 = new Object();
//private Object lock2 = new Object();
public void inc1() {
synchronized(c1) {
c1++;
}
}
public void inc2() {
synchronized(c2) {
c2++;
}
}
}
Without bothering create lock objects? Also, why bother instantiate that lock objects? Can't I just pass a null reference? I think I'm missing out something here.
Also, assume that I've two public synchronized methods in the same class accessed by several thread. Is it true that the two methods will never be executed at the same time? If the answer is yes, is there a built-in mechanism that prevents one method from starvation (never been executed or been executed too few times compared to the other method)?
As #11thdimension has replied, you cannot synchronize on a primitive type (eg., long). It must be a class object.
So, you might be tempted to do something like the following:
Long c1 = 0;
public void incC1() {
synchronized(c1) {
c1++;
}
}
This will not work properly, as "c1++" is a shortcut for "c1 = c1 + 1", which actually assigns a new object to c1, and as such, two threads might end up in the same block of synchronized code.
For the lock to work properly, the object being synchronized upon should not be reassigned. (Well, maybe in some rare circumstances where you really know what you are doing.)
You cannot pass a null object to the synchronized(...) statement. Java is effectively creating semaphores on the ref'd object, and uses that information to prevent more than one thread accessing the same protected resource.
You do not always need a separate lock object, as in the case of a synchronized method. In this case, the class object instance itself is used to store the locking information, as if you used 'this' in the method iteslf:
public void incC1() {
synchronized(this) {
c1++;
}
}
First you can not pass primitive variable to synchronized, it requires a reference. Second that tutorial is just a example showing guarded block. It's not c1,c2 that it's trying to protect but it's trying to protect all the code inside synchronized block.
JVM uses Operating system's scheduling algorithm.
What is the JVM Scheduling algorithm?
So it's not JVM's responsibility to see if threads are starved. You can however assign priority of threads to prefer one over other to execute.
Every thread has a priority. Threads with higher priority are executed in preference to threads with lower priority. Each thread may or may not also be marked as a daemon. When code running in some thread creates a new Thread object, the new thread has its priority initially set equal to the priority of the creating thread, and is a daemon thread if and only if the creating thread is a daemon.
From:https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html
If you're concerned about this scenario then you have to implement it yourself. Like maintaining a thread which checks for starving threads and as time passes it increases the priority of the threads which have been waiting longer than others.
Yes it's true that two method which have been synchronized will never be executed on the same instance simultaneously.
Why bother instantiate that lock objects? Can't I just pass a null reference?
As others have mentioned, you cannot lock on long c1 because it is a primitive. Java locks on the monitor associated with an object instance. This is why you also can't lock on null.
The thread tutorial is trying to demonstrate a good pattern which is to create private final lock objects to precisely control the mutex locations that you are trying to protect. Calling synchronized on this or other public objects can cause external callers to block your methods which may not be what you want.
The tutorial explains this:
All updates of these fields must be synchronized, but there's no reason to prevent an update of c1 from being interleaved with an update of c2 — and doing so reduces concurrency by creating unnecessary blocking. Instead of using synchronized methods or otherwise using the lock associated with this, we create two objects solely to provide locks.
So they are also trying to allow updates to c1 and updates to c2 to happen concurrently ("interleaved") and not block each other while at the same time making sure that the updates are protected.
Assume that I've two public synchronized methods in the same class accessed by several thread. Is it true that the two methods will never be executed at the same time?
If one thread is working in a synchronized method of an object, another thread will be blocked if it tries the same or another synchronized method of the same object. Threads can run methods on different objects concurrently.
If the answer is yes, is there a built-in mechanism that prevents one method from starvation (never been executed or been executed too few times compared to the other method)?
As mentioned, this is handled by the native thread constructs from the operating system. All modern OS' handle thread starvation which is especially important if the threads have different priorities.

When Singleton is not Singleton?

I was reading a blog which discussed about When Singleton is not Singleton.
In one of the cases of which Author tries to explain shows how double checked locking can also be a failure when implemented on Singleton.
// Double-checked locking -- don't use
public static MySingleton getInstance() {
if (_instance==null) {
synchronized (MySingleton.class) {
if (_instance==null) {
_instance = new MySingleton();
}
}
}
}
For the above code block Author says:
"In this situation, we intend to avoid the expense of grabbing the lock of the singleton class every time the method is called. The lock is grabbed only if the singleton instance does not exist, and then the existence of the instance is checked again in case another thread passed the first check an instant before the current thread."
Can someone help me explaining what exactly this means?
I'll try to talk it through.
The synchronized block takes time to enter as it requires cross-thread coordination. We'll try to avoid entering it if needed.
Now, if we are working with multiple threads, if the object already exists, let's just return it, as methods will synchronize themselves against threading race conditions internally. We can do this before entering a synchronized block as if it was created, it was created. The constructor is already designed so a partially-constructed object cannot be returned, as specified by the memory model design.
If the singleton object doesn't exist yet, we need to create one. But what if while we were checking another thread created it? We'll use synchronized to ensure no other threads hold it. Now, once we enter, we check again. If the singleton was created by another thread, let's return it since it exists already. If we didn't do this, a thread could get its singleton and do something to it and we'd just steamroller over its changes and effects.
If not, let's lock it and return a new one. By holding the lock, we now protect the singleton from the other side. Another thread waits for the lock, and noticing it's been created(as per the inner null comparison) returns the existing one. If we didn't acquire the lock, threads would both steamroller over changes, and find their changes destroyed as well.
Please note that the code block in your post is incomplete. It would need to return _instance if any of the null checks returned false, using else blocks.
Now, if we were in a single-threaded environment this would not have been important. We could just use:
public static MySingleton getInstance() {
if (_instance==null) {
_instance = new MySingleton();
}
else return _instance;
}
With newer versions, java uses this behavior in many cases, as part of its libraries, checking if a lock is needed before taking time to acquire it. Before, it either failed to acquire the lock(bad, data loss) or acquired it immediately(bad, more potential for slowdown and deadlock).
You should still implement this yourself in your own classes for thread safety.
He doesn't explain how it can fail in that quote. He is just explaining double-checked locking. He probably refers elsewhere to the fact that double-checked locking itself didn't work prior to Java 1.5. But that's a long time ago.
I have found on wikipedia, the best of explanation of different Singleton implentations, their flaws and what is the best. Follow this link:
http://en.wikipedia.org/wiki/Singleton_pattern
Hope it helps!

Partially constructed objects in non thread-safe Singleton

In a multi-threaded environment, how can a thread possibly see a 'partially constructed object'? I understood that it is not thread-safe since multiple threads can create multiple instances.
class LazyInit
{ private static Resource resource = null;
public static getInstance()
{ if (resource == null) { resource = new Resource(); }
return instance;
}
}
Because of out-of-order writes.
If your constructor writes to non-final members, they don't have to be committed to memory right away, and actually they may even be committed after the singleton variable is. Java guarantees the thread that affects it sees the affectations in order, but not that other threads will unless you put a memory barrier.
See this question and this page of the Java specification for more information.
It might be beside the point but in your example, it is entirely possible that two threads see different singletons. Suppose one thread tests the nullity of the variable, enters the if and gets preempted before it gets a chance to construct the object. The new thread that gets the CPU now tests the yet-null object, constructs the singleton. When the old thread starts running again it will happily finish constructing the object and overwrite the singleton variable.
Another, more frightening issue, arises if the constructor of Resource calls for a method that will ultimately result in another call to this getInstance. Even if the state of the program results in no infinite loop, you will create several singleton instances.

Avoid synchronized(this) in Java?

Whenever a question pops up on SO about Java synchronization, some people are very eager to point out that synchronized(this) should be avoided. Instead, they claim, a lock on a private reference is to be preferred.
Some of the given reasons are:
some evil code may steal your lock (very popular this one, also has an "accidentally" variant)
all synchronized methods within the same class use the exact same lock, which reduces throughput
you are (unnecessarily) exposing too much information
Other people, including me, argue that synchronized(this) is an idiom that is used a lot (also in Java libraries), is safe and well understood. It should not be avoided because you have a bug and you don't have a clue of what is going on in your multithreaded program. In other words: if it is applicable, then use it.
I am interested in seeing some real-world examples (no foobar stuff) where avoiding a lock on this is preferable when synchronized(this) would also do the job.
Therefore: should you always avoid synchronized(this) and replace it with a lock on a private reference?
Some further info (updated as answers are given):
we are talking about instance synchronization
both implicit (synchronized methods) and explicit form of synchronized(this) are considered
if you quote Bloch or other authorities on the subject, don't leave out the parts you don't like (e.g. Effective Java, item on Thread Safety: Typically it is the lock on the instance itself, but there are exceptions.)
if you need granularity in your locking other than synchronized(this) provides, then synchronized(this) is not applicable so that's not the issue
I'll cover each point separately.
Some evil code may steal your lock (very popular this one, also has an
"accidentally" variant)
I'm more worried about accidentally. What it amounts to is that this use of this is part of your class' exposed interface, and should be documented. Sometimes the ability of other code to use your lock is desired. This is true of things like Collections.synchronizedMap (see the javadoc).
All synchronized methods within the same class use the exact same
lock, which reduces throughput
This is overly simplistic thinking; just getting rid of synchronized(this) won't solve the problem. Proper synchronization for throughput will take more thought.
You are (unnecessarily) exposing too much information
This is a variant of #1. Use of synchronized(this) is part of your interface. If you don't want/need this exposed, don't do it.
Well, firstly it should be pointed out that:
public void blah() {
synchronized (this) {
// do stuff
}
}
is semantically equivalent to:
public synchronized void blah() {
// do stuff
}
which is one reason not to use synchronized(this). You might argue that you can do stuff around the synchronized(this) block. The usual reason is to try and avoid having to do the synchronized check at all, which leads to all sorts of concurrency problems, specifically the double checked-locking problem, which just goes to show how difficult it can be to make a relatively simple check threadsafe.
A private lock is a defensive mechanism, which is never a bad idea.
Also, as you alluded to, private locks can control granularity. One set of operations on an object might be totally unrelated to another but synchronized(this) will mutually exclude access to all of them.
synchronized(this) just really doesn't give you anything.
While you are using synchronized(this) you are using the class instance as a lock itself. This means that while lock is acquired by thread 1, the thread 2 should wait.
Suppose the following code:
public void method1() {
// do something ...
synchronized(this) {
a ++;
}
// ................
}
public void method2() {
// do something ...
synchronized(this) {
b ++;
}
// ................
}
Method 1 modifying the variable a and method 2 modifying the variable b, the concurrent modification of the same variable by two threads should be avoided and it is. BUT while thread1 modifying a and thread2 modifying b it can be performed without any race condition.
Unfortunately, the above code will not allow this since we are using the same reference for a lock; This means that threads even if they are not in a race condition should wait and obviously the code sacrifices concurrency of the program.
The solution is to use 2 different locks for two different variables:
public class Test {
private Object lockA = new Object();
private Object lockB = new Object();
public void method1() {
// do something ...
synchronized(lockA) {
a ++;
}
// ................
}
public void method2() {
// do something ...
synchronized(lockB) {
b ++;
}
// ................
}
}
The above example uses more fine grained locks (2 locks instead one (lockA and lockB for variables a and b respectively) and as a result allows better concurrency, on the other hand it became more complex than the first example ...
While I agree about not adhering blindly to dogmatic rules, does the "lock stealing" scenario seem so eccentric to you? A thread could indeed acquire the lock on your object "externally"(synchronized(theObject) {...}), blocking other threads waiting on synchronized instance methods.
If you don't believe in malicious code, consider that this code could come from third parties (for instance if you develop some sort of application server).
The "accidental" version seems less likely, but as they say, "make something idiot-proof and someone will invent a better idiot".
So I agree with the it-depends-on-what-the-class-does school of thought.
Edit following eljenso's first 3 comments:
I've never experienced the lock stealing problem but here is an imaginary scenario:
Let's say your system is a servlet container, and the object we're considering is the ServletContext implementation. Its getAttribute method must be thread-safe, as context attributes are shared data; so you declare it as synchronized. Let's also imagine that you provide a public hosting service based on your container implementation.
I'm your customer and deploy my "good" servlet on your site. It happens that my code contains a call to getAttribute.
A hacker, disguised as another customer, deploys his malicious servlet on your site. It contains the following code in the init method:
synchronized (this.getServletConfig().getServletContext()) {
while (true) {}
}
Assuming we share the same servlet context (allowed by the spec as long as the two servlets are on the same virtual host), my call on getAttribute is locked forever. The hacker has achieved a DoS on my servlet.
This attack is not possible if getAttribute is synchronized on a private lock, because 3rd-party code cannot acquire this lock.
I admit that the example is contrived and an oversimplistic view of how a servlet container works, but IMHO it proves the point.
So I would make my design choice based on security consideration: will I have complete control over the code that has access to the instances? What would be the consequence of a thread's holding a lock on an instance indefinitely?
It depends on the situation.
If There is only one sharing entity or more than one.
See full working example here
A small introduction.
Threads and shareable entities
It is possible for multiple threads to access same entity, for eg multiple connectionThreads sharing a single messageQueue. Since the threads run concurrently there may be a chance of overriding one's data by another which may be a messed up situation.
So we need some way to ensure that shareable entity is accessed only by one thread at a time. (CONCURRENCY).
Synchronized block
synchronized() block is a way to ensure concurrent access of shareable entity.
First, a small analogy
Suppose There are two-person P1, P2 (threads) a Washbasin (shareable entity) inside a washroom and there is a door (lock).
Now we want one person to use washbasin at a time.
An approach is to lock the door by P1 when the door is locked P2 waits until p1 completes his work
P1 unlocks the door
then only p1 can use washbasin.
syntax.
synchronized(this)
{
SHARED_ENTITY.....
}
"this" provided the intrinsic lock associated with the class (Java developer designed Object class in such a way that each object can work as monitor).
Above approach works fine when there are only one shared entity and multiple threads (1: N).
N shareable entities-M threads
Now think of a situation when there is two washbasin inside a washroom and only one door. If we are using the previous approach, only p1 can use one washbasin at a time while p2 will wait outside. It is wastage of resource as no one is using B2 (washbasin).
A wiser approach would be to create a smaller room inside washroom and provide them one door per washbasin. In this way, P1 can access B1 and P2 can access B2 and vice-versa.
washbasin1;
washbasin2;
Object lock1=new Object();
Object lock2=new Object();
synchronized(lock1)
{
washbasin1;
}
synchronized(lock2)
{
washbasin2;
}
See more on Threads----> here
There seems a different consensus in the C# and Java camps on this. The majority of Java code I have seen uses:
// apply mutex to this instance
synchronized(this) {
// do work here
}
whereas the majority of C# code opts for the arguably safer:
// instance level lock object
private readonly object _syncObj = new object();
...
// apply mutex to private instance level field (a System.Object usually)
lock(_syncObj)
{
// do work here
}
The C# idiom is certainly safer. As mentioned previously, no malicious / accidental access to the lock can be made from outside the instance. Java code has this risk too, but it seems that the Java community has gravitated over time to the slightly less safe, but slightly more terse version.
That's not meant as a dig against Java, just a reflection of my experience working on both languages.
Make your data immutable if it is possible ( final variables)
If you can't avoid mutation of shared data across multiple threads, use high level programming constructs [e.g. granular Lock API ]
A Lock provides exclusive access to a shared resource: only one thread at a time can acquire the lock and all access to the shared resource requires that the lock be acquired first.
Sample code to use ReentrantLock which implements Lock interface
class X {
private final ReentrantLock lock = new ReentrantLock();
// ...
public void m() {
lock.lock(); // block until condition holds
try {
// ... method body
} finally {
lock.unlock()
}
}
}
Advantages of Lock over Synchronized(this)
The use of synchronized methods or statements forces all lock acquisition and release to occur in a block-structured way.
Lock implementations provide additional functionality over the use of synchronized methods and statements by providing
A non-blocking attempt to acquire a lock (tryLock())
An attempt to acquire the lock that can be interrupted (lockInterruptibly())
An attempt to acquire the lock that can timeout (tryLock(long, TimeUnit)).
A Lock class can also provide behavior and semantics that is quite different from that of the implicit monitor lock, such as
guaranteed ordering
non-re entrant usage
Deadlock detection
Have a look at this SE question regarding various type of Locks:
Synchronization vs Lock
You can achieve thread safety by using advanced concurrency API instead of Synchronied blocks. This documentation page provides good programming constructs to achieve thread safety.
Lock Objects support locking idioms that simplify many concurrent applications.
Executors define a high-level API for launching and managing threads. Executor implementations provided by java.util.concurrent provide thread pool management suitable for large-scale applications.
Concurrent Collections make it easier to manage large collections of data, and can greatly reduce the need for synchronization.
Atomic Variables have features that minimize synchronization and help avoid memory consistency errors.
ThreadLocalRandom (in JDK 7) provides efficient generation of pseudorandom numbers from multiple threads.
Refer to java.util.concurrent and java.util.concurrent.atomic packages too for other programming constructs.
The java.util.concurrent package has vastly reduced the complexity of my thread safe code. I only have anecdotal evidence to go on, but most work I have seen with synchronized(x) appears to be re-implementing a Lock, Semaphore, or Latch, but using the lower-level monitors.
With this in mind, synchronizing using any of these mechanisms is analogous to synchronizing on an internal object, rather than leaking a lock. This is beneficial in that you have absolute certainty that you control the entry into the monitor by two or more threads.
If you've decided that:
the thing you need to do is lock on
the current object; and
you want to
lock it with granularity smaller than
a whole method;
then I don't see the a taboo over synchronizezd(this).
Some people deliberately use synchronized(this) (instead of marking the method synchronized) inside the whole contents of a method because they think it's "clearer to the reader" which object is actually being synchronized on. So long as people are making an informed choice (e.g. understand that by doing so they're actually inserting extra bytecodes into the method and this could have a knock-on effect on potential optimisations), I don't particularly see a problem with this. You should always document the concurrent behaviour of your program, so I don't see the "'synchronized' publishes the behaviour" argument as being so compelling.
As to the question of which object's lock you should use, I think there's nothing wrong with synchronizing on the current object if this would be expected by the logic of what you're doing and how your class would typically be used. For example, with a collection, the object that you would logically expect to lock is generally the collection itself.
I think there is a good explanation on why each of these are vital techniques under your belt in a book called Java Concurrency In Practice by Brian Goetz. He makes one point very clear - you must use the same lock "EVERYWHERE" to protect the state of your object. Synchronised method and synchronising on an object often go hand in hand. E.g. Vector synchronises all its methods. If you have a handle to a vector object and are going to do "put if absent" then merely Vector synchronising its own individual methods isn't going to protect you from corruption of state. You need to synchronise using synchronised (vectorHandle). This will result in the SAME lock being acquired by every thread which has a handle to the vector and will protect overall state of the vector. This is called client side locking. We do know as a matter of fact vector does synchronised (this) / synchronises all its methods and hence synchronising on the object vectorHandle will result in proper synchronisation of vector objects state. Its foolish to believe that you are thread safe just because you are using a thread safe collection. This is precisely the reason ConcurrentHashMap explicitly introduced putIfAbsent method - to make such operations atomic.
In summary
Synchronising at method level allows client side locking.
If you have a private lock object - it makes client side locking impossible. This is fine if you know that your class doesn't have "put if absent" type of functionality.
If you are designing a library - then synchronising on this or synchronising the method is often wiser. Because you are rarely in a position to decide how your class is going to be used.
Had Vector used a private lock object - it would have been impossible to get "put if absent" right. The client code will never gain a handle to the private lock thus breaking the fundamental rule of using the EXACT SAME LOCK to protect its state.
Synchronising on this or synchronised methods do have a problem as others have pointed out - someone could get a lock and never release it. All other threads would keep waiting for the lock to be released.
So know what you are doing and adopt the one that's correct.
Someone argued that having a private lock object gives you better granularity - e.g. if two operations are unrelated - they could be guarded by different locks resulting in better throughput. But this i think is design smell and not code smell - if two operations are completely unrelated why are they part of the SAME class? Why should a class club unrelated functionalities at all? May be a utility class? Hmmmm - some util providing string manipulation and calendar date formatting through the same instance?? ... doesn't make any sense to me at least!!
No, you shouldn't always. However, I tend to avoid it when there are multiple concerns on a particular object that only need to be threadsafe in respect to themselves. For example, you might have a mutable data object that has "label" and "parent" fields; these need to be threadsafe, but changing one need not block the other from being written/read. (In practice I would avoid this by declaring the fields volatile and/or using java.util.concurrent's AtomicFoo wrappers).
Synchronization in general is a bit clumsy, as it slaps a big lock down rather than thinking exactly how threads might be allowed to work around each other. Using synchronized(this) is even clumsier and anti-social, as it's saying "no-one may change anything on this class while I hold the lock". How often do you actually need to do that?
I would much rather have more granular locks; even if you do want to stop everything from changing (perhaps you're serialising the object), you can just acquire all of the locks to achieve the same thing, plus it's more explicit that way. When you use synchronized(this), it's not clear exactly why you're synchronizing, or what the side effects might be. If you use synchronized(labelMonitor), or even better labelLock.getWriteLock().lock(), it's clear what you are doing and what the effects of your critical section are limited to.
Short answer: You have to understand the difference and make choice depending on the code.
Long answer: In general I would rather try to avoid synchronize(this) to reduce contention but private locks add complexity you have to be aware of. So use the right synchronization for the right job. If you are not so experienced with multi-threaded programming I would rather stick to instance locking and read up on this topic. (That said: just using synchronize(this) does not automatically make your class fully thread-safe.) This is a not an easy topic but once you get used to it, the answer whether to use synchronize(this) or not comes naturally.
A lock is used for either visibility or for protecting some data from concurrent modification which may lead to race.
When you need to just make primitive type operations to be atomic there are available options like AtomicInteger and the likes.
But suppose you have two integers which are related to each other like x and y co-ordinates, which are related to each other and should be changed in an atomic manner. Then you would protect them using a same lock.
A lock should only protect the state that is related to each other. No less and no more. If you use synchronized(this) in each method then even if the state of the class is unrelated all the threads will face contention even if updating unrelated state.
class Point{
private int x;
private int y;
public Point(int x, int y){
this.x = x;
this.y = y;
}
//mutating methods should be guarded by same lock
public synchronized void changeCoordinates(int x, int y){
this.x = x;
this.y = y;
}
}
In the above example I have only one method which mutates both x and y and not two different methods as x and y are related and if I had given two different methods for mutating x and y separately then it would not have been thread safe.
This example is just to demonstrate and not necessarily the way it should be implemented. The best way to do it would be to make it IMMUTABLE.
Now in opposition to Point example, there is an example of TwoCounters already provided by #Andreas where the state which is being protected by two different locks as the state is unrelated to each other.
The process of using different locks to protect unrelated states is called Lock Striping or Lock Splitting
The reason not to synchronize on this is that sometimes you need more than one lock (the second lock often gets removed after some additional thinking, but you still need it in the intermediate state). If you lock on this, you always have to remember which one of the two locks is this; if you lock on a private Object, the variable name tells you that.
From the reader's viewpoint, if you see locking on this, you always have to answer the two questions:
what kind of access is protected by this?
is one lock really enough, didn't someone introduce a bug?
An example:
class BadObject {
private Something mStuff;
synchronized setStuff(Something stuff) {
mStuff = stuff;
}
synchronized getStuff(Something stuff) {
return mStuff;
}
private MyListener myListener = new MyListener() {
public void onMyEvent(...) {
setStuff(...);
}
}
synchronized void longOperation(MyListener l) {
...
l.onMyEvent(...);
...
}
}
If two threads begin longOperation() on two different instances of BadObject, they acquire
their locks; when it's time to invoke l.onMyEvent(...), we have a deadlock because neither of the threads may acquire the other object's lock.
In this example we may eliminate the deadlock by using two locks, one for short operations and one for long ones.
As already said here synchronized block can use user-defined variable as lock object, when synchronized function uses only "this". And of course you can manipulate with areas of your function which should be synchronized and so on.
But everyone says that no difference between synchronized function and block which covers whole function using "this" as lock object. That is not true, difference is in byte code which will be generated in both situations. In case of synchronized block usage should be allocated local variable which holds reference to "this". And as result we will have a little bit larger size of function (not relevant if you have only few number of functions).
More detailed explanation of the difference you can find here:
http://www.artima.com/insidejvm/ed2/threadsynchP.html
Also usage of synchronized block is not good due to following point of view:
The synchronized keyword is very limited in one area: when exiting a synchronized block, all threads that are waiting for that lock must be unblocked, but only one of those threads gets to take the lock; all the others see that the lock is taken and go back to the blocked state. That's not just a lot of wasted processing cycles: often the context switch to unblock a thread also involves paging memory off the disk, and that's very, very, expensive.
For more details in this area I would recommend you read this article:
http://java.dzone.com/articles/synchronized-considered
This is really just supplementary to the other answers, but if your main objection to using private objects for locking is that it clutters your class with fields that are not related to the business logic then Project Lombok has #Synchronized to generate the boilerplate at compile-time:
#Synchronized
public int foo() {
return 0;
}
compiles to
private final Object $lock = new Object[0];
public int foo() {
synchronized($lock) {
return 0;
}
}
A good example for use synchronized(this).
// add listener
public final synchronized void addListener(IListener l) {listeners.add(l);}
// remove listener
public final synchronized void removeListener(IListener l) {listeners.remove(l);}
// routine that raise events
public void run() {
// some code here...
Set ls;
synchronized(this) {
ls = listeners.clone();
}
for (IListener l : ls) { l.processEvent(event); }
// some code here...
}
As you can see here, we use synchronize on this to easy cooperate of lengthly (possibly infinite loop of run method) with some synchronized methods there.
Of course it can be very easily rewritten with using synchronized on private field. But sometimes, when we already have some design with synchronized methods (i.e. legacy class, we derive from, synchronized(this) can be the only solution).
It depends on the task you want to do, but I wouldn't use it. Also, check if the thread-save-ness you want to accompish couldn't be done by synchronize(this) in the first place? There are also some nice locks in the API that might help you :)
I only want to mention a possible solution for unique private references in atomic parts of code without dependencies. You can use a static Hashmap with locks and a simple static method named atomic() that creates required references automatically using stack information (full class name and line number). Then you can use this method in synchronize statements without writing new lock object.
// Synchronization objects (locks)
private static HashMap<String, Object> locks = new HashMap<String, Object>();
// Simple method
private static Object atomic() {
StackTraceElement [] stack = Thread.currentThread().getStackTrace(); // get execution point
StackTraceElement exepoint = stack[2];
// creates unique key from class name and line number using execution point
String key = String.format("%s#%d", exepoint.getClassName(), exepoint.getLineNumber());
Object lock = locks.get(key); // use old or create new lock
if (lock == null) {
lock = new Object();
locks.put(key, lock);
}
return lock; // return reference to lock
}
// Synchronized code
void dosomething1() {
// start commands
synchronized (atomic()) {
// atomic commands 1
...
}
// other command
}
// Synchronized code
void dosomething2() {
// start commands
synchronized (atomic()) {
// atomic commands 2
...
}
// other command
}
Avoid using synchronized(this) as a locking mechanism: This locks the whole class instance and can cause deadlocks. In such cases, refactor the code to lock only a specific method or variable, that way whole class doesn't get locked. Synchronised can be used inside method level.
Instead of using synchronized(this), below code shows how you could just lock a method.
public void foo() {
if(operation = null) {
synchronized(foo) {
if (operation == null) {
// enter your code that this method has to handle...
}
}
}
}
My two cents in 2019 even though this question could have been settled already.
Locking on 'this' is not bad if you know what you are doing but behind the scene locking on 'this' is (which unfortunately what synchronized keyword in method definition allows).
If you actually want users of your class to be able to 'steal' your lock (i.e. prevent other threads from dealing with it), you actually want all the synchronized methods to wait while another sync method is running and so on.
It should be intentional and well thought off (and hence documented to help your users understand it).
To further elaborate, in the reverse you must know what you are 'gaining' (or 'losing' out on) if you lock on a non accessible lock (nobody can 'steal' your lock, you are in total control and so on...).
The problem for me is that synchronized keyword in the method definition signature makes it just too easy for programmers not to think about what to lock on which is a mighty important thing to think about if you don't want to run into problems in a multi-threaded program.
One can't argue that 'typically' you don't want users of your class to be able to do these stuff or that 'typically' you want...It depends on what functionality you are coding. You can't make a thumb rule as you can't predict all the use cases.
Consider for e.g. the printwriter which uses an internal lock but then people struggle to use it from multiple threads if they don't want their output to interleave.
Should your lock be accessible outside of the class or not is your decision as a programmer on the basis of what functionality the class has. It is part of the api. You can't move away for instance from synchronized(this) to synchronized(provateObjet) without risking breaking changes in the code using it.
Note 1: I know you can achieve whatever synchronized(this) 'achieves' by using a explicit lock object and exposing it but I think it is unnecessary if your behaviour is well documented and you actually know what locking on 'this' means.
Note 2: I don't concur with the argument that if some code is accidentally stealing your lock its a bug and you have to solve it. This in a way is same argument as saying I can make all my methods public even if they are not meant to be public. If someone is 'accidentally' calling my intended to be private method its a bug. Why enable this accident in the first place!!! If ability to steal your lock is a problem for your class don't allow it. As simple as that.
Let me put the conclusion first - locking on private fields does not work for slightly more complicated multi-threaded program. This is because multi-threading is a global problem. It is impossible to localize synchronization unless you write in a very defensive way (e.g. copy everything on passing to other threads).
Here is the long explanation:
Synchronization includes 3 parts: Atomicity, Visibility and Ordering
Synchronized block is very coarse level of synchronization. It enforces visibility and ordering just as what you expected. But for atomicity, it does not provide much protection. Atomicity requires global knowledge of the program rather than local knowledge. (And that makes multi-threading programming very hard)
Let's say we have a class Account having method deposit and withdraw. They are both synchronized based on a private lock like this:
class Account {
private Object lock = new Object();
void withdraw(int amount) {
synchronized(lock) {
// ...
}
}
void deposit(int amount) {
synchronized(lock) {
// ...
}
}
}
Considering we need to implement a higher-level class which handles transfer, like this:
class AccountManager {
void transfer(Account fromAcc, Account toAcc, int amount) {
if (fromAcc.getBalance() > amount) {
fromAcc.setBalance(fromAcc.getBalance() - amount);
toAcc.setBalance(toAcc.getBalance + amount);
}
}
}
Assuming we have 2 accounts now,
Account john;
Account marry;
If the Account.deposit() and Account.withdraw() are locked with internal lock only. That will cause problem when we have 2 threads working:
// Some thread
void threadA() {
john.withdraw(500);
}
// Another thread
void threadB() {
accountManager.transfer(john, marry, 100);
}
Because it is possible for both threadA and threadB run at the same time. And thread B finishes the conditional check, thread A withdraws, and thread B withdraws again. This means we can withdraw $100 from John even if his account has no enough money. This will break atomicity.
You may propose that: why not adding withdraw() and deposit() to AccountManager then? But under this proposal, we need to create a multi-thread safe Map which maps from different accounts to their locks. We need to delete the lock after execution (otherwise will leak memory). And we also need to ensure no other one accesses the Account.withdraw() directly. This will introduce a lots of subtle bugs.
The correct and most idiomatic way is to expose the lock in the Account. And let the AccountManager to use the lock. But in this case, why not just use the object itself then?
class Account {
synchronized void withdraw(int amount) {
// ...
}
synchronized void deposit(int amount) {
// ...
}
}
class AccountManager {
void transfer(Account fromAcc, Account toAcc, int amount) {
// Ensure locking order to prevent deadlock
Account firstLock = fromAcc.hashCode() < toAcc.hashCode() ? fromAcc : toAcc;
Account secondLock = fromAcc.hashCode() < toAcc.hashCode() ? toAcc : fromAcc;
synchronized(firstLock) {
synchronized(secondLock) {
if (fromAcc.getBalance() > amount) {
fromAcc.setBalance(fromAcc.getBalance() - amount);
toAcc.setBalance(toAcc.getBalance + amount);
}
}
}
}
}
To conclude in simple English, private lock does not work for slightly more complicated multi-threaded program.
(Reposted from https://stackoverflow.com/a/67877650/474197)
I think points one (somebody else using your lock) and two (all methods using the same lock needlessly) can happen in any fairly large application. Especially when there's no good communication between developers.
It's not cast in stone, it's mostly an issue of good practice and preventing errors.

Categories