Issue with Double Check Locking in Java [duplicate]

Issue with Double Check Locking in Java [duplicate] - java

This question already has answers here:
Java double checked locking
(11 answers)
Closed 4 years ago.
One of the article mentions an issue with "Double Check Locking". Please see the below example
public class MyBrokenFactory {
private static MyBrokenFactory instance;
private int field1, field2 ...
public static MyBrokenFactory getFactory() {
// This is incorrect: don't do it!
if (instance == null) {
synchronized (MyBrokenFactory.class) {
if (instance == null)
instance = new MyBrokenFactory();
}
}
return instance;
}
private MyBrokenFactory() {
field1 = ...
field2 = ...
}
}
Reason:- (Please note the order of execution by the numbering)
Thread 1: 'gets in first' and starts creating instance.
1. Is instance null? Yes.
2. Synchronize on class.
3. Memory is allocated for instance.
4. Pointer to memory saved into instance.
[[Thread 2]]
7. Values for field1 and field2 are written
to memory allocated for object.
.....................
Thread 2: gets in just as Thread 1 has written the object reference
to memory, but before it has written all the fields.
5. Is instance null? No.
6. instance is non-null, but field1 and field2 haven't yet been set!
This thread sees invalid values for field1 and field2!
Question :
As the creation of the new instance(new MyBrokenFactory()) is done from the synchronized block, will the lock be released before the entire initialization is completed (private MyBrokenFactory() is completely executed) ?
Reference - https://www.javamex.com/tutorials/double_checked_locking.shtml
Please explain.

The problem is here:
Thread 2: gets in just as Thread 1 has written the object reference to memory, but before it has written all the fields.
Is instance null? No.
With out synchronization, thread 2 might see instance as null, even though thread 1 has written it. Notice that the first check of instance is outside of the synchronized block:
if (instance == null) {
synchronized (MyBrokenFactory.class) {
Since that first check is done outside of the block there's no guarantee that thread 2 will see the correct value of instance.
I have no idea what you're trying to do with field1 and field2, you never even write them.
Re. Your edit:
As the creation of the new instance(new MyBrokenFactory()) is done from the synchronized block
I think what you're asking is if the two instance fields, field1 and field2 are guaranteed to be visible. The answer is no, and the problem is the same as with instance. Because you don't read instance from within a synchronized block, there's no guarantee that those instance fields will be read correctly. If instance is non-null, you never enter the synchronized block, so no synchronization occurs.

Please find an answer to my question. I got the answer by looking into another similar question here.
Synchronize guarantees, that only one thread can enter a block of code. But it doesn't guarantee, that variables modifications done within synchronized section will be visible to other threads. Only the threads that enters the synchronized block is guaranteed to see the changes. This is the reason why double checked locking is broken - it is not synchronized on the reader's side. The reading thread may see, that the singleton is not null, but singleton data may not be fully initialized (visible).
Ordering is provided by volatile. volatile guarantees ordering, for instance write to volatile singleton static field guarantees that writes to the singleton object will be finished before the write to volatile static field. It doesn't prevent creating singleton of two objects, this is provided by synchronize.
Class final static fields doesn't need to be volatile. In Java, the JVM takes care of this problem.

Related

Do we need to synchronize writes if we are synchronizing reads?

I have few doubts about synchronized blocks.
Before my questions I would like to share the answers from another related post Link for Answer to related question. I quote Peter Lawrey from the same answer.
synchronized ensures you have a consistent view of the data. This means you will read the latest value and other caches will get the
latest value. Caches are smart enough to talk to each other via a
special bus (not something required by the JLS, but allowed) This
bus means that it doesn't have to touch main memory to get a
consistent view.
If you only use synchronized, you wouldn't need volatile. Volatile is useful if you have a very simple operation for which synchronized
would be overkill.
In reference to above I have three questions below :
Q1. Suppose in a multi threaded application there is an object or a primitive instance field being only read in a synchronized block (write may be happening in some other method without synchronization). Also Synchronized block is defined upon some other Object. Does declaring it volatile (even if it is read inside Synchronized block only) makes any sense ?
Q2. I understand the value of the states of the object upon which Synchronization has been done is consistent. I am not sure for the state of other objects and primitive fields being read in side the Synchronized block. Suppose changes are made without obtaining a lock but reading is done by obtaining a lock. Does state of all the objects and value of all primitive fields inside a Synchronized block will have consistent view always. ?
Q3. [Update] : Will all fields being read in a synchronized block will be read from main memory regardless of what we lock on ? [answered by CKing]
I have a prepared a reference code for my questions above.
public class Test {
private SomeClass someObj;
private boolean isSomeFlag;
private Object lock = new Object();
public SomeClass getObject() {
return someObj;
}
public void setObject(SomeClass someObj) {
this.someObj = someObj;
}
public void executeSomeProcess(){
//some process...
}
// synchronized block is on a private someObj lock.
// inside the lock method does the value of isSomeFlag and state of someObj remain consistent?
public void someMethod(){
synchronized (lock) {
while(isSomeFlag){
executeSomeProcess();
}
if(someObj.isLogicToBePerformed()){
someObj.performSomeLogic();
}
}
}
// this is method without synchronization.
public void setSomeFlag(boolean isSomeFlag) {
this.isSomeFlag = isSomeFlag;
}
}

The first thing you need to understand is that there is a subtle difference between the scenario being discussed in the linked answer and the scenario you are talking about. You speak about modifying a value without synchronization whereas all values are modified within a synchronized context in the linked answer. With this understanding in mind, let's address your questions :
Q1. Suppose in a multi threaded application there is an object or a primitive instance field being only read in a synchronized block (write may be happening in some other method without synchronization). Also Synchronized block is defined upon some other Object. Does declaring it volatile (even if it is read inside Synchronized block only) makes any sense ?
Yes it does make sense to declare the field as volatile. Since the write is not happening in a synchronized context, there is no guarantee that the writing thread will flush the newly updated value to main memory. The reading thread may still see inconsistent values because of this.
Suppose changes are made without obtaining a lock but reading is done by obtaining a lock. Does state of all the objects and value of all primitive fields inside a Synchronized block will have consistent view always. ?
The answer is still no. The reasoning is the same as above.
Bottom line : Modifying values outside synchronized context will not ensure that these values get flushed to main memory. (as the reader thread may enter the synchronized block before the writer thread does) Threads that read these values in a synchronized context may still end up reading older values even if they get these values from the main memory.
Note that this question talks about primitives so it is also important to understand that Java provides Out-of-thin-air safety for 32-bit primitives (all primitives except long and double) which means that you can be assured that you will atleast see a valid value (if not consistent).

All synchronized does is capture the lock of the object that it is synchronized on. If the lock is already captured, it will wait for its release. It does not in any way assert that that object's internal fields won't change. For that, there is volatile

When you synchronize on an object monitor A, it is guaranteed that another thread synchronizing on the same monitor A afterwards will see any changes made by the first thread to any object. That's the visibility guarantee provided by synchronized, nothing more.
A volatile variable guarantees visibility (for the variable only, a volatile HashMap doesn't mean the contents of the map would be visible) between threads regardless of any synchronized blocks.

Why the code would be at risk for seeing a partially constructed object?

There is an article about volatile using in ibm,and the explanation confused me,below is a sample in this article and its explanation:
public class BackgroundFloobleLoader {
public volatile Flooble theFlooble;
public void initInBackground() {
// do lots of stuff
theFlooble = new Flooble(); // this is the only write to theFlooble
}
}
public class SomeOtherClass {
public void doWork() {
while (true) {
// do some stuff...
// use the Flooble, but only if it is ready
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
}
}
}
Without the theFlooble reference being volatile, the code in doWork() would be at risk for seeing a partially constructed Flooble as it dereferences the theFlooble reference.
How to understand this?Why without volatile,we may use a partially constructed Flooble object?Thanks!

Without the volatile you could see a partially constructed object. E.g. consider this Flooble object.
public class Flooble {
public int x;
public int y;
public Flooble() {
x = 5;
y = 1;
}
}
public class SomeOtherClass {
public void doWork() {
while (true) {
// do some stuff...
// use the Flooble, but only if it is ready
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
}
public void doSomething(Flooble flooble) {
System.out.println(flooble.x / flooble.y);
}
}
}
Without volatile the method doSomething is not guaranteed to see the values 5 and 1 for x and y. It could see for instance x == 5 but y == 0, leading to division by zero.
When you execute this operation theFlooble = new Flooble(), three writes occur:
tmpFlooble.x = 5
tmpFlooble.y = 1
theFlooble = tmpFlooble
If these writes happen in this order everything is ok. But without the volatile the compiler is free to reorder these writes and perform them as it wishes. E.g. first point 3 and then points 1 and 2.
This actually happens all the time. The compiler really does reorder the writes. This is done to increase performance.
The error can easily happen in the following way:
Thread A executes initInBackground() method from class BackgroundFloobleLoader. The compiler reorders the writes so before executing the body of Flooble() (where x and y are set), the thread A first executes theFlooble = new Flooble(). Now, theFlooble points to a flooble instance, whose x and y are 0. Before thread A continues, some other thread B executes method doWork() of class SomeOtherClass. This method calls method doSomething(floobleLoader.theFlooble) with the current value of theFlooble. In this method theFlooble.x is divided by theFlooble.y resulting in division by zero. Thread B finishes due to uncaught exception. Thread A continues and sets theFlooble.x = 5 and theFlooble.y = 1.
This scenario of course won't happen on every run, but according to the rules of Java, can happen.

When different threads access your code, any thread can perform modifications on the state of your object, which means that when other threads access it, the state may not be as it should.
From the oracle documentation:
The Java programming language allows threads to access shared
variables. As a rule, to ensure that shared variables are
consistently and reliably updated, a thread should ensure that it has
exclusive use of such variables by obtaining a lock that,
conventionally, enforces mutual exclusion for those shared variables.
The Java programming language provides a second mechanism, volatile
fields, that is more convenient than locking for some purposes.
A field may be declared volatile, in which case the Java Memory Model
ensures that all threads see a consistent value for the variable.
source
Which means the value of this variable will never be cached thread-locally, all reads and writes will go straight to "main memory"
For example picture thread1 and thread2 accessing the object:
Thread1 access the object and stores it in its local cache
Trhead2 modifies the object
Thread1 accesses the object again, but since it is still in its cache, it doesn't access the updated state by thread2.

Look at it from the point of view of the code that does this:
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
Clearly, you need a guarantee that all of the writes performed by new Flooble() are visible to this code before theFlooble could possibly test as != null. Nothing in the code without volatile provides this guarantee. So you need a guarantee you don't have. Fail.
Java provides several ways to get the guarantee you need. One is by use of a volatile variable:
... any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change. -- Docs
So putting a write to a volatile in one thread and a read to a volatile in the other establishes precisely the happens-before relationship we need.

I doubt there is such a thing as partially constructed objects in Java. Volatile guarantees that every thread will see a constructed object. Since volatile works like a tiny synchronized block on the referenced object you would end up with a NPE if theFlobble == null. Maybe that is what they mean.

Objects encapsulate a lot of things: variables, methods, etc. and these take time to come into existence inside a computer. In Java, if any variable is declared volatile then all reads and writes to it is atomic. So if a variable referencing an object is declared volatile then access to its members is allowed only when it fully loads in your system (how do you read or write to something that isn't there at all?)

volatile in double-checked locking in Java [duplicate]

This question already has answers here:
Why is volatile used in double checked locking
(8 answers)
Closed 4 years ago.
As I understand, this is a correct implementation of the double-checked locking pattern in Java (since Java 5):
class Foo {
private volatile Bar _barInstance;
public Bar getBar() {
if (_barInstance == null) {
synchronized(this) { // or synchronized(someLock)
if (_barInstance == null) {
Bar newInstance = new Bar();
// possible additional initialization
_barInstance = newInstance;
}
}
}
return _barInstance;
}
}
I wonder if absence of volatile is a serious error or just a slight imperfection with possible performance drawback assuming _barInstance accessed only through getBar.
My idea is the following: synchronized introduces happens-before relation. The thread that initializes _barInstance writes its value to the main memory leaving the synchronized block. So there will be no double initialization of _barInstance even when it isn't volatile: other threads have null in theirs local copies of _barInstance (get true in the first check), but have to read the new value from the main memory in the second check after entering the synchronized block (get false and do no re-initialization). So the only problem is an excessive one-per-thread lock acquisition.
As I understand, it's correct in CLR and I believe it's also correct in JVM. Am I right?
Thank you.

Not using volatile may result in errors in the following case:
Thread 1 enters getBar() and finds _barInstance to be null
Thread 1 attempts to create a Bar object and update the reference to _barInstance. Due to certain compiler optimisations, these operations may be done out of order.
Meanwhile, thread 2 enters getBar() and sees a non-null _barInstance but might see default values in member fields of the _barInstance object. It essentially sees a partially constructed object but the reference is not null.
The volatile modifier will prohibit a write or read of the variable _barInstance with respect to any previous read or write. Hence, it will make sure that thread 2 will not see a partially constructed object.
For more details: http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

'Effective Java' conundrum: Why is volatile required in this concurrent code? [duplicate]

This question already has answers here:
Why is volatile used in double checked locking
(8 answers)
Closed 4 years ago.
I'm working my way through item 71, "Use lazy initialization judiciously", of Effective Java (second edition). It suggests the use of the double-check idiom for lazy initialization of instance fields using this code (pg 283):
private volatile FieldType field;
FieldType getField() {
FieldType result = field;
if (result == null) { //First check (no locking)
synchronized(this) {
result = field;
if (result == null) //Second check (with locking)
field = result = computeFieldValue();
}
}
return result;
}
So, I actually have several questions:
Why is the volatile modifier required on field given that initialization takes place in a synchronized block? The book offers this supporting text: "Because there is no locking if the field is already initialized, it is critical that the field be declared volatile". Therefore, is it the case that once the field is initialized, volatile is the only guarantee of multiple thread consistent views on field given the lack of other synchronization? If so, why not synchronize getField() or is it the case that the above code offers better performance?
The text suggests that the not-required local variable, result, is used to "ensure that field is read only once in the common case where it's already initialized", thereby improving performance. If result was removed, how would field be read multiple times in the common case where it was already initialized?

Why is the volatile modifier required on field given that initialization takes place in a synchronized block?
The volatile is necessary because of the possible reordering of instructions around the construction of objects. The Java memory model states that the real-time compiler has the option to reorder instructions to move field initialization outside of an object constructor.
This means that thread-1 can initialized the field inside of a synchronized but that thread-2 may see the object not fully initialized. Any non-final fields do not have to be initialized before the object has been assigned to the field. The volatile keyword ensures that field as been fully initialized before it is accessed.
This is an example of the famous "double check locking" bug.
If result was removed, how would field be read multiple times in the common case where it was already initialized?
Anytime you access a volatile field, it causes a memory-barrier to be crossed. This can be expensive compared to accessing a normal field. Copying a volatile field into a local variable is a common pattern if it is to be accessed in any way multiple times in the same method.
See my answer here for more examples of the perils of sharing an object without memory-barriers between threads:
About reference to object before object's constructor is finished

This a fairly complicated but it is related to now the compiler can rearrange things.
Basically the Double Checked Locking pattern does not work in Java unless the variable is volatile.
This is because, in some cases, the compiler can assign the variable so something other than null then do the initialisation of the variable and reassign it. Another thread would see that the variable is not null and attempt to read it - this can cause all sorts of very special outcomes.
Take a look at this other SO question on the topic.

Good questions.
Why is the volatile modifier required on field given that initialization takes place in a synchronized block?
If you have no synchronization, and you assign to that shared global field there is no promise that all writes that occur on construction of that object will be seen. For instance imagine FieldType looks like.
public class FieldType{
Object obj = new Object();
Object obj2 = new Object();
public Object getObject(){return obj;}
public Object getObject2(){return obj2;}
}
It is possible getField() returns a non-null instance but that instance getObj() and getObj2() methods can return null values. This is because without synchronization the writes to those fields can race with the consturction of the object.
How is this fixed with volatile? All writes that occur prior to a volatile write are visible after that volatile write occurs.
If result was removed, how would field be read multiple times in the common case where it was already initialized?
Storing locally once and reading throughout the method ensures one thread/process local store and all thread local reads. You can argue premature optimization in those regards but I like this style because you won't run yourself into strange reordering problems that can occur if you don't.

Out-of-order writes for Double-checked locking

In the examples mentioned for Out-of-order writes for double-checked locking scenarios (ref:
IBM article & Wikipedia Article)
I could not understand the simple reason of why Thread1 would come of out synchronized block before the constructor is fully initialized. As per my understanding, creating "new" and the calling constructor should execute in-sequence and the synchronized lock should not be release till all the work in not completed.
Please let me know what I am missing here.

The constructor can have completed - but that doesn't mean that all the writes involved within that constructor have been made visible to other threads. The nasty situation is when the reference becomes visible to other threads (so they start using it) before the contents of the object become visible.
You might find Bill Pugh's article on it helps shed a little light, too.
Personally I just avoid double-checked locking like the plague, rather than trying to make it all work.

The code in question is here:
public static Singleton getInstance()
{
if (instance == null)
{
synchronized(Singleton.class) { //1
if (instance == null) //2
instance = new Singleton(); //3
}
}
return instance;
}
Now the problem with this cannot be understood as long as you keep thinking that the code executes in the order it is written. Even if it does, there is the issue of cache synchronization across multiple processors (or cores) in a Symmetrical Multiprocessing architecture, which is the mainstream today.
Thread1 could for example publish the instance reference to the main memory, but fail to publish any other data inside the Singleton object that was created. Thread2 will observe the object in an inconsistent state.
As long as Thread2 doesn't enter the synchronized block, the cache synchronization doesn't have to happen, so Thread2 can go on indefinitely without ever observing the Singleton in a consistent state.

Thread 2 checks to see if the instance is null when Thread 1 is at //3 .
public static Singleton getInstance()
{
if (instance == null)
{
synchronized(Singleton.class) { //1
if (instance == null) //2
instance = new Singleton(); //3
}
}
return instance;//4
}
At this point the memory for instance has been allocated from the heap and the pointer to it is stored in the instance reference, so the "if statement" executed by Thread 2 returns "false".
Note that because instance is not null when Thread2 checks it, thread 2 does not enter the synchronized block and instead returns a reference to a " fully constructed, but partially initialized, Singleton object."

There's a general problem with code not being executed in the order it's written. In Java, a thread is only obligated to be consistent with itself. An instance created on one line with new has to be ready to go on the next. There's no such oblgation to other threads. For instance, if fieldA is 1 and 'fieldB' is 2 going into this code on thread 1:
fieldA = 5;
fieldB = 10;
and thread 2 runs this code:
int x = fieldA;
int y = FieldB;
x y values of 1 2, 5 2, and 5 10 are all to be expected, but 1 10--fieldB was set and/or picked up before fieldA--is perfectly legal, and likely, as well. So double-checked locking is a special case of a more general problem, and if you work with multiple threads you need to be aware of it, particularly if they all access the same fields.
One simple solution from Java 1.5 that should be mentioned: fields marked volatile are guaranteed to be read from main memory immediately before being referenced and written immediately after. If fieldA and fieldB above were declared volatile, an x y value of 1 10 would not be possible. If instance is volatile, double-checked locking works. There's a cost to using volatile fields, but it's less than synchronizing, so the double-checked locking becomes a pretty good idea. It's an even better idea because it avoids having a bunch of threads waiting to synch while CPU cores are sitting idle.
But you do want to understand this (if you can't be talked out of multithreading). On the one hand you need to avoid timing problems and on the other avoid bringing your program to a halt with all the threads waiting to get into synch blocks. And it's very difficult to understand.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.