volatile in double-checked locking in Java [duplicate]

volatile in double-checked locking in Java [duplicate] - java

This question already has answers here:
Why is volatile used in double checked locking
(8 answers)
Closed 4 years ago.
As I understand, this is a correct implementation of the double-checked locking pattern in Java (since Java 5):
class Foo {
private volatile Bar _barInstance;
public Bar getBar() {
if (_barInstance == null) {
synchronized(this) { // or synchronized(someLock)
if (_barInstance == null) {
Bar newInstance = new Bar();
// possible additional initialization
_barInstance = newInstance;
}
}
}
return _barInstance;
}
}
I wonder if absence of volatile is a serious error or just a slight imperfection with possible performance drawback assuming _barInstance accessed only through getBar.
My idea is the following: synchronized introduces happens-before relation. The thread that initializes _barInstance writes its value to the main memory leaving the synchronized block. So there will be no double initialization of _barInstance even when it isn't volatile: other threads have null in theirs local copies of _barInstance (get true in the first check), but have to read the new value from the main memory in the second check after entering the synchronized block (get false and do no re-initialization). So the only problem is an excessive one-per-thread lock acquisition.
As I understand, it's correct in CLR and I believe it's also correct in JVM. Am I right?
Thank you.

Not using volatile may result in errors in the following case:
Thread 1 enters getBar() and finds _barInstance to be null
Thread 1 attempts to create a Bar object and update the reference to _barInstance. Due to certain compiler optimisations, these operations may be done out of order.
Meanwhile, thread 2 enters getBar() and sees a non-null _barInstance but might see default values in member fields of the _barInstance object. It essentially sees a partially constructed object but the reference is not null.
The volatile modifier will prohibit a write or read of the variable _barInstance with respect to any previous read or write. Hence, it will make sure that thread 2 will not see a partially constructed object.
For more details: http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html

Related

Issue with Double Check Locking in Java [duplicate]

This question already has answers here:
Java double checked locking
(11 answers)
Closed 4 years ago.
One of the article mentions an issue with "Double Check Locking". Please see the below example
public class MyBrokenFactory {
private static MyBrokenFactory instance;
private int field1, field2 ...
public static MyBrokenFactory getFactory() {
// This is incorrect: don't do it!
if (instance == null) {
synchronized (MyBrokenFactory.class) {
if (instance == null)
instance = new MyBrokenFactory();
}
}
return instance;
}
private MyBrokenFactory() {
field1 = ...
field2 = ...
}
}
Reason:- (Please note the order of execution by the numbering)
Thread 1: 'gets in first' and starts creating instance.
1. Is instance null? Yes.
2. Synchronize on class.
3. Memory is allocated for instance.
4. Pointer to memory saved into instance.
[[Thread 2]]
7. Values for field1 and field2 are written
to memory allocated for object.
.....................
Thread 2: gets in just as Thread 1 has written the object reference
to memory, but before it has written all the fields.
5. Is instance null? No.
6. instance is non-null, but field1 and field2 haven't yet been set!
This thread sees invalid values for field1 and field2!
Question :
As the creation of the new instance(new MyBrokenFactory()) is done from the synchronized block, will the lock be released before the entire initialization is completed (private MyBrokenFactory() is completely executed) ?
Reference - https://www.javamex.com/tutorials/double_checked_locking.shtml
Please explain.

The problem is here:
Thread 2: gets in just as Thread 1 has written the object reference to memory, but before it has written all the fields.
Is instance null? No.
With out synchronization, thread 2 might see instance as null, even though thread 1 has written it. Notice that the first check of instance is outside of the synchronized block:
if (instance == null) {
synchronized (MyBrokenFactory.class) {
Since that first check is done outside of the block there's no guarantee that thread 2 will see the correct value of instance.
I have no idea what you're trying to do with field1 and field2, you never even write them.
Re. Your edit:
As the creation of the new instance(new MyBrokenFactory()) is done from the synchronized block
I think what you're asking is if the two instance fields, field1 and field2 are guaranteed to be visible. The answer is no, and the problem is the same as with instance. Because you don't read instance from within a synchronized block, there's no guarantee that those instance fields will be read correctly. If instance is non-null, you never enter the synchronized block, so no synchronization occurs.

Please find an answer to my question. I got the answer by looking into another similar question here.
Synchronize guarantees, that only one thread can enter a block of code. But it doesn't guarantee, that variables modifications done within synchronized section will be visible to other threads. Only the threads that enters the synchronized block is guaranteed to see the changes. This is the reason why double checked locking is broken - it is not synchronized on the reader's side. The reading thread may see, that the singleton is not null, but singleton data may not be fully initialized (visible).
Ordering is provided by volatile. volatile guarantees ordering, for instance write to volatile singleton static field guarantees that writes to the singleton object will be finished before the write to volatile static field. It doesn't prevent creating singleton of two objects, this is provided by synchronize.
Class final static fields doesn't need to be volatile. In Java, the JVM takes care of this problem.

Why the code would be at risk for seeing a partially constructed object?

There is an article about volatile using in ibm,and the explanation confused me,below is a sample in this article and its explanation:
public class BackgroundFloobleLoader {
public volatile Flooble theFlooble;
public void initInBackground() {
// do lots of stuff
theFlooble = new Flooble(); // this is the only write to theFlooble
}
}
public class SomeOtherClass {
public void doWork() {
while (true) {
// do some stuff...
// use the Flooble, but only if it is ready
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
}
}
}
Without the theFlooble reference being volatile, the code in doWork() would be at risk for seeing a partially constructed Flooble as it dereferences the theFlooble reference.
How to understand this?Why without volatile,we may use a partially constructed Flooble object?Thanks!

Without the volatile you could see a partially constructed object. E.g. consider this Flooble object.
public class Flooble {
public int x;
public int y;
public Flooble() {
x = 5;
y = 1;
}
}
public class SomeOtherClass {
public void doWork() {
while (true) {
// do some stuff...
// use the Flooble, but only if it is ready
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
}
public void doSomething(Flooble flooble) {
System.out.println(flooble.x / flooble.y);
}
}
}
Without volatile the method doSomething is not guaranteed to see the values 5 and 1 for x and y. It could see for instance x == 5 but y == 0, leading to division by zero.
When you execute this operation theFlooble = new Flooble(), three writes occur:
tmpFlooble.x = 5
tmpFlooble.y = 1
theFlooble = tmpFlooble
If these writes happen in this order everything is ok. But without the volatile the compiler is free to reorder these writes and perform them as it wishes. E.g. first point 3 and then points 1 and 2.
This actually happens all the time. The compiler really does reorder the writes. This is done to increase performance.
The error can easily happen in the following way:
Thread A executes initInBackground() method from class BackgroundFloobleLoader. The compiler reorders the writes so before executing the body of Flooble() (where x and y are set), the thread A first executes theFlooble = new Flooble(). Now, theFlooble points to a flooble instance, whose x and y are 0. Before thread A continues, some other thread B executes method doWork() of class SomeOtherClass. This method calls method doSomething(floobleLoader.theFlooble) with the current value of theFlooble. In this method theFlooble.x is divided by theFlooble.y resulting in division by zero. Thread B finishes due to uncaught exception. Thread A continues and sets theFlooble.x = 5 and theFlooble.y = 1.
This scenario of course won't happen on every run, but according to the rules of Java, can happen.

When different threads access your code, any thread can perform modifications on the state of your object, which means that when other threads access it, the state may not be as it should.
From the oracle documentation:
The Java programming language allows threads to access shared
variables. As a rule, to ensure that shared variables are
consistently and reliably updated, a thread should ensure that it has
exclusive use of such variables by obtaining a lock that,
conventionally, enforces mutual exclusion for those shared variables.
The Java programming language provides a second mechanism, volatile
fields, that is more convenient than locking for some purposes.
A field may be declared volatile, in which case the Java Memory Model
ensures that all threads see a consistent value for the variable.
source
Which means the value of this variable will never be cached thread-locally, all reads and writes will go straight to "main memory"
For example picture thread1 and thread2 accessing the object:
Thread1 access the object and stores it in its local cache
Trhead2 modifies the object
Thread1 accesses the object again, but since it is still in its cache, it doesn't access the updated state by thread2.

Look at it from the point of view of the code that does this:
if (floobleLoader.theFlooble != null)
doSomething(floobleLoader.theFlooble);
Clearly, you need a guarantee that all of the writes performed by new Flooble() are visible to this code before theFlooble could possibly test as != null. Nothing in the code without volatile provides this guarantee. So you need a guarantee you don't have. Fail.
Java provides several ways to get the guarantee you need. One is by use of a volatile variable:
... any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change. -- Docs
So putting a write to a volatile in one thread and a read to a volatile in the other establishes precisely the happens-before relationship we need.

I doubt there is such a thing as partially constructed objects in Java. Volatile guarantees that every thread will see a constructed object. Since volatile works like a tiny synchronized block on the referenced object you would end up with a NPE if theFlobble == null. Maybe that is what they mean.

Objects encapsulate a lot of things: variables, methods, etc. and these take time to come into existence inside a computer. In Java, if any variable is declared volatile then all reads and writes to it is atomic. So if a variable referencing an object is declared volatile then access to its members is allowed only when it fully loads in your system (how do you read or write to something that isn't there at all?)

'Effective Java' conundrum: Why is volatile required in this concurrent code? [duplicate]

This question already has answers here:
Why is volatile used in double checked locking
(8 answers)
Closed 4 years ago.
I'm working my way through item 71, "Use lazy initialization judiciously", of Effective Java (second edition). It suggests the use of the double-check idiom for lazy initialization of instance fields using this code (pg 283):
private volatile FieldType field;
FieldType getField() {
FieldType result = field;
if (result == null) { //First check (no locking)
synchronized(this) {
result = field;
if (result == null) //Second check (with locking)
field = result = computeFieldValue();
}
}
return result;
}
So, I actually have several questions:
Why is the volatile modifier required on field given that initialization takes place in a synchronized block? The book offers this supporting text: "Because there is no locking if the field is already initialized, it is critical that the field be declared volatile". Therefore, is it the case that once the field is initialized, volatile is the only guarantee of multiple thread consistent views on field given the lack of other synchronization? If so, why not synchronize getField() or is it the case that the above code offers better performance?
The text suggests that the not-required local variable, result, is used to "ensure that field is read only once in the common case where it's already initialized", thereby improving performance. If result was removed, how would field be read multiple times in the common case where it was already initialized?

Why is the volatile modifier required on field given that initialization takes place in a synchronized block?
The volatile is necessary because of the possible reordering of instructions around the construction of objects. The Java memory model states that the real-time compiler has the option to reorder instructions to move field initialization outside of an object constructor.
This means that thread-1 can initialized the field inside of a synchronized but that thread-2 may see the object not fully initialized. Any non-final fields do not have to be initialized before the object has been assigned to the field. The volatile keyword ensures that field as been fully initialized before it is accessed.
This is an example of the famous "double check locking" bug.
If result was removed, how would field be read multiple times in the common case where it was already initialized?
Anytime you access a volatile field, it causes a memory-barrier to be crossed. This can be expensive compared to accessing a normal field. Copying a volatile field into a local variable is a common pattern if it is to be accessed in any way multiple times in the same method.
See my answer here for more examples of the perils of sharing an object without memory-barriers between threads:
About reference to object before object's constructor is finished

This a fairly complicated but it is related to now the compiler can rearrange things.
Basically the Double Checked Locking pattern does not work in Java unless the variable is volatile.
This is because, in some cases, the compiler can assign the variable so something other than null then do the initialisation of the variable and reassign it. Another thread would see that the variable is not null and attempt to read it - this can cause all sorts of very special outcomes.
Take a look at this other SO question on the topic.

Good questions.
Why is the volatile modifier required on field given that initialization takes place in a synchronized block?
If you have no synchronization, and you assign to that shared global field there is no promise that all writes that occur on construction of that object will be seen. For instance imagine FieldType looks like.
public class FieldType{
Object obj = new Object();
Object obj2 = new Object();
public Object getObject(){return obj;}
public Object getObject2(){return obj2;}
}
It is possible getField() returns a non-null instance but that instance getObj() and getObj2() methods can return null values. This is because without synchronization the writes to those fields can race with the consturction of the object.
How is this fixed with volatile? All writes that occur prior to a volatile write are visible after that volatile write occurs.
If result was removed, how would field be read multiple times in the common case where it was already initialized?
Storing locally once and reading throughout the method ensures one thread/process local store and all thread local reads. You can argue premature optimization in those regards but I like this style because you won't run yourself into strange reordering problems that can occur if you don't.

Java: Caching of non-volatile variables by different threads

The situation is the following:
I have an object with lots of setters and getters.
Instance of this object is created in a one particular thread where all values are set. Initially I create an "empty" object using new statement and only then I call some setters methods based on some complicated legacy logic.
Only then this object became available to all other threads that use only getters.
The question: Do I have to make all variables of this class volatile or not?
Concerns:
Creation of a new instance of the object and setting all its values
is separated in time.
But all other threads have no idea about this
new instance until all values are set. So other threads shall not
have a cache of not fully initialized object. Isn't it?
Note: I am aware about builder pattern, but I cannot apply it there for several other reasons :(
EDITED:
As I feel two answers from Mathias and axtavt do not match very well, I would like to add an example:
Let's say we have a foo class:
class Foo {
public int x=0;
}
and two threads are using it as described above:
// Thread 1 init the value:
Foo f = new Foo();
f.x = 5;
values.add(f); // Publication via thread-safe collection like Vector or Collections.synchronizedList(new ArrayList(...)) or ConcurrentHashMap?.
// Thread 2
if (values.size()>0){
System.out.println(values.get(0).x); // always 5 ?
}
As I understood Mathias, it can print out 0 on some JVM according to JLS. As I understood axtavt it will always print 5.
What is your opinion?
--
Regards,
Dmitriy

In this case you need to use safe publication idioms when making your object available to other threads, namely (from Java Concurrency in Practice):
Initializing an object reference from a static initializer;
Storing a reference to it into a volatile field or AtomicReference;
Storing a reference to it into a final field of a properly constructed object; or
Storing a reference to it into a field that is properly guarded by a lock.
If you use safe publication, you don't need to declare fields volatile.
However, if you don't use it, declaring fields volatile (theoretically) won't help, because memory barriers incurred by volatile are one-side: volatile write can be reordered with non-volatile actions after it.
So, volatile ensures correctness in the following case:
class Foo {
public int x;
}
volatile Foo foo;
// Thread 1
Foo f = new Foo();
f.x = 42;
foo = f; // Safe publication via volatile reference
// Thread 2
if (foo != null)
System.out.println(foo.x); // Guaranteed to see 42
but don't work in this case:
class Foo {
public volatile int x;
}
Foo foo;
// Thread 1
Foo f = new Foo();
// Volatile doesn't prevent reordering of the following actions!!!
f.x = 42;
foo = f;
// Thread 2
if (foo != null)
System.out.println(foo.x); // NOT guaranteed to see 42,
// since f.x = 42 can happen after foo = f
From the theoretical point of view, in the first sample there is a transitive happens-before relationship
f.x = 42 happens before foo = f happens before read of foo.x
In the second example f.x = 42 and read of foo.x are not linked by happens-before relationship, therefore they can be executed in any order.

You do not need to declare you field volatile of its value is set before the start method is called on the threads that read the field.
The reason is that in that case the setting is in a happens-before relation (as defined in the Java Language Specification) with the read in the other thread.
The relevant rules from the JLS are:
Each action in a thread happens-before every action in that thread that comes later in the program's order
A call to start on a thread happens-before any action in the started thread.
However, if you start the other threads before setting the field, then you must declare the field volatile. The JLS does not allow you to assume that the thread will not cache the value before it reads it for the first time, even if that may be the case on a particular version of the JVM.

In order to fully understand what's going on I have been reading about the Java Memory Model (JMM). A useful introduction to the JMM can be found in Java Conurrency in Practice.
I think the answer to the question is: yes, in the example given making the members of the object volatile is NOT necessary. However, this implementation is rather brittle as this guarantee depends on the exact ORDER in which things are done and on the Thread-Safety of the Container. A builder pattern would be a much better option.
Why is it guaranteed:
The thread 1 does all the assignment BEFORE putting the value into the thread safe container.
The add method of the thread safe container must use some synchronization construct like volatile read / write, lock or synchronized(). This guarantees two things:
Instructions which are in thread 1. before the synchronization will actually be executed before. That is the JVM is not allowed to reorder instructions for optimization purposes with the synchronization instruction. This is called happens-before guarantee.
All writes which happen before the synchronization in thread 1 will afterwards be visible to all other threads.
The objects are NEVER modified after publication.
However, if the container was not thread safe or the Order of things was changed by somebody not aware of the pattern or the objects are changed accidentally after publication then there are no guarantees anymore. So, following the Builder Pattern, as can be generated by google AutoValue or Freebuilder is much safer.
This article on the topic is also quite good:
http://tutorials.jenkov.com/java-concurrency/volatile.html

Java concurrency - why doesn't synchronizing a setter (but not a getter) make a class thread-safe? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Thread safety in Java class
I'm reading Java concurrency in Practice, and I've come to an example that puzzles me.
The authors state that this class is not threadsafe
public class MutableInteger {
private int number;
public int getInt() {
return number;
}
public void setInt(int val) {
number = val;
}
}
And they also state that synchronizing only one method (the setter for example) would not do; you have to syncronize both.
My question is: Why? Wouldn't synchronizing the setter just do?

Java has a happens before/happens after memory model. There needs to be some common concurrent construct (e.g. synchronized block/method, lock, volatile, atomic) on both the write path and the read path to trigger this behaviour.
If you synchronize both methods you are creating a lock on the whole object that will be shared by both the read and write threads. The JVM will ensure that any changes that occur on the writing thread that occur before leaving the (synchronized) setInt method will be visible to any reading threads after they enter the (synchronized) getInt method. The JVM will insert the necessary memory barriers to ensure that this will happen.
If only the write method is synchronized then changes to the object may not be visible to any reading thread. This is because there is no point on the read path that the JVM can use to ensure that the reading thread's visible memory (cache's etc.) are in line with the writing thread. Make the getInt method synchronized would provide that.
Note: specifically in this case making the field 'number' volatile would give the correct behaviour as volatile read/write also provides the same memory visibility behaviour in the JVM and the action inside of the setInt method is only an assignment.

It's explained in the book before the sample (page 35):
"Synchronizing only the setter would not be sufficient: threads calling get would still be able to see stale values."
Stale data: When the reader thread examines ready, it may see an out-of-date value. Unless synchronization is used every time a variable is accessed, it is possible to see a stale value for that variable. Worse, staleness is not all-or-nothing: a thread can see an up-to-date value of one variable but a stale value of another variable that was written first.

If you only Synchronize the setter method, you could only guarantee the attribute would not be amended incorrectly, but you could not be sure it is stale value when you try to read the variable.

because number is not volatile, and getInt() is not synchronized, getInt() may return stale values. For more information, read about the java memory model.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.