Why DCL without volatile is valid for primitives? - java

Disclaimer: I don't use DCL in real production code - I have an academic interest only.
I've read following famous article: The "Double-Checked Locking is Broken" Declaration
The problem declaration(my vision):
// Correct multithreaded version
class Foo {
private Helper helper = null;
public synchronized Helper getHelper() {
if (helper == null)
helper = new Helper();
return helper;
}
// other functions and members...
}
Let's imagine that thread_1 executed line helper = new Helper();
Another Thread(thread_2) might see that helper link is not null but it is not initialized yet. It happens because of constructor invocation might be reordered with helper link assignment
fromthread_2 veiw.
But at this article mentioned that this approach works properly for 32 bit primitives.
Although the double-checked locking idiom cannot be used for
references to objects, it can work for 32-bit primitive values (e.g.,
int's or float's). Note that it does not work for long's or double's,
since unsynchronized reads/writes of 64-bit primitives are not
guaranteed to be atomic.
// Correct Double-Checked Locking for 32-bit primitives
class Foo {
private int cachedHashCode = 0;
public int hashCode() {
int h = cachedHashCode;
if (h == 0)
synchronized(this) {
if (cachedHashCode != 0) return cachedHashCode;
h = computeHashCode();
cachedHashCode = h;
}
return h;
}
// other functions and members...
}
Please explain me why it works ? I know that 32 bit write is atomic.
What the reason of local variable here ?

The essence of the "DCL is broken" trope is that, using DCL to initialize a singleton object, a thread could see the reference to the object before it sees the object in a fully initialized state. DCL adequately synchronizes the effectively final global variable that refers to the singleton, but it fails to synchronize the singleton object to which the global refers.
In your example, there's only just the global variable. There is no "object to which it refers."

Related

Java concurrency in practice - safe publication, immutable object and volatile

I'm reading "Java concurrency in practice" and one thing is confusing me.
class OneValueCache {
private final BigInteger lastNumber;
private final BigInteger[] lastFactors;
public OneValueCache(BigInteger lastNumber, BigInteger[] lastFactors) {
this.lastNumber = lastNumber;
this.lastFactors = Arrays.copyOf(lastFactors, lastFactors.length);
}
public BigInteger[] getFactors(BigInteger i) {
if (lastNumber == null || !lastNumber.equals(i)) {
return null;
}
return Arrays.copyOf(lastFactors, lastFactors.length);
}
}
class VolatileCachedFactorized implements Servlet {
private volatile OneValueCache cache = new OneValueCache(null, null);
public void service(ServletRequest req, ServletResponse resp) {
BigInteger i = extractFromRequest(req);
BigInteger[] factors = cache.getFactors(i);
if (factors == null) {
factors = factor(i);
cache = new OneValueCache(i, factors);
}
encodeIntoResponse(resp, factors);
}
}
In above code author uses volatile with reference to immutable OneValueCache, but a few page later he writes:
Immutable objects can be used safely by any thread without additional synchronization, even when synchronization is not used to publish them.
So .. volatile is not necessary in above code?
There are kind of 2 level of "thread-safety" that is being applied here. One is at reference level ( done using volatile). Think of an example where a thread reads the value to be null vs other thread seeing some reference value ( changed in between). Volatile will guarantee the publication of one thread is visible to another. But aAnother level of thread safety will be required to safeguard the internal members themselves which have the potential to be changed. Just having a volatile will have no impact on the data within the Cache ( like lastNumber, lastFactors). So immutability will help in that case.
As a general rule ( referred here) as a good thread safe programming practice
Do not assume that declaring a reference volatile guarantees safe
publication of the members of the referenced object
This is the same reason why putting a volatile keyword in front of a HasMap variable does not make it threadsafe.
cache is not a cache, it is a reference to a cache. The reference needs to be volatile in order that the switch of cache is visible to all threads.
Even after assignment to cache, other threads may be using the old cache, which they can safely do. But if you want the new cache to be seen as soon as it is switched, volatile is needed. There is still a window where threads might be using the old cache, but volatile guarantees that subsequent accessors will see the new cache. Do not confuse 'safety' with 'timeliness'.
Another way to look at this is to note that immutability is a property of the cache object, and cannot affect the use of any reference to that object. (And obviously the reference is not immutable, since we assign to it).

A rare usage of WeakReference?

I have a class whose instances are initialized and used by underlying flatform.
class MyAttributeConverter implements AttributeConverter<XX, YY> {
public YY convertToDatabaseColumn(XX attribute) { return null; }
public XX convertToEntityAttribute(YY dbData) { return null; }
}
Nothing's wrong and I thought I need to add some static methods for being used as method references.
private static MyAttributeConverter instance;
// just a lazy-initialization;
// no synchronization is required;
// multiple instantiation is not a problem;
private static MyAttributeConverter instance() {
if (instance == null) {
instance = new MyAttributeConverter();
}
return instance;
}
// do as MyAttributeConverter::toDatabaseColumn(xx)
public static YY toDatabaseColumn(XX attribute) {
return instance().convertToDatabaseColumn(attribute);
}
public static XX toEntityAttribute(YY dbData) {
return instance().convertToEntityAttribute(attribute);
}
Still nothing seems wrong (I believe) and I don't like the instance persisted with the class and that's why I'm trying to do this.
private static WeakReference<MyAttributeConverter> reference;
public static <R> R applyInstance(Function<? super MyAttributeConverter, ? extends R> function) {
MyAttributeConverter referent;
if (reference == null) {
referent = new MyAttributeConverter();
refernce = new WeakReference<>(referent);
return applyInstance(function);
}
referent = reference.get();
if (referent == null) {
referent = new MyAttributeConverter();
refernce = new WeakReference<>(referent);
return applyInstance(function);
}
return function.apply(referent); // ##?
}
I basically don't even know how to test this code. And I'm sorry for my questions which each might be somewhat vague.
Is this a (right/wrong) approach?
Is there any chance that reference.get() inside the function.apply idiom may be null?
Is there any chance that there may be some problems such as memory-leak?
Should I rely on SoftReference rather than WeakReference?
Thank you.
Note that a method like
// multiple instantiation is not a problem;
private static MyAttributeConverter instance() {
if (instance == null) {
instance = new MyAttributeConverter();
}
return instance;
}
is not thread safe, as it bears two reads of the instance field; each of them may perceive updates made by other threads or not. This implies that the first read in instance == null may perceive a newer value written by another thread whereas the second in return instance; could evaluate to the previous value, i.e. null. So this method could return null when more than one thread is executing it concurrently. This is a rare corner case, still, this method is not safe. You’d need a local variable to ensure that the test and the return statement use the same value.
// multiple instantiation is not a problem;
private static MyAttributeConverter instance() {
MyAttributeConverter current = instance;
if (current == null) {
instance = current = new MyAttributeConverter();
}
return current;
}
This still is only safe when MyAttributeConverter is immutable using only final fields. Otherwise, a thread may return an instance created by another thread in an incompletely constructed state.
You can use the simple way to make it safe without those constraints:
private static final MyAttributeConverter instance = new MyAttributeConverter();
private static MyAttributeConverter instance() {
return instance;
}
This still is lazy as class initialization only happens on one of the specified triggers, i.e. the first invocation of the method instance().
Your usage of WeakReference is subject to the same problems. Further, it’s not clear why you resort to a recursive invocation of your method at two points where you already have the required argument in a local variable.
A correct implementation can be far simpler:
private static WeakReference<MyAttributeConverter> reference;
public static <R> R applyInstance(
Function<? super MyAttributeConverter, ? extends R> function) {
WeakReference<MyAttributeConverter> r = reference;
MyAttributeConverter referent = r != null? r.get(): null;
if (referent == null) {
referent = new MyAttributeConverter();
reference = new WeakReference<>(referent);
}
return function.apply(referent);
}
But before you are going to use it, you should reconsider whether the complicated code is worth the effort. The fact that you are accepting the need to reconstruct the object when it has been garbage collected, even potentially constructing multiple instances on concurrent invocations, suggest that you know that the construction will be cheap. When the construction is cheap, you probably don’t need to cache an instance of it at all.
Just consider
public static <R> R applyInstance(
Function<? super MyAttributeConverter, ? extends R> function) {
return function.apply(new MyAttributeConverter());
}
It’s at least worth trying, measuring the application’s performance and comparing it with the other approaches.
On the other hand, it doesn’t look like the instance was occupying a significant amount of memory nor holding non-memory resources. As otherwise, you were more worried about the possibility of multiple instances flying around. So the other variant worth trying and comparing, is the one shown above using a static final field with lazy class initialization and no opportunity to garbage collect that small object.
One last clarification. You asked
Is there any chance that reference.get() inside the function.apply idiom may be null?
Since there is no reference.get() invocation inside the evaluation of function.apply, there is no chance that such an invocation may evaluate to null at this point. The function receives a strong reference and since the calling code ensured that this strong reference is not null, it will never become null during the invocation of the apply method.
Generally, the garbage collector will never alter the application state in a way that code using strong references will notice a difference (letting the availability of more memory aside).
But since you asked specifically about reference.get(), a garbage collector may collect an object after its last use, regardless of method executions or local scopes. So the referent could get collected during the execution of the apply method when this method does not use the object anymore. Runtime optimizations may allow this to happen earlier than you might guess by looking at the source code, because what may look like an object use (e.g. a field read) may not use the object at runtime (e.g. because that value is already held in a CPU register, eliminating the need to access the object’s memory). As said, all without altering the method’s behavior.
So a hypothetical reference.get() during the execution of the apply method could in principle evaluate to null, but there is no reason for concern, as said, the behavior of the apply method does not change. The JVM will retain the object’s memory as long as needed for ensuring this correct method execution.
But that explanation was just for completeness. As said, you should not use weak nor soft references for objects not holding expensive resources.

Visibility effects of synchronization in Java

This article says:
In this noncompliant code example, the Helper class is made immutable
by declaring its fields final. The JMM guarantees that immutable
objects are fully constructed before they become visible to any other
thread. The block synchronization in the getHelper() method guarantees
that all threads that can see a non-null value of the helper field
will also see the fully initialized Helper object.
public final class Helper {
private final int n;
public Helper(int n) {
this.n = n;
}
// Other fields and methods, all fields are final
}
final class Foo {
private Helper helper = null;
public Helper getHelper() {
if (helper == null) { // First read of helper
synchronized (this) {
if (helper == null) { // Second read of helper
helper = new Helper(42);
}
}
}
return helper; // Third read of helper
}
}
However, this code is not guaranteed to succeed on all Java Virtual
Machine platforms because there is no happens-before relationship
between the first read and third read of helper. Consequently, it is
possible for the third read of helper to obtain a stale null value
(perhaps because its value was cached or reordered by the compiler),
causing the getHelper() method to return a null pointer.
I don't know what to make of it. I can agree that there is no happens before relationship between first and third read, at least no immediate relationship. Isn't there a transitive happens-before relationship in a sense that first read must happen before second, and that second read has to happen before third, therefore first read has to happen before third
Could someone elaborate more proficiently?
No, there is no transitive relationship.
The idea behind the JMM is to define rules that JVM must respect. Providing the JVM follows these rules, they are authorized to reorder and execute code as they want.
In your example, the 2nd read and the 3rd read are not related - no memory barrier introduced by the use of synchronized or volatile for example. Thus, the JVM is allowed to execute it as follow:
public Helper getHelper() {
final Helper toReturn = helper; // "3rd" read, reading null
if (helper == null) { // First read of helper
synchronized (this) {
if (helper == null) { // Second read of helper
helper = new Helper(42);
}
}
}
return toReturn; // Returning null
}
Your call would then return a null value. Yet, a singleton value would have been created. However, sub-sequent calls may still get a null value.
As suggested, using a volatile would introduce new memory barrier. Another common solution is to capture the read value and return it.
public Helper getHelper() {
Helper singleton = helper;
if (singleton == null) {
synchronized (this) {
singleton = helper;
if (singleton == null) {
singleton = new Helper(42);
helper = singleton;
}
}
}
return singleton;
}
As your rely on a local variable, there is nothing to reorder. Everything is happening in the same thread.
No, there's no any transitive relationship between those reads. synchornized only guarantees visibility of changes that were made within synchronized blocks of the same lock. In this case all reads do not use the synchronized blocks on the same lock, hence this is flawed and visibility is not guaranteed.
Because there is no locking once the field is initialized, it is critical that the field be declared volatile. This will ensure the visibility.
private volatile Helper helper = null;
It's all explained here https://shipilev.net/blog/2014/safe-public-construction/#_singletons_and_singleton_factories, the issue simple.
... Notice that we do several reads of instance in this code, and at
least "read 1" and "read 3" are the reads without any
synchronization ... Specification-wise, as mentioned in happens-before
consistency rules, a read action can observe the unordered write via
the race. This is decided for each read action, regardless what other
actions have already read the same location. In our example, that
means that even though "read 1" could read non-null instance, the code
then moves on to returning it, then it does another racy read, and it
can read a null instance, which would be returned!

Publishing and reading of non-volatile field

public class Factory {
private Singleton instance;
public Singleton getInstance() {
Singleton res = instance;
if (res == null) {
synchronized (this) {
res = instance;
if (res == null) {
res = new Singleton();
instance = res;
}
}
}
return res;
}
}
It is almost correct implementation of thread-safe Singleton. The only problem I see is:
The thread #1 that is initializing the instance field can published before it will be initialized completely. Now, the second thread can read instance in a inconsistent state.
But, for my eye it is only problem here. Is it only problem here?
(And we can make instance volatile).
You example is explained by Shipilev in Safe Publication and Safe Initialization in Java. I highly recommend to read the whole article, but to sum it up look at UnsafeLocalDCLFactory section there:
public class UnsafeLocalDCLFactory implements Factory {
private Singleton instance; // deliberately non-volatile
#Override
public Singleton getInstance() {
Singleton res = instance;
if (res == null) {
synchronized (this) {
res = instance;
if (res == null) {
res = new Singleton();
instance = res;
}
}
}
return res;
}
}
Above has following problems:
The introduction of local variable here is a correctness fix, but only partial: there still no happens-before between publishing the Singleton instance, and reading of any of its fields. We are only protecting ourselves from returning "null" instead of Singleton instance. The same trick can also be regarded as a performance optimization for SafeDCLFactory, i.e. doing only a single volatile read, yielding:
Shipilev suggests to fix as follows, by marking instance volatile:
public class SafeLocalDCLFactory implements Factory {
private volatile Singleton instance;
#Override
public Singleton getInstance() {
Singleton res = instance;
if (res == null) {
synchronized (this) {
res = instance;
if (res == null) {
res = new Singleton();
instance = res;
}
}
}
return res;
}
}
There are no other problems with this example.
Normally I would never use a double checked locking mechanism anymore. To create a thread safe singleton you should let the compiler do this:
public class Factory {
private static Singleton instance = new Singleton();
public static Singleton getInstance() {
return res;
}
}
Now you are talking to make the instance volatile. I don't think this is necessary with this solution as the jit compiler now handlers the synchronization of the threads when the object is constructed. But if you want to make it volatile, you can.
Finally I would make the getInstance() and the instance static. Then you can reference Factory.getInstance() directly without constructing the Factory class. Also: you will get the same instance across all threads in your application. Else every new Factory() will give you a new instance.
You can also look at Wikipedia. They have a clean solution if you need a lazy solution:
https://en.wikipedia.org/wiki/Double-checked_locking#Usage_in_Java
// Correct lazy initialization in Java
class Foo {
private static class HelperHolder {
public static final Helper helper = new Helper();
}
public static Helper getHelper() {
return HelperHolder.helper;
}
}
EDIT I've written one more answer here that should clear all the confusion.
This is a good question, and I'll try to summarize my understanding here.
Suppose Thread1 is currently initializing Singleton instance and publishes the reference (unsafely obviously). Thread2 can see this un-safe published reference (meaning it sees a non-null reference), but that does not mean that the fields that it sees via that reference (Singleton fields that are initialized via the constructor) are initialized correctly too.
As far as I can see, this happens because there could be re-ordering of the stores of the fields happening inside the constructor. Since there is no "happens-before" rules (these are plain variables), this could be entirely possible.
But that is not the only problem here. Notice that you do two reads here:
if (res == null) { // read 1
return res // read 2
These reads have no synchronization protection, thus these are racy reads. AFAIK this means that read 1 is allowed to read a non-null reference, while read 2 is allowed to read a null reference.
This btw is the same thing that the ALL mighty Shipilev explains (even if I read this article once 1/2 year I still find something new every time).
Indeed making instance volatile would fix things. When you make it volatile, this happens:
instance = res; // volatile write, thus [LoadStore][StoreStore] barriers
All "other" actions (stores from within the constructor) can not pass this fence, there will be no re-orderings. It also means that when you read the volatile variable and see a non-null value, it means that every "write" that was done before writing the volatile itself has occurred for sure. This excellent post has the exact meaning of it
This also solves the second problem, since these operations can not be re-ordered, you are guaranteed to see the same value from read 1 and read 2.
No matter how much I read and try to understand these things are constantly complicated to me, there are very few people that I know that can write code like this and reason correctly about it too. When you can (I do!) please stick to the known and working examples of double check locking :)
I do that like this:
public class Factory {
private static Factory factor;
public static Factory getInstance() {
return factor==null ? factor = new Factory() : factor;
}
}
Just simply
After some time (yeah it took 2 years, I know), I think I have the proper answer. To take it literally, the answer to this:
But, for my eye it is only problem here. Is it only problem here?
Would be yes. The way you have it right now, callers of getInstance will never see a null. But if Singleton would have fields, there is no guarantee that those fields will be correctly initialized.
Let's take this slow, since the example is beautiful, IMHO. That code you showed does a single (racy) volatile read :
public class Factory {
private Singleton instance;
public Singleton getInstance() {
Singleton res = instance; // <-- volatile RACY read
if (res == null) {
synchronized (this) {
res = instance; // <-- volatile read under a lock, thus NOT racy
if (res == null) {
res = new Singleton();
instance = res;
}
}
}
return res;
}
}
Usually, the classical "double check locking" has two racy reads of volatile, for example:
public class SafeDCLFactory {
private volatile Singleton instance;
public Singleton get() {
if (instance == null) { // <-- RACY read 1
synchronized(this) {
if (instance == null) { // <-- non-racy read
instance = new Singleton();
}
}
}
return instance; // <-- RACY read 2
}
}
Because those two reads are racy, without volatile, this pattern is broken. You can read how we can break here, for example.
In your case, there is an optimization, that does one less reading of a volatile field. On some platforms this matters, afaik.
The other part of the question is more interesting. What if Singleton has some fields that we need to set?
static class Singleton {
//setter and getter also
private Object obj;
}
And a factory, where Singleton is volatile:
static class Factory {
private volatile Singleton instance;
public Singleton get(Object obj) {
if (instance == null) {
synchronized (this) {
if (instance == null) {
instance = new Singleton();
instance.setObj(obj);
}
}
}
return instance;
}
}
We have a volatile field, we are safe, right? Wrong. The assign of obj happens after the volatile write, as such there are no guarantees about it. In plain english: this should help you a lot.
The correct way to fix this is to do the volatile write with an already build instance (fully build):
if (instance == null) {
Singleton local = new Singleton();
local.setObj(obj);
instance = local;
}
Now, the second thread can read instance in a inconsistent state.
I'm pretty sure that really is the only issue in that code. The way I understand it, as soon as the line
instance = res;
is executed, another thread could read instance and see it as non-null, and thus skips the synchronized. This means there is no happens-before relation between those two threads, because those only exist if both threads synchronize on the same object or access the same volatile fields.
The other answers already linked to Safe Publication and Safe Initialization in Java, which offers the following ways to solve the unsafe publication:
Making the instance field volatile. All threads have to read the same volatile variable, which establishes a happens-before relation
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
Wrapping the singleton into a wrapper which stores the singleton in a final field. The rules for final fields are not as formally specified as the happens-before relations, the best explanation I could find is in final Field Semantics
An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.
(Not the emphasis and restriction to the final fields, other fields might be seen in inconsistent state at least theoretically)
Making sure the singleton itself only contains final fields. The explanation would be the same as the one above.
The problem with the code mentioned in the question is that reordering can happen and a thread can get a partially constructed object of the singleton class.
When I say reordering, I mean the following:
public static Singleton getInstance() {
if (instance == null) {
synchronized (Singleton.class) {
if (instance == null) {
instance = new Singleton();
/* The above line executes the following steps:
1) memory allocation for Singleton class
2) constructor call ( it may have gone for some I/O like reading property file etc...
3) assignment ( look ahead shows it depends only on memory allocation which has already happened in 1st step.
If compiler changes the order, it might assign the memory allocated to the instance variable.
What may happen is half initialized object will be returned to a different thread )
*/
}
}
}
return instance;
}
Declaring the instance variable volatile ensures a happens-before/ordered relationship on the above mentioned 3 steps:
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
From Wikipedia's Double-checked locking:
As of J2SE 5.0, this problem has been fixed. The volatile keyword now ensures that multiple threads handle the singleton instance correctly. This new idiom is described in The "Double-Checked Locking is Broken" Declaration:
// Works with acquire/release semantics for volatile
// Broken under Java 1.4 and earlier semantics for volatile
class Foo {
private volatile Helper helper = null;
public Helper getHelper() {
Helper result = helper;
if (result == null) {
synchronized(this) {
result = helper;
if (result == null) {
helper = result = new Helper();
}
}
}
return result;
}
// other functions and members...
}

Java double checked locking - Strings

Given that strings contain final field, does it mean in the context of double checked locking it is not necessary to declare them volatile? E.g.
class SomeClass{
private String val;
String getVal(){
if(val == null){
synchronized(this){
if(val ==null)
val = new String("foo");
}
}
}
}
I used a string as an example, but it should work with other objects that declare some final field, correct?
For strings you're right. A string which is declared final cannot be differed and therefore you do not need to synchronize when using it.
Thats not true for other Objects. Take this little class for example:
public class BankAccount {
private int balance = 0;
public void addMoney(int money) {
balance+=money;
}
}
When you've got a final Object of this class it doesn't mean that nobody can change the fields inside the object. You just can't assign something else to the final variable!
Conclusion: When accessing final String you don't need to synchronize, when accessing final Objects you might have to, depending on the Object itself.
No, you still have to declare val as volatile here. The problem is that while String is immutable and thread safe, val is not. You still have a visibility problem with val itself.
To address your point about "given that String contains a final field," note that the JLS specifically says that visibility is not transitive when dealing with final fields.
Given a write w, a freeze f, an action a (that is not a read of a final field), a read r1 of the final field frozen by f, and a read r2 such that hb(w, f), hb(f, a), mc(a, r1), and dereferences(r1, r2), then when determining which values can be seen by r2, we consider hb(w, r2). (This happens-before ordering does not transitively close with other happens-before orderings.)
https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.5
Where a "freeze f" is how the JLS refers to the thread-safe part of final field semantics, i.e., the part that actually makes the object referenced by the field visible.
(There are cases where you can rely on transitivity with synchronizes-with and happens-before. Brian Goetz calls this 'piggy-backing' and talks about it in Java Concurrency in Practice. But it's pretty much experts only and I don't recommend it until you are an expert with the Java memory model.)
In short, declare val volatile and don't worry about saving two nanoseconds by skipping synchronization. The extra rigmarole in the code isn't worth it, and it doesn't work anyway.

Categories