Understanding unsafe publication - java

In JCIP 16.2 B.Goetz mentioned that
If you do not ensure that publishing the shared reference
happens-before another thread loads that shared reference, then the
write of the reference to the new object can be reordered (from the
perspective of the thread consumign the object) with writes to its
fields.
So I would guess that it means that publishing even NotThreadSafe objects with synchronization is enough. Consider the following shared object
public ObjectHolder{
private int a = 1;
private Object o = new Object();
//Not synchronizaed GET, SET
}
//Assume that the SharedObjectHolder published
//with enough level of synchronization
public class SharedObjectHolder{
private ObjectHolder oh;
private final Lock lock = new ReentrantLock();
public SharedObjectHolder(){
lock.lock();
try{
oh = new ObjectHolder();
} finally {
lock.unlock();
}
}
public ObjectHolder get(){
lock.lock();
try{
return oh;
} finally {
lock.unlock();
}
}
}
Now we have happens-before between writng to oh and returning oh from the method get(). It guarantees that any caller thread observes up-to-date value of oh.
But, writing to oh fields (private int a, private Object o) during construction is not happens-before with wiritng to oh. JMM does not guarantee that. If I'm wrong, please provide a proof-reference to JMM. Therefore even with such publishing, a thread reading oh may observe a partually-constructed object.
So, what did he mean by saying that I provided in a quote? Can you clarify?

If you only read or write oh per the methods above, then the lock aquired by get() will ensure you see all actions up to the release of the lock in SharedObjectHolder's constructor -- including any writes to oh's fields. The happens-before edge you're relying on has nothing to do with the write to oh, and everything to do with writes (including to oh's fields) happening before a lock is released, which happens before that lock is acquired, which happens before reads.
It is possible to see a partially-constructed oh, if you have a thread that reorders get() to happen before the constructor and the write to oh to happen before both of them. That's why the SharedObjectHolder instance needs to be published safely.
(That said, if you can publish SharedObjectHolder safely, I don't see why you couldn't just publish the original oh reference safely.)

Since you specifically asked for a disprove of your statement: “But, writing to oh fields (private int a, private Object o) during construction is not happens-before with writing to oh. JMM does not guarantee that”, have a look at JLS §17.4.5. Happens-before Order, right the first bullet:
If we have two actions x and y, we write hb(x, y) to indicate that x happens-before y.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
…
This, together with the transitivity of happens-before relationships, is the most important guaranty of the JMM as it implies that we can have threads performing a sequence of actions without synchronization and only synchronizing when needed. But note that it isn’t relevant to establish a happens-before relationship between the writing of the fields of ObjectHolder and the write to SharedObjectHolder.oh as that all happens within a single thread.
The important consequence of the citation above is that there is a happens-before relationship between all three writes and the release of the Lock due to the program order. Since there also is a happens-before relationship between the release of the Lock and the subsequent acquisition of the Lock by another thread within SharedObjectHolder.get(), the transitivity establishes a happens-before relationship between all three writes and the acquisition of the Lock. It doesn’t matter in which order these three writes were actually performed, the only thing that matters is that all three are completed by the time the Lock is acquired.
As a side note, you wrote in a code comment “Assume that the SharedObjectHolder published with enough level of synchronization”. If we assume that, the entire Lock becomes obsolete as the “enough level of synchronization” used to properly publish the SharedObjectHolder instance is also enough for the publication of the embedded ObjectHolder and its fields, as all their initialization happens-before that publication of SharedObjectHolder due to the program order.

We have:
Write of ObjectHolder values
Write of oh
Unlock of lock
Lock of lock
Read of oh and ObjectHolder values.
There are happens-before relations between 1, 2, 3 and 4, 5 because they are in program order and in the same thread.
There is a happens-before relation between 3 and 4 because of the lock.
So there is a happens-before relation between the writes of ObjectHolder values and the reads in the other thread because of transitivity.

Related

What is the extent of variable visibility effect of synchronized/volatile in Java

According to "Java Concurrency in Practice":
everything A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock
and
The visibility effects of volatile variables extend beyond the value of the volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable, the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable
what I'm not clear about is what dose it mean by everything and all variables? Dose it mean everything literally? If we have a class like this:
class MyClassA{
int a;
int[] array = new int[10];
MyClassB myClass; // a class with similar properties
void notSyncronizedMethod(){
// do something with a, array[3], myClass.a, myClass.array[3]
}
syncronized void syncronizedMethodA(){
// update value of a, array[3], myClass.a, myClass.array[3]
}
syncronized void syncronizedMethodB(){
// do something with a, array[3], myClass.a, myClass.array[3]
}
}
if we call syncronizedMethodA() in one thread and then call syncronizedMethodB() or notSyncronizedMethod() in another thread, assume the time order is stritly garanteed, will call of syncronizedMethodB() and notSyncronizedMethod() use the latest variable value set by syncronizedMethodA(). I'm sure value of a is OK for syncronizedMethodB(), but what about elements of reference types like array[3], myClass.a or even myClass.myClass.array[3]? What about notSyncronizedMethod() with value updated by an syncronized method?
In order to figure out what visibility guarantees are provided, you need to understand the Java Memory Model a little better, and more specifically, what happens-before means in the context of the JMM. The JMM describes things that happen as actions, for example, normal reads and writes, volatile reads and writes, lock, unlock, etc.
There are a handful of rules in the JMM that establish when one action happens-before another action. The rules relevant in your case are the following:
The single thread rule: in a given thread, action A happens-before action B if A precedes B in program order.
The monitor lock rule (synchronized): An unlock of given monitor happens-before a subsequent lock on the same monitor.
It's important to know that happens-before is transitive, i.e. if hb(a, b) and hb(b, c), then hb(a, c).
In your example, one thread releases the monitor when exiting syncronizedMethodA(), and another thread subsequently acquires the monitor when entering syncronizedMethodB(). That's one happens-before relation. And since HB is transitive, actions performed in syncronizedMethodA() become visible for any thread that subsequently enters syncronizedMethodB().
On the other hand, no happens-before relation exists between the release of the monitor in syncronizedMethodA() and subsequent actions performed in notSynchronizedMethod() by another thread. Therefore, there are no guarantees that the writes in syncronizedMethodA() are made visible to another thread's reads in notSynchronizedMethod().

Why is `synchronized (new Object()) {}` a no-op?

In the following code:
class A {
private int number;
public void a() {
number = 5;
}
public void b() {
while(number == 0) {
// ...
}
}
}
If method b is called and then a new thread is started which fires method a, then method b is not guaranteed to ever see the change of number and thus b may never terminate.
Of course we could make number volatile to resolve this. However for academic reasons let's assume that volatile is not an option:
The JSR-133 FAQs tells us:
After we exit a synchronized block, we release the monitor, which has the effect of flushing the cache to main memory, so that writes made by this thread can be visible to other threads. Before we can enter a synchronized block, we acquire the monitor, which has the effect of invalidating the local processor cache so that variables will be reloaded from main memory.
This sounds like I just need both a and b to enter and exit any synchronized-Block at all, no matter what monitor they use. More precisely it sounds like this...:
class A {
private int number;
public void a() {
number = 5;
synchronized(new Object()) {}
}
public void b() {
while(number == 0) {
// ...
synchronized(new Object()) {}
}
}
}
...would eliminate the problem and will guarantee that b will see the change to a and thus will also eventually terminate.
However the FAQs also clearly state:
Another implication is that the following pattern, which some people
use to force a memory barrier, doesn't work:
synchronized (new Object()) {}
This is actually a no-op, and your compiler can remove it entirely,
because the compiler knows that no other thread will synchronize on
the same monitor. You have to set up a happens-before relationship for
one thread to see the results of another.
Now that is confusing. I thought that the synchronized-Statement will cause caches to flush. It surely can't flush a cache to main memory in way that the changes in the main memory can only be seen by threads which synchronize on the same monitor, especially since for volatile which basically does the same thing we don't even need a monitor, or am I mistaken there? So why is this a no-op and does not cause b to terminate by guarantee?
The FAQ is not the authority on the matter; the JLS is. Section 17.4.4 specifies synchronizes-with relationships, which feed into happens-before relationships (17.4.5). The relevant bullet point is:
An unlock action on monitor m synchronizes-with all subsequent lock actions on m (where "subsequent" is defined according to the synchronization order).
Since m here is the reference to the new Object(), and it's never stored or published to any other thread, we can be sure that no other thread will acquire a lock on m after the lock in this block is released. Furthermore, since m is a new object, we can be sure that there is no action that previously unlocked on it. Therefore, we can be sure that no action formally synchronizes-with this action.
Technically, you don't even need to do a full cache flush to be up to the JLS spec; it's more than the JLS requires. A typical implementation does that, because it's the easiest thing the hardware lets you do, but it's going "above and beyond" so to speak. In cases where escape analysis tells an optimizing compiler that we need even less, the compiler can perform less. In your example, escape analysis can could tell the compiler that the action has no effect (due to the reasoning above) and can be optimized out entirely.
the following pattern, which some people use to force a memory barrier, doesn't work:
It's not guaranteed to be a no-op, but the spec permits it to be a no-op. The spec only requires synchronization to establish a happens-before relationship between two threads when the two threads synchronize on the same object, but it actually would be easier to implement a JVM where the identity of the object did not matter.
I thought that the synchronized-Statement will cause caches to flush
There is no "cache" in the Java Language Specification. That's a concept that only exists in the details of some (well, O.K., virtually all) hardware platforms and JVM implementations.

Possible synchronization issue due to code rearrangement by compiler

Consider the following code sample:
private Object lock = new Object();
private volatile boolean doWait = true;
public void conditionalWait() throws Exception {
synchronized (lock) {
if (doWait) {
lock.wait();
}
}
}
public void cancelWait() throws Exception {
doWait = false;
synchronized (lock) {
lock.notifyAll();
}
}
If I understand the Java Memory Model correctly, then above code is not Thread-safe. It might very well block because the compiler might decide to rearrange the code as follows:
public void cancelWait() throws Exception {
synchronized (lock) {
lock.notifyAll();
}
doWait = false;
}
In this case it might happen that thread T1 calls the cancelWait() method, aquire the lock, call notifyAll() and release lock. After this a parallel thread T2 could call conditionalWait() and aquire the now available lock. The variable doWait still has value true, thus thread T2 executes lock.wait() and blocks.
Is my understanding correct? If not, then please provide according references from the Java Specification which disprove above scenario.
Is there a solution that resolves this issue that does not require pulling doWait into the synchronized block?
The question you are asking is actually
Can a monitor enter be re-ordered above a volatile store?
No, your transformation cannot happen. Take a look at the grid linked at the top of http://gee.cs.oswego.edu/dl/jmm/cookbook.html.
First Operation: Volatile Store
Second Operation: Monitor Enter
Result: No
So the compiler cannot re-order as you suggest.
Your code is broken, but not because of reordering or visibility issues. Reordering problems occur in the absence of sufficient synchronization, which is not the case here. You have done everything possible, in terms of marking things volatile or synchronized, to let the JVM know to make the right things visible across threads.
Your problem here is that you're making several false assumptions:
You're thinking wait can never return until it gets a notification (this may not happen frequently, but it can happen, this is called a "spurious wakeup").
You're assuming that another thread can't barge in between the time the notification happens and the time that the waiting thread can reacquire the monitor. (Object#wait releases the monitor, and upon reacquiring it the thread needs to re-check what the current state is, instead of proceeding based on possibly outdated assumptions.)
You're assuming you can predict that the notify will happen after the wait (can't say whether that's true in this case since you didn't post a complete working example, but in general this is not something you want to assume).
There are lots of toy examples (thinking of the even-odd assignment) that get away with this because they are limited to only 2 threads, the race condition that causes spurious wakeups doesn't happen often on PC JVMs, and the program forces the two threads to act in lock-step so the order in which things happen is predictable. But those aren't realistic assumptions for the real world.
The fix for these bad assumptions is to wait in a loop using a condition variable to decide when you're done waiting (see this Oracle tutorial):
private final Object lock = new Object(); // final to emphasize this shouldn't change
private volatile boolean doWait = true;
public void conditionalWait() throws InterruptedException {
synchronized (lock) {
while (doWait) {
lock.wait();
}
}
}
public void cancelWait() {
doWait = false;
synchronized (lock) {
lock.notifyAll();
}
}
(I narrowed the exceptions thrown, the only thing thrown by notifyAll is IllegalMonitorStateException, which is unchecked and won't happen as long as you're using the right locks, it's only thrown as a result of programmer error.
Object#wait throws InterruptedException as well as IllegalMonitorStateException, it's ok to let it be thrown here.)
It would be just as well here to move the references to the doWait variable into the synchronized blocks, if all references to it are made while holding a lock then you don't need to make it volatile. But this isn't required.
The Java memory model guarantees sequential consistency when your program is correctly synchronized. Since your code above is correctly synchronized, then the reordering doesn't happen.
Happens Before Order
A program is correctly synchronized if and only if all sequentially consistent executions are free of data races.
If a program is correctly synchronized, then all executions of the program will appear to be sequentially consistent (§17.4.3).
This is an extremely strong guarantee for programmers. Programmers do not need to reason about reorderings to determine that their code contains data races. Therefore they do not need to reason about reorderings when determining whether their code is correctly synchronized. Once the determination that the code is correctly synchronized is made, the programmer does not need to worry that reorderings will affect his or her code.
This can be confusing since sequential consistency is defined previously in the inter-thread section of the spec (dealing with only a single thread).
Programs and Program Order
A set of actions is sequentially consistent if all actions occur in a total order (the execution order) that is consistent with program order, and furthermore, each read r of a variable v sees the value written by the write w to v such that:
w comes before r in the execution order, and
there is no other write w' such that w comes before w' and w' comes before r in the execution order.
Sequential consistency is a very strong guarantee that is made about visibility and ordering in an execution of a program. Within a sequentially consistent execution, there is a total order over all individual actions (such as reads and writes) which is consistent with the order of the program, and each individual action is atomic and is immediately visible to every thread.
If a program has no data races, then all executions of the program will appear to be sequentially consistent.
So what sequential consistency boils down to is that your program, when correctly synchronized, must appear to work like each read and write was completed exactly in the order specified in your program. No reordering are allowed (or allowed to be visible).
Normally when you talk about a write being reordered you're talking about a p-threads memory model, used by C++ (I think), which specifies when writes can and cannot be re-ordered past a memory barrier. It's a popular memory model and a lot of people know it.
Java doesn't have the concept of memory barriers. Java is similar, but not the same as, the p-thread spec, so don't get the two confused. In Java, either you have a program that works exactly in the order you specify in your program, or you have no guarantees at all if you don't synchronize. It's one or the other, and your case the write to the volatile has to appear in program order.
Re. your question in your comment below: I don't think it's that hard to find happens-before in the spec. Synchronization Order says:
Every execution has a synchronization order. A synchronization order is a total order over all of the synchronization actions of an execution. For each thread t, the synchronization order of the synchronization actions (§17.4.2) in t is consistent with the program order (§17.4.3) of t.
Synchronization actions induce the synchronized-with relation on actions, defined as follows:
An unlock action on monitor m synchronizes-with all subsequent lock actions on m (where "subsequent" is defined according to the synchronization order).
And back to some definitions in Happens Before Order:
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
If we have two actions x and y, we write hb(x, y) to indicate that x happens-before y.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
If an action x synchronizes-with a following action y, then we also have hb(x, y).
So, the unlock of your monitor synchronized (lock) in cancelWait() synchronizes-with the lock acquire action in conditionalWait(). Synchroizes-with creates a happens-before relationship (see the very last line of that quote directly above). Therefore the assignment of doWait=false; must be visible when it is read in conditionalWait().
(Happens Before Order also says:
If hb(x, y) and hb(y, z), then hb(x, z).
so we know that if the volatile is assigned before the lock release, and a new lock acquire then happens after the lock release, it must be that the volatile assignment happens-before the lock acquire and is therefore visible.)
According JSL specification it's impossible
http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4.5
also you could look into
Java memory model : compiler rearranging code lines
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile

Java's happens-before and synchronization

I'm having a little disagreement on Java's happens-before and synchronization.
Imagine the following scenario:
Main Thread
MyObject o = new MyObject(); // (0)
synchronized (sharedMonitor) {
// (1) add the object to a shared collection
}
// (2) spawn other threads
Other Threads
MyObject o;
synchronized (sharedMonitor) {
// (3) retrieve the previously added object
}
// (4) actions to modify the object
Note that the instance variables of MyObject aren't neither volatile, nor final.
The methods of MyObject do not use synchronization.
It is my understanding that:
1 happens-before 3, since there's synchronization on the same monitor, and the other threads are spawned only at 2, which is executed after 1.
Actions on 4 have no guarantees of being later visible to the main thread, unless there's further synchronization for all threads, and the main thread somehow synchronizes after these actions.
Q: Is there any guarantee of the actions at 0 being visible, happening-before, concurrent access on 3, or must I declare the variables as volatile?
Consider now the following scenario:
Main Thread
MyObject o = new MyObject(); // (0)
synchronized (sharedMonitor) {
// (1) add the object to a shared collection
}
// (2) spawn other threads, and wait for their termination
// (5) access the data stored in my object.
Other Threads
MyObject o;
synchronized (sharedMonitor) {
// (3) retrieve the previously added object
}
o.lock(); // using ReentrantLock
try {
// (4) actions to modify the object
} finally { o.unlock(); }
It is my understanding that:
1 happens-before 3, just as before.
Actions on 4 are visible between the other threads, due to synchronization on the ReentrantLock held by MyObject.
Actions on 4 logically happen after 3, but there's no happens-before relation from 3 to 4, as consequence of synchronizing on a different monitor.
The point above would remain true, even if there was synchronization on sharedMonitor after the unlock of 4.
Actions on 4 do not happen-before the access on 5, even though the main thread awaits for the other tasks to terminate. This is due to the access on 5 not being synchronized with o.lock(), and so the main thread may still see outdated data.
Q: Is my understanding correct?
Q: Is there any guarantee of the actions at 0 being visible, happening-before, concurrent access on 3, or must I declare the variables as volatile?
Yes there is a guarantee. You do not need the have the synchronized block in the main thread because there is a happens-before relationship when the threads are started. From JLS 17.4.5: "A call to start() on a thread happens-before any actions in the started thread."
This also means that if you pass your o into the thread constructor you wouldn't need the synchronized block around (3) either.
Actions on (4) logically happen after (3), but there's no happens-before relation from (3) to (4), as consequence of synchronizing on a different monitor.
Yes and no. The logical order means that in the same thread there is certainly a happens-before relationship even though it is different monitor. The compiler is not able to reorder 3 past 4 even though they are dealing with different monitors. The same would be true with an access to a volatile field.
With multiple threads, since (3) is only reading the object then there is not a race condition. However, if (3) was making modifications to the object (as opposed to just reading it), then in another thread those modifications may not be seen at (4). As you quote and #StephenC reiterates, the JLS says that the happens-before relationship is only guaranteed on the same monitor. JLS 17.4.5: "An unlock on a monitor happens-before every subsequent lock on that monitor."
The point above would remain true, even if there was synchronization on sharedMonitor after the unlock of (4).
See above.
Actions on (4) do not happen-before the access on (5), even though the main thread awaits for the other tasks to terminate
No. Once the main thread calls thread.join() and it returns without getting interrupted then the main thread is synchronized fully with the memory of the thread it joined with. There is a happens-before relationship between the thread being joined with and the thread doing the joining. JLS 17.4.5: "All actions in a thread happen-before any other thread successfully returns from a join() on that thread."

Volatile piggyback. Is this enough for visiblity?

This is about volatile piggyback.
Purpose: I want to reach a lightweight vars visibilty. Consistency of a_b_c is not important. I have a bunch of vars and I don't want to make them all volatile.
Is this code threadsafe?
class A {
public int a, b, c;
volatile int sync;
public void setup() {
a = 2;
b = 3;
c = 4;
}
public void sync() {
sync++;
}
}
final static A aaa = new A();
Thread0:
aaa.setup();
end
Thread1:
for(;;) {aaa.sync(); logic with aaa.a, aaa.b, aaa.c}
Thread2:
for(;;) {aaa.sync(); logic with aaa.a, aaa.b, aaa.c}
Java Memory Model defines the happens-before relationship which has the following properties (amongst others):
"Each action in a thread happens-before every action in that thread that comes later in the program order" (program order rule)
"A write to a volatile field happens-before every subsequent read of that same volatile" (volatile variable rule)
These two properties together with transitivity of the happens-before relationship imply the visibility guarantees that OP seeks in the following manner:
A write to a in thread 1 happens-before a write to sync in a call to sync() in thread 1 (program order rule).
The write to sync in the call to sync() in thread 1 happens-before a read to sync in a call to sync in thread 2 (volatile variable rule).
The read from sync in the call to sync() in thread 2 happens-before a read from a in thread 2 (program order rule).
This implies that the answer to the question is yes, i.e. the call to sync() in each iteration in threads 1 and 2 ensures visibility of changes to a, b and c to the other thread(s). Note that this ensures visibility only. No mutual exclusion guarantees exist and hence all invariants binding a, b and c may be violated.
See also Java theory and practice: Fixing the Java Memory Model, Part 2. In particular the section "New guarantees for volatile" which says
Under the new memory model, when thread A writes to a volatile
variable V, and thread B reads from V, any variable values that were
visible to A at the time that V was written are guaranteed now to be
visible to B.
Incrementing a value between threads is never thread-safe with just volatile. This only ensures that each thread gets an up to date value, not that the increment is atomic, because at the assembler level your ++ is actually several instructions that can be interleaved.
You should use AtomicInteger for a fast atomic increment.
Edit: Reading again what you need is actually a memory fence. Java has no memory fence instruction, but you can use a lock for the memory fence "side-effect". In that case declare the sync method synchronized to introduce an implicit fence:
void synchronized sync() {
sync++;
}
The pattern is usually like this.
public void setup() {
a = 2;
b = 3;
c = 4;
sync();
}
However, while this guarantees the other threads will see this change, the other threads can see an incomplete change. e.g. the Thread2 might see a = 2, b = 3, c = 0. or even possibly a = 2, b = 0, c = 4;
Using the sync() on the reader doesn't help much.
From javadoc:
An unlock (synchronized block or method exit) of a monitor
happens-before every subsequent lock (synchronized block or method
entry) of that same monitor. And because the happens-before relation
is transitive, all actions of a thread prior to unlocking
happen-before all actions subsequent to any thread locking that
monitor.
A write to a volatile field happens-before every subsequent read of
that same field. Writes and reads of volatile fields have similar
memory consistency effects as entering and exiting monitors, but do
not entail mutual exclusion locking.
So I think that writing to volatile var is not an equivalent to syncronization in this case and it doesn't guarantee happens-before order and visibility of changes in Thread1 to Thread2
You don't really have to manually synchronize at all, just use an automatically synchronized data structure, like java.util.concurrent.atomic.AtomicInteger.
You could alternatively make the sync() method synchronized.

Categories