Java's happens-before and synchronization

Java's happens-before and synchronization - java

I'm having a little disagreement on Java's happens-before and synchronization.
Imagine the following scenario:
Main Thread
MyObject o = new MyObject(); // (0)
synchronized (sharedMonitor) {
// (1) add the object to a shared collection
}
// (2) spawn other threads
Other Threads
MyObject o;
synchronized (sharedMonitor) {
// (3) retrieve the previously added object
}
// (4) actions to modify the object
Note that the instance variables of MyObject aren't neither volatile, nor final.
The methods of MyObject do not use synchronization.
It is my understanding that:
1 happens-before 3, since there's synchronization on the same monitor, and the other threads are spawned only at 2, which is executed after 1.
Actions on 4 have no guarantees of being later visible to the main thread, unless there's further synchronization for all threads, and the main thread somehow synchronizes after these actions.
Q: Is there any guarantee of the actions at 0 being visible, happening-before, concurrent access on 3, or must I declare the variables as volatile?
Consider now the following scenario:
Main Thread
MyObject o = new MyObject(); // (0)
synchronized (sharedMonitor) {
// (1) add the object to a shared collection
}
// (2) spawn other threads, and wait for their termination
// (5) access the data stored in my object.
Other Threads
MyObject o;
synchronized (sharedMonitor) {
// (3) retrieve the previously added object
}
o.lock(); // using ReentrantLock
try {
// (4) actions to modify the object
} finally { o.unlock(); }
It is my understanding that:
1 happens-before 3, just as before.
Actions on 4 are visible between the other threads, due to synchronization on the ReentrantLock held by MyObject.
Actions on 4 logically happen after 3, but there's no happens-before relation from 3 to 4, as consequence of synchronizing on a different monitor.
The point above would remain true, even if there was synchronization on sharedMonitor after the unlock of 4.
Actions on 4 do not happen-before the access on 5, even though the main thread awaits for the other tasks to terminate. This is due to the access on 5 not being synchronized with o.lock(), and so the main thread may still see outdated data.
Q: Is my understanding correct?

Q: Is there any guarantee of the actions at 0 being visible, happening-before, concurrent access on 3, or must I declare the variables as volatile?
Yes there is a guarantee. You do not need the have the synchronized block in the main thread because there is a happens-before relationship when the threads are started. From JLS 17.4.5: "A call to start() on a thread happens-before any actions in the started thread."
This also means that if you pass your o into the thread constructor you wouldn't need the synchronized block around (3) either.
Actions on (4) logically happen after (3), but there's no happens-before relation from (3) to (4), as consequence of synchronizing on a different monitor.
Yes and no. The logical order means that in the same thread there is certainly a happens-before relationship even though it is different monitor. The compiler is not able to reorder 3 past 4 even though they are dealing with different monitors. The same would be true with an access to a volatile field.
With multiple threads, since (3) is only reading the object then there is not a race condition. However, if (3) was making modifications to the object (as opposed to just reading it), then in another thread those modifications may not be seen at (4). As you quote and #StephenC reiterates, the JLS says that the happens-before relationship is only guaranteed on the same monitor. JLS 17.4.5: "An unlock on a monitor happens-before every subsequent lock on that monitor."
The point above would remain true, even if there was synchronization on sharedMonitor after the unlock of (4).
See above.
Actions on (4) do not happen-before the access on (5), even though the main thread awaits for the other tasks to terminate
No. Once the main thread calls thread.join() and it returns without getting interrupted then the main thread is synchronized fully with the memory of the thread it joined with. There is a happens-before relationship between the thread being joined with and the thread doing the joining. JLS 17.4.5: "All actions in a thread happen-before any other thread successfully returns from a join() on that thread."

Related

What is the extent of variable visibility effect of synchronized/volatile in Java

According to "Java Concurrency in Practice":
everything A did in or prior to a synchronized block is visible to B when it executes a synchronized block guarded by the same lock
and
The visibility effects of volatile variables extend beyond the value of the volatile variable itself. When thread A writes to a volatile variable and subsequently thread B reads that same variable, the values of all variables that were visible to A prior to writing to the volatile variable become visible to B after reading the volatile variable
what I'm not clear about is what dose it mean by everything and all variables? Dose it mean everything literally? If we have a class like this:
class MyClassA{
int a;
int[] array = new int[10];
MyClassB myClass; // a class with similar properties
void notSyncronizedMethod(){
// do something with a, array[3], myClass.a, myClass.array[3]
}
syncronized void syncronizedMethodA(){
// update value of a, array[3], myClass.a, myClass.array[3]
}
syncronized void syncronizedMethodB(){
// do something with a, array[3], myClass.a, myClass.array[3]
}
}
if we call syncronizedMethodA() in one thread and then call syncronizedMethodB() or notSyncronizedMethod() in another thread, assume the time order is stritly garanteed, will call of syncronizedMethodB() and notSyncronizedMethod() use the latest variable value set by syncronizedMethodA(). I'm sure value of a is OK for syncronizedMethodB(), but what about elements of reference types like array[3], myClass.a or even myClass.myClass.array[3]? What about notSyncronizedMethod() with value updated by an syncronized method?

In order to figure out what visibility guarantees are provided, you need to understand the Java Memory Model a little better, and more specifically, what happens-before means in the context of the JMM. The JMM describes things that happen as actions, for example, normal reads and writes, volatile reads and writes, lock, unlock, etc.
There are a handful of rules in the JMM that establish when one action happens-before another action. The rules relevant in your case are the following:
The single thread rule: in a given thread, action A happens-before action B if A precedes B in program order.
The monitor lock rule (synchronized): An unlock of given monitor happens-before a subsequent lock on the same monitor.
It's important to know that happens-before is transitive, i.e. if hb(a, b) and hb(b, c), then hb(a, c).
In your example, one thread releases the monitor when exiting syncronizedMethodA(), and another thread subsequently acquires the monitor when entering syncronizedMethodB(). That's one happens-before relation. And since HB is transitive, actions performed in syncronizedMethodA() become visible for any thread that subsequently enters syncronizedMethodB().
On the other hand, no happens-before relation exists between the release of the monitor in syncronizedMethodA() and subsequent actions performed in notSynchronizedMethod() by another thread. Therefore, there are no guarantees that the writes in syncronizedMethodA() are made visible to another thread's reads in notSynchronizedMethod().

using Java object refs for inter-thread communication utilizing synchronized blocks [duplicate]

This question already has answers here:
How to understand happens-before consistent
(5 answers)
Closed 4 years ago.
A synchronized statement establishes a happens-before relation. But im not sure about the details.
In http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/package-summary.html one can read
An unlock (synchronized block or method exit) of a monitor happens-before every subsequent lock
(synchronized block or method entry) of that same monitor
I want to know if i understood that correctly. Therefore have a look at the following example.
Lets assume that there are 2 Threads T1,T2 sharing the same instance data of a class Data and object of the class Object.
Now the following code gets executed in the given threads and order:
(1)T1: data.setValue("newValue");
(2)T1: synchronized(object){}
(3)T2: synchronized(object){}
(4)T2: String str=data.getValue();
because (1) and (2) are executed in the same thread, one has hb(1,2) and analogue hb(3,4). In (2) is an unlock of the monitor and in (3) a lock of the same monitor, thus hb(2,3), therefore hb(1,4) and str should be equal to "newValue". Is that correct? If not than hb(2,3) should be wrong, but why?
Edit
Because details of the class Data is needed to answer the question:
public class Data {
private String value
public void setValue(String newValue){
value=newValue;
}
public String getValue getValue(){
return value;
}
}
Edit 2
its clear, that one cannot guarantee the order of execution. When one has instead
(1*)T1: synchronized(object){data.setValue("newValue");}
(2*)T2: synchronized(object){String str=data.getValue();}
one has also no guarantee that (1*) is exectuted before (2*), but if im right, one has the guarantee that after (2*) one has str= "newValue" if (1*) was executed before (2*). I want to know if the same holds for the 1st example

because (1) and (2) are executed in the same thread, one has hb(1,2) and analogue hb(3,4). In (2) is an unlock of the monitor and in (3) a lock of the same monitor, thus hb(2,3), therefore hb(1,4) and str should be equal to "newValue". Is that correct?
Yes, your logic is correct for this specific scenario. If (and only if) 2 executes before 3 then hb(2, 3). To understand why it should be, imagine a thread process such as the following:
localState *= 2;
synchronized(object) {
sharedState = localState;
}
Although localState is computed outside a synchronized block, it should be necessary for other threads to see this computation to also see the correct value for sharedState.
However, it's important to understand that there is no reason to expect the order you've asked about as the outcome. For example it could just as easily happen to execute this way:
(1)T1: data.setValue("newValue");
(3)T2: synchronized(object){}
(4)T2: String str=data.getValue();
(2)T1: synchronized(object){}
This is bad because now T1 is writing to a location in memory without synchronization while T2 is about to read it. (T2 could even read at the same time the write is occurring!)
To understand what happens-before is all about, instead imagine these threads are running concurrently (as threads do) and execute under the following timeline:
| T1 | T2
-------------------------------------------------------------
1 | synchronized(object){} |
2 | data.setValue("newValue"); | String str=data.getValue();
3 | | synchronized(object){}
Notice how I've aligned these hypothetical actions.
At point 1, T1 acquires the lock and releases it.
At point 2, T1 executes a write while simulaneously T2 executes a read.
At point 3, T2 acquires the lock and releases it.
But which actually happens first at point 2? T1's write or T2's read?
Synchronization doesn't guarantee the order that threads actually execute with respect to one another. Instead, it is about memory consistency between threads.
At point 2, because there is no synchronization, even if T1 actually makes the write before T2 reads it, T2 is free to see the old value in memory. Therefore it can appear that T2(2) happened before T1(2).
Technically what this means is that outside of synchronization, a thread is free to read/write in a CPU cache instead of main memory. Synchronization forces the read/write in main memory.
Now with the second concurrent timeline:
T1 | T2
------------------------------------------------------------
synchronized(object){ | synchronized(object){
data.setValue("newValue"); | String str=data.getValue();
} | }
Although we do not have a guarantee about which thread acquires the lock first, we do have a guarantee that the memory access will be consistent. We also have a guarantee that their actions will not overlap, which was possible in the first timeline.
If T1 acquires the lock first, it is guaranteed that T1's synchronized actions will appear as if happening before T2's actions. (T1 will definitely write before T2 reads.)
If T2 acquires the lock first, it is guaranteed that T2's synchronized actions will appear as if happening before T1's actions. (T1 will definitely write after T2 reads.)

No. It's not necessary that statement 2 will always execute or happen before statement 3. It can happen that thread 2 will acquire the monitor on object and hence statement 3 will happen before statement 2.
You don't have control over which thread will actually get the monitor of Object and you can't predict.

It is not quite that simple. It also depends on what data.setValue and data.getValue actually do under the covers. Are those methods safe for concurrent (unsynchronized) calls?
In one contrived example, if data were backed by a HashMap and multiple threads call various set methods concurrently, it could lead to an infinite loop.
In short, you are only able to guarantee the order of execution. You have some limited guarantees to memory visibility between the set and get, but not concurrent calls to set or get having any potential side effects.

Understanding unsafe publication

In JCIP 16.2 B.Goetz mentioned that
If you do not ensure that publishing the shared reference
happens-before another thread loads that shared reference, then the
write of the reference to the new object can be reordered (from the
perspective of the thread consumign the object) with writes to its
fields.
So I would guess that it means that publishing even NotThreadSafe objects with synchronization is enough. Consider the following shared object
public ObjectHolder{
private int a = 1;
private Object o = new Object();
//Not synchronizaed GET, SET
}
//Assume that the SharedObjectHolder published
//with enough level of synchronization
public class SharedObjectHolder{
private ObjectHolder oh;
private final Lock lock = new ReentrantLock();
public SharedObjectHolder(){
lock.lock();
try{
oh = new ObjectHolder();
} finally {
lock.unlock();
}
}
public ObjectHolder get(){
lock.lock();
try{
return oh;
} finally {
lock.unlock();
}
}
}
Now we have happens-before between writng to oh and returning oh from the method get(). It guarantees that any caller thread observes up-to-date value of oh.
But, writing to oh fields (private int a, private Object o) during construction is not happens-before with wiritng to oh. JMM does not guarantee that. If I'm wrong, please provide a proof-reference to JMM. Therefore even with such publishing, a thread reading oh may observe a partually-constructed object.
So, what did he mean by saying that I provided in a quote? Can you clarify?

If you only read or write oh per the methods above, then the lock aquired by get() will ensure you see all actions up to the release of the lock in SharedObjectHolder's constructor -- including any writes to oh's fields. The happens-before edge you're relying on has nothing to do with the write to oh, and everything to do with writes (including to oh's fields) happening before a lock is released, which happens before that lock is acquired, which happens before reads.
It is possible to see a partially-constructed oh, if you have a thread that reorders get() to happen before the constructor and the write to oh to happen before both of them. That's why the SharedObjectHolder instance needs to be published safely.
(That said, if you can publish SharedObjectHolder safely, I don't see why you couldn't just publish the original oh reference safely.)

Since you specifically asked for a disprove of your statement: “But, writing to oh fields (private int a, private Object o) during construction is not happens-before with writing to oh. JMM does not guarantee that”, have a look at JLS §17.4.5. Happens-before Order, right the first bullet:
If we have two actions x and y, we write hb(x, y) to indicate that x happens-before y.
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
…
This, together with the transitivity of happens-before relationships, is the most important guaranty of the JMM as it implies that we can have threads performing a sequence of actions without synchronization and only synchronizing when needed. But note that it isn’t relevant to establish a happens-before relationship between the writing of the fields of ObjectHolder and the write to SharedObjectHolder.oh as that all happens within a single thread.
The important consequence of the citation above is that there is a happens-before relationship between all three writes and the release of the Lock due to the program order. Since there also is a happens-before relationship between the release of the Lock and the subsequent acquisition of the Lock by another thread within SharedObjectHolder.get(), the transitivity establishes a happens-before relationship between all three writes and the acquisition of the Lock. It doesn’t matter in which order these three writes were actually performed, the only thing that matters is that all three are completed by the time the Lock is acquired.
As a side note, you wrote in a code comment “Assume that the SharedObjectHolder published with enough level of synchronization”. If we assume that, the entire Lock becomes obsolete as the “enough level of synchronization” used to properly publish the SharedObjectHolder instance is also enough for the publication of the embedded ObjectHolder and its fields, as all their initialization happens-before that publication of SharedObjectHolder due to the program order.

We have:
Write of ObjectHolder values
Write of oh
Unlock of lock
Lock of lock
Read of oh and ObjectHolder values.
There are happens-before relations between 1, 2, 3 and 4, 5 because they are in program order and in the same thread.
There is a happens-before relation between 3 and 4 because of the lock.
So there is a happens-before relation between the writes of ObjectHolder values and the reads in the other thread because of transitivity.

Non-volatile variable value during wait() and notifyall() call in 2 threads

Lets say I have two threads A and B and inside these both 2 threads I have synchronized block in which an int variable is modified continously.
For example, thread A enter synchronized block modify int variable then call these 2 methods:
notifyall(); //to wake thread B which is in waiting state and
wait():
and after that thread B acquire lock and do same steps as thread A and process keep on repeating. All changes to int variable happens inside synchronized block of both threads.
My question is do I need to make int variable volatile. Do thread flush to main memory before they go to waiting state and reload data in registers when thread acquire lock again as a result of notifyall(); call.

If A and B run alternatively rather than concurrently, and if they switch off via wait() and notifyAll() invocations on the same Object, and if no other threads access the variable in question, then thread safety does not require the variable to be volatile.
Note that o.wait() and o.notifyAll() must be invoked inside a method or block synchronized on o -- that synchronization is sufficient to ensure that the two threads see all each others' writes to any variable before the switch-off.
Do be careful to ensure that the two threads are synchronizing on the same object, which is not clear from your question. You have no effective synchronization at all if, say, the two threads are waiting on and notifying different instances of the same class.

The answer is no you do not need to make the variable volatile. The reasoning being, writes that occur to a variable within a synchronized block will be visible to subsequent threads entering a synchronized block on the same object.
So it has the same memory semantics as a volatile read and write.

Not sure about java. But in C: https://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt
If shared_data were declared volatile, the locking would still be
necessary. But the compiler would also be prevented from optimizing access
to shared_data within the critical section, when we know that nobody else
can be working with it. While the lock is held, shared_data is not
volatile. When dealing with shared data, proper locking makes volatile
unnecessary - and potentially harmful.

static array variables would need to be locked?

so let's say that I have a static variable, which is an array of size 5.
And let's say I have two threads, T1 and T2, they both are trying to change the element at index 0 of that array. And then use the element at index 0.
In this case, I should lock the array until T1 is finished using the element right?
Another question is let's say T1 and T2 are already running, T1 access element at index 0 first, then lock it. But then right after T2 tries to access element at index 0, but T1 hasn't unlocked index 0 yet. Then in this case, in order for T2 to access element at index 0, what should T2 do? should T2 use call back function after T1 unlocks index 0 of the array?

Synchronization in java is (technically) not about refusing other threads access to an object, it about ensuring unique usage of it (at one time) between threads using synchronization locks. So T2 can access the object while T1 has synchronization lock, but will be unable to obtain the synchronization lock until T1 releases it.

You synchronize (lock) when you're going to have multiple threads accessing something.
The second thread is going to block until the first thread releases the lock (exits the synchronized block)
More fine-grained control can be had by using java.util.concurrent.locks and using non-blocking checks if you don't want threads to block.

1) Basically, yes. You needn't necessarily lock the array, you could lock at a higher level of granularity (say, the enclosing class if it were a private variable). The important thing is that no part of the code tries to modify or read from the array without holding the same lock. If this condition is violated, undefined behaviour could result (including, but not limited to, seeing old values, seeing garbage values that never existed, throwing exceptions, and going into infinite loops).
2) This depends partly on the synchronization scheme you're using, and your desired semantics. With the standard synchronized keyword, T2 would block indefinitely until the monitor is released by T1, at which point T2 will acquire the monitor and continue with the logic inside the synchronized block.
If you want finer-grained control over the behaviour when a lock is contended, you could use explicit Lock objects. These offer tryLock methods (both with a timeout, and returning immediately) which return true or false according to whether the lock could be obtained. Thus you could then test the return value and take whatever action you like if the lock isn't immediately obtained (such as registering a callback function, incrementing a counter and giving feedback to a user before trying again, etc.).
However, this custom reaction is seldom necessary, and notably increases the complexity of your locking code, not to mention the large possibility of mistakes if you forget to always release the lock in a finally block if and only if it was acquired successfully, etc. As a general rule, just go with synchronized unless/until you can show that it's providing a significant bottleneck to your application's required throughput.

I should lock the array until T1 is finished using the element right?
Yes, to avoid race conditions that would be a good idea.
what should T2 do
Look the array, then read the value. At this time you know noone else can modify it. When using locks such as monitors an queue is automatically kept by the system. Hence if T2 tries to access an object locked by T1 it will block (hang) until T1 releases the lock.
Sample code:
private Obect[] array;
private static final Object lockObject = new Object();
public void modifyObject() {
synchronized(lockObject) {
// read or modify the objects
}
}
Technically you could also synchronize on the array itself.

You don't lock a variable; you lock a mutex, which protects
a specific range of code. And the rule is simple: if any thread
modifies an object, and more than one thread accesses it (for
any reason), all accesses must be fully synchronized. The usual
solution is to define a mutex to protect the variable, request
a lock on it, and free the lock once the access has finished.
When a thread requests a lock, it is suspended until that lock
has been freed.
In C++, it is usual to use RAII to ensure that the lock is
freed, regardless of how the block is exited. In Java,
a synchronized block will acquire the lock at the start
(waiting until it is available), and leave the lock when the
program leaves the block (for whatever reasons).

Have you considered using AtomicReferenceArray? http://download.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/atomic/AtomicReferenceArray.html It provides a #getAndSet method, that provides a thread safe atomic way to update indexes.

T1 access element at index 0 first, then lock it.
Lock first on static final mutex variable then access your static variable.
static final Object lock = new Object();
synchronized(lock) {
// access static reference
}
or better access on class reference
synchronized(YourClassName.class) {
// access static reference
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.