Is volatile read happens-before volatile write?

Is volatile read happens-before volatile write? - java

I try to understand why this example is a correctly synchronized program:
a - volatile
Thread1:
x=a
Thread2:
a=5
Because there are conflicting accesses (there is a write to and read of a) so in every sequential consistency execution must be happens-before relation between that accesses.
Suppose one of sequential execution:
1. x=a
2. a=5
Is 1 happens-before 2, why?

Is 1 happens-before 2, why?
I'm not 100% sure I understand your question.
If you have a volatile variable a and one thread is reading from it and another is writing to it, the order of those accesses can be in either order. It is a race condition. What is guaranteed by the JVM and the Java Memory Model (JMM) depends on which operation happens first.
The write could have just happened and the read sees the updated value. Or the write could happen after the read. So x could be either 5 or the previous value of a.
every sequential consistency execution must be happens-before relation between that accesses
I'm not sure what this means so I'll try to be specific. The "happens before relation" with volatile means that all previous memory writes to a volatile variable prior to a read of the same variable are guaranteed to have finished. But this guarantee in no way explains the timing between the two volatile operations which is subject to the race condition. The reader is guaranteed to have seen the write, but only if the write happened before the read.
You might think this is a pretty weak guarantee, but in threads, whose performance is dramatically improved by using local CPU cache, reading the value of a field might come from a cached memory segment instead of central memory. The guarantee is critical to ensure that the local thread memory is invalidated and updated when a volatile read occurs so that threads can share data appropriately.
Again, the JVM and the JMM guarantee that if you are reading from a volatile field a, then any writes to the same field that have happened before the read, will be seen by it -- the value written will be properly published and visible to the reading thread. However, this guarantee in no way determines the ordering. It doesn't say that the write has to happen before the read.

No, a volatile read before (in synchronization order) a volatile write of the same variable does not necessarily happens-before the volatile write.
This means they can be in a "data race", because they are "conflicting accesses not ordered by a happens-before relationship". If that's true pretty much all programs contain data races:) But it's probably a spec bug. A volatile read and write should never be considered a data race. If all variables in a program are volatile, all executions are trivially sequentially consistent. see http://cs.oswego.edu/pipermail/concurrency-interest/2012-January/008927.html

Sorry, but you cannot say correctly how the JVM will optimize the code depending on the 'memory model' of the JVM. You have to use the high level tools of Java for defining what you want.
So volatile means only that there will be no "inter-thread cache" used for the variables.
If you want a stricter order, you have to use synchronized blocks.
http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html

Volatile and happens-before is only useful when the read of the field drives some condition. For example:
volatile int a;
int b =0;
Thread-1:
b = 5;
a = 10;
Thread-2
c = b + a;
In this case there is no happens-before, a can be either 10 or 0 and b can be either 5 or 0, so as a result c could be either 0, 5, 10 or 15. If the read of a implies some other condition then the happens-before is established for instance:
int b = 0;
volatile int a = 0;
Thread-1:
b = 5
a = 10;
Thread 2:
if(a == 10){
c = b + a;
}
In this case you will ensure c = 15 because the read of a==10 implies that the write of b = 5 happens-before the write of a = 10
Edit: Updating addition order as noted the inconsistency by Gray

Related

Java volatile and happens-before scope

The tutorial http://tutorials.jenkov.com/java-concurrency/volatile.html says
Reads from and writes to other variables cannot be reordered to occur
after a write to a volatile variable, if the reads / writes originally
occurred before the write to the volatile variable. The reads / writes
before a write to a volatile variable are guaranteed to "happen
before" the write to the volatile variable.
What is meant by "before the write to the volatile variable"? Does it mean previous read/writes in the same method where we are writing to the volatile variable? Or is it a larger scope (also in methods higher up the call stack)?

JVM can reorder operations. For example if we have i, j variables and code
i = 1;
j = 2;
JVM can run this in reordered manner
j = 2;
i = 1;
But if the j variable marked as volatile then JVM runs operations only as
i = 1;
j = 2;
write to i "happens before the write to the volatile variable" j.

The JVM ensures that writes to a volatile variable happens-before any reads from it. Take two threads. It's guarateed that for a single thread, the execution follows an as-if-serial semantics. Basically you can assume that there is an implicit happens-before relationship b/w two executions in the same thread (the compiler is still free to reorder instructions). Basically a single thread has a total order b/w its instructions governed by the happens-before relationship trivially.
A multi-threaded program has many such partial orders (every thread has a total order in the local instruction set but there is no order globally across threads) but not a total order b/w the global instruction set. Synchronisation is all about giving your program as much total order as possible.
Coming back to volatile variables, when a thread reads from it, the JVM ensures that all writes to it happened before the read. Now because of this order, everything the writing thread did before it wrote to the variable become visible to the thread reading from it. So yes, to answer your question, even variables up in the call stack should be visible to the reading thread.
I'll try to draw a visual picture. The two threads can be imagined as two parallel rails, and write to a volatile variable can be one of the sleepers b/w them. You basically get a
A -----
|
|
------- B
shaped total order b/w the two threads of execution. Everything in A before the sleeper should be visible to B after the sleeper because of this total order.

The JMM is defined in terms of happens before relation which we'll call ->. If a->b, then the b should see everything of a. This means that there are constraints on reordering loads/stores.
If a is a volatile write and b is a subsequent volatile read of the same variable, then a->b. This is called the volatile variable rule.
If a occurs before b in the code, then a->b. This is called the program order rule.
If a->b and b->c, then a->c. This is called the transitivity rule.
So lets apply this to a simple example:
int a;
volatile int b;
thread1(){
a=1;
b=1
}
thread2(){
int rb=b;
int ra=a;
if(rb==1 and ra==0) print("violation");
}
So the question is if thread2 sees rb=1,will it see ra=1?
a=1->b=1 due to program order rule.
b=1->rb=b (since we see the value 1) due to the volatile variable rule.
rb=b->ra=a due to program order rule.
Now we can apply the transitivity rule twice and we can conclude that that a=1->ra=a. And therefor ra needs to be 1.
This means that:
a=1 and b=1 can't be reordered.
rb=b and ra=a can't be reordered
otherwise we could end up with an rb=1 and ra=0.

Java volatile reordering prevention scope

Writes and reads to a volatile field prevent reordering of reads/writes before and after the volatile field respectively. Variable reads/writes before a write to a volatile variable can not be reordered to happen after it, and reads/writes after a read from a volatile variable can not be reordered to happen before it. But what is the scope of this prohibition? As I understand volatile variable prevents reordering only inside the block where it is used, am I right?
Let me give a concrete example for clarity. Let's say we have such code:
int i,j,k;
volatile int l;
boolean flag = true;
void someMethod() {
int i = 1;
if (flag) {
j = 2;
}
if (flag) {
k = 3;
l = 4;
}
}
Obviously, write to l will prevent write to k from reordering, but will it prevent reordering of writes to i and j in respect to l? In other words can writes to i and j happen after write to l?
UPDATE 1
Thanks guys for taking your time and answering my question - I appreciate this. The problem is you're answering the wrong question. My question is about scope, not about the basic concept. The question is basically how far in code does complier guarantee the "happens before" relation to the volatile field.
Obviously compiler can guarantee that inside the same code block, but what about enclosing blocks and peer blocks - that's what my question is about. #Stephen C said, that volatile guarantees happen before behavior inside the whole method's body, even in the enclosing block, but I can not find any confirmation to that. Is he right, is there a confirmation somewhere?
Let me give yet another concrete example about scoping to clarify things:
setVolatile() {
l = 5;
}
callTheSet() {
i = 6;
setVolatile();
}
Will compiler prohibit reordering of i write in this case? Or maybe compiler can not/is not programmed to track what happens in other methods in case of volatile, and i write can be reordered to happen before setVolatile()? Or maybe compiler doesn't reorder method calls at all?
I mean there is got to be a point somewhere, when compiler will not be able to track if some code should happen before some volatile field write. Otherwise one volatile field write/read might affect ordering of half of a program, if not more. This is a rare case, but it is possible.
Moreover, look at this quote
Under the new memory model, it is still true that volatile variables cannot be reordered with each other. The difference is that it is now no longer so easy to reorder normal field accesses around them.
"Around them". This phrase implies, that there is a scope where volatile field can prevent reordering.

Obviously, write to l will prevent write to k from reordering, but will it prevent reordering of writes to i and j?
It is not entirely clear what you mean by reordering; see my comments above.
However, in the Java 5+ memory model, we can say that the writes to i and j that happened before the write to l will be visible to another thread after it has read l ... provided that nothing writes i and j after write to l.
This does have the effect of constraining any reordering of the instructions that write to i and j. Specifically, they can't be moved to after the memory write barrier following the write to l, because that could lead them to not being visible to the second thread.
But what is the scope of this prohibition?
There isn't a prohibition per se.
You need to understand that instructions, reordering and memory barriers are just details of a specific way of implementing the Java memory model. The model is actually defined in terms of what is guaranteed to be visible in any "well-formed execution".
As I understand volatile prevents reordering inside the block where it is used, am I right?
Actually, no. The blocks don't come into the consideration. What matters is the (program source code) order of the statements within the method.
#Stephen C said, that volatile guarantees happen before behavior inside the whole method's body, even in the enclosing block, but I can not find any confirmation to that.
The confirmation is JLS 17.4.3. It states the following:
Among all the inter-thread actions performed by each thread t, the program order of t is a total order that reflects the order in which these actions would be performed according to the intra-thread semantics of t.
A set of actions is sequentially consistent if all actions occur in a total order (the execution order) that is consistent with program order, and furthermore, each read r of a variable v sees the value written by the write w to v such that:
w comes before r in the execution order, and
there is no other write w' such that w comes before w' and w' comes before r in the execution order.
Sequential consistency is a very strong guarantee that is made about visibility and ordering in an execution of a program. Within a sequentially consistent execution, there is a total order over all individual actions (such as reads and writes) which is consistent with the order of the program, and each individual action is atomic and is immediately visible to every thread.
If a program has no data races, then all executions of the program will appear to be sequentially consistent.
Notice that there is NO mention of blocks or scopes in this definition.

EDIT 2
The volatile ONLY gaurentee the happens-before relation.
Why it reorder in single thread
Considered we have two fields:
int i = 0;
int j = 0;
We have a method to write them
void write() {
i = 1;
j = 2;
}
As you know, compiler may reorder them. That is because compiler think it is not matter access which first. Because in single thread, they are 'happen together'.
Why can't reorder in multi thread
But now we have another method to read them in another thread:
void read() {
if(j==2) {
assert i==1;
}
}
If compiler still reorder it, this assert may fail. That means j has been 2, but i unexpectly is not 1. Which seems i=1 is happens after assert i==1.
What volatile do
The volatile only gaurantee the happens-before relation.
Now we add volatile
volatile int j = 0;
When we observe j==2 is true, that means j=2 is happened and i=2 is before it, it must happened. So the assert will never fail now.
'Prventing reorder' is just an approach that compiler to provide that guarantee.
Conclusion
The only things you should now is happens-before. Please refer to the link below of java specification. The reordering or not is just a side effect of this guarantee.
Answer for you question
Since l is volatile, acccess to i and j always before access to l in the someMethod. The fact is, every thing before the line l=4 will happen before before it.
EDIT 1
Since the post has been edit. Here is further explasion.
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
happens-before means:
If one action happens-before another, then the first is visible to and ordered before the second.
So the access to i and j happen-before access to l.
reference: https://docs.oracle.com/javase/specs/jls/se10/html/jls-17.html#jls-17.4.5
Origin answer
No, the volatile only protect itself, though it is not easy to reorder field access near volatile.
Under the new memory model, it is still true that volatile variables cannot be reordered with each other. The difference is that it is now no longer so easy to reorder normal field accesses around them. Writing to a volatile field has the same memory effect as a monitor release, and reading from a volatile field has the same memory effect as a monitor acquire. In effect, because the new memory model places stricter constraints on reordering of volatile field accesses with other field accesses, volatile or not, anything that was visible to thread A when it writes to volatile field f becomes visible to thread B when it reads f.
The volatile keyword only guarantee that:
A write to a volatile field happens before every subsequent read of that same volatile.
reference: http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile

I am curious to know how volatile variable affects OTHER fields
Volatile variables do affect the other fields. JIT compiler can reorder the instructions if he thinks that reordering will not have any impact on the execution output. So if you have 6 independent variable stores JIT can reorder the instructions.
However if you make a variable volatile i.e. in your case variable l then JIT will not reorder any variable STORES after the volatile STORE. And I think that makes sense because in a multithreaded program if I get the value of variable l as 4, then I should get i as 1, because in my program i was written before l and which eventually is Program Order Semantics (If I am not wrong).
Note that volatile variables does two things:
Compiler will not reorder any stores after volatile store / not reorder any reads before volatile read.
Flushes the Load/Store buffer so that all the processor can see the changes.
EDIT:
Good blog here: http://jpbempel.blogspot.com/2013/05/volatile-and-memory-barriers.html

Maybe I know the "real scope" you are in dout.
Two types of reorder is the main reason of unordering instruction result:
1. Compiler optimization
2. Cpu processor recordering(maily caused by cache and main memory synchronize)
volatile keyword first need to confirm the flushing of volatile variable, at the meantime, other variables are also flushed to main memory.But because of compiler reordering, some writable instructions before the volatile valatile variable may be reordered after the volatile variable, the reader may be confused to read not the real time other variable values which is before the volatile variable in program order, so the rule of "variables writting instruction before the volatile variable is forced to run before the volatile" is made.This optimazation is done by Java Compiler or JIT.
The main point is optimization of compiler in instructions,like finding dead code , instruction reorder operation, the instructions code range is always a "basic block"(Except some other constant propagation optimization, etc.). A basic block is an set of instructions without jmp instruction inside, so this is a basic block. So in my opinion, the reorder operation is fixed in the range basic block.
the basic block in source code is always a block or the body of a method.
And also because java does not have inline function, the method call is used by dynamic invoke method instruction, the reorder operation should not be across two method.
So, the scope will not be larger than a "method body", or maybe only a area of "for" body , it's the basic block range.
This is all my thought, I'm not sure if it is right, someone can help to make it more accurate.

Volatile writes reorder with non volatile writes

Does volatile writes are reordered with non volatile writes.
For Ex:
I have two threads T1 and T2:
T1:
i = 10;
volatile boolean result = true;
T2:
while(!result){
}
System.out.println(i);
Does T2 always see the updated value of i(10) or old value?

Yes. There is a happens-before relationship for a volatile statement:
Please consider this stackoverflow question: Does Java volatile variables impose a happens-before relationship before it is read?
A write to a volatile field happens-before every subsequent read of
that same field. Writes and reads of volatile fields have similar
memory consistency effects as entering and exiting monitors, but do
not entail mutual exclusion locking.
Also you can read section 3.1.3 (Locking and visbility) in the great book called "Java Concurrency in Practice". There is a relevant explanation there about a similar visibility issue and the outline is this:
Locking is not just about mutual exclusion; it is also about memory visibility.To ensure that all threads see the most up to date values of shared mutable variables, the reading and writing threads must synchronize on a common lock
In your code the lock is the volatile variable

As far as I understand it, this is correctly synchronized, so no races occur and 10 is always printed.
The important parts are that within a thread, things occur in program order, and that writes to a volatile variable happen before reads that see that value. Together with the transitive closure rule, this means that the assignment to i happens before the print statement.
i = 10 happens before result = true. result = true happens before result is read as true in thread 2. result is read as true happens before System.out.println(i);. Therefore, i = 10 happens before System.out.println(i);.

Volatile Vs Atomic [duplicate]

This question already has answers here:
What is the difference between atomic / volatile / synchronized?
(7 answers)
Closed 9 years ago.
I read somewhere below line.
Java volatile keyword doesn't means atomic, its common misconception
that after declaring volatile, ++ operation will be atomic, to make
the operation atomic you still need to ensure exclusive access using
synchronized method or block in Java.
So what will happen if two threads attack a volatile primitive variable at same time?
Does this mean that whosoever takes lock on it, that will be setting its value first. And if in meantime, some other thread comes up and read old value while first thread was changing its value, then doesn't new thread will read its old value?
What is the difference between Atomic and volatile keyword?

The effect of the volatile keyword is approximately that each individual read or write operation on that variable is made atomically visible to all threads.
Notably, however, an operation that requires more than one read/write -- such as i++, which is equivalent to i = i + 1, which does one read and one write -- is not atomic, since another thread may write to i between the read and the write.
The Atomic classes, like AtomicInteger and AtomicReference, provide a wider variety of operations atomically, specifically including increment for AtomicInteger.

Volatile and Atomic are two different concepts. Volatile ensures, that a certain, expected (memory) state is true across different threads, while Atomics ensure that operation on variables are performed atomically.
Take the following example of two threads in Java:
Thread A:
value = 1;
done = true;
Thread B:
if (done)
System.out.println(value);
Starting with value = 0 and done = false the rule of threading tells us, that it is undefined whether or not Thread B will print value. Furthermore value is undefined at that point as well! To explain this you need to know a bit about Java memory management (which can be complex), in short: Threads may create local copies of variables, and the JVM can reorder code to optimize it, therefore there is no guarantee that the above code is run in exactly that order. Setting done to true and then setting value to 1 could be a possible outcome of the JIT optimizations.
volatile only ensures, that at the moment of access of such a variable, the new value will be immediately visible to all other threads and the order of execution ensures, that the code is at the state you would expect it to be. So in case of the code above, defining done as volatile will ensure that whenever Thread B checks the variable, it is either false, or true, and if it is true, then value has been set to 1 as well.
As a side-effect of volatile, the value of such a variable is set thread-wide atomically (at a very minor cost of execution speed). This is however only important on 32-bit systems that i.E. use long (64-bit) variables (or similar), in most other cases setting/reading a variable is atomic anyways. But there is an important difference between an atomic access and an atomic operation. Volatile only ensures that the access is atomically, while Atomics ensure that the operation is atomically.
Take the following example:
i = i + 1;
No matter how you define i, a different Thread reading the value just when the above line is executed might get i, or i + 1, because the operation is not atomically. If the other thread sets i to a different value, in worst case i could be set back to whatever it was before by thread A, because it was just in the middle of calculating i + 1 based on the old value, and then set i again to that old value + 1. Explanation:
Assume i = 0
Thread A reads i, calculates i+1, which is 1
Thread B sets i to 1000 and returns
Thread A now sets i to the result of the operation, which is i = 1
Atomics like AtomicInteger ensure, that such operations happen atomically. So the above issue cannot happen, i would either be 1000 or 1001 once both threads are finished.

There are two important concepts in multithreading environment:
atomicity
visibility
The volatile keyword eradicates visibility problems, but it does not deal with atomicity. volatile will prevent the compiler from reordering instructions which involve a write and a subsequent read of a volatile variable; e.g. k++.
Here, k++ is not a single machine instruction, but three:
copy the value to a register;
increment the value;
place it back.
So, even if you declare a variable as volatile, this will not make this operation atomic; this means another thread can see a intermediate result which is a stale or unwanted value for the other thread.
On the other hand, AtomicInteger, AtomicReference are based on the Compare and swap instruction. CAS has three operands: a memory location V on which to operate, the expected old value A, and the new value B. CAS atomically updates V to the new value B, but only if the value in V matches the expected old value A; otherwise, it does nothing. In either case, it returns the value currently in V. The compareAndSet() methods of AtomicInteger and AtomicReference take advantage of this functionality, if it is supported by the underlying processor; if it is not, then the JVM implements it via spin lock.

As Trying as indicated, volatile deals only with visibility.
Consider this snippet in a concurrent environment:
boolean isStopped = false;
:
:
while (!isStopped) {
// do some kind of work
}
The idea here is that some thread could change the value of isStopped from false to true in order to indicate to the subsequent loop that it is time to stop looping.
Intuitively, there is no problem. Logically if another thread makes isStopped equal to true, then the loop must terminate. The reality is that the loop will likely never terminate even if another thread makes isStopped equal to true.
The reason for this is not intuitive, but consider that modern processors have multiple cores and that each core has multiple registers and multiple levels of cache memory that are not accessible to other processors. In other words, values that are cached in one processor's local memory are not visisble to threads executing on a different processor. Herein lies one of the central problems with concurrency: visibility.
The Java Memory Model makes no guarantees whatsoever about when changes that are made to a variable in one thread may become visible to other threads. In order to guarantee that updates are visisble as soon as they are made, you must synchronize.
The volatile keyword is a weak form of synchronization. While it does nothing for mutual exclusion or atomicity, it does provide a guarantee that changes made to a variable in one thread will become visible to other threads as soon as it is made. Because individual reads and writes to variables that are not 8-bytes are atomic in Java, declaring variables volatile provides an easy mechanism for providing visibility in situations where there are no other atomicity or mutual exclusion requirements.

The volatile keyword is used:
to make non atomic 64-bit operations atomic: long and double. (all other, primitive accesses are already guaranteed to be atomic!)
to make variable updates guaranteed to be seen by other threads + visibility effects: after writing to a volatile variable, all the variables that where visible before writing that variable become visible to another thread after reading the same volatile variable (happen-before ordering).
The java.util.concurrent.atomic.* classes are, according to the java docs:
A small toolkit of classes that support lock-free thread-safe
programming on single variables. In essence, the classes in this
package extend the notion of volatile values, fields, and array
elements to those that also provide an atomic conditional update
operation of the form:
boolean compareAndSet(expectedValue, updateValue);
The atomic classes are built around the atomic compareAndSet(...) function that maps to an atomic CPU instruction. The atomic classes introduce the happen-before ordering as the volatile variables do. (with one exception: weakCompareAndSet(...)).
From the java docs:
When a thread sees an update to an atomic variable caused by a
weakCompareAndSet, it does not necessarily see updates to any other
variables that occurred before the weakCompareAndSet.
To your question:
Does this mean that whosoever takes lock on it, that will be setting
its value first. And in if meantime, some other thread comes up and
read old value while first thread was changing its value, then doesn't
new thread will read its old value?
You don't lock anything, what you are describing is a typical race condition that will happen eventually if threads access shared data without proper synchronization. As already mentioned declaring a variable volatile in this case will only ensure that other threads will see the change of the variable (the value will not be cached in a register of some cache that is only seen by one thread).
What is the difference between AtomicInteger and volatile int?
AtomicInteger provides atomic operations on an int with proper synchronization (eg. incrementAndGet(), getAndAdd(...), ...), volatile int will just ensure the visibility of the int to other threads.

So what will happen if two threads attack a volatile primitive variable at same time?
Usually each one can increment the value. However sometime, both will update the value at the same time and instead of incrementing by 2 total, both thread increment by 1 and only 1 is added.
Does this mean that whosoever takes lock on it, that will be setting its value first.
There is no lock. That is what synchronized is for.
And in if meantime, some other thread comes up and read old value while first thread was changing its value, then doesn't new thread will read its old value?
Yes,
What is the difference between Atomic and volatile keyword?
AtomicXxxx wraps a volatile so they are basically same, the difference is that it provides higher level operations such as CompareAndSwap which is used to implement increment.
AtomicXxxx also supports lazySet. This is like a volatile set, but doesn't stall the pipeline waiting for the write to complete. It can mean that if you read a value you just write you might see the old value, but you shouldn't be doing that anyway. The difference is that setting a volatile takes about 5 ns, bit lazySet takes about 0.5 ns.

How to understand happens-before consistent

In chapter 17 of JLS, it introduce a concept: happens-before consistent.
A set of actions A is happens-before consistent if for all reads r in A, where W(r) is the write action seen by r, it is not the case that either hb(r, W(r)) or that there exists a write w in A such that w.v = r.v and hb(W(r), w) and hb(w, r)"
In my understanding, it equals to following words:
..., it is the case that neither ... nor ...
So my first two questions are:
is my understanding right?
what does "w.v = r.v" mean?
It also gives an Example: 17.4.5-1
Thread 1 Thread 2
B = 1; A = 2;
r2 = A; r1 = B;
In first execution order:
1: B = 1;
3: A = 2;
2: r2 = A; // sees initial write of 0
4: r1 = B; // sees initial write of 0
The order itself has already told us that two threads are executed alternately, so my third question is: what does left number mean?
In my understanding, the reason of both r2 and r1 can see initial write of 0 is both A and B are not volatile field. So my fourth quesiton is: whether my understanding is right?
In second execution order:
1: r2 = A; // sees write of A = 2
3: r1 = B; // sees write of B = 1
2: B = 1;
4: A = 2;
According to definition of happens-before consistency, it is not difficult to understand this execution order is happens-before consistent(if my first understanding is correct).
So my fifth and sixth questions are: does it exist this situation (reads see writes that occur later) in real world? If it does, could you give me a real example?

Each thread can be on a different core with its own private registers which Java can use to hold values of variables, unless you force access to coherent shared memory. This means that one thread can write to a value storing in a register, and this value is not visible to another thread for some time, like the duration of a loop or whole function. (milli-seconds is not uncommon)
A more extreme example is that the reading thread's code is optimised with the assumption that since it never changes the value, it doesn't need to read it from memory. In this case the optimised code never sees the change performed by another thread.
In both cases, the use of volatile ensures that reads and write occur in a consistent order and both threads see the same value. This is sometimes described as always reading from main memory, though it doesn't have to be the case because the caches can talk to each other directly. (So the performance hit is much smaller than you might expect).
On normal CPUs, caches are "coherent" (can't hold stale / conflicting values) and transparent, not managed manually. Making data visible between threads just means doing an actual load or store instruction in asm to access memory (through the data caches), and optionally waiting for the store buffer to drain to give ordering wrt. other later operations.

happens-before
Let's take a look at definitions in concurrency theory:
Atomicity - is a property of operation that can be executed completely as a single transaction and can not be executed partially. For example Atomic operations[Example]
Visibility - if one thread made changes they are visible for other threads. volatile before Java 5 with happens-before
Ordering - compiler is able to change an ordering of operations/instructions of source code to make some optimisations.
For example happens-before which is a kind of memory barrier which helps to solve Visibility and Ordering issue. Good examples of happens-before are volatile[About], synchronized monitor[About]
A good example of atomicity is Compare and swap(CAS) realization of check then act(CTA) pattern which should be atomic and allows to change a variable in multithreading envirompment. You can write your own implementation if CTA:
volatile + synchronized
java.util.concurrent.atomic with sun.misc.Unsafe(memory allocation, instantiating without constructor call...) from Java 5 which uses JNI and CPU advantages.
CAS algoritm has thee parameters(A(address), O(old value), N(new value)).
If value by A(address) == O(old value) then put N(new value) into A(address),
else O(old value) = value from A(address) and repeat this actions again
Happens-before
Official doc
Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second.
volatile[About] as an example
A write to a volatile field happens-before every subsequent read of that field.
Let's take a look at the example:
// Definitions
int a = 1;
int b = 2;
volatile boolean myVolatile = false;
// Thread A. Program order
{
a = 5;
b = 6;
myVolatile = true; // <-- write
}
//Thread B. Program order
{
//Thread.sleep(1000); //just to show that writing into `myVolatile`(Thread A) was executed before
System.out.println(myVolatile); // <-- read
System.out.println(a); //prints 5, not 1
System.out.println(b); //prints 6, not 2
}
Visibility - When Thread A changes/writes a volatile variable it also pushes all previous changes into RAM - Main Memory as a result all not volatile variable will be up to date and visible for another threads
Ordering:
All operations before writing into volatile variable in Thread A will be called before. JVM is able to reorder them but guarantees that no one operation before writing into volatile variable in Thread A will be called after it.
All operations after reading the volatile variable in Thread B will be called after. JVM is able to reorder them but guarantees that no one operation after reading a volatile variable in Thread B will be called before it.
[Concurrency vs Parallelism]

The Java Memory Model defines a partial ordering of all your actions of your program which is called happens-before.
To guarantee that a thread Y is able to see the side-effects of action X (irrelevant if X occurred in different thread or not) a happens-before relationship is defined between X and Y.
If such a relationship is not present the JVM may re-order the operations of the program.
Now, if a variable is shared and accessed by many threads, and written by (at least) one thread if the reads and writes are not ordered by the happens before relationship, then you have a data race.
In a correct program there are no data races.
Example is 2 threads A and B synchronized on lock X.
Thread A acquires lock (now Thread B is blocked) and does the write operations and then releases lock X. Now Thread B acquires lock X and since all the actions of Thread A were done before releasing the lock X, they are ordered before the actions of Thread B which acquired the lock X after thread A (and also visible to Thread B).
Note that this occurs on actions synchronized on the same lock. There is no happens before relationship among threads synchronized on different locks

In substance that is correct. The main thing to take out of this is: unless you use some form of synchronization, there is no guarantee that a read that comes after a write in your program order sees the effect of that write, as the statements might have been reodered.
does it exist this situation (reads see writes that occur later) in real world? If it does, could you give me a real example?
From a wall clock's perspective, obviously, a read can't see the effect of a write that has not happened yet.
From a program order's perspective, because statements can be reordered if there isn't a proper synchronization (happens before relationship), a read that comes before a write in your program, could see the effect of that write during execution because it has been executed after the write by the JVM.

Q1: is my understanding right?
A: Yes
Q2: what does "w.v = r.v" mean?
A: The value of w.v is same as that of r.v
Q3: What does left number mean?
A: I think it is statement ID like shown in "Table 17.4-A. Surprising results caused by statement reordering - original code". But you can ignore it because it does not apply to the conent of "Another execution order that is happens-before consistent is: " So the left number is shit completely. Do not stick to it.
Q4: In my understanding, the reason of both r2 and r1 can see initial write of 0 is both A and B are not volatile field. So my fourth quesiton is: whether my understanding is right?
A: That is one reason. re-order can also make it. "A program must be correctly synchronized to avoid the kinds of counterintuitive behaviors that can be observed when code is reordered."
Q5&6: In second execution order ... So my fifth and sixth questions are: does it exist this situation (reads see writes that occur later) in real world? If it does, could you give me a real example?
A: Yes. no synchronization in code, each thread read can see either the write of the initial value or the write by the other thread.
time 1: Thread 2: A=2
time 2: Thread 1: B=1 // Without synchronization, B=1 of Thread 1 can be interleaved here
time 3: Thread 2: r1=B // r1 value is 1
time 4: Thread 1: r2=A // r2 value is 2
Note "An execution is happens-before consistent if its set of actions is happens-before consistent"

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.