When Java refresh Thread Cache to actual copy

When Java refresh Thread Cache to actual copy - java

I read few articles on volatile Thread cache and found either it is too much brief without examples, so it is very difficult for beginner to understand.
Please help me in understanding below program,
public class Test {
int a = 0;
public static void main(String[] args) {
final Test t = new Test();
new Thread(new Runnable(){
public void run() {
try {
Thread.sleep(3000);
} catch (Exception e) {}
t.a = 10;
System.out.println("now t.a == 10");
}
}).start();
new Thread(new Runnable(){
public void run() {
while(t.a == 0) {}
System.out.println("Loop done: " + t.a);
}
}).start();
}
}
When I make a variable volatile and run my program then it stops after some time but when I remove volatile to a variable, then it goes on and my program is not stopping.
What I knew about volatile is "when variable is declared as volatile then thread will directly read/write to variable memory instead of read/write from local thread cache.
if not declared volatile then one can see delay in updation of actual value."
Also, as per my understanding of refreshing the cached copy, I thought program will stop in some time but then why in above program it is continuing to run and not updating.
So when is Thread referring to its local cache starts referring to main copy or refresh its value with main copy value?
Please correct me if I am wrong in my understanding....
Please explain me with some small code snippet or link.

when variable is declared as volatile then thread will directly read/write to variable memory instead of read/write from local thread cache. if not declared volatile then one can see delay in updation of actual value.
To begin with, the above statements are false. There are many more phenomena going on at the level of machine code which have nothing to do with any "thread-local variable caches". In fact, this concept is hardly applicable at all.
To give you something specific to focus on, the JIT compiler will be allowed to transform your code
while(t.a == 0) {}
into
if (t.a == 0) while (true) {}
whenever t.a is not volatile. The Java Memory Model allows any variable accessed in a data race to be treated as if the accessing thread was the only thread in existence. Since obviously this thread is not modifying t.a, its value can be considered a loop invariant and the check doesn't have to be repeated... ever.

This may or may not be necessarily what is happening, but volatile also prevents certain reordering by the compiler.
For instance your code here
while(t.a == 0) {}
System.out.println("Loop done: " + t.a);
Can be reordered to
if(t.a == 0){
while(true){
}
}
System.out.println("Loop done: " + t.a);
This is called hoisting and is perfectly legal. Declaring it volatile will prevent this sort of ordering.

If a variable is not declared volatile, whether it will be read from the cache or from the main copy is not predictable.
The cache has a limited size and the variable can get evicted from it for various reasons, like other variables occupying the cache. When this happens the main copy is read.

Marking a variable volatile is making sure changes to it are visible across threads.
In your code first thread changes the value of a and other thread sees this change and breaks out of loop.
Important point to note about volatile variable is value can change between last access and current access, even if compiler knows that its values is not being changed.
(as #Marko Topolnik said)
This stops the JIT compiler from doing optimizations like
while(t.a == 0) {}
into
if (t.a == 0) while (true) {}
knowing that a cant not change.
this talk offers very good explanation of these things. Jeremy Menson's talk on Java Memory Model#Google

Related

The visibility of variable which write after volatile variable write

public class Test {
private static volatile boolean flag = false;
private static int i = 1;
public static void main(String[] args) {
new Thread(() -> {
try {
TimeUnit.SECONDS.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
flag = true;
i += 1;
}).start();
new Thread(() -> {
while (!flag) {
if (i != 1) {
System.out.println(i);
}
}
System.out.println(flag);
System.out.println(i);
}).start();
}
}
Variable i is written after volatile variable flag, but the code output true 2. It seems the modify of i by the first thread is visible to the second thread.
According to my understanding, variable I should be written before flag, then the second thread can be aware of the change.

Accordingly to the language standard (§17.4):
A field may be declared volatile, in which case the Java Memory Model
ensures that all threads see a consistent value for the variable
So informally, all threads will have a view of the must update value of that variable.
However, the volatile clause not only implies ensure visibility guarantee of the target variable but also full volatile visibility guarantee, namely:
Actually, the visibility guarantee of Java volatile goes beyond the
volatile variable itself. The visibility guarantee is as follows:
If Thread A writes to a volatile variable and Thread B subsequently
reads the same volatile variable, then all variables visible to Thread
A before writing the volatile variable, will also be visible to Thread
B after it has read the volatile variable.
If Thread A reads a
volatile variable, then all all variables visible to Thread A when
reading the volatile variable will also be re-read from main memory.
According to my understanding, variable I should be written before
flag, then the second thread can be aware of the change.
"All variables visible to the Thread A before writing the volatile variable", it does not refer to operation over those variables.

Your code suffers from a data-race.
A data race is when there are 2 memory actions to the same address which are not ordered by a happens before relation, and at least one of these actions is a write.
In this case the write to i is the problem.
The write to i, is after the write to the volatile variable flag and hence there is no happens before relation between writing the i and reading i.
If you would write i before you write to the flag, there would be the following happens before relation:
write of i happens before write of flag due to the program order rule
write of flag happens before read of flag due to volatile variable rule (on the hardware level this is a task for cache coherence).
the read of flag happens before the read of i due to program order rule.
Because the happens before relation is transitive, the write of i happens before the read if i.
So like you already indicated, if you move the write of i in front of the write of the flag; the data race is gone.

The memory model defines guarantees, however anything may happen on top of them.
On x86, all writes have release semantics, and as soon as you write to a variable, its updated value will be visible as soon as possible from other threads.
So the fact that actions before writing to a volatile variable happen-before actions after reading to it, does not prevent actions after writing to it to become visible after reading it.

Program not terminating after loop completion

In the following scenario, the boolean 'done' gets set to true which should end the program. Instead the program just keeps going on even though the while(!done) is no longer a valid scenario thus it should have halted. Now if I were to add in a Thread sleep even with zero sleep time, the program terminates as expected. Why is that?
public class Sample {
private static boolean done;
public static void main(String[] args) throws InterruptedException {
done = false;
new Thread(() -> {
System.out.println("Running...");
int count = 0;
while (!done) {
count++;
try {
Thread.sleep(0); // program only ends if I add this line.
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}).start();
Thread.sleep(2000);
done = true; // this is set to true after 2 seconds so program should end.
System.out.println("Done!"); // this gets printed after 2 seconds
}
}
EDIT: I am looking to understand why the above needs Thread.sleep(0) to terminate. I do not want to use volatile keyword unless it is an absolute must and I do understand that would work by exposing my value to all threads which is not my intention to expose.

Each thread have a different cached version of done created for performance, your counter thread is too busy making the calculations for count that it doesnt give a chance to reload done.
volatile ensures that any read/write is done on the main memory, always update the cpu cache copy.
Thread.sleep always pause the current thread, so even if 0 your counter thread is interrupted by some time <1ms, that is enough time for the thread to be adviced of done variable change.

I am no Java expert man, I don't even program in java, but let me try.
A thread on stackoverflow explains the Java Memory model: Are static variables shared between threads?
Important part: https://docs.oracle.com/javase/6/docs/api/java/util/concurrent/package-summary.html#MemoryVisibility
Chapter 17 of the Java Language Specification defines the
happens-before relation on memory operations such as reads and writes
of shared variables. The results of a write by one thread are
guaranteed to be visible to a read by another thread only if the write
operation happens-before the read operation. The synchronized and
volatile constructs, as well as the Thread.start() and Thread.join()
methods, can form happens-before relationships.
If you go through the thread, it mentions the "Happens before" logic when executing threads that share a variable. So my guess is when you call Thread.sleep(0), the main thread is able to set the done variable properly making sure that it "Happens first". Though, in a multi-threaded environment even that is not guaranteed. But since the code-piece is so small it makes it work in this case.
To sum it up, I just ran your program with a minor change to the variable "done" and the program worked as expected:
private static volatile boolean done;
Thank you. Maybe someone else can give you a better explanation :P

Core Java Threading and Volatile keyword Usage

I have recently started studying multithreading in core Java. I am studying a Volatile keyword. As per my understanding threads are allowed to keep the values of the variables locally. If the variables are driving any logic (e.g. loop) then the changes to the variables will not be visible to the threads and they will keep using the outdated cached values unless the variable declared as 'volatile'.
I created a small code to demonstrate this, however, my observation is different. The changes made by one thread (main thread) are very well visible to the other thread even if the shared variable 'done' is not volatile. Could anyone please help me understand this behavior
package com.test;
public class App implements Runnable {
private boolean done= true;
private static App a = new App();
public static void main( String[] args ) throws InterruptedException {
Thread t = new Thread(a);
t.start();
System.out.println("Main Thread:"+ Thread.currentThread().getName());
Thread.sleep(1);
a.done=false;
System.out.println("Value of done in main method is:"+ a.done);
}
public void run() {
while(done) {
System.out.println("Second Thread:"+ Thread.currentThread().getName());
System.out.println("Still running");
}
}
}
Output of the above code is as follows
Main Thread:main
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Value of done in main method is:false

The keyword volatile guarantees the behavior you described. The value of a volatile field is visible to all readers after a write operation. This means that no cached values are used.
If you are not using the volatile keyword this does not automatically mean though that you are always using old, cached variables. It's just the possibility that this could happen.
Takeaway:
You use the volatile keyword to guarantee memory visibility. It does not mean that not using it will break your code as soon as you use multithreading.
Also note that volatile does not guarantee atomic interaction with your variable. This means that a volatile field is not automatically thread safe regarding race conditions and other potential trouble.

Is volatile not needed for objects' members but only on primitive members?

My Code is
package threadrelated;
import threadrelated.lockrelated.MyNonBlockingQueue;
public class VolatileTester extends Thread {
MyNonBlockingQueue mbq ;
public static void main(String[] args) throws InterruptedException {
VolatileTester vt = new VolatileTester();
vt.mbq = new MyNonBlockingQueue(10);
System.out.println(Thread.currentThread().getName()+" "+vt.mbq);
Thread t1 = new Thread(vt,"First");
Thread t2 = new Thread(vt,"Secondz");
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println(Thread.currentThread().getName()+" "+vt.mbq);
}
#Override
public void run() {
System.out.println(Thread.currentThread().getName()+" before "+mbq);
mbq = new MyNonBlockingQueue(20);
try {
Thread.sleep(TimeUnit.SECONDS.toMillis(10));
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(Thread.currentThread().getName()+" after "+mbq);
}
}
Output is
main threadrelated.lockrelated.MyNonBlockingQueue#72fcb1f4
Secondz before threadrelated.lockrelated.MyNonBlockingQueue#72fcb1f4
First before threadrelated.lockrelated.MyNonBlockingQueue#72fcb1f4
Secondz after threadrelated.lockrelated.MyNonBlockingQueue#7100650c
First after threadrelated.lockrelated.MyNonBlockingQueue#7100650c
main threadrelated.lockrelated.MyNonBlockingQueue#7100650c
It shows that when First thread assigns member variable to new object, same is visible to other thread. Even if "mbq" is not declared as volatile.
I used breakpoints to try different sequence of operations. But my observation is that one thread can immediately see impact of other thread.
Is volatile not needed for class members which are object ? Are they always synchronized to main memory ? Volatile needed only for primitive member variables (int, long, boolean etc. ? )

It's just as necessary for references as it is for primitives. The fact that your output doesn't show a visibility problem doesn't prove one doesn't exist. In general, it's very difficult to prove non-existence of a concurrency bug. But here's a simple counterproof showing the necessity of volatile:
public class Test {
static volatile Object ref;
public static void main(String[] args) {
// spin until ref is updated
new Thread(() -> {
while (ref == null);
System.out.println("done");
}).start();
// wait a second, then update ref
new Thread(() -> {
try { Thread.sleep(1000); } catch (Exception e) {}
ref = new Object();
}).start();
}
}
This program runs for a second, then prints "done". Remove volatile and it won't terminate because the first thread never sees the updated ref value. (Disclaimer: As with any concurrency test, results may vary.)

Your code is not a useful test of volatile. It will work with or without volatile, not by accident but according to spec.
Shmosel's answer includes code that is a much better test of the volatile keyword because there is a consequence to whether the field is volatile or not. If you take that code, making the field non-volatile, and insert a println within the loop, then you should see the field's value set from the other thread be visible. This is because the println synchronizes on the print stream, inserting a memory barrier.
There are two other things in your example that insert these barriers, causing updates to be visible across threads.
The Java Language Specification lists these happens-before relationships:
A call to start() on a thread happens-before any actions in the started thread.
All actions in a thread happen-before any other thread successfully returns from a join() on that thread.
This means volatile is not needed in your posted code. The newly started threads can see the queue passed in from main, and main can see the reference to the queue once the threads have completed. There is a window, between the time the threads start and the time a println is executed, where the contents of the field could be stale, but nothing in the code is testing it.
But no, it's not accurate to say volatile isn't needed for references. There's a happens-before relationship for volatile:
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
The spec doesn't distinguish between fields that contain references and fields that contain primitives, the rule applies to both. This comes back to Java being call-by-value, references are values.

Generally the fact that you do not see something happening at this point in time, does not mean it will not happen later. Especially true for concurrent code. There is jcstress library that you can play with and will try to show you what might be wrong with your code.
Volatile variable is different than other variables because it introduces memory barries at the CPU level. Without these there is no guarantee of when or what thread sees the updates from another. In simplistic words these are called StoreLoad|StoreStore|LoadLoad|LoadStore.
So using volatile guarantees visibility effects, actually it is the only thing you can rely on for visibility effects (besides using Unsafe and locks/synchronized keyword). You also have to take into consideration that you are testing this for a particular CPU, most likely x86. But for different CPU's (like ARM let's say) things would break much faster.

When to use volatile and synchronized

I know there are many questions about this, but I still don't quite understand. I know what both of these keywords do, but I can't determine which to use in certain scenarios. Here are a couple of examples that I'm trying to determine which is the best to use.
Example 1:
import java.net.ServerSocket;
public class Something extends Thread {
private ServerSocket serverSocket;
public void run() {
while (true) {
if (serverSocket.isClosed()) {
...
} else { //Should this block use synchronized (serverSocket)?
//Do stuff with serverSocket
}
}
}
public ServerSocket getServerSocket() {
return serverSocket;
}
}
public class SomethingElse {
Something something = new Something();
public void doSomething() {
something.getServerSocket().close();
}
}
Example 2:
public class Server {
private int port;//Should it be volatile or the threads accessing it use synchronized (server)?
//getPort() and setPort(int) are accessed from multiple threads
public int getPort() {
return port;
}
public void setPort(int port) {
this.port = port;
}
}
Any help is greatly appreciated.

A simple answer is as follows:
synchronized can always be used to give you a thread-safe / correct solution,
volatile will probably be faster, but can only be used to give you a thread-safe / correct in limited situations.
If in doubt, use synchronized. Correctness is more important than performance.
Characterizing the situations under which volatile can be used safely involves determining whether each update operation can be performed as a single atomic update to a single volatile variable. If the operation involves accessing other (non-final) state or updating more than one shared variable, it cannot be done safely with just volatile. You also need to remember that:
updates to non-volatile long or a double may not be atomic, and
Java operators like ++ and += are not atomic.
Terminology: an operation is "atomic" if the operation either happens entirely, or it does not happen at all. The term "indivisible" is a synonym.
When we talk about atomicity, we usually mean atomicity from the perspective of an outside observer; e.g. a different thread to the one that is performing the operation. For instance, ++ is not atomic from the perspective of another thread, because that thread may be able to observe state of the field being incremented in the middle of the operation. Indeed, if the field is a long or a double, it may even be possible to observe a state that is neither the initial state or the final state!

The synchronized keyword
synchronized indicates that a variable will be shared among several threads. It's used to ensure consistency by "locking" access to the variable, so that one thread can't modify it while another is using it.
Classic Example: updating a global variable that indicates the current time
The incrementSeconds() function must be able to complete uninterrupted because, as it runs, it creates temporary inconsistencies in the value of the global variable time. Without synchronization, another function might see a time of "12:60:00" or, at the comment marked with >>>, it would see "11:00:00" when the time is really "12:00:00" because the hours haven't incremented yet.
void incrementSeconds() {
if (++time.seconds > 59) { // time might be 1:00:60
time.seconds = 0; // time is invalid here: minutes are wrong
if (++time.minutes > 59) { // time might be 1:60:00
time.minutes = 0; // >>> time is invalid here: hours are wrong
if (++time.hours > 23) { // time might be 24:00:00
time.hours = 0;
}
}
}
The volatile keyword
volatile simply tells the compiler not to make assumptions about the constant-ness of a variable, because it may change when the compiler wouldn't normally expect it. For example, the software in a digital thermostat might have a variable that indicates the temperature, and whose value is updated directly by the hardware. It may change in places that a normal variable wouldn't.
If degreesCelsius is not declared to be volatile, the compiler is free to optimize this:
void controlHeater() {
while ((degreesCelsius * 9.0/5.0 + 32) < COMFY_TEMP_IN_FAHRENHEIT) {
setHeater(ON);
sleep(10);
}
}
into this:
void controlHeater() {
float tempInFahrenheit = degreesCelsius * 9.0/5.0 + 32;
while (tempInFahrenheit < COMFY_TEMP_IN_FAHRENHEIT) {
setHeater(ON);
sleep(10);
}
}
By declaring degreesCelsius to be volatile, you're telling the compiler that it has to check its value each time it runs through the loop.
Summary
In short, synchronized lets you control access to a variable, so you can guarantee that updates are atomic (that is, a set of changes will be applied as a unit; no other thread can access the variable when it's half-updated). You can use it to ensure consistency of your data. On the other hand, volatile is an admission that the contents of a variable are beyond your control, so the code must assume it can change at any time.

There is insufficient information in your post to determine what is going on, which is why all the advice you are getting is general information about volatile and synchronized.
So, here's my general advice:
During the cycle of writing-compiling-running a program, there are two optimization points:
at compile time, when the compiler might try to reorder instructions or optimize data caching.
at runtime, when the CPU has its own optimizations, like caching and out-of-order execution.
All this means that instructions will most likely not execute in the order that you wrote them, regardless if this order must be maintained in order to ensure program correctness in a multithreaded environment. A classic example you will often find in the literature is this:
class ThreadTask implements Runnable {
private boolean stop = false;
private boolean work;
public void run() {
while(!stop) {
work = !work; // simulate some work
}
}
public void stopWork() {
stop = true; // signal thread to stop
}
public static void main(String[] args) {
ThreadTask task = new ThreadTask();
Thread t = new Thread(task);
t.start();
Thread.sleep(1000);
task.stopWork();
t.join();
}
}
Depending on compiler optimizations and CPU architecture, the above code may never terminate on a multi-processor system. This is because the value of stop will be cached in a register of the CPU running thread t, such that the thread will never again read the value from main memory, even thought the main thread has updated it in the meantime.
To combat this kind of situation, memory fences were introduced. These are special instructions that do not allow regular instructions before the fence to be reordered with instructions after the fence. One such mechanism is the volatile keyword. Variables marked volatile are not optimized by the compiler/CPU and will always be written/read directly to/from main memory. In short, volatile ensures visibility of a variable's value across CPU cores.
Visibility is important, but should not be confused with atomicity. Two threads incrementing the same shared variable may produce inconsistent results even though the variable is declared volatile. This is due to the fact that on some systems the increment is actually translated into a sequence of assembler instructions that can be interrupted at any point. For such cases, critical sections such as the synchronized keyword need to be used. This means that only a single thread can access the code enclosed in the synchronized block. Other common uses of critical sections are atomic updates to a shared collection, when usually iterating over a collection while another thread is adding/removing items will cause an exception to be thrown.
Finally two interesting points:
synchronized and a few other constructs such as Thread.join will introduce memory fences implicitly. Hence, incrementing a variable inside a synchronized block does not require the variable to also be volatile, assuming that's the only place it's being read/written.
For simple updates such as value swap, increment, decrement, you can use non-blocking atomic methods like the ones found in AtomicInteger, AtomicLong, etc. These are much faster than synchronized because they do not trigger a context switch in case the lock is already taken by another thread. They also introduce memory fences when used.

Note: In your first example, the field serverSocket is actually never initialized in the code you show.
Regarding synchronization, it depends on whether or not the ServerSocket class is thread safe. (I assume it is, but I have never used it.) If it is, you don't need to synchronize around it.
In the second example, int variables can be atomically updated so volatile may suffice.

volatile solves “visibility” problem across CPU cores. Therefore, value from local registers is flushed and synced with RAM. However, if we need consistent value and atomic op, we need a mechanism to defend the critical data. That can be achieved by either synchronized block or explicit lock.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.