Core Java Threading and Volatile keyword Usage

Core Java Threading and Volatile keyword Usage - java

I have recently started studying multithreading in core Java. I am studying a Volatile keyword. As per my understanding threads are allowed to keep the values of the variables locally. If the variables are driving any logic (e.g. loop) then the changes to the variables will not be visible to the threads and they will keep using the outdated cached values unless the variable declared as 'volatile'.
I created a small code to demonstrate this, however, my observation is different. The changes made by one thread (main thread) are very well visible to the other thread even if the shared variable 'done' is not volatile. Could anyone please help me understand this behavior
package com.test;
public class App implements Runnable {
private boolean done= true;
private static App a = new App();
public static void main( String[] args ) throws InterruptedException {
Thread t = new Thread(a);
t.start();
System.out.println("Main Thread:"+ Thread.currentThread().getName());
Thread.sleep(1);
a.done=false;
System.out.println("Value of done in main method is:"+ a.done);
}
public void run() {
while(done) {
System.out.println("Second Thread:"+ Thread.currentThread().getName());
System.out.println("Still running");
}
}
}
Output of the above code is as follows
Main Thread:main
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Second Thread:Thread-0
Still running
Value of done in main method is:false

The keyword volatile guarantees the behavior you described. The value of a volatile field is visible to all readers after a write operation. This means that no cached values are used.
If you are not using the volatile keyword this does not automatically mean though that you are always using old, cached variables. It's just the possibility that this could happen.
Takeaway:
You use the volatile keyword to guarantee memory visibility. It does not mean that not using it will break your code as soon as you use multithreading.
Also note that volatile does not guarantee atomic interaction with your variable. This means that a volatile field is not automatically thread safe regarding race conditions and other potential trouble.

Related

Multiple thread accessing same data but getting latest data?

I wrote this program:
package com.example.threads;
import java.util.concurrent.ConcurrentHashMap;
public class ConcurrentHashMapBehaviour {
private static ConcurrentHashMap<String, String> chm = new ConcurrentHashMap<>();
private static Object _lock = new Object();
public static void main(String[] args) {
Thread t = new Thread(new MyThread());
t.start();
int counter = 0;
while (true) {
String val = "FirstVal" + counter;
counter++;
String currentVal = null;
synchronized (_lock) {
chm.put("first", val);
currentVal = chm.get("first");
}
System.out.println("In Main thread, current value is : " + currentVal);
}
}
static class MyThread implements Runnable {
#Override
public void run() {
String val = null;
while (true) {
synchronized (_lock) {
val = chm.get("first");
}
System.out.println("Value seen in MyThread is " + val);
}
}
}
}
I am sharing a common data between these thread viz: chm (ConcurrentHashMap). I made this to run in debug mode in which I made Main thread run more times than MyThread, both are controlled by _lock.
So, for instance, I made to run Main thread twice and so the value of "first" key would be "FirstVal1". Then i made Main Thread to halt and made MyThread to proceed, it was able to get the latest value, even though main thread was run multiple times.
How is this possible? I was under the impression that this variable needs to be volatile in order for these MyThread to get the latest values.
I didn't understand this behaviour. Can anyone decipher this where I am missing?

First, you're using a ConcurrentHashMap, which is safe to use in a multi-threaded environment, so if a thread puts a value into it, other threads will be able to see that value.
Second, you are synchronizing access to the map. That will ensure only one thread will write to the map.
Each such explicit synchronization also includes a memory-barrier, which will write any results waiting in a cache to be written to the main memory, making it possible for other threads to see it. Which is what a volatile variable access is: access to volatile values have memory visibility guarantees.
If you want to see data races in your program, remove all synchronization primitives and try again. That does not guarantee that you'll observe a race all the time, but you should be able to see unexpected values every now and then.

There are three misconceptions here:
Writing to a volatile variable guarantees that all changes made by the writing thread are published, i.e. can be seen by other threads. See The Java Language Specification Chapter 8 for all the details. This does not mean that the absence of the volatile modifier forbids publication. JVM implementations may be (and actually are) implemented much more forgiving. This is one of the reasons concurrency problems are so hard to trace.
"A hash table supporting full concurrency of retrievals and high expected concurrency for updates." is the first sentence of the API Documentation on the ConcurrentHashMap class. And that pretty much sums it up. The concurrent hashmap guarantees that when calling get any thread gets the latest value. That's exactly the purpose of this class. If you look at its source code you can by the way see that they use volatile fields internally.
You're additionally using synchronized blocks to access your data. These do not only guarantee exclusive access, they also guarantee that all changes made before leaving such a block are visible to all threads that synchronize on the same lock object.
To summarize it: By using the concurrent hashmap implementation and using synchronization blocks you publish the changes and make the latest changes visible to other threads. One of the two would have already been sufficient.

Safe multithreading in java

I am new to multi threading in java.
I have gone through some online references but can't get clarity regarding how to properly implement thread concurrency and addressing resource access conflicts.
(like where to use synchronized and volatile and how to design code that dont even need them).
Can somebody suggest some guidelines or provide any valuable online references you have come across for implementing a safer multi threading project?
Thanks in advance.

Didn't go through your code, but here's something important to begin using synchronize and volatile keywords.
Essentially, volatile is used to indicate that a variable's value will be modified by different threads.
Declaring a volatile Java variable means:
The value of this variable will never be cached thread-locally: all reads and writes will go straight to "main memory"; This means that threads are making changes directly to a (volatile)variable where other threads also have a hold on. Everyone(every thread) has control and they can make changes which are reflected globally.
Here is an excellent example to understand more about volatile variables
If a variable is not declared volatile : The problem with threads not seeing the latest value of a variable because it has not yet been written back to main memory by another thread, is called a "visibility" problem. The updates of one thread are not visible to other threads
Declaring a synchronized Java variable means:
Synchronized blocks in Java are marked with the synchronized keyword and is synchronized on some object. All synchronized blocks synchronized on the same object can only have one thread executing inside them at the same time. All other threads attempting to enter the synchronized block are blocked until the thread inside the synchronized block exits the block.
Usage :
If you want a count variable to be incremented by some threads then make it volatile.
public class SharedObject {
public volatile int counter = 0;
}
However if you need your counter increment to be atomic( one thread at a time) make it synchronized too.
public synchronized void add(int value){
this.counter += value;
}

Is volatile not needed for objects' members but only on primitive members?

My Code is
package threadrelated;
import threadrelated.lockrelated.MyNonBlockingQueue;
public class VolatileTester extends Thread {
MyNonBlockingQueue mbq ;
public static void main(String[] args) throws InterruptedException {
VolatileTester vt = new VolatileTester();
vt.mbq = new MyNonBlockingQueue(10);
System.out.println(Thread.currentThread().getName()+" "+vt.mbq);
Thread t1 = new Thread(vt,"First");
Thread t2 = new Thread(vt,"Secondz");
t1.start();
t2.start();
t1.join();
t2.join();
System.out.println(Thread.currentThread().getName()+" "+vt.mbq);
}
#Override
public void run() {
System.out.println(Thread.currentThread().getName()+" before "+mbq);
mbq = new MyNonBlockingQueue(20);
try {
Thread.sleep(TimeUnit.SECONDS.toMillis(10));
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println(Thread.currentThread().getName()+" after "+mbq);
}
}
Output is
main threadrelated.lockrelated.MyNonBlockingQueue#72fcb1f4
Secondz before threadrelated.lockrelated.MyNonBlockingQueue#72fcb1f4
First before threadrelated.lockrelated.MyNonBlockingQueue#72fcb1f4
Secondz after threadrelated.lockrelated.MyNonBlockingQueue#7100650c
First after threadrelated.lockrelated.MyNonBlockingQueue#7100650c
main threadrelated.lockrelated.MyNonBlockingQueue#7100650c
It shows that when First thread assigns member variable to new object, same is visible to other thread. Even if "mbq" is not declared as volatile.
I used breakpoints to try different sequence of operations. But my observation is that one thread can immediately see impact of other thread.
Is volatile not needed for class members which are object ? Are they always synchronized to main memory ? Volatile needed only for primitive member variables (int, long, boolean etc. ? )

It's just as necessary for references as it is for primitives. The fact that your output doesn't show a visibility problem doesn't prove one doesn't exist. In general, it's very difficult to prove non-existence of a concurrency bug. But here's a simple counterproof showing the necessity of volatile:
public class Test {
static volatile Object ref;
public static void main(String[] args) {
// spin until ref is updated
new Thread(() -> {
while (ref == null);
System.out.println("done");
}).start();
// wait a second, then update ref
new Thread(() -> {
try { Thread.sleep(1000); } catch (Exception e) {}
ref = new Object();
}).start();
}
}
This program runs for a second, then prints "done". Remove volatile and it won't terminate because the first thread never sees the updated ref value. (Disclaimer: As with any concurrency test, results may vary.)

Your code is not a useful test of volatile. It will work with or without volatile, not by accident but according to spec.
Shmosel's answer includes code that is a much better test of the volatile keyword because there is a consequence to whether the field is volatile or not. If you take that code, making the field non-volatile, and insert a println within the loop, then you should see the field's value set from the other thread be visible. This is because the println synchronizes on the print stream, inserting a memory barrier.
There are two other things in your example that insert these barriers, causing updates to be visible across threads.
The Java Language Specification lists these happens-before relationships:
A call to start() on a thread happens-before any actions in the started thread.
All actions in a thread happen-before any other thread successfully returns from a join() on that thread.
This means volatile is not needed in your posted code. The newly started threads can see the queue passed in from main, and main can see the reference to the queue once the threads have completed. There is a window, between the time the threads start and the time a println is executed, where the contents of the field could be stale, but nothing in the code is testing it.
But no, it's not accurate to say volatile isn't needed for references. There's a happens-before relationship for volatile:
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
The spec doesn't distinguish between fields that contain references and fields that contain primitives, the rule applies to both. This comes back to Java being call-by-value, references are values.

Generally the fact that you do not see something happening at this point in time, does not mean it will not happen later. Especially true for concurrent code. There is jcstress library that you can play with and will try to show you what might be wrong with your code.
Volatile variable is different than other variables because it introduces memory barries at the CPU level. Without these there is no guarantee of when or what thread sees the updates from another. In simplistic words these are called StoreLoad|StoreStore|LoadLoad|LoadStore.
So using volatile guarantees visibility effects, actually it is the only thing you can rely on for visibility effects (besides using Unsafe and locks/synchronized keyword). You also have to take into consideration that you are testing this for a particular CPU, most likely x86. But for different CPU's (like ARM let's say) things would break much faster.

Visibility issue in java concurrent programming

I came across following example in book 'Java Concurrency in Practice'.
public class NoVisibility {
private static boolean ready;
private static int number;
private static class ReaderThread extends Thread {
public void run() {
while (!ready)
Thread.yield();
System.out.println(number);
}
}
public static void main(String[] args) {
new ReaderThread().start();
number = 42;
ready = true;
}
}
Its stated further as:
NoVisibility could loop forever because the value of ready might never become
visible to the reader thread. Even more strangely, NoVisibility could print
zero because the write to ready might be made visible to the reader thread before
the write to number, a phenomenon known as reordering.
I can understand reordering issue, but I a not able to comprehend the visibility issue. Why the value of ready might never become visible to reader thread? Once main thread writes value in ready, sooner or later reader thread would get its chance to run and it can read value of ready. Why is it that change made by main thread in ready might not be visible to reader thread?

ReaderThread's run() method may never see the latest value of ready because it's free to assume and optimize that the value will not change outside of it's thread. This assumption can be taken away by using the relevant concurrency features of the language like adding the keyword volatile to ready's declaration.

I believe this is a new problem that started happening with multi-core CPUs and separate CPU caches.
There would be no need to worry if you were actually reading and modifying memory, and even with multi-CPUs you'd be safe except that each CPU now has it's own cache. The memory location would be cached and the other thread will never see it because it will be operating exclusively out of the cache.
When you make it volatile it forces both threads to go directly to memory every time--so it slows things down quite a bit but it's thread safe.

When Java refresh Thread Cache to actual copy

I read few articles on volatile Thread cache and found either it is too much brief without examples, so it is very difficult for beginner to understand.
Please help me in understanding below program,
public class Test {
int a = 0;
public static void main(String[] args) {
final Test t = new Test();
new Thread(new Runnable(){
public void run() {
try {
Thread.sleep(3000);
} catch (Exception e) {}
t.a = 10;
System.out.println("now t.a == 10");
}
}).start();
new Thread(new Runnable(){
public void run() {
while(t.a == 0) {}
System.out.println("Loop done: " + t.a);
}
}).start();
}
}
When I make a variable volatile and run my program then it stops after some time but when I remove volatile to a variable, then it goes on and my program is not stopping.
What I knew about volatile is "when variable is declared as volatile then thread will directly read/write to variable memory instead of read/write from local thread cache.
if not declared volatile then one can see delay in updation of actual value."
Also, as per my understanding of refreshing the cached copy, I thought program will stop in some time but then why in above program it is continuing to run and not updating.
So when is Thread referring to its local cache starts referring to main copy or refresh its value with main copy value?
Please correct me if I am wrong in my understanding....
Please explain me with some small code snippet or link.

when variable is declared as volatile then thread will directly read/write to variable memory instead of read/write from local thread cache. if not declared volatile then one can see delay in updation of actual value.
To begin with, the above statements are false. There are many more phenomena going on at the level of machine code which have nothing to do with any "thread-local variable caches". In fact, this concept is hardly applicable at all.
To give you something specific to focus on, the JIT compiler will be allowed to transform your code
while(t.a == 0) {}
into
if (t.a == 0) while (true) {}
whenever t.a is not volatile. The Java Memory Model allows any variable accessed in a data race to be treated as if the accessing thread was the only thread in existence. Since obviously this thread is not modifying t.a, its value can be considered a loop invariant and the check doesn't have to be repeated... ever.

This may or may not be necessarily what is happening, but volatile also prevents certain reordering by the compiler.
For instance your code here
while(t.a == 0) {}
System.out.println("Loop done: " + t.a);
Can be reordered to
if(t.a == 0){
while(true){
}
}
System.out.println("Loop done: " + t.a);
This is called hoisting and is perfectly legal. Declaring it volatile will prevent this sort of ordering.

If a variable is not declared volatile, whether it will be read from the cache or from the main copy is not predictable.
The cache has a limited size and the variable can get evicted from it for various reasons, like other variables occupying the cache. When this happens the main copy is read.

Marking a variable volatile is making sure changes to it are visible across threads.
In your code first thread changes the value of a and other thread sees this change and breaks out of loop.
Important point to note about volatile variable is value can change between last access and current access, even if compiler knows that its values is not being changed.
(as #Marko Topolnik said)
This stops the JIT compiler from doing optimizations like
while(t.a == 0) {}
into
if (t.a == 0) while (true) {}
knowing that a cant not change.
this talk offers very good explanation of these things. Jeremy Menson's talk on Java Memory Model#Google

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.