I would like to understand what happens from a cache point-of-view (MESI protocol) if a programmer forget to add the volatile keyword to a variable used for synchronization.
The following snippet of code simply stops after some iterations. I know it's because one of the 2 threads (let's assume THREAD 0) does not see the update made to current variable by THREAD 1 done by getNext(), therefore it keeps looping forever.
However, I don't understand why it is the case. THREAD 0 is looping on current and should see at some point that the cache line was updated by THREAD 1 (the cache line switched to Modified state) and issue a "Read" message on the memory bus to fetch it in its local cache. Adding volatile modifier to current variable will make everything works fine.
What is happening that prevent THREAD 0 to continue its execution ?
Reference: Memory Barriers: a Hardware View for Software Hackers
public class Volatile {
public static final int THREAD_0 = 0;
public static final int THREAD_1 = 1;
public static int current;
public static int counter = 0;
public static void main(String[] args) {
current = 0;
/** Thread 0 **/
Thread thread0 = new Thread(() -> {
while(true) { /** THREAD_0 */
while (current != THREAD_0);
counter++;
System.out.println("Thread0:" + counter);
current = getNext(THREAD_0);
}
});
/** Thread 1 **/
Thread thread1 = new Thread(() -> {
while(true) { /** THREAD_1 */
while (current != THREAD_1);
counter++;
System.out.println("Thread1:" + counter);
current = getNext(THREAD_1);
}
});
thread0.start();
thread1.start();
}
public static int getNext(int threadId) {
return threadId == THREAD_0 ? THREAD_1 : THREAD_0;
}
}
Different processor architectures can have different degrees of cache coherence. Some may be pretty minimal, because the goal is to maximize throughput and synching up caches with memory unnecessarily will drag that down. Java imposes its own memory barriers when it detects that coordination is required, but that depends on the programmer following the rules.
If the programmer doesn't add indications (like using the synchronized or volatile keywords) that a variable can be shared across threads, the JIT is allowed to assume that no other thread is modifying that variable, and it can optimize accordingly. The bytecode that tests that variable can get reordered drastically. If the JIT reorders the code enough then when the hardware detects a modification and fetches the new value may not matter.
Quoting from Java Concurrency in Practice 3.1:
In the absence of synchronization, the compiler, processor, and runtime can do some downright weird things to the order in which operations appear to execute. Attempts to reason about the order in which memory actions "must" happen in insufficiently synchronized multithreaded programs will almost certainly be incorrect.
(There are several places in the JCIP book making the case that reasoning about insufficiently-synchronized code is futile.)
Related
I understood that reading and writing data from multiple threads need to have a good locking mechanism to avoid data race. However, one situation is: If multiple threads try to write to a single variable with a single value, can this be a problem.
For example, here my sample code:
public class Main {
public static void main(String[] args) {
final int[] a = {1};
while(true) {
new Thread(new Runnable() {
#Override
public void run() {
a[0] = 1;
assert a[0] == 1;
}
}).start();
}
}
}
I have run this program for a long time, and look like everything is fine. If this code can cause the problem, how can I reproduce that?
Your test case does not cover the actual problem. You test the variable's value in the same thread - but that thread already copied the initial state of the variable and when it changes within the thread, the changes are visible to that thread, just like in any single-threaded applications. The real issue with write operations is how and when is the updated value used in the other threads.
For example, if you were to write a counter, where each thread increments the value of the number, you would run into issues. An other problem is that your test operation take way less time than creating a thread, therefore the execution is pretty much linear. If you had longer code in the threads, it would be possible for multiple threads to access the variable at the same time. I wrote this test using Thread.sleep(), which is known to be unreliable (which is what we need):
int[] a = new int[]{0};
for(int i = 0; i < 100; i++) {
final int k = i;
new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(20);
} catch(InterruptedException e) {
e.printStackTrace();
}
a[0]++;
System.out.println(a[0]);
}
}).start();
}
If you execute this code, you will see how unreliable the output is. The order of the numbers change (they are not in ascending order), there are duplicates and missing numbers as well. This is because the variable is copied to the CPU memory multiple times (once for each thread), and is pasted back to the shared ram after the operation is complete. (This does not happen right after it is completed to save time in case it is needed later).
There also might be some other mechanics in the JVM that copy the values within the RAM for threads, but I'm unaware of them.
The thing is, even locking doesn't prevent these issues. It prevents threads from accessing the variable at the same time, but it generally doesn't make sure that the value of the variable is updated before the next thread accesses it.
In the tutorial of java multi-threading, it gives an exmaple of Memory Consistency Errors. But I can not reproduce it. Is there any other method to simulate Memory Consistency Errors?
The example provided in the tutorial:
Suppose a simple int field is defined and initialized:
int counter = 0;
The counter field is shared between two threads, A and B. Suppose thread A increments counter:
counter++;
Then, shortly afterwards, thread B prints out counter:
System.out.println(counter);
If the two statements had been executed in the same thread, it would be safe to assume that the value printed out would be "1". But if the two statements are executed in separate threads, the value printed out might well be "0", because there's no guarantee that thread A's change to counter will be visible to thread B — unless the programmer has established a happens-before relationship between these two statements.
I answered a question a while ago about a bug in Java 5. Why doesn't volatile in java 5+ ensure visibility from another thread?
Given this piece of code:
public class Test {
volatile static private int a;
static private int b;
public static void main(String [] args) throws Exception {
for (int i = 0; i < 100; i++) {
new Thread() {
#Override
public void run() {
int tt = b; // makes the jvm cache the value of b
while (a==0) {
}
if (b == 0) {
System.out.println("error");
}
}
}.start();
}
b = 1;
a = 1;
}
}
The volatile store of a happens after the normal store of b. So when the thread runs and sees a != 0, because of the rules defined in the JMM, we must see b == 1.
The bug in the JRE allowed the thread to make it to the error line and was subsequently resolved. This definitely would fail if you don't have a defined as volatile.
This might reproduce the problem, at least on my computer, I can reproduce it after some loops.
Suppose you have a Counter class:
class Holder {
boolean flag = false;
long modifyTime = Long.MAX_VALUE;
}
Let thread_A set flag as true, and save the time into
modifyTime.
Let another thread, let's say thread_B, read the Counter's flag. If thread_B still get false even when it is later than modifyTime, then we can say we have reproduced the problem.
Example code
class Holder {
boolean flag = false;
long modifyTime = Long.MAX_VALUE;
}
public class App {
public static void main(String[] args) {
while (!test());
}
private static boolean test() {
final Holder holder = new Holder();
new Thread(new Runnable() {
#Override
public void run() {
try {
Thread.sleep(10);
holder.flag = true;
holder.modifyTime = System.currentTimeMillis();
} catch (Exception e) {
e.printStackTrace();
}
}
}).start();
long lastCheckStartTime = 0L;
long lastCheckFailTime = 0L;
while (true) {
lastCheckStartTime = System.currentTimeMillis();
if (holder.flag) {
break;
} else {
lastCheckFailTime = System.currentTimeMillis();
System.out.println(lastCheckFailTime);
}
}
if (lastCheckFailTime > holder.modifyTime
&& lastCheckStartTime > holder.modifyTime) {
System.out.println("last check fail time " + lastCheckFailTime);
System.out.println("modify time " + holder.modifyTime);
return true;
} else {
return false;
}
}
}
Result
last check time 1565285999497
modify time 1565285999494
This means thread_B get false from Counter's flag filed at time 1565285999497, even thread_A has set it as true at time 1565285999494(3 milli seconds ealier).
The example used is too bad to demonstrate the memory consistency issue. Making it work will require brittle reasoning and complicated coding. Yet you may not be able to see the results. Multi-threading issues occur due to unlucky timing. If someone wants to increase the chances of observing issue, we need to increase chances of unlucky timing.
Following program achieves it.
public class ConsistencyIssue {
static int counter = 0;
public static void main(String[] args) throws InterruptedException {
Thread thread1 = new Thread(new Increment(), "Thread-1");
Thread thread2 = new Thread(new Increment(), "Thread-2");
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println(counter);
}
private static class Increment implements Runnable{
#Override
public void run() {
for(int i = 1; i <= 10000; i++)
counter++;
}
}
}
Execution 1 output: 10963,
Execution 2 output: 14552
Final count should have been 20000, but it is less than that. Reason is count++ is multi step operation,
1. read count
2. increment count
3. store it
two threads may read say count 1 at once, increment it to 2. and write out 2. But if it was a serial execution it should have been 1++ -> 2++ -> 3.
We need a way to make all 3 steps atomic. i.e to be executed by only one thread at a time.
Solution 1: Synchronized
Surround the increment with Synchronized. Since counter is static variable you need to use class level synchronization
#Override
public void run() {
for (int i = 1; i <= 10000; i++)
synchronized (ConsistencyIssue.class) {
counter++;
}
}
Now it outputs: 20000
Solution 2: AtomicInteger
public class ConsistencyIssue {
static AtomicInteger counter = new AtomicInteger(0);
public static void main(String[] args) throws InterruptedException {
Thread thread1 = new Thread(new Increment(), "Thread-1");
Thread thread2 = new Thread(new Increment(), "Thread-2");
thread1.start();
thread2.start();
thread1.join();
thread2.join();
System.out.println(counter.get());
}
private static class Increment implements Runnable {
#Override
public void run() {
for (int i = 1; i <= 10000; i++)
counter.incrementAndGet();
}
}
}
We can do with semaphores, explicit locking too. but for this simple code AtomicInteger is enough
Sometimes when I try to reproduce some real concurrency problems, I use the debugger.
Make a breakpoint on the print and a breakpoint on the increment and run the whole thing.
Releasing the breakpoints in different sequences gives different results.
Maybe to simple but it worked for me.
Please have another look at how the example is introduced in your source.
The key to avoiding memory consistency errors is understanding the happens-before relationship. This relationship is simply a guarantee that memory writes by one specific statement are visible to another specific statement. To see this, consider the following example.
This example illustrates the fact that multi-threading is not deterministic, in the sense that you get no guarantee about the order in which operations of different threads will be executed, which might result in different observations across several runs. But it does not illustrate a memory consistency error!
To understand what a memory consistency error is, you need to first get an insight about memory consistency. The simplest model of memory consistency has been introduced by Lamport in 1979. Here is the original definition.
The result of any execution is the same as if the operations of all the processes were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program
Now, consider this example multi-threaded program, please have a look at this image from a more recent research paper about sequential consistency. It illustrates what a real memory consistency error might look like.
To finally answer your question, please note the following points:
A memory consistency error always depends on the underlying memory model (A particular programming languages may allow more behaviours for optimization purposes). What's the best memory model is still an open research question.
The example given above gives an example of sequential consistency violation, but there is no guarantee that you can observe it with your favorite programming language, for two reasons: it depends on the programming language exact memory model, and due to undeterminism, you have no way to force a particular incorrect execution.
Memory models are a wide topic. To get more information, you can for example have a look at Torsten Hoefler and Markus Püschel course at ETH Zürich, from which I understood most of these concepts.
Sources
Leslie Lamport. How to Make a Multiprocessor Computer That Correctly Executes Multiprocessor Programs, 1979
Wei-Yu Chen, Arvind Krishnamurthy, Katherine Yelick, Polynomial-Time Algorithms for Enforcing Sequential Consistency in SPMD Programs with Arrays, 2003
Design of Parallel and High-Performance Computing course, ETH Zürich
I have a class something like this:
public class Outer {
public static final TaskUpdater TASK_UPDATER = new TaskUpdater() {
public void doSomething(Task task) {
//uses and modifies task and some other logic
}
};
public void taskRelatedMethod() {
//some logic
TASK_UPDATER.doSomething(new Task());
//some other logic
}
}
I've noticed some strange behaviour when running this in a multi-threaded environment that I can't reproduce locally, and I suspect it's a threading issue. Is it possible for two instances of Outer to somehow interfere with each other by both calling doSomething on TASK_UPDATER? Each will be passing a difference instance of Task into the doSomething method.
Is it possible for two instances of Outer to somehow interfere with each other by both calling doSomething on TASK_UPDATER?
The answer is "it depends". Any time you have multiple threads sharing the same object instances, you may have concurrency issues. In your case, you have multiple instances of Outer sharing the same static TaskUpdater instance. This in itself is not a problem however if TaskUpdater has any fields, they will be shared by the threads. If the threads make any changes to those fields in any way then data synchronization needs to happen and possible blocking around critical code sections. If the TaskUpdater is only reading and operating on the Task argument, which seems to be per Outer instance, then there is no problem.
For example, you could have a task updater like:
public static final TaskUpdater TASK_UPDATER = new TaskUpdater() {
public void doSomething(Task task) {
int total = 0;
for (Job job : task.getJobs() {
total += job.getSize();
}
task.setTotalSize(total);
}
};
In this case, the task is only changing the Task instance passed in. It can use local variables without a problem because those are on the stack and now shared between threads. This is thread safe.
However consider this updater:
public static final TaskUpdater TASK_UPDATER = new TaskUpdater() {
private long total = 0;
public void doSomething(Task task) {
for (Job job : task.getJobs() {
// race condition and memory synchronization issues here
total += job.getSize();
}
}
public long getTotal() {
return total;
}
};
In this case, both threads will be updating the same total field on the shard TaskUpdater. This is not thread safe since you have race conditions around the += (since it is 3 operations: get, plus, set) as well as memory synchronization issues. One thread may have a cached version of total which is 5 which it increments to 6 but another thread has already incremented its cached version of total to 10.
When threads share common fields you need to protect those operations and worry about synchronization in terms of mutex access and memory publishing. In this case, making total be an AtomicLong will be in order.
private AtomicLong total = new AtomicLong(0);
...
total.addAndGet(job.getSize());
AtomicLong wraps a volatile long so the memory is published appropriately to all threads and it has code that does atomic test/set operations which removes the race conditions.
When I create a new object in a thread which is an attribute of an object I´am giving to the thread it stays null in the main-function (but just without the System.out). I wrote a simple example of my Problem, which has the same result:
public class T1 {
public T2 t2;
}
public class T2 {
public String s;
/**
* #param args
*/
public static void main(String[] args) {
T1 t1 = new T1();
T3 thread = new T3(t1);
thread.start();
while(t1.t2 == null){
// System.out.println("null");
}
System.exit(0);
}
}
public class T3 extends Thread{
public T1 t1;
public T3(T1 t1){
this.t1 = t1;
}
#Override
public void run(){
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
t1.t2 = new T2();
while(true){
System.out.println(t1.t2);
}
}
}
So without System.out.println("null") it results in an infinite loop, but when I add this System.out it behaves like I suspect. I even get the same result or problem if I use static variables.
Is there some sort of optimization or something else I don´t understand? Or why is t1.t2 always == null without System.out.println("null")? I thought the T1-object and his attributes (in this case the object t2) will be created on the heap, which is shared between all threads and just the t1-reference-variable is stored on the stack. So hopefully someone can explain me, why it stays null without the System.out... The problem just occurs if the thread is executed after the while-loop, thats why there is a sleep(1000).
So without System.out.println("null") it results in an infinite loop, but when I add this System.out it behaves like I suspect. I even get the same result or problem if I use static variables.
If a thread is updating a value that another thread is reading, there must be some sort of memory synchronization. When you add the System.out.println(...) this uses the underlying PrintStream which is a synchronized class. So the call to println(...) is what synchronizes the memory between the threads.
Here's some good information around memory synchronization from Oracle.
You should add volatile to the T2 t2; field to have the updates to t2 be visible between threads.
The real problem here is that with a modern multi-CPU (and core) hardware, each CPU has its own high speed memory caches. Modern OS and JVM software makes use of these physical (and virtual) CPUs to schedule threads to run in parallel simultaneously. These caches are a critical part of threading performance. If every read and every write had to go to central storage, your application would run 2+ times order of magnitude slower. The memory synchronization flushes the cache so that local writes getting written to central storage, and local cached reads are marked dirty so they have to be re-read from central storage when necessary.
I have read article concerning atomic operation in Java but still have some doubts needing to be clarified:
int volatile num;
public void doSomething() {
num = 10; // write operation
System.out.println(num) // read
num = 20; // write
System.out.println(num); // read
}
So i have done w-r-w-r 4 operations on 1 method, are they atomic operations? What will happen if multiple threads invoke doSomething() method simultaneously ?
An operation is atomic if no thread will see an intermediary state, i.e. the operation will either have completed fully, or not at all.
Reading an int field is an atomic operation, i.e. all 32 bits are read at once. Writing an int field is also atomic, the field will either have been written fully, or not at all.
However, the method doSomething() is not atomic; a thread may yield the CPU to another thread while the method is being executing, and that thread may see that some, but not all, operations have been executed.
That is, if threads T1 and T2 both execute doSomething(), the following may happen:
T1: num = 10;
T2: num = 10;
T1: System.out.println(num); // prints 10
T1: num = 20;
T1: System.out.println(num); // prints 20
T2: System.out.println(num); // prints 20
T2: num = 20;
T2: System.out.println(num); // prints 20
If doSomething() were synchronized, its atomicity would be guaranteed, and the above scenario impossible.
volatile ensures that if you have a thread A and a thread B, that any change to that variable will be seen by both. So if it at some point thread A changes this value, thread B could in the future look at it.
Atomic operations ensure that the execution of the said operation happens "in one step." This is somewhat confusion because looking at the code 'x = 10;' may appear to be "one step", but actually requires several steps on the CPU. An atomic operation can be formed in a variety of ways, one of which is by locking using synchronized:
What the volatile keyword promises.
The lock of an object (or the Class in the case of static methods) is acquired, and no two objects can access it at the same time.
As you asked in a comment earlier, even if you had three separate atomic steps that thread A was executing at some point, there's a chance that thread B could begin executing in the middle of those three steps. To ensure the thread safety of the object, all three steps would have to be grouped together to act like a single step. This is part of the reason locks are used.
A very important thing to note is that if you want to ensure that your object can never be accessed by two threads at the same time, all of your methods must be synchronized. You could create a non-synchronized method on the object that would access the values stored in the object, but that would compromise the thread safety of the class.
You may be interested in the java.util.concurrent.atomic library. I'm also no expert on these matters, so I would suggest a book that was recommended to me: Java Concurrency in Practice
Each individual read and write to a volatile variable is atomic. This means that a thread won't see the value of num changing while it's reading it, but it can still change in between each statement. So a thread running doSomething while other threads are doing the same, will print a 10 or 20 followed by another 10 or 20. After all threads have finished calling doSomething, the value of num will be 20.
My answer modified according to Brian Roach's comment.
It's atomic because it is integer in this case.
Volatile can only ganrentee visibility among threads, but not atomic. volatile can make you see the change of the integer, but cannot ganrentee the integration in changes.
For example, long and double can cause unexpected intermediate state.
Atomic Operations and Synchronization:
Atomic executions are performed in a single unit of task without getting affected from other executions. Atomic operations are required in multi-threaded environment to avoid data irregularity.
If we are reading/writing an int value then it is an atomic operation. But generally if it is inside a method then if the method is not synchronized many threads can access it which can lead to inconsistent values. However, int++ is not an atomic operation. So by the time one threads read it’s value and increment it by one, other thread has read the older value leading to wrong result.
To solve data inconsistency, we will have to make sure that increment operation on count is atomic, we can do that using Synchronization but Java 5 java.util.concurrent.atomic provides wrapper classes for int and long that can be used to achieve this atomically without usage of Synchronization.
Using int might create data data inconsistencies as shown below:
public class AtomicClass {
public static void main(String[] args) throws InterruptedException {
ThreardProcesing pt = new ThreardProcesing();
Thread thread_1 = new Thread(pt, "thread_1");
thread_1.start();
Thread thread_2 = new Thread(pt, "thread_2");
thread_2.start();
thread_1.join();
thread_2.join();
System.out.println("Processing count=" + pt.getCount());
}
}
class ThreardProcesing implements Runnable {
private int count;
#Override
public void run() {
for (int i = 1; i < 5; i++) {
processSomething(i);
count++;
}
}
public int getCount() {
return this.count;
}
private void processSomething(int i) {
// processing some job
try {
Thread.sleep(i * 1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
OUTPUT: count value varies between 5,6,7,8
We can resolve this using java.util.concurrent.atomic that will always output count value as 8 because AtomicInteger method incrementAndGet() atomically increments the current value by one. shown below:
public class AtomicClass {
public static void main(String[] args) throws InterruptedException {
ThreardProcesing pt = new ThreardProcesing();
Thread thread_1 = new Thread(pt, "thread_1");
thread_1.start();
Thread thread_2 = new Thread(pt, "thread_2");
thread_2.start();
thread_1.join();
thread_2.join();
System.out.println("Processing count=" + pt.getCount());
}
}
class ThreardProcesing implements Runnable {
private AtomicInteger count = new AtomicInteger();
#Override
public void run() {
for (int i = 1; i < 5; i++) {
processSomething(i);
count.incrementAndGet();
}
}
public int getCount() {
return this.count.get();
}
private void processSomething(int i) {
// processing some job
try {
Thread.sleep(i * 1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
Source: Atomic Operations in java