How to Avoid Data Races - Two Examples

How to Avoid Data Races - Two Examples - java

I was told that the following code example has a data race condition (assuming multiple threads, of course):
class C {
private int x = 0;
private int y = 0;
void f() {
x = 1;
y = 1;
}
void g() {
int a = y;
int b = x;
assert(b >= a);
}
}
Yet, I am told that the following "fix" does not have data races:
class C {
private int x = 0;
private int y = 0;
void f() {
synchronized(this) { x = 1; }
synchronized(this) { y = 1; }
}
void g() {
int a, b;
synchronized(this) { a = y; }
synchronized(this) { b = x; }
assert(b >= a);
}
}
Understandably, there are other problems with the above examples, but I just want to know why the second code block has no race conditions. How does synchronizing each assignment statement eliminate the data race condition? What is the significance of synchronizing only a single assignment statement at a time?
Just to clarify, data race is defined as such:
Data races: Simultaneous read/write or write/write of the same
memory location

In the first example the data race condition will be noticed by having the assert fail.
So how is this possible? y > x should always be false, as y is written after x and read before x.
Even if you consider all interleaving of
Thread 1 Thread 2
----------------------------------
read y
read x
write x 1
write y 1
you should always have x <= y
But in a safe execution, if read v during the execution of a write v, there is no guarantee on the value read.
v is 0
T1 write 1: wwwwwwwww
T2 read : rrrrr
T3 read : rrrrr
In this case the value read by T2 can be anything, like 42. Meanwhile, the value read by T3 is guaranteed to be 1.
In the first case a and b can be anything, so the assertion may fail.
The "fix" offers the guarantee that the data race (concurrent read\write) will never occur, and that a and b will always be either 0 or 1.

Whoever told you this was wrong; the race condition (changing x and y before the assert; actually, just assert (x >= y); has the same problem) is still present if you synchronize separately.
A JIT JVM might very well perform lock coarsening and move both pairs of assignments into a single synchronized block, but that's not guaranteed by the language semantics.

The synchronized keyword is all about different threads reading and writing to the same variables, objects and resources. This is not a trivial topic in Java, but here is a quote from Sun:
Synchronized methods enable a simple strategy for preventing thread interference and memory consistency errors: if an object is visible to more than one thread, all reads or writes to that object's variables are done through synchronized methods.
In a very, very small nutshell: When you have two threads that are reading and writing to the same 'resource', say a variable named foo, you need to ensure that these threads access the variable in an atomic way. Without the synchronized keyword, your thread 1 may not see the change thread 2 made to foo, or worse, it may only be half changed. This would not be what you logically expect.

Related

Multithreading issue in Java, different results at runtime

Whenever I run this program it gives me different result. Can someone explain to me, or give me some topics where I could find answer in order to understand what happens in the code?
class IntCell {
private int n = 0;
public int getN() {return n;}
public void setN(int n) {this.n = n;}
}
public class Count extends Thread {
static IntCell n = new IntCell();
public void run() {
int temp;
for (int i = 0; i < 200000; i++) {
temp = n.getN();
n.setN(temp + 1);
}
}
public static void main(String[] args) {
Count p = new Count();
Count q = new Count();
p.start();
q.start();
try { p.join(); q.join(); }
catch (InterruptedException e) { }
System.out.println("The value of n is " + n.getN());
}
}

The reason is simple: you don't get and modify your counter atomically such that your code is prone to race condition issues.
Here is an example that illustrates the problem:
Thread #1 calls n.getN() gets 0
Thread #2 calls n.getN() gets 0
Thread #1 calls n.setN(1) to set n to 1
Thread #2 is not aware that thread #1 has already set n to 1 so still calls n.setN(1) to set n to 1 instead of 2 as you would expect, this is called a race condition issue.
Your final result would then depend on the total amount of race condition issues met while executing your code which is unpredictable so it changes from one test to another.
One way to fix it, is to get and set your counter in a synchronized block in order to do it atomically as next, indeed it will enforce the threads to acquire an exclusive lock on the instance of IntCell assigned to n before being able to execute this section of code.
synchronized (n) {
temp = n.getN();
n.setN(temp + 1);
}
Output:
The value of n is 400000
You could also consider using AtomicInteger instead of int for your counter in order to rely on methods of type addAndGet(int delta) or incrementAndGet() to increment your counter atomically.

The access to the IntCell n static variable is concurrent between your two threads :
static IntCell n = new IntCell();
public void run() {
int temp;
for (int i = 0; i < 200000; i++) {
temp = n.getN();
n.setN(temp + 1);
}
}
Race conditions make that you cannot have a predictable behavior when n.setN(temp + 1); is performed as it depends on which thread has previously called :temp = n.getN();.
If it the current thread, you have the value put by the thread otherwise you have the last value put by the other thread.
You could add synchronization mechanism to avoid the problem of unexpected behavior.

You are running 2 threads in parallel and updating a shared variable by these 2 threads, that is why your answer is always different. It is not a good practice to update shared variable like this.
To understand, you should first understand Multithreading and then notify and wait, simple cases

You modify the same number n with two concurrent Threads. If Thread1 reads n = 2, then Thread2 reads n = 2 before Thread2 has written the increment, Thread1 will increment n to 3, but Thread2 will no more increment, but write another "3" to n. If Thread1 finishes its incrementation before Thread2 reads, both will increment.
Now both Threads are concurrent and you can never tell which one will get what CPU cycle. This depends on what else runs on your machine. So You will always lose a different number of incrementations by the above mentioned overwriting situation.
To solve it, run real incrementations on n via n++. They go in a single CPU cycle.

Understanding happens-before and synchronization [duplicate]

This question already has answers here:
How to understand happens-before consistent
(5 answers)
Closed 4 years ago.
I'm trying to understand Java happens-before order concept and there are a few things that seem very confusing. As far as I can tell, happens before is just an order on the set of actions and does not provide any guarantees about real-time execution order. Actually (emphasize mine):
It should be noted that the presence of a happens-before relationship
between two actions does not necessarily imply that they have to take
place in that order in an implementation. If the reordering produces
results consistent with a legal execution, it is not illegal.
So, all it says is that if there are two actions w (write) and r (read) such that hb(w, r), than r might actually happens before w in an execution, but there's no guarantee that it will. Also the write w is observed by the read r.
How I can determine that two actions are performed subsequently in run-time? For instance:
public volatile int v;
public int c;
Actions:
Thread A
v = 3; //w
Thread B
c = v; //r
Here we have hb(w, r) but that doesn't mean that c will contain value 3 after assignment. How do I enforce that c is assigned with 3? Does synchronization order provide such guarantees?

When the JLS says that some event X in thread A establishes a happens before relationship with event Y in thread B, it does not mean that X will happen before Y.
It means that IF X happens before Y, then both threads will agree that X happened before Y. That is to say, both threads will see the program's memory in a state that is consistent with X happening before Y.
It's all about memory. Threads communicate through shared memory, but when there are multiple CPUs in a system, all trying to access the same memory system, then the memory system becomes a bottleneck. Therefore, the CPUs in a typical multi-CPU computer are allowed to delay, re-order, and cache memory operations in order to speed things up.
That works great when threads are not interacting with one another, but it causes problems when they actually do want to interact: If thread A stores a value into an ordinary variable, Java makes no guarantee about when (or even if) thread B will see the value change.
In order to overcome that problem when it's important, Java gives you certain means of synchronizing threads. That is, getting the threads to agree on the state of the program's memory. The volatile keyword and the synchronized keyword are two means of establishing synchronization between threads.
I think the reason they called it "happens before" is to emphasize the transitive nature of the relationship: If you can prove that A happens before B, and you can prove that B happens before C, then according to the rules specified in the JLS, you have proved that A happens before C.

I would like to associate the above statement with some sample code flow.
To understand this, let us take below class that has two fields counter and isActive.
class StateHolder {
private int counter = 100;
private boolean isActive = false;
public synchronized void resetCounter() {
counter = 0;
isActive = true;
}
public synchronized void printStateWithLock() {
System.out.println("Counter : " + counter);
System.out.println("IsActive : " + isActive);
}
public void printStateWithNoLock() {
System.out.println("Counter : " + counter);
System.out.println("IsActive : " + isActive);
}
}
And assume that there are three thread T1, T2, T3 calling the following methods on the same object of StateHolder:
T1 calls resetCounter() and T2 calls printStateWithLock() at a same time and T1 gets the lock
T3 -> calls printStateWithNoLock() after T1 has completed its execution
It should be noted that the presence of a happens-before relationship between two actions does not necessarily imply that they have to take place in that order in an implementation. If the reordering produces results consistent with a legal execution, it is not illegal.
and the immediate line says,
As per the above statement, it gives the flexibility for JVM, OS or underlying hardware to reorder the statements within the resetCounter() method. And as T1 gets executed it could execute the statements in the below order.
public synchronized void resetCounter() {
isActive = true;
counter = 0;
}
This is inline with the statement not necessarily imply that they have to take place in that order in an implementation.
Now looking at it from a T2 perspective, this reordering doesn't have any negative impact, because both T1 and T2 are synchronizing on the same object and T2 is guaranteed to see changes changes to both of the fields, irrespective of whether the reordering has happened or not, as there is happens-before relationship. So output will always be:
Counter : 0
IsActive : true
This is as per statement, If the reordering produces results consistent with a legal execution, it is not illegal
But look at it from a T3 perspective, with this reordering it possible that T3 will see the updated value of isActive as 'truebut still see thecountervalue as100`, although T1 has completed its execution.
Counter : 100
IsActive : true
The next point in the above link further clarifies the statement and says that:
More specifically, if two actions share a happens-before relationship, they do not necessarily have to appear to have happened in that order to any code with which they do not share a happens-before relationship. Writes in one thread that are in a data race with reads in another thread may, for example, appear to occur out of order to those reads.
In this example T3 has encountered this problem as it doesn't have any happens-before relationship with T1 or T2. This is inline with Not necessarily have to appear to have happened in that order to any code with which they do not share a happens-before relationship.
NOTE: To simplify the case, we have single thread T1 modifying the state and T2 and T3 reading the state. It is possible to have
T1 updates counter to 0, later
T2 modifies isActive to true and sees counter is 0, after sometime
T3 that prints the state could still see only isActive as true but counter is 100, although both T1 and T2 have completed the execution.
As to the last question:
we have hb(w, r) but that doesn't mean that c will contain value 3 after assignment. How do I enforce that c is assigned with 3?
public volatile int v;
public int c;
Thread A
v = 3; //w
Thread B
c = v; //r
Since v is a volatile, as per Happens-before Order
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
So it is safe to assume that when Thread B tries to read the variable v it will always read the updated value and c will be assigned 3 in the above code.

Interpreting #James' answer to my liking:
// Definition: Some variables
private int first = 1;
private int second = 2;
private int third = 3;
private volatile boolean hasValue = false;
// Thread A
first = 5;
second = 6;
third = 7;
hasValue = true;
// Thread B
System.out.println("Flag is set to : " + hasValue);
System.out.println("First: " + first); // will print 5
System.out.println("Second: " + second); // will print 6
System.out.println("Third: " + third); // will print 7
if you want the state/value of the memory(memory and CPU cache) seen at the
time of a write statement of a variable by one thread,
State of the memory seen by hasValue=true(write statement) in Thread A :
first having value 5,second having value 6,third having value 7
to be seen from every subsequent(why subsequent even though only one
read in Thread B in this example? We many have Thread C doing exactly
similar to Thread B) read statement of the same variable by another
thread,then mark that variable volatile.
If X (hasValue=true) in Thread A happens before Y (sysout(hasValue)) in Thread B, the behaviour should be as if X happened before Y in the same thread (memory values seen at X should be same starting from Y)

Here we have hb(w, r) but that doesn't mean that c will contain value 3 after assignment. How do I enforce that c is assigned with 3? Does synchronization order provide such guarantees?
And your example
public volatile int v;
public int c;
Actions:
Thread A
v = 3; //w
Thread B
c = v; //r
You don't need volatile for v in your example. Let's take a look at a similar example
int v = 0;
int c = 0;
volatile boolean assigned = false;
Actions:
Thread A
v = 3;
assigned = true;
Thread B
while(!assigned);
c = v;
assigned field is volatile.
We will have c = v statement in Thread B only after assigned will be true (while(!assigned) is responsible for that).
if we have volatile — we have happens before.
happens before means that, if we see assigned == true — we will see all that happened before a statement assigned = true: we will see v = 3.
So when we have assigned == true -> we have v = 3.
We have c = 3 as a result.
What will happen without volatile
int v = 0;
int c = 0;
boolean assigned = false;
Actions:
Thread A
v = 3;
assigned = true;
Thread B
while(!assigned);
c = v;
We have assigned without volatile for now.
The value of c in the Thread B can be equal 0 or 3 in such situation. So there is not any guaranties
that c == 3.

Volatile and atomic operation in java

I have read article concerning atomic operation in Java but still have some doubts needing to be clarified:
int volatile num;
public void doSomething() {
num = 10; // write operation
System.out.println(num) // read
num = 20; // write
System.out.println(num); // read
}
So i have done w-r-w-r 4 operations on 1 method, are they atomic operations? What will happen if multiple threads invoke doSomething() method simultaneously ?

An operation is atomic if no thread will see an intermediary state, i.e. the operation will either have completed fully, or not at all.
Reading an int field is an atomic operation, i.e. all 32 bits are read at once. Writing an int field is also atomic, the field will either have been written fully, or not at all.
However, the method doSomething() is not atomic; a thread may yield the CPU to another thread while the method is being executing, and that thread may see that some, but not all, operations have been executed.
That is, if threads T1 and T2 both execute doSomething(), the following may happen:
T1: num = 10;
T2: num = 10;
T1: System.out.println(num); // prints 10
T1: num = 20;
T1: System.out.println(num); // prints 20
T2: System.out.println(num); // prints 20
T2: num = 20;
T2: System.out.println(num); // prints 20
If doSomething() were synchronized, its atomicity would be guaranteed, and the above scenario impossible.

volatile ensures that if you have a thread A and a thread B, that any change to that variable will be seen by both. So if it at some point thread A changes this value, thread B could in the future look at it.
Atomic operations ensure that the execution of the said operation happens "in one step." This is somewhat confusion because looking at the code 'x = 10;' may appear to be "one step", but actually requires several steps on the CPU. An atomic operation can be formed in a variety of ways, one of which is by locking using synchronized:
What the volatile keyword promises.
The lock of an object (or the Class in the case of static methods) is acquired, and no two objects can access it at the same time.
As you asked in a comment earlier, even if you had three separate atomic steps that thread A was executing at some point, there's a chance that thread B could begin executing in the middle of those three steps. To ensure the thread safety of the object, all three steps would have to be grouped together to act like a single step. This is part of the reason locks are used.
A very important thing to note is that if you want to ensure that your object can never be accessed by two threads at the same time, all of your methods must be synchronized. You could create a non-synchronized method on the object that would access the values stored in the object, but that would compromise the thread safety of the class.
You may be interested in the java.util.concurrent.atomic library. I'm also no expert on these matters, so I would suggest a book that was recommended to me: Java Concurrency in Practice

Each individual read and write to a volatile variable is atomic. This means that a thread won't see the value of num changing while it's reading it, but it can still change in between each statement. So a thread running doSomething while other threads are doing the same, will print a 10 or 20 followed by another 10 or 20. After all threads have finished calling doSomething, the value of num will be 20.

My answer modified according to Brian Roach's comment.
It's atomic because it is integer in this case.
Volatile can only ganrentee visibility among threads, but not atomic. volatile can make you see the change of the integer, but cannot ganrentee the integration in changes.
For example, long and double can cause unexpected intermediate state.

Atomic Operations and Synchronization:
Atomic executions are performed in a single unit of task without getting affected from other executions. Atomic operations are required in multi-threaded environment to avoid data irregularity.
If we are reading/writing an int value then it is an atomic operation. But generally if it is inside a method then if the method is not synchronized many threads can access it which can lead to inconsistent values. However, int++ is not an atomic operation. So by the time one threads read it’s value and increment it by one, other thread has read the older value leading to wrong result.
To solve data inconsistency, we will have to make sure that increment operation on count is atomic, we can do that using Synchronization but Java 5 java.util.concurrent.atomic provides wrapper classes for int and long that can be used to achieve this atomically without usage of Synchronization.
Using int might create data data inconsistencies as shown below:
public class AtomicClass {
public static void main(String[] args) throws InterruptedException {
ThreardProcesing pt = new ThreardProcesing();
Thread thread_1 = new Thread(pt, "thread_1");
thread_1.start();
Thread thread_2 = new Thread(pt, "thread_2");
thread_2.start();
thread_1.join();
thread_2.join();
System.out.println("Processing count=" + pt.getCount());
}
}
class ThreardProcesing implements Runnable {
private int count;
#Override
public void run() {
for (int i = 1; i < 5; i++) {
processSomething(i);
count++;
}
}
public int getCount() {
return this.count;
}
private void processSomething(int i) {
// processing some job
try {
Thread.sleep(i * 1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
OUTPUT: count value varies between 5,6,7,8
We can resolve this using java.util.concurrent.atomic that will always output count value as 8 because AtomicInteger method incrementAndGet() atomically increments the current value by one. shown below:
public class AtomicClass {
public static void main(String[] args) throws InterruptedException {
ThreardProcesing pt = new ThreardProcesing();
Thread thread_1 = new Thread(pt, "thread_1");
thread_1.start();
Thread thread_2 = new Thread(pt, "thread_2");
thread_2.start();
thread_1.join();
thread_2.join();
System.out.println("Processing count=" + pt.getCount());
}
}
class ThreardProcesing implements Runnable {
private AtomicInteger count = new AtomicInteger();
#Override
public void run() {
for (int i = 1; i < 5; i++) {
processSomething(i);
count.incrementAndGet();
}
}
public int getCount() {
return this.count.get();
}
private void processSomething(int i) {
// processing some job
try {
Thread.sleep(i * 1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}
Source: Atomic Operations in java

Is it safe to use conditional operators with volatile primitives in multithreaded applications

In the below code listings, are Statement 1 and Statement 2 thread safe or not? They are using the VolatileIntWrapper.
If they not thread safe, which statements need to be wrapped in synchronized block?
public class Demo {
public static void main(String[] args) {
VolatileIntWrapper volatileIntWrapper = new VolatileIntWrapper() ;
for(int i = 1 ; i <= 5 ; ++i){
new ModifyWrapperIntValue(volatileIntWrapper).start() ;
}
}
}
class VolatileIntWrapper{
public volatile int value = 0 ;
}
class ModifyWrapperIntValue extends Thread{
private VolatileIntWrapper wrapper ;
private int counter = 0 ;
public ModifyWrapperIntValue(VolatileIntWrapper viw) {
this.wrapper = viw ;
}
#Override
public void run() {
//randomly increments or decrements VolatileIntWrapper primitive int value
//we can use below statement also, if value in VolatileIntWrapper is private
// wrapper.getValue() instead of wrapper.value
//but, as per my understanding, it will add more complexity to logic(might be requires additional synchronized statements),
//so, for simplicity, we declared it public
//Statement 1
while(wrapper.value > -1500 && wrapper.value < 1500){
++counter ;
int randomValue = (int) (Math.random() * 2) ;
//Statement 2
wrapper.value += (randomValue == 0) ? 1 : -1 ;
}
System.out.println("Executed " + counter + " times...");
}
}

The volatile keyword provides a memory barrier for both reading and writing a field. That means that multiple threads can access the field and be guaranteed to read the most current value and their writes are guaranteed to be seen by other threads.
What volatile does not do is provide any guarantees around the order of operations -- especially when you have multiple read and write statements. In your code you are accessing the volatile int a couple of places in your loop:
while(wrapper.value > -1500 && wrapper.value < 1500){
...
wrapper.value += (randomValue == 0) ? 1 : -1 ;
}
There are no guarantees as to the order of operations here. Immediately after thread A tests the value > -1500, another thread might change it before thread A can test value < 1500. Or thread A might do both tests, then thread B might do both tests, then thread A would assign the value, and then thread B would assign the value. That is the nature of multithreading race conditions.
The while loop is the section of code that I suspect would be considered have a bug unless you synchronize around it. You should do something like the following. Once you are synchronizing that section, the synchronized keyword provides the memory barrier itself and so the volatile keyword is unnecessary.
synchronized (wrapper) {
while (...) {
...
}
}

It is safe to use a volatile field once and only once. (A read and a write counts as twice)
You are using the field a total of four times so you have three places for a race condition.
The problem with this example is it is faster and simpler to be performed single threaded, so anything you do with it in a multi-threaded way will appear unnatural and inefficient.

The question needs the following interpretation:
The thread you are using is safe and you are reading your primitive value as it is intended.
there is a specific term to use synchronize blocks on the primitive field, but you need to do the following:
Use getter and setters of your field.
Put synchronize in the both accessors and voila.

Java In Concurrencty says the following criteria need to be met touse volatile variables:
1. Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
2. The variable does not participate in invariants with other state variables; and
3. Locking is not required for any other reason while the variable is being accessed.

Does making a field `volatile` prevent all memory visibility issues in a concurrent situation?

Does making a class field volatile prevent all memory visibility issues with it in a concurrent situation ? Is it possible that for below class , a thread that gets a reference of a Test object sees x as 0 first ( the default value of int ) and then 10 ? I am thinking this to be possible if and only if the Constructor of Test gives away this reference without completing ( improper publishing ) . Can someone validate/correct me ?
class Test {
volatile int x = 10;
}
Second question: what if it was final int x=10; ?

You are actually not guaranteed to see x = 10 according to the JMM.
For example if you have
Test test = null;
Thread 1 -> test = new Test();
Thread 2 -> test.x == // even though test != null, x can be seen as 0 if the
// write of x hasn't yet occur
Now if you had
class Test{
int y = 3;
volatile x = 10;
}
If thread-2 reads x == 10 thread-2 is guaranteed to read y == 3
To answer your second question.
Having a final field will issue a storestore after the constructor and before publishing so having the field final will actually ensure you see x = 10.
Edit: As yshavit noted. You lose the happens-before relationship I mention in my first example with final fields, that is as yshavit put it if thread-2 reads x == 10 it may not read y == 3 where x is a final field.

Even in a single threaded implementation, you are not guaranteed to see x = 10 if you leak this in the constructor. So the issue you can experience here is not directly a concurrency issue, but an order of execution issue (depending on when you leak this). E.g. if you leak this in a parent constructor for instace:
public class TestParent
{
public TestParent()
{
if (this instanceof TestChild)
{
TestChild child = (TestChild) this;
System.out.println(child.field); // will print 0 when TestChild is instantiated.
}
}
}
public class TestChild extends TestParent
{
volatile int field = 10;
}
public static void main(String[] args)
{
TestChild child = new TestChild();
System.out.println(child.field);
// The above results in 0 (from TestParent constructor) then 10 being printed.
}
Final fields, on the other hand, are guaranteed to have the assigned initial value so long as that assignment is done on the declaring line (if you make the field final but initialize it in the constructor then you can still leak this before and show the uninitialized value.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.