When to use volatile vs synchronization in multithreading in java?

When to use volatile vs synchronization in multithreading in java? - java

When to use volatile keyword vs synchronization in multithreading?

Use volatile to guarantee that each read access to a variable will see the latest value written to that variable. Use synchronized whenever you need values to be stable for multiple instructions. (Note that this does not necessarily mean multiple statements; the single statement:
var++; // NOT thread safe!
is not thread-safe even if var is declared volatile. You need to do this:
synchronized(LOCK_OBJECT){var++;}
See here for a nice summary of this issue.

Volatile only ensures the read operation always gives the latest state from memory across threads. However, it does not ensure any write safety / ordering of operations, i.e. two threads can update the volatile variable in any random order. Also it does not ensure that multiple operations on the variable are atomic.
However a synchronized block ensures latest state and write safety. Also the access and update to variable is atomic inside a synchronized block.
The above, however is true, only if all the access / updates to the variable in question are using the same lock object so that at no time multiple threads gets access to the variable.

That's a pretty broad question. The best answer I can give is to use synchronized when performing multiple actions that must be seen by other threads as occurring atomically—either all or none of the steps have occurred.
For a single action, volatile may be sufficient; it acts as a memory barrier to ensure visibility of the change to other threads.

Related

Why Volatile variable isn't used for Atomicity

From Javadocs
Using volatile variables reduces the risk of memory consistency
errors, because any write to a volatile variable establishes a
happens-before relationship with subsequent reads of that same
variable. This means that changes to a volatile variable are always
visible to other threads.
When changes made to a volatile variable are always visible to any other thread, then why volatile variable cant be used in case of multiple threads writing to that variable. Why is volatile only used for cases when one thread is writing or reading to that while the other thread is only reading the variable?
If changes are always visible to other threads, then suppose if thread B wants to write to that variable, it will see the new value(updated by thread A) and update it. And when the thread A again wants to write, it will again see the updated value by thread B and write to it.Where is the problem in that?
In short, i am not able to understand this.
if two threads are both reading and writing to a shared variable, then
using the volatile keyword for that is not enough. You need to use
synchronization in that case to guarantee that the reading and writing
of the variable is atomic.

There are plenty of purposes that volatile works fine for — but also plenty of purposes that it doesn't. For example, imagine that you have a field like this:
private volatile int i;
and two threads that both run ++this.i: reading this.i and then writing to it.
The problem is that ++this.i is a volatile read followed by a completely separate volatile write. Any number of things could have happened between the read and the write; in particular, you could get a situation where both threads read i before either thread writes to it. The net effect is that the value of i increases by only 1, even though two separate threads both incremented it.
AtomicInteger (and the other atomics) address this sort of problem by allowing you to simultaneously read and write in a single atomic (≈ volatile) step. (They do this by using a compare-and-swap instruction that performs the write only if the value that was read is still the current value. The increment-and-get method just runs a loop that retries this until the write actually succeeds.)

Think about what "atomicity" means. It means that two or more operations that happen in one thread appear to happen as an atomic unit as far as other threads can tell.
So if I declare some volatile int foobar, and I write code to perform some operations on it, how would the compiler know which of those operations are supposed to be the atomic unit?
When you write a synchronized block, the atomic unit is whatever you put inside the block.

Volatile Vs Atomic [duplicate]

This question already has answers here:
What is the difference between atomic / volatile / synchronized?
(7 answers)
Closed 9 years ago.
I read somewhere below line.
Java volatile keyword doesn't means atomic, its common misconception
that after declaring volatile, ++ operation will be atomic, to make
the operation atomic you still need to ensure exclusive access using
synchronized method or block in Java.
So what will happen if two threads attack a volatile primitive variable at same time?
Does this mean that whosoever takes lock on it, that will be setting its value first. And if in meantime, some other thread comes up and read old value while first thread was changing its value, then doesn't new thread will read its old value?
What is the difference between Atomic and volatile keyword?

The effect of the volatile keyword is approximately that each individual read or write operation on that variable is made atomically visible to all threads.
Notably, however, an operation that requires more than one read/write -- such as i++, which is equivalent to i = i + 1, which does one read and one write -- is not atomic, since another thread may write to i between the read and the write.
The Atomic classes, like AtomicInteger and AtomicReference, provide a wider variety of operations atomically, specifically including increment for AtomicInteger.

Volatile and Atomic are two different concepts. Volatile ensures, that a certain, expected (memory) state is true across different threads, while Atomics ensure that operation on variables are performed atomically.
Take the following example of two threads in Java:
Thread A:
value = 1;
done = true;
Thread B:
if (done)
System.out.println(value);
Starting with value = 0 and done = false the rule of threading tells us, that it is undefined whether or not Thread B will print value. Furthermore value is undefined at that point as well! To explain this you need to know a bit about Java memory management (which can be complex), in short: Threads may create local copies of variables, and the JVM can reorder code to optimize it, therefore there is no guarantee that the above code is run in exactly that order. Setting done to true and then setting value to 1 could be a possible outcome of the JIT optimizations.
volatile only ensures, that at the moment of access of such a variable, the new value will be immediately visible to all other threads and the order of execution ensures, that the code is at the state you would expect it to be. So in case of the code above, defining done as volatile will ensure that whenever Thread B checks the variable, it is either false, or true, and if it is true, then value has been set to 1 as well.
As a side-effect of volatile, the value of such a variable is set thread-wide atomically (at a very minor cost of execution speed). This is however only important on 32-bit systems that i.E. use long (64-bit) variables (or similar), in most other cases setting/reading a variable is atomic anyways. But there is an important difference between an atomic access and an atomic operation. Volatile only ensures that the access is atomically, while Atomics ensure that the operation is atomically.
Take the following example:
i = i + 1;
No matter how you define i, a different Thread reading the value just when the above line is executed might get i, or i + 1, because the operation is not atomically. If the other thread sets i to a different value, in worst case i could be set back to whatever it was before by thread A, because it was just in the middle of calculating i + 1 based on the old value, and then set i again to that old value + 1. Explanation:
Assume i = 0
Thread A reads i, calculates i+1, which is 1
Thread B sets i to 1000 and returns
Thread A now sets i to the result of the operation, which is i = 1
Atomics like AtomicInteger ensure, that such operations happen atomically. So the above issue cannot happen, i would either be 1000 or 1001 once both threads are finished.

There are two important concepts in multithreading environment:
atomicity
visibility
The volatile keyword eradicates visibility problems, but it does not deal with atomicity. volatile will prevent the compiler from reordering instructions which involve a write and a subsequent read of a volatile variable; e.g. k++.
Here, k++ is not a single machine instruction, but three:
copy the value to a register;
increment the value;
place it back.
So, even if you declare a variable as volatile, this will not make this operation atomic; this means another thread can see a intermediate result which is a stale or unwanted value for the other thread.
On the other hand, AtomicInteger, AtomicReference are based on the Compare and swap instruction. CAS has three operands: a memory location V on which to operate, the expected old value A, and the new value B. CAS atomically updates V to the new value B, but only if the value in V matches the expected old value A; otherwise, it does nothing. In either case, it returns the value currently in V. The compareAndSet() methods of AtomicInteger and AtomicReference take advantage of this functionality, if it is supported by the underlying processor; if it is not, then the JVM implements it via spin lock.

As Trying as indicated, volatile deals only with visibility.
Consider this snippet in a concurrent environment:
boolean isStopped = false;
:
:
while (!isStopped) {
// do some kind of work
}
The idea here is that some thread could change the value of isStopped from false to true in order to indicate to the subsequent loop that it is time to stop looping.
Intuitively, there is no problem. Logically if another thread makes isStopped equal to true, then the loop must terminate. The reality is that the loop will likely never terminate even if another thread makes isStopped equal to true.
The reason for this is not intuitive, but consider that modern processors have multiple cores and that each core has multiple registers and multiple levels of cache memory that are not accessible to other processors. In other words, values that are cached in one processor's local memory are not visisble to threads executing on a different processor. Herein lies one of the central problems with concurrency: visibility.
The Java Memory Model makes no guarantees whatsoever about when changes that are made to a variable in one thread may become visible to other threads. In order to guarantee that updates are visisble as soon as they are made, you must synchronize.
The volatile keyword is a weak form of synchronization. While it does nothing for mutual exclusion or atomicity, it does provide a guarantee that changes made to a variable in one thread will become visible to other threads as soon as it is made. Because individual reads and writes to variables that are not 8-bytes are atomic in Java, declaring variables volatile provides an easy mechanism for providing visibility in situations where there are no other atomicity or mutual exclusion requirements.

The volatile keyword is used:
to make non atomic 64-bit operations atomic: long and double. (all other, primitive accesses are already guaranteed to be atomic!)
to make variable updates guaranteed to be seen by other threads + visibility effects: after writing to a volatile variable, all the variables that where visible before writing that variable become visible to another thread after reading the same volatile variable (happen-before ordering).
The java.util.concurrent.atomic.* classes are, according to the java docs:
A small toolkit of classes that support lock-free thread-safe
programming on single variables. In essence, the classes in this
package extend the notion of volatile values, fields, and array
elements to those that also provide an atomic conditional update
operation of the form:
boolean compareAndSet(expectedValue, updateValue);
The atomic classes are built around the atomic compareAndSet(...) function that maps to an atomic CPU instruction. The atomic classes introduce the happen-before ordering as the volatile variables do. (with one exception: weakCompareAndSet(...)).
From the java docs:
When a thread sees an update to an atomic variable caused by a
weakCompareAndSet, it does not necessarily see updates to any other
variables that occurred before the weakCompareAndSet.
To your question:
Does this mean that whosoever takes lock on it, that will be setting
its value first. And in if meantime, some other thread comes up and
read old value while first thread was changing its value, then doesn't
new thread will read its old value?
You don't lock anything, what you are describing is a typical race condition that will happen eventually if threads access shared data without proper synchronization. As already mentioned declaring a variable volatile in this case will only ensure that other threads will see the change of the variable (the value will not be cached in a register of some cache that is only seen by one thread).
What is the difference between AtomicInteger and volatile int?
AtomicInteger provides atomic operations on an int with proper synchronization (eg. incrementAndGet(), getAndAdd(...), ...), volatile int will just ensure the visibility of the int to other threads.

So what will happen if two threads attack a volatile primitive variable at same time?
Usually each one can increment the value. However sometime, both will update the value at the same time and instead of incrementing by 2 total, both thread increment by 1 and only 1 is added.
Does this mean that whosoever takes lock on it, that will be setting its value first.
There is no lock. That is what synchronized is for.
And in if meantime, some other thread comes up and read old value while first thread was changing its value, then doesn't new thread will read its old value?
Yes,
What is the difference between Atomic and volatile keyword?
AtomicXxxx wraps a volatile so they are basically same, the difference is that it provides higher level operations such as CompareAndSwap which is used to implement increment.
AtomicXxxx also supports lazySet. This is like a volatile set, but doesn't stall the pipeline waiting for the write to complete. It can mean that if you read a value you just write you might see the old value, but you shouldn't be doing that anyway. The difference is that setting a volatile takes about 5 ns, bit lazySet takes about 0.5 ns.

Multithreaded access and variable cache of threads

I could find the answer if I read a complete chapter/book about multithreading, but I'd like a quicker answer. (I know this stackoverflow question is similar, but not sufficiently.)
Assume there is this class:
public class TestClass {
private int someValue;
public int getSomeValue() { return someValue; }
public void setSomeValue(int value) { someValue = value; }
}
There are two threads (A and B) that access the instance of this class. Consider the following sequence:
A: getSomeValue()
B: setSomeValue()
A: getSomeValue()
If I'm right, someValue must be volatile, otherwise the 3rd step might not return the up-to-date value (because A may have a cached value). Is this correct?
Second scenario:
B: setSomeValue()
A: getSomeValue()
In this case, A will always get the correct value, because this is its first access so he can't have a cached value yet. Is this right?
If a class is accessed only in the second way, there is no need for volatile/synchronization, or is it?
Note that this example was simplified, and actually I'm wondering about particular member variables and methods in a complex class, and not about whole classes (i.e. which variables should be volatile or have synced access). The main point is: if more threads access certain data, is synchronized access needed by all means, or does it depend on the way (e.g. order) they access it?
After reading the comments, I try to present the source of my confusion with another example:
From UI thread: threadA.start()
threadA calls getSomeValue(), and informs the UI thread
UI thread gets the message (in its message queue), so it calls: threadB.start()
threadB calls setSomeValue(), and informs the UI thread
UI thread gets the message, and informs threadA (in some way, e.g. message queue)
threadA calls getSomeValue()
This is a totally synchronized structure, but why does this imply that threadA will get the most up-to-date value in step 6? (if someValue is not volatile, or not put into a monitor when accessed from anywhere)

If two threads are calling the same methods, you can't make any guarantees about the order that said methods are called. Consequently, your original premise, which depends on calling order, is invalid.
It's not about the order in which the methods are called; it's about synchronization. It's about using some mechanism to make one thread wait while the other fully completes its write operation. Once you've made the decision to have more than one thread, you must provide that synchronization mechanism to avoid data corruption.

As we all know, that its the crucial state of the data that we need to protect, and the atomic statements which govern the crucial state of the data must be Synchronized.
I had this example, where is used volatile, and then i used 2 threads which used to increment the value of a counter by 1 each time till 10000. So it must be a total of 20000. but to my surprise it didnt happened always.
Then i used synchronized keyword to make it work.
Synchronization makes sure that when a thread is accessing the synchronized method, no other thread is allowed to access this or any other synchronized method of that object, making sure that data corruption is not done.
Thread-Safe class means that it will maintain its correctness in the presence of the scheduling and interleaving of the underlining Runtime environment, without any thread-safe mechanism from the Client side, which access that class.

Let's look at the book.
A field may be declared volatile, in which case the Java memory model (§17) ensures that all threads see a consistent value for the variable.
So volatile is a guarantee that the declared variable won't be copied into thread local storage, which is otherwise allowed. It's further explained that this is an intentional alternative to locking for very simple kinds of synchronized access to shared storage.
Also see this earlier article, which explains that int access is necessarily atomic (but not double or long).
These together mean that if your int field is declared volatile then no locks are necessary to guarantee atomicity: you will always see a value that was last written to the memory location, not some confused value resulting from a half-complete write (as is possible with double or long).
However you seem to imply that your getters and setters themselves are atomic. This is not guaranteed. The JVM can interrupt execution at intermediate points of during the call or return sequence. In this example, this has no consequences. But if the calls had side effects, e.g. setSomeValue(++val), then you would have a different story.

The issue is that java is simply a specification. There are many JVM implementations and examples of physical operating environments. On any given combination an an action may be safe or unsafe. For instance On single processor systems the volatile keyword in your example is probably completely unnecessary. Since the writers of the memory and language specifications can't reasonably account for possible sets of operating conditions, they choose to white-list certain patterns that are guaranteed to work on all compliant implementations. Adhering to to these guidelines ensures both that your code will work on your target system and that it will be reasonably portable.
In this case "caching" typically refers to activity that is going on at the hardware level. There are certain events that occur in java that cause cores on a multi processor systems to "Synchronize" their caches. Accesses to volatile variables are an example of this, synchronized blocks are another. Imagine a scenario where these two threads X and Y are scheduled to run on different processors.
X starts and is scheduled on proc 1
y starts and is scheduled on proc 2
.. now you have two threads executing simultaneously
to speed things up the processors check local caches
before going to main memory because its expensive.
x calls setSomeValue('x-value') //assuming proc 1's cache is empty the cache is set
//this value is dropped on the bus to be flushed
//to main memory
//now all get's will retrieve from cache instead
//of engaging the memory bus to go to main memory
y calls setSomeValue('y-value') //same thing happens for proc 2
//Now in this situation depending on to order in which things are scheduled and
//what thread you are calling from calls to getSomeValue() may return 'x-value' or
//'y-value. The results are completely unpredictable.
The point is that volatile(on compliant implementations) ensures that ordered writes will always be flushed to main memory and that other processor's caches will be flagged as 'dirty' before the next access regardless of the thread from which that access occurs.
disclaimer: volatile DOES NOT LOCK. This is important especially in the following case:
volatile int counter;
public incrementSomeValue(){
counter++; // Bad thread juju - this is at least three instructions
// read - increment - write
// there is no guarantee that this operation is atomic
}
this could be relevant to your question if your intent is that setSomeValue must always be called before getSomeValue
If the intent is that getSomeValue() must always reflect the most recent call to setSomeValue() then this is a good place for the use of the volatile keyword. Just remember that without it there is no guarantee that getSomeValue() will reflect to most recent call to setSomeValue() even if setSomeValue() was scheduled first.

If I'm right, someValue must be volatile, otherwise the 3rd step might not return the up-to-date value (because A may have a cached
value). Is this correct?
If thread B calls setSomeValue(), you need some sort of synchronization to ensure that thread A can read that value. volatile won't accomplish this on its own, and neither will making the methods synchronized. The code that does this is ultimately whatever synchronization code you added that made sure that A: getSomeValue() happens after B: setSomeValue(). If, as you suggest, you used a message queue to synchronize threads, this happens because the memory changes made by thread A became visible to thread B once thread B acquired the lock on your message queue.
If a class is accessed only in the second way, there is no need for
volatile/synchronization, or is it?
If you are really doing your own synchronization then it doesn't sound like you care whether these classes are thread-safe. Be sure that you aren't accessing them from more than one thread at the same time though; otherwise, any methods that aren't atomic (assiging an int is) may lead to you to be in an unpredictable state. One common pattern is to put the shared state into an immutable object so that you are sure that the receiving thread isn't calling any setters.
If you do have a class that you want to be updated and read from multiple threads, I'd probably do the simplest thing to start, which is often to synchronize all public methods. If you really believe this to be a bottleneck, you could look into some of the more complex locking mechanisms in Java.
So what does volatile guarantee?
For the exact semantics, you might have to go read tutorials, but one way to summarize it is that 1) any memory changes made by the last thread to access the volatile will be visible to the current thread accessing the volatile, and 2) that accessing the volatile is atomic (it won't be a partially constructed object, or a partially assigned double or long).
Synchronized blocks have analogous properties: 1) any memory changes made by the last thread to access to the lock will be visible to this thread, and 2) the changes made within the block are performed atomically with respect to other synchronized blocks
(1) means any memory changes, not just changes to the volatile (we're talking post JDK 1.5) or within the synchronized block. This is what people mean when they refer to ordering, and this is accomplished in different ways on different chip architectures, often by using memory barriers.
Also, in the case of synchronous blocks (2) only guarantees that you won't see inconsistent values if you are within another block synchronized on the same lock. It's usually a good idea to synchronize all access to shared variables, unless you really know what you are doing.

is a volatile variable synchronized? (java)

Say that I have a private variable and I have a setVariable() method for it which is synchronized, isn't it exactly the same as using volatile modifier?

No. Volatile means the variable isn't cached in any per-thread cache, and its value is always retrieved from main memory when needed. Synchronization means that those per-thread caches will be kept in sync at certain points. In theory, using a volatile variable can come with a great speed penalty if many threads need to read the value of the variable, but it is changed only rarely.

No, calling a synchronized getXXX/setXXX method is not the same as reading/writing to a volatile variable.
Multiple threads can concurrently read from or write to a volatile variable. But only one thread at a time can read from or write to a variable that is guarded by a synchronized block.

volatile variables are not synchronized (at least, not in the way synchronized stuff is synchronized). What volatile does is ensure that a variable is retrieved each time it's used (ie: it prevents certain kinds of optimization), and IIRC that it's read and written in the correct order. This could conceivably emulate some kinds of synchronization, but it can't work the same if your setter has to set more than one thing. (If you set two volatile variables, for example, there will be a point where one is set and the other isn't.)

Actually No.
volatile is actually weaker form of synchronization, when field is declared as a volatile the compiler and runtime understands that this variable is shared and operations on it shouldn't be reordered with other memory operations. Volatile variable aren't cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always return a recent write by any thread.

just an example :
First thread run :
while(stopped){
... do something
}
Second thread run :
stopped = true;
it's useful to declare stopped as a volatile boolean for the first thread to have a fresh value of it.

There is no any relation.
Basically
Volatile => it always retrieves parameter's latest value
Synchronized => it serves only 1 thread at the same time

java threads synchronization

In the class below, is the method getIt() thread safe and why?
public class X {
private long myVar;
public void setIt(long var){
myVar = var;
}
public long getIt() {
return myVar;
}
}

It is not thread-safe. Variables of type long and double in Java are treated as two separate 32-bit variables. One thread could be writing and have written half the value when another thread reads both halves. In this situation, the reader would see a value that was never supposed to exist.
To make this thread-safe you can either declare myVar as volatile (Java 1.5 or later) or make both setIt and getIt synchronized.
Note that even if myVar was a 32-bit int you could still run into threading issues where one thread could be reading an out of date value that another thread has changed. This could occur because the value has been cached by the CPU. To resolve this, you again need to declare myVar as volatile (Java 1.5 or later) or make both setIt and getIt synchronized.
It's also worth noting that if you are using the result of getIt in a subsequent setIt call, e.g. x.setIt(x.getIt() * 2), then you probably want to synchronize across both calls:
synchronized(x)
{
x.setIt(x.getIt() * 2);
}
Without the extra synchronization, another thread could change the value in between the getIt and setIt calls causing the other thread's value to be lost.

This is not thread-safe. Even if your platform guarantees atomic writes of long, the lack of synchronized makes it possible that one thread calls setIt() and even after this call has finished it is possible that another thread can call getIt() and this call could return the old value of myVar.
The synchronized keyword does more than an exclusive access of one thread to a block or a method. It also guarantees that the second thread is informed about a change of a variable.
So you either have to mark both methods as synchronized or mark the member myVar as volatile.
There's a very good explanation about synchronization here:
Atomic actions cannot be interleaved, so they can be used without fear of thread interference. However, this does not eliminate all need to synchronize atomic actions, because memory consistency errors are still possible. Using volatile variables reduces the risk of memory consistency errors, because any write to a volatile variable establishes a happens-before relationship with subsequent reads of that same variable. This means that changes to a volatile variable are always visible to other threads. What's more, it also means that when a thread reads a volatile variable, it sees not just the latest change to the volatile, but also the side effects of the code that led up the change.

No, it's not. At least, not on platforms that lack atomic 64-bit memory accesses.
Suppose that Thread A calls setIt, copies 32 bits into memory where the backing value is, and is then pre-empted before it can copy the other 32 bits.
Then Thread B calls getIt.

No it is not, because longs are not atomic in java, so one thread could have written 32 bits of the long in the setIt method, and then the getIt could read the value, and then setIt could set the other 32 bits.
So the end result is that getIt returns a value that was never valid.

It ought to be, and generally is, but is not guaranteed to be thread safe. There could be issues with different cores having different versions in CPU cache, or the store/retrieve not being atomic for all architectures. Use the AtomicLong class.

The getter is not thread safe because it’s not guarded by any mechanism that guarantees the most up-to-date visibility. Your choices are:
making myVar final (but then you can’t mutate it)
making myVar volatile
use synchronized to accessing myVar

AFAIK, Modern JVMs no longer split long and double operations. I don't know of any reference which states this is still a problem. For example, see AtomicLong which doesn't use synchronization in Sun's JVM.
Assuming you want to be sure it is not a problem then you can use synchronize both get() and set(). However, if you are performing an operation like add, i.e. set(get()+1) then this synchronization doesn't buy you much, you still have to synchronize the object for the whole operation. (A better way around this is to use a single operation for add(n) which is synchronized)
However, a better solution is to use an AtomicLong. This supports atomic operations like get, set and add and DOESN'T use synchronization.

Since it is a read only method. You should synchronize the set method.
EDIT : I see why the get method needs to be synchronized as well. Good job explaining Phil Ross.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.