Using volatile keyword with mutable object - java

In Java, I understand that volatile keyword provides visibility to variables. The question is, if a variable is a reference to a mutable object, does volatile also provide visibility to the members inside that object?
In the example below, does it work correctly if multiple threads are accessing volatile Mutable m and changing the value?
example
class Mutable {
private int value;
public int get()
{
return a;
}
public int set(int value)
{
this.value = value;
}
}
class Test {
public volatile Mutable m;
}

This is sort of a side note explanation on some of the details of volatile. Writing this here because it is too much for an comment. I want to give some examples which show how volatile affects visibility, and how that changed in jdk 1.5.
Given the following example code:
public class MyClass
{
private int _n;
private volatile int _volN;
public void setN(int i) {
_n = i;
}
public void setVolN(int i) {
_volN = i;
}
public int getN() {
return _n;
}
public int getVolN() {
return _volN;
}
public static void main() {
final MyClass mc = new MyClass();
Thread t1 = new Thread() {
public void run() {
mc.setN(5);
mc.setVolN(5);
}
};
Thread t2 = new Thread() {
public void run() {
int volN = mc.getVolN();
int n = mc.getN();
System.out.println("Read: " + volN + ", " + n);
}
};
t1.start();
t2.start();
}
}
The behavior of this test code is well defined in jdk1.5+, but is not well defined pre-jdk1.5.
In the pre-jdk1.5 world, there was no defined relationship between volatile accesses and non-volatile accesses. therefore, the output of this program could be:
Read: 0, 0
Read: 0, 5
Read: 5, 0
Read: 5, 5
In the jdk1.5+ world, the semantics of volatile were changed so that volatile accesses affect non-volatile accesses in exactly the same way as synchronization. therefore, only certain outputs are possible in the jdk1.5+ world:
Read: 0, 0
Read: 0, 5
Read: 5, 0 <- not possible
Read: 5, 5
Output 3. is not possible because the reading of "5" from the volatile _volN establishes a synchronization point between the 2 threads, which means all actions from t1 taken before the assignment to _volN must be visible to t2.
Further reading:
Fixing the java memory model, part 1
Fixing the java memory model, part 2

In your example the volatile keyword only guarantees that the last reference written, by any thread, to 'm' will be visible to any thread reading 'm' subsequently.
It doesn't guarantee anything about your get().
So using the following sequence:
Thread-1: get() returns 2
Thread-2: set(3)
Thread-1: get()
it is totally legitimate for you to get back 2 and not 3. volatile doesn't change anything to that.
But if you change your Mutable class to this:
class Mutable {
private volatile int value;
public int get()
{
return a;
}
public int set(int value)
{
this.value = value;
}
}
Then it is guaranteed that the second get() from Thread-1 shall return 3.
Note however that volatile typically ain't the best synchronization method.
In you simple get/set example (I know it's just an example) a class like AtomicInteger, using proper synchronization and actually providing useful methods, would be better.

volatile only provides guarantees about the reference to the Object that is declared so. The members of that instance don't get synchronized.
According to the Wikipedia, you have:
(In all versions of Java) There is a global ordering on the reads and
writes to a volatile variable. This
implies that every thread accessing a
volatile field will read its current
value before continuing, instead of
(potentially) using a cached value.
(However, there is no guarantee about
the relative ordering of volatile
reads and writes with regular reads
and writes, meaning that it's
generally not a useful threading
construct.)
(In Java 5 or later) Volatile reads and writes establish a happens-before
relationship, much like acquiring and
releasing a mutex.
So basically what you have is that by declaring the field volatile, interacting with it creates a "point of synchronization", after which any change will be visible in other threads. But after that, using get() or set() is unsynched. The Java Spec has a more thorough explanation.

Use of volatile rather than a fully synchronized value is essentially an optimization. The optimization comes from the weaker guarantees provided for a volatile value compared with a synchronized access. Premature optimmization is the root of all evil; in this case, the evil could be hard to track down because it would be in the form of race conditions and such like. So if you need to ask, you probably ought not to use it.

volatile does not "provide visibility". Its only effect is to prevent processor caching of the variable, thus providing a happens-before relation on concurrent reads and writes. It does not affect the members of an object, nor does it provide any synchronisation synchronized locking.
As you haven't told us what the "correct" behaviour of your code is, the question cannot be answered.

Related

is synchronized needed in getValue() ? & volatile needed?

I've a class in multithreading application:
public class A {
private volatile int value = 0; // is volatile needed here?
synchronized public void increment() {
value++; // Atomic is better, agree
}
public int getValue() { // synchronized needed ?
return value;
}
}
The keyword volatile gives you the visibility aspects and without that you may read some stale value. A volatile read adds a memory barrier such that the compiler, hardware or the JVM can't reorder the memory operations in ways that would violate the visibility guarantees provided by the memory model. According to the memory model, a write to a volatile field happens-before every subsequent read of that same field, thus you are guaranteed to read the latest value.
The keyword synchronized is also needed since you are performing a compound action value++ which has to be done atomically. You read the value, increment it in the CPU and then write it back. All these actions has to be done atomically. However, you don't need to synchronize the read path since the keyword volatile guarantees the visibility. In fact, use of both volatile and synchronize on the read path would be confusing and would offer no performance or safety benefit.
The use of atomic variables is generally encouraged, since they use non blocking synchronization using CAS instructions built into the CPU which yields low lock contention and higher throughput. If it were written using the atomic variables, it would be something like this.
public class A {
private final LongAdder value = new LongAdder();
public void increment() {
value.add(1);
}
public int getValue() {
return value.intValue();
}
}

Does synchronizing on the static field that you are modifying make your code thread safe?

class A {
private static BigInteger staticCode = BigInteger.ZERO;
private BigInteger code;
public A() {
synchronized(staticCode) {
staticCode = staticCode.plus(BigInteger.ONE);
code = staticCode;
}
}
}
I'm not an expert in concurrency by any means. Could someone explain to me why the class provided above isn't thread safe?
What are the situations that can cause a race condition? My thought process is that if we create 10 instances of this class, every instance will synchronize on a different value of staticCode and that's why it's thread safe, but I was told that it wasn't. But why?
I know that we can synchronize on .class and it will definitely be thread safe, but I still want to understand this particular situation.
Does synchronizing on the static field that you are modifying make your code thread safe?
No, because you're reassigning it. (*)
As soon as that reassignment has taken place, you've effectively lost the mutual exclusion on access to the staticCode field.
Any thread which is already waiting at the synchronized block before the assignment will continue to wait.
Any thread which arrives at the synchronized block after the reassignment but before the reassigning thread has left the block will attempt to synchronize on the new value of staticCode.
A more subtle point than the fact you don't have mutual exclusion is that you also lose the happens-before between the end of the synchronized block and the start of the next execution. This means that you don't have guaranteed visibility of the updated value, so you can potentially generate multiple instances of A with the same code.
It's a bad idea to synchronize on a non-final member. If you don't want to synchronize on A.class, you can define an auxilliary member on which to synchronized:
class A {
private static final Object lock = new Object();
private static BigInteger staticCode = BigInteger.ZERO;
public A() {
synchronized (lock) {
staticCode = ...
}
}
}
This preserves the mutability of staticCode, but allows correct mutual exclusion.
However, an Atomic* class would be far easier because you avoid the need to synchronize (e.g. AtomicInteger or AtomicLong - but if you really think you're going to have more than 2^63 things, you can use an AtomicReference<BigInteger>):
class A {
private static final Object lock = new Object();
private static AtomicReference<BigInteger> staticCode = new AtomicReference<>(BigInteger.ZERO);
public A() {
BigInteger code;
do {
code = staticCode.get();
} while (!staticCode.compareAndSet(code, code.add(BigInteger.ONE)));
this.code = code;
// Even easier with AtomicInteger/Long:
// this.code = BigInteger.valueOf(staticCode.incrementAndGet());
}
}
(*) But anyway, dispense with the notion that synchronizing automatically makes something thread safe. For one thing, you need to define precisely what you mean by "thread safe"; but then, you need to understand what synchronization actually provides for you, in order to evaluate whether those things satisfy your thread safety requirements.
I guess the main point I was missing here is that we synchronize on objects, not references to objects.
Consider a situation where I synchronize on BigInteger.ZERO, and then enter the synchronized block.
When the value of staticCode has changed and become BigInteger.ONE, this block still continues to be synchronized on BigInteger.ZERO. Meanwhile another thread is already synchronized on BigInteger.ONE, before we even had a change to assign BigInteger.ONE to code. That second thread could bump staticCode to the value of 2, and now both threads are before the second assignment, but the value of staticCode is 2, so they can both assign the same value of staticCode to 2 different instances of the class.

Synchronized Get Methods in Java

This may seem like pedantry but is really me questioning my fundamental assumptions.. :)
In the java documentation on synchronised methods, there is the following example:
public class SynchronizedCounter {
private int c = 0;
public synchronized void increment() {
c++;
}
public synchronized void decrement() {
c--;
}
public synchronized int value() {
return c;
}
}
Is the synchronized keyword really required on the value method? Surely this is atomic and whether the value is retrieved before or after any calls to related methods on other threads makes little difference? Would the following suffice:
public class SynchronizedCounter {
private int c = 0;
public synchronized void increment() {
c++;
}
public synchronized void decrement() {
c--;
}
public int value() {
return c;
}
}
I understand that in a more complex case, where multiple private variables were being accessed then yes, it would be essential - but in this simple case, is it safe to assume that this can be simplified?
Also, I suppose that there is a risk that future modifications may require the value method to be synchronised and this could be forgotten, leading to bugs, etc, so perhaps this counts somewhat as defensive programming, but I am ignoring that aspect here.. :)
Yes, synchronized is really required on value(). Otherwise a thread can call value() and get a stale answer.
Surely this is atomic
For ints I believe so, but if value was a long or double, it's not. It is even possible to see only some of the bits in the field updated!
value is retrieved before or after any calls to related methods on other threads makes little difference?
Depends on your use case. Often it does matter.
Some static analysis software such as FindBugs will flag this code as not having correct synchronization if value() isn't also synchronized.
synchronized is required for both reading and writing a variable from another thread. This guarantees
that values will be copied from cache or registers to RAM (granted this is important for writing not reading)
it establishes that writes will happen before reads if they appear so in code. Otherwise the compiler is free to rearrange lines of bytecode for optimization
Check Effective Java Item 66 for a more detailed analysis

Volatile keyword in Java - Clarification [duplicate]

This question already has answers here:
Difference between volatile and synchronized in Java
(4 answers)
Closed 6 years ago.
I am really confused about what I read about the applications of volatile keyword in java.
Is the following statement correct?
"a write to a volatile field happens before every subsequent read of the same field"
Ideally when should volatile keyword used?
What is the difference between:
class TestClass
{ private int x;
synchronized int get(){return x;}
synchronized void set(int x){this.x = x;}
}
and
class TestClass
{ private volatile int x;
int get(){return x;}
void set(int x){this.x = x;}
}
volatile is a field modifier, while synchronized modifies code blocks and methods. So we can specify three variations of a simple accessor using those two keywords:
int i1;
int geti1() {return i1;}
volatile int i2;
int geti2() {return i2;}
int i3;
synchronized int geti3() {return i3;}
geti1() accesses the value currently stored in i1 in the current thread.
Threads can have local copies of variables, and the data does not have to be the same as the data held in other threads.In particular, another thread may have updated i1 in it's thread, but the value in the current thread could be different from that updated value. In fact Java has the idea of a "main" memory, and this is the memory that holds the current "correct" value for variables. Threads can have their own copy of data for variables, and the thread copy can be different from the "main" memory. So in fact, it is possible for the "main" memory to have a value of 1 for i1, for thread1 to have a value of 2 for i1 and for thread2 to have a value of 3 for i1 if thread1 and thread2 have both updated i1 but those updated value has not yet been propagated to "main" memory or other threads.
On the other hand, geti2() effectively accesses the value of i2 from "main" memory. A volatile variable is not allowed to have a local copy of a variable that is different from the value currently held in "main" memory. Effectively, a variable declared volatile must have it's data synchronized across all threads, so that whenever you access or update the variable in any thread, all other threads immediately see the same value. Generally volatile variables have a higher access and update overhead than "plain" variables. Generally threads are allowed to have their own copy of data is for better efficiency.
There are two differences between volitile and synchronized.
Firstly synchronized obtains and releases locks on monitors which can force only one thread at a time to execute a code block. That's the fairly well known aspect to synchronized. But synchronized also synchronizes memory. In fact synchronized synchronizes the whole of thread memory with "main" memory. So executing geti3() does the following:
The thread acquires the lock on the monitor for object this .
The thread memory flushes all its variables, i.e. it has all of its variables effectively read from "main" memory .
The code block is executed (in this case setting the return value to the current value of i3, which may have just been reset from "main" memory).
(Any changes to variables would normally now be written out to "main" memory, but for geti3() we have no changes.)
The thread releases the lock on the monitor for object this.
So where volatile only synchronizes the value of one variable between thread memory and "main" memory, synchronized synchronizes the value of all variables between thread memory and "main" memory, and locks and releases a monitor to boot. Clearly synchronized is likely to have more overhead than volatile.
http://javaexp.blogspot.com/2007/12/difference-between-volatile-and.html
volatile guarantees that reads from the variable always reflects the most up to update value. The runtime can achieve this in various ways, including not caching or refreshing the cache when the value has changed.
bwawok eluded to it, but the volatile keyword isnt only for memory visibility. Before Java 1.5 was released the volatile keyword declared that the field will get the most recent value of the object by hitting main memory each time for reads and flushing for writes.
Today's volatile keyword syas two very important things:
Dont worry about how but know that when reading a volatile field you will always have the most up to date value.
A compiler cannot re order a volatile read/write as to maintain program order.
From a client point of view, a private volatile field is hidden from the public interface while synchronized methods are more visible.
To answer part 3 of your question, and partly part 2.
There is no functional difference between synchronized and volatile samples.
However, each has it's own drawbacks in terms of performance. In some cases volatile performance may be really worse than just using synchronized or other primitives from java.util.concurrent. For discussion of this see -> Why aren't variables in Java volatile by default?.
Answer by Kerem Baydoğan is completely right. I just want to give an practical example about what volatile keyword offers us.
First, we have a counter, smth like
public class Counter {
private int x;
public int getX() { return x; }
public void increment() { x++; }
}
And some Runnable tasks which increments the value of x
#Override
public void run() {
for (N) {
int oldValue = counter.getX();
counter.increment();
int new value = counter.getX();
}
}
}
With NO synchronization there is going to be interference between threads and simply is not going to work
the simplest way to solve this:
public class Counter {
private int x;
public synchronized int getX() { return x; }
public synchronized void increment() { x++; }
}
Actually in order to force the system to break, I do a Thread.sleep before reading and writing x, just imagine is a BD or a huge task to deal with.
Now, what is volatile useful for? There are a lot of good articles over there: volatile article or this question
synchronizing the access to the common resource is not the answer but is a good choice to hold the flag to stop threads
I our prev. example, imagine we want to increment the variable up to 100, a simply way could be a volatile boolean flag. Example:
private volatile boolean stop;
#Override
public void run() {
while(!stop) {
int oldValue = counter.getX();
if (oldValue == 100) {
stop = true;
} else {
counter.increment();
int new value = counter.getX();
}
}
}
This works fine, but, if you remove the volatile keyword from the flag, it's possible to come across and infinite loop.

How do you ensure multiple threads can safely access a class field?

When a class field is accessed via a getter method by multiple threads, how do you maintain thread safety? Is the synchronized keyword sufficient?
Is this safe:
public class SomeClass {
private int val;
public synchronized int getVal() {
return val;
}
private void setVal(int val) {
this.val = val;
}
}
or does the setter introduce further complications?
If you use 'synchronized' on the setter here too, this code is threadsafe. However it may not be sufficiently granular; if you have 20 getters and setters and they're all synchronized, you may be creating a synchronization bottleneck.
In this specific instance, with a single int variable, then eliminating the 'synchronized' and marking the int field 'volatile' will also ensure visibility (each thread will see the latest value of 'val' when calling the getter) but it may not be synchronized enough for your needs. For example, expecting
int old = someThing.getVal();
if (old == 1) {
someThing.setVal(2);
}
to set val to 2 if and only if it's already 1 is incorrect. For this you need an external lock, or some atomic compare-and-set method.
I strongly suggest you read Java Concurrency In Practice by Brian Goetz et al, it has the best coverage of Java's concurrency constructs.
In addition to Cowan's comment, you could do the following for a compare and store:
synchronized(someThing) {
int old = someThing.getVal();
if (old == 1) {
someThing.setVal(2);
}
}
This works because the lock defined via a synchronized method is implicitly the same as the object's lock (see java language spec).
From my understanding you should use synchronized on both the getter and the setter methods, and that is sufficient.
Edit: Here is a link to some more information on synchronization and what not.
If your class contains just one variable, then another way of achieving thread-safety is to use the existing AtomicInteger object.
public class ThreadSafeSomeClass {
private final AtomicInteger value = new AtomicInteger(0);
public void setValue(int x){
value.set(x);
}
public int getValue(){
return value.get();
}
}
However, if you add additional variables such that they are dependent (state of one variable depends upon the state of another), then AtomicInteger won't work.
Echoing the suggestion to read "Java Concurrency in Practice".
For simple objects this may suffice. In most cases you should avoid the synchronized keyword because you may run into a synchronization deadlock.
Example:
public class SomeClass {
private Object mutex = new Object();
private int val = -1; // TODO: Adjust initialization to a reasonable start
// value
public int getVal() {
synchronized ( mutex ) {
return val;
}
}
private void setVal( int val ) {
synchronized ( mutex ) {
this.val = val;
}
}
}
Assures that only one thread reads or writes to the local instance member.
Read the book "Concurrent Programming in Java(tm): Design Principles and Patterns (Java (Addison-Wesley))", maybe http://java.sun.com/docs/books/tutorial/essential/concurrency/index.html is also helpful...
Synchronization exists to protect against thread interference and memory consistency errors. By synchronizing on the getVal(), the code is guaranteeing that other synchronized methods on SomeClass do not also execute at the same time. Since there are no other synchronized methods, it isn't providing much value. Also note that reads and writes on primitives have atomic access. That means with careful programming, one doesn't need to synchronize the access to the field.
Read Sychronization.
Not really sure why this was dropped to -3. I'm simply summarizing what the Synchronization tutorial from Sun says (as well as my own experience).
Using simple atomic variable access is
more efficient than accessing these
variables through synchronized code,
but requires more care by the
programmer to avoid memory consistency
errors. Whether the extra effort is
worthwhile depends on the size and
complexity of the application.

Categories