Isn't this a bad example for explaining Final in Java? - java

The Java spec 17.5 has the following code to illustrate the use of final Fields In The Java Memory Model.
(in comparison to normal fields)
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
}
The spec goes on to say:
"The class FinalFieldExample has a final int field x and a non-final int field y.
One thread might execute the method writer and another might execute the method reader. Because the writer method writes f after the object's constructor finishes, the reader method will be guaranteed to see the properly initialized value for f.x: it will read the value 3.
However, f.y is not final; the reader method is therefore not guaranteed to see the value 4 for it."
My question is : Isn't this a lame (or at least a badly contrived) example ?
Or am I missing something here ?
My reasoning to term the example as 'lame' is:
If an object of FinalFieldExample class is to be shared by threads in a multi-threaded scenario, shouldnt it follow the basic tenet of multi-threading, which is to use some form of synchronization. If they had used synchronization, then the issue mentioned would not exist.
The above example seems to advocate Final fields as an alternative (or a partial pacifier) to proper synchronization techniques. In my understanding, final fields have use even when used on top of proper synchronization. And should never be used to gain the advantage mentioned in the example (in the absence of synchronization).
So one could ask:
Isn't there a decent example (with synchronization) to explain the advantage of final fields over normal fields? I guess, Immutability is!

You are confusing synchronization and concurrency.
If a field is a constant then it can be safely shared between multiple Threada without any need for locking.
If a field is a variable then it needs to be synchronized or otherwise locked.
You can have a concurrent program that has multiple threads reading the same constant field, this doesn't block any Threads.
Any code that uses synchronized blocks does so a huge cost. This is a very expensive process and should be avoided wherever possible. Not to mention the problems of resource starvation, deadlock, livelock, etc. etc...
If you can use final instead of synchronized you should do so.

EDIT: I missed the point in this answer. The issue is not that the value can be changed. See bmorris591's answer instead.
One of the advantages of immutable objects is that you don't need synchronization.
But this example is not about synchronization, it's about the value that the reader thread is guaranteed to see. Even with synchronization, the value of ycould change, while the value of x is always guaranteed to be 3.

This spec you refer to just describes how stuff (should) behave. Based on this spec you can decide how to code properly. This example in no way tries to represent a real use case. It just illustrates with a few lines what the behaviour is. And if your jvm implementation does not behave like that, then it is a bug.

Related

Initialization safety in java

Just to make sure I understand the concepts presented in java concurrency in practice.
Lets say I have the following program:
public class Stuff{
private int x;
public Stuff(int x){
this.x=x;
}
public int getX(){return x;}
}
public class UseStuff(){
private Stuff s;
public void makeStuff(int x){
s=new Stuff(x);
}
public int useStuff(){
return s.getX();
}
}
If I let multiple threads to play with this code, then I'm not only in trouble because s might be pointing to multiple instances if two or more threads are entering to the makeStuff method, but even if just one thread creates a new Stuff, then an other thread who is just entered to useStuff can return the value 0 (predefined int value) or the value assigned to "x" by its constructor.
That all depends on whether the constructor has finished initializing x.
So at this point, to make it thread safe I must do one thing and then I can choose from two different ways.
First I must make makeStuff() atomic, so "s" will point to one object at a time.
Then I either make useStuff synchronized as well which ensures the I get back the Stuff object x var only after its constructor has finished building it, OR i can make Stuff's x final, and by this the JMM makes sure that x's value will only be visible after it has been initialized.
Do I understand the importance of final fields in the context of concurrency and JMM?
Do I understand the importance of final fields in the context of concurrency and JMM?
Not quite. The spec writes:
final fields also allow programmers to implement thread-safe immutable objects without synchronization. A thread-safe immutable object is seen as immutable by all threads, even if a data race is used to pass references to the immutable object between threads. This can provide safety guarantees against misuse of an immutable class by incorrect or malicious code
If you make x final, this guarantees that every thread that obtains a reference to a Stuff instance will observe x to have been assigned. It does not guarantee that any thread will obtain such a reference.
That is, in the absence of synchronization action in useStuff(), the runtime is permitted to satisfy a read of s from a register, which might return a stale value.
The cheapest correctly synchronized variant of this code is declaring s volatile, which ensures that writes to s happen-before (and are therefore visible to) subsequent reads of s. If you do that, you need not even make x final (because the write to x happens-before the write of s, the read of s happens-before the read of x, and happens-before is transitive).
Some answers claim that s can only refer to one object at a time. This is wrong; because there is no memory barrier, different threads can have their own notion about the value of s. In order for all threads to see a consistent value assigned to s, you need to declare s as volatile, or use some other memory barrier.
If you do this, you won't need to declare x as final for the correct value to be visible to all threads (but you might still want to; fields shouldn't be mutable without a reason). That's because the initialization of x happens-before the assignment of s in "source code order," and the write of the volatile field s happens-before other thread reads that value from s. If you subsequently modified the value of a non-final field x, however, you could run into trouble because the modification isn't guaranteed to be visible to other threads. Making Stuff immutable would eliminate that possibility.
Of course, there's nothing to stop threads from clobbering the value assigned to s, so different threads could still see different values for x. This isn't really a threading issue though. Even a single thread could write and then read different values of x over time. But preventing this behavior in a multi-threaded environment requires atomicity, that is, checking to see whether s has a value and assigning one if not should appear as one indivisible action to other threads. An AtomicReference would be the best solution, but the synchronized keyword would work too.
What are you trying to protect by making things synchronized? Are you concerned that thread A will call makeStuff and then thread B will call getStuff afterwards and the value won't be there? I'm not sure how synchronizing any of this will help that. Depending on what problem you are trying to avoid, it might be as simple as marking s as volatile.
I'm not sure what you're doing there. Why are you trying to create an object and then assign it to a field? Why save it if it can be overwritten by other call to makeStuff? It seems like you use UseStuff both as an proxy and as a factory to your actual Stuff model object. You better separate the two:
public class StuffFactory {
public static Stuff createStuff(int value) {
return new StuffProxy(value);
}
}
public class StuffProxy extends Stuff {
// Replacement for useStuff from your original UseStuff class
#Override
public int getX() {
//Put custom logic here
return super.getX();
}
}
The logic here is that each thread is responsible for creation of their own Stuff objects (using the factory) so concurrent access no longer an issue.

Are final fields really useful regarding thread-safety?

I have been working on a daily basis with the Java Memory Model for some years now. I think I have a good understanding about the concept of data races and the different ways to avoid them (e.g, synchronized blocks, volatile variables, etc). However, there's still something that I don't think I fully understand about the memory model, which is the way that final fields of classes are supposed to be thread safe without any further synchronization.
So according to the specification, if an object is properly initialized (that is, no reference to the object escapes in its constructor in such a way that the reference can be seen by another thread), then, after construction, any thread that sees the object will be guaranteed to see the references to all the final fields of the object (in the state they were when constructed), without any further synchronization.
In particular, the standard (http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.4) says:
The usage model for final fields is a simple one: Set the final fields
for an object in that object's constructor; and do not write a
reference to the object being constructed in a place where another
thread can see it before the object's constructor is finished. If this
is followed, then when the object is seen by another thread, that
thread will always see the correctly constructed version of that
object's final fields. It will also see versions of any object or
array referenced by those final fields that are at least as up-to-date
as the final fields are.
They even give the following example:
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
}
In which a thread A is supposed to run "reader()", and a thread B is supposed to run "writer()".
So far, so good, apparently.
My main concern has to do with... is this really useful in practice? As far as I know, in order to make thread A (which is running "reader()") see the reference to "f", we must use some synchronization mechanism, such as making f volatile, or using locks to synchronize access to f. If we don't do so, we are not even guaranteed that "reader()" will be able to see an initialized "f", that is, since we have not synchronized access to "f", the reader will potentially see "null" instead of the object that was constructed by the writer thread. This issue is stated in http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#finalWrong , which is one of the main references for the Java Memory Model [bold emphasis mine]:
Now, having said all of this, if, after a thread constructs an
immutable object (that is, an object that only contains final fields),
you want to ensure that it is seen correctly by all of the other
thread, you still typically need to use synchronization. There is no
other way to ensure, for example, that the reference to the immutable
object will be seen by the second thread. The guarantees the program
gets from final fields should be carefully tempered with a deep and
careful understanding of how concurrency is managed in your code.
So if we are not even guaranteed to see the reference to "f", and we must therefore use typical synchronization mechanisms (volatile, locks, etc.), and these mechanisms do already cause data races to go away, the need for final is something I would not even consider. I mean, if in order to make "f" visible to other threads we still need to use volatile or synchronized blocks, and they already make internal fields be visible to the other threads... what's the point (in thread safety terms) in making a field final in the first place?
I think that you are misunderstanding what the JLS example is intended to show:
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
This code does not guarantee that the latest value of f will be seen by the thread that calls reader(). But what it is saying is that if you do see f as non-null, then f.x is guaranteed to be 3 ... despite the fact that we didn't actually do any explicit synchronizing.
Well is this implicit synchronization for finals in constructors useful? Certainly it is ... IMO. It means that we don't need to do any extra synchronization each time we accessed an immutable object's state. That is a good thing, because synchronization typically entails cache read-through or write-through, and that slows your program down.
But what Pugh is saying is that you will typically need to synchronize to get hold of the reference to the immutable object in the first place. He is making the point that using immutable objects (implemented using final) does not excuse you from the need to synchronize ... or from the need to understand the concurrency / synchronization implementation of your application.
The problem is that we still need to be sure that reader will se a non-null "f", and that's only possible if we use other synchronization mechanism that will already provide the semantics of allowing us to see 3 for f.x. And if that's the case, why bother using final for thread safety stuff?
There is a difference between synchronizing to get the reference and synchronizing to use the reference. The first one I may need to do only once. The second one I may need to do lots of times ... with the same reference. And even if it is one-to-one, I have still halved the number of synchronizing operations ... if I (hypothetically) implement the immutable object as thread-safe.
TL;DR: Most software developers should ignore the special rules regarding final variables in the Java Memory Model. They should adhere to the general rule: If a program is free of data races, all executions will appear to be sequentially consistent. In most cases, final variables can not be used to improve the performance of concurrent code, because the special rule in the Java Memory Model creates some additional costs for final variables, what makes volatile superior to final variables for almost all use cases.
The special rule about final variables prevents in some cases, that a final variable can show different values. However, performance-wise the rule is irrelevant.
Having said that, here is a more detailed answer. But I have to warn you. The following description might contain some precarious information, that most software developers should never care about, and it's better if they don't know about it.
The special rule about final variables in the Java Memory Model somehow implies, that it makes a difference for the Java VM and Java JIT compiler, if a member variable is final or if it's not.
public class Int {
public /* final */ int value;
public Int(int value) {
this.value = value;
}
}
If you take a look at the Hotspot source code, you will see that the compiler checks if the constructor of a class writes at least one final variable. If it does so, the compiler will emit additional code for the constructor, more precisely a memory release barrier. You will also find the following comment in the source code:
This method (which must be a constructor by the rules of Java)
wrote a final. The effects of all initializations must be
committed to memory before any code after the constructor
publishes the reference to the newly constructor object.
Rather than wait for the publication, we simply block the
writes here. Rather than put a barrier on only those writes
which are required to complete, we force all writes to complete.
That means the initialization of a final variable is similar to a write of a volatile variable. It implies some kind of memory release barrier. However, as can be seen from the quoted comment, final variables might be even more expensive. And what's even worse, you have these additional costs for final variables regardless whether they are used in concurrent code or not.
That's awful, because we want software developers to use final variables in order to increase the readability and maintainability of source code. Unfortunately, using final variables can significantly impact the performance of a program.
The question remains: Are there any use cases where the special rule regarding final variables helps to improve the performance of concurrent code?
That's hard to tell, because it depends on the actual implementation of the Java VM and the memory architecture of the machine. I haven't seen any such use cases until now. A quick glance at the source code of the package java.util.concurrent has also revealed nothing.
The problem is: The initialization of a final variable is about as expensive as a write of a volatile or atomic variable. If you use a volatile variable for the reference of the newly created object, you get the same behaviour and costs with the exception, that the reference will also be published immediately. So, there is basically no benefit in using final variables for concurrent programming.
You are right, since locking makes stronger guarantees, the guarantee about availability of finals is not particularly useful in the presence of locking. However, locking is not always necessary to ensure reliable concurrent access.
As far as I know, in order to make thread A (which is running "reader()") see the reference to "f", we must use some synchronization mechanism, such as making f volatile, or using locks to synchronize access to f.
Making f volatile is not a synchronization mechanism; it forces threads to read the memory each time the variable is accessed, but it does not synchronize access to a memory location. Locking is a way to synchronize access, but it is not necessary in practice to guarantee that the two threads share data reliably. For example, you could use a ConcurrentLinkedQueue<E> class, which is a lock-free concurrent collection* , to pass data from a reader thread to a writer thread, and avoid synchronization. You could also use AtomicReference<T> to ensure reliable concurrent access to an object without locking.
It is when you use lock-free concurrency that the guarantee about the visibility of final fields come in handy. If you make a lock-free collection, and use it to store immutable objects, your threads would be able to access the content of the objects without additional locking.
* ConcurrentLinkedQueue<E> is not only lock-free, but also a wait-free collection (i.e. a lock-free collection with additional guarantees not relevant to this discussion).
Yes final final fields are useful in terms of thread-safety. It may not be useful in your example, however if you look at the old ConcurrentHashMap implementation the get method doesn't apply any locking while it search for the value, though there is a risk that while look up is happening the list might change (think of ConcurrentModificationException ). However CHM uses the list made of final filed for 'next' field guaranteeing the consistency of the list (the items in the front/yet-to see will not grow or shrink). So the advantage is thread-safety is established without synchronization.
From the article
Exploiting immutability
One significant source of inconsistency is avoided by making the Entry
elements nearly immutable -- all fields are final, except for the
value field, which is volatile. This means that elements cannot be
added to or removed from the middle or end of the hash chain --
elements can only be added at the beginning, and removal involves
cloning all or part of the chain and updating the list head pointer.
So once you have a reference into a hash chain, while you may not know
whether you have a reference to the head of the list, you do know that
the rest of the list will not change its structure. Also, since the
value field is volatile, you will be able to see updates to the value
field immediately, greatly simplifying the process of writing a Map
implementation that can deal with a potentially stale view of memory.
While the new JMM provides initialization safety for final variables,
the old JMM does not, which means that it is possible for another
thread to see the default value for a final field, rather than the
value placed there by the object's constructor. The implementation
must be prepared to detect this as well, which it does by ensuring
that the default value for each field of Entry is not a valid value.
The list is constructed such that if any of the Entry fields appear to
have their default value (zero or null), the search will fail,
prompting the get() implementation to synchronize and traverse the
chain again.
Article link: https://www.ibm.com/developerworks/library/j-jtp08223/

Effectively Immutable Object

I want to make sure that I correctly understand the 'Effectively Immutable Objects' behavior according to Java Memory Model.
Let's say we have a mutable class which we want to publish as an effectively immutable:
class Outworld {
// This MAY be accessed by multiple threads
public static volatile MutableLong published;
}
// This class is mutable
class MutableLong {
private long value;
public MutableLong(long value) {
this.value = value;
}
public void increment() {
value++;
}
public long get() {
return value;
}
}
We do the following:
// Create a mutable object and modify it
MutableLong val = new MutableLong(1);
val.increment();
val.increment();
// No more modifications
// UPDATED: Let's say for this example we are completely sure
// that no one will ever call increment() since now
// Publish it safely and consider Effectively Immutable
Outworld.published = val;
The question is:
Does Java Memory Model guarantee that all threads MUST have Outworld.published.get() == 3 ?
According to Java Concurrency In Practice this should be true, but please correct me if I'm wrong.
3.5.3. Safe Publication Idioms
To publish an object safely, both the reference to the object and the
object's state must be made visible to other threads at the same time.
A properly constructed object can be safely published by:
- Initializing an object reference from a static initializer;
- Storing a reference to it into a volatile field or AtomicReference;
- Storing a reference to it into a final field of a properly constructed object; or
- Storing a reference to it into a field that is properly guarded by a lock.
3.5.4. Effectively Immutable Objects
Safely published effectively immutable objects can be used safely by
any thread without additional synchronization.
Yes. The write operations on the MutableLong are followed by a happens-before relationship (on the volatile) before the read.
(It is possible that a thread reads Outworld.published and passes it on to another thread unsafely. In theory, that could see earlier state. In practice, I don't see it happening.)
There is a couple of conditions which must be met for the Java Memory Model to guarantee that Outworld.published.get() == 3:
the snippet of code you posted which creates and increments the MutableLong, then sets the Outworld.published field, must happen with visibility between the steps. One way to achieve this trivially is to have all that code running in a single thread - guaranteeing "as-if-serial semantics". I assume that's what you intended, but thought it worth pointing out.
reads of Outworld.published must have happens-after semantics from the assignment. An example of this could be having the same thread execute Outworld.published = val; then launch other the threads which could read the value. This would guarantee "as if serial" semantics, preventing re-ordering of the reads before the assignment.
If you are able to provide those guarantees, then the JMM will guarantee all threads see Outworld.published.get() == 3.
However, if you're interested in general program design advice in this area, read on.
For the guarantee that no other threads ever see a different value for Outworld.published.get(), you (the developer) have to guarantee that your program does not modify the value in any way. Either by subsequently executing Outworld.published = differentVal; or Outworld.published.increment();. While that is possible to guarantee, it can be so much easier if you design your code to avoid both the mutable object, and using a static non-final field as a global point of access for multiple threads:
instead of publishing MutableLong, copy the relevant values into a new instance of a different class, whose state cannot be modified. E.g.: introduce the class ImmutableLong, which assigns value to a final field on construction, and doesn't have an increment() method.
instead of multiple threads accessing a static non-final field, pass the object as a parameter to your Callable/Runnable implementations. This will prevent the possibility of one rogue thread from reassigning the value and interfering with the others, and is easier to reason about than static field reassignment. (Admittedly, if you're dealing with legacy code, this is easier said than done).
The question is: Does Java Memory Model guarantee that all threads
MUST have Outworld.published.get() == 3 ?
The short answer is no. Because other threads might access Outworld.published before it has been read.
After the moment when Outworld.published = val; had been performed, under condition that no other modifications done with the val - yes - it always be 3.
But if any thread performs val.increment then its value might be different for other threads.

Thread Safe Copying of Objects in Java

I have a static array of classes similar to the following:
public class Entry {
private String sharedvariable1= "";
private String sharedvariable2= "";
private int sharedvariable3= -1;
private int mutablevariable1 = -1
private int mutablevariable2 = -2;
public Entry (String sharedvariable1,
String sharedvariable2,
int sharedvariable3) {
this.sharedvariable1 = sharedvariable1;
this.sharedvariable2 = sharedvariable2;
this.sharedvariable3 = sharedvariable 3;
}
public Entry (Entry entry) { //copy constructor.
this (entry.getSharedvariable1,
entry.getSharedvariable2,
entry.getSharedvaraible3);
}
....
/* other methods including getters and setters*/
}
At some point in my program I access an instance of this object and make a copy of it using the copy constructor above. I then change the value of the two mutable variables above. This program is running in a multithreaded environment. Please note. ALL VARIABLES ARE SET WITH THEIR INITIAL VALUES PRIOR TO THREADING. Only after the program is threaded an a copy is made, are the variables changed. I believe that it is thread safe because I am only reading the static object, not writing to it (even shared variable3, although an int and mutable is only read) and I am only making changes to the copy of the static object (and the copy is being made within a thread). But, I want to confirm that my thinking is correct here.
Can someone please evaluate what I am doing?
It is not thread-safe. You need to wrap anything that modifies the sharedvariables thusly:
synchronized (this) {
this.sharedvariable1 = newValue;
}
For setters, you can do this instead:
public synchronized void setSharedvariable1(String sharedvariable1) {
this.sharedvariable1 = sharedvariable1;
}
Then in your copy constructor, you'll do similarly:
public Entry (Entry entry) {
this();
synchronized(entry) {
this.setSharedvariable1(entry.getSharedvariable1());
this.setSharedvariable2(entry.getSharedvariable2());
this.setSharedvariable3(entry.getSharedvariable3());
}
}
This ensures that if modifications are being made to an instance, the copy operation will wait until the modifications are done.
It is not thread-safe, you should synchronize in your copy constructor. You are reading each of the three variables from the original object in your copy constructor. These operations are not atomic together. So it could be that while you are reading the first value the third value gets changed by another thread. In this case you have a "copied" object in an inconsistent state.
It's not thread safe. And I mean that is does not guarantee thread safety for multiple threads that use the same Entry instance.
The problem I see here is as follows:
Thread 1 starts constructing an Entry instance. It does not keep that instance hidden from other threads access.
Thread 2 accesses that instance, using its copy constructor, while it is still in the middle of construction.
Considering the initial value for Entry's field private int sharedvariable3= -1;, the result might be that the new "copied" instance created by Thread 2 will have its sharedvariable3 field set to 0 (the default for int class fields in java).
That's the problem.
If it bothers you, you've got to either synchronize the read/write operations, or take care of Entry instances publication. Meaning, don't allow access of other threads to an Entry instance that is in the middle of construction.
I don't really get, why you consider private instance variables as shared. Usually shared fields are static and not private - I recommend you not to share private instance variables. For thread-safety you should synchronize the operations that mutate the variables values.
You can use the synchronized keyword for that but choose the correct monitor object (I think the entry itself should do). Another alternative is to use some lock implementation from java.util.concurrent. Usually locks offer higher throughput and better granularity (for example multiple parallel reads but only one write at any given time).
Another thing you have to think about is what is called the memory barrier. Have a look at this interesting article http://java.dzone.com/articles/java-memory-model-programer%E2%80%99s
You can enforce the happens before semantic with the volatile keyword. Explicit synchronization (locks or synchonized code) also crosses the memory barrier and enforces happens before semantics.
Finally a general piece of advice: You should avoid shared mutable state at all costs. Synchronization is a pain in the ass (performance and maintenance wise). Bugs that result from incorrect synchronization are incredibly hard to detect. It is better to design for immutability or isolated mutability (e.g. actors).
The answer is that it is thread safe under the conditions outlined since I am only reading from the variables in their static state and only changing the copies.

Memory barriers and coding style over a Java VM

Suppose I have a static complex object that gets periodically updated by a pool of threads, and read more or less continually in a long-running thread. The object itself is always immutable and reflects the most recent state of something.
class Foo() { int a, b; }
static Foo theFoo;
void updateFoo(int newA, int newB) {
f = new Foo();
f.a = newA;
f.b = newB;
// HERE
theFoo = f;
}
void readFoo() {
Foo f = theFoo;
// use f...
}
I do not care in the least whether my reader sees the old or the new Foo, however I need to see a fully initialized object. IIUC, The Java spec says that without a memory barrier in HERE, I may see an object with f.b initialized but f.a not yet committed to memory. My program is a real-world program that will sooner or later commit stuff to memory, so I don't need to actually commit the new value of theFoo to memory right away (though it wouldn't hurt).
What do you think is the most readable way to implement the memory barrier ? I am willing to pay a little performance price for the sake of readability if need be. I think I can just synchronize the assignment to Foo and that would work, but I'm not sure it's very obvious to someone reading the code why I do that. I could also synchronize the whole initialization of the new Foo, but that would introduce more locking that actually needed.
How would you write it so that it's as readable as possible ?
Bonus kudos for a Scala version :)
Short Answers to the Original Question
If Foo is immutable, simply making the fields final will ensure complete initialization and consistent visibility of fields to all threads irrespective of synchronization.
Whether or not Foo is immutable, publication via volatile theFoo or AtomicReference<Foo> theFoo is sufficient to ensure that writes to its fields are visible to any thread reading via theFoo reference
Using a plain assignment to theFoo, reader threads are never guaranteed to see any update
In my opinion, and based on JCiP, the "most readable way to implement the memory barrier" is AtomicReference<Foo>, with explicit synchronization coming in second, and use of volatile coming in third
Sadly, I have nothing to offer in Scala
You can use volatile
I blame you. Now I'm hooked, I've broken out JCiP, and now I'm wondering if any code I've ever written is correct. The code snippet above is, in fact, potentially inconsistent. (Edit: see the section below on Safe publication via volatile.) The reading thread could also see stale (in this case, whatever the default values for a and b were) for unbounded time. You can do one of the following to introduce a happens-before edge:
Publish via volatile, which creates a happens-before edge equivalent to a monitorenter (read side) or monitorexit (write side)
Use final fields and initialize the values in a constructor before publication
Introduce a synchronized block when writing the new values to theFoo object
Use AtomicInteger fields
These gets the write ordering solved (and solves their visibility issues). Then you need to address visibility of the new theFoo reference. Here, volatile is appropriate -- JCiP says in section 3.1.4 "Volatile variables", (and here, the variable is theFoo):
You can use volatile variables only when all the following criteria are met:
Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value;
The variable does not participate in invariants with other state variables; and
Locking is not required for any other reason while the variable is being accessed
If you do the following, you're golden:
class Foo {
// it turns out these fields may not be final, with the volatile publish,
// the values will be seen under the new JMM
final int a, b;
Foo(final int a; final int b)
{ this.a = a; this.b=b; }
}
// without volatile here, separate threads A' calling readFoo()
// may never see the new theFoo value, written by thread A
static volatile Foo theFoo;
void updateFoo(int newA, int newB) {
f = new Foo(newA,newB);
theFoo = f;
}
void readFoo() {
final Foo f = theFoo;
// use f...
}
Straightforward and Readable
Several folks on this and other threads (thanks #John V) note that the authorities on these issues emphasize the importance of documentation of synchronization behavior and assumptions. JCiP talks in detail about this, provides a set of annotations that can be used for documentation and static checking, and you can also look at the JMM Cookbook for indicators about specific behaviors that would require documentation and links to the appropriate references. Doug Lea has also prepared a list of issues to consider when documenting concurrency behavior. Documentation is appropriate particularly because of the concern, skepticism, and confusion surrounding concurrency issues (on SO: "Has java concurrency cynicism gone too far?"). Also, tools like FindBugs are now providing static checking rules to notice violations of JCiP annotation semantics, like "Inconsistent Synchronization: IS_FIELD-NOT_GUARDED".
Until you think you have a reason to do otherwise, it's probably best to proceed with the most readable solution, something like this (thanks, #Burleigh Bear), using the #Immutable and #GuardedBy annotations.
#Immutable
class Foo {
final int a, b;
Foo(final int a; final int b) { this.a = a; this.b=b; }
}
static final Object FooSync theFooSync = new Object();
#GuardedBy("theFooSync");
static Foo theFoo;
void updateFoo(final int newA, final int newB) {
f = new Foo(newA,newB);
synchronized (theFooSync) {theFoo = f;}
}
void readFoo() {
final Foo f;
synchronized(theFooSync){f = theFoo;}
// use f...
}
or, possibly, since it's cleaner:
static AtomicReference<Foo> theFoo;
void updateFoo(final int newA, final int newB) {
theFoo.set(new Foo(newA,newB)); }
void readFoo() { Foo f = theFoo.get(); ... }
When is it appropriate to use volatile
First, note that this question pertains to the question here, but has been addressed many, many times on SO:
When exactly do you use volatile?
Do you ever use the volatile keyword in Java
For what is used "volatile"
Using volatile keyword
Java volatile boolean vs. AtomicBoolean
In fact, a google search: "site:stackoverflow.com +java +volatile +keyword" returns 355 distinct results. Use of volatile is, at best, a volatile decision. When is it appropriate? The JCiP gives some abstract guidance (cited above). I'll collect some more practical guidelines here:
I like this answer: "volatile can be used to safely publish immutable objects", which neatly encapsulates most of the range of use one might expect from an application programmer.
#mdma's answer here: "volatile is most useful in lock-free algorithms" summarizes another class of uses—special purpose, lock-free algorithms which are sufficiently performance sensitive to merit careful analysis and validation by an expert.
Safe Publication via volatile
Following up on #Jed Wesley-Smith, it appears that volatile now provides stronger guarantees (since JSR-133), and the earlier assertion "You can use volatile provided the object published is immutable" is sufficient but perhaps not necessary.
Looking at the JMM FAQ, the two entries How do final fields work under the new JMM? and What does volatile do? aren't really dealt with together, but I think the second gives us what we need:
The difference is that it is now no
longer so easy to reorder normal field
accesses around them. Writing to a
volatile field has the same memory
effect as a monitor release, and
reading from a volatile field has the
same memory effect as a monitor
acquire. In effect, because the new
memory model places stricter
constraints on reordering of volatile
field accesses with other field
accesses, volatile or not, anything
that was visible to thread A when it
writes to volatile field f becomes
visible to thread B when it reads f.
I'll note that, despite several rereadings of JCiP, the relevant text there didn't leap out to me until Jed pointed it out. It's on p. 38, section 3.1.4, and it says more or less the same thing as this preceding quote -- the published object need only be effectively immutable, no final fields required, QED.
Older stuff, kept for accountability
One comment: Any reason why newA and newB can't be arguments to the constructor? Then you can rely on publication rules for constructors...
Also, using an AtomicReference likely clears up any uncertainty (and may buy you other benefits depending on what you need to get done in the rest of the class...) Also, someone smarter than me can tell you if volatile would solve this, but it always seems cryptic to me...
In further review, I believe that the comment from #Burleigh Bear above is correct --- (EDIT: see below) you actually don't have to worry about out-of-sequence ordering here, since you are publishing a new object to theFoo. While another thread could conceivably see inconsistent values for newA and newB as described in JLS 17.11, that can't happen here because they will be committed to memory before the other thread gets ahold of a reference to the new f = new Foo() instance you've created... this is safe one-time publication. On the other hand, if you wrote
void updateFoo(int newA, int newB) {
f = new Foo(); theFoo = f;
f.a = newA; f.b = newB;
}
But in that case the synchronization issues are fairly transparent, and ordering is the least of your worries. For some useful guidance on volatile, take a look at this developerWorks article.
However, you may have an issue where separate reader threads can see the old value for theFoo for unbounded amounts of time. In practice, this seldom happens. However, the JVM may be allowed to cache away the value of the theFoo reference in another thread's context. I'm quite sure marking theFoo as volatile will address this, as will any kind of synchronizer or AtomicReference.
Having an immutable Foo with final a and b fields solves the visibility issues with the default values, but so does making theFoo volatile.
Personally I like having immutable value classes anyway as they much harder to misuse.

Categories