Could you explain how the value of f.y could be seen 0 instead of 4?
Would that be because other thread writes updates the value to 0 from 4?
This example is taken from jls https://docs.oracle.com/javase/specs/jls/se8/html/jls-17.html#jls-17.5
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
}
Assuming that we have two threads started, like this:
new Thread(FinalFieldExample::writer).start(); // Thread #1
new Thread(FinalFieldExample::reader).start(); // Thread #2
We might observe our program's actual order of operations to be the following:
Thread #1 writes x = 3.
Thread #1 writes f = ....
Thread #2 reads f and finds that it is not null.
Thread #2 reads f.x and sees 3.
Thread #2 reads f.y and sees 0, because y does not appear to be written yet.
Thread #1 writes y = 4.
In other words, Threads #1 and #2 are able to have their operations interleave in a way such that Thread #2 reads f.y before Thread #1 writes it.
Note also that the write to the static field f was allowed to be reordered so that it appears to happen before the write to f.y. This is just another consequence of the absence of any kind of synchronization. If we declared f as also volatile, this reordering would be prevented.
There's some talk in the comments about writing to final fields with reflection, which is true. This is discussed in §17.5.3:
In some cases, such as deserialization, the system will need to change the final fields of an object after construction. final fields can be changed via reflection and other implementation-dependent means.
It's therefore possible in the general case for Thread #2 to see any value when it reads f.x.
There's also a more conventional way to see the default value of a final field, by simply leaking this before the assignment:
class Example {
final int x;
Example() {
leak(this);
x = 5;
}
static void leak(Example e) { System.out.println(e.x); }
public static void main(String[] args) { new Example(); }
}
I think that if FinalFieldExample's constructor was like this:
static FinalFieldExample f;
public FinalFieldExample() {
f = this;
x = 3;
y = 4;
}
Thread #2 would be able to read f.x as 0 as well.
This is from §17.5:
An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.
The more technical sections of specification for final contain wording like that as well.
Could you explain how the value of f.y could be seen 0 instead of 4?
In Java, one of the important optimizations performed by the compiler/JVM is the reordering of instructions. As long as it doesn't violate the language specifications, the compiler is free to reorder all instructions for efficiency reasons. During object construction, it is possible for an object to be instantiated, the constructor to finish, and its reference published before all of the fields in the object have been properly initialized.
However, Java language says that if a field is marked as final then it must be properly initialized by the time the constructor finishes. To quote from the section of the Java language specs you reference. Emphasis is mine.
An object is considered to be completely initialized when its constructor finishes. A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object's final fields.
So by the time the FinalFieldExample is constructed and assigned to f, the x field must be properly initialized to 3 however the y field may or may not have been properly initialized. So if thread1 makes the call to writer() and then thread2 makes the call to reader() and sees f as not null, y could be 0 (not yet initialized) or 4 (initialized).
Related
Below code sample is taken from JLS 17.5 "final Field Semantics":
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
}
Since the instance of FinalFieldExample is published through a data race, is it possible that the f != null check evaluates successfully, yet subsequent f.x dereference sees f as null?
In other words, is it possible to get a NullPointerException on line that is commented with "guaranteed to see 3"?
Okay, here is my own take on it, based on quite a detailed talk (in Russian) on final semantics given by Vladimir Sitnikov, and subsequent revisit of JLS 17.5.1.
Final field semantics
The specification states:
Given a write w, a freeze f, an action a (that is not a read of a final field), a read r1 of the final field frozen by f, and a read r2 such that hb(w, f), hb(f, a), mc(a, r1), and dereferences(r1, r2), then when determining which values can be seen by r2, we consider hb(w, r2).
In other words, we are guaranteed to see the write to a final field if the following chain of relations can be built:
hb(w, f) -> hb(f, a) -> mc(a, r1) -> dereferences(r1, r2)
1. hb(w, f)
w is the write to the final field: x = 3
f is the "freeze" action (exiting FinalFieldExample constructor):
Let o be an object, and c be a constructor for o in which a final
field f is written. A freeze action on final field f of o takes place
when c exits, either normally or abruptly.
As the field write comes before finishing the constructor in program order, we can assume that hb(w, f):
If x and y are actions of the same thread and x comes before y in program order, then hb(x, y)
2. hb(f, a)
Definition of a given in the specification is really vague ("action, that is not a read of a final field")
We can assume that a is publishing a reference to the object (f = new FinalFieldExample()) since this assumption does not contradict the spec (it is an action, and it is not a read of a final field)
Since finishing constructor comes before writing the reference in program order, these two operations are ordered by a happens-before relationship: hb(f, a)
3. mc(a, r1)
In our case r1 is a "read of the final field frozen by f" (f.x)
And this is where it starts to get interesting. mc (Memory Chain) is one of the two additional partial orders introduced in "Semantics of final Fields" section:
There are several constraints on the memory chain ordering:
If r is a read that sees a write w, then it must be the case that mc(w, r).
If r and a are actions such that dereferences(r, a), then it must be the case that mc(r, a).
If w is a write of the address of an object o by a thread t that did not initialize o, then there must exist some read r by thread t that sees the address of o such that mc(r, w).
For the simple example given in question we're really only interested in the first point, as the other two are needed to reason about more complicated cases.
Below is the part that actually explains why it is possible to get an NPE:
notice the bold part in the spec quote: mc(a, r1) relation only exists if the read of the field sees the write to the shared reference
f != null and f.x are two distinct read operations from the JMM standpoint
there is nothing in the spec that says that mc relations are transitive with respect to program-order or happens-before
therefore if f != null sees the write done by another thread, there are no guarantees that f.x sees it too
I won't go into the details of the Dereference Chain constraints, as they are needed only to reason about longer reference chains (e.g. when a final field refers to an object, which in turn refers to another object).
For our simple example it suffices to say that JLS states that "dereferences order is reflexive, and r1 can be the same as r2" (which is exactly our case).
Safe way of dealing with unsafe publication
Below is the modified version of the code that is guaranteed to not throw an NPE:
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
FinalFieldExample local = f;
if (local != null) {
int i = local.x; // guaranteed to see 3
int j = local.y; // could see 0
}
}
}
The important difference here is reading the shared reference into a local variable.
As stated by JLS:
Local variables ... are never shared between threads and are unaffected by the memory model.
Therefore, there is only one read from shared state from the JMM standpoint.
If that read happens to see the write done by another thread, it would imply the two operations are connected with a memory chain (mc) relationship.
Furthermore, local = f and i = local.x are connected with dereference chain relationship, which gives us the whole chain mentioned in the beginning:
hb(w, f) -> hb(f, a) -> mc(a, r1) -> dereferences(r1, r2)
Your analysis is beautiful (1+), if I could upvote twice - I would. Here is one more link to the same problem with "independent reads" here, for example.
I have also tried to approach this problem in a different answer too.
I think if we introduce the same concept here, things could be provable, too. Let's take that method and slightly change it:
static void reader() {
FinalFieldExample instance1 = f;
if (instance1 != null) {
FinalFieldExample instance2 = f;
int i = instance2.x;
FinalFieldExample instance3 = f;
int j = instance3.y;
}
}
And a compiler can now do some eager reads (move those reads before the if statement):
static void reader() {
FinalFieldExample instance1 = f;
FinalFieldExample instance2 = f;
FinalFieldExample instance3 = f;
if (instance1 != null) {
int i = instance2.x;
int j = instance3.y;
}
}
Those reads can be further re-ordered between them:
static void reader() {
FinalFieldExample instance2 = f;
FinalFieldExample instance1 = f;
FinalFieldExample instance3 = f;
if (instance1 != null) {
int i = instance2.x;
int j = instance3.y;
}
}
Things should be trivial from here: ThreadA reads FinalFieldExample instance2 = f; to be null, before it does the next read : FinalFieldExample instance1 = f; some ThreadB calls writer (as such f != null) and the part:
FinalFieldExample instance1 = f;
is resolved to non-null.
Lets say I have the following class:
private final int[] ints = new int[5];
public void set(int index, int value) {
ints[index] = value;
}
public int get(int index) {
return ints[index]
}
Thread A runs the following:
set(1, 100);
(very) shortly after Thread B runs the following:
get(1)
My understanding is that there is no guarantee that Thread B will see the change that Thread A has been as the change could still be sitting in a CPU cache/register...or there could be instruction reordering...is this correct?
Moving further on, what happens if I have the following class:
public class ImmutableArray {
public final int[] ints
public ImmutableArray(int[] ints) {
this.ints = ints
}
}
With the following local variable:
volatile ImmutableArray arr;
and Thread A runs the following:
int[] ints = new int[5];
int[0] = 1;
arr = new ImmutableArray(ints);
(very) shortly after Thread B runs the following:
int i = arr.ints[0];
is Thread B guaranteed to get the value 1 because of the final and happens-before relationship, even despite the fact the value in the array was set outside of this?
EDIT: In the second example, the array is never changes, hence the name "ImmutableArray"
Second EDIT::
So what I understand by the answer is that given this:
int a = 0
volatile int b = 0;
If Thread A does the following:
a = 1;
b = 1;
Then the Thread B did the following:
a == 1; // true
b == 1; // true
So volatile acts as a sort of barrier, but at what point does the barrier end and allow reordering of instructions again?
You are correct on both counts. In fact, you would be right even if you relaxed your problem statement.
In the first example, no matter how much later thread B evaluates get(1), there will be no guarantee that it will ever observe the value written by thread A's call to set(1, 100).
In the second example, either final on the ints or volatile on the arr would be enough on its own for you to have the guarantee of observing ints[0] = 5. Without volatile you wouldn't be guaranteed to ever observe the ImmutableArray instance itself, but whenever you would observe it, you would be guaranteed to observe it in the fully constructed state. More specifically, the guarantee pertains to the complete state reachable from the object's final field as of the moment when the constructor returns.
Every one of you know about this feature of JMM, that sometimes reference to object could receive value before constructor of this object is finished.
In JLS7, p. 17.5 final Field Semantics we can also read:
The usage model for final fields is a simple one: Set the final fields
for an object in that object's constructor; and do not write a
reference to the object being constructed in a place where another
thread can see it before the object's constructor is finished. If this
is followed, then when the object is seen by another thread, that
thread will always see the correctly constructed version of that
object's final fields. (1)
And just after that in JLS the example follows, which demonstrate, how non-final field is not guaranteed to be initialized (1Example 17.5-1.1) (2):
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
}
Also, in this question-answer Mr. Gray wrote:
If you mark the field as final then the constructor is guaranteed to
finish initialization as part of the constructor. Otherwise you will
have to synchronize on a lock before using it. (3)
So, the question is:
1) According to statement (1) we should avoid sharing reference to immutable object before its constructor is finished
2) According to JLS's given example (2) and conclusion (3) it seems, that we can safely share reference to immutable object before its constructor is finished, i.e. when all its fields are final.
Isn't there some contradiction?
EDIT-1: What I exactly mean. If we will modify class in example such way, that field y will be also final (2):
class FinalFieldExample {
final int x;
final int y;
...
hence in reader() method it will be guaranteed, that:
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // guaranteed to see 4, isn't it???
If so, why we should avoid writing reference to object f before it's constructor is finished (according to (1)), when all fields of f are final?
Isn't there some contradiction [in the JLS around constructors and object publishing]?
I believe these are slightly different issues that are not contradictory.
The JLS reference is taking about storing an object reference in a place where other threads can see it before the constructor is finished. For example, in a constructor, you should not put an object into a static field that is used by other threads nor should you fork a thread.
public class FinalFieldExample {
public FinalFieldExample() {
...
// very bad idea because the constructor may not have finished
FinalFieldExample.f = this;
...
}
}
You shouldn't start the thread in a construtor either:
// obviously we should implement Runnable here
public class MyThread extends Thread {
public MyThread() {
...
// very bad idea because the constructor may not have finished
this.start();
}
}
Even if all of your fields are final in a class, sharing the reference to the object to another thread before the constructor finishes cannot guarantee that the fields have been set by the time the other threads start using the object.
My answer was talking about using an object without synchronization after the constructor had finished. It's a slightly different question although similar with regards to constructors, lack of synchronization, and reordering of operations by the compiler.
In JLS 17.5-1 they don't assign a static field inside of the constructor. They assign the static field in another static method:
static void writer() {
f = new FinalFieldExample();
}
This is the critical difference.
In the full example
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
}
As you can see, f is not set until after the constructor returns. This means f.x is safe because it is final AND the constructor has returned.
In the following example, neither value is guarenteed to be set.
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
f = this; // assign before finished.
}
static void writer() {
new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // not guaranteed to see 3
int j = f.y; // could see 0
}
}
}
According to statement (1) we should avoid sharing reference to immutable object before its constructor is finished
You should not allow a reference to an object escape before it is constructed for a number of reason (immutable or other wise) e.g. the object might throw an Exception after you have store the object.
According to JLS's given example (2) and conclusion (3) it seems, that we can safely share reference to immutable object, i.e. when all its fields are final.
You can safely share a reference to an immutable object between threads after the object has been constructed.
Note: you can see the value of an immutable field before it is set in a method called by a constructor.
Construct exit plays an important role here; the JLS says "A freeze action on final field f of o takes place when c exits". Publishing the reference before/after constructor exit are very different.
Informally
1 constructor enter{
2 assign final field
3 publish this
4 }constructor exit
5 publish the newly constructed object
[2] cannot be reordered beyond constructor exit. so [2] cannot be reordered after [5].
but [2] can be reordered after [3].
Statement 1) does not say what you think it does. If anything, I would rephrase your statement:
1) According to statement (1) we should avoid sharing reference to
immutable object before its constructor is finished
to read
1) According to statement (1) we should avoid sharing reference to
mutable object before its constructor is finished
where what I mean by mutable is an object that has ANY non-final fields or final references to mutable objects. (have to admit I'm not 100% that you need to worry about final references to mutable objects, but I think I'm right...)
To put it another way, you should distinguish between:
final fields (immutable parts of a possibly immutable object)
non-final fields who have to be initialized before anyone interacts with this object
non-final fields that do not have to be initialized before anyone interacts with this object
The second one is the problem spot.
So, you can share references to immutable objects (all fields are final), but you need to use caution with objects that have non-final fields that HAVE to be initialized before the object can be used by anyone.
In other words, for the edited JLS example you posted where both fields are final, int j = f.y; is guaranteed to be final. But what that means is that you do NOT need to avoid writing the reference to object f, because it'll always be in a correctly initialized state before anyone could see it. You do not need to worry about it, the JVM does.
I read in "Java Concurrency In Practice" that "publishing objects before they are fully constructed can compromise thread safety".
Could someone explain this?
Consider this code:
public class World{
public static Point _point;
public static void main(String[] args){
new PointMaker().start();
System.out.println(_point);
}
}
public class Point{
private final int _x, _y;
public Point(int x, int y){
_x = x;
World._point = this;//BAD: publish myself before I'm fully constructed
//some long computation here
_y = y;
}
public void toString(){
return _x + "," + _y;
}
}
public class PointMaker extends Thread{
public void run(){
new Point(1, 1);
}
}
Because Point publishes itself before setting the value of _y, the call to println may yield "1,0" instead of the expected "1,1".
(Note that it may also yield "null" if PointMaker + Point.<init> don't get far enough to set the World._point field before the call to println executes.)
new operator is allowed to return a value before the constructor of the class finishes. So a variable might not read null but contains an uninitialized class instance. This happens due to byte reordering.
Some clarification:
From a single thread perspective the JVM is allowed to reorder some instruction. When creating an instance traditionally you would think it goes like this:
allocate memory
run initialization (constructor)
assign reference to
var
While in fact the JVM might do something like:
allocate memory
assign reference to var
run initialization
(constructor)
This has performance advantages since addresses don't need to be lookup up again. From a single thread perspective this doesn't change the order of the logic. You're program works fine. But this poses a problem in multithreaded code. This means the reference can be published before the constructor has run. Therefor you need to an 'happens-before' rule to make sure the instance is fully initialized. Declaring variables volatile dos enforce such happens-before rules.
More on reordering:
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#reordering
I am reading the book Effective Java.
In an item Minimize Mutability , Joshua Bloch talks about making a class immutable.
Don’t provide any methods that modify the object’s state -- this is fine.
Ensure that the class can’t be extended. - Do we really need to do this?
Make all fields final - Do we really need to do this?
For example let's assume I have an immutable class,
class A{
private int a;
public A(int a){
this.a =a ;
}
public int getA(){
return a;
}
}
How can a class which extends from A , compromise A's immutability ?
Like this:
public class B extends A {
private int b;
public B() {
super(0);
}
#Override
public int getA() {
return b++;
}
}
Technically, you're not modifying the fields inherited from A, but in an immutable object, repeated invocations of the same getter are of course expected to produce the same number, which is not the case here.
Of course, if you stick to rule #1, you're not allowed to create this override. However, you cannot be certain that other people will obey that rule. If one of your methods takes an A as a parameter and calls getA() on it, someone else may create the class B as above and pass an instance of it to your method; then, your method will, without knowing it, modify the object.
The Liskov substitution principle says that sub-classes can be used anywhere that a super class is. From the point of view of clients, the child IS-A parent.
So if you override a method in a child and make it mutable you're violating the contract with any client of the parent that expects it to be immutable.
If you declare a field final, there's more to it than make it a compile-time error to try to modify the field or leave it uninitialized.
In multithreaded code, if you share instances of your class A with data races (that is, without any kind of synchronization, i.e. by storing it in a globally available location such as a static field), it is possible that some threads will see the value of getA() change!
Final fields are guaranteed (by the JVM specs) to have its values visible to all threads after the constructor finishes, even without synchronization.
Consider these two classes:
final class A {
private final int x;
A(int x) { this.x = x; }
public getX() { return x; }
}
final class B {
private int x;
B(int x) { this.x = x; }
public getX() { return x; }
}
Both A and B are immutable, in the sense that you cannot modify the value of the field x after initialization (let's forget about reflection). The only difference is that the field x is marked final in A. You will soon realize the huge implications of this tiny difference.
Now consider the following code:
class Main {
static A a = null;
static B b = null;
public static void main(String[] args) {
new Thread(new Runnable() { void run() { try {
while (a == null) Thread.sleep(50);
System.out.println(a.getX()); } catch (Throwable t) {}
}}).start()
new Thread(new Runnable() { void run() { try {
while (b == null) Thread.sleep(50);
System.out.println(b.getX()); } catch (Throwable t) {}
}}).start()
a = new A(1); b = new B(1);
}
}
Suppose both threads happen to see that the fields they are watching are not null after the main thread has set them (note that, although this supposition might look trivial, it is not guaranteed by the JVM!).
In this case, we can be sure that the thread that watches a will print the value 1, because its x field is final -- so, after the constructor has finished, it is guaranteed that all threads that see the object will see the correct values for x.
However, we cannot be sure about what the other thread will do. The specs can only guarantee that it will print either 0 or 1. Since the field is not final, and we did not use any kind of synchronization (synchronized or volatile), the thread might see the field uninitialized and print 0! The other possibility is that it actually sees the field initialized, and prints 1. It cannot print any other value.
Also, what might happen is that, if you keep reading and printing the value of getX() of b, it could start printing 1 after a while of printing 0! In this case, it is clear why immutable objects must have its fields final: from the point of view of the second thread, b has changed, even if it is supposed to be immutable by not providing setters!
If you want to guarantee that the second thread will see the correct value for x without making the field final, you could declare the field that holds the instance of B volatile:
class Main {
// ...
volatile static B b;
// ...
}
The other possibility is to synchronize when setting and when reading the field, either by modifying the class B:
final class B {
private int x;
private synchronized setX(int x) { this.x = x; }
public synchronized getX() { return x; }
B(int x) { setX(x); }
}
or by modifying the code of Main, adding synchronization to when the field b is read and when it is written -- note that both operations must synchronize on the same object!
As you can see, the most elegant, reliable and performant solution is to make the field x final.
As a final note, it is not absolutely necessary for immutable, thread-safe classes to have all their fields final. However, these classes (thread-safe, immutable, containing non-final fields) must be designed with extreme care, and should be left for experts.
An example of this is the class java.lang.String. It has a private int hash; field, which is not final, and is used as a cache for the hashCode():
private int hash;
public int hashCode() {
int h = hash;
int len = count;
if (h == 0 && len > 0) {
int off = offset;
char val[] = value;
for (int i = 0; i < len; i++)
h = 31*h + val[off++];
hash = h;
}
return h;
}
As you can see, the hashCode() method first reads the (non-final) field hash. If it is uninitialized (ie, if it is 0), it will recalculate its value, and set it. For the thread that has calculated the hash code and written to the field, it will keep that value forever.
However, other threads might still see 0 for the field, even after a thread has set it to something else. In this case, these other threads will recalculate the hash, and obtain exactly the same value, then set it.
Here, what justifies the immutability and thread-safety of the class is that every thread will obtain exactly the same value for hashCode(), even if it is cached in a non-final field, because it will get recalculated and the exact same value will be obtained.
All this reasoning is very subtle, and this is why it is recommended that all fields are marked final on immutable, thread-safe classes.
If the class is extended then the derived class may not be immutable.
If your class is immutable, then all fields will not be modified after creation. The final keyword will enforce this and make it obvious to future maintainers.
Adding this answer to point to the exact section of the JVM spec that mentions why member variables need to be final in order to be thread-safe in an immutable class. Here's the example used in the spec, which I think is very clear:
class FinalFieldExample {
final int x;
int y;
static FinalFieldExample f;
public FinalFieldExample() {
x = 3;
y = 4;
}
static void writer() {
f = new FinalFieldExample();
}
static void reader() {
if (f != null) {
int i = f.x; // guaranteed to see 3
int j = f.y; // could see 0
}
}
}
Again, from the spec:
The class FinalFieldExample has a final int field x and a non-final int field y. One thread might execute the method writer and another might execute the method reader.
Because the writer method writes f after the object's constructor finishes, the reader method will be guaranteed to see the properly initialized value for f.x: it will read the value 3. However, f.y is not final; the reader method is therefore not guaranteed to see the value 4 for it.