Safe publication through final - java

Even after going though this, I am still not clear about how usage of final causes safe publication in the below code. Can someone give an easy-to-understand explanation.
public class SafeListener
{
private final EventListener listener;
private SafeListener()
{
listener = new EventListener()
{ public void onEvent(Event e)
{ doSomething(e); }
};
}
public static SafeListener newInstance(EventSource source)
{
SafeListener safe = new SafeListener();
source.registerListener(safe.listener);
return safe;
}
}

Edited to add: Interesting perspective on the origins of Java and JSR-133's final behavior.
Canonical reference for how final works in the new JMM, for safe publication: http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#finalRight
On simple review, I think your code represents "safe" publication to the EventSource source object, which presumably will be fielding event callbacks to listener in a different thread. You are guaranteed that threads operating on the safe.listener reference passed will see a fully-initialized listener field. This does not make any further guarantees about other synchronization issues associated with calls to onEvent or other interactions with the object's state.
What is guaranteed by your code is that, when SafeListener's constructor returns a reference inside the static method, the listener field will not be seen in an unwritten state (even if there is no explicit synchronization). For example: Suppose a thread A calls newInstance(), resulting in an assignment to the listener field. Suppose that a thread B is able to dereference the listener field. Then, even absent any other synchronization, thread B is guaranteed to see the write listener = new EventListener().... If the field were not final, you would not receive that guarantee. There are several (other) ways of providing the guarantee (explicit synchronization, use of an atomic reference, use of volatile) of varying performance and readability.
Not everything that's legal is advisable. Suggest you take a look at JCiP and perhaps this article on safe publication techniques.
A recent, related question is here: "Memory barriers and coding...", "Java multi-threading & Safe Publication".

In a nutshell, the specification for final (see #andersoj's answer) guarantees that when the constructor returns, the final field will have been properly initialized (as visible from all threads).
There is no such guarantee for non-final fields (which means that if another thread gets the freshly constructed object, the field may not have been set yet).
That this works is part of the JVM spec.
How it works would be a JVM implementation detail.

You can refer to JSL
Final field or the object reachable through a final reference can't be reordered with the initial load of a reference to that object. It is visible to all other threads after its construction.

Related

Java thread safety of setting a reference to an object

I'm wondering if the following class is thread safe:
class Example {
private Thing thing;
public setThing(Thing thing) {
this.thing = thing;
}
public use() {
thing.function();
}
}
Specifically, what happens if one thread calls setThing while another thread is in Thing::function via Example::use?
For example:
Example example = new Example();
example.setThing(new Thing());
createThread(example); // create first thread
createThread(example); // create second thread
//Thread1
while(1) {
example.use();
}
//Thread2
while(1) {
sleep(3600000); //yes, i know to use a scheduled thread executor
setThing(new Thing());
}
Specifically, I want to know, when setThing is called while use() is executing, will it continue with the old object successfully, or could updating the reference to the object somehow cause a problem.
There are 2 points when reasoning about thread safety of a particulcar class :
Visibility of shared state between threads.
Safety (preserving class invariants) when class object is used by multiple threads through class methods.
Shared state of Example class consists only from one Thing object.
The class isn't thread safe from visibility perspective. Result of setThing by one thread isn't seen by other threads so they can work with stale data. NPE is also acceptable cause initial value of thing during class initialization is null.
It's not possible to say whether it's safe to access Thing class through use method without its source code. However Example invokes use method without any synchronization so it should be, otherwise Example isn't thread safe.
As a result Example isn't thread safe. To fix point 1 you can either add volatile to thing field if you really need setter or mark it as final and initialize in constructor. The easiest way to ensure that 2 is met is to mark use as synchronized. If you mark setThing with synchronized as well you don't need volatile anymore. However there lots of other sophisticated techniques to meet point 2. This great book describes everything written here in more detail.
If the method is sharing resources and the thread is not synchronized, then the they will collide and several scenarios can occur including overwriting data computed by another thread and stored in a shared variable.
If the method has only local variables, then you can use the method by mutliple threads without worring about racing. However, usually non-helper classes manipulate member variables in their methods, therefore it's recommended to make methods synchronized or if you know exactly where the problem might occur, then lock (also called synchronize) a subscope of a method with a final lock/object.

Java Memory Model: Is it safe to create a cyclical reference graph of final instance fields, all assigned within the same thread?

Can somebody who understand the Java Memory Model better than me confirm my understanding that the following code is correctly synchronized?
class Foo {
private final Bar bar;
Foo() {
this.bar = new Bar(this);
}
}
class Bar {
private final Foo foo;
Bar(Foo foo) {
this.foo = foo;
}
}
I understand that this code is correct but I haven't worked through the whole happens-before math. I did find two informal quotations that suggest this is lawful, though I'm a bit wary of completely relying on them:
The usage model for final fields is a simple one: Set the final fields for an object in that object's constructor; and do not write a reference to the object being constructed in a place where another thread can see it before the object's constructor is finished. If this is followed, then when the object is seen by another thread, that thread will always see the correctly constructed version of that object's final fields. It will also see versions of any object or array referenced by those final fields that are at least as up-to-date as the final fields are. [The Java® Language Specification: Java SE 7 Edition, section 17.5]
Another reference:
What does it mean for an object to be properly constructed? It simply means that no reference to the object being constructed is allowed to "escape" during construction. (See Safe Construction Techniques for examples.) In other words, do not place a reference to the object being constructed anywhere where another thread might be able to see it; do not assign it to a static field, do not register it as a listener with any other object, and so on. These tasks should be done after the constructor completes, not in the constructor. [JSR 133 (Java Memory Model) FAQ, "How do final fields work under the new JMM?"]
Yes, it is safe. Your code does not introduce a data race. Hence, it is synchronized correctly. All objects of both classes will always be visible in their fully initialized state to any thread that is accessing the objects.
For your example, this is quite straight-forward to derive formally:
For the thread that is constructing the threads, all observed field values need to be consistent with program order. For this intra-thread consistency, when constructing Bar, the handed Foo value is observed correctly and never null. (This might seem trivial but a memory model also regulates "single threaded" memory orderings.)
For any thread that is getting hold of a Foo instance, its referenced Bar value can only be read via the final field. This introduces a dereference ordering between reading of the address of the Foo object and the dereferencing of the object's field pointing to the Bar instance.
If another thread is therefore capable of observing the Foo instance altogether (in formal terms, there exists a memory chain), this thread is guaranteed to observe this Foo fully constructed, meaning that its Bar field contains a fully initialized value.
Note that it does not even matter that the Bar instance's field is itself final if the instance can only be read via Foo. Adding the modifier does not hurt and better documents the intentions, so you should add it. But, memory-model-wise, you would be okay even without it.
Note that the JSR-133 cookbook that you quoted is only describing an implementation of the memory model rather than then memory model itself. In many points, it is too strict. One day, the OpenJDK might no longer align with this implementation and rather implement a less strict model that still fulfills the formal requirements. Never code against an implementation, always code against the specification! For example, do not rely on a memory barrier being placed after the constructor, which is how HotSpot more or less implements it. These things are not guaranteed to stay and might even differ for different hardware architectures.
The quoted rule that you should never let a this reference escape from a constructor is also too narrow a view on the problem. You should not let it escape to another thread. If you would, for example, hand it to a virtually dispatched method, you could not longer control where the instance would end up. This is therefore a very bad practice! However, constructors are not dispatched virtually and you can safely create circular references in the manner you depicted. (I assume that you are in control of Bar and its future changes. In a shared code base, you should document tightly that the constructor of Bar must not let the reference slip out.)
Immutable Objects (with only final fields) are only "threadsafe" after they are properly constructed, meaning their constructor has completed. (The VM probably accomplishes this by a memory barrier after the constructor of such objects)
Lets see how to make your example surely unsafe:
If the Bar-Constructor would store a this-reference where another thread could see it, this would be unsafe because Bar isnt constructed yet.
If the Bar-Constructor would store a foo-reference where another thread could see it, this would be unsafe because foo isnt constructed yet.
If the Bar-Constructor would read some foo-fields, then (depending on the order of initialization inside the Foo-constructor) these fields would always be uninitialized. Thats not a threadsafety-problem, just an effect of the order of initialization. (Calling a virtual method inside a constructor has the same issues)
References to immutable Objects (only final fields) which are created by a new-expression are always safe to access (no uninitialized fields visible). But the Objects referenced in these final fields may show uninitialized values if these references were obtained by a constructor giving away its this-reference.
As Assylias already wrote: Because in your example the constructors stored no references to where another thread could see them, your example is "threadsafe". The created Foo-Object can safely be given other threads.

Do different threads see the same version of a object referenced by a local variable?

Are multiple threads guaranteed to see the same version of a shared object to which they have a reference? Here is a code sample:
public static void main(String[] args) {
final AtomicBoolean flag = new AtomicBoolean(false);
new Thread(){
public void run() { possibly read and mutate flag }
}.start();
new Thread(){
public void run() { possibly read and mutate flag }
}.start();
while (!flag.get()) {
Thread.yield();
}
}
To be clear, I am wondering whether writes by the child threads to the shared object are seen by the parent and sibling threads.
Are multiple threads guaranteed to see the same version of a shared local variable in their scope.
In general, it depends on what you mean by "the same version". It also depends on the nature of the variable (e.g. how it is declared and initialized) ... and on how the threads use it.
(In general, Java doesn't do "versions" of variables. A thread accessing a shared variable or object will either see the latest state, or it won't. If it sees a state that isn't the latest state, then there are no guarantees as to what it will see. In particular, it may see something that doesn't directly correspond to any notional version of the object ... due to word-tearing and other cache-related memory artefacts.)
In your example you are using a final local variable within an inner class (in this case you have two anonymous inner classes). When you do that, the compiler creates a corresponding synthetic variable in the inner class that is initialized with the value of the variable in the method scope. The compiled inner class then refers to the value of the synthetic variable instead of the original variable.
In your example, it is guaranteed that the inner classes (e.g. your threads) will see the same (reference) value as in the original variable. Furthermore, it is guaranteed that they will (at least initially) see a consistent snapshot of whatever object it is that it references. (And since it is an AtomicXxxx class, it will always be consistent for all threads that can access it. Guaranteed.)
OK, so what about other cases:
If flag was a static or instance field that was also final, then we wouldn't have synthetic variables, and each nested class would be referencing the same shared variable. But it would all still work.
If flag was a static or instance field and it wasn't final, but nothing changed the field (after creating of the threads) then it would still be OK. (Though you could argue that this is fragile ... because something could change the field.)
If flag was a static or instance field and it wasn't final or volatile, then the threads would initially see the same state as the parent thread. But if either the original thread or any of the other threads changed the variable (etcetera), then the others are not guaranteed to see the new state ... unless they respective threads synchronize properly.
I would like to know if changes to flag made in one thread are seen immediately by the other two threads.
As I said above, it depends ...
In your example, the answer is "yes", because you use a final reference to AtomicBoolean.
If you had declared flag as a boolean and marked it as volatile, then the answer would be "yes".
If you had declared flag as a boolean and non-volatile, then the answer would be "no".
If flag was a final reference to an ordinary object with a mutable non-volatile boolean field, then the answer would also be "no". (The threads would all see the same object, but they wouldn't consistently see the latest state. The solution would be to use synchronized getters and setters, or equivalent.)
Yes, the two threads share the same final AtomicBoolean which is a class used to set the truth value. The variable flag itself can't be recreated because it is final. But you can perform actions on it to set value. Just like a final int[] can't be assigned to different size but you can change the value of what's inside.
final AtomicBoolean flag = new AtomicBoolean(false);
new Thread(){
public void run(){
try {
Thread.sleep(100);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
flag.set(true);
}
}.start();
new Thread(){
public void run(){
flag.set(false);
}
}.start();
Thread.sleep(200);// comment this line, you see different results
System.out.println(flag);
In this case, yes. The Java Language Specification says that calling Thread.start() synchronizes-with all previous actions on the calling thread:
An action that starts a thread synchronizes-with the first action in the thread it starts.
This creates a happens-before relationship between all writes on your main thread (including any writes the constructor of the AtomicBoolean made to initialize itself) are made visible to the thread your main thread started.
A call to start() on a thread happens-before any actions in the started thread.
So basically you are good to go. Your AtomicBoolean object is visible to both threads, and they both see the same object.
This pattern is called Safe Publication, btw. You use it to safely publish an object you create (like your AtomicBoolean) so that other threads can see it. (And yes, Thread.start() isn't on the list there of ways to safely publish an object because Thread.start() isn't general enough. But it's the same idea, and works the same way.)
The local variable is not shared1, and being final means that there would be no worry of it changing even if it was the case. (A question about member variables would result in a different response although, excluding constructor leakage, a final member would provide the same guarantees.)
The same object is shared across threads; it will be the same object and will adhere to the defined AtomicBoolean contract.
A boolean value that may be updated atomically. See the java.util.concurrent.atomic package specification for description of the properties of atomic variables.
In short the package documentation specifies the following which in turn guarantees happens-before relationships.
get has the memory effects of reading a volatile variable.
set has the memory effects of writing (assigning) a volatile variable.
There are many questions relating to the thread-safey of volatile (and AtomicXYZ objects), eg. see Is a volatile int in Java thread-safe? and Is AtomicBoolean needed to create a cancellable thread?
1 Anonymous types, including Java 8 lambdas, do not create closures/lexical bindings to variables in scope and as such are not capable of sharing local variables; rather variables are synthesized with the value of the final (or effectively final) variable from the enclosing scope which is bound when the anonymous type is instantiated.

Java Thread Safety of Initialized Objects

Consider the following class:
public class MyClass
{
private MyObject obj;
public MyClass()
{
obj = new MyObject();
}
public void methodCalledByOtherThreads()
{
obj.doStuff();
}
}
Since obj was created on one thread and accessed from another, could obj be null when methodCalledByOtherThread is called? If so, would declaring obj as volatile be the best way to fix this issue? Would declaring obj as final make any difference?
Edit:
For clarity, I think my main question is:
Can other threads see that obj has been initialized by some main thread or could obj be stale (null)?
For the methodCalledByOtherThreads to be called by another thread and cause problems, that thread would have to get a reference to a MyClass object whose obj field is not initialized, ie. where the constructor has not yet returned.
This would be possible if you leaked the this reference from the constructor. For example
public MyClass()
{
SomeClass.leak(this);
obj = new MyObject();
}
If the SomeClass.leak() method starts a separate thread that calls methodCalledByOtherThreads() on the this reference, then you would have problems, but this is true regardless of the volatile.
Since you don't have what I'm describing above, your code is fine.
It depends on whether the reference is published "unsafely". A reference is "published" by being written to a shared variable; another thread reads the variable to get the reference. If there is no relationship of happens-before(write, read), the publication is called unsafe. An example of unsafe publication is through a non-volatile static field.
#chrylis 's interpretation of "unsafe publication" is not accurate. Leaking this before constructor exit is orthogonal to the concept of unsafe publication.
Through unsafe publication, another thread may observe the object in an uncertain state (hence the name); in your case, field obj may appear to be null to another thread. Unless, obj is final, then it cannot appear to be null even if the host object is published unsafely.
This is all too technical and it requires further readings to understand. The good news is, you don't need to master "unsafe publication", because it is a discouraged practice anyway. The best practice is simply: never do unsafe publication; i.e. never do data race; i.e. always read/write shared data through proper synchronization, by using synchronized, volatile or java.util.concurrent.
If we always avoid unsafe publication, do we still need final fields? The answer is no. Then why are some objects (e.g. String) designed to be "thread safe immutable" by using final fields? Because it's assumed that they can be used in malicious code that tries to create uncertain state through deliberate unsafe publication. I think this is an overblown concern. It doesn't make much sense in server environments - if an application embeds malicious code, the server is compromised, period. It probably makes a bit of sense in Applet environment where JVM runs untrusted codes from unknown sources - even then, this is an improbable attack vector; there's no precedence of this kind of attack; there are a lot of other more easily exploitable security holes, apparently.
This code is fine because the reference to the instance of MyClass can't be visible to any other threads before the constructor returns.
Specifically, the happens-before relation requires that the visible effects of actions occur in the same order as they're listed in the program code, so that in the thread where the MyClass is constructed, obj must be definitely assigned before the constructor returns, and the instantiating thread goes directly from the state of not having a reference to the MyClass object to having a reference to a fully-constructed MyClass object.
That thread can then pass a reference to that object to another thread, but all of the construction will have transitively happened-before the second thread can call any methods on it. This might happen through the constructing thread's launching the second thread, a synchronized method, a volatile field, or the other concurrency mechanisms, but all of them will ensure that all of the actions that took place in the instantiating thread are finished before the memory barrier is passed.
Note that if a reference to this gets passed out of the class inside the constructor somewhere, that reference might go floating around and get used before the constructor is finished. That's what's known as unsafe publishing of the object, but code such as yours that doesn't call non-final methods from the constructor (or directly pass out references to this) is fine.
Your other thread could see a null object. A volatile object could possibly help, but an explicit lock mechanism (or a Builder) would likely be a better solution.
Have a look at Java Concurrency in Practice - Sample 14.12
This class (if taken as is) is NOT thread safe. In two words: there is reordering of instructions in java (Instruction reordering & happens-before relationship in java) and when in your code you're instantiating MyClass, under some circumstances you may get following set of instructions:
Allocate memory for new instance of MyClass;
Return link to this block of memory;
Link to this not fully initialized MyClass is available for other threads, they can call "methodCalledByOtherThreads()" and get NullPointerException;
Initialize internals of MyClass.
In order to prevent this and make your MyClass really thread safe - you either have to add "final" or "volatile" to the "obj" field. In this case Java's memory model (starting from Java 5 on) will guarantee that during initialization of MyClass, reference to alocated for it block of memory will be returned only when all internals are initialized.
For more details I would strictly recommend you to read nice book "Java Concurrency in Practice". Exactly your case is described on the pages 50-51 (section 3.5.1). I would even say - you just can write correct multithreaded code without reading that book! :)
The originally picked answer by #Sotirios Delimanolis is wrong. #ZhongYu 's answer is correct.
There is the visibility issue of the concern here. So if MyClass is published unsafely, anything could happen.
Someone in the comment asked for evidence - one can check Listing 3.15 in the book Java Concurrency in Practice:
public class Holder {
private int n;
// Initialize in thread A
public Holder(int n) { this.n = n; }
// Called in thread B
public void assertSanity() {
if (n != n) throw new AssertionError("This statement is false.");
}
}
Someone comes up an example to verify this piece of code:
coding a proof for potential concurrency issue
As to the specific example of this post:
public class MyClass{
private MyObject obj;
// Initialize in thread A
public MyClass(){
obj = new MyObject();
}
// Called in thread B
public void methodCalledByOtherThreads(){
obj.doStuff();
}
}
If MyClass is initialized in Thread A, there is no guarantee that thread B will see this initialization (because the change might stay in the cache of the CPU that Thread A runs on and has not propagated into main memory).
Just as #ZhongYu has pointed out, because the write and read happens at 2 independent threads, so there is no happens-before(write, read) relation.
To fix this, as the original author has mentioned, we can declare private MyObject obj as volatile, which will ensure that the reference itself will be visible to other threads in timely manner
(https://www.logicbig.com/tutorials/core-java-tutorial/java-multi-threading/volatile-ref-object.html) .

How do JVM's implicit memory barriers behave when chaining constructors?

Referring to my earlier question on incompletely constructed objects, I have a second question. As Jon Skeet pointed out, there's an implicit memory barrier in the end of a constructor that makes sure that final fields are visible to all threads. But what if a constructor calls another constructor; is there such a memory barrier in the end of each of them, or only in the end of the one that got called in the first place? That is, when the "wrong" solution is:
public class ThisEscape {
public ThisEscape(EventSource source) {
source.registerListener(
new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
});
}
}
And the correct one would be a factory method version:
public class SafeListener {
private final EventListener listener;
private SafeListener() {
listener = new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
}
}
public static SafeListener newInstance(EventSource source) {
SafeListener safe = new SafeListener();
source.registerListener(safe.listener);
return safe;
}
}
Would the following work too, or not?
public class MyListener {
private final EventListener listener;
private MyListener() {
listener = new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
}
}
public MyListener(EventSource source) {
this();
source.register(listener);
}
}
Update: The essential question is that is this() guaranteed to actually call the private constructor above (in which case there would be the barrier where intended and everything would be safe), or is it possible that the private constructor gets inlined into the public one as an optimization to save one memory barrier (in which case there wouldn't be a barrier until in the end of the public constructor)?
Are the rules of this() defined precisely somewhere? If not, then I think we must assume that inlining chained constructors is allowed, and probably some JVMs or maybe even javacs are doing it.
I think it is safe as java memory model states that:
Let o be an object, and c be a constructor for o in which a final
field f is written. A freeze action on final field f of o takes place
when c exits, either normally or abruptly. Note that if one
constructor invokes another constructor, and the invoked constructor
sets a final field, the freeze for the final field takes place at the
end of the invoked constructor.
An object is considered to be completely initialized when its constructor finishes.
This applies also for chained constructors.
If you have to register in the constructor define the listener as a static inner class. This is safe.
Your second version is not correct, because it is allowing the 'this' reference to escape from the construction process. Having 'this' escape invalidates the initialization safety guarantees that give final fields their safety.
To address the implicit question, the barrier at the end of construction only happens at the very end of object construction. The intuition one reader offered about inlining is a useful one; from the perspective of the Java Memory Model, method boundaries do not exist.
EDIT After the comment that suggested the compiler inlining the private constructor (I had not thought of that optimization) chances are that the code will be unsafe. And the worst part of unsafe multithreaded code is that is seems to work, so you are better off avoiding it completely. If you want to play different tricks (you do really want to avoid the factory for some reason) consider adding a wrapper to guarantee the coherence of data in the internal implementation object and register in the external object.
My guess is that it will be fragile but ok. The compiler cannot know whether the internal constructor will be called only from within other constructors or not, so it has to make sure that the result would be correct for code calling only the internal constructor, so whatever mechanism it uses (memory barrier?) has to be in place there.
I would guess that the compiler would add the memory barrier at the end of each and every constructor. The problem is still there: you are passing the this reference to other code (possibly other threads) before it is fully constructed --that is bad--, but if the only ´construction´ that is left is registering the listener, then the object state is as stable as it will ever be.
The solution is fragile in that some other day, you or some other programmer may need to add another member to the object and may forget that the chained constructors is a concurrency trick and may decide to initialize the field in the public constructor, and in doing so will add a hard to detect potential data race in your application, so I would try to avoid that construct.
BTW: The guessed safety may be wrong. I don't know how complex/smart the compiler is, and whether the memory barrier (or the like) is something it could try to optimize away... since the constructor is private the compiler does have enough information to know that it is only called from other constructors, and that is enough information to determine that the synchronization mechanism is not necessary in the internal constructor...
Escaping object reference in c-tor can publish an incompletely constructed object. This is true even if the publication is the last statement in the constructor.
Your SafeListener might not behave ok in a concurrent environment, even if c-tor inlining is performed (which I think it's not - think about creating objects using reflection by accessing private c-tor).

Categories