Partial constructed objects in the Java Memory Model

Partial constructed objects in the Java Memory Model - java

I came across the following code in an article somewhere on the Internet:
public class MyInt {
private int x;
public MyInt(int y) {
this.x = y;
}
public int getValue() {
return this.x;
}
}
The article states that
Constructors are not treated special by the compiler (JIT, CPU etc) so it is allowed to reorder instructions from the constructor and instructions that come after the constructor.
Also, this JSR-133 article about the Java Memory Model states that
A thread that can only see a reference to an object after that object has been completely initialized is guaranteed to see the correctly initialized values for that object’s final fields.
The abovementioned MyInt instance seems immutable (except that the class is not marked final) and thread-safe, but the articles state it is not. They state that it's not guaranteed that x always has the correct value upon read.
But I thought that
only the thread that creates an object should have access to it while it is being constructed
and the Java Tutorials seem so support that.
My question is: does it mean that, with the current JMM, a thread can have access to a partially constructed object due to instruction reordering? And if yes, how? And does that mean that the statement from the Java Tutorials is simply not true?

That article is saying that if you have code like
foo = new MyInt(7);
in a class that has a field
MyInt foo;
then the instructions that amount to
(reference to new object).x = 7;
foo = (reference to new object);
could be swapped over as some kind of optimisation. This will never change the behaviour of the thread that's running this code, but it's possible that some other thread will be able to read foo after the line
foo = (reference to new object);
but before the line
(reference to new object).x = 7;
in which case it would see foo.x as 0, not 7. That is to say, that other thread could run
int bar = someObject.getFoo().getValue();
and end up with bar equal to 0.
I've never seen anything like this happen in the wild, but the author seems to know what he's talking about.

Instruction reordering alone can not lead to another thread seeing a partially constructed object. By definition, the JVM is only allowed to reorder things if they don't affect a correctly synchronized program's behaviour.
It's unsafe publishing of the object reference that enables bad things to happen. Here's a particularly poor attempt at a singleton for example:
public class BadSingleton {
public static BadSingleton theInstance;
private int foo;
public BadSingleton() {
this.foo = 42;
if (theInstance == null) {
theInstance = this;
}
}
}
Here you accidentally publish the reference to the object being constructed in a static field. This would not necessarily be a problem until the JVM decides to reorder things and places this.foo = 42 after the assignment to theInstance. So the two things together conspire to break your invariants and allow another thread to see a BadSingleton.theInstance with its foo field uninitialised.
Another frequent source of accidental publication is calling overrideable methods from the constructor. This does not always lead to accidental publication, but the potential is there, hence it should be avoided.
only the thread that creates an object should have access to it while it is being constructed
And does that mean that the statement from the Java Tutorials is
simply not true?
Yes and no. It depends on how we interpret the word should. There is no guarantee that in every possible case another thread won't see a partially constructed object. But it's true in the sense that you should write code that doesn't allow it to happen.

Related

can moderm JVMs optimize different instances of the same class differently?

say I have 2 instances of the same class, but they behave differently (follow different code paths) based on a final boolean field set at construction time. so something like:
public class Foo {
private final boolean flag;
public Foo(boolean flagValue) {
this.flag = flagValue;
}
public void f() {
if (flag) {
doSomething();
} else {
doSomethingElse();
}
}
}
2 instances of Foo with different values for flag could in theory be backed by 2 different assemblies, thereby eliminating the cost of the if (sorry for the contrived example, its the simplest one I could come up with).
so my question is - do any JVMs actually do this? or is a single class always backed by a single assembly?

Yes, JVMs do this form of optimization. In your case, this would be a result of inlining and adaptive optimization for being a value to always be true. Consider the following code:
Foo foo = new Foo(true);
foo.f();
It is trivial to prove for HotSpot that Foo is always an actual instance of Foo at the call site of f what allows the VM to simply copy-paste the code of the method, thus eliminating the virtual dispatch. After inlining, the example is reduced to:
Foo foo = new Foo(true);
if (foo.flag) {
doSomething();
} else {
doSomethingElse();
}
This again, allows to reduce the code to:
Foo foo = new Foo(true);
foo.doSomething();
If the optimization can be applied does therefore depend on the monomorphism of the call site of foo and the stability of flag at this call site. (The VM profiles your methods for such patterns.) The less the VM is able to predict the outcome of your program, the less optimization is applied.
If the example was so trivial as the above code, the JIT would probably also erase the object allocation and simply call doSomething. Also, for the trivial example case where the value of the field can be proven to be true trivially, the VM does not even need to optimize adaptively but simply applies the above optimization. There is a great tool named JITWatch that allows you to look into how your code gets optimized.

The following applies to hotspot, other JVMs may apply different optimizations.
If those instances are in turned assigned to static final fields and then referred to by other code and the VM is started with -XX:+TrustFinalNonStaticFields then those instances can participate in constant folding and inlining CONSTANT.f() can result in different branches being eliminated.
Another approach available to privileged code is creating anonymous classes instead of instances via sun.misc.Unsafe.defineAnonymousClass(Class<?>, byte[], Object[]) and patching a class constant for each class, but ultimately that also has to be referenced through a class constant to have any effect on optimizations.

Java Memory Visibility In Constructors

For the following simplified class:
public class MutableInteger {
private int value;
public MutableInteger(int initial) {
synchronized(this) { // is this necessary for memory visibility?
this.value = initial;
}
}
public synchronized int get() {
return this.value;
}
public synchronized void increment() {
this.value++;
}
...
}
I guess the general question is for mutable variables guarded by synchronization is it necessary to synchronize when setting the initial value in the constructor?

You're right, without the synchronized block in the constructor there is no visibility guarantee for non-final fields, as can be seen in this example.
However in practice I would rather use volatile fields or the Atomic* classes in situations like this.
Update: It is also important to mention here that in order for your program to be correctly synchronized (as defined by the JLS), you will need to publish the reference to your object in a safe manner. The cited example doesn't do that, hence why you may see the wrong value in non-final fields. But if you publish the object reference correctly (i.e. by assigning it to a final field of another object, or by creating it before calling Thread.start()), it is guaranteed that your object will be seen at least as up-to-date as the time of publishing, therefore making the synchronized block in the constructor unnecessary.

Though you've accepted an answer, let me add my two cents.
Based on what I've read, synchronization or making the field volatile would not grantee the following visibility.
A thread T1 may see a not-null value for this, but unless you've made the field value final, there's a good chance of thread T1 seeing the default value of value.
The value could be a volatile or been accessed within synchronized blocks (monitor acquire and release), either way provided that the correct execution order was followed, there's happens-before edge from the write to the read of value. There's no argument on that.
But it's not the happens before edge that we have to consider here, but the correct publication of the object itself(MutableInteger).
Creating an object is twofold where the JVM first allocates a heap space and then start initializing fields. A thread may see a not-null reference of an object but an uninitialized field of that as long as the said field is not final (Assuming reference has been correctly published).

Does immutability guarantee thread safety?

Well, consider the immutable class Immutable as given below:
public final class Immutable
{
final int x;
final int y;
public Immutable(int x,int y)
{
this.x = x;
this.y = y;
}
//Setters
public int getX()
{
return this.x;
}
public int getY()
{
return this.y;
}
}
Now I am creating an object of Immutable in a class Sharable whose object is going to be shared by multiple threads:
public class Sharable
{
private static Immutable obj;
public static Immutable getImmutableObject()
{
if (obj == null) --->(1)
{
synchronized(this)
{
if(obj == null)
{
obj = new Immutable(20,30); ---> (2)
}
}
}
return obj; ---> (3)
}
}
Thread A sees the obj as null and moves into the synchronized block and creates object. Now, Since The Java Memory Model (JMM) allows multiple threads to observe the object after its initialization has begun but before it has concluded. So, Thread B could see the write to objas occurring before the writes to the fields of the Immutable. Hence Thread B could thus see a partially constructed Immutable that may well be in an invalid state and whose state may unexpectedly change later.
Isn't it making Immutable non-thread-safe ?
EDIT
OK, After having lot of look up on SO and going thorough some comments,I got to know that You can safely share a reference to an immutable object between threads after the object has been constructed. Also, as mentioned by #Makoto, it is usually required to declare the fields containing their references volatile to ensure visibility. Also , as stated by #PeterLawrey , declaring the reference to immutable object as final makes the field as thread-safe

So, Thread B could see the write to objas occurring before the writes to the fields of the Immutable. Hence Thread B could thus see a partially constructed Immutable that may well be in an invalid state and whose state may unexpectedly change later.
In Java 1.4, this was true. In Java 5.0 and above, final fields are thread safe after construction.

What you're describing here are two different things. First, Immutable is thread safe if the operations are being done to an instance of it.
Thread safety is, in part, ensuring that memory isn't accidentally overwritten by another thread. Insofar as using Immutable, you can never overwrite any data contained in an instance of it, so you can feel confident that, in a concurrent environment, an Immutable object will be the same when you constructed it to when the threads are manipulating it.
What you've got right there is a broken implementation of double-checked locking.
You're right in that Thread A and Thread B may trample the instance before it's set, thus making the whole immutability of the object Immutable completely moot.
I believe that the approach to fix this would be to use the volatile keyword for your obj field, so that Java (> 1.5) will respect the intended use of the singleton, and disallow threads to overwrite the contents of obj.
Now, having read a bit closer, it seems to be a bit wonky that you'd have an immutable singleton that required two pieces of static data for it to exist. It seems more like this would be suited towards a factory instead.
public class Sharable {
private Sharable() {
}
public static Immutable getImmutableInstance(int a, int b) {
return new Immutable(a, b);
}
}
Every instance of Immutable you get will truly be immutable - creating a new Immutable has no impact on the others, and using an instance of Immutable has no impact on any others as well.

Java Thread Safety of Initialized Objects

Consider the following class:
public class MyClass
{
private MyObject obj;
public MyClass()
{
obj = new MyObject();
}
public void methodCalledByOtherThreads()
{
obj.doStuff();
}
}
Since obj was created on one thread and accessed from another, could obj be null when methodCalledByOtherThread is called? If so, would declaring obj as volatile be the best way to fix this issue? Would declaring obj as final make any difference?
Edit:
For clarity, I think my main question is:
Can other threads see that obj has been initialized by some main thread or could obj be stale (null)?

For the methodCalledByOtherThreads to be called by another thread and cause problems, that thread would have to get a reference to a MyClass object whose obj field is not initialized, ie. where the constructor has not yet returned.
This would be possible if you leaked the this reference from the constructor. For example
public MyClass()
{
SomeClass.leak(this);
obj = new MyObject();
}
If the SomeClass.leak() method starts a separate thread that calls methodCalledByOtherThreads() on the this reference, then you would have problems, but this is true regardless of the volatile.
Since you don't have what I'm describing above, your code is fine.

It depends on whether the reference is published "unsafely". A reference is "published" by being written to a shared variable; another thread reads the variable to get the reference. If there is no relationship of happens-before(write, read), the publication is called unsafe. An example of unsafe publication is through a non-volatile static field.
#chrylis 's interpretation of "unsafe publication" is not accurate. Leaking this before constructor exit is orthogonal to the concept of unsafe publication.
Through unsafe publication, another thread may observe the object in an uncertain state (hence the name); in your case, field obj may appear to be null to another thread. Unless, obj is final, then it cannot appear to be null even if the host object is published unsafely.
This is all too technical and it requires further readings to understand. The good news is, you don't need to master "unsafe publication", because it is a discouraged practice anyway. The best practice is simply: never do unsafe publication; i.e. never do data race; i.e. always read/write shared data through proper synchronization, by using synchronized, volatile or java.util.concurrent.
If we always avoid unsafe publication, do we still need final fields? The answer is no. Then why are some objects (e.g. String) designed to be "thread safe immutable" by using final fields? Because it's assumed that they can be used in malicious code that tries to create uncertain state through deliberate unsafe publication. I think this is an overblown concern. It doesn't make much sense in server environments - if an application embeds malicious code, the server is compromised, period. It probably makes a bit of sense in Applet environment where JVM runs untrusted codes from unknown sources - even then, this is an improbable attack vector; there's no precedence of this kind of attack; there are a lot of other more easily exploitable security holes, apparently.

This code is fine because the reference to the instance of MyClass can't be visible to any other threads before the constructor returns.
Specifically, the happens-before relation requires that the visible effects of actions occur in the same order as they're listed in the program code, so that in the thread where the MyClass is constructed, obj must be definitely assigned before the constructor returns, and the instantiating thread goes directly from the state of not having a reference to the MyClass object to having a reference to a fully-constructed MyClass object.
That thread can then pass a reference to that object to another thread, but all of the construction will have transitively happened-before the second thread can call any methods on it. This might happen through the constructing thread's launching the second thread, a synchronized method, a volatile field, or the other concurrency mechanisms, but all of them will ensure that all of the actions that took place in the instantiating thread are finished before the memory barrier is passed.
Note that if a reference to this gets passed out of the class inside the constructor somewhere, that reference might go floating around and get used before the constructor is finished. That's what's known as unsafe publishing of the object, but code such as yours that doesn't call non-final methods from the constructor (or directly pass out references to this) is fine.

Your other thread could see a null object. A volatile object could possibly help, but an explicit lock mechanism (or a Builder) would likely be a better solution.
Have a look at Java Concurrency in Practice - Sample 14.12

This class (if taken as is) is NOT thread safe. In two words: there is reordering of instructions in java (Instruction reordering & happens-before relationship in java) and when in your code you're instantiating MyClass, under some circumstances you may get following set of instructions:
Allocate memory for new instance of MyClass;
Return link to this block of memory;
Link to this not fully initialized MyClass is available for other threads, they can call "methodCalledByOtherThreads()" and get NullPointerException;
Initialize internals of MyClass.
In order to prevent this and make your MyClass really thread safe - you either have to add "final" or "volatile" to the "obj" field. In this case Java's memory model (starting from Java 5 on) will guarantee that during initialization of MyClass, reference to alocated for it block of memory will be returned only when all internals are initialized.
For more details I would strictly recommend you to read nice book "Java Concurrency in Practice". Exactly your case is described on the pages 50-51 (section 3.5.1). I would even say - you just can write correct multithreaded code without reading that book! :)

The originally picked answer by #Sotirios Delimanolis is wrong. #ZhongYu 's answer is correct.
There is the visibility issue of the concern here. So if MyClass is published unsafely, anything could happen.
Someone in the comment asked for evidence - one can check Listing 3.15 in the book Java Concurrency in Practice:
public class Holder {
private int n;
// Initialize in thread A
public Holder(int n) { this.n = n; }
// Called in thread B
public void assertSanity() {
if (n != n) throw new AssertionError("This statement is false.");
}
}
Someone comes up an example to verify this piece of code:
coding a proof for potential concurrency issue
As to the specific example of this post:
public class MyClass{
private MyObject obj;
// Initialize in thread A
public MyClass(){
obj = new MyObject();
}
// Called in thread B
public void methodCalledByOtherThreads(){
obj.doStuff();
}
}
If MyClass is initialized in Thread A, there is no guarantee that thread B will see this initialization (because the change might stay in the cache of the CPU that Thread A runs on and has not propagated into main memory).
Just as #ZhongYu has pointed out, because the write and read happens at 2 independent threads, so there is no happens-before(write, read) relation.
To fix this, as the original author has mentioned, we can declare private MyObject obj as volatile, which will ensure that the reference itself will be visible to other threads in timely manner
(https://www.logicbig.com/tutorials/core-java-tutorial/java-multi-threading/volatile-ref-object.html) .

Object creation (state initialisation) and thread safety

I am look into the book "Java Concurrency in Practice" and found really hard to believe below quoted statement (But unfortunately it make sense).
http://www.informit.com/store/java-concurrency-in-practice-9780321349606
Just wanted to get clear about this 100%
public class Holder {
private int n;
public Holder(int n) { this.n = n; }
public void assertSanity() {
if (n != n)
throw new AssertionError("This statement is false.");
}
}
While it may seem that field values set in a constructor are the first
values written to those fields and therefore that there are no "older"
values to see as stale values, the Object constructor first
writes the default values to all fields before subclass
constructors run. It is therefore Possible to see the default value
for a field as a stale value
Regarding bolded statement in above,
I am aware that the behaviour BUT now it is clear that this calling hierarchy of constructors is NOT guarantee to be ATOMIC (calling super constructors in single synchronised block that is guarded by a lock), but what would be the solution? imagine a class hierarchy that has more than one level (even it is not recommended, lets assume as it is possible). The above code snippest is a kind of a prototype that we see everyday in most of the projects.

You misread the book. It explicitely says:
The problem here is not the Holder class itself, but that the Holder is not properly published.
So the above construct if fine. What's not fine is to improperly publish such an object to other threads. The book explains that in details.

When creating a new object things happen sequentially. I don't know the precise order, but it's something like: allocate the space and initialize it to zeroes, then set the fields that get constant values, then set the fields that get calculated values, then run the constructor code. And, of course, it's got to initialize the subclasses in there somewhere.
So if you try to work with an object that is still being constructed, you can see odd, invalid values in the fields. This doesn't usually happen, but ways to do it:
Reference a field that doesn't yet have a value during an assignment to another field.
Reference a value in the constructor that doesn't get assigned till later in the constructor.
Reference a field in an object in a field in an object that was just read from an ObjectInputStream. (OIS often takes a long time to put values in objects it's read.)
Before Java 5, something like:
public volatile MyClass myObject;
...
myObject = new MyClass( 10 );
could make trouble because another thread could grab the reference to myObject before the MyClass constructor was finished and it would see bad values (zero instead of 10, in this case) inside the object. With Java 5, the JVM is not allowed to make myObject non-null until the constructor is finished.
And today you can still set myObject to this within the constructor and accomplish the same thing.
If you're clever, you can also get hold of Class fields before they've been initialized.
In your code example, (n != n) would be true if something changed the value between the two reads of n. I guess the point is n starts as zero, get's set to something else by the constructor, and assertSanity is called during the construction. In this case, n is not volatile so I don't think the assert will ever be triggered. Make it volatile and it will happen once every million times or so if you time everything precisely right. In real life this kind of problem happens just often enough to wreak havoc but rarely enough that you can't reproduce it.

I guess theoretically it is possible. It is similar to double checked locking problem.
public class Test {
static Holder holder;
static void test() {
if (holder == null) {
holder = new Holder(1);
}
holder.assertSanity();
}
...
If test() is called by 2 threads, thread-2 might see the holder in a state when initialization is still in progress so n != n may happen to be true. Here is bytecode for n != n:
ALOAD 0
GETFIELD x/Holder.n : I
ALOAD 0
GETFIELD x/Holder.n : I
IF_ICMPEQ L1
as you can see JVM loads field n to operand stack twice. So it may happen that the first var gets value before init and the seccond after init

the comment:
the Object constructor first writes the default values to all fields
before subclass constructors run
seems wrong. My prior experience is that the default values for a class are set before its constructor is run. that is a super class will see its init-ed variables set before its constructor runs and does things. This was root of bug a friend looked at where a base class was calling a method during construction that the super class implemented and set a reference that was defined with initialization to null in the super class. the item would be there until entry to the constructor at which time the init set it to null value.
references to the object are not available to another thread (assuming none generated in the constructor) until it completes construction and object reference is returned.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.