Is reference update thread safe?

Is reference update thread safe? - java

public class Test{
private MyObj myobj = new MyObj(); //it is not volatile
public class Updater extends Thred{
myobje = getNewObjFromDb() ; //not am setting new object
}
public MyObj getData(){
//getting stale date is fine for
return myobj;
}
}
Updated regularly updates myobj
Other classes fetch data using getData
IS this code thread safe without using volatile keyword?
I think yes. Can someone confirm?

No, this is not thread safe. (What makes you think it is?)
If you are updating a variable in one thread and reading it from another, you must establish a happens-before relationship between the write and the subsequent read.
In short, this basically means making both the read and write synchronized (on the same monitor), or making the reference volatile.
Without that, there are no guarantees that the reading thread will see the update - and it wouldn't even be as simple as "well, it would either see the old value or the new value". Your reader threads could see some very odd behaviour with the data corruption that would ensue. Look at how lack of synchronization can cause infinite loops, for example (the comments to that article, especially Brian Goetz', are well worth reading):
The moral of the story: whenever mutable data is shared across threads, if you don’t use synchronization properly (which means using a common lock to guard every access to the shared variables, read or write), your program is broken, and broken in ways you probably can’t even enumerate.

No, it isn't.
Without volatile, calling getData() from a different thread may return a stale cached value.
volatile forces assignments from one thread to be visible on all other threads immediately.
Note that if the object itself is not immutable, you are likely to have other problems.

You may get a stale reference. You may not get an invalid reference.
The reference you get is the value of the reference to an object that the variable points to or pointed to or will point to.
Note that there are no guarantees how much stale the reference may be, but it's still a reference to some object and that object still exists. In other words, writing a reference is atomic (nothing can happen during the write) but not synchronized (it is subject to instruction reordering, thread-local cache et al.).
If you declare the reference as volatile, you create a synchronization point around the variable. Simply speaking, that means that all cache of the accessing thread is flushed (writes are written and reads are forgotten).
The only types that don't get atomic reads/writes are long and double because they are larger than 32-bits on 32-bit machines.

If MyObj is immutable (all fields are final), you don't need volatile.

The big problem with this sort of code is the lazy initialization. Without volatile or synchronized keywords, you could assign a new value to myobj that had not been fully initialized. The Java memory model allows for part of an object construction to be executed after the object constructor has returned. This re-ordering of memory operations is why the memory-barrier is so critical in multi-threaded situations.
Without a memory-barrier limitation, there is no happens-before guarantee so you do not know if the MyObj has been fully constructed. This means that another thread could be using a partially initialized object with unexpected results.
Here are some more details around constructor synchronization:
Constructor synchronization in Java

Volatile would work for boolean variables but not for references. Myobj seems to perform like a cached object it could work with an AtomicReference. Since your code extracts the value from the DB I'll let the code stay as is and add the AtomicReference to it.
import java.util.concurrent.atomic.AtomicReference;
public class AtomicReferenceTest {
private AtomicReference<MyObj> myobj = new AtomicReference<MyObj>();
public class Updater extends Thread {
public void run() {
MyObj newMyobj = getNewObjFromDb();
updateMyObj(newMyobj);
}
public void updateMyObj(MyObj newMyobj) {
myobj.compareAndSet(myobj.get(), newMyobj);
}
}
public MyObj getData() {
return myobj.get();
}
}
class MyObj {
}

Related

Do we need to synchronize writes if we are synchronizing reads?

I have few doubts about synchronized blocks.
Before my questions I would like to share the answers from another related post Link for Answer to related question. I quote Peter Lawrey from the same answer.
synchronized ensures you have a consistent view of the data. This means you will read the latest value and other caches will get the
latest value. Caches are smart enough to talk to each other via a
special bus (not something required by the JLS, but allowed) This
bus means that it doesn't have to touch main memory to get a
consistent view.
If you only use synchronized, you wouldn't need volatile. Volatile is useful if you have a very simple operation for which synchronized
would be overkill.
In reference to above I have three questions below :
Q1. Suppose in a multi threaded application there is an object or a primitive instance field being only read in a synchronized block (write may be happening in some other method without synchronization). Also Synchronized block is defined upon some other Object. Does declaring it volatile (even if it is read inside Synchronized block only) makes any sense ?
Q2. I understand the value of the states of the object upon which Synchronization has been done is consistent. I am not sure for the state of other objects and primitive fields being read in side the Synchronized block. Suppose changes are made without obtaining a lock but reading is done by obtaining a lock. Does state of all the objects and value of all primitive fields inside a Synchronized block will have consistent view always. ?
Q3. [Update] : Will all fields being read in a synchronized block will be read from main memory regardless of what we lock on ? [answered by CKing]
I have a prepared a reference code for my questions above.
public class Test {
private SomeClass someObj;
private boolean isSomeFlag;
private Object lock = new Object();
public SomeClass getObject() {
return someObj;
}
public void setObject(SomeClass someObj) {
this.someObj = someObj;
}
public void executeSomeProcess(){
//some process...
}
// synchronized block is on a private someObj lock.
// inside the lock method does the value of isSomeFlag and state of someObj remain consistent?
public void someMethod(){
synchronized (lock) {
while(isSomeFlag){
executeSomeProcess();
}
if(someObj.isLogicToBePerformed()){
someObj.performSomeLogic();
}
}
}
// this is method without synchronization.
public void setSomeFlag(boolean isSomeFlag) {
this.isSomeFlag = isSomeFlag;
}
}

The first thing you need to understand is that there is a subtle difference between the scenario being discussed in the linked answer and the scenario you are talking about. You speak about modifying a value without synchronization whereas all values are modified within a synchronized context in the linked answer. With this understanding in mind, let's address your questions :
Q1. Suppose in a multi threaded application there is an object or a primitive instance field being only read in a synchronized block (write may be happening in some other method without synchronization). Also Synchronized block is defined upon some other Object. Does declaring it volatile (even if it is read inside Synchronized block only) makes any sense ?
Yes it does make sense to declare the field as volatile. Since the write is not happening in a synchronized context, there is no guarantee that the writing thread will flush the newly updated value to main memory. The reading thread may still see inconsistent values because of this.
Suppose changes are made without obtaining a lock but reading is done by obtaining a lock. Does state of all the objects and value of all primitive fields inside a Synchronized block will have consistent view always. ?
The answer is still no. The reasoning is the same as above.
Bottom line : Modifying values outside synchronized context will not ensure that these values get flushed to main memory. (as the reader thread may enter the synchronized block before the writer thread does) Threads that read these values in a synchronized context may still end up reading older values even if they get these values from the main memory.
Note that this question talks about primitives so it is also important to understand that Java provides Out-of-thin-air safety for 32-bit primitives (all primitives except long and double) which means that you can be assured that you will atleast see a valid value (if not consistent).

All synchronized does is capture the lock of the object that it is synchronized on. If the lock is already captured, it will wait for its release. It does not in any way assert that that object's internal fields won't change. For that, there is volatile

When you synchronize on an object monitor A, it is guaranteed that another thread synchronizing on the same monitor A afterwards will see any changes made by the first thread to any object. That's the visibility guarantee provided by synchronized, nothing more.
A volatile variable guarantees visibility (for the variable only, a volatile HashMap doesn't mean the contents of the map would be visible) between threads regardless of any synchronized blocks.

do I need a lock?

MyObject myObj ...
public void updateObj(){
MyObject newObj = getNewMyObject();
myObj = newObj;
}
public int getSomething(){
//O(n^2) operation performed in getSomething method
int something = myObj.getSomething();
return something;
}
Suppose the main thread periodically calls updateObj() and a child thread calling getSomething() method pretty often.
Do I need a lock (or declare methods as synchronized) before myObj = newObj; and int something = myObj.getSomething();
Someone argued I don't need a lock here because in Java, assignment operations (e.g myObj = newObj;) is atomic. But what I don't get it is that myObj.getSomething(); this is not an atomic operation but its O(n^2) so I think a locking is still needed. Is this correct?

You need to declare myObject volatile, otherwise getSomething() method may not see updated object. Other than that I cannot see any synchronization issues in the above code

Yes, you must synchronize your access to the shared variable properly. According to the Java Memory Model, your code doesn't guarantee happens-before relationship between the read and the write in all possible executions.
Proper synchronization doesn't always mean using locks; In your case, declaring myObj as volatile can do the job (assuming that you don't mutate it after construction).

The write to a reference variable is indeed atomic.
But without the lock (or some other kind of synchronization) the other threads are not guaranteed to see the updated value of myObj to the extreme that they will see only its initial value (null, if you did not assign it in the constructor).
You will be hard-pressed to write a program in such a way that it exhibits exactly this extreme behavior, but without synchronization you will definitely get inconsistent results when a thread calls updateObj and than some other threads call getSomething and use the obsolete myObj instance.
The simplest way to ensure that getSomething uses the latest myObj is to declare the myObj as volatile.

yes you need some locking over here. because I can see is both different function is called by different threads and you are using myObj in getSomething() method.
Now assume when your child thread is executing getSomething() and at the same time your main thread have change myObj then your program will be victim of the race condition and you may not get the desired output.
but though it depends on the overall program you write. hope this helps

The updateObj() needs to be synchronized so that the shared mutable myObj is written to atomically.
MyObjects.getSomething() needs to be a synchronized method if it uses shared mutable values.

Effectively Immutable Object

I want to make sure that I correctly understand the 'Effectively Immutable Objects' behavior according to Java Memory Model.
Let's say we have a mutable class which we want to publish as an effectively immutable:
class Outworld {
// This MAY be accessed by multiple threads
public static volatile MutableLong published;
}
// This class is mutable
class MutableLong {
private long value;
public MutableLong(long value) {
this.value = value;
}
public void increment() {
value++;
}
public long get() {
return value;
}
}
We do the following:
// Create a mutable object and modify it
MutableLong val = new MutableLong(1);
val.increment();
val.increment();
// No more modifications
// UPDATED: Let's say for this example we are completely sure
// that no one will ever call increment() since now
// Publish it safely and consider Effectively Immutable
Outworld.published = val;
The question is:
Does Java Memory Model guarantee that all threads MUST have Outworld.published.get() == 3 ?
According to Java Concurrency In Practice this should be true, but please correct me if I'm wrong.
3.5.3. Safe Publication Idioms
To publish an object safely, both the reference to the object and the
object's state must be made visible to other threads at the same time.
A properly constructed object can be safely published by:
- Initializing an object reference from a static initializer;
- Storing a reference to it into a volatile field or AtomicReference;
- Storing a reference to it into a final field of a properly constructed object; or
- Storing a reference to it into a field that is properly guarded by a lock.
3.5.4. Effectively Immutable Objects
Safely published effectively immutable objects can be used safely by
any thread without additional synchronization.

Yes. The write operations on the MutableLong are followed by a happens-before relationship (on the volatile) before the read.
(It is possible that a thread reads Outworld.published and passes it on to another thread unsafely. In theory, that could see earlier state. In practice, I don't see it happening.)

There is a couple of conditions which must be met for the Java Memory Model to guarantee that Outworld.published.get() == 3:
the snippet of code you posted which creates and increments the MutableLong, then sets the Outworld.published field, must happen with visibility between the steps. One way to achieve this trivially is to have all that code running in a single thread - guaranteeing "as-if-serial semantics". I assume that's what you intended, but thought it worth pointing out.
reads of Outworld.published must have happens-after semantics from the assignment. An example of this could be having the same thread execute Outworld.published = val; then launch other the threads which could read the value. This would guarantee "as if serial" semantics, preventing re-ordering of the reads before the assignment.
If you are able to provide those guarantees, then the JMM will guarantee all threads see Outworld.published.get() == 3.
However, if you're interested in general program design advice in this area, read on.
For the guarantee that no other threads ever see a different value for Outworld.published.get(), you (the developer) have to guarantee that your program does not modify the value in any way. Either by subsequently executing Outworld.published = differentVal; or Outworld.published.increment();. While that is possible to guarantee, it can be so much easier if you design your code to avoid both the mutable object, and using a static non-final field as a global point of access for multiple threads:
instead of publishing MutableLong, copy the relevant values into a new instance of a different class, whose state cannot be modified. E.g.: introduce the class ImmutableLong, which assigns value to a final field on construction, and doesn't have an increment() method.
instead of multiple threads accessing a static non-final field, pass the object as a parameter to your Callable/Runnable implementations. This will prevent the possibility of one rogue thread from reassigning the value and interfering with the others, and is easier to reason about than static field reassignment. (Admittedly, if you're dealing with legacy code, this is easier said than done).

The question is: Does Java Memory Model guarantee that all threads
MUST have Outworld.published.get() == 3 ?
The short answer is no. Because other threads might access Outworld.published before it has been read.
After the moment when Outworld.published = val; had been performed, under condition that no other modifications done with the val - yes - it always be 3.
But if any thread performs val.increment then its value might be different for other threads.

In Java, is it safe to change a reference to a HashMap read concurrently

I hope this isn't too silly a question...
I have code similar to the following in my project:
public class ConfigStore {
public static class Config {
public final String setting1;
public final String setting2;
public final String setting3;
public Config(String setting1, String setting2, String setting3) {
this.setting1 = setting1;
this.setting2 = setting2;
this.setting3 = setting3;
}
}
private volatile HashMap<String, Config> store = new HashMap<String, Config>();
public void swapConfigs(HashMap<String, Config> newConfigs) {
this.store = newConfigs;
}
public Config getConfig(String name) {
return this.store.get(name);
}
}
As requests are processed, each thread will request a config to use from the store using the getConfig() function. However, periodically (every few days most likely), the configs are updated and swapped out using the swapConfigs() function. The code that calls swapConfigs() does not keep a reference to the Map it passes in as it is simply the result of parsing a configuration file.
In this case, is the volatile keyword still needed on the store instance variable?
Will the volatile keyword introduce any potential performance bottlenecks that I should be aware of or can avoid given that the rate of reads greatly exceeds the rate of writes?
Thanks very much,

Since changing references is an atomic operation, you won't end up with one thread modifying the reference, and the other seeing a garbage reference, even if you drop volatile. However, the new map may not get instantly visible for some threads, which may consequently keep reading configuration from the old map for an indefinite time (or forever). So keep volatile.
Update
As #BeeOnRope pointed out in a comment below, there is an even stronger reason to use volatile:
"non-volatile writes [...] don't establish a happens-before relationship between the write and subsequent reads that see the written value. This means that a thread can see a new map published through the instance variable, but this new map hasn't been fully constructed yet. This is not intuitive, but it's a consequence of the memory model, and it happens in the real word. For an object to be safely published, it must be written to a volatile, or use a handful of other techniques.
Since you change the value very rarely, I don't think volatile would cause any noticeable performance difference. But at any rate, correct behaviour trumps performance.

No, this is not thread safe without volatile, even apart from the issues of seeing stale values. Even though there are no writes to the map itself, and reference assignment is atomic, the new Map<> has not been safely published.
For an object to be safely published, it must be communicated to other threads using some mechanism that either establishes a happens-before relationship between the object construction, the reference publication and the reference read, or it must use a handful of narrower methods which are guaranteed to be safe for publishing:
Initializing an object reference from a static initializer.
Storing a reference to it into a final field.
Neither of those two publication specific ways applies to you, so you'll need volatile to establish happens-before.
Here is a longer version of this reasoning, including links to the JLS and some examples of real-world things that can happen if you don't publish safely.
More details on safe publication can be found in JCIP (highly recommended), or here.

Your code is fine. You need volatile, otherwise your code would be 100% thread-safe (updating a reference is atomic), however the change might not be visible to all the threads. It means some threads will still see the old value of store.
That being said volatile is obligatory in your example. You might consider AtomicReference, but it won't give you anything more in your case.
You cannot trade correctness for performance so your second question is not really valid. It will have some performance impact, but probably only during update, which happens very rarely as you said. Basically JVM will ensure the change is visible to all the threads by "flushing" it, but after that it will be accessible as any other local variable (up until next update).
BTW I like Config class being immutable, please also consider immutable Map implementation just in case.

Would it work for you to use a ConcurrentHashMap and instead of swapping the entire config update the affected values in the hash map?

Use of Volatile variables for safe publication of Immutable objects

I came across this statement:
In properly constructed objects, all
threads will see correct values of
final fields, regardless of how the
object is published.
Then why a volatile variable is used to safely
publishing an Immutable object?
I'm really confused. Can anybody make it clear with a suitable example?

In this case, the volatility would only ensure visibility of the new object; any other threads that happened to get hold of your object via a non-volatile field would indeed see the correct values of final fields as per JSR-133's initialization safety guarantees.
Still, making the variable volatile doesn't hurt; is correct from a memory management perspective anyway; and would be necessary for non-final fields initialised in a constructor (although there shouldn't be any of these in an immutable object). If you wish to share variables between threads, you'll need to ensure adequate synchronization to give visibility anyway; though in this case you're right, that there's no danger to the atomicity of the constructor.
Thanks to Tom Hawtin for pointing out I'd completely overlooked the JMM guarantees on final fields; previous incorrect answer is given below.
The reason for the volatile variable is that is establishes a happens-before relationship (according to the Java Memory Model) between the construction of the object, and the assignment of the variable. This achieves two things:
Subsequent reads of that variable from different threads are guaranteed to see the new value. Without marking the variable as volatile, these threads could see stale values of the reference.
The happens-before relationship places limits on what reorderings the compiler can do. Without a volatile variable, the assignment to the variable could happen before the object's constructor runs - hence other threads could get a reference to the object before it was fully constructed.
Since one of the fundamental rules of immutable objects is that you don't publish references during the constructor, it's this second point that is likely being referenced here. In a multithreaded environment without proper concurrent handling, it is possible for a reference to the object to be "published" before that object has been constructed. Thus another thread could get that object, see that one of its fields is null, and then later see that this "immutable" object has changed.
Note that you don't have to use volatile fields to achieve this if you have other appropriate synchronization primitives - for example, if the assignment (and all later reads) are done in a synchronized block on a given monitor - but in a "standalone" sense, marking the variable as volatile is the easiest way to tell the JVM "this might be read by multiple threads, please make the assignment safe in that context."

A volatile reference to an immutable object could be useful. This would allow you to swap one object for another to make the new data available to other threads.
I would suggets you look at using AtomicReference first however.
If you need final volatile fields you have a problem. All fields, including final ones are available to other threads as soon as the constructor returns. So if you pass an object to another thread in the constructor, it is possible for the other thread to see an inconsistent state. IMHO you should consider a different solution so you don't have to do this.

You cant really see the difference in Immutable class.see the below example.in Myclass.class
public static Foo getInstance(){
if(INSTANCE == null){
INSTANCE = new Foo();
}
return INSTANCE;
}
in the above code if Foo is declared final(final Foo INSTANCE;) it guarantees that it won't publish references during the constructor call.partial object construction is not possible
consider this...if this Myclass is Immutable, its state is not gonna change after object construction, making Volatile(volatile final Foo INSTANCE;) keyword redundant.but if this class allows its object state to be changed(Not immutable) multiple threads CAN actually update the object and some updates are not visible to other threads, hence volatile keyword ensures safety publication of objects in non-Immutable class.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.