Concurrent access to unmodifiableMap

Concurrent access to unmodifiableMap - java

#Singleton
#LocalBean
#Startup
#ConcurrencyManagement(ConcurrencyManagementType.BEAN)
public class DeliverersHolderSingleton {
private volatile Map<String, Deliverer> deliverers;
#PostConstruct
private void init() {
Map<String, Deliverer> deliverersMod = new HashMap<>();
for (String delivererName : delivererNames) {
/*gettig deliverer by name*/
deliverersMod.put(delivererName, deliverer);
}
deliverers = Collections.unmodifiableMap(deliverersMod);
}
public Deliverer getDeliverer(String delivererName) {
return deliverers.get(delivererName);
}
#Schedule(minute="*", hour="*")
public void maintenance() {
init();
}
}
Singleton is used for storing data. Data is updated once per minute.
Is it possible, that read from the unmodifiableMap will be a problem with the synchronization? Is it possible that it will occurs reordering in init method and link to the collection will published, but collection not filled completely?

The Java Memory Model guarantees that there is a happens-before relationship between a write and a subsequent read to a volatile variable. In other words, if you write to a volatile variable and subsequently read that same variable, you have the guarantee that the write operation will be visible, even if multiple threads are involved:
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
It goes further and guarantees that any operation that happened before the write operation will also be visible at the reading point (thanks to the program order rule and the fact that the happens-before relationship is transitive).
Your getDeliverers method reads from the volatile variable so it will see the latest write operated on the line deliverers = Collections.unmodifiableMap(deliverersMod); as well as the preceding operations where the map is populated.
So your code is thread safe and your getDeliverers method will return a result based on the latest version of your map.

Thread safety issues here:
multiple reads from the HashMap - is thread safe, because multiple reads are allowed as long as there are no modifications to the collection and writes to the HashMap will not happen, because the map is an unmodifiableMap()
read/write on deliverers - is thread safe, because all java reference assignments are atomic
I can see no thread-unsafe operations here.
I would like to note that the name of init() metod is misleading, it suggests that it is called once during initialization; I'd suggest calling it rebuild() or recreate().

According to the Reordering Grid found here http://g.oswego.edu/dl/jmm/cookbook.html, the 1st operation being Normal Store cannot be reordered with the second operation being Volatile Store, so in your case, as long as the immutable map is not null, there wouldn't be any reordering problems.
Also, all writes that occur prior to a volatile store will be visible, so you will not see any publishing issues.

Related

Do we need to synchronize writes if we are synchronizing reads?

I have few doubts about synchronized blocks.
Before my questions I would like to share the answers from another related post Link for Answer to related question. I quote Peter Lawrey from the same answer.
synchronized ensures you have a consistent view of the data. This means you will read the latest value and other caches will get the
latest value. Caches are smart enough to talk to each other via a
special bus (not something required by the JLS, but allowed) This
bus means that it doesn't have to touch main memory to get a
consistent view.
If you only use synchronized, you wouldn't need volatile. Volatile is useful if you have a very simple operation for which synchronized
would be overkill.
In reference to above I have three questions below :
Q1. Suppose in a multi threaded application there is an object or a primitive instance field being only read in a synchronized block (write may be happening in some other method without synchronization). Also Synchronized block is defined upon some other Object. Does declaring it volatile (even if it is read inside Synchronized block only) makes any sense ?
Q2. I understand the value of the states of the object upon which Synchronization has been done is consistent. I am not sure for the state of other objects and primitive fields being read in side the Synchronized block. Suppose changes are made without obtaining a lock but reading is done by obtaining a lock. Does state of all the objects and value of all primitive fields inside a Synchronized block will have consistent view always. ?
Q3. [Update] : Will all fields being read in a synchronized block will be read from main memory regardless of what we lock on ? [answered by CKing]
I have a prepared a reference code for my questions above.
public class Test {
private SomeClass someObj;
private boolean isSomeFlag;
private Object lock = new Object();
public SomeClass getObject() {
return someObj;
}
public void setObject(SomeClass someObj) {
this.someObj = someObj;
}
public void executeSomeProcess(){
//some process...
}
// synchronized block is on a private someObj lock.
// inside the lock method does the value of isSomeFlag and state of someObj remain consistent?
public void someMethod(){
synchronized (lock) {
while(isSomeFlag){
executeSomeProcess();
}
if(someObj.isLogicToBePerformed()){
someObj.performSomeLogic();
}
}
}
// this is method without synchronization.
public void setSomeFlag(boolean isSomeFlag) {
this.isSomeFlag = isSomeFlag;
}
}

The first thing you need to understand is that there is a subtle difference between the scenario being discussed in the linked answer and the scenario you are talking about. You speak about modifying a value without synchronization whereas all values are modified within a synchronized context in the linked answer. With this understanding in mind, let's address your questions :
Q1. Suppose in a multi threaded application there is an object or a primitive instance field being only read in a synchronized block (write may be happening in some other method without synchronization). Also Synchronized block is defined upon some other Object. Does declaring it volatile (even if it is read inside Synchronized block only) makes any sense ?
Yes it does make sense to declare the field as volatile. Since the write is not happening in a synchronized context, there is no guarantee that the writing thread will flush the newly updated value to main memory. The reading thread may still see inconsistent values because of this.
Suppose changes are made without obtaining a lock but reading is done by obtaining a lock. Does state of all the objects and value of all primitive fields inside a Synchronized block will have consistent view always. ?
The answer is still no. The reasoning is the same as above.
Bottom line : Modifying values outside synchronized context will not ensure that these values get flushed to main memory. (as the reader thread may enter the synchronized block before the writer thread does) Threads that read these values in a synchronized context may still end up reading older values even if they get these values from the main memory.
Note that this question talks about primitives so it is also important to understand that Java provides Out-of-thin-air safety for 32-bit primitives (all primitives except long and double) which means that you can be assured that you will atleast see a valid value (if not consistent).

All synchronized does is capture the lock of the object that it is synchronized on. If the lock is already captured, it will wait for its release. It does not in any way assert that that object's internal fields won't change. For that, there is volatile

When you synchronize on an object monitor A, it is guaranteed that another thread synchronizing on the same monitor A afterwards will see any changes made by the first thread to any object. That's the visibility guarantee provided by synchronized, nothing more.
A volatile variable guarantees visibility (for the variable only, a volatile HashMap doesn't mean the contents of the map would be visible) between threads regardless of any synchronized blocks.

How to guarantee get() of ConcurrentHashMap to always return the latest actual value?

Introduction
Suppose I have a ConcurrentHashMap singleton:
public class RecordsMapSingleton {
private static final ConcurrentHashMap<String,Record> payments = new ConcurrentHashMap<>();
public static ConcurrentHashMap<String, Record> getInstance() {
return payments;
}
}
Then I have three subsequent requests (all processed by different threads) from different sources.
The first service makes a request, that gets the singleton, creates Record instance, generates unique ID and places it into Map, then sends this ID to another service.
Then the second service makes another request, with that ID. It gets the singleton, finds Record instance and modifies it.
Finally (probably after half an hour) the second service makes another request, in order to modify Record further.
Problem
In some really rare cases, I'm experiencing heisenbug. In logs I can see, that first request successfully placed Record into Map, second request found it by ID and modified it, and then third request tried to find Record by ID, but found nothing (get() returned null).
The single thing that I found about ConcurrentHashMap guarantees, is:
Actions in a thread prior to placing an object into any concurrent
collection happen-before actions subsequent to the access or removal
of that element from the collection in another thread.
from here. If I got it right, it literally means, that get() could return any value that actually was sometime into Map, as far as it doesn't ruin happens-before relationship between actions in different threads.
In my case it applies like this: if third request doesn't care about what happened during processing of first and second, then it could read null from Map.
It doesn't suit me, because I really need to get from Map the latest actual Record.
What have I tried
So I started to think, how to form happens-before relationship between subsequent Map modifications; and came with idea. JLS says (in 17.4.4) that:
A write to a volatile variable v (§8.3.1.4) synchronizes-with all
subsequent reads of v by any thread (where "subsequent" is defined
according to the synchronization order).
So, let's suppose, I'll modify my singleton like this:
public class RecordsMapSingleton {
private static final ConcurrentHashMap<String,Record> payments = new ConcurrentHashMap<>();
private static volatile long revision = 0;
public static ConcurrentHashMap<String, Record> getInstance() {
return payments;
}
public static void incrementRevision() {
revision++;
}
public static long getRevision() {
return revision;
}
}
Then, after each modification of Map or Record inside, I'll call incrementRevision() and before any read from Map I'll call getRevision().
Question
Due to nature of heisenbugs no amount of tests is enough to tell that this solution is correct. And I'm not an expert in concurrency, so couldn't verify it formally.
Can someone approve, that following this approach guarantees that I'm always going to get the latest actual value from ConcurrentHashMap? If this approach is incorrect or appears to be inefficient, could you recommend me something else?

You approach will not work as you are actually repeating the same mistake again. As ConcurrentHashMap.put and ConcurrentHashMap.get will create a happens before relationship but no time ordering guaranty, exactly the same applies to your reads and writes to the volatile variable. They form a happens before relationship but no time ordering guaranty, if one thread happens to call get before the other performed put, the same applies to the volatile read that will happen before the volatile write then. Besides that, you are adding another error as applying the ++ operator to a volatile variable is not atomic.
The guarantees made for volatile variables are not stronger than these made for a ConcurrentHashMap. It’s documentation explicitly states:
Retrievals reflect the results of the most recently completed update operations holding upon their onset.
The JLS states that external actions are inter-thread actions regarding the program order:
An inter-thread action is an action performed by one thread that can be detected or directly influenced by another thread. There are several kinds of inter-thread action that a program may perform:
…
External Actions. An external action is an action that may be observable outside of an execution, and has a result based on an environment external to the execution.
Simply said, if one thread puts into a ConcurrentHashMap and sends a message to an external entity and a second thread gets from the same ConcurrentHashMap after receiving a message from an external entity depending on the previously sent message, there can’t be a memory visibility issue.
It might be the case that these action aren’t programmed that way or that the external entity doesn’t have the assumed dependency, but it might be the case that the error lies in a completely different area but we can’t tell as you didn’t post the relevant code, e.g. the key doesn’t match or the printing code is wrong. But whatever it is, it won’t be fixed by the volatile variable.

Avoiding concurrent structures by manually triggering memory barriers

Background
I have a class whose instances are used to collect and publish data (uses Guava's HashMultimap):
public class DataCollector {
private final SetMultimap<String, String> valueSetsByLabel
= HashMultimap.create();
public void addLabelValue(String label, String value) {
valueSetsByLabel.put(label, value);
}
public Set<String> getLabels() {
return valueSetsByLabel.keySet();
}
public Set<String> getLabelValues(String label) {
return valueSetsByLabel.get(label);
}
}
Instances of this class will now be passed between threads, so I need to modify it for thread-safety. Since Guava's Multimap implementations aren't thread-safe, I used a LoadingCache that lazily creates concurrent hash sets instead (see the CacheBuilder and MapMaker javadocs for details):
public class ThreadSafeDataCollector {
private final LoadingCache<String, Set<String>> valueSetsByLabel
= CacheBuilder.newBuilder()
.concurrencyLevel(1)
.build(new CacheLoader<String, Set<String>>() {
#Override
public Set<String> load(String label) {
// make and return a concurrent hash set
final ConcurrentMap<String, Boolean> map = new MapMaker()
.concurrencyLevel(1)
.makeMap();
return Collections.newSetFromMap(map);
}
});
public void addLabelValue(String label, String value) {
valueSetsByLabel.getUnchecked(label).add(value);
}
public Set<String> getLabels() {
return valueSetsByLabel.asMap().keySet();
}
public Set<String> getLabelValues(String label) {
return valueSetsByLabel.getUnchecked(label);
}
}
You'll notice I'm setting the concurrency level for both the loading cache and nested concurrent hash sets to 1 (meaning they each only read from and write to one underlying table). This is because I only expect one thread at a time to read from and write to these objects.
(To quote the concurrencyLevel javadoc, "A value of one permits only one thread to modify the map at a time, but since read operations can proceed concurrently, this still yields higher concurrency than full synchronization.")
Problem
Because I can assume there will only be a single reader/writer at a time, I feel that using many concurrent hash maps per object is heavy-handed. Such structures are meant to handle concurrent reads and writes, and guarantee atomicity of concurrent writes. But in my case atomicity is unimportant - I only need to make sure each thread sees the last thread's changes.
In my search for a more optimal solution I came across this answer by erickson, which says:
Any data that is shared between thread needs a "memory barrier" to ensure its visibility.
[...]
Changes to any member that is declared volatile are visible to all
threads. In effect, the write is "flushed" from any cache to main
memory, where it can be seen by any thread that accesses main memory.
Now it gets a bit trickier. Any writes made by a thread before that
thread writes to a volatile variable are also flushed. Likewise, when
a thread reads a volatile variable, its cache is cleared, and
subsequent reads may repopulate it from main memory.
[...]
One way to make this work is to have the thread that is populating
your shared data structure assign the result to a volatile variable. [...]
When other threads access that variable, not only are they guaranteed
to get the most recent value for that variable, but also any changes
made to the data structure by the thread before it assigned the value
to the variable.
(See this InfoQ article for a further explanation of memory barriers.)
The problem erickson is addressing is slightly different in that the data structure in question is fully populated and then assigned to a variable that he suggests be made volatile, whereas my structures are assigned to final variables and gradually populated across multiple threads. But his answer suggests I could use a volatile dummy variable to manually trigger memory barriers:
public class ThreadVisibleDataCollector {
private final SetMultimap<String, String> valueSetsByLabel
= HashMultimap.create();
private volatile boolean dummy;
private void readMainMemory() {
if (dummy) { }
}
private void writeMainMemory() {
dummy = false;
}
public void addLabelValue(String label, String value) {
readMainMemory();
valueSetsByLabel.put(label, value);
writeMainMemory();
}
public Set<String> getLabels() {
readMainMemory();
return valueSetsByLabel.keySet();
}
public Set<String> getLabelValues(String label) {
readMainMemory();
return valueSetsByLabel.get(label);
}
}
Theoretically, I could take this a step further and leave it to the calling code to trigger memory barriers, in order to avoid unnecessary volatile reads and writes between calls on the same thread (potentially by using Unsafe.loadFence and Unsafe.storeFence, which were added in Java 8). But that seems too extreme and hard to maintain.
Question
Have I drawn the correct conclusions from my reading of erickson's answer (and the JMM) and implemented ThreadVisibleDataCollector correctly? I wasn't able to find examples of using a volatile dummy variable to trigger memory barriers, so I want to verify that this code will behave as expected across architectures.

The thing you are trying to do is called “Premature Optimization”. You don’t have a real performance problem but try to make your entire program very complicated and possibly error prone, without any gain.
The reason why you will never experience any (notable) gain lies in the way how a lock works. You can learn a lot of it by studying the documentation of the class AbstractQueuedSynchronizer.
A Lock is formed around a simple int value with volatile semantics and atomic updates. In the simplest form, i.e. without contention, locking and unlocking consist of a single atomic update of this int variable. Since you claim that you can be sure that there will be only one thread accessing the data at a given time, there will be no contention and the lock state update has similar performance characteristics compared to your volatile boolean attempts but with the difference that the Lock code works reliable and is heavily tested.
The ConcurrentMap approach goes a step further and allows a lock-free read that has the potential to be even more efficient than your volatile read (depending on the actual implementation).
So you are creating a potentially slower and possibly error prone program just because you “feel that using many concurrent hash maps per object is heavy-handed”. The only answer can be: don’t feel. Measure. Or just leave it as is as long as there is no real performance problem.

Some value is written to volatile variable happens-before this value can be read from it. As a consequence, the visibility guarantees you want will be achieved by reading/writing it, so the answer is yes, this solves visibility issues.
Besides the problems mentioned by Darren Gilroy in his answer, I'd like to remember that in Java 8 there are explicit memory barrier instructions in Unsafe class:
/**
* Ensures lack of reordering of loads before the fence
* with loads or stores after the fence.
*/
void loadFence();
/**
* Ensures lack of reordering of stores before the fence
* with loads or stores after the fence.
*/
void storeFence();
/**
* Ensures lack of reordering of loads or stores before the fence
* with loads or stores after the fence.
*/
void fullFence();
Although Unsafe is not a public API, I still recommend to at least consider using it, if you're using Java 8.
One more solution is coming to my mind. You have set your concurrencyLevel to 1 which means that only one thread at a time can do anything with a collection. IMO standard Java synchronized or ReentrantLock (for the cases of high contention) will also fit for your task and do provide visibility guarantees. Although, if you want one writer, many readers access pattern, consider using ReentrantReadWriteLock.

Well, that's still not particularly safe, b/c it depends a lot of the underlying implementation of the HashMultimap.
You might take a look at the following blog post for a discussion: http://mailinator.blogspot.com/2009/06/beautiful-race-condition.html
For this type of thing, a common pattern is to load a "most recent version" into a volatile variable and have your readers read immutable versions through that. This is how CopyOnWriteArrayList is implemented.
Something like ...
class Collector {
private volatile HashMultimap values = HashMultimap.create();
public add(String k, String v) {
HashMultimap t = HashMultimap.create(values);
t.put(k,v);
this.values = t; // this invokes a memory barrier
}
public Set<String> get(String k) {
values.get(k); // this volatile read is memory barrier
}
}
However, both your and my solution still have a bit of a problem -- we are both returning mutable views on the underlying data structure. I might change the HashMultimap to an ImmutableMultimap to fix the mutability issue. Beware also that callers retain a reference to the full internal map (not just the returned Set) as a side effect of things being a view.
Creating a new copy can seem somewhat wasteful, but I suspect that if you have only one thread writing, then you have an understanding of the rate of change and can decide if that's reasonable or not. For example, f you wanted to return Set<String> instances which update dynamically as things change then the solution based on map maker doesn't seem heavy handed.

Characteristics of a volatile hashmap

I am trying to get a firm handle on how a variable declared as
private volatile HashMap<Object, ArrayList<String>> data;
would behave in a multi-threaded environment.
What I understand is that volatile means get from main memory and not from the thread cache. That means that if a variable is being updated I will not see the new values until the update is complete and I will not block, rather what I see is the last updated value. (This is exactly what I want BTW.)
My question is when I retrieve the ArrayList<String> and add or remove strings to it in thread A while thread B is reading, what exactly is affected by the volatile keyword? The HashMap only or is the effect extended to the contents (K and V) of the HashMap as well? That is when thread B gets an ArrayList<String> that is currently being modified in thread A what is actually returned is the last value of ArrayList<String> that existed before the updated began.
Just to be clear, lets say the update is adding 2 strings. One string has already been added in thread A when thread B gets the array. Does thread B get the array as it was before the first string was added?

That means that if a variable is being updated I will not see the new values until the update is complete and I will not block, rather what I see is the last updated value
This is your source of confusion. What volatile does is make sure that reads and writes to that field are atomic - so no other threads could ever see a partially written value.
A non-atomic long field (which takes 2 memory addresses on a 32-bit machine) could be read incorrectly if a write operation was preempted after writing to the first address, and before writing to the second address.
Note that the atomicity of reads/writes to a field has nothing to do with updating the inner state of an HashMap. Updating the inner state of an HashMap entails multiple instructions, which are not atomic as a whole. That's why you'd use locks to synchronize access to the HashMap.
Also, since read/write operations on references are always atomic, even if the field is not marked as volatile, there is no difference between a volatile and a non-volatile HashMap, regarding atomicity. In that case, all volatile does is give you acquire-release semantics. This means that, even though the processor and the compiler are still allowed to slightly reorder your instructions, no instructions may ever be moved above a volatile read or below a volatile write.

The volatile keyword here is only applicable to HashMap, not the data stored within it, in this case is ArrayList.
As stated in HashMap documentation:
Note that this implementation is not synchronized. If multiple threads
access a hash map concurrently, and at least one of the threads
modifies the map structurally, it must be synchronized externally. (A
structural modification is any operation that adds or deletes one or
more mappings; merely changing the value associated with a key that an
instance already contains is not a structural modification.) This is
typically accomplished by synchronizing on some object that naturally
encapsulates the map. If no such object exists, the map should be
"wrapped" using the Collections.synchronizedMap method. This is best
done at creation time, to prevent accidental unsynchronized access to
the map:
Map m = Collections.synchronizedMap(new HashMap(...));

The volatile keywords neither affects operations on the HashMap (e.g. put, get) nor operations on the ArrayLists within the HashMap. The volatile keywords only affects reads and writes on this particular reference to the HashMap. Again, there can be further references to the same HashMap, which are no affected.
If you want to synchronise all operations on:
- the reference
- the HashMap
- and the ArrayList,
then use an additional Lock object for synchronisation as in the following code.
private final Object lock = new Object();
private Map<Object, List<String>> map = new HashMap<>();
// access reference
synchronized (lock) {
map = new HashMap<>();
}
// access reference and HashMap
synchronized (lock) {
return map.contains(42);
}
// access reference, HashMap and ArrayList
synchronized (lock) {
map.get(42).add("foobar");
}
If the reference is not changed, you can use the HashMap for synchronization (instead of the Lock).

Volatile HashMap Not Updating Outside of Thead

so I have a HashMap that is declared in class level like so:
private static volatile HashMap<String, ArrayList>String>> map =
new HashMap<String, ArrayList>String>>();
I have several threads updating the same map and the threads are declared in the class level like so:
private class UpdateThread extends Thread {
#Override
public void run() {
// update map here
// map actually gets updated here
}
}
But after the threads exit:
for (FetchSKUsThread thread : listOfThreads) {
thread.start();
}
for (FetchSKUsThread thread : listOfThreads) {
try {
thread.join();
// map not updated anymore :-[
} catch (InterruptedException e) {
e.printStackTrace();
}
}
Why are the map changes that are occuring inside the thread not persisting after the thread is done? I've decalred the map static and volatile already...
Thanks in advance

Why are the map changes that are occurring inside the thread not persisting after the thread is done? I've declared the map static and volatile already...
It depends highly on how you are updating the map.
// update map here -- what's happening here?
As #Louis points out, if multiple threads are updating the same map instance, volatile won't help you and you should be using a ConcurrentHashMap. As #Gerhard points out, volatile is only protecting the updating of the HashMap reference and not the innards of the map itself. You need to fully lock the map if the threads are updating it in parallel or use a concurrent map.
However, if each thread is replacing the map with a new map then the volatile method would work. Then again, each thread may be overwriting the central map because of race conditions.
If you show us your update code, we should be able to explain it better.

The keyowrd volatile only makes the reference to the HashMap visible to all threads.
If you want to access a HashMap in several threads, you need to use a synchronized map. The easiest choices are using java.util.Hashtable or using Collections.synchronizedMap(map). The volatile declaration is useless in your case, since your variable is initialized at the beginning.

The semantics of volatile apply only to the variable you are declaring.
In your case, the variable that holds your reference to map is volatile, and so the JVM will go to lengths to assure that changes you make to the reference contained by map are visible to other threads.
However, the object referred to by map is not covered by any such guarantee and in order for changes to any object or any object graph to be viewed by other threads, you will need to establish a happens-before relationship. With mutable state objects, this usually means synchronizing on a lock or using a thread safe object designed for concurrency. Happily, in your case, a high-performance Map implementation designed for concurrent access is part of the Java library: `ConcurrentHashMap'.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.