Related
I want to track getVariableAndLogAccess(RequestInfo requestInfo) using the code below. Will it be thread safe if only these two methods access variable?
What is the standard way to make it thread safe?
public class MyAccessLog(){
private int recordIndex = 0;
private int variableWithAccessTracking = 42;
private final Map<Integer, RequestInfo> requestsLog = new HashMap<>();
public int getVariableAndLogAccess(RequestInfo requestInfo){
Integer myID = recordIndex++;
int variableValue = variableWithAccessTracking;
requestInfo.saveValue(variableValue);
requestLog.put(myID, requestInfo);
return variableValue;
}
public void setValueAndLog(RequestInfo requestInfo, int newValue){
Integer myID = recordIndex++;
variableWithAccessTracking = variableValue;
requestInfo.saveValue(variableValue);
requestLog.put(myID, requestInfo);
}
/*other methods*/
}
Will it be thread safe if only these two methods access variable?
No.
For instance, if two threads call setValueAndLog, they might end up with the same myID value.
What is the standard way to make it thread safe?
You should either replace your int with an AtomicInteger, use a lock, or a syncrhonized block to prevent concurrent modifications.
As a rule of thumb, using an atomic variable such as the previously mentioned AtomicInteger is better than using locks since locks involve the operating system. Calling the operating system is like bringing in the lawyers - both are best avoided for things you can solve yourself.
Note that if you use locks or synchronized blocks, both the setter and getter need to use the same lock. Otherwise the getter could be accessed while the setter is still updating the variable, leading to concurrency errors.
Will it be thread safe if only these two methods access variable?
Nope.
Intuitively, there are two reasons:
An increment consists of a read followed by a write. The JLS does not guarantee that the two will be performed as an atomic operation. And indeed, neither to Java implementations do that.
Modern multi-core systems implement memory access with fast local memory caches and slower main memory. This means that one thread is not guaranteed to see the results of another thread's memory writes ... unless there are appropriate "memory barrier" instructions to force main-memory writes / reads.
Java will only insert these instructions if the memory model says it is necessary. (Because ... they slow the code down!)
Technically, the JLS has a whole chapter describing the Java Memory Model, and it provides a set of rules that allow you to reason about whether memory is being used correctly. For the higher level stuff, you can reason based on the guarantees provided by AtomicInteger, etcetera.
What is the standard way to make it thread safe?
In this case, you could use either an AtomicInteger instance, or you could synchronize using a primitive object locking (i.e the synchronized keyword) or a Lock object.
#Malt is right. Your code is not even close to be thread safe.
You can use AtomicInteger for your counter, but LongAdder would be more suitable for your case, as it is optimized for cases where you need counting things and read the result of your counting less often then you update it. LongAdder also has the same thread safety assurance of AtomicInteger
From java doc on LongAdder:
This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.
This is a common approach to log in a thread safe way:
For counter use AtomicInteger counter with counter.addAndGet(1) method.
Add data using public synchronized void putRecord(Data data){ /**/}
If you only use recordIndex as a handler for the record you can replace a map with a synchronized list: List list = Collections.synchronizedList(new LinkedList());
Introduction
Suppose I have a ConcurrentHashMap singleton:
public class RecordsMapSingleton {
private static final ConcurrentHashMap<String,Record> payments = new ConcurrentHashMap<>();
public static ConcurrentHashMap<String, Record> getInstance() {
return payments;
}
}
Then I have three subsequent requests (all processed by different threads) from different sources.
The first service makes a request, that gets the singleton, creates Record instance, generates unique ID and places it into Map, then sends this ID to another service.
Then the second service makes another request, with that ID. It gets the singleton, finds Record instance and modifies it.
Finally (probably after half an hour) the second service makes another request, in order to modify Record further.
Problem
In some really rare cases, I'm experiencing heisenbug. In logs I can see, that first request successfully placed Record into Map, second request found it by ID and modified it, and then third request tried to find Record by ID, but found nothing (get() returned null).
The single thing that I found about ConcurrentHashMap guarantees, is:
Actions in a thread prior to placing an object into any concurrent
collection happen-before actions subsequent to the access or removal
of that element from the collection in another thread.
from here. If I got it right, it literally means, that get() could return any value that actually was sometime into Map, as far as it doesn't ruin happens-before relationship between actions in different threads.
In my case it applies like this: if third request doesn't care about what happened during processing of first and second, then it could read null from Map.
It doesn't suit me, because I really need to get from Map the latest actual Record.
What have I tried
So I started to think, how to form happens-before relationship between subsequent Map modifications; and came with idea. JLS says (in 17.4.4) that:
A write to a volatile variable v (§8.3.1.4) synchronizes-with all
subsequent reads of v by any thread (where "subsequent" is defined
according to the synchronization order).
So, let's suppose, I'll modify my singleton like this:
public class RecordsMapSingleton {
private static final ConcurrentHashMap<String,Record> payments = new ConcurrentHashMap<>();
private static volatile long revision = 0;
public static ConcurrentHashMap<String, Record> getInstance() {
return payments;
}
public static void incrementRevision() {
revision++;
}
public static long getRevision() {
return revision;
}
}
Then, after each modification of Map or Record inside, I'll call incrementRevision() and before any read from Map I'll call getRevision().
Question
Due to nature of heisenbugs no amount of tests is enough to tell that this solution is correct. And I'm not an expert in concurrency, so couldn't verify it formally.
Can someone approve, that following this approach guarantees that I'm always going to get the latest actual value from ConcurrentHashMap? If this approach is incorrect or appears to be inefficient, could you recommend me something else?
You approach will not work as you are actually repeating the same mistake again. As ConcurrentHashMap.put and ConcurrentHashMap.get will create a happens before relationship but no time ordering guaranty, exactly the same applies to your reads and writes to the volatile variable. They form a happens before relationship but no time ordering guaranty, if one thread happens to call get before the other performed put, the same applies to the volatile read that will happen before the volatile write then. Besides that, you are adding another error as applying the ++ operator to a volatile variable is not atomic.
The guarantees made for volatile variables are not stronger than these made for a ConcurrentHashMap. It’s documentation explicitly states:
Retrievals reflect the results of the most recently completed update operations holding upon their onset.
The JLS states that external actions are inter-thread actions regarding the program order:
An inter-thread action is an action performed by one thread that can be detected or directly influenced by another thread. There are several kinds of inter-thread action that a program may perform:
…
External Actions. An external action is an action that may be observable outside of an execution, and has a result based on an environment external to the execution.
Simply said, if one thread puts into a ConcurrentHashMap and sends a message to an external entity and a second thread gets from the same ConcurrentHashMap after receiving a message from an external entity depending on the previously sent message, there can’t be a memory visibility issue.
It might be the case that these action aren’t programmed that way or that the external entity doesn’t have the assumed dependency, but it might be the case that the error lies in a completely different area but we can’t tell as you didn’t post the relevant code, e.g. the key doesn’t match or the printing code is wrong. But whatever it is, it won’t be fixed by the volatile variable.
Hashtable and Collections.synchronizedMap are thread safe but still compound operations like
if (!map_obj.containsKey(key)) {
map_obj.put(key, value);
}
needs external synchronization as:
synchronized(map_obj) {
if (!map_obj.containsKey(key)) {
map_obj.put(key, value);
}
}
Suppose we have ConcurrentHashMap(CHM) instead of Hashtable or HashMap. CHM provides an alternative putIfAbsent() method for the above compound operation, thus removing the need of external synchronization.
But suppose there is no putIfAbsent() provided by CHM. Then can we write following code:
synchronized(concurrenthashmap_obj) {
if (!concurrenthashmap_obj.containsKey(key)) {
concurrenthashmap_obj.put(key, value);
}
}
I mean can we use external synchronization on CHM object?Will it work?
For above compound operation there is putIfAbsent() method in CHM but how can we achieve thread safety for other compound operations if we are using CHM. I mean can we use external synchronization on CHM object?
No, you cannot use external synchronization to ensure atomicity of compound operations over ConcurrentHashMap.
To be precise, you can use external synchronization to ensure atomicity of compound operations, but only if all operations with ConcurrentHashMap are synchronized over the same lock as well (though use of ConcurrentHashMap won't make sense in this case - you can replace it with regular HashMap).
Approach with external synchronization works with Hashtable and Collections.synchronizedMap() only because they guarantee that their primitive operations are synchronized over these objects as well. Since ConcurrentHashMap doesn't provide such a guarantee, primitive operations may interfere with execution of your compound operations, breaking their atomicity.
However, ConcurrentHashMap provides number of methods that can be used to implement compound operations in optimistic manner:
putIfAbsent(key, value)
remove(key, value)
replace(key, value)
replace(key, oldValue, newValue)
You can use these operation to implement certain compound operations without explict synchronization, the same way as you would do for AtomicReference, etc.
There isn't any reason why you can't. Traditional synchronization works with everything, there aren't special exceptions against them. ConcurrentHashMaps simply use more optimized thread-safety mechanisms, if you wish to do something more complex, falling back to traditional synchronization may actually be your only option (that and using locks).
You can always use a synchronized block. The fancy collections in java.util.concurrent do not prohibit it, they just make it redundant for most common use cases. If you are performing a compound operation (e.g. - you want to insert two keys which must always have the same value), not only can you use external synchronization - you must.
E.g.:
String key1 = getKeyFromSomewhere();
String key2 = getKeyFromSomewhereElse();
String value = getValue();
// We want to put two pairs in the map - [key1, value] and [key2, value]
// and be sure that in any point in time both key1 and key2 have the same
// value
synchronized(concurrenthashmap_obj) {
concurrenthashmap_obj.put(key1, value);
// without external syncronoziation, key1's value may have already been
// overwritten from a different thread!
concurrenthashmap_obj.put(key2, value);
}
As the ConcurrentHashMap implements the Map Interface, it does support all features every basic Map does as well. So yes: you can use it like any other map and ignore all the extra features. But then you will essentially have a slower HashMap.
The main difference between a synchronized Map and a concurrent Map is - as the name says - concurrency. Imagine you have 100 threads wanting to read from the Map, if you synchronize you block out 99 threads and 1 can do the work. If you use concurrency 100 threads can work at the same time.
Now if you think about the actual reason why you use threads, you soon come to the conclusion that you should get rid of every possible synchronized block that you can.
It all depends on what you mean by "other compound operation" and by "working". Synchronization works with ConcurrentHashMap exactly the same way as it works with any other object.
So, if you want some complex shared state change to be seen as an atomic change, then all accesses to this shared state must be synchronized, on the same lock. This lock could be the Map itself, or it could be another object.
About java.util.concurrent.ConcurrentHashMap
"is fully interoperable with Hashtable in programs that rely on its thread safety but not on its synchronization details: they do not throw ConcurrentModificationException."
"allows concurrency among update operations"
About java memory
Generally speaking the reads are safe from a synchronization standpoint but not a memory standpoint.
See also "http://www.ibm.com/developerworks/java/library/j-jtp03304/".
So synchronizaton and volatile should be used to manage concurrent reading (vs. writing).
About putIfAbsent
putIfAbsent is your friend:
If the specified key is not already associated with a value, associate
it with the given
value. This is equivalent to
if (!map.containsKey(key))
return map.put(key, value);
else
return map.get(key);
except that the action is performed !!!atomically!!!.
#Singleton
#LocalBean
#Startup
#ConcurrencyManagement(ConcurrencyManagementType.BEAN)
public class DeliverersHolderSingleton {
private volatile Map<String, Deliverer> deliverers;
#PostConstruct
private void init() {
Map<String, Deliverer> deliverersMod = new HashMap<>();
for (String delivererName : delivererNames) {
/*gettig deliverer by name*/
deliverersMod.put(delivererName, deliverer);
}
deliverers = Collections.unmodifiableMap(deliverersMod);
}
public Deliverer getDeliverer(String delivererName) {
return deliverers.get(delivererName);
}
#Schedule(minute="*", hour="*")
public void maintenance() {
init();
}
}
Singleton is used for storing data. Data is updated once per minute.
Is it possible, that read from the unmodifiableMap will be a problem with the synchronization? Is it possible that it will occurs reordering in init method and link to the collection will published, but collection not filled completely?
The Java Memory Model guarantees that there is a happens-before relationship between a write and a subsequent read to a volatile variable. In other words, if you write to a volatile variable and subsequently read that same variable, you have the guarantee that the write operation will be visible, even if multiple threads are involved:
A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.
It goes further and guarantees that any operation that happened before the write operation will also be visible at the reading point (thanks to the program order rule and the fact that the happens-before relationship is transitive).
Your getDeliverers method reads from the volatile variable so it will see the latest write operated on the line deliverers = Collections.unmodifiableMap(deliverersMod); as well as the preceding operations where the map is populated.
So your code is thread safe and your getDeliverers method will return a result based on the latest version of your map.
Thread safety issues here:
multiple reads from the HashMap - is thread safe, because multiple reads are allowed as long as there are no modifications to the collection and writes to the HashMap will not happen, because the map is an unmodifiableMap()
read/write on deliverers - is thread safe, because all java reference assignments are atomic
I can see no thread-unsafe operations here.
I would like to note that the name of init() metod is misleading, it suggests that it is called once during initialization; I'd suggest calling it rebuild() or recreate().
According to the Reordering Grid found here http://g.oswego.edu/dl/jmm/cookbook.html, the 1st operation being Normal Store cannot be reordered with the second operation being Volatile Store, so in your case, as long as the immutable map is not null, there wouldn't be any reordering problems.
Also, all writes that occur prior to a volatile store will be visible, so you will not see any publishing issues.
There is a case where a map will be constructed, and once it is initialized, it will never be modified again. It will however, be accessed (via get(key) only) from multiple threads. Is it safe to use a java.util.HashMap in this way?
(Currently, I'm happily using a java.util.concurrent.ConcurrentHashMap, and have no measured need to improve performance, but am simply curious if a simple HashMap would suffice. Hence, this question is not "Which one should I use?" nor is it a performance question. Rather, the question is "Would it be safe?")
Jeremy Manson, the god when it comes to the Java Memory Model, has a three part blog on this topic - because in essence you are asking the question "Is it safe to access an immutable HashMap" - the answer to that is yes. But you must answer the predicate to that question which is - "Is my HashMap immutable". The answer might surprise you - Java has a relatively complicated set of rules to determine immutability.
For more info on the topic, read Jeremy's blog posts:
Part 1 on Immutability in Java:
http://jeremymanson.blogspot.com/2008/04/immutability-in-java.html
Part 2 on Immutability in Java:
http://jeremymanson.blogspot.com/2008/07/immutability-in-java-part-2.html
Part 3 on Immutability in Java:
http://jeremymanson.blogspot.com/2008/07/immutability-in-java-part-3.html
Your idiom is safe if and only if the reference to the HashMap is safely published. Rather than anything relating the internals of HashMap itself, safe publication deals with how the constructing thread makes the reference to the map visible to other threads.
Basically, the only possible race here is between the construction of the HashMap and any reading threads that may access it before it is fully constructed. Most of the discussion is about what happens to the state of the map object, but this is irrelevant since you never modify it - so the only interesting part is how the HashMap reference is published.
For example, imagine you publish the map like this:
class SomeClass {
public static HashMap<Object, Object> MAP;
public synchronized static setMap(HashMap<Object, Object> m) {
MAP = m;
}
}
... and at some point setMap() is called with a map, and other threads are using SomeClass.MAP to access the map, and check for null like this:
HashMap<Object,Object> map = SomeClass.MAP;
if (map != null) {
.. use the map
} else {
.. some default behavior
}
This is not safe even though it probably appears as though it is. The problem is that there is no happens-before relationship between the set of SomeObject.MAP and the subsequent read on another thread, so the reading thread is free to see a partially constructed map. This can pretty much do anything and even in practice it does things like put the reading thread into an infinite loop.
To safely publish the map, you need to establish a happens-before relationship between the writing of the reference to the HashMap (i.e., the publication) and the subsequent readers of that reference (i.e., the consumption). Conveniently, there are only a few easy-to-remember ways to accomplish that[1]:
Exchange the reference through a properly locked field (JLS 17.4.5)
Use static initializer to do the initializing stores (JLS 12.4)
Exchange the reference via a volatile field (JLS 17.4.5), or as the consequence of this rule, via the AtomicX classes
Initialize the value into a final field (JLS 17.5).
The ones most interesting for your scenario are (2), (3) and (4). In particular, (3) applies directly to the code I have above: if you transform the declaration of MAP to:
public static volatile HashMap<Object, Object> MAP;
then everything is kosher: readers who see a non-null value necessarily have a happens-before relationship with the store to MAP and hence see all the stores associated with the map initialization.
The other methods change the semantics of your method, since both (2) (using the static initalizer) and (4) (using final) imply that you cannot set MAP dynamically at runtime. If you don't need to do that, then just declare MAP as a static final HashMap<> and you are guaranteed safe publication.
In practice, the rules are simple for safe access to "never-modified objects":
If you are publishing an object which is not inherently immutable (as in all fields declared final) and:
You already can create the object that will be assigned at the moment of declarationa: just use a final field (including static final for static members).
You want to assign the object later, after the reference is already visible: use a volatile fieldb.
That's it!
In practice, it is very efficient. The use of a static final field, for example, allows the JVM to assume the value is unchanged for the life of the program and optimize it heavily. The use of a final member field allows most architectures to read the field in a way equivalent to a normal field read and doesn't inhibit further optimizationsc.
Finally, the use of volatile does have some impact: no hardware barrier is needed on many architectures (such as x86, specifically those that don't allow reads to pass reads), but some optimization and reordering may not occur at compile time - but this effect is generally small. In exchange, you actually get more than what you asked for - not only can you safely publish one HashMap, you can store as many more not-modified HashMaps as you want to the same reference and be assured that all readers will see a safely published map.
For more gory details, refer to Shipilev or this FAQ by Manson and Goetz.
[1] Directly quoting from shipilev.
a That sounds complicated, but what I mean is that you can assign the reference at construction time - either at the declaration point or in the constructor (member fields) or static initializer (static fields).
b Optionally, you can use a synchronized method to get/set, or an AtomicReference or something, but we're talking about the minimum work you can do.
c Some architectures with very weak memory models (I'm looking at you, Alpha) may require some type of read barrier before a final read - but these are very rare today.
The reads are safe from a synchronization standpoint but not a memory standpoint. This is something that is widely misunderstood among Java developers including here on Stackoverflow. (Observe the rating of this answer for proof.)
If you have other threads running, they may not see an updated copy of the HashMap if there is no memory write out of the current thread. Memory writes occur through the use of the synchronized or volatile keywords, or through uses of some java concurrency constructs.
See Brian Goetz's article on the new Java Memory Model for details.
After a bit more looking, I found this in the java doc (emphasis mine):
Note that this implementation is not
synchronized. If multiple threads
access a hash map concurrently, and at
least one of the threads modifies the
map structurally, it must be
synchronized externally. (A structural
modification is any operation that
adds or deletes one or more mappings;
merely changing the value associated
with a key that an instance already
contains is not a structural
modification.)
This seems to imply that it will be safe, assuming the converse of the statement there is true.
One note is that under some circumstances, a get() from an unsynchronized HashMap can cause an infinite loop. This can occur if a concurrent put() causes a rehash of the Map.
http://lightbody.net/blog/2005/07/hashmapget_can_cause_an_infini.html
There is an important twist though. It's safe to access the map, but in general it's not guaranteed that all threads will see exactly the same state (and thus values) of the HashMap. This might happen on multiprocessor systems where the modifications to the HashMap done by one thread (e.g., the one that populated it) can sit in that CPU's cache and won't be seen by threads running on other CPUs, until a memory fence operation is performed ensuring cache coherence. The Java Language Specification is explicit on this one: the solution is to acquire a lock (synchronized (...)) which emits a memory fence operation. So, if you are sure that after populating the HashMap each of the threads acquires ANY lock, then it's OK from that point on to access the HashMap from any thread until the HashMap is modified again.
According to http://www.ibm.com/developerworks/java/library/j-jtp03304/ # Initialization safety you can make your HashMap a final field and after the constructor finishes it would be safely published.
...
Under the new memory model, there is something similar to a happens-before relationship between the write of a final field in a constructor and the initial load of a shared reference to that object in another thread.
...
This question is addressed in Brian Goetz's "Java Concurrency in Practice" book (Listing 16.8, page 350):
#ThreadSafe
public class SafeStates {
private final Map<String, String> states;
public SafeStates() {
states = new HashMap<String, String>();
states.put("alaska", "AK");
states.put("alabama", "AL");
...
states.put("wyoming", "WY");
}
public String getAbbreviation(String s) {
return states.get(s);
}
}
Since states is declared as final and its initialization is accomplished within the owner's class constructor, any thread who later reads this map is guaranteed to see it as of the time the constructor finishes, provided no other thread will try to modify the contents of the map.
So the scenario you described is that you need to put a bunch of data into a Map, then when you're done populating it you treat it as immutable. One approach that is "safe" (meaning you're enforcing that it really is treated as immutable) is to replace the reference with Collections.unmodifiableMap(originalMap) when you're ready to make it immutable.
For an example of how badly maps can fail if used concurrently, and the suggested workaround I mentioned, check out this bug parade entry: bug_id=6423457
Be warned that even in single-threaded code, replacing a ConcurrentHashMap with a HashMap may not be safe. ConcurrentHashMap forbids null as a key or value. HashMap does not forbid them (don't ask).
So in the unlikely situation that your existing code might add a null to the collection during setup (presumably in a failure case of some kind), replacing the collection as described will change the functional behaviour.
That said, provided you do nothing else concurrent reads from a HashMap are safe.
[Edit: by "concurrent reads", I mean that there are not also concurrent modifications.
Other answers explain how to ensure this. One way is to make the map immutable, but it's not necessary. For example, the JSR133 memory model explicitly defines starting a thread to be a synchronised action, meaning that changes made in thread A before it starts thread B are visible in thread B.
My intent is not to contradict those more detailed answers about the Java Memory Model. This answer is intended to point out that even aside from concurrency issues, there is at least one API difference between ConcurrentHashMap and HashMap, which could scupper even a single-threaded program which replaced one with the other.]
http://www.docjar.com/html/api/java/util/HashMap.java.html
here is the source for HashMap. As you can tell, there is absolutely no locking / mutex code there.
This means that while its okay to read from a HashMap in a multithreaded situation, I'd definitely use a ConcurrentHashMap if there were multiple writes.
Whats interesting is that both the .NET HashTable and Dictionary<K,V> have built in synchronization code.
If the initialization and every put is synchronized you are save.
Following code is save because the classloader will take care of the synchronization:
public static final HashMap<String, String> map = new HashMap<>();
static {
map.put("A","A");
}
Following code is save because the writing of volatile will take care of the synchronization.
class Foo {
volatile HashMap<String, String> map;
public void init() {
final HashMap<String, String> tmp = new HashMap<>();
tmp.put("A","A");
// writing to volatile has to be after the modification of the map
this.map = tmp;
}
}
This will also work if the member variable is final because final is also volatile. And if the method is a constructor.