Following up on this question (Java thread safety - multiple atomic operations?), I don't want to add more questions to it, but now I have this doubt:
private final Map<String, Set<String>> data = Maps.newConcurrentMap();
... then in a method ...
if (data.containsKey("A")) {
data.get("A").add("B");
}
It should be something like this:
synchronized(data) {
if (data.containsKey("A")) {
data.get("A").add("B");
}
}
In order to be thread-safe. Is that correct?
So operations are atomic, but combining them would require synchronisation, is that right? At that point, would it make sense to just use a simple HashMap instead of a concurrent one, as we're manually handling sync?
Is there any method in CHM to make this work atomically?
In your specific case, you might want to use computeIfPresent method of ConcurrentHashMap:
data.computeIfPresent("A", (k, v) -> { v.add("B"); return v; } );
From the javadocs:
If the value for the specified key is present, attempts to compute a new mapping given the key and its current mapped value. The entire method invocation is performed atomically.
So there's no need for explicit synchronization.
synchronised(data) {
if (data.containsKey("A")) {
data.get("A").add("B");
}
}
You probably need to show more code.
Looking only at this, the only possible issue is that someone removes the Set found at "A" after your if check. If you don't ever remove map entries you need no synchronization at all.
If you do remove map entries concurrently, you could use computeIfPresent to arrive at the updated map.
You could also do
Set<String> set = data.get("A");
if (set != null) set.add("B");
Since you are not actually producting a new Set, I find this more idiomatic than computeIfPresent (which should compute a new value).
Note that you need to make all these Sets thread-safe as well.
Related
There is special need for creating thread monitor based on the string value.
Ex:
Map<String, String> values = new HashMap<>(); (instance variable)
values.put("1", "one");values.put("2", "two");values.put("3", "three");
void someMethod(String value) {
synchronized(values.get(value) == null ? value : values.get(value)) {
sout("I'm done");
}
}
The catch here is synchronized block has a ternary operator, is it allowed? I don't get any compile/run time exception or error.
I'm not sure about the above code really thread safe, at a time only one thread has to obtain the system monitor based on the string value.
Please provide thoughts on this. is this good practice or any other way around?
There are fundamental problems with this approach. You’re accessing a HashMap, which is not thread safe, before ever entering the synchronized block. If there are updates to the map after its construction, this approach is broken.
It’s crucial to use the same object instance for synchronizing when accessing the same data.
So even if you used a thread safe map here, using values.get(value) == null? value: values.get(value) means using changing objects for synchronization, when there are map updates, sometimes it uses the key, sometimes the mapped value, depending on whether a mapping is present. Even when the key is always present, it may use different mapped values.
It’s also pertinent to the Check-Then-Act anti-pattern, as you are checking values.get(value) == null first, and using values.get(value) afterwards, when the condition could have changed already.
You should never use strings for synchronization, as different string objects may be equal, so they map to the same data when using them as key to a Map, whereas synchronization fails due to different object identity. On the other hand, strings may get shared freely in a JVM and they are in case of string literals, so unrelated code performing synchronization on strings could block each other.
There’s a simple solution using a tool designed for this purpose. When using
ConcurrentMap<String, String> values = new ConcurrentHashMap<>();
void someMethod(String string) {
values.compute(string, (key,value) -> {
if(value == null) value = key.toUpperCase(); // construct when not present
// update value
return value;
});
}
the string’s equality determines the mutual exclusion while not serving as the synchronization key itself. So equal keys provide the desired blocking, while unrelated code, e.g. using a different ConcurrentHashMap with similar or even the same key values, is not affected by these operations.
I have the following defined
private ConcurrentMap<Integer, AtomicInteger> = new ConcurrentHashMap<Integer, AtomicInteger>();
private void add() {
staffValues.replace(100, staffValues.get(100), new AtomicInteger(staffValues.get(100).addAndGet(200)));
}
After testing, the values I am getting are not expected, and I think there is a race condition here. Does anyone know if this would be considered threadsafe by wrapping the get call in the replace function?
A good way to handle situations like this is using the computeIfAbsent method (not the compute method that #the8472 recommends)
The computeIfAbsent accepts 2 arguments, the key, and a Function<K, V> that will only be called if the existing value is missing. Since a AtomicInteger is thread safe to increment from multiple threads, you can use it easely in the following manner:
staffValues.computeIfAbsent(100, k -> new AtomicInteger(0)).addAndGet(200);
There are a few issues with your code. The biggest is that you're ignoring the return-value of ConcurrentHashMap.replace: if the replacement doesn't happen (due to another thread having made a replacement in parallel), you simply proceed as if it happened. This is the main reason you're getting wrong results.
I also think it's a design mistake to mutate an AtomicInteger and then immediately replace it with a different AtomicInteger; even if you can get this working, there's simply no reason for it.
Lastly, I don't think you should call staffValues.get(100) twice. I don't think that causes a bug in the current code — your correctness depends only on the second call returning a "newer" result than the first, which I think is actually guaranteed by ConcurrentHashMap — but it's fragile and subtle and confusing. In general, when you call ConcurrentHashMap.replace, its third argument should be something you computed using the second.
Overall, you can simplify your code either by not using AtomicInteger:
private ConcurrentMap<Integer, Integer> staffValues = new ConcurrentHashMap<>();
private void add() {
final Integer prevValue = staffValues.get(100);
staffValues.replace(100, prevValue, prevValue + 200);
}
or by not using replace (and perhaps not even ConcurrentMap, depending how else you're touching this map):
private Map<Integer, AtomicInteger> staffValues = new HashMap<>();
private void add() {
staffValues.get(100).addAndGet(200);
}
You don't need to use replace(). AtomicInteger is a mutable value that does not need to be substituted whenever you want to increment it. In fact addAndGet already increments it in place.
Instead use compute to put a default value (presumably 0) into the map when none is present and otherwise get the pre-existing value and increment that.
If, on the other hand, you want to use immutable values put Integer instances instead of AtomicInteger into the map and update them with the atomic compute/replace/merge operations.
I am a bit confused regarding one pattern I have seen in some legacy code of ours.
The controller uses a map as a cache, with an approach that should be thread safe, however I am still not confident it indeed is. We have a map, which is properly synchronized during addition and retrieval, however, there is a bit of logic outside of the synchronized block, that does some additional filtering.
(the map itself and the lists are never accessed outside of this method, so concurrent modification is not an issue; the map holds some stable parameters, which basically never change, but are used often).
The code looks like the following sample:
public class FooBarController {
private final Map<String, List<FooBar>> fooBarMap =
new HashMap<String, List<FooBar>>();
public FooBar getFooBar(String key, String foo, String bar) {
List<FooBar> foobarList;
synchronized (fooBarMap) {
if (fooBarMap.get(key) == null) {
foobarList = queryDbByKey(key);
fooBarMap.put(key, foobarList);
} else {
foobarList = fooBarMap.get(key);
}
}
for(FooBar fooBar : foobarList) {
if(foo.equals(fooBar.getFoo()) && bar.equals(fooBar.getBar()))
return fooBar;
}
return null;
}
private List<FooBar> queryDbByKey(String key) {
// ... (simple Hibernate-query)
}
// ...
}
Based on what I know about the JVM memory model, this should be fine, since if one thread populates a list, another one can only retrieve it from the map with proper synchronization in place, ensuring that the entries of the list is visible. (putting the list happens-before getting it)
However, we keep seeing cases, where an entry expected to be in the map is not found, combined with the typical notorious symptoms of concurrency issues (e.g. intermittent failures in production, which I cannot reproduce in my development environment; different threads can properly retrieve the value etc.)
I am wondering if iterating through the elements of the List like this is thread-safe?
The code you provided is correct in terms of concurrency. Here are the guarantees:
only one thread at a time adds values to map, because of synchronization on map object
values added by thread become visible for all other threads, that enter synchronized block
Given that, you can be sure that all threads that iterate a list see the same elements. The issues you described are indeed strange but I doubt they're related to the code you provided.
It could be thread safe only if all access too fooBarMap are synchronized. A little out of scope, but safer may be to use a ConcurrentHashmap.
There is a great article on how hashmaps can be synchronized here.
In situation like this it's best option to use ConcurrentHashMap.
Verify if all Update-Read are in order.
As I understood from your question. There are fix set of params which never changes. One of the ways I preferred in situation like this is:
I. To create the map cache during start up and keep only one instance of it.
II. Read the map Instance anytime anywhere in the application.
In the for loop you are returning reference to fooBar objects in the foobarList.
So the method calling getFooBar() has access to the Map through this fooBar reference object.
try to clone fooBar before returning from getFooBar()
A data structure that I use commonly in multi-threaded applications is a ConcurrentHashMap where I want to save a group of items that all share the same key. The problem occurs when installing the first item for a particular key value.
The pattern that I have been using is:
final ConcurrentMap<KEYTYPE, Set<VALUETYPE>> hashMap = new ConcurrentHashMap<KEYTYPE, Set<VALUETYPE>>();
// ...
Set<VALUETYPE> newSet = new HashSet<VALUETYPE>();
final Set<VALUETYPE> set = hashMap.putIfAbsent(key, newSet)
if (set != null) {
newSet = set;
}
synchronized (newSet) {
if (!newSet.contains(value)) {
newSet.add(value);
}
}
Is there a better pattern for doing this operation? Is this even thread-safe? Is there a better class to use for the inner Set than java.util.HashSet?
I strongly recommend using the Google Guava libraries for this, specifically an implementation of Multimap. The HashMultimap would be your best bet, though if you need concurrent update opterations you would need to wrap it in a delegate using Multimaps.synchronizedSetMultimap().
Another option is to use a ComputingMap (also from Guava), which is a map that, if the Value returned from a call to get(Key) does not exist, it is instantiated there and then. ComputingMaps are created using MapMaker.
The code from your question would be roughly:
ConcurrentMap<KEYTYPE, Set<VALUETYPE>> hashMap = new MapMaker()
.makeComputingMap(
new Function<KEYTYPE, VALUETYPE>() {
public Graph apply(KEYTYPE key) {
return new HashSet<VALUETYPE>();
}
});
The Function would only be called when a call to get() for a specific key would otherwise return null. This means that you can then do this:
hashMap.get(key).put(value);
safely knowing that the HashSet<VALUETYPE> is created if it doesn't already exist.
MapMaker is also relevant because of the control it gives you over the tuning of the returned Map, letting you specify, for example, the concurrency level using the method concurrencyLevel(). You may find that useful:
Guides the allowed concurrency among update operations. Used as a hint for internal sizing. The table is internally partitioned to try to permit the indicated number of concurrent updates without contention. Because assignment of entries to these partitions is not necessarily uniform, the actual concurrency observed may vary.
I think using java.util.concurrent.ConcurrentSkipListMap and java.util.concurrent.ConcurrentSkipListSet could help you resolve the concurrency concerns.
I'm the following situation.
At web application startup I need to load a Map which is thereafter used by multiple incoming threads. That is, requests comes in and the Map is used to find out whether it contains a particular key and if so the value (the object) is retrieved and associated to another object.
Now, at times the content of the Map changes. I don't want to restart my application to reload the new situation. Instead I want to do this dynamically.
However, at the time the Map is re-loading (removing all items and replacing them with the new ones), concurrent read requests on that Map still arrive.
What should I do to prevent all read threads from accessing that Map while it's being reloaded ? How can I do this in the most performant way, because I only need this when the Map is reloading which will only occur sporadically (each every x weeks) ?
If the above is not an option (blocking) how can I make sure that while reloading my read request won't suffer from unexpected exceptions (because a key is no longer there, or a value is no longer present or being reloaded) ?
I was given the advice that a ReadWriteLock might help me out. Can you someone provide me an example on how I should use this ReadWriteLock with my readers and my writer ?
Thanks,
E
I suggest to handle this as follow:
Have your map accessible at a central place (could be a Spring singleton, a static ...).
When starting to reload, let the instance as is, work in a different Map instance.
When that new map is filled, replace the old map with this new one (that's an atomic operation).
Sample code:
static volatile Map<U, V> map = ....;
// **************************
Map<U, V> tempMap = new ...;
load(tempMap);
map = tempMap;
Concurrency effects :
volatile helps with visibility of the variable to other threads.
While reloading the map, all other threads see the old value undisturbed, so they suffer no penalty whatsoever.
Any thread that retrieves the map the instant before it is changed will work with the old values.
It can ask several gets to the same old map instance, which is great for data consistency (not loading the first value from the older map, and others from the newer).
It will finish processing its request with the old map, but the next request will ask the map again, and will receive the newer values.
If the client threads do not modify the map, i.e. the contents of the map is solely dependent on the source from where it is loaded, you can simply load a new map and replace the reference to the map your client threads are using once the new map is loaded.
Other then using twice the memory for a short time, no performance penalty is incurred.
In case the map uses too much memory to have 2 of them, you can use the same tactic per object in the map; iterate over the map, construct a new mapped-to object and replace the original mapping once the object is loaded.
Note that changing the reference as suggested by others could cause problems if you rely on the map being unchanged for a while (e.g. if (map.contains(key)) {V value = map.get(key); ...}. If you need that, you should keep a local reference to the map:
static Map<U,V> map = ...;
void do() {
Map<U,V> local = map;
if (local.contains(key)) {
V value = local.get(key);
...
}
}
EDIT:
The assumption is that you don't want costly synchronization for your client threads. As a trade-off, you allow client threads to finish their work that they've already begun before your map changed - ignoring any changes to the map that happened while it is running. This way, you can safely made some assumptions about your map - e.g. that a key is present and always mapped to the same value for the duration of a single request. In the example above, if your reader thread changed the map just after a client called map.contains(key), the client might get null on map.get(key) - and you'd almost certainly end this request with a NullPointerException. So if you're doing multiple reads to the map and need to do some assumptions as the one mentioned before, it's easiest to keep a local reference to the (maybe obsolete) map.
The volatile keyword isn't strictly necessary here. It would just make sure that the new map is used by other threads as soon as you changed the reference (map = newMap). Without volatile, a subsequent read (local = map) could still return the old reference for some time (we're talking about less than a nanosecond though) - especially on multicore systems if I remember correctly. I wouldn't care about it, but f you feel a need for that extra bit of multi-threading beauty, your free to use it of course ;)
I like the volatile Map solution from KLE a lot and would go with that. Another idea that someone might find interesting is to use the map equivalent of a CopyOnWriteArrayList, basically a CopyOnWriteMap. We built one of these internally and it is non-trivial but you might be able to find a COWMap out in the wild:
http://old.nabble.com/CopyOnWriteMap-implementation-td13018855.html
This is the answer from the JDK javadocs for ReentrantReadWriteLock implementation of ReadWriteLock. A few years late but still valid, especially if you don't want to rely only on volatile
class RWDictionary {
private final Map<String, Data> m = new TreeMap<String, Data>();
private final ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
private final Lock r = rwl.readLock();
private final Lock w = rwl.writeLock();
public Data get(String key) {
r.lock();
try { return m.get(key); }
finally { r.unlock(); }
}
public String[] allKeys() {
r.lock();
try { return m.keySet().toArray(); }
finally { r.unlock(); }
}
public Data put(String key, Data value) {
w.lock();
try { return m.put(key, value); }
finally { w.unlock(); }
}
public void clear() {
w.lock();
try { m.clear(); }
finally { w.unlock(); }
}
}