Using ConcurrentHashMap, when is synchronizing necessary?

Using ConcurrentHashMap, when is synchronizing necessary? - java

I have a ConcurrentHashMap where I do the following:
sequences = new ConcurrentHashMap<Class<?>, AtomicLong>();
if(!sequences.containsKey(table)) {
synchronized (sequences) {
if(!sequences.containsKey(table))
initializeHashMapKeyValue(table);
}
}
My question is - is it unnecessary to make the extra
if(!sequences.containsKey(table))
Check inside the synschronized block so other threads wont initialize the same hashmap value?
Maybe the check is necessary and I am doing it wrong? It seems a bit silly what I'm doing, but I think it is necessary.

All operations on a ConcurrentHashMap are thread-safe, but thread-safe operations are not composable. You trying to make atomic a pair of operations: checking for something in the map and, in case it's not there, put something there (I assume). So the answer to your questions is yes, you need to check again, and your code looks ok.

You should be using the putIfAbsent methods of ConcurrentMap.
ConcurrentMap<String, AtomicLong> map = new ConcurrentHashMap<String, AtomicLong> ();
public long addTo(String key, long value) {
// The final value it became.
long result = value;
// Make a new one to put in the map.
AtomicLong newValue = new AtomicLong(value);
// Insert my new one or get me the old one.
AtomicLong oldValue = map.putIfAbsent(key, newValue);
// Was it already there? Note the deliberate use of '!='.
if ( oldValue != newValue ) {
// Update it.
result = oldValue.addAndGet(value);
}
return result;
}
For the functional purists amongst us, the above can be simplified (or perhaps complexified) to:
public long addTo(String key, long value) {
return map.putIfAbsent(key, new AtomicLong()).addAndGet(value);
}
And in Java 8 we can avoid the unnecessary creation of an AtomicLong:
public long addTo8(String key, long value) {
return map.computeIfAbsent(key, k -> new AtomicLong()).addAndGet(value);
}

You can't get exclusive lock with ConcurrentHashMap. In such case you should better use Synchronized HashMap.
There is already an atomic method to put inside ConcurrentHashMap if the object is not already there; putIfAbsent

I see what you did there ;-) question is do you see it yourself?
First off all you used something called "Double checked locking pattern". Where you have fast path (first contains) which does not need synchronization if case it is satisfied and slow path which must be synchronized because you do complex operation. Your operation consists of checking if something is inside the map and then putting there something / initializing it. So it does not matter that ConcurrentHashMap is thread safe for single operation because you do two simple operations which must be treated as unit so yes this synchronized block is correct and actually it could be synchronized by anything else for example this.

In Java 8 you should be able to replace the double checked lock with .computeIfAbsent:
sequences.computeIfAbsent(table, k -> initializeHashMapKeyValue(k));

Create a file named dictionary.txt with the following contents:
a
as
an
b
bat
ball
Here we have:
Count of words starting with "a": 3
Count of words starting with "b": 3
Total word count: 6
Now execute the following program as: java WordCount test_dictionary.txt 10
public class WordCount {
String fileName;
public WordCount(String fileName) {
this.fileName = fileName;
}
public void process() throws Exception {
long start = Instant.now().toEpochMilli();
LongAdder totalWords = new LongAdder();
//Map<Character, LongAdder> wordCounts = Collections.synchronizedMap(new HashMap<Character, LongAdder>());
ConcurrentHashMap<Character, LongAdder> wordCounts = new ConcurrentHashMap<Character, LongAdder>();
Files.readAllLines(Paths.get(fileName))
.parallelStream()
.map(line -> line.split("\\s+"))
.flatMap(Arrays::stream)
.parallel()
.map(String::toLowerCase)
.forEach(word -> {
totalWords.increment();
char c = word.charAt(0);
if (!wordCounts.containsKey(c)) {
wordCounts.put(c, new LongAdder());
}
wordCounts.get(c).increment();
});
System.out.println(wordCounts);
System.out.println("Total word count: " + totalWords);
long end = Instant.now().toEpochMilli();
System.out.println(String.format("Completed in %d milliseconds", (end - start)));
}
public static void main(String[] args) throws Exception {
for (int r = 0; r < Integer.parseInt(args[1]); r++) {
new WordCount(args[0]).process();
}
}
}
You would see counts vary as shown below:
{a=2, b=3}
Total word count: 6
Completed in 77 milliseconds
{a=3, b=3}
Total word count: 6
Now comment out ConcurrentHashMap at line 13, uncomment the line above it and run the program again.
You would see deterministic counts.

Related

Automic Integer Thread Clarification

I have the following code,
private final Map<String, AtomicInteger> wordCounter = new ConcurrentHashMap<>();
AtomicInteger count = wordCounter.get(word);
if (count == null) {
if ((count = wordCounter.putIfAbsent(word, new AtomicInteger(1))) == null) {
continue;
}
}
count.incrementAndGet();
I'm checking count == null in IF condition. As far as i know, operation in AutomicInteger is thread-safe. Is it necessary to lock count instance using one of the locking mechanism?

The above code works without any additional locking, but it can be simplified to the following idiomatic form
// If word doesn't exist, create a new atomic integer, otherwise return the existing
wordCounter.computeIfAbsent(word, k -> new AtomicInteger(0))
.incrementAndGet(); // increment it
Your code looks a bit like double checked locking, in that putIfAbsent() is used after the null-check to avoid overwriting a value that was possibly put there by another thread. However that path creates an extra AtomicInteger which doesn't happen with DCL. The extra object probably wouldn't matter much, but it does make the solution a little less "pure".

Necessity of the locks while working with concurrent hash map

Here is the code in one of my classes:
class SomeClass {
private Map<Integer, Integer> map = new ConcurrentHashMap<>();
private volatile int counter = 0;
final AtomicInteger sum = new AtomicInteger(0); // will be used in other classes/threads too
private ReentrantLock l = new ReentrantLock();
public void put(String some) {
l.lock();
try {
int tmp = Integer.parseInt(some);
map.put(counter++, tmp);
sum.getAndAdd(tmp);
} finally {
l.unlock();
}
}
public Double get() {
l.lock();
try {
//... perform some map resizing operation ...
// some calculations including sum field ...
} finally {
l.unlock();
}
}
}
You can assume that this class will be used in concurrent environment.
The question is: how do you think is there a necessity of the locks? How does this code smell? :)

Let's look at the operations inside public void put(String some).
map.put(counter++, tmp);
sum.getAndAdd(tmp);
Now let's look at the individual parts.
counter is a volatile variable. So it only provides memory visibility but not atomicity. Since counter++ is a compound operation, you need a lock to achieve atomicity.
map.put(key, value) is atomic since it is a ConcurrentHashMap.
sum.getAndAdd(tmp) is atomic since it is a AtomicInteger.
As you can see, except counter++ every other operation is atomic. However, you are trying to achieve some function by combining all these operations. To achieve atomicity at the functionality level, you need a lock. This will help you to avoid surprising side effects when the threads interleave between the individual atomic operations.
So you need a lock because counter++ is not atomic and you want to combine a few atomic operations to achieve some functionality (assuming you want this to be atomic).

Since you always increment counter when you use it as a key to put into this map:
map.put(counter++, tmp);
when you come to read it again:
return sum / map.get(counter);
map.get(counter) will be null, so this results in a NPE (unless you put more than 2^32 things into the map, ofc). (I'm assuming you mean sum.get(), otherwise it won't compile).
As such, you can have equivalent functionality without any locks:
class SomeClass {
public void put(String some) { /* do nothing */ }
public Double get() {
throw new NullPointerException();
}
}
You've not really fixed the problem with your edit. divisor will still be null, so the equivalent functionality without locks would be:
class SomeClass {
private final AtomicInteger sum = new AtomicInteger(0);
public void put(String some) {
sum.getAndAdd(Integer.parseInt(some));
}
public Double get() {
return sum.get();
}
}

Is an assignment inside ConcurrentHashMap.computeIfAbsent threadsafe?

Consider the following implementation of some kind of fixed size cache, that allows lookup by an integer handle:
static class HandleCache {
private final AtomicInteger counter = new AtomicInteger();
private final Map<Data, Integer> handles = new ConcurrentHashMap<>();
private final Data[] array = new Data[100_000];
int getHandle(Data data) {
return handles.computeIfAbsent(data, k -> {
int i = counter.getAndIncrement();
if (i >= array.length) {
throw new IllegalStateException("array overflow");
}
array[i] = data;
return i;
});
}
Data getData(int handle) {
return array[handle];
}
}
There is an array store inside the compute function, which is not synchronized in any way. Would it be allowed by the java memory model for other threads to read a null value from this array later on?
PS: Would the outcome change if the id returned from getHandle was stored in a final field and only accessed through this field from other threads?

The read access isn't thread safe. You could make it thread safe indirectly however it's likely to be brittle. I would implemented it in a much simpler way and only optimise it later should it prove to a performance problem. e.g. because you see it in a profiler for a realistic test.
static class HandleCache {
private final Map<Data, Integer> handles = new HashMap<>();
private final List<Data> dataByIndex = new ArrayList<>();
synchronized int getHandle(Data data) {
Integer id = handles.get(data);
if (id == null) {
id = handles.size();
handles.put(data, id);
dataByIndex.add(id);
}
return id;
}
synchronized Data getData(int handle) {
return dataByIndex.get(handle);
}
}

Assuming that you determine the index for the array read from the value of counter than yes - you may get a null read
The simplest example (there are others) is a follows:
T1 calls getHandle(data) and is suspended just after int i = counter.getAndIncrement();
T2 calls handles[counter.get()] and reads null.
You should be able to easily verify this with a strategically placed sleep and two threads.

From the documentation of ConcurrentHashMap#computeIfAbsent:
The entire method invocation is performed atomically, so the function is applied at most once per key. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.
The documentation's reference to blocking refers only to update operations on the Map, so if any other thread attempts to access array directly (rather than through an update operation on the Map), there can be race conditions and null can be read.

How to fix this race condition error? [duplicate]

This question already has an answer here:
Not thread safe methods of CuncurrentSkipListMap in Java
(1 answer)
Closed 6 years ago.
I have such simple code:
class B {
//....
}
public class A {
private ConcurrentSkipListMap<Long, B> map = new ConcurrentSkipListMap<>();
public void add(B b) {
long key = LocalDateTime.now().toEpochSecond(ZoneOffset.UTC) / 60;
//this area has bug
if (map.containsKey(key)) {
B oldB = map.get(key);
// work with oldB
} else {
map.put(key, b);
}
//end this area
}
}
So, I can get key from 2 threads. Then first thread go to else-path. Then second thread is starting. But first thread has not added value yet.

Wrap the area you have marked as "this area has a bug" in a synchronized block:
synchronized (map) {
if (map.containsKey(key)) {
B oldB = map.get(key);
// work with oldB
} else {
map.put(key, b);
}
}
This prevents two threads with the same key value from accessing the map at the same time - but only if all other accesses to the map are also synchronized with get (e.g. you don't have an unsynchronized map.get elsewhere in the class).
Note that this prevents all concurrent updates to the map, which might create an unacceptable bottleneck. Whilst you can use Long.valueOf(key) to obtain an instance on which you can synchronize, there are no guaranteed ranges of input which are guaranteed to be cached.
Instead, you could perhaps map the long into the range of values cached by Integer.valueOf (i.e. -128 to 127), which would give you a more granular lock, e.g.
// Assuming that your clock isn't stuck in the 1960s...
Integer intKey = Integer.valueOf((int)( (longKey % 255) - 128));
synchronized (intKey) {
// ...
}
(Or, of course, you could maintain your own cache of keys).

Java ConcurrentHashMap actions atomicity

This may be a duplicate question, but I've found this part of code in a book about concurrency. This is said to be thread-safe:
ConcurrentHashMap<String, Integer> counts = new ...;
private void countThing(String thing) {
while (true) {
Integer currentCount = counts.get(thing);
if (currentCount == null) {
if (counts.putIfAbsent(thing, 1) == null)
break;
} else if (counts.replace(thing, currentCount, currentCount + 1)) {
break;
}
}
}
From my (concurrency beginners') point of view, thread t1 and thread t2 could both read currentCount = 1. Then both threads could change the maps' value to 2. Can someone please explain me if the code is okay or not?

The trick is that replace(K key, V oldValue, V newValue) provides the atomicity for you. From the docs (emphasis mine):
Replaces the entry for a key only if currently mapped to a given value. ... the action is performed atomically.
The key word is "atomically." Within replace, the "check if the old value is what we expect, and only if it is, replace it" happens as a single chunk of work, with no other threads able to interleave with it. It's up to the implementation to do whatever synchronization it needs to make sure that it provides this atomicity.
So, it can't be that both threads see currentAction == 1 from within the replace function. One of them will see it as 1, and thus its invocation to replace will return true. The other will see it as 2 (because of the first call), and thus return false — and loop back to try again, this time with the new value of currentAction == 2.
Of course, it could be that a third thread has updated currentAction to 3 in the meanwhile, in which case that second thread will just keep trying until it's lucky enough to not have anyone jump ahead of it.

Can someone please explain me if the code is okay or not?
In addition to yshavit's answer, you can avoid writing your own loop by using compute which was added in Java 8.
ConcurrentMap<String, Integer> counts = new ...;
private void countThing(String thing) {
counts.compute(thing, (k, prev) -> prev == null ? 1 : 1 + prev);
}

With put you can too replace the value.
if (currentCount == null) {
counts.put(thing, 2);
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Using ConcurrentHashMap, when is synchronizing necessary? - java

You can't get exclusive lock with ConcurrentHashMap. In such case you should better use Synchronized HashMap. There is already an atomic method to put inside ConcurrentHashMap if the object is not already there; putIfAbsent

In Java 8 you should be able to replace the double checked lock with .computeIfAbsent: sequences.computeIfAbsent(table, k -> initializeHashMapKeyValue(k));

Related

Automic Integer Thread Clarification

Necessity of the locks while working with concurrent hash map

Is an assignment inside ConcurrentHashMap.computeIfAbsent threadsafe?

How to fix this race condition error? [duplicate]

Java ConcurrentHashMap actions atomicity

Categories

Resources