Necessity of the locks while working with concurrent hash map

Necessity of the locks while working with concurrent hash map - java

Here is the code in one of my classes:
class SomeClass {
private Map<Integer, Integer> map = new ConcurrentHashMap<>();
private volatile int counter = 0;
final AtomicInteger sum = new AtomicInteger(0); // will be used in other classes/threads too
private ReentrantLock l = new ReentrantLock();
public void put(String some) {
l.lock();
try {
int tmp = Integer.parseInt(some);
map.put(counter++, tmp);
sum.getAndAdd(tmp);
} finally {
l.unlock();
}
}
public Double get() {
l.lock();
try {
//... perform some map resizing operation ...
// some calculations including sum field ...
} finally {
l.unlock();
}
}
}
You can assume that this class will be used in concurrent environment.
The question is: how do you think is there a necessity of the locks? How does this code smell? :)

Let's look at the operations inside public void put(String some).
map.put(counter++, tmp);
sum.getAndAdd(tmp);
Now let's look at the individual parts.
counter is a volatile variable. So it only provides memory visibility but not atomicity. Since counter++ is a compound operation, you need a lock to achieve atomicity.
map.put(key, value) is atomic since it is a ConcurrentHashMap.
sum.getAndAdd(tmp) is atomic since it is a AtomicInteger.
As you can see, except counter++ every other operation is atomic. However, you are trying to achieve some function by combining all these operations. To achieve atomicity at the functionality level, you need a lock. This will help you to avoid surprising side effects when the threads interleave between the individual atomic operations.
So you need a lock because counter++ is not atomic and you want to combine a few atomic operations to achieve some functionality (assuming you want this to be atomic).

Since you always increment counter when you use it as a key to put into this map:
map.put(counter++, tmp);
when you come to read it again:
return sum / map.get(counter);
map.get(counter) will be null, so this results in a NPE (unless you put more than 2^32 things into the map, ofc). (I'm assuming you mean sum.get(), otherwise it won't compile).
As such, you can have equivalent functionality without any locks:
class SomeClass {
public void put(String some) { /* do nothing */ }
public Double get() {
throw new NullPointerException();
}
}
You've not really fixed the problem with your edit. divisor will still be null, so the equivalent functionality without locks would be:
class SomeClass {
private final AtomicInteger sum = new AtomicInteger(0);
public void put(String some) {
sum.getAndAdd(Integer.parseInt(some));
}
public Double get() {
return sum.get();
}
}

Related

ConcurrentHashMap and ReentrantReadWriteLock

I have one thread that updates data in Map and several threads that read this data. Now my code looks like this:
public class Updater {
private ConcurrentMap<String, Integer> valuesMap = new ConcurrentHashMap<>();
private ReadWriteLock reentrantReadWriteLock = new ReentrantReadWriteLock();
public void update(Settings settings) {
reentrantReadWriteLock.writeLock().lock();
try {
for (Map.Entry<String, Integer> entry : valuesMap.entrySet()) {
valuesMap.put(entry.getKey(),
entry.getValue() + settings.getUpdateValue());
}
} finally {
reentrantReadWriteLock.writeLock().unlock();
}
}
public Integer getValue(String key) {
reentrantReadWriteLock.readLock().lock();
try {
return valuesMap.get(key);
} finally {
reentrantReadWriteLock.readLock().unlock();
}
}
}
But I think I overdid it. Can I use only ConcurrentHashMap in this situation?

Can I use only ConcurrentHashMap in this situation?
It depends on whether you want the update operation to be atomic; i.e. whether you want a reader to never see the state of the map when only some of the put operations have been performed.
If update doesn't need to be atomic, then locking is unnecessary. In fact if is an unwanted concurrency bottleneck.
If update needs to be atomic, then the lock is necessary. But if you are controlling access with locks, then you don't need to use a ConcurrentHashMap. A simple HashMap would be better.
I don't think that ConcurrentHashMap has a way to implement multiple put operations as an atomic action.

How to implement a thread-safe class with a setter and a getter function where the frequency of using setter is way higher than getter

Let's say we want a thread-safe class called Adder, and there is an add function and a get function
class Adder {
int counter;
public void add(int a) counter += a;
public int get() return counter;
Right now it is apparently not thread safe. Now given that the frequency of using add() function is 80% and using get() is 20%. Also, we allow getting a counter with a lag so the get function does not need to be thread safe. How should we implement this? Apparently adding synchronized keyword to add() isnt a good solution since too many threads will try to obtain the lock and there will a race condition.
Also, the add() function should always be successful (meaning that you cannot use non-blocking locking)

If you are willing to allow imperfections in get, and the frequency of additions is much higher than of queries, use LongAdder.
This class is usually preferable to AtomicLong when multiple threads update a common sum that is used for purposes such as collecting statistics, not for fine-grained synchronization control. Under low update contention, the two classes have similar characteristics. But under high contention, expected throughput of this class is significantly higher, at the expense of higher space consumption.

You can solve this problem with AtomicInteger in Java:
import java.util.concurrent.atomic.AtomicInteger;
public class Adder {
private final AtomicInteger counter = new AtomicInteger(0);
public int add(int delta) {
return counter.addAndGet(delta);
}
public int get() {
return counter.get();
}
}

Using the synchronized keyword should not cause any problems. The whole point of synchronized is to add thread safety.
If you want a more specific solution, I would try using either AtomicInteger or ReadWriteLock.
ReadWriteLock Solution :
class ThreadSafeAdder {
private ReadWriteLock lock;
private int counter;
public ThreadSafeAdder() {
lock = new ReentrantReadWriteLock();
counter = 0;
}
public int get() {
try {
lock.readLock().lock();
return counter;
} finally {
lock.readLock().unlock();
}
}
public void set(int a) {
try {
lock.writeLock().lock();
counter = a;
} finally {
lock.writeLock().unlock();
}
}
}
AtomicInteger does the same thing as the above solution, it just hides all the thread safety code... (As seen in #kamaci's answer)

Is an assignment inside ConcurrentHashMap.computeIfAbsent threadsafe?

Consider the following implementation of some kind of fixed size cache, that allows lookup by an integer handle:
static class HandleCache {
private final AtomicInteger counter = new AtomicInteger();
private final Map<Data, Integer> handles = new ConcurrentHashMap<>();
private final Data[] array = new Data[100_000];
int getHandle(Data data) {
return handles.computeIfAbsent(data, k -> {
int i = counter.getAndIncrement();
if (i >= array.length) {
throw new IllegalStateException("array overflow");
}
array[i] = data;
return i;
});
}
Data getData(int handle) {
return array[handle];
}
}
There is an array store inside the compute function, which is not synchronized in any way. Would it be allowed by the java memory model for other threads to read a null value from this array later on?
PS: Would the outcome change if the id returned from getHandle was stored in a final field and only accessed through this field from other threads?

The read access isn't thread safe. You could make it thread safe indirectly however it's likely to be brittle. I would implemented it in a much simpler way and only optimise it later should it prove to a performance problem. e.g. because you see it in a profiler for a realistic test.
static class HandleCache {
private final Map<Data, Integer> handles = new HashMap<>();
private final List<Data> dataByIndex = new ArrayList<>();
synchronized int getHandle(Data data) {
Integer id = handles.get(data);
if (id == null) {
id = handles.size();
handles.put(data, id);
dataByIndex.add(id);
}
return id;
}
synchronized Data getData(int handle) {
return dataByIndex.get(handle);
}
}

Assuming that you determine the index for the array read from the value of counter than yes - you may get a null read
The simplest example (there are others) is a follows:
T1 calls getHandle(data) and is suspended just after int i = counter.getAndIncrement();
T2 calls handles[counter.get()] and reads null.
You should be able to easily verify this with a strategically placed sleep and two threads.

From the documentation of ConcurrentHashMap#computeIfAbsent:
The entire method invocation is performed atomically, so the function is applied at most once per key. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.
The documentation's reference to blocking refers only to update operations on the Map, so if any other thread attempts to access array directly (rather than through an update operation on the Map), there can be race conditions and null can be read.

AtomicLongMap read thread safety

I have a following class:
public class Storage {
protected static final AtomicLongMap<String> MAP;
protected Storage () {
MAP= AtomicLongMap.create();
}
public void decrement(String id) {
long num = MAP.get(id);
if (num != 0) {
MAP.decrementAndGet(id);
}
}
public void putIntoActiveAgents(String id, Integer num) {
MAP.put(id, num);
}
public void remove(String id) {
MAP.remove(id);
}
public Long get(String id) {
return MAP.get(ID);
}
}
In my case I have let say 6 threads which are performing similar things:
Each thread checks if long in the map is equal to 1, if not, they call decrement, if yes, they call remove.
Everywhere I read that AtomicLongMap is thread safe. I'm sure it is, when somebody is incrementing / decrementing long number, but I'm not sure if it is still thread safe when other thread are reading values from that map. My scenario:
1. Thread A reads value from the map - it's 2 (so it decrements the value)
2. Thread B reads the value before the counter has been decremented - it's still returning 2, so it also decrements value.
3. In result, nobody sees the value set to 1.
My question is in such case, do I need to make MAP synchronized?

If you are using Java 8, looking at your code I suggest you use the ConcurrentHashMap. The Map interface in Java 8 has been updated with new functions such as computeIfPresent(). So your function "decrement(String id)" would look like this -
public class Storage {
protected static final Map<String, Long> MAP = new ConcurrentHashMap<>();
public void decrement(String id) {
MAP.computeIfPresent(id, (id, currentValue) -> --currentValue);
}
public void putIntoActiveAgents(String id, Integer num) {
MAP.put(id, num);
}
public void remove(String id) {
MAP.remove(id);
}
public Long get(String id) {
return MAP.get(ID);
}
}

If you have a look at the source you'll see the that com.google.common.util.concurrent.AtomicLongMap (I assume you're refering to that class) internally uses a ConcurrentHashMap<K, AtomicLong> so reading does depend on the properties of ConcurrentHashMap whose JavaDoc states:
Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset.
However, since you're circumventing the map here (you're reading, checking and then decrementing) your code is not thread-safe. Thus you might have to synchronize your methods or use a mechanism (you could try something like AtomicLong#compareAndSet but AtomicLongMap doesn't seem to provide access to that).

Delegating thread-safety to ConcurrentMap and AtomicInteger

I need to provide thread-safe implementation of the following container:
public interface ParameterMetaData<ValueType> {
public String getName();
}
public interface Parameters {
public <M> M getValue(ParameterMetaData<M> pmd);
public <M> void put(ParameterMetaData<M> p, M value);
public int size();
}
The thing is the size method should return the accurate number of paramters currently contained in a Parameters instance. So, my first attempt was to try delegating thread-safety as follows:
public final class ConcurrentParameters implements Parameters{
private final ConcurrentMap<ParameterMetaData<?>, Object> parameters =
new ConcurrentHashMap<>();
//Should represent the ACCURATE size of the internal map
private final AtomicInteger size = new AtomicInteger();
#Override
public <M> M getValue(ParameterMetaData<M> pmd) {
#SuppressWarnings("unchecked")
M value = (M) parameters.get(pmd);
return value;
}
#Override
public <M> void put(ParameterMetaData<M> p, M value){
if(value == null)
return;
//The problem is in the code below
M previous = (M) parameters.putIfAbsent(p, value);
if(previous != null)
//throw an exception indicating that the parameter already exists
size.incrementAndGet();
}
#Override
public int size() {
return size.intValue();
}
The problem is that I can't just call parameters.size() on the ConcurrentHashMap instance to return the actual size, as that the operation performs traversal without locking and there's no guaratee that it will retrieve the actual size. It isn't acceptable in my case. So, I decided to maintain the field containing the size.
QUESTION: Is it possible somehow to delegate thread safety and preserve the invariatns?

The outcome you want to achieve is non-atomic. You want to modify map and then get count of elements that would be consistent in a scope of single thread. The only way to achieve that is to make this flow "atomic operation" by synchronizing access to the map. This is the only way to assure that count will not change due to modifications made in another thread.
Synchronize modify-count access to the map via synchronized or Semaphore to allow only single thread to modify map and count elements at the time.
Using additional field as a counter does not guarantee thread safety here, as after map modification and before counter manipulation, other thread can in fact modify map, and the counter value will not be valid.
This is the reason why map does not keeps its size internally but has to traversal over elements - to give most accurate results at given point in time.
EDIT:
To be 100% clear, this is the most convinient way to achieve this:
synchronized(yourMap){
doSomethingWithTheMap();
yourMap.size();
}
so if you will change every map operation to such block, you will guarantee that size() will return accurate count of elements. The only condition is that all data manipulations are done using such synchronized block.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Necessity of the locks while working with concurrent hash map - java

Related

ConcurrentHashMap and ReentrantReadWriteLock

How to implement a thread-safe class with a setter and a getter function where the frequency of using setter is way higher than getter

Is an assignment inside ConcurrentHashMap.computeIfAbsent threadsafe?

AtomicLongMap read thread safety

Delegating thread-safety to ConcurrentMap and AtomicInteger

Categories

Resources