AtomicLongMap read thread safety - java

I have a following class:
public class Storage {
protected static final AtomicLongMap<String> MAP;
protected Storage () {
MAP= AtomicLongMap.create();
}
public void decrement(String id) {
long num = MAP.get(id);
if (num != 0) {
MAP.decrementAndGet(id);
}
}
public void putIntoActiveAgents(String id, Integer num) {
MAP.put(id, num);
}
public void remove(String id) {
MAP.remove(id);
}
public Long get(String id) {
return MAP.get(ID);
}
}
In my case I have let say 6 threads which are performing similar things:
Each thread checks if long in the map is equal to 1, if not, they call decrement, if yes, they call remove.
Everywhere I read that AtomicLongMap is thread safe. I'm sure it is, when somebody is incrementing / decrementing long number, but I'm not sure if it is still thread safe when other thread are reading values from that map. My scenario:
1. Thread A reads value from the map - it's 2 (so it decrements the value)
2. Thread B reads the value before the counter has been decremented - it's still returning 2, so it also decrements value.
3. In result, nobody sees the value set to 1.
My question is in such case, do I need to make MAP synchronized?

If you are using Java 8, looking at your code I suggest you use the ConcurrentHashMap. The Map interface in Java 8 has been updated with new functions such as computeIfPresent(). So your function "decrement(String id)" would look like this -
public class Storage {
protected static final Map<String, Long> MAP = new ConcurrentHashMap<>();
public void decrement(String id) {
MAP.computeIfPresent(id, (id, currentValue) -> --currentValue);
}
public void putIntoActiveAgents(String id, Integer num) {
MAP.put(id, num);
}
public void remove(String id) {
MAP.remove(id);
}
public Long get(String id) {
return MAP.get(ID);
}
}

If you have a look at the source you'll see the that com.google.common.util.concurrent.AtomicLongMap (I assume you're refering to that class) internally uses a ConcurrentHashMap<K, AtomicLong> so reading does depend on the properties of ConcurrentHashMap whose JavaDoc states:
Retrieval operations (including get) generally do not block, so may overlap with update operations (including put and remove). Retrievals reflect the results of the most recently completed update operations holding upon their onset.
However, since you're circumventing the map here (you're reading, checking and then decrementing) your code is not thread-safe. Thus you might have to synchronize your methods or use a mechanism (you could try something like AtomicLong#compareAndSet but AtomicLongMap doesn't seem to provide access to that).

Related

Necessity of the locks while working with concurrent hash map

Here is the code in one of my classes:
class SomeClass {
private Map<Integer, Integer> map = new ConcurrentHashMap<>();
private volatile int counter = 0;
final AtomicInteger sum = new AtomicInteger(0); // will be used in other classes/threads too
private ReentrantLock l = new ReentrantLock();
public void put(String some) {
l.lock();
try {
int tmp = Integer.parseInt(some);
map.put(counter++, tmp);
sum.getAndAdd(tmp);
} finally {
l.unlock();
}
}
public Double get() {
l.lock();
try {
//... perform some map resizing operation ...
// some calculations including sum field ...
} finally {
l.unlock();
}
}
}
You can assume that this class will be used in concurrent environment.
The question is: how do you think is there a necessity of the locks? How does this code smell? :)
Let's look at the operations inside public void put(String some).
map.put(counter++, tmp);
sum.getAndAdd(tmp);
Now let's look at the individual parts.
counter is a volatile variable. So it only provides memory visibility but not atomicity. Since counter++ is a compound operation, you need a lock to achieve atomicity.
map.put(key, value) is atomic since it is a ConcurrentHashMap.
sum.getAndAdd(tmp) is atomic since it is a AtomicInteger.
As you can see, except counter++ every other operation is atomic. However, you are trying to achieve some function by combining all these operations. To achieve atomicity at the functionality level, you need a lock. This will help you to avoid surprising side effects when the threads interleave between the individual atomic operations.
So you need a lock because counter++ is not atomic and you want to combine a few atomic operations to achieve some functionality (assuming you want this to be atomic).
Since you always increment counter when you use it as a key to put into this map:
map.put(counter++, tmp);
when you come to read it again:
return sum / map.get(counter);
map.get(counter) will be null, so this results in a NPE (unless you put more than 2^32 things into the map, ofc). (I'm assuming you mean sum.get(), otherwise it won't compile).
As such, you can have equivalent functionality without any locks:
class SomeClass {
public void put(String some) { /* do nothing */ }
public Double get() {
throw new NullPointerException();
}
}
You've not really fixed the problem with your edit. divisor will still be null, so the equivalent functionality without locks would be:
class SomeClass {
private final AtomicInteger sum = new AtomicInteger(0);
public void put(String some) {
sum.getAndAdd(Integer.parseInt(some));
}
public Double get() {
return sum.get();
}
}

Is an assignment inside ConcurrentHashMap.computeIfAbsent threadsafe?

Consider the following implementation of some kind of fixed size cache, that allows lookup by an integer handle:
static class HandleCache {
private final AtomicInteger counter = new AtomicInteger();
private final Map<Data, Integer> handles = new ConcurrentHashMap<>();
private final Data[] array = new Data[100_000];
int getHandle(Data data) {
return handles.computeIfAbsent(data, k -> {
int i = counter.getAndIncrement();
if (i >= array.length) {
throw new IllegalStateException("array overflow");
}
array[i] = data;
return i;
});
}
Data getData(int handle) {
return array[handle];
}
}
There is an array store inside the compute function, which is not synchronized in any way. Would it be allowed by the java memory model for other threads to read a null value from this array later on?
PS: Would the outcome change if the id returned from getHandle was stored in a final field and only accessed through this field from other threads?
The read access isn't thread safe. You could make it thread safe indirectly however it's likely to be brittle. I would implemented it in a much simpler way and only optimise it later should it prove to a performance problem. e.g. because you see it in a profiler for a realistic test.
static class HandleCache {
private final Map<Data, Integer> handles = new HashMap<>();
private final List<Data> dataByIndex = new ArrayList<>();
synchronized int getHandle(Data data) {
Integer id = handles.get(data);
if (id == null) {
id = handles.size();
handles.put(data, id);
dataByIndex.add(id);
}
return id;
}
synchronized Data getData(int handle) {
return dataByIndex.get(handle);
}
}
Assuming that you determine the index for the array read from the value of counter than yes - you may get a null read
The simplest example (there are others) is a follows:
T1 calls getHandle(data) and is suspended just after int i = counter.getAndIncrement();
T2 calls handles[counter.get()] and reads null.
You should be able to easily verify this with a strategically placed sleep and two threads.
From the documentation of ConcurrentHashMap#computeIfAbsent:
The entire method invocation is performed atomically, so the function is applied at most once per key. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this map.
The documentation's reference to blocking refers only to update operations on the Map, so if any other thread attempts to access array directly (rather than through an update operation on the Map), there can be race conditions and null can be read.

Delegating thread-safety to ConcurrentMap and AtomicInteger

I need to provide thread-safe implementation of the following container:
public interface ParameterMetaData<ValueType> {
public String getName();
}
public interface Parameters {
public <M> M getValue(ParameterMetaData<M> pmd);
public <M> void put(ParameterMetaData<M> p, M value);
public int size();
}
The thing is the size method should return the accurate number of paramters currently contained in a Parameters instance. So, my first attempt was to try delegating thread-safety as follows:
public final class ConcurrentParameters implements Parameters{
private final ConcurrentMap<ParameterMetaData<?>, Object> parameters =
new ConcurrentHashMap<>();
//Should represent the ACCURATE size of the internal map
private final AtomicInteger size = new AtomicInteger();
#Override
public <M> M getValue(ParameterMetaData<M> pmd) {
#SuppressWarnings("unchecked")
M value = (M) parameters.get(pmd);
return value;
}
#Override
public <M> void put(ParameterMetaData<M> p, M value){
if(value == null)
return;
//The problem is in the code below
M previous = (M) parameters.putIfAbsent(p, value);
if(previous != null)
//throw an exception indicating that the parameter already exists
size.incrementAndGet();
}
#Override
public int size() {
return size.intValue();
}
The problem is that I can't just call parameters.size() on the ConcurrentHashMap instance to return the actual size, as that the operation performs traversal without locking and there's no guaratee that it will retrieve the actual size. It isn't acceptable in my case. So, I decided to maintain the field containing the size.
QUESTION: Is it possible somehow to delegate thread safety and preserve the invariatns?
The outcome you want to achieve is non-atomic. You want to modify map and then get count of elements that would be consistent in a scope of single thread. The only way to achieve that is to make this flow "atomic operation" by synchronizing access to the map. This is the only way to assure that count will not change due to modifications made in another thread.
Synchronize modify-count access to the map via synchronized or Semaphore to allow only single thread to modify map and count elements at the time.
Using additional field as a counter does not guarantee thread safety here, as after map modification and before counter manipulation, other thread can in fact modify map, and the counter value will not be valid.
This is the reason why map does not keeps its size internally but has to traversal over elements - to give most accurate results at given point in time.
EDIT:
To be 100% clear, this is the most convinient way to achieve this:
synchronized(yourMap){
doSomethingWithTheMap();
yourMap.size();
}
so if you will change every map operation to such block, you will guarantee that size() will return accurate count of elements. The only condition is that all data manipulations are done using such synchronized block.

Letting multiple Threads operate on a data set while one Thread sums it up

I am trying to implement a banking system where I have a set of accounts. There are multiple threds trying to transfer money between accounts, while one thread continuosuly (or rather, at random times) tries to sum up the total money in the bank (Sum of all accounts balance).
The way to solve that sounded obvious at first; Using ReentrantReadWriteLocks with readLock for the threads doing the transactions and writeLock for the thread doing the summation. However after implementing it that way (see code below), I saw a huge drop in performance / "transaction-throughput" even versus doing transactions with only one thread.
Code for above mentioned implementation:
public class Account implements Compareable<Account>{
private int id;
private int balance;
public Account(int id){
this.id = id;
this.balance = 0;
}
public synchronized int getBalance(){ return balance; }
public synchronized setBalance(int balance){
if(balance < 0){ throw new IllegalArgumentException("Negative balance"); }
this.balance = balance;
}
public int getId(){ return this.id; }
// To sort a collection of Accounts.
public int compareTo(Account other){
return (id < other.getId() ? -1 : (id == other.getId() ? 0 : 1));
}
}
public class BankingSystem {
protected List<Account> accounts;
protected ReadWriteLock lock = new ReentrantReadWriteLock(); // !!
public boolean transfer(Account from, Account to, int amount){
if(from.getId() != to.getId()){
synchronized(from){
if(from.getBalance() < amount) return false;
lock.readLock().lock(); // !!
from.setBalance(from.getBalance() - amount);
}
synchronized(to){
to.setBalance(to.getBalance() + amount);
lock.readLock().unlock(); // !!
}
}
return true;
}
// Rest of class..
}
Note that this is without even using the summation method yet, so no writeLock is ever being aquired. If I only delet the lines marked with a // !! and also do not call the summation method, suddenly the "transfer throughput" using multiple threads is a lot higher than when using a single thread, as is the goal.
My question now is, why does that simple introduction of a readWriteLock slow down the entire thing that much, if I never try to aquire a writeLock, and what I did wrong here, because I can't manage to find the problem.
Sidenote:
I had asked a question regarding this problem already here, but managed to ask the wrong question. I did however, get an amazing answer for that question I asked. I decided to not reduce question quality immensly, and to keep that great answer alive for people in need for help regardin that matter, I would not edit the question (yet again). Instead I opened this question, in the strong belief that this is NOT a duplicate, but instead an entirely different matter.
You would generally use either a lock or synchronized, it is unusual to use both at once.
To manage your scenario you would normally use a fine-grained lock on each account rather than a coarse one as you have. You would also implement the totalling mechanism using a listener.
public interface Listener {
public void changed(int oldValue, int newValue);
}
public class Account {
private int id;
private int balance;
protected ReadWriteLock lock = new ReentrantReadWriteLock();
List<Listener> accountListeners = new ArrayList<>();
public Account(int id) {
this.id = id;
this.balance = 0;
}
public int getBalance() {
int localBalance;
lock.readLock().lock();
try {
localBalance = this.balance;
} finally {
lock.readLock().unlock();
}
return localBalance;
}
public void setBalance(int balance) {
if (balance < 0) {
throw new IllegalArgumentException("Negative balance");
}
// Keep track of the old balance for the listener.
int oldValue = this.balance;
lock.writeLock().lock();
try {
this.balance = balance;
} finally {
lock.writeLock().unlock();
}
if (this.balance != oldValue) {
// Inform all listeners of any change.
accountListeners.stream().forEach((l) -> {
l.changed(oldValue, this.balance);
});
}
}
public boolean lock() throws InterruptedException {
return lock.writeLock().tryLock(1, TimeUnit.SECONDS);
}
public void unlock() {
lock.writeLock().unlock();
}
public void addListener(Listener l) {
accountListeners.add(l);
}
public int getId() {
return this.id;
}
}
public class BankingSystem {
protected List<Account> accounts;
public boolean transfer(Account from, Account to, int amount) throws InterruptedException {
if (from.getId() != to.getId()) {
if (from.lock()) {
try {
if (from.getBalance() < amount) {
return false;
}
if (to.lock()) {
try {
// We have write locks on both accounts.
from.setBalance(from.getBalance() - amount);
to.setBalance(to.getBalance() + amount);
} finally {
to.unlock();
}
} else {
// Not sure what to do - failed to lock the account.
}
} finally {
from.unlock();
}
} else {
// Not sure what to do - failed to lock the account.
}
}
return true;
}
// Rest of class..
}
Note that you can take a write lock in the same thread twice - the second one is also allowed. Locks only exclude access from other threads.
First of all, it’s correct to put the update into its own synchronized block, even if the getter and setter are synchronized on its own, so you avoid the check-then-act anti-pattern.
However, from a performance standpoint it’s not optimal as you acquire the same lock three times (four times for the from account). The JVM, or the HotSpot optimizer, knows the synchronization primitives and is able to optimize such patterns of nested synchronization, but (now we have to guess a bit) if you acquire another lock in-between, it might prevent these optimizations.
As already suggested at the other question, you may turn to a lock free update, but of course you have to fully understand it. The lock free updates are centered around one special operation, compareAndSet which performs an update only if the variable has the expected old value, in other words, no concurrent update has been performed in-between, whereas checking and updating is performed as one atomic operation. And that operation is not implemented using synchronized but utilizing a dedicated CPU instruction directly.
The use pattern is always like
read the current value
calculate the new value (or reject the update)
attempt to perform an update which will succeed if the current value is still the same
The drawback is that the update might fail which requires repeating the three steps but it’s acceptable if the computation is not too heavy and since a failed update indicates that another thread must have succeed with its update in-between, there will always be a progress.
This led to the example code for an account:
static void safeWithdraw(AtomicInteger account, int amount) {
for(;;) { // a loop as we might have to repeat the steps
int current=account.get(); // 1. read the current value
if(amount>current) throw new IllegalStateException();// 2. possibly reject
int newValue=current-amount; // 2. calculate new value
// 3. update if current value didn’t change
if(account.compareAndSet(current, newValue))
return; // exit on success
}
}
So, for supporting lock-free access it’s never sufficient to provide getBalance and setBalance operations as every attempt to compose an operation out of get and set operations without locking will fail.
You have three options:
Provide every supported update operation as a dedicated method like the safeWithdraw method
Provide a compareAndSet method to allow callers to compose their own update operations using that method
Provide an update method that takes an update function as paramater, like the AtomicInteger does in Java 8;
Of course, this comes especially handy when using Java 8 where you can use lambda expressions to implement the actual update function.
Note that AtomicInteger itself uses all options. There are dedicated update methods for common operations like increment and there’s the compareAndSet method allowing composing arbitrary update operations.
Locking is expensive but in your case, I assume there might be some kind of "almost deadlock" when you run the tests: If some thread is in the synchronized(from){} block of your code and another thread wants to unlock the from instance in it's synchronized(to){} block, then it won't be able to: The first synchronized will prevent thread #2 from entering the synchronized(to){} block and hence the lock won't be released very quickly.
That could lead to a lot of threads hanging in the queue of the lock which makes it slow to get/release locks.
Some more notes: Your code will cause problems when the second part (to.setBalance(to.getBalance() + amount);) isn't executed for some reason (exception, deadlock). You need to find out a way to create a transaction around the two operations to make sure they are either executed both or none are executed.
A good way to do this is to create a Balance value object. In your code, you can create two new ones, update both balances and then just call the two setters - since setters can't fail, either both balances will be updated or the code will fail before any setters could be called.

Synchronizing a Map of Sets/Lists

I would like to implement a variation on the "Map of Sets" collection that will be constantly accessed by multiple threads. I am wondering whether the synchronization I am doing is sufficient to guarantee that no issues will manifest.
So given the following code, where Map, HashMap, and Set are the Java implementations, and Key and Value are some arbitrary Objects:
public class MapOfSets {
private Map<Key, Set<Value>> map;
public MapOfLists() {
map = Collections.synchronizedMap(new HashMap<Key, Set<Value>());
}
//adds value to the set mapped to key
public void add(Key key, Value value) {
Set<Value> old = map.get(key);
//if no previous set exists on this key, create it and add value to it
if(old == null) {
old = new Set<Value>();
old.add(value);
map.put(old);
}
//otherwise simply insert the value to the existing set
else {
old.add(value);
}
}
//similar to add
public void remove(Key key, Value value) {...}
//perform some operation on all elements in the set mapped to key
public void foo(Key key) {
Set<Value> set = map.get(key);
for(Value v : set)
v.bar();
}
}
The idea here is that because I've synchronized the Map itself, the get() and put() method should be atomic right? So there should be no need to do additional synchronization on the Map or the Sets contained in it. So will this work?
Alternatively, would the above code be advantageous over another possible synchronization solution:
public class MapOfSets {
private Map<Key, Set<Value>> map;
public MapOfLists() {
map = new HashMap<Key, Set<Value>();
}
public synchronized void add(Key key, Value value) {
Set<Value> old = map.get(key);
//if no previous set exists on this key, create it and add value to it
if(old == null) {
old = new Set<Value>();
old.add(value);
map.put(old);
}
//otherwise simply insert the value to the existing set
else {
old.add(value);
}
}
//similar to add
public synchronized void remove(Key key, Value value) {...}
//perform some operation on all elements in the set mapped to key
public synchronized void foo(Key key) {
Set<Value> set = map.get(key);
for(Value v : set)
v.bar();
}
}
Where I leave the data structures unsynchronized but synchronize all the possible public methods instead. So which ones will work, and which one is better?
The first implementation you posted is not thread safe. Consider what happens when the add method is accessed by two concurrent threads with the same key:
thread A executes line 1 of the method, and gets a null reference because no item with the given key is present
thread B executes line 1 of the method, and gets a null reference because no item with the given key is present — this will happen after A returns from the first call, as the map is synchronized
thread A evaluates the if condition to false
thread B evaluates the if condition to false
From that point on, the two threads will carry on with execution of the true branch of the if statement, and you will lose one of the two value objects.
The second variant of the method you posted looks safer.
However, if you can use third party libraries, I would suggest you to check out Google Guava, as they offer concurrent multimaps (docs).
The second one is correct, but the first one isn't.
Think about it a minute, and suppose two threads are calling add() in parallel. Here's what could occur:
Thread 1 calls add("foo", bar");
Thread 2 calls add("foo", baz");
Thread 1 gets the set for "foo" : null
Thread 2 gets the set for "foo" : null
Thread 1 creates a new set and adds "bar" in it
Thread 2 creates a new set and adds "baz" in it
Thread 1 puts its set in the map
Thread 2 puts its set in the map
At the end of the story, the map contains one value for "foo" instead of two.
Synchronizing the map makes sure that its internal state is coherent, and that each method you call on the map is thread-safe. but it doesn't make the get-then-put operation atomic.
Consider using one of Guava's SetMultiMap implementations, which does everything for you. Wrap it into a call to Multimaps.synchronizedSetMultimap(SetMultimap) to make it thread-safe.
Your second implementation will work, but it holds locks for longer than it needs to (an inevitable problem with using synchronized methods rather than synchronized blocks), which will reduce concurrency. If you find that the limit on concurrency here is a bottleneck, you could shrink the locked regions a bit.
Alternatively, you could use some of the lock-free collections providded by java.util.concurrent. Here's my attempt at that; this isn't tested, and it requires Key to be comparable, but it should not perform any locking ever:
public class MapOfSets {
private final ConcurrentMap<Key, Set<Value>> map;
public MapOfSets() {
map = new ConcurrentSkipListMap<Key, Set<Value>>();
}
private static ThreadLocal<Set<Value>> freshSets = new ThreadLocal<Set<Value>>() {
#Override
protected Set<Value> initialValue() {
return new ConcurrentSkipListSet<Value>();
}
};
public void add(Key key, Value value) {
Set<Value> freshSet = freshSets.get();
Set<Value> set = map.putIfAbsent(key, freshSet);
if (set == null) {
set = freshSet;
freshSets.remove();
}
set.add(value);
}
public void remove(Key key, Value value) {
Set<Value> set = map.get(key);
if (set != null) {
set.remove(value);
}
}
//perform some operation on all elements in the set mapped to key
public void foo(Key key) {
Set<Value> set = map.get(key);
if (set != null) {
for (Value v: set) {
v.bar();
}
}
}
}
For your Map implementation you could just use a ConcurrentHashMap - You wouldn't have to worry about ensuring thread safety for access, whether it's input or retrieval, as the implemenation takes care of that for you.
And if you really want to use a Set, you could call
Collections.newSetFromMap(new ConcurrentHashMap<Object,Boolean>())
on your ConcurrentHashMap.

Categories