Improve the performance of non-duplicate concurrent ArrayList - java

I came across the performance issue when implementing a data structure of non-duplicate concurrent ArrayList(or ConcurrentLinkedQueue).
public class NonDuplicateList implements Outputable {
private Map<Term, Integer> map;
private List<Term> terms;
public NonDuplicateList() {
this.map = new HashMap<>();
this.terms = new ArrayList<>();
}
public synchronized int addTerm(Term term) { //bad performance :(
Integer index = map.get(term);
if (index == null) {
index = terms.size();
terms.add(term);
map.put(term, index);
}
return index;
}
#Override
public void output(DataOutputStream out) throws IOException {
out.writeInt(terms.size());
for (Term term : terms) {
term.output(out);
}
}
}
Note that Term and NonDuplicateList both implement Outputable interface to output.
In order to keep NonDuplicateList thread-safe, I use synchronized to guard the method addTerm(Term) and the performance is as bad as expected, when currently invoking addTerm.
It seems that ConcurrentHashMap isn't suitable for this case, since it doesn't keep strong data consistency. Any idea how to improve the performance of addTerm without losing its thread-safety?
EDIT:
output method, i.e. iteration through NonDuplicateList, might not be thread-safe since only one thread will access this method after concurrently invoking addTerm, but addTerm must return the index value immediately as soon as a term is added into the NonDuplicateList.

There is a possibility to reuse ConcurrentHashMap in your implementation if you can sacrifice addTerm return type. Instead of returning actual index you can return boolean which indicates whether addition was successful or produced duplicate. This will also allow you to remove method synchronization and improve performance:
private ConcurrentMap<Term, Boolean> map;
private List<Term> terms;
public boolean addTerm(Term term) {
Boolean previousValue = map.putIfAbsent(term, Boolean.TRUE);
if (previousValue == null) {
terms.add(term);
return true;
}
return false;
}

I am afraid you will not get much faster solution here. The point is to avoid synchronization when you don't need it. If you don't mind weak consistency, using ConcurrentHashMap iterator can be significantly cheaper than either preventing other threads from adding items while you're iterating or taking a consistent snapshot when the iterator is created.
On the other hand, when you need synchronization and a consistent iterator, you'll need an alternative for ConcurrentHashMap. One that comes to my mind is java.util.Collections#synchronizedMap, but it's using synchronization at Object level, so every read/write operation needs to acquire lock, which is a performance overhead.
Take a look at ConcurrentSkipListMap, which guarantees average O(log(n)) performance on a wide variety of operations. It also has a number of operations that ConcurrentHashMap doesn't: ceilingEntry/Key, floorEntry/Key, etc. It also maintains a sort order, which would otherwise have to be calculated (at notable expense) if you were using a ConcurrentHashMap. Maybe it would be possible to get rid of list+map and use ConcurrentSkipListMap instead. Index of element might be computed using ConcurrentSkipListMap api.

Related

Is repeatedly trying to get locks a good solution to prevent deadlocks?

my question is about synchronisation and preventing deadlocks when using threads. In this example an object simply holds an integer variable and multiple threads call swapValue on those objects.
public class Data {
private long value;
public Data(long value) {
this.value = value;
}
public synchronized long getValue() {
return value;
}
public synchronized void setValue(long value) {
this.value = value;
}
public void swapValue(Data other) {
long temp = getValue();
long newValue = other.getValue();
setValue(newValue);
other.setValue(temp);
}
}
The swapValue method should be thread safe and should not skip swapping the values if the resources are not available. Simply using the synchronized keyword on the method signature will result in a deadlock. I came up with this (apparently) working solution, which is only based on the probability that one thread unlocks its resource and the other tries to claim it while the resource is still unlocked.
private Lock lock = new ReentrantLock();
...
public void swapValue(Data other) {
lock.lock();
while(!other.lock.tryLock())
{
lock.unlock();
lock.lock();
}
long temp = getValue();
long newValue = other.getValue();
setValue(newValue);
other.setValue(temp);
other.lock.unlock();
lock.unlock();
}
To me this looks like a hack. Is this a common solution for these kind of problems? Are there solutions that are "more deterministic" in their behaviour and also applicable in practice?
There are two issues at play here:
First, mixing Data.lock with the built-in lock used by the synchronized keyword
Second, inconsistent locking order among four (!) locks - this.lock, other.lock, the built-in lock of this, and the built-in lock of other
Even without synchronized, a.swapValue(b) and b.swapValue(a) can deadlock unless you use your approach to try to spin while locking and unlocking, which is inefficient.
One approach that you could take is add a field with some kind of final unique ID to each Data object - when swapping data of two objects, lock the one with a lower ID before the one with the higher ID, regardless of which is this and which is other. Note that System.identityHashCode is unfortunately not unique so it can't be easily used here.
The unlock ordering isn't critical here, but unlocking in the reverse order of locking is generally a good practice to follow where possible.
#Nanofarad has the right idea: Give every Data instance a unique, permanent numeric ID, and then use those IDs to decide which object to lock first. Here's what that might look like in practice:
private static void lockBoth(Data a, Data b) {
Lock first = a.lock;
Lock second = b.lock;
if (a.getID() < b.getID()) {
first = b.lock;
second = a.lock;
}
first.lock();
second.lock();
}
private static void unlockBoth(Data a, Data b) {
a.lock.unlock();
b.lock.unlock();
// Note: #Queeg suggests in comments below that in the general case,
// it would be good practice to make this routine always unlock the
// two locks in the order opposite to which `lockBoth()` locked them.
// See https://stackoverflow.com/a/8949355/801894 for an explanation.
}
public void swapValue(Data other) {
lockBoth(this, other);
...swap 'em...
unlockBoth(this, other);
}
In your case, just use AtomicInteger or AtomicLong instead inventing the wheel again. About the synchronization and deadlocks part of your question in general - DO NOT RELY ON PROBABILITY -- it is way too tricky and too easy to get it wrong, unless you're experienced mathematician knowing exactly what youre doing - but even then it is risky. One example when probability is used is UUID, but if computers will get fast enough then the code that shouldn't reasonably break till the end of universe can break in matter of milliseconds or faster, it is better to write code that do not rely on probability, especially concurrent code.

Missing updates with locks and ConcurrentHashMap

I have a scenario where I have to maintain a Map which can be populated by multiple threads, each modifying their respective List (unique identifier/key being the thread name), and when the list size for a thread exceeds a fixed batch size, we have to persist the records to the database.
Aggregator class
private volatile ConcurrentHashMap<String, List<T>> instrumentMap = new ConcurrentHashMap<String, List<T>>();
private ReentrantLock lock ;
public void addAll(List<T> entityList, String threadName) {
try {
lock.lock();
List<T> instrumentList = instrumentMap.get(threadName);
if(instrumentList == null) {
instrumentList = new ArrayList<T>(batchSize);
instrumentMap.put(threadName, instrumentList);
}
if(instrumentList.size() >= batchSize -1){
instrumentList.addAll(entityList);
recordSaver.persist(instrumentList);
instrumentList.clear();
} else {
instrumentList.addAll(entityList);
}
} finally {
lock.unlock();
}
}
There is one more separate thread running after every 2 minutes (using the same lock) to persist all the records in Map (to make sure we have something persisted after every 2 minutes and the map size does not gets too big)
if(//Some condition) {
Thread.sleep(//2 minutes);
aggregator.getLock().lock();
List<T> instrumentList = instrumentMap.values().stream().flatMap(x->x.stream()).collect(Collectors.toList());
if(instrumentList.size() > 0) {
saver.persist(instrumentList);
instrumentMap .values().parallelStream().forEach(x -> x.clear());
aggregator.getLock().unlock();
}
}
This solution is working fine in almost for every scenario that we tested, except sometimes we see some of the records went missing, i.e. they are not persisted at all, although they were added fine to the Map.
My questions are:
What is the problem with this code?
Is ConcurrentHashMap not the best solution here?
Does the List that is used with the ConcurrentHashMap have an issue?
Should I use the compute method of ConcurrentHashMap here (no need I think, as ReentrantLock is already doing the same job)?
The answer provided by #Slaw in the comments did the trick. We were letting the instrumentList instance escape in non-synchronized way i.e. access/operations are happening over list without any synchonization. Fixing the same by passing the copy to further methods did the trick.
Following line of code is the one where this issue was happening
recordSaver.persist(instrumentList);
instrumentList.clear();
Here we are allowing the instrumentList instance to escape in non-synchronized way i.e. it is passed to another class (recordSaver.persist) where it was to be actioned on but we are also clearing the list in very next line(in Aggregator class) and all of this is happening in non-synchronized way. List state can't be predicted in record saver... a really stupid mistake.
We fixed the issue by passing a cloned copy of instrumentList to recordSaver.persist(...) method. In this way instrumentList.clear() has no affect on list available in recordSaver for further operations.
I see, that you are using ConcurrentHashMap's parallelStream within a lock. I am not knowledgeable about Java 8+ stream support, but quick searching shows, that
ConcurrentHashMap is a complex data structure, that used to have concurrency bugs in past
Parallel streams must abide to complex and poorly documented usage restrictions
You are modifying your data within a parallel stream
Based on that information (and my gut-driven concurrency bugs detector™), I wager a guess, that removing the call to parallelStream might improve robustness of your code. In addition, as mentioned by #Slaw, you should use ordinary HashMap in place of ConcurrentHashMap if all instrumentMap usage is already guarded by lock.
Of course, since you don't post the code of recordSaver, it is possible, that it too has bugs (and not necessarily concurrency-related ones). In particular, you should make sure, that the code that reads records from persistent storage — the one, that you are using to detect loss of records — is safe, correct, and properly synchronized with rest of your system (preferably by using a robust, industry-standard SQL database).
It looks like this was an attempt at optimization where it was not needed. In that case, less is more and simpler is better. In the code below, only two concepts for concurrency are used: synchronized to ensure a shared list is properly updated and final to ensure all threads see the same value.
import java.util.ArrayList;
import java.util.List;
public class Aggregator<T> implements Runnable {
private final List<T> instruments = new ArrayList<>();
private final RecordSaver recordSaver;
private final int batchSize;
public Aggregator(RecordSaver recordSaver, int batchSize) {
super();
this.recordSaver = recordSaver;
this.batchSize = batchSize;
}
public synchronized void addAll(List<T> moreInstruments) {
instruments.addAll(moreInstruments);
if (instruments.size() >= batchSize) {
storeInstruments();
}
}
public synchronized void storeInstruments() {
if (instruments.size() > 0) {
// in case recordSaver works async
// recordSaver.persist(new ArrayList<T>(instruments));
// else just:
recordSaver.persist(instruments);
instruments.clear();
}
}
#Override
public void run() {
while (true) {
try { Thread.sleep(1L); } catch (Exception ignored) {
break;
}
storeInstruments();
}
}
class RecordSaver {
void persist(List<?> l) {}
}
}

Delegating thread-safety to ConcurrentMap and AtomicInteger

I need to provide thread-safe implementation of the following container:
public interface ParameterMetaData<ValueType> {
public String getName();
}
public interface Parameters {
public <M> M getValue(ParameterMetaData<M> pmd);
public <M> void put(ParameterMetaData<M> p, M value);
public int size();
}
The thing is the size method should return the accurate number of paramters currently contained in a Parameters instance. So, my first attempt was to try delegating thread-safety as follows:
public final class ConcurrentParameters implements Parameters{
private final ConcurrentMap<ParameterMetaData<?>, Object> parameters =
new ConcurrentHashMap<>();
//Should represent the ACCURATE size of the internal map
private final AtomicInteger size = new AtomicInteger();
#Override
public <M> M getValue(ParameterMetaData<M> pmd) {
#SuppressWarnings("unchecked")
M value = (M) parameters.get(pmd);
return value;
}
#Override
public <M> void put(ParameterMetaData<M> p, M value){
if(value == null)
return;
//The problem is in the code below
M previous = (M) parameters.putIfAbsent(p, value);
if(previous != null)
//throw an exception indicating that the parameter already exists
size.incrementAndGet();
}
#Override
public int size() {
return size.intValue();
}
The problem is that I can't just call parameters.size() on the ConcurrentHashMap instance to return the actual size, as that the operation performs traversal without locking and there's no guaratee that it will retrieve the actual size. It isn't acceptable in my case. So, I decided to maintain the field containing the size.
QUESTION: Is it possible somehow to delegate thread safety and preserve the invariatns?
The outcome you want to achieve is non-atomic. You want to modify map and then get count of elements that would be consistent in a scope of single thread. The only way to achieve that is to make this flow "atomic operation" by synchronizing access to the map. This is the only way to assure that count will not change due to modifications made in another thread.
Synchronize modify-count access to the map via synchronized or Semaphore to allow only single thread to modify map and count elements at the time.
Using additional field as a counter does not guarantee thread safety here, as after map modification and before counter manipulation, other thread can in fact modify map, and the counter value will not be valid.
This is the reason why map does not keeps its size internally but has to traversal over elements - to give most accurate results at given point in time.
EDIT:
To be 100% clear, this is the most convinient way to achieve this:
synchronized(yourMap){
doSomethingWithTheMap();
yourMap.size();
}
so if you will change every map operation to such block, you will guarantee that size() will return accurate count of elements. The only condition is that all data manipulations are done using such synchronized block.

Write an ArrayList with integers which will be accessed concurrently

The requirement is that, I need to write an ArrayList of integers. I need thread-safe access of the different integers (write, read, increase, decrease), and also need to allow maximum concurrency.
The operation with each integer is also special, like this:
Mostly frequent operation is to read
Secondly frequent operation is to decrease by one only if the value is greater than zero. Or, to increase by one (unconditionally)
Adding/removing elements is rare, but still needed.
I thought about AtomicInteger. However this becomes unavailable, because the atomic operation I want is to compare if not zero, then decrease. However the atomic operation provided by AtomicInteger, is compare if equal, and set. If you know how to apply AtomicInteger in this case, please raise it here.
What I am thinking is to synchronized the access to each integer like this:
ArrayList <Integer> list;
... ...
// Compare if greater than zero, and decrease
MutableInt n = list.get(index);
boolean success = false;
synchronized (n) {
if (n.intValue()>0) { n.decrement(); success=true; }
}
// To add one
MutableInt n = list.get(index);
synchronized (n) {
n.increment();
}
// To just read, I am thinking no need synchronization at all.
int n = list.get(index).intValue();
With my solution, is there any side-effect? Is it efficient to maintain hundreds or even thousands of synchronized integers?
Update: I am also thinking that allowing concurrent access to every element is not practical and not beneficial, as the actual concurrent access is limited by the number of processors. Maybe I just use several synchronization objects to guard different portions of the List, then it is enough?
Then another thing is to implement the operation of add/delete, that it is thread-safe, but do not impact much of the concurrency of the other operations. I am thinking ReadWriteLock, for add/delete, need to acquire the write lock, for other operations (change the value of one integer), acquire the read lock. Is this a right approach?
I think you're right to use read lock for accessing the list and write lock for add/remove on the list.
You can still use AtomicInteger for the values:
// Increase value
value.incrementAndGet()
// Decrease value, lower bound is 0
do {
int num = value.get();
if (num == 0)
break;
} while (! value.compareAndSet(num, num - 1)); // try again if concurrently updated
I think, if you can live with a fixed size list, using a single AtomicIntegerArray is a better choice than using multiple AtomicIntegers:
public class AtomicIntList extends AbstractList<Integer> {
private final AtomicIntegerArray array;
public AtomicIntList(int size) {
array=new AtomicIntegerArray(size);
}
public int size() {
return array.length();
}
public Integer get(int index) {
return array.get(index);
}
// for code accessing this class directly rather than using the List interface
public int getAsInt(int index) {
return array.get(index);
}
public Integer set(int index, Integer element) {
return array.getAndSet(index, element);
}
// for code accessing this class directly rather than using the List interface
public int setAsInt(int index, int element) {
return array.getAndSet(index, element);
}
public boolean decrementIfPositive(int index) {
for(;;) {
int old=array.get(index);
if(old<=0) return false;
if(array.compareAndSet(index, old, old-1)) return true;
}
}
public int incrementAndGet(int index) {
return array.incrementAndGet(index);
}
}
Code accessing this class directly rather than via the List<Integer> interface may use the methods getAsInt and setAsInt to avoid boxing conversions. This is a common pattern. Since the methods decrementIfPositive and incrementAndGet are not part of the List interface anyway, they always use int values.
As an update of this question... I found out that the simplest solution, just synchronizing the entire code-block for all possible conflict methods, turns out to be the best, even from performance's point of view. Synchronizing the code block solves both issues - accessing each counters, and also add/delete elements to the counter list.
This is because ReentrantReadWriteLock has a really high overhead, even when only the read lock is applied. Comparing to the overhead of read/write lock, the cost of the operation itself is so tiny, that any additional locking is not worthy doing that.
The statement in the API doc of ReentrantReadWriteLock shall be highly put attention to: "ReentrantReadWriteLocks... is typically worthwhile only when ... and entail operations with overhead that outweighs synchronization overhead".

ThreadLocal HashMap vs ConcurrentHashMap for thread-safe unbound caches

I'm creating a memoization cache with the following characteristics:
a cache miss will result in computing and storing an entry
this computation is very expensive
this computation is idempotent
unbounded (entries never removed) since:
the inputs would result in at most 500 entries
each stored entry is very small
cache is relatively shorted-lived (typically less than an hour)
overall, memory usage isn't an issue
there will be thousands of reads - over the cache's lifetime, I expect 99.9%+ cache hits
must be thread-safe
What would have superior performance, or under what conditions would one solution be favored over the other?
ThreadLocal HashMap:
class MyCache {
private static class LocalMyCache {
final Map<K,V> map = new HashMap<K,V>();
V get(K key) {
V val = map.get(key);
if (val == null) {
val = computeVal(key);
map.put(key, val);
}
return val;
}
}
private final ThreadLocal<LocalMyCache> localCaches = new ThreadLocal<LocalMyCache>() {
protected LocalMyCache initialValue() {
return new LocalMyCache();
}
};
public V get(K key) {
return localCaches.get().get(key);
}
}
ConcurrentHashMap:
class MyCache {
private final ConcurrentHashMap<K,V> map = new ConcurrentHashMap<K,V>();
public V get(K key) {
V val = map.get(key);
if (val == null) {
val = computeVal(key);
map.put(key, val);
}
return val;
}
}
I figure the ThreadLocal solution would initially be slower if there a lot of threads because of all the cache misses per thread, but over thousands of reads, the amortized cost would be lower than the ConcurrentHashMap solution. Is my intuition correct?
Or is there an even better solution?
use ThreadLocal as cache is a not good practice
In most containers, threads are reused via thread pools and thus are never gc. this would lead something wired
use ConcurrentHashMap you have to manage it in order to prevent mem leak
if you insist, i suggest using week or soft ref and evict after rich maxsize
if you are finding a in mem cache solution ( do not reinventing the wheel )
try guava cache
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/cache/CacheBuilder.html
this computation is very expensive
I assume this is the reason you created the cache and this should be your primary concern.
While the speed of the solutions might be slightly different << 100 ns, I suspect it is more important that you be able to share results between threads. i.e. ConcurrentHashMap is likely to be the best for your application is it is likely to save you more CPU time in the long run.
In short, the speed of you solution is likely to be tiny compared to the cost of computing the same thing multiple times (for multiple threads)
Note that your ConcurrentHashMap implementation is not thread safe and could lead to one item being computed twice. It is actually quite complicated to get it right if you store the results directly without using explicit locking, which you certainly want to avoid if performance is a concern.
It is worth noting that ConcurrentHashMap is highly scalable and works well under high contention. I don't know if ThreadLocal would perform better.
Apart from using a library, you could take some inspiration from Java Concurrency in Practice Listing 5.19. The idea is to save a Future<V> in your map instead of a V. That helps a lot in making the whole method thread safe while staying efficient (lock-free). I paste the implementation below for reference but the chapter is worth reading to understand that every detail counts.
public interface Computable<K, V> {
V compute(K arg) throws InterruptedException;
}
public class Memoizer<K, V> implements Computable<K, V> {
private final ConcurrentMap<K, Future<V>> cache = new ConcurrentHashMap<K, Future<V>>();
private final Computable<K, V> c;
public Memoizer(Computable<K, V> c) {
this.c = c;
}
public V compute(final K arg) throws InterruptedException {
while (true) {
Future<V> f = cache.get(arg);
if (f == null) {
Callable<V> eval = new Callable<V>() {
public V call() throws InterruptedException {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = cache.putIfAbsent(arg, ft);
if (f == null) {
f = ft;
ft.run();
}
}
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
throw new RuntimeException(e.getCause());
}
}
}
}
Given that it's relatively easy to implement both of these, I would suggest you try them both and test at steady state load to see which one performs the best for your application.
My guess is that the the ConcurrentHashMap will be a little faster since it does not have to make native calls to Thread.currentThread() like a ThreadLocal does. However, this may depend on the objects you are storing and how efficient their hash coding is.
I may also be worthwhile trying to tune the concurrent map's concurrencyLevel to the number of threads you need. It defaults to 16.
The lookup speed is probably similar in both solutions. If there are no other concerns, I'd prefer ThreadLocal, since the best solution to multi-threading problems is single-threading.
However, your main problem is you don't want concurrent calculations for the same key; so there should be a lock per key; such locks can usually be implemented by ConcurrentHashMap.
So my solution would be
class LazyValue
{
K key;
volatile V value;
V getValue() { lazy calculation, doubled-checked locking }
}
static ConcurrentHashMap<K, LazyValue> centralMap = ...;
static
{
for every key
centralMap.put( key, new LazyValue(key) );
}
static V lookup(K key)
{
V value = localMap.get(key);
if(value==null)
localMap.put(key, value=centralMap.get(key).getValue())
return value;
}
The performance question is irrelevant, as the solutions are not equivalent.
The ThreadLocal hash map isn't shared between threads, so the question of thread safety doesn't even arise, but it also doesn't meet your specification, which doesn't say anything about each thread having its own cache.
The requirement for thread safety implies that a single cache is shared among all threads, which rules out ThreadLocal completely.

Categories