The requirement is that, I need to write an ArrayList of integers. I need thread-safe access of the different integers (write, read, increase, decrease), and also need to allow maximum concurrency.
The operation with each integer is also special, like this:
Mostly frequent operation is to read
Secondly frequent operation is to decrease by one only if the value is greater than zero. Or, to increase by one (unconditionally)
Adding/removing elements is rare, but still needed.
I thought about AtomicInteger. However this becomes unavailable, because the atomic operation I want is to compare if not zero, then decrease. However the atomic operation provided by AtomicInteger, is compare if equal, and set. If you know how to apply AtomicInteger in this case, please raise it here.
What I am thinking is to synchronized the access to each integer like this:
ArrayList <Integer> list;
... ...
// Compare if greater than zero, and decrease
MutableInt n = list.get(index);
boolean success = false;
synchronized (n) {
if (n.intValue()>0) { n.decrement(); success=true; }
}
// To add one
MutableInt n = list.get(index);
synchronized (n) {
n.increment();
}
// To just read, I am thinking no need synchronization at all.
int n = list.get(index).intValue();
With my solution, is there any side-effect? Is it efficient to maintain hundreds or even thousands of synchronized integers?
Update: I am also thinking that allowing concurrent access to every element is not practical and not beneficial, as the actual concurrent access is limited by the number of processors. Maybe I just use several synchronization objects to guard different portions of the List, then it is enough?
Then another thing is to implement the operation of add/delete, that it is thread-safe, but do not impact much of the concurrency of the other operations. I am thinking ReadWriteLock, for add/delete, need to acquire the write lock, for other operations (change the value of one integer), acquire the read lock. Is this a right approach?
I think you're right to use read lock for accessing the list and write lock for add/remove on the list.
You can still use AtomicInteger for the values:
// Increase value
value.incrementAndGet()
// Decrease value, lower bound is 0
do {
int num = value.get();
if (num == 0)
break;
} while (! value.compareAndSet(num, num - 1)); // try again if concurrently updated
I think, if you can live with a fixed size list, using a single AtomicIntegerArray is a better choice than using multiple AtomicIntegers:
public class AtomicIntList extends AbstractList<Integer> {
private final AtomicIntegerArray array;
public AtomicIntList(int size) {
array=new AtomicIntegerArray(size);
}
public int size() {
return array.length();
}
public Integer get(int index) {
return array.get(index);
}
// for code accessing this class directly rather than using the List interface
public int getAsInt(int index) {
return array.get(index);
}
public Integer set(int index, Integer element) {
return array.getAndSet(index, element);
}
// for code accessing this class directly rather than using the List interface
public int setAsInt(int index, int element) {
return array.getAndSet(index, element);
}
public boolean decrementIfPositive(int index) {
for(;;) {
int old=array.get(index);
if(old<=0) return false;
if(array.compareAndSet(index, old, old-1)) return true;
}
}
public int incrementAndGet(int index) {
return array.incrementAndGet(index);
}
}
Code accessing this class directly rather than via the List<Integer> interface may use the methods getAsInt and setAsInt to avoid boxing conversions. This is a common pattern. Since the methods decrementIfPositive and incrementAndGet are not part of the List interface anyway, they always use int values.
As an update of this question... I found out that the simplest solution, just synchronizing the entire code-block for all possible conflict methods, turns out to be the best, even from performance's point of view. Synchronizing the code block solves both issues - accessing each counters, and also add/delete elements to the counter list.
This is because ReentrantReadWriteLock has a really high overhead, even when only the read lock is applied. Comparing to the overhead of read/write lock, the cost of the operation itself is so tiny, that any additional locking is not worthy doing that.
The statement in the API doc of ReentrantReadWriteLock shall be highly put attention to: "ReentrantReadWriteLocks... is typically worthwhile only when ... and entail operations with overhead that outweighs synchronization overhead".
Related
my question is about synchronisation and preventing deadlocks when using threads. In this example an object simply holds an integer variable and multiple threads call swapValue on those objects.
public class Data {
private long value;
public Data(long value) {
this.value = value;
}
public synchronized long getValue() {
return value;
}
public synchronized void setValue(long value) {
this.value = value;
}
public void swapValue(Data other) {
long temp = getValue();
long newValue = other.getValue();
setValue(newValue);
other.setValue(temp);
}
}
The swapValue method should be thread safe and should not skip swapping the values if the resources are not available. Simply using the synchronized keyword on the method signature will result in a deadlock. I came up with this (apparently) working solution, which is only based on the probability that one thread unlocks its resource and the other tries to claim it while the resource is still unlocked.
private Lock lock = new ReentrantLock();
...
public void swapValue(Data other) {
lock.lock();
while(!other.lock.tryLock())
{
lock.unlock();
lock.lock();
}
long temp = getValue();
long newValue = other.getValue();
setValue(newValue);
other.setValue(temp);
other.lock.unlock();
lock.unlock();
}
To me this looks like a hack. Is this a common solution for these kind of problems? Are there solutions that are "more deterministic" in their behaviour and also applicable in practice?
There are two issues at play here:
First, mixing Data.lock with the built-in lock used by the synchronized keyword
Second, inconsistent locking order among four (!) locks - this.lock, other.lock, the built-in lock of this, and the built-in lock of other
Even without synchronized, a.swapValue(b) and b.swapValue(a) can deadlock unless you use your approach to try to spin while locking and unlocking, which is inefficient.
One approach that you could take is add a field with some kind of final unique ID to each Data object - when swapping data of two objects, lock the one with a lower ID before the one with the higher ID, regardless of which is this and which is other. Note that System.identityHashCode is unfortunately not unique so it can't be easily used here.
The unlock ordering isn't critical here, but unlocking in the reverse order of locking is generally a good practice to follow where possible.
#Nanofarad has the right idea: Give every Data instance a unique, permanent numeric ID, and then use those IDs to decide which object to lock first. Here's what that might look like in practice:
private static void lockBoth(Data a, Data b) {
Lock first = a.lock;
Lock second = b.lock;
if (a.getID() < b.getID()) {
first = b.lock;
second = a.lock;
}
first.lock();
second.lock();
}
private static void unlockBoth(Data a, Data b) {
a.lock.unlock();
b.lock.unlock();
// Note: #Queeg suggests in comments below that in the general case,
// it would be good practice to make this routine always unlock the
// two locks in the order opposite to which `lockBoth()` locked them.
// See https://stackoverflow.com/a/8949355/801894 for an explanation.
}
public void swapValue(Data other) {
lockBoth(this, other);
...swap 'em...
unlockBoth(this, other);
}
In your case, just use AtomicInteger or AtomicLong instead inventing the wheel again. About the synchronization and deadlocks part of your question in general - DO NOT RELY ON PROBABILITY -- it is way too tricky and too easy to get it wrong, unless you're experienced mathematician knowing exactly what youre doing - but even then it is risky. One example when probability is used is UUID, but if computers will get fast enough then the code that shouldn't reasonably break till the end of universe can break in matter of milliseconds or faster, it is better to write code that do not rely on probability, especially concurrent code.
I have been asked to implement fine grained locking on a hashlist. I have done this using synchronized but the questions tells me to use Lock instead.
I have created a hashlist of objects in the constructor
private LinkedList<E> data[];;
private Lock lock[];
private Lock lockR = new ReentrantLock();
// The constructors ensure that both the data and the dataLock are the same size
#SuppressWarnings("unchecked")
public ConcurrentHashList(int n){
if(n > 1000) {
data = (LinkedList<E>[])(new LinkedList[n/10]);
lock = new Lock [n/10];
}
else {
data = (LinkedList<E>[])(new LinkedList[100]);
lock = new Lock [100]; ;
}
for(int j = 0; j < data.length;j++) {
data[j] = new LinkedList<E>();
lock[j] = new ReentrantLock();// Adding a lock to each bucket index
}
}
The original method
public void add(E x){
if(x != null){
lock.lock();
try{
int index = hashC(x);
if(!data[index].contains(x))
data[index].add(x);
}finally{lock.unlock();}
}
}
Using synchronization to grab a handle on the object hashlist to allow mutable Threads to work on mutable indexes concurrently.
public void add(E x){
if(x != null){
int index = hashC(x);
synchronized (dataLock[index]) { // Getting handle before adding
if(!data[index].contains(x))
data[index].add(x);
}
}
}
I do not know how to implement it using Lock though I can not lock a single element in a array only the whole method which means it is not coarse grained.
Using an array of ReentrantLock
public void add(E x){
if(x != null){
int index = hashC(x);
dataLock[index].lock();
try {
// Getting handle before adding
if(!data[index].contains(x))
data[index].add(x);
}finally {dataLock[index].unlock();}
}
}
The hash function
private int hashC(E x){
int k = x.hashCode();
int h = Math.abs(k % data.length);
return(h);
}
Presumably, hashC() is a function that is highly likely to produce unique numbers. As in, you have no guarantee that the hashes are unique, but the incidence of non-unique hashes is extremely low. For a data structure with a few million entries, you have a literal handful of collisions, and any given collision always consists of only a pair or maybe 3 conflicts (2 to 3 objects in your data structure have the same hash, but not 'thousands').
Also, assumption: the hash for a given object is constant. hashC(x) will produce the same value no matter how many times you call it, assuming you provide the same x.
Then, you get some fun conclusions:
The 'bucket' (The LinkedList instance found at array slot hashC(x) in data) that your object should go into, is always the same - you know which one it should be based solely on the result of hashC.
Calculating hashC does not require a lock of any sort. It has no side effects whatsoever.
Thus, knowing which bucket you need for a given operation on a single value (Be it add, remove, or check-if-in-collection) can be done without locking anything.
Now, once you know which bucket you need to look at / mutate, okay, now locking is involved.
So, just have 1 lock for each bucket. Not a List<Object> locks[];, that's a whole list worth of locks per bucket. Just Object[] locks is all you need, or ReentrantLock[] locks if you prefer to use lock/unlock instead of synchronized (lock[bucketIdx]) { ... }.
This is effectively fine-grained: After all, the odds that one operation needs to twiddle its thumbs because another thread is doing something, even though that other thread is operating on a different object, is very low; it would require the two different objects to have a colliding hash, which is possible, but extremely rare - as per assumption #1.
NB: Note that therefore lock can go away entirely, you don't need it, unless you want to build into your code that the code may completely re-design its bucket structure. For example, 1000 buckets feels a bit meh if you end up with a billion objects. I don't think 'rebucket everything' is part of the task here, though.
I came across the performance issue when implementing a data structure of non-duplicate concurrent ArrayList(or ConcurrentLinkedQueue).
public class NonDuplicateList implements Outputable {
private Map<Term, Integer> map;
private List<Term> terms;
public NonDuplicateList() {
this.map = new HashMap<>();
this.terms = new ArrayList<>();
}
public synchronized int addTerm(Term term) { //bad performance :(
Integer index = map.get(term);
if (index == null) {
index = terms.size();
terms.add(term);
map.put(term, index);
}
return index;
}
#Override
public void output(DataOutputStream out) throws IOException {
out.writeInt(terms.size());
for (Term term : terms) {
term.output(out);
}
}
}
Note that Term and NonDuplicateList both implement Outputable interface to output.
In order to keep NonDuplicateList thread-safe, I use synchronized to guard the method addTerm(Term) and the performance is as bad as expected, when currently invoking addTerm.
It seems that ConcurrentHashMap isn't suitable for this case, since it doesn't keep strong data consistency. Any idea how to improve the performance of addTerm without losing its thread-safety?
EDIT:
output method, i.e. iteration through NonDuplicateList, might not be thread-safe since only one thread will access this method after concurrently invoking addTerm, but addTerm must return the index value immediately as soon as a term is added into the NonDuplicateList.
There is a possibility to reuse ConcurrentHashMap in your implementation if you can sacrifice addTerm return type. Instead of returning actual index you can return boolean which indicates whether addition was successful or produced duplicate. This will also allow you to remove method synchronization and improve performance:
private ConcurrentMap<Term, Boolean> map;
private List<Term> terms;
public boolean addTerm(Term term) {
Boolean previousValue = map.putIfAbsent(term, Boolean.TRUE);
if (previousValue == null) {
terms.add(term);
return true;
}
return false;
}
I am afraid you will not get much faster solution here. The point is to avoid synchronization when you don't need it. If you don't mind weak consistency, using ConcurrentHashMap iterator can be significantly cheaper than either preventing other threads from adding items while you're iterating or taking a consistent snapshot when the iterator is created.
On the other hand, when you need synchronization and a consistent iterator, you'll need an alternative for ConcurrentHashMap. One that comes to my mind is java.util.Collections#synchronizedMap, but it's using synchronization at Object level, so every read/write operation needs to acquire lock, which is a performance overhead.
Take a look at ConcurrentSkipListMap, which guarantees average O(log(n)) performance on a wide variety of operations. It also has a number of operations that ConcurrentHashMap doesn't: ceilingEntry/Key, floorEntry/Key, etc. It also maintains a sort order, which would otherwise have to be calculated (at notable expense) if you were using a ConcurrentHashMap. Maybe it would be possible to get rid of list+map and use ConcurrentSkipListMap instead. Index of element might be computed using ConcurrentSkipListMap api.
I have read that in concurrent hashmap in Java, simultaneous insertions are possible because it is divided into segments and separate lock is taken for each segment.
But if two insertions are going to happen on same segment, then these simultaneous will not happen.
My question is what will happen in such a case? Will second insertion waits till first one gets completed or what?
In general you don't need be too concerned how ConcurrentHashMap is implemented. It simply complies to the the contract of ConcurrentMap which ensures that concurrent modifications are possible.
But to answer your question: yes, one insertion may wait for completion of the other one. Internally, it uses locks which ensure that one thread is waiting until the other one releases the lock. Class Segment used internally actually inherits from ReentrantLock. Here is a shortened version of Segmenet.put():
final V put(K key, int hash, V value, boolean onlyIfAbsent) {
HashEntry<K,V> node = tryLock() ? null : scanAndLockForPut(key, hash, value);
V oldValue;
try {
// modifications
} finally {
unlock();
}
return oldValue;
}
private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
// ...
int retries = -1; // negative while locating node
while (!tryLock()) {
if (retries < 0) {
// ...
}
else if (++retries > MAX_SCAN_RETRIES) {
lock();
break;
}
else if ((retries & 1) == 0 && (f = entryForHash(this, hash)) != first) {
e = first = f; // re-traverse if entry changed
retries = -1;
}
}
return node;
}
This could give you an idea.
ConcurrentHashMap does not block when performing retrieval operations, and there is no locking for the usual operations.
The heuristic with most Concurrent Data Structures is that there's a backing data structure that gets modified first, with a front-facing data structure that's visible to outside methods. Then, when the modification is complete, the backing data structure is made the public data structure and the public data structure is pushed to the back. There's way more to it than that, but that's the typical contract.
If 2 updates try to happen on the same segment they will go into contention with each other and one of them will have to wait. You can optimise this by choosing a concurrencyLevel value which takes into account the number of threads which will be concurrently updating the hashmap.
You can find all the details in the javadoc for the class
ConcurrentHashMap contains array of Segment which in turn holds array of HashEntry. Each HashEntry holds a key, a value, and a pointer to it's next adjacent entry.
But it acquires the lock in segment level. Hence you are correct. i.e second insertion waits till first one gets completed
Take a look at the javadoc for ConcurrentMap. It describes the extra methods available to deal with concurrent map mutations.
I'm writing a modificted Kademlia P2P system here but the problem I'm describing here is very similar to the implementation of the original one.
So, what's the most efficient way of implementing k-Buckets? What matters for me are access time, parallelism (read & write) and memory consuption.
Thought doing it with a ConcurrentLinkedQueue and a ConcurrentHashMap but that's pretty redundant and nasty, isn't it?
At the moment I'm simply synchronizing a LinkedList.
Here is my code:
import java.util.LinkedList;
class Bucket {
private final LinkedList<Neighbour> neighbours;
private final Object lock;
Bucket() {
neighbours = new LinkedList<>();
lock = new Object();
}
void sync(Neighbour n) {
synchronized(lock) {
int index = neighbours.indexOf(n);
if(index == -1) {
neighbours.add(n);
n.updateLastSeen();
} else {
Neighbour old = neighbours.remove(index);
neighbours.add(old);
old.updateLastSeen();
}
}
}
void remove(Neighbour n) {
synchronized(lock) {
neighbours.remove(n);
}
}
Neighbour resolve(Node n) throws ResolveException {
Neighbour nextHop;
synchronized(lock) {
int index = neighbours.indexOf(n);
if(index == -1) {
nextHop = neighbours.poll();
neighbours.add(nextHop);
return nextHop;
} else {
return neighbours.get(index);
}
}
}
}
Please don't wonder, I have implemented another neighbour eviction process.
So, what's the most efficient way of implementing k-Buckets?
That depends. If you want to do an implementation with bells and whistles (e.g. bucket splitting, multi-homing) then you need a flexible list or a tree.
In my experience a copy on write array + binary search works well for the routing table because you rarely modify the total number of buckets, just the content of buckets.
With CoW semantics you need less locking since you can just fetch a current copy of the array, retrieve the bucket of interest and then lock on the bucket. Or use an atomic array inside each bucket.
But of course such optimizations are only necessary if you expect high throughput, most DHT nodes see very little traffic, a few packets per second at most, i.e. there's no need to involve multiple threads unless you implement a specialized node that has so much throughput that multiple threads are needed to process the data.
CoW works less well for routing-table-like lookup caches or the ephemeral visited-node/target node sets built during a lookup since those get rapidly modified. ConcurrentSkipListMaps can be a better choice if you are expecting high load.
If you want a simplified, approximate implementation then just use a fixed-size array of 160 elements where the array index is the shared-prefix-bits count relative to your node ID. This performs reasonably well but doesn't allow for some of the optimizations proposed by the full kademlia paper.