I have a key-value map accessed by multiple threads:
private final ConcurrentMap<Key, VersionValue> key_vval_map = new ConcurrentHashMap<Key, VersionValue>();
My custom get() and put() methods follow the typical check-then-act pattern. Therefore, synchronization is necessary to ensure atomicity. To avoid locking the whole ConcurrentHashMap, I define:
private final Object[] locks = new Object[10];
{
for(int i = 0; i < locks.length; i++)
locks[i] = new Object();
}
And the get() method goes (it calls the get() method of ConcurrentHashMap):
public VersionValue get(Key key)
{
final int hash = key.hashCode() & 0x7FFFFFFF;
synchronized (locks[hash % locks.length]) // I am not sure whether this synchronization is necessary.
{
VersionValue vval = this.key_vval_map.get(key);
if (vval == null)
return VersionValue.RESERVED_VERSIONVALUE; // RESERVED_VERSIONVALUE is defined elsewhere
return vval;
}
}
The put() method goes (it calls the get() method above):
public void put(Key key, VersionValue vval)
{
final int hash = key.hashCode() & 0x7FFFFFFF;
synchronized (locks[hash % locks.length]) // allowing concurrent writers
{
VersionValue current_vval = this.get(key); // call the get() method above
if (current_vval.compareTo(vval) < 0) // it is an newer VersionValue
this.key_vval_map.put(key, vval);
}
}
The above code works. But, as you know, working is far from being correct in multi-threaded programming.
My questions are :
Is this synchronization mechanism (especially synchronized (locks[hash % locks.length])) necessary and correct in my code?
In Javadoc on Interface Lock, it says
Lock implementations provide more extensive locking operations than
can be obtained using synchronized methods and statements.
Then is it feasible to replace synchronization by Lock in my code?
Edit: If you are using Java-8, don't hesitate to refer to the answer by #nosid.
ConcurrentMap allows you to use optimistic locking instead of explicit synchronization:
VersionValue current_vval = null;
VersionValue new_vval = null;
do {
current_vval = key_vval_map.get(key);
VersionValue effectiveVval = current_vval == null ? VersionValue.RESERVED_VERSIONVALUE : current_vval;
if (effectiveVval.compareTo(vval) < 0) {
new_vval = vval;
} else {
break;
}
} while (!replace(key, current_vval, new_vval));
...
private boolean replace(Key key, VersionValue current, VersionValue newValue) {
if (current == null) {
return key_vval_map.putIfAbsent(key, newValue) == null;
} else {
return key_vval_map.replace(key, current, newValue);
}
}
It will probably have better performance under low contention.
Regarding your questions:
If you use Guava, take a look at Striped
No, you don't need additional functionality of Lock here
If you are using Java-8, you can use the method ConcurrentHashMap::merge instead of reading and updating the value in two steps.
public VersionValue get(Key key) {
return key_vval_map.getOrDefault(key, VersionValue.RESERVED_VERSIONVALUE);
}
public void put(Key key, VersionValue vval) {
key_vval_map.merge(key, vval,
(lhs, rhs) -> lhs.compareTo(rhs) >= 0 ? lhs : rhs);
}
Related
I have implemented my own hashmap for study purposes. The key has a string and the value has an object of the class I created. By the way, I want to know if my hashcode method is appropriate, and how to not calculate the hashcode every time a value is inserted.
I saved the hash value once calculated as a member variable of object. However, when the get method is called, only the key value is received, so the hashcode must be obtained. How can I recycle the once calculated hash value?
Finally, is my hash generation method appropriate?
class IHashMap {
private class Node {
int hash;
String key;
int data;
Node right;
public Node(String key, int data) {
this.key = key;
this.data = data;
this.right = null;
this.hash = 0;
}
}
private Node[] table;
private int tbSize;
private int n;
public IHashMap(int tbSize) {
this.table = new Node[tbSize];
this.tbSize = tbSize;
this.n = 0;
}
//...Omit irrelevant code...
public void put(String key, int value) {
int hash = hashCode(key);
Node node = new Node(key, value);
node.hash = hash;
if(this.table[hash] != null) {
Node entry = this.table[hash];
while(entry.right != null && !entry.key.equals(key))
entry = entry.right;
if(entry.key.equals(key)) {
entry.data++;
}
else {
entry.right = node;
this.n++;
}
}
else {
this.table[hash] = node;
this.n++;
}
}
public int get(String key) {
int hash = hashCode(key);
if(this.table[hash] != null) {
if(this.table[hash].key.equals(key))
return this.table[hash].data;
Node entry = this.table[hash];
while(entry != null && !entry.key.equals(key))
entry = entry.right;
if(entry == null)
return -1;
return entry.data;
}
return -1;
}
private int hash(String key) {
int h = 0;
if(key.length() > 0) {
char[] var = strToCharArray(key);
for(int i = 0; i < var.length; i++)
h = 31 * h + var[i];
}
return h;
}
private int hashCode(String key) {
return (hash(key) & 0x7fffffff) % this.tbSize;
}
//...Omit irrelevant code...
}
I would really appreciate it if you could answer me.
So, the hashcode is the hashcode of the thing that is being inserted.
They way to prevent this from being too much of a hassle is to slip in lines into the storage items's hashcode that looks like
int hashcode() {
if (I have a cached hashcode) {
return cached_hashcode;
}
(calculate hashcode)
cached_hashcode = hashcode;
return hashcode;
}
this way, for each object, you only go through the hashcode computation once.
Now, keep in mind that computers have progressed a lot. They mostly wait on the RAM subsystem to respond to results, and can do about 1000 to 10000 math operations for a single ram fetch. This means that "preserving CPU cycles" at the cost of memory look ups can actually slow down your program.
Benchmark wisely, and don't be afraid to use a little CPU if it means reducing your RAM footprint.
For those who are curious, if your program is small enough to fit into layer 1 cache, it's not a big delay, but as you spill over these caches into the other layers the delays become noticeable. This is why "caching" is not always a great solution, as if you cache too heavily, your program becomes larger, and will spill out of cache more often.
Modern CPUs try to compensate, mostly by pre-fetching the needed RAM before it is requested (looking ahead in the processing stream). That leads to better runtime in many cases, but also creates new issues (like preloading stuff you might not use because you chose the "other" path through the code).
The best bet is to not overly-cache stuff that is simple, unless it's expensive to reconstruct. With the JVM a method call (at the very low levels) is more expensive than you might think, so the JVM has special optimizations for Strings and their hash codes.
Is this a valid code to write,if I wish to avoid unnecessary contains call?
I wish to avoid a contains call on every invocation,as this is highly time sensitive code.
cancelretryCountMap.putIfAbsent(tag,new AtomicInteger(0));
count = cancelretryCountMap.get(tag).incrementAndGet();
if(count > 10){
///abort after x retries
....
}
I am using JDK 7
Usually, you would use putIfAbsent like this:
final AtomicInteger present = map.get(tag);
int count;
if (present != null) {
count = present.incrementAndGet();
} else {
final AtomicInteger instance = new AtomicInteger(0);
final AtomicInteger marker = map.putIfAbsent(tag, instance);
if (marker == null) {
count = instance.incrementAndGet();
} else {
count = marker.incrementAndGet();
}
}
The reason for the explicit get being, that you want to avoid the allocation of the default value in the "happy" path (i.e., when there is already an entry with the given key).
If there is no matching entry, you have to use the return value of putIfAbsent in order to distinguish between
the entry was still missing (and the default value has been added due to the call), in which case the method returns null, and
some other thread has won the race and inserted the new entry after the call to get (in which case the method returns the current value associated with the given key)
You can abstract this sequence by introducing a helper method, e.g.,
interface Supplier<T> {
T get();
}
static <T> T computeIfAbsent(ConcurrentMap<K,T> map, T key, Supplier<? extends T> producer) {
final T present = map.get(key);
if (present != null) {
return present;
} else {
final T fallback = producer.get();
final T marker = map.putIfAbsent(key, fallback);
if (marker == null) {
return fallback;
} else {
return marker;
}
}
}
You could use this in your example:
static final Supplier<AtomicInteger> newAtomicInteger = new Supplier<AtomicInteger>() {
public AtomicInteger get() { return new AtomicInteger(0); }
};
void yourMethodWhatever(Object tag) {
final AtomicInteger counter = computeIfAbsent(cancelretryCountMap, tag, newAtomicInteger);
if (counter.incrementAndGet() > 10) {
... whatever ...
}
}
Note, that this is actually already provided in the JDK 8 as default method on Map, but since you are still on JDK 7, you have to roll your own, as is done here.
I have a HashMap
ConcurrentHashMap<String, Integer> count =new ConcurrentHashMap<String, Integer>();
I will use like this:
private Integer somefunction(){
Integer order;
synchronized (this) {
if (count.containsKey(key)) {
order = count.get(key);
count.put(key, order + 1);
} else {
order = 0;
count.put(key, order + 1);
}
}
return order;
}
But as you can see, this may not be ideal to handle concurrency, since only value under the same key may interfere each other.Different key does't interfere each other so it's not necessary to synchronize all operation. I want to synchronize only when the key is the same.
Can I do something that can achieve better performance on concurrency?
(I know ConcurrentHashMap and synchronize is kind of redundant here ,but let's focus on if we can only synchronize when key is same)
The whole point of ConcurrentHashMap is to facilitate concurrent operations. Here's how you can do an atomic update with no need for explicit synchronization:
private Integer somefunction() {
Integer oldOrder;
// Insert key if it isn't already present.
oldOrder = count.putIfAbsent(key, 1);
if (oldOrder == null) {
return 0;
}
// If we get here, oldOrder holds the previous value.
// Atomically update it.
while (!count.replace(key, oldOrder, oldOrder + 1)) {
oldOrder = count.get(key);
}
return oldOrder;
}
See the Javadocs for putIfAbsent() and replace() for details.
As Tagir Valeev points out in his answer, you can use merge() instead if you're on Java 8, which would shorten the code above to:
private Integer somefunction() {
return count.merge(key, 1, Integer::sum) - 1;
}
Another option would be to let the values be AtomicInteger instead. See hemant1900's answer for how to do so.
I think this might be better and simpler -
private final ConcurrentHashMap<String, AtomicInteger> count = new ConcurrentHashMap<String, AtomicInteger>();
private Integer someFunction(String key){
AtomicInteger order = count.get(key);
if (order == null) {
final AtomicInteger value = new AtomicInteger(0);
order = count.putIfAbsent(key, value);
if (order == null) {
order = value;
}
}
return order.getAndIncrement();
}
It's very easy if you can use Java-8:
return count.merge(key, 1, Integer::sum)-1;
No additional synchronization is necessary. The merge method is guaranteed to be executed atomically.
First of all, where does key even come from?
Secondly, if key will never be the same for two threads running that function at any one time you don't need to synchronize any part of the function.
If, however, two threads could have the same key at the same time then you only need:
synchronized(count) {
count.put(key, order + 1);
}
The reason for this is that only threaded mutation of an object variables will need to be synchronized. But the fact that you are using a ConcurrentHashMap should eliminate this problem (double check me on this), thus no synchronization is needed.
Here is how I do this,
private Integer somefunction(){
Integer order = count.compute(key, (String k, Integer v) -> {
if (v == null)
return 1;
else {
return v + 1;
}
});
return order-1;
}
This avoid keeps trying use replace(key,oldValue,newValue)
Will this be better for concurrency?
The problem is that a lot of environment doesn't support jdk8 yet.
From Usage_in_Java & Effective Java 2, I do understand I need to have volatile keyword, if the lazy initialization target is a single variable.
However, what about I'm perform multiple lazy initialization, and store them in a ConcurrentHashMap? Does the Map need to be volatile too?
Here's the code example.
public class Utils {
private static final Map<Integer, RemoteViews> remoteViewsMap = new ConcurrentHashMap<Integer, RemoteViews>();
public static RemoteViews getRemoteViews(int resourceId) {
RemoteViews remoteViews = remoteViewsMap.get(resourceId);
if (remoteViews == null) {
synchronized(remoteViewsMap){
remoteViews = remoteViewsMap.get(resourceId);
if (remoteViews == null) {
remoteViews = new RemoteViews("org.yccheok.gui", resourceId);
remoteViewsMap.put(resourceId, remoteViews);
}
}
}
return remoteViews;
}
}
Is the above code correct and thread safe?
There's no need for volatile keyword since ConcurrentHashMap being an implementation of ConcurrentMap provides the following memory consistency effect:
actions in a thread prior to placing an object into a ConcurrentMap as a key or value happen-before actions subsequent to the access or removal of that object from the ConcurrentMap in another thread
However, that's not how you usually want to work with concurrent map. General pattern is as follows:
Object existing = concurrentMap.get(key);
// check if this key is already present
if (existing == null) {
Object newObject = new Object();
existing = concurrentMap.putIfAbsent(key, newObject); // atomic operation
// another thread might have already stored mapping for key
if (existing == null) {
return newObject;
}
}
return existing;
Note, it doesn't protect you from two threads simultaneously calling new Object() (which can be an issue if creation of new object is expensive), but it allows you to avoid explicit synchronization alltogether.
Update: as for double-checked locking, in your case, it should look as follows:
Object value = concurrentMap.get(key);
if (value == null) {
synchronized (lock) {
value = concurrentMap.get(key);
if (value == null) {
value = new Object();
concurrentMap.put(key, value);
}
}
}
return value;
In Java, I want to do something like this:
Object r = map.get(t);
if (r == null) {
r = create(); // creating r is an expensive operation.
map.put(t, r);
}
Now that snippet of code can be executed in a multithreaded environment.
map can be a ConcurrentHashMap.
But how do I make that logic atomic?
Please don't give me trivial solution like a 'synchronized' block.
I 'd expect this problem can be solved neatly once and for all.
It's been solved neatly by Guava.
Use CacheBuilder and call build with a CacheLoader. This will return a LoadingCache object. If you really need a Map implementation, you can call asMap().
There's also the older MapMaker with its makeComputingMap, but that's deprecated in favor of the CacheBuilder approach.
Of course you can also implement it manually, but doing that correctly is nontrivial. Several aspects to consider are:
you want to avoid calling create twice with the same input
you want to wait for a current thread to finish creating but don't want to do that with an idle loop
you want to avoid synchronizing in the good case (i.e. element is already in the map).
if two create calls happen at the same time you want each caller to only wait for the one relevant to him.
try
value = concurentMap.get(key);
if(value == null) {
map.putIfAbsent(key, new Value());
value = map.get(key);
}
return value;
Since Java 8, method ConcurrentMap.computeIfAbsent is what you are looking for:
equivalent to the following steps for this map, but atomic:
V oldValue = map.get(key);
if (oldValue == null) {
V newValue = mappingFunction.apply(key);
if (newValue != null) {
return map.putIfAbsent(key, newValue);
} else {
return null;
}
} else {
return oldValue;
}
The most common usage is to construct a new object serving as an initial mapped value or memoized result, which is what you are looking for, I think, as in:
Value v = map.computeIfAbsent(key, k -> new Value(f(k)));
I know this maybe isn't what you're looking for, but I'll include it for sake of argument.
public Object ensureExistsInMap(Map map, Object t) {
Object r = map.get(t);
if (r != null) return r; // we know for sure it exists
synchronized (creationLock) {
// multiple threads might have come this far if r was null
// outside the synchronized block
r = map.get(t);
if (r != null) return r;
r = create();
map.put(t, r);
return r;
}
}
What you describe is basically the Multitone pattern with Lazy Initialization
Here is an example using double locking with modern Java locks
private static Map<Object, Object> instances = new ConcurrentHashMap<Object, Object>();
private static Lock createLock = new ReentrantLock();
private Multitone() {}
public static Object getInstance(Object key) {
Object instance = instances.get(key);
if (instance == null) {
createLock.lock();
try {
if (instance == null) {
instance = createInstance();
instances.put(key, instance);
}
} finally {
createLock.unlock();
}
}
return instance;
}
I think the solution is documented in concurrency in practice.
The trick is to use a Future instead of R as object in the map.
Although I dislike this answer because it looks far far too complex.
here is the code:
public class Memorizer<A, V> implements Computable<A, V> {
private final ConcurrentMap<A, Future<V>> cache = new ConcurrentHashMap<A, Future<V>>();
private final Computable<A, V> c;
public Memorizer(Computable<A, V> c) { this.c = c; }
public V compute(final A arg) throws InterruptedException {
while (true) {
Future<V> f = cache.get(arg);
if (f == null) {
Callable<V> eval = new Callable<V>() {
public V call() throws InterruptedException {
return c.compute(arg);
}
};
FutureTask<V> ft = new FutureTask<V>(eval);
f = cache.putIfAbsent(arg, ft);
if (f == null) { f = ft; ft.run(); }
try {
return f.get();
} catch (CancellationException e) {
cache.remove(arg, f);
} catch (ExecutionException e) {
throw launderThrowable(e.getCause());
}
}
}