How to handle concurrency in this case? - java

I have a HashMap
ConcurrentHashMap<String, Integer> count =new ConcurrentHashMap<String, Integer>();
I will use like this:
private Integer somefunction(){
Integer order;
synchronized (this) {
if (count.containsKey(key)) {
order = count.get(key);
count.put(key, order + 1);
} else {
order = 0;
count.put(key, order + 1);
}
}
return order;
}
But as you can see, this may not be ideal to handle concurrency, since only value under the same key may interfere each other.Different key does't interfere each other so it's not necessary to synchronize all operation. I want to synchronize only when the key is the same.
Can I do something that can achieve better performance on concurrency?
(I know ConcurrentHashMap and synchronize is kind of redundant here ,but let's focus on if we can only synchronize when key is same)

The whole point of ConcurrentHashMap is to facilitate concurrent operations. Here's how you can do an atomic update with no need for explicit synchronization:
private Integer somefunction() {
Integer oldOrder;
// Insert key if it isn't already present.
oldOrder = count.putIfAbsent(key, 1);
if (oldOrder == null) {
return 0;
}
// If we get here, oldOrder holds the previous value.
// Atomically update it.
while (!count.replace(key, oldOrder, oldOrder + 1)) {
oldOrder = count.get(key);
}
return oldOrder;
}
See the Javadocs for putIfAbsent() and replace() for details.
As Tagir Valeev points out in his answer, you can use merge() instead if you're on Java 8, which would shorten the code above to:
private Integer somefunction() {
return count.merge(key, 1, Integer::sum) - 1;
}
Another option would be to let the values be AtomicInteger instead. See hemant1900's answer for how to do so.

I think this might be better and simpler -
private final ConcurrentHashMap<String, AtomicInteger> count = new ConcurrentHashMap<String, AtomicInteger>();
private Integer someFunction(String key){
AtomicInteger order = count.get(key);
if (order == null) {
final AtomicInteger value = new AtomicInteger(0);
order = count.putIfAbsent(key, value);
if (order == null) {
order = value;
}
}
return order.getAndIncrement();
}

It's very easy if you can use Java-8:
return count.merge(key, 1, Integer::sum)-1;
No additional synchronization is necessary. The merge method is guaranteed to be executed atomically.

First of all, where does key even come from?
Secondly, if key will never be the same for two threads running that function at any one time you don't need to synchronize any part of the function.
If, however, two threads could have the same key at the same time then you only need:
synchronized(count) {
count.put(key, order + 1);
}
The reason for this is that only threaded mutation of an object variables will need to be synchronized. But the fact that you are using a ConcurrentHashMap should eliminate this problem (double check me on this), thus no synchronization is needed.

Here is how I do this,
private Integer somefunction(){
Integer order = count.compute(key, (String k, Integer v) -> {
if (v == null)
return 1;
else {
return v + 1;
}
});
return order-1;
}
This avoid keeps trying use replace(key,oldValue,newValue)
Will this be better for concurrency?
The problem is that a lot of environment doesn't support jdk8 yet.

Related

Data Structure that performs set(), get(), setAll() in O(1) [duplicate]

This question already has answers here:
Interview question: data structure to set all values in O(1)
(18 answers)
Closed 9 months ago.
I'm trying to write a data structure that is capable to set all the Values in O(1).
My code:
public class myData {
boolean setAllStatus = false;
HashMap<Integer, Integer> hasMap = new HashMap<>();
int setAllValue = 0;
int count = 0;
public void set(int key, int value) {
hasMap.put(key, value);
}
public int get(int key) {
if (setAllStatus) {
if (hasMap.get(key) != null) {
if (count == hasMap.size()) {
return setAllValue;
} else {
// do something
}
} else {
throw new NullPointerException();
}
} else {
if (hasMap.get(key) == null) {
throw new NullPointerException();
} else {
return hasMap.get(key);
}
}
}
public void setAll(int value) {
setAllStatus = true;
setAllValue = value;
count = hasMap.size();
}
public static void main(String[] args) {
myData m = new myData();
m.set(1, 4);
m.set(4, 5);
System.out.println(m.get(4)); // 5
m.setAll(6);
System.out.println(m.get(4)); // 6
m.set(8, 7);
System.out.println(m.get(8)); // 7
}
}
When I set variables for the first time and then set all the values to a specific variable it works, but when I try to put a new variable after setting all the variables I'm a bit confused.
What kind of solution can I use to make it work?
If you want to enhance your knowledge of Data Structures, I suggest you to implement your own version of Hash table data structure from the ground up (define an array of buckets, learn how to store elements in a bucket, how to resolve collisions and so on...) instead of decorating the HashMap.
Your current code is very contrived:
By its nature, get() should not do anything apart from retrieving a value associated with a key because that's the only responsibility of this method (have a look at the implementation of get() in the HashMap class). Get familiar with the Single responsibility principle.
The idea of throwing an exception when the given key isn't present in the map is strange. And NullPointerException is not the right type of exception to describe this case, NoSuchElementException would be more intuitive.
You might also be interested in learning What does it mean to "program to an interface"?
And the main point is that is because you've picked the wrong starting point (see the advice at the very beginning), learn more about data structures starting from the simplest like Dynamic array, try to implement them from scratch, and gradually learn more about the class design and language features.
Time complexity
Regarding the time complexity, since your class decorates a HashMap methods set() and get() would perform in a constant time O(1).
If you need to change all the values in a HashMap, that could be done only a linear time O(n). Assuming that all existing values are represented by objects that are distinct from one another, it's inherently impossible to perform this operation in a constant time because we need to do this change for every node in each bucket.
The only situation when all values can be set to a new value in a constant time is the following very contrived example where each and every key would be associated with the same object, and we need to maintain a reference to this object (i.e. it would always retrieve the same value for every key that is present, which doesn't seem to be particularly useful):
public class SingleValueMap<K, V> {
private Map<K, V> map = new HashMap<>();
private V commonValue;
public void setAll(V newValue) {
this.commonValue = newValue;
}
public void add(K key) {
map.put(key, commonValue);
}
public void add(K key, V newValue) {
setAll(newValue);
map.put(key, commonValue);
}
public V get(K key) {
if (!map.containsKey(key)) throw new NoSuchElementException();
return commonValue;
}
}
And since we are no longer using the actual HashMap's functionality for storing the values, HashMap can be replaced with HashSet:
public class SingleValueMap<K, V> {
private Set<K> set = new HashSet<>();
private V commonValue;
public void setAll(V newValue) {
this.commonValue = newValue;
}
public void add(K key) {
set.add(key);
}
public void add(K key, V newValue) {
setAll(newValue);
set.add(key);
}
public V get(K key) {
if (!set.contains(key)) throw new NoSuchElementException();
return commonValue;
}
}
If I'm understanding the problem here correctly, every time setAll is called, we effectively forget about all the values of the HashMap and track only its keys basically as if it were a HashSet, where get uses the value passed into setAll. Additionally, any new set calls should still track both the key and the value until setAll is called some time later.
In other words, you need to track the set of keys before setAll, and the set of key-and-values after setAll separately in order to be able to distinguish them.
See if you can find a way to amortize or through constant time operations, keep track of which keys are and are not associated with the latest setAll operation.
Given that this looks like a homework problem, I am hesitating to help further (as per these SO guidelines), but if this is not homework, let me know and I can delve further into this topic.

Do I need client side locking while checking if a key exist or not in ConcurrentHashMap?

I know I could use the putIfAbsent of ConcurrentHashMap. However, I need to do a webservice call to fetch a value for the given key if it does not exist and then store it (kind of caching) so I don't need to the next time same key is being used. Which of the following is correct? I think second version is necessary with the synchronized.
Update 1: I cannot use any functional interface.
Update 2: Updating the code snippet based on the reply from Costi Ciudatu
private static final Map<String, String> KEY_VALUE_MAP = new ConcurrentHashMap<>();
public String getValueVersion1(String key) {
String value = KEY_VALUE_MAP.get(key);
if (value != null) {
return value;
}
// Else fetch the value for the key from the webservice.
value = doRestCall(key);
KEY_VALUE_MAP.putIfAbsent(key, value);
return value;
} // Version 1 Ends here.
public synchronized String getValueVersion2(String key) {
String value = KEY_VALUE_MAP.get(key);
if (value != null) {
return value;
}
// Else fetch the value for the key from the webservice.
value = doRestCall(key);
KEY_VALUE_MAP.put(key, value);
return value;
} // Version 2 ends here.
You should have a look at ConcurrentMap#computeIfAbsent which does that for you atomically:
return KEY_VALUE_MAP.computeIfAbsent(key, this::doRestCall);
Edit (to address your "no functional interface" constraints):
You only need "client side locking" if you want to make sure that you only invoke doRestCall once for any given key. Otherwise, this code would work just fine:
final String value = KEY_VALUE_MAP.get(key);
if (value == null) {
// multiple threads may call this in parallel
final String candidate = doRestCall(key);
// but only the first result will end up in the map
final String winner = KEY_VALUE_MAP.putIfAbsent(key, candidate);
// local computation result gets lost if another thread made it there first
// otherwise, our "candidate" is the "winner"
return winner != null ? winner : candidate;
}
return value;
However, if you do want to enforce that doRestCall is invoked only once for any given key (my guess is you don't really need this), you will need some sort of synchronization. But try to be a bit more creative than the "all-or-nothing" approach in your examples:
final String value = KEY_VALUE_MAP.get(key);
if (value != null) {
return value;
}
synchronized(KEY_VALUE_MAP) {
final String existing = KEY_VALUE_MAP.get(key);
if (existing != null) { // double-check
return existing;
}
final String result = doRestCall(key);
KEY_VALUE_MAP.put(key, result); // no need for putIfAbsent
return result;
}
If you want to use this second (paranoid) approach, you can also consider using the key itself for locking (to narrow the scope to the minimum). But this would probably require you to manage your own pool of keys, as syncrhonized (key.intern()) is not good practice.
This all relies on the fact that your doRestCall() method never returns null. Otherwise, you'll have to wrap the map values within an Optional (or some home-made pre-java8 alternative).
As a (final) side note, in your code samples you inverted the use of put() and putIfAbsent() (the latter is the one to use with no external synchronization) and you read the value twice for null-checking.

Thread safe of operation in one entry

I want to do operations like
class A {
}
ConcurrentHashMap<A, Integer> map = new ConcurrentHashMap<>();
public void fun() {
Integer count = map.get(Object);
if (count != null) {
map.put(Object, count+1);
}
}
public void add() {
// increase Object count by 1
}
public void remove() {
// deduct Object count by 1
}
How can I make fun() thread safe ?
I know a way to do this is to add synchronized block
public void fun() {
synchronized("") {
Integer count = map.get(Object);
if (count != null) {
map.put(Object, count+1);
}
}
}
But are there any other ways to do it ?
Or are there any libraries to do it ?
like thread safe entry processor ?
I also want to implement something like
public void remove() {
int count = map.get(Object);
count -= 5;
if (count <= 0) {
map.remove(Object);
} else {
map.put(Object, count + 2);
}
}
Any ways to do this ?
Thank you
Use AtomicInteger and ConcurrentHashMap.putIfAbsent()
Also look at the
ConcurrentHashMap.remove(key, value) -- removes the key only if it is mapped to the given value.
I am not sure, if it is possible to implement the exact logic (which is not very well defined in the question above), but those methods could be very useful in doing something similar.
More hints (that could be useful or may be not too much):
You (probably!) can use methods: computeIfAbsent, computeIfPresent (or replace), and remove(key, value).
ConcurrentHashMap could be defined on values are Integers.
It will be very dirty solution, and I do not recommend you to use it, but as something to think about, it could be very challenging.
Let me know if you need more hints.

Is this synchronization on ConcurrentHashMap correct?

I have a key-value map accessed by multiple threads:
private final ConcurrentMap<Key, VersionValue> key_vval_map = new ConcurrentHashMap<Key, VersionValue>();
My custom get() and put() methods follow the typical check-then-act pattern. Therefore, synchronization is necessary to ensure atomicity. To avoid locking the whole ConcurrentHashMap, I define:
private final Object[] locks = new Object[10];
{
for(int i = 0; i < locks.length; i++)
locks[i] = new Object();
}
And the get() method goes (it calls the get() method of ConcurrentHashMap):
public VersionValue get(Key key)
{
final int hash = key.hashCode() & 0x7FFFFFFF;
synchronized (locks[hash % locks.length]) // I am not sure whether this synchronization is necessary.
{
VersionValue vval = this.key_vval_map.get(key);
if (vval == null)
return VersionValue.RESERVED_VERSIONVALUE; // RESERVED_VERSIONVALUE is defined elsewhere
return vval;
}
}
The put() method goes (it calls the get() method above):
public void put(Key key, VersionValue vval)
{
final int hash = key.hashCode() & 0x7FFFFFFF;
synchronized (locks[hash % locks.length]) // allowing concurrent writers
{
VersionValue current_vval = this.get(key); // call the get() method above
if (current_vval.compareTo(vval) < 0) // it is an newer VersionValue
this.key_vval_map.put(key, vval);
}
}
The above code works. But, as you know, working is far from being correct in multi-threaded programming.
My questions are :
Is this synchronization mechanism (especially synchronized (locks[hash % locks.length])) necessary and correct in my code?
In Javadoc on Interface Lock, it says
Lock implementations provide more extensive locking operations than
can be obtained using synchronized methods and statements.
Then is it feasible to replace synchronization by Lock in my code?
Edit: If you are using Java-8, don't hesitate to refer to the answer by #nosid.
ConcurrentMap allows you to use optimistic locking instead of explicit synchronization:
VersionValue current_vval = null;
VersionValue new_vval = null;
do {
current_vval = key_vval_map.get(key);
VersionValue effectiveVval = current_vval == null ? VersionValue.RESERVED_VERSIONVALUE : current_vval;
if (effectiveVval.compareTo(vval) < 0) {
new_vval = vval;
} else {
break;
}
} while (!replace(key, current_vval, new_vval));
...
private boolean replace(Key key, VersionValue current, VersionValue newValue) {
if (current == null) {
return key_vval_map.putIfAbsent(key, newValue) == null;
} else {
return key_vval_map.replace(key, current, newValue);
}
}
It will probably have better performance under low contention.
Regarding your questions:
If you use Guava, take a look at Striped
No, you don't need additional functionality of Lock here
If you are using Java-8, you can use the method ConcurrentHashMap::merge instead of reading and updating the value in two steps.
public VersionValue get(Key key) {
return key_vval_map.getOrDefault(key, VersionValue.RESERVED_VERSIONVALUE);
}
public void put(Key key, VersionValue vval) {
key_vval_map.merge(key, vval,
(lhs, rhs) -> lhs.compareTo(rhs) >= 0 ? lhs : rhs);
}

How to implement efficient hash cons with java HashSet

I am trying to implement a hash cons in java, comparable to what String.intern does for strings. I.e., I want a class to store all distinct values of a data type T in a set and provide an T intern(T t) method that checks whether t is already in the set. If so, the instance in the set is returned, otherwise t is added to the set and returned. The reason is that the resulting values can be compared using reference equality since two equal values returned from intern will for sure also be the same instance.
Of course, the most obvious candidate data structure for a hash cons is java.util.HashSet<T>. However, it seems that its interface is flawed and does not allow efficient insertion, because there is no method to retrieve an element that is already in the set or insert one if it is not in there.
An algorithm using HashSet would look like this:
class HashCons<T>{
HashSet<T> set = new HashSet<>();
public T intern(T t){
if(set.contains(t)) {
return ???; // <----- PROBLEM
} else {
set.add(t); // <--- Inefficient, second hash lookup
return t;
}
}
As you see, the problem is twofold:
This solution would be inefficient since I would access the hash table twice, once for contains and once for add. But okay, this may not be a too big performance hit since the correct bucket will be in the cache after the contains, so add will not trigger a cache miss and thus be quite fast.
I cannot retrieve an element already in the set (see line flagged PROBLEM). There is just no method to retrieve the element in the set. So it is just not possible to implement this.
Am I missing something here? Or is it really impossible to build a usual hash cons with java.util.HashSet?
I don't think it's possible using HashSet. You could use some kind of Map instead and use your value as key and as value. The java.util.concurrent.ConcurrentMap also happens to posess the quite convenient method
putIfAbsent(K key, V value)
that returns the value if it is already existent. However, I don't know about the performance of this method (compared to checking "manually" on non-concurrent implementations of Map).
Here is how you would do it using a HashMap:
class HashCons<T>{
Map<T,T> map = new HashMap<T,T>();
public T intern(T t){
if (!map.containsKey(t))
map.put(t,t);
return map.get(t);
}
}
I think the reason why it is not possible with HashSet is quite simple: To the set, if contains(t) is fulfilled, it means that the given t also equals one of the t' in the set. There is no reason for being able return it (as you already have it).
Well HashSet is implemented as HashMap wrapper in OpenJDK, so you won't win in memory usage comparing to solution suggested by aRestless.
10-min sketch
class HashCons<T> {
T[] table;
int size;
int sizeLimit;
HashCons(int expectedSize) {
init(Math.max(Integer.highestOneBit(expectedSize * 2) * 2, 16));
}
private void init(int capacity) {
table = (T[]) new Object[capacity];
size = 0;
sizeLimit = (int) (capacity * 2L / 3);
}
T cons(#Nonnull T key) {
int mask = table.length - 1;
int i = key.hashCode() & mask;
do {
if (table[i] == null) break;
if (key.equals(table[i])) return table[i];
i = (i + 1) & mask;
} while (true);
table[i] = key;
if (++size > sizeLimit) rehash();
return key;
}
private void rehash() {
T[] table = this.table;
if (table.length == (1 << 30))
throw new IllegalStateException("HashCons is full");
init(table.length << 1);
for (T key : table) {
if (key != null) cons(key);
}
}
}

Categories