Multi threading with a ConcurrentHashMap

Multi threading with a ConcurrentHashMap - java

I'm trying to create a method with a ConcurrentHashMap with the following behavior.
Read no lock
Write lock
prior to writing,
read to see if record exist,
if it still doesn't exist, save to database and add record to map.
if record exist from previous write, just return record.
My thoughts.
private Object lock1 = new Object();
private ConcurrentHashMap<String, Object> productMap;
private Object getProductMap(String name) {
if (productMap.isEmpty()) {
productMap = new ConcurrentHashMap<>();
}
if (productMap.containsKey(name)) {
return productMap.get(name);
}
synchronized (lock1) {
if (productMap.containsKey(name)) {
return productMap.get(name);
} else {
Product product = new Product(name);
session.save(product);
productMap.putIfAbsent(name, product);
}
}
}
Could someone help me to understand if this is a correct approach?

There are several bugs here.
If productMap isn't guaranteed to be initialized, you will get an NPE in your first statement to this method.
The method isn't guaranteed to return anything if the map is empty.
The method doesn't return on all paths.
The method is both poorly named and unnecessary; you're trying to emulate putIfAbsent which half accomplishes your goal.
You also don't need to do any synchronization; ConcurrentHashMap is thread safe for your purposes.
If I were to rewrite this, I'd do a few things differently:
Eagerly instantiate the ConcurrentHashMap
Bind it to ConcurrentMap instead of the concrete class (so ConcurrentMap<String, Product> productMap = new ConcurrentHashMap<>();)
Rename the method to putIfMissing and delegate to putIfAbsent, with some logic to return the same record I want to add if the result is null. The above absolutely depends on Product having a well-defined equals and hashCode method, such that new Product(name) will produce objects with the same values for equals and hashCode if provided the same name.
Use an Optional to avoid any NPEs with the result of putIfAbsent, and to provide easier to digest code.
A snippet of the above:
public Product putIfMissing(String key) {
Product product = new Product(key);
Optional<Product> result =
Optional.ofNullable(productMap.putIfAbsent(key, product));
session.save(result.orElse(product));
return result.orElse(product);
}

Related

Is it safe to use hashmap value reference when it may be updated in another thread

Is it safe to use getParameter
Since I can tolerate the value is not latest.
And when next time I can get the latest value of Parameter
Code like this :
public class ParameterManager {
private volatile Map<String, Parameter> scenarioParameterMap = Maps.newHashMap();
public ParameterManager(String appName) throws DarwinClientException {
}
public Parameter getParameter(String scenario) {
return scenarioParameterMap.get(scenario);
}
public void update(String scenario, Map<String, String> parameters) {
if (scenarioParameterMap.containsKey(scenario)) {
Parameter parameter = scenarioParameterMap.get(scenario);
parameter.update(parameters);
} else {
scenarioParameterMap.put(scenario, new Parameter(scenario, parameters));
}
}
}
or the update is just use
scenarioParameterMap.put(scenario, new Parameter(scenario, parameters));

volatile does not help here at all. It only protects the reference held in scenarioParameterMap, not the contents of that map. Since you're not reassigning it to point to a different map at any point, volatile is extraneous.
This code is not threadsafe. You need to use proper synchronization, be that via synchronized, or using a concurrent map, or other equivalent method.
Since I can tolerate the value is not latest.
Thread non-safety can be more dangerous than that. It could give you wrong results. It could crash. You can't get by thinking that the worst case is stale data. That's not the case.
Imagine that Map.put() is in the middle of updating the map and has the internal data in some temporarily invalid state. If Map.get() runs at the same time who knows what might go wrong. Sometimes adding an entry to a hash map will cause the whole thing to be reallocated and re-bucketed. Another thread reading the map at that time would be very confused.

Behavior of entrySet().removeIf in ConcurrentHashMap

I would like to use ConcurrentHashMap to let one thread delete some items from the map periodically and other threads to put and get items from the map at the same time.
I'm using map.entrySet().removeIf(lambda) in the removing thread. I'm wondering what assumptions I can make about its behavior. I can see that removeIf method uses iterator to go through elements in the map, check the given condition and then remove them if needed using iterator.remove().
Documentation gives some info about ConcurrentHashMap iterators behavior:
Similarly, Iterators, Spliterators and Enumerations return elements
reflecting the state of the hash table at some point at or since the
creation of the iterator/enumeration. hey do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.
As the whole removeIf call happens in one thread I can be sure that the iterator is not used by more than one thread at the time. Still I'm wondering if the course of events described below is possible:
Map contains mapping: 'A'->0
Deleting Thread starts executing map.entrySet().removeIf(entry->entry.getValue()==0)
Deleting Thread calls .iteratator() inside removeIf call and gets the iterator reflecting the current state of the collection
Another thread executes map.put('A', 1)
Deleting thread still sees 'A'->0 mapping (iterator reflects the old state) and because 0==0 is true it decides to remove A key from the map.
The map now contains 'A'->1 but deleting thread saw the old value of 0 and the 'A' ->1 entry is removed even though it shouldn't be. The map is empty.
I can imagine that the behavior may be prevented by the implementation in many ways. For example: maybe iterators are not reflecting put/remove operations but are always reflecting value updates or maybe the remove method of the iterator checks if the whole mapping (both key and value) is still present in the map before calling remove on the key. I couldn't find info about any of those things happening and I'm wondering if there's something which makes that use case safe.

I also managed to reproduce such case on my machine.
I think, the problem is that EntrySetView (which is returned by ConcurrentHashMap.entrySet()) inherits its removeIf implementation from Collection, and it looks like:
default boolean removeIf(Predicate<? super E> filter) {
Objects.requireNonNull(filter);
boolean removed = false;
final Iterator<E> each = iterator();
while (each.hasNext()) {
// `test` returns `true` for some entry
if (filter.test(each.next())) {
// entry has been just changed, `test` would return `false` now
each.remove(); // ...but we still remove
removed = true;
}
}
return removed;
}
In my humble opinion, this cannot be considered as a correct implementation for ConcurrentHashMap.

After discussion with user Zielu in comments below Zielu's answer I have gone deeper into the ConcurrentHashMap code and found out that:
ConcurrentHashMap implementation provides remove(key, value) method which calls replaceNode(key, null, value)
replaceNode checks if both key and value are still present in the map before removing so using it should be fine. Documentation says that it
Replaces node value with v, conditional upon match of cv if
* non-null.
In the case mentioned in the question ConcurrentHashMap's .entrySet() is called which returns EntrySetView class. Then removeIf method calls .iterator() which returns EntryIterator.
EntryIterator extends BaseIterator and inherits remove implementation that calls map.replaceNode(p.key, null, null) which disables conditional removal and just always removes the key.
The negative course of events could be still prevented if iterators always iterated over 'current' values and never returned old ones if some value is modified. I still don't know if that happens or not, but the test case mentioned below seems to verify the whole thing.
I think that have created a test case which shows that the behavior described in my question can really happen. Please correct me if I there are any mistakes in the code.
The code starts two threads. One of them (DELETING_THREAD) removes all entries mapped to 'false' boolean value. Another one (ADDING_THREAD) randomly puts (1, true) or (1,false) values into the map. If it puts true in the value it expects that the entry will still be there when checked and throws an exception if it is not. It throws an exception quickly when I run it locally.
package test;
import java.util.Random;
import java.util.concurrent.ConcurrentHashMap;
public class MainClass {
private static final Random RANDOM = new Random();
private static final ConcurrentHashMap<Integer, Boolean> MAP = new ConcurrentHashMap<Integer, Boolean>();
private static final Integer KEY = 1;
private static final Thread DELETING_THREAD = new Thread() {
#Override
public void run() {
while (true) {
MAP.entrySet().removeIf(entry -> entry.getValue() == false);
}
}
};
private static final Thread ADDING_THREAD = new Thread() {
#Override
public void run() {
while (true) {
boolean val = RANDOM.nextBoolean();
MAP.put(KEY, val);
if (val == true && !MAP.containsKey(KEY)) {
throw new RuntimeException("TRUE value was removed");
}
}
}
};
public static void main(String[] args) throws InterruptedException {
DELETING_THREAD.setDaemon(true);
ADDING_THREAD.start();
DELETING_THREAD.start();
ADDING_THREAD.join();
}
}

Is java dynamic synchronization a good idea or allowed?

Basically, what is needed is to synchronize requests to each of the records.
Some of the codes I can think of is like this:
//member variable
ConcurrentHashMap<Long, Object> lockMap = new ConcurrentHashMap<Long, Object>();
//one method
private void maintainLockObjects(long id){
lockMap.putIfAbsent(id, new Object());
}
//the request method
bar(long id){
maintainLockObjects(id);
synchronized(lockMap.get(id)){
//logic here
}
}

Have a look at ClassLoader.getClassLoadingLock:
Returns the lock object for class loading operations. For backward compatibility, the default implementation of this method behaves as follows. If this ClassLoader object is registered as parallel capable, the method returns a dedicated object associated with the specified class name. Otherwise, the method returns this ClassLoader object.
Its implementation code may look familiar to you:
protected Object getClassLoadingLock(String className) {
Object lock = this;
if (parallelLockMap != null) {
Object newLock = new Object();
lock = parallelLockMap.putIfAbsent(className, newLock);
if (lock == null) {
lock = newLock;
}
}
return lock;
}
The first null check is only for the mentioned backwards compatibility. So besides that, the only difference between this heavily used code and your approach is that this code avoids to call get afterwards as putIfAbsent already returns the old object if there is one.
So the simply answer, it works and this pattern also proving within a really crucial part of Oracle’s JRE implementation.

Looking for concurrent map with functors

If I look at ConcurrentHashMap at java, and specifically the putIfAbsent method, a typical usage of this method would be:
ConcurrentMap<String,Person> map = new ConcurrentHashMap<>();
map.putIfAbsent("John",new Person("John"));
The problem is that the Person object is always initialized.
Is there some helper collection (maybe some java framework providing this)
that will give me similar behavior of ConcurrentHashMap, and that will work with a functor or any other mean to construct the value object,
and the construction code (i.e - functor.execute() ) will be called only if the map does not contain a value for the given key?

The only way to do this is to use locking. You can minimise the impact of this by using checking first.
if(!map.containsKey("John"))
synchronized(map) {
if(!map.containsKey("John"))
map.put("John", new Person("John"));
}
The reson you need locking is that you need to hold the map while you create the Person to prevent other threads trying to add the same object at the same time. ConcurrentMap doesn't support blocking operations like this directly.
If you need to minise locking to a specific key you can do the following.
ConcurrentMap<String, AtomicReference<Person>> map = new ConcurrentHashMap<String, AtomicReference<Person>>();
String name = "John";
AtomicReference<Person> personRef = map.get(name);
if (personRef == null)
map.putIfAbsent(name, new AtomicReference<Person>());
personRef = map.get(name);
if (personRef.get() == null)
synchronized (personRef) {
if (personRef.get() == null)
// can take a long time without blocking use of other keys.
personRef.set(new Person(name));
}
Person person = personRef.get();

On using Enum based Singleton to cache large objects (Java)

Is there any better way to cache up some very large objects, that can only be created once, and therefore need to be cached ? Currently, I have the following:
public enum LargeObjectCache {
INSTANCE;
private Map<String, LargeObject> map = new HashMap<...>();
public LargeObject get(String s) {
if (!map.containsKey(s)) {
map.put(s, new LargeObject(s));
}
return map.get(s);
}
}
There are several classes that can use the LargeObjects, which is why I decided to use a singleton for the cache, instead of passing LargeObjects to every class that uses it.
Also, the map doesn't contain many keys (one or two, but the key can vary in different runs of the program) so, is there another, more efficient map to use in this case ?

You may need thread-safety to ensure you don't have two instance of the same name.
It does matter much for small maps but you can avoid one call which can make it faster.
public LargeObject get(String s) {
synchronized(map) {
LargeObject ret = map.get(s);
if (ret == null)
map.put(s, ret = new LargeObject(s));
return ret;
}
}

As it has been pointed out, you need to address thread-safety. Simply using Collections.synchronizedMap() doesn't make it completely correct, as the code entails compound operations. Synchronizing the entire block is one solution. However, using ConcurrentHashMap will result in a much more concurrent and scalable behavior if it is critical.
public enum LargeObjectCache {
INSTANCE;
private final ConcurrentMap<String, LargeObject> map = new ConcurrentHashMap<...>();
public LargeObject get(String s) {
LargeObject value = map.get(s);
if (value == null) {
value = new LargeObject(s);
LargeObject old = map.putIfAbsent(s, value);
if (old != null) {
value = old;
}
}
return value;
}
}
You'll need to use it exactly in this form to have the correct and the most efficient behavior.
If you must ensure only one thread gets to even instantiate the value for a given key, then it becomes necessary to turn to something like the computing map in Google Collections or the memoizer example in Brian Goetz's book "Java Concurrency in Practice".

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.