Should I use a ConcurrentHashMap?

Should I use a ConcurrentHashMap? - java

A quick question about ConcurrentHashMap:
public Map<String, String> getA(){
get something from db in a HashMap lets call it x
....
do some operations in on x
....
put the result in ConcurrentHashMap lets call it A
.....
return A
}
Does it make sense to have a ConcurrentHashMap or should I go with a HashMap?
1.HashMap
2.ConsurentHashMap

If you are on different threads or otherwise the data will be operated on at the same time (multithreaded delegate or the alike) , yes, use ConcurrentHashMap. Otherwise, HashMap should do (given the information you've provided).
Based on reading your pseudo code, I get the impression that you are not working on different threads and therefore HashMap should suffice.

You might do better wrapping it in Collections.unmodifiableMap() if you don't want to worry about the clients of this method getting into race conditions when modifying/reading the map.

Related

thread safe LinkedHashMap without Collections.synchronized

I am using a LinkedHashMap and the environment is multi threaded so this structure needs to be thread safe. During specific events I need to read the entire map push to db and clear all.
Most of time only writes happen to this map. This map has a limit 50 entries.
I am using Oracle MAF and it does not have Collections.syncronizedMap available. So, what are things I need to put in synchronized blocks to make sure writing and reading doesn't hit me concurrentModificationException etc
Few requirements:
I need to behave it like a circular queue so Overriding removeEldestEntry method of the LinkedHashMap.
I need to preserve the order

So, what are things I need to put in synchronized blocks to make sure writing and reading doesn't hit me concurrentModificationException etc
Everything method call should be in a synchronized block.
The tricky one being the use of an Iterator, as you have to hold the lock for the life of the Iterator. e.g.
// pre Java 5.0 code
synchronized(map) { // the lock has to be held for the whole loop.
for(Iterator iter = map.entrySet().iterator(); iter.hashNext(); ) {
Map.Entry entry = iter.next();
String key = (String) entry.getKey();
MyType value = (MyType) entry.getValue();
// do something with key and value.
}
}

If you are using a java version 1.5 or newer you can use java.util.concurrent.ConcurrentHashMap.
This is the most efficient implementation of a Map to use in a multithread environment.
It adds also some method like putIfAbsent very useful for atomic operations on the map.
From java doc:
Retrieval operations (including get) generally do not block, so may
overlap with update operations (including put and remove). Retrievals
reflect the results of the most recently completed update operations
holding upon their onset. For aggregate operations such as putAll and
clear, concurrent retrievals may reflect insertion or removal of only
some entries
So verify is this is the behaviour you expect from your class.
If your map has only 50 records and needs to be used as a circular Queue why you use a Map? Is not better to use one of the Queue implementations?
If you need to use a LinkedHashMap use the following:
Map m = Collections.synchronizedMap(new LinkedHashMap());
From javadoc of LinkedHashMap:
Note that this implementation is not synchronized. If multiple threads
access a linked hash map concurrently, and at least one of the threads
modifies the map structurally, it must be synchronized externally.
This is typically accomplished by synchronizing on some object that
naturally encapsulates the map. If no such object exists, the map
should be "wrapped" using the Collections.synchronizedMap method. This
is best done at creation time, to prevent accidental unsynchronized
access to the map:
Map m = Collections.synchronizedMap(new LinkedHashMap(...));
https://docs.oracle.com/javase/7/docs/api/java/util/LinkedHashMap.html

Most LinkedHashMap operations require synchronization in a multi-threaded environment, even the ones that look pure like get(key), get(key) actually mutates some internal nodes. The easiest you could do is using Collections.synchronizedMap.
Map<K,V> map = Collections.synchronizedMap(new LinkedHashMap<>());
Now if it is not available, you can easily add it, as it is just a simple decorator around map that synchronize all operation.
class SyncMap<T,U> implements Map<T,U>{
SyncMap<T,U>(LinkedHashMap<T,U> map){
..
}
public synchronized U get(T t){
..
}
}

Synchronizing LinkedHashmap externally

What is the best way to implement synchronization of a linkedhashmap externally, without using Collections.synchronizedMap
When Collections.synchronizedMap is used entire datastructure is locked, so performance is hugely impacted in a bad way.
What is the best way to lock only required part of datastructure. e.g. If thread is accessing key (K1), it should lock only Key(K1) and Value(v1) part of the datastructure

You can't get a fine-grained-locking, FIFO-eviction concurrent map from the built-in Java implementations.
Check out Guava's Cache or the open-source ConcurrentLinkedHashMap project.

I think you may want to synchronize the subsequent operation you do, just on the value coming from the map:
Object value = map.get(key);
synchronized(value) {
doSomethingWith(value);
}
Synchronizing to values get from the Map, makes sense, since they can be shared and accessed concurrently; the example I posted above should do what you need. That should be enough.
By the way you can also synchronize on the key doing two nested synchronized blocks:
synchronized(key) {
Object value = map.get(key);
synchronized(value) {
doSomethingWith(value);
}
}
The key is -usually- just used to access the object (by hashing). Keys are matched by hash value, so it doesn't make full sense to me to synchronize over the key.
Or, maybe you can subclass ConcurrentHashMap adding what is missing from LinkedHashMap.

Louis Wasserman's suggestion is probably the best because it gives you a lot of useful functionality. However, even if you lock on the entire map, you have to be hitting it really, really hard to make that a bottleneck (as in, your code is mostly doing read/write on the map). If you don't need the additional functionality of Guava's Cache, a synchronized map could be simpler & better. You could also use a ReadWriteLock if you mostly read from the map.

Best option would be to use java.util.concurrent.ConcurrentHashMap .
I can't see how it would be possible to externally lock only parts of zour Map, since you cannot control what shared datastructures are accessed internally by a call to any of the maps function.

If you don't need a LinkedHaspMap, use a ConcurrentHashMap from the java.util.concurrent package.
It is specifically designed for both speed and thread safety. It uses the minimal possible locking to achieve its thread safety.

An insertion in a HashMap, or LinkedHashMap, can cause a rehash because it increases the ratio between the size and the number of buckets. Having two or more threads rehash simultaneously would be a disaster.
Even if you are only doing a get, another thread may be removing an entry from the same bucket, so you are scanning a linked list that is being modified under you. You could also have two or more threads appending to the main linked list at the same time.
If you can do without the linking, use java.util.concurrent.ConcurrentHashMap, as already suggested.

Sort concurrentHash Map with threadsafty

I am using 'concurrentHashMap' in my 'multithreaded' application. i was able to sort it as describe here. but since i am converting hashmap to a list i am bit worried about the thred safty. My 'ConcurrentHashMap' is a static variable hence i can guarantee there will be only one instance of it. but when i am going to sort it i convert it to a list, and sort then put it back to a new concurrentHashMap.
Is this a good practice in multi-threading enlivenment?
Please let me know your thoughts and suggestions.
Thank you in advance.

You should use a ConcurrentSkipListMap. It is thread-safe, fast and maintains ordering according to the object's comparable implementation.

If you don't change it a lot and all you want is to have it sorted, you should use a TreeMap ** wrapped by a **Collections.synchronizedMap() call
Your code would be something like this:
public class YourClass {
public static final Map<Something,Something> MAP = Collections.synchronizedMap( new TreeMap<Something,Something>() );
}

My 'ConcurrentHashMap' is a static variable hence i can guarantee there will be only one instance of it. but when i am going to sort it i convert it to a list, and sort then put it back to a new concurrentHashMap.
This is not a simple problem.
I can tell you for a fact, that using a ConcurrentHashMap won't make this thread-safe. Nor will using a synchronizedMap wrapper. The problem is that sorting is not supported as a single atomic operation. Rather it involves a sequence of Map API operations, probably with significant time gaps in between them.
I can think of two approaches to solving this:
Avoid the need for sorting in the first place by using a Map that keeps the keys in order; e.g. use ConcurrentSkipListMap.
Wrap the Map class in a custom synchronized wrapper class with a synchronized sort method. The problem with this approach is that you are likely to reintroduce the concurrency bottleneck that you avoided by using ConcurrentHashMap.
And it is worth pointing out that it doesn't make any sense to sort a HashMap or a ConcurrentHashMap because these maps will not preserve the order into which you sort the elements. You could use a LinkedHashMap, which preserves the entry insertion order.

Java:How can I populate map if I use callables?

I want to use a Map as a form of small database "cache" in my application.
I thought that it would be better to use something like:
ConcurrentHashMap<K,Callable<V>>
So that I have a single cache for many kind of database objects (and not 1 for each kind i.e. `ConcurrentHashMap<K,V> where V would be some specific object).
My problem now (assuming all the above thoughts are reasonable) is how would I pre-load this cache on start up from DB?
I mean using callable if I need something in the cache and is not there the callable would get it the first time and have it ready on the next get.
But how can I pre-load the cache if I use callables?
Note:I am not interested in using some library since my needs are small.

You might have better luck with ConcurrentHashMap<K, Future<V>>, since Future better matches the concept of "something in the process of being computed, or possibly already computed." You could just initialize some elements of the cache with a Future that's already computed.

Couldn't you just do something simple like this?
for (Callable<V> c : map.values()) {
c.call();
}

You probably should use interfaces on your objects:
public interface Cacheable{}
public MyObject implements Cacheable{...}
ConcurrentHashMap<K, Cacheable> = ...

Do I have to use a thread-safe Map implementation when only reading from it?

If I do the following.
Create a HashMap (in a final field)
Populate HashMap
Wrap HashMap with unmodifiable wrapper Map
Start other threads which will access but not modify the Map
As I understand it the Map has been "safely published" because the other threads were started after the Map was fully populated so I think it is ok to access the Map from multiple threads as it cannot be modified after this point.
Is this right?

This is perfectly fine concerning the map itself. But you need to realize the making the map unmodifiable will only make the map itself unmodifiable and not its keys and values. So if you have for example a Map<String, SomeMutableObject> such as Map<String, List<String>>, then threads will still be able to alter the value by for example map.get("foo").add("bar");. To avoid this, you'd like to make the keys/values immutable/unmodifiable as well.

As I understand it the Map has been "safely published" because the other threads were started after the Map was fully populated so I think it is ok to access the Map from multiple threads as it cannot be modified after this point.
Yes. Just make sure that the other threads are started in a synchronized manner, i.e. make sure you have a happens-before relation between publishing the map, and starting the threads.
This is discussed in this blog post:
[...] This is how Collections.unmodifiableMap() works.
[...]
Because of the special meaning of the keyword "final", instances of this class can be shared with multiple threads without using any additional synchronization; when another thread calls get() on the instance, it is guaranteed to get the object you put into the map, without doing any additional synchronization. You should probably use something that is thread-safe to perform the handoff between threads (like LinkedBlockingQueue or something), but if you forget to do this, then you still have the guarantee.

In short, no you don't need the map to be thread-safe if the reads are non-destructive and the map reference is safely published to the client.
In the example there are two important happens-before relationships established here. The final-field publication (if and only if the population is done inside the constructor and the reference doesn't leak outside the constructor) and the calls to start the threads.
Anything that modifies the map after these calls wrt the client reading from the map is not safely published.
We have for example a CopyOnWriteMap that has a non-threadsafe map underlying that is copied on each write. This is as fast as possible in situations where there are many more reads than writes (caching configuration data is a good example).
That said, if the intention really is to not change the map, setting an immutable version of the map into the field is always the best way to go as it guarantees the client will see the correct thing.
Lastly, there are some Map implementations that have destructive reads such as a LinkedHashMap with access ordering, or a WeakHashMap where entries can disappear. These types of maps must be accessed serially.

You are correct. There is no need to ensure exclusive access to the data structure by different threads by using mutex'es or otherwise since it's immutable. This usually greatly increases performance.
Also note that if you only wrap the original Map rather than creating a copy, ie the unmodifiable Map delegates method calls further to the inner HashMap, modifying the underlying Map may introduce race condition problems.

Immutable map is born to thread-safe. You could use ImmutableMap of Guava.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.