I need a data structure that is a LinkedHashMap and is thread safe.
How can I do that ?
You can wrap the map in a Collections.synchronizedMap to get a synchronized hashmap that maintains insertion order. This is not as efficient as a ConcurrentHashMap (and doesn't implement the extra interface methods of ConcurrentMap) but it does get you the (somewhat) thread safe behavior.
Even the mighty Google Collections doesn't appear to have solved this particular problem yet. However, there is one project that does try to tackle the problem.
I say somewhat on the synchronization, because iteration is still not thread safe in the sense that concurrent modification exceptions can happen.
There's a number of different approaches to this problem. You could use:
Collections.synchronizedMap(new LinkedHashMap());
as the other responses have suggested but this has several gotchas you'll need to be aware of. Most notably is that you will often need to hold the collections synchronized lock when iterating over the collection, which in turn prevents other threads from accessing the collection until you've completed iterating over it. (See Java theory and practice: Concurrent collections classes). For example:
synchronized(map) {
for (Object obj: map) {
// Do work here
}
}
Using
new ConcurrentHashMap();
is probably a better choice as you won't need to lock the collection to iterate over it.
Finally, you might want to consider a more functional programming approach. That is you could consider the map as essentially immutable. Instead of adding to an existing Map, you would create a new one that contains the contents of the old map plus the new addition. This sounds pretty bizarre at first, but it is actually the way Scala deals with concurrency and collections
There is one implementation available under Google code. A quote from their site:
A high performance version of java.util.LinkedHashMap for use as a software cache.
Design
A concurrent linked list runs through a ConcurrentHashMap to provide eviction ordering.
Supports insertion and access ordered eviction policies (FIFO, LRU, and Second Chance).
You can use a ConcurrentSkipListMap, only available in Java SE/EE 6 or later. It is order presevering in that keys are sorted according to their natural ordering. You need to have a Comparator or make the keys Comparable objects. In order to mimik a linked hash map behavior (iteration order is the order in time in which entries were added) I implemented my key objects to always compare to be greater than a given other object unless it is equal (whatever that is for your object).
A wrapped synchronized linked hash map did not suffice because as stated in
http://www.ibm.com/developerworks/java/library/j-jtp07233.html: "The synchronized collections wrappers, synchronizedMap and synchronizedList, are sometimes called conditionally thread-safe -- all individual operations are thread-safe, but sequences of operations where the control flow depends on the results of previous operations may be subject to data races. The first snippet in Listing 1 shows the common put-if-absent idiom -- if an entry does not already exist in the Map, add it. Unfortunately, as written, it is possible for another thread to insert a value with the same key between the time the containsKey() method returns and the time the put() method is called. If you want to ensure exactly-once insertion, you need to wrap the pair of statements with a synchronized block that synchronizes on the Map m."
So what only helps is a ConcurrentSkipListMap which is 3-5 times slower than a normal ConcurrentHashMap.
Collections.synchronizedMap(new LinkedHashMap())
Since the ConcurrentHashMap offers a few important extra methods that are not in the Map interface, simply wrapping a LinkedHashMap with a synchronizedMap won't give you the same functionality, in particular, they won't give you anything like the putIfAbsent(), replace(key, oldValue, newValue) and remove(key, oldValue) methods which make the ConcurrentHashMap so useful.
Unless there's some apache library that has implemented what you want, you'll probably have to use a LinkedHashMap and provide suitable synchronized{} blocks of your own.
I just tried synchronized bounded LRU Map based on insertion order LinkedConcurrentHashMap; with Read/Write Lock for synchronization.
So when you are using iterator; you have to acquire WriteLock to avoid ConcurrentModificationException. This is better than Collections.synchronizedMap.
public class LinkedConcurrentHashMap<K, V> {
private LinkedHashMap<K, V> linkedHashMap = null;
private final int cacheSize;
private ReadWriteLock readWriteLock = null;
public LinkedConcurrentHashMap(LinkedHashMap<K, V> psCacheMap, int size) {
this.linkedHashMap = psCacheMap;
cacheSize = size;
readWriteLock=new ReentrantReadWriteLock();
}
public void put(K key, V value) throws SQLException{
Lock writeLock=readWriteLock.writeLock();
try{
writeLock.lock();
if(linkedHashMap.size() >= cacheSize && cacheSize > 0){
K oldAgedKey = linkedHashMap.keySet().iterator().next();
remove(oldAgedKey);
}
linkedHashMap.put(key, value);
}finally{
writeLock.unlock();
}
}
public V get(K key){
Lock readLock=readWriteLock.readLock();
try{
readLock.lock();
return linkedHashMap.get(key);
}finally{
readLock.unlock();
}
}
public boolean containsKey(K key){
Lock readLock=readWriteLock.readLock();
try{
readLock.lock();
return linkedHashMap.containsKey(key);
}finally{
readLock.unlock();
}
}
public V remove(K key){
Lock writeLock=readWriteLock.writeLock();
try{
writeLock.lock();
return linkedHashMap.remove(key);
}finally{
writeLock.unlock();
}
}
public ReadWriteLock getLock(){
return readWriteLock;
}
public Set<Map.Entry<K, V>> entrySet(){
return linkedHashMap.entrySet();
}
}
The answer is pretty much no, there's nothing equivalent to a ConcurrentHashMap that is sorted (like the LinkedHashMap). As other people pointed out, you can wrap your collection using Collections.synchronizedMap(-yourmap-) however this will not give you the same level of fine grained locking. It will simply block the entire map on every operation.
Your best bet is to either use synchronized around any access to the map (where it matters, of course. You may not care about dirty reads, for example) or to write a wrapper around the map that determines when it should or should not lock.
How about this.
Take your favourite open-source concurrent HashMap implementation. Sadly it can't be Java's ConcurrentHashMap as it's basically impossible to copy and modify that due to huge numbers of package-private stuff. (Why do the Java authors always do that?)
Add a ConcurrentLinkedDeque field.
Modify all of the put methods so that if an insertion is successful the Entry is added to the end of the deque. Modify all of the remove methods so that any removed entries are also removed from the deque. Where a put method replaces the existing value, we don't have to do anything to the deque.
Change all iterator/spliterator methods so that they delegate to the deque.
There's no guarantee that the deque and the map have exactly the same contents at all times, but concurrent hash maps don't make those sort of promises anyway.
Removal won't be super fast (have to scan the deque). But most maps are never (or very rarely) asked to remove entries anyway.
You could also achieve this by extending ConcurrentHashMap, or decorating it (decorator pattern).
Related
I have this code:
private ConcurrentMap<String, Integer> myMap = new ConcurrentHashMap<>();
#Scheduled(fixedDelay = 600_000)
public void foo(){
myMap.values().stream().
filter(predicate()).
forEach(this::remove);
}
public void insert(String str, Integer value){
myMap.put(str, value);
}
What would happen if while iterating over this map - someone will put a new value in it or remove an existing value from it?
The documentation for ConcurrentHashMap has some details about the behavior. First we look at what ConcurrentHashMap.values() does:
Returns a Collection view of the values contained in this map...
The view's iterators and spliterators are weakly consistent.
The view's spliterator reports Spliterator.CONCURRENT and Spliterator.NONNULL.
What's interesting are the terms "weakly consistent" and Spliterator.CONCURRENT, where the former is described as:
Most concurrent Collection implementations (including most Queues) also differ from the usual java.util conventions in that their Iterators and Spliterators provide weakly consistent rather than fast-fail traversal:
they may proceed concurrently with other operations
they will never throw ConcurrentModificationException
they are guaranteed to traverse elements as they existed upon construction exactly once, and may (but are not guaranteed to) reflect any modifications subsequent to construction.
and Spliterator.CONCURRENT is described as:
Characteristic value signifying that the element source may be safely concurrently modified (allowing additions, replacements, and/or removals) by multiple threads without external synchronization. If so, the Spliterator is expected to have a documented policy concerning the impact of modifications during traversal.
From all these documentations, and being consistent with the concurrency model of the ConcurrentHashMap, it means that the stream pipeline is completely thread-safe and will traverse the elements as they existed upon the creation of the iterator.
In this post:
Can we use Synchronized for each entry instead of ConcurrentHashMap?
I asked if we can use Synchronized block to lock only entries of a HashMap, which I learnt we cannot. Now, my question is, if we have a ConcurrentHashMap (not hashMap) with values of type ArrayList, or TreeMap, then can I use that approach (using synchronized). Here what I mean:
ConcurrentHashMap<String, ArrayList<String>> map = new ConcurrentHashMap<>();
synchronized (map.get("key")) {
//do something with the array thread-safely,
}
Is it safe? the reason that I am asking is that I don't know how to check this kind of issues by testing.
As long as you use the putIfAbsent operation, then it will be thread-safe. You will always be synchronizing (blocking) on the same object reference.
I am using a LinkedHashMap and the environment is multi threaded so this structure needs to be thread safe. During specific events I need to read the entire map push to db and clear all.
Most of time only writes happen to this map. This map has a limit 50 entries.
I am using Oracle MAF and it does not have Collections.syncronizedMap available. So, what are things I need to put in synchronized blocks to make sure writing and reading doesn't hit me concurrentModificationException etc
Few requirements:
I need to behave it like a circular queue so Overriding removeEldestEntry method of the LinkedHashMap.
I need to preserve the order
So, what are things I need to put in synchronized blocks to make sure writing and reading doesn't hit me concurrentModificationException etc
Everything method call should be in a synchronized block.
The tricky one being the use of an Iterator, as you have to hold the lock for the life of the Iterator. e.g.
// pre Java 5.0 code
synchronized(map) { // the lock has to be held for the whole loop.
for(Iterator iter = map.entrySet().iterator(); iter.hashNext(); ) {
Map.Entry entry = iter.next();
String key = (String) entry.getKey();
MyType value = (MyType) entry.getValue();
// do something with key and value.
}
}
If you are using a java version 1.5 or newer you can use java.util.concurrent.ConcurrentHashMap.
This is the most efficient implementation of a Map to use in a multithread environment.
It adds also some method like putIfAbsent very useful for atomic operations on the map.
From java doc:
Retrieval operations (including get) generally do not block, so may
overlap with update operations (including put and remove). Retrievals
reflect the results of the most recently completed update operations
holding upon their onset. For aggregate operations such as putAll and
clear, concurrent retrievals may reflect insertion or removal of only
some entries
So verify is this is the behaviour you expect from your class.
If your map has only 50 records and needs to be used as a circular Queue why you use a Map? Is not better to use one of the Queue implementations?
If you need to use a LinkedHashMap use the following:
Map m = Collections.synchronizedMap(new LinkedHashMap());
From javadoc of LinkedHashMap:
Note that this implementation is not synchronized. If multiple threads
access a linked hash map concurrently, and at least one of the threads
modifies the map structurally, it must be synchronized externally.
This is typically accomplished by synchronizing on some object that
naturally encapsulates the map. If no such object exists, the map
should be "wrapped" using the Collections.synchronizedMap method. This
is best done at creation time, to prevent accidental unsynchronized
access to the map:
Map m = Collections.synchronizedMap(new LinkedHashMap(...));
https://docs.oracle.com/javase/7/docs/api/java/util/LinkedHashMap.html
Most LinkedHashMap operations require synchronization in a multi-threaded environment, even the ones that look pure like get(key), get(key) actually mutates some internal nodes. The easiest you could do is using Collections.synchronizedMap.
Map<K,V> map = Collections.synchronizedMap(new LinkedHashMap<>());
Now if it is not available, you can easily add it, as it is just a simple decorator around map that synchronize all operation.
class SyncMap<T,U> implements Map<T,U>{
SyncMap<T,U>(LinkedHashMap<T,U> map){
..
}
public synchronized U get(T t){
..
}
}
As known, the ConcurrenthashMap class allows us to use iterators safely. As far as I understood from the sources of the Map it's achieved by storing the current Map state into the iterator itself. Here is the inner class representing the iterator (There's a child that is created when iterator()'s called):
abstract class HashIterator {
int nextSegmentIndex;
int nextTableIndex;
HashEntry<K,V>[] currentTable;
HashEntry<K, V> nextEntry;
HashEntry<K, V> lastReturned;
//Methods and ctor
}
But what if some thread writes to the Map something during construction of the iterator? Do we get non-determenistic state of the map then?
The thing is neither of the methods of the Map are synchronized. There's a ReentrantLock for put method, but that's it (as far as I could find). So, I don't understand how the iterator can support a correct state even if some thread writes to the map during its construction?.
The Iterator offers a weakly consistent state. It doesn't offer a transactional view of the data. It only offers that you will see all the keys/values if it is not altered and if it is, you may or may not see that change, but you won't get an error.
From the java doc of ConcurrentHashMap:
Retrieval operations (including get) generally do not block, so may
overlap with update operations (including put and remove). Retrievals
reflect the results of the most recently completed update operations
holding upon their onset. For aggregate operations such as putAll and
clear, concurrent retrievals may reflect insertion or removal of only
some entries. Similarly, Iterators and Enumerations return elements
reflecting the state of the hash table at some point at or since the
creation of the iterator/enumeration. They do not throw
ConcurrentModificationException. However, iterators are designed to be
used by only one thread at a time.
Now answering the questions.
But what if some thread writes to the Map something during
construction of the iterator?
As mentioned, an iterator represents the state at some point of time. So it may not be the most recent state.
how the iterator can support a correct state even if some thread
writes to the map during its construction?
The guarantee is that things will not break if you put/remove during iteration. However, there is no guarantee that one thread will see the changes to the map that the other thread performs (without obtaining a new iterator from the map). The iterator is guaranteed to reflect the state of the map at the time of it's creation. Futher changes may be reflected in the iterator, but they do not have to be.
What is the best way to implement synchronization of a linkedhashmap externally, without using Collections.synchronizedMap
When Collections.synchronizedMap is used entire datastructure is locked, so performance is hugely impacted in a bad way.
What is the best way to lock only required part of datastructure. e.g. If thread is accessing key (K1), it should lock only Key(K1) and Value(v1) part of the datastructure
You can't get a fine-grained-locking, FIFO-eviction concurrent map from the built-in Java implementations.
Check out Guava's Cache or the open-source ConcurrentLinkedHashMap project.
I think you may want to synchronize the subsequent operation you do, just on the value coming from the map:
Object value = map.get(key);
synchronized(value) {
doSomethingWith(value);
}
Synchronizing to values get from the Map, makes sense, since they can be shared and accessed concurrently; the example I posted above should do what you need. That should be enough.
By the way you can also synchronize on the key doing two nested synchronized blocks:
synchronized(key) {
Object value = map.get(key);
synchronized(value) {
doSomethingWith(value);
}
}
The key is -usually- just used to access the object (by hashing). Keys are matched by hash value, so it doesn't make full sense to me to synchronize over the key.
Or, maybe you can subclass ConcurrentHashMap adding what is missing from LinkedHashMap.
Louis Wasserman's suggestion is probably the best because it gives you a lot of useful functionality. However, even if you lock on the entire map, you have to be hitting it really, really hard to make that a bottleneck (as in, your code is mostly doing read/write on the map). If you don't need the additional functionality of Guava's Cache, a synchronized map could be simpler & better. You could also use a ReadWriteLock if you mostly read from the map.
Best option would be to use java.util.concurrent.ConcurrentHashMap .
I can't see how it would be possible to externally lock only parts of zour Map, since you cannot control what shared datastructures are accessed internally by a call to any of the maps function.
If you don't need a LinkedHaspMap, use a ConcurrentHashMap from the java.util.concurrent package.
It is specifically designed for both speed and thread safety. It uses the minimal possible locking to achieve its thread safety.
An insertion in a HashMap, or LinkedHashMap, can cause a rehash because it increases the ratio between the size and the number of buckets. Having two or more threads rehash simultaneously would be a disaster.
Even if you are only doing a get, another thread may be removing an entry from the same bucket, so you are scanning a linked list that is being modified under you. You could also have two or more threads appending to the main linked list at the same time.
If you can do without the linking, use java.util.concurrent.ConcurrentHashMap, as already suggested.