thread safe LinkedHashMap without Collections.synchronized

thread safe LinkedHashMap without Collections.synchronized - java

I am using a LinkedHashMap and the environment is multi threaded so this structure needs to be thread safe. During specific events I need to read the entire map push to db and clear all.
Most of time only writes happen to this map. This map has a limit 50 entries.
I am using Oracle MAF and it does not have Collections.syncronizedMap available. So, what are things I need to put in synchronized blocks to make sure writing and reading doesn't hit me concurrentModificationException etc
Few requirements:
I need to behave it like a circular queue so Overriding removeEldestEntry method of the LinkedHashMap.
I need to preserve the order

So, what are things I need to put in synchronized blocks to make sure writing and reading doesn't hit me concurrentModificationException etc
Everything method call should be in a synchronized block.
The tricky one being the use of an Iterator, as you have to hold the lock for the life of the Iterator. e.g.
// pre Java 5.0 code
synchronized(map) { // the lock has to be held for the whole loop.
for(Iterator iter = map.entrySet().iterator(); iter.hashNext(); ) {
Map.Entry entry = iter.next();
String key = (String) entry.getKey();
MyType value = (MyType) entry.getValue();
// do something with key and value.
}
}

If you are using a java version 1.5 or newer you can use java.util.concurrent.ConcurrentHashMap.
This is the most efficient implementation of a Map to use in a multithread environment.
It adds also some method like putIfAbsent very useful for atomic operations on the map.
From java doc:
Retrieval operations (including get) generally do not block, so may
overlap with update operations (including put and remove). Retrievals
reflect the results of the most recently completed update operations
holding upon their onset. For aggregate operations such as putAll and
clear, concurrent retrievals may reflect insertion or removal of only
some entries
So verify is this is the behaviour you expect from your class.
If your map has only 50 records and needs to be used as a circular Queue why you use a Map? Is not better to use one of the Queue implementations?
If you need to use a LinkedHashMap use the following:
Map m = Collections.synchronizedMap(new LinkedHashMap());
From javadoc of LinkedHashMap:
Note that this implementation is not synchronized. If multiple threads
access a linked hash map concurrently, and at least one of the threads
modifies the map structurally, it must be synchronized externally.
This is typically accomplished by synchronizing on some object that
naturally encapsulates the map. If no such object exists, the map
should be "wrapped" using the Collections.synchronizedMap method. This
is best done at creation time, to prevent accidental unsynchronized
access to the map:
Map m = Collections.synchronizedMap(new LinkedHashMap(...));
https://docs.oracle.com/javase/7/docs/api/java/util/LinkedHashMap.html

Most LinkedHashMap operations require synchronization in a multi-threaded environment, even the ones that look pure like get(key), get(key) actually mutates some internal nodes. The easiest you could do is using Collections.synchronizedMap.
Map<K,V> map = Collections.synchronizedMap(new LinkedHashMap<>());
Now if it is not available, you can easily add it, as it is just a simple decorator around map that synchronize all operation.
class SyncMap<T,U> implements Map<T,U>{
SyncMap<T,U>(LinkedHashMap<T,U> map){
..
}
public synchronized U get(T t){
..
}
}

Related

Thread-safety of a non-concurrent data structure in a concurrent data structure

I have a data structure like this in java:
ConcurrentHashMap<String, Set<String>> objects;
Set(HashSet) is not a concurrent data structure.
Multiple threads can safely mutate the ConcurrentHashMap but what about the Set in it? Is the Set objects in the Map are thread-safe? Or the ConcurrentHashMap provides thread-safety for itself only?
Thanks

Set object in the Map is not thread safe. The Map does not make it thread safe only because it contains a reference to the set.
ConcurrentHashMap implementation provides thread safety only for its own operations, such as putting, retrieval, removal, content checks, etc.
If so happens that the same Set is modified simultaneously by several threads, the result of these modifications is unpredictable.
If you need synchronization of the Set object, you may consider using a Set wrapper for ConcurrentHashMap:
Set<String> set = ConcurrentHashMap.newKeySet();
or, simply:
Set<String> set = ...;
set = Collections.synchronizedSet(set);

Understanding concurrentHashMap

As known, the ConcurrenthashMap class allows us to use iterators safely. As far as I understood from the sources of the Map it's achieved by storing the current Map state into the iterator itself. Here is the inner class representing the iterator (There's a child that is created when iterator()'s called):
abstract class HashIterator {
int nextSegmentIndex;
int nextTableIndex;
HashEntry<K,V>[] currentTable;
HashEntry<K, V> nextEntry;
HashEntry<K, V> lastReturned;
//Methods and ctor
}
But what if some thread writes to the Map something during construction of the iterator? Do we get non-determenistic state of the map then?
The thing is neither of the methods of the Map are synchronized. There's a ReentrantLock for put method, but that's it (as far as I could find). So, I don't understand how the iterator can support a correct state even if some thread writes to the map during its construction?.

The Iterator offers a weakly consistent state. It doesn't offer a transactional view of the data. It only offers that you will see all the keys/values if it is not altered and if it is, you may or may not see that change, but you won't get an error.

From the java doc of ConcurrentHashMap:
Retrieval operations (including get) generally do not block, so may
overlap with update operations (including put and remove). Retrievals
reflect the results of the most recently completed update operations
holding upon their onset. For aggregate operations such as putAll and
clear, concurrent retrievals may reflect insertion or removal of only
some entries. Similarly, Iterators and Enumerations return elements
reflecting the state of the hash table at some point at or since the
creation of the iterator/enumeration. They do not throw
ConcurrentModificationException. However, iterators are designed to be
used by only one thread at a time.
Now answering the questions.
But what if some thread writes to the Map something during
construction of the iterator?
As mentioned, an iterator represents the state at some point of time. So it may not be the most recent state.
how the iterator can support a correct state even if some thread
writes to the map during its construction?
The guarantee is that things will not break if you put/remove during iteration. However, there is no guarantee that one thread will see the changes to the map that the other thread performs (without obtaining a new iterator from the map). The iterator is guaranteed to reflect the state of the map at the time of it's creation. Futher changes may be reflected in the iterator, but they do not have to be.

Synchronizing LinkedHashmap externally

What is the best way to implement synchronization of a linkedhashmap externally, without using Collections.synchronizedMap
When Collections.synchronizedMap is used entire datastructure is locked, so performance is hugely impacted in a bad way.
What is the best way to lock only required part of datastructure. e.g. If thread is accessing key (K1), it should lock only Key(K1) and Value(v1) part of the datastructure

You can't get a fine-grained-locking, FIFO-eviction concurrent map from the built-in Java implementations.
Check out Guava's Cache or the open-source ConcurrentLinkedHashMap project.

I think you may want to synchronize the subsequent operation you do, just on the value coming from the map:
Object value = map.get(key);
synchronized(value) {
doSomethingWith(value);
}
Synchronizing to values get from the Map, makes sense, since they can be shared and accessed concurrently; the example I posted above should do what you need. That should be enough.
By the way you can also synchronize on the key doing two nested synchronized blocks:
synchronized(key) {
Object value = map.get(key);
synchronized(value) {
doSomethingWith(value);
}
}
The key is -usually- just used to access the object (by hashing). Keys are matched by hash value, so it doesn't make full sense to me to synchronize over the key.
Or, maybe you can subclass ConcurrentHashMap adding what is missing from LinkedHashMap.

Louis Wasserman's suggestion is probably the best because it gives you a lot of useful functionality. However, even if you lock on the entire map, you have to be hitting it really, really hard to make that a bottleneck (as in, your code is mostly doing read/write on the map). If you don't need the additional functionality of Guava's Cache, a synchronized map could be simpler & better. You could also use a ReadWriteLock if you mostly read from the map.

Best option would be to use java.util.concurrent.ConcurrentHashMap .
I can't see how it would be possible to externally lock only parts of zour Map, since you cannot control what shared datastructures are accessed internally by a call to any of the maps function.

If you don't need a LinkedHaspMap, use a ConcurrentHashMap from the java.util.concurrent package.
It is specifically designed for both speed and thread safety. It uses the minimal possible locking to achieve its thread safety.

An insertion in a HashMap, or LinkedHashMap, can cause a rehash because it increases the ratio between the size and the number of buckets. Having two or more threads rehash simultaneously would be a disaster.
Even if you are only doing a get, another thread may be removing an entry from the same bucket, so you are scanning a linked list that is being modified under you. You could also have two or more threads appending to the main linked list at the same time.
If you can do without the linking, use java.util.concurrent.ConcurrentHashMap, as already suggested.

ConcurrentHashMap in Java?

What is the use of ConcurrentHashMap in Java? What are its benefits? How does it work?
Sample code would be useful too.

The point is to provide an implementation of HashMap that is threadsafe. Multiple threads can read from and write to it without the chance of receiving out-of-date or corrupted data. ConcurrentHashMap provides its own synchronization, so you do not have to synchronize accesses to it explicitly.
Another feature of ConcurrentHashMap is that it provides the putIfAbsent method, which will atomically add a mapping if the specified key does not exist. Consider the following code:
ConcurrentHashMap<String, Integer> myMap = new ConcurrentHashMap<String, Integer>();
// some stuff
if (!myMap.contains("key")) {
myMap.put("key", 3);
}
This code is not threadsafe, because another thread could add a mapping for "key" between the call to contains and the call to put. The correct implementation would be:
myMap.putIfAbsent("key", 3);

ConcurrentHashMap allow concurrent access to the map. HashTables too offers synchronized access to map, but your entire map is locked to perform any operation.
The logic behind ConcurrentHashMap is that your entire table is not getting locked, but only the portion[segments]. Each segments manages its own HashTable. Locking is applied only for updates. In case of of retrievals, it allows full concurrency.
Let's take four threads are concurrently working on a map whose capacity is 32, the table is partitioned into four segments where each segments manages a hash table of capacity. The collection maintains a list of 16 segments by default, each of which is used to guard (or lock on) a single bucket of the map.
This effectively means that 16 threads can modify the collection at a single time. This level of concurrency can be increased using the optional concurrencyLevel constructor argument.
public ConcurrentHashMap(int initialCapacity,
float loadFactor, int concurrencyLevel)
As the other answer stated, the ConcurrentHashMap offers new method putIfAbsent() which is similar to put except the value will not be overridden if the key exists.
private static Map<String,String> aMap =new ConcurrentHashMap<String,String>();
if(!aMap.contains("key"))
aMap.put("key","value");
The new method is also faster as it avoids double traversing as above. contains method has to locate the segment and iterate the table to find the key and again the method put has to traverse the bucket and put the key.

Really the big functional difference is it doesn't throw an exception and/or end up corrupt when someone else changes it while you're using it.
With regular collections, if another thread adds or removes an element while you're access it (via the iterator) it will throw an exception. ConcurrentHashMap lets them make the change and doesn't stop your thread.
Mind you it does not make any kind of synchronization guarantees or promises about the point-in-time visibility of the change from one thread to the other. (It's sort of like a read-committed database isolation, rather than a synchronized map which behaves more like a serializable database isolation. (old school row-locking SQL serializable, not Oracle-ish multiversion serializable :) )
The most common use I know of is in caching immutable derived information in App Server environments where many threads may be accessing the same thing, and it doesn't really matter if two happen to calculate the same cache value and put it twice because they interleave, etc. (e.g., it's used extensively inside the Spring WebMVC framework for holding runtime-derived config like mappings from URLs to Handler Methods.)

It can be used for memoization:
import java.util.concurrent.ConcurrentHashMap;
public static Function<Integer, Integer> fib = (n) -> {
Map<Integer, Integer> cache = new ConcurrentHashMap<>();
if (n == 0 || n == 1) return n;
return cache.computeIfAbsent(n, (key) -> HelloWorld.fib.apply(n - 2) + HelloWorld.fib.apply(n - 1));
};

1.ConcurrentHashMap is thread-safe that is the code can be accessed by single thread at a time .
2.ConcurrentHashMap synchronizes or locks on the certain portion of the Map . To optimize the performance of ConcurrentHashMap , Map is divided into different partitions depending upon the Concurrency level . So that we do not need to synchronize the whole Map Object.
3.Default concurrency level is 16, accordingly map is divided into 16 part and each part is governed with a different lock that means 16 thread can operate.
4.ConcurrentHashMap does not allow NULL values . So the key can not be null in ConcurrentHashMap .

Hello guys today we discussed the ConcurrentHashMap.
What is ConcurrentHashMap?
ConcurrentHashMap is a class it introduce in java 1.5 which implements the ConcurrentMap as well as the Serializable interface. ConcurrentHashMap is enhance the HashMap when it dealing with multiple Theading.
As we know when the application has multiple threading HashMap is not a good choice because performance issue occurred.
There are the some key point of ConcurrentHashMap.
Underling data structure for ConcurrentHashMap is HashTable.
ConcurrentHashMap is a class, That class is thread safe, it means multiple thread can access on a single thread object without any complication.
ConcurretnHashMap object is divided into number of segment according to the concurrency level.
The Default Concurrency-level of ConcurrentHashMap is 16.
In ConcurrentHashMap any number of Thread can perform the retrieval operation,But for updation in object Thread must lock the particular Segment in which thread want to operate.
This type of locking mechanism is known as Segment-Locking OR Bucket-Locking.
In ConcurrentHashMap the 16 updation operation perform at a time.
Null insertion is not possible in ConcurrentHashMap.
Here are the ConcurrentHashMap construction.
ConcurrentHashMap m=new ConcurrentHashMap();:Creates a new, empty map with a default initial capacity (16), load factor (0.75) and concurrencyLevel (16).
ConcurrentHashMap m=new ConcurrentHashMap(int initialCapacity);:Creates a new, empty map with the specified initial capacity, and with default load factor (0.75) and concurrencyLevel (16).
ConcurrentHashMap m=new ConcurrentHashMap(int initialCapacity, float loadFactor);:
Creates a new, empty map with the specified initial capacity and load factor and with the default concurrencyLevel (16).
ConcurrentHashMap m=new ConcurrentHashMap(int initialCapacity, float loadFactor, int concurrencyLevel);:Creates a new, empty map with the specified initial capacity, load factor and concurrency level.
ConcurrentHashMap m=new ConcurrentHashMap(Map m);:Creates a new map with the same mappings as the given map.
ConcurretHashMap has one method named is putIfAbsent(); That method is prevent to store the duplicate key please refer the below example.
import java.util.concurrent.*;
class ConcurrentHashMapDemo {
public static void main(String[] args)
{
ConcurrentHashMap m = new ConcurrentHashMap();
m.put(1, "Hello");
m.put(2, "Vala");
m.put(3, "Sarakar");
// Here we cant add Hello because 1 key
// is already present in ConcurrentHashMap object
m.putIfAbsent(1, "Hello");
// We can remove entry because 2 key
// is associated with For value
m.remove(2, "Vala");
// Now we can add Vala
m.putIfAbsent(4, "Vala");
System.out.println(m);
}
}

Does java have a "LinkedConcurrentHashMap" data structure?

I need a data structure that is a LinkedHashMap and is thread safe.
How can I do that ?

You can wrap the map in a Collections.synchronizedMap to get a synchronized hashmap that maintains insertion order. This is not as efficient as a ConcurrentHashMap (and doesn't implement the extra interface methods of ConcurrentMap) but it does get you the (somewhat) thread safe behavior.
Even the mighty Google Collections doesn't appear to have solved this particular problem yet. However, there is one project that does try to tackle the problem.
I say somewhat on the synchronization, because iteration is still not thread safe in the sense that concurrent modification exceptions can happen.

There's a number of different approaches to this problem. You could use:
Collections.synchronizedMap(new LinkedHashMap());
as the other responses have suggested but this has several gotchas you'll need to be aware of. Most notably is that you will often need to hold the collections synchronized lock when iterating over the collection, which in turn prevents other threads from accessing the collection until you've completed iterating over it. (See Java theory and practice: Concurrent collections classes). For example:
synchronized(map) {
for (Object obj: map) {
// Do work here
}
}
Using
new ConcurrentHashMap();
is probably a better choice as you won't need to lock the collection to iterate over it.
Finally, you might want to consider a more functional programming approach. That is you could consider the map as essentially immutable. Instead of adding to an existing Map, you would create a new one that contains the contents of the old map plus the new addition. This sounds pretty bizarre at first, but it is actually the way Scala deals with concurrency and collections

There is one implementation available under Google code. A quote from their site:
A high performance version of java.util.LinkedHashMap for use as a software cache.
Design
A concurrent linked list runs through a ConcurrentHashMap to provide eviction ordering.
Supports insertion and access ordered eviction policies (FIFO, LRU, and Second Chance).

You can use a ConcurrentSkipListMap, only available in Java SE/EE 6 or later. It is order presevering in that keys are sorted according to their natural ordering. You need to have a Comparator or make the keys Comparable objects. In order to mimik a linked hash map behavior (iteration order is the order in time in which entries were added) I implemented my key objects to always compare to be greater than a given other object unless it is equal (whatever that is for your object).
A wrapped synchronized linked hash map did not suffice because as stated in
http://www.ibm.com/developerworks/java/library/j-jtp07233.html: "The synchronized collections wrappers, synchronizedMap and synchronizedList, are sometimes called conditionally thread-safe -- all individual operations are thread-safe, but sequences of operations where the control flow depends on the results of previous operations may be subject to data races. The first snippet in Listing 1 shows the common put-if-absent idiom -- if an entry does not already exist in the Map, add it. Unfortunately, as written, it is possible for another thread to insert a value with the same key between the time the containsKey() method returns and the time the put() method is called. If you want to ensure exactly-once insertion, you need to wrap the pair of statements with a synchronized block that synchronizes on the Map m."
So what only helps is a ConcurrentSkipListMap which is 3-5 times slower than a normal ConcurrentHashMap.

Collections.synchronizedMap(new LinkedHashMap())

Since the ConcurrentHashMap offers a few important extra methods that are not in the Map interface, simply wrapping a LinkedHashMap with a synchronizedMap won't give you the same functionality, in particular, they won't give you anything like the putIfAbsent(), replace(key, oldValue, newValue) and remove(key, oldValue) methods which make the ConcurrentHashMap so useful.
Unless there's some apache library that has implemented what you want, you'll probably have to use a LinkedHashMap and provide suitable synchronized{} blocks of your own.

I just tried synchronized bounded LRU Map based on insertion order LinkedConcurrentHashMap; with Read/Write Lock for synchronization.
So when you are using iterator; you have to acquire WriteLock to avoid ConcurrentModificationException. This is better than Collections.synchronizedMap.
public class LinkedConcurrentHashMap<K, V> {
private LinkedHashMap<K, V> linkedHashMap = null;
private final int cacheSize;
private ReadWriteLock readWriteLock = null;
public LinkedConcurrentHashMap(LinkedHashMap<K, V> psCacheMap, int size) {
this.linkedHashMap = psCacheMap;
cacheSize = size;
readWriteLock=new ReentrantReadWriteLock();
}
public void put(K key, V value) throws SQLException{
Lock writeLock=readWriteLock.writeLock();
try{
writeLock.lock();
if(linkedHashMap.size() >= cacheSize && cacheSize > 0){
K oldAgedKey = linkedHashMap.keySet().iterator().next();
remove(oldAgedKey);
}
linkedHashMap.put(key, value);
}finally{
writeLock.unlock();
}
}
public V get(K key){
Lock readLock=readWriteLock.readLock();
try{
readLock.lock();
return linkedHashMap.get(key);
}finally{
readLock.unlock();
}
}
public boolean containsKey(K key){
Lock readLock=readWriteLock.readLock();
try{
readLock.lock();
return linkedHashMap.containsKey(key);
}finally{
readLock.unlock();
}
}
public V remove(K key){
Lock writeLock=readWriteLock.writeLock();
try{
writeLock.lock();
return linkedHashMap.remove(key);
}finally{
writeLock.unlock();
}
}
public ReadWriteLock getLock(){
return readWriteLock;
}
public Set<Map.Entry<K, V>> entrySet(){
return linkedHashMap.entrySet();
}
}

The answer is pretty much no, there's nothing equivalent to a ConcurrentHashMap that is sorted (like the LinkedHashMap). As other people pointed out, you can wrap your collection using Collections.synchronizedMap(-yourmap-) however this will not give you the same level of fine grained locking. It will simply block the entire map on every operation.
Your best bet is to either use synchronized around any access to the map (where it matters, of course. You may not care about dirty reads, for example) or to write a wrapper around the map that determines when it should or should not lock.

How about this.
Take your favourite open-source concurrent HashMap implementation. Sadly it can't be Java's ConcurrentHashMap as it's basically impossible to copy and modify that due to huge numbers of package-private stuff. (Why do the Java authors always do that?)
Add a ConcurrentLinkedDeque field.
Modify all of the put methods so that if an insertion is successful the Entry is added to the end of the deque. Modify all of the remove methods so that any removed entries are also removed from the deque. Where a put method replaces the existing value, we don't have to do anything to the deque.
Change all iterator/spliterator methods so that they delegate to the deque.
There's no guarantee that the deque and the map have exactly the same contents at all times, but concurrent hash maps don't make those sort of promises anyway.
Removal won't be super fast (have to scan the deque). But most maps are never (or very rarely) asked to remove entries anyway.
You could also achieve this by extending ConcurrentHashMap, or decorating it (decorator pattern).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.