I have one thread that updates data in Map and several threads that read this data. Now my code looks like this:
public class Updater {
private ConcurrentMap<String, Integer> valuesMap = new ConcurrentHashMap<>();
private ReadWriteLock reentrantReadWriteLock = new ReentrantReadWriteLock();
public void update(Settings settings) {
reentrantReadWriteLock.writeLock().lock();
try {
for (Map.Entry<String, Integer> entry : valuesMap.entrySet()) {
valuesMap.put(entry.getKey(),
entry.getValue() + settings.getUpdateValue());
}
} finally {
reentrantReadWriteLock.writeLock().unlock();
}
}
public Integer getValue(String key) {
reentrantReadWriteLock.readLock().lock();
try {
return valuesMap.get(key);
} finally {
reentrantReadWriteLock.readLock().unlock();
}
}
}
But I think I overdid it. Can I use only ConcurrentHashMap in this situation?
Can I use only ConcurrentHashMap in this situation?
It depends on whether you want the update operation to be atomic; i.e. whether you want a reader to never see the state of the map when only some of the put operations have been performed.
If update doesn't need to be atomic, then locking is unnecessary. In fact if is an unwanted concurrency bottleneck.
If update needs to be atomic, then the lock is necessary. But if you are controlling access with locks, then you don't need to use a ConcurrentHashMap. A simple HashMap would be better.
I don't think that ConcurrentHashMap has a way to implement multiple put operations as an atomic action.
Related
I have List<List<File>> which I divide for chunks. For each chunk I run n Threads. Each Thread build TreeMap<String, List<String>>. I wanted to connect all TreeMap<String, List<String>>'s to single (but they can contain same key - so some comparations needs to be done) or I would to run use some thread-safe collection or concurent but I don't know which? (it needs to be possible to sort) and the key could be the same in each Thread but their values needs to be connected.
List<File> filesAndFoldersList = Arrays.asList(filesAndFoldersArray);
List<List<File>> filesArrays = divideArrayIntoChunks(filesAndFoldersList, 100);
for (int i = 0; i < filesArrays.size(); i++) {
List<File> filesList = filesArrays.get(i);
BuildNamesAndFileNamesMap buildNamesAndFileNamesMap = new BuildNamesAndFileNamesMap(... , i);
Thread thread = new Thread(buildNamesAndFileNamesMap, "Thread " + i);
thread.start();
if (buildNamesAndFileNamesMap.hasFinished()) {
namesAndFileNamesPartMap = buildNamesAndFileNamesMap.getNamesAndFileNamesMap();
... // join to main TreeMap or?
}
}
This is BuildNamesAndFileNamesMap.class:
private static class BuildNamesAndFileNamesMap implements Runnable {
....
private final TreeMap<String, List<String>> namesAndFileNamesMap;
private int threadNumber;
private boolean hasFinished = false;
public BuildNamesAndFileNameMap(..., int threadNumber) {
...
}
public TreeMap<String, List<String>> getNamesAndFileNamesMap() {
return namesAndFileNamesMap;
}
public boolean hasFinished() {
return hasFinished;
}
#Override
public void run() {
...
namesAndFileNamesMap.put(...);
hasFinished = true;
}
}
I understand that what you posted is pseudocode (otherwise, how does the synchronous buildNamesAndFileNamesMap.hasFinished() work)?
The simplest would be to use a synchronization mechanism like CountDownLatch and await() termination of all threads on the main thread. You can then merge all the results into a synchronous Map using Map.merge, for example, thus eliminating the need for a concurrent collection (for a cleaner design, you might want to look into CompletableFuture.allOf(...))
If you want more parallelism than that (i.e. collecting intermediate results performed by the child threads themselves, rather than a post-processing step), you would need to make the BuildNamesAndFileNameMap instances share a concurrent collection (since it sounds you will need Map.merge for merging results from different threads, a ConcurrentHashMap will be most appropriate, otherwise if no merging is needed then a ConcurrentSkipListMap would be enough).
Here is the code in one of my classes:
class SomeClass {
private Map<Integer, Integer> map = new ConcurrentHashMap<>();
private volatile int counter = 0;
final AtomicInteger sum = new AtomicInteger(0); // will be used in other classes/threads too
private ReentrantLock l = new ReentrantLock();
public void put(String some) {
l.lock();
try {
int tmp = Integer.parseInt(some);
map.put(counter++, tmp);
sum.getAndAdd(tmp);
} finally {
l.unlock();
}
}
public Double get() {
l.lock();
try {
//... perform some map resizing operation ...
// some calculations including sum field ...
} finally {
l.unlock();
}
}
}
You can assume that this class will be used in concurrent environment.
The question is: how do you think is there a necessity of the locks? How does this code smell? :)
Let's look at the operations inside public void put(String some).
map.put(counter++, tmp);
sum.getAndAdd(tmp);
Now let's look at the individual parts.
counter is a volatile variable. So it only provides memory visibility but not atomicity. Since counter++ is a compound operation, you need a lock to achieve atomicity.
map.put(key, value) is atomic since it is a ConcurrentHashMap.
sum.getAndAdd(tmp) is atomic since it is a AtomicInteger.
As you can see, except counter++ every other operation is atomic. However, you are trying to achieve some function by combining all these operations. To achieve atomicity at the functionality level, you need a lock. This will help you to avoid surprising side effects when the threads interleave between the individual atomic operations.
So you need a lock because counter++ is not atomic and you want to combine a few atomic operations to achieve some functionality (assuming you want this to be atomic).
Since you always increment counter when you use it as a key to put into this map:
map.put(counter++, tmp);
when you come to read it again:
return sum / map.get(counter);
map.get(counter) will be null, so this results in a NPE (unless you put more than 2^32 things into the map, ofc). (I'm assuming you mean sum.get(), otherwise it won't compile).
As such, you can have equivalent functionality without any locks:
class SomeClass {
public void put(String some) { /* do nothing */ }
public Double get() {
throw new NullPointerException();
}
}
You've not really fixed the problem with your edit. divisor will still be null, so the equivalent functionality without locks would be:
class SomeClass {
private final AtomicInteger sum = new AtomicInteger(0);
public void put(String some) {
sum.getAndAdd(Integer.parseInt(some));
}
public Double get() {
return sum.get();
}
}
I have a HashMap which is static and three threads which try to access HashMap simultaneously from their corresponding class`s.
each thread task is get list value of a specified key, process some operations on the list(modify the list). and put the processed list in HashMap.
I want to make other threads trying to access the HashMap wait until current thread finishes the processing and modifying the HashMap.
in some situation, the flow is like this,
thread A is retrieved HashMap, while Thread A is processing on the list of HashMap, other Thread B retrieves the HashMap and starts its processing.
Actual behaviour has to be like:
Thread A -> retrieves HashMap -> process -> put value in HashMap.
Thread B -> retrieves HashMap -> process -> put value in HashMap.
Thread C -> retrieves HashMap -> process -> put value in HashMap.
logic :
apply lock on HashMap
retrieve.
process.
put into HashMap.
release lock.
help me in converting the logic to code, or any suggestions are accepted with smile.
You can really make use the ReentrantReadWriteLock. Here is the link for that.
Javadoc for ReadWriteReentrant lock
I would implement the feature as something like this..........
public class Test {
private Map<Object, Object> map = new HashMap<>();
private ReentrantReadWriteLock reentrantReadWriteLock = new ReentrantReadWriteLock();
public void process() {
methodThatModifiesMap();
methodThatJustReadsmap();
}
private void methodThatModifiesMap() {
//if the code involves modifying the structure of the map like 'put(), remove()' i will acquire the write reentrantReadWriteLock
reentrantReadWriteLock.writeLock().lock();
try {
//DO your thing and put() or remove from map
}
finally {
//Dont forget to unlock
reentrantReadWriteLock.writeLock().unlock();
}
}
private void methodThatJustReadsmap() {
// if all you are doing is reading ie 'get()'
reentrantReadWriteLock.readLock().lock(); // this does not block other reads from other threads as long as there is no writes during this thread's read
try {
} finally {
reentrantReadWriteLock.readLock().unlock();
}
}
}
Not only your map is thread-safe, the throughput is better too.
You can use ConcurrentHashMap instead of HashMap. The ConcurrentHashMap gives better performance and reduces overhead of locking the whole HashMap while other thread is accessing it.
You can find more details on this page as well - http://crunchify.com/hashmap-vs-concurrenthashmap-vs-synchronizedmap-how-a-hashmap-can-be-synchronized-in-java/
You can either use ConcurrentHashMap as suggested above or use class level locks.What I mean by it is by using synchronized keyword on static method.eg
public class SynchronizedExample extends Thread {
static HashMap map = new HashMap();
public synchronized static void execute() {
//Modify and read HashMap
}
public void run() {
execute();
}
}
Also as others mentioned it will incur performance bottlenecks if you use synchronized methods, depends on how atomic functions you make.
Also you can check class level locks vs object level locks(Although its almost same, but do check that.)
I was going to implement some code which needs a synchronized data structure. I came up with HashTable and Collections.synchronized(HashMap). I wouldn't be needing ConcurrentHashMap for this. I was wondering which one of the two would be better.
PS : I will be calling a lot of getter of this object and they would not be at the same time. So their is no problem with concurrency issue also.
ConcurrentHashMap is much more scaleable: http://www.javamex.com/tutorials/concurrenthashmap_scalability.shtml
HashTable and Collections.synchronized(HashMap) provide with the same performance, but they are conditionally thread-safe (i.e. they are not fully thread-safe)
If there are a lot read operations, I would recommend to wrap it with read-write locks:
public class MyHashMap<K, V> extends HashMap<K, V> {
private final ReadWriteLock lock = new ReentrantReadWriteLock();
#Override
public V put(K key, V value) {
final Lock w = lock.writeLock();
w.lock();
try {
return super.put(key, value);
} finally {
w.unlock();
}
}
#Override
public V get(Object key) {
final Lock r = lock.readLock();
r.lock();
try {
return super.get(key);
} finally {
r.unlock();
}
}
.... // the same approach distinguishing read and write operations
}
UPDATE:
I will be calling a lot of getter of this object and they would not be at the same time
It doesn't guarantee that you don't need synchronization.
Unless you need to acquire a lock on the whole map for some reason (unlikely) you should go for ConcurrentHashMap which gives much better scalability.
HashTable and the synchronized wrapper (Collections.synchronized(HashMap)) use one lock whereas ConcurrentHashMap partitions the map in 16 segments by default, each having its own lock, which gives much better concurrent access.
Although HashTable is thread-safe but it doesn't guarantee that it make your whole code thread-safe.HashTable also has some performance issue. So you should use HashMap but you have to manage all thread-safety yourself.
Let's say I have a HashMap declared as follows:
#GuardedBy("pendingRequests")
private final Map<UInt32, PendingRequest> pendingRequests = new HashMap<UInt32, PendingRequest>();
Access to the map is multi-threaded, and all access is guarded by synchronizing on this final instance of the map, e.g.:
synchronized (pendingRequests) {
pendingRequests.put(reqId, request);
}
Is this enough? Should the map be created using Collections.synchronizedMap()? Should I be locking on a dedicated lock object instead of the map instance? Or maybe both?
External synchronization (in addition to possibly using Collections.synchronizedMap()) is needed in a couple areas where multiple calls on the map must be atomic.
Synchronizing on the map itself is essentially what the Map returned by Collection.synchronizedMap() would do. For your situation it is a reasonable approach, and there is not much to recommend using a separate lock object other than personal preference (or if you wish to have more fine grained control and use a ReentrantReadWriteLock to allow concurrent reading of the map).
E.g.
private Map<Integer,Object> myMap;
private ReentrantReadWriteLock rwl = new ReentrantReadWriteLock();
public void myReadMethod()
{
rwl.readLock().lock();
try
{
myMap.get(...);
...
} finally
{
rwl.readLock().unlock();
}
}
public void myWriteMethod()
{
// may want / need to call rwl.readLock().unlock() here,
// since if you are holding the readLock here already then
// you cannot get the writeLock (so be careful on how your
// methods lock/unlock and call each other).
rwl.writeLock().lock();
try
{
myMap.put(key1,item1);
myMap.put(key2,item2);
} finally
{
rwl.writeLock().unlock();
}
}
All calls to the map need to be synchronized, and Collections.synchronizedMap() gives you that.
However, there is also an aspect of compound logic. If you need the integrity of the compound logic, synchronization of individual calls is not enough. For example, consider the following code:
Object value = yourMap.get(key); // synchronized
if (value == null) {
// do more action
yourMap.put(key, newValue); // synchronized
}
Although individual calls (get() and put()) are synchronized, your logic will not be safe against concurrent access.
Another interesting case is when you iterate. For an iteration to be safe, you'd need to synchronize for the entire duration of the iteration, or you will get ConcurrentModificationExceptions.