HashMap and concurrency- different keys - java

Say I have a hash map and multiple threads. If I have a synchronized method that adds to the hash map, how would I make it possible that two different threads can put different keys simultaneously (concurrently) into the hash map?
My current implementation is a synchronized method. Would this allow two different threads to put two different keys simultaneously into the hash map?
I am using a regular hash map, not Java's concurrent hash map. I am not allowed to use a concurrent hash map.
EDIT:
I think I found a solution! I think I may have miswrote this post. Let's say that the hash map is initialized as a Integer as its key and a LinkedList as its value. In order to put a totally new key, I realize that the whole hash map has to be synchronized (i.e. locked). However, if I am trying to add another String into an already contained key's corresponding LinkedList, I can just synchronize the hash map's get method. I think this will allow multiple threads to simultaneously (concurrently) add to the LinkedLists of different, already contained keys. Please let me know if I'm wrong.
Here's a concrete example. I have a hash map hashMap that uses an Integer as its key and a LinkedList as its value. The keys 5, and 10 are already in the hash map. The key 5 contains a LinkedList of Joey, Joe, Kerry. The key 10 contains the LinkedList of Jerry, Mary, Tim. I have two threads t1 and t2. t1 wants to add Moe to the LinkedList corresponding to key 5. t2 wants to add Harry to the LinkedList corresponding to key 10. Both will be concurrently added to the hash map, since the hash map's value is only locked.

My current implementation is a synchronized method. Would this allow two different threads to put two different keys simultaneously into the hash map?
No. Only a ConcurrentHashMap or a specifically designed concurrent map would support this safely, so it's impossible for you to put two keys simultaneously into the same map from two threads.
how would I make it possible that two different threads can put different keys simultaneously (concurrently) into the hash map?
You cannot, without using ConcurrentHashMap, another ConcurrentMap implementation, or implementing your own.

The easiest answer is to use a ConcurrentHashMap, which does exactly what you're looking for.
If you can't do that (I see you edited your post after I answered), then you'll have to duplicate the same thing that ConcurrentHashMap does. No, simply synchronizing the method of a HashMap will not allow two threads to add a key-value pair at the same time, they have to wait and go one at a time.

The answer to your question is NO. Because synchronized block forces every tread to wait in the single queue. With ConcurrentHashMap you have more chances for simultaneously add, because it locks only basket where element will be inserted instead of locking whole HashMap.

However, if I am trying to add another String into an already contained key's corresponding LinkedList, I can just synchronize the hash map's get method. I think this will allow multiple threads to simultaneously (concurrently) add to the LinkedLists of different, already contained keys.
Read-only access to a HashMap is safe: you can have multiple threads call the get method with no synchronization at all and nothing breaks. If the linked lists are not shared between threads you don't need synchronization either. To be sure the threads never share a list the map key should be something specific to the thread, like an object created locally or the thread ID.
What's not safe is to let another thread modify a map or a list concurrently with a read or write. This is the use case for a read-write lock: it allows multiple concurrent reads but writes have to be exclusive.

Related

Does a thread put a new value in its own segment in ConcurrentHashMap

I have a ConcurrentHashMap and i try to put a key value pair using putIfAbsent().
Now since each thread working on a concurrentHashMap has its own segment which consists of a set of key value pairs . Will a new key value pair be placed in it own segment or there is nothing mandatory like this ?
Thanks
You have a fundamental misunderstanding. Threads do not have their own segment. The mappings of a ConcurrentHashMap are distributed over the segments depending on the hash codes of their keys and in the best case, threads accessing different keys end up at different segments and hence, may work independently.
Threads accessing the same key can never end up at different segments. The principle of a Map that all keys are unique doesn’t change.
But this is a description of an outdated technology anyway. Since Java 8, ConcurrentHashMap does not use segments anymore. Depending on the capacity and the hash code distribution, all keys may get updated concurrently.

What is the difference between Segment of ConcurrentHashMap and buckets of HashMap theoretically?

I understand that in HashMap, the entries (Key, Value) are placed in buckets based on hash(Key.hashCode)--> The index that denotes the bucket location. In case an entry is already placed at that location, there is a linked list created and the new entry (if it has a different key --> via equals() method) is placed at the beginning of the linked list.
Can I co-relate this concept with that of ConcurrentHashMap, but instead of Buckets, there are Segments upon which individual threads have a lock. And instead of Entries, there are HashEntry(ies). In similar fashion, a linked list is created and if the Key-Value pair being inserted is different, based on equals() of the key, it is placed at end of the linked list.
Am I correct when I say that:
put of CHM is not synchronized, thus any thread can access this method, this put method calculates hash value of the key passed to it and gets the segment index (kind of like buckets). Then for that segment only, it calls the put method. Now under Segment, the put method specifies that there will be a lock(), so that only one thread can alter data in a particular segment, thus concluding that if the concurrency level is 16 there shall be 16 threads and thus these threads will be able to PUT values only one segment at a time.
A bucket is an individual slot in the map's array. This is the same with both HashMap and ConcurrentHashMap. Conceptually, the latter has its array broken into segments (each segment is an array of references), but that's it. Note that the CHM in Java 8 no longer has segments, it's all a single array.
Yes, it's the scheme known as segmented locking. It reduces inter-thread contention, but does not eliminate it.

Thread safe container for <key, value> pair

I need a container that contains [key, Value] pair.
Here, key = Integer, Value = User Defined class object.
Mutiple threads are trying to add [key, Value] pair in above container.
If key already present in the container, I want to update the value by checking some condition.
At the end I want container in sorted order, according to Key.
My efforts -
I used this synchronizedSortedMap and Sorted Map for above task.
SortedMap<Integer, USER_DEFINED_OBJECT> m = Collections.synchronizedSortedMap(new TreeMap<Integer, USER_DEFINED_OBJECT>());
This helps me to add pairs concurrently on above container.
And, yes If key already present, then I check some condition, then proceed.
Is my approach always thread safe ? If not, please correct me.
Updated
USER_DEFINED_OBJECT has some field index.
At the time of adding, I am checking if key is already present, then compare current USER_DEFINED_OBJECT with already present USER_DEFINED_OBJECT on the basis of above mentioned(in point 1) filed "index". If currect "index" is greater than update.
Use ConcurrentHashMap from java.util package, read the API ConcurrentHashMap
java.util.concurrent.ConcurrentSkipListMap
A scalable concurrent ConcurrentNavigableMap implementation. The map is sorted according to the natural ordering of its keys, or by a Comparator provided at map creation time, depending on which constructor is used.
This class implements a concurrent variant of SkipLists providing expected average log(n) time cost for the containsKey, get, put and remove operations and their variants. Insertion, removal, update, and access operations safely execute concurrently by multiple threads. Iterators are weakly consistent, returning elements reflecting the state of the map at some point at or since the creation of the iterator. They do not throw ConcurrentModificationException, and may proceed concurrently with other operations. Ascending key ordered views and their iterators are faster than descending ones.
The concurrent collections let you call methods like put, remove etc. in kind of transaction. Therefore it's thread safe.
From what I understood your scenario for adding new [key, value] pair is as follows:
Check whether the mapping already exists
If not, just add it
If yes, update the existing value in the mapping based on some check
I doubt there is an implementation in place which does this for you in thread-safe way. In the case I understood your use-case correctly you will need to add some manual synchronization on your own to make the update steps transactional.

Atomic way to reorder keys in a ConcurrentSkipListMap / ConcurrentSkipListSet?

Summary of this post: I have an set of ordered items whose order may change over time. I need to be able to iterate through this set from multiple threads, each of which may also want to update the order of the items.
For example, multiple threads need to access String keys in some arbitrary sorted order. They strings are not sorted according to their natural ordering, but by some values that may change (hence, a custom Comparator). My original implementation was to use a TreeSet and synchronize on it. If any of the keys needed to be reordered, a thread would remove the key from the map, update the comparison value, and reinsert the key. To implement this, the keys are native Strings, but the comparator has access to the values. This is a weird arrangement where the order of keys may change over time, but since a changed key is always removed and reinserted when it changes, it seems to work. (I suppose it could also work if the Strings were wrapped inside another object.)
I recently became aware of the ConcurrentSkipListSet/ConcurrentSkipListMap implementations which are basically thread-safe sorted sets (resp. maps.) It seems like I can now iterate through the keys without having to lock the entire data structure. However, is there a way I can use them to atomically remove a key and replace it with another, like the operation I was doing above, so that other iterating threads don't miss the item, and without having to use synchronize blocks?
If anyone can suggest a better data structure for this type of operation, I'm all ears, too!
is there a way I can use them to atomically remove a key and replace it with another, like the operation I was doing above, so that other iterating threads don't miss the item, and without having to use synchronize blocks?
The short answer is no. If you need to remove and reinsert, there is no atomic way to do this with any collection that I know of.
That said, one possibility would be for you to reinsert the item before deleting it from the skip list. This would cause a duplicate but may be easier to handle then a missing entry. You would reinsert it after you changed the object so it would sort differently. This assumes that the object would then be non-equal as well. But if the other threads that are processing the lists can't handle the duplicates then I think you are SOL.

Do I have to use a thread-safe Map implementation when only reading from it?

If I do the following.
Create a HashMap (in a final field)
Populate HashMap
Wrap HashMap with unmodifiable wrapper Map
Start other threads which will access but not modify the Map
As I understand it the Map has been "safely published" because the other threads were started after the Map was fully populated so I think it is ok to access the Map from multiple threads as it cannot be modified after this point.
Is this right?
This is perfectly fine concerning the map itself. But you need to realize the making the map unmodifiable will only make the map itself unmodifiable and not its keys and values. So if you have for example a Map<String, SomeMutableObject> such as Map<String, List<String>>, then threads will still be able to alter the value by for example map.get("foo").add("bar");. To avoid this, you'd like to make the keys/values immutable/unmodifiable as well.
As I understand it the Map has been "safely published" because the other threads were started after the Map was fully populated so I think it is ok to access the Map from multiple threads as it cannot be modified after this point.
Yes. Just make sure that the other threads are started in a synchronized manner, i.e. make sure you have a happens-before relation between publishing the map, and starting the threads.
This is discussed in this blog post:
[...] This is how Collections.unmodifiableMap() works.
[...]
Because of the special meaning of the keyword "final", instances of this class can be shared with multiple threads without using any additional synchronization; when another thread calls get() on the instance, it is guaranteed to get the object you put into the map, without doing any additional synchronization. You should probably use something that is thread-safe to perform the handoff between threads (like LinkedBlockingQueue or something), but if you forget to do this, then you still have the guarantee.
In short, no you don't need the map to be thread-safe if the reads are non-destructive and the map reference is safely published to the client.
In the example there are two important happens-before relationships established here. The final-field publication (if and only if the population is done inside the constructor and the reference doesn't leak outside the constructor) and the calls to start the threads.
Anything that modifies the map after these calls wrt the client reading from the map is not safely published.
We have for example a CopyOnWriteMap that has a non-threadsafe map underlying that is copied on each write. This is as fast as possible in situations where there are many more reads than writes (caching configuration data is a good example).
That said, if the intention really is to not change the map, setting an immutable version of the map into the field is always the best way to go as it guarantees the client will see the correct thing.
Lastly, there are some Map implementations that have destructive reads such as a LinkedHashMap with access ordering, or a WeakHashMap where entries can disappear. These types of maps must be accessed serially.
You are correct. There is no need to ensure exclusive access to the data structure by different threads by using mutex'es or otherwise since it's immutable. This usually greatly increases performance.
Also note that if you only wrap the original Map rather than creating a copy, ie the unmodifiable Map delegates method calls further to the inner HashMap, modifying the underlying Map may introduce race condition problems.
Immutable map is born to thread-safe. You could use ImmutableMap of Guava.

Categories