I'm new to threading in Java and I need to access data structure from few active threads. I've heard that java.util.concurrent.ConcurrentHashMap is threading-friendly. Do I need to use synchronized(map){}
while accessing ConcurrentHashMap or it will handle locks itself?
It handles the locks itself, and in fact you have no access to them (there is no other option)
You can use synchronized in special cases for writes, but it is very rare that you should need to do this. e.g. if you need to implement your own putIfAbsent because the cost of creating an object is high.
Using syncrhonized for reads would defeat the purpose of using the concurrent collection.
ConcurrentHashMap is suited only to the cases where you don't need any more atomicity than provided out-of-the-box. If for example you need to get a value, do something with it, and then set a new value, all in an atomic operation, this cannot be achieved without external locking.
In all such cases nothing can replace explicit locks in your code and it is nothing but waste to use this implementation instead of the basic HashMap.
Short answer: no you don't need to use synchronized(map).
Long answer:
all the operations provided by ConcurrentHashMap are thread safe and you can call them without worrying about locking
however, if you need some operations to be atomic in your code, you will still need some sort of locking at the client side
No, you don't need, but if you need to depend on internal synchronization, you should use Collections.synchronizedMap instead. From the javadoc of ConcurrentHashMap:
This class is fully interoperable with Hashtable in programs that rely on its thread safety but not on its synchronization details.
Actually it won't synchronize on the whole data structure but on subparts (some buckets) of it.
This implies that ConcurrentHashMap's iterators are weakly consistent and the size of the map can be inaccurate. (But on the other hand it's put and get operations are still consistent and the throughput is higher)
There is one more important feature to note for concurrenthmp other than the concurrency feature it provides, which is fail safe iterator. Use CHMP just because they want to edit the entryset for put/remove while iteration.
Collections.synchronizedMap(Map) is other one. But ConcurrentModificationException may come in the above case.
Related
Why we need a thread-safe collection if we easily convert a non-thread-safe collection to Thread safe.
Ex: we can create Synchronized ArrayList by using Collections.synchronizedList() method.
synchronizedList just wraps all methods with exclusive locks. That may be too strict for you. For example, you may very well want to allow any number of concurrent read operations to proceed at the same time (and only serialize writes). A specialized implementation can offer that.
synchronizedList is only thread-safe in the sense that its internal state does not get corrupted. That may not be enough for your application. For example if (list.isEmpty()) list.add(1); is not thread-safe even on a synchronized list. Nor is for (String x: list) giving you a snapshot iteration. Specialized implementations can add higher-level atomic operations.
Why we need a thread-safe collection...
You don't need them, because, as you have pointed out,
we can create Synchronized ArrayList by using Collections.synchronizedList() method.
So why does the library provide "concurrent" collection classes? It's because some of those classes can be implemented using thread-safe algorithms, and especially, non-blocking algorithms that may be more efficient or safer than using a mutex-protected algorithm.
Of course, as others have pointed out, simply protecting a collection might not always be enough for your application. You might need a mutex anyway to protect some other data that is related to the collection.
But, if the lock-free versions are helpful to you, then the good news is that they are there; and if they are not helpful, then the good news is that you don't have to use them.
In JDK1.8, is it necessary to use ConcurrentHashMap on all occasions that require synchronization?And never use Collections.synchronizedMap (Map)?
No, it is not necessary to replace use of Collections.synchronizedMap with ConcurrentHashMap when used by multiple threads.
The Concurrent Collections section in the javadoc for package java.util.concurrent says:
"Synchronized" classes can be useful when you need to prevent all access to a collection via a single lock, at the expense of poorer scalability. In other cases in which multiple threads are expected to access a common collection, "concurrent" versions are normally preferable.
As you can see, both Collections.synchronizedMap and ConcurrentHashMap can have their uses.
See also "What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?".
I understand that Concurrent HashMap allows only a single thread at a time to update/write operation for "each segment". However multiple threads are allowed to read values from the map at the same time.
For my project, I want to extend this functionality such that while getting a value from a particular segment, no update/write operations should take place in that segment until read is completed.
Any ideas to achieve this?
Just to elaborate on the problem I'm facing right now. After reading a value from the map I perform certain update operations which are strongly dependent on that read value. Thus if a separate thread updates a key value and another threads get() fails to get the most recently updated values, this will lead to a big mess. So in this case extending would be a good idea?
My gut says no. Extending ConcurrentHashMap does not sound like a good idea.
One of the most valuable design principles to which you can adhere is called "Separation of Concerns." The main "concern" of a HashMap is to store key/value pairs. Sounds like maintaining consistent relationships between certain data in your program is another concern.
Don't try to address both concerns with a single class. I would create a higher-level class to take care of maintaining the consistent relationships (maybe by using Lock objects), and I would use a plain HashMap or ConcurrentHashMap to store the key/value pairs.
Extend the ConcurrentHashMap class, and implement the getValue() method by including a synchronized block, so that no access is allowed to other threads until the read operation is completed.
Informally, you can think of a Map as an set of "variables", each "variable" is addressed by a key (instead of a static name of an ordinary variable).
(An array is formally a list of variables, each addressed by an integer index.)
In HashMap, these "variables" are like "plain" variables; if you access a "variable" concurrently, things may go wrong (just like ordinary non-volatile variables)
In ConcurrentHashMap, these "variables" have volatile semantics. Therefore it is "more" safe to use concurrently. For example, a write will be visible to the "subsequent" read.
Of course, volatile is not enough sometimes; for example, we know we cannot use a volatile int for atomic increments (without locking). We need new devices, like AtomicInteger, for atomic operations.
Fortunately, in Java 8, new atomic methods are added to ConcurrentHashMap, so that now we can operate on these "variables" atomically. See if the compute() method may fit your use case.
I have a data structure as:-
Map<String,Map<String,List<CustomPOJO>>>
The frequency of read operations on this data structure will be too high and write operations will also be there but not many.
As far as read is concerned I guess there is no issue by using simple java.util.HashMap API.
For write operations there can be two approaches:-
Put entire data in ConcurrentHashMap and use it to write data into it.
Perform all write operations in a synchronized block/method and use simple java.util.HashMap API.
Please suggest which one would be better for write operation and also suggest whether there can be any loophole in read operation.
Firstly, how predictable is the outer Map's key string value? If that's all predictable at design time, I would rather turn that into an Enum and uses an EnumMap to hold the outer Map. The same applies for the inner map, too. In that case your question turns into
EnumMap<Enum, EnumMap<Enum, List<POJO>>>
and is perfectly resolved.
Secondly, since you are using a map of map structure and using in an env where performance matters, I would assume the number of keys in the outer map << the number of total POJOs inside the entire structure. That's to say, the chance you add a new submap to the whole structure is very small. In this case a ReadWriteLock is best on the outer Map; For the inner map you could consider either ReadWriteLock or ConcurrentHashMap.
There are 3 major design considerations for the ConcurrentHashMap:
That it generates a lot of temp objects. So if your application is GC-sensitive you want to limit its usage.
That it allows maximum 16 concurrent threads operating it by default - but this is unlikely to be a concern.
That it's size() isn't constant time.
I would usually apply ReadWrite lock pattern or even atomic variable based implementation mainly when 1. turns out to be a problem. Otherwise I think ConcurrentHashMap does fine in most of circumstances.
Also, note that in the latest implementation of JDK, the Read/Write priority changed for the ReadWriteLock. If my memory is correct it seems to favor read operation than write; so in case you have too many reads your write thread might got thread starvation. In that case you might want your own read/write implementation.
I think this article best suites to your question there is explanation. Personally I think you should use ConcurrentHashMap because it allow reader to read without blocking all hashmap.
http://javarevisited.blogspot.com/2011/04/difference-between-concurrenthashmap.html
ConcurrentHashMap was introduced in 1.5 as a part java java.util.concurrent package. Before that the only way to have a threadsafe map was to use HashTable or Collections.synchronizedMap(Map).
For all the practical purpose (multithread environment),ConcurrentHashMap is sufficient to address the needs except one case wherein a thread needs a uniform view of the map.
My question is, apart from having a Uniform View of the map, are there any other scenarios wherein ConcurrentHashMap is not an option ?
The usage of Hashtable has been discouraged since Java 1.2 and the utility of synchronizedMap is quite limited and almost always ends up being insufficient due to the too-fine granularity of locking. However, when you do have a scenario where individual updates are the grain size you need, ConcurrentHashMap is a no-brainer better choice over synchronizedMap. It has better concurrency, thread-safe iterators (no, synchronizedMap doesn't have those—this is due to its design as a wrapper around a non-thread-safe map), better overall performance, and very little extra memory weight to pay for it all.
This is a stretch but I will give it as a use case.
If you needed a thread-safe Map implementation which you can do some extra compound operation on which isn't available via ConcurrentMap. Let's say you want to ensure two other objects don't exist before adding a third.
Hashtable t = new Hashtable();
synchronized(t){
if(!t.contains(object1) && !t.contains(object2)){
t.put(object3,object3);
}
}
Again this is a stretch, but you would not be able to achieve this with a CHM while ensuring atomicity and thread-safety. Because all operations of a Hashtable and its synchronizedMap counter part synchronize on the instance of the Map this ensures thread-safety.
At the end of the day I would seldom, if ever, use a synchronizedMap/Hashtable and I suggest you should do the same.
As far as I understand, ConcurrentMap is a replacement of HashTable and Collections.synchronizedMap() for thread-safe purposes. A usage of that all classes is discouraged. Thus, the answer to your question is "no, there are no other scenarios".
See also: What's the difference between ConcurrentHashMap and Collections.synchronizedMap(Map)?