Is collection synchronizing (via Collections.synchronizedX) necessary when access methods are synchronized? - java

There is a lot of topics when synchronization in Java appears. In many of them is recommended to using invokation of Collections.synchronized{Collecation, List, Map, Set, SortedMap, SortedSet} instead of Collection, List, etc. in case of multithreading work to thread-safe access.
Lets imagine situation when some threads exist and all of them need to access collection via methods that have synchronized block in their bodies.
So then, is it necessary to use:
Collection collection = Collections.synchronizedCollection(new ArrayList<T>());
or only
Collection collection = new ArrayList<String>();
need to?
Maybe you can show me an example when second attempt instead of first will cause evidently incorrect behaviour?

To the contrary, Collections.synchronizedCollection() is generally not sufficient because many operations (like iterating, check then add, etc.) need additional, explicit synchronization.
If every access to the collection is already done through properly synchronized methods, then wrapping the collection again into a syncronized proxy is useless.

No, if your access methods are synchronized there is no need to also use a synchronized collection.
Collection collection = new ArrayList<String>();
will do just fine in that scenario.

If you have already arranged for proper synchronization of your code, you definitely do not need another layer of synchronization on the lower level of granularity.
Just make sure when you say
all of them need to access collection via methods that have synchronized block in their bodies.
that all these blocks use the same lock. It is not enough to just involve some synchronized block.

Related

EnumMap with concurrent put/get

I am considering using EnumMap in a concurrent environment. However, the environment is atypical, here's why:
EnumMap is always full: there are no unmapped keys when the map is exposed to the concurrent environment
Only put() and get() operations will be used (no iterating over, no remove(), etc.)
It is completely acceptable if a call to get() does not reflect a call to put() immediately or orderly.
From what I could gather, including relevant method source code, this seems to be a safe scenario (unlike if iterations were allowed). Is there anything I might have overlooked?
In general, using non-thread-safe classes across threads is fraught with many problems. In your particular case, assuming safe publication after all keys have had values assigned (such that map.size() == TheEnum.values().length), the only problem I can see from a quickish glance of EnumMap's code in Java 1.6 is that a put may not ever get reflected in another thread. But that's only true because of the internals of EnumMap's implementation, which could change in the future. In other words, future changes could break the use case in more dangerous, subtle ways.
It's possible to write correct code that still contains data races -- but it's tricky. Why not just wrap the instance in a Collections.synchronizedMap?
Straight from the JavaDoc:
Like most collection implementations EnumMap is not synchronized. If multiple threads access an enum map concurrently, and at least one of the threads modifies the map, it should be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the enum map. If no such object exists, the map should be "wrapped" using the Collections.synchronizedMap(java.util.Map<K, V>) method. This is best done at creation time, to prevent accidental unsynchronized access:
Map<EnumKey, V> m = Collections.synchronizedMap(new EnumMap<EnumKey, V>(...));
The problem you have is that threads may not ever see the change made by another thread or they may see partially made changes. It's the same reason double-check-locking was broken before java 5 introduced volatile.
It might work if you made the EnumMap reference volatile but I'm not 100% sure even then, you might need the internal references inside the EnumMap to be volatile and obviously you can't do that without doing your own version of EnumMap.

How to avoid ConcurrentModificationException in multi-threaded code

Whenever we use java.util Collection classes, we have that if one thread changes a collection while another thread is traversing through it using an iterator, then any call to iterator.hasNext() or iterator.next() will throw ConcurrentModificationException. Even the synchronized collection wrapper classes SynchronizedMap and SynchronizedList are only conditionally thread-safe, which means all individual operations are thread-safe but compound operations where flow of control depends on the results of previous operations may be subject to threading issues. The question is: How to avoid this problem without affecting the performance. Note: I am aware of CopyOnWriteArrayList.
You can use CopyOnWriteArrayList or ConcurrentHashMap etc. as you mentioned above or you can use Atomic* classes which are working with CAS.
If you weren't aware of Atomic* classes they definitely worth a look! You may check out this question.
So to answer your question you have to choose the right tools for the task. Since you do not share the context with us I can just guess. In some situations CAS will perform better in others the concurrent Collections will.
If something isn't clear you can always check out the official Oracle Trails: Lesson: Concurrency
I think you raised an interesting question.
I tried thinking whether ConcurrentHashMap for example, as was suggested by others can help, but I'm not sure as the lock is segment-based.
What I would do in this case , and I do hope I understood your question well, is to lock access to your collection, using a ReaderWriterLock.
The reason I chose this lock is because I do feel this needs locking (as you explained - iteration is composed from several operations) ,
And because in case of reader threads, I do not want them to wait on lock , if no writer thread is working on the collection.
Thanks to #Adam Arold I paid attention that you suggested the "synchronized decorator" - but I feel this decorator is "too strong" for your needs, as it uses a synchronized and will not diffrentiate between cases of N readers and M writers.
This is because the "standard" Java collections are not thread safe as they are not synchronized. When working with multiple threads accessing your collections, you should look at the java.util.concurrent packages.
Without this package, before Java 5, one had to perform a manual synchronization :
synchronized(list) {
Iterator i = list.iterator(); // Must be in synchronized block
while (i.hasNext())
foo(i.next());
}
or using
Collections.synchronizedList(arrayList);
but neither could really offer a complete thread safety feature.
With this package, all access to the collection is made atomically and some classes provide a snapshot of the state of the list when the iterator was constructed (see CopyOnWriteArrayList. The CopyOnWriteArrayList is fast on read, but if you are performing many writes, this might affect performance.
Thus, if CopyOnWriteArrayList is not desired, take a look at ConcurrentLinkedQueue which offers a "weakly consistent" iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator. This one is efficient in all point, unless you have to access elements at specific indexes more often than traversing the entire collection.
Another option would be ConcurrentSkipListSet which provides expected average log(n) time cost for the contains, add, and remove operations and their variants. Insertion, removal, and access operations safely execute concurrently by multiple threads and iterators are weakly consistent as well.
Which concurrent (thread-safe) collections depend of what type of operations you perform the most with. And since they are all part of the Java Collection framework, you can swap them to which ever you need.
If you have an unencapsulated Collection object of a non thread-safe class, it is impossible to prevent misuse of the Collection, and thus the possibility of a ConcurrentModificationException.
Other answers have suggested use of a thread-safe Collection class, such as those provided by java.util.concurrent. You should however consider encapsulating the Collection object: have the object be private, and have your class provide higher level abstractions (such as addPerson and removePerson) that manipulate the Collection on behalf of the callers, and do not have any getter methods that return references to the Collection. It is then fairly easy to enforce invariants on the encapsulated data (such as "every person has a non empty name") and provide thread-safety using synchronized.

About unsynchronized & synchronized access in Java Collections Framework?

Can anyone explain what is unsynchronized & synchronized access in Java Collections Framework?
Synchronized vs unsynchronized access doesn't have to do with the Java Collections Framework per see.
Synchronized access means that you have some sort of locking for accessing the data. This can be introduced by using the synchronized keyword or by using some of the higher level constructs from the java.util.concurrent package.
Unsynchronized access means that you don't have any locking involved when accessing the data.
If you're using a collection in several threads, you better make sure that you're accessing it in a synchronized way, or, that the collection itself is thread safe, i.e., takes care of such locking internally.
To make sure all accesses to some collection coll is accessed in a synchronized way, you can either
...surround accesses with synchronized (coll) { ... }
public void someMethod() {
synchronized (coll) {
// do work...
}
}
...encapsulate it using Collections.synchronizedCollections
coll = Collections.synchronizedCollection(coll);
In the former approach, you need to make sure that every access to the collection is covered by synchronized. In the latter approach, you need to make sure that every reference points at the synchronized version of the collection.
As pointed out by #Fatal however, you should understand that the latter approach only transforms a thread unsafe collection into a thread safe collection. This is most often not sufficient for making sure that the class you are writing is thread safe. For an example, see #Fatals comment.
Synchronized access means it is thread-safe. So different threads can access the collection concurrently without any problems, but it is probably a little bit slower depending on what you are doing.
Unsynchronized is the opposite. Not thread-safe, but a little bit faster.
The synchronized access in Java Collection Framework is normally done by wrapping with Collections.synchronizedCollection(...) etc. and only access through this wrapper.
There are some exceptions already synchronized like Hashtable and Vector.
But keep in mind:
Synchronization is done over the collection instance itself and has a scope for each method call. So subsequent calls maybe interrupted by another thread.
Example:
You first call isEmtpy() method getting result that it is not empty and after that you want to retrieve an element from that collection. But this second method call may fail, because collection may be empty now due to actions by another thread done between your calls.
So even with synchronized collections you've to care about synchronization and it maybe necessary to synchronize yourself outside the collection!

Are concurrent classes provided by the JDK required to use their instance's own intrinstic lock for synchronization?

The JDK provides a set of thread-safe classes like ConcurrentHashMap, ConcurrentLinkedQueue and AtomicInteger.
Are these classes required to synchronize on this to implement their thread-safe behavior?
Provided that they do we can implement our own synchronized operations on these objects and mix them with the built-in ones?
In other words is it safe to do:
ConcurrentMap<Integer, Account> accounts
= new ConcurrentHashMap<Integer, Account>();
// Add an account atomically
synchronized(accounts) {
if (!accounts.containsKey(id)) {
Account account = new Account();
accounts.put(id, account);
}
}
And in another thread
// Access the object expecting it to synchronize(this){…} internally
accounts.get(id);
Note that the simple synchronized block above could probably be replaced by putIfAbsent() but I can see other cases where synchronizing on the object could be useful.
Are these classes required to
synchronize on this to implement their
thread-safe behavior.
No and, not only that, the various code inspection tools will warn you if you do try to use the object lock.
In the case of the put method above, note the javadoc:
A hash table supporting full
concurrency of retrievals and
adjustable expected concurrency for
updates. This class obeys the same
functional specification as Hashtable,
and includes versions of methods
corresponding to each method of
Hashtable. However, even though all
operations are thread-safe, retrieval
operations do not entail locking, and
there is not any support for locking
the entire table in a way that
prevents all access. This class is
fully interoperable with Hashtable in
programs that rely on its thread
safety but not on its synchronization
details.
This means that the options are thread safe and there isn't a way to do what you're trying to do above (lock the whole table). Furthermore, for the operations that you use (put and get), neither of them will require such locking.
I particularly like this quote from the javadoc from the values() method:
The view's iterator is a "weakly
consistent" iterator that will never
throw ConcurrentModificationException,
and guarantees to traverse elements as
they existed upon construction of the
iterator, and may (but is not
guaranteed to) reflect any
modifications subsequent to
construction.
So, if you use this method, you'll get a reasonable list: it will have the data as of the request time and might or might not have any later updates. The assurance that you won't have to worry about the ConcurrentModificationExceptions is a huge one: you can write simple code without the synchronized block that you show above and know that things will just work.

How to clone a synchronized Collection?

Imagine a synchronized Collection:
Set s = Collections.synchronizedSet(new HashSet())
What's the best approach to clone this Collection?
It's prefered that the cloning doesn't need any synchronization on the original Collection but required that iterating over the cloned Collection does not need any synchronization on the original Collection.
Use a copy-constructor inside a synchronized block:
synchronized (s) {
Set newSet = new HashSet(s); //preferably use generics
}
If you need the copy to be synchronized as well, then use Collections.synchronizedSet(..) again.
As per Peter's comment - you'll need to do this in a synchronized block on the original set. The documentation of synchronizedSet is explicit about this:
It is imperative that the user manually synchronize on the returned set when iterating over it
When using synchronized sets, do understand that you will incur synchronization overhead accessing every element in the set. The Collections.synchronizedSet() merely wraps your set with a shell that forces every method to be synchronized. Probably not what you really intended. A ConcurrentSkipListSet will give you better performance in a multithreaded environment where multiple threads will be writing to the set.
The ConcurrentSkipListSet will allow you to perform the following:
Set newSet = s.clone();//preferably use generics
It's not uncommon to use a clone of a set for snapshot processing. If that's what you are after, you might add a little code to handle the case where the item is already processed. The overhead involved with the occasional object included in more than one copy set is usually less than the consistent overhead of using Collections.concurrentSet().
EDIT: I just noticed that ConcurrentSkipListSet is Cloneable and provides a threadsafe clone() method. I changed my answer because I really believe this is the best option--instead of losing scalability and performance to Collections.concurrentSet().
You can avoid synchronizing the set by doing the following which avoids exposing an Iterator on the original set.
Set newSet = new HashSet(Arrays.asList(s.toArray()));
EDIT From Collections.SynchronizedCollection
public Object[] toArray() {
synchronized(mutex) {return c.toArray();}
}
As you can see, the lock is held for the entire time the operation is performed. As such a safe copy of the data is taken. It doesn't matter if an Iterator is used internally. The array returned can be used in a thread safe manner as only the local thread has a reference to it.
NOTE: If you want to avoid these issues I suggest you use a Set from the concurrency library added in Java 5.0 in 2004. I also suggest you use generics as this can make your collections more type safe.

Categories