How to synchronize unmodifiable collections - java

I want to return an unmodifiable view of the class (that maintain a collection of items ) to outside clients .
So to protect concurrent access, I need to wrap the collection in a synchronized wrapper first, then put an unmodifiable wrapper around the version I return to outside threads.
So I wrote the following code and unfortunately it is throwing a ConcurrentModificationException.
.
import java.util.*;
public class Test {
public static void main(String[] args) {
// assume c1 is private, nicely encapsulated in some class
final Collection col1 = Collections.synchronizedCollection(new ArrayList());
// this unmodifiable version is public
final Collection unmodcol1 = Collections.unmodifiableCollection(col1);
col1.add("a");
col1.add("b");
new Thread(new Runnable() {
public void run() {
while (true) {
// no way to synchronize on c1!
for (Iterator it = unmodcol1 .iterator(); it.hasNext(); it.next())
;
}
}
}).start();
while (true) {
col1 .add("c");
col1 .remove("c");
}
}
}
So my question is How to synchronize unmodifiable collections ?
To add more
When a client who received the collection wants to iterate over its elements
1) it doesn't necessarily know that it's a synchronized collection and
2) even if it does, it can't correctly synchronize on the synchronization wrapper mutex to iterate over its elements. The penalty, as described in
Collections.synchronizedCollection, is non-deterministic behaviour.
From my understanding Putting an unmodifiable wrapper on a synchronized collection leaves no access
to the mutex that must be held to iterate correctly.

If you can ensure that read-only clients of the collection synchronize on the collection, synchronize on that same view in your producer:
/* In the producer... */
Collection<Object> collection = new ArrayList<>();
Collection<Object> tmp = Collections.unmodifiableCollection(collection);
Collection<Object> view = Collections.synchronizedCollection(tmp);
synchronized (view) {
collection.add("a");
collection.add("b");
}
/* Give clients access only to "view" ... */
/* Meanwhile, in the client: */
synchronized (view) {
for (Object o : view) {
/* Do something with o */
}
}

You need to decide on a few things first.
A. Are users of the returned collection supposed to automatically see updates to it, and when? If so you would need to take care not to (or decide if this is ok) accidently locking it for updates for periods of time. If using synchronized and synchronizing on the returned collection you are effectively allowing the user of the returned collection to lock it for updates for example.
B. Or should they need to call again to get a fresh collection?
Besides, using Collections.synchronizedX won't give you any protection against iterating over it, just individual read and writes. So would require the client to guarantee that it locks during all explicit and implicit iterations. Sounds bad in general, but depends I guess.
Possible solutions:
Return a copy, don't need to wrap it in unmodifiable even. Just lock it while creating it. synchronized (collection) { return new ArrayList(collection); } No further synchronization needed. An example implementation of Option B above.
Like 1 but automatically by the data structure itself, use CopyOnWriteArrayList and return it (wrapped in unmodifiable). Note: This means writes to the collection are expensive. Reads are not. On the other hand even iterating on it is thread safe. No synchronization whatsoever needed. Supports option A above.
Depending on the properties of the data structure you need you could go for a non RandomAccess list like ConcurrentLinkedQueue or ConcurrentLinkedDeque, both allow iterating etc over the data structure without any extra synchronization. Again, wrapped in unmodifiable. Supports option A above.
I would go for option B-1 for the general case and to get started. But it depends as usual.

You're asking "How to synchronize unmodifiable collections", but actually that's not what you did in your code. You made a syncronized collection unmodifiable. If you 1st make your collection unmodifiable, and then syncronize it, then you'll get what you want.
// you'll need to create the list
final ArrayList list = new ArrayList();
// and add items to it while it's still modifiable:
list.add("a");
list.add("b");
final Collection unmodcol1 = Collections.unmodifiableCollection(list);
final Collection col1 = Collections.synchronizedCollection(unmodcol1);
However the add, remove inside the while will still fail for the same reason.
On the other hand if you created your list and made it unmodifiable, then you might not need to syncronize it at all.

You may want to use one of concurrent collections. That will give you what you need, if I read your question correctly. Just accept to pay on cost.
List<T> myCollection = new CopyOnWriteArrayList<T>();
List<T> clientsCollection = Collections.unmodifiableList(myCollection);
this way, you will not get CME, as client will always get unmodifiable collection and not interfere with your writes. However, price is rather high.

Related

can CopyOnWriteArrayList help to allowing remove item from different thread which has been put in a iterator

having a map holds list of eventlisteners for same event by type as key,
func_1() will start to get the listenerlist of one type from the map and iterate the list to handle the event with every listener.
When one listener finishes its handling, it will ask to remove it from the listenerlist in the map.
since the listeners are in an iterator, and removing it from the original list will cause java.util.ConcurrentModificationException in the iterator.previous() for getting next listener.
question is if using CopyOnWriteArrayList to copy the listener list then iterator on it, since it is a copy of the list, will it still throw when the listener is removed from other thread?
does it make any difference just simply making a copy of normal list instead of CopyOnWriteArrayList to iterator on?
func_1(Event event) {
List<WeakReference<EventListener<Event>>> listenerlist = mEventMap.get(event.eventType);
/* instead of directly iterator on the listenerlist
ListIterator<WeakReference<EventListener<Event>>> listenerIterator =
listenerlist.listIterator(listenerlist.size());
but making a CopyOnWriteArrayList first:
*/
List<WeakReference<EventListener<Event>>> listeners =
new CopyOnWriteArrayList<>(listenerlist);
ListIterator<WeakReference<EventListener<Event>>> listenerIterator =
listeners.listIterator(listeners.size());
while(listenerIterator.hasPrevious()){
WeakReference<EventListener<Event>> listenerItem =
listenerIterator.previous();
//doing something
listenerItem.func_2(event);
}
}
EventListener::func_2(Event event){
//do something
//remove the type in the map
funct_3(EventListener.this);
}
funct_3(EventListener listener) {
List<WeakReference<EventListener<Event>>> listeners =
mEventMap.get(listener.eventType);
if (listeners != null) {
Iterator<WeakReference<EventListener<Event>>> listenerIterator =
listeners.iterator();
while (listenerIterator.hasNext()) {
WeakReference<EventListener<Event>> listenerItem = listenerIterator.next();
if (listenerItem.get() != null && listenerItem.get() == listener) {
listenerIterator.remove();
break;
}
}
}
}
Did the test and it does not throw because it is iterating on a copy of the list, while the removing happens on the original list.
The draw back is it might be costly if the event comes too often.
-https://www.ibm.com/developerworks/library/j-5things4/
"2. CopyOnWriteArrayList
Making a fresh copy of an array is too expensive an operation, in terms of both time and memory overhead, to consider for ordinary use; developers often resort to using a synchronized ArrayList instead. That's also a costly option, however, because every time you iterate across the contents of the collection, you have to synchronize all operations, including read and write, to ensure consistency.
This puts the cost structure backward for scenarios where numerous readers are reading the ArrayList but few are modifying it.
CopyOnWriteArrayList is the amazing little jewel that solves this problem. Its Javadoc defines CopyOnWriteArrayList as a "thread-safe variant of ArrayList in which all mutative operations (add, set, and so on) are implemented by making a fresh copy of the array."
The collection internally copies its contents over to a new array upon any modification, so readers accessing the contents of the array incur no synchronization costs (because they're never operating on mutable data).
Essentially, CopyOnWriteArrayList is ideal for the exact scenario where ArrayList fails us: read-often, write-rarely collections such as the Listeners for a JavaBean event."

Should Iterator or Iterable be used when exposing internal collection items?

I have a class with a private mutable list of data.
I need to expose list items given following conditions:
List should not be modifiable outside;
It should be clear for developers who use getter function that a list they get can not be modified.
Which getter function should be marked as recommended approach? Or can you offer a better solution?
class DataProcessor {
private final ArrayList<String> simpleData = new ArrayList<>();
private final CopyOnWriteArrayList<String> copyData = new CopyOnWriteArrayList<>();
public void modifyData() {
...
}
public Iterable<String> getUnmodifiableIterable() {
return Collections.unmodifiableCollection(simpleData);
}
public Iterator<String> getUnmodifiableIterator() {
return Collections.unmodifiableCollection(simpleData).iterator();
}
public Iterable<String> getCopyIterable() {
return copyData;
}
public Iterator<String> getCopyIterator() {
return copyData.iterator();
}
}
UPD: this question is from a real code-review discussion on the best practice for list getter implementation
The "best" solution actually depends on the intended application patterns (and not so much on "opinions", as suggested by a close-voter). Each possible solution has pros and cons that can be judged objectively (and have to be judged by the developer).
Edit: There already was a question "Should I return a Collection or a Stream?", with an elaborate answers by Brian Goetz. You should consult this answers as well before making any decision. My answer does not refer to streams, but only to different ways of exposing the data as a collection, pointing out the pros, cons and implications of the different approaches.
Returning an iterator
Returning only an Iterator is inconvenient, regardless of further details, e.g. whether it will allow modifications or not. An Iterator alone can not be used in the foreach loop. So clients would have to write
Iterator<String> it = data.getUnmodifiableIterator();
while (it.hasNext()) {
String s = it.next();
process(s);
}
whereas basically all other solutions would allow them to just write
for (String s : data.getUnmodifiableIterable()) {
process(s);
}
Exposing a Collections.unmodifiable... view on the internal data:
You could expose the internal data structure, wrapped into the corresponding Collections.unmodifiable... collection. Any attempt to modify the returned collection will cause an UnsupportedOperationException to be thrown, clearly stating that the client should not modify the data.
One degree of freedom in the design space here is whether or not you hide additional information: When you have a List, you could offer a method
private List<String> internalData;
List<String> getData() {
return Collections.unmodifiableList(internalData);
}
Alternatively, you could be less specific about the type of the internal data:
If the caller should not be able to do indexed access with the List#get(int index) method, then you could change the return type of this method to Collection<String>.
If the caller additionally should not be able to obtain the size of the returned sequence by calling Collection'size(), then you could return an Iterable<String>.
Also consider that, when exposing the less specific interfaces, you later have the choice to change the type of the internal data to be a Set<String>, for example. If you had guaranteed to return a List<String>, then changing this later may cause some headaches.
Exposing a copy of the internal data:
A very simple solution is to just return a copy of the list:
private List<String> internalData;
List<String> getData() {
return new ArrayList<String>(internalData);
}
This may have the drawback of (potentially large and frequent) memory copies, and thus should only be considered when the collection is "small".
Additionally, the caller will be able to modify the list, and he might expect the changes to be reflected in the internal state (which is not the case). This problem could be alleviated by additionally wrapping the new list into a Collections.unmodifiableList.
Exposing a CopyOnWriteArrayList
Exposing a CopyOnWriteArrayList via its Iterator or as an Iterable is probably not a good idea: The caller has the option to modify it via Iterator#remove calls, and you explicitly wanted to avoid this.
The solution of exposing a CopyOnWriteArrayList which is wrapped into a Collections.unmodifiableList may be an option. It may look like a superfluously thick firewall at the first glance, but it definitely could be justified - see the next paragraph.
General considerations
In any case, you should document the behavior religiously. Particularly, you should document that the caller is not supposed to change the returned data in any way (regardless of whether it is possible without causing an exception).
Beyond that, there is an uncomfortable trade-off: You can either be precise in the documentation, or avoid exposing implementation details in the documentation.
Consider the following case:
/**
* Returns the data. The returned list is unmodifiable.
*/
List<String> getData() {
return Collections.unmodifiableList(internalData);
}
The documentation here should in fact also state that...
/* ...
* The returned list is a VIEW on the internal data.
* Changes in the internal data will be visible in
* the returned list.
*/
This may be an important information, considering thread safety and the behavior during iteration. Consider a loop that iterates over the unmodifiable view on the internal data. And consider that in this loop, someone calls a function that causes a modification of the internal data:
for (String s : data.getData()) {
...
data.changeInternalData();
}
This loop will break with a ConcurrentModificationException, because the internal data is modified while it is being iterated over.
The trade-off regarding the documentation here refers to the fact that, once a certain behavior is specified, clients will rely on this behavior. Imagine the client does this:
List<String> list = data.getList();
int oldSize = list.size();
data.insertElementToInternalData();
// Here, the client relies on the fact that he received
// a VIEW on the internal data:
int newSize = list.size();
assertTrue(newSize == oldSize+1);
Things like the ConcurrentModificationException could have been avoided if a true copy of the internal data had been returned, or by using a CopyOnWriteArrayList (each wrapped into a Collections.unmodifiableList). This would be the "safest" solution, in this regard:
The caller can not modify the returned list
The caller can not modify the internal state directly
If the caller modifies the internal state indirectly, then the iteration still works
But one has to think about whether so much "safety" is really required for the respective application case, and how this can be documented in a way that still allows changes to the internal implementation details.
Typically, Iterator is used only with Iterable, for the purpose of for-each loop. It'll be pretty odd to see a non-Iterable type contains a method returning Iterator, and it maybe upsetting to the user that it cannot be used in for-each loop.
So I suggest Iterable in this case. You could even have your class implements Iterable if that makes sense.
If you want to jump on the Java 8 wagon, returning a Stream probably is a more "modern" approach.
By encapsulation rules you had to always return an unmodifiable list, in your case is a design rule, so return the Collections.unmodifiableCollection, and you don't need to name the method as getUnmodifiable, use getter naming convenction and use Javadoc to tell other developer what kind of list you return and why...careless users will be alerted with an exception!!

Is it necessary to use synchronizedList instead List if iteration synchronized already?

Task is a simple. I have a dozen threads and one "global" list (of some objects).
Each thread (periodically) iterate through all list to find the desired object and change it (or add it if not present). And then iterate again and save all object to the file.
The JavaDoc says: "It is imperative that the user manually synchronize on the returned list when iterating over it"
And now - all that I do with my list I do inside synchronized block.
Inside implementation of synchronizedList, as I understand, also present some synchronization, and they (I suppose) add undesirable delay.
What if I will use simple List?
I think - if all my doing with the list already in critical section - what I will lose if I change
private static final List<JobSummary> SyncJS = Collections.synchronizedList(new ArrayList<JobSummary>());
to
private static final List<JobSummary> SyncJS = new ArrayList<>();
Or I miss something?
The point of the synchronizedObjects is to avoid manually synchronizing. If you are inside a synchronized block then you don't need them. It will only be extra overhead.
The operation "find the desired object and change it" is not atomic, hence you need to synchronize the list before performing any non atomic actions.
If you synchronize the list for all of your operations, then you do not need Collections.synchronizedList(new ArrayList<JobSummary>()). new ArrayList<>() would work well.

LinkedList.addAll fires null pointer exception in multithreaded application

I have a LinkedList that is being modified by a single thread. But there are many threads reading it.
protected volatile LinkedList<V> list = new LinkedList<V>();
I need to retrieve this list at some point. So, when I do,
List<V> retList = new LinkedList<V>();
retList.addAll(list);
return Collections.unmodifiableList(list);
I receive the following exception
java.lang.NullPointerException
at java.util.LinkedList.toArray(LinkedList.java:866)
at java.util.LinkedList.addAll(LinkedList.java:269)
at java.util.LinkedList.addAll(LinkedList.java:247)
I have checked that the list or the values within it are not null. What is the correct approach when you have a single writer-multiple readers. The aim is not to have a synchronized list.
Your volatile tag doesn't mean anything -- this has to do with the thread safety of the writable object. If you do not want to implement proper memory barriers yourself, then you need to use a different class, like maybe:
Instead of:
protected volatile LinkedList<V> list = new LinkedList<V>();
Try using:
protected Deque<V> list = new [LinkedBlockingDeque][1]<V>();
I assume your using LinkedList instead of ArrayList because you want the Deque methods rather then the list methods.
If you really need list methods instead of dequeu methods, you can use CopyOnWriteArrayList
Anyway -- You need something that spits out a weakly consistent iterator rather then one of the standard java.util collections which all disclaim in their javadocs that they are not inherently thread-safe with any mix of readers and writers and are thus inappropriate for your use case without proper memory barriers.

Synchronizing an ArrayList in a different Class

So I'm aware that in order to synchronize an arrayList you need to use
Collections.SynchronizedList(new ArrayList());
But what if the synchronized arrayList is in one class and I want to have a refernce to it in several other classes, the multiple other classes containing the threads that will add to it. Would I do something like
List referenceToList = OtherClass.mainList;
// inside OtherClass would be List<String>mainList
= Collections.sychronizedList(new ArrayList<String>());
Or would the proper way to be
List referenceToList = Collections.synchronizedList(OtherClass.mainList);
Also is there any difference in the way i would iterate over the list, or is it the same as if All the adding and reading was contained in one Class?
It doesn't matter which class the list is contained in - the synchronization is for controlling access to reads and writes to that list from multiple threads (again, regardless of the class it's contained in). Once you've wrapped it in a call to Collections.synchronizedList, there's no point in doing it again.
For clarity (based on your question), your code would look like this:
class OtherClass {
public static List mainList = Collections.synchronizedList(new ArrayList());
}
class RandomClass {
public static List referenceToList = OtherClass.mainList;
}
Here, referenceToList is just a pointer to the same list that mainList points to, which has read/write access synchronized.
As a note, there are other List implementations that are designed for concurrent access situations, such as CopyOnWriteArrayList.

Categories