I have a LinkedList that is being modified by a single thread. But there are many threads reading it.
protected volatile LinkedList<V> list = new LinkedList<V>();
I need to retrieve this list at some point. So, when I do,
List<V> retList = new LinkedList<V>();
retList.addAll(list);
return Collections.unmodifiableList(list);
I receive the following exception
java.lang.NullPointerException
at java.util.LinkedList.toArray(LinkedList.java:866)
at java.util.LinkedList.addAll(LinkedList.java:269)
at java.util.LinkedList.addAll(LinkedList.java:247)
I have checked that the list or the values within it are not null. What is the correct approach when you have a single writer-multiple readers. The aim is not to have a synchronized list.
Your volatile tag doesn't mean anything -- this has to do with the thread safety of the writable object. If you do not want to implement proper memory barriers yourself, then you need to use a different class, like maybe:
Instead of:
protected volatile LinkedList<V> list = new LinkedList<V>();
Try using:
protected Deque<V> list = new [LinkedBlockingDeque][1]<V>();
I assume your using LinkedList instead of ArrayList because you want the Deque methods rather then the list methods.
If you really need list methods instead of dequeu methods, you can use CopyOnWriteArrayList
Anyway -- You need something that spits out a weakly consistent iterator rather then one of the standard java.util collections which all disclaim in their javadocs that they are not inherently thread-safe with any mix of readers and writers and are thus inappropriate for your use case without proper memory barriers.
Related
I have a class with a private mutable list of data.
I need to expose list items given following conditions:
List should not be modifiable outside;
It should be clear for developers who use getter function that a list they get can not be modified.
Which getter function should be marked as recommended approach? Or can you offer a better solution?
class DataProcessor {
private final ArrayList<String> simpleData = new ArrayList<>();
private final CopyOnWriteArrayList<String> copyData = new CopyOnWriteArrayList<>();
public void modifyData() {
...
}
public Iterable<String> getUnmodifiableIterable() {
return Collections.unmodifiableCollection(simpleData);
}
public Iterator<String> getUnmodifiableIterator() {
return Collections.unmodifiableCollection(simpleData).iterator();
}
public Iterable<String> getCopyIterable() {
return copyData;
}
public Iterator<String> getCopyIterator() {
return copyData.iterator();
}
}
UPD: this question is from a real code-review discussion on the best practice for list getter implementation
The "best" solution actually depends on the intended application patterns (and not so much on "opinions", as suggested by a close-voter). Each possible solution has pros and cons that can be judged objectively (and have to be judged by the developer).
Edit: There already was a question "Should I return a Collection or a Stream?", with an elaborate answers by Brian Goetz. You should consult this answers as well before making any decision. My answer does not refer to streams, but only to different ways of exposing the data as a collection, pointing out the pros, cons and implications of the different approaches.
Returning an iterator
Returning only an Iterator is inconvenient, regardless of further details, e.g. whether it will allow modifications or not. An Iterator alone can not be used in the foreach loop. So clients would have to write
Iterator<String> it = data.getUnmodifiableIterator();
while (it.hasNext()) {
String s = it.next();
process(s);
}
whereas basically all other solutions would allow them to just write
for (String s : data.getUnmodifiableIterable()) {
process(s);
}
Exposing a Collections.unmodifiable... view on the internal data:
You could expose the internal data structure, wrapped into the corresponding Collections.unmodifiable... collection. Any attempt to modify the returned collection will cause an UnsupportedOperationException to be thrown, clearly stating that the client should not modify the data.
One degree of freedom in the design space here is whether or not you hide additional information: When you have a List, you could offer a method
private List<String> internalData;
List<String> getData() {
return Collections.unmodifiableList(internalData);
}
Alternatively, you could be less specific about the type of the internal data:
If the caller should not be able to do indexed access with the List#get(int index) method, then you could change the return type of this method to Collection<String>.
If the caller additionally should not be able to obtain the size of the returned sequence by calling Collection'size(), then you could return an Iterable<String>.
Also consider that, when exposing the less specific interfaces, you later have the choice to change the type of the internal data to be a Set<String>, for example. If you had guaranteed to return a List<String>, then changing this later may cause some headaches.
Exposing a copy of the internal data:
A very simple solution is to just return a copy of the list:
private List<String> internalData;
List<String> getData() {
return new ArrayList<String>(internalData);
}
This may have the drawback of (potentially large and frequent) memory copies, and thus should only be considered when the collection is "small".
Additionally, the caller will be able to modify the list, and he might expect the changes to be reflected in the internal state (which is not the case). This problem could be alleviated by additionally wrapping the new list into a Collections.unmodifiableList.
Exposing a CopyOnWriteArrayList
Exposing a CopyOnWriteArrayList via its Iterator or as an Iterable is probably not a good idea: The caller has the option to modify it via Iterator#remove calls, and you explicitly wanted to avoid this.
The solution of exposing a CopyOnWriteArrayList which is wrapped into a Collections.unmodifiableList may be an option. It may look like a superfluously thick firewall at the first glance, but it definitely could be justified - see the next paragraph.
General considerations
In any case, you should document the behavior religiously. Particularly, you should document that the caller is not supposed to change the returned data in any way (regardless of whether it is possible without causing an exception).
Beyond that, there is an uncomfortable trade-off: You can either be precise in the documentation, or avoid exposing implementation details in the documentation.
Consider the following case:
/**
* Returns the data. The returned list is unmodifiable.
*/
List<String> getData() {
return Collections.unmodifiableList(internalData);
}
The documentation here should in fact also state that...
/* ...
* The returned list is a VIEW on the internal data.
* Changes in the internal data will be visible in
* the returned list.
*/
This may be an important information, considering thread safety and the behavior during iteration. Consider a loop that iterates over the unmodifiable view on the internal data. And consider that in this loop, someone calls a function that causes a modification of the internal data:
for (String s : data.getData()) {
...
data.changeInternalData();
}
This loop will break with a ConcurrentModificationException, because the internal data is modified while it is being iterated over.
The trade-off regarding the documentation here refers to the fact that, once a certain behavior is specified, clients will rely on this behavior. Imagine the client does this:
List<String> list = data.getList();
int oldSize = list.size();
data.insertElementToInternalData();
// Here, the client relies on the fact that he received
// a VIEW on the internal data:
int newSize = list.size();
assertTrue(newSize == oldSize+1);
Things like the ConcurrentModificationException could have been avoided if a true copy of the internal data had been returned, or by using a CopyOnWriteArrayList (each wrapped into a Collections.unmodifiableList). This would be the "safest" solution, in this regard:
The caller can not modify the returned list
The caller can not modify the internal state directly
If the caller modifies the internal state indirectly, then the iteration still works
But one has to think about whether so much "safety" is really required for the respective application case, and how this can be documented in a way that still allows changes to the internal implementation details.
Typically, Iterator is used only with Iterable, for the purpose of for-each loop. It'll be pretty odd to see a non-Iterable type contains a method returning Iterator, and it maybe upsetting to the user that it cannot be used in for-each loop.
So I suggest Iterable in this case. You could even have your class implements Iterable if that makes sense.
If you want to jump on the Java 8 wagon, returning a Stream probably is a more "modern" approach.
By encapsulation rules you had to always return an unmodifiable list, in your case is a design rule, so return the Collections.unmodifiableCollection, and you don't need to name the method as getUnmodifiable, use getter naming convenction and use Javadoc to tell other developer what kind of list you return and why...careless users will be alerted with an exception!!
Task is a simple. I have a dozen threads and one "global" list (of some objects).
Each thread (periodically) iterate through all list to find the desired object and change it (or add it if not present). And then iterate again and save all object to the file.
The JavaDoc says: "It is imperative that the user manually synchronize on the returned list when iterating over it"
And now - all that I do with my list I do inside synchronized block.
Inside implementation of synchronizedList, as I understand, also present some synchronization, and they (I suppose) add undesirable delay.
What if I will use simple List?
I think - if all my doing with the list already in critical section - what I will lose if I change
private static final List<JobSummary> SyncJS = Collections.synchronizedList(new ArrayList<JobSummary>());
to
private static final List<JobSummary> SyncJS = new ArrayList<>();
Or I miss something?
The point of the synchronizedObjects is to avoid manually synchronizing. If you are inside a synchronized block then you don't need them. It will only be extra overhead.
The operation "find the desired object and change it" is not atomic, hence you need to synchronize the list before performing any non atomic actions.
If you synchronize the list for all of your operations, then you do not need Collections.synchronizedList(new ArrayList<JobSummary>()). new ArrayList<>() would work well.
I only call addAll and clear of the List, but need it to thread-safe, is there any existing List for this ? Thanks
List is not-synchronized.
So it is not thread-safe.
If you want it as thread-safe means it is possible to make the list as thread-safe, you can use the
Collections.synchronizedList(List list)
A list created using Collections.synchronizedList(List list) will satisfy those requirements, provided that the synchronized list is the target object in the addAll(...) call, and never the parameter.
If the synchronized list (created as above) is the argument, then the problem is that addAll(list) iterates the argument list, and iterating a synchronized list is not atomic. If another thread updates list while it is being added, the you are liable to get a ConcurrentModificationException.
If you need to do the addAll(list) in a thread-safe fashion in the face of concurrent updates to list, then you need to make list a CopyOnWriteArrayList.
There is a concurrent list implementation in java.util.concurrent. CopyOnWriteArrayList in particular.
If you want to use exsting list as Synchronised one, go for Collections.synchronizedList(list) or you are creating targeted list, then you can go for CopyOnWriteArrayList
CopyOnWriteArrayList is a concurrent replacement for synchronizedList that offers better concurrency in some common situations & eliminates the need to lock or copy the collection during iteration.
The copy on write collections derive their thread safety from the fact that as an effectively immutable object is properly published, no further synchronization is required when accessing it. They implement mutability by creating & republishing a new copy of the collection everytime it is modified. The collection does not throw ConcurrentModificationException regardless of subsequent modifications.
I have a list of ListIterator<PointF> as a class field. I fill it in method grow(). When i try to use iterators from this list i get ConcurrentModificationException.
ListIterator<ListIterator<PointF>> i = mPoints.listIterator();
while (i.hasNext()) {
ListIterator<PointF> j = i.next();
if (j.hasNext())
PointF tmp = j.next(); // Exception here
}
I have no idea why does this code causes exeption in any method besides grow()
If the underlying list changes, the iterator that was obtained before that throws ConcurrentModificationException. So don't store iterators in instance fields.
What we can say for sure is that a ConcurrentModificationException means that the underlying iterable has been modified at some point after your call to get the iterator.
This does not always mean concurrent as in multi-threaded; one can easily trigger this exception by iterating through a list and deleting elements during the loop. So, if there are no other threads potentially modifying this, then we can say that the current thread has modified an iterator's underlying data structure at some point.
There's not enough code here to be sure, but your practice of storing iterators is a little suspicious. When did you add the (inner) iterators to mPoints? If the collection they refer to changes at any time after the iterator was created, it will throw this exception when invoked. Hence as soon as you add an iterator to the mPoints collection, the iterator's data structure is effectively locked for changes, and yet this won't be very clear in the code at all.
So I suspect this is the root cause of your problem. Unless it's for a very short term (and usually within a single lexical scope, e.g. a single method invocation) it's probably a bad idea to store iterators for the reason you're seeing. It might be better to store a reference to the underlying collections themselves, and then create the iterators during the code block above, something like:
ListIterator<Iterable<PointF>> i = mPoints.listIterator();
while (i.hasNext()) {
Iterator<PointF> j = i.next().iterator();
if (j.hasNext())
PointF tmp = j.next();
}
Then again the exact solution depends on the general architecture of your method. The main thing to bear in mind is don't store iterators long-term, because it's almost impossible to make this work reliably. Even if it does work right now, it creates a kind of invisible dependency between different parts of your code that will almost invariably be broken by someone implementing what should be a trivial change.
I need to make an ArrayList of ArrayLists thread safe. I also cannot have the client making changes to the collection. Will the unmodifiable wrapper make it thread safe or do I need two wrappers on the collection?
It depends. The wrapper will only prevent changes to the collection it wraps, not to the objects in the collection. If you have an ArrayList of ArrayLists, the global List as well as each of its element Lists need to be wrapped separately, and you may also have to do something for the contents of those lists. Finally, you have to make sure that the original list objects are not changed, since the wrapper only prevents changes through the wrapper reference, not to the original object.
You do NOT need the synchronized wrapper in this case.
On a related topic - I've seen several replies suggesting using synchronized collection in order to achieve thread safety.
Using synchronized version of a collection doesn't make it "thread safe" - although each operation (insert, count etc.) is protected by mutex when combining two operations there is no guarantee that they would execute atomically.
For example the following code is not thread safe (even with a synchronized queue):
if(queue.Count > 0)
{
queue.Add(...);
}
The unmodifiable wrapper only prevents changes to the structure of the list that it applies to. If this list contains other lists and you have threads trying to modify these nested lists, then you are not protected against concurrent modification risks.
From looking at the Collections source, it looks like Unmodifiable does not make it synchronized.
static class UnmodifiableSet<E> extends UnmodifiableCollection<E>
implements Set<E>, Serializable;
static class UnmodifiableCollection<E> implements Collection<E>, Serializable;
the synchronized class wrappers have a mutex object in them to do the synchronized parts, so looks like you need to use both to get both. Or roll your own!
I believe that because the UnmodifiableList wrapper stores the ArrayList to a final field, any read methods on the wrapper will see the list as it was when the wrapper was constructed as long as the list isn't modified after the wrapper is created, and as long as the mutable ArrayLists inside the wrapper aren't modified (which the wrapper can't protect against).
It will be thread-safe if the unmodifiable view is safely published, and the modifiable original is never ever modified (including all objects recursively contained in the collection!) after publication of the unmodifiable view.
If you want to keep modifying the original, then you can either create a defensive copy of the object graph of your collection and return an unmodifiable view of that, or use an inherently thread-safe list to begin with, and return an unmodifiable view of that.
You cannot return an unmodifiableList(synchonizedList(theList)) if you still intend to access theList unsynchronized afterwards; if mutable state is shared between multiple threads, then all threads must synchronize on the same locks when they access that state.
An immutable object is by definition thread safe (assuming no-one retains references to the original collections), so synchronization is not necessary.
Wrapping the outer ArrayList using Collections.unmodifiableList()
prevents the client from changing its contents (and thus makes it thread
safe), but the inner ArrayLists are still mutable.
Wrapping the inner ArrayLists using Collections.unmodifiableList() too
prevents the client from changing their contents (and thus makes them
thread safe), which is what you need.
Let us know if this solution causes problems (overhead, memory usage etc);
other solutions may be applicable to your problem. :)
EDIT: Of course, if the lists are modified they are NOT thread safe. I assumed no further edits were to be made.
Not sure if I understood what you are trying to do, but I'd say the answer in most cases is "No".
If you setup an ArrayList of ArrayList and both, the outer and inner lists can never be changed after creation (and during creation only one thread will have access to either inner and outer lists), they are probably thread safe by a wrapper (if both, outer and inner lists are wrapped in such a way that modifying them is impossible). All read-only operations on ArrayLists are most likely thread-safe. However, Sun does not guarantee them to be thread-safe (also not for read-only operations), so even though it might work right now, it could break in the future (if Sun creates some internal caching of data for quicker access for example).
This is neccessary if:
There is still a reference to the original modifiable list.
The list will possibly be accessed though an iterator.
If you intend to read from the ArrayList by index only you could assume this is thread-safe.
When in doubt, chose the synchronized wrapper.