disable mutability in a Java List

disable mutability in a Java List - java

It would be nice to turn objects of Java's List interface into an immutable equivalent at the point in time that mutation is no longer required. That is, the client can call a freeze method that makes the List immutable. The immediate benefit to me is thread-safety without the memory expense of deep copying. (Edit: People would be correct if they assume that one extra immutable copy, to be used by all threads, is affordable.)
Is there a third-party interface or class that provides such a feature?

How about Collections.unmodifiableList(List list)?

There is an ImmutableList class as part of the Guava libraries. You can use the copyOf method to create an ImmutableList from an existing Iterable, eg.
List<String> immutableList = ImmutableList.copyOf(list);

Try to use CopyOnWriteArrayList.
The CopyOnWriteArrayList behaves much like the ArrayList class, except that when the list is
modified, instead of modifying the underlying array, a new array is created and the old array is discarded. This
means that when a caller gets an iterator (i.e. copyOnWriteArrayListRef.iterator() ), which internally
holds a reference to the underlying CopyOnWriteArrayList object’s array, which is immutable and therefore can be
used for traversal without requiring either synchronization on the list copyOnWriteArrayListRef or need to
clone() the copyOnWriteArrayListRef list before traversal (i.e. there is no risk of concurrent modification) and
also offers better performance.

Direct using of Collections.unmodifiableList is not enough if the client still has a reference to the original mutable list.
I would create a delegating list implementation that would have an internal reference to the original mutable list (the delegate) and forward all of the method calls to it. It's a PITA to write such code by hand, but Eclipse for example can generate it automatically for you.
Then upon calling the freeze method, I would wrap the original list with the Collections.unmodifiableList which ensures that all of the future method calls to the FreezingList go to the original delegate only through the unmodifable view.
To make things more secure, but less flexible, you can change the following constructor and instead of passing the original list to it (which can still leave a reference to the original mutable list to the client) instantiate the list internally (for example as an ArrayList).
public class FreezingList<E> implements List<E> {
// the original list you delegate to (the delegate)
private List<E> list;
private boolean frozen = false;
public FreezingList(List<E> list) {
this.list = list;
}
public void freeze() {
if (!frozen) {
list = Collections.unmodifiableList(list);
frozen = true;
}
}
// all the delegating methods follow:
public int size() {
return list.size();
}
public E get(int index) {
return list.get(index);
}
// etc. etc.
}

Related

Java: Curious ImmutableList add

I don't understand this method implementation. It's
public static <T> List<T> add(List<T> list, T element) {
final int size = list.size();
if (size == 0) {
return ImmutableList.of(element);
} else if (list instanceof ImmutableList) {
if (size == 1) {
final T val = list.get(0);
list = Lists.newArrayList();
list.add(val);
} else {
list = Lists.newArrayList(list);
}
}
list.add(element);
return list;
}
Why not a straightforward list.add(element)?

The code is implementing adding to the list that's given. If the input list is an ImmutableList, it first creates a mutable list (since otherwise it can't add to it) and copies the elements to it. If it's not, it just uses the existing list.
It's a bit odd that it returns an ImmutableList if the list passed in is empty, but a (mutable) ArrayList if it's given a non-empty ImmutableList to add to, but perhaps that makes sense in the broader context of where and how it's used. But that inconsistency is definitely something I'd query in a code review.

Addition for ImmutableList
Why not a strightforward list.add(element)?
You can't call that method if the given list is immutable. Actually you can but usually such a method will then throw an UnsupportedOperationException. The documentation of Guavas ImmutableList#add says
Deprecated. Unsupported operation.
Guaranteed to throw an exception and leave the list unmodified.
However the goal of the method seems to be to also support addition for ImmutableList by creating a mutable clone. So a straightforward implementation would be:
public static <T> List<T> add(List<T> list, T element) {
if (list instanceof ImmutableList) {
// Create mutable clone, ArrayList is mutable
list = Lists.newArrayList(list);
}
list.add(element);
return list;
}
Other stuff
Note that the type may change. While the input may be an ImmutableList, the output definitely is not.
You could keep the type by creating a temporary clone, adding to it (as shown) and then again wrap some ImmutableList around. However that doesn't seem to be a goal of this method.
Also note that the method in same cases may add something to the given list and in some create a new instance instead. So the caller of the method must be aware of the method sometimes changing his argument and sometimes not. For me this is a very odd behavior, it definitely must be highlighted in the documentation but I would not recommend doing stuff like that.
It seems that another goal of the method is to keep the list immutable if it was empty at method call. This is a bit strange but probably highlighted in its documentation. Therefore they add this call:
if (size == 0) {
return ImmutableList.of(element);
}
Besides that they do some minor stuff by calling
Lists.newArrayList();
instead of
Lists.newArrayList(list);
if the list is currently of size 1. However I'm not sure why they do this step. In my opinion they could just leave it the way it was.
So all in all I would probably implement such a method as
/**
* Creates a new list with the contents of the given list
* and the given element added to the end.
*
* <T> The type of the lists elements
*
* #params list The list to use elements of, the list will not be changed
* #params element The element to add to the end of the resulting list
*
* #return A new list with the contents of the given list and
* the given element added to the end. If the given list was
* of type {#link ImmutableList} the resulting list will
* also be of type {#link ImmutableList}.
**/
public static <T> List<T> add(List<T> list, T element) {
List<T> result;
// Create a Stream of all elements for the result
Stream<T> elements = Stream.concat(list.stream(), Stream.of(element));
// If the list was immutable, make the result also immutable
if (list instanceof ImmutableList) {
result = ImmutableList.of(elements.toArray(T[]::new));
} else {
result = elements.collect(Collectors.toList());
}
return result;
}
By that you won't ever change the argument list and you will also keep the list ImmutableList if it was. Using the Stream#concat method makes things a bit more efficient here (it is a lazy method), otherwise we would need to create temporary clones in between.
However we do not know which goals your method has, so probably in the context of your specific method what it does it makes more sense.

This method is an anti-pattern, and should not be used. It munges mutable and immutable data structures, providing the worst of both implementations.
If you're using an immutable data structure you should make that clear in your types - casting to List loses that important context. See the "Interfaces" not implementations section of ImmutableCollection.
If you're using a mutable data you should avoid doing linear-time copies and instead take advantage of the data structure's mutability (carefully).
It generally does not make sense to use the two types interchangeably - if you want to add things to an existing collection use a mutable collection you own. If you intend for the collection to be immutable, don't try to add things to it. This method discards that intent and will lead to runtime errors and/or reduced performance.

The reason that this method is not "a straightforward list.add(element)" is because this method is designed to be able to add elements to ImmutableLists. Those are, quite obviously, immutable (if you look, their native add method throws an UnsupportedOperationException) and so the only way to "add" to them is to create a new list.
The fact that the newly returned list is now mutable is a strange design decision and only a wider context or input from the code's author would help solve that one.
The special case where an empty input list will return an immutable list is another strange design decision. The function would work fine without that conditional branch.
Because this method returns a copy of the list, you should be careful to assign the result to something, probably the original variable:
myList = TheClass.add(myList, newElement);
and note that the following usage will effectively do nothing:
TheClass.add(myList, newElement);

Why trimToSize/ensureCapacity methods provide 'public' level access?

Any user who would like to access the java.util.ArrayList facility, has to obey the usage contract provided in java.util.List.
But one can break this usage contract easily and access the methods trimToSize
public void trimToSize() { ... }
and ensureCapacity
public void ensureCapacity(int minCapacity) { ...}
Because these two methods are neither overridden from java.util.AbstractList nor implemented from java.util.List.
So, Why these two methods provide public level access and break the abstraction provided by java.util.List?

There are some cases where efficiency may cost more than data abstraction. The ensureCapacity may be used to preallocate the internal buffer once when you are about to add known number of elements. The trimToSize may be used when you are not going to add more elements to free wasted memory. Both methods are non-applicable to other List implementations, thus they are added to ArrayList only.
Note that usually list is created and initially populated by one method which knows what implementation is used. So this does not break abstraction. For example, consider such code:
public List<String> createList() {
ArrayList<String> list = new ArrayList<>();
// populate the list
list.trimToSize();
return list;
}
This way you can save the memory and still return the List interace.

Nothing here breaks List's contract. ArrayList is a specific implementation of List, and if your program has specific performance needs that are important enough for you to forgo the abstraction of using a List, there's no reason not to allow you to use specific methods to tune ArrayList's behavior.

Should Iterator or Iterable be used when exposing internal collection items?

I have a class with a private mutable list of data.
I need to expose list items given following conditions:
List should not be modifiable outside;
It should be clear for developers who use getter function that a list they get can not be modified.
Which getter function should be marked as recommended approach? Or can you offer a better solution?
class DataProcessor {
private final ArrayList<String> simpleData = new ArrayList<>();
private final CopyOnWriteArrayList<String> copyData = new CopyOnWriteArrayList<>();
public void modifyData() {
...
}
public Iterable<String> getUnmodifiableIterable() {
return Collections.unmodifiableCollection(simpleData);
}
public Iterator<String> getUnmodifiableIterator() {
return Collections.unmodifiableCollection(simpleData).iterator();
}
public Iterable<String> getCopyIterable() {
return copyData;
}
public Iterator<String> getCopyIterator() {
return copyData.iterator();
}
}
UPD: this question is from a real code-review discussion on the best practice for list getter implementation

The "best" solution actually depends on the intended application patterns (and not so much on "opinions", as suggested by a close-voter). Each possible solution has pros and cons that can be judged objectively (and have to be judged by the developer).
Edit: There already was a question "Should I return a Collection or a Stream?", with an elaborate answers by Brian Goetz. You should consult this answers as well before making any decision. My answer does not refer to streams, but only to different ways of exposing the data as a collection, pointing out the pros, cons and implications of the different approaches.
Returning an iterator
Returning only an Iterator is inconvenient, regardless of further details, e.g. whether it will allow modifications or not. An Iterator alone can not be used in the foreach loop. So clients would have to write
Iterator<String> it = data.getUnmodifiableIterator();
while (it.hasNext()) {
String s = it.next();
process(s);
}
whereas basically all other solutions would allow them to just write
for (String s : data.getUnmodifiableIterable()) {
process(s);
}
Exposing a Collections.unmodifiable... view on the internal data:
You could expose the internal data structure, wrapped into the corresponding Collections.unmodifiable... collection. Any attempt to modify the returned collection will cause an UnsupportedOperationException to be thrown, clearly stating that the client should not modify the data.
One degree of freedom in the design space here is whether or not you hide additional information: When you have a List, you could offer a method
private List<String> internalData;
List<String> getData() {
return Collections.unmodifiableList(internalData);
}
Alternatively, you could be less specific about the type of the internal data:
If the caller should not be able to do indexed access with the List#get(int index) method, then you could change the return type of this method to Collection<String>.
If the caller additionally should not be able to obtain the size of the returned sequence by calling Collection'size(), then you could return an Iterable<String>.
Also consider that, when exposing the less specific interfaces, you later have the choice to change the type of the internal data to be a Set<String>, for example. If you had guaranteed to return a List<String>, then changing this later may cause some headaches.
Exposing a copy of the internal data:
A very simple solution is to just return a copy of the list:
private List<String> internalData;
List<String> getData() {
return new ArrayList<String>(internalData);
}
This may have the drawback of (potentially large and frequent) memory copies, and thus should only be considered when the collection is "small".
Additionally, the caller will be able to modify the list, and he might expect the changes to be reflected in the internal state (which is not the case). This problem could be alleviated by additionally wrapping the new list into a Collections.unmodifiableList.
Exposing a CopyOnWriteArrayList
Exposing a CopyOnWriteArrayList via its Iterator or as an Iterable is probably not a good idea: The caller has the option to modify it via Iterator#remove calls, and you explicitly wanted to avoid this.
The solution of exposing a CopyOnWriteArrayList which is wrapped into a Collections.unmodifiableList may be an option. It may look like a superfluously thick firewall at the first glance, but it definitely could be justified - see the next paragraph.
General considerations
In any case, you should document the behavior religiously. Particularly, you should document that the caller is not supposed to change the returned data in any way (regardless of whether it is possible without causing an exception).
Beyond that, there is an uncomfortable trade-off: You can either be precise in the documentation, or avoid exposing implementation details in the documentation.
Consider the following case:
/**
* Returns the data. The returned list is unmodifiable.
*/
List<String> getData() {
return Collections.unmodifiableList(internalData);
}
The documentation here should in fact also state that...
/* ...
* The returned list is a VIEW on the internal data.
* Changes in the internal data will be visible in
* the returned list.
*/
This may be an important information, considering thread safety and the behavior during iteration. Consider a loop that iterates over the unmodifiable view on the internal data. And consider that in this loop, someone calls a function that causes a modification of the internal data:
for (String s : data.getData()) {
...
data.changeInternalData();
}
This loop will break with a ConcurrentModificationException, because the internal data is modified while it is being iterated over.
The trade-off regarding the documentation here refers to the fact that, once a certain behavior is specified, clients will rely on this behavior. Imagine the client does this:
List<String> list = data.getList();
int oldSize = list.size();
data.insertElementToInternalData();
// Here, the client relies on the fact that he received
// a VIEW on the internal data:
int newSize = list.size();
assertTrue(newSize == oldSize+1);
Things like the ConcurrentModificationException could have been avoided if a true copy of the internal data had been returned, or by using a CopyOnWriteArrayList (each wrapped into a Collections.unmodifiableList). This would be the "safest" solution, in this regard:
The caller can not modify the returned list
The caller can not modify the internal state directly
If the caller modifies the internal state indirectly, then the iteration still works
But one has to think about whether so much "safety" is really required for the respective application case, and how this can be documented in a way that still allows changes to the internal implementation details.

Typically, Iterator is used only with Iterable, for the purpose of for-each loop. It'll be pretty odd to see a non-Iterable type contains a method returning Iterator, and it maybe upsetting to the user that it cannot be used in for-each loop.
So I suggest Iterable in this case. You could even have your class implements Iterable if that makes sense.
If you want to jump on the Java 8 wagon, returning a Stream probably is a more "modern" approach.

By encapsulation rules you had to always return an unmodifiable list, in your case is a design rule, so return the Collections.unmodifiableCollection, and you don't need to name the method as getUnmodifiable, use getter naming convenction and use Javadoc to tell other developer what kind of list you return and why...careless users will be alerted with an exception!!

Return list or modify by reference

In java, I have a method which is modifying the contents of a list. Is it better to use:
public List modifyList(List originalList) { // note - my real method uses generics
// iterate over originalList and modify elements
return originalList;
}
Or is it better to do the following:
public void modifyList(List originalList) {
// iterate over originalList and modify elements
// since java objects are handled by reference, the originalList will be modified
// even though the originalList is not explicitly returned by the method
}
Note - The only difference between the two methods is the return type (one function returns void and the other returns a List).

It all depends on how you are using your List - if you are implementing some kind of list and this is the non-static method of your List class, then you should write
public List modifyList() // returning list
or
public int modifyList() // number of elements changed
If it's method outside this class
About performing operations on List or its copy: you should consider desired bahaviour and your expectations - the most importantly - do I need "old" list copy?. Deep copying list can be a little overhead. Shallow copy will unable you to perform operations on certain elements of list (i.e. changing it's attributes - if they are objects) without affecting the "old" list.
About returning void: it's good practise to return changed list (or at least number of changed elements) which will allow you to chain methods invocations, if not needed you can always ignore it.

If you are just manipulating the list, it entirely depends on temperament. Some people(including me) would argue is easier to read code using the first option (and it allows for chaining as pointed out by Adam, if you want that sort of thing).
However, keep in mind that its not really a reference being passed in. Its a pointer really. Hence, if you reinitialize the originalList instance for some reason, as in putting a
originalList = new ArrayList();
in your method body. This will not affect the list you actually passed into the method.

In my opinion you should only encourage method chaining with immutable classes.
If your function mutates an object it is too easy to do it accidentally if in a chain of methods.

One possible benefit of Option 1 is that it can accept a null List. For example, if you are collecting Foos, and generally create a brand new List, but want the option to add to an existing list. e.g. (note name of method as well)
public List<Foo> appendFoos(List<Foo> in) {
if (in == null)
in = new ArrayList<Foo>;
// now go do it, e.g.
in.add(someFooIFound);
return in;
}
and, if you wish, add an explicit no-arg "get" method as well
public List<Foo> getFoos() {
return appendFoos(null);
}
Now, in Option #2, you could do this by having the user create a new, empty ArrayList and passing that in, but Option #1 is more convenient. i.e.
Option 1 Usage:
List<Foo> theFoos = getFoos();
Option 2 Usage:
List<Foo> theFoos = new ArrayList<Foo>();
appendFoos(theFoos);

As List is mutable, so second method is better. You don't need to return modified List.

How do I get the underlying static array out of a CopyOnWriteArrayList in Java?

I have a class which maintains a list of features of the class. These features change infrequently compared to the reads. The reads are almost always iterations through the feature list. Because of this, I'm using a CopyOnWriteArrayList.
I want to have a function like this:
function Feature[] getFeatures() {
.. implementation goes here ..
}
I admit, the reason may be a bit of laziness. I'd like to write code like this:
for (Feature f: object.getFeatures()) {
.. do something interesting ..
}
rather than this:
Iterator<Feature> iter = object.getFeatureIterator();
while (iter.hasNext()) {
Feature f = iter.next();
.. do something interesting ..
}
The main question is - am I being lazy here? I'm going to follow this pattern a lot, and I think the first chunk of code is far easier to maintain. Obviously, I would never change the underlying array, and I would put this in the documentation.
What is the proper way to handle this situation?

Just call the toArray method on the list:
public Feature[] getFeatures() {
return this.featureList.toArray(new Feature[this.featureList.size()]);
}
Note that the foreach syntax can be used with all the Iterable objects, and List is Iterable, so you could just have
public List<Feature> getFeatures() {
return this.features;
}
and use the same foreach loop. If you don't want the callers to modify the internal list, return an unmodifiable view of the list:
public List<Feature> getFeatures() {
return Collections.unmodifiableList(this.features);
}

I don't understand your reason: return a List<Feature>, use your CopyOnWriteArrayList or an unmodifiable copy, and use the foreach. Why do you specifically want an array?

Class CopyOnWriteArrayList implements Iterable, which is all you need to use the sugared for loop syntax. You don't need to get hold of an Iterator explicitly in the case you describe above.
Did you find that it doesn't compile?

you can clone the List
public List<Feature> getFeatures() {
return (List<Feature>)this.features.clone();
}
cloning a copyOnWriteArrayList doesn't copy the underlying array

The copy-on-write (COW) nature of a CopyOnWriteArrayList object means that at every modification of the list you get a new instance of its underlying array, with all entries but the modification copied from the previous instance.
To give you a coherent view while iterating over the list when other threads keep changing its content, a call to the iterator() method ties the iterator to the array, not the list. When a modification changes the list, a new array holds the new content, but the iterator continues to run through the old array that was available when iterator() was called. This means that you walk through a coherent snapshot of the list.
(In contrast, a loop such as for (int i = 0; i < list.size(); ++i) doSomethingWith(list.get(i)); does not protect you from modifications from other threads, and you may easily run off the end of the list if between a call to list.size() and the corresponding list.get(i), some elements have been deleted!)
Since the for each style for-loop uses iterators under the covers, you get this coherence guarantee. You also get it when using the forEach() method. This iterates (using array indexing) within the array available at the time forEach() is invoked.
Finally, the clone() method essentially takes a similar snapshot. If your code is the only place you use the cloned version, or you make it unmodifiable, you'll be consulting the original snapshot. The COW nature of the original shields your copy from changes to the original. If you don't want to rely on clone(), which has issues, you can copy out the list data, but that involves at least copying the array with another allocation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

disable mutability in a Java List - java

How about Collections.unmodifiableList(List list)?

There is an ImmutableList class as part of the Guava libraries. You can use the copyOf method to create an ImmutableList from an existing Iterable, eg. List<String> immutableList = ImmutableList.copyOf(list);

Related

Java: Curious ImmutableList add

Why trimToSize/ensureCapacity methods provide 'public' level access?

Should Iterator or Iterable be used when exposing internal collection items?

Return list or modify by reference

How do I get the underlying static array out of a CopyOnWriteArrayList in Java?

Categories

Resources