I have a method:
public List getDocuments(Long orgClientId, long userId) {
// do stuff
List list = new ArrayList();
// do stuff to fill the list. The list will be of hashmaps.
return list;
}
I don't want my list to be modified outside this method. How should I return it then? Cloning it? Or just returning a new instance like
return new ArrayList<>(list);
would do it?
You can make it superficially unmodifiable thus:
return Collections.unmodifiableList(list)
This wraps the list in order that invoking any of the mutation methods will result in an exception.
However, does it really matter to you if somebody modifies the list? You are giving back a new list instance each time this method is invoked, so there is no issue with two callers getting the same instance and having to deal with interactions between the mutations.
Whatever step you take to attempt to make it unmodifiable, I can just copy the list into a modifiable container myself.
It's also quite inconvenient to me as a caller if you give me back something which I can't detect whether it is mutable or not - it looks like a List; the only way to see if I can mutate it is by calling a mutation method. That's a runtime failure, which makes it a PITA.
Update, to add an alternative.
An alternative to Collections.unmodifiableList is something like Guava's ImmutableList: this will likely be faster than Collections.unmodifiableList since it does not simply provide an unmodifiable view of existing list by wrapping, but rather provides a custom implementation of List which forbids mutation (there is less indirection).
The advantage of this is that you can return ImmutableList rather than just the List returned by Collections.unmodifiableList: this allows you to give the caller a bit more type information that they can't mutate it. ImmutableList (and its siblings like ImmutableSet) mark the mutation methods #Deprecated so your IDE can warn you that you might be doing something iffy.
The caller might choose to disregard that information (they assign the result to a List, rather than an ImmutableList), but that's their choice.
The idea is that the object returned by umodifiableCollection can't directly be changed, but could change through other means (effectively by changing the internal collection directly).
List<String> list = new ArrayList<String>();
list.add("One");
list.add("Two");
list.add("Three");
List<String> unmodifiableList = Collections.unmodifiableList(list);
// this doesn't throw an exception since it's using the add
// method of the original List reference, which is no problem
list.add("Four");
System.out.println(unmodifiableList);
// this, however, throws an exception
unmodifiableList.add("Five");
unmodifiableList returns an unmodifiable view of the specified list.
This method allows modules to provide users with "read-only" access to
internal lists. Query operations on the returned list "read through"
to the specified list, and attempts to modify the returned list,
whether direct or via its iterator, result in an
UnsupportedOperationException. The returned list will be serializable
if the specified list is serializable.
Related
I don't understand this method implementation. It's
public static <T> List<T> add(List<T> list, T element) {
final int size = list.size();
if (size == 0) {
return ImmutableList.of(element);
} else if (list instanceof ImmutableList) {
if (size == 1) {
final T val = list.get(0);
list = Lists.newArrayList();
list.add(val);
} else {
list = Lists.newArrayList(list);
}
}
list.add(element);
return list;
}
Why not a straightforward list.add(element)?
The code is implementing adding to the list that's given. If the input list is an ImmutableList, it first creates a mutable list (since otherwise it can't add to it) and copies the elements to it. If it's not, it just uses the existing list.
It's a bit odd that it returns an ImmutableList if the list passed in is empty, but a (mutable) ArrayList if it's given a non-empty ImmutableList to add to, but perhaps that makes sense in the broader context of where and how it's used. But that inconsistency is definitely something I'd query in a code review.
Addition for ImmutableList
Why not a strightforward list.add(element)?
You can't call that method if the given list is immutable. Actually you can but usually such a method will then throw an UnsupportedOperationException. The documentation of Guavas ImmutableList#add says
Deprecated. Unsupported operation.
Guaranteed to throw an exception and leave the list unmodified.
However the goal of the method seems to be to also support addition for ImmutableList by creating a mutable clone. So a straightforward implementation would be:
public static <T> List<T> add(List<T> list, T element) {
if (list instanceof ImmutableList) {
// Create mutable clone, ArrayList is mutable
list = Lists.newArrayList(list);
}
list.add(element);
return list;
}
Other stuff
Note that the type may change. While the input may be an ImmutableList, the output definitely is not.
You could keep the type by creating a temporary clone, adding to it (as shown) and then again wrap some ImmutableList around. However that doesn't seem to be a goal of this method.
Also note that the method in same cases may add something to the given list and in some create a new instance instead. So the caller of the method must be aware of the method sometimes changing his argument and sometimes not. For me this is a very odd behavior, it definitely must be highlighted in the documentation but I would not recommend doing stuff like that.
It seems that another goal of the method is to keep the list immutable if it was empty at method call. This is a bit strange but probably highlighted in its documentation. Therefore they add this call:
if (size == 0) {
return ImmutableList.of(element);
}
Besides that they do some minor stuff by calling
Lists.newArrayList();
instead of
Lists.newArrayList(list);
if the list is currently of size 1. However I'm not sure why they do this step. In my opinion they could just leave it the way it was.
So all in all I would probably implement such a method as
/**
* Creates a new list with the contents of the given list
* and the given element added to the end.
*
* <T> The type of the lists elements
*
* #params list The list to use elements of, the list will not be changed
* #params element The element to add to the end of the resulting list
*
* #return A new list with the contents of the given list and
* the given element added to the end. If the given list was
* of type {#link ImmutableList} the resulting list will
* also be of type {#link ImmutableList}.
**/
public static <T> List<T> add(List<T> list, T element) {
List<T> result;
// Create a Stream of all elements for the result
Stream<T> elements = Stream.concat(list.stream(), Stream.of(element));
// If the list was immutable, make the result also immutable
if (list instanceof ImmutableList) {
result = ImmutableList.of(elements.toArray(T[]::new));
} else {
result = elements.collect(Collectors.toList());
}
return result;
}
By that you won't ever change the argument list and you will also keep the list ImmutableList if it was. Using the Stream#concat method makes things a bit more efficient here (it is a lazy method), otherwise we would need to create temporary clones in between.
However we do not know which goals your method has, so probably in the context of your specific method what it does it makes more sense.
This method is an anti-pattern, and should not be used. It munges mutable and immutable data structures, providing the worst of both implementations.
If you're using an immutable data structure you should make that clear in your types - casting to List loses that important context. See the "Interfaces" not implementations section of ImmutableCollection.
If you're using a mutable data you should avoid doing linear-time copies and instead take advantage of the data structure's mutability (carefully).
It generally does not make sense to use the two types interchangeably - if you want to add things to an existing collection use a mutable collection you own. If you intend for the collection to be immutable, don't try to add things to it. This method discards that intent and will lead to runtime errors and/or reduced performance.
The reason that this method is not "a straightforward list.add(element)" is because this method is designed to be able to add elements to ImmutableLists. Those are, quite obviously, immutable (if you look, their native add method throws an UnsupportedOperationException) and so the only way to "add" to them is to create a new list.
The fact that the newly returned list is now mutable is a strange design decision and only a wider context or input from the code's author would help solve that one.
The special case where an empty input list will return an immutable list is another strange design decision. The function would work fine without that conditional branch.
Because this method returns a copy of the list, you should be careful to assign the result to something, probably the original variable:
myList = TheClass.add(myList, newElement);
and note that the following usage will effectively do nothing:
TheClass.add(myList, newElement);
I have a class with a private mutable list of data.
I need to expose list items given following conditions:
List should not be modifiable outside;
It should be clear for developers who use getter function that a list they get can not be modified.
Which getter function should be marked as recommended approach? Or can you offer a better solution?
class DataProcessor {
private final ArrayList<String> simpleData = new ArrayList<>();
private final CopyOnWriteArrayList<String> copyData = new CopyOnWriteArrayList<>();
public void modifyData() {
...
}
public Iterable<String> getUnmodifiableIterable() {
return Collections.unmodifiableCollection(simpleData);
}
public Iterator<String> getUnmodifiableIterator() {
return Collections.unmodifiableCollection(simpleData).iterator();
}
public Iterable<String> getCopyIterable() {
return copyData;
}
public Iterator<String> getCopyIterator() {
return copyData.iterator();
}
}
UPD: this question is from a real code-review discussion on the best practice for list getter implementation
The "best" solution actually depends on the intended application patterns (and not so much on "opinions", as suggested by a close-voter). Each possible solution has pros and cons that can be judged objectively (and have to be judged by the developer).
Edit: There already was a question "Should I return a Collection or a Stream?", with an elaborate answers by Brian Goetz. You should consult this answers as well before making any decision. My answer does not refer to streams, but only to different ways of exposing the data as a collection, pointing out the pros, cons and implications of the different approaches.
Returning an iterator
Returning only an Iterator is inconvenient, regardless of further details, e.g. whether it will allow modifications or not. An Iterator alone can not be used in the foreach loop. So clients would have to write
Iterator<String> it = data.getUnmodifiableIterator();
while (it.hasNext()) {
String s = it.next();
process(s);
}
whereas basically all other solutions would allow them to just write
for (String s : data.getUnmodifiableIterable()) {
process(s);
}
Exposing a Collections.unmodifiable... view on the internal data:
You could expose the internal data structure, wrapped into the corresponding Collections.unmodifiable... collection. Any attempt to modify the returned collection will cause an UnsupportedOperationException to be thrown, clearly stating that the client should not modify the data.
One degree of freedom in the design space here is whether or not you hide additional information: When you have a List, you could offer a method
private List<String> internalData;
List<String> getData() {
return Collections.unmodifiableList(internalData);
}
Alternatively, you could be less specific about the type of the internal data:
If the caller should not be able to do indexed access with the List#get(int index) method, then you could change the return type of this method to Collection<String>.
If the caller additionally should not be able to obtain the size of the returned sequence by calling Collection'size(), then you could return an Iterable<String>.
Also consider that, when exposing the less specific interfaces, you later have the choice to change the type of the internal data to be a Set<String>, for example. If you had guaranteed to return a List<String>, then changing this later may cause some headaches.
Exposing a copy of the internal data:
A very simple solution is to just return a copy of the list:
private List<String> internalData;
List<String> getData() {
return new ArrayList<String>(internalData);
}
This may have the drawback of (potentially large and frequent) memory copies, and thus should only be considered when the collection is "small".
Additionally, the caller will be able to modify the list, and he might expect the changes to be reflected in the internal state (which is not the case). This problem could be alleviated by additionally wrapping the new list into a Collections.unmodifiableList.
Exposing a CopyOnWriteArrayList
Exposing a CopyOnWriteArrayList via its Iterator or as an Iterable is probably not a good idea: The caller has the option to modify it via Iterator#remove calls, and you explicitly wanted to avoid this.
The solution of exposing a CopyOnWriteArrayList which is wrapped into a Collections.unmodifiableList may be an option. It may look like a superfluously thick firewall at the first glance, but it definitely could be justified - see the next paragraph.
General considerations
In any case, you should document the behavior religiously. Particularly, you should document that the caller is not supposed to change the returned data in any way (regardless of whether it is possible without causing an exception).
Beyond that, there is an uncomfortable trade-off: You can either be precise in the documentation, or avoid exposing implementation details in the documentation.
Consider the following case:
/**
* Returns the data. The returned list is unmodifiable.
*/
List<String> getData() {
return Collections.unmodifiableList(internalData);
}
The documentation here should in fact also state that...
/* ...
* The returned list is a VIEW on the internal data.
* Changes in the internal data will be visible in
* the returned list.
*/
This may be an important information, considering thread safety and the behavior during iteration. Consider a loop that iterates over the unmodifiable view on the internal data. And consider that in this loop, someone calls a function that causes a modification of the internal data:
for (String s : data.getData()) {
...
data.changeInternalData();
}
This loop will break with a ConcurrentModificationException, because the internal data is modified while it is being iterated over.
The trade-off regarding the documentation here refers to the fact that, once a certain behavior is specified, clients will rely on this behavior. Imagine the client does this:
List<String> list = data.getList();
int oldSize = list.size();
data.insertElementToInternalData();
// Here, the client relies on the fact that he received
// a VIEW on the internal data:
int newSize = list.size();
assertTrue(newSize == oldSize+1);
Things like the ConcurrentModificationException could have been avoided if a true copy of the internal data had been returned, or by using a CopyOnWriteArrayList (each wrapped into a Collections.unmodifiableList). This would be the "safest" solution, in this regard:
The caller can not modify the returned list
The caller can not modify the internal state directly
If the caller modifies the internal state indirectly, then the iteration still works
But one has to think about whether so much "safety" is really required for the respective application case, and how this can be documented in a way that still allows changes to the internal implementation details.
Typically, Iterator is used only with Iterable, for the purpose of for-each loop. It'll be pretty odd to see a non-Iterable type contains a method returning Iterator, and it maybe upsetting to the user that it cannot be used in for-each loop.
So I suggest Iterable in this case. You could even have your class implements Iterable if that makes sense.
If you want to jump on the Java 8 wagon, returning a Stream probably is a more "modern" approach.
By encapsulation rules you had to always return an unmodifiable list, in your case is a design rule, so return the Collections.unmodifiableCollection, and you don't need to name the method as getUnmodifiable, use getter naming convenction and use Javadoc to tell other developer what kind of list you return and why...careless users will be alerted with an exception!!
Consider the following code below
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class Test {
public static void main(String[] args) {
List<Integer> intList1=new ArrayList<Integer>();
List<Integer> intList2;
intList1.add(1);
intList1.add(2);
intList1.add(3);
intList2=Collections.unmodifiableList(intList1);
intList1.add(4);
for(int i=0;i<4;i++)
{
System.out.println(intList2.get(i));
}
}
}
The result of the above code is
1
2
3
4
In the above code we create an unmodifiable List intList2 from the contents of the List intList1. But after the Collections.unmodifiable statement when I make a change to intList1 that change reflects to intList2. How is this possible ?
You need to read the Javadoc for Collections.unmodifiableList
Returns an unmodifiable view of the specified list.
This means that the returned view is unmodifiable. If you have the original reference you can change the collection. If you change the collection then changes will be reflected in the view.
This has the advantage of being very fast, i.e. you don't need to copy the collection, but the disadvantage that you noted - the resulting collection is a view.
In order to create a truly unmodifiable collection you would need to copy then wrap:
intList2=Collections.unmodifiableList(new ArrayList<>(intList1));
This copies the contents of intList1 into another collection then wraps that collection in the unmodifiable variable. No reference to the wrapped collection exists.
This is expensive - the entire underlying datastore (an array in this case) needs to duplicated which (generally) takes O(n).
Google Guava provides immutable collections which solve some of the problems of making defensive copies:
If a collection is already immutable it is not copied again
Provide an interface which can be used to explicitly state that a collection is immutable
Provide numerous static factory methods to generate immutable collections
But speed is still the key concern when using immutable copies of collections rather than unmodifiable views.
It should be noted that the usual use for Collections.unmodifiableXXX is to return from a method, for example a getter:
public Collection<Thing> getThings() {
return Collections.unmodifiableCollection(things);
}
In this case there are two things to note:
The user of getThings cannot access things so the unmodifiability cannot be broken.
It would be very expensive to copy things each time a getter were called.
In summary the answer to your question is a little more complex than you might have expected and there are a number of aspects to consider when passing collections around in your application.
From the Javadoc of Collections.unmodifiableList:
Returns an unmodifiable view of the specified list. This method allows modules to provide users with "read-only" access to internal lists.
It prevent the returned list to be modified, but the original list itself still can be.
In your code you are
intList2=Collections.unmodifiableList(intList1);
creating unmodifiableList in intList2. So you are free to make changes in inList1
but you are not allowed to do any changes in intList2
try this:
intList2.add(4);
you will get
java.lang.UnsupportedOperationException
at java.util.Collections$UnmodifiableCollection.add(Unknown Source)
above exception.
Collections.unmodifiableList returns a "read-only" view of the internal list. While the object that was returned is not modifiable the original list that it references can be modified. Both objects point to the same object in memory so it will reflect changes made.
Here is a good explanation of what is happening.
That happens because the unmodifiable list is internally backed for the first list, if you really want it to be unmodifiable you shouldn't use the first List any more.
In java, I have a method which is modifying the contents of a list. Is it better to use:
public List modifyList(List originalList) { // note - my real method uses generics
// iterate over originalList and modify elements
return originalList;
}
Or is it better to do the following:
public void modifyList(List originalList) {
// iterate over originalList and modify elements
// since java objects are handled by reference, the originalList will be modified
// even though the originalList is not explicitly returned by the method
}
Note - The only difference between the two methods is the return type (one function returns void and the other returns a List).
It all depends on how you are using your List - if you are implementing some kind of list and this is the non-static method of your List class, then you should write
public List modifyList() // returning list
or
public int modifyList() // number of elements changed
If it's method outside this class
About performing operations on List or its copy: you should consider desired bahaviour and your expectations - the most importantly - do I need "old" list copy?. Deep copying list can be a little overhead. Shallow copy will unable you to perform operations on certain elements of list (i.e. changing it's attributes - if they are objects) without affecting the "old" list.
About returning void: it's good practise to return changed list (or at least number of changed elements) which will allow you to chain methods invocations, if not needed you can always ignore it.
If you are just manipulating the list, it entirely depends on temperament. Some people(including me) would argue is easier to read code using the first option (and it allows for chaining as pointed out by Adam, if you want that sort of thing).
However, keep in mind that its not really a reference being passed in. Its a pointer really. Hence, if you reinitialize the originalList instance for some reason, as in putting a
originalList = new ArrayList();
in your method body. This will not affect the list you actually passed into the method.
In my opinion you should only encourage method chaining with immutable classes.
If your function mutates an object it is too easy to do it accidentally if in a chain of methods.
One possible benefit of Option 1 is that it can accept a null List. For example, if you are collecting Foos, and generally create a brand new List, but want the option to add to an existing list. e.g. (note name of method as well)
public List<Foo> appendFoos(List<Foo> in) {
if (in == null)
in = new ArrayList<Foo>;
// now go do it, e.g.
in.add(someFooIFound);
return in;
}
and, if you wish, add an explicit no-arg "get" method as well
public List<Foo> getFoos() {
return appendFoos(null);
}
Now, in Option #2, you could do this by having the user create a new, empty ArrayList and passing that in, but Option #1 is more convenient. i.e.
Option 1 Usage:
List<Foo> theFoos = getFoos();
Option 2 Usage:
List<Foo> theFoos = new ArrayList<Foo>();
appendFoos(theFoos);
As List is mutable, so second method is better. You don't need to return modified List.
private List list;
If we use Collections.unmodifiableCollection(list), will this return a copy of the collection, or is it faster than creating a copy? We could do other.addAll(list) but we have list of 600,000 object, so addAll is not so good.
Caller just needs a read-only collection.
Collections.unmodifiableList just returns an unmodifiable wrapper; it does not copy the contents of the input list.
Its Javadoc states this fairly clearly:
Returns an unmodifiable view of the specified list. This method allows modules to provide users with "read-only" access to internal lists. Query operations on the returned list "read through" to the specified list, and attempts to modify the returned list, whether direct or via its iterator, result in an UnsupportedOperationException.
As Matt Ball mentioned, if you don't need the internal List to be mutable, you may want to just store a Guava ImmutableList internally... you can safely give that to callers directly since it can never change.
Does Collections.unmodifiableCollection(list) copy the collection?
The other answers are correct (+1s all around): the answer is no.
Instead of Collections.unmodifiableList() you can use Guava's ImmutableList.copyOf() to create an immutable (not modifiable) list copy.
Collections.unmodifiableCollection(..) simply wraps the original collection, disabling methods for modification. It doesn't copy it.
If you change the original list, the "unmodifiable" collection will also change. But the client having only the unmodifiable collection can't change it.