I've been reading the term view a few times when using Guava collections and reading its documentation.
I've looked for an explanation of what a view is in this context and whether it's a term used outside of Guava. It's quite often used here. This type from Guava has view in its name.
My guess is that a view of a collection is another collection with the same data but structured differently; for instance when I add the entries from a java.util.HashSet to a java.util.LinkedHashSet the latter would be a view of the former. Is that correct?
Can somebody hook me up with a link to an accepted definition of view, if there is one?
Thanks.
A view of another object doesn't contain its own data at all. All of its operations are implemented in terms of operations on the other object.
For example, the keySet() view of a Map might have an implementation that looks something like this:
class KeySet implements Set<K> {
private final Map<K, V> map;
public boolean contains(Object o) {
return map.containsKey(o);
}
...
}
In particular, whenever you modify the backing object of your view -- here, the Map backs the keySet() -- the view reflects the same changes. For example, if you call map.remove(key), then keySet.contains(key) will return false without you having to do anything else.
Alternately, Arrays.asList(array) provides a List view of that array.
String[] strings = {"a", "b", "c"};
List<String> list = Arrays.asList(strings);
System.out.println(list.get(0)); // "a"
strings[0] = "d";
System.out.println(list.get(0)); // "d"
list.set(0, "e");
System.out.println(strings[0]); // "e"
A view is just another way of looking at the data in the original backing object -- Arrays.asList lets you use the List API to access a normal array; Map.keySet() lets you access the keys of a Map as if it were a perfectly ordinary Set -- all without copying the data or creating another data structure.
Generally, the advantage of using a view instead of making a copy is the efficiency. For example, if you have an array and you need to get it to a method that takes a List, you're not creating a new ArrayList and a whole copy of the data -- the Arrays.asList view takes only constant extra memory, and just implements all the List methods by accessing the original array.
A view in this context is a collection backed by another collection (or array) that itself uses a constant amount memory (i.e. the memory does not depend on the size of the backing collection). Operations applied to the view are delegated to the backing collection (or array). Of course it's possible to expand this definition beyond just collections but your question seems to pertain specifically to them.
For example, Arrays.asList() returns "a list view of the specified array". It does not copy the elements to a new list but rather creates a list that contains a reference to the array and operates based on that.
Another example is Collections.unmodifiableList() which returns "an unmodifiable view of the specified list". In other words, it returns a list containing a reference to the specified list to which all operations are delegated. In this case, the list returned does not permit you to modify it in any way, and so instead of delegating methods responsible for mutating the list, it throws an exception when such methods are called instead.
Related
I have a method:
public List getDocuments(Long orgClientId, long userId) {
// do stuff
List list = new ArrayList();
// do stuff to fill the list. The list will be of hashmaps.
return list;
}
I don't want my list to be modified outside this method. How should I return it then? Cloning it? Or just returning a new instance like
return new ArrayList<>(list);
would do it?
You can make it superficially unmodifiable thus:
return Collections.unmodifiableList(list)
This wraps the list in order that invoking any of the mutation methods will result in an exception.
However, does it really matter to you if somebody modifies the list? You are giving back a new list instance each time this method is invoked, so there is no issue with two callers getting the same instance and having to deal with interactions between the mutations.
Whatever step you take to attempt to make it unmodifiable, I can just copy the list into a modifiable container myself.
It's also quite inconvenient to me as a caller if you give me back something which I can't detect whether it is mutable or not - it looks like a List; the only way to see if I can mutate it is by calling a mutation method. That's a runtime failure, which makes it a PITA.
Update, to add an alternative.
An alternative to Collections.unmodifiableList is something like Guava's ImmutableList: this will likely be faster than Collections.unmodifiableList since it does not simply provide an unmodifiable view of existing list by wrapping, but rather provides a custom implementation of List which forbids mutation (there is less indirection).
The advantage of this is that you can return ImmutableList rather than just the List returned by Collections.unmodifiableList: this allows you to give the caller a bit more type information that they can't mutate it. ImmutableList (and its siblings like ImmutableSet) mark the mutation methods #Deprecated so your IDE can warn you that you might be doing something iffy.
The caller might choose to disregard that information (they assign the result to a List, rather than an ImmutableList), but that's their choice.
The idea is that the object returned by umodifiableCollection can't directly be changed, but could change through other means (effectively by changing the internal collection directly).
List<String> list = new ArrayList<String>();
list.add("One");
list.add("Two");
list.add("Three");
List<String> unmodifiableList = Collections.unmodifiableList(list);
// this doesn't throw an exception since it's using the add
// method of the original List reference, which is no problem
list.add("Four");
System.out.println(unmodifiableList);
// this, however, throws an exception
unmodifiableList.add("Five");
unmodifiableList returns an unmodifiable view of the specified list.
This method allows modules to provide users with "read-only" access to
internal lists. Query operations on the returned list "read through"
to the specified list, and attempts to modify the returned list,
whether direct or via its iterator, result in an
UnsupportedOperationException. The returned list will be serializable
if the specified list is serializable.
Consider the following code below
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class Test {
public static void main(String[] args) {
List<Integer> intList1=new ArrayList<Integer>();
List<Integer> intList2;
intList1.add(1);
intList1.add(2);
intList1.add(3);
intList2=Collections.unmodifiableList(intList1);
intList1.add(4);
for(int i=0;i<4;i++)
{
System.out.println(intList2.get(i));
}
}
}
The result of the above code is
1
2
3
4
In the above code we create an unmodifiable List intList2 from the contents of the List intList1. But after the Collections.unmodifiable statement when I make a change to intList1 that change reflects to intList2. How is this possible ?
You need to read the Javadoc for Collections.unmodifiableList
Returns an unmodifiable view of the specified list.
This means that the returned view is unmodifiable. If you have the original reference you can change the collection. If you change the collection then changes will be reflected in the view.
This has the advantage of being very fast, i.e. you don't need to copy the collection, but the disadvantage that you noted - the resulting collection is a view.
In order to create a truly unmodifiable collection you would need to copy then wrap:
intList2=Collections.unmodifiableList(new ArrayList<>(intList1));
This copies the contents of intList1 into another collection then wraps that collection in the unmodifiable variable. No reference to the wrapped collection exists.
This is expensive - the entire underlying datastore (an array in this case) needs to duplicated which (generally) takes O(n).
Google Guava provides immutable collections which solve some of the problems of making defensive copies:
If a collection is already immutable it is not copied again
Provide an interface which can be used to explicitly state that a collection is immutable
Provide numerous static factory methods to generate immutable collections
But speed is still the key concern when using immutable copies of collections rather than unmodifiable views.
It should be noted that the usual use for Collections.unmodifiableXXX is to return from a method, for example a getter:
public Collection<Thing> getThings() {
return Collections.unmodifiableCollection(things);
}
In this case there are two things to note:
The user of getThings cannot access things so the unmodifiability cannot be broken.
It would be very expensive to copy things each time a getter were called.
In summary the answer to your question is a little more complex than you might have expected and there are a number of aspects to consider when passing collections around in your application.
From the Javadoc of Collections.unmodifiableList:
Returns an unmodifiable view of the specified list. This method allows modules to provide users with "read-only" access to internal lists.
It prevent the returned list to be modified, but the original list itself still can be.
In your code you are
intList2=Collections.unmodifiableList(intList1);
creating unmodifiableList in intList2. So you are free to make changes in inList1
but you are not allowed to do any changes in intList2
try this:
intList2.add(4);
you will get
java.lang.UnsupportedOperationException
at java.util.Collections$UnmodifiableCollection.add(Unknown Source)
above exception.
Collections.unmodifiableList returns a "read-only" view of the internal list. While the object that was returned is not modifiable the original list that it references can be modified. Both objects point to the same object in memory so it will reflect changes made.
Here is a good explanation of what is happening.
That happens because the unmodifiable list is internally backed for the first list, if you really want it to be unmodifiable you shouldn't use the first List any more.
Suppose there is this code:
List<String> modifiableList = new ArrayList<String>(Arrays.asList("1","2","3"));
List<String> unmodifiableList = Collections.unmodifiableList(modifiableList);
System.out.println(unmodifiableList);
modifiableList.remove("1");
modifiableList.remove("3");
System.out.println(unmodifiableList);
it prints
[1, 2, 3]
[2]
If changing the second line to
List<String> unmodifiableList = Collections.unmodifiableList(
new ArrayList<String>(modifiableList));
it works as expected.
The question is why doesn't the UnmodifiableList inner class from Collections (and all the other unmodifiable classes) from there create a fresh copy of the original list, as does the constructor of ArrayList for example in order to really make it unmodifiable?
Edit: I understand what the code does; my question is why was it implemented this way? Why does the constructor from the UnmodifiableList (inner class from Collections) behave like the constructor of ArrayList in creating a fresh copy of the underlying array? Why a modifiable collection (ArrayList) copies the whole content while an unmodifiable collection doesn't?
Well the purpose of the methods is to create an unmodifiable view on an existing collection. That's the documented behaviour, and in many cases it's exactly what you want - it's much more efficient than copying all the data, for example... or you want to hand callers collections which will reflect any changes you want to make, but without allowing them to make changes.
If you want an immutable copy of the data (or at least the references...) then just create a copy and then create an immutable view over the top of it - just as you are.
So basically, you can easily create either a view or a copy-and-view, based on Collections.unmodifiable* themselves performing the view operation. So we have two orthogonal operations:
Create a copy (e.g. via the constructor)
Create a view (via Collections.unmodifiable*)
Those operations can be composed very easily. If Collections.unmodifiable* actually performed a "copy only" then we'd require other operations in order to just make a view. If you accept that both options are useful in different situations, making them composable gives lots of flexibility.
The reason is simple efficiency. Copying all of the elements of a collection could be very time-consuming, particularly if the collection being wrapped has some sort of magic going on like JPA lazy-loading, and requires extra memory. Wrapping the underlying collection as-is is a trivial operation that imposes no additional overhead. In the case where the developer really does want a separate copy (unmodifiable or not), it's very easy to create it explicitly. (I tend to use Guava Immutable* for this.)
Please note, that unmodifiableList returns a "unmodifiable view" of provided list. So the list itself stays at it is (it can be still modified), only its "unmodifiable view" is unmodifiable. You can think of it as of SQL tables and views --- you can run DML scripts on tables and it will be reflected on related views. As to ArrayList --- it's backed by... an array, so it's implementation feature, that it copies elements from provided source list (which doesn't have to be backed by an array actually). Does it answer your question?
I know that the Collection framework allows for the creation of "views", that is lightweight "wrappers" for a Collection object.
What I am especially interested in is, given a List, to return a view for only a subset of elements matching some conditions.
Basically, what I want to emulate is the functionality of the subList() method, only not based on start and end indexes, but on some parameters of the elements.
The first approach I thought about was simply to create another List, go through the first List and check each element...
While this wouldn't be actually copy any MyObject but only their references, I would anyways create a new List object, with its overhead. Isn't that right?
Is there any lightweight method of doing what I need?
N.B. My original List is a really big collection...
Thank you all
You can do this easily in Java using the Guava collections (Collections2 has a filter method http://docs.guava-libraries.googlecode.com/git-history/v11.0.1/javadoc/index.html).
You can also do this in groovy using the findAll method, for example
myList.findAll { it.contains("aValue") }
Any of these methods will create a new collection under the hood. So they are just doing the work for you of iterating over the elements and checking them. The overhead of creating a new list is minimal (it's just instantiating one new object).
I would anyways create a new List object, with its overhead
I don't understand what your concern here. Looking at source of ArrayList class even subList(int fromIndex, int toIndex) method in List class creates a new inner class (which extends from List). That is essentially what you will be doing in your method i.e. create a new List instance and copy your matching element's reference into it. That custom method will be more or less will have same performance as subList method.
If I have a class that needs to return an array of strings of variable dimension (and that dimension could only be determined upon running some method of the class), how do I declare the dynamic array in my class' constructor?
If the question wasn't clear enough,
in php we could simply declare an array of strings as $my_string_array = array();
and add elements to it by $my_string_array[] = "New value";
What is the above code equivalent then in java?
You will want to look into the java.util package, specifically the ArrayList class. It has methods such as .add() .remove() .indexof() .contains() .toArray(), and more.
Plain java arrays (ie String[] strings) cannot be resized dynamically; when you're out of room but you still want to add elements to your array, you need to create a bigger one and copy the existing array into its first n positions.
Fortunately, there are java.util.List implementations that do this work for you. Both java.util.ArrayList and java.util.Vector are implemented using arrays.
But then, do you really care if the strings happen to be stored internally in an array, or do you just need a collection that will let you keep adding items without worrying about running out of room? If the latter, then you can pick any of the several general purpose List implementations out there. Most of the time the choices are:
ArrayList - basic array based implementation, not synchronized
Vector - synchronized, array based implementation
LinkedList - Doubly linked list implementation, faster for inserting items in the middle of a list
Do you expect your list to have duplicate items? If duplicate items should never exist for your use case, then you should prefer a java.util.Set. Sets are guaranteed to not contain duplicate items. A good general-purpose set implementation is java.util.HashSet.
Answer to follow-up question
To access strings using an index similar to $my_string_array["property"], you need to put them in a Map<String, String>, also in the java.util package. A good general-purpose map implementation is HashMap.
Once you've created your map,
Use map.put("key", "string") to add strings
Use map.get("key") to access a string by its key.
Note that java.util.Map cannot contain duplicate keys. If you call put consecutively with the same key, only the value set in the latest call will remain, the earlier ones will be lost. But I'd guess this is also the behavior for PHP associative arrays, so it shouldn't be a surprise.
Create a List instead.
List<String> l = new LinkedList<String>();
l.add("foo");
l.add("bar");
No dynamic array in java, length of array is fixed.
Similar structure is ArrayList, a real array is implemented underlying it.
See the name ArrayList :)