Purpose of the iterator-pattern. Have I understood it right? - java

I have recently tried to understand the so-called Iterator pattern.
Think if have understood it's purpose but I'm still not sure. So please correct me concerning this:
The purpose of the iterator pattern is to abstract away the underlying structure in which the data
are kept. Data-structure can be an array, a tree, a list ...
It's important methods are next() (returns an object), hasNext() (returns a boolean) and remove().
The methods are implemented in a way which is appropriate way for the used data-structure. So the developer
who uses a iterator-implementing-class don't have to care. Just uses the provided methods which are
the same for every iterator-implementing-class.
Have I get it right?

What you have summarized is correct. Iterator design pattern hides the underlying complexity of a collection\aggregate by providing an iterator interface in between the collection and the data retriever.
Next,we can take this one level higher by defining an abstraction for the iterator. This means we can write iterators to traverse the same collection in multiple ways. For e.g.: If we have a binary tree collection then we can write three iterators for inorder, postorder and preorder traversals.
Lastly, Iterator pattern allows to have an abstraction for the collection being iterated as well. This implies that one can implement a family of iterators for a family of collections.
One more important thing to note is that an interator knows enough about the inner structure of the collection to be able to iterate it. And that it is the responsibility of the collection instance to create the correct iterator(out of the possible family of iterators) for itself and return it back to the client.
If you are interested in reading more about the iterator pattern, I have explained the above points in depth in a writeup on my blog: http://www.javabrahman.com/design-patterns/iterator-design-pattern-in-java/

Yes your understanding almost covers most of it.
For getting the same in a more technical scenes refer IteratorDesignPattern https://sourcemaking.com/design_patterns/iterator

Related

Why doesn't LinkedHashSet have addFirst method?

As the documentation of LinkedHashSet states, it is
Hash table and linked list implementation of the Set interface, with
predictable iteration order. This implementation differs from HashSet
in that it maintains a doubly-linked list running through all of its
entries.
So it's essentially a HashSet with FIFO queue of keys implemented by a linked list. Considering that LinkedList is Deque and permits, in particular, insertion at the beginning, I wonder why doesn't LinkedHashSet have the addFirst(E e) method in addition to the methods present in the Set interface. It seems not hard to implement this.
As Eliott Frisch said, the answer is in the next sentence of the paragraph you quoted:
… This linked list defines the iteration ordering, which is the order
in which elements were inserted into the set (insertion-order). …
An addFirst method would break the insertion order and thereby the design idea of LinkedHashSet.
If I may add a bit of guesswork too, other possible reasons might include:
It’s not so simple to implement as it appears since a LinkedHashSet is really implemented as a LinkedHasMap where the values mapped to are not used. At least you would have to change that class too (which in turn would also break its insertion order and thereby its design idea).
As that other guy may have intended in a comment, they didn’t find it useful.
That said, you are asking the question the wrong way around. They designed a class with a functionality for which they saw a need. They moved on to implement it using a hash table and a linked list. You are starting out from the implementation and using it as a basis for a design discussion. While that may occasionally add something useful, generally it’s not the way to good designs.
While I can in theory follow your point that there might be a situation where you want a double-ended queue with set property (duplicates are ignored/eliminated), I have a hard time imagining when a Deque would not fulfil your needs in this case (Eliott Frisch mentioned the under-used ArrayDeque). You need pretty large amounts of data and/or pretty strict performance requirements before the linear complexity of contains and remove would be prohibitive. And in that case you may already be better off custom designing your own data structure.

What is the benefit of using Iterator instead of ListIterator for ArrayList?

From what I understand, ListIterator provides much more functionality than Iterator for lists. If so, why use Iterator at all? Is there an optimization or performance factor to it as well?
Generally the less your code knows about other parts of code and data structures the better. The reason is simple: this simplifies the modification.
Indeed, Collection is more powerful than Iterator, List is more powerful than Collection.
But Iterator can be obtained from a lot of data structures, so if you are using Iterator that you get from (for example) list you can than change your code and get iterator from (for example) database without changing code that works with iterator.
Use the ListIterator to express the requirement to iterate in a specific order. Use the Iterator to express that order does not matter.
Generally I think that the use of iterators makes code difficult to read. I don't know your use case but in most cases it makes sense to use the enhanced for loop or a functional approach like Java 8 streams instead.
If you need functionality of ListIterator then you can use it. At the same time, you restrict yourself to operating on Lists at that specific place.
Now, if you implement some really List-based operation, you actually want to operate on Lists only, so maybe that's no restriction for you.
If, however, you don't really need the special funtionality of ListIterator but an Iterator is sufficient, then prefer using the plain Iterator; this enables you to operate not only on Lists but on any data structure which implements Iterable, including all kinds of Collections. This makes your code more generic and flexible to use.

Is it bad practice to return an iterable in a method?

I have often read in many places that one should avoid returning an iterable and return a collection instead. For example -
public Iterable<Maze> Portals() {
// a list of some maze configurations
List<Maze> mazes = createMazes();
...
return Collections.unmodifiableList(mazes);
}
Since returning an iterable is only useful for using it in foreach loop, while collection already provides an iterator and provides much more control. Could you please tell me when it is beneficial to specifically return an iterable in a method? Or we should always return a collection instead?
Note : This question is not about Guava library
Returning an Iterable would be beneficial when we need to lazily load a collection that contains a lot of elements.
The following quote from Google Collections FAQ seems to support the idea of lazy loading:
Why so much emphasis on Iterators and Iterables?
In general, our methods do not require a Collection to be passed in
when an Iterable or Iterator would suffice. This distinction is
important to us, as sometimes at Google we work with very large
quantities of data, which may be too large to fit in memory, but which
can be traversed from beginning to end in the course of some
computation. Such data structures can be implemented as collections,
but most of their methods would have to either throw an exception,
return a wrong answer, or perform abysmally. For these situations,
Collection is a very poor fit; a square peg in a round hole.
An Iterator represents a one-way scrollable "stream" of elements, and
an Iterable is anything which can spawn independent iterators. A
Collection is much, much more than this, so we only require it when we
need to.
I can see advantages and disadvantages:
One advantage is that Iterable is a simpler interface than Collection. If you have a non-standard collection type, it may be easier to make it Iterable than Collection. Indeed, there are some kinds of collection for which some of the Collection methods are problematic to implement. For example, lazy collections types and collections where you don't want to rely on the standard equals(Object) method to determine membership.
One disadvantage is that Iterable is functionality poor. If you have a concrete type that implements Collection, and you return it as an Iterable, you are removing the possibility that the code can (directly) call a variety of useful collection methods.
There are some cases where neither Iterable or Collection are a good fit; e.g. specialist collections of primitive types ... where you need to avoid the overheads of using the primitive wrapper types.
You can't really say whether it is good or bad practice to return an Iterable. It depends on the circumstances; e.g. the purpose of the API you are designing, and the requirements or constraints that you want / need to place on it.
The problem is that if underlying collection changes, you will be in trouble.
If you are using a collection which throws concurrentmodification exception then you have to take care of it as well but with collection there are no such issues.
Return the most specific type that makes sense for the use in question. If you have a method that's creating a new collection, for example, or you can easily wrap the collection in an unmodifiable wrapper, returning the collection as a Collection, or even a List or Set, makes the client developer's life a little easier.
Returning Iterable makes sense for code where the values may be generated on-the-fly; you could imagine a Fibonacci generator, for example, that created an Iterator that calculated the next number instead of trying to store some lookup table. If you're writing framework or interface code where such a "streaming" sort of API might be useful (Guava and its functional classes do a good bit of this), then specifying Iterable instead of a collection type might be worth the loss of flexibility on the consumer side.

Why is exposing an iterators underlying representation bad?

Iterator Pattern Definition: Provides a way to access the elements of an aggregate object sequentially without exposing its underlying representation. Wiki
What are the consequences of exposing the underlying representation?
To provide a more detailed answer: How is the iterator pattern preventing this?
As per: http://www.oodesign.com/iterator-pattern.html
The idea of the iterator pattern is to take the responsibility of accessing and passing trough the objects of the collection and put it in the iterator object. The iterator object will maintain the state of the iteration, keeping track of the current item and having a way of identifying what elements are next to be iterated.
Few benefits that you can get from this pattern:
Using Iterator pattern code designer can decide whether to allow 1 way iteration (using next() only) or allow reverse iteration as well (using prev() as in ListIterator).
Whether to allow object removal or not, if yes then how.
Maintain internal housekeeping when object is removed.
It allows you to expose common mechanism of traversing a collection rather than expecting your clients to understand underlying collections.
If the underlying representation were exposed, client code could couple to it. Then:
If the representation changes, it may be necessary to change all the code coupling to it.
If you want to iterate over a different type of container, it may be necessary to change the code coupling to the old container.
Data abstraction makes code more resilient to a change in the representation.
In short: all the code relying on the underlying representation will have to be changed if you decide to change the representation.
E.g., you decided to use TreeMap at first, but then you don't want ordering anymore (in most cases), so you change to HashMap. Somebody is looping through your map trying to get a increasing list. !!
Using iterator pattern, you could always give the user the ability to loop through something with a certain logic (or just random, which is a kind of logic) without knowing what it is under the hood.
Now, if you use HashMap instead of TreeMap, you could expose a sorted view to the user. If you provide this SortedIterator and tell user "using this will guarantee the result to be sorted, but I can't tell you anything about what's underneath", you can change the representation to be whatever you like, as long as the contract of this SortedIterator is maintained by you.

Is there a way to create a List/Set which keeps insertion order and does not allow duplicates in Java?

What is the most efficient way of maintaining a list that does not allow duplicates, but maintains insertion order and also allows the retrieval of the last inserted element in Java?
Try LinkedHashSet, which keeps the order of input.
Note that re-inserting an element would update its position in the input order, thus you might first try and check whether the element is already contained in the set.
Edit:
You could also try the Apache commons collections class ListOrderedSet which according to the JavaDoc (if I didn't missread anything again :) ) would decorate a set in order to keep insertion order and provides a get(index) method.
Thus, it seems you can get what you want by using new ListOrderedSet(new HashSet());
Unfortunately this class doesn't provide a generic parameter, but it might get you started.
Edit 2:
Here's a project that seems to represent commons collections with generics, i.e. it has a ListOrderedSet<E> and thus you could for example call new ListOrderedSet<String>(new HashSet<String>());
I don't think there's anything in the JDK which does this.
However, LinkedHashMap, which is used as the basis for LinkedHashSet, comes close: it maintains a circular doubly-linked list of the entries in the map. It only tracks the head of the list not the tail, but because the list is circular, header.before is the tail (the most recently inserted element).
You could therefore implement what you need on top of this. LinkedHashMap has not been designed for extension, so this is somewhat awkward. You could copy the code into your own class and add a suitable last() method (be aware of licensing issues here), or you could extend the existing class, and add a method which uses reflection to get at the private header and before fields.
That would get you a Map, rather than a Set. However, HashSet is already a wrapper which makes a Map look like a Set. Again, it is not designed for general extension, but you could write a subclass whose constructor calls the super constructor, then uses more reflection to replace the superclass's value of map with an instance of your new map. From there on, the class should do exactly what you want.
As an aside, the library classes here were all written by Josh Bloch and Neal Gafter. Those guys are two of the giants of Java. And yet the code in there is largely horrible. Never meet your heroes.
Just use a TreeSet.

Categories