Java: why can't iterate over an iterator?

Java: why can't iterate over an iterator? - java

I read Why is Java's Iterator not an Iterable? and Why aren't Enumerations Iterable?, but I still don't understand why this:
void foo(Iterator<X> it) {
for (X x : it) {
bar(x);
baz(x);
}
}
was not made possible. In other words, unless I'm missing something, the above could have been nice and valid syntactic sugar for:
void foo(Iterator<X> it) {
for (X x; it.hasNext();) {
x = it.next();
bar(x);
baz(x);
}
}

Most likely the reason for this is because iterators are not reusable; you need to get a fresh Iterator from the Iterable collection each time you want to iterate over the elements. However, as a quick fix:
private static <T> Iterable<T> iterable(final Iterator<T> it){
return new Iterable<T>(){ public Iterator<T> iterator(){ return it; } };
}
//....
{
// ...
// Now we can use:
for ( X x : iterable(it) ){
// do something with x
}
// ...
}
//....
That said, the best thing to do is simply pass around the Iterable<T> interface instead of Iterator<T>

but I still don't understand why this [...] was not made possible.
I can see several reasons:
Iterators are not reusable, so a for/each would consume the iterator - not incorrect behavior, perhaps, but unintuitive to those who don't know how the for/each is desugared.
Iterators don't appear "naked" in code all that often so it would be complicating the JLS with little gain (the for/each construct is bad enough as it is, working on both Iterables and arrays).
There's an easy workaround. It may seem a little wasteful to allocate a new object just for this, but allocation is cheap as it is and escape analysis would rid you even of that small cost in most cases. (Why they didn't include this workaround in an Iterables utility class, analogous to Collections and Arrays, is beyond me, though.)
(Probably not true - see the comments.) I seem to recall that the JLS can only reference things in java.lang[citation needed], so they'd have to create an Iterator interface in java.lang which java.util.Iterator extends without adding anything to. Now we have two functionally equivalent iterator interfaces. 50% of the new code using naked iterators will choose the java.lang version, the rest use the one in java.util. Chaos ensues, compatibility problems abound, etc.
I think points 1-3 are very much in line with how the Java language design philosophy seems to go: Don't surprise newcomers, don't complicate the spec if it doesn't have a clear gain that overshadows the costs, and don't do with a language feature what can be done with a library.
The same arguments would explain why java.util.Enumeration isn't Iterable, too.

The for(Type t : iterable) syntax is only valid for classes that implement Iterable<Type>.
An iterator does not implement iterable.
You can iterate over things like Collection<T>, List<T>, or Set<T> because they implement Iterable.
The following code is equivalent:
for (Type t: list) {
// do something with t
}
and
Iterator<Type> iter = list.iterator();
while (iter.hasNext()) {
t = iter.next();
// do something with t
}
The reason this was not made possible, is because the for-each syntax was added to the language to abstract out the Iterator. Making the for-each loop work with iterators would not accomplish what the for-each loop was created for.

Actually, you can.
There is very short workaround available on java 8:
for (X item : (Iterable<X>) () -> iterator)
See How to iterate with foreach loop over java 8 stream for the detailed explanation of the trick.
And some explanations why this was not natively supported can be found in related question:
Why does Stream<T> not implement Iterable<T>?

Iterators are not meant be reused (i.e.: used in more than one iteration loop). In particular, Iterator.hasNext() guarantees that you can safely call Iterator.next() and indeed get the next value from the underlying collection.
When the same iterator is used in two concurrently running iterations (let's assume a multi-threading scenario), this promise can no longer be kept:
while(iter.hasNext() {
// Now a context switch happens, another thread is performing
// iter.hasNext(); x = iter.next();
String s = iter.next();
// A runtime exception is thrown because the iterator was
// exhausted by the other thread
}
Such scenarios completely break the protocol offered by Iterator. Actually, they can occur even in a single threaded program: an iteration loop calls another method which uses the same iterator to perform its own iteration. When this method returns, the caller is issuing an Iterator.next() call which, again, fails.

Because the for-each is designed to read as something like:
for each element of [some collection of elements]
An Iterator is not [some collection of elements]. An array and an Iterable is.

Related

Can I use many listIterators sequentially to mutate or remove list elements from an ArrayList in Java?

I am relying on list iterators to move through a list of characters. This is a single-threaded program and I use listIterator objects sequentially in 4 different methods. Each method has the same setup:
private void myMethod(ArrayList<Integer> input) {
ListIterator<Integer> i = input.listIterator();
while (i.hasNext()) {
Integer in = i.next();
if (in < 10)
i.remove();
else
i.set(in*in); // because its lucky
}
}
With this pattern, on the second iterator the following Exception is thrown:
java.util.ConcurrentModificationException
However, looking at the javadocs I don't see this Exception in the Exceptions thrown nor do I see a method to close the iterator after I am done. Am I using the listIterator incorrectly? I have to iterate over the same ArrayList multiple times, each time conditionally removing or mutating each element. Maybe there is a better way to iterate over the ArrayList and this use-case is not best solved by a ListIterator.
java docs for ListIterator

This is explained in the ArrayList javadoc, you are modifying the list with remove() and set() while using an Iterator:
The iterators returned by this class's iterator and listIterator methods are fail-fast: if the list is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove or add methods, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

It’s hard to give diagnostic for a problem when the shown code clearly isn’t the code that produced the exception, as it doesn’t even compile. The remove method of Iterator doesn’t take arguments and the set method is defined on ListIterator, but your code declares the variable i only as Iterator.
A fixed version
private void myMethod(ArrayList<Integer> input) {
ListIterator<Integer> i = input.listIterator();
while (i.hasNext()) {
Integer in = i.next();
if (in < 10)
i.remove();
else
i.set(in*in);
}
}
would run without problems. The answer to your general question is that each modification invalidates all existing iterators, except the one used to make the modification when you did use an iterator for the modification and not the collection interface directly.
But in your code, there is only one iterator, which is only created and used for this one operation. As long as there is no overlapping use of iterators to the same collection, there is no problem with the invalidation. Iterators existing from previous operations are abandoned anyway and the iterators used in subsequent operations do not exist yet.
Still, it’s easier to use
private void myMethod(ArrayList<Integer> input) {
input.removeIf(in -> in < 10);
input.replaceAll(in -> in*in);
}
instead. Unlike the original code, this does two iterations, but as explained in this answer, removeIf will be actually faster than iterator based removal in those cases, where performance really matters.
But still, the problem persists. The shown code can’t cause a ConcurrentModificationException, so your actual problem is somewhere else and may still be present, regardless of how this one method has been implemented.

I am not knowledgable enough about Java ListIterators to answer the question but it appears I have run into the XY problem here. The problem seems to be better solved with Java Streams to remove the element or map the element into a new ArrayList by exercising a function on each element in the original ArrayList.
private ArrayList<Integer> myMethod(ArrayList<Integer> input) {
ArrayList<Integer> results = input.stream().filter(
in -> (in < 10)).collect(Collectors.toCollection(ArrayList::new));
results = input.stream().map(
in -> in*in).collect(Collectors.toCollection(ArrayList::new));
return results;
}

Unconventional use of Iterator to iterate over a collection

I am aware of the conventional iterator creation-usage for a List<String> list as below:
//Conventional-style
Iterator<String> iterator = list.iterator()
while(iterator.hasNext()){
String string = iterator.next();
//...further code goes here
}
However, in the accepted answer of Iterating through a Collection, avoiding ConcurrentModificationException when removing in loop, I came across this unusual for loop usage with Iterator:
//Unconventional for loop style
for (Iterator<String> iterator = list.iterator(); iterator.hasNext();) {
String string = iterator.next();
//...further code goes here
}
Now, I'd like to know:
Does this unconventional style create the iterator on the collection for each iteration over and over again? Or is it somehow a special kind of intelligent for-loop, which creates the iterator once and reuses it?
If it creates an iterator each time, shouldn't it be a performance concern?
Can we replace the while loop line in the conventional style with
for(;iterator.hasNext();), if I were to use a for loop only?
PS: I am well aware of the enhanced for loop use on a collection. I am looking at this with the intention of 'safe' removal of elements, without causing a ConcurrentModificationException.

The idiom you call "unconventional" is actually the recommended one because it restricts the scope of the iterator variable to the loop where it is used.
The iterator is created once, before the loop begins. This follows from the general semantics of the for loop, which I warmly advise you get acquainted with.
You can, but you would not be recommended to. Such an idiom would be a pointless obfuscation of the while idiom.
Finally, note that for 99% of use cases all of the above is moot because you really should be using either the enhanced for loop or Java 8 forEach.

Java is derived from C, and thus for (A; B; C) { P; } has the same semantics as A; while (B) { P; C; }. The only difference is the scope of the variables. In particular, the A part is only executed once. So your two code examples do exactly the same, but in the for-variant the scope of the variable is restricted.
The more modern way of iterating through a collection is the enhance for loop:
for (String string : list) {
...
}
However, if you want to delete or change items while iterating through it, you still need the iterator version. For example:
for (Iterator<String> it = list.iterator(); it.hasNext();) {
String string = it.next();
if (someFunction(string)) {
it.delete();
}
}
has no enhanced for-loop equivalent.

1.
No, it does not create an iterator over and over again.. This was the perfectly fine style before Java included the interface Iterable<T>.
If you want to remove an item while iterating over the collection you have to use the iterator.remove() method if it is provided.. Because otherwise a ConcurrentModificationException will be thrown.
If you do not want to remove an Item while iterating over the collection then you should just use the for each concept, which is provided by every collection that implements the Iterable<T> interface. (link in the end for more information)
for (String s : yourList) {
... // do something with the string
}
2.
Yes!! Use the for loop idiom. But as I said, if you do not want to use the iterator.remove() operation, but just want to iterate over the collection, you should use the provided for each concept.
You can find a lot of information on the downsides of the iterator.next() approach here and why the newly integrated for:each concept is better:
https://docs.oracle.com/javase/1.5.0/docs/guide/language/foreach.html

Why does Iterable<T> not provide stream() and parallelStream() methods?

I am wondering why the Iterable interface does not provide the stream() and parallelStream() methods. Consider the following class:
public class Hand implements Iterable<Card> {
private final List<Card> list = new ArrayList<>();
private final int capacity;
//...
#Override
public Iterator<Card> iterator() {
return list.iterator();
}
}
It is an implementation of a Hand as you can have cards in your hand while playing a Trading Card Game.
Essentially it wraps a List<Card>, ensures a maximum capacity and offers some other useful features. It is better as implementing it directly as a List<Card>.
Now, for convienience I thought it would be nice to implement Iterable<Card>, such that you can use enhanced for-loops if you want to loop over it. (My Hand class also provides a get(int index) method, hence the Iterable<Card> is justified in my opinion.)
The Iterable interface provides the following (left out javadoc):
public interface Iterable<T> {
Iterator<T> iterator();
default void forEach(Consumer<? super T> action) {
Objects.requireNonNull(action);
for (T t : this) {
action.accept(t);
}
}
default Spliterator<T> spliterator() {
return Spliterators.spliteratorUnknownSize(iterator(), 0);
}
}
Now can you obtain a stream with:
Stream<Hand> stream = StreamSupport.stream(hand.spliterator(), false);
So onto the real question:
Why does Iterable<T> not provide a default methods that implement stream() and parallelStream(), I see nothing that would make this impossible or unwanted?
A related question I found is the following though: Why does Stream<T> not implement Iterable<T>?
Which is oddly enough suggesting it to do it somewhat the other way around.

This was not an omission; there was detailed discussion on the EG list in June of 2013.
The definitive discussion of the Expert Group is rooted at this thread.
While it seemed "obvious" (even to the Expert Group, initially) that stream() seemed to make sense on Iterable, the fact that Iterable was so general became a problem, because the obvious signature:
Stream<T> stream()
was not always what you were going to want. Some things that were Iterable<Integer> would rather have their stream method return an IntStream, for example. But putting the stream() method this high up in the hierarchy would make that impossible. So instead, we made it really easy to make a Stream from an Iterable, by providing a spliterator() method. The implementation of stream() in Collection is just:
default Stream<E> stream() {
return StreamSupport.stream(spliterator(), false);
}
Any client can get the stream they want from an Iterable with:
Stream s = StreamSupport.stream(iter.spliterator(), false);
In the end we concluded that adding stream() to Iterable would be a mistake.

I did an investigation in several of the project lambda mailing lists and I think I found a few interesting discussions.
I have not found a satisfactory explanation so far. After reading all this I concluded it was just an omission. But you can see here that it was discussed several times over the years during the design of the API.
Lambda Libs Spec Experts
I found a discussion about this in the Lambda Libs Spec Experts mailing list:
Under Iterable/Iterator.stream() Sam Pullara said:
I was working with Brian on seeing how limit/substream
functionality[1] might be implemented and he suggested conversion to
Iterator was the right way to go about it. I had thought about that
solution but didn't find any obvious way to take an iterator and turn
it into a stream. It turns out it is in there, you just need to first
convert the iterator to a spliterator and then convert the spliterator
to a stream. So this brings me to revisit the whether we should have
these hanging off one of Iterable/Iterator directly or both.
My suggestion is to at least have it on Iterator so you can move
cleanly between the two worlds and it would also be easily
discoverable rather than having to do:
Streams.stream(Spliterators.spliteratorUnknownSize(iterator,
Spliterator.ORDERED))
And then Brian Goetz responded:
I think Sam's point was that there are plenty of library classes that
give you an Iterator but don't let you necessarily write your own
spliterator. So all you can do is call
stream(spliteratorUnknownSize(iterator)). Sam is suggesting that we
define Iterator.stream() to do that for you.
I would like to keep the stream() and spliterator() methods as being
for library writers / advanced users.
And later
"Given that writing a Spliterator is easier than writing an Iterator,
I would prefer to just write a Spliterator instead of an Iterator (Iterator is so 90s :)"
You're missing the point, though. There are zillions of classes out
there that already hand you an Iterator. And many of them are not
spliterator-ready.
Previous Discussions in Lambda Mailing List
This may not be the answer you are looking for but in the Project Lambda mailing list this was briefly discussed. Perhaps this helps to foster a broader discussion on the subject.
In the words of Brian Goetz under Streams from Iterable:
Stepping back...
There are lots of ways to create a Stream. The more information you
have about how to describe the elements, the more functionality and
performance the streams library can give you. In order of least to
most information, they are:
Iterator
Iterator + size
Spliterator
Spliterator that knows its size
Spliterator that knows its size, and further knows that all sub-splits
know their size.
(Some may be surprised to find that we can extract parallelism even
from a dumb iterator in cases where Q (work per element) is
nontrivial.)
If Iterable had a stream() method, it would just wrap an Iterator with
a Spliterator, with no size information. But, most things that are
Iterable do have size information. Which means we're serving up
deficient streams. That's not so good.
One downside of the API practice outlined by Stephen here, of
accepting Iterable instead of Collection, is that you are forcing
things through a "small pipe" and therefore discarding size
information when it might be useful. That's fine if all you're doing
to do is forEach it, but if you want to do more, its better if you
can preserve all the information you want.
The default provided by Iterable would be a crappy one indeed -- it
would discard size even though the vast majority of Iterables do know
that information.
Contradiction?
Although, it looks like the discussion is based on the changes that the Expert Group did to the initial design of Streams which was initially based on iterators.
Even so, it is interesting to notice that in a interface like Collection, the stream method is defined as:
default Stream<E> stream() {
return StreamSupport.stream(spliterator(), false);
}
Which could be the exact the same code being used in the Iterable interface.
So, this is why I said this answer is probably not satisfactory, but still interesting for the discussion.
Evidence of Refactoring
Continuing with the analysis in the mailing list, it looks like the splitIterator method was originally in the Collection interface, and at some point in 2013 they moved it up to Iterable.
Pull splitIterator up from Collection to Iterable.
Conclusion/Theories?
Then chances are that the lack of the method in Iterable is just an omission, since it looks like they should have moved the stream method as well when they moved the splitIterator up from Collection to Iterable.
If there are other reasons those are not evident. Somebody else has other theories?

If you know the size you could use java.util.Collection which provides the stream() method:
public class Hand extends AbstractCollection<Card> {
private final List<Card> list = new ArrayList<>();
private final int capacity;
//...
#Override
public Iterator<Card> iterator() {
return list.iterator();
}
#Override
public int size() {
return list.size();
}
}
And then:
new Hand().stream().map(...)
I faced the same problem and was surprised that my Iterable implementation could be very easily extended to an AbstractCollection implementation by simply adding the size() method (luckily I had the size of the collection :-)
You should also consider to override Spliterator<E> spliterator().

What is the Iterable interface used for?

I am a beginner and I cannot understand the real effect of the Iterable interface.

Besides what Jeremy said, its main benefit is that it has its own bit of syntactic sugar: the enhanced for-loop. If you have, say, an Iterable<String>, you can do:
for (String str : myIterable) {
...
}
Nice and easy, isn't it? All the dirty work of creating the Iterator<String>, checking if it hasNext(), and calling str = getNext() is handled behind the scenes by the compiler.
And since most collections either implement Iterable or have a view that returns one (such as Map's keySet() or values()), this makes working with collections much easier.
The Iterable Javadoc gives a full list of classes that implement Iterable.

If you have a complicated data set, like a tree or a helical queue (yes, I just made that up), but you don't care how it's structured internally, you just want to get all elements one by one, you get it to return an iterator.
The complex object in question, be it a tree or a queue or a WombleBasket implements Iterable, and can return an iterator object that you can query using the Iterator methods.
That way, you can just ask it if it hasNext(), and if it does, you get the next() item, without worrying where to get it from the tree or wherever.

It returns an java.util.Iterator. It is mainly used to be able to use the implementing type in the enhanced for loop
List<Item> list = ...
for (Item i:list) {
// use i
}
Under the hood the compiler calls the list.iterator() and iterates it giving you the i inside the for loop.

An interface is at its heart a list of methods that a class should implement. The iterable interface is very simple -- there is only one method to implement: Iterator(). When a class implements the Iterable interface, it is telling other classes that you can get an Iterator object to use to iterate over (i.e., traverse) the data in the object.

Iterators basically allow for iteration over any Collection.
It's also what is required to use Java's for-each control statement.

The Iterable is defined as a generic type.
Iterable , where T type parameter represents the type of elements returned by the iterator.
An object that implements this interface allows it to be the target of the “foreach” statement. The for-each loop is used for iterating over arrays, collections.
read more -: https://examples.javacodegeeks.com/iterable-java-example-java-lang-iterable-interface/

Why aren't Enumerations Iterable?

In Java 5 and above you have the foreach loop, which works magically on anything that implements Iterable:
for (Object o : list) {
doStuff(o);
}
However, Enumerable still does not implement Iterable, meaning that to iterate over an Enumeration you must do the following:
for(; e.hasMoreElements() ;) {
doStuff(e.nextElement());
}
Does anyone know if there is a reason why Enumeration still does not implement Iterable?
Edit: As a clarification, I'm not talking about the language concept of an enum, I'm talking a Java-specific class in the Java API called 'Enumeration'.

As an easy and clean way of using an Enumeration with the enhanced for loop, convert to an ArrayList with java.util.Collections.list.
for (TableColumn col : Collections.list(columnModel.getColumns()) {
(javax.swing.table.TableColumnModel.getColumns returns Enumeration.)
Note, this may be very slightly less efficient.

It doesn't make sense for Enumeration to implement Iterable. Iterable is a factory method for Iterator. Enumeration is analogous to Iterator, and only maintains state for a single enumeration.
So, be careful trying to wrap an Enumeration as an Iterable. If someone passes me an Iterable, I will assume that I can call iterator() on it repeatedly, creating as many Iterator instances as I want, and iterating independently on each. A wrapped Enumeration will not fulfill this contract; don't let your wrapped Enumeration escape from your own code. (As an aside, I noticed that Java 7's DirectoryStream violates expectations in just this way, and shouldn't be allowed to "escape" either.)
Enumeration is like an Iterator, not an Iterable. A Collection is Iterable. An Iterator is not.
You can't do this:
Vector<X> list = …
Iterator<X> i = list.iterator();
for (X x : i) {
x.doStuff();
}
So it wouldn't make sense to do this:
Vector<X> list = …
Enumeration<X> i = list.enumeration();
for (X x : i) {
x.doStuff();
}
There is no Enumerable equivalent to Iterable. It could be added without breaking anything to work in for loops, but what would be the point? If you are able to implement this new Enumerable interface, why not just implement Iterable instead?

Enumeration hasn't been modified to support Iterable because it's an interface not a concrete class (like Vector, which was modifed to support the Collections interface).
If Enumeration was changed to support Iterable it would break a bunch of people's code.

AFAIK Enumeration is kinda "deprecated":
Iterator takes the place of
Enumeration in the Java collections
framework
I hope they'll change the Servlet API with JSR 315 to use Iterator instead of Enumeration.

If you would just like it to be syntactically a little cleaner, you can use:
while(e.hasMoreElements()) {
doStuff(e.nextElement());
}

It is possible to create an Iterable from any object with a method that returns an Enumeration, using a lambda as an adapter. In Java 8, use Guava's static Iterators.forEnumeration method, and in Java 9+ use the Enumeration instance method asIterator.
Consider the Servlet API's HttpSession.getAttributeNames(), which returns an Enumeration<String> rather than an Iterator<String>.
Java 8 using Guava
Iterable<String> iterable = () -> Iterators.forEnumeration(session.getAttributeNames());
Java 9+
Iterable<String> iterable = () -> session.getAttributeNames().asIterator();
Note that these lambdas are truly Iterable; they return a fresh Iterator each time they are invoked. You can use them exactly like any other Iterable in an enhanced for loop, StreamSupport.stream(iterable.spliterator(), false), and iterable.forEach().
The same trick works on classes that provide an Iterator but don't implement Iterable. Iterable<Something> iterable = notIterable::createIterator;

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.