Unconventional use of Iterator to iterate over a collection - java

I am aware of the conventional iterator creation-usage for a List<String> list as below:
//Conventional-style
Iterator<String> iterator = list.iterator()
while(iterator.hasNext()){
String string = iterator.next();
//...further code goes here
}
However, in the accepted answer of Iterating through a Collection, avoiding ConcurrentModificationException when removing in loop, I came across this unusual for loop usage with Iterator:
//Unconventional for loop style
for (Iterator<String> iterator = list.iterator(); iterator.hasNext();) {
String string = iterator.next();
//...further code goes here
}
Now, I'd like to know:
Does this unconventional style create the iterator on the collection for each iteration over and over again? Or is it somehow a special kind of intelligent for-loop, which creates the iterator once and reuses it?
If it creates an iterator each time, shouldn't it be a performance concern?
Can we replace the while loop line in the conventional style with
for(;iterator.hasNext();), if I were to use a for loop only?
PS: I am well aware of the enhanced for loop use on a collection. I am looking at this with the intention of 'safe' removal of elements, without causing a ConcurrentModificationException.

The idiom you call "unconventional" is actually the recommended one because it restricts the scope of the iterator variable to the loop where it is used.
The iterator is created once, before the loop begins. This follows from the general semantics of the for loop, which I warmly advise you get acquainted with.
You can, but you would not be recommended to. Such an idiom would be a pointless obfuscation of the while idiom.
Finally, note that for 99% of use cases all of the above is moot because you really should be using either the enhanced for loop or Java 8 forEach.

Java is derived from C, and thus for (A; B; C) { P; } has the same semantics as A; while (B) { P; C; }. The only difference is the scope of the variables. In particular, the A part is only executed once. So your two code examples do exactly the same, but in the for-variant the scope of the variable is restricted.
The more modern way of iterating through a collection is the enhance for loop:
for (String string : list) {
...
}
However, if you want to delete or change items while iterating through it, you still need the iterator version. For example:
for (Iterator<String> it = list.iterator(); it.hasNext();) {
String string = it.next();
if (someFunction(string)) {
it.delete();
}
}
has no enhanced for-loop equivalent.

1.
No, it does not create an iterator over and over again.. This was the perfectly fine style before Java included the interface Iterable<T>.
If you want to remove an item while iterating over the collection you have to use the iterator.remove() method if it is provided.. Because otherwise a ConcurrentModificationException will be thrown.
If you do not want to remove an Item while iterating over the collection then you should just use the for each concept, which is provided by every collection that implements the Iterable<T> interface. (link in the end for more information)
for (String s : yourList) {
... // do something with the string
}
2.
Yes!! Use the for loop idiom. But as I said, if you do not want to use the iterator.remove() operation, but just want to iterate over the collection, you should use the provided for each concept.
You can find a lot of information on the downsides of the iterator.next() approach here and why the newly integrated for:each concept is better:
https://docs.oracle.com/javase/1.5.0/docs/guide/language/foreach.html

Related

Can I use many listIterators sequentially to mutate or remove list elements from an ArrayList in Java?

I am relying on list iterators to move through a list of characters. This is a single-threaded program and I use listIterator objects sequentially in 4 different methods. Each method has the same setup:
private void myMethod(ArrayList<Integer> input) {
ListIterator<Integer> i = input.listIterator();
while (i.hasNext()) {
Integer in = i.next();
if (in < 10)
i.remove();
else
i.set(in*in); // because its lucky
}
}
With this pattern, on the second iterator the following Exception is thrown:
java.util.ConcurrentModificationException
However, looking at the javadocs I don't see this Exception in the Exceptions thrown nor do I see a method to close the iterator after I am done. Am I using the listIterator incorrectly? I have to iterate over the same ArrayList multiple times, each time conditionally removing or mutating each element. Maybe there is a better way to iterate over the ArrayList and this use-case is not best solved by a ListIterator.
java docs for ListIterator
This is explained in the ArrayList javadoc, you are modifying the list with remove() and set() while using an Iterator:
The iterators returned by this class's iterator and listIterator methods are fail-fast: if the list is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove or add methods, the iterator will throw a ConcurrentModificationException. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
It’s hard to give diagnostic for a problem when the shown code clearly isn’t the code that produced the exception, as it doesn’t even compile. The remove method of Iterator doesn’t take arguments and the set method is defined on ListIterator, but your code declares the variable i only as Iterator.
A fixed version
private void myMethod(ArrayList<Integer> input) {
ListIterator<Integer> i = input.listIterator();
while (i.hasNext()) {
Integer in = i.next();
if (in < 10)
i.remove();
else
i.set(in*in);
}
}
would run without problems. The answer to your general question is that each modification invalidates all existing iterators, except the one used to make the modification when you did use an iterator for the modification and not the collection interface directly.
But in your code, there is only one iterator, which is only created and used for this one operation. As long as there is no overlapping use of iterators to the same collection, there is no problem with the invalidation. Iterators existing from previous operations are abandoned anyway and the iterators used in subsequent operations do not exist yet.
Still, it’s easier to use
private void myMethod(ArrayList<Integer> input) {
input.removeIf(in -> in < 10);
input.replaceAll(in -> in*in);
}
instead. Unlike the original code, this does two iterations, but as explained in this answer, removeIf will be actually faster than iterator based removal in those cases, where performance really matters.
But still, the problem persists. The shown code can’t cause a ConcurrentModificationException, so your actual problem is somewhere else and may still be present, regardless of how this one method has been implemented.
I am not knowledgable enough about Java ListIterators to answer the question but it appears I have run into the XY problem here. The problem seems to be better solved with Java Streams to remove the element or map the element into a new ArrayList by exercising a function on each element in the original ArrayList.
private ArrayList<Integer> myMethod(ArrayList<Integer> input) {
ArrayList<Integer> results = input.stream().filter(
in -> (in < 10)).collect(Collectors.toCollection(ArrayList::new));
results = input.stream().map(
in -> in*in).collect(Collectors.toCollection(ArrayList::new));
return results;
}

how to remove data from ArrayList [duplicate]

This question already has answers here:
Iterating through a Collection, avoiding ConcurrentModificationException when removing objects in a loop
(31 answers)
Closed 8 years ago.
In Java, is it legal to call remove on a collection when iterating through the collection using a foreach loop? For instance:
List<String> names = ....
for (String name : names) {
// Do something
names.remove(name).
}
As an addendum, is it legal to remove items that have not been iterated over yet? For instance,
//Assume that the names list as duplicate entries
List<String> names = ....
for (String name : names) {
// Do something
while (names.remove(name));
}
To safely remove from a collection while iterating over it you should use an Iterator.
For example:
List<String> names = ....
Iterator<String> i = names.iterator();
while (i.hasNext()) {
String s = i.next(); // must be called before you can call i.remove()
// Do something
i.remove();
}
From the Java Documentation :
The iterators returned by this class's iterator and listIterator
methods are fail-fast: if the list is structurally modified at any
time after the iterator is created, in any way except through the
iterator's own remove or add methods, the iterator will throw a
ConcurrentModificationException. Thus, in the face of concurrent
modification, the iterator fails quickly and cleanly, rather than
risking arbitrary, non-deterministic behavior at an undetermined time
in the future.
Perhaps what is unclear to many novices is the fact that iterating over a list using the for/foreach constructs implicitly creates an iterator which is necessarily inaccessible. This info can be found here
You don't want to do that. It can cause undefined behavior depending on the collection. You want to use an Iterator directly. Although the for each construct is syntactic sugar and is really using an iterator, it hides it from your code so you can't access it to call Iterator.remove.
The behavior of an iterator is
unspecified if the underlying
collection is modified while the
iteration is in progress in any way
other than by calling this method.
Instead write your code:
List<String> names = ....
Iterator<String> it = names.iterator();
while (it.hasNext()) {
String name = it.next();
// Do something
it.remove();
}
Note that the code calls Iterator.remove, not List.remove.
Addendum:
Even if you are removing an element that has not been iterated over yet, you still don't want to modify the collection and then use the Iterator. It might modify the collection in a way that is surprising and affects future operations on the Iterator.
for (String name : new ArrayList<String>(names)) {
// Do something
names.remove(nameToRemove);
}
You clone the list names and iterate through the clone while you remove from the original list. A bit cleaner than the top answer.
The java design of the "enhanced for loop" was to not expose the iterator to code, but the only way to safely remove an item is to access the iterator. So in this case you have to do it old school:
for(Iterator<String> i = names.iterator(); i.hasNext();) {
String name = i.next();
//Do Something
i.remove();
}
If in the real code the enhanced for loop is really worth it, then you could add the items to a temporary collection and call removeAll on the list after the loop.
EDIT (re addendum): No, changing the list in any way outside the iterator.remove() method while iterating will cause problems. The only way around this is to use a CopyOnWriteArrayList, but that is really intended for concurrency issues.
The cheapest (in terms of lines of code) way to remove duplicates is to dump the list into a LinkedHashSet (and then back into a List if you need). This preserves insertion order while removing duplicates.
I didn't know about iterators, however here's what I was doing until today to remove elements from a list inside a loop:
List<String> names = ....
for (i=names.size()-1;i>=0;i--) {
// Do something
names.remove(i);
}
This is always working, and could be used in other languages or structs not supporting iterators.
Yes you can use the for-each loop,
To do that you have to maintain a separate list to hold removing items and then remove that list from names list using removeAll() method,
List<String> names = ....
// introduce a separate list to hold removing items
List<String> toRemove= new ArrayList<String>();
for (String name : names) {
// Do something: perform conditional checks
toRemove.add(name);
}
names.removeAll(toRemove);
// now names list holds expected values
Make sure this is not code smell. Is it possible to reverse the logic and be 'inclusive' rather than 'exclusive'?
List<String> names = ....
List<String> reducedNames = ....
for (String name : names) {
// Do something
if (conditionToIncludeMet)
reducedNames.add(name);
}
return reducedNames;
The situation that led me to this page involved old code that looped through a List using indecies to remove elements from the List. I wanted to refactor it to use the foreach style.
It looped through an entire list of elements to verify which ones the user had permission to access, and removed the ones that didn't have permission from the list.
List<Service> services = ...
for (int i=0; i<services.size(); i++) {
if (!isServicePermitted(user, services.get(i)))
services.remove(i);
}
To reverse this and not use the remove:
List<Service> services = ...
List<Service> permittedServices = ...
for (Service service:services) {
if (isServicePermitted(user, service))
permittedServices.add(service);
}
return permittedServices;
When would "remove" be preferred? One consideration is if gien a large list or expensive "add", combined with only a few removed compared to the list size. It might be more efficient to only do a few removes rather than a great many adds. But in my case the situation did not merit such an optimization.
Those saying that you can't safely remove an item from a collection except through the Iterator aren't quite correct, you can do it safely using one of the concurrent collections such as ConcurrentHashMap.
Try this 2. and change the condition to "WINTER" and you will wonder:
public static void main(String[] args) {
Season.add("Frühling");
Season.add("Sommer");
Season.add("Herbst");
Season.add("WINTER");
for (String s : Season) {
if(!s.equals("Sommer")) {
System.out.println(s);
continue;
}
Season.remove("Frühling");
}
}
It's better to use an Iterator when you want to remove element from a list
because the source code of remove is
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null;
so ,if you remove an element from the list, the list will be restructure ,the other element's index will be changed, this can result something that you want to happened.
Use
.remove() of Interator or
Use
CopyOnWriteArrayList

Iterator vs for

I was asked in an interview what is the advantage of using iterator over for loop or what is the advantage of using for loop over iterator?
Can any body please answer this?
First of all, there are 2 kinds of for loops, which behave very differently. One uses indices:
for (int i = 0; i < list.size(); i++) {
Thing t = list.get(i);
...
}
This kind of loop isn't always possible. For example, Lists have indices, but Sets don't, because they're unordered collections.
The other one, the foreach loop uses an Iterator behind the scenes:
for (Thing thing : list) {
...
}
This works with every kind of Iterable collection (or array)
And finally, you can use an Iterator, which also works with any Iterable:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
Thing t = it.next();
...
}
So you in fact have 3 loops to compare.
You can compare them in different terms: performance, readability, error-proneness, capability.
An Iterator can do things that a foreach loop can't. For example, you can remove elements while you're iterating, if the iterator supports it:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
Thing t = it.next();
if (shouldBeDeleted(thing) {
it.remove();
}
}
Lists also offer iterators that can iterate in both directions. A foreach loop only iterates from the beginning to an end.
But an Iterator is more dangerous and less readable. When a foreach loop is all you need, it's the most readable solution. With an iterator, you could do the following, which would be a bug:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
System.out.println(it.next().getFoo());
System.out.println(it.next().getBar());
}
A foreach loop doesn't allow for such a bug to happen.
Using indices to access elements is slightly more efficient with collections backed by an array. But if you change your mind and use a LinkedList instead of an ArrayList, suddenly the performance will be awful, because each time you access list.get(i), the linked list will have to loop though all its elements until the ith one. An Iterator (and thus the foreach loop) doesn't have this problem. It always uses the best possible way to iterate through elements of the given collection, because the collection itself has its own Iterator implementation.
My general rule of thumb is: use the foreach loop, unless you really need capabilities of an Iterator. I would only use for loop with indices with arrays, when I need access to the index inside the loop.
Iterator Advantage:
Ability to remove elements from Collections.
Ability to move forward and backward using next() and previous().
Ability to check if there more elements or not by using hasNext().
Loop was designed only to iterate over a Collection, so if you want just to iterate over a Collection, its better to use loop such as for-Each, but if you want more that that you could use Iterator.
The main difference between Iterator and the classic for loop, apart from the obvious one of having or not having access to the index of the item you're iterating, is that using Iterator abstracts the client code from the underlying collection implementation, allow me to elaborate.
When your code uses an iterator, either in this form
for(Item element : myCollection) { ... }
this form
Iterator<Item> iterator = myCollection.iterator();
while(iterator.hasNext()) {
Item element = iterator.next();
...
}
or this form
for(Iterator iterator = myCollection.iterator(); iterator.hasNext(); ) {
Item element = iterator.next();
...
}
What your code is saying is "I don't care about the type of collection and its implementation, I just care that I can iterate through its elements". Which is usually the better approach, since it makes your code more decoupled.
On the other hand, if you're using the classic for loop, as in
for(int i = 0; i < myCollection.size(); i++) {
Item element = myCollection.get(i);
...
}
Your code is saying, I need to know the type of collection, because I need to iterate through its elements in a specific way, I'm also possibly going to check for nulls or compute some result based on the order of iteration. Which makes your code more fragile, because if at any point the type of collection you receive changes, it will impact the way your code works.
Summing it up, the difference is not so much about speed, or memory usage, is more about decoupling your code so that is more flexible to cope with change.
if you access to data by number (e.g. "i"), it is fast when you use array. because it goes to element directly
But, other data structure (e.g. tree, list), it needs more time, because it start from first element to target element. when you use list. It needs time O(n). so, it is to be slow.
if you use iterator, compiler knows that where you are. so It needs O(1)
(because, it start from current position)
finally, if you use only array or data structure that support direct access(e.g. arraylist at java). "a[i]" is good. but, when you use other data structure, iterator is more efficient
Unlike other answers, I want to point another things;
if you need to perform the iteration in more than one place in your code, you will likely end up duplicating the logic. This clearly isn’t a very extensible approach. Instead, what’s needed is a way to separate the logic for selecting the data from the code that actually processes it.
An iterator solves these problems by providing a generic interface for looping over a set of data so that the underlying data structure or storage mechanism — such as an array- is hidden.
Iterator is a concept not an implementation.
An iterator provides a number of operations for traversing and accessing data.
An iterator may wrap any datastructure like array.
One of the more interesting and useful advantages of using iterators is the capability to wrap or decorate another iterator to filter the return values
An iterator may be thread safe while a for loop alone cannot be as it is accessing elements directly. The only popular thread-safety iterator is CopyOnWriteArrayList but it is well known and used often so worth mentioning.
This is from the book that it is https://www.amazon.com/Beginning-Algorithms-Simon-Harris/dp/0764596748
I stumbled on this question. The answer lies to the problems Iterator tries to solve:
access and traverse the elements of an aggregate object without exposing its representation
define traversal operations for an aggregate object without changing its interface

When to use for (int i =0; ; ) / for-each / iterator to go through a List

Is it a matter of preference to use the traditional for loop, the for-each loop or an iterator to go through a List?
1) for(MyClass mc : al){ // do something on mc }
or
2) iter = arrayList.iterator();
while(iter.hasNext()){MyClass mc = iter.Next()}
For most iterations you should use the regular loop:
for (Object o : list) { /* */ }
It is much more readable, intent is clear, and potential bugs are kept to a minimum.
Use an iterator when you need explicit control over the iteration, for example, when you might want to start iteration all over again.
You can use iterators to avoid ConcurrentModificationExceptions.
iter = arrayList.iterator();
while(iter.hasNext()) {
MyClass mc = iter.next();
if(shouldItBeRemoved(mc)) {
iter.remove(); // Will not throw ConcurrentModificationException
// arrayList.remove(mc); // Will throw CME
}
}
That said, I find the for-each loop more readable, so use it whenever you do not modify the list in the loop.
My preference is
1) If I need to move forward through the list without any modification to the List object, for readable and clean code, I will use:
for(MyClass mc : list){
/* code without modification to list */
}
2) If I need modification to the List object, no doubt I will use:
iter = list.iterator();
while(iter.hasNext()) {
MyClass mc = iter.next()
/* code without modification to list */
/* code with modification to list */
}
Additional Information:
Iterator will be useful if you need to create a utility method that can traverse multiple type of collection (e.g. ArrayList, LinkedList, HashSet, TreeSet, LinkedHashSet)
public class Example {
public static void iterateAndDoSomething(Iterator<MyClass> iter) {
while(iter.hasNext()) {
MyClass mc = iter.next();
/* code without modification to list */
/* code with modification to list */
}
}
public static void main(String[] args) {
ArrayList<MyClass> als = new ArrayList<MyClass>();
TreeSet<MyClass> tss = new TreeSet<MyClass>();
iterateAndDoSomething(als.iterator());
iterateAndDoSomething(tss.iterator());
}
}
Some classes doesn't have iterator() method (such as NodeList) then you have to use #1. Other than that, it's a matter of preference I think.
I use for (i=0; if i need the index during the loop, an iterator() if it's the only thing possible or I need to remove() elements (concurrently). For all other cases I use the shortened for loop for(MyClass mc : al) because of its readability.
Traditional loops (indexed based) are useful where you need the index to manipulate the array.
If you don't care about the indexes and you interest is only getting the value out of the array, for..each loop is the best fit.
Some Collection objects doesn't provide a way to get the values using index, in that case iterator() is the only option.
Based on the code snippet you provided, it should be obvious that for-each like iteration produces cleaner code.
With the presence of both, you do have the flexibility on what you would like to choose.
The for-each-loop isn't available on Java 1.4, so this might be eliminated if you need to support Java <1.5.
The other two choices are a matter of use case and style I think. Iterators usually look cleaner, but you might also need to have a counter variable, so a for-loop might fit your needs better. Additionally some list implementations do not provide iterators, so you will have to use an index.
I have read that iterators are very helpful in Swing especially if you iterate through collection in paintComponent(Graphics g) method.
The benefit of iterators is that you can iterate through the same collection from multiple threads and you even remove an element of that collection using the method remove() while the collection is accessed concurrently elsewhere.
The behavior of an iterator is unspecified if the underlying
collection is modified while the iteration is in progress in any way
other than by calling this method.
THIS means that if you modify the same collection concurrently then behavior of method remove is not defined. BUT method remove works well EVEN IF you traverse through the same collection concurrently while calling iterator.remove()!!! I have used this in my GUI.
According to my experience it is necessary to use an iterator in the method paintComponent rather than cycle for or for-each!

Enhanced for (or 'for each') loop iterating to element it just removed - throws error

could not find anything on this, wondering if anyone knew about this or a possible workaround. I am using JDOM and working with an xml schema.
I have created a List of which are just xml tags. The algorithm's aim is to iterate through the List of elements and remove the element if a condition is met (in this case if it starts with a certain string). See below:
for (Element appinfo : appinfos) {
if (appinfo.getText().startsWith(
PARAMETER_DESCRIPTION_APPINFO)) {
removeAppInfoElement(appinfo, name, appinfo.getText());
}
}
However, the loop appears to be attempting to iterate to the element it just removed.
Does anyone see anything wrong with this? Do I need to abandon the enhanced for loop or dig deeper for cause of problem?
I suppose you're talking about ConcurrentModificationException. Try to use iterator instead.
Yes that wont work.
Add all the items you want to remove to a new collection and then do a removeAll with those elements on the original collection.
You cannot remove elements from a Collection directly as you iterate over it - this causes issues because the Iterator has no idea that the element has been removed.
Instead of the enhanced for-loop, use the Iterator directly and call the remove() function, for example:
for (Iterator it = appinfos.iterator(); it.hasNext();) {
Element appinfo : it.next();
if (someCondition) {
it.remove();
}
}
willcodejavaforfood's answer is one way of doing this.
An alternative, which may be better or worse depending on style and what else you want to do in the loop, is to get the Iterator explicitly and use its remove method:
final Iterator<Element> iter = appinfos.iterator();
while (iter.hasNext()) {
if (iter.next().getText().startsWith(
PARAMETER_DESCRIPTION_APPINFO)) {
iter.remove();
}
}
This of course only works if a simple removal from the collection is what you want to do. When invoking potentially complex methods that will directly remove from the underlying collection, the best approach is to take a copy of the collection initially, then iterate over this copy.
In all cases, modifying a collection while you are iterating over it will generally cause Bad Things to happen.

Categories