Remove elements from collection while iterating - java

AFAIK, there are two approaches:
Iterate over a copy of the collection
Use the iterator of the actual collection
For instance,
List<Foo> fooListCopy = new ArrayList<Foo>(fooList);
for(Foo foo : fooListCopy){
// modify actual fooList
}
and
Iterator<Foo> itr = fooList.iterator();
while(itr.hasNext()){
// modify actual fooList using itr.remove()
}
Are there any reasons to prefer one approach over the other (e.g. preferring the first approach for the simple reason of readability)?

Let me give a few examples with some alternatives to avoid a ConcurrentModificationException.
Suppose we have the following collection of books
List<Book> books = new ArrayList<Book>();
books.add(new Book(new ISBN("0-201-63361-2")));
books.add(new Book(new ISBN("0-201-63361-3")));
books.add(new Book(new ISBN("0-201-63361-4")));
Collect and Remove
The first technique consists in collecting all the objects that we want to delete (e.g. using an enhanced for loop) and after we finish iterating, we remove all found objects.
ISBN isbn = new ISBN("0-201-63361-2");
List<Book> found = new ArrayList<Book>();
for(Book book : books){
if(book.getIsbn().equals(isbn)){
found.add(book);
}
}
books.removeAll(found);
This is supposing that the operation you want to do is "delete".
If you want to "add" this approach would also work, but I would assume you would iterate over a different collection to determine what elements you want to add to a second collection and then issue an addAll method at the end.
Using ListIterator
If you are working with lists, another technique consists in using a ListIterator which has support for removal and addition of items during the iteration itself.
ListIterator<Book> iter = books.listIterator();
while(iter.hasNext()){
if(iter.next().getIsbn().equals(isbn)){
iter.remove();
}
}
Again, I used the "remove" method in the example above which is what your question seemed to imply, but you may also use its add method to add new elements during iteration.
Using JDK >= 8
For those working with Java 8 or superior versions, there are a couple of other techniques you could use to take advantage of it.
You could use the new removeIf method in the Collection base class:
ISBN other = new ISBN("0-201-63361-2");
books.removeIf(b -> b.getIsbn().equals(other));
Or use the new stream API:
ISBN other = new ISBN("0-201-63361-2");
List<Book> filtered = books.stream()
.filter(b -> b.getIsbn().equals(other))
.collect(Collectors.toList());
In this last case, to filter elements out of a collection, you reassign the original reference to the filtered collection (i.e. books = filtered) or used the filtered collection to removeAll the found elements from the original collection (i.e. books.removeAll(filtered)).
Use Sublist or Subset
There are other alternatives as well. If the list is sorted, and you want to remove consecutive elements you can create a sublist and then clear it:
books.subList(0,5).clear();
Since the sublist is backed by the original list this would be an efficient way of removing this subcollection of elements.
Something similar could be achieved with sorted sets using NavigableSet.subSet method, or any of the slicing methods offered there.
Considerations:
What method you use might depend on what you are intending to do
The collect and removeAl technique works with any Collection (Collection, List, Set, etc).
The ListIterator technique obviously only works with lists, provided that their given ListIterator implementation offers support for add and remove operations.
The Iterator approach would work with any type of collection, but it only supports remove operations.
With the ListIterator/Iterator approach the obvious advantage is not having to copy anything since we remove as we iterate. So, this is very efficient.
The JDK 8 streams example don't actually removed anything, but looked for the desired elements, and then we replaced the original collection reference with the new one, and let the old one be garbage collected. So, we iterate only once over the collection and that would be efficient.
In the collect and removeAll approach the disadvantage is that we have to iterate twice. First we iterate in the foor-loop looking for an object that matches our removal criteria, and once we have found it, we ask to remove it from the original collection, which would imply a second iteration work to look for this item in order to remove it.
I think it is worth mentioning that the remove method of the Iterator interface is marked as "optional" in Javadocs, which means that there could be Iterator implementations that throw UnsupportedOperationException if we invoke the remove method. As such, I'd say this approach is less safe than others if we cannot guarantee the iterator support for removal of elements.

Old Timer Favorite (it still works):
List<String> list;
for(int i = list.size() - 1; i >= 0; --i)
{
if(list.get(i).contains("bad"))
{
list.remove(i);
}
}
Benefits:
It only iterates over the list once
No extra objects created, or other unneeded complexity
No problems with trying to use the index of a removed item, because... well, think about it!

In Java 8, there is another approach. Collection#removeIf
eg:
List<Integer> list = new ArrayList<>();
list.add(1);
list.add(2);
list.add(3);
list.removeIf(i -> i > 2);

Are there any reasons to prefer one approach over the other
The first approach will work, but has the obvious overhead of copying the list.
The second approach will not work because many containers don't permit modification during iteration. This includes ArrayList.
If the only modification is to remove the current element, you can make the second approach work by using itr.remove() (that is, use the iterator's remove() method, not the container's). This would be my preferred method for iterators that support remove().

Only second approach will work. You can modify collection during iteration using iterator.remove() only. All other attempts will cause ConcurrentModificationException.

You can't do the second, because even if you use the remove() method on Iterator, you'll get an Exception thrown.
Personally, I would prefer the first for all Collection instances, despite the additional overheard of creating the new Collection, I find it less prone to error during edit by other developers. On some Collection implementations, the Iterator remove() is supported, on other it isn't. You can read more in the docs for Iterator.
The third alternative, is to create a new Collection, iterate over the original, and add all the members of the first Collection to the second Collection that are not up for deletion. Depending on the size of the Collection and the number of deletes, this could significantly save on memory, when compared to the first approach.

I would choose the second as you don't have to do a copy of the memory and the Iterator works faster. So you save memory and time.

You can see this sample; If we think remove odd value from a list:
public static void main(String[] args) {
Predicate<Integer> isOdd = v -> v % 2 == 0;
List<Integer> listArr = Arrays.asList(5, 7, 90, 11, 55, 60);
listArr = listArr.stream().filter(isOdd).collect(Collectors.toList());
listArr.forEach(System.out::println);
}

use Iterator to remove object from collection other wise get

why not this?
for( int i = 0; i < Foo.size(); i++ )
{
if( Foo.get(i).equals( some test ) )
{
Foo.remove(i);
}
}
And if it's a map, not a list, you can use keyset()

Related

how to remove data from ArrayList [duplicate]

This question already has answers here:
Iterating through a Collection, avoiding ConcurrentModificationException when removing objects in a loop
(31 answers)
Closed 8 years ago.
In Java, is it legal to call remove on a collection when iterating through the collection using a foreach loop? For instance:
List<String> names = ....
for (String name : names) {
// Do something
names.remove(name).
}
As an addendum, is it legal to remove items that have not been iterated over yet? For instance,
//Assume that the names list as duplicate entries
List<String> names = ....
for (String name : names) {
// Do something
while (names.remove(name));
}
To safely remove from a collection while iterating over it you should use an Iterator.
For example:
List<String> names = ....
Iterator<String> i = names.iterator();
while (i.hasNext()) {
String s = i.next(); // must be called before you can call i.remove()
// Do something
i.remove();
}
From the Java Documentation :
The iterators returned by this class's iterator and listIterator
methods are fail-fast: if the list is structurally modified at any
time after the iterator is created, in any way except through the
iterator's own remove or add methods, the iterator will throw a
ConcurrentModificationException. Thus, in the face of concurrent
modification, the iterator fails quickly and cleanly, rather than
risking arbitrary, non-deterministic behavior at an undetermined time
in the future.
Perhaps what is unclear to many novices is the fact that iterating over a list using the for/foreach constructs implicitly creates an iterator which is necessarily inaccessible. This info can be found here
You don't want to do that. It can cause undefined behavior depending on the collection. You want to use an Iterator directly. Although the for each construct is syntactic sugar and is really using an iterator, it hides it from your code so you can't access it to call Iterator.remove.
The behavior of an iterator is
unspecified if the underlying
collection is modified while the
iteration is in progress in any way
other than by calling this method.
Instead write your code:
List<String> names = ....
Iterator<String> it = names.iterator();
while (it.hasNext()) {
String name = it.next();
// Do something
it.remove();
}
Note that the code calls Iterator.remove, not List.remove.
Addendum:
Even if you are removing an element that has not been iterated over yet, you still don't want to modify the collection and then use the Iterator. It might modify the collection in a way that is surprising and affects future operations on the Iterator.
for (String name : new ArrayList<String>(names)) {
// Do something
names.remove(nameToRemove);
}
You clone the list names and iterate through the clone while you remove from the original list. A bit cleaner than the top answer.
The java design of the "enhanced for loop" was to not expose the iterator to code, but the only way to safely remove an item is to access the iterator. So in this case you have to do it old school:
for(Iterator<String> i = names.iterator(); i.hasNext();) {
String name = i.next();
//Do Something
i.remove();
}
If in the real code the enhanced for loop is really worth it, then you could add the items to a temporary collection and call removeAll on the list after the loop.
EDIT (re addendum): No, changing the list in any way outside the iterator.remove() method while iterating will cause problems. The only way around this is to use a CopyOnWriteArrayList, but that is really intended for concurrency issues.
The cheapest (in terms of lines of code) way to remove duplicates is to dump the list into a LinkedHashSet (and then back into a List if you need). This preserves insertion order while removing duplicates.
I didn't know about iterators, however here's what I was doing until today to remove elements from a list inside a loop:
List<String> names = ....
for (i=names.size()-1;i>=0;i--) {
// Do something
names.remove(i);
}
This is always working, and could be used in other languages or structs not supporting iterators.
Yes you can use the for-each loop,
To do that you have to maintain a separate list to hold removing items and then remove that list from names list using removeAll() method,
List<String> names = ....
// introduce a separate list to hold removing items
List<String> toRemove= new ArrayList<String>();
for (String name : names) {
// Do something: perform conditional checks
toRemove.add(name);
}
names.removeAll(toRemove);
// now names list holds expected values
Make sure this is not code smell. Is it possible to reverse the logic and be 'inclusive' rather than 'exclusive'?
List<String> names = ....
List<String> reducedNames = ....
for (String name : names) {
// Do something
if (conditionToIncludeMet)
reducedNames.add(name);
}
return reducedNames;
The situation that led me to this page involved old code that looped through a List using indecies to remove elements from the List. I wanted to refactor it to use the foreach style.
It looped through an entire list of elements to verify which ones the user had permission to access, and removed the ones that didn't have permission from the list.
List<Service> services = ...
for (int i=0; i<services.size(); i++) {
if (!isServicePermitted(user, services.get(i)))
services.remove(i);
}
To reverse this and not use the remove:
List<Service> services = ...
List<Service> permittedServices = ...
for (Service service:services) {
if (isServicePermitted(user, service))
permittedServices.add(service);
}
return permittedServices;
When would "remove" be preferred? One consideration is if gien a large list or expensive "add", combined with only a few removed compared to the list size. It might be more efficient to only do a few removes rather than a great many adds. But in my case the situation did not merit such an optimization.
Those saying that you can't safely remove an item from a collection except through the Iterator aren't quite correct, you can do it safely using one of the concurrent collections such as ConcurrentHashMap.
Try this 2. and change the condition to "WINTER" and you will wonder:
public static void main(String[] args) {
Season.add("Frühling");
Season.add("Sommer");
Season.add("Herbst");
Season.add("WINTER");
for (String s : Season) {
if(!s.equals("Sommer")) {
System.out.println(s);
continue;
}
Season.remove("Frühling");
}
}
It's better to use an Iterator when you want to remove element from a list
because the source code of remove is
if (numMoved > 0)
System.arraycopy(elementData, index+1, elementData, index,
numMoved);
elementData[--size] = null;
so ,if you remove an element from the list, the list will be restructure ,the other element's index will be changed, this can result something that you want to happened.
Use
.remove() of Interator or
Use
CopyOnWriteArrayList

Iterator vs for

I was asked in an interview what is the advantage of using iterator over for loop or what is the advantage of using for loop over iterator?
Can any body please answer this?
First of all, there are 2 kinds of for loops, which behave very differently. One uses indices:
for (int i = 0; i < list.size(); i++) {
Thing t = list.get(i);
...
}
This kind of loop isn't always possible. For example, Lists have indices, but Sets don't, because they're unordered collections.
The other one, the foreach loop uses an Iterator behind the scenes:
for (Thing thing : list) {
...
}
This works with every kind of Iterable collection (or array)
And finally, you can use an Iterator, which also works with any Iterable:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
Thing t = it.next();
...
}
So you in fact have 3 loops to compare.
You can compare them in different terms: performance, readability, error-proneness, capability.
An Iterator can do things that a foreach loop can't. For example, you can remove elements while you're iterating, if the iterator supports it:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
Thing t = it.next();
if (shouldBeDeleted(thing) {
it.remove();
}
}
Lists also offer iterators that can iterate in both directions. A foreach loop only iterates from the beginning to an end.
But an Iterator is more dangerous and less readable. When a foreach loop is all you need, it's the most readable solution. With an iterator, you could do the following, which would be a bug:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
System.out.println(it.next().getFoo());
System.out.println(it.next().getBar());
}
A foreach loop doesn't allow for such a bug to happen.
Using indices to access elements is slightly more efficient with collections backed by an array. But if you change your mind and use a LinkedList instead of an ArrayList, suddenly the performance will be awful, because each time you access list.get(i), the linked list will have to loop though all its elements until the ith one. An Iterator (and thus the foreach loop) doesn't have this problem. It always uses the best possible way to iterate through elements of the given collection, because the collection itself has its own Iterator implementation.
My general rule of thumb is: use the foreach loop, unless you really need capabilities of an Iterator. I would only use for loop with indices with arrays, when I need access to the index inside the loop.
Iterator Advantage:
Ability to remove elements from Collections.
Ability to move forward and backward using next() and previous().
Ability to check if there more elements or not by using hasNext().
Loop was designed only to iterate over a Collection, so if you want just to iterate over a Collection, its better to use loop such as for-Each, but if you want more that that you could use Iterator.
The main difference between Iterator and the classic for loop, apart from the obvious one of having or not having access to the index of the item you're iterating, is that using Iterator abstracts the client code from the underlying collection implementation, allow me to elaborate.
When your code uses an iterator, either in this form
for(Item element : myCollection) { ... }
this form
Iterator<Item> iterator = myCollection.iterator();
while(iterator.hasNext()) {
Item element = iterator.next();
...
}
or this form
for(Iterator iterator = myCollection.iterator(); iterator.hasNext(); ) {
Item element = iterator.next();
...
}
What your code is saying is "I don't care about the type of collection and its implementation, I just care that I can iterate through its elements". Which is usually the better approach, since it makes your code more decoupled.
On the other hand, if you're using the classic for loop, as in
for(int i = 0; i < myCollection.size(); i++) {
Item element = myCollection.get(i);
...
}
Your code is saying, I need to know the type of collection, because I need to iterate through its elements in a specific way, I'm also possibly going to check for nulls or compute some result based on the order of iteration. Which makes your code more fragile, because if at any point the type of collection you receive changes, it will impact the way your code works.
Summing it up, the difference is not so much about speed, or memory usage, is more about decoupling your code so that is more flexible to cope with change.
if you access to data by number (e.g. "i"), it is fast when you use array. because it goes to element directly
But, other data structure (e.g. tree, list), it needs more time, because it start from first element to target element. when you use list. It needs time O(n). so, it is to be slow.
if you use iterator, compiler knows that where you are. so It needs O(1)
(because, it start from current position)
finally, if you use only array or data structure that support direct access(e.g. arraylist at java). "a[i]" is good. but, when you use other data structure, iterator is more efficient
Unlike other answers, I want to point another things;
if you need to perform the iteration in more than one place in your code, you will likely end up duplicating the logic. This clearly isn’t a very extensible approach. Instead, what’s needed is a way to separate the logic for selecting the data from the code that actually processes it.
An iterator solves these problems by providing a generic interface for looping over a set of data so that the underlying data structure or storage mechanism — such as an array- is hidden.
Iterator is a concept not an implementation.
An iterator provides a number of operations for traversing and accessing data.
An iterator may wrap any datastructure like array.
One of the more interesting and useful advantages of using iterators is the capability to wrap or decorate another iterator to filter the return values
An iterator may be thread safe while a for loop alone cannot be as it is accessing elements directly. The only popular thread-safety iterator is CopyOnWriteArrayList but it is well known and used often so worth mentioning.
This is from the book that it is https://www.amazon.com/Beginning-Algorithms-Simon-Harris/dp/0764596748
I stumbled on this question. The answer lies to the problems Iterator tries to solve:
access and traverse the elements of an aggregate object without exposing its representation
define traversal operations for an aggregate object without changing its interface

Filtering elements out of a List

Say you want to filter elements out of a List. Running this code triggers something like a concurrency exception as the list is being modified during the loop:
List<String> aList = new ArrayList<String>();
// Add instances to list here ...
for( String s : aList ){
if( "apple".equals( s ) ){ // Criteria
aList.remove(s);
}
}
What is the conventional way of doing this and what other approaches do you know of?
best way is to use iterator.remove()
For your case, you can simply use a solution without any iteration manually done (removeAll takes care about this):
aList.removeAll("apple");
It removes all "apple" elements from the list.
Agree with the above two if you're iterating or have a collection of simple elements. If your condition or objects are more involved, consider Apache Commons CollectionUtils' filter method, which allows you to define a predicate encapsulating your condition and which is then applied to each element of the collection. Given your collection aList and a predicate applePredicate, the method of invocation would then be:
org.apache.commons.collections.CollectionUtils.filter(aList, applePredicate);
http://commons.apache.org/collections/apidocs/org/apache/commons/collections/CollectionUtils.html#filter(java.util.Collection, org.apache.commons.collections.Predicate)

Iteration over a list (ConcurrentModificationException)

The following code throws a ConcurrentModificationException:
for (String word : choices) {
List<String> choicesCopy = choices;
chosen.add(word);
choicesCopy.remove(word);
subsets(choicesCopy, chosen, alreadyPrinted);
}
What's going on? The original list (choices) isn't modified at all.
You made a reference copy not object copy in here
List<String> choicesCopy = choices;
So obviously you are modifying the same list and you are bound to get the ConcurrentModificationException
Use Collections.copy() to properly make a copy of your list.
EDIT:
As suggested below you can also use constructor for copying.
The reason is because you cannot modify anything inside a foreach loop. Try using a for loop. Or you have take all the contents of list and add them 1 at a time to the other list. because its done by reference
Edit: You need to make a deep copy of the list and remove from that copy. Then you can assign the reference of the original list to point to the new one that has the modifications. You cannot remove from the list you're currently iterating through even if it's being referenced by another variable.
Change the code like this:
for (Iterator<String> it = choices.iterator(); it.hasnext();) {
String word = it.next();
chosen.add(word);
it.remove();
subsets(choicesCopy, chosen, alreadyPrinted);
}
Explanation: foreach loops use an iterator internally, but don't expose it to the user. So if you want to remove items you have to simulate the foreach loop and keep a reference to the iterator yourself.
While iterating, any other means of removing data from a collection will result in a ConcurrentModificationException.
I think the universal solution is:
List<E> obj = Collections.synchronizedList(new ArrayList<E>());
You'll need to copy the list properly e.g. Collections.copy and then remove from the copy, or use Iterator.remove, which will remove the Object from the underlying collection. Iterators are fail fast, so you can't change the underlying Collection without using the API of the Iterator.
I suspect chosen should be a copy as well. Otherwise chosen will accumulates all the words by the time the loop has finished. i.e. I suspect the chosen and choices shouldn't have any words in common.
I also suspect the collections should be sets (unordered collections without duplicates) instead of lists.
Perhaps the code should be.
Set<String> choices =
Set<String> chosen =
for (String word : choices) {
Set<String> choicesCopy = new LinkedHashSet<String>(choices);
choicesCopy.remove(word);
Set<String> chosenCopy = new LinkedHashSet<String>(chosen);
chosenCopy.add(word);
subsets(choicesCopy, chosenCopy, alreadyPrinted);
}

remove elements from CopyOnWriteArrayList

I am getting an exception when I try to remove elements from CopyOnWriteArrayList using an iterator.
I have noticed that it is documented
Element-changing operations on iterators themselves (remove, set, and add) are not supported. These methods throw UnsupportedOperationException.
(from http://download.oracle.com/javase/6/docs/api/java/util/concurrent/CopyOnWriteArrayList.html)
Now, surprisingly i can iterate it with foreach and use the remove() function . But then I get the famous bug - when trying to remove an item from a list using a for loop - you skip the element next to the removed element.
any suggestions then?
Iterate over the collection choosing all the elements you want to delete and putting those in a temporary collection. After you finish iteration remove all found elements from the original collection using method removeAll.
Would that work out for you? I mean, not sure if deletion logic is more complicated than that in your algorithm.
EDIT: I'm an idiot. I missed the fact that this is a copy-on-write list so every removal means a new copy. So my suggestions below are likely to be suboptimal if there's more than one removal.
Same as for any other list whose iterator doesn't support remove, or anything where you're not using an iterator. There are three basic techniques that come to mind to avoid this bug:
Decrement the index after removing something (being careful not to do anything with the index until the next iteration). For this you'll obviously have to use a for(int i=0; i < ... style of for loop, so that you can manipulate the index.
Somehow repeat what the inside of the loop is doing, without literally going back to the top of the loop. Bit of a hack - I would avoid this technique.
Iterate over the list in reverse (from end to start, instead of from start to end). I prefer this approach as it's the simplest.
Since this is a CopyOnWriteArrayList it is totally safe to remove elements while iterating with forEach. No need for fancy algorithms.
list.forEach(e -> {
if (shouldRemove(e))
list.remove(e);
});
EDIT: Well of course that works if you want to delete elements by reference, not by position.
Ususlly you would iterate first gathering elemenet to be deleted in a separate list then delete them outside the for each loop (which is disguised iterator based loop anyway)
Something like this:
int pos = 0;
while(pos < lst.size() ) {
Foo foo = lst.get(pos);
if( hasToBeRemoved(foo) ) {
lst.remove(pos);
// do not move position
} else {
pos++;
}
}
You could use Queue instead of List.
private Queue<Something> queue = new ConcurrentLinkedQueue<Something>();
It's thread safe and supports iterator.remove(). Be aware of the thread-safe behavior of Queue iterators, though (check the javadoc).
If you want to delete all use just clear(). If you want to keep elements put them in a temporary ArrayList and get them back from there.
List<Object> tKeepThese= new ArrayList<>();
for(ListIterator<Object> tIter = theCopyOnWriteArrayList; tIter.hasNext();)
{
tObject = tIter.next();
if(condition to keep element)
tKeepThese.add(tObject);
}
theCopyOnWriteArrayList.clear();
theCopyOnWriteArrayList.addAll(tKeepThese);
the shortest and most efficient way:
List<String> list = new CopyOnWriteArrayList<>();
list.removeIf(s -> s.length() < 1);
internally it creates an temporary array with the same length and copies all elements where the predicate returns true.
keep in mind that if you use this method to actually iterate over the elements to perform some action, these actions cannot be performed in paralell anymore since the removeIf-call is atomic and will lock the traversal for other threads
Below works fine with CopyOnWriteArrayList
for(String key : list) {
if (<some condition>) {
list.remove(key);
}
}

Categories