remove elements from CopyOnWriteArrayList - java

I am getting an exception when I try to remove elements from CopyOnWriteArrayList using an iterator.
I have noticed that it is documented
Element-changing operations on iterators themselves (remove, set, and add) are not supported. These methods throw UnsupportedOperationException.
(from http://download.oracle.com/javase/6/docs/api/java/util/concurrent/CopyOnWriteArrayList.html)
Now, surprisingly i can iterate it with foreach and use the remove() function . But then I get the famous bug - when trying to remove an item from a list using a for loop - you skip the element next to the removed element.
any suggestions then?

Iterate over the collection choosing all the elements you want to delete and putting those in a temporary collection. After you finish iteration remove all found elements from the original collection using method removeAll.
Would that work out for you? I mean, not sure if deletion logic is more complicated than that in your algorithm.

EDIT: I'm an idiot. I missed the fact that this is a copy-on-write list so every removal means a new copy. So my suggestions below are likely to be suboptimal if there's more than one removal.
Same as for any other list whose iterator doesn't support remove, or anything where you're not using an iterator. There are three basic techniques that come to mind to avoid this bug:
Decrement the index after removing something (being careful not to do anything with the index until the next iteration). For this you'll obviously have to use a for(int i=0; i < ... style of for loop, so that you can manipulate the index.
Somehow repeat what the inside of the loop is doing, without literally going back to the top of the loop. Bit of a hack - I would avoid this technique.
Iterate over the list in reverse (from end to start, instead of from start to end). I prefer this approach as it's the simplest.

Since this is a CopyOnWriteArrayList it is totally safe to remove elements while iterating with forEach. No need for fancy algorithms.
list.forEach(e -> {
if (shouldRemove(e))
list.remove(e);
});
EDIT: Well of course that works if you want to delete elements by reference, not by position.

Ususlly you would iterate first gathering elemenet to be deleted in a separate list then delete them outside the for each loop (which is disguised iterator based loop anyway)

Something like this:
int pos = 0;
while(pos < lst.size() ) {
Foo foo = lst.get(pos);
if( hasToBeRemoved(foo) ) {
lst.remove(pos);
// do not move position
} else {
pos++;
}
}

You could use Queue instead of List.
private Queue<Something> queue = new ConcurrentLinkedQueue<Something>();
It's thread safe and supports iterator.remove(). Be aware of the thread-safe behavior of Queue iterators, though (check the javadoc).

If you want to delete all use just clear(). If you want to keep elements put them in a temporary ArrayList and get them back from there.
List<Object> tKeepThese= new ArrayList<>();
for(ListIterator<Object> tIter = theCopyOnWriteArrayList; tIter.hasNext();)
{
tObject = tIter.next();
if(condition to keep element)
tKeepThese.add(tObject);
}
theCopyOnWriteArrayList.clear();
theCopyOnWriteArrayList.addAll(tKeepThese);

the shortest and most efficient way:
List<String> list = new CopyOnWriteArrayList<>();
list.removeIf(s -> s.length() < 1);
internally it creates an temporary array with the same length and copies all elements where the predicate returns true.
keep in mind that if you use this method to actually iterate over the elements to perform some action, these actions cannot be performed in paralell anymore since the removeIf-call is atomic and will lock the traversal for other threads

Below works fine with CopyOnWriteArrayList
for(String key : list) {
if (<some condition>) {
list.remove(key);
}
}

Related

Java - Add elements to an ArrayList while browsing it

The problem is that I have this method createNode() that creates a node in a tree, and then if it's a leave node it adds it into an ArrayList<Tree> treeLeaves, and I make the call of this method while browsing the treeLeaves ArrayList like this :
Iterator<Tree> iter = treeLeaves.iterator();
while (iter.hasNext()) {
iter.next().createNode();
}
Or like this :
For (Tree cursor : treeLeaves) {
cursor.createNode();
}
But I keep having this exception :
Exception in thread "main" java.util.ConcurrentModificationException
Even when put the codes below in snychronized(treeLeaves){} bloc.
P.S: I don't know if this is usefull or not but; it's an n-Tree.
You need A ConcurrentList...
http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/CopyOnWriteArrayList.html
read Is there a concurrent List in Java's JDK?
Also you can't change a arraylist when browsing it... Instead you can use a buffer in tempoaray memory to edit current list if you like.
For an ArrayList, you could avoid the iterator by just using an index:
for (int i = 0; i < treeLeaves.size(); i++) {
Tree current = treeLeaves.get(i);
// your code
}
As long as the only thing you do is append to the end of the array, and you don't insert in the middle or at the beginning or delete any items, this will work. treeLeaves.size() will be recomputed every time you go through the loop, which means that if you append to the end, the size will be recomputed and i will get to the new items. Yes, using an old-fashioned loop isn't as "cool" as an iterator, but it works.
I don't recommend using this for any kind of List other than an ArrayList, because in general, get(i) will have to start from the beginning of the list and step through each element (unless the runtime optimizes the case where you're using get(i+1) after get(i), which wouldn't be too hard, but I don't know whether the implementations do that). For an ArrayList, however, get(i) should take constant time.
This is because in Java, when an Iterator is created, you cannot modify the underlying data structure. The enhanced for loop "for (Tree cursor : treeLeaves))" uses an Iterator.
As Ya stated, "Also you can't change a arraylist when browsing it... Instead you can use a buffer in tempoaray memory to edit current list if you like."
In brief, in most cases, if you change the structure of a collection, all still-open iterators will become invalid. (Quite some exception including using Concurrent collections, or modification is done through the iterator, or etc.)
Your problem can be demonstrated easily by:
List<Node> treeLeaves = new ArrayList<>();
//... add something to treeLeaves
for (leaf : treeLeaves) {
treeLeaves.add(new Node());
}
For your case, it can be easily solved by creating a new collection to iterate:
List<Node> treeLeaves = new ArrayList<>();
//... add something to treeLeaves
List<Node> tempLeaves = new ArrayList<>(treeLeaves);
for (leaf : tempLeaves ) {
treeLeaves.add(new Node());
}

What is the best way to iterate over list

I have worked pretty much on collection but I have few doubts.
I am aware that we can iterate list with iterator.
Another way is that we can go through as below:
for(int i=0; i<list.size(); i++){
list.get(i);
}
Here I think there is problem that each time it will call list.size() that will build whole tree that will impact performance.
I thought other solution as well like:
int s = list.size();
for(int i=0; i<s; i++){
list.get(i);
}
I think this can solve the problem. I am not much exposed to thread. I am thinking that whetherthis should be right approach or not.
Another way I thought is like:
for (Object obj; list){
}
With this new for loop, I think compiler again checks size of list.
Please give best solution from these or alternative performance efficient approach. Thank you for your help.
Calling size() at each iteration is not really a problem. This operation is O(1) for all the collections I know of: size() simply returns the value of a field of the list, holding its size.
The main problem of the first way is the repeated call to get(i). This operation is O(1) for an ArrayList, but is O(n) for a LinkedList, making the whole iteration O(n2) instead of O(n): get(i) forces the list to start from the first element of the list (or the last one), and to go to the next node until the ith element.
Using an iterator, or using a foreach loop (which internally uses an iterator), guarantees that the most appropriate way of iterating is used, because the iterator knows about how the list is implemented and how best go from one element to the next.
BTW, this is also the only way to iterate through non-indexed collections, like Sets. So you'd better get used to use that kind of loop.
For your example is the best way:
for (Object obj: list){
}
It is the same like in java version < 1.5:
for (Iterator it = hs.iterator() ; it.hasNext() ; ){}
It use iterator of collection. You actually don't need the size of collection. The .size() method is should actually don't build the tree, but .get() can loops to the given element. .get() and .size() methods depend on List implementation. .get() In ArrayList should be actually O(1) complexity and not O(n)
UPDATE
In java 8 you can use:
myList.forEach{ Object elem ->
//do something
}
The best way to iterate the list in terms of performance would be to use iterators ( your second approach using foreach ).
If you are using list.get(i), it's performance would depend upon the implementation of the list. For ArrayList, list.get(i) is O(1) where as it's O(n) for LinkedList.
Also, list.size() is O(1) and should not have any impact over the performance.
for (Object obj: list){
}
Above code for me is the best way, it is clean and can be read easily.
The forEach in Java 8 is nice too.

Fastest way to remove the null elements in a list

Is there a quick way of removing the
null elements from a linked list?
The only way i know is to iterate on the
elements and remove the null ones.
I don't see how multithreading would help-- linked list doesn't have direct access to its members.
////////////////////////////////
EDIT: One I could think of is putting the elements into a set then back to the list again. This wouldn't retain the order, However would work every otherwise(?)
Unless you have an iterator pointing to the middle of the list, multithreading is not going to help* . A simple iteration with a ListIterator<T> should do the trick:
ListIterator<String> iter = list.listIterator();
while (iter.hasNext()) {
if (iter.next() == null) {
iter.remove();
}
}
* That is before we take into consideration that the linked list is not thread-safe without external synchronization.
What do you mean by fastest? The least number of operations is probably just going through one-by-one. If you use multithreading you would likely have to either chop up the list and put it back together or get the indexes and then go through the list in reverse and remove each one manually.
while(list.remove(null)); would work but might be slow on large list
Edit: In java 8 you can do list.parallelStream().filter((e)->e!=null) if you need the result as a linked list again you can do new LinkedList(Arrays.asList(stream.toArray));
This is pretty fast way. Worst case though it will be slower than just traversing once. In most cases it will be faster though
ArrayList a = new ArrayList<Integer>();
Collections.sort(a);
for(int i=a.size()-1; i>-1; i--)
{
if(a.get(i)==null)
a.remove(i);
else
break;
}

Iterator vs for

I was asked in an interview what is the advantage of using iterator over for loop or what is the advantage of using for loop over iterator?
Can any body please answer this?
First of all, there are 2 kinds of for loops, which behave very differently. One uses indices:
for (int i = 0; i < list.size(); i++) {
Thing t = list.get(i);
...
}
This kind of loop isn't always possible. For example, Lists have indices, but Sets don't, because they're unordered collections.
The other one, the foreach loop uses an Iterator behind the scenes:
for (Thing thing : list) {
...
}
This works with every kind of Iterable collection (or array)
And finally, you can use an Iterator, which also works with any Iterable:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
Thing t = it.next();
...
}
So you in fact have 3 loops to compare.
You can compare them in different terms: performance, readability, error-proneness, capability.
An Iterator can do things that a foreach loop can't. For example, you can remove elements while you're iterating, if the iterator supports it:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
Thing t = it.next();
if (shouldBeDeleted(thing) {
it.remove();
}
}
Lists also offer iterators that can iterate in both directions. A foreach loop only iterates from the beginning to an end.
But an Iterator is more dangerous and less readable. When a foreach loop is all you need, it's the most readable solution. With an iterator, you could do the following, which would be a bug:
for (Iterator<Thing> it = list.iterator(); it.hasNext(); ) {
System.out.println(it.next().getFoo());
System.out.println(it.next().getBar());
}
A foreach loop doesn't allow for such a bug to happen.
Using indices to access elements is slightly more efficient with collections backed by an array. But if you change your mind and use a LinkedList instead of an ArrayList, suddenly the performance will be awful, because each time you access list.get(i), the linked list will have to loop though all its elements until the ith one. An Iterator (and thus the foreach loop) doesn't have this problem. It always uses the best possible way to iterate through elements of the given collection, because the collection itself has its own Iterator implementation.
My general rule of thumb is: use the foreach loop, unless you really need capabilities of an Iterator. I would only use for loop with indices with arrays, when I need access to the index inside the loop.
Iterator Advantage:
Ability to remove elements from Collections.
Ability to move forward and backward using next() and previous().
Ability to check if there more elements or not by using hasNext().
Loop was designed only to iterate over a Collection, so if you want just to iterate over a Collection, its better to use loop such as for-Each, but if you want more that that you could use Iterator.
The main difference between Iterator and the classic for loop, apart from the obvious one of having or not having access to the index of the item you're iterating, is that using Iterator abstracts the client code from the underlying collection implementation, allow me to elaborate.
When your code uses an iterator, either in this form
for(Item element : myCollection) { ... }
this form
Iterator<Item> iterator = myCollection.iterator();
while(iterator.hasNext()) {
Item element = iterator.next();
...
}
or this form
for(Iterator iterator = myCollection.iterator(); iterator.hasNext(); ) {
Item element = iterator.next();
...
}
What your code is saying is "I don't care about the type of collection and its implementation, I just care that I can iterate through its elements". Which is usually the better approach, since it makes your code more decoupled.
On the other hand, if you're using the classic for loop, as in
for(int i = 0; i < myCollection.size(); i++) {
Item element = myCollection.get(i);
...
}
Your code is saying, I need to know the type of collection, because I need to iterate through its elements in a specific way, I'm also possibly going to check for nulls or compute some result based on the order of iteration. Which makes your code more fragile, because if at any point the type of collection you receive changes, it will impact the way your code works.
Summing it up, the difference is not so much about speed, or memory usage, is more about decoupling your code so that is more flexible to cope with change.
if you access to data by number (e.g. "i"), it is fast when you use array. because it goes to element directly
But, other data structure (e.g. tree, list), it needs more time, because it start from first element to target element. when you use list. It needs time O(n). so, it is to be slow.
if you use iterator, compiler knows that where you are. so It needs O(1)
(because, it start from current position)
finally, if you use only array or data structure that support direct access(e.g. arraylist at java). "a[i]" is good. but, when you use other data structure, iterator is more efficient
Unlike other answers, I want to point another things;
if you need to perform the iteration in more than one place in your code, you will likely end up duplicating the logic. This clearly isn’t a very extensible approach. Instead, what’s needed is a way to separate the logic for selecting the data from the code that actually processes it.
An iterator solves these problems by providing a generic interface for looping over a set of data so that the underlying data structure or storage mechanism — such as an array- is hidden.
Iterator is a concept not an implementation.
An iterator provides a number of operations for traversing and accessing data.
An iterator may wrap any datastructure like array.
One of the more interesting and useful advantages of using iterators is the capability to wrap or decorate another iterator to filter the return values
An iterator may be thread safe while a for loop alone cannot be as it is accessing elements directly. The only popular thread-safety iterator is CopyOnWriteArrayList but it is well known and used often so worth mentioning.
This is from the book that it is https://www.amazon.com/Beginning-Algorithms-Simon-Harris/dp/0764596748
I stumbled on this question. The answer lies to the problems Iterator tries to solve:
access and traverse the elements of an aggregate object without exposing its representation
define traversal operations for an aggregate object without changing its interface

removing duplicates linkedlist concurrent modification exception

I have a linkedlist where each element has key and value(ArrayList<dataStructure>). I want to merge the elements having same key.
Iterator<CElem> oItr = linkedList.iterator();
{
while (oItr.hasNext())
{
CElem outer = oItr.next();
Iterator<CElem> iItr = linkedList.iterator();
{
while (iItr.hasNext())
{
CElem inner = iItr.next();
if (outer.equals(inner))
continue;
if (outer.getKey().equals(inner.getKey()))
{
outer.getValues().addAll(inner.getValues());
iItr.remove();
}
}
}
}
}
Though I am using the iterators remove methog getting a java.util.ConcurrentModificationException. What should be changed to get rid of this.
You remove element with one of your iterators, thus the second of them does not know about this removal and ConcurrentModificationException is thrown
BTW:
you should consider using some multimap in place of list that is having key-values pairs
Add the elements you want to remove in another List and then loop in that list at the end to remove these elements.
Alternatively, use a Map/Set.
Both your iterators are traversing the linked list
Iterator<CElem> oItr = linkedList.iterator();
....
Iterator<CElem> iItr = linkedList.iterator();
probably iItr should be for the inner array list?
UPDATE Scratch above answer I misread the question. The challenge though is that you have two iterators traversing the list, so while you use one iterator's remove() method, the other still detects the concurrent modification.
Normally, to remove duplicates from a list, you can just run them through a Set (e.g. a HashSet) but that won't work for you as it's only key duplication, not the entire member of the list.
I'd take an approach where I try to find and capture the duplicated keys and their values in a separate list and then merge in and remove the duplicates as a separate step.
The problem is that when you use the iItr.remove() it modifies the list, which iItr is happy with because it knows what changed, but oItr isn't. There are three possible solutions to this that I can see:
Switch to a concurrent list (e.g. ConcurrentLinkedQueue - but see then answers at Lock-Free Concurrent Linked List in Java for warnings about this)
Switch to a set structure, e.g. TreeSet, which will keep your items unique automatically (but won't preserve their order)
Make sure you don't use the other iterator after removing from one of them -- you could do this by switching which element you are removing, i.e. change iItr.remove() to:
oItr.remove();
break;
This would cause the first instance of each key to be removed, rather than subsequent ones, which might not be the behaviour you want -- in that case you could try iterating over the lists backwards.
Will this work?
Iterator<CElem> oItr = linkedList.iterator();
{
while (oItr.hasNext())
{
CElem outer = oItr.next();
Iterator<CElem> iItr = linkedList.iterator();
{
while (iItr.hasNext())
{
CElem inner = iItr.next();
if (outer.equals(inner))
continue;
if (outer.getKey().equals(inner.getKey()))
{
inner.getValues().addAll(outer.getValues());
outer.remove();
break;
}
}
}
}
}

Categories