How does java's iterator.remove() translate to python? - java

I have a python set that I need to iterate over, and for each element, check if it satisfies a constraint, and if so, remove it and add it to a different, possibly already nonempty set. I could just add it to a buffer and then, after the loop is complete, iterate through the buffer and remove its contents from the set, but there must be a better way to do this.
Here is what I'm trying to do:
for elem in S:
if (P(elem)):
S.remove(elem)
T.add(elem)
This doesn't work.
Here's the fix that would work but is unclean:
B = set()
for elem in S:
if (P(elem)):
B.add(elem)
T.add(elem)
for elem in B:
S.remove(elem)
EDIT:
The best solution seems to be:
for elem in S.copy():
if (P(elem)):
S.remove(elem)
T.add(elem)

For a list, create a copy before iterating:
for elem in S[:]:
where the [:] slice notation creates a copy of the full list. You cannot otherwise remove elements from a list while iterating over it.

One option is to use a list comprehension to create T and then remove all elements in T from S.
T = set([elem for elem in S if P(elem)])
S = S - T

If iterating over the list twice and calling P() twice for each item is not an issue this would be the most readable solution::
T = [x for x in S if P(elem)]
S = [x for x in S if not P(elem)]

Related

How to add element to a HashSet while iterating this HashSet?

I have a use case like following:
SET is a Set of Integer with size N
for i in SET (I mean only iterate the Set of size N at start point):
if i + 7 not in SET:
SET.add(i + 7)
return SET
How to implement this using Java HashSet except using an auxiliary list/set to store the element which needs to be inserted?
It is impossible to add something to the Set instance while iterating over its contents; when using the foreach loop (for( var e : set ) notation), no modification is allowed, while when using an explicit iterator (for( var i = set.iterator(); i.hasNext(); ) … notation), you can call i.remove() to get rid of the current element. But adding new elements does still not work in this case.
This behaviour is shared by all Java Collection classes, although List knows a special iterator class, ListIterator, that also allows adding entries (notation for( var i = list.listIterator(); i.hasNext(); ) …) by calling i.add() – thanks to #lucasvw for reminding me on that.
#lucasvw eluded to the underlying issue here - you need to somehow differentiate between the original values and the values you've added, otherwise, the loop will run indefinitely (or, at least, until the values overflow enough so they start repeating themselves).
The best way to do this is to indeed have an auxiliary set to hold all the values you want to add:
Set<Integer> aux = original.stream().map(i -> i + 7).collect(Collectors.toSet());
original.addAll(aux);
If you do not want to make a copy yourself, Java can do that for you: https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/CopyOnWriteArraySet.html. It will not be faster or anything magical, but you can iterate "the original" and modify at the same time.
However if you want something efficient, that is presumably to create another Set with the new elements, and addAll() them at the end. Depending on the size of the set it may be faster to skip the containment-check, and leave it for the merging.
BitSet and its or() operation may be also something to look at if your numbers are nonnegative and are of low magnitude.

how to remove object from stream in foreach method?

i have to arrays: arrA and arrB. arrA and arrB are Lists of objectss of diffrent types and add function converts objects A to objects B. I want to add each object from arrA to arrB and remove that object from arrA. Im trying to do this by stream:
arrA.stream().foreach(c -> {arrB.add(c); arrA.remove(c);});
when i execute this, two things are happening:
not all objects are passed from arrA to arrB.
after few iterations null pointer exception is thrown.
i gues it's because length of array is decreased after each remove() call and the counter of iterations is increased (only objects under odd indexes are passed to arrB)
Now i could solve this by copying array in one stream call and then remove objects in second stream call but this doesnt seem correct for me.
What would be proper solution to this problem?
EDIT.
Additional information:
in real implementation this list if previously filtered
arrA.stream().filter(some condition).foreach(c -> {arrB.add(c); arrA.remove(c);});
and its called few times to add elements meeting diffrent conditions to diffrent lists (arrC, arrD etc.) but each object can be only on one list
Streams are designed to be used in a more functional way, preferably treating your collections as immutable.
The non-streams way would be:
arrB.addAll(arrA);
arrA.clear();
However you might be using Streams so you can filter the input so it's more like:
arrB.addAll(arrA.stream().filter(x -> whatever).toList())
then remove from arrA (thanks to #Holgar for the comment).
arrA.removeIf(x -> whatever)
If your predicate is expensive, then you could partition:
Map<Boolean, XXX> lists = arrA.stream()
.collect(Collectors.partitioningBy(x -> whatever));
arrA = lists.get(false);
arrB = lists.get(true);
or make a list of the changes:
List<XXX> toMove = arrA.stream().filter(x->whatever).toList();
arrA.removeAll(toMove);
arrB.addAll(toMove);
As the others have mentioned, this is not possible with foreach - as it is impossible with the for (A a: arrA) loop to remove elements.
In my opinion, the cleanest solution is to use a plain for while with iterators - iterators allow you to remove elements while iterating (as long as the collection supports that).
Iterator<A> it = arrA.iterator()
while (it.hasNext()) {
A a = it.next();
if (!check(a))
continue;
arrB.add(a);
it.remove();
}
This also saves you from copying/cloning arrA.
I don't think you can remove from arrA while you iterate over it.
You can get around this by wrapping it in a new ArrayList<>();
new ArrayList<>(arrA).stream().foreach(c -> {arrB.add(c); arrA.remove(c);});
i guess it's because length of array is decreased after each remove() call and the counter of iterations is increased
Right. the for-each-loop is just like a normal for-loop, but easier to write and read. You can think of it as syntactic sugar. Internally it will either use an Iterator or array indices. The forEach method of streams is a more fancy version of it that allows parallel execution and functional coding style, but has its own drawbacks.
As with any indexed loop, removing an element while looping breaks the loop. Consider having three elements with indices 0, 1, and 2. When you remove element 0 in the first iteration, the list items will shift one up and the next iteration you'll have elements 0 (previously 1) and 1 (previously 2). Your loop variable now points to 1, so it skips the actually next item. When it gets to index 2 the loop you're working on only has one item left (you removed two), which throws an error because the index is out of bounds.
Possible solutions:
Use the List methods for cloning and clearing lists.
Do it with two loops if you really need to call the methods on each single item.
You could just do Collections.addAll. Then when that's finished. just call clear() on arrA.

Extract first k elements from a Set efficiently

Problem
I'm writing a simple Java program in which I have a TreeSet which contains Comparable elements (it's a class that I've written myself). In a specific moment I need to take only the first k elements from it.
What I've done
Currently I've found two different solution for my problem:
Using a simple method written by me; It copies the first k elements from the initial TreeSet;
Use Google Guava greatestOf method.
For the second option you need to call the method in this way:
Ordering.natural().greatestOf(mySet, 80))
But I think that it's useless to use this kind of invocation because the elements are already sorted. Am I wrong?
Question
I want to ask here which is a correct and, at the same time, efficient method to obtain a Collection derived class which contains the first k elements of a TreeSet?
Additional information
Java version: >= 7
You could use Guava's Iterables#limit:
ImmutableList.copyOf(Iterables.limit(yourSet, 7))
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Iterables.html#limit(java.lang.Iterable, int)
I would suggest you to use a TreeSet<YourComparableClass> collection, it seems to be the solution you are looking for.
A TreeSet can return you an iterator, and you can simply iterates K times, by storing the objects the iterator returns you: the elements will be returned you in order.
Moreover a TreeSet keep your elements always sorted: at any time, when you add or remove elements, they are inserted and removed so that the structure remains ordered.
Here a possible example:
public static ArrayList<YourComparableClass> getFirstK(TreeSet<YourComparableClass> set, int k) {
Iterator<YourComparableClass> iterator = set.iterator();
ArrayList<YourComparableClass> result = new ArrayList<>(k); //to store first K items
for (int i=0;i<k;i++) result.add(iterator.next()); //iterator returns items in order
//you should also check iterator.hasNext(); if you are not sure to have always a K<set.size()
return result;
}
The descendingIterator() method of java.util.TreeSet yields elements from greatest to least, so you can just step it however many times, inserting the elements into a collection. The running time is O(log n + k) where k is the number of elements returned, which is surely fast enough.
If you're using a HashSet, on the other hand, then the elements in fact are not sorted, so you need to use the linear-time selection method that you indicated.

Java, multiple iterators on a set, removing proper subsets and ConcurrentModificationException

I have a set A = {(1,2), (1,2,3), (2,3,4), (3,4), (1)}
I want to turn it into A={(1,2,3), (2,3,4)}, remove proper subsets from this set.
I'm using a HashSet to implement the set, 2 iterator to run through the set and check all pairs for proper subset condition using containsAll(c), and the remove() method to remove proper subsets.
the code looks something like this:
HashSet<Integer> hs....
Set<Integer> c=hs.values();
Iterator<Integer> it= c.iterator();
while(it.hasNext())
{
p=it.next();
Iterator<Integer> it2= c.iterator();
while(it2.hasNext())
{
q=it2.next();
if q is a subset of p
it2.remove();
else if p is a subset of q
{
it.remove();
break;
}
}
}
I get a ConcurrentModificationException the 1st time i come out of the inner while loop and do a
p=it.next();
The exception is for when modifying the Collection while iterating over it. But that's what .remove() is for.
I have used remove() when using just 1 iterator and encountered no problems there.
If the exception is because I'm removing an element from 'c' or 'hs' while iterating over it, then the exception should be thrown when it encounter the very next it 2 .next() command, but I don't see it then. I see it when it encounters the it.next() command.
I used the debugger, and the collections and iterators are in perfect order after the element has been removed. They contain and point to the proper updated set and element. it.next() contains the next element to be analyzed, it's not a deleted element.
Any ideas over how i can do what i'm trying to do without making a copy of the hashset itself and using it as an intermediate before I commit updates?
Thank you
You can't modify the collection with it2 and continue iterating it with it. Just as the exception says, it's concurrent modification, and it's not supported.
I'm afraid you're stuck with an intermediate collection.
Edit
Actually, your code doesn't seem you make sense: are you sure it's a collection of Integer and not of Set<Integer>? In your code p and q are Integers, so "if q is a subset of p" doesn't seem to make too much sense.
One obvious way to make this a little smarter: sort your sets by size first, as you go from largest to smallest, add the ones you want to keep to a new list. You only have to check each set against the keep list, not the whole original collection.
The idea behind the ConcurrentModificationException is to maintain the internal state of the iterators. When you add or delete things from a set of items, it will throw an exception even if nothing appears wrong. This is to save you from coding errors that would end up throwing a NullPointerException in otherwise mundane code. Unless you have very tight space constraints or have an extremely large collection, you should just make a working copy that you can add and delete from without worry.
How about creating another set subsetNeedRemoved containing all subsets you are going to remove? For each subset, if there is a proper superset, add the subset to subsetNeedRemoved. At the end, you can loop over subsetNeedRemoved and remove corresponding subsets in the original set.
I'd write something like this...
PriorityQueue<Set<Integer>> queue = new PriorityQueue<Set<Integer>>(16,
new Comparator<Set<Integer>>() {
public int compare(Set<Integer> a, Set<Integer> b) {
return b.size() - a.size(); // overflow-safe!
}
});
queue.addAll(sets); // we'll extract them in order from largest to smallest
List<Set<Integer>> result = new ArrayList<>();
while(!queue.isEmpty()) {
Set<Integer> largest = queue.poll();
result.add(largest);
Iterator<Set<Integer>> rest = queue.iterator();
while(rest.hasNext()) {
if(largest.containsAll(rest.next())) {
rest.remove();
}
}
}
Yeah, it consumes some extra memory, but it's idiomatic, straightforward, and possibly faster than another approach.

Java: How to get n elements from a set

I was trying to find the most elegant way to get the n elements from a set starting from x. What I concluded was using streams:
Set<T> s;
Set<T> subS = s.stream().skip(x).limit(n).collect(Collectors.toSet());
Is this the best way to do it this way? Are there any drawbacks?
Similar to Steve Kuo's answer but also skipping the first x elements:
Iterables.limit(Iterables.skip(s, x), n);
Guava Iterables
Use Guava, Iterables.limit(s, 20).
Your code doesn’t work.
Set<T,C> s;
Set<T,C> subS = s.stream().skip(x).limit(n).collect(Collectors.toSet());
What is Set<T,C>? A Set contains elements of a given type so what are the two type parameters supposed to mean?
Further, if you have a Set<T>, you don’t have a defined order. “the n elements from a set starting from x” makes no sense in the context of a Set. There are some specialized Set implementations which have an order, e.g. are sorted or do retain insertion order, but since your code doesn’t declare such prerequisite but seems to be supposed to work on an arbitrary Set, it must be considered broken.
If you want to process a fraction of the Set according to an order, you have to freeze the order first:
Set<T> s;
List<T> frozenOrder=new ArrayList<>(s);
The list will have an order which will be the order of the Set, if there is any, or an arbitrary order, fixed at the creation time of the ArrayList, which will not change afterwards.
Then, extracting a fragment of it, is easy:
List<T> sub=frozenOrder.subList(x, Math.min(s.size(), x+n));
You may also convert it back to a Set, if you wish:
Set<T> subSet=new HashSet<>(sub);
That said, it’s rather unusual to process a part of a Set given by positional numbers.
The use of Stream is fine. The one drawback I can see is not all implementation of Set is ordered e.g. HashSet is not ordered but LinkedHashSet is. SO you might get different resulting set on different run.
You can just iterate over set and collect first n elements:
int n = 0;
Iterator<T> iter = set.iterator();
while (n < 8 && iter.hasNext()) {
T t = iter.next();
list.add(t);
n++;
}
The benefit is that it should be faster than more generic solutions.
The drawback is that it's more verbose than the one that you suggested.
A set - in its original manner - is not intended to have ordered elements, so you can not start from element x. SortedSet may be the "set" you want to use.
I'd convert it to a List first, like
new ArrayList(s).subList(<index of x>, <index of x + n>);
but it may have a very bad impact on performance. In this case the ArrayList would have to be stored to retrive the next subList because there is no explicit order, and the implicit order may change the next time new ArrayList(s) is called.
First, a set is not made for getting specific elements of it -
you should use a sortedSet or a ArrayList instead.
But if you have to get the elements of the set, you can use the following code
to iterate over the set:
int c = 0;
int n = 50; //Number of elements to get
Iterator<T> iter = set.iterator();
while (c<n && iter.hasNext()) {
T t = iter.next();
list.add(t);
c++;
}

Categories