Merge sets when two elements in common - java

This is the follow up of compare sets
I have
Set<Set<Node>> NestedSet = new HashSet<Set<Node>>();
[[Node[0], Node[1], Node[2]], [Node[0], Node[2], Node[6]], [Node[3], Node[4], Node[5]] [Node[2], Node[6], Node[7]] ]
I want to merge the sets when there are two elements in common. For example 0,1,2 and 0,2,6 has two elements in common so merging them to form [0,1,2,6].
Again [0,1,2,6] and [2,6,7] has 2 and 6 common. so merging them and getting [0,1,2,6,7].
The final output should be :
[ [Node[0], Node[1], Node[2], Node[6], Node[7]], [Node[3], Node[4], Node[5]] ]
I tried like this :
for (Set<Node> s1 : NestedSet ) {
Optional<Set<Node>> findFirst = result.stream().filter(p -> { HashSet<Node> temp = new HashSet<>(s1);
temp.retainAll(p);
return temp.size() == 2; }).findFirst();
if (findFirst.isPresent()){
findFirst.get().addAll(s1);
}
else {
result.add(s1);
}
}
But the result I got was :
[[Node[0], Node[1], Node[2], Node[6], Node[7]], [Node[3], Node[4], Node[5]], [Node[0], Node[2], Node[6], Node[7]]]
Any idea ? Is there any way to get the desired output?

Some considerations:
Each time you apply a merge, you have to restart the procedure and iterate over the modified collection. Because of this, the iteration order of the input set is important, if you want your code to be deterministic you may want to use collections that give guarantees over their iteration order (e.g. use LinkedHashSet (not HashSet) or List.
Your current code has side effects as it modifies the supplied sets when merging. In general I think it helps to abstain from creating side effects whenever possible.
The following code does what you want:
static <T> List<Set<T>> mergeSets(Collection<? extends Set<T>> unmergedSets) {
final List<Set<T>> mergedSets = new ArrayList<>(unmergedSets);
List<Integer> mergeCandidate = Collections.emptyList();
do {
mergeCandidate = findMergeCandidate(mergedSets);
// apply the merge
if (!mergeCandidate.isEmpty()) {
// gather the sets to merge
final Set<T> mergedSet = Sets.union(
mergedSets.get(mergeCandidate.get(0)),
mergedSets.get(mergeCandidate.get(1)));
// removes both sets using their index, starts with the highest index
mergedSets.remove(mergeCandidate.get(0).intValue());
mergedSets.remove(mergeCandidate.get(1).intValue());
// add the mergedSet
mergedSets.add(mergedSet);
}
} while (!mergeCandidate.isEmpty());
return mergedSets;
}
// O(n^2/2)
static <T> List<Integer> findMergeCandidate(List<Set<T>> sets) {
for (int i = 0; i < sets.size(); i++) {
for (int j = i + 1; j < sets.size(); j++) {
if (Sets.intersection(sets.get(i), sets.get(j)).size() == 2) {
return Arrays.asList(j, i);
}
}
}
return Collections.emptyList();
}
For testing this method I created two helper methods:
static Set<Integer> set(int... ints) {
return new LinkedHashSet<>(Ints.asList(ints));
}
#SafeVarargs
static <T> Set<Set<T>> sets(Set<T>... sets) {
return new LinkedHashSet<>(Arrays.asList(sets));
}
These helper methods allow to write very readable tests, for example (using the numbers from the question):
public static void main(String[] args) {
// prints [[2, 6, 7, 0, 1]]
System.out.println(mergeSets(sets(set(0, 1, 2, 6), set(2, 6, 7))));
// prints [[3, 4, 5], [0, 2, 6, 1, 7]]
System.out.println(
mergeSets(sets(set(0, 1, 2), set(0, 2, 6), set(3, 4, 5), set(2, 6, 7))));
}

I'm not sure why you are getting that result, but I do see another problem with this code: It is order-dependent. For example, even if the code worked as intended, it would matter whether [Node[0], Node[1], Node[2]] is compared first to [Node[0], Node[2], Node[6]] or [Node[2], Node[6], Node[7]]. But Sets don't have a defined order, so the result is either non-deterministic or implementation-dependent, depending on how you look at it.
If you really want deterministic order-dependent operations here, you should be using List<Set<Node>>, rather than Set<Set<Node>>.

Here's a clean approach using recursion:
public static <T> Set<Set<T>> mergeIntersectingSets(Collection<? extends Set<T>> unmergedSets) {
boolean edited = false;
Set<Set<T>> mergedSets = new HashSet<>();
for (Set<T> subset1 : unmergedSets) {
boolean merged = false;
// if at least one element is contained in another subset, then merge the subsets
for (Set<T> subset2 : mergedSets) {
if (!Collections.disjoint(subset1, subset2)) {
subset2.addAll(subset1);
merged = true;
edited = true;
}
}
// otherwise, add the current subset as a new subset
if (!merged) mergedSets.add(subset1);
}
if (edited) return mergeIntersectingSets(mergedSets); // continue merging until we reach a fixpoint
else return mergedSets;
}

Related

How to collect element of the stream into groups of fixed size in accordance with their order

I am trying to do the following with a Stream<BigDecimal> using Java 8 but am stuck at step 2.
Remove null and negative values.
Create groups with a size of 3 elements. Retain groups with an average of less than 30, otherwise discard.
Example. Let's assume the following:
stream<Bigdecimal> input = {4,5,61,3,9,3,1,null,-4,7,2,-8,6,-3,null}; //technically its incorrect but just assume.
I was able to solve step 1 as below:
Stream<BigDecimal> newInList = input.filter(bd -> (bd != null && bd.signum() > 0));
I'm not able to do the step 2 - create groups of 3 elements.
The expected result for step2: {4,5,6},{61,3,9},{3,1,7}.
I'm looking for a solution with Java 8 streams.
So you need to extract groups with the size of 3 elements from the stream in accordance with their order.
It can be done using Stream API by implementing a custom collector that implements the Collector interface.
While initializing the GroupCollector size of the group has to be provided (it's was done to make the collector more flexible and avoid hard-coding the value of 3 inside the class).
Deque<List<T>> is used as a mutable container because the Deque interface provides convenient access to the last element.
combiner() method provides the logic of how to combine results of the execution obtained by different threads. Parallel stream provides a guarantee for the collect() operation that the initial order of the stream will be preserved and results from the different threads will be joined with respect to the order they were assigned with their tasks. Therefore this solution can be parallelized.
The logic of combining the two queues produced by different treads entails the following concerns:
make sure that all groups (except for one that should be the last) have exactly 3 elements. Therefore we can't simply add all the contents of the second deque to the first deque. Instead, every group of the second deque has to be processed one by one.
lists that are already created should be reused.
finisher() function will discard the last list in the deque if its size is less than the groupSize (requirement provided by the PO in the comment).
As an example, I've used the sequence of numbers from the question.
public static void main(String[] args) {
Stream<BigDecimal> source =
IntStream.of(4, 5, 6, 61, 3, 9, 3, 1, 7, 2, 6)
.mapToObj(BigDecimal::valueOf);
System.out.println(createGroups(source)
.flatMap(List::stream)
.collect(Collectors.toList())); // collecting to list for demonstration purposes
}
Method createGroups()
public static Stream<List<BigDecimal>> createGroups(Stream<BigDecimal> source) {
return source
.collect(new GroupCollector<BigDecimal>(3))
.stream()
.filter(list -> averageIsLessThen(list, 30));
}
Collector
public class GroupCollector<T> implements Collector<T, Deque<List<T>>, Deque<List<T>>> {
private final int groupSize;
public GroupCollector(int groupSize) {
this.groupSize = groupSize;
}
#Override
public Supplier<Deque<List<T>>> supplier() {
return ArrayDeque::new;
}
#Override
public BiConsumer<Deque<List<T>>, T> accumulator() {
return (deque, next) -> {
if (deque.isEmpty() || deque.getLast().size() == groupSize) {
List<T> group = new ArrayList<>();
group.add(next);
deque.addLast(group);
} else {
deque.getLast().add(next);
}
};
}
#Override
public BinaryOperator<Deque<List<T>>> combiner() {
return (deque1, deque2) -> {
if (deque1.isEmpty()) {
return deque2;
} else if (deque1.getLast().size() == groupSize) {
deque1.addAll(deque2);
return deque1;
}
// last group in the deque1 has a size less than groupSize
List<T> curGroup = deque1.pollLast();
List<T> nextGroup;
for (List<T> nextItem: deque2) {
nextGroup = nextItem;
Iterator<T> iter = nextItem.iterator();
while (iter.hasNext() && curGroup.size() < groupSize) {
curGroup.add(iter.next());
iter.remove();
}
deque1.add(curGroup);
curGroup = nextGroup;
}
if (curGroup.size() != 0) {
deque1.add(curGroup);
}
return deque1;
};
}
#Override
public Function<Deque<List<T>>, Deque<List<T>>> finisher() {
return deque -> {
if (deque.peekLast() != null && deque.peekLast().size() < groupSize) {
deque.pollLast();
}
return deque;
};
}
#Override
public Set<Characteristics> characteristics() {
return Collections.emptySet();
}
}
The auxiliary method that is used to validate a group of elements based on its average value (in case you are wondering what RoundingMode is meant for, then read this answer).
private static boolean averageIsLessThen(List<BigDecimal> list, double target) {
BigDecimal average = list.stream()
.reduce(BigDecimal.ZERO, BigDecimal::add)
.divide(BigDecimal.valueOf(list.size()), RoundingMode.HALF_UP);
return average.compareTo(BigDecimal.valueOf(target)) < 0;
}
output (expected result: { 4, 5, 6, 61, 3, 9, 3, 1, 7 }, provided by the PO)
[4, 5, 6, 61, 3, 9, 3, 1, 7]

Looping over All Possible combinations of ArrayList

I want to loop over the same list to process possible combinations of that list. For example : From a list consisting [1,2,3] I want to get an ArrayList which looks like this: [[1,2], [1,3], [2,3]]
I am processing a list of nodes instead of integers. For now i am trying something like the following :
ArrayList<ArrayList<Node>> saveList = new ArrayList<ArrayList<Node>>();
for (Node n1 : nodes)
ArrayList<Node> saveList2 = new ArrayList<Node>();
for (Node n2 : nodes)
if n2.name == n1.name
continue;
saveList2.add(n1).add(n2);
if (!saveList.containsAll(saveList2))
then process graph;
else continue;
I don't process the same node and avoid the combination already processed. Is there a better solution ?
Using a combinatorics library may be a bit overkill in your case. Your task is indeed finding combinations of size 2, but the fact that the size is two simplifies it drastically.
A good old index-based for-loop does the trick here, with no check for duplicates necessary. Notice how the second loop starts from i + 1. Go over the algorithm in a scratchpad and you will see how this avoids duplicates.
List<List<Node>> pairs = new ArrayList<>();
for (int i = 0; i < nodes.size(); i++) {
for (int j = i + 1; j < nodes.size(); j++) {
pairs.add(Arrays.asList(nodes.get(i), nodes.get(j)));
}
}
If the task is not of academic nature or does not consist of implementing an algorithm, I would use a library and focus on the core of the task the application is supposed to solve. Such a library would be for example combinatoricslib3. Google guava or Apache commons certainly have similar methods. With combinatoricslib3 the solution to your issue above would be a one-liner:
Generator.combination(1,2,3)
.simple(2)
.stream()
.forEach(System.out::println);
Output:
[1, 2]
[1, 3]
[2, 3]
or something like:
List<List<String>> result = Generator.combination("FOO", "BAR", "BAZ")
.simple(2)
.stream()
.collect(Collectors.toList());
System.out.println(result);
to get
[[FOO, BAR], [FOO, BAZ], [BAR, BAZ]]
It works not only for primitive data types like ints or strings as shown above, you can also use your own custom objects and use a list of your objects as a parameter. Assuming you have a Node class:
public class Node {
String name;
// getter, setter, toString ...
}
List<Node> nodeList = List.of(new Node("node1"), new Node("node2"), new Node("node3"));
Generator.combination(nodeList)
.simple(2)
.stream()
.forEach(System.out::println);
Output:
[Node(name=node1), Node(name=node2)]
[Node(name=node1), Node(name=node3)]
[Node(name=node2), Node(name=node3)]
To use the lib add the dependency to your pom.xml or download the jar and add to classpath. mvn dependency:
<dependency>
<groupId>com.github.dpaukov</groupId>
<artifactId>combinatoricslib3</artifactId>
<version>3.3.2</version>
</dependency>
Try this.
static <T> List<List<T>> combinations(List<T> list, int n) {
int length = list.size();
List<List<T>> result = new ArrayList<>();
T[] selections = (T[])new Object[n];
new Object() {
void select(int start, int index) {
if (index >= n)
result.add(List.of(selections));
else if (start < length){
selections[index] = list.get(start);
select(start + 1, index + 1);
select(start + 1, index);
}
}
}.select(0, 0);
return result;
}
public static void main(String[] args) {
List<Integer> list = List.of(1, 2, 3);
System.out.println(combinations(list, 2));
}
output:
[[1, 2], [1, 3], [2, 3]]

isPalindrome - Collection and List Reversal

This is a homework lab for school. I am trying to reverse a LinkedList, and check if it is a palindrome (the same backwards and forwards). I saw similar questions online, but not many that help me with this. I have made programs that check for palindromes before, but none that check an array or list. So, first, here is my isPalindrome method:
public static <E> boolean isPalindrome(Collection<E> c) {
Collection<E> tmp = c;
System.out.println(tmp);
Collections.reverse((List<E>) c);
System.out.println(c);
if(tmp == c) { return true; } else { return false; }
}
My professor wants us to set the method up to accept all collections which is why I used Collection and cast it as a list for the reverse method, but I'm not sure if that is done correctly. I know that it does reverse the list. Here is my main method:
public static void main(String...strings) {
Integer[] arr2 = {1,3,1,1,2};
LinkedList<Integer> ll2 = new LinkedList<Integer>(Arrays.asList(arr2));
if(isPalindrome(ll2)) { System.out.println("Successful!"); }
}
The problem is, I am testing this with an array that is not a palindrome, meaning it is not the same backwards as it is forwards. I already tested it using the array {1,3,1} and it works fine because that is a palindrome. Using {1,3,1,1,2} still returns true for palindrome, though it is clearly not. Here is my output using the {1,3,1,1,2} array:
[1, 3, 1, 1, 2]
[2, 1, 1, 3, 1]
Successful!
So, it seems to be properly reversing the List, but when it compares them, it assumes they are equal? I believe there is an issue with the tmp == c and how it checks whether they are equal. I assume it just checks if it contains the same elements, but I'm not sure. I also tried tmp.equals(c), but it returned the same results. I'm just curious is there is another method that I can use or do I have to write a method to compare tmp and c?
Thank you in advance!
Tommy
In your code c and tmp are links to same collection and tmp == c will be always true. Your must clone your collection to new instance, for example: List<E> tmp = new ArrayList(c);.
Many small points
public static <E> boolean isPalindrome(Collection<E> c) {
List<E> list = new ArrayList<>(c);
System.out.println(list);
Collections.reverse(list);
System.out.println(list);
return list.equals(new ArrayList<E>(c));
}
Reverse only works on an ordered list.
One makes a copy of the collection.
One uses equals to compare collections.
public static void main(String...strings) {
int[] arr2 = {1, 3, 1, 1, 2};
//List<Integer> ll2 = new LinkedList<>(Arrays.asList(arr2));
List<Integer> ll2 = Arrays.asList(arr2);
if (isPalindrome(ll2)) { System.out.println("Successful!"); }
}
You need to copy the Collection to a List / array. This has to be done, since the only ordering defined for a Collection is the one of the iterator.
Object[] asArray = c.toArray();
You can apply the algorithm of your choice for checking if this array is a palindrom to check, if the Collection is a palindrom.
Alternatively using LinkedList it would be more efficient to check, if the list is a palindrom without creating a new List to reverse:
public static <E> boolean isPalindrome(Collection<E> c) {
List<E> list = new LinkedList<>(c);
Iterator<E> startIterator = list.iterator();
ListIterator<E> endIterator = list.listIterator(list.size());
for (int i = list.size() / 2; i > 0; i--) {
if (!Objects.equals(startIterator.next(), endIterator.previous())) {
return false;
}
}
return true;
}

Combine values with Java8 stream

If I have a list with integers, is there a way to construct another list, where integers are summed if the difference to the head of the new list is below a threashold? I would like to solve this using Java 8 streams. It should work similar to the Scan operator of RxJava.
Example: 5, 2, 2, 5, 13
Threashold: 2
Result: 5, 9, 13
Intermediate results:
5
5, 2
5, 4 (2 and 2 summed)
5, 9 (4 and 5 summed)
5, 9, 13
Sequential Stream solution may look like this:
List<Integer> result = Stream.of(5, 2, 2, 5, 13).collect(ArrayList::new, (list, n) -> {
if(!list.isEmpty() && Math.abs(list.get(list.size()-1)-n) < 2)
list.set(list.size()-1, list.get(list.size()-1)+n);
else
list.add(n);
}, (l1, l2) -> {throw new UnsupportedOperationException();});
System.out.println(result);
Though it looks not much better as good old solution:
List<Integer> input = Arrays.asList(5, 2, 2, 5, 13);
List<Integer> list = new ArrayList<>();
for(Integer n : input) {
if(!list.isEmpty() && Math.abs(list.get(list.size()-1)-n) < 2)
list.set(list.size()-1, list.get(list.size()-1)+n);
else
list.add(n);
}
System.out.println(list);
Seems that your problem is not associative, so it cannot be easily parallelized. For example, if you split the input into two groups like this (5, 2), (2, 5, 13), you cannot say whether the first two items of the second group should be merged until the first group is processed. Thus I cannot specify the proper combiner function.
As Tagir Valeev observed, (+1) the combining function is not associative, so reduce() won't work, and it's not possible to write a combiner function for a Collector. Instead, this combining function needs to be applied left-to-right, with the previous partial result being fed into the next operation. This is called a fold-left operation, and unfortunately Java streams don't have such an operation.
(Should they? Let me know.)
It's possible to sort-of write your own fold-left operation using forEachOrdered while capturing and mutating an object to hold partial state. First, let's extract the combining function into its own method:
// extracted from Tagir Valeev's answer
void combine(List<Integer> list, int n) {
if (!list.isEmpty() && Math.abs(list.get(list.size()-1)-n) < 2)
list.set(list.size()-1, list.get(list.size()-1)+n);
else
list.add(n);
}
Then, create the initial result list and call the combining function from within forEachOrdered:
List<Integer> result = new ArrayList<>();
IntStream.of(5, 2, 2, 5, 13)
.forEachOrdered(n -> combine(result, n));
This gives the desired result of
[5, 9, 13]
In principle this can be done on a parallel stream, but performance will probably degrade to sequential given the semantics of forEachOrdered. Also note that the forEachOrdered operations are performed one at a time, so we needn't worry about thread safety of the data we're mutating.
I know that the Stream's masters "Tagir Valeev" and "Stuart Marks" already pointed out that reduce() will not work because the combining function is not associative, and I'm risking a couple of downvotes here. Anyway:
What about if we force the stream to be sequential? Wouldn't we be able then to use reduce? Isn't the associativity property only needed when using parallelism?
Stream<Integer> s = Stream.of(5, 2, 2, 5, 13);
LinkedList<Integer> result = s.sequential().reduce(new LinkedList<Integer>(),
(list, el) -> {
if (list.isEmpty() || Math.abs(list.getLast() - el) >= 2) {
list.add(el);
} else {
list.set(list.size() - 1, list.getLast() + el);
}
return list;
}, (list1, list2) -> {
//don't really needed, as we are sequential
list1.addAll(list2); return list1;
});
Java 8 way is define custom IntSpliterator class:
static class IntThreasholdSpliterator extends Spliterators.AbstractIntSpliterator {
private PrimitiveIterator.OfInt it;
private int threashold;
private int sum;
public IntThreasholdSpliterator(int threashold, IntStream stream, long est) {
super(est, ORDERED );
this.it = stream.iterator();
this.threashold = threashold;
}
#Override
public boolean tryAdvance(IntConsumer action) {
if(!it.hasNext()){
return false;
}
int next = it.nextInt();
if(next<threashold){
sum += next;
}else {
action.accept(next + sum);
sum = 0;
}
return true;
}
}
public static void main( String[] args )
{
IntThreasholdSpliterator s = new IntThreasholdSpliterator(3, IntStream.of(5, 2, 2, 5, 13), 5);
List<Integer> rs= StreamSupport.intStream(s, false).mapToObj(Integer::valueOf).collect(toList());
System.out.println(rs);
}
Also you can hack it as
List<Integer> list = Arrays.asList(5, 2, 2, 5, 13);
int[] sum = {0};
list = list.stream().filter(s -> {
if(s<=2) sum[0]+=s;
return s>2;
}).map(s -> {
int rs = s + sum[0];
sum[0] = 0;
return rs;
}).collect(toList());
System.out.println(list);
But I am not sure that this hack is good idea for production code.

Sequential Searching

Ok I am relatively new to Java Programming, but have previous experience in C++. I want to search an array for a specific item, but what if there are more than one of the same specific item? Would it be best to use a temporary array to store all found items in the array and return the temporary array?
Note: I'm trying to find the best way of doing this with memory management and speed. And it's not for Home work:)
Use apache commons lib, which solve a lot of issues. Use this if you want to filter by predicate and select sub array
CollectionUtils.filter(
Arrays.asList(new Integer[] {1,2,3,4,5}),
new Predicate() {
public boolean evaluate(final Object object) {
return ((Integer) object) > 2;
}
}
);
In case if you would like to select item(s) use
CollectionUtils.select(Collection inputCollection, Predicate predicate)
Use true java way - Navigable set and maps
NavigableSet<E> subSet(E fromElement, boolean fromInclusive,
E toElement, boolean toInclusive);
If you able to skip Java, then in Scala it will be much easier:
scala> val a = Array(4, 6, 8, 9, 4, 2, 4, 2)
a: Array[Int] = Array(4, 6, 8, 9, 4, 2, 4, 2)
scala> a.filter(_ == 4)
res0: Array[Int] = Array(4, 4, 4)
just use guava library as the simplest solution:
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Iterables.html
or
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/Collections2.html
Just use an ArrayList. Example:
/** Returns all strings starting with the letter a.*/
public static List<String> getStartsWithA(String[] strs) {
List<String> ret = new ArrayList<String>();
for (String s: strs) {
if (s.startsWith("a") || s.startsWith("A")) {
ret.add(s);
}
}
return ret;
}
ArrayList's internal array will dynamically grow as more space is needed.
I would use a "ready to use" implementation like a HashMap. You say "search", so I believe that you have a searchkey (in my proposal the String) under wich you can store your data (for example an Integer).
Map<String, List<Integer>> map = new HashMap<String, List<Integer>>();
void storeValue(final String key, final Integer value) {
List<Integer> l = this.map.get(key);
if (l == null) {
synchronized (this.map) {
if (l == null) {
l = new Vector<Integer>();
this.map.put(key, l);
}
}
}
l.add(value);
}
List<Integer> searchByKey(final String key) {
return this.map.get(key);
}
With this, you can store multiple Integers # one key. Of course you can store other Object than the Integers.

Categories