Combine values with Java8 stream - java

If I have a list with integers, is there a way to construct another list, where integers are summed if the difference to the head of the new list is below a threashold? I would like to solve this using Java 8 streams. It should work similar to the Scan operator of RxJava.
Example: 5, 2, 2, 5, 13
Threashold: 2
Result: 5, 9, 13
Intermediate results:
5
5, 2
5, 4 (2 and 2 summed)
5, 9 (4 and 5 summed)
5, 9, 13

Sequential Stream solution may look like this:
List<Integer> result = Stream.of(5, 2, 2, 5, 13).collect(ArrayList::new, (list, n) -> {
if(!list.isEmpty() && Math.abs(list.get(list.size()-1)-n) < 2)
list.set(list.size()-1, list.get(list.size()-1)+n);
else
list.add(n);
}, (l1, l2) -> {throw new UnsupportedOperationException();});
System.out.println(result);
Though it looks not much better as good old solution:
List<Integer> input = Arrays.asList(5, 2, 2, 5, 13);
List<Integer> list = new ArrayList<>();
for(Integer n : input) {
if(!list.isEmpty() && Math.abs(list.get(list.size()-1)-n) < 2)
list.set(list.size()-1, list.get(list.size()-1)+n);
else
list.add(n);
}
System.out.println(list);
Seems that your problem is not associative, so it cannot be easily parallelized. For example, if you split the input into two groups like this (5, 2), (2, 5, 13), you cannot say whether the first two items of the second group should be merged until the first group is processed. Thus I cannot specify the proper combiner function.

As Tagir Valeev observed, (+1) the combining function is not associative, so reduce() won't work, and it's not possible to write a combiner function for a Collector. Instead, this combining function needs to be applied left-to-right, with the previous partial result being fed into the next operation. This is called a fold-left operation, and unfortunately Java streams don't have such an operation.
(Should they? Let me know.)
It's possible to sort-of write your own fold-left operation using forEachOrdered while capturing and mutating an object to hold partial state. First, let's extract the combining function into its own method:
// extracted from Tagir Valeev's answer
void combine(List<Integer> list, int n) {
if (!list.isEmpty() && Math.abs(list.get(list.size()-1)-n) < 2)
list.set(list.size()-1, list.get(list.size()-1)+n);
else
list.add(n);
}
Then, create the initial result list and call the combining function from within forEachOrdered:
List<Integer> result = new ArrayList<>();
IntStream.of(5, 2, 2, 5, 13)
.forEachOrdered(n -> combine(result, n));
This gives the desired result of
[5, 9, 13]
In principle this can be done on a parallel stream, but performance will probably degrade to sequential given the semantics of forEachOrdered. Also note that the forEachOrdered operations are performed one at a time, so we needn't worry about thread safety of the data we're mutating.

I know that the Stream's masters "Tagir Valeev" and "Stuart Marks" already pointed out that reduce() will not work because the combining function is not associative, and I'm risking a couple of downvotes here. Anyway:
What about if we force the stream to be sequential? Wouldn't we be able then to use reduce? Isn't the associativity property only needed when using parallelism?
Stream<Integer> s = Stream.of(5, 2, 2, 5, 13);
LinkedList<Integer> result = s.sequential().reduce(new LinkedList<Integer>(),
(list, el) -> {
if (list.isEmpty() || Math.abs(list.getLast() - el) >= 2) {
list.add(el);
} else {
list.set(list.size() - 1, list.getLast() + el);
}
return list;
}, (list1, list2) -> {
//don't really needed, as we are sequential
list1.addAll(list2); return list1;
});

Java 8 way is define custom IntSpliterator class:
static class IntThreasholdSpliterator extends Spliterators.AbstractIntSpliterator {
private PrimitiveIterator.OfInt it;
private int threashold;
private int sum;
public IntThreasholdSpliterator(int threashold, IntStream stream, long est) {
super(est, ORDERED );
this.it = stream.iterator();
this.threashold = threashold;
}
#Override
public boolean tryAdvance(IntConsumer action) {
if(!it.hasNext()){
return false;
}
int next = it.nextInt();
if(next<threashold){
sum += next;
}else {
action.accept(next + sum);
sum = 0;
}
return true;
}
}
public static void main( String[] args )
{
IntThreasholdSpliterator s = new IntThreasholdSpliterator(3, IntStream.of(5, 2, 2, 5, 13), 5);
List<Integer> rs= StreamSupport.intStream(s, false).mapToObj(Integer::valueOf).collect(toList());
System.out.println(rs);
}
Also you can hack it as
List<Integer> list = Arrays.asList(5, 2, 2, 5, 13);
int[] sum = {0};
list = list.stream().filter(s -> {
if(s<=2) sum[0]+=s;
return s>2;
}).map(s -> {
int rs = s + sum[0];
sum[0] = 0;
return rs;
}).collect(toList());
System.out.println(list);
But I am not sure that this hack is good idea for production code.

Related

How to collect element of the stream into groups of fixed size in accordance with their order

I am trying to do the following with a Stream<BigDecimal> using Java 8 but am stuck at step 2.
Remove null and negative values.
Create groups with a size of 3 elements. Retain groups with an average of less than 30, otherwise discard.
Example. Let's assume the following:
stream<Bigdecimal> input = {4,5,61,3,9,3,1,null,-4,7,2,-8,6,-3,null}; //technically its incorrect but just assume.
I was able to solve step 1 as below:
Stream<BigDecimal> newInList = input.filter(bd -> (bd != null && bd.signum() > 0));
I'm not able to do the step 2 - create groups of 3 elements.
The expected result for step2: {4,5,6},{61,3,9},{3,1,7}.
I'm looking for a solution with Java 8 streams.
So you need to extract groups with the size of 3 elements from the stream in accordance with their order.
It can be done using Stream API by implementing a custom collector that implements the Collector interface.
While initializing the GroupCollector size of the group has to be provided (it's was done to make the collector more flexible and avoid hard-coding the value of 3 inside the class).
Deque<List<T>> is used as a mutable container because the Deque interface provides convenient access to the last element.
combiner() method provides the logic of how to combine results of the execution obtained by different threads. Parallel stream provides a guarantee for the collect() operation that the initial order of the stream will be preserved and results from the different threads will be joined with respect to the order they were assigned with their tasks. Therefore this solution can be parallelized.
The logic of combining the two queues produced by different treads entails the following concerns:
make sure that all groups (except for one that should be the last) have exactly 3 elements. Therefore we can't simply add all the contents of the second deque to the first deque. Instead, every group of the second deque has to be processed one by one.
lists that are already created should be reused.
finisher() function will discard the last list in the deque if its size is less than the groupSize (requirement provided by the PO in the comment).
As an example, I've used the sequence of numbers from the question.
public static void main(String[] args) {
Stream<BigDecimal> source =
IntStream.of(4, 5, 6, 61, 3, 9, 3, 1, 7, 2, 6)
.mapToObj(BigDecimal::valueOf);
System.out.println(createGroups(source)
.flatMap(List::stream)
.collect(Collectors.toList())); // collecting to list for demonstration purposes
}
Method createGroups()
public static Stream<List<BigDecimal>> createGroups(Stream<BigDecimal> source) {
return source
.collect(new GroupCollector<BigDecimal>(3))
.stream()
.filter(list -> averageIsLessThen(list, 30));
}
Collector
public class GroupCollector<T> implements Collector<T, Deque<List<T>>, Deque<List<T>>> {
private final int groupSize;
public GroupCollector(int groupSize) {
this.groupSize = groupSize;
}
#Override
public Supplier<Deque<List<T>>> supplier() {
return ArrayDeque::new;
}
#Override
public BiConsumer<Deque<List<T>>, T> accumulator() {
return (deque, next) -> {
if (deque.isEmpty() || deque.getLast().size() == groupSize) {
List<T> group = new ArrayList<>();
group.add(next);
deque.addLast(group);
} else {
deque.getLast().add(next);
}
};
}
#Override
public BinaryOperator<Deque<List<T>>> combiner() {
return (deque1, deque2) -> {
if (deque1.isEmpty()) {
return deque2;
} else if (deque1.getLast().size() == groupSize) {
deque1.addAll(deque2);
return deque1;
}
// last group in the deque1 has a size less than groupSize
List<T> curGroup = deque1.pollLast();
List<T> nextGroup;
for (List<T> nextItem: deque2) {
nextGroup = nextItem;
Iterator<T> iter = nextItem.iterator();
while (iter.hasNext() && curGroup.size() < groupSize) {
curGroup.add(iter.next());
iter.remove();
}
deque1.add(curGroup);
curGroup = nextGroup;
}
if (curGroup.size() != 0) {
deque1.add(curGroup);
}
return deque1;
};
}
#Override
public Function<Deque<List<T>>, Deque<List<T>>> finisher() {
return deque -> {
if (deque.peekLast() != null && deque.peekLast().size() < groupSize) {
deque.pollLast();
}
return deque;
};
}
#Override
public Set<Characteristics> characteristics() {
return Collections.emptySet();
}
}
The auxiliary method that is used to validate a group of elements based on its average value (in case you are wondering what RoundingMode is meant for, then read this answer).
private static boolean averageIsLessThen(List<BigDecimal> list, double target) {
BigDecimal average = list.stream()
.reduce(BigDecimal.ZERO, BigDecimal::add)
.divide(BigDecimal.valueOf(list.size()), RoundingMode.HALF_UP);
return average.compareTo(BigDecimal.valueOf(target)) < 0;
}
output (expected result: { 4, 5, 6, 61, 3, 9, 3, 1, 7 }, provided by the PO)
[4, 5, 6, 61, 3, 9, 3, 1, 7]

Looping over All Possible combinations of ArrayList

I want to loop over the same list to process possible combinations of that list. For example : From a list consisting [1,2,3] I want to get an ArrayList which looks like this: [[1,2], [1,3], [2,3]]
I am processing a list of nodes instead of integers. For now i am trying something like the following :
ArrayList<ArrayList<Node>> saveList = new ArrayList<ArrayList<Node>>();
for (Node n1 : nodes)
ArrayList<Node> saveList2 = new ArrayList<Node>();
for (Node n2 : nodes)
if n2.name == n1.name
continue;
saveList2.add(n1).add(n2);
if (!saveList.containsAll(saveList2))
then process graph;
else continue;
I don't process the same node and avoid the combination already processed. Is there a better solution ?
Using a combinatorics library may be a bit overkill in your case. Your task is indeed finding combinations of size 2, but the fact that the size is two simplifies it drastically.
A good old index-based for-loop does the trick here, with no check for duplicates necessary. Notice how the second loop starts from i + 1. Go over the algorithm in a scratchpad and you will see how this avoids duplicates.
List<List<Node>> pairs = new ArrayList<>();
for (int i = 0; i < nodes.size(); i++) {
for (int j = i + 1; j < nodes.size(); j++) {
pairs.add(Arrays.asList(nodes.get(i), nodes.get(j)));
}
}
If the task is not of academic nature or does not consist of implementing an algorithm, I would use a library and focus on the core of the task the application is supposed to solve. Such a library would be for example combinatoricslib3. Google guava or Apache commons certainly have similar methods. With combinatoricslib3 the solution to your issue above would be a one-liner:
Generator.combination(1,2,3)
.simple(2)
.stream()
.forEach(System.out::println);
Output:
[1, 2]
[1, 3]
[2, 3]
or something like:
List<List<String>> result = Generator.combination("FOO", "BAR", "BAZ")
.simple(2)
.stream()
.collect(Collectors.toList());
System.out.println(result);
to get
[[FOO, BAR], [FOO, BAZ], [BAR, BAZ]]
It works not only for primitive data types like ints or strings as shown above, you can also use your own custom objects and use a list of your objects as a parameter. Assuming you have a Node class:
public class Node {
String name;
// getter, setter, toString ...
}
List<Node> nodeList = List.of(new Node("node1"), new Node("node2"), new Node("node3"));
Generator.combination(nodeList)
.simple(2)
.stream()
.forEach(System.out::println);
Output:
[Node(name=node1), Node(name=node2)]
[Node(name=node1), Node(name=node3)]
[Node(name=node2), Node(name=node3)]
To use the lib add the dependency to your pom.xml or download the jar and add to classpath. mvn dependency:
<dependency>
<groupId>com.github.dpaukov</groupId>
<artifactId>combinatoricslib3</artifactId>
<version>3.3.2</version>
</dependency>
Try this.
static <T> List<List<T>> combinations(List<T> list, int n) {
int length = list.size();
List<List<T>> result = new ArrayList<>();
T[] selections = (T[])new Object[n];
new Object() {
void select(int start, int index) {
if (index >= n)
result.add(List.of(selections));
else if (start < length){
selections[index] = list.get(start);
select(start + 1, index + 1);
select(start + 1, index);
}
}
}.select(0, 0);
return result;
}
public static void main(String[] args) {
List<Integer> list = List.of(1, 2, 3);
System.out.println(combinations(list, 2));
}
output:
[[1, 2], [1, 3], [2, 3]]

Stateful filter for ordered stream

I have a problem and I wonder if there is a solution using Streams.
Imagine you have an ordered stream of Objects; let's assume a stream of Integers.
Stream<Integer> stream = Stream.of(2,20,18,17,4,11,13,6,3,19,4,10,13....)
Now I want to filter all values where the difference of a value and the previous number before this value is greater than n.
stream.filter(magicalVoodoo(5))
// 2, 20, 4, 11, 3, 19, 4, 10 ...
I there any possibility to do this?
Yes, this is possible, but you will need a stateful predicate that keeps track of the previous value for doing the comparison. This does mean it can only be used for sequential streams: with parallel streams you'd run into race conditions.
Luckily, most streams default to sequential, but if you need to do this on streams from an unknown source, you may want to check using isParallel() and either throw an exception, or convert it to a sequential stream using sequential().
An example:
public class DistanceFilter implements IntPredicate {
private final int distance;
private int previousValue;
public DistanceFilter(int distance) {
this(distance, 0);
}
public DistanceFilter(int distance, int startValue) {
this.distance = distance;
this.previousValue = startValue;
}
#Override
public boolean test(int value) {
if (Math.abs(previousValue - value) > distance) {
previousValue = value;
return true;
}
return false;
}
// Just for simple demonstration
public static void main(String[] args) {
int[] ints = IntStream.of(2, 20, 18, 17, 4, 11, 13, 6, 3, 19, 4, 10, 13)
.filter(new DistanceFilter(5))
.toArray();
System.out.println(Arrays.toString(ints));
}
}
I used IntStream here, because it is a better type for this, but the concept would be similar for Stream<Integer> (or other object types).
Streams are not designed for this kind of task. I would use a different way to accomplish this, which doesn't use streams. But, if you really must use streams, the solution has to circumvent certain restrictions due to the design of streams and lambdas, and therefore looks quite hacky:
int[] previous = new int[1];
previous[0] = firstElement;
... = stream.filter(n -> {
boolean isAllowed = (abs(n - previous[0]) > 5);
if (isAllowed)
previous[0] = n;
return isAllowed;})
Notice that the variable previous is a one-element array. That's a hack due to the fact that the lambda is not allowed to modify variables (it can modify an element of an array, but not the array itself).

java streams - How to filter a collection to two new collections

I tried to create two lists - odds and evens as follows:
public static void main(String[] args) {
List<Integer> numbers = new ArrayList<>(Arrays.asList(1, 2, 3, 5, 8, 13, 21));
List<Integer> odds = new ArrayList<>();
List<Integer> evens = new ArrayList<>();
numbers.stream().forEach(x -> x % 2 == 0 ? evens.add(x) : odds.add(x));
}
But it gave me incompatible types error (bad return type in lambda expression
missing return value)
What is the best way to filter a collection to two new collections?
As the other answers explain why it doesn't compile I would use in your case the partitioningBy collector and fetch the resulting lists:
import static java.util.stream.Collectors.partitioningBy;
...
List<Integer> numbers = Arrays.asList(1, 2, 3, 5, 9, 13, 21);
Map<Boolean, List<Integer>> partition =
numbers.stream().collect(partitioningBy(x -> x % 2 == 0));
List<Integer> odds = partition.get(false);
List<Integer> evens = partition.get(true);
Well, you can make your existing code compile with a trivial modification:
public static void main(String[] args) {
List<Integer> numbers = new ArrayList<>(Arrays.asList(1, 2, 3, 5, 8, 13, 21));
List<Integer> odds = new ArrayList<>();
List<Integer> evens = new ArrayList<>();
numbers.stream().forEach(x -> (x % 2 == 0 ? evens : odds).add(x));
}
The conditional ?: operator is an expression, which isn't a valid statement on its own. My modified code changes the use of the conditional operator to just select which list to add to, and then calls add on it - and that method invocation expression is a valid statement.
An alternative would be to collect using Collectors.partitioningBy - although in this particular case that would probably be more confusing code than what you've got.
A ternary operator is not a statement. If you're using a forEach block, you'd need a valid Java statement, or a complete block:
numbers.stream().forEach(x -> {
if (x % 2 == 0 ) {
pairs.add(x);
} else {
ods.add(x);
}
});

Merge sets when two elements in common

This is the follow up of compare sets
I have
Set<Set<Node>> NestedSet = new HashSet<Set<Node>>();
[[Node[0], Node[1], Node[2]], [Node[0], Node[2], Node[6]], [Node[3], Node[4], Node[5]] [Node[2], Node[6], Node[7]] ]
I want to merge the sets when there are two elements in common. For example 0,1,2 and 0,2,6 has two elements in common so merging them to form [0,1,2,6].
Again [0,1,2,6] and [2,6,7] has 2 and 6 common. so merging them and getting [0,1,2,6,7].
The final output should be :
[ [Node[0], Node[1], Node[2], Node[6], Node[7]], [Node[3], Node[4], Node[5]] ]
I tried like this :
for (Set<Node> s1 : NestedSet ) {
Optional<Set<Node>> findFirst = result.stream().filter(p -> { HashSet<Node> temp = new HashSet<>(s1);
temp.retainAll(p);
return temp.size() == 2; }).findFirst();
if (findFirst.isPresent()){
findFirst.get().addAll(s1);
}
else {
result.add(s1);
}
}
But the result I got was :
[[Node[0], Node[1], Node[2], Node[6], Node[7]], [Node[3], Node[4], Node[5]], [Node[0], Node[2], Node[6], Node[7]]]
Any idea ? Is there any way to get the desired output?
Some considerations:
Each time you apply a merge, you have to restart the procedure and iterate over the modified collection. Because of this, the iteration order of the input set is important, if you want your code to be deterministic you may want to use collections that give guarantees over their iteration order (e.g. use LinkedHashSet (not HashSet) or List.
Your current code has side effects as it modifies the supplied sets when merging. In general I think it helps to abstain from creating side effects whenever possible.
The following code does what you want:
static <T> List<Set<T>> mergeSets(Collection<? extends Set<T>> unmergedSets) {
final List<Set<T>> mergedSets = new ArrayList<>(unmergedSets);
List<Integer> mergeCandidate = Collections.emptyList();
do {
mergeCandidate = findMergeCandidate(mergedSets);
// apply the merge
if (!mergeCandidate.isEmpty()) {
// gather the sets to merge
final Set<T> mergedSet = Sets.union(
mergedSets.get(mergeCandidate.get(0)),
mergedSets.get(mergeCandidate.get(1)));
// removes both sets using their index, starts with the highest index
mergedSets.remove(mergeCandidate.get(0).intValue());
mergedSets.remove(mergeCandidate.get(1).intValue());
// add the mergedSet
mergedSets.add(mergedSet);
}
} while (!mergeCandidate.isEmpty());
return mergedSets;
}
// O(n^2/2)
static <T> List<Integer> findMergeCandidate(List<Set<T>> sets) {
for (int i = 0; i < sets.size(); i++) {
for (int j = i + 1; j < sets.size(); j++) {
if (Sets.intersection(sets.get(i), sets.get(j)).size() == 2) {
return Arrays.asList(j, i);
}
}
}
return Collections.emptyList();
}
For testing this method I created two helper methods:
static Set<Integer> set(int... ints) {
return new LinkedHashSet<>(Ints.asList(ints));
}
#SafeVarargs
static <T> Set<Set<T>> sets(Set<T>... sets) {
return new LinkedHashSet<>(Arrays.asList(sets));
}
These helper methods allow to write very readable tests, for example (using the numbers from the question):
public static void main(String[] args) {
// prints [[2, 6, 7, 0, 1]]
System.out.println(mergeSets(sets(set(0, 1, 2, 6), set(2, 6, 7))));
// prints [[3, 4, 5], [0, 2, 6, 1, 7]]
System.out.println(
mergeSets(sets(set(0, 1, 2), set(0, 2, 6), set(3, 4, 5), set(2, 6, 7))));
}
I'm not sure why you are getting that result, but I do see another problem with this code: It is order-dependent. For example, even if the code worked as intended, it would matter whether [Node[0], Node[1], Node[2]] is compared first to [Node[0], Node[2], Node[6]] or [Node[2], Node[6], Node[7]]. But Sets don't have a defined order, so the result is either non-deterministic or implementation-dependent, depending on how you look at it.
If you really want deterministic order-dependent operations here, you should be using List<Set<Node>>, rather than Set<Set<Node>>.
Here's a clean approach using recursion:
public static <T> Set<Set<T>> mergeIntersectingSets(Collection<? extends Set<T>> unmergedSets) {
boolean edited = false;
Set<Set<T>> mergedSets = new HashSet<>();
for (Set<T> subset1 : unmergedSets) {
boolean merged = false;
// if at least one element is contained in another subset, then merge the subsets
for (Set<T> subset2 : mergedSets) {
if (!Collections.disjoint(subset1, subset2)) {
subset2.addAll(subset1);
merged = true;
edited = true;
}
}
// otherwise, add the current subset as a new subset
if (!merged) mergedSets.add(subset1);
}
if (edited) return mergeIntersectingSets(mergedSets); // continue merging until we reach a fixpoint
else return mergedSets;
}

Categories