Stop java stream computations based on previous computation results - java

How to break stream computation based on previous results? If it's obvious that stream.filter(...).count() would be less than some number - how to stop stream computation?
I have the following code which checks if some sampleData passes the predicate test:
// sampleData.size() may be greater than 10.000.000
Set<String> sampleData = downloadFromWeb();
return sampleData.stream().filter(predicate::test).count() > sampleData.size() * coefficient;
I could have thousands of sampleData. The problem is that this code is ineffective. For example, if coefficient equals 0.5, sampleData.size() = 10_000_000, and first 5_000_000 elements fails the predicate::test - there is no reason to validate last 5_000_000 elements (count() will never be greater than 5_000_000).

ZhekaKozlov’s answer is heading into the right direction, but it lacks the negation. For the matches to be larger than a certain threshold, the number of non matching elements must be smaller than “size - threshold”. If we test for the nonmatching elements to be smaller, we can apply a limit to stop once they become larger:
Set<String> sampleData = downloadFromWeb();
final long threshold = sampleData.size()-(long)(sampleData.size() * coefficient);
return sampleData.stream()
.filter(predicate.negate()).limit(threshold+1).count() < threshold;
There is, by the way, no reason to create a method reference to the test method of an existing Predicate like with predicate::test. Just pass the Predicate to the filter method. The code above also uses predicate.negate() instead of predicate.negate()::test…

To be honest I am not quite sure this would be correct, I hope someone will come along and review this, but here is my idea of using a custom spliterator:
static class CustomSpl<T> extends AbstractSpliterator<T> {
private Spliterator<T> source;
private int howMany;
private int coefficient;
private Predicate<T> predicate;
private T current;
private long initialSize;
private void setT(T t) {
this.current = t;
}
public CustomSpl(Spliterator<T> source, int howMany, int coefficient, Predicate<T> predicate, long initialSize) {
super(source.estimateSize(), source.characteristics());
this.source = source;
this.howMany = howMany;
this.coefficient = coefficient;
this.predicate = predicate;
this.initialSize = initialSize;
}
#Override
public boolean tryAdvance(Consumer<? super T> action) {
boolean hasMore = source.tryAdvance(this::setT);
System.out.println(current);
if (!hasMore) {
return false;
}
if (predicate.test(current)) {
++howMany;
}
if (initialSize - howMany <= coefficient) {
return false;
}
action.accept(current);
return true;
}
}
And for example this will produce only 4 elements, since we said to only care having a coefficient 5:
Spliterator<Integer> sp = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10).stream().spliterator();
long count = StreamSupport.stream(new CustomSpl<>(sp, 0, 5, x -> x > 3, sp.getExactSizeIfKnown()), false)
.count();
Also this is possible for spliterators with known size only.

Set<String> sampleData = downloadFromWeb();
int size = (int) (sampleData.size() * coefficient);
return sampleData.stream().filter(predicate::test).limit(size + 1).count() > size;

Related

Calculate the sum of dices based on target accurately

I am trying, to sum up score for a little dice game I made. The dice has the following values: 1-6, the player can select target, LOW = n < 4, 1,2,3. The rest of the targets, 4, 5, 6, 7, 8, 9, 10, 11 and 12.
When a throw is done, sum up the total sum of the dices, based on target value. What this means, is, if LOW is select, then everything below 4, is summed up. Otherwise, the process, must sum each dice till it reaches the target sum, and continue.
If, a throw is done, and I selected 6 as target, and get the following set: {1, 1, 4, 2, 5, 6}. We have a 6, 5+1=6, and 4+2=6, we are left with 1 which is not counted.
Constraints:
1-6 dice values.
Everything (Target) below 4, is summed up.
Everything (Target) selecting between 4, 5, 6, 7, 8, 9, 10, 11, & 12, is processed differently.
6 dices, can produces any number between 1-6. This means, [6,6,6,6,6,6], or [1,1,3,5,4,2], or some other set.
The only thing important, is the sum that is calculated and nothing else, as long as it matches the input of dices.
For example:
If the target is 12 and a list of numbers is [6, 6, 6, 6, 6, 6] then return value should be 36.
If we receive a list of numbers [1, 3, 4, 5, 6, 6] and a target is should be 12 (5+1+6=12 and also 5+4+3=12, however, numbers can only be used once and not reused, therefore only one of the combinations can contribute the result).
Below is a method, which gives the occurrences of a dice.
public static TreeMap<Integer, Integer> countOccurrences(List<Integer> numbers){
TreeMap<Integer, Integer> map = new TreeMap<Integer, Integer>();
int i = 0;
for(int n: numbers) {
if(map.containsKey(n)) {
i = map.get(n);
map.put(n, ++i);
}
else
map.put(n, 1);
}
return map;
}
Results:
Occurrences: {1=2, 2=1, 4=1, 5=1, 6=1}
Sample code
public static void main(String[] args) {
System.out.println(sum(combinationSum(List.of(1, 2, 4, 6, 6, 6), 12)));
System.out.println(combinationSum(List.of(1, 3, 3, 6, 5, 6), 12));
System.out.println(combinationSum(List.of(1, 2, 1, 4, 5, 6), 6));
}
public static int sum(List<List<Integer>> numbers)
{
int sum = 0;
int n = 0;
while(n < numbers.size()){
sum += numbers.get(n).stream().mapToInt(Integer::intValue).sum();
++n;
}
return sum;
}
public static List<List<Integer>> combinationSum(List<Integer> candidates, int target) {
List<List<Integer>> res = new ArrayList<>();
List<Integer> ds = new ArrayList<Integer>();
findCombinations(res, ds, target, candidates, 0);
return res;
}
private static void findCombinations(List<List<Integer>> res, List<Integer> ds, int target, List<Integer> arr, int index ){
if(target == 0){res.add(new ArrayList<>(ds)); return;}
for(int i= index ; i<arr.size(); i++){
if(i>index && arr.get(i) == arr.get(i-1)) continue;
if(arr.get(i) > target) break;
ds.add(arr.get(i));
findCombinations( res, ds, target-arr.get(i) , arr, i+1);
ds.remove(ds.size()-1 );
}
}
Produces:
24
[[1, 6, 5], [1, 5, 6], [3, 3, 6], [3, 3, 6], [6, 6]]
[[1, 1, 4], [1, 5], [2, 4], [1, 5], [6]]
Live running: https://www.jdoodle.com/ia/sHY
Update
In order to find the maximum possible score of the list, we maximize the number of non-overlapping combinations that we can construct from it.
For we need to take the following steps:
Find all the possible combinations which produce the target sum. We can not reject any combinations on the fly, because we can't know in advance which of them can be used in the optimal solution. For instance, if target is 6 and a list of number is [3,3,2,2,1,1], there will be the following combinations: [3,3], [2,2,1,1] and [3,2,1] appear two times. If we pick [3,3], which is the shortest combination, we will not be able to construct any combinations and resulting score will be wrong (6). Instead, we need to choose two combinations [3,2,1] which well give the result 12. That proves that we can't rely on the combination size, we need to explore all the combinations and then choose the optimal set of combinations.
Generate all possible groups of combinations that fit to the given list of numbers and find a group having the maximum number of combinations in it.
In the code below both steps are implemented recursively.
A link to Online Demo
Expalanation:
The enty point, this method that kicks in the process.
public static int sumRecursively(List<Integer> numbers, int target) {
Deque<List<Integer>> combinations = new ArrayDeque<>();
generateCombinations(numbers, combinations, new ArrayList<>(), target);
return processCombinations(combinations, numbers, target);
}
Step 1. Generating all the combinations.
Recursive method responsible for generating the set of combinations (implemented as void to avoid wrapping a combination with an additional collection and then throwing this wrapper away):
private static void generateCombinations(List<Integer> numbers,
Queue<List<Integer>> combinations,
List<Integer> combination,
int currentSum) {
if (currentSum == 0) { // base case - a combination has been found
combinations.add(combination);
return;
}
// recursive case
for (Integer next : numbers) { // loop is need only to discard the elements that exceed the currentSum - pay attention to the condition below and break clause at the end of the loop (we will bread out of the loop after the first encountered element that fits to the current sum)
if (next > currentSum) continue;
List<Integer> newNumbers = new ArrayList<>(numbers);
newNumbers.remove(next);
// there two opportunities for each number: use it, or ignore it
// add next number
List<Integer> newCombination = new ArrayList<>(combination);
newCombination.add(next);
getCombinations(newNumbers, combinations, newCombination, currentSum - next);
// ignore next number
getCombinations(newNumbers, combinations, new ArrayList<>(combination), currentSum);
break;
}
}
Step 2. Generating the groups of combinations.
Method below is responsible choosing the group that fits to the given list (i.e. we can construct all the combinations in the group from list elements) and has the maximum number of combinations in it.
All the functionality related to the procissing the the group of combinations (which represents a List<List<Integer>>) is incapsulated in a class CombinationGroup to make the code cleaner.
public static int processCombinations(Deque<List<Integer>> combinations,
List<Integer> numbers,
int target) {
List<CombinationGroup> combinationGroups = new ArrayList<>();
generateGroups(combinationGroups, combinations, new CombinationGroup(numbers.size()));
return combinationGroups.stream()
.filter(group -> group.canConstruct(numbers))
.mapToInt(group -> group.getCombCount() * target)
.max()
.orElse(0);
}
The following method is responsible for creating the all possible groups of previosly descovered combinations. There's also a small optimization: total number of elements in each group should not exceed the number of elements in the source list:
public static void generateGroups(List<CombinationGroup> groups,
Deque<List<Integer>> combinations,
CombinationGroup group) {
if (combinations.isEmpty()) {
groups.add(group);
return;
}
Deque<List<Integer>> combinationsCopy = new ArrayDeque<>(combinations);
List<Integer> comb = null;
while (!combinationsCopy.isEmpty() && (comb == null || !group.canAdd(comb))) {
comb = combinationsCopy.removeLast();
}
// adding the combination
if (comb != null) {
CombinationGroup groupWithNewComb = group.copy();
groupWithNewComb.addCombination(comb);
generateGroups(groups, combinationsCopy, groupWithNewComb);
}
// ignoring the combination
generateGroups(groups, combinationsCopy, group);
}
Class CombinationGroup used in the methods above:
public class CombinationGroup {
private List<List<Integer>> combinations = new ArrayList<>();
private int combCount; // number of combinations
private int size; // total number of elements in the list of combinations
private int sizeLimit;
public CombinationGroup(int sizeLimit) {
this.sizeLimit = sizeLimit;
}
public boolean canAdd(List<Integer> combination) {
return size + combination.size() <= sizeLimit;
}
public boolean addCombination(List<Integer> combination) {
if (!canAdd(combination)) return false;
combinations.add(combination);
size += combination.size();
combCount++;
return true;
}
public CombinationGroup copy() {
CombinationGroup copy = new CombinationGroup(this.sizeLimit);
for (List<Integer> comb : combinations) {
copy.addCombination(comb);
}
return copy;
}
public boolean canConstruct(List<Integer> numbers) {
if (numbers.size() < size) return false;
Map<Integer, Long> frequencyByValueNumb = getFrequencies(numbers.stream());
Map<Integer, Long> frequencyByValueComb = getFrequencies();
return frequencyByValueNumb.keySet().stream() // each element that prent this CombinationGroup appears in the given list of numbers appears at least the same number of times - that means we construct all these combinations from the given list
.allMatch(key -> frequencyByValueNumb.get(key) >= frequencyByValueComb.getOrDefault(key, 0L));
}
public Map<Integer, Long> getFrequencies() {
return getFrequencies(combinations.stream().flatMap(List::stream));
}
public Map<Integer, Long> getFrequencies(Stream<Integer> stream) {
return stream.collect(Collectors.groupingBy(
Function.identity(),
Collectors.counting()
));
}
public int getCombCount() {
return combCount;
}
#Override
public String toString() {
return "CombinationGroup{" +
"combinations=" + combinations +
'}';
}
}
main()
public static void main(String[] args) {
System.out.println(sumRecursively(List.of(1, 3, 4, 5, 6, 6), 12));
System.out.println(sumRecursively(List.of(1, 3, 3, 6, 5), 12));
System.out.println(sumRecursively(List.of(1, 2, 1, 4, 5, 6), 6));
}
Output:
24
12
18
Simplified algorithm
(doesn't maximizes the number of combinations)
In order to ensure that all the elements in each combination are unique, we to track indices that have been already used. I.e. each time we find a combination which sums up to the target number, we should prohibit the usage of elements that have been used in this combination, but not earlier (because there can be many combinations which are not able to produce the target, and therefore any element should eligible to be used until we didn't construct a complete combination that gives the target using this element).
To track the elements that are taken, we need an object that would be visible in every recursive branch. And we are already passing a list of numbers while making every recursive call, what if we would modify it each time we found a combination that produces the target number by removing the elements that have been use in this combination? If we took this road after the first combination, thing would become complicated because we will not be able to rely on the indices (because they can change in an unpredictable way) while constructing a single combination - it's crucial to ensure that each element that belongs to a particular combination is used only once withing a combination. Since values of elements might be identical, we should use the iteration order to construct each combination properly, but each removal of elements would create a mess. So, is there a better way?
We can maintain an array of boolean values, each element is this array would indicate whether a number at a particular index already belongs to a combination that gives the target or not.
Instead of clattering the recursive method with the code that manipulates with this boolean array, I've encapsulated it within a class with simple and self-explanatory methods, and sumRecursively() makes use of an instance of this class.
public class CombinationTracker {
private boolean[] isUsed;
public CombinationTracker(int size) {
this.isUsed = new boolean[size];
}
public boolean indexIsUsed(int ind) {
return isUsed[ind];
}
public boolean allNotUsed(List<Integer> indices) {
// return indices.stream().noneMatch(i -> isUsed[i]); // this line does the same as the loop below
boolean result = true;
for (int idx: indices) {
if (isUsed[idx]) {
result = false;
break;
}
}
return result;
}
public void setIsUsed(List<Integer> indices) {
for (int ind: indices)
setIsUsed(ind);
}
public void setIsUsed(int ind) {
isUsed[ind] = true;
}
}
Using this approach, we are able to construct combinations from numbers that are not used yet, and iterate over the list of numbers starting from a particular position by passing the index while making a recursive call. We can be sure that any of the elements that reside prier to the current position would not be added to the current combination.
Now, a quick recap on recursion.
Every recursive implementation consists of two parts:
Base case - that represents an edge-case (or a set of edge-cases) for which the outcome is known in advance. For this problem, there are two edge-cases:
we've managed to find a combination that gives the target number, i.e. currentSum == target, and the result would be equal to target;
the end of the list is reached (and the combination doesn't result to the target), the result would be 0 (this edge-case resolves automatically by termination condition of the for loop in the recursive case, and therefore no need to treat it separately).
Recursive case - a part of a solution where recursive calls are made and where the main logic resides. In the recursive case we're iterating over the list of numbers and at each iteration step (if the index is not yet used) we are making one or two recursive calls depending on a value of the current element (depending whether we exceed the target or not). In the general, we have two opportunities: either take the current element, or ignore it. The results of these recursive calls would be added together and produce the return value of the recursive case.
Since we need a couple of additional parameters, it's a good practice to create an auxiliary overloaded method (that will be used in the client code) which expects only a list of numbers and a target value and delegates the actual work to the recursive method.
That's how it might look like.
public static int sumRecursively(List<Integer> numbers, int target) {
return sumRecursively(new ArrayList<>(numbers),
new ArrayList<>(),
new CombinationTracker(numbers.size()),
0, 0, target);
}
The actual recursive method:
private static int sumRecursively(List<Integer> numbers,
List<Integer> combination,
CombinationTracker tracker,
int currentIndex,
int currentSum, int target) {
if (currentSum == target && tracker.allNotUsed(combination)) { // base case - a combination has been found
tracker.setIsUsed(combination);
return target;
}
// recursive case
int result = 0;
for (int i = currentIndex; i < numbers.size(); i++) {
int next = numbers.get(i);
if (tracker.indexIsUsed(i)) continue;
if (currentSum + next > target) continue;
// there are two opportunities for each number: either use next number, or ignore it
// add next number
if (next + currentSum <= target) {
List<Integer> newCombination = new ArrayList<>(combination);
newCombination.add(i);
result += sumRecursively(numbers, newCombination, tracker, i + 1, currentSum + next, target);
}
// ignore next number
result += sumRecursively(numbers, new ArrayList<>(combination), tracker, i + 1, currentSum, target);
}
return result;
}
main()
public static void main(String[] args) {
System.out.println(sumRecursively(List.of(1, 3, 4, 5, 6, 6), 12));
System.out.println(sumRecursively(List.of(6, 6, 6, 6, 6, 6), 12));
}
Output:
12
36
UPD.
Got a comment that code "sucks" due to performance issues...
First of all, I do believe you have missed a point the SO community is not a kind of service which solves code interview puzzles, generally speaking, if you came here with a puzzle you already failed, so, such comments are unacceptable.
At second, yes, the code suffers from performance issues just because it is a naive bruteforce solution - I had spent on it about 15 minutes (for example, figuring out all possible combinations with target sum has O(2^N) complexity, if it does not match performance expectations that means any code based on such idea has poor performance). BTW, if you had expectations about the performance you was need to:
provide correct input constraints (saying there is 6 numbers is not correct)
provide good testcases instead of saying code does not work - that allows us to eliminate bad ideas about algorithm.
Some idea:
it seems that we do not need to compute all possible combinations with target sum, because singletons are always preferable over N-lets, pairs are either the same or do not influence on the result or interchangeable with N-lets (e.g. in case of [2,2,8,10,10] we would prefer to eliminate pairs first), but whether it is true for higher N's completely unclear - it is better to have some testcases.
Not sure I properly understand the problem, but I believe the solution is following:
public class TargetNumber {
public static void main(String[] args) {
System.out.println(score(new int[]{1, 2, 1, 4, 5, 6}, 6));
}
public static int score(int[] candidates, int target) {
List<List<Integer>> combinations = getUniqueCombinations(candidates, target);
Map<Integer, Integer> freqs = new HashMap<>();
for (int n : candidates) {
freqs.merge(n, 1, Integer::sum);
}
return score(0, combinations, freqs, target);
}
public static int score(int offset, List<List<Integer>> combinations, Map<Integer, Integer> freqs, int target) {
if (offset == combinations.size()) {
return 0;
}
int result = 0;
List<Integer> comb = combinations.get(offset);
Map<Integer, Integer> nfreq = reduce(freqs, comb);
if (nfreq != null) {
result = Math.max(result, target + score(offset, combinations, nfreq, target));
}
result = Math.max(result, score(offset + 1, combinations, freqs, target));
return result;
}
public static Map<Integer, Integer> reduce(Map<Integer, Integer> freqs, List<Integer> comb) {
Map<Integer, Integer> result = new HashMap<>(freqs);
for (int n : comb) {
if (result.merge(n, -1, Integer::sum) < 0) {
return null;
}
}
return result;
}
public static List<List<Integer>> getUniqueCombinations(int[] candidates, int target) {
List<List<Integer>> result = new ArrayList<>();
Arrays.sort(candidates);
getUniqueCombinations(candidates, target, 0, new ArrayList<>(), result);
return result;
}
public static void getUniqueCombinations(int[] candidates, int target, int index, List<Integer> solution, List<List<Integer>> result) {
for (int i = index, n = candidates.length; i < n; ) {
int num = candidates[i];
if (num > target) {
break;
}
solution.add(num);
if (num == target) {
result.add(new ArrayList<>(solution));
}
getUniqueCombinations(candidates, target - num, i + 1, solution, result);
solution.remove(solution.size() - 1);
while (i < n && num == candidates[i]) {
i++;
}
}
}
}

How do I pass solutions from recursive method call to calling method? (backtracking algorithm)

I'm trying to implement a backtracking algorithm to balance weights on a scale. It's for university, so there are given weights I have to use (0, 2, 7, 20, 70, 200, 700). Weights can be placed on the scale multiple times to match the input. For example: input(80) -> result(20, 20, 20, 20) or input(16) -> result(7,7,2).
I have to use backtracking and recursion.
I have difficulties understanding how to do the backtracking if a proposal is wrong. I can only step back one step, but if the right solution requires two steps back my algorithm fails.
So my method isInvalid() is checking if the sum of all counterweights is higher than the input. If so, it will remove the last weight.
I guess this is my problem. For input(16) it produces (7,7,2) --> correct.
But for input(21) it never finishes, because it tries to add 20, and then tries to add 7. Then it will be over 21 and will remove 7, but it will never remove the 20.
/* This is my backtracking algorithm */
public Proposal calc(Proposal proposal) {
Proposal result;
if(proposal.isInvalid()) return null;
if(proposal.isSolution()) return proposal;
for (int i : proposal.possibleNextSteps()) {
Proposal newProposal = new Proposal(proposal.getWeight(), proposal.getCounterWeights());
newProposal.apply(i);
result = calc(newProposal);
if (result != null) return result;
}
return null;
}
/* this is the class Proposal (only required parts) */
public class Proposal {
private int weight;
private ArrayList<Integer> counterWeights;
private static Integer[] weights = {0, 2, 7, 20, 70, 200};
public Proposal(int weight, ArrayList<Integer> counterWeights) {
this.weight = weight;
this.counterWeights = counterWeights;
Arrays.sort(weights, Collections.reverseOrder());
}
public boolean isInvalid() {
if(counterWeights.stream().mapToInt(i -> i.intValue()).sum() > weight) {
counterWeights.remove(counterWeights.size()-1);
return true;
}
return false;
}
public boolean isSolution() {
return counterWeights.stream().mapToInt(value -> value).sum() == weight;
}
public Integer[] possibleNextSteps() {
return weights;
}
public void apply(int option) {
this.counterWeights.add(option);
}
}
What am I doing wrong?
And also, is this the right way to reverse my array of weights?
Thanks!
EDIT:
I tried something different.
I changed this:
Proposal newProposal = new Proposal(proposal.getWeight()- proposal.getSum(), new ArrayList<>());
And this:
public boolean isInvalid() {
return counterWeights.stream().mapToInt(value -> value).sum() > weight;
}
So now if I follow it step by step in debug mode, it is pretty much doing what I want it to do, but it does not pass the solutions from my recursion to my previous solution, so they do not add up to a final solution.
So basically I break down the problem in smaller problems (once i find a weight that fits, I'll call the method recursively with the difference between the total weight and the solution I've already found). But how do I pass the solutions to the calling method?
In the following implementation, a solution is an array of coefficients. a coefficient at index i is the number of times the weight at position i appears in the solution.
Note that you can have several solutions giving the same total weight, this implementation gives them all. It's easy to change it to return only the first solution found.
The recursive methode void solve(int weight, int n, int total) tries for index n all integers for which the total weight is no greater than the target weight.
public class Solver {
private final int[] weights;
private int[] current;
private final List<int[]> solutions = new ArrayList<>();
public Solver(int...weights) {
this.weights = weights;
}
public int[][] solve(int weight) {
current = new int[weights.length];
solutions.clear();
solve(weight, 0, 0);
return solutions.toArray(new int[solutions.size()][]);
}
public void printSolution(int[] solution) {
int total = 0;
for (int i = 0; i < solution.length; ++i) {
for (int j = 0; j < solution[i]; ++j) {
System.out.print(weights[i] + " ");
total += weights[i];
}
}
System.out.println(" total: " + total);
System.out.println();
}
private void solve(int weight, int n, int total) {
if (n >= current.length) {
if (total == weight) {
solutions.add(current.clone());
}
} else {
solve(weight, n+1, total);
while (total < weight) {
++current[n];
total += weights[n];
solve(weight, n+1, total);
}
current[n] = 0;
}
}
}

How to implement a Spliterator for streaming Fibonacci numbers?

I'm playing with Java 8 Spliterator and created one to stream Fibonacci numbers up to a given n. So for the Fibonacci series 0, 1, 1, 2, 3, 5, 8, ...
n fib(n)
-----------
-1 0
1 0
2 1
3 1
4 2
Following is my implementation which prints a bunch of 1 before running out of stack memory. Can you help me find the bug? (I think it's not advancing the currentIndex but I'm not sure what value to set it to).
Edit 1: If you decide to answer, please keep it relevant to the question. This question is not about efficient fibonacci number generation; it's about learning spliterators.
FibonacciSpliterator:
#RequiredArgsConstructor
public class FibonacciSpliterator implements Spliterator<FibonacciPair> {
private int currentIndex = 3;
private FibonacciPair pair = new FibonacciPair(0, 1);
private final int n;
#Override
public boolean tryAdvance(Consumer<? super FibonacciPair> action) {
// System.out.println("tryAdvance called.");
// System.out.printf("tryAdvance: currentIndex = %d, n = %d, pair = %s.\n", currentIndex, n, pair);
action.accept(pair);
return n - currentIndex >= 2;
}
#Override
public Spliterator<FibonacciPair> trySplit() {
// System.out.println("trySplit called.");
FibonacciSpliterator fibonacciSpliterator = null;
if (n - currentIndex >= 2) {
// System.out.printf("trySplit Begin: currentIndex = %d, n = %d, pair = %s.\n", currentIndex, n, pair);
fibonacciSpliterator = new FibonacciSpliterator(n);
long currentFib = pair.getMinusTwo() + pair.getMinusOne();
long nextFib = pair.getMinusOne() + currentFib;
fibonacciSpliterator.pair = new FibonacciPair(currentFib, nextFib);
fibonacciSpliterator.currentIndex = currentIndex + 3;
// System.out.printf("trySplit End: currentIndex = %d, n = %d, pair = %s.\n", currentIndex, n, pair);
}
return fibonacciSpliterator;
}
#Override
public long estimateSize() {
return n - currentIndex;
}
#Override
public int characteristics() {
return ORDERED | IMMUTABLE | NONNULL;
}
}
FibonacciPair:
#RequiredArgsConstructor
#Value
public class FibonacciPair {
private final long minusOne;
private final long minusTwo;
#Override
public String toString() {
return String.format("%d %d ", minusOne, minusTwo);
}
}
Usage:
Spliterator<FibonacciPair> spliterator = new FibonacciSpliterator(5);
StreamSupport.stream(spliterator, true)
.forEachOrdered(System.out::print);
Besides the fact that your code is incomplete, there are at least two errors in your tryAdvance method recognizable. First, you are not actually making any advance. You are not modifying any state of your spliterator. Second, you are unconditionally invoking the action’s accept method which is not matching the fact that you are returning a conditional value rather than true.
The purpose of tryAdvance is:
as the name suggests, try to make an advance, i.e. calculate a next value
if there is a next value, invoke action.accept with that value and return true
otherwise just return false
Note further that your trySplit() does not look very convincing, I don’t even know where to start. You are better off, inheriting from AbstractSpliterator and not implementing a custom trySplit(). Your operation doesn’t benefit from parallel execution anyway. A stream constructed with that source could only gain an advantage from parallel execution if you chain it with quiet expensive per-element operations.
In general you don't need implementing the spliterator. If you really need a Spliterator object, you may use stream for this purpose:
Spliterator.OfLong spliterator = Stream
.iterate(new long[] { 0, 1 },
prev -> new long[] { prev[1], prev[0] + prev[1] })
.mapToLong(pair -> pair[1]).spliterator();
Testing:
for(int i=0; i<20; i++)
spliterator.tryAdvance((LongConsumer)System.out::println);
Please note that holding Fibonacci numbers in long variable is questionable: it overflows after Fibonacci number 92. So if you want to create spliterator which just iterates over first 92 Fibonacci numbers, I'd suggest to use predefined array for this purpose:
Spliterator.OfLong spliterator = Spliterators.spliterator(new long[] {
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6765,
10946, 17711, 28657, 46368, 75025, 121393, 196418, 317811, 514229, 832040, 1346269, 2178309,
3524578, 5702887, 9227465, 14930352, 24157817, 39088169, 63245986, 102334155, 165580141,
267914296, 433494437, 701408733, 1134903170, 1836311903, 2971215073L, 4807526976L,
7778742049L, 12586269025L, 20365011074L, 32951280099L, 53316291173L, 86267571272L, 139583862445L,
225851433717L, 365435296162L, 591286729879L, 956722026041L, 1548008755920L, 2504730781961L,
4052739537881L, 6557470319842L, 10610209857723L, 17167680177565L, 27777890035288L,
44945570212853L, 72723460248141L, 117669030460994L, 190392490709135L, 308061521170129L,
498454011879264L, 806515533049393L, 1304969544928657L, 2111485077978050L, 3416454622906707L,
5527939700884757L, 8944394323791464L, 14472334024676221L, 23416728348467685L, 37889062373143906L,
61305790721611591L, 99194853094755497L, 160500643816367088L, 259695496911122585L, 420196140727489673L,
679891637638612258L, 1100087778366101931L, 1779979416004714189L, 2880067194370816120L,
4660046610375530309L, 7540113804746346429L
}, Spliterator.ORDERED);
Array spliterator also splits well, so you will have real parallel processing.
Ok, let's write the spliterator. Using OfLong is still too boring: let's switch to BigInteger and don't limit user by 92. The tricky thing here is to quickly jump to the given Fibonacci number. I'll use matrix multiplication algorithm described here for this purpose. Here's my code:
static class FiboSpliterator implements Spliterator<BigInteger> {
private final static BigInteger[] STARTING_MATRIX = {
BigInteger.ONE, BigInteger.ONE,
BigInteger.ONE, BigInteger.ZERO};
private BigInteger[] state; // previous and current numbers
private int cur; // position
private final int fence; // max number to cover by this spliterator
public FiboSpliterator(int max) {
this(0, max);
}
// State is not initialized until traversal
private FiboSpliterator(int cur, int fence) {
assert fence >= 0;
this.cur = cur;
this.fence = fence;
}
// Multiplication of 2x2 matrix, by definition
static BigInteger[] multiply(BigInteger[] m1, BigInteger[] m2) {
return new BigInteger[] {
m1[0].multiply(m2[0]).add(m1[1].multiply(m2[2])),
m1[0].multiply(m2[1]).add(m1[1].multiply(m2[3])),
m1[2].multiply(m2[0]).add(m1[3].multiply(m2[2])),
m1[2].multiply(m2[1]).add(m1[3].multiply(m2[3]))};
}
// Log(n) algorithm to raise 2x2 matrix to n-th power
static BigInteger[] power(BigInteger[] m, int n) {
assert n > 0;
if(n == 1) {
return m;
}
if(n % 2 == 0) {
BigInteger[] root = power(m, n/2);
return multiply(root, root);
} else {
return multiply(power(m, n-1), m);
}
}
#Override
public boolean tryAdvance(Consumer<? super BigInteger> action) {
if(cur == fence)
return false; // traversal finished
if(state == null) {
// initialize state: done only once
if(cur == 0) {
state = new BigInteger[] {BigInteger.ZERO, BigInteger.ONE};
} else {
BigInteger[] res = power(STARTING_MATRIX, cur);
state = new BigInteger[] {res[1], res[0]};
}
}
action.accept(state[1]);
// update state
if(++cur < fence) {
BigInteger next = state[0].add(state[1]);
state[0] = state[1];
state[1] = next;
}
return true;
}
#Override
public Spliterator<BigInteger> trySplit() {
if(fence - cur < 2)
return null;
int mid = (fence+cur) >>> 1;
if(mid - cur < 100) {
// resulting interval is too small:
// instead of jumping we just store prefix into array
// and return ArraySpliterator
BigInteger[] array = new BigInteger[mid-cur];
for(int i=0; i<array.length; i++) {
tryAdvance(f -> {});
array[i] = state[0];
}
return Spliterators.spliterator(array, ORDERED | NONNULL | SORTED);
}
// Jump to another position
return new FiboSpliterator(cur, cur = mid);
}
#Override
public long estimateSize() {
return fence - cur;
}
#Override
public int characteristics() {
return ORDERED | IMMUTABLE | SIZED| SUBSIZED | NONNULL | SORTED;
}
#Override
public Comparator<? super BigInteger> getComparator() {
return null; // natural order
}
}
This implementation actually faster in parallel for very big fence value (like 100000). Probably even wiser implementation is also possible which would split unevenly reusing the intermediate results of matrix multiplication.

Fixed-size collection that keeps top (N) values in Java

I need to keep top N(< 1000) integers while trying to add values from a big list of integers(around a million sized lazy list). I want to be try adding values to a collection but that needs to keep only the top N(highest values) integers. Is there any preferred data structure to use for this purpose ?
I'd suggest to use some sorted data structure, such as TreeSet. Before insertion, check the number of items in the set, and if it reached 1000, remove the smallest number if it's smaller than the newly added number, and add the new number.
TreeSet<Integer> set = ...;
public void add (int n) {
if (set.size () < 1000) {
set.add (n);
} else {
Integer first = set.first();
if (first.intValue() < n) {
set.pollFirst();
set.add (n);
}
}
}
Google Guava MinMaxPriorityQueue class.
You can also use custom sorting by using a comparator (Use orderedBy(Comparator<B> comparator) method).
Note: This collection is NOT a sorted collection.
See javadoc
Example:
#Test
public void test() {
final int maxSize = 5;
// Natural order
final MinMaxPriorityQueue<Integer> queue = MinMaxPriorityQueue
.maximumSize(maxSize).create();
queue.addAll(Arrays.asList(10, 30, 60, 70, 20, 80, 90, 50, 100, 40));
assertEquals(maxSize, queue.size());
assertEquals(new Integer(50), Collections.max(queue));
System.out.println(queue);
}
Output:
[10, 50, 40, 30, 20]
One efficient solution is a slightly tweaked array-based priority queue using a binary min-heap.
First N integers are simply added to the heap one by one or you can build it from array of first N integers (slightly faster).
After that, compare the incoming integer with the root element (which is MIN value found so far). If the new integer is larger that that, simply replace the root with this new integer and perform down-heap operation (i.e. trickle down the new integer until both its children are smaller or it becomes a leaf). The data structure guarantees you will always have N largest integers so far with average addition time of O(log N).
Here is my C# implementation, the mentioned method is named "EnqueueDown". The "EnqueueUp" is a standard enqueue operation that expands the array, adds new leaf and trickles it up.
I have tested it on 1M numbers with max heap size of 1000 and it runs under 200 ms:
namespace ImagingShop.Research.FastPriorityQueue
{
using System;
using System.Collections;
using System.Collections.Generic;
using System.Linq;
using System.Runtime.CompilerServices;
public sealed class FastPriorityQueue<T> : IEnumerable<Tuple<T, float>>
{
private readonly int capacity;
private readonly Tuple<T, float>[] nodes;
private int count = 0;
public FastPriorityQueue(int capacity)
{
this.capacity = capacity;
this.nodes = new Tuple<T, float>[capacity];
}
public int Capacity => this.capacity;
public int Count => this.count;
public T FirstNode => this.nodes[0].Item1;
public float FirstPriority => this.nodes[0].Item2;
public void Clear()
{
this.count = 0;
}
public bool Contains(T node) => this.nodes.Any(tuple => Equals(tuple.Item1, node));
public T Dequeue()
{
T nodeHead = this.nodes[0].Item1;
int index = (this.count - 1);
this.nodes[0] = this.nodes[index];
this.count--;
DownHeap(index);
return nodeHead;
}
public void EnqueueDown(T node, float priority)
{
if (this.count == this.capacity)
{
if (priority < this.nodes[0].Item2)
{
return;
}
this.nodes[0] = Tuple.Create(node, priority);
DownHeap(0);
return;
}
int index = this.count;
this.count++;
this.nodes[index] = Tuple.Create(node, priority);
UpHeap(index);
}
public void EnqueueUp(T node, float priority)
{
int index = this.count;
this.count++;
this.nodes[index] = Tuple.Create(node, priority);
UpHeap(index);
}
public IEnumerator<Tuple<T, float>> GetEnumerator()
{
for (int i = 0; i < this.count; i++) yield return this.nodes[i];
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void DownHeap(int index)
{
while (true)
{
int indexLeft = (index << 1);
int indexRight = (indexLeft | 1);
int indexMin = ((indexLeft < this.count) && (this.nodes[indexLeft].Item2 < this.nodes[index].Item2))
? indexLeft
: index;
if ((indexRight < this.count) && (this.nodes[indexRight].Item2 < this.nodes[indexMin].Item2))
{
indexMin = indexRight;
}
if (indexMin == index)
{
break;
}
Flip(index, indexMin);
index = indexMin;
}
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void Flip(int indexA, int indexB)
{
var temp = this.nodes[indexA];
this.nodes[indexA] = this.nodes[indexB];
this.nodes[indexB] = temp;
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void UpHeap(int index)
{
while (true)
{
if (index == 0)
{
break;
}
int indexParent = (index >> 1);
if (this.nodes[indexParent].Item2 <= this.nodes[index].Item2)
{
break;
}
Flip(index, indexParent);
index = indexParent;
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
}
The basic implementation is taken from "Cormen, Thomas H. Introduction to algorithms. MIT press, 2009."
In Java 1.7 one may use java.util.PriorityQueue. To keep the top N items you need to use reverse comparator, e.g. for integers you order them descending. In this manner the smallest number is always on top and could be removed if to many items in queue.
package eu.pawelsz.example.topn;
import java.util.Comparator;
import java.util.PriorityQueue;
public class TopN {
public static <E> void add(int keep, PriorityQueue<E> priorityQueue, E element) {
if (keep == priorityQueue.size()) {
priorityQueue.poll();
}
priorityQueue.add(element);
}
public static void main(String[] args) {
int N = 4;
PriorityQueue<Integer> topN = new PriorityQueue<>(N, new Comparator<Integer>() {
#Override
public int compare(Integer o1, Integer o2) {
return o1 - o2;
}
});
add(N, topN, 1);
add(N, topN, 2);
add(N, topN, 3);
add(N, topN, 4);
System.out.println("smallest: " + topN.peek());
add(N, topN, 8);
System.out.println("smallest: " + topN.peek());
add(N, topN, 5);
System.out.println("smallest: " + topN.peek());
add(N, topN, 2);
System.out.println("smallest: " + topN.peek());
}
}
// this Keep Top Most K Instance in Queue
public static <E> void add(int keep, PriorityQueue<E> priorityQueue, E element) {
if(priorityQueue.size()<keep){
priorityQueue.add(element);
}
else if(keep == priorityQueue.size()) {
priorityQueue.add(element); // size = keep +1 but
Object o = (Object)topN.toArray()[k-1];
topN.remove(o); // resized to keep
}
}
The fastest way is likely a simple array items = new Item[N]; and a revolving cursor int cursor = 0;. The cursor points to the insertion point of the next element.
To add a new element use the method
put(Item newItem) { items[cursor++] = newItem; if(cursor == N) cursor = 0; }
when accessing this structure you can make the last item added appear at index 0 via a small recalculation of the index, i.e.
get(int index) { return items[ cursor > index ? cursor-index-1 : cursor-index-1+N ]; }
(the -1 is because cursor always point at the next insertion point, i.e. cursor-1 is the last element added).
Summary: put(item) will add a new item. get(0) will get the last item added, get(1) will get the second last item, etc.
In case you need to take care of the case where n < N elements have been added you just need to check for null.
(TreeSets will likely be slower)
Your Question is answered here:
Size-limited queue that holds last N elements in Java
To summerize it:
No there is no data structure in the default java sdk, but Apache commons collections 4 has a CircularFifoQueue.

Is it possible to split a Java list into three without looping?

Given a Java List with 21 elements.
What is the best way to create three new lists with:
A = 0, 3, 6, ... indexed elements from source
B = 1, 4, 7, ...
C = 2 ,5, 8, 11, 14, 17, 20
Is it possible without looping?
Well you could write a wrapper class which is able to provide a read-only "view" onto a list given a multiple (3 in this case) and an offset (0, 1 and 2). When asked for the item at a particular index, it would have to multiply by the "multiple" and add the offset, then look into the original list. (Likewise for the other operations.)
It would be simpler to loop though... what's the context here? What are you really trying to achieve?
Here's an example of what Jon mentioned (if of course you really don't want to just loop). The name isn't great... I'm not sure what a good name for such a thing would be.
public class OffsetList<E> extends AbstractList<E> {
private final List<E> delegate;
private final int offset;
private final int multiple;
public static <E> OffsetList<E> create(List<E> delegate, int offset, int multiple) {
return new OffsetList<E>(delegate, offset, multiple);
}
private OffsetList(List<E> delegate, int offset, int multiple) {
this.delegate = delegate;
this.offset = offset;
this.multiple= multiple;
}
#Override public E get(int index) {
return delegate.get(offset + (index * multiple));
}
#Override public int size() {
int offsetToEnd = delegate.size() - offset;
return (int) Math.ceil(offsetToEnd / (double) multiple);
}
}
Example use:
List<Integer> numbers = // the numbers 0 to 21 in order
List<Integer> first = OffsetList.create(numbers, 0, 3);
List<Integer> second = OffsetList.create(numbers, 1, 3);
List<Integer> third = OffsetList.create(numbers, 2, 3);
System.out.println(first); // [0, 3, 6, 9, 12, 15, 18, 21]
System.out.println(second); // [1, 4, 7, 10, 13, 16, 19]
System.out.println(third); // [2, 5, 8, 11, 14, 17, 20]
Creating each list is O(1) since they're views. Iterating each list is O(n) where n is the size of the actual view list, not the size of the full list it's based on. This assumes the original list is a random access list... this approach, like index-based iteration, would be very inefficient with a linked list.
Given you saying you're used to functional programming, I'm going to assume you want to split up the lists because you want to do something different to each. If that's the case I would put the filtering logic at the Iterator level.
You could have a wrapping Iterator instead of a wrapping List. It might look something like this:
public <T> Iterable<T> filter(final Iterable<T> allElements, final int offset, final int multiple) {
return new Iterable<T> {
public Iterator<T> iterator() {
return new Iterator<T> {
int index = 0;
Iterator<T> allElementsIt = allElements.iterator();
public boolean hasNext() {
while (allElementsIt.hasNext()) {
if ( isDesiredIndex(index) ) {
return true;
} else {
allElementsIt.next();
index++;
}
}
return false;
}
private boolean isDesiredIndex(int index) {
return (index - offset) % multiple == 0;
}
public T next() {
if ( hasNext() ) {
return allElementsIt.next();
} else {
throw NoSuchElementException();
}
}
public void remove() {...}
}
}
}
}
Then to use it:
for ( ElementType element : filter(elements, 2, 3) ) {
//do something to every third starting with third element
}
Next try :)
class Mod3Comparator implements Comparator<Integer> {
public int compare(Integer a, Integer b) {
if (a % 3 < b % 3 || (a % 3 == b % 3 && a < b)) {
return -1;
}
if (a % 3 > b % 3 || (a % 3 == b % 3 && a > b)) {
return 1;
}
return 0;
}
}
First sort the list taking into consideration the modulo rule, then use the Arrays.copyOfRange method.
Collections.sort(list, new Mod3Comparator());
Integer[] array = new Integer[list.size()];
list.toArray(array);
List<Integer> A = Arrays.asList(Arrays.copyOfRange(array, 0, 7));
List<Integer> B = Arrays.asList(Arrays.copyOfRange(array, 7, 14));
...
Also see this example.
Unfortunately, I can't think of a way of doing so without pretending the arrays are lists are doing the following.
String[] twentyOne = new String[21];
String[] first = new String[3];
first[0] = twentyOne[0];
first[1] = twentyOne[3];
first[2] = twentyOne[6];
// And so on
String[] second = new String[3];
second[0] = twentyOne[1];
second[1] = twentyOne[4];
second[2] = twentyOne[7];
String[] third = new String[15];
third[0] = twentyOne[2];
// You get the picture
I only used arrays in the example because I'm more confident with them, and know them without needing to look at something.
May I ask why you want to avoid looping?

Categories