Passing a non-associative function to reduce

Passing a non-associative function to reduce - java

My program has this line:
Function<String, Integer> f = (String s) -> s.chars().reduce(0, (a, b) -> 2 * a + b);
The function being passed to reduce is not associative. Reduce's documentation says that the function passed must be associative.
How can I rewrite this as an expression which doesn't break reduce's contract?

Under the current implementation and IFF you are not going to use parallel - you are safe with what you have right now. Obviously if you are OK with these disclaimers.
Or you can obviously create the function with a for loop:
Function<String, Integer> f = s -> {
int first = s.charAt(0) * 2 + s.charAt(1);
int total = first;
for (int x = 1; x < s.length() - 1; x++) {
total = total * 2 + s.charAt(x + 1);
}
return total;
};

You can convert this function to an associative function, as explained in this answer at the example of List.hashCode(). The difference lies only in the factor (2vs.31) and the start value (1vs.0).
It can be adapted to your task, which is especially easy when you have a random access input like a String:
Function<String, Integer> f =
s -> IntStream.range(0, s.length()).map(i -> s.charAt(i)<<(s.length()-i-1)).sum();
This would even run in parallel, but it’s unlikely that you ever encounter such humongous strings that a parallel evaluation provides a benefit. So what remains, is that most people might consider this solution less readable than a simple for loop…
Note that the above solution exhibits a different overflow behavior, i.e. if the String has more than 32 chars, due to the usage of the shift operator rather than multiplying with two.
The fix for this issue makes the solution even more efficient:
Function<String, Integer> f = s ->
IntStream.range(Math.max(0, s.length()-32), s.length())
.map(i -> s.charAt(i)<<(s.length()-i-1)).sum();
If the string has more than 32 chars, it only processes the last 32 chars, which is already sufficient to calculate the same result as your original function.

Related

Count Elements in a stream and return Integer insted of long

I need to count Elements in a Stream and assign it to an Integer without casting.
.count() does return long
thought about the .collect(Collectors.reducing(..)) but cant figure it out.
I feel like there is something simple I don't get.
My Try:
Stream<String> s = Stream.of("Hallo ", "Test", "String");
Integer count = s.filter(e -> (e.length() >= lb && e.length() <= ub && !e.contains(" ")))
.map(e -> e.toUpperCase())
.distinct()
.collect(Collectors.reducing(0, e -> 1, Integer::sum)));
System.out.println(count);

Simply: don't.
Don't cast, but also don't make things overly complicated.
Rather look into safe ways of getting that int out of the long returned by count(). See here for starters:
int bar = Math.toIntExact(someLong);
for example. When you are 100% sure that the computed value always fits within int, then you just avoid putting down the catch for the potentially thrown ArithmeticException. And you still got that good feeling that you can't "overrun" without noticing.
But as said: don't invest time/energy into specially computing your own stuff, when you can use built-in functionality to count things, and turn them into int/Integer. Remember: each character you put into code needs to be read and understood later on. Thus even "20 characters" more add up over time. So when you always lean towards the shorter solution, as long as they are easy to read/understand.

Here is the right way. Convert all the distinct values to 1 using Stream::mapToInt - it produces the IntStream which has sum/count methods able to handle stream of numeric values directly without mapping:
Integer count = s.filter(e -> (e.length() >= lb && e.length() <= ub && !e.contains(" ")))
.map(String::toUpperCase)
.distinct()
.mapToInt(i -> 1)
.sum();
Without mapping to int, you can use Stream::reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner) to get the very same result:
Integer count = s.filter(e -> (e.length() >= 2 && e.length() <= 10 && !e.contains(" ")))
.map(String::toUpperCase)
.distinct()
.reduce(0, (a,b) -> a + 1, (a,b) -> a + b);
The interface of this method is little bit complicated:
U identity is set to 0 - a start of counting
accumulator ((a,b) -> a + 1) converts the String to int, each String will be converted to 1 and added to the previous result (0+1+1+1...).
combiner combines two consecutive values ((a,b) -> a + b) - the sum of the 1 values, which is practically the count.

If you want to count the elements in stream without using the build in .count() method then you could map each element to an int and reduce by summing them. Something like this:
Integer count = s.mapToInt(i -> 1).reduce((a, b) -> a + b).orElse(0);
Or as #Holger commented bellow to use the sum() after mapping.
Integer count = s.mapToInt(i -> 1).sum();

With Java 8, you can use Math.toIntExact(long val).
public static int toIntExact(long value)
Returns the value of the long argument; throwing an exception if the
value overflows an int.

Common element in Java infinite streams

I have three infinite Java IntStream objects. I want to find the smallest element that is present in all three of them.
IntStream a = IntStream.iterate(286, i->i+1).map(i -> (Integer)i*(i+1)/2);
IntStream b = IntStream.iterate(166, i->i+1).map(i -> (Integer)i*(3*i-1)/2);
IntStream c = IntStream.iterate(144, i->i+1).map(i -> i*(2*i-1));
I can always employ a brute force solution (without streams) which involves iterating in nested loops, but I was wondering if we can do it more efficiently with streams?

You need to iterate all 3 in parallel, advancing the one with the lowest value, checking if all 3 are equal.
You code will not find an answer for next value after 40755, because the next value is 1_533_776_805, which has intermediate value (before division by 2) higher than Integer.MAX_VALUE (2_147_483_647).
So, here is one way to use your streams, after changing them to long and guarding against overflow.
LongStream a = LongStream.iterate(286, i->i+1).map(i -> Math.multiplyExact(i, i+1)/2);
LongStream b = LongStream.iterate(166, i->i+1).map(i -> Math.multiplyExact(i, 3*i-1)/2);
LongStream c = LongStream.iterate(144, i->i+1).map(i -> Math.multiplyExact(i, 2*i-1));
OfLong aIter = a.iterator();
OfLong bIter = b.iterator();
OfLong cIter = c.iterator();
long aVal = aIter.nextLong();
long bVal = bIter.nextLong();
long cVal = cIter.nextLong();
while (aVal != bVal || bVal != cVal) {
long min = Math.min(Math.min(aVal, bVal), cVal);
if (aVal == min)
aVal = aIter.nextLong();
if (bVal == min)
bVal = bIter.nextLong();
if (cVal == min)
cVal = cIter.nextLong();
}
System.out.println(aVal);

These functions are always increasing. So the code should stop when the magic equal triplet is found.
The thing to code is:
a) when a stream's current value is below any other, it can iterate next for itself.
b) when it meets the same candidate value, it waits for the 3rd stream to take a decision.
c) when it has a higher value than the all others, it changes the candidate and waits for both others.
Reference juggling.
There may not be a solution too (at least in short time).
Notice that stream c can only produce even numbers (when seeded with even). There might be some optimization there to skip a and b faster.

I don't think there is anything smart possible with stream API. The main reason is that you can't really go over one stream until some condition is met - instead, you look at current 3 elements and pick the next element from one of the streams before comparing the elements again.
The most efficient (and might be also the cleanest) solution is to use iterators and keep calling next() method on the right streams until the answer is found.
To start with, you can focus on two streams only and find their first common value:
while (elementA != elementB) {
if (elementA < elementB) {
elementA = iteratorA.next();
} else {
elementB = iteratorB.next();
}
}
Then you need to do make third stream catch up with these two:
while (elementC < elementA) {
elementC = iteratorC.next();
}
At this point there are two options:
either elementC == elementA in which case you have the answer
or elementC > elementA in which case you can go to next value on all three streams and start over
One thing to remember is the max value of integer. Because you have i^2, this means that it will overflow for i about 46k, so you need to change streams of ints to streams of longs (the answer is about 1.5 billion - and that's after division by 2 in these functions).
Since you are doing exercises for practice, I don't think it's right to give you the full working code, but let me know if you still struggle with it ;)

Using the stream api of java for a feed forward computation

For a college project of mine i needed to implement a deeplearning neural network in plain java. After profiling the application i wanted to see if the automatic parallelization using java's stream api would lead to a significant improvement in performance, but i am struggling to transform my old code to a stream based approach.
The method takes a vector (double array), performs a matrix multiplication, then adds a value to each element and finally applies a lambda function (DoubleFunction) to every element.
Here is the old code that i want to replace:
/* e.g.
double[] x = double[100]
int inputNeurons = 100
int outputNeurons = 200
double[][] weights = double[200][100]
double[] biases = double[200]
*/
private double[] output(double[] x) {
double[] y = new double[outputNeurons];
for (int i = 0; i < outputNeurons; i++) {
double preActivation = 0.;
for (int j = 0; j < inputNeurons; j++) {
preActivation += weights[i][j] * x[j];
}
preActivation += biases[i];
y[i] = activation.apply(preActivation);
}
}
This is what i came up with so far (it does not work):
private double[] output(double[] x) {
return Arrays.stream(weights).parallel()
.map(outputNeuron -> IntStream.range(0, outputNeurons)
.mapToDouble(i -> IntStream.range(0, inputNeurons)
.mapToDouble(j -> x[i] * outputNeuron[i]).sum()
).map(activation::apply)
).toArray();
Since i don't know streams good enough, i would really appreciate any help!

Good attempt but your stream approach is quite off the imperative one. the exact equivalent of your imperative approach is:
return IntStream.range(0, outputNeurons)
//.parallel() uncomment to see difference in performance
.mapToDouble(i -> IntStream.range(0, inputNeurons)
.mapToDouble(j -> weights[i][j] * x[j]).sum() + biases[i])
.map(activation::apply)
.toArray();
Note, there are many factors that influence whether parallel streams will make your code faster or slower than your imperative approach or sequential streams. Thus, you'll need to consider some factors before going parallel.
Data size
Number of cores
Cost per element (meaning time spent executing in parallel and overhead of decomposition and merging)
Source data structure
Packing (meaning primitive types are faster to operate on than boxed values).
You should also consider reading Should I always use a parallel stream when possible?

Choose best combinations of operators to find target number

I have an array of operations and a target number.
The operations could be
+ 3
- 3
* 4
/ 2
I want to find out how close I can get to the target number by using those operations.
I start from 0 and I need to iterate through the operations in that order, and I can choose to either use the operation or not use it.
So if the target number is 13, I can use + 3 and * 4 to get 12 which is the closest I can get to the target number 13.
I guess I need to compute all possible combinations (I guess the number of calculations is thus 2^n where n is the number of operations).
I have tried to do this in java with
import java.util.*;
public class Instruction {
public static void main(String[] args) {
// create scanner
Scanner sc = new Scanner(System.in);
// number of instructions
int N = sc.nextInt();
// target number
int K = sc.nextInt();
//
String[] instructions = new String[N];
// N instructions follow
for (int i=0; i<N; i++) {
//
instructions[i] = sc.nextLine();
}
//
System.out.println(search(instructions, 0, N, 0, K, 0, K));
}
public static int search(String[] instructions, int index, int length, int progressSoFar, int targetNumber, int bestTarget, int bestDistance) {
//
for (int i=index; i<length; i++) {
// get operator
char operator = instructions[i].charAt(0);
// get number
int number = Integer.parseInt(instructions[i].split("\\s+")[1]);
//
if (operator == '+') {
progressSoFar += number;
} else if (operator == '*') {
progressSoFar *= number;
} else if (operator == '-') {
progressSoFar -= number;
} else if (operator == '/') {
progressSoFar /= number;
}
//
int distance = Math.abs(targetNumber - progressSoFar);
// if the absolute distance between progress so far
// and the target number is less than what we have
// previously accomplished, we update best distance
if (distance < bestDistance) {
bestTarget = progressSoFar;
bestDistance = distance;
}
//
if (true) {
return bestTarget;
} else {
return search(instructions, index + 1, length, progressSoFar, targetNumber, bestTarget, bestDistance);
}
}
}
}
It doesn't work yet, but I guess I'm a little closer to solving my problem. I just don't know how to end my recursion.
But maybe I don't use recursion, but should instead just list all combinations. I just don't know how to do this.
If I, for instance, have 3 operations and I want to compute all combinations, I get the 2^3 combinations
111
110
101
011
000
001
010
100
where 1 indicates that the operation is used and 0 indicates that it is not used.
It should be rather simple to do this and then choose which combination gave the best result (the number closest to the target number), but I don't know how to do this in java.

In pseudocode, you could try brute-force back-tracking, as in:
// ops: list of ops that have not yet been tried out
// target: goal result
// currentOps: list of ops used so far
// best: reference to the best result achieved so far (can be altered; use
// an int[1], for example)
// opsForBest: list of ops used to achieve best result so far
test(ops, target, currentOps, best, opsForBest)
if ops is now empty,
current = evaluate(currentOps)
if current is closer to target than best,
best = current
opsForBest = a copy of currentOps
otherwise,
// try including next op
with the next operator in ops,
test(opsAfterNext, target,
currentOps concatenated with next, best, opsForBest)
// try *not* including next op
test(opsAfterNext, target, currentOps, best, opsForBest)
This is guaranteed to find the best answer. However, it will repeat many operations once and again. You can save some time by avoiding repeat calculations, which can be achieved using a cache of "how does this subexpression evaluate". When you include the cache, you enter the realm of "dynamic programming" (= reusing earlier results in later computation).
Edit: adding a more OO-ish variant
Variant returning the best result, and avoiding the use of that best[] array-of-one. Requires the use of an auxiliary class Answer with fields ops and result.
// ops: list of ops that have not yet been tried out
// target: goal result
// currentOps: list of ops used so far
Answer test(ops, target, currentOps, opsForBest)
if ops is now empty,
return new Answer(currentOps, evaluate(currentOps))
otherwise,
// try including next op
with the next operator in ops,
Answer withOp = test(opsAfterNext, target,
currentOps concatenated with next, best, opsForBest)
// try *not* including next op
Answer withoutOp = test(opsAfterNext, target,
currentOps, best, opsForBest)
if withOp.result closer to target than withoutOp.target,
return withOp
else
return withoutOp

Dynamic programming
If the target value is t, and there are n operations in the list, and the largest absolute value you can create by combining some subsequence of them is k, and the absolute value of the product of all values that appear as an operand of a division operation is d, then there's a simple O(dkn)-time and -space dynamic programming algorithm that determines whether it's possible to compute the value i using some subset of the first j operations and stores this answer (a single bit) in dp[i][j]:
dp[i][j] = dp[i][j-1] || dp[invOp(i, j)][j-1]
where invOp(i, j) computes the inverse of the jth operation on the value i. Note that if the jth operation is a multiplication by, say, x, and i is not divisible by x, then the operation is considered to have no inverse, and the term dp[invOp(i, j)][j-1] is deemed to evaluate to false. All other operations have unique inverses.
To avoid loss-of-precision problems with floating point code, first multiply the original target value t, as well as all operands to addition and subtraction operations, by d. This ensures that any division operation / x we encounter will only ever be applied to a value that is known to be divisible by x. We will essentially be working throughout with integer multiples of 1/d.
Because some operations (namely subtractions and divisions) require solving subproblems for higher target values, we cannot in general calculate dp[i][j] in a bottom-up way. Instead we can use memoisation of the top-down recursion, starting at the (scaled) target value t*d and working outwards in steps of 1 in each direction.
C++ implementation
I've implemented this in C++ at https://ideone.com/hU1Rpq. The "interesting" part is canReach(i, j); the functions preceding this are just plumbing to handle the memoisation table. Specify the inputs on stdin with the target value first, then a space-separated list of operations in which operators immediately preceed their operand values, e.g.
10 +8 +11 /2
or
10 +4000 +5500 /1000
The second example, which should give the same answer (9.5) as the first, seems to be around the ideone (and my) memory limits, although this could be extended somewhat by using long long int instead of int and a 2-bit table for _m[][][] instead of wasting a full byte on each entry.
Exponential worst-case time and space complexity
Note that in general, dk or even just k by itself could be exponential in the size of the input: e.g. if there is an addition, followed by n-1 multiplication operations, each of which involves a number larger than 1. It's not too difficult to compute k exactly via a different DP that simply looks for the largest and smallest numbers reachable using the first i operations for all 1 <= i <= n, but all we really need is an upper bound, and it's easy enough to get a (somewhat loose) one: simply discard the signs of all multiplication operands, convert all - operations to + operations, and then perform all multiplication and addition operations (i.e., ignoring divisions).
There are other optimisations that could be applied, for example dividing through by any common factor.

Here's a Java 8 example, using memoization. I wonder if annealing can be applied...
public class Tester {
public static interface Operation {
public int doOperation(int cur);
}
static Operation ops[] = { // lambdas for the opertions
(x -> x + 3),
(x -> x - 3),
(x -> x * 4),
(x -> x / 2),
};
private static int getTarget(){
return 2;
}
public static void main (String args[]){
int map[];
int val = 0;
int MAX_BITMASK = (1 << ops.length) - 1;//means ops.length < 31 [int overflow]
map = new int[MAX_BITMASK];
map[0] = val;
final int target = getTarget();// To get rid of dead code warning
int closest = val, delta = target < 0? -target: target;
int bestSeq = 0;
if (0 == target) {
System.out.println("Winning sequence: Do nothing");
}
int lastBitMask = 0, opIndex = 0;
int i = 0;
for (i = 1; i < MAX_BITMASK; i++){// brute force algo
val = map[i & lastBitMask]; // get prev memoized value
val = ops[opIndex].doOperation(val); // compute
map[i] = val; //add new memo
//the rest just logic to find the closest
// except the last part
int d = val - target;
d = d < 0? -d: d;
if (d < delta) {
bestSeq = i;
closest = val;
delta = d;
}
if (val == target){ // no point to continue
break;
}
//advance memo mask 0b001 to 0b011 to 0b111, etc.
// as well as the computing operation.
if ((i & (i + 1)) == 0){ // check for 2^n -1
lastBitMask = (lastBitMask << 1) + 1;
opIndex++;
}
}
System.out.println("Winning sequence: " + bestSeq);
System.out.println("Closest to \'" + target + "\' is: " + closest);
}
}
Worth noting, the "winning sequence" is the bit representation (displayed as decimal) of what was used and what wasn't, as the OP has done in the question.
For Those of you coming from Java 7, this is what I was referencing for lambdas: Lambda Expressionsin GUI Applications. So if you're constrained to 7, you can still make this work quite easily.

Infinite Fibonacci Sequence with Memoized in Java 8

Firstly, I'm a JavaScript programmer, and fairly new to Java8 and trying the new functional feature.
Since I expertise JS coding, I implemented my own JS lazy-functional library for proof of concept.
https://github.com/kenokabe/spacetime
Using the library, I could write Infinite sequence of Natural numbers and Fibonacci as below:
JavaScript
var spacetime = require('./spacetime');
var _ = spacetime.lazy();
var natural = _(function(n) //memoized automatically
{
return n; // Natural numbers is defined as the `n`th number becomes `n`
});
var natural10 = _(natural)
.take(10)
.compute(function(x)
{
console.log(x);
});
//wrap a recursive function to memoize
// must be at the definition in the same scope
var fib = _(function(n)
{
if (n <= 1)
return 1; // as the Fib definition in Math
else
return fib(n - 2) + fib(n - 1); // as the Fib definition in Math
});
var fib10 = _(fib)
.take(10)
.compute(function(x)
{
console.log(x);
});
Clear enough. The point is that I can define Natural/Fibonacci infinite sequence as the math definition as it is, then later compute the required part of the infinite sequence with lazy-evaluation.
So, now I wonder if I can do the same manner with Java8.
For natural sequence, I had post another Question here.
Infinite sequence of Natural numbers with Java8 generator
One of the way to define Natural sequence is to use iterator of Java8:
Java8
IntStream natural = IntStream.iterate(0, i -> i + 1);
natural
.limit(10)
.forEach(System.out::println);
I observe IntStream natural = IntStream.iterate(0, i -> i + 1); is a fair definition of natural numbers in math sense.
However, I wonder if it's possible to define it as I did before, that is,
JavaScript
var natural = _(function(n) //memoized automatically
{
return n; // Natural numbers is defined as the `n`th number becomes `n`
});
because this looks more concise. Unfortunately, the answers suggest it's probably not possible even we use generate.
In addition, IntStream.iterate does not fit for Fibonacci sequence.
I seek web to generate indefinite sequence of Fibonacci, the best results I found are
http://blog.informatech.cr/2013/05/08/memoized-fibonacci-numbers-with-java-8/
Java8
private static Map<Integer,Long> memo = new HashMap<>();
static {
memo.put(0,0L); //fibonacci(0)
memo.put(1,1L); //fibonacci(1)
}
//And for the inductive step all we have to do is redefine our Fibonacci function as follows:
public static long fibonacci(int x) {
return memo.computeIfAbsent(x, n -> fibonacci(n-1) + fibonacci(n-2));
}
This is not an infinite sequence (lazy Stream in Java8).
and
Providing Limit condition on Stream generation
Java8
Stream.generate(new Supplier<Long>() {
private long n1 = 1;
private long n2 = 2;
#Override
public Long get() {
long fibonacci = n1;
long n3 = n2 + n1;
n1 = n2;
n2 = n3;
return fibonacci;
}
}).limit(50).forEach(System.out::println);
This is an infinite sequence (lazy Stream in Java8), and you could say it's defined as Math.
However I do not like this implementation because, as you can see, there are many internal valuable to obtain the sequence such as n1 n2 n3 then fibonacci, accordingly the code structure is complicated and you need to control mutable state which is anti-functional manner - unlike the math definition, and probably this is not memoized.
So, here's my question. With Java8 Stream, is there any way to write a code to define the infinite sequence of fibonacci in concise math manner with memoization like
JavaScript
var fib = _(function(n)
{
if (n <= 1)
return 1; // as the Fib definition in Math
else
return fib(n - 2) + fib(n - 1); // as the Fib definition in Math
});
Thanks for your thought.

You can take your map-based memoized fibonacci(x) and make an infinite stream out of it like this:
LongStream fibs = IntStream.iterate(1, i->i+1).mapToLong(i -> fibonacci(i));
But the easiest way to make an infinite stream of fibonacci numbers is like this:
LongStream fibs = Stream.iterate(
new long[]{1, 1},
f -> new long[]{f[1], f[0] + f[1]}
).mapToLong(f -> f[0]);
As the article you linked to points out, "infinite" really means "until long overflows" which happens quickly. If you want to generate hundreds of fibonacci numbers, replace long with BigInteger:
Stream<BigInteger> bigFibs = Stream.iterate(
new BigInteger[]{BigInteger.ONE, BigInteger.ONE},
f -> new BigInteger[]{f[1], f[0].add(f[1])}
).map(f -> f[0]);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Passing a non-associative function to reduce - java

Related

Count Elements in a stream and return Integer insted of long

Common element in Java infinite streams

Using the stream api of java for a feed forward computation

Choose best combinations of operators to find target number

Infinite Fibonacci Sequence with Memoized in Java 8

Categories

Resources