Firstly, I'm a JavaScript programmer, and fairly new to Java8 and trying the new functional feature.
Since I expertise JS coding, I implemented my own JS lazy-functional library for proof of concept.
https://github.com/kenokabe/spacetime
Using the library, I could write Infinite sequence of Natural numbers and Fibonacci as below:
JavaScript
var spacetime = require('./spacetime');
var _ = spacetime.lazy();
var natural = _(function(n) //memoized automatically
{
return n; // Natural numbers is defined as the `n`th number becomes `n`
});
var natural10 = _(natural)
.take(10)
.compute(function(x)
{
console.log(x);
});
//wrap a recursive function to memoize
// must be at the definition in the same scope
var fib = _(function(n)
{
if (n <= 1)
return 1; // as the Fib definition in Math
else
return fib(n - 2) + fib(n - 1); // as the Fib definition in Math
});
var fib10 = _(fib)
.take(10)
.compute(function(x)
{
console.log(x);
});
Clear enough. The point is that I can define Natural/Fibonacci infinite sequence as the math definition as it is, then later compute the required part of the infinite sequence with lazy-evaluation.
So, now I wonder if I can do the same manner with Java8.
For natural sequence, I had post another Question here.
Infinite sequence of Natural numbers with Java8 generator
One of the way to define Natural sequence is to use iterator of Java8:
Java8
IntStream natural = IntStream.iterate(0, i -> i + 1);
natural
.limit(10)
.forEach(System.out::println);
I observe IntStream natural = IntStream.iterate(0, i -> i + 1); is a fair definition of natural numbers in math sense.
However, I wonder if it's possible to define it as I did before, that is,
JavaScript
var natural = _(function(n) //memoized automatically
{
return n; // Natural numbers is defined as the `n`th number becomes `n`
});
because this looks more concise. Unfortunately, the answers suggest it's probably not possible even we use generate.
In addition, IntStream.iterate does not fit for Fibonacci sequence.
I seek web to generate indefinite sequence of Fibonacci, the best results I found are
http://blog.informatech.cr/2013/05/08/memoized-fibonacci-numbers-with-java-8/
Java8
private static Map<Integer,Long> memo = new HashMap<>();
static {
memo.put(0,0L); //fibonacci(0)
memo.put(1,1L); //fibonacci(1)
}
//And for the inductive step all we have to do is redefine our Fibonacci function as follows:
public static long fibonacci(int x) {
return memo.computeIfAbsent(x, n -> fibonacci(n-1) + fibonacci(n-2));
}
This is not an infinite sequence (lazy Stream in Java8).
and
Providing Limit condition on Stream generation
Java8
Stream.generate(new Supplier<Long>() {
private long n1 = 1;
private long n2 = 2;
#Override
public Long get() {
long fibonacci = n1;
long n3 = n2 + n1;
n1 = n2;
n2 = n3;
return fibonacci;
}
}).limit(50).forEach(System.out::println);
This is an infinite sequence (lazy Stream in Java8), and you could say it's defined as Math.
However I do not like this implementation because, as you can see, there are many internal valuable to obtain the sequence such as n1 n2 n3 then fibonacci, accordingly the code structure is complicated and you need to control mutable state which is anti-functional manner - unlike the math definition, and probably this is not memoized.
So, here's my question. With Java8 Stream, is there any way to write a code to define the infinite sequence of fibonacci in concise math manner with memoization like
JavaScript
var fib = _(function(n)
{
if (n <= 1)
return 1; // as the Fib definition in Math
else
return fib(n - 2) + fib(n - 1); // as the Fib definition in Math
});
Thanks for your thought.
You can take your map-based memoized fibonacci(x) and make an infinite stream out of it like this:
LongStream fibs = IntStream.iterate(1, i->i+1).mapToLong(i -> fibonacci(i));
But the easiest way to make an infinite stream of fibonacci numbers is like this:
LongStream fibs = Stream.iterate(
new long[]{1, 1},
f -> new long[]{f[1], f[0] + f[1]}
).mapToLong(f -> f[0]);
As the article you linked to points out, "infinite" really means "until long overflows" which happens quickly. If you want to generate hundreds of fibonacci numbers, replace long with BigInteger:
Stream<BigInteger> bigFibs = Stream.iterate(
new BigInteger[]{BigInteger.ONE, BigInteger.ONE},
f -> new BigInteger[]{f[1], f[0].add(f[1])}
).map(f -> f[0]);
Related
It seems that when using ordered Streams to process a short-circuiting operation on a difficult to bound numeric range, parallel() cannot be used.
E.g.:
public class InfiniteTest {
private static boolean isPrime(int x) {
if (x < 2) {
return false;
}
if (x % 2 == 0 && x > 2) {
return false;
}
// loop while i <= sqrt(x), using multiply for speedup
for (int i = 3; i * i <= x; i += 2) {
if (x % i == 0) {
return false;
}
}
return true;
}
private static int findNthPrime(final int n) {
// must not use infinite stream, causes OOME
// but even big size causes huge slowdown
IntStream.range(1, 1000_000_000)
// .parallel()
.filter(InfiniteTest::isPrime)
.skip(n - 1)
.findFirst()
.getAsInt();
}
public static void main(String[] args) {
int n = 1000; // find the nth prime number
System.out.println(findNthPrime(n));
}
}
This sequential stream works fine. But when I add parallel(), it seems to run forever (or very long at last). I assume it's because the stream threads work on arbitrary numbers instead of starting with the first numbers in the stream. I cannot usefully bound the range of integers to scan for prime numbers.
So is there any simple trick to run this problem in parallel with streams without that trap, such as forcing the splititerator to serve chunks of work from the beginning of the stream? Or building the stream from substreams that cover increasing number ranges?
Or somehow setting up the multithreading as producer/consumer pattern but with streams?
Similar questions all just seem to try to discourage use of parallel:
Generate infinite parallel stream
Java 8, using .parallel in a stream causes OOM error
Java 8's streams: why parallel stream is slower?
Apart from 2 and 3, all prime numbers are of the form 6n-1 or 6n+1. You already treat 2 as a special case in your code. You might want to try also treating 3 as special:
if (x % 3 == 0) {
return x == 3;
}
And then run two parallel streams, one testing numbers of the form 6n-1, starting at 5, and the other testing numbers of the form 6n+1, starting at 7. Each stream can skip six numbers at a time.
You can use the Prime Number theorem to estimate the value of the nth prime and set the limit of your search slightly above that estimate for safety.
TL/DR: It is not possible.
It seems processing unbounded streams in parallel with a short-circuit method to find the earliest occurrences(in stream order) of anything is not possible in a useful way ("useful" meaning better than sequential in terms of time to find the result).
Explanation
I tried a custom implementation of AbstractIntSpliterator that splits the stream not in partitions (1-100, 101-200, ...) but instead splits them interleavingly ([0, 2, 4, 6, 8, ...], [1, 3, 5, 6 ...]). This works correctly in the sequential case:
/**
* Provides numbers starting at n, on split splits such that child iterator and
* this take provide interleaving numbers
*/
public class InterleaveSplitIntSplitIterator extends Spliterators.AbstractIntSpliterator {
private int current;
private int increment;
protected InterleaveSplitIntSplitIterator(int start, int increment) {
super(Integer.MAX_VALUE,
Spliterator.DISTINCT
// splitting is interleaved, not prefixing
// | Spliterator.ORDERED
| Spliterator.NONNULL
| Spliterator.IMMUTABLE
// SORTED must imply ORDERED
// | Spliterator.SORTED
);
if (increment == 0) {
throw new IllegalArgumentException("Increment must be non-zero");
}
this.current = start;
this.increment = increment;
}
#Override
public boolean tryAdvance(IntConsumer action) {
// Don't benchmark with this on
// System.out.println(Thread.currentThread() + " " + current);
action.accept(current);
current += increment;
return true;
}
// this is required for ORDERED even if sorted() is never called
#Override
public Comparator<? super Integer> getComparator() {
if (increment > 0) {
return null;
}
return Comparator.<Integer>naturalOrder().reversed();
}
#Override
public OfInt trySplit() {
if (increment >= 2) {
return null;
}
int newIncrement = this.increment * 2;
int oldIncrement = this.increment;
this.increment = newIncrement;
return new InterleaveSplitIntSplitIterator(current + oldIncrement, newIncrement);
}
// for convenience
public static IntStream asIntStream(int start, int increment) {
return StreamSupport.intStream(
new InterleaveSplitIntSplitIterator(start, increment),
/* no, never set parallel here */ false);
}
}
However, such streams cannot have the Spliterator.ORDERED characteristics, because
If so, this Spliterator guarantees that method
{#link #trySplit} splits a strict prefix of elements
and this also means such a stream cannot keep it's SORTED characteristics, because
A Spliterator that reports {#code SORTED} must also report {#code ORDERED}
So my splititerator in parallel ends up having (somewhat) jumbled numbers, which would have to be fixed by sorting before applying a limit, which does not work well with infinite streams (in the general case).
So all solutions to this must use a splititerator that splits in chunks or prefix data, which then are consumed in ~arbitrary order, which causes many number ranges beyond the actual result to be processed, becoming (much) slower in general than a sequential solution.
So other than bounding the number range to test, it seems there cannot be a solution using a parallel stream. The problem is in the specification requiring ORDERED characteristics to split a Stream by prefixing, instead of providing a different means of reassembling ordered stream results from multiple splititerators.
However a solution using a sequential stream with parallelly processed (buffered) inputs may still be possible (but not as simple as calling parallel()).
For a college project of mine i needed to implement a deeplearning neural network in plain java. After profiling the application i wanted to see if the automatic parallelization using java's stream api would lead to a significant improvement in performance, but i am struggling to transform my old code to a stream based approach.
The method takes a vector (double array), performs a matrix multiplication, then adds a value to each element and finally applies a lambda function (DoubleFunction) to every element.
Here is the old code that i want to replace:
/* e.g.
double[] x = double[100]
int inputNeurons = 100
int outputNeurons = 200
double[][] weights = double[200][100]
double[] biases = double[200]
*/
private double[] output(double[] x) {
double[] y = new double[outputNeurons];
for (int i = 0; i < outputNeurons; i++) {
double preActivation = 0.;
for (int j = 0; j < inputNeurons; j++) {
preActivation += weights[i][j] * x[j];
}
preActivation += biases[i];
y[i] = activation.apply(preActivation);
}
}
This is what i came up with so far (it does not work):
private double[] output(double[] x) {
return Arrays.stream(weights).parallel()
.map(outputNeuron -> IntStream.range(0, outputNeurons)
.mapToDouble(i -> IntStream.range(0, inputNeurons)
.mapToDouble(j -> x[i] * outputNeuron[i]).sum()
).map(activation::apply)
).toArray();
Since i don't know streams good enough, i would really appreciate any help!
Good attempt but your stream approach is quite off the imperative one. the exact equivalent of your imperative approach is:
return IntStream.range(0, outputNeurons)
//.parallel() uncomment to see difference in performance
.mapToDouble(i -> IntStream.range(0, inputNeurons)
.mapToDouble(j -> weights[i][j] * x[j]).sum() + biases[i])
.map(activation::apply)
.toArray();
Note, there are many factors that influence whether parallel streams will make your code faster or slower than your imperative approach or sequential streams. Thus, you'll need to consider some factors before going parallel.
Data size
Number of cores
Cost per element (meaning time spent executing in parallel and overhead of decomposition and merging)
Source data structure
Packing (meaning primitive types are faster to operate on than boxed values).
You should also consider reading Should I always use a parallel stream when possible?
My program has this line:
Function<String, Integer> f = (String s) -> s.chars().reduce(0, (a, b) -> 2 * a + b);
The function being passed to reduce is not associative. Reduce's documentation says that the function passed must be associative.
How can I rewrite this as an expression which doesn't break reduce's contract?
Under the current implementation and IFF you are not going to use parallel - you are safe with what you have right now. Obviously if you are OK with these disclaimers.
Or you can obviously create the function with a for loop:
Function<String, Integer> f = s -> {
int first = s.charAt(0) * 2 + s.charAt(1);
int total = first;
for (int x = 1; x < s.length() - 1; x++) {
total = total * 2 + s.charAt(x + 1);
}
return total;
};
You can convert this function to an associative function, as explained in this answer at the example of List.hashCode(). The difference lies only in the factor (2vs.31) and the start value (1vs.0).
It can be adapted to your task, which is especially easy when you have a random access input like a String:
Function<String, Integer> f =
s -> IntStream.range(0, s.length()).map(i -> s.charAt(i)<<(s.length()-i-1)).sum();
This would even run in parallel, but it’s unlikely that you ever encounter such humongous strings that a parallel evaluation provides a benefit. So what remains, is that most people might consider this solution less readable than a simple for loop…
Note that the above solution exhibits a different overflow behavior, i.e. if the String has more than 32 chars, due to the usage of the shift operator rather than multiplying with two.
The fix for this issue makes the solution even more efficient:
Function<String, Integer> f = s ->
IntStream.range(Math.max(0, s.length()-32), s.length())
.map(i -> s.charAt(i)<<(s.length()-i-1)).sum();
If the string has more than 32 chars, it only processes the last 32 chars, which is already sufficient to calculate the same result as your original function.
I am working on a class project to create a more efficient Fibonacci than the recursive version of Fib(n-1) + Fib(n-2). For this project I need to use BigInteger. So far I have had the idea to use a map to store the previous fib numbers.
public static BigInteger theBigFib(BigInteger n) {
Map<BigInteger, BigInteger> store = new TreeMap<BigInteger, BigInteger>();
if (n.intValue()<= 2){
return BigInteger.ONE;
}else if(store.containsKey(n)){
return store.get(n);
}else{
BigInteger one = new BigInteger("1");
BigInteger two = new BigInteger("2");
BigInteger val = theBigFib(n.subtract(one)).add(theBigFib(n.subtract(two)));
store.put(n,val);
return val;
}
}
I think that the map is storing more than it should be. I also think this line
BigInteger val = theBigFib(n.subtract(one)).add(theBigFib(n.subtract(two)));
is an issue. If anyone could shed some light on what i'm doing wrong or possible another solution to make it faster than the basic code.
Thanks!
You don't need all the previous BigIntegers, you just need the last 2.
Instead of a recursive solution you can use a loop.
public static BigInteger getFib(int n) {
BigInteger a = new BigInteger.ONE;
BigInteger b = new BigInteger.ONE;
if (n < 2) {
return a;
}
BigInteger c = null;
while (n-- >= 2) {
c = a.add(b);
a = b;
b = c;
}
return c;
}
If you want to store all the previous values, you can use an array instead.
static BigInteger []memo = new BigInteger[MAX];
public static BigInteger getFib(int n) {
if (n < 2) {
return new BigInteger("1");
}
if (memo[n] != null) {
return memo[n];
}
memo[n] = getFib(n - 1).add(getFib(n - 2));
return memo[n];
}
If you just want the nth Fib value fast and efficient.
You can use the matrix form of fibonacci.
A = 1 1
1 0
A^n = F(n + 1) F(n)
F(n) F(n - 1)
You can efficiently calculate A^n using Exponentiation by Squaring.
I believe the main issue in your code is that you create a new Map on each function call. Note that it's still local variable, despite that your method is static. So, you're guaranteed that the store.containsKey(n) condition never holds and your solution is not better than naive. I.e. it still has exponential complexity of n. More precisely, it takes about F(n) steps to get to the answer (basically because all "ones" that make up your answer are returned by some function call).
I'd suggest making the variable a static field instead of a local variable. Then number of calls should become linear instead of exponential and you will see a significant improvement. Other solutions include for loop with three variables which iteratively calculate Fibonacci numbers from 0, 1, 2 up to n-th and the best solutions I know involve matrix exponentiation or explicit formula with real numbers (which is bad for precision), but it's a question better suited for computer science StackExchange website, imho.
I have an array of operations and a target number.
The operations could be
+ 3
- 3
* 4
/ 2
I want to find out how close I can get to the target number by using those operations.
I start from 0 and I need to iterate through the operations in that order, and I can choose to either use the operation or not use it.
So if the target number is 13, I can use + 3 and * 4 to get 12 which is the closest I can get to the target number 13.
I guess I need to compute all possible combinations (I guess the number of calculations is thus 2^n where n is the number of operations).
I have tried to do this in java with
import java.util.*;
public class Instruction {
public static void main(String[] args) {
// create scanner
Scanner sc = new Scanner(System.in);
// number of instructions
int N = sc.nextInt();
// target number
int K = sc.nextInt();
//
String[] instructions = new String[N];
// N instructions follow
for (int i=0; i<N; i++) {
//
instructions[i] = sc.nextLine();
}
//
System.out.println(search(instructions, 0, N, 0, K, 0, K));
}
public static int search(String[] instructions, int index, int length, int progressSoFar, int targetNumber, int bestTarget, int bestDistance) {
//
for (int i=index; i<length; i++) {
// get operator
char operator = instructions[i].charAt(0);
// get number
int number = Integer.parseInt(instructions[i].split("\\s+")[1]);
//
if (operator == '+') {
progressSoFar += number;
} else if (operator == '*') {
progressSoFar *= number;
} else if (operator == '-') {
progressSoFar -= number;
} else if (operator == '/') {
progressSoFar /= number;
}
//
int distance = Math.abs(targetNumber - progressSoFar);
// if the absolute distance between progress so far
// and the target number is less than what we have
// previously accomplished, we update best distance
if (distance < bestDistance) {
bestTarget = progressSoFar;
bestDistance = distance;
}
//
if (true) {
return bestTarget;
} else {
return search(instructions, index + 1, length, progressSoFar, targetNumber, bestTarget, bestDistance);
}
}
}
}
It doesn't work yet, but I guess I'm a little closer to solving my problem. I just don't know how to end my recursion.
But maybe I don't use recursion, but should instead just list all combinations. I just don't know how to do this.
If I, for instance, have 3 operations and I want to compute all combinations, I get the 2^3 combinations
111
110
101
011
000
001
010
100
where 1 indicates that the operation is used and 0 indicates that it is not used.
It should be rather simple to do this and then choose which combination gave the best result (the number closest to the target number), but I don't know how to do this in java.
In pseudocode, you could try brute-force back-tracking, as in:
// ops: list of ops that have not yet been tried out
// target: goal result
// currentOps: list of ops used so far
// best: reference to the best result achieved so far (can be altered; use
// an int[1], for example)
// opsForBest: list of ops used to achieve best result so far
test(ops, target, currentOps, best, opsForBest)
if ops is now empty,
current = evaluate(currentOps)
if current is closer to target than best,
best = current
opsForBest = a copy of currentOps
otherwise,
// try including next op
with the next operator in ops,
test(opsAfterNext, target,
currentOps concatenated with next, best, opsForBest)
// try *not* including next op
test(opsAfterNext, target, currentOps, best, opsForBest)
This is guaranteed to find the best answer. However, it will repeat many operations once and again. You can save some time by avoiding repeat calculations, which can be achieved using a cache of "how does this subexpression evaluate". When you include the cache, you enter the realm of "dynamic programming" (= reusing earlier results in later computation).
Edit: adding a more OO-ish variant
Variant returning the best result, and avoiding the use of that best[] array-of-one. Requires the use of an auxiliary class Answer with fields ops and result.
// ops: list of ops that have not yet been tried out
// target: goal result
// currentOps: list of ops used so far
Answer test(ops, target, currentOps, opsForBest)
if ops is now empty,
return new Answer(currentOps, evaluate(currentOps))
otherwise,
// try including next op
with the next operator in ops,
Answer withOp = test(opsAfterNext, target,
currentOps concatenated with next, best, opsForBest)
// try *not* including next op
Answer withoutOp = test(opsAfterNext, target,
currentOps, best, opsForBest)
if withOp.result closer to target than withoutOp.target,
return withOp
else
return withoutOp
Dynamic programming
If the target value is t, and there are n operations in the list, and the largest absolute value you can create by combining some subsequence of them is k, and the absolute value of the product of all values that appear as an operand of a division operation is d, then there's a simple O(dkn)-time and -space dynamic programming algorithm that determines whether it's possible to compute the value i using some subset of the first j operations and stores this answer (a single bit) in dp[i][j]:
dp[i][j] = dp[i][j-1] || dp[invOp(i, j)][j-1]
where invOp(i, j) computes the inverse of the jth operation on the value i. Note that if the jth operation is a multiplication by, say, x, and i is not divisible by x, then the operation is considered to have no inverse, and the term dp[invOp(i, j)][j-1] is deemed to evaluate to false. All other operations have unique inverses.
To avoid loss-of-precision problems with floating point code, first multiply the original target value t, as well as all operands to addition and subtraction operations, by d. This ensures that any division operation / x we encounter will only ever be applied to a value that is known to be divisible by x. We will essentially be working throughout with integer multiples of 1/d.
Because some operations (namely subtractions and divisions) require solving subproblems for higher target values, we cannot in general calculate dp[i][j] in a bottom-up way. Instead we can use memoisation of the top-down recursion, starting at the (scaled) target value t*d and working outwards in steps of 1 in each direction.
C++ implementation
I've implemented this in C++ at https://ideone.com/hU1Rpq. The "interesting" part is canReach(i, j); the functions preceding this are just plumbing to handle the memoisation table. Specify the inputs on stdin with the target value first, then a space-separated list of operations in which operators immediately preceed their operand values, e.g.
10 +8 +11 /2
or
10 +4000 +5500 /1000
The second example, which should give the same answer (9.5) as the first, seems to be around the ideone (and my) memory limits, although this could be extended somewhat by using long long int instead of int and a 2-bit table for _m[][][] instead of wasting a full byte on each entry.
Exponential worst-case time and space complexity
Note that in general, dk or even just k by itself could be exponential in the size of the input: e.g. if there is an addition, followed by n-1 multiplication operations, each of which involves a number larger than 1. It's not too difficult to compute k exactly via a different DP that simply looks for the largest and smallest numbers reachable using the first i operations for all 1 <= i <= n, but all we really need is an upper bound, and it's easy enough to get a (somewhat loose) one: simply discard the signs of all multiplication operands, convert all - operations to + operations, and then perform all multiplication and addition operations (i.e., ignoring divisions).
There are other optimisations that could be applied, for example dividing through by any common factor.
Here's a Java 8 example, using memoization. I wonder if annealing can be applied...
public class Tester {
public static interface Operation {
public int doOperation(int cur);
}
static Operation ops[] = { // lambdas for the opertions
(x -> x + 3),
(x -> x - 3),
(x -> x * 4),
(x -> x / 2),
};
private static int getTarget(){
return 2;
}
public static void main (String args[]){
int map[];
int val = 0;
int MAX_BITMASK = (1 << ops.length) - 1;//means ops.length < 31 [int overflow]
map = new int[MAX_BITMASK];
map[0] = val;
final int target = getTarget();// To get rid of dead code warning
int closest = val, delta = target < 0? -target: target;
int bestSeq = 0;
if (0 == target) {
System.out.println("Winning sequence: Do nothing");
}
int lastBitMask = 0, opIndex = 0;
int i = 0;
for (i = 1; i < MAX_BITMASK; i++){// brute force algo
val = map[i & lastBitMask]; // get prev memoized value
val = ops[opIndex].doOperation(val); // compute
map[i] = val; //add new memo
//the rest just logic to find the closest
// except the last part
int d = val - target;
d = d < 0? -d: d;
if (d < delta) {
bestSeq = i;
closest = val;
delta = d;
}
if (val == target){ // no point to continue
break;
}
//advance memo mask 0b001 to 0b011 to 0b111, etc.
// as well as the computing operation.
if ((i & (i + 1)) == 0){ // check for 2^n -1
lastBitMask = (lastBitMask << 1) + 1;
opIndex++;
}
}
System.out.println("Winning sequence: " + bestSeq);
System.out.println("Closest to \'" + target + "\' is: " + closest);
}
}
Worth noting, the "winning sequence" is the bit representation (displayed as decimal) of what was used and what wasn't, as the OP has done in the question.
For Those of you coming from Java 7, this is what I was referencing for lambdas: Lambda Expressionsin GUI Applications. So if you're constrained to 7, you can still make this work quite easily.