How is this piece of Recursive lambda call in Java working - java

I recently came across this piece of code in Java. It involves Function and printing fibonacci numbers and it works.
public class AppLambdaSubstitution {
public static Function<Integer, Integer> Y(Function<Function<Integer, Integer>, Function<Integer, Integer>> f) {
return x -> f.apply(Y(f)).apply(x);
}
public static void main(String[] args) {
Function<Integer, Integer> fib = Y(
func -> x -> {
if (x < 2)
return x;
else
return func.apply(x - 1) + func.apply(x - 2);
});
IntStream.range(1,11).
mapToObj(Integer::valueOf).
map(fib).forEach(System.out::println);
}
}
The part that has me confused is return x -> f.apply(Y(f)).apply(x);. Isn't Y(f) a recursive call to the method Y? We keep calling it with the Function f as a parameter. To me, there's no base case for this recursive call to return from. Why is there no overflow resulting from an endless recursive call?

Fundamentally you are missing the point that x -> f.apply(Y(f)).apply(x); will not call apply, it will return a Function.
That's just a very complicated (and non-intuitive?) way of showing currying and recursive function IMO. Things would be much simpler if you would replace a couple of things and make it a bit more readable.
This construction:
Function<Function<Integer, Integer>, Function<Integer, Integer>>
is not needed at all, since the left parameter is not used at all. It's simply needed to get a hold of the right one. As such the left parameter could be anything at all (I will later replace it with Supplier - that is not needed either, but just to prove a point).
Actually all you care about here is this Function that does the actual computation for each element of the Stream:
public static Function<Integer, Integer> right() {
return new Function<Integer, Integer>() {
#Override
public Integer apply(Integer x) {
if (x < 2) {
return x;
} else {
return apply(x - 1) + apply(x - 2);
}
}
};
}
Now you could write that entire construct with:
Supplier<Function<Integer, Integer>> toUse = () -> right();
Function<Integer, Integer> fib = curry(toUse);
IntStream.range(1, 11)
.mapToObj(Integer::valueOf)
.map(fib)
.forEach(System.out::println);
This Supplier<Function<Integer, Integer>> toUse = () -> right(); should make you understand why in the previous example (Function<Function, Function>) the left part was needed - just to get a hold of the right one.
If you look even closer, you might notice that the Supplier is entirely not needed, thus you could even further simplify it with:
IntStream.range(1, 11)
.mapToObj(Integer::valueOf)
.map(right())
.forEach(System.out::println);

Related

How to xor predicates?

I need a method where I need to xor Predicates which I will recieve as method params. I have a somewhat working but cumbersome solution for two predicates. To give a simple, minimal and reproducible example:
Predicate<String> pred1 = s -> s.contains("foo");
Predicate<String> pred2 = s -> s.contains("bar");
String toTest = "foobar";
The logical OR will return true for given predicates and the test string:
boolean oneOnly = pred1.or(pred2).test(toTest);
but for my use case it should return false since both substrings are included. It should only return true if and only if one condition is met.
For two prdeicates I have this
static boolean xor(Predicate<String> pred1, Predicate<String> pred2, String toTest){
return pred1.and(pred2.negate()).or(pred2.and(pred1.negate())).test(toTest);
}
Is there a simple but a convinient way to xor predicates?
In followup to #xdhmoore's answer, that's overkill and can be done much simpler:
static <T> Predicate<T> xor(Predicate<T> pred1, Predicate<T> pred2) {
return t -> pred1.test(t) ^ pred2.test(t);
}
Update:
Below are some examples of why you'd want to return a Predicate instead of a boolean, but #rzwitserloot's answer does it nicely and more succinctly.
To play the Devil's advocate: it's less pretty, but one advantage for the way you already have it is you are slightly more in line with the Predicate idioms. A little tweaking gets you:
Return a Predicate
static <T> Predicate<T> xor(Predicate<T> pred1, Predicate<T> pred2){
return pred1.and(pred2.negate())
.or(pred2.and(pred1.negate()));
}
// Which means you can do this, which is probably more conducive to combining your
// new xor function with other predicates:
xor((Integer a) -> a > 1, (Integer b) -> b < 10).test(0));
// For example, because you return a Predicate:
xor((Integer a) -> a > 1, (Integer b) -> b < 10).negate().test(0));
Return a boolean
static <T> boolean xor(Predicate<T> pred1, Predicate<T> pred2, T toTest) {
return pred1.test(toTest) ^ pred2.test(toTest);
}
// In contrast, if your xor function returns a boolean, you get this, which is
// similar, but is less conducive to using all the Predicate methods:
xor((Integer a) -> a > 1, (Integer b) -> b < 10, 14);
// To be honest, this seems more readable to me than the negate() function in the
// example above, but perhaps there are scenarios where the above is preferred...
!xor((Integer a) -> a > 1, (Integer b) -> b < 10, 14)
Not a big deal, but your question made me curious...
You could reduce your xor'ed predicates to a single predicate with stream.reduce and then return the outcome.
like so:
import java.util.function.Predicate;
import java.util.Arrays;
public class MultiXor{
public static void main(String[] args){
System.out.println(xor("monkey", p -> p.equals("monkey"), p -> p.equals("dork"), p -> p.equalsIgnoreCase("Monkey")) );
System.out.println(true ^ false ^ true);
System.out.println(xor("monkey", p -> p.equals("monkey"), p -> p.equals("dork")) );
System.out.println(true ^ false);
}
public static <T> boolean xor(final T param, Predicate<T>... predicates){
return Arrays.stream(predicates).reduce( p -> false, (previous, p) -> r -> previous.test(param) ^ (p.test(param))).test(param);
}
}

Run a For-each loop on a Filtered HashMap

I am so new to java. and there is my problem.
I have a Map in Type of Map<Integer , List<MyObject>> that I call it myMap.
As myMap has a lot of members (About 100000) , I don't think the for loop to be such a good idea so I wanna filter my Map<Integer , List<MyObject>> Where the bellow condition happens:
myMap.get(i).get(every_one_of_them).a_special_attribute_of_my_MyObject == null;
in which every_one_of_them means i wanna to delete members of myMap which the Whole list's members(All of its Objects) are null in that attribute(for more comfort , let's call it myAttribute).
one of my uncompleted idea was such a thing:
Map<Integer, List<toHandle>> collect = myMap.entrySet().stream()
.filter(x -> x.getValue.HERE_IS_WHERE_I_DO_NOT_KNOW_HOW_TO)
.collect(Collectors.toMap(x -> x.getKey(), x -> x.getValue()));
Any Help Will Be Highly Appreciated. Thanks.
You can
iterate over map values() and remove from it elements which you don't want. You can use for that removeIf(Predicate condition).
To check if all elements in list fulfill some condition you can use list.stream().allMatch(Predicate condition)
For instance lets we have Map<Integer, List<String>> and we want to remove lists which have all strings starting with b or B. You can do it via
myMap.values()
.removeIf(list -> list.stream()
.allMatch(str -> str.toLowerCase().startsWith("b"))
// but in real application for better performance use
// .allMatch(str -> str.regionMatches(true, 0, "b", 0, 1))
);
DEMO:
Map<Integer , List<String>> myMap = new HashMap<>(Map.of(
1, List.of("Abc", "Ab"),
2, List.of("Bb", "Bc"),
3, List.of("Cc")
));
myMap.values()
.removeIf(list -> list.stream()
.allMatch(str -> str.toLowerCase().startsWith("b"))
);
System.out.println(myMap);
Output:
{1=[Abc, Ab], 3=[Cc]}
As myMap has a lot of members (About 100000) , I don't think the for loop to be such a good idea so I wanna filter
That sounds like you think stream.filter is somehow faster than foreach. It's not; it's either slower or about as fast.
SPOILER: All the way at the end I do some basic performance tests, but I invite anyone to take that test and upgrade it to a full JMH test suite and run it on a variety of hardware. However - it says you're in fact exactly wrong, and foreach is considerably faster than anything involving streams.
Also, it sounds like you feel 100000 is a lot of entries. It mostly isn't. a foreach loop (or rather, an iterator) will be faster. Removing with the iterator will be considerably faster.
parallelism can help you out here, and is simpler with streams, but you can't just slap a parallel() in there and trust that it'll just work out. It depends on the underlying types. For example, your plain jane j.u.HashMap isn't very good at this; Something like a ConcurrentHashMap is far more capable. But if you take the time to copy over all data to a more suitable map type, well, in that timespan you could have done the entire job, and probably faster to boot! (Depends on how large those lists are).
Step 1: Make an oracle
But, first things first, we need an oracle function: One that determines if a given entry ought to be deleted. No matter what solution you go with, this is required:
public boolean keep(List<MyObject> mo) {
for (MyObject obj : mo) if (obj.specialProperty != null) return true;
return false;
}
you could 'streamify' it:
public boolean keep(List<MyObject> mo) {
return mo.stream().anyMatch(o -> o.specialProperty != null);
}
Step 2: Filter the list
Once we have that, the task becomes easier:
var it = map.values().iterator();
while (it.hasNext()) if (!keep(it.next())) it.remove();
is now all you need. We can streamify that if you prefer, but note that you can't use streams to change a map 'in place', and copying over is usually considerably slower, so, this is likely slower and certainly takes more memory:
Map<Integer, List<MyObject>> result =
map.entrySet().stream()
.filter(e -> keep(e.getValue()))
.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
Note also how the stream option doesn't generally result in significantly shorter code either. Don't make the decision between stream or non-stream based on notions that streams are inherently better, or lead to more readable code. Programming just isn't that simple, I'm afraid.
We can also use some of the more functional methods in map itself:
map.values().removeIf(v -> !keep(v));
That seems like the clear winner, here, although it's a bit bizarre we have to 'bounce' through values(); map itself has no removeIf method, but the collections returned by keySet, values, entrySet etc reflect any changes back to the map, so that works out.
Let's performance test!
Performance testing is tricky and really requires using JMH for good results. By all means, as an exercise, do just that. But, let's just do a real quick scan:
import java.util.*;
import java.util.stream.*;
public class Test {
static class MyObj {
String foo;
}
public static MyObj hit() {
MyObj o = new MyObj();
o.foo = "";
return o;
}
public static MyObj miss() {
return new MyObj();
}
private static final int MAP_ELEMS = 100000;
private static final int LIST_ELEMS = 50;
private static final double HIT_OR_MISS = 0.01;
private static final Random rnd = new Random();
public static void main(String[] args) {
var map = construct();
long now = System.currentTimeMillis();
filter_seq(map);
long delta = System.currentTimeMillis() - now;
System.out.printf("Sequential: %.3f\n", 0.001 * delta);
map = construct();
now = System.currentTimeMillis();
filter_stream(map);
delta = System.currentTimeMillis() - now;
System.out.printf("Stream: %.3f\n", 0.001 * delta);
map = construct();
now = System.currentTimeMillis();
filter_removeIf(map);
delta = System.currentTimeMillis() - now;
System.out.printf("RemoveIf: %.3f\n", 0.001 * delta);
}
private static Map<Integer, List<MyObj>> construct() {
var m = new HashMap<Integer, List<MyObj>>();
for (int i = 0; i < MAP_ELEMS; i++) {
var list = new ArrayList<MyObj>();
for (int j = 0; j < LIST_ELEMS; j++) {
list.add(rnd.nextDouble() < HIT_OR_MISS ? hit() : miss());
}
m.put(i, list);
}
return m;
}
static boolean keep_seq(List<MyObj> list) {
for (MyObj o : list) if (o.foo != null) return true;
return false;
}
static boolean keep_stream(List<MyObj> list) {
return list.stream().anyMatch(o -> o.foo != null);
}
static void filter_seq(Map<Integer, List<MyObj>> map) {
var it = map.values().iterator();
while (it.hasNext()) if (!keep_seq(it.next())) it.remove();
}
static void filter_stream(Map<Integer, List<MyObj>> map) {
Map<Integer, List<MyObj>> result =
map.entrySet().stream()
.filter(e -> keep_stream(e.getValue()))
.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
}
static void filter_removeIf(Map<Integer, List<MyObj>> map) {
map.values().removeIf(v -> !keep_stream(v));
}
}
This, reliably, on my hardware anyway, shows that the stream route is by far the slowest, and the sequential option wins out with some percent from the removeIf variant. Which just goes to show that your initial line (if I can take that as 'I think foreach is too slow') was entirely off the mark, fortunately.
For fun I replaced the map with a ConcurrentHashMap and made the stream parallel(). This did not change the timing significantly, and I wasn't really expecting it too.
A note about style
In various snippets, I omit braces for loops and if statements. If you add them, the non-stream-based code occupies considerably more lines, and if you include the indent whitespace for the insides of these constructs, considerably more 'surface area' of paste. However, that is a ridiculous thing to clue off of - that is tantamount to saying: "Actually, the commonly followed style guides for java are incredibly obtuse and badly considered. However, I dare not break them. Fortunately, lambdas came along and gave me an excuse to toss the entire principle of those style guides right out the window and now pile it all into a single, braceless line, and oh look, lambdas lead to shorter code!". I would assume any reader, armed with this knowledge, can easily pierce through such baloney argumentation. The reasons for those braces primarily involve easier debug breakpointing and easy ways to add additional actions to a given 'code node', and those needs are exactly as important, if not more so, if using streams. If it's okay to one-liner and go brace-free for lambdas, then surely it is okay to do the same to if and for bodies.

How to convert a for iteration with conditions to Java 8 stream

Currently, I have this method, which I want to convert to a Java 8 stream style (I have little practice with this API btw, that's the purpose of this little exercise):
private static Map<Integer, List<String>> splitByWords(List<String> list) {
for (int i = 0; i < list.size(); i++) {
if(list.get(i).length() > 30 && list.get(i).contains("-")) {
mapOfElements.put(i, Arrays.stream(list.get(i).split("-")).collect(Collectors.toList()));
} else if(list.get(i).length() > 30) {
mapOfElements.put(i, Arrays.asList(new String[]{list.get(i)}));
} else {
mapOfElements.put(i, Arrays.asList(new String[]{list.get(i) + "|"}));
}
}
return mapOfElements;
}
This is what I´ve got so far:
private static Map<Integer, List<String>> splitByWords(List<String> list) {
Map<Integer, List<String>> mapOfElements = new HashMap<>();
IntStream.range(0, list.size())
.filter(i-> list.get(i).length() > 30 && list.get(i).contains("-"))
.boxed()
.map(i-> mapOfElements.put(i, Arrays.stream(list.get(i).split("-")).collect(Collectors.toList())));
//Copy/paste the above code twice, just changing the filter() and map() functions?
In the "old-fashioned" way, I just need one for iteration to do everything I need regarding my conditions. Is there a way to achieve that using the Stream API or, if I want to stick to it, I have to repeat the above code just changing the filter() and map() conditions, therefore having three for iterations?
The current solution with the for-loop looks good. As you have to distinguish three cases only, there is no need to generalize the processing.
Should there be more cases to distinguish, then it could make sense to refactor the code. My approach would be to explicitly define the different conditions and their corresponding string processing. Let me explain it using the code from the question.
First of all I'm defining the different conditions using an enum.
public enum StringClassification {
CONTAINS_HYPHEN, LENGTH_GT_30, DEFAULT;
public static StringClassification classify(String s) {
if (s.length() > 30 && s.contains("-")) {
return StringClassification.CONTAINS_HYPHEN;
} else if (s.length() > 30) {
return StringClassification.LENGTH_GT_30;
} else {
return StringClassification.DEFAULT;
}
}
}
Using this enum I define the corresponding string processors:
private static final Map<StringClassification, Function<String, List<String>>> PROCESSORS;
static {
PROCESSORS = new EnumMap<>(StringClassification.class);
PROCESSORS.put(StringClassification.CONTAINS_HYPHEN, l -> Arrays.stream(l.split("-")).collect(Collectors.toList()));
PROCESSORS.put(StringClassification.LENGTH_GT_30, l -> Arrays.asList(new String[] { l }));
PROCESSORS.put(StringClassification.DEFAULT, l -> Arrays.asList(new String[] { l + "|" }));
}
Based on this I can do the whole processing using the requested IntStream:
private static Map<Integer, List<String>> splitByWords(List<String> list) {
return IntStream.range(0, list.size()).boxed()
.collect(Collectors.toMap(Function.identity(), i -> PROCESSORS.get(StringClassification.classify(list.get(i))).apply(list.get(i))));
}
The approach is to retrieve for a string the appropriate StringClassification and then in turn the corresponding string processor. The string processors are implementing the strategy pattern by providing a Function<String, List<String>> which maps a String to a List<String> according to the StringClassification.
A quick example:
public static void main(String[] args) {
List<String> list = Arrays.asList("123",
"1-2",
"0987654321098765432109876543211",
"098765432109876543210987654321a-b-c");
System.out.println(splitByWords(list));
}
The output is:
{0=[123|], 1=[1-2|], 2=[0987654321098765432109876543211], 3=[098765432109876543210987654321a, b, c]}
This makes it easy to add or to remove conditions and string processors.
First of I don't see any reason to use the type Map<Integer, List<String>> when the key is an index. Why not use List<List<String>> instead? If you don't use a filter the elements should be on the same index as the input.
The power in a more functional approach is that it's more readable what you're doing. Because you want to do multiple things for multiple sizes strings it's pretty hard write a clean solution. You can however do it in a single loop:
private static List<List<String>> splitByWords(List<String> list)
{
return list.stream()
.map(
string -> string.length() > 30
? Arrays.asList(string.split("-"))
: Arrays.asList(string + "|")
)
.collect(Collectors.toList());
}
You can add more complex logic by making your lambda multiline (not needed in this case). eg.
.map(string -> {
// your complex logic
// don't forget, when using curly braces you'll
// need to return explicitly
return result;
})
The more functional approach would be to group the strings by size followed by applying a specific handler for the different groups. It's pretty hard to keep the index the same, so I change the return value to Map<String, List<String>> so the result can be fetched by providing the original string:
private static Map<String, List<String>> splitByWords(List<String> list)
{
Map<String, List<String>> result = new HashMap<>();
Map<Boolean, List<String>> greaterThan30;
// group elements
greaterThan30 = list.stream().collect(Collectors.groupingBy(
string -> string.length() > 30
));
// handle strings longer than 30 chars
result.putAll(
greaterThan30.get(true).stream().collect(Collectors.toMap(
Function.identity(), // the same as: string -> string
string -> Arrays.asList(string.split("-"))
))
);
// handle strings not longer than 30 chars
result.putAll(
greaterThan30.get(false).stream().collect(Collectors.toMap(
Function.identity(), // the same as: string -> string
string -> Arrays.asList(string + "|")
))
);
return result;
}
The above seems like a lot of hassle, but is in my opinion better understandable. You could also dispatch the logic to handle large and small strings to other methods, knowing the provided string does always match the criteria.
This is slower than the first solution. For a list of size n, it has to loop through n elements to group by the criteria. Then loop through x (0 <= x <= n) elements that match the criteria, followed by a loop through n - x elements that don't match the criteria. (In total 2 times the whole list.)
In this case it might not be worth the trouble since both the criteria, as well as the logic to apply are pretty simple.

Access element of previous step in a stream or pass down element as "parameter" to next step?

What is a good way to access an element of a previous step of a stream?
The example here starts with a stream of parents (String) and flatMaps them to two children (Integer) for each parent. At the step where I am dealing with the Integer children (forEach) I would like to "remember" what the parent String was ('parentFromPreviousStep' compilation error).
The obvious solution is to introduce a holder class for a String plus an Integer and in the flatMap produce a stream of such holder objects. But this seems a bit clumsy and I was wondering whether there is a way to access a "previous element" in such a chain or to "pass down a parameter" (without introducing a special holder class).
List<String> parents = new ArrayList();
parents.add("parentA");
parents.add("parentB");
parents.add("parentC");
parents.stream()
.flatMap(parent -> {
List<Integer> children = new ArrayList<>();
children.add(1);
children.add(2);
return children.stream();
}).forEach(child -> System.out.println(child + ", my parent is: ???" + parentFromPreviousStep));
Background: This actually happens in Nashorn with the idea to provide JavaScript scripting code for an application without having to deploy the Java application. This is why I do not want to introduce a special Java holder class for my two business objects (which are simplified here as String and Integer). [The Java code above can be written almost the same in Nashorn, and the stream functions are Java, not JavaScript.]
Holger's comment to the question provides valuable information and leads to the following solution. The existence of SimpleImmutableEntry in the JDK solves the problem that creating a proprietary holder class was not an option for me in this situation. (This is a community wiki because Holger provided the answer.)
Stream.of("parentA", "parentB", "parentC")
.flatMap(parent -> Stream.of(
new AbstractMap.SimpleImmutableEntry<>(1, parent),
new AbstractMap.SimpleImmutableEntry<>(2, parent))
).forEach(child -> System.out.println(child.getKey() + ", my parent is: " + child.getValue()));
Child aggregates to a parent, hmm.. Smells like key-value pairs. Map!
So, I tried it in this way. and it worked. Hope this would be helpful.
parents.stream()
.flatMap(parent -> {
List<Integer> children = new ArrayList<>();
children.add(1);
children.add(2);
Map<String, List<Integer>> stringListMap = new HashMap<String, List<Integer>>();
stringListMap.put(parent, children);
return stringListMap.entrySet().stream();
})
.forEach(entry -> {
entry.getValue().stream().forEach(val -> System.out.println(val + ", my parent is: ???" + entry.getKey()));
});
This is possible in the reducer method of the stream as follows, this code actually gives you Lists of the current element and the previous two elements.
I was a little surprised this solution was NOT in this baeldung article;
https://www.baeldung.com/java-stream-last-element
public List<Integer> mainFunPreviousAccess(int n) {
Supplier<Integer> s = new Supplier<Integer>() {
int next = 0;
#Override
public Integer get() {
return next++;
}
};
return Stream.generate(s).takeWhile(j -> j <= n).map(j -> List.of(j)).reduce((x,y) -> {
System.out.println("\naccumulator x " + x);
System.out.println("y " + y);
int xs = x.size();
if (xs >= 2) {
return List.of(x.get(xs - 2), x.get(xs -1), y.get(0));
}
return List.of(x.get(0), y.get(0));
}).get();
}
#Adligo :)
Here is another solutions to the same problem, that allows continuing use of the stream since this is not done through a terminal operation;
public void mainFunPreviousAccess2(int n) {
//Stream.generate(s).takeWhile(j -> j <= i).forEach(j -> System.out.println("j " + j));
Supplier<Integer> s = new Supplier<Integer>() {
int next = 0;
#Override
public Integer get() {
return next++;
}
};
Function<Integer, List<Integer>> s2 = new Function<Integer,List<Integer>>() {
Integer m1 = null;
#Override
public List<Integer> apply(Integer t) {
if (m1 == null) {
m1 = t;
return List.of(t);
}
List<Integer> r = List.of(m1, t);
m1 = t;
return r;
}
};
Stream.generate(s).takeWhile(k -> k <= n).map(s2).forEach(k -> System.out.println("hey " + k));
}
#Adligo :)

Generating primes with LongStream and jOOλ leads to StackOverflowError

For educational purposes I want to create a stream of prime numbers using Java-8. Here's my approach. The number x is prime if it has no prime divisors not exceeding sqrt(x). So assuming I already have a stream of primes I can check this with the following predicate:
x -> Seq.seq(primes()).limitWhile(p -> p <= Math.sqrt(x)).allMatch(p -> x % p != 0)
Here I used jOOλ library (0.9.10 if it matters) just for limitWhile operation which is absent in standard Stream API. So now knowing some previous prime prev I can generate the next prime iterating the numbers until I find the one matching this predicate:
prev -> LongStream.iterate(prev + 1, i -> i + 1)
.filter(x -> Seq.seq(primes()).limitWhile(p -> p <= Math.sqrt(x))
.allMatch(p -> x % p != 0))
.findFirst()
.getAsLong()
Putting everything together I wrote the following primes() method:
public static LongStream primes() {
return LongStream.iterate(2L,
prev -> LongStream.iterate(prev + 1, i -> i + 1)
.filter(x -> Seq.seq(primes())
.limitWhile(p -> p <= Math.sqrt(x))
.allMatch(p -> x % p != 0))
.findFirst()
.getAsLong());
}
Now to launch this I use:
primes().forEach(System.out::println);
Unfortunately it fails with unpleasant StackOverflowError which looks like this:
Exception in thread "main" java.lang.StackOverflowError
at java.util.stream.ReferencePipeline$StatelessOp.opIsStateful(ReferencePipeline.java:624)
at java.util.stream.AbstractPipeline.<init>(AbstractPipeline.java:211)
at java.util.stream.ReferencePipeline.<init>(ReferencePipeline.java:94)
at java.util.stream.ReferencePipeline$StatelessOp.<init>(ReferencePipeline.java:618)
at java.util.stream.LongPipeline$3.<init>(LongPipeline.java:225)
at java.util.stream.LongPipeline.mapToObj(LongPipeline.java:224)
at java.util.stream.LongPipeline.boxed(LongPipeline.java:201)
at org.jooq.lambda.Seq.seq(Seq.java:2481)
at Primes.lambda$2(Primes.java:13)
at Primes$$Lambda$4/1555009629.test(Unknown Source)
at java.util.stream.LongPipeline$8$1.accept(LongPipeline.java:324)
at java.util.Spliterators$LongIteratorSpliterator.tryAdvance(Spliterators.java:2009)
at java.util.stream.LongPipeline.forEachWithCancel(LongPipeline.java:160)
at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:529)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:516)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:502)
at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.LongPipeline.findFirst(LongPipeline.java:474)
at Primes.lambda$0(Primes.java:14)
at Primes$$Lambda$1/918221580.applyAsLong(Unknown Source)
at java.util.stream.LongStream$1.nextLong(LongStream.java:747)
at java.util.Spliterators$LongIteratorSpliterator.tryAdvance(Spliterators.java:2009)
...
You might think that I deserve what I get: I called the primes() recursively inside the primes() method itself. However let's just change the method return type to Stream<Long> and use Stream.iterate instead, leaving everything else as is:
public static Stream<Long> primes() {
return Stream.iterate(2L,
prev -> LongStream.iterate(prev + 1, i -> i + 1)
.filter(x -> Seq.seq(primes())
.limitWhile(p -> p <= Math.sqrt(x))
.allMatch(p -> x % p != 0))
.findFirst()
.getAsLong());
}
Now it works like a charm! Not very fast, but in couple of minutes I get the prime numbers exceeding 1000000 without any exceptions. The result is correct, which can be checked against the table of primes:
System.out.println(primes().skip(9999).findFirst());
// prints Optional[104729] which is actually 10000th prime.
So the question is: what's wrong with the first LongStream-based version? Is it jOOλ bug, JDK bug or I'm doing something wrong?
Note that I'm not interested in alternative ways to generate primes, I want to know what's wrong with this specific code.
It seems that LongStream and Stream behave differently when streams are produced by iterate. The following code illustrates the distinction:
LongStream.iterate(1, i -> {
System.out.println("LongStream incrementing " + i);
return i + 1;
}).limit(1).count();
Stream.iterate(1L, i -> {
System.out.println("Stream incrementing " + i);
return i + 1;
}).limit(1).count();
The output is
LongStream incrementing 1
So LongStream will call the function even if only the first element is needed while Stream will not. This explains the exception you are getting.
I don't know if this should be called a bug. Javadoc doesn't specify this behavior one way or another although it would be nice if it were consistent.
One way to fix it is to hardcode the initial sequence of primes:
public static LongStream primes() {
return LongStream.iterate(2L,
prev -> prev == 2 ? 3 :
prev == 3 ? 5 :
LongStream.iterate(prev + 1, i -> i + 1)
.filter(x -> Seq.seq(primes())
.limitWhile(p -> p <= Math.sqrt(x))
.allMatch(p -> x % p != 0)
).findFirst()
.getAsLong());
You can produce this difference in much simpler ways. Consider the following two version of (equally inefficient) recursive long enumeration streams, which can be called as follows to produce a sequence from 1-5:
longs().limit(5).forEach(System.out::println);
Will cause the same StackOverflowError
public static LongStream longs() {
return LongStream.iterate(1L, i ->
1L + longs().skip(i - 1L)
.findFirst()
.getAsLong());
}
Will work
public static Stream<Long> longs() {
return Stream.iterate(1L, i ->
1L + longs().skip(i - 1L)
.findFirst()
.get());
}
The reason
The boxed Stream.iterate() implementation is optimised as follows:
final Iterator<T> iterator = new Iterator<T>() {
#SuppressWarnings("unchecked")
T t = (T) Streams.NONE;
#Override
public boolean hasNext() {
return true;
}
#Override
public T next() {
return t = (t == Streams.NONE) ? seed : f.apply(t);
}
};
unlike the LongStream.iterate() version:
final PrimitiveIterator.OfLong iterator = new PrimitiveIterator.OfLong() {
long t = seed;
#Override
public boolean hasNext() {
return true;
}
#Override
public long nextLong() {
long v = t;
t = f.applyAsLong(t);
return v;
}
};
Notice how the boxed iterator calls the function only after the seed has been returned, whereas the primitive iterator caches the next value prior to returning the seed.
This means that when you use a recursive iteration function with the primitive iterator, the first value in the stream can never be produced, because the next value is fetched prematurely.
This can probably be reported as a JDK bug, and also explains Misha's observation

Categories