Generate infinite parallel stream - java

Problem
Hi, I have a function where i going to return infinite stream of parallel (yes, it is much faster in that case) generated results. So obviously (or not) i used
Stream<Something> stream = Stream.generate(this::myGenerator).parallel()
It works, however ... it doesn't when i want to limit the result (everything is fine when the stream is sequential). I mean, it creates results when i make something like
stream.peek(System.out::println).limit(2).collect(Collectors.toList())
but even when peek output produces more than 10 elements, collect is still not finallized (generating is slow so those 10 can took even a minute)... and that is easy example. Actually, limiting those results is a future due the main expectation is to get only better than recent results until the user will kill the process (other case is to return first what i can make with throwing exception if nothing else will help [findFirst didn't, even when i had more elements on the console and no more results for about 30 sec]).
So, the question is...
how to copy with that? My idea was also to use RxJava, and there is another question - how to achieve similar result with that tool (or other).
Code sample
public Stream<Solution> generateSolutions() {
final Solution initialSolution = initialSolutionMaker.findSolution();
return Stream.concat(
Stream.of(initialSolution),
Stream.generate(continuousSolutionMaker::findSolution)
).parallel();
}
new Solver(instance).generateSolutions()
.map(Solution::getPurpose)
.peek(System.out::println)
.limit(5).collect(Collectors.toList());
Implementation of findSolution is not important.
It has some side effect like adding to solutions repo (singleton, sych etc..), but nothing more.

As explained in the already linked answer, the key point to an efficient parallel stream is to use a stream source already having an intrinsic size instead of using an unsized or even infinite stream and apply a limit on it. Injecting a size doesn’t work with the current implementation at all, while ensuring that a known size doesn’t get lost is much easier. Even if the exact size can’t be retained, like when applying a filter, the size still will be carried as an estimate size.
So instead of
Stream.generate(this::myGenerator).parallel()
.peek(System.out::println)
.limit(2)
.collect(Collectors.toList())
just use
IntStream.range(0, /* limit */ 2).unordered().parallel()
.mapToObj(unused -> this.myGenerator())
.peek(System.out::println)
.collect(Collectors.toList())
Or, closer to your sample code
public Stream<Solution> generateSolutions(int limit) {
final Solution initialSolution = initialSolutionMaker.findSolution();
return Stream.concat(
Stream.of(initialSolution),
IntStream.range(1, limit).unordered().parallel()
.mapToObj(unused -> continuousSolutionMaker.findSolution())
);
}
new Solver(instance).generateSolutions(5)
.map(Solution::getPurpose)
.peek(System.out::println)
.collect(Collectors.toList());

Unfortunately this is expected behavior. As I remember I've seen at least two topics on this matter, here is one of them.
The idea is that Stream.generate creates an unordered infinite stream and limit will not introduce the SIZED flag. Because of this when you spawn a parallel execution on that Stream, individual tasks have to sync their execution to see if they have reached that limit; by the time that sync happens there could be multiple elements already processed. For example this:
Stream.iterate(0, x -> x + 1)
.peek(System.out::println)
.parallel()
.limit(2)
.collect(Collectors.toList());
and this :
IntStream.of(1, 2, 3, 4)
.peek(System.out::println)
.parallel()
.limit(2)
.boxed()
.collect(Collectors.toList());
will always generate two elements in the List (Collectors.toList) and will always output two elements also (via peek).
On the other hand this:
Stream<Integer> stream = Stream.generate(new Random()::nextInt).parallel();
List<Integer> list = stream
.peek(x -> {
System.out.println("Before " + x);
})
.map(x -> {
System.out.println("Mapping x " + x);
return x;
})
.peek(x -> {
System.out.println("After " + x);
})
.limit(2)
.collect(Collectors.toList());
will generate two elements in the List, but it may process many more that later will be discarded by the limit. This is what you are actually seeing in your example.
The only sane way of going that (as far as I can tell) would be to create a custom Spliterator. I have not written many of them, but here is my attempt:
static class LimitingSpliterator<T> implements Spliterator<T> {
private int limit;
private final Supplier<T> generator;
private LimitingSpliterator(Supplier<T> generator, int limit) {
Preconditions.checkArgument(limit > 0);
this.limit = limit;
this.generator = Objects.requireNonNull(generator);
}
#Override
public boolean tryAdvance(Consumer<? super T> consumer) {
if (limit == 0) {
return false;
}
T nextElement = generator.get();
--limit;
consumer.accept(nextElement);
return true;
}
#Override
public LimitingSpliterator<T> trySplit() {
if (limit <= 1) {
return null;
}
int half = limit >> 1;
limit = limit - half;
return new LimitingSpliterator<>(generator, half);
}
#Override
public long estimateSize() {
return limit >> 1;
}
#Override
public int characteristics() {
return SIZED;
}
}
And the usage would be:
StreamSupport.stream(new LimitingSpliterator<>(new Random()::nextInt, 7), true)
.peek(System.out::println)
.collect(Collectors.toList());

Related

Run a For-each loop on a Filtered HashMap

I am so new to java. and there is my problem.
I have a Map in Type of Map<Integer , List<MyObject>> that I call it myMap.
As myMap has a lot of members (About 100000) , I don't think the for loop to be such a good idea so I wanna filter my Map<Integer , List<MyObject>> Where the bellow condition happens:
myMap.get(i).get(every_one_of_them).a_special_attribute_of_my_MyObject == null;
in which every_one_of_them means i wanna to delete members of myMap which the Whole list's members(All of its Objects) are null in that attribute(for more comfort , let's call it myAttribute).
one of my uncompleted idea was such a thing:
Map<Integer, List<toHandle>> collect = myMap.entrySet().stream()
.filter(x -> x.getValue.HERE_IS_WHERE_I_DO_NOT_KNOW_HOW_TO)
.collect(Collectors.toMap(x -> x.getKey(), x -> x.getValue()));
Any Help Will Be Highly Appreciated. Thanks.
You can
iterate over map values() and remove from it elements which you don't want. You can use for that removeIf(Predicate condition).
To check if all elements in list fulfill some condition you can use list.stream().allMatch(Predicate condition)
For instance lets we have Map<Integer, List<String>> and we want to remove lists which have all strings starting with b or B. You can do it via
myMap.values()
.removeIf(list -> list.stream()
.allMatch(str -> str.toLowerCase().startsWith("b"))
// but in real application for better performance use
// .allMatch(str -> str.regionMatches(true, 0, "b", 0, 1))
);
DEMO:
Map<Integer , List<String>> myMap = new HashMap<>(Map.of(
1, List.of("Abc", "Ab"),
2, List.of("Bb", "Bc"),
3, List.of("Cc")
));
myMap.values()
.removeIf(list -> list.stream()
.allMatch(str -> str.toLowerCase().startsWith("b"))
);
System.out.println(myMap);
Output:
{1=[Abc, Ab], 3=[Cc]}
As myMap has a lot of members (About 100000) , I don't think the for loop to be such a good idea so I wanna filter
That sounds like you think stream.filter is somehow faster than foreach. It's not; it's either slower or about as fast.
SPOILER: All the way at the end I do some basic performance tests, but I invite anyone to take that test and upgrade it to a full JMH test suite and run it on a variety of hardware. However - it says you're in fact exactly wrong, and foreach is considerably faster than anything involving streams.
Also, it sounds like you feel 100000 is a lot of entries. It mostly isn't. a foreach loop (or rather, an iterator) will be faster. Removing with the iterator will be considerably faster.
parallelism can help you out here, and is simpler with streams, but you can't just slap a parallel() in there and trust that it'll just work out. It depends on the underlying types. For example, your plain jane j.u.HashMap isn't very good at this; Something like a ConcurrentHashMap is far more capable. But if you take the time to copy over all data to a more suitable map type, well, in that timespan you could have done the entire job, and probably faster to boot! (Depends on how large those lists are).
Step 1: Make an oracle
But, first things first, we need an oracle function: One that determines if a given entry ought to be deleted. No matter what solution you go with, this is required:
public boolean keep(List<MyObject> mo) {
for (MyObject obj : mo) if (obj.specialProperty != null) return true;
return false;
}
you could 'streamify' it:
public boolean keep(List<MyObject> mo) {
return mo.stream().anyMatch(o -> o.specialProperty != null);
}
Step 2: Filter the list
Once we have that, the task becomes easier:
var it = map.values().iterator();
while (it.hasNext()) if (!keep(it.next())) it.remove();
is now all you need. We can streamify that if you prefer, but note that you can't use streams to change a map 'in place', and copying over is usually considerably slower, so, this is likely slower and certainly takes more memory:
Map<Integer, List<MyObject>> result =
map.entrySet().stream()
.filter(e -> keep(e.getValue()))
.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
Note also how the stream option doesn't generally result in significantly shorter code either. Don't make the decision between stream or non-stream based on notions that streams are inherently better, or lead to more readable code. Programming just isn't that simple, I'm afraid.
We can also use some of the more functional methods in map itself:
map.values().removeIf(v -> !keep(v));
That seems like the clear winner, here, although it's a bit bizarre we have to 'bounce' through values(); map itself has no removeIf method, but the collections returned by keySet, values, entrySet etc reflect any changes back to the map, so that works out.
Let's performance test!
Performance testing is tricky and really requires using JMH for good results. By all means, as an exercise, do just that. But, let's just do a real quick scan:
import java.util.*;
import java.util.stream.*;
public class Test {
static class MyObj {
String foo;
}
public static MyObj hit() {
MyObj o = new MyObj();
o.foo = "";
return o;
}
public static MyObj miss() {
return new MyObj();
}
private static final int MAP_ELEMS = 100000;
private static final int LIST_ELEMS = 50;
private static final double HIT_OR_MISS = 0.01;
private static final Random rnd = new Random();
public static void main(String[] args) {
var map = construct();
long now = System.currentTimeMillis();
filter_seq(map);
long delta = System.currentTimeMillis() - now;
System.out.printf("Sequential: %.3f\n", 0.001 * delta);
map = construct();
now = System.currentTimeMillis();
filter_stream(map);
delta = System.currentTimeMillis() - now;
System.out.printf("Stream: %.3f\n", 0.001 * delta);
map = construct();
now = System.currentTimeMillis();
filter_removeIf(map);
delta = System.currentTimeMillis() - now;
System.out.printf("RemoveIf: %.3f\n", 0.001 * delta);
}
private static Map<Integer, List<MyObj>> construct() {
var m = new HashMap<Integer, List<MyObj>>();
for (int i = 0; i < MAP_ELEMS; i++) {
var list = new ArrayList<MyObj>();
for (int j = 0; j < LIST_ELEMS; j++) {
list.add(rnd.nextDouble() < HIT_OR_MISS ? hit() : miss());
}
m.put(i, list);
}
return m;
}
static boolean keep_seq(List<MyObj> list) {
for (MyObj o : list) if (o.foo != null) return true;
return false;
}
static boolean keep_stream(List<MyObj> list) {
return list.stream().anyMatch(o -> o.foo != null);
}
static void filter_seq(Map<Integer, List<MyObj>> map) {
var it = map.values().iterator();
while (it.hasNext()) if (!keep_seq(it.next())) it.remove();
}
static void filter_stream(Map<Integer, List<MyObj>> map) {
Map<Integer, List<MyObj>> result =
map.entrySet().stream()
.filter(e -> keep_stream(e.getValue()))
.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
}
static void filter_removeIf(Map<Integer, List<MyObj>> map) {
map.values().removeIf(v -> !keep_stream(v));
}
}
This, reliably, on my hardware anyway, shows that the stream route is by far the slowest, and the sequential option wins out with some percent from the removeIf variant. Which just goes to show that your initial line (if I can take that as 'I think foreach is too slow') was entirely off the mark, fortunately.
For fun I replaced the map with a ConcurrentHashMap and made the stream parallel(). This did not change the timing significantly, and I wasn't really expecting it too.
A note about style
In various snippets, I omit braces for loops and if statements. If you add them, the non-stream-based code occupies considerably more lines, and if you include the indent whitespace for the insides of these constructs, considerably more 'surface area' of paste. However, that is a ridiculous thing to clue off of - that is tantamount to saying: "Actually, the commonly followed style guides for java are incredibly obtuse and badly considered. However, I dare not break them. Fortunately, lambdas came along and gave me an excuse to toss the entire principle of those style guides right out the window and now pile it all into a single, braceless line, and oh look, lambdas lead to shorter code!". I would assume any reader, armed with this knowledge, can easily pierce through such baloney argumentation. The reasons for those braces primarily involve easier debug breakpointing and easy ways to add additional actions to a given 'code node', and those needs are exactly as important, if not more so, if using streams. If it's okay to one-liner and go brace-free for lambdas, then surely it is okay to do the same to if and for bodies.

Java predicate - match against first predicate [duplicate]

I've just started playing with Java 8 lambdas and I'm trying to implement some of the things that I'm used to in functional languages.
For example, most functional languages have some kind of find function that operates on sequences, or lists that returns the first element, for which the predicate is true. The only way I can see to achieve this in Java 8 is:
lst.stream()
.filter(x -> x > 5)
.findFirst()
However this seems inefficient to me, as the filter will scan the whole list, at least to my understanding (which could be wrong). Is there a better way?
No, filter does not scan the whole stream. It's an intermediate operation, which returns a lazy stream (actually all intermediate operations return a lazy stream). To convince you, you can simply do the following test:
List<Integer> list = Arrays.asList(1, 10, 3, 7, 5);
int a = list.stream()
.peek(num -> System.out.println("will filter " + num))
.filter(x -> x > 5)
.findFirst()
.get();
System.out.println(a);
Which outputs:
will filter 1
will filter 10
10
You see that only the two first elements of the stream are actually processed.
So you can go with your approach which is perfectly fine.
However this seems inefficient to me, as the filter will scan the whole list
No it won't - it will "break" as soon as the first element satisfying the predicate is found. You can read more about laziness in the stream package javadoc, in particular (emphasis mine):
Many stream operations, such as filtering, mapping, or duplicate removal, can be implemented lazily, exposing opportunities for optimization. For example, "find the first String with three consecutive vowels" need not examine all the input strings. Stream operations are divided into intermediate (Stream-producing) operations and terminal (value- or side-effect-producing) operations. Intermediate operations are always lazy.
return dataSource.getParkingLots()
.stream()
.filter(parkingLot -> Objects.equals(parkingLot.getId(), id))
.findFirst()
.orElse(null);
I had to filter out only one object from a list of objects. So i used this, hope it helps.
In addition to Alexis C's answer, If you are working with an array list, in which you are not sure whether the element you are searching for exists, use this.
Integer a = list.stream()
.peek(num -> System.out.println("will filter " + num))
.filter(x -> x > 5)
.findFirst()
.orElse(null);
Then you could simply check whether a is null.
Already answered by #AjaxLeung, but in comments and hard to find.
For check only
lst.stream()
.filter(x -> x > 5)
.findFirst()
.isPresent()
is simplified to
lst.stream()
.anyMatch(x -> x > 5)
import org.junit.Test;
import java.util.Arrays;
import java.util.List;
import java.util.Optional;
// Stream is ~30 times slower for same operation...
public class StreamPerfTest {
int iterations = 100;
List<Integer> list = Arrays.asList(1, 10, 3, 7, 5);
// 55 ms
#Test
public void stream() {
for (int i = 0; i < iterations; i++) {
Optional<Integer> result = list.stream()
.filter(x -> x > 5)
.findFirst();
System.out.println(result.orElse(null));
}
}
// 2 ms
#Test
public void loop() {
for (int i = 0; i < iterations; i++) {
Integer result = null;
for (Integer walk : list) {
if (walk > 5) {
result = walk;
break;
}
}
System.out.println(result);
}
}
}
A generic utility function with looping seems a lot cleaner to me:
static public <T> T find(List<T> elements, Predicate<T> p) {
for (T item : elements) if (p.test(item)) return item;
return null;
}
static public <T> T find(T[] elements, Predicate<T> p) {
for (T item : elements) if (p.test(item)) return item;
return null;
}
In use:
List<Integer> intList = Arrays.asList(1, 2, 3, 4, 5);
Integer[] intArr = new Integer[]{1, 2, 3, 4, 5};
System.out.println(find(intList, i -> i % 2 == 0)); // 2
System.out.println(find(intArr, i -> i % 2 != 0)); // 1
System.out.println(find(intList, i -> i > 5)); // null
Improved One-Liner answer: If you are looking for a boolean return value, we can do it better by adding isPresent:
return dataSource.getParkingLots().stream().filter(parkingLot -> Objects.equals(parkingLot.getId(), id)).findFirst().isPresent();

How to create a List<T> from Map<K,V> and List<K> of keys?

Using Java 8 lambdas, what's the "best" way to effectively create a new List<T> given a List<K> of possible keys and a Map<K,V>? This is the scenario where you are given a List of possible Map keys and are expected to generate a List<T> where T is some type that is constructed based on some aspect of V, the map value types.
I've explored a few and don't feel comfortable claiming one way is better than another (with maybe one exception -- see code). I'll clarify "best" as a combination of code clarity and runtime efficiency. These are what I came up with. I'm sure someone can do better, which is one aspect of this question. I don't like the filter aspect of most as it means needing to create intermediate structures and multiple passes over the names List. Right now, I'm opting for Example 6 -- a plain 'ol loop. (NOTE: Some cryptic thoughts are in the code comments, especially "need to reference externally..." This means external from the lambda.)
public class Java8Mapping {
private final Map<String,Wongo> nameToWongoMap = new HashMap<>();
public Java8Mapping(){
List<String> names = Arrays.asList("abbey","normal","hans","delbrook");
List<String> types = Arrays.asList("crazy","boring","shocking","dead");
for(int i=0; i<names.size(); i++){
nameToWongoMap.put(names.get(i),new Wongo(names.get(i),types.get(i)));
}
}
public static void main(String[] args) {
System.out.println("in main");
Java8Mapping j = new Java8Mapping();
List<String> testNames = Arrays.asList("abbey", "froderick","igor");
System.out.println(j.getBongosExample1(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample2(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample3(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample4(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample5(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample6(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
}
private static class Wongo{
String name;
String type;
public Wongo(String s, String t){name=s;type=t;}
#Override public String toString(){return "Wongo{name="+name+", type="+type+"}";}
}
private static class Bongo{
Wongo wongo;
public Bongo(Wongo w){wongo = w;}
#Override public String toString(){ return "Bongo{wongo="+wongo+"}";}
}
// 1: Create a list externally and add items inside 'forEach'.
// Needs to externally reference Map and List
public List<Bongo> getBongosExample1(List<String> names){
final List<Bongo> listOne = new ArrayList<>();
names.forEach(s -> {
Wongo w = nameToWongoMap.get(s);
if(w != null) {
listOne.add(new Bongo(nameToWongoMap.get(s)));
}
});
return listOne;
}
// 2: Use stream().map().collect()
// Needs to externally reference Map
public List<Bongo> getBongosExample2(List<String> names){
return names.stream()
.filter(s -> nameToWongoMap.get(s) != null)
.map(s -> new Bongo(nameToWongoMap.get(s)))
.collect(Collectors.toList());
}
// 3: Create custom Collector
// Needs to externally reference Map
public List<Bongo> getBongosExample3(List<String> names){
Function<List<Wongo>,List<Bongo>> finisher = list -> list.stream().map(Bongo::new).collect(Collectors.toList());
Collector<String,List<Wongo>,List<Bongo>> bongoCollector =
Collector.of(ArrayList::new,getAccumulator(),getCombiner(),finisher, Characteristics.UNORDERED);
return names.stream().collect(bongoCollector);
}
// example 3 helper code
private BiConsumer<List<Wongo>,String> getAccumulator(){
return (list,string) -> {
Wongo w = nameToWongoMap.get(string);
if(w != null){
list.add(w);
}
};
}
// example 3 helper code
private BinaryOperator<List<Wongo>> getCombiner(){
return (l1,l2) -> {
l1.addAll(l2);
return l1;
};
}
// 4: Use internal Bongo creation facility
public List<Bongo> getBongosExample4(List<String> names){
return names.stream().filter(s->nameToWongoMap.get(s) != null).map(s-> new Bongo(nameToWongoMap.get(s))).collect(Collectors.toList());
}
// 5: Stream the Map EntrySet. This avoids referring to anything outside of the stream,
// but bypasses the lookup benefit from Map.
public List<Bongo> getBongosExample5(List<String> names){
return nameToWongoMap.entrySet().stream().filter(e->names.contains(e.getKey())).map(e -> new Bongo(e.getValue())).collect(Collectors.toList());
}
// 6: Plain-ol-java loop
public List<Bongo> getBongosExample6(List<String> names){
List<Bongo> bongos = new ArrayList<>();
for(String s : names){
Wongo w = nameToWongoMap.get(s);
if(w != null){
bongos.add(new Bongo(w));
}
}
return bongos;
}
}
If namesToWongoMap is an instance variable, you can't really avoid a capturing lambda.
You can clean up the stream by splitting up the operations a little more:
return names.stream()
.map(n -> namesToWongoMap.get(n))
.filter(w -> w != null)
.map(w -> new Bongo(w))
.collect(toList());
return names.stream()
.map(namesToWongoMap::get)
.filter(Objects::nonNull)
.map(Bongo::new)
.collect(toList());
That way you don't call get twice.
This is very much like the for loop, except, for example, it could theoretically be parallelized if namesToWongoMap can't be mutated concurrently.
I don't like the filter aspect of most as it means needing to create intermediate structures and multiple passes over the names List.
There are no intermediate structures and there is only one pass over the List. A stream pipeline says "for each element...do this sequence of operations". Each element is visited once and the pipeline is applied.
Here are some relevant quotes from the java.util.stream package description:
A stream is not a data structure that stores elements; instead, it conveys elements from a source such as a data structure, an array, a generator function, or an I/O channel, through a pipeline of computational operations.
Processing streams lazily allows for significant efficiencies; in a pipeline such as the filter-map-sum example above, filtering, mapping, and summing can be fused into a single pass on the data, with minimal intermediate state.
Radiodef's answer pretty much nailed it, I think. The solution given there:
return names.stream()
.map(namesToWongoMap::get)
.filter(Objects::nonNull)
.map(Bongo::new)
.collect(toList());
is probably about the best that can be done in Java 8.
I did want to mention a small wrinkle in this, though. The Map.get call returns null if the name isn't present in the map, and this is subsequently filtered out. There's nothing wrong with this per se, though it does bake null-means-not-present semantics into the pipeline structure.
In some sense we'd want a mapper pipeline operation that has a choice of returning zero or one elements. A way to do this with streams is with flatMap. The flatmapper function can return an arbitrary number of elements into the stream, but in this case we want just zero or one. Here's how to do that:
return names.stream()
.flatMap(name -> {
Wongo w = nameToWongoMap.get(name);
return w == null ? Stream.empty() : Stream.of(w);
})
.map(Bongo::new)
.collect(toList());
I admit this is pretty clunky and so I wouldn't recommend doing this. A slightly better but somewhat obscure approach is this:
return names.stream()
.flatMap(name -> Optional.ofNullable(nameToWongoMap.get(name))
.map(Stream::of).orElseGet(Stream::empty))
.map(Bongo::new)
.collect(toList());
but I'm still not sure I'd recommend this as it stands.
The use of flatMap does point to another approach, though. If you have a more complicated policy of how to deal with the not-present case, you could refactor this into a helper function that returns a Stream containing the result or an empty Stream if there's no result.
Finally, JDK 9 -- still under development as of this writing -- has added Stream.ofNullable which is useful in exactly these situations:
return names.stream()
.flatMap(name -> Stream.ofNullable(nameToWongoMap.get(name)))
.map(Bongo::new)
.collect(toList());
As an aside, JDK 9 has also added Optional.stream which creates a zero-or-one stream from an Optional. This is useful in cases where you want to call an Optional-returning function from within flatMap. See this answer and this answer for more discussion.
One approach I didn't see is retainAll:
public List<Bongo> getBongos(List<String> names) {
Map<String, Wongo> copy = new HashMap<>(nameToWongoMap);
copy.keySet().retainAll(names);
return copy.values().stream().map(Bongo::new).collect(
Collectors.toList());
}
The extra Map is a minimal performance hit, since it's just copying pointers to objects, not the objects themselves.

How to use Java 8 streams to find all values preceding a larger value?

Use Case
Through some coding Katas posted at work, I stumbled on this problem that I'm not sure how to solve.
Using Java 8 Streams, given a list of positive integers, produce a
list of integers where the integer preceded a larger value.
[10, 1, 15, 30, 2, 6]
The above input would yield:
[1, 15, 2]
since 1 precedes 15, 15 precedes 30, and 2 precedes 6.
Non-Stream Solution
public List<Integer> findSmallPrecedingValues(final List<Integer> values) {
List<Integer> result = new ArrayList<Integer>();
for (int i = 0; i < values.size(); i++) {
Integer next = (i + 1 < values.size() ? values.get(i + 1) : -1);
Integer current = values.get(i);
if (current < next) {
result.push(current);
}
}
return result;
}
What I've Tried
The problem I have is I can't figure out how to access next in the lambda.
return values.stream().filter(v -> v < next).collect(Collectors.toList());
Question
Is it possible to retrieve the next value in a stream?
Should I be using map and mapping to a Pair in order to access next?
Using IntStream.range:
static List<Integer> findSmallPrecedingValues(List<Integer> values) {
return IntStream.range(0, values.size() - 1)
.filter(i -> values.get(i) < values.get(i + 1))
.mapToObj(values::get)
.collect(Collectors.toList());
}
It's certainly nicer than an imperative solution with a large loop, but still a bit meh as far as the goal of "using a stream" in an idiomatic way.
Is it possible to retrieve the next value in a stream?
Nope, not really. The best cite I know of for that is in the java.util.stream package description:
The elements of a stream are only visited once during the life of a stream. Like an Iterator, a new stream must be generated to revisit the same elements of the source.
(Retrieving elements besides the current element being operated on would imply they could be visited more than once.)
We could also technically do it in a couple other ways:
Statefully (very meh).
Using a stream's iterator is technically still using the stream.
That's not a pure Java8, but recently I've published a small library called StreamEx which has a method exactly for this task:
// Find all numbers where the integer preceded a larger value.
Collection<Integer> numbers = Arrays.asList(10, 1, 15, 30, 2, 6);
List<Integer> res = StreamEx.of(numbers).pairMap((a, b) -> a < b ? a : null)
.nonNull().toList();
assertEquals(Arrays.asList(1, 15, 2), res);
The pairMap operation internally implemented using custom spliterator. As a result you have quite clean code which does not depend on whether the source is List or anything else. Of course it works fine with parallel stream as well.
Committed a testcase for this task.
It's not a one-liner (it's a two-liner), but this works:
List<Integer> result = new ArrayList<>();
values.stream().reduce((a,b) -> {if (a < b) result.add(a); return b;});
Rather than solving it by "looking at the next element", this solves it by "looking at the previous element, which reduce() give you for free. I have bent its intended usage by injecting a code fragment that populates the list based on the comparison of previous and current elements, then returns the current so the next iteration will see it as its previous element.
Some test code:
List<Integer> result = new ArrayList<>();
IntStream.of(10, 1, 15, 30, 2, 6).reduce((a,b) -> {if (a < b) result.add(a); return b;});
System.out.println(result);
Output:
[1, 15, 2]
The accepted answer works fine if either the stream is sequential or parallel but can suffer if the underlying List is not random access, due to multiple calls to get.
If your stream is sequential, you might roll this collector:
public static Collector<Integer, ?, List<Integer>> collectPrecedingValues() {
int[] holder = {Integer.MAX_VALUE};
return Collector.of(ArrayList::new,
(l, elem) -> {
if (holder[0] < elem) l.add(holder[0]);
holder[0] = elem;
},
(l1, l2) -> {
throw new UnsupportedOperationException("Don't run in parallel");
});
}
and a usage:
List<Integer> precedingValues = list.stream().collect(collectPrecedingValues());
Nevertheless you could also implement a collector so thats works for sequential and parallel streams. The only thing is that you need to apply a final transformation, but here you have control over the List implementation so you won't suffer from the get performance.
The idea is to generate first a list of pairs (represented by a int[] array of size 2) which contains the values in the stream sliced by a window of size two with a gap of one. When we need to merge two lists, we check the emptiness and merge the gap of the last element of the first list with the first element of the second list. Then we apply a final transformation to filter only desired values and map them to have the desired output.
It might not be as simple as the accepted answer, but well it can be an alternative solution.
public static Collector<Integer, ?, List<Integer>> collectPrecedingValues() {
return Collectors.collectingAndThen(
Collector.of(() -> new ArrayList<int[]>(),
(l, elem) -> {
if (l.isEmpty()) l.add(new int[]{Integer.MAX_VALUE, elem});
else l.add(new int[]{l.get(l.size() - 1)[1], elem});
},
(l1, l2) -> {
if (l1.isEmpty()) return l2;
if (l2.isEmpty()) return l1;
l2.get(0)[0] = l1.get(l1.size() - 1)[1];
l1.addAll(l2);
return l1;
}), l -> l.stream().filter(arr -> arr[0] < arr[1]).map(arr -> arr[0]).collect(Collectors.toList()));
}
You can then wrap these two collectors in a utility collector method, check if the stream is parallel with isParallel an then decide which collector to return.
If you're willing to use a third party library and don't need parallelism, then jOOλ offers SQL-style window functions as follows
System.out.println(
Seq.of(10, 1, 15, 30, 2, 6)
.window()
.filter(w -> w.lead().isPresent() && w.value() < w.lead().get())
.map(w -> w.value())
.toList()
);
Yielding
[1, 15, 2]
The lead() function accesses the next value in traversal order from the window.
Disclaimer: I work for the company behind jOOλ
You can achieve that by using a bounded queue to store elements which flows through the stream (which is basing on the idea which I described in detail here: Is it possible to get next element in the Stream?
Belows example first defines instance of BoundedQueue class which will store elements going through the stream (if you don't like idea of extending the LinkedList, refer to link mentioned above for alternative and more generic approach). Later you just examine the two subsequent elements - thanks to the helper class:
public class Kata {
public static void main(String[] args) {
List<Integer> input = new ArrayList<Integer>(asList(10, 1, 15, 30, 2, 6));
class BoundedQueue<T> extends LinkedList<T> {
public BoundedQueue<T> save(T curElem) {
if (size() == 2) { // we need to know only two subsequent elements
pollLast(); // remove last to keep only requested number of elements
}
offerFirst(curElem);
return this;
}
public T getPrevious() {
return (size() < 2) ? null : getLast();
}
public T getCurrent() {
return (size() == 0) ? null : getFirst();
}
}
BoundedQueue<Integer> streamHistory = new BoundedQueue<Integer>();
final List<Integer> answer = input.stream()
.map(i -> streamHistory.save(i))
.filter(e -> e.getPrevious() != null)
.filter(e -> e.getCurrent() > e.getPrevious())
.map(e -> e.getPrevious())
.collect(Collectors.toList());
answer.forEach(System.out::println);
}
}

How to create an infinite stream with Java 8

Is there a easy way to create a infinity stream using java-8 without external libraries?
For example in Scala:
Iterator.iterate(0)(_ + 2)
Yes, there is an easy way:
IntStream.iterate(0, i -> i + 2);
With as usecase:
IntStream.iterate(0, i -> i + 2)
.limit(100)
.forEach(System.out::println);
Which prints out 0 to 198 increasing in steps of 2.
The generic method is:
Stream.iterate(T seed, UnaryOperator<T> f);
The latter may be more uncommon in usage.
Here is an example:
PrimitiveIterator.OfInt it = new PrimitiveIterator.OfInt() {
private int value = 0;
#Override
public int nextInt() {
return value++;
}
#Override
public boolean hasNext() {
return true;
}
};
Spliterator.OfInt spliterator = Spliterators.spliteratorUnknownSize(it,
Spliterator.DISTINCT | Spliterator.IMMUTABLE |
Spliterator.ORDERED | Spliterator.SORTED);
IntStream stream = StreamSupport.intStream(spliterator, false);
It's a bit verbose, as you see. To print the first 10 elements of this stream:
stream.limit(10).forEach(System.out::println);
You can ofcourse also transform the elements, like you do in your Scala example:
IntStream plusTwoStream = stream.map(n -> n + 2);
Note that there are built-in infinite streams such as java.util.Random.ints() which gives you an infinite stream of random integers.
There is another possible solution in Java 8:
AtomicInteger adder = new AtomicInteger();
IntStream stream = IntStream.generate(() -> adder.getAndAdd(2));
Important: an order of numbers is preserved only if the stream is sequential.
It's also worth noting that a new version of the IntStream.iterate has been added since Java 9:
static IntStream iterate​(int seed,
IntPredicate hasNext,
IntUnaryOperator next);
seed - the initial element;
hasNext - a predicate to apply to elements to determine when the stream must terminate;
next - a function to be applied to the previous element to produce a new element.
Examples:
IntStream stream = IntStream.iterate(0, i -> i >= 0, i -> i + 2);
IntStream.iterate(0, i -> i < 10, i -> i + 2).forEach(System.out::println);
You can build your own InfiniteStream by implementing stream and consumer and compose both and may will need queue to queueing your data as :
public class InfiniteStream<T> implements Consumer<T>, Stream<T> {
private final Stream<T> stream;
private final Queueing q;
...
public InfiniteStream(int length) {
this.q = new Queueing(this.length);
this.stream = Stream.generate(q);
...
}
//implement stream methods
//implement accept
}
check full code here
https://gist.github.com/bassemZohdy/e5fdd56de44cea3cd8ff

Categories