Peek the next Element in a stream - java

Is there a way to peek the next element in a stream? The idea rose from a stream of a list of objects, where two following objects should be compared (to smooth some diffs, but that shouldn't matter here). As an old for loop this would look like:
List<Car> autobahn = getCars();
for (int i = 0; i < autobahn.size()-1; i++) {
if(autobahn.get(i).speed>autobahn.get(i+1).speed)
autobahn.get(i).honk();
}
The best way so far as stream would be:
autobahn.stream()
.limit(autobahn.size()-1)
.filter(car -> car.speed < autobahn.get(autobahn.indexOf(car)+1).speed)
.forEach(car -> car.honk());
The main-problem with this solution is the indexOf method, since there might be twice the same car on the autobahn. A better solution would be some way to peek the next (or the one before) element (with an helping class, this might be even possible, but looks horrible)
BoxedCar boxedCar = new BoxedCar(autobahn.get(0));
autobahn.stream()
.skip(1)
.filter(car -> boxedCar.setContent(car))
.forEach(car -> car.winTheRace());
with helperclass
class BoxedCar {
Car content;
BoxedCar(Car content) {
this.content = content;
}
boolean setContent(Car content) {
double speed = this.content.speed;
this.content = content;
return content.speed > speed;
}
}
or to divert the Stream<Car> into a kind of Stream<(Car,Car)> with the second stream somehow created by the first one (this sounds also awful and here I have no idea, how this would look).
Is there a nice way to do this with streams, or are we stuck to the for-loop?

Sticking with the for loop wouldn't be a bad idea. The Stream API isn't designed for this type of requirement. You can refer to that answer for more insight.
However, a simple way to do this using the Stream API would be to use a Stream over the indexes of your list, supposing that you have random access.
IntStream.range(0, autobahn.size() - 1)
.filter(i -> autobahn.get(i).speed > autobahn.get(i+1).speed)
.forEach(i -> autobahn.get(i).honk());
Note that this highly resemble the for loop.

Using my free StreamEx library:
StreamEx.of(autobahn)
.pairMap((car, nextCar) -> car.speed < nextCar.speed ? car : null)
.nonNull()
.forEach(Car::honk);
Here non-standard pairMap operation is used which can map the adjacent pair of elements to the single element. This works for any stream source (not only random-access indexed list) and can be parallelized pretty well.

what about using an IntStream instead of a loop:
IntStream.range(0, autobahn.size() - 1)
.filter(i -> autobahn.get(i).speed < autobahn.get(i + 1).speed)
.forEach(i -> autobahn.get(i).honk());

Related

java 8 parallelStream().forEach Result data loss

There are two test cases which use parallelStream():
List<Integer> src = new ArrayList<>();
for (int i = 0; i < 20000; i++) {
src.add(i);
}
List<String> strings = new ArrayList<>();
src.parallelStream().filter(integer -> (integer % 2) == 0).forEach(integer -> strings.add(integer + ""));
System.out.println("=size=>" + strings.size());
=size=>9332
List<Integer> src = new ArrayList<>();
for (int i = 0; i < 20000; i++) {
src.add(i);
}
List<String> strings = new ArrayList<>();
src.parallelStream().forEach(integer -> strings.add(integer + ""));
System.out.println("=size=>" + strings.size());
=size=>17908
Why do I always lose data when using parallelStream?
What did i do wrong?
ArrayList isn't thread safe. You need to do
List<String> strings = Collections.synchronizedList(new ArrayList<>());
or
List<String> strings = new Vector<>();
to ensure all updates are synchronized, or switch to
List<String> strings = src.parallelStream()
.filter(integer -> (integer % 2) == 0)
.map(integer -> integer + "")
.collect(Collectors.toList());
and leave the list building to the Streams framework. Note that it's undefined whether the list returned by collect is modifiable, so if that is a requirement, you may need to modify your approach.
In terms of performance, Stream.collect is likely to be much faster than using Stream.forEach to add to a synchronized collection, since the Streams framework can handle collection of values in each thread separately without synchronization and combine the results at the end in a thread safe fashion.
ArrayList isn't thread-safe. While 1 thread sees a list with 30 elements another might still see 29 and override the 30th position (loosing 1 element).
Another issue might arise when the array backing the list needs to be resized. A new array (with double the size) is created and elements from the original array are copied into it. While other threads might have added stuff the thread doing the resizing might not have seen this or multiple threads are resizing and eventually only 1 will win.
When using multiple threads you need to either do some syncronized when accessing the list OR use a multi-thread safe list (by either wrapping it in a SynchronizedList or by using a CopyOnWriteArrayList to mention 2 possible solutions). Even better would be to use the collect method on the stream to put everything into a list.
ParallelStream with forEach is a deadly combo if not used carefully.
Please take a look at below points to avoid any bugs:
If you have a preexisting list object in which you want to add more objects from a parallelStream loop, Use Collections.synchronizedList & pass that pre-existing list object to it before looping through the parallelstream.
If you have to create a new list, then you can use Vector to initialize the list outside the loop.
or
If you have to create a new list, then simply use parallelStream and collect the output at the end.
You lose the benefits of using stream (and parallel stream) when you try to do mutation. As a general rule, avoid mutation when using streams. Venkat Subramaniam explains why. Instead, use collectors. Also try to get a lot accomplished within the stream chain. For example:
System.out.println(
IntStream.range(0, 200000)
.filter(i -> i % 2 == 0)
.mapToObj(String::valueOf)
.collect(Collectors.toList()).size()
);
You can run that in parallelStream by adding .parallel()

indexing <Stream> data Java

need some help with indexing Stream data in Java. The context is that we need to manually set index for document that is embedded to other document (tldr; the output needs to be Stream in this method)
return Stream.concat(firstStream, secondStream) <- these need to be indexed
.sorted(// sorted using Comparator)
.forEach? .map? // the class has index field with getter and setter so I think need to do `setIndex(i)` but wasnt sure where to get 'i'
Any advice would be greatly appreciated!
If you can construct your streams yourself from lists, use IntStream of indices rather than Stream of objects.
IntStream.range(0, firstList.size()).forEach(i -> firstList.get(i).setIndex(i));
int offsetForSecondList = firstList.size();
IntStream.range(0, secondList.size())
.forEach(i -> secondList.get(i).setIndex(offsetForSecondList + i));
I have not tried to compile the code, so forgive any typo.
Otherwise your AtomicReference approach works too.
Assuming you have a class MyObject:
class MyObject{
int index;
String name;
//getters,setters,cons, toString...
}
Something like below may be a starting point:
public static Stream<MyObject> fooBar(){
//just for example, inorder to get the streams to be concatnated
List<MyObject> first = List.of(new MyObject("foo"),new MyObject("foo"),new MyObject("foo"));
List<MyObject> second = List.of(new MyObject("bar"),new MyObject("bar"),new MyObject("bar"));
AtomicInteger ai = new AtomicInteger(0);
return Stream.concat(first.stream(), second.stream())
.peek(myo -> myo.setIndex(ai.getAndIncrement()));
}

How to pass in Streams Prev element result to next one

I have List of items where on each item I need to create some calculation.
Each calculation is built by the preceding element.
So for example:
List<Object> Users=new ArrayList<>();
users.stream().filter(element->calculateSomething(<need-prev-element-input>).findFirst();
The calculateSomething will return true/false depends on the prev element calculation result in the stream
Any idea how can I do that?
Streams are not designed to be able to do any operation like this. You might be able to hack something together to do that, but it'll be awful; you should go back to using normal loops instead.
If you really want to use streams, stream over indexes:
IntStream.range(1, users.size())
.filter(i -> calculateSomething(users.get(i-1) , users.get(i)))
.map(users::get)
.findFirst();
There are also a number of non-standard libraries that let you stream over pairs from a list.
Here's the Java 8 way of doing it:
<T extends User> Optional<T> findUserSomehow(List<T> users) {
for (int idx = 1; idx < users.size(); ++idx)
if (calculateSomething(users.get(idx - 1)))
return Optional.of(users.get(idx));
return Optional.empty();
}

filtering a stream against items in another list

trying to filter a stream against data within a different list:
It works, but I use a for loop in the middle of the stream. I cannot find any information of how to convert the for loop to a stream.
I could just .stream() the selction.getItems() than .forEach() and have a new .stream() of DATA.accounts, but that is poor code as it would have to restream on every .forEach.
y=1;
DATA.accounts.stream()
.flatMap(estimate -> estimate.getElements().stream())
.filter( ele-> {
// different list;
for (Element element:selection.getItems()){
if (element.getId()==ele.getId()){
return true;
}
}
return false;
})
.forEach(element -> {
element.setDateSchedualed(selectedDate);
element.setOrder(y);
y++;
});
I think what you really need is:
list1.removeAll(list2);
No streams involved though.
You can express the filter as
.filter(ele -> selection.getItems().stream()
.anyMatch(element -> element.getId()==ele.getId())
The fact that this “would have to restream” shouldn’t bother you more than the fact that the original code will loop for every element. You have created an operation with O(n×m) time complexity in either case. This is acceptable if you can surely predict that one of these lists will always be very small.
Otherwise, there is no way around preparing this operation by storing the id values in a structure with a fast (O(1) in the best case) lookup. I.e.
Set<IdType> id = selection.getItems().stream()
.map(element -> element.getId())
.collect(Collectors.toSet());
…
.filter(ele -> id.contains(ele.getId())
Besides that, your forEach approach incrementing the y variable clearly is an anti-pattern and it doesn’t even compile, when y is a local variable. And if y is a field, it would make this code even worse. Here, it’s much cleaner to accept a temporary storage into a List:
Set<IdType> id = selection.getItems().stream().map(element -> element.getId());
List<ElementType> list = DATA.accounts.stream()
.flatMap(estimate -> estimate.getElements().stream())
.filter(ele -> id.contains(ele.getId())
.collect(Collectors.toList());
IntStream.range(0, list.size())
.forEach(ix -> {
ElementType element = list.get(ix);
element.setDateSchedualed(selectedDate);
element.setOrder(ix+1);
});
Put the other list's IDs in a Set selectedIds, then filter based on ele-> selectedIds.contains(ele.getId()).
That will give you (amortized) linear time complexity.
Since you need to check presence among all elements in selected for each item in the stream, I don't expect there will be any straightforward method using only streams (because you cannot really stream the selected collection for this task).
I think there is actually nothing wrong with using a for-each loop if you want to search for the id in linear time, because for example if your items list was an ArrayList and you used its contains method for filtering, it would actually also just loop over the elements. You could write a general contains function like:
public static <E1, E2> boolean contains(Collection<E1> collection, E2 e2, BiPredicate<E1, E2> predicate){
for (E1 e1 : collection){
if (predicate.test(e1, e2)){
return true;
}
}
return false;
}
and replace your for-each loop with it:
ele -> contains(selection.getItems(), ele, (e1, e2) -> e1.getId() == e2.getId())

Guava function to add items to resultant collection based on applying conditions to each item in list

I am trying to create a set of items by looking through an existing set and adding each one to a resultant list if it meets certain conditions. I was wondering if there was a more efficient way to do this task using some of the Google Guava libraries. The algorithm is below
final List<String> matchingItems = new ArrayList<>();
for (final String li : getItems()) {
if (li.length() > 5) {
matchingItems.add(li);
}
}
return matchingItems;
No
There are no more efficient way to write that for-loop than your way[1].
But you can rewrite that in several ways using Guava or Java 8. These will be slightly less efficient.
Using Guava
You can use Guava's FluentIterable class. It's kind of made for your use-case.
Note that you have to create a separate Predicate, but you can reuse it at will, of course.
Predicate<String> hasLengthGreaterThanFive = new Predicate<String>() {
#Override public boolean apply(String str) {
return str.length() > 5;
}
};
List<String> matchingItems = FluentIterable.from(getItems())
.filter(hasLengthGreaterThanFive)
.toList();
Using Java 8
Java 8 makes the previous bit of code much more readable, and you don't need to include Guava.
List<String> matchingItems = getItems().stream()
.filter(s -> s.length() > 5)
.collect(Collectors.toList());
1: bar some other byte-code relevant optimizations.

Categories