How do you create stream of Boolean.FALSE, say, length of 100?
What I've struggled with is:
Originally I've intended to create an array of Boolean.FALSE. But new Boolean[100] returns an array of NULL. So reasonably I considered to use stream API as a convenient Iterable and almost (1) Iterable manipulation tool;
There is no Boolean no-params constructor (2), hence I can't
use Stream.generate(), since it accepts Supplier<T> (3).
What I found is Stream.iterate(Boolean.FALSE, bool -> Boolean.FALSE).limit(100); gives what I want, but it doesn't seem to be quite elegant solution, IMHO.
One more option, I found (4) is IntStream.range(0, 100).mapToObj(idx -> Boolean.FALSE);, which seems to me even more strange.
Despite these options don't violate pipeline conception of a stream API, are there any more concise ways to create stream of Boolean.FALSE?
Even though Boolean has no no-arg constructor, you can still use Stream.generate using a lambda:
Stream.generate(() -> Boolean.FALSE).limit(100)
This also has the advantage (compared to using a constructor) that those will be the same Boolean instances, and not 100 different but equal ones.
You can use Collections's static <T> List<T> nCopies(int n, T o):
Collections.nCopies (100, Boolean.FALSE).stream()...
Note that the List returned by nCopies is tiny (it contains a single reference to the data object)., so it doesn't require more storage compared to the Stream.generate().limit() solution, regardless of the required size.
of course you could create the stream directly
Stream.Builder<Boolean> builder = Stream.builder();
for( int i = 0; i < 100; i++ )
builder.add( false );
Stream<Boolean> stream = builder.build();
Related
I have the following for loop:
List<Player> players = new ArrayList<>();
for (Team team : teams) {
ArrayList<TeamPlayer> teamPlayers = team.getTeamPlayers();
for (teamPlayer player : teamPlayers) {
players.add(new Player(player.getName, player.getPosition());
}
}
and I'm trying to convert it to a Stream:
List<Player> players = teams.forEach(t -> t.getTeamPlayers()
.forEach(p -> players.add(new Player(p.getName(), p.getPosition())))
);
But I'm getting a compilation error:
variable 'players' might not have been initialized
Why is this happening? Maybe there's an alternative way to create the stream, I was thinking of using flatMap but not sure how to apply it.
First of all, you need to understand that Streams don't act like Loops.
Hence, don't try to mimic a loop. Examine the tools offered by the API. Operation forEach() is there for special cases when you need to perform side-effects, not in order to accumulate elements from the stream into a Collection.
Note: with teams.forEach() you're not actually using a stream, but method Iterable.forEach() which is available with every implementation of Iterable.
To perform reduction on streams, we have several specialized operations like collect, reduce, etc. (for more information refer to the API documentation - Reduction).
collect() operation is meant to perform mutable reduction. You can use to collect the data into a list by providing built-in Collector Collectors.toList() as an argument. And since Java 16 operation toList() was introduced into API, which is implemented on top of the toArray() operation and performs better than namesake collector (therefore it's a preferred option if your JDK version allows you to use it).
I was thinking of using flatMap but not sure how to apply it.
Operation flatMap() is meant to perform one-to-many transformations. It expects a Function which takes a stream element and generates a Stream of the resulting type, elements of the generated stream become a replacement for the initial element.
Note: that general approach to writing streams to use as fewer operations as possible (because one of the main advantages that Functional programming brings to Java is simplicity). For that reason, applying flatMap() when a stream element produces a Collection in a single step is idiomatic, since it's sorter than performing map().flatMap() in two steps.
That's how implementation might look like:
List<Team> teams = List.of();
List<Player> players = teams.stream() // Stream<Team>
.flatMap(team -> team.getTeamPlayers().stream()) // Stream<Player>
.map(player -> new Player(player.getName(), player.getPosition()))
.toList(); // for Java 16+ or collect(Collectors.toList())
This is basically the answer of Alexander Ivanchenko, but with method reference.
final var players = teams.stream()
.map(Team::getTeamPlayers)
.flatMap(Collection::stream)
.map(p -> new Player(p.getName(), p.getPosition()))
.toList();
If your Player class has a factory method like (depending on the relation between Player and TeamPlayer:
public static Player fromTeamPlayer(final TeamPlayer teamPlayer) {
return new Player(teamPlayer.getName(), teamPlayer.getPosition());
}
You could further reduce it to:
final var players = teams.stream()
.map(Team::getTeamPlayers)
.flatMap(Collection::stream)
.map(Player::fromTeamPlayer)
.toList();
In Java 8 I can map over streams with the map method, e.g.
Stream.of("Hello", "world").map(s -> s.length())
gives me a stream containing the integers [5, 5]. I am trying to do the same with lists. I have come up with
List<String> list = ...
list.stream().map(s -> s.length()).collect(Collectors.toList())
This works but is rather verbose. Is there a more concise solution? Ideally, there would be a similar map method for lists, but I haven't found any. So, are there any alternatives?
As compact as possible
Just wrap it into your own utility function:
public <T, S> List<S> mapBy(List<T> items, Function<T, S> mapFn) {
return items.stream().map(mapFn).collect(Collectors.toList());
}
Now you can just use mapBy(students, Student::getName). It doesn't get less verbose than that.
Note that this is only useful if that's the only data mutation you want to make. Once you have more stream operators you want to apply it'd be better to do just that as otherwise you keep creating intermediate lists, which is quite wasteful.
Think practically to do operation on each element in list you need to either stream it or loop it, so stream is more concise than loop. for more info you can replace lambda expression with method reference operator
list.stream().map(String::length).collect(Collectors.toList());
I have a List defined as follows:
List<Integer> list1 = new ArrayList<>();
list1.add(1);
list1.add(2);
How can I increment each element of the List by one (i.e. end up with a List [2,3]) using Java 8's Stream API without creating new List?
When you create a Stream from the List, you are not allowed to modify the source List from the Stream as specified in the “Non-interference” section of the package documentation. Not obeying this constraint can result in a ConcurrentModificationException or, even worse, a corrupted data structure without getting an exception.
The only solution to directly manipulate the list using a Java Stream, is to create a Stream not iterating over the list itself, i.e. a stream iterating over the indices like
IntStream.range(0, list1.size()).forEach(ix -> list1.set(ix, list1.get(ix)+1));
like in Eran’s answer
But it’s not necessary to use a Stream here. The goal can be achieved as simple as
list1.replaceAll(i -> i + 1);
This is a new List method introduced in Java 8, also allowing to smoothly use a lambda expression. Besides that, there are also the probably well-known Iterable.forEach, the nice Collection.removeIf, and the in-place List.sort method, to name other new Collection operations not involving the Stream API. Also, the Map interface got several new methods worth knowing.
See also “New and Enhanced APIs That Take Advantage of Lambda Expressions and Streams in Java SE 8” from the official documentation.
Holger's answer is just about perfect. However, if you're concerned with integer overflow, then you can use another utility method that was released in Java 8: Math#incrementExact. This will throw an ArithmeticException if the result overflows an int. A method reference can be used for this as well, as seen below:
list1.replaceAll(Math::incrementExact);
You can iterate over the indices via an IntStream combined with forEach:
IntStream.range(0,list1.size()).forEach(i->list1.set(i,list1.get(i)+1));
However, this is not much different than a normal for loop, and probably less readable.
reassign the result to list1:
list1 = list1.stream().map(i -> i+1).collect(Collectors.toList());
public static Function<Map<String, LinkedList<Long>>, Map<String, LinkedList<Long>>> applyDiscount = (
objectOfMAp) -> {
objectOfMAp.values().forEach(listfLong -> {
LongStream.range(0, ((LinkedList<Long>) listfLong).size()).forEach(index -> {
Integer position = (int) index;
Double l = listfLong.get(position) - (10.0 / 100 * listfLong.get(position));
listfLong.set(position, l.longValue());
});
});
return objectOfMAp;
};
So I have some code using Java 8 streams, and it works. It does exactly what I need it to do, and it's legible (a rarity for functional programming). Towards the end of a subroutine, the code runs over a List of a custom pair type:
// All names Hungarian-Notation-ized for SO reading
class AFooAndABarWalkIntoABar
{
public int foo_int;
public BarClass bar_object;
....
}
List<AFooAndABarWalkIntoABar> results = ....;
The data here must be passed into other parts of the program as arrays, so they get copied out:
// extract either a foo or a bar from each "foo-and-bar" (fab)
int[] foo_array = results.stream()
.mapToInt (fab -> fab.foo_int)
.toArray();
BarClass[] bar_array = results.stream()
.map (fab -> fab.bar_object)
.toArray(BarClass[]::new);
And done. Now each array can go do its thing.
Except... that loop over the List twice bothers me in my soul. And if we ever need to track more information, they're likely going to add a third field, and then have to make a third pass to turn the 3-tuple into three arrays, etc. So I'm fooling around with trying to do it in a single pass.
Allocating the data structures is trivial, but maintaining an index for use by the Consumer seems hideous:
int[] foo_array = new int[results.size()];
BarClass[] bar_array = new BarClass[results.size()];
// the trick is providing a stateful iterator across the array:
// - can't just use 'int', it's not effectively final
// - an actual 'final int' would be hilariously wrong
// - "all problems can be solved with a level of indirection"
class Indirection { int iterating = 0; }
final Indirection sigh = new Indirection();
// equivalent possibility is
// final int[] disgusting = new int[]{ 0 };
// and then access disgusting[0] inside the lambda
// wash your hands after typing that code
results.stream().forEach (fab -> {
foo_array[sigh.iterating] = fab.foo_int;
bar_array[sigh.iterating] = fab.bar_object;
sigh.iterating++;
});
This produces identical arrays as the existing solution using multiple stream loops. And it does so in about half the time, go figure. But the iterator indirection tricks seem so unspeakably ugly, and of course preclude any possibility of populating the arrays in parallel.
Using a pair of ArrayList instances, created with appropriate capacity, would let the Consumer code simply call add for each instance, and no external iterator needed. But ArrayList's toArray(T[]) has to perform a copy of the storage array again, and in the int case there's boxing/unboxing on top of that.
(edit: The answers to the "possible duplicate" question all talk about only maintaining the indices in a stream, and using direct array indexing to get to the actual data during filter/map calls, along with a note that it doesn't really work if the data isn't accessible by direct index. While this question has a List and is "directly indexable" only from a viewpoint of "well, List#get exists, technically". If the results collection above is a LinkedList, for example, then calling an O(n) get N times with nonconsecutive index would be... bad.)
Are there other, better, possibilities that I'm missing? I thought a custom Collector might do it, but I can't figure out how to maintain the state there either and never even got as far as scratch code.
As the size of stream is known, there is no reason of reinventing the wheel again. The simplest solution is usually the best one. The second approach you have shown is nearly there - just use AtomicIntegeras array index and you will achieve your goal - single pass over data, and possible parralel stream execution ( due to AtomicInteger).
SO
AtomicInteger index=new AtomicInteger()
results.parallelStream().forEach (fab -> {
int idx=index.getAndIncrement();
foo_array[idx] = fab.foo_int;
bar_array[idx] = fab.bar_object;
});
Thread safe for parralel execution. One iteratio over whole collection
If your prerequisites are that both, iterating the list and accessing the list via an index, are expensive operations, there is no chance of getting a benefit from the parallel Stream processing. You can try to go with this answer, if you don’t need the result values in the original list order.
Otherwise, you can’t benefit from the parallel Stream processing as it requires the source to be able to efficiently split its contents into two halves, which implies either, random access or fast iteration. If the source has no customized spliterator, the default implementation will try to enable parallel processing via buffering elements into an array, which already implies iterating before the parallel processing even starts and having additional array storage costs where your sole operation is an array storage operation anyway.
When you accept that there is no benefit from parallel processing, you can stay with your sequential solution, but solve the ugliness of the counter by moving it into the Consumer. Since lambda expressions don’t support this, you can turn to the good old anonymous inner class:
int[] foo_array = new int[results.size()];
BarClass[] bar_array = new BarClass[results.size()];
results.forEach(new Consumer<AFooAndABarWalkIntoABar>() {
int index=0;
public void accept(AFooAndABarWalkIntoABar t) {
foo_array[index]=t.foo_int;
bar_array[index]=t.bar_object;
index++;
}
});
Of course, there’s also the often-overlooked alternative of the good old for-loop:
int[] foo_array = new int[results.size()];
BarClass[] bar_array = new BarClass[results.size()];
int index=0;
for(AFooAndABarWalkIntoABar t: results) {
foo_array[index]=t.foo_int;
bar_array[index]=t.bar_object;
index++;
}
I wouldn’t be surprised, if this beats all other alternatives performance-wise for your scenario…
A way to reuse an index in a stream is to wrap your lambda in an IntStream that is in charge of incrementing the index:
IntStream.range(0, results.size()).forEach(i -> {
foo_array[i] = results.get(i).foo_i;
bar_array[i] = results.get(i).bar_object;
});
With regards to Antoniossss's answer, using an IntStream seems like a slightly preferable option to using AtomicInteger:
It also works with parallel();
Two less local variable;
Leaves the Stream API in charge of parallel processing;
Two less lines of code.
EDIT: as Mikhail Prokhorov pointed out, calling the get method twice on implementations such as LinkedList will be slower than other solutions, given the O(n) complexity of their implementations of get. This can be fixed with:
AFooAndABarWalkIntoABar temp = results.get(i);
foo_array[i] = temp.foo_i;
bar_array[i] = temp.bar_object;
Java 12 adds a teeing collector which provides an approach to do this in one pass. Here is some example code using Apache Commons Pair class.
import org.apache.commons.lang3.tuple.Pair;
import java.util.Arrays;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;
class Scratch {
public static void main(String[] args) {
final Stream<Pair<String, String>> pairs = Stream.of(
Pair.of("foo1", "bar1"),
Pair.of("foo2", "bar2"),
Pair.of("foo3", "bar3")
);
final Pair<List<String>, List<String>> zipped = pairs
.collect(Collectors.teeing(
Collectors.mapping(Pair::getLeft, Collectors.toList()),
Collectors.mapping(Pair::getRight, Collectors.toList()),
(lefts, rights) -> Pair.of(lefts, rights)
));
// Then get the arrays out
String[] lefts = zipped.getLeft().toArray(String[]::new);
String[] rights = zipped.getRight().toArray(String[]::new);
System.out.println(Arrays.toString(lefts));
System.out.println(Arrays.toString(rights));
}
}
The output will be
[foo1, foo2, foo3]
[bar1, bar2, bar3]
It does not require the stream size to be known ahead of time.
I'd like to duplicate a Java 8 stream so that I can deal with it twice. I can collect as a list and get new streams from that;
// doSomething() returns a stream
List<A> thing = doSomething().collect(toList());
thing.stream()... // do stuff
thing.stream()... // do other stuff
But I kind of think there should be a more efficient/elegant way.
Is there a way to copy the stream without turning it into a collection?
I'm actually working with a stream of Eithers, so want to process the left projection one way before moving onto the right projection and dealing with that another way. Kind of like this (which, so far, I'm forced to use the toList trick with).
List<Either<Pair<A, Throwable>, A>> results = doSomething().collect(toList());
Stream<Pair<A, Throwable>> failures = results.stream().flatMap(either -> either.left());
failures.forEach(failure -> ... );
Stream<A> successes = results.stream().flatMap(either -> either.right());
successes.forEach(success -> ... );
I think your assumption about efficiency is kind of backwards. You get this huge efficiency payback if you're only going to use the data once, because you don't have to store it, and streams give you powerful "loop fusion" optimizations that let you flow the whole data efficiently through the pipeline.
If you want to re-use the same data, then by definition you either have to generate it twice (deterministically) or store it. If it already happens to be in a collection, great; then iterating it twice is cheap.
We did experiment in the design with "forked streams". What we found was that supporting this had real costs; it burdened the common case (use once) at the expense of the uncommon case. The big problem was dealing with "what happens when the two pipelines don't consume data at the same rate." Now you're back to buffering anyway. This was a feature that clearly didn't carry its weight.
If you want to operate on the same data repeatedly, either store it, or structure your operations as Consumers and do the following:
stream()...stuff....forEach(e -> { consumerA(e); consumerB(e); });
You might also look into the RxJava library, as its processing model lends itself better to this kind of "stream forking".
You can use a local variable with a Supplier to set up common parts of the stream pipeline.
From http://winterbe.com/posts/2014/07/31/java8-stream-tutorial-examples/:
Reusing Streams
Java 8 streams cannot be reused. As soon as you call any terminal operation the stream is closed:
Stream<String> stream = Stream.of("d2", "a2", "b1", "b3", "c")
.filter(s -> s.startsWith("a"));
stream.anyMatch(s -> true); // ok
stream.noneMatch(s -> true); // exception
Calling `noneMatch` after `anyMatch` on the same stream results in the following exception:
java.lang.IllegalStateException: stream has already been operated upon or closed
at
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:229)
at
java.util.stream.ReferencePipeline.noneMatch(ReferencePipeline.java:459)
at com.winterbe.java8.Streams5.test7(Streams5.java:38)
at com.winterbe.java8.Streams5.main(Streams5.java:28)
To overcome this limitation we have to to create a new stream chain for every terminal operation we want to execute, e.g. we could create a stream supplier to construct a new stream with all intermediate operations already set up:
Supplier<Stream<String>> streamSupplier =
() -> Stream.of("d2", "a2", "b1", "b3", "c")
.filter(s -> s.startsWith("a"));
streamSupplier.get().anyMatch(s -> true); // ok
streamSupplier.get().noneMatch(s -> true); // ok
Each call to get() constructs a new stream on which we are save to call the desired terminal operation.
Use a Supplier to produce the stream for each termination operation.
Supplier<Stream<Integer>> streamSupplier = () -> list.stream();
Whenever you need a stream of that collection,
use streamSupplier.get() to get a new stream.
Examples:
streamSupplier.get().anyMatch(predicate);
streamSupplier.get().allMatch(predicate2);
We've implemented a duplicate() method for streams in jOOλ, an Open Source library that we created to improve integration testing for jOOQ. Essentially, you can just write:
Tuple2<Seq<A>, Seq<A>> duplicates = Seq.seq(doSomething()).duplicate();
Internally, there is a buffer storing all values that have been consumed from one stream but not from the other. That's probably as efficient as it gets if your two streams are consumed about at the same rate, and if you can live with the lack of thread-safety.
Here's how the algorithm works:
static <T> Tuple2<Seq<T>, Seq<T>> duplicate(Stream<T> stream) {
final List<T> gap = new LinkedList<>();
final Iterator<T> it = stream.iterator();
#SuppressWarnings("unchecked")
final Iterator<T>[] ahead = new Iterator[] { null };
class Duplicate implements Iterator<T> {
#Override
public boolean hasNext() {
if (ahead[0] == null || ahead[0] == this)
return it.hasNext();
return !gap.isEmpty();
}
#Override
public T next() {
if (ahead[0] == null)
ahead[0] = this;
if (ahead[0] == this) {
T value = it.next();
gap.offer(value);
return value;
}
return gap.poll();
}
}
return tuple(seq(new Duplicate()), seq(new Duplicate()));
}
More source code here
Tuple2 is probably like your Pair type, whereas Seq is Stream with some enhancements.
You could create a stream of runnables (for example):
results.stream()
.flatMap(either -> Stream.<Runnable> of(
() -> failure(either.left()),
() -> success(either.right())))
.forEach(Runnable::run);
Where failure and success are the operations to apply. This will however create quite a few temporary objects and may not be more efficient than starting from a collection and streaming/iterating it twice.
Another way to handle the elements multiple times is to use Stream.peek(Consumer):
doSomething().stream()
.peek(either -> handleFailure(either.left()))
.foreach(either -> handleSuccess(either.right()));
peek(Consumer) can be chained as many times as needed.
doSomething().stream()
.peek(element -> handleFoo(element.foo()))
.peek(element -> handleBar(element.bar()))
.peek(element -> handleBaz(element.baz()))
.foreach(element-> handleQux(element.qux()));
cyclops-react, a library I contribute to, has a static method that will allow you duplicate a Stream (and returns a jOOλ Tuple of Streams).
Stream<Integer> stream = Stream.of(1,2,3);
Tuple2<Stream<Integer>,Stream<Integer>> streams = StreamUtils.duplicate(stream);
See comments, there is performance penalty that will be incurred when using duplicate on an existing Stream. A more performant alternative would be to use Streamable :-
There is also a (lazy) Streamable class that can be constructed from a Stream, Iterable or Array and replayed multiple times.
Streamable<Integer> streamable = Streamable.of(1,2,3);
streamable.stream().forEach(System.out::println);
streamable.stream().forEach(System.out::println);
AsStreamable.synchronizedFromStream(stream) - can be used to create a Streamable that will lazily populate it's backing collection, in a way such that can be shared across threads. Streamable.fromStream(stream) will not incur any synchronization overhead.
For this particular problem you can use also partitioning. Something like
// Partition Eighters into left and right
List<Either<Pair<A, Throwable>, A>> results = doSomething();
Map<Boolean, Object> passingFailing = results.collect(Collectors.partitioningBy(s -> s.isLeft()));
passingFailing.get(true) <- here will be all passing (left values)
passingFailing.get(false) <- here will be all failing (right values)
We can make use of Stream Builder at the time of reading or iterating a stream.
Here's the document of Stream Builder.
https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.Builder.html
Use case
Let's say we have employee stream and we need to use this stream to write employee data in excel file and then update the employee collection/table
[This is just use case to show the use of Stream Builder]:
Stream.Builder<Employee> builder = Stream.builder();
employee.forEach( emp -> {
//store employee data to excel file
// and use the same object to build the stream.
builder.add(emp);
});
//Now this stream can be used to update the employee collection
Stream<Employee> newStream = builder.build();
I had a similar problem, and could think of three different intermediate structures from which to create a copy of the stream: a List, an array and a Stream.Builder. I wrote a little benchmark program, which suggested that from a performance point of view the List was about 30% slower than the other two which were fairly similar.
The only drawback of converting to an array is that it is tricky if your element type is a generic type (which in my case it was); therefore I prefer to use a Stream.Builder.
I ended up writing a little function that creates a Collector:
private static <T> Collector<T, Stream.Builder<T>, Stream<T>> copyCollector()
{
return Collector.of(Stream::builder, Stream.Builder::add, (b1, b2) -> {
b2.build().forEach(b1);
return b1;
}, Stream.Builder::build);
}
I can then make a copy of any stream str by doing str.collect(copyCollector()) which feels quite in keeping with the idiomatic usage of streams.