Similar questions have been asked, here and here, but given the advent of Java 8, and the generally outdated nature of these questions I'm wondering if now there'd be something at least kindred to it?
This is what I'm referring to.
You can use a lambda and Stream.reduce, there is a page in the docs dedicated to reductions:
Integer totalAgeReduce = roster
.stream()
.map(Person::getAge)
.reduce(
0,
(a, b) -> a + b);
This is the example used in the Python docs implemented with Java 8 streams:
List<Integer> numbers = Arrays.asList(new Integer[] { 1, 2, 3, 4, 5 });
Optional<Integer> sum = numbers.stream().reduce((a, b) -> a + b);
System.out.println(sum.get());
The Stream.reduce Method
The Stream.reduce method is a general-purpose reduction operation. Consider the following pipeline, which calculates the sum of the male members' ages in the collection roster. It uses the Stream.sum reduction operation:
Integer totalAge = roster
.stream()
.mapToInt(Person::getAge)
.sum();
Compare this with the following pipeline, which uses the Stream.reduce operation to calculate the same value:
Integer totalAgeReduce = roster
.stream()
.map(Person::getAge)
.reduce(
0,
(a, b) -> a + b);
The reduce operation in this example takes two arguments:
identity: The identity element is both the initial value of the reduction and the default result if there are no elements in the stream. In this example, the identity element is 0; this is the initial value of the sum of ages and the default value if no members exist in the collection roster.
accumulator: The accumulator function takes two parameters: a partial result of the reduction (in this example, the sum of all processed integers so far) and the next element of the stream (in this example, an integer). It returns a new partial result. In this example, the accumulator function is a lambda expression that adds two Integer values and returns an Integer value:
(a, b) -> a + b
The reduce operation always returns a new value. However, the accumulator function also returns a new value every time it processes an element of a stream. Suppose that you want to reduce the elements of a stream to a more complex object, such as a collection. This might hinder the performance of your application. If your reduce operation involves adding elements to a collection, then every time your accumulator function processes an element, it creates a new collection that includes the element, which is inefficient. It would be more efficient for you to update an existing collection instead. You can do this with the Stream.collect method, which the next section describes.
The official oracle tutorial describes how the Stream.reduce works. Please have a look, I believe it will answer your query.
Related
Stream.reduce has 3 method overloads.
reduce(BinaryOperator<T> accumulator)
reduce(T identity, BinaryOperator<T> accumulator)
reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)
1st overload can be used to calculate sum of integer list for example.
2nd overload is the same but if the list is empty it just returns the default value.
I'm having a hard time understanding how third overload (Stream.reduce(identity, accumulator, combiner)) works and what is a use case of that. So, how does it work, and why does that exists?
If I understand correctly, your question is about the third argument combiner.
Firstly, one of the goals of Java was to have similar APIs for sequential and parallel streams.
The 3-argument version of reduce is useful for parallel streams.
Suppose you are reducing from value of Collection<T> to U type and you are using parallel stream versions. The parallel stream splits the collection T into smaller streams and generates a u' value for each by using the second function. But now these different u' values have to be combined ? How do they get combined ? The third function is the one that provides that logic.
Basically it combines a mapping function with a reduction. Most of the examples I've seen for this don't really demonstrate why it's preferrable to calling map() and a normal reduce() in separate steps. The API Note comes in handy here:
Many reductions using this form can be represented more simply by an explicit combination of map and reduce operations. The accumulator function acts as a fused mapper and accumulator, which can sometimes be more efficient than separate mapping and reduction, such as when knowing the previously reduced value allows you to avoid some computation.
So let's say we have a Stream<String> numbers, and we want to parse them to BigDecimal and calculate their product. We could do something like this:
BigDecimal product = numbers.map(BigDecimal::new)
.reduce(BigDecimal.ONE, BigDecimal::multiply);
But this has an inefficiency. If one of the numbers is "0", we're wasting cycles converting the remainder to BigDecimal. We can use the 3-arg reduce() here to bypass the mapping logic:
BigDecimal product = numbers.reduce(BigDecimal.ONE,
(d, n) -> d.equals(BigDecimal.ZERO) ? BigDecimal.ZERO : new BigDecimal(n).multiply(d),
BigDecimal::multiply);
Of course it would be even more efficient to short-circuit the stream entirely, but that's tricky to do in a stream, especially in parallel. And this is just an example to get the concept across.
Note: Some of the examples are contrived for demonstration. In some instances a simple .sum() could have been used.
The big difference, imo, is that the third form has a BiFunction as a second argument instead of a BinaryOperator. So you can use the third form to change the result type. It also has a BinaryOperator as a combiner to combine the different results from parallel operations.
Generate some data
record Data(String name, int value) {}
Random r = new Random();
List<Data> dataList = r.ints(1000, 1, 20).mapToObj(i->new Data("Item"+i, i)).toList();
No parallel operation but different types. But the third argument is required so just return the sum.
int sum = dataList.stream().reduce(0, (item, data) -> item + data.value,
(finalSum, partialSum) -> finalSum);
System.out.println(sum);
prints
10162
The second form. Use map to get the value to be summed. BinaryOperator used here since types are the same and no parallel operation.
sum = dataList.stream().map(Data::value).reduce(0, (sum1,val)->sum1+val);
System.out.println(sum); // print same as above
This shows the same as above but in parallel. The third argument accumulates partial sums. And those sums are accumulated as the next thread finishes so there may not be a sensible order to the output.
sum = dataList.parallelStream().reduce(0, (sum1, data) -> sum1 + data.value,
(finalSum, partialSum) -> {
System.out.println("Adding " + partialSum + " to " + finalSum);
finalSum += partialSum;
return finalSum;
});
System.out.println(sum);
prints something like the following
Adding 586 to 670
Adding 567 to 553
Adding 1256 to 1120
Adding 715 to 620
Adding 624 to 601
Adding 1335 to 1225
Adding 2560 to 2376
Adding 662 to 579
Adding 706 to 715
Adding 1421 to 1241
Adding 713 to 689
Adding 576 to 586
Adding 1402 to 1162
Adding 2662 to 2564
Adding 4936 to 5226
10162
One final note. None of the Collectors.reducing methods have a BiFunction to handle different types. To handle this the second argument is a Function to act as a mapper so the third argument, a BinaryOperator can collect the values.
sum = dataList.parallelStream().collect(
Collectors.reducing(0, Data::value, (finalSum, partialSum) -> {
System.out.println(
"Adding " + partialSum + " to " + finalSum);
return finalSum + partialSum;
}));
System.out.println(sum);
I'm having trouble fully understanding the role that the combiner fulfills in Streams reduce method.
For example, the following code doesn't compile:
int length = asList("str1", "str2").stream()
.reduce(0, (accumulatedInt, str) -> accumulatedInt + str.length());
Compile error says :
(argument mismatch; int cannot be converted to java.lang.String)
but this code does compile:
int length = asList("str1", "str2").stream()
.reduce(0, (accumulatedInt, str ) -> accumulatedInt + str.length(),
(accumulatedInt, accumulatedInt2) -> accumulatedInt + accumulatedInt2);
I understand that the combiner method is used in parallel streams - so in my example it is adding together two intermediate accumulated ints.
But I don't understand why the first example doesn't compile without the combiner or how the combiner is solving the conversion of string to int since it is just adding together two ints.
Can anyone shed light on this?
Eran's answer described the differences between the two-arg and three-arg versions of reduce in that the former reduces Stream<T> to T whereas the latter reduces Stream<T> to U. However, it didn't actually explain the need for the additional combiner function when reducing Stream<T> to U.
One of the design principles of the Streams API is that the API shouldn't differ between sequential and parallel streams, or put another way, a particular API shouldn't prevent a stream from running correctly either sequentially or in parallel. If your lambdas have the right properties (associative, non-interfering, etc.) a stream run sequentially or in parallel should give the same results.
Let's first consider the two-arg version of reduction:
T reduce(I, (T, T) -> T)
The sequential implementation is straightforward. The identity value I is "accumulated" with the zeroth stream element to give a result. This result is accumulated with the first stream element to give another result, which in turn is accumulated with the second stream element, and so forth. After the last element is accumulated, the final result is returned.
The parallel implementation starts off by splitting the stream into segments. Each segment is processed by its own thread in the sequential fashion I described above. Now, if we have N threads, we have N intermediate results. These need to be reduced down to one result. Since each intermediate result is of type T, and we have several, we can use the same accumulator function to reduce those N intermediate results down to a single result.
Now let's consider a hypothetical two-arg reduction operation that reduces Stream<T> to U. In other languages, this is called a "fold" or "fold-left" operation so that's what I'll call it here. Note this doesn't exist in Java.
U foldLeft(I, (U, T) -> U)
(Note that the identity value I is of type U.)
The sequential version of foldLeft is just like the sequential version of reduce except that the intermediate values are of type U instead of type T. But it's otherwise the same. (A hypothetical foldRight operation would be similar except that the operations would be performed right-to-left instead of left-to-right.)
Now consider the parallel version of foldLeft. Let's start off by splitting the stream into segments. We can then have each of the N threads reduce the T values in its segment into N intermediate values of type U. Now what? How do we get from N values of type U down to a single result of type U?
What's missing is another function that combines the multiple intermediate results of type U into a single result of type U. If we have a function that combines two U values into one, that's sufficient to reduce any number of values down to one -- just like the original reduction above. Thus, the reduction operation that gives a result of a different type needs two functions:
U reduce(I, (U, T) -> U, (U, U) -> U)
Or, using Java syntax:
<U> U reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)
In summary, to do parallel reduction to a different result type, we need two functions: one that accumulates T elements to intermediate U values, and a second that combines the intermediate U values into a single U result. If we aren't switching types, it turns out that the accumulator function is the same as the combiner function. That's why reduction to the same type has only the accumulator function and reduction to a different type requires separate accumulator and combiner functions.
Finally, Java doesn't provide foldLeft and foldRight operations because they imply a particular ordering of operations that is inherently sequential. This clashes with the design principle stated above of providing APIs that support sequential and parallel operation equally.
Since I like doodles and arrows to clarify concepts... let's start!
From String to String (sequential stream)
Suppose having 4 strings: your goal is to concatenate such strings into one. You basically start with a type and finish with the same type.
You can achieve this with
String res = Arrays.asList("one", "two","three","four")
.stream()
.reduce("",
(accumulatedStr, str) -> accumulatedStr + str); //accumulator
and this helps you to visualize what's happening:
The accumulator function converts, step by step, the elements in your (red) stream to the final reduced (green) value. The accumulator function simply transforms a String object into another String.
From String to int (parallel stream)
Suppose having the same 4 strings: your new goal is to sum their lengths, and you want to parallelize your stream.
What you need is something like this:
int length = Arrays.asList("one", "two","three","four")
.parallelStream()
.reduce(0,
(accumulatedInt, str) -> accumulatedInt + str.length(), //accumulator
(accumulatedInt, accumulatedInt2) -> accumulatedInt + accumulatedInt2); //combiner
and this is a scheme of what's happening
Here the accumulator function (a BiFunction) allows you to transform your String data to an int data. Being the stream parallel, it's splitted in two (red) parts, each of which is elaborated independently from eachother and produces just as many partial (orange) results. Defining a combiner is needed to provide a rule for merging partial int results into the final (green) int one.
From String to int (sequential stream)
What if you don't want to parallelize your stream? Well, a combiner needs to be provided anyway, but it will never be invoked, given that no partial results will be produced.
The two and three argument versions of reduce which you tried to use don't accept the same type for the accumulator.
The two argument reduce is defined as :
T reduce(T identity,
BinaryOperator<T> accumulator)
In your case, T is String, so BinaryOperator<T> should accept two String arguments and return a String. But you pass to it an int and a String, which results in the compilation error you got - argument mismatch; int cannot be converted to java.lang.String. Actually, I think passing 0 as the identity value is also wrong here, since a String is expected (T).
Also note that this version of reduce processes a stream of Ts and returns a T, so you can't use it to reduce a stream of String to an int.
The three argument reduce is defined as :
<U> U reduce(U identity,
BiFunction<U,? super T,U> accumulator,
BinaryOperator<U> combiner)
In your case U is Integer and T is String, so this method will reduce a stream of String to an Integer.
For the BiFunction<U,? super T,U> accumulator you can pass parameters of two different types (U and ? super T), which in your case are Integer and String. In addition, the identity value U accepts an Integer in your case, so passing it 0 is fine.
Another way to achieve what you want :
int length = asList("str1", "str2").stream().mapToInt (s -> s.length())
.reduce(0, (accumulatedInt, len) -> accumulatedInt + len);
Here the type of the stream matches the return type of reduce, so you can use the two parameter version of reduce.
Of course you don't have to use reduce at all :
int length = asList("str1", "str2").stream().mapToInt (s -> s.length())
.sum();
There is no reduce version that takes two different types without a combiner since it can't be executed in parallel (not sure why this is a requirement). The fact that accumulator must be associative makes this interface pretty much useless since:
list.stream().reduce(identity,
accumulator,
combiner);
Produces the same results as:
list.stream().map(i -> accumulator(identity, i))
.reduce(identity,
combiner);
I was wondering how stream().toArray[x -> new Integer[x]] knows what size of array to from? I wrote a snippet in which i created a list of an integer of size 4 and filtered the values and it created an array of length of the filtered stream, I could not see any method on stream to get a size of the stream.
List<Integer> intList = new ArrayList<Integer>();
intList.add(1);
intList.add(2);
intList.add(3);
intList.add(4);
Integer[] array = intList.stream()
.filter(x -> x > 2)
.toArray(x -> {
System.out.println("x --> " + x);
return new Integer[x];
});
System.out.println("array length: " + array.length);
Output of above code:
x --> 2
array length: 2
initially, the snippet was like
Integer[] array = intList.stream()
.filter(x -> x > 2)
.toArray(x -> new Integer[x]);
Just to get the understanding what value of x it passes i had to change it to print x in lambda
Of course, this is implementation dependent. For some streams, the size is predicable, if the source has a known size and no size changing intermediate operation is involved. Since you are using a filter operation, this doesn’t apply, however, there is an estimate size, based on the unfiltered count.
Now, the Stream implementation simply allocates a temporary buffer, either using the estimated size or a default size with support for increasing the capacity, if necessary, and copies the data into the destination array, created by your function, in a final step.
The intermediate buffers could be created via the supplied function, which is the reason why the documentation states “…using the provided generator function to allocate the returned array, as well as any additional arrays that might be required for a partitioned execution or for resizing” and I vaguely remember seeing such a behavior in early versions. However, the current implementation just uses Object[] arrays (or Object[][] in a “spined buffer”) for intermediate storage and uses the supplied function only for creating the final array. Therefore, you can’t observe intermediate array creation with the function, given this specific JRE implementation.
The thing is: this is a terminal operation. It happens in the end, when the stream was processed: meaning - the "final" count is known by then; as there are no more operations that could remove/add values to the stream!
Simply look at javas stream documentation of toArray.
<A> A[] toArray(IntFunction<A[]> generator)
Returns an array containing the elements of this stream, using the provided generator function to allocate the returned array, as well as any additional arrays that might be required for a partitioned execution or for resizing.
This is a terminal operation.
API Note:
The generator function takes an integer, which is the size of the desired array, and produces an array of the desired size. This can be concisely expressed with an array constructor reference
Therefore toArray does give you the desired array size as a parameter and you are responsible for allocating a correct sized array, at least when using this method. This method is a terminal operation. So the size calculation is done within the internals of the Stream API.
IMHO it is better to grasp if you name your lambda parameters differently for filter and toArray.
Integer[] array = intList.stream()
.filter(myint -> myint > 2)
.toArray(desiredArraySize -> new Integer[desiredArraySize]);
I want to sum a list of Integers. It works as follows, but the syntax does not feel right. Could the code be optimized?
Map<String, Integer> integers;
integers.values().stream().mapToInt(i -> i).sum();
This will work, but the i -> i is doing some automatic unboxing which is why it "feels" strange. mapToInt converts the stream to an IntStream "of primitive int-valued elements". Either of the following will work and better explain what the compiler is doing under the hood with your original syntax:
integers.values().stream().mapToInt(i -> i.intValue()).sum();
integers.values().stream().mapToInt(Integer::intValue).sum();
I suggest 2 more options:
integers.values().stream().mapToInt(Integer::intValue).sum();
integers.values().stream().collect(Collectors.summingInt(Integer::intValue));
The second one uses Collectors.summingInt() collector, there is also a summingLong() collector which you would use with mapToLong.
And a third option: Java 8 introduces a very effective LongAdder accumulator designed to speed-up summarizing in parallel streams and multi-thread environments. Here, here's an example use:
LongAdder a = new LongAdder();
map.values().parallelStream().forEach(a::add);
sum = a.intValue();
From the docs
Reduction operations
A reduction operation (also called a fold) takes a sequence of input elements and combines them into a single summary result by repeated application of a combining operation, such as finding the sum or maximum of a set of numbers, or accumulating elements into a list. The streams classes have multiple forms of general reduction operations, called reduce() and collect(), as well as multiple specialized reduction forms such as sum(), max(), or count().
Of course, such operations can be readily implemented as simple sequential loops, as in:
int sum = 0;
for (int x : numbers) {
sum += x;
}
However, there are good reasons to prefer a reduce operation over a mutative accumulation such as the above. Not only is a reduction "more abstract" -- it operates on the stream as a whole rather than individual elements -- but a properly constructed reduce operation is inherently parallelizable, so long as the function(s) used to process the elements are associative and stateless. For example, given a stream of numbers for which we want to find the sum, we can write:
int sum = numbers.stream().reduce(0, (x,y) -> x+y);
or:
int sum = numbers.stream().reduce(0, Integer::sum);
These reduction operations can run safely in parallel with almost no modification:
int sum = numbers.parallelStream().reduce(0, Integer::sum);
So, for a map you would use:
integers.values().stream().mapToInt(i -> i).reduce(0, (x,y) -> x+y);
Or:
integers.values().stream().reduce(0, Integer::sum);
You can use reduce method:
long sum = result.stream().map(e -> e.getCreditAmount()).reduce(0L, (x, y) -> x + y);
or
long sum = result.stream().map(e -> e.getCreditAmount()).reduce(0L, Integer::sum);
You can use reduce() to sum a list of integers.
int sum = integers.values().stream().reduce(0, Integer::sum);
You can use collect method to add list of integers.
List<Integer> list = Arrays.asList(2, 4, 5, 6);
int sum = list.stream().collect(Collectors.summingInt(Integer::intValue));
I have declared a list of Integers.
ArrayList<Integer> numberList = new ArrayList<Integer>(Arrays.asList(1, 2, 3, 4, 5));
You can try using these different ways below.
Using mapToInt
int sum = numberList.stream().mapToInt(Integer::intValue).sum();
Using summarizingInt
int sum = numberList.stream().collect(Collectors.summarizingInt(Integer::intValue)).getSum();
Using reduce
int sum = numberList.stream().reduce(Integer::sum).get().intValue();
May this help those who have objects on the list.
If you have a list of objects and wanted to sum specific fields of this object use the below.
List<ResultSom> somList = MyUtil.getResultSom();
BigDecimal result= somList.stream().map(ResultSom::getNetto).reduce(
BigDecimal.ZERO, BigDecimal::add);
This would be the shortest way to sum up int type array (for long array LongStream, for double array DoubleStream and so forth). Not all the primitive integer or floating point types have the Stream implementation though.
IntStream.of(integers).sum();
Unfortunately looks like the Stream API only returns normal streams from, say, List<Integer>#stream(). Guess they're pretty much forced to because of how generics work.
These normal Streams are of generic objects so don't have specialized methods like sum() etc. so you have to use the weird re-stream "looks like a no-op" conversion by default to get to those methods... .mapToInt(i -> i).
Another option is using "Eclipse Collections" which are like an expanded java Stream API
IntLists.immutable.ofAll(integers.values()).sum();
There is one more option no one considered here and it reflects on usage of multi-core environment. If you want to use its advantages, then next code should be used instead of the other mentioned solutions:
int sum = integers.values().parallelStream().mapToInt(Integer::intValue)
.reduce(0, Integer::sum, Integer::sum);
This solution is similar to other ones, but please notice the third argument in reduce. It tells compiler what to do with partial summaries calculated in different chunks of the stream, by different threads. Also instead of stream(), the parallelStream() is used. In this case it would just summarize it. The other option to put as third argument is (i, j) -> i + j, which means that it would add a value of a stream chunk (j) to the current value (i) and use it as a current value for the next stream chunk until all partial results are processed.
Even when using plain stream() it is useful to tell to reduce what to do with stream chunks' summaries, just in case someone, or you, would like to parallelize it in the future. The initial development is best time for that, since later on you need to remember what this is supposed to be and need to spend some time in understanding the purpose of that code again.
And of course instead of method reference operator you can have different dialect of lambda. I prefer it this way as more compact and still easy readable.
Also remember this can be used for more complex calculations too, but always be aware there are no guarantees about sequence and deployment of stream elements to threads.
IntStream.of(1, 2, 23).sum();
IntStream.of(1, 2, 23,1, 2, 23,1, 2, 23).max().getAsInt();
I'm having trouble fully understanding the role that the combiner fulfills in Streams reduce method.
For example, the following code doesn't compile:
int length = asList("str1", "str2").stream()
.reduce(0, (accumulatedInt, str) -> accumulatedInt + str.length());
Compile error says :
(argument mismatch; int cannot be converted to java.lang.String)
but this code does compile:
int length = asList("str1", "str2").stream()
.reduce(0, (accumulatedInt, str ) -> accumulatedInt + str.length(),
(accumulatedInt, accumulatedInt2) -> accumulatedInt + accumulatedInt2);
I understand that the combiner method is used in parallel streams - so in my example it is adding together two intermediate accumulated ints.
But I don't understand why the first example doesn't compile without the combiner or how the combiner is solving the conversion of string to int since it is just adding together two ints.
Can anyone shed light on this?
Eran's answer described the differences between the two-arg and three-arg versions of reduce in that the former reduces Stream<T> to T whereas the latter reduces Stream<T> to U. However, it didn't actually explain the need for the additional combiner function when reducing Stream<T> to U.
One of the design principles of the Streams API is that the API shouldn't differ between sequential and parallel streams, or put another way, a particular API shouldn't prevent a stream from running correctly either sequentially or in parallel. If your lambdas have the right properties (associative, non-interfering, etc.) a stream run sequentially or in parallel should give the same results.
Let's first consider the two-arg version of reduction:
T reduce(I, (T, T) -> T)
The sequential implementation is straightforward. The identity value I is "accumulated" with the zeroth stream element to give a result. This result is accumulated with the first stream element to give another result, which in turn is accumulated with the second stream element, and so forth. After the last element is accumulated, the final result is returned.
The parallel implementation starts off by splitting the stream into segments. Each segment is processed by its own thread in the sequential fashion I described above. Now, if we have N threads, we have N intermediate results. These need to be reduced down to one result. Since each intermediate result is of type T, and we have several, we can use the same accumulator function to reduce those N intermediate results down to a single result.
Now let's consider a hypothetical two-arg reduction operation that reduces Stream<T> to U. In other languages, this is called a "fold" or "fold-left" operation so that's what I'll call it here. Note this doesn't exist in Java.
U foldLeft(I, (U, T) -> U)
(Note that the identity value I is of type U.)
The sequential version of foldLeft is just like the sequential version of reduce except that the intermediate values are of type U instead of type T. But it's otherwise the same. (A hypothetical foldRight operation would be similar except that the operations would be performed right-to-left instead of left-to-right.)
Now consider the parallel version of foldLeft. Let's start off by splitting the stream into segments. We can then have each of the N threads reduce the T values in its segment into N intermediate values of type U. Now what? How do we get from N values of type U down to a single result of type U?
What's missing is another function that combines the multiple intermediate results of type U into a single result of type U. If we have a function that combines two U values into one, that's sufficient to reduce any number of values down to one -- just like the original reduction above. Thus, the reduction operation that gives a result of a different type needs two functions:
U reduce(I, (U, T) -> U, (U, U) -> U)
Or, using Java syntax:
<U> U reduce(U identity, BiFunction<U,? super T,U> accumulator, BinaryOperator<U> combiner)
In summary, to do parallel reduction to a different result type, we need two functions: one that accumulates T elements to intermediate U values, and a second that combines the intermediate U values into a single U result. If we aren't switching types, it turns out that the accumulator function is the same as the combiner function. That's why reduction to the same type has only the accumulator function and reduction to a different type requires separate accumulator and combiner functions.
Finally, Java doesn't provide foldLeft and foldRight operations because they imply a particular ordering of operations that is inherently sequential. This clashes with the design principle stated above of providing APIs that support sequential and parallel operation equally.
Since I like doodles and arrows to clarify concepts... let's start!
From String to String (sequential stream)
Suppose having 4 strings: your goal is to concatenate such strings into one. You basically start with a type and finish with the same type.
You can achieve this with
String res = Arrays.asList("one", "two","three","four")
.stream()
.reduce("",
(accumulatedStr, str) -> accumulatedStr + str); //accumulator
and this helps you to visualize what's happening:
The accumulator function converts, step by step, the elements in your (red) stream to the final reduced (green) value. The accumulator function simply transforms a String object into another String.
From String to int (parallel stream)
Suppose having the same 4 strings: your new goal is to sum their lengths, and you want to parallelize your stream.
What you need is something like this:
int length = Arrays.asList("one", "two","three","four")
.parallelStream()
.reduce(0,
(accumulatedInt, str) -> accumulatedInt + str.length(), //accumulator
(accumulatedInt, accumulatedInt2) -> accumulatedInt + accumulatedInt2); //combiner
and this is a scheme of what's happening
Here the accumulator function (a BiFunction) allows you to transform your String data to an int data. Being the stream parallel, it's splitted in two (red) parts, each of which is elaborated independently from eachother and produces just as many partial (orange) results. Defining a combiner is needed to provide a rule for merging partial int results into the final (green) int one.
From String to int (sequential stream)
What if you don't want to parallelize your stream? Well, a combiner needs to be provided anyway, but it will never be invoked, given that no partial results will be produced.
The two and three argument versions of reduce which you tried to use don't accept the same type for the accumulator.
The two argument reduce is defined as :
T reduce(T identity,
BinaryOperator<T> accumulator)
In your case, T is String, so BinaryOperator<T> should accept two String arguments and return a String. But you pass to it an int and a String, which results in the compilation error you got - argument mismatch; int cannot be converted to java.lang.String. Actually, I think passing 0 as the identity value is also wrong here, since a String is expected (T).
Also note that this version of reduce processes a stream of Ts and returns a T, so you can't use it to reduce a stream of String to an int.
The three argument reduce is defined as :
<U> U reduce(U identity,
BiFunction<U,? super T,U> accumulator,
BinaryOperator<U> combiner)
In your case U is Integer and T is String, so this method will reduce a stream of String to an Integer.
For the BiFunction<U,? super T,U> accumulator you can pass parameters of two different types (U and ? super T), which in your case are Integer and String. In addition, the identity value U accepts an Integer in your case, so passing it 0 is fine.
Another way to achieve what you want :
int length = asList("str1", "str2").stream().mapToInt (s -> s.length())
.reduce(0, (accumulatedInt, len) -> accumulatedInt + len);
Here the type of the stream matches the return type of reduce, so you can use the two parameter version of reduce.
Of course you don't have to use reduce at all :
int length = asList("str1", "str2").stream().mapToInt (s -> s.length())
.sum();
There is no reduce version that takes two different types without a combiner since it can't be executed in parallel (not sure why this is a requirement). The fact that accumulator must be associative makes this interface pretty much useless since:
list.stream().reduce(identity,
accumulator,
combiner);
Produces the same results as:
list.stream().map(i -> accumulator(identity, i))
.reduce(identity,
combiner);