Library method to partition a collection by a predicate

Library method to partition a collection by a predicate - java

I have a collection of objects that I would like to partition into two collections, one of which passes a predicate and one of which fails a predicate. I was hoping there would be a Guava method to do this, but the closest they come is filter, which doesn't give me the other collection.
I would image the signature of the method would be something like this:
public static <E> Pair<Collection<E>, Collection<E>> partition(Collection<E> source, Predicate<? super E> predicate)
I realize this is super fast to code myself, but I'm looking for an existing library method that does what I want.

Use Guava's Multimaps.index.
Here is an example, which partitions a list of words into two parts: those which have length > 3 and those that don't.
List<String> words = Arrays.asList("foo", "bar", "hello", "world");
ImmutableListMultimap<Boolean, String> partitionedMap = Multimaps.index(words, new Function<String, Boolean>(){
#Override
public Boolean apply(String input) {
return input.length() > 3;
}
});
System.out.println(partitionedMap);
prints:
false=[foo, bar], true=[hello, world]

With the new java 8 features(stream and lambda epressions), you could write:
List<String> words = Arrays.asList("foo", "bar", "hello", "world");
Map<Boolean, List<String>> partitionedMap =
words.stream().collect(
Collectors.partitioningBy(word -> word.length() > 3));
System.out.println(partitionedMap);

If you're using Eclipse Collections (formerly GS Collections), you can use the partition method on all RichIterables.
MutableList<Integer> integers = FastList.newListWith(-3, -2, -1, 0, 1, 2, 3);
PartitionMutableList<Integer> result = integers.partition(IntegerPredicates.isEven());
Assert.assertEquals(FastList.newListWith(-2, 0, 2), result.getSelected());
Assert.assertEquals(FastList.newListWith(-3, -1, 1, 3), result.getRejected());
The reason for using a custom type, PartitionMutableList, instead of Pair is to allow covariant return types for getSelected() and getRejected(). For example, partitioning a MutableCollection gives two collections instead of lists.
MutableCollection<Integer> integers = ...;
PartitionMutableCollection<Integer> result = integers.partition(IntegerPredicates.isEven());
MutableCollection<Integer> selected = result.getSelected();
If your collection isn't a RichIterable, you can still use the static utility in Eclipse Collections.
PartitionIterable<Integer> partitionIterable = Iterate.partition(integers, IntegerPredicates.isEven());
PartitionMutableList<Integer> partitionList = ListIterate.partition(integers, IntegerPredicates.isEven());
Note: I am a committer for Eclipse Collections.

seems like a good job for the new Java 12 Collectors::teeing:
var dividedStrings = Stream.of("foo", "hello", "bar", "world")
.collect(Collectors.teeing(
Collectors.filtering(s -> s.length() <= 3, Collectors.toList()),
Collectors.filtering(s -> s.length() > 3, Collectors.toList()),
List::of
));
System.out.println(dividedStrings.get(0)); //[foo, bar]
System.out.println(dividedStrings.get(1)); //[hello, world]
You can find more examples here.

Apache Commons Collections IterableUtils provides methods for partitioning Iterable objects based on one or more predicates. (Look for the partition(...) methods.)

Note that in case of limited set of known in advance partiotion keys it may be much more efficient just to iterate the collection once more for each partition key skipping all different-key items on each iteration. As this would not allocate many new objects for Garbage Collector.
LocalDate start = LocalDate.now().with(TemporalAdjusters.firstDayOfYear());
LocalDate endExclusive = LocalDate.now().plusYears(1);
List<LocalDate> daysCollection = Stream.iterate(start, date -> date.plusDays(1))
.limit(ChronoUnit.DAYS.between(start, endExclusive))
.collect(Collectors.toList());
List<DayOfWeek> keys = Arrays.asList(DayOfWeek.values());
for (DayOfWeek key : keys) {
int count = 0;
for (LocalDate day : daysCollection) {
if (key == day.getDayOfWeek()) {
++count;
}
}
System.out.println(String.format("%s: %d days in this year", key, count));
}
Another both GC-friendly and encapsulated approach is using Java 8 filtering wrapper streams around the original collection:
List<AbstractMap.SimpleEntry<DayOfWeek, Stream<LocalDate>>> partitions = keys.stream().map(
key -> new AbstractMap.SimpleEntry<>(
key, daysCollection.stream().filter(
day -> key == day.getDayOfWeek())))
.collect(Collectors.toList());
// partitions could be passed somewhere before being used
partitions.forEach(pair -> System.out.println(
String.format("%s: %d days in this year", pair.getKey(), pair.getValue().count())));
Both snippets print this:
MONDAY: 57 days in this year
TUESDAY: 57 days in this year
WEDNESDAY: 57 days in this year
THURSDAY: 57 days in this year
FRIDAY: 56 days in this year
SATURDAY: 56 days in this year
SUNDAY: 56 days in this year

Related

Filter List for unique elements

I am searching for an elegant way to filter a list for only the elements that are unique. An example:
[1, 2, 2, 3, 1, 4]
-> [3, 4] // 1 and 2 occur more than once
Most solutions I found manually compute the occurrences of all elements and then filter by the elements that have exactly one occurrence.
That does not sound too elegant to me, maybe there is a better solution, a best practice or a name for a data-structure that solves this already? I was also thinking about maybe utilizing streams, but I do not know how.
Note that I am not asking for duplicate removal, i.e. [1, 2, 3, 4] but for keeping only the unique elements, so [3, 4].
The order of the resulting list or what type of Collection exactly does not matter to me.

I doubt there is a better approach than actually counting and filtering for the ones that only appeared once. At least, all approaches I can think of will use something similar to that under the hood.
Also, it is not clear what you mean by elegant, readability or performance? So I will just dump some approaches.
Stream counting
Here is a stream-variant that computes number of occurrences (Map) and then filters for elements that appear only once. It is essentially the same as what you described already, or what Bags do under the hood:
List<E> result = elements.stream() // Stream<E>
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting())) // Map<E, Long>
.entries() // Set<Entry<E, Long>>
.stream() // Stream<Entry<E, Long>>
.filter(entry -> entry.getValue() == 1)
.map(Entry::getKey)
.collect(Collectors.toList());
It requires two full iterations over the data-set. Since it uses the Stream-API, the operations support multi-threading right from the get-go though. So if you have lots of elements, this might be pretty fast due to that.
Manual Set
Here is another approach that reduces iteration and lookup time by manually collecting into a Set to identify duplicates as fast as possible:
Set<E> result = new HashSet<>();
Set<E> appeared = new HashSet<>();
for (E element : elements) {
if (result.contains(element)) { // 2nd occurrence
result.remove(element);
appeared.add(element);
continue;
}
if (appeared.contains(element)) { // >2nd occurrence
continue;
}
result.add(element); // 1st occurrence
}
As you see, this only requires one iteration over the elements instead of multiple.
This approach is elegant in a sense that it does not compute unnecessary information. For what you want, it is completely irrelevant to compute how often exactly elements appear. We only care for "does it appear once or more often?" and not if it appears 5 times or 11 times.

You can use Bag to count occurrences (getCount(1) for unique)
Bag is a collection that allows storing multiple items along with their repetition count:
public void whenAdded_thenCountIsKept() {
Bag<Integer> bag = new HashBag<>(
Arrays.asList(1, 2, 3, 3, 3, 1, 4));
assertThat(2, equalTo(bag.getCount(1)));
}
Or CollectionBag
Apache Collections' library provides a decorator called the CollectionBag. We can use this to make our bag collections compliant with the Java Collection contract:
And get unique set:
bag.uniqueSet();
Returns a Set of unique elements in the Bag.

One need first to collect all, reached the end for deleting groups of more than 1 element.
Map<String, Long> map = Stream.of("a", "b", "a", "a", "c", "d", "c")
.collect(Collectors.groupingBy(Function.identity(),
Collectors.counting()));
map.entrySet()
.stream()
.filter(e -> e.getValue() == 1L)
.map(e -> e.getKey())
.forEach(System.out::println);
Or in one go:
Stream.of("a", "b", "a", "a", "c", "d", "c")
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet()
.stream()
.filter(e -> e.getValue() == 1L)
.map(e -> e.getKey())
.forEach(System.out::println);

The idea of using a map to accumulate frequency counts sounds like a good one: it runs in roughly linear (O(n)) time and only requires O(n) extra space.
Here's an algorithm that requires zero extra space, at the expense of running in O(n^2) time:
public static <T> void retainSingletons(List<T> list)
{
int i = 0;
while (i < list.size()) {
boolean foundDup = false;
int j = i + 1;
while (j < list.size()) {
if (list.get(i).equals(list.get(j))) {
list.remove(j);
foundDup = true;
} else {
++j;
}
}
if (foundDup) {
list.remove(i);
} else {
++i;
}
}
}
The idea is straightforward: step a slow pointer, i, over the list until it runs off the end; for each value of i, run a fast pointer j from i+1 until the end of the list, removing any list[j] that's a duplicate of list[i]; after j runs out, if any duplicates of list[i] were found and removed, also remove list[i].

The following will work using Eclipse Collections:
IntList list = IntLists.mutable.with(1, 2, 2, 3, 1, 4);
IntSet unique = list.toBag().selectUnique();
System.out.println(unique);
Using an IntList removes the need to box the int values and Integer objects.
Note: I am a committer for Eclipse Collections.

Find missing integer in a sequential sorted stream

Let's say I have a list
ArrayList<String> arr = new ArrayList(Arrays.asList("N1", "N2", "N3", "N5"));
How do I find "N4", I mean, how I find that the missing integer is 4?
What I've tried so far
Integer missingID = arr.stream().map(p -> Integer.parseInt(p.substring(1))).sorted()
.reduce((p1, p2) -> (p2 - p1) > 1 ? p1 + 1 : 0).get();
This doesn't work because reduce is not intended to work in the way I need in this situation, actually, I have no idea how do that.
If there's no missing number, than the next must be "N6" - or just 6 - (in this example)
It must be done with java standard stream's library, no use of third parties.

The algorithm to implement here is based from this one: to find the missing number in a sequence of integers, the trick is to:
calculate the sum of the elements in the sequence.
calculate the sum of the elements the sequence would have with the missing number: this is easy to do since we can determine the minimum, the maximum and we know that the sum from a sequence of integer going from min to max is max*(max+1)/2 - (min-1)*min/2.
find the difference between those two sums: that's our missing number
In this case, we can collect statistics on our Stream by first mapping to an IntStream formed by only the numbers themselves and then calling summaryStatistics(). This returns a IntSummaryStatistics that has all the values we want: min, max and sum:
public static void main(String[] args) {
List<String> arr = Arrays.asList("N3", "N7", "N4", "N5", "N2");
IntSummaryStatistics statistics =
arr.stream()
.mapToInt(s -> Integer.parseInt(s.substring(1)))
.summaryStatistics();
long max = statistics.getMax();
long min = statistics.getMin();
long missing = max*(max+1)/2 - (min-1)*min/2 - statistics.getSum();
System.out.println(missing); // prints "6" here
}
If there is no missing number, this will print 0.

Here's the solution involving the pairMap operation from my free StreamEx library. It prints all the missing elements of the sorted input:
ArrayList<String> arr = new ArrayList(Arrays.asList("N1", "N2", "N3", "N5"));
StreamEx.of(arr).map(n -> Integer.parseInt(n.substring(1)))
.pairMap((a, b) -> IntStream.range(a+1, b))
.flatMapToInt(Function.identity())
.forEach(System.out::println);
The pairMap operation allows you to map every adjacent pair of the stream to something else. Here we map them to the streams of the skipped numbers, then flatten these streams.
The same solution is possible without third-party library, but looks more verbose:
ArrayList<String> arr = new ArrayList(Arrays.asList("N1", "N2", "N3", "N5"));
IntStream.range(0, arr.size()-1)
.flatMap(idx -> IntStream.range(
Integer.parseInt(arr.get(idx).substring(1))+1,
Integer.parseInt(arr.get(idx+1).substring(1))))
.forEach(System.out::println);

If there's only ONE missing number in the array, and if all numbers are positive, you could use the XOR algorithm, as explained in this question and its answers:
List<String> list = Arrays.asList("N5", "N2", "N3", "N6");
int xorArray = list.stream()
.mapToInt(p -> Integer.parseInt(p.substring(1)))
.reduce(0, (p1, p2) -> p1 ^ p2);
int xorAll = IntStream.rangeClosed(2, 6)
.reduce(0, (p1, p2) -> p1 ^ p2);
System.out.println(xorArray ^ xorAll); // 4
The advantage of this approach is that you don't need to use extra data structures, all you need is a couple of ints.
EDIT as per #Holger's comments below:
This solution requires you to know the range of the numbers in advance. Although on the other hand, it doesn't require the list and stream to be sorted.
Even if the list wasn't sorted, you could still get min and max (hence, the range) with IntSummaryStatistics, but this would require an extra iteration.

You could create a state object which is used to transform a single input stream into multiple streams of missing entries. These missing entry streams can then be flat mapped to produce a single output:
public class GapCheck {
private String last;
public GapCheck(String first) {
last = first;
}
public Stream<String> streamMissing(String next) {
final int n = Integer.parseInt(next.replaceAll("N", ""));
final int l = Integer.parseInt(last.replaceAll("N", ""));
last = next;
return IntStream.range(l + 1, n).mapToObj(Integer::toString);
}
}
Usage:
final List<String> arr = new ArrayList(Arrays.asList("N1", "N3", "N5"));
arr.stream()
.flatMap(new GapCheck(arr.get(0))::streamMissing)
.forEach(System.out::println);
output:
2
4

This is more work than you might expect, but it can be done with a collect call.
public class Main {
public static void main(String[] args) {
ArrayList<String> arr = new ArrayList<String>(Arrays.asList("N1", "N2", "N3", "N5", "N7", "N14"));
Stream<Integer> st = arr.stream().map(p -> Integer.parseInt(p.substring(1))).sorted();
Holder<Integer> holder = st.collect(() -> new Holder<Integer>(),
(h, i) -> {
Integer last = h.getProcessed().isEmpty() ? null : h.getProcessed().get(h.getProcessed().size() - 1);
if (last != null) {
while (i - last > 1) {
h.getMissing().add(++last);
}
}
h.getProcessed().add(i);
},
(h, h2) -> {});
holder.getMissing().forEach(System.out::println);
}
private static class Holder<T> {
private ArrayList<T> processed;
private ArrayList<T> missing;
public Holder() {
this.processed = new ArrayList<>();
this.missing = new ArrayList<>();
}
public ArrayList<T> getProcessed() {
return this.processed;
}
public ArrayList<T> getMissing() {
return this.missing;
}
}
}
This prints
4
6
8
9
10
11
12
13
Note that this sort of thing isn't really a particularly strong fit for Streams. All of the stream processing methods will tend to pass you each item exactly one time, so you need to handle all runs of missing numbers at once, and in the end, you're writing kind of a lot of code to avoid just writing a loop.

Here is one solution using pure streams, albeit not very efficient.
public void test() {
List<String> arr = new ArrayList(
Arrays.asList("N1", "N2", "N3", "N5", "N7"));
List<Integer> list = IntStream
.range(1, arr.size())
.mapToObj(t -> new AbstractMap.SimpleEntry<Integer, Integer>(
extract(arr, t), extract(arr, t) - extract(arr, t - 1)))
.filter(t -> t.getValue() > 1)
.map(t -> t.getKey() - 1)
.collect(Collectors.toList());
System.out.println(list);
}
private int extract(List<String> arr, int t) {
return Integer.parseInt(arr.get(t).substring(1));
}
Major performance block will be because of repeated parsing of list elements. However, this solution will be able to provide all missing numbers.

Java 8 lambda get and remove element from list

Given a list of elements, I want to get the element with a given property and remove it from the list. The best solution I found is:
ProducerDTO p = producersProcedureActive
.stream()
.filter(producer -> producer.getPod().equals(pod))
.findFirst()
.get();
producersProcedureActive.remove(p);
Is it possible to combine get and remove in a lambda expression?

To Remove element from the list
objectA.removeIf(x -> conditions);
eg:
objectA.removeIf(x -> blockedWorkerIds.contains(x));
List<String> str1 = new ArrayList<String>();
str1.add("A");
str1.add("B");
str1.add("C");
str1.add("D");
List<String> str2 = new ArrayList<String>();
str2.add("D");
str2.add("E");
str1.removeIf(x -> str2.contains(x));
str1.forEach(System.out::println);
OUTPUT:
A
B
C

Although the thread is quite old, still thought to provide solution - using Java8.
Make the use of removeIf function. Time complexity is O(n)
producersProcedureActive.removeIf(producer -> producer.getPod().equals(pod));
API reference: removeIf docs
Assumption: producersProcedureActive is a List
NOTE: With this approach you won't be able to get the hold of the deleted item.

Consider using vanilla java iterators to perform the task:
public static <T> T findAndRemoveFirst(Iterable<? extends T> collection, Predicate<? super T> test) {
T value = null;
for (Iterator<? extends T> it = collection.iterator(); it.hasNext();)
if (test.test(value = it.next())) {
it.remove();
return value;
}
return null;
}
Advantages:
It is plain and obvious.
It traverses only once and only up to the matching element.
You can do it on any Iterable even without stream() support (at least those implementing remove() on their iterator).
Disadvantages:
You cannot do it in place as a single expression (auxiliary method or variable required)
As for the
Is it possible to combine get and remove in a lambda expression?
other answers clearly show that it is possible, but you should be aware of
Search and removal may traverse the list twice
ConcurrentModificationException may be thrown when removing element from the list being iterated

The direct solution would be to invoke ifPresent(consumer) on the Optional returned by findFirst(). This consumer will be invoked when the optional is not empty. The benefit also is that it won't throw an exception if the find operation returned an empty optional, like your current code would do; instead, nothing will happen.
If you want to return the removed value, you can map the Optional to the result of calling remove:
producersProcedureActive.stream()
.filter(producer -> producer.getPod().equals(pod))
.findFirst()
.map(p -> {
producersProcedureActive.remove(p);
return p;
});
But note that the remove(Object) operation will again traverse the list to find the element to remove. If you have a list with random access, like an ArrayList, it would be better to make a Stream over the indexes of the list and find the first index matching the predicate:
IntStream.range(0, producersProcedureActive.size())
.filter(i -> producersProcedureActive.get(i).getPod().equals(pod))
.boxed()
.findFirst()
.map(i -> producersProcedureActive.remove((int) i));
With this solution, the remove(int) operation operates directly on the index.

Use can use filter of Java 8, and create another list if you don't want to change the old list:
List<ProducerDTO> result = producersProcedureActive
.stream()
.filter(producer -> producer.getPod().equals(pod))
.collect(Collectors.toList());

I'm sure this will be an unpopular answer, but it works...
ProducerDTO[] p = new ProducerDTO[1];
producersProcedureActive
.stream()
.filter(producer -> producer.getPod().equals(pod))
.findFirst()
.ifPresent(producer -> {producersProcedureActive.remove(producer); p[0] = producer;}
p[0] will either hold the found element or be null.
The "trick" here is circumventing the "effectively final" problem by using an array reference that is effectively final, but setting its first element.

With Eclipse Collections you can use detectIndex along with remove(int) on any java.util.List.
List<Integer> integers = Lists.mutable.with(1, 2, 3, 4, 5);
int index = Iterate.detectIndex(integers, i -> i > 2);
if (index > -1) {
integers.remove(index);
}
Assert.assertEquals(Lists.mutable.with(1, 2, 4, 5), integers);
If you use the MutableList type from Eclipse Collections, you can call the detectIndex method directly on the list.
MutableList<Integer> integers = Lists.mutable.with(1, 2, 3, 4, 5);
int index = integers.detectIndex(i -> i > 2);
if (index > -1) {
integers.remove(index);
}
Assert.assertEquals(Lists.mutable.with(1, 2, 4, 5), integers);
Note: I am a committer for Eclipse Collections

The below logic is the solution without modifying the original list
List<String> str1 = new ArrayList<String>();
str1.add("A");
str1.add("B");
str1.add("C");
str1.add("D");
List<String> str2 = new ArrayList<String>();
str2.add("D");
str2.add("E");
List<String> str3 = str1.stream()
.filter(item -> !str2.contains(item))
.collect(Collectors.toList());
str1 // ["A", "B", "C", "D"]
str2 // ["D", "E"]
str3 // ["A", "B", "C"]

When we want to get multiple elements from a List into a new list (filter using a predicate) and remove them from the existing list, I could not find a proper answer anywhere.
Here is how we can do it using Java Streaming API partitioning.
Map<Boolean, List<ProducerDTO>> classifiedElements = producersProcedureActive
.stream()
.collect(Collectors.partitioningBy(producer -> producer.getPod().equals(pod)));
// get two new lists
List<ProducerDTO> matching = classifiedElements.get(true);
List<ProducerDTO> nonMatching = classifiedElements.get(false);
// OR get non-matching elements to the existing list
producersProcedureActive = classifiedElements.get(false);
This way you effectively remove the filtered elements from the original list and add them to a new list.
Refer the 5.2. Collectors.partitioningBy section of this article.

As others have suggested, this might be a use case for loops and iterables. In my opinion, this is the simplest approach. If you want to modify the list in-place, it cannot be considered "real" functional programming anyway. But you could use Collectors.partitioningBy() in order to get a new list with elements which satisfy your condition, and a new list of those which don't. Of course with this approach, if you have multiple elements satisfying the condition, all of those will be in that list and not only the first.

the task is: get ✶and✶ remove element from list
p.stream().collect( Collectors.collectingAndThen( Collector.of(
ArrayDeque::new,
(a, producer) -> {
if( producer.getPod().equals( pod ) )
a.addLast( producer );
},
(a1, a2) -> {
return( a1 );
},
rslt -> rslt.pollFirst()
),
(e) -> {
if( e != null )
p.remove( e ); // remove
return( e ); // get
} ) );

resumoRemessaPorInstrucoes.removeIf(item ->
item.getTipoOcorrenciaRegistro() == TipoOcorrenciaRegistroRemessa.PEDIDO_PROTESTO.getNome() ||
item.getTipoOcorrenciaRegistro() == TipoOcorrenciaRegistroRemessa.SUSTAR_PROTESTO_BAIXAR_TITULO.getNome());

Combining my initial idea and your answers I reached what seems to be the solution
to my own question:
public ProducerDTO findAndRemove(String pod) {
ProducerDTO p = null;
try {
p = IntStream.range(0, producersProcedureActive.size())
.filter(i -> producersProcedureActive.get(i).getPod().equals(pod))
.boxed()
.findFirst()
.map(i -> producersProcedureActive.remove((int)i))
.get();
logger.debug(p);
} catch (NoSuchElementException e) {
logger.error("No producer found with POD [" + pod + "]");
}
return p;
}
It lets remove the object using remove(int) that do not traverse again the
list (as suggested by #Tunaki) and it lets return the removed object to
the function caller.
I read your answers that suggest me to choose safe methods like ifPresent instead of get but I do not find a way to use them in this scenario.
Are there any important drawback in this kind of solution?
Edit following #Holger advice
This should be the function I needed
public ProducerDTO findAndRemove(String pod) {
return IntStream.range(0, producersProcedureActive.size())
.filter(i -> producersProcedureActive.get(i).getPod().equals(pod))
.boxed()
.findFirst()
.map(i -> producersProcedureActive.remove((int)i))
.orElseGet(() -> {
logger.error("No producer found with POD [" + pod + "]");
return null;
});
}

A variation of the above:
import static java.util.function.Predicate.not;
final Optional<MyItem> myItem = originalCollection.stream().filter(myPredicate(someInfo)).findFirst();
final List<MyItem> myOtherItems = originalCollection.stream().filter(not(myPredicate(someInfo))).toList();
private Predicate<MyItem> myPredicate(Object someInfo) {
return myItem -> myItem.someField() == someInfo;
}

How to use Java 8 streams to find all values preceding a larger value?

Use Case
Through some coding Katas posted at work, I stumbled on this problem that I'm not sure how to solve.
Using Java 8 Streams, given a list of positive integers, produce a
list of integers where the integer preceded a larger value.
[10, 1, 15, 30, 2, 6]
The above input would yield:
[1, 15, 2]
since 1 precedes 15, 15 precedes 30, and 2 precedes 6.
Non-Stream Solution
public List<Integer> findSmallPrecedingValues(final List<Integer> values) {
List<Integer> result = new ArrayList<Integer>();
for (int i = 0; i < values.size(); i++) {
Integer next = (i + 1 < values.size() ? values.get(i + 1) : -1);
Integer current = values.get(i);
if (current < next) {
result.push(current);
}
}
return result;
}
What I've Tried
The problem I have is I can't figure out how to access next in the lambda.
return values.stream().filter(v -> v < next).collect(Collectors.toList());
Question
Is it possible to retrieve the next value in a stream?
Should I be using map and mapping to a Pair in order to access next?

Using IntStream.range:
static List<Integer> findSmallPrecedingValues(List<Integer> values) {
return IntStream.range(0, values.size() - 1)
.filter(i -> values.get(i) < values.get(i + 1))
.mapToObj(values::get)
.collect(Collectors.toList());
}
It's certainly nicer than an imperative solution with a large loop, but still a bit meh as far as the goal of "using a stream" in an idiomatic way.
Is it possible to retrieve the next value in a stream?
Nope, not really. The best cite I know of for that is in the java.util.stream package description:
The elements of a stream are only visited once during the life of a stream. Like an Iterator, a new stream must be generated to revisit the same elements of the source.
(Retrieving elements besides the current element being operated on would imply they could be visited more than once.)
We could also technically do it in a couple other ways:
Statefully (very meh).
Using a stream's iterator is technically still using the stream.

That's not a pure Java8, but recently I've published a small library called StreamEx which has a method exactly for this task:
// Find all numbers where the integer preceded a larger value.
Collection<Integer> numbers = Arrays.asList(10, 1, 15, 30, 2, 6);
List<Integer> res = StreamEx.of(numbers).pairMap((a, b) -> a < b ? a : null)
.nonNull().toList();
assertEquals(Arrays.asList(1, 15, 2), res);
The pairMap operation internally implemented using custom spliterator. As a result you have quite clean code which does not depend on whether the source is List or anything else. Of course it works fine with parallel stream as well.
Committed a testcase for this task.

It's not a one-liner (it's a two-liner), but this works:
List<Integer> result = new ArrayList<>();
values.stream().reduce((a,b) -> {if (a < b) result.add(a); return b;});
Rather than solving it by "looking at the next element", this solves it by "looking at the previous element, which reduce() give you for free. I have bent its intended usage by injecting a code fragment that populates the list based on the comparison of previous and current elements, then returns the current so the next iteration will see it as its previous element.
Some test code:
List<Integer> result = new ArrayList<>();
IntStream.of(10, 1, 15, 30, 2, 6).reduce((a,b) -> {if (a < b) result.add(a); return b;});
System.out.println(result);
Output:
[1, 15, 2]

The accepted answer works fine if either the stream is sequential or parallel but can suffer if the underlying List is not random access, due to multiple calls to get.
If your stream is sequential, you might roll this collector:
public static Collector<Integer, ?, List<Integer>> collectPrecedingValues() {
int[] holder = {Integer.MAX_VALUE};
return Collector.of(ArrayList::new,
(l, elem) -> {
if (holder[0] < elem) l.add(holder[0]);
holder[0] = elem;
},
(l1, l2) -> {
throw new UnsupportedOperationException("Don't run in parallel");
});
}
and a usage:
List<Integer> precedingValues = list.stream().collect(collectPrecedingValues());
Nevertheless you could also implement a collector so thats works for sequential and parallel streams. The only thing is that you need to apply a final transformation, but here you have control over the List implementation so you won't suffer from the get performance.
The idea is to generate first a list of pairs (represented by a int[] array of size 2) which contains the values in the stream sliced by a window of size two with a gap of one. When we need to merge two lists, we check the emptiness and merge the gap of the last element of the first list with the first element of the second list. Then we apply a final transformation to filter only desired values and map them to have the desired output.
It might not be as simple as the accepted answer, but well it can be an alternative solution.
public static Collector<Integer, ?, List<Integer>> collectPrecedingValues() {
return Collectors.collectingAndThen(
Collector.of(() -> new ArrayList<int[]>(),
(l, elem) -> {
if (l.isEmpty()) l.add(new int[]{Integer.MAX_VALUE, elem});
else l.add(new int[]{l.get(l.size() - 1)[1], elem});
},
(l1, l2) -> {
if (l1.isEmpty()) return l2;
if (l2.isEmpty()) return l1;
l2.get(0)[0] = l1.get(l1.size() - 1)[1];
l1.addAll(l2);
return l1;
}), l -> l.stream().filter(arr -> arr[0] < arr[1]).map(arr -> arr[0]).collect(Collectors.toList()));
}
You can then wrap these two collectors in a utility collector method, check if the stream is parallel with isParallel an then decide which collector to return.

If you're willing to use a third party library and don't need parallelism, then jOOλ offers SQL-style window functions as follows
System.out.println(
Seq.of(10, 1, 15, 30, 2, 6)
.window()
.filter(w -> w.lead().isPresent() && w.value() < w.lead().get())
.map(w -> w.value())
.toList()
);
Yielding
[1, 15, 2]
The lead() function accesses the next value in traversal order from the window.
Disclaimer: I work for the company behind jOOλ

You can achieve that by using a bounded queue to store elements which flows through the stream (which is basing on the idea which I described in detail here: Is it possible to get next element in the Stream?
Belows example first defines instance of BoundedQueue class which will store elements going through the stream (if you don't like idea of extending the LinkedList, refer to link mentioned above for alternative and more generic approach). Later you just examine the two subsequent elements - thanks to the helper class:
public class Kata {
public static void main(String[] args) {
List<Integer> input = new ArrayList<Integer>(asList(10, 1, 15, 30, 2, 6));
class BoundedQueue<T> extends LinkedList<T> {
public BoundedQueue<T> save(T curElem) {
if (size() == 2) { // we need to know only two subsequent elements
pollLast(); // remove last to keep only requested number of elements
}
offerFirst(curElem);
return this;
}
public T getPrevious() {
return (size() < 2) ? null : getLast();
}
public T getCurrent() {
return (size() == 0) ? null : getFirst();
}
}
BoundedQueue<Integer> streamHistory = new BoundedQueue<Integer>();
final List<Integer> answer = input.stream()
.map(i -> streamHistory.save(i))
.filter(e -> e.getPrevious() != null)
.filter(e -> e.getCurrent() > e.getPrevious())
.map(e -> e.getPrevious())
.collect(Collectors.toList());
answer.forEach(System.out::println);
}
}

How to force max to return ALL maximum values in a Java Stream?

I've tested a bit the max function on Java 8 lambdas and streams, and it seems that in case max is executed, even if more than one object compares to 0, it returns an arbitrary element within the tied candidates without further consideration.
Is there an evident trick or function for such a max expected behavior, so that all max values are returned? I don't see anything in the API but I am sure it must exist something better than comparing manually.
For instance:
// myComparator is an IntegerComparator
Stream.of(1, 3, 5, 3, 2, 3, 5)
.max(myComparator)
.forEach(System.out::println);
// Would print 5, 5 in any order.

I believe the OP is using a Comparator to partition the input into equivalence classes, and the desired result is a list of members of the equivalence class that is the maximum according to that Comparator.
Unfortunately, using int values as a sample problem is a terrible example. All equal int values are fungible, so there is no notion of preserving the ordering of equivalent values. Perhaps a better example is using string lengths, where the desired result is to return a list of strings from an input that all have the longest length within that input.
I don't know of any way to do this without storing at least partial results in a collection.
Given an input collection, say
List<String> list = ... ;
...it's simple enough to do this in two passes, the first to get the longest length, and the second to filter the strings that have that length:
int longest = list.stream()
.mapToInt(String::length)
.max()
.orElse(-1);
List<String> result = list.stream()
.filter(s -> s.length() == longest)
.collect(toList());
If the input is a stream, which cannot be traversed more than once, it is possible to compute the result in only a single pass using a collector. Writing such a collector isn't difficult, but it is a bit tedious as there are several cases to be handled. A helper function that generates such a collector, given a Comparator, is as follows:
static <T> Collector<T,?,List<T>> maxList(Comparator<? super T> comp) {
return Collector.of(
ArrayList::new,
(list, t) -> {
int c;
if (list.isEmpty() || (c = comp.compare(t, list.get(0))) == 0) {
list.add(t);
} else if (c > 0) {
list.clear();
list.add(t);
}
},
(list1, list2) -> {
if (list1.isEmpty()) {
return list2;
}
if (list2.isEmpty()) {
return list1;
}
int r = comp.compare(list1.get(0), list2.get(0));
if (r < 0) {
return list2;
} else if (r > 0) {
return list1;
} else {
list1.addAll(list2);
return list1;
}
});
}
This stores intermediate results in an ArrayList. The invariant is that all elements within any such list are equivalent in terms of the Comparator. When adding an element, if it's less than the elements in the list, it's ignored; if it's equal, it's added; and if it's greater, the list is emptied and the new element is added. Merging isn't too difficult either: the list with the greater elements is returned, but if their elements are equal the lists are appended.
Given an input stream, this is pretty easy to use:
Stream<String> input = ... ;
List<String> result = input.collect(maxList(comparing(String::length)));

I would group by value and store the values into a TreeMap in order to have my values sorted, then I would get the max value by getting the last entry as next:
Stream.of(1, 3, 5, 3, 2, 3, 5)
.collect(groupingBy(Function.identity(), TreeMap::new, toList()))
.lastEntry()
.getValue()
.forEach(System.out::println);
Output:
5
5

I implemented more generic collector solution with custom downstream collector. Probably some readers might find it useful:
public static <T, A, D> Collector<T, ?, D> maxAll(Comparator<? super T> comparator,
Collector<? super T, A, D> downstream) {
Supplier<A> downstreamSupplier = downstream.supplier();
BiConsumer<A, ? super T> downstreamAccumulator = downstream.accumulator();
BinaryOperator<A> downstreamCombiner = downstream.combiner();
class Container {
A acc;
T obj;
boolean hasAny;
Container(A acc) {
this.acc = acc;
}
}
Supplier<Container> supplier = () -> new Container(downstreamSupplier.get());
BiConsumer<Container, T> accumulator = (acc, t) -> {
if(!acc.hasAny) {
downstreamAccumulator.accept(acc.acc, t);
acc.obj = t;
acc.hasAny = true;
} else {
int cmp = comparator.compare(t, acc.obj);
if (cmp > 0) {
acc.acc = downstreamSupplier.get();
acc.obj = t;
}
if (cmp >= 0)
downstreamAccumulator.accept(acc.acc, t);
}
};
BinaryOperator<Container> combiner = (acc1, acc2) -> {
if (!acc2.hasAny) {
return acc1;
}
if (!acc1.hasAny) {
return acc2;
}
int cmp = comparator.compare(acc1.obj, acc2.obj);
if (cmp > 0) {
return acc1;
}
if (cmp < 0) {
return acc2;
}
acc1.acc = downstreamCombiner.apply(acc1.acc, acc2.acc);
return acc1;
};
Function<Container, D> finisher = acc -> downstream.finisher().apply(acc.acc);
return Collector.of(supplier, accumulator, combiner, finisher);
}
So by default it can be collected to a list using:
public static <T> Collector<T, ?, List<T>> maxAll(Comparator<? super T> comparator) {
return maxAll(comparator, Collectors.toList());
}
But you can use other downstream collectors as well:
public static String joinLongestStrings(Collection<String> input) {
return input.stream().collect(
maxAll(Comparator.comparingInt(String::length), Collectors.joining(","))));
}

If I understood well, you want the frequency of the max value in the Stream.
One way to achieve that would be to store the results in a TreeMap<Integer, List<Integer> when you collect elements from the Stream. Then you grab the last key (or first depending on the comparator you give) to get the value which will contains the list of max values.
List<Integer> maxValues = st.collect(toMap(i -> i,
Arrays::asList,
(l1, l2) -> Stream.concat(l1.stream(), l2.stream()).collect(toList()),
TreeMap::new))
.lastEntry()
.getValue();
Collecting it from the Stream(4, 5, -2, 5, 5) will give you a List [5, 5, 5].
Another approach in the same spirit would be to use a group by operation combined with the counting() collector:
Entry<Integer, Long> maxValues = st.collect(groupingBy(i -> i,
TreeMap::new,
counting())).lastEntry(); //5=3 -> 5 appears 3 times
Basically you firstly get a Map<Integer, List<Integer>>. Then the downstream counting() collector will return the number of elements in each list mapped by its key resulting in a Map. From there you grab the max entry.
The first approaches require to store all the elements from the stream. The second one is better (see Holger's comment) as the intermediate List is not built. In both approached, the result is computed in a single pass.
If you get the source from a collection, you may want to use Collections.max one time to find the maximum value followed by Collections.frequency to find how many times this value appears.
It requires two passes but uses less memory as you don't have to build the data-structure.
The stream equivalent would be coll.stream().max(...).get(...) followed by coll.stream().filter(...).count().

I'm not really sure whether you are trying to
(a) find the number of occurrences of the maximum item, or
(b) Find all the maximum values in the case of a Comparator that is not consistent with equals.
An example of (a) would be [1, 5, 4, 5, 1, 1] -> [5, 5].
An example of (b) would be:
Stream.of("Bar", "FOO", "foo", "BAR", "Foo")
.max((s, t) -> s.toLowerCase().compareTo(t.toLowerCase()));
which you want to give [Foo, foo, Foo], rather than just FOO or Optional[FOO].
In both cases, there are clever ways to do it in just one pass. But these approaches are of dubious value because you would need to keep track of unnecessary information along the way. For example, if you start with [2, 0, 2, 2, 1, 6, 2], it would only be when you reach 6 that you would realise it was not necessary to track all the 2s.
I think the best approach is the obvious one; use max, and then iterate the items again putting all the ties into a collection of your choice. This will work for both (a) and (b).

If you'd rather rely on a library than the other answers here, StreamEx has a collector to do this.
Stream.of(1, 3, 5, 3, 2, 3, 5)
.collect(MoreCollectors.maxAll())
.forEach(System.out::println);
There's a version which takes a Comparator too for streams of items which don't have a natural ordering (i.e. don't implement Comparable).

System.out.println(
Stream.of(1, 3, 5, 3, 2, 3, 5)
.map(a->new Integer[]{a})
.reduce((a,b)->
a[0]==b[0]?
Stream.concat(Stream.of(a),Stream.of(b)).toArray() :
a[0]>b[0]? a:b
).get()
)

I was searching for a good answer on this question, but a tad more complex and couldn't find anything until I figured it out myself, which is why I'm posting if this helps anybody.
I have a list of Kittens.
Kitten is an object which has a name, age and gender. I had to return a list of all the youngest kittens.
For example:
So kitten list would contain kitten objects (k1, k2, k3, k4) and their ages would be (1, 2, 3, 1) accordingly. We want to return [k1, k4], because they are both the youngest. If only one youngest exists, the function should return [k1(youngest)].
Find the min value of the list (if it exists):
Optional<Kitten> minKitten = kittens.stream().min(Comparator.comparingInt(Kitten::getAge));
filter the list by the min value
return minKitten.map(value -> kittens.stream().filter(kitten -> kitten.getAge() == value.getAge())
.collect(Collectors.toList())).orElse(Collections.emptyList());

The following two lines will do it without implementing a separate comparator:
List<Integer> list = List.of(1, 3, 5, 3, 2, 3, 5);
list.stream().filter(i -> i == (list.stream().max(Comparator.comparingInt(i2 -> i2))).get()).forEach(System.out::println);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Library method to partition a collection by a predicate - java

Apache Commons Collections IterableUtils provides methods for partitioning Iterable objects based on one or more predicates. (Look for the partition(...) methods.)

Related

Filter List for unique elements

Find missing integer in a sequential sorted stream

Java 8 lambda get and remove element from list

How to use Java 8 streams to find all values preceding a larger value?

How to force max to return ALL maximum values in a Java Stream?

Categories

Resources