find the largest 3 shops using java stream - java

I have a list of shop objects that are grouped by the item they have.
class Shop{
String shopName;
String item;
int size;
...}
How can I get a list of the 3 biggest shops (or n biggest shops) for each item?
ie. suppose I have
Shop("Walmart", "Hammer", 100);
Shop("Target", "Scissor", 30);
Shop("Walgreens", "Hammer", 300);
Shop("Glens", "Hammer", 500);
Shop("Walmart", "Scissor", 75);
Shop("Toms", "Hammer", 150);
I want to return a list of the top 3 shops grouped by item.
I grouped the items but i am not sure how to iterate through the given Map or entryset...
public class Shop {
int size;
String item;
String name;
public Shop(int size, String item, String name){
this.size = size;
this.item = item;
this.name = name;
}
//Return a list of the top 3 largest shops by item
public static void main(){
List<Shop> shops = new LinkedList<Shop>();
Comparator<Shop> shopComparator = new Comparator<Shop>(){
#Override
public int compare(Shop f1, Shop f2) {
return f1.getSize() < f2.getSize() ? 1 : -1;
}
};
shops.stream().collect(groupingBy(Shop::getItem))
.entrySet()
.stream()
.filter(entry -> entry.getValue().stream().map )
.forEach(item -> item.getValue())//Stuck here
;
}
}

The most important thing that you can learn about streams is that they aren't inherently "better" than equivalent approaches by any measure. Sometimes, they make code more readable, other times, less so. Use them to clarify your code, and avoid them when they obfuscate it.
This is a case where your code will be far more readable by using a collector for this purpose. Coding your own is fairly easy, and if you really want to understand streams better, I recommend it as a simple learning exercise.
Here, I'm using MoreCollectors.greatest() from the StreamEx library:
Comparator<Shop> bySize = Comparator.comparingInt(Shop::getSize);
Map<String, List<Shop>> biggestByItem
= shops.stream().collect(groupingBy(Shop::getItem, greatest(bySize, 3)));
This isn't better because it's shorter, or because it is faster and uses constant memory; it's better because complexity is factored out of the code, and hidden behind meaningful names that explain the behavior. Instead of littering your application with complex pipelines that need to be read, tested, and maintained independently, you have written (or referenced) a reusable collector with a clear behavior.
As I mentioned, there is a bit of a learning curve in understanding how the pieces of a Collector work together, but it's worth studying. Here's a possible implementation for a similar collector:
public static <T> Collector<T, ?, List<T>> top(int limit, Comparator<? super T> order) {
if (limit < 1) throw new IndexOutOfBoundsException(limit);
Objects.requireNonNull(order);
Supplier<Queue<T>> supplier = () -> new PriorityQueue<>(order);
BiConsumer<Queue<T>, T> accumulator = (q, e) -> collect(order, limit, q, e);
BinaryOperator<Queue<T>> combiner = (q1, q2) -> {
q2.forEach(e -> collect(order, limit, q1, e));
return q1;
};
Function<Queue<T>, List<T>> finisher = q -> {
List<T> list = new ArrayList<>(q);
Collections.reverse(list);
return list;
};
return Collector.of(supplier, accumulator, combiner, finisher, Collector.Characteristics.UNORDERED);
}
private static <T> void collect(Comparator<? super T> order, int limit, Queue<T> q, T e) {
if (q.size() < limit) {
q.add(e);
} else if (order.compare(e, q.peek()) > 0) {
q.remove();
q.add(e);
}
}
Given this factory, it's trivial to create others that give you bottom(3, bySize), etc.
You may be interested in this related question and its answers.

Well, you could take the following steps:
With groupingBy(Shop::getItem), you could create a map which sorts by the item, so your result would be a Map<String, List<Shop>>, where the list contains all shops with that item.
Now we need to sort the List<Shop> in reversed order, so the top items of the list are the shops with the largest size. In order to do this, we could use collectingAndThen as downstream collector to groupingBy.
Collectors.collectingAndThen(Collectors.toList(), finisherFunction);
Our finisher function should sort the list:
list -> {
Collections.sort(list, Comparator.comparing(Shop::size).reversed());
return list;
}
This would result in a Map<String, List<Shop>>, where the list is sorted, highest size first.
Now the only thing we need to do, is limiting the list size to 3. We could use subList. I think subList throws an exception if the list contains less than 3 items, so we need to use Math.min(3, list.size()) to take this into account.
list -> {
Collections.sort(list, Comparator.comparing(Shop::size).reversed());
return list.subList(0, Math.min(3, list.size()));
}
The whole code then looks like this:
shops.stream()
.collect(groupingBy(Shop::item, Collectors.collectingAndThen(Collectors.toList(), list -> {
Collections.sort(list, Comparator.comparing(Shop::size).reversed());
return list.subList(0, Math.min(3, list.size()));
})));
Online demo
Instead of 'manually' sorting the list and limiting it to 3, you could create a small class which automatically does this — both limit and sort the list upon adding elements.

Not as fancy as MC Emperor but it seems to work.
I started from the part you already did:
shops.stream().collect(Collectors.groupingBy(Shop::getItem))
.entrySet().stream().map(entry -> {
entry.setValue(entry.getValue().stream()
.sorted(Comparator.comparingInt(s->-s.size))
.limit(3) // only keep top 3
.collect(Collectors.toList()));
return entry;
}).forEach(item -> {
System.out.println(item.getKey()+":"+item.getValue());
});

You can use groupingBy along with limit to get desired result:
import static java.util.stream.Collectors.*;
// Define the sort logic. reversed() applies asc order (Default is desc)
Comparator<Shop> sortBySize = Comparator.comparingInt(Shop::getSize).reversed();
int limit = 3; // top n items
var itemToTopNShopsMap = list.stream().collect(
collectingAndThen(groupingBy(Shop::getItem),
itemToShopsMap -> getTopNShops(sortBySize, itemToShopsMap, limit)));
static Map<String, List<Shop>> getTopNShops(Comparator<Shop> sortBy, Map<String, List<Shop>> inMap, int limit) {
var returningMap = new HashMap<String, List<Shop>>();
for (var i : inMap.entrySet()) {
returningMap.put(i.getKey(), i.getValue().stream().sorted(sortBy).limit(Long.valueOf(limit)).collect(toList()));
}
return returningMap;
}
We took following steps:
Group the List by 'item'
For each grouping, i.e., item to list of shops entry, we sort the list of shops by predefined sort logic and collect (limit) the top n results.
Note:
In static method getTopNShops, mutation of source map is avoided. We could have written this method as a stream, but the stream version may have been less readable than the foreach loop.

Related

How to fetch 3 objects having the highest values from a List with Stream API

I have a method like this:
public String mostExpensiveItems() {
List<Entry> myList = getList();
List<Double> expensive = myList.stream()
.map(Entry::getAmount)
.sorted(Comparator.reverseOrder())
.limit(3)
.toList();
return "";
}
This method needs to return the product IDs of the 3 most expensive items as a string like this:
"item1, item2, item3"
I should be able to use only streams and I got stuck here. I should be able to sort the items by value then get the product IDs, but I can't seem to make it work.
Entry class
public class Entry {
private String productId;
private LocalDate date;
private String state;
private String category;
private Double amount;
public Entry(LocalDate orderDate, String state, String productId, String category, Double sales) {
this.date = orderDate;
this.productId = productId;
this.state = state;
this.category = category;
this.amount = sales;
}
public String getProductId() {
return productId;
}
Assuming product ID is inside Entry, it can be something like this.
public String mostExpensiveItems() {
List<Entry> myList = getList();
List<String> expensive = myList.stream()
.sorted(Comparator.comparing(Entry::getAmount).reversed())
.limit(3)
.map(Entry::getProductID)
.toList();
return "";
}
NB: I didn't test this out yet, but this should be able to convey the idea.
You don't to sort all the given data for this task. Because sorting is overkill, when only need to fetch 3 largest values.
Because sorting the hole data set will cost O(n log n) time. Meanwhile, this task can be done in a single pass through the list, maintaining only 3 largest previously encountered values in the sorted order. And time complexity will be very close to a linear time.
To implement the partial sorting with streams, you can define a custom collector (an object that is responsible for accumulating the data from the stream).
You can create a custom collector either inline by using one of the versions of the static method Collector.of() or by creating a class that implements the Collector interface.
These are parameters that you need to provide while defining a custom collector:
Supplier Supplier<A> is meant to provide a mutable container which store elements of the stream. In this case because we need to perform a partial sorting, PriorityQueue will be handy for that purpose as a mutable container.
Accumulator BiConsumer<A,T> defines how to add elements into the container provided by the supplier. For this task, the accumulator needs to guarantee that queue will not exceed the given size by rejecting values that are smaller than the lowest value previously added to the queue and by removing the lowest value if the size has reached the limit a new value needs to be added.
Combiner BinaryOperator<A> combiner() establishes a rule on how to merge two containers obtained while executing stream in parallel. Here combiner rely on the same logic that was described for accumulator.
Finisher Function<A,R> is meant to produce the final result by transforming the mutable container. The finisher function in the code below turns the queue into an immutable list.
Characteristics allow to provide additional information, for instance Collector.Characteristics.UNORDERED which is used in this case denotes that the order in which partial results of the reduction produced while executing in parallel is not significant, which can improve performance of this collector with parallel streams.
Note that with Collector.of() only supplier, accumulator and combiner are mandatory, other parameters are defined if needed.
The method below that generates a collector would be more reusable if we apply generic type parameter to it and declare it to expect a comparator as a parameter (will be used in the constructor of the PriorityQueue and while adding elements to the queue).
Custom collector:
public static <T> Collector<T, ?, List<T>> getMaxN(int size, Comparator<T> comparator) {
return Collector.of(
() -> new PriorityQueue<>(comparator),
(Queue<T> queue, T next) -> tryAdd(queue, next, comparator, size),
(Queue<T> left, Queue<T> right) -> {
right.forEach(next -> tryAdd(left, next, comparator, size));
return left;
},
(Queue<T> queue) -> queue.stream().toList(),
Collector.Characteristics.UNORDERED);
}
public static <T> void tryAdd(Queue<T> queue, T next, Comparator<T> comparator, int size) {
if (queue.size() == size && comparator.compare(queue.element(), next) < 0) queue.remove(); // if next value is greater than the smallest element in the queue and max size has been exceeded the smallest element needs to be removed from the queue
if (queue.size() < size) queue.add(next);
}
Stream:
public static <T> String getMostExpensive(List<T> list, Function<T, String> function,
Comparator<T> comparator, int limit) {
return list.stream()
.collect(getMaxN(limit, comparator))
.stream()
.map(function)
.collect(Collectors.joining(", "));
}
main() - demo with dummy Entries that expects only amount as a parameter.
public static void main(String[] args) {
List<Entry> entries =
List.of(new Entry("item1", 2.6), new Entry("item2", 3.5), new Entry("item3", 5.7),
new Entry("item4", 1.9), new Entry("item5", 3.2), new Entry("item6", 9.5),
new Entry("item7", 7.2), new Entry("item8", 8.1), new Entry("item9", 7.9));
System.out.println(getMostExpensive(entries, Entry::getProductId,
Comparator.comparingDouble(Entry::getAmount), 3));
}
Output
[item9, item6, item8] // final result is not sorted PriorityQueue maintains the elements in unsorted order (sorting happens only while dequeue operation happens), if these values are requeted to be sorted it could be done by changing the finisher function

How to avoid multiple Streams with Java 8

I am having the below code
trainResponse.getIds().stream()
.filter(id -> id.getType().equalsIgnoreCase("Company"))
.findFirst()
.ifPresent(id -> {
domainResp.setId(id.getId());
});
trainResponse.getIds().stream()
.filter(id -> id.getType().equalsIgnoreCase("Private"))
.findFirst()
.ifPresent(id ->
domainResp.setPrivateId(id.getId())
);
Here I'm iterating/streaming the list of Id objects 2 times.
The only difference between the two streams is in the filter() operation.
How to achieve it in single iteration, and what is the best approach (in terms of time and space complexity) to do this?
You can achieve that with Stream IPA in one pass though the given set of data and without increasing memory consumption (i.e. the result will contain only ids having required attributes).
For that, you can create a custom Collector that will expect as its parameters a Collection attributes to look for and a Function responsible for extracting the attribute from the stream element.
That's how this generic collector could be implemented.
/** *
* #param <T> - the type of stream elements
* #param <F> - the type of the key (a field of the stream element)
*/
class CollectByKey<T, F> implements Collector<T, Map<F, T>, Map<F, T>> {
private final Set<F> keys;
private final Function<T, F> keyExtractor;
public CollectByKey(Collection<F> keys, Function<T, F> keyExtractor) {
this.keys = new HashSet<>(keys);
this.keyExtractor = keyExtractor;
}
#Override
public Supplier<Map<F, T>> supplier() {
return HashMap::new;
}
#Override
public BiConsumer<Map<F, T>, T> accumulator() {
return this::tryAdd;
}
private void tryAdd(Map<F, T> map, T item) {
F key = keyExtractor.apply(item);
if (keys.remove(key)) {
map.put(key, item);
}
}
#Override
public BinaryOperator<Map<F, T>> combiner() {
return this::tryCombine;
}
private Map<F, T> tryCombine(Map<F, T> left, Map<F, T> right) {
right.forEach(left::putIfAbsent);
return left;
}
#Override
public Function<Map<F, T>, Map<F, T>> finisher() {
return Function.identity();
}
#Override
public Set<Characteristics> characteristics() {
return Collections.emptySet();
}
}
main() - demo (dummy Id class is not shown)
public class CustomCollectorByGivenAttributes {
public static void main(String[] args) {
List<Id> ids = List.of(new Id(1, "Company"), new Id(2, "Fizz"),
new Id(3, "Private"), new Id(4, "Buzz"));
Map<String, Id> idByType = ids.stream()
.collect(new CollectByKey<>(List.of("Company", "Private"), Id::getType));
idByType.forEach((k, v) -> {
if (k.equalsIgnoreCase("Company")) domainResp.setId(v);
if (k.equalsIgnoreCase("Private")) domainResp.setPrivateId(v);
});
System.out.println(idByType.keySet()); // printing keys - added for demo purposes
}
}
Output
[Company, Private]
Note, after the set of keys becomes empty (i.e. all resulting data has been fetched) the further elements of the stream will get ignored, but still all remained data is required to be processed.
IMO, the two streams solution is the most readable. And it may even be the most efficient solution using streams.
IMO, the best way to avoid multiple streams is to use a classical loop. For example:
// There may be bugs ...
boolean seenCompany = false;
boolean seenPrivate = false;
for (Id id: getIds()) {
if (!seenCompany && id.getType().equalsIgnoreCase("Company")) {
domainResp.setId(id.getId());
seenCompany = true;
} else if (!seenPrivate && id.getType().equalsIgnoreCase("Private")) {
domainResp.setPrivateId(id.getId());
seenPrivate = true;
}
if (seenCompany && seenPrivate) {
break;
}
}
It is unclear whether that is more efficient to performing one iteration or two iterations. It will depend on the class returned by getIds() and the code of iteration.
The complicated stuff with two flags is how you replicate the short circuiting behavior of findFirst() in your 2 stream solution. I don't know if it is possible to do that at all using one stream. If you can, it will involve something pretty cunning code.
But as you can see your original solution with 2 stream is clearly easier to understand than the above.
The main point of using streams is to make your code simpler. It is not about efficiency. When you try to do complicated things to make the streams more efficient, you are probably defeating the (true) purpose of using streams in the first place.
For your list of ids, you could just use a map, then assign them after retrieving, if present.
Map<String, Integer> seen = new HashMap<>();
for (Id id : ids) {
if (seen.size() == 2) {
break;
}
seen.computeIfAbsent(id.getType().toLowerCase(), v->id.getId());
}
If you want to test it, you can use the following:
record Id(String getType, int getId) {
#Override
public String toString() {
return String.format("[%s,%s]", getType, getId);
}
}
Random r = new Random();
List<Id> ids = r.ints(20, 1, 100)
.mapToObj(id -> new Id(
r.nextBoolean() ? "Company" : "Private", id))
.toList();
Edited to allow only certain types to be checked
If you have more than two types but only want to check on certain ones, you can do it as follows.
the process is the same except you have a Set of allowed types.
You simply check to see that your are processing one of those types by using contains.
Map<String, Integer> seen = new HashMap<>();
Set<String> allowedTypes = Set.of("company", "private");
for (Id id : ids) {
String type = id.getType();
if (allowedTypes.contains(type.toLowerCase())) {
if (seen.size() == allowedTypes.size()) {
break;
}
seen.computeIfAbsent(type,
v -> id.getId());
}
}
Testing is similar except that additional types need to be included.
create a list of some types that could be present.
and build a list of them as before.
notice that the size of allowed types replaces the value 2 to permit more than two types to be checked before exiting the loop.
List<String> possibleTypes =
List.of("Company", "Type1", "Private", "Type2");
Random r = new Random();
List<Id> ids =
r.ints(30, 1, 100)
.mapToObj(id -> new Id(possibleTypes.get(
r.nextInt((possibleTypes.size()))),
id))
.toList();
You can group by type and check the resulting map.
I suppose the type of ids is IdType.
Map<String, List<IdType>> map = trainResponse.getIds()
.stream()
.collect(Collectors.groupingBy(
id -> id.getType().toLowerCase()));
Optional.ofNullable(map.get("company")).ifPresent(ids -> domainResp.setId(ids.get(0).getId()));
Optional.ofNullable(map.get("private")).ifPresent(ids -> domainResp.setPrivateId(ids.get(0).getId()));
I'd recommend a traditionnal for loop. In addition of being easily scalable, this prevents you from traversing the collection multiple times.
Your code looks like something that'll be generalised in the future, thus my generic approch.
Here's some pseudo code (with errors, just for the sake of illustration)
Set<String> matches = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
for(id : trainResponse.getIds()) {
if (! matches.add(id.getType())) {
continue;
}
switch (id.getType().toLowerCase()) {
case "company":
domainResp.setId(id.getId());
break;
case "private":
...
}
}
Something along these lines can might work, it would go through the whole stream though, and won't stop at the first occurrence.
But assuming a small stream and only one Id for each type, why not?
Map<String, Consumer<String>> setters = new HashMap<>();
setters.put("Company", domainResp::setId);
setters.put("Private", domainResp::setPrivateId);
trainResponse.getIds().forEach(id -> {
if (setters.containsKey(id.getType())) {
setters.get(id.getType()).accept(id.getId());
}
});
We can use the Collectors.filtering from Java 9 onwards to collect the values based on condition.
For this scenario, I have changed code like below
final Map<String, String> results = trainResponse.getIds()
.stream()
.collect(Collectors.filtering(
id -> id.getType().equals("Company") || id.getIdContext().equals("Private"),
Collectors.toMap(Id::getType, Id::getId, (first, second) -> first)));
And getting the id from results Map.

Join list from two list in Java Object in stream

I have two list on two Class where id and month is common
public class NamePropeties{
private String id;
private Integer name;
private Integer months;
}
public class NameEntries {
private String id;
private Integer retailId;
private Integer months;
}
List NamePropetiesList = new ArrayList<>();
List NameEntries = new ArrayList<>();
Now i want to JOIN two list (like Sql does, JOIN ON month and id coming from two results) and return the data in new list where month and id is same in the given two list.
if i will start iterating only one and check in another list then there can be a size iteration issue.
i have tried to do it in many ways but is there is any stream way?
The general idea has been sketched in the comments: iterate one list, create a map whose keys are the attributes you want to join by, then iterate the other list and check if there's an entry in the map. If there is, get the value from the map and create a new object from the value of the map and the actual element of the list.
It's better to create a map from the list with the higher number of joined elements. Why? Because searching a map is O(1), no matter the size of the map. So, if you create a map from the list with the higher number of joined elements, then, when you iterate the second list (which is smaller), you'll be iterating among less elements.
Putting all this in code:
public static <B, S, J, R> List<R> join(
List<B> bigger,
List<S> smaller,
Function<B, J> biggerKeyExtractor,
Function<S, J> smallerKeyExtractor,
BiFunction<B, S, R> joiner) {
Map<J, List<B>> map = new LinkedHashMap<>();
bigger.forEach(b ->
map.computeIfAbsent(
biggerKeyExtractor.apply(b),
k -> new ArrayList<>())
.add(b));
List<R> result = new ArrayList<>();
smaller.forEach(s -> {
J key = smallerKeyExtractor.apply(s);
List<B> bs = map.get(key);
if (bs != null) {
bs.forEach(b -> {
R r = joiner.apply(b, s);
result.add(r);
}
}
});
return result;
}
This is a generic method that joins bigger List<B> and smaller List<S> by J join keys (in your case, as the join key is a composite of String and Integer types, J will be List<Object>). It takes care of duplicates and returns a result List<R>. The method receives both lists, functions that will extract the join keys from each list and a joiner function that will create new result R elements from joined B and S elements.
Note that the map is actually a multimap. This is because there might be duplicates as per the biggerKeyExtractor join function. We use Map.computeIfAbsent to create this multimap.
You should create a class like this to store joined results:
public class JoinedResult {
private final NameProperties properties;
private final NameEntries entries;
public JoinedResult(NameProperties properties, NameEntries entries) {
this.properties = properties;
this.entries = entries;
}
// TODO getters
}
Or, if you are in Java 14+, you might just use a record:
public record JoinedResult(NameProperties properties, NameEntries entries) { }
Or actually, any Pair class from out there will do, or you could even use Map.Entry.
With the result class (or record) in place, you should call the join method this way:
long propertiesSize = namePropertiesList.stream()
.map(p -> Arrays.asList(p.getMonths(), p.getId()))
.distinct()
.count();
long entriesSize = nameEntriesList.steram()
.map(e -> Arrays.asList(e.getMonths(), e.getId()))
.distinct()
.count();
List<JoinedResult> result = propertiesSize > entriesSize ?
join(namePropertiesList,
nameEntriesList,
p -> Arrays.asList(p.getMonths(), p.getId()),
e -> Arrays.asList(e.getMonths(), e.getId()),
JoinedResult::new) :
join(nameEntriesList,
namePropertiesList,
e -> Arrays.asList(e.getMonths(), e.getId()),
p -> Arrays.asList(p.getMonths(), p.getId()),
(e, p) -> new JoinedResult(p, e));
The key is to use generics and call the join method with the right arguments (they are flipped, as per the join keys size comparison).
Note 1: we can use List<Object> as the key of the map, because all Java lists implement equals and hashCode consistently (thus they can safely be used as map keys)
Note 2: if you are on Java9+, you should use List.of instead of Arrays.asList
Note 3: I haven't checked for neither null nor invalid arguments
Note 4: there is room for improvements, i.e. key extractor functions could be memoized, join keys could be reused instead of calculated more than once and multimap could have Object values for single elements and lists for duplicates, etc
If performance and nesting (as discussed) is not too much of a concern you could employ something along the lines of a crossjoin with filtering:
Result holder class
public class Tuple<A, B> {
public final A a;
public final B b;
public Tuple(A a, B b) {
this.a = a;
this.b = b;
}
}
Join with a predicate:
public static <A, B> List<Tuple<A, B>> joinOn(
List<A> l1,
List<B> l2,
Predicate<Tuple<A, B>> predicate) {
return l1.stream()
.flatMap(a -> l2.stream().map(b -> new Tuple<>(a, b)))
.filter(predicate)
.collect(Collectors.toList());
}
Call it like this:
List<Tuple<NamePropeties, NameEntries>> joined = joinOn(
properties,
names,
t -> Objects.equals(t.a.id, t.b.id) && Objects.equals(t.a.months, t.b.months)
);

What is the best way to implement the python count function in java?

I am learning how to use streams in java and I would like to know the most efficient way to copy the python count functionality into java.
For those unfamiliar with python count, see here.
I've already done a naive implementation but I doubt this would ever get added to a production level environment:
private List<String> countMessages(List<String> messages) {
Map<String, Integer> messageOccurrences = new HashMap<>();
List<String> stackedMessages = new LinkedList<String>();
this.messages.stream().filter((message) -> (messageOccurrences.containsKey(message))).forEachOrdered((message) -> {
int new_occ = messageOccurrences.get(message) + 1;
messageOccurrences.put(message, new_occ);
});
messageOccurrences.keySet().forEach((key) -> {
stackedMessages.add(key + "(" + messageOccurrences.get(key) + "times)" );
});
return stackedMessages;
}
Any improvements or pointers would be appreciated.
To answer the question "what is the best way to implement the python count function in java?".
Java already has Collections.frequency which will do exactly that.
However, if you want to do it with the streams API then I believe a generic solution would be:
public static <T> long count(Collection<T> source, T element) {
return source.stream().filter(e -> Objects.equals(e, element)).count();
}
then the use case would be:
long countHellp = count(myStringList, "hello");
long countJohn = count(peopleList, new Person("John"));
long count101 = count(integerList, 101);
...
...
or you can even pass a predicate if you wanted:
public static <T> long count(Collection<T> source, Predicate<? super T> predicate) {
return source.stream().filter(predicate).count();
}
Then the use case would be for example:
long stringsGreaterThanTen = count(myStringList, s -> s.length() > 10);
long malesCount = count(peopleList, Person::isMale);
long evens = count(integerList, i -> i % 2 == 0);
...
...
Given your comment on the post, it seems like you want to "group" then and get the count of each group.
public Map<String, Long> countMessages(List<String> messages) {
return messages.stream()
.collect(groupingBy(Function.identity(), counting()));
}
This creates a stream from the messages list and then groups them, passing a counting() as the downstream collector meaning we will retrieve a Map<String, Long> where the keys are the elements and the values are the occurrences of that specific string.
Ensure you have the import:
import static java.util.stream.Collectors.*;
for the latter solution.

How to create a List<T> from Map<K,V> and List<K> of keys?

Using Java 8 lambdas, what's the "best" way to effectively create a new List<T> given a List<K> of possible keys and a Map<K,V>? This is the scenario where you are given a List of possible Map keys and are expected to generate a List<T> where T is some type that is constructed based on some aspect of V, the map value types.
I've explored a few and don't feel comfortable claiming one way is better than another (with maybe one exception -- see code). I'll clarify "best" as a combination of code clarity and runtime efficiency. These are what I came up with. I'm sure someone can do better, which is one aspect of this question. I don't like the filter aspect of most as it means needing to create intermediate structures and multiple passes over the names List. Right now, I'm opting for Example 6 -- a plain 'ol loop. (NOTE: Some cryptic thoughts are in the code comments, especially "need to reference externally..." This means external from the lambda.)
public class Java8Mapping {
private final Map<String,Wongo> nameToWongoMap = new HashMap<>();
public Java8Mapping(){
List<String> names = Arrays.asList("abbey","normal","hans","delbrook");
List<String> types = Arrays.asList("crazy","boring","shocking","dead");
for(int i=0; i<names.size(); i++){
nameToWongoMap.put(names.get(i),new Wongo(names.get(i),types.get(i)));
}
}
public static void main(String[] args) {
System.out.println("in main");
Java8Mapping j = new Java8Mapping();
List<String> testNames = Arrays.asList("abbey", "froderick","igor");
System.out.println(j.getBongosExample1(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample2(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample3(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample4(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample5(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample6(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
}
private static class Wongo{
String name;
String type;
public Wongo(String s, String t){name=s;type=t;}
#Override public String toString(){return "Wongo{name="+name+", type="+type+"}";}
}
private static class Bongo{
Wongo wongo;
public Bongo(Wongo w){wongo = w;}
#Override public String toString(){ return "Bongo{wongo="+wongo+"}";}
}
// 1: Create a list externally and add items inside 'forEach'.
// Needs to externally reference Map and List
public List<Bongo> getBongosExample1(List<String> names){
final List<Bongo> listOne = new ArrayList<>();
names.forEach(s -> {
Wongo w = nameToWongoMap.get(s);
if(w != null) {
listOne.add(new Bongo(nameToWongoMap.get(s)));
}
});
return listOne;
}
// 2: Use stream().map().collect()
// Needs to externally reference Map
public List<Bongo> getBongosExample2(List<String> names){
return names.stream()
.filter(s -> nameToWongoMap.get(s) != null)
.map(s -> new Bongo(nameToWongoMap.get(s)))
.collect(Collectors.toList());
}
// 3: Create custom Collector
// Needs to externally reference Map
public List<Bongo> getBongosExample3(List<String> names){
Function<List<Wongo>,List<Bongo>> finisher = list -> list.stream().map(Bongo::new).collect(Collectors.toList());
Collector<String,List<Wongo>,List<Bongo>> bongoCollector =
Collector.of(ArrayList::new,getAccumulator(),getCombiner(),finisher, Characteristics.UNORDERED);
return names.stream().collect(bongoCollector);
}
// example 3 helper code
private BiConsumer<List<Wongo>,String> getAccumulator(){
return (list,string) -> {
Wongo w = nameToWongoMap.get(string);
if(w != null){
list.add(w);
}
};
}
// example 3 helper code
private BinaryOperator<List<Wongo>> getCombiner(){
return (l1,l2) -> {
l1.addAll(l2);
return l1;
};
}
// 4: Use internal Bongo creation facility
public List<Bongo> getBongosExample4(List<String> names){
return names.stream().filter(s->nameToWongoMap.get(s) != null).map(s-> new Bongo(nameToWongoMap.get(s))).collect(Collectors.toList());
}
// 5: Stream the Map EntrySet. This avoids referring to anything outside of the stream,
// but bypasses the lookup benefit from Map.
public List<Bongo> getBongosExample5(List<String> names){
return nameToWongoMap.entrySet().stream().filter(e->names.contains(e.getKey())).map(e -> new Bongo(e.getValue())).collect(Collectors.toList());
}
// 6: Plain-ol-java loop
public List<Bongo> getBongosExample6(List<String> names){
List<Bongo> bongos = new ArrayList<>();
for(String s : names){
Wongo w = nameToWongoMap.get(s);
if(w != null){
bongos.add(new Bongo(w));
}
}
return bongos;
}
}
If namesToWongoMap is an instance variable, you can't really avoid a capturing lambda.
You can clean up the stream by splitting up the operations a little more:
return names.stream()
.map(n -> namesToWongoMap.get(n))
.filter(w -> w != null)
.map(w -> new Bongo(w))
.collect(toList());
return names.stream()
.map(namesToWongoMap::get)
.filter(Objects::nonNull)
.map(Bongo::new)
.collect(toList());
That way you don't call get twice.
This is very much like the for loop, except, for example, it could theoretically be parallelized if namesToWongoMap can't be mutated concurrently.
I don't like the filter aspect of most as it means needing to create intermediate structures and multiple passes over the names List.
There are no intermediate structures and there is only one pass over the List. A stream pipeline says "for each element...do this sequence of operations". Each element is visited once and the pipeline is applied.
Here are some relevant quotes from the java.util.stream package description:
A stream is not a data structure that stores elements; instead, it conveys elements from a source such as a data structure, an array, a generator function, or an I/O channel, through a pipeline of computational operations.
Processing streams lazily allows for significant efficiencies; in a pipeline such as the filter-map-sum example above, filtering, mapping, and summing can be fused into a single pass on the data, with minimal intermediate state.
Radiodef's answer pretty much nailed it, I think. The solution given there:
return names.stream()
.map(namesToWongoMap::get)
.filter(Objects::nonNull)
.map(Bongo::new)
.collect(toList());
is probably about the best that can be done in Java 8.
I did want to mention a small wrinkle in this, though. The Map.get call returns null if the name isn't present in the map, and this is subsequently filtered out. There's nothing wrong with this per se, though it does bake null-means-not-present semantics into the pipeline structure.
In some sense we'd want a mapper pipeline operation that has a choice of returning zero or one elements. A way to do this with streams is with flatMap. The flatmapper function can return an arbitrary number of elements into the stream, but in this case we want just zero or one. Here's how to do that:
return names.stream()
.flatMap(name -> {
Wongo w = nameToWongoMap.get(name);
return w == null ? Stream.empty() : Stream.of(w);
})
.map(Bongo::new)
.collect(toList());
I admit this is pretty clunky and so I wouldn't recommend doing this. A slightly better but somewhat obscure approach is this:
return names.stream()
.flatMap(name -> Optional.ofNullable(nameToWongoMap.get(name))
.map(Stream::of).orElseGet(Stream::empty))
.map(Bongo::new)
.collect(toList());
but I'm still not sure I'd recommend this as it stands.
The use of flatMap does point to another approach, though. If you have a more complicated policy of how to deal with the not-present case, you could refactor this into a helper function that returns a Stream containing the result or an empty Stream if there's no result.
Finally, JDK 9 -- still under development as of this writing -- has added Stream.ofNullable which is useful in exactly these situations:
return names.stream()
.flatMap(name -> Stream.ofNullable(nameToWongoMap.get(name)))
.map(Bongo::new)
.collect(toList());
As an aside, JDK 9 has also added Optional.stream which creates a zero-or-one stream from an Optional. This is useful in cases where you want to call an Optional-returning function from within flatMap. See this answer and this answer for more discussion.
One approach I didn't see is retainAll:
public List<Bongo> getBongos(List<String> names) {
Map<String, Wongo> copy = new HashMap<>(nameToWongoMap);
copy.keySet().retainAll(names);
return copy.values().stream().map(Bongo::new).collect(
Collectors.toList());
}
The extra Map is a minimal performance hit, since it's just copying pointers to objects, not the objects themselves.

Categories