I have a List of Maps with certain keys that map to String values.
Something like List<Map<String,String>> aMapList;
Objective : Stream over this List of maps and collect values of a single key in all Maps.
How I'm doing this ->
key = "somekey";
aMapList.stream().map(a -> a.get(key)).collect(Collectors.averagingInt());
The Problem:
I get exceptions due to a.get(key) if there is no such key! because averaging this will give a null. How do I check or make lambda ignore any such maps and move on.
I do know that I can add a filter on a -> a.contains(key) and then proceed.
Edit : I can also add more filters or simple check multiple conditions on one filter.
Possible Solution:
aMapList.stream().filter(a -> a.contains(key)).
map(a -> a.get(key)).collect(Collectors.averagingInt());
Can this be made prettier? Instead of halting the operation, simply skip over them?
Is there some more generic way to skip over exceptions or nulls.
For eg. We can expand the lambda and put a try-catch block, but I still need to return something, what if I wish to do an equivalent of "continue".
Eg.
(a -> {return a.get(key) }).
Can be expanded to -->
(a -> {try{return a.get(key)}
catch(Exception e){return null} }).
The above still returns a null, instead of just skipping over.
I'm selecting the best answer for giving two options, But I do not find any of them prettier. Chaining filters seems to be the solution to this.
How about wrapping the result with Optional:
List<Optional<String>> values = aMapList.stream()
.map(a -> Optional.ofNullable(a.get(key)))
.collect(Collectors.toList());
Later code will know to expect possible empty elements.
The solution you propose has a potential bug for maps that allow null values. For example:
Map<String, String> aMap = new HashMap<>();
aMap.put("somekey", null);
aMapList.add(aMap);
aMapList.straem()
.filter(a -> a.contains("somekey")) // true returned for contains
.map(a -> a.get("somekey")) // null returned for get
.collect(Collectors.toList());
Based on the Map documentation, and on your comment under your question, you're not actually getting an exception from a.get(key). Rather, that expression produces a null value, and you're having problems later when you run into these null values. So simply filtering out these null values right away should work just fine:
aMapList.stream()
.map(a -> a.get(key))
.filter(v -> v != null)
.collect(Collectors.toList());
This is prettier, simpler, and performs better than the workaround in your question.
I should mention that I usually prefer the Optional<> type when dealing with null values, but this filtering approach works better in this case since you specifically said you wanted to ignore elements where the key doesn't exist in a map list.
The simplest I could come up with was:
aMapList.stream()
.filter(map -> map.containsKey(key))
.map(map -> map.get(key))
.collect(Collectors.toList());
By formatting the lambda in this fashion, it is easier to see the distinct steps that the code processes.
Although I reckon this is not exactly a prettier approach, you could do:
aMapList.stream().map(a -> a.containsKey(key) ? a.get(key) : null).collect(Collectors.toList());
Related
In the following code I am trying to remove all nodes and leaves that do not have a root in the key of input map. Input is the Map<rootId: String, listOf(root,nodes,leaves)>
Working logic
#NotNull
private static Map<String, List<Element>> removeOrphanNodes(Map<String, List<Element>> mapOfAllProcesses) {
Map<String,List<Element>> refinedRootMap= new HashMap<>();
for(Map.Entry<String,List<Element>>entrySet: mapOfAllProcesses.entrySet())
{
if(entrySet.getValue().size()>1)
refinedRootMap.put(entrySet.getKey(),entrySet.getValue());
else {
Element loneElement = entrySet.getValue().get(0);
if (entrySet.getKey().equals(loneElement.getIdAsString()))
refinedRootMap.put(entrySet.getKey(),entrySet.getValue());
else if(loneElement.getCurrentOperations()!=null && loneElement.getCurrentOperations().iterator().next().getId().toHexString().equals(entrySet.getKey()))
refinedRootMap.put(entrySet.getKey(),entrySet.getValue());
}
}
return refinedRootMap;
}
The above code works as expected. I wanted to make use streams to achieve the same functionality but getCurrentOperations throws null pointer
My Attempt
return mapOfAllProcesses.entrySet().stream().filter(entry -> entry.getValue().size()>1 || entry.getValue().stream()
.anyMatch(
element-> element.getIdAsString().equals(entry.getKey())||element.getCurrentOperations().stream().findFirst().get().getId().toHexString().equals(entry.getKey())
)).collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
Don't do this.
The Stream API is not a replacement for loops or conventional iteration constructs.
Ideal code for putting into a Stream would be something that:
Does one function (reads data out, loads data in)
Could be parallelized
Has no side-effects (e.g. doesn't reach out to anything else)
Your code satisfies the last bullet, I'm not sure about the middle bullet and it definitely does a lot more based on conditions, which...isn't ideal for streaming.
Maybe a better way to approach this problem would be to re-think the data structure you're using? You're using a Map<K, List<V>>, and that can be contextualized inside of a Guava Multimap. Maybe that's where the first improvement needs to happen - using a more suitable data structure for this instead?
To avoid raising NPE when collection returned by element.getCurrentOperations() is null you might try to use Stream.ofNullable(). And dummy default value while extracting result from the optional can help in cases when this collection is empty (and consequently optional would be empty).
return mapOfAllProcesses.entrySet().stream()
.filter(entry -> entrySet.getValue().size() > 1 ||
entry.getValue().stream()
.anyMatch(
element -> element.getIdAsString().equals(entry.getKey())
||
Stream.ofNullable(element.getCurrentOperations()).findFirst()
.map(operation -> operation.getId().toHexString())
.orElse("").equals(entry.getKey())
))
.collect(Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue
));
It is worth to note that both snippets (imperative and functional) are extremely convoluted, so it would be wise to consider extracting some pieces of functionality into separate methods.
This should do it, but I don't think there's any advantage to be gained this way. It's slightly shorter, but not really any simpler, and definitely not any clearer.
#NotNull
private static Map<String, List<Element>> removeOrphanNodes(Map<String, List<Element>> mapOfAllProcesses) {
return mapOfAllProcesses.entrySet().stream().filter((entry) ->
(entry.getValue().size() > 1)
|| entry.getKey().equals(entry.getValue().get(0).getIdAsString())
|| Optional.ofNullable(entry.getValue().get(0).getCurrentOperations())
.stream()
.map((ops) -> ops.iterator().next().getId().toHexString())
.anyMatch((s) -> s.equals(entry.getKey()))
).collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
}
This version handles the fact that getCurrentOperations() can return null by capturing the return value as an Optional. The Optional's stream will be empty when getCurrentOperations() returns null, and anyMatch() returns false on an empty stream.
The map() of the inner stream could be omitted in favor of a more complex predicate for its anyMatch(), but I think the version with map() is clearer.
Do Java streams have a convenient way to map based upon a predicate, but if the predicate is not met to map to some other value?
Let's say I have Stream.of("2021", "", "2023"). I want to map that to Stream.of(Optional.of(Year.of(2021)), Optional.empty(), Optional.of(Year.of(2023))). Here's one way I could do that:
Stream<String> yearStrings = Stream.of("2021", "", "2023");
Stream<Optional<Year>> yearsFound = yearStrings.map(yearString ->
!yearString.isEmpty() ? Year.parse(yearString) : null)
.map(Optional::ofNullable);
But here is what I would like to do, using a hypothetical filter-map:
Stream<String> yearStrings = Stream.of("2021", "", "2023");
Stream<Optional<Year>> yearsFound = yearStrings.mapIfOrElse(not(String::isEmpty),
Year::parse, null).map(Optional::ofNullable);
Of course I can write my own mapIfOrElse(Predicate<>, Function<>, T) function to use with Stream.map(), but I wanted to check if there is something similar in Java's existing arsenal that I've missed.
There is not a very much better way of doing it than you have it - it might be nicer if you extracted it to a method, but that's really it.
Another way might be to construct Optionals from all values, and then use Optional.filter to map empty values to empty optionals:
yearStreams.map(Optional::of)
.map(opt -> opt.filter(Predicate.not(String::isEmpty)));
Is this better? Probably not.
Yet another way would be to make use of something like Guava's Strings.emptyToNull (other libraries are available), which turns your empty strings into null first; and then use Optional.ofNullable to turn non-nulls and nulls into non-empty and empty Optionals, respectively:
yearStreams.map(Strings::emptyToNull)
.map(Optional::ofNullable)
You can just simply use filter to validate and then only map
Stream<Year> yearsFound = yearStrings.filter(yearString->!yearString.isEmpty()).map(Year::parse)
It's hardly possible to combine all these actions smoothly in well-readable way within a single stream operation.
Here's a weird method-chaining with Java 16 mapMulti():
Stream<Optional<Year>> yearsFound = yearStrings
.mapMulti((yearString, consumer) ->
Optional.of(yearString).filter(s -> !s.isEmpty()).map(Year::parse)
.ifPresentOrElse(year -> consumer.accept(Optional.of(year)),
() -> consumer.accept(Optional.empty()))
);
I am generating a power set (Set<Set<Integer>>) from an original set (Set<Integer>).
i.e. {1, 2, 3} -> { {}, {1}, {2}, {3}, {1,2}, {2,3}, {1,3}, {1,2,3} }
Then I am using an isClique(Set<Integer>) method that returns a boolean if the given set is a clique in the adjacency matrix I am using.
I want to use a java stream to parallelize this operation and return the largest subset that is also a clique.
I am thinking something like this, but every variation I come up with causes a variety of compilation errors.
Optional result = powerSet.stream().parallel().
filter(e ->{return(isClique(e));}).
collect(Collectors.maxBy(Comparator Set<Integer> comparator));
I either get:
MaxClique.java:86: error: incompatible types: Stream<Set<Integer>> cannot be converted to Set<Integer>
currentMax = powerSet.stream().parallel().filter(e -> { return (isClique(e));});//.collect(Collectors.maxBy(Comparator <Set<Integer>> comparator));
or something related to the comparator (which I'm not sure I'm doing correctly).
Please advise, thanks.
You have some syntax problems. But beside that, you can compute the same optional using:
Optional<Set<Integer>> result = powerSet.stream().parallel()
.filter(e -> isClique(e))
.collect(
Collectors.maxBy(
(set1, set2) -> Integer.compare(set1.size(), set2.size())
)
);
This is filtering based on your condition, then pulling the max value based on a comparator that compares set sizes.
Your major issue is using the wrong syntax for the comparator. Rather, you'd want something along the lines of:
Optional<Set<Integer>> resultSet =
powerSet.stream()
.parallel()
.filter(e -> isClique(e))
.max(Comparator.comparingInt(Set::size));
Note the use of the max method as opposed to the maxBy, this is because the maxBy is typically used as a downstream collector. in fact, the real motivation for it to exist is to be used as a downstream collector.
Also, note the use of Optional<Set<Integer>> being the receiver type as opposed to Optional as in your example code snippet. The latter is a raw type and you should avoid to use them unless there is no choice.
Lastly, but not least, if you haven't already done so then I'd suggest you try executing the code sequentially first and if you think you can benefit from parallel streams then you can proceed with the current approach.
This question already has answers here:
Why filter() after flatMap() is "not completely" lazy in Java streams?
(8 answers)
Closed 3 years ago.
Consider the following code:
urls.stream()
.flatMap(url -> fetchDataFromInternet(url).stream())
.filter(...)
.findFirst()
.get();
Will fetchDataFromInternet be called for second url when the first one was enough?
I tried with a smaller example and it looks like working as expected. i.e processes data one by one but can this behavior be relied on? If not, does calling .sequential() before .flatMap(...) help?
Stream.of("one", "two", "three")
.flatMap(num -> {
System.out.println("Processing " + num);
// return FetchFromInternetForNum(num).data().stream();
return Stream.of(num);
})
.peek(num -> System.out.println("Peek before filter: "+ num))
.filter(num -> num.length() > 0)
.peek(num -> System.out.println("Peek after filter: "+ num))
.forEach(num -> {
System.out.println("Done " + num);
});
Output:
Processing one
Peek before filter: one
Peek after filter: one
Done one
Processing two
Peek before filter: two
Peek after filter: two
Done two
Processing three
Peek before filter: three
Peek after filter: three
Done three
Update: Using official Oracle JDK8 if that matters on implementation
Answer:
Based on the comments and the answers below, flatmap is partially lazy. i.e reads the first stream fully and only when required, it goes for next. Reading a stream is eager but reading multiple streams is lazy.
If this behavior is intended, the API should let the function return an Iterable instead of a stream.
In other words: link
Under the current implementation, flatmap is eager; like any other stateful intermediate operation (like sorted and distinct). And it's very easy to prove :
int result = Stream.of(1)
.flatMap(x -> Stream.generate(() -> ThreadLocalRandom.current().nextInt()))
.findFirst()
.get();
System.out.println(result);
This never finishes as flatMap is computed eagerly. For your example:
urls.stream()
.flatMap(url -> fetchDataFromInternet(url).stream())
.filter(...)
.findFirst()
.get();
It means that for each url, the flatMap will block all others operation that come after it, even if you care about a single one. So let's suppose that from a single url your fetchDataFromInternet(url) generates 10_000 lines, well your findFirst will have to wait for all 10_000 to be computed, even if you care about only one.
EDIT
This is fixed in Java 10, where we get our laziness back: see JDK-8075939
EDIT 2
This is fixed in Java 8 too (8u222): JDK-8225328
It’s not clear why you set up an example that does not address the actual question, you’re interested in. If you want to know, whether the processing is lazy when applying a short-circuiting operation like findFirst(), well, then use an example using findFirst() instead of forEach that processes all elements anyway. Also, put the logging statement right into the function whose evaluation you want to track:
Stream.of("hello", "world")
.flatMap(s -> {
System.out.println("flatMap function evaluated for \""+s+'"');
return s.chars().boxed();
})
.peek(c -> System.out.printf("processing element %c%n", c))
.filter(c -> c>'h')
.findFirst()
.ifPresent(c -> System.out.printf("found an %c%n", c));
flatMap function evaluated for "hello"
processing element h
processing element e
processing element l
processing element l
processing element o
found an l
This demonstrates that the function passed to flatMap gets evaluated lazily as expected while the elements of the returned (sub-)stream are not evaluated as lazy as possible, as already discussed in the Q&A you have linked yourself.
So, regarding your fetchDataFromInternet method that gets invoked from the function passed to flatMap, you will get the desired laziness. But not for the data it returns.
Today I also stumbled up on this bug. Behavior is not so strait forward, cause simple case, like below, is working fine, but similar production code doesn't work.
stream(spliterator).map(o -> o).flatMap(Stream::of)..flatMap(Stream::of).findAny()
For guys who cannot wait another couple years for migration to JDK-10 there is a alternative true lazy stream. It doesn't support parallel. It was dedicated for JavaScript translation, but it worked out for me, cause interface is the same.
StreamHelper is collection based, but it is easy to adapt Spliterator.
https://github.com/yaitskov/j4ts/blob/stream/src/main/java/javaemul/internal/stream/StreamHelper.java
Am trying to print the values from the .stream() via two .filter(). But the value is not printing.
With one .filter() am able to print the values.
Please find my code below.
listProducts.stream()
.flatMap(listproducts -> listproducts.getProductAttr().stream())
.flatMap(attr ->attr.getProductAttrValue().stream())
.filter(av -> av.getLabel().equalsIgnoreCase("source"))
.filter(av -> av.getLabel().equalsIgnoreCase("description"))
.forEachOrdered(av -> System.out.println(av.getValue()));
No element of your Stream can pass the Predicates passed to both of your filter calls, since av.getLabel() can't be equal to both "source" and "description" at the same time.
You can use a single filter instead:
.filter(av -> av.getLabel().equalsIgnoreCase("source") ||
av.getLabel().equalsIgnoreCase("description"))
.filter(av -> Pattern.matches("(?i)source|description", av.getLabel()))
You are keeping only "source" strings (ignoring the case) after the first filtering.
The second filter kicks away the previous results.
You should build a composite boolean expression within one filter.
I suggest writing that simple regexp.*
*It can be improved by precompiling the pattern as #daniu's suggested.