I very much want to use Map.computeIfAbsent but it has been too long since lambdas in undergrad.
Almost directly from the docs: it gives an example of the old way to do things:
Map<String, Boolean> whoLetDogsOut = new ConcurrentHashMap<>();
String key = "snoop";
if (whoLetDogsOut.get(key) == null) {
Boolean isLetOut = tryToLetOut(key);
if (isLetOut != null)
map.putIfAbsent(key, isLetOut);
}
And the new way:
map.computeIfAbsent(key, k -> new Value(f(k)));
But in their example, I think I'm not quite "getting it." How would I transform the code to use the new lambda way of expressing this?
Recently I was playing with this method too. I wrote a memoized algorithm to calcualte Fibonacci numbers which could serve as another illustration on how to use the method.
We can start by defining a map and putting the values in it for the base cases, namely, fibonnaci(0) and fibonacci(1):
private static Map<Integer,Long> memo = new HashMap<>();
static {
memo.put(0,0L); //fibonacci(0)
memo.put(1,1L); //fibonacci(1)
}
And for the inductive step all we have to do is redefine our Fibonacci function as follows:
public static long fibonacci(int x) {
return memo.computeIfAbsent(x, n -> fibonacci(n-2) + fibonacci(n-1));
}
As you can see, the method computeIfAbsent will use the provided lambda expression to calculate the Fibonacci number when the number is not present in the map. This represents a significant improvement over the traditional, tree recursive algorithm.
Suppose you have the following code:
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
public class Test {
public static void main(String[] s) {
Map<String, Boolean> whoLetDogsOut = new ConcurrentHashMap<>();
whoLetDogsOut.computeIfAbsent("snoop", k -> f(k));
whoLetDogsOut.computeIfAbsent("snoop", k -> f(k));
}
static boolean f(String s) {
System.out.println("creating a value for \""+s+'"');
return s.isEmpty();
}
}
Then you will see the message creating a value for "snoop" exactly once as on the second invocation of computeIfAbsent there is already a value for that key. The k in the lambda expression k -> f(k) is just a placeolder (parameter) for the key which the map will pass to your lambda for computing the value. So in the example the key is passed to the function invocation.
Alternatively you could write: whoLetDogsOut.computeIfAbsent("snoop", k -> k.isEmpty()); to achieve the same result without a helper method (but you won’t see the debugging output then). And even simpler, as it is a simple delegation to an existing method you could write: whoLetDogsOut.computeIfAbsent("snoop", String::isEmpty); This delegation does not need any parameters to be written.
To be closer to the example in your question, you could write it as whoLetDogsOut.computeIfAbsent("snoop", key -> tryToLetOut(key)); (it doesn’t matter whether you name the parameter k or key). Or write it as whoLetDogsOut.computeIfAbsent("snoop", MyClass::tryToLetOut); if tryToLetOut is static or whoLetDogsOut.computeIfAbsent("snoop", this::tryToLetOut); if tryToLetOut is an instance method.
Another example. When building a complex map of maps, the computeIfAbsent() method is a replacement for map's get() method. Through chaining of computeIfAbsent() calls together, missing containers are constructed on-the-fly by provided lambda expressions:
// Stores regional movie ratings
Map<String, Map<Integer, Set<String>>> regionalMovieRatings = new TreeMap<>();
// This will throw NullPointerException!
regionalMovieRatings.get("New York").get(5).add("Boyhood");
// This will work
regionalMovieRatings
.computeIfAbsent("New York", region -> new TreeMap<>())
.computeIfAbsent(5, rating -> new TreeSet<>())
.add("Boyhood");
multi-map
This is really helpful if you want to create a multimap without resorting to the Google Guava library for its implementation of MultiMap.
For example, suppose you want to store a list of students who enrolled for a particular subject.
The normal solution for this using JDK library is:
Map<String,List<String>> studentListSubjectWise = new TreeMap<>();
List<String>lis = studentListSubjectWise.get("a");
if(lis == null) {
lis = new ArrayList<>();
}
lis.add("John");
//continue....
Since it have some boilerplate code, people tend to use Guava Mutltimap.
Using Map.computeIfAbsent, we can write in a single line without guava Multimap as follows.
studentListSubjectWise.computeIfAbsent("a", (x -> new ArrayList<>())).add("John");
Stuart Marks & Brian Goetz did a good talk about this
https://www.youtube.com/watch?v=9uTVXxJjuco
Came up with this comparison example (old vs new) which demonstrates both the approaches;
static Map<String, Set<String>> playerSkills = new HashMap<>();
public static void main(String[] args) {
//desired output
//player1, cricket, baseball
//player2, swimming
//old way
add("Player1","cricket");
add("Player2","swimming");
add("Player1","baseball");
System.out.println(playerSkills);
//clear
playerSkills.clear();
//new
addNew("Player1","cricket");
addNew("Player2","swimming");
addNew("Player1","baseball");
System.out.println(playerSkills);
}
private static void add(String name, String skill) {
Set<String> skills = playerSkills.get(name);
if(skills==null) {
skills= new HashSet<>();
playerSkills.put(name, skills);
}
skills.add(skill);
}
private static void addNew(String name, String skill) {
playerSkills
.computeIfAbsent(name, set -> new HashSet<>())
.add(skill);
}
Related
I'm trying to use the Java summarize which I have just discovered and they are just perfect for my use case.
The only issue is that I can't make it working when I need to summarize on multiple field:
final Map<PackageType, LongSummaryStatistics> map2 = artifactory.getStorageInfo()
.getRepositorySummaries()
.stream()
.map(o -> RepositorySummaryValue.from(o))
.collect(groupingBy(
k -> k.getPackageType(),
summarizingLong(k -> k.filesCount)
));
this is my RepositorySummaryValue class:
#lombok.Value
#Builder(builderClassName = "Builder")
private static class RepositorySummaryValue {
long filesCount;
#NonNull
PackageType packageType;
#NonNull
String key;
#NonNull
RepositoryType type;
long usedSpaceBytes;
#SneakyThrows
static RepositorySummaryValue from(RepositorySummary source) {
return builder()
.filesCount(source.getFilesCount())
.packageType(source.getPackageType())
.key(source.getKey())
.type(source.getType())
.usedSpaceBytes(source.getUsedSpaceBytes())
.build();
}
}
What I want is to get summarise also for summarizingLong(k -> k.usedSpaceBytes)
Any way for doing it?
=========EDIT============
I'm using Java 8
Here's an idea that with a little setup would allow any number of fields of some type T to be summarized. It may give you some ideas. It makes use of the compute* features of a map introduced in Java 8.
I used a record in lieu of a class to hold the data.
first, set up a list of method references to get the values you want.
then, initialize another list with the titles of names of those values (here I used your names but the data is just for demo). Note: There are many ways to do this. A single list with a record holding the name and method reference would be another way.
create a map of maps to house the results. The key to the outer map is the packageType, the key to the inner map is the name of the field you are summarizing for each packageType.
Now simply iterate over the list of data, and build the map. The details of how the compute* methods work are explained in the Map Interface JavaDoc but here is a quick summary.
computeIfAbsent will evaluate its second argument if the supplied key is not there. That second argument (here it's the inner map) is returned for access. In this case, another computeIfAbsent is used to see if that map has a key for the field. If not it adds it and creates a LongSummaryStatistic instance wit the key for the field name. That instance is also made available for access. The data item is then accepted and the statistics updated.
public class Summarizing {
record Data(String getPackageType, long getFilesCount,
long getUsedSpaceBytes) {
}
static List<Function<Data, Long>> summaryFields = List
.of(Data::getFilesCount, Data::getUsedSpaceBytes);
static List<String> names = List.of("filesCount", "usedSpaceBytes");
public static void main(String[] args) {
List<Data> list = List.of(new Data("foo", 10, 100),
new Data("bar", 20, 200),
new Data("foo", 30, 300),
new Data("bar", 40, 400));
Map<String, Map<String, LongSummaryStatistics>> result = new HashMap<>();
for (Data d : list) {
Map<String, LongSummaryStatistics> innerMap = result
.computeIfAbsent(d.getPackageType(),
v -> new HashMap<>());
for (int i = 0; i < summaryFields.size(); i++) {
innerMap.computeIfAbsent(names.get(i),
v -> new LongSummaryStatistics())
.accept(summaryFields.get(i)
.apply(d));
}
}
result.entrySet().forEach(e-> {
System.out.println(e.getKey());
for (Entry<?,?> ee : e.getValue().entrySet()) {
System.out.println(" " + ee);
}
});
}
}
prints
bar
filesCount=LongSummaryStatistics{count=2, sum=60, min=20, average=30.000000, max=40}
usedSpaceBytes=LongSummaryStatistics{count=2, sum=600, min=200, average=300.000000, max=400}
foo
filesCount=LongSummaryStatistics{count=2, sum=40, min=10, average=20.000000, max=30}
usedSpaceBytes=LongSummaryStatistics{count=2, sum=400, min=100, average=200.000000, max=300}
I am so new to java. and there is my problem.
I have a Map in Type of Map<Integer , List<MyObject>> that I call it myMap.
As myMap has a lot of members (About 100000) , I don't think the for loop to be such a good idea so I wanna filter my Map<Integer , List<MyObject>> Where the bellow condition happens:
myMap.get(i).get(every_one_of_them).a_special_attribute_of_my_MyObject == null;
in which every_one_of_them means i wanna to delete members of myMap which the Whole list's members(All of its Objects) are null in that attribute(for more comfort , let's call it myAttribute).
one of my uncompleted idea was such a thing:
Map<Integer, List<toHandle>> collect = myMap.entrySet().stream()
.filter(x -> x.getValue.HERE_IS_WHERE_I_DO_NOT_KNOW_HOW_TO)
.collect(Collectors.toMap(x -> x.getKey(), x -> x.getValue()));
Any Help Will Be Highly Appreciated. Thanks.
You can
iterate over map values() and remove from it elements which you don't want. You can use for that removeIf(Predicate condition).
To check if all elements in list fulfill some condition you can use list.stream().allMatch(Predicate condition)
For instance lets we have Map<Integer, List<String>> and we want to remove lists which have all strings starting with b or B. You can do it via
myMap.values()
.removeIf(list -> list.stream()
.allMatch(str -> str.toLowerCase().startsWith("b"))
// but in real application for better performance use
// .allMatch(str -> str.regionMatches(true, 0, "b", 0, 1))
);
DEMO:
Map<Integer , List<String>> myMap = new HashMap<>(Map.of(
1, List.of("Abc", "Ab"),
2, List.of("Bb", "Bc"),
3, List.of("Cc")
));
myMap.values()
.removeIf(list -> list.stream()
.allMatch(str -> str.toLowerCase().startsWith("b"))
);
System.out.println(myMap);
Output:
{1=[Abc, Ab], 3=[Cc]}
As myMap has a lot of members (About 100000) , I don't think the for loop to be such a good idea so I wanna filter
That sounds like you think stream.filter is somehow faster than foreach. It's not; it's either slower or about as fast.
SPOILER: All the way at the end I do some basic performance tests, but I invite anyone to take that test and upgrade it to a full JMH test suite and run it on a variety of hardware. However - it says you're in fact exactly wrong, and foreach is considerably faster than anything involving streams.
Also, it sounds like you feel 100000 is a lot of entries. It mostly isn't. a foreach loop (or rather, an iterator) will be faster. Removing with the iterator will be considerably faster.
parallelism can help you out here, and is simpler with streams, but you can't just slap a parallel() in there and trust that it'll just work out. It depends on the underlying types. For example, your plain jane j.u.HashMap isn't very good at this; Something like a ConcurrentHashMap is far more capable. But if you take the time to copy over all data to a more suitable map type, well, in that timespan you could have done the entire job, and probably faster to boot! (Depends on how large those lists are).
Step 1: Make an oracle
But, first things first, we need an oracle function: One that determines if a given entry ought to be deleted. No matter what solution you go with, this is required:
public boolean keep(List<MyObject> mo) {
for (MyObject obj : mo) if (obj.specialProperty != null) return true;
return false;
}
you could 'streamify' it:
public boolean keep(List<MyObject> mo) {
return mo.stream().anyMatch(o -> o.specialProperty != null);
}
Step 2: Filter the list
Once we have that, the task becomes easier:
var it = map.values().iterator();
while (it.hasNext()) if (!keep(it.next())) it.remove();
is now all you need. We can streamify that if you prefer, but note that you can't use streams to change a map 'in place', and copying over is usually considerably slower, so, this is likely slower and certainly takes more memory:
Map<Integer, List<MyObject>> result =
map.entrySet().stream()
.filter(e -> keep(e.getValue()))
.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
Note also how the stream option doesn't generally result in significantly shorter code either. Don't make the decision between stream or non-stream based on notions that streams are inherently better, or lead to more readable code. Programming just isn't that simple, I'm afraid.
We can also use some of the more functional methods in map itself:
map.values().removeIf(v -> !keep(v));
That seems like the clear winner, here, although it's a bit bizarre we have to 'bounce' through values(); map itself has no removeIf method, but the collections returned by keySet, values, entrySet etc reflect any changes back to the map, so that works out.
Let's performance test!
Performance testing is tricky and really requires using JMH for good results. By all means, as an exercise, do just that. But, let's just do a real quick scan:
import java.util.*;
import java.util.stream.*;
public class Test {
static class MyObj {
String foo;
}
public static MyObj hit() {
MyObj o = new MyObj();
o.foo = "";
return o;
}
public static MyObj miss() {
return new MyObj();
}
private static final int MAP_ELEMS = 100000;
private static final int LIST_ELEMS = 50;
private static final double HIT_OR_MISS = 0.01;
private static final Random rnd = new Random();
public static void main(String[] args) {
var map = construct();
long now = System.currentTimeMillis();
filter_seq(map);
long delta = System.currentTimeMillis() - now;
System.out.printf("Sequential: %.3f\n", 0.001 * delta);
map = construct();
now = System.currentTimeMillis();
filter_stream(map);
delta = System.currentTimeMillis() - now;
System.out.printf("Stream: %.3f\n", 0.001 * delta);
map = construct();
now = System.currentTimeMillis();
filter_removeIf(map);
delta = System.currentTimeMillis() - now;
System.out.printf("RemoveIf: %.3f\n", 0.001 * delta);
}
private static Map<Integer, List<MyObj>> construct() {
var m = new HashMap<Integer, List<MyObj>>();
for (int i = 0; i < MAP_ELEMS; i++) {
var list = new ArrayList<MyObj>();
for (int j = 0; j < LIST_ELEMS; j++) {
list.add(rnd.nextDouble() < HIT_OR_MISS ? hit() : miss());
}
m.put(i, list);
}
return m;
}
static boolean keep_seq(List<MyObj> list) {
for (MyObj o : list) if (o.foo != null) return true;
return false;
}
static boolean keep_stream(List<MyObj> list) {
return list.stream().anyMatch(o -> o.foo != null);
}
static void filter_seq(Map<Integer, List<MyObj>> map) {
var it = map.values().iterator();
while (it.hasNext()) if (!keep_seq(it.next())) it.remove();
}
static void filter_stream(Map<Integer, List<MyObj>> map) {
Map<Integer, List<MyObj>> result =
map.entrySet().stream()
.filter(e -> keep_stream(e.getValue()))
.collect(Collectors.toMap(e -> e.getKey(), e -> e.getValue()));
}
static void filter_removeIf(Map<Integer, List<MyObj>> map) {
map.values().removeIf(v -> !keep_stream(v));
}
}
This, reliably, on my hardware anyway, shows that the stream route is by far the slowest, and the sequential option wins out with some percent from the removeIf variant. Which just goes to show that your initial line (if I can take that as 'I think foreach is too slow') was entirely off the mark, fortunately.
For fun I replaced the map with a ConcurrentHashMap and made the stream parallel(). This did not change the timing significantly, and I wasn't really expecting it too.
A note about style
In various snippets, I omit braces for loops and if statements. If you add them, the non-stream-based code occupies considerably more lines, and if you include the indent whitespace for the insides of these constructs, considerably more 'surface area' of paste. However, that is a ridiculous thing to clue off of - that is tantamount to saying: "Actually, the commonly followed style guides for java are incredibly obtuse and badly considered. However, I dare not break them. Fortunately, lambdas came along and gave me an excuse to toss the entire principle of those style guides right out the window and now pile it all into a single, braceless line, and oh look, lambdas lead to shorter code!". I would assume any reader, armed with this knowledge, can easily pierce through such baloney argumentation. The reasons for those braces primarily involve easier debug breakpointing and easy ways to add additional actions to a given 'code node', and those needs are exactly as important, if not more so, if using streams. If it's okay to one-liner and go brace-free for lambdas, then surely it is okay to do the same to if and for bodies.
The definition of the BiFunction interface contains a method apply(T t, U u), which accepts two arguments. However, I don't understand the use or purpose of this interface and method. What do we need this interface for?
The problem with this question is that it's not clear whether you see the purpose of a Function, which has a method apply(T t).
The value of all the functional types is that you can pass code around like data. One common use of this is the callback, and until Java 8, we used to have to do this with anonymous class declarations:
ui.onClick(new ClickHandler() {
public void handleAction(Action action) {
// do something in response to a click, using `action`.
}
}
Now with lambdas we can do that much more tersely:
ui.onClick( action -> { /* do something with action */ });
We can also assign them to variables:
Consumer clickHandler = action -> { /* do something with action */ };
ui.onClick(clickHandler);
... and do the usual things we do with objects, like put them in collections:
Map<String,Consumer> handlers = new HashMap<>();
handlers.put("click", handleAction);
A BiFunction is just this with two input parameters. Let's use what we've seen so far to do something useful with BiFunctions:
Map<String,BiFunction<Integer,Integer,Integer>> operators = new HashMap<>();
operators.put("+", (a,b) -> a + b);
operators.put("-", (a,b) -> a - b);
operators.put("*", (a,b) -> a * b);
...
// get a, b, op from ui
ui.output(operators.get(operator).apply(a,b));
One of usages of BiFunction is in the Map.merge method.
Here is an example usage of the Map.merge method, which uses a BiFunction as a parameter. What merge does is basically replaces the value of the given key with the given value if the value is null or the key does not have a value. Otherwise, replace the value of the given key after applying the BiFunction.
HashMap<String, String> map = new HashMap<>();
map.put("1", null);
map.put("2", "Hello");
map.merge("1", "Hi", String::concat);
map.merge("2", "Hi", String::concat);
System.out.println(map.get("1")); // Hi
System.out.println(map.get("2")); // HelloHi
If a BiFunction were not used, you would have to write a lot more code, even spanning several lines.
Here is a link that shows all the usages of BiFunction in the JDK: https://docs.oracle.com/javase/8/docs/api/java/util/function/class-use/BiFunction.html
Go check it out!
An extra example of BiFunction is reduce():
public static void main(String[] args) {
List<Integer> list = new ArrayList<>(Arrays.asList(5,5,10));
Integer reduce = list.stream().reduce(0, (v1,v2) -> v1+v2);
System.out.println(reduce); // result is: 20
}
Using Java 8 lambdas, what's the "best" way to effectively create a new List<T> given a List<K> of possible keys and a Map<K,V>? This is the scenario where you are given a List of possible Map keys and are expected to generate a List<T> where T is some type that is constructed based on some aspect of V, the map value types.
I've explored a few and don't feel comfortable claiming one way is better than another (with maybe one exception -- see code). I'll clarify "best" as a combination of code clarity and runtime efficiency. These are what I came up with. I'm sure someone can do better, which is one aspect of this question. I don't like the filter aspect of most as it means needing to create intermediate structures and multiple passes over the names List. Right now, I'm opting for Example 6 -- a plain 'ol loop. (NOTE: Some cryptic thoughts are in the code comments, especially "need to reference externally..." This means external from the lambda.)
public class Java8Mapping {
private final Map<String,Wongo> nameToWongoMap = new HashMap<>();
public Java8Mapping(){
List<String> names = Arrays.asList("abbey","normal","hans","delbrook");
List<String> types = Arrays.asList("crazy","boring","shocking","dead");
for(int i=0; i<names.size(); i++){
nameToWongoMap.put(names.get(i),new Wongo(names.get(i),types.get(i)));
}
}
public static void main(String[] args) {
System.out.println("in main");
Java8Mapping j = new Java8Mapping();
List<String> testNames = Arrays.asList("abbey", "froderick","igor");
System.out.println(j.getBongosExample1(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample2(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample3(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample4(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample5(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
System.out.println(j.getBongosExample6(testNames).stream().map(Bongo::toString).collect(Collectors.joining(", ")));
}
private static class Wongo{
String name;
String type;
public Wongo(String s, String t){name=s;type=t;}
#Override public String toString(){return "Wongo{name="+name+", type="+type+"}";}
}
private static class Bongo{
Wongo wongo;
public Bongo(Wongo w){wongo = w;}
#Override public String toString(){ return "Bongo{wongo="+wongo+"}";}
}
// 1: Create a list externally and add items inside 'forEach'.
// Needs to externally reference Map and List
public List<Bongo> getBongosExample1(List<String> names){
final List<Bongo> listOne = new ArrayList<>();
names.forEach(s -> {
Wongo w = nameToWongoMap.get(s);
if(w != null) {
listOne.add(new Bongo(nameToWongoMap.get(s)));
}
});
return listOne;
}
// 2: Use stream().map().collect()
// Needs to externally reference Map
public List<Bongo> getBongosExample2(List<String> names){
return names.stream()
.filter(s -> nameToWongoMap.get(s) != null)
.map(s -> new Bongo(nameToWongoMap.get(s)))
.collect(Collectors.toList());
}
// 3: Create custom Collector
// Needs to externally reference Map
public List<Bongo> getBongosExample3(List<String> names){
Function<List<Wongo>,List<Bongo>> finisher = list -> list.stream().map(Bongo::new).collect(Collectors.toList());
Collector<String,List<Wongo>,List<Bongo>> bongoCollector =
Collector.of(ArrayList::new,getAccumulator(),getCombiner(),finisher, Characteristics.UNORDERED);
return names.stream().collect(bongoCollector);
}
// example 3 helper code
private BiConsumer<List<Wongo>,String> getAccumulator(){
return (list,string) -> {
Wongo w = nameToWongoMap.get(string);
if(w != null){
list.add(w);
}
};
}
// example 3 helper code
private BinaryOperator<List<Wongo>> getCombiner(){
return (l1,l2) -> {
l1.addAll(l2);
return l1;
};
}
// 4: Use internal Bongo creation facility
public List<Bongo> getBongosExample4(List<String> names){
return names.stream().filter(s->nameToWongoMap.get(s) != null).map(s-> new Bongo(nameToWongoMap.get(s))).collect(Collectors.toList());
}
// 5: Stream the Map EntrySet. This avoids referring to anything outside of the stream,
// but bypasses the lookup benefit from Map.
public List<Bongo> getBongosExample5(List<String> names){
return nameToWongoMap.entrySet().stream().filter(e->names.contains(e.getKey())).map(e -> new Bongo(e.getValue())).collect(Collectors.toList());
}
// 6: Plain-ol-java loop
public List<Bongo> getBongosExample6(List<String> names){
List<Bongo> bongos = new ArrayList<>();
for(String s : names){
Wongo w = nameToWongoMap.get(s);
if(w != null){
bongos.add(new Bongo(w));
}
}
return bongos;
}
}
If namesToWongoMap is an instance variable, you can't really avoid a capturing lambda.
You can clean up the stream by splitting up the operations a little more:
return names.stream()
.map(n -> namesToWongoMap.get(n))
.filter(w -> w != null)
.map(w -> new Bongo(w))
.collect(toList());
return names.stream()
.map(namesToWongoMap::get)
.filter(Objects::nonNull)
.map(Bongo::new)
.collect(toList());
That way you don't call get twice.
This is very much like the for loop, except, for example, it could theoretically be parallelized if namesToWongoMap can't be mutated concurrently.
I don't like the filter aspect of most as it means needing to create intermediate structures and multiple passes over the names List.
There are no intermediate structures and there is only one pass over the List. A stream pipeline says "for each element...do this sequence of operations". Each element is visited once and the pipeline is applied.
Here are some relevant quotes from the java.util.stream package description:
A stream is not a data structure that stores elements; instead, it conveys elements from a source such as a data structure, an array, a generator function, or an I/O channel, through a pipeline of computational operations.
Processing streams lazily allows for significant efficiencies; in a pipeline such as the filter-map-sum example above, filtering, mapping, and summing can be fused into a single pass on the data, with minimal intermediate state.
Radiodef's answer pretty much nailed it, I think. The solution given there:
return names.stream()
.map(namesToWongoMap::get)
.filter(Objects::nonNull)
.map(Bongo::new)
.collect(toList());
is probably about the best that can be done in Java 8.
I did want to mention a small wrinkle in this, though. The Map.get call returns null if the name isn't present in the map, and this is subsequently filtered out. There's nothing wrong with this per se, though it does bake null-means-not-present semantics into the pipeline structure.
In some sense we'd want a mapper pipeline operation that has a choice of returning zero or one elements. A way to do this with streams is with flatMap. The flatmapper function can return an arbitrary number of elements into the stream, but in this case we want just zero or one. Here's how to do that:
return names.stream()
.flatMap(name -> {
Wongo w = nameToWongoMap.get(name);
return w == null ? Stream.empty() : Stream.of(w);
})
.map(Bongo::new)
.collect(toList());
I admit this is pretty clunky and so I wouldn't recommend doing this. A slightly better but somewhat obscure approach is this:
return names.stream()
.flatMap(name -> Optional.ofNullable(nameToWongoMap.get(name))
.map(Stream::of).orElseGet(Stream::empty))
.map(Bongo::new)
.collect(toList());
but I'm still not sure I'd recommend this as it stands.
The use of flatMap does point to another approach, though. If you have a more complicated policy of how to deal with the not-present case, you could refactor this into a helper function that returns a Stream containing the result or an empty Stream if there's no result.
Finally, JDK 9 -- still under development as of this writing -- has added Stream.ofNullable which is useful in exactly these situations:
return names.stream()
.flatMap(name -> Stream.ofNullable(nameToWongoMap.get(name)))
.map(Bongo::new)
.collect(toList());
As an aside, JDK 9 has also added Optional.stream which creates a zero-or-one stream from an Optional. This is useful in cases where you want to call an Optional-returning function from within flatMap. See this answer and this answer for more discussion.
One approach I didn't see is retainAll:
public List<Bongo> getBongos(List<String> names) {
Map<String, Wongo> copy = new HashMap<>(nameToWongoMap);
copy.keySet().retainAll(names);
return copy.values().stream().map(Bongo::new).collect(
Collectors.toList());
}
The extra Map is a minimal performance hit, since it's just copying pointers to objects, not the objects themselves.
I have been trying to learn Java 8's new functional interface features, and I am having some difficulty refactoring code that I have previously written.
As part of a test case, I want to store a list of read names in a Map structure in order to check to see if those reads have been "fixed" in a subsequent section of code. I am converting from an existing Map> data structure. The reason why I am flattening this datastructure is because the outer "String" key of the original Map is not needed in the subsequent analysis (I used it to segregate data from different sources before merging them in the intermediate data). Here is my original program logic:
public class MyClass {
private Map<String, Map<String, Short>> anchorLookup;
...
public void CheckMissingAnchors(...){
Map<String, Boolean> anchorfound = new HashMap<>();
// My old logic used the foreach syntax to populate the "anchorfound" map
for(String rg : anchorLookup.keySet()){
for(String clone : anchorLookup.get(rg).keySet()){
anchorfound.put(clone, false);
}
}
...
// Does work to identify the read name in the file. If found, the boolean in the map
// is set to "true." Afterwards, the program prints the "true" and "false" counts in
// the map
}
}
I attempted to refactor the code to use functional interfaces; however, I getting errors from my IDE (Netbeans 8.0 Patch 2 running Java 1.8.0_05):
public class MyClass {
private Map<String, Map<String, Short>> anchorLookup;
...
public void CheckMissingAnchors(...){
Map<String, Boolean> anchorfound = anchorLookup.keySet()
.stream()
.map((s) -> anchorlookup.get(s).keySet()) // at this point I am expecting a
// Stream<Set<String>> which I thought could be "streamed" for the collector method
// ; however, my IDE does not allow me to select the "stream()" method
.sequential() // this still gives me a Stream<Set<String>>
.collect(Collectors.toMap((s) -> s, (s) -> false);
// I receive an error for the preceding method call, as Stream<Set<String>> cannot be
// converted to type String
...
}
}
Is there a better way to create the "anchorfound" map using the Collection methods or is the vanilla Java "foreach" structure the best way to generate this data structure?
I apologize for any obvious errors in my code. My formal training was not in computer science but I would like to learn more about Java's implementation of functional programming concepts.
I believe what you need is a flatMap.
This way you convert each key of the outer map to a stream of the keys of the corresponding inner map, and then flatten them to a single stream of String.
public class MyClass {
private Map<String, Map<String, Short>> anchorLookup;
...
public void CheckMissingAnchors(...){
Map<String, Boolean> anchorfound = anchorLookup.keySet()
.stream()
.flatMap(s -> anchorlookup.get(s).keySet().stream())
.collect(Collectors.toMap((s) -> s, (s) -> false);
...
}
}
Eran's suggestion of flatMap is a good one, +1.
This can be simplified somewhat by using Map.values() instead of Map.keySet(), since the map's keys aren't used for any other purpose than to retrieve the values. Streaming the result of Map.values() gives a Stream<Map<String,Short>>. Here we don't care about the inner map's values, so we can use keySet() to extract the keys, giving a Stream<Set<String>>. Now we just flatMap these sets into Stream<String>. Finally we send the results into the collector as before.
The resulting code looks like this:
public class MyClass {
private Map<String, Map<String, Short>> anchorLookup;
public void checkMissingAnchors() {
Map<String, Boolean> anchorfound = anchorLookup.values().stream()
.map(Map::keySet)
.flatMap(Set::stream)
.collect(Collectors.toMap(s -> s, s -> false));
}
}