Lambda to populate Map - java

I am trying to fill up a map with words and the number of their occurrences. I am trying to write a lambda to do it, like so:
Consumer<String> wordCount = word -> map.computeIfAbsent(word, (w) -> (new Integer(1) + 1).intValue());
map is Map<String, Integer>. It should just insert the word in the map as a key if it is absent and if it is present it should increase its integer value by 1. This one is not correct syntax-wise.

You can't increment the count using computeIfAbsent, since it will only be computed the first time.
You probably meant:
map.compute(word, (w, i) -> i == null ? 1 : i + 1);

This is what Collectors are for.
Assuming you have some Stream<String> words:
Map<String, Long> countedWords = words
.collect(Collectors
.groupingBy(
Function.identity(),
Collectors.counting());

It doesn't compile because you can't call a method on a primitive:
new Integer(1) -> 1 // unboxing was applied
(1 + 1).intValue() // incorrect
I would write it with Map#put and Map#getOrDefault:
Consumer<String> consumer = word -> map.put(word, map.getOrDefault(word, 0) + 1);

Related

Collect key values from array to map without duplicates

My app gets some string from web service. It's look like this:
name=Raul&city=Paris&id=167136
I want to get map from this string:
{name=Raul, city=Paris, id=167136}
Code:
Arrays.stream(input.split("&"))
.map(sub -> sub.split("="))
.collect(Collectors.toMap(string-> string[0]), string -> string[1]));
It's okay and works in most cases, but app can get a string with duplicate keys, like this:
name=Raul&city=Paris&id=167136&city=Oslo
App will crash with following uncaught exception:
Exception in thread "main" java.lang.IllegalStateException: Duplicate key city (attempted merging values Paris and Oslo)
I tried to change collect method:
.collect(Collectors.toMap(tokens -> tokens[0], tokens -> tokens[1]), (r, strings) -> strings[0]);
But complier says no:
Cannot resolve method 'collect(java.util.stream.Collector<T,capture<?>,java.util.Map<K,U>>, <lambda expression>)'
And Array type expected; found: 'T'
I guess, it's because I have an array. How to fix it?
You are misunderstanding the final argument of toMap (the merge operator). When it find a duplicate key it hands the current value in the map and the new value with the same key to the merge operator which produces the single value to store.
For example, if you want to just store the first value found then use (s1, s2) -> s1. If you want to comma separate them, use (s1, s2) -> s1 + ", " + s2.
If you want to add value of duplicated keys together and group them by key (since app can get a string with duplicate keys), instead of using Collectors.toMap() you can use a Collectors.groupingBy with custom collector (Collector.of(...)) :
String input = "name=Raul&city=Paris&city=Berlin&id=167136&id=03&id=505";
Map<String, Set<Object>> result = Arrays.stream(input.split("&"))
.map(splitedString -> splitedString.split("="))
.filter(keyValuePair -> keyValuePair.length() == 2)
.collect(
Collectors.groupingBy(array -> array[0], Collector.of(
() -> new HashSet<>(), (set, array) -> set.add(array[1]),
(left, right) -> {
if (left.size() < right.size()) {
right.addAll(left);
return right;
} else {
left.addAll(right);
return left;
}
}, Collector.Characteristics.UNORDERED)
)
);
This way you'll get :
result => size = 3
"city" -> size = 2 ["Berlin", "Paris"]
"name" -> size = 1 ["Raul"]
"id" -> size = 3 ["167136","03","505"]
You can achieve the same result using kotlin collections
val res = message
.split("&")
.map {
val entry = it.split("=")
Pair(entry[0], entry[1])
}
println(res)
println(res.toMap()) //distinct by key
The result is
[(name, Raul), (city, Paris), (id, 167136), (city, Oslo)]
{name=Raul, city=Oslo, id=167136}

Java 8 Stream to determine a maximum count in a text file

For my assignment I have to replace for loops with streams that count the frequency of words in a text document, and I am having trouble figuring the TODO part out.
String filename = "SophieSallyJack.txt";
if (args.length == 1) {
filename = args[0];
}
Map<String, Integer> wordFrequency = new TreeMap<>();
List<String> incoming = Utilities.readAFile(filename);
wordFrequency = incoming.stream()
.map(String::toLowerCase)
.filter(word -> !word.trim().isEmpty())
.collect(Collectors.toMap(word -> word, word -> 1, (a, b) -> a + b, TreeMap::new));
int maxCnt = 0;
// TODO add a single statement that uses streams to determine maxCnt
for (String word : incoming) {
Integer cnt = wordFrequency.get(word);
if (cnt != null) {
if (cnt > maxCnt) {
maxCnt = cnt;
}
}
}
System.out.print("Words that appear " + maxCnt + " times:");
I have tried this:
wordFrequency = incoming.parallelStream().
collect(Collectors.toConcurrentMap(w -> w, w -> 1, Integer::sum));
But that is not right and I'm not sure how to incorporate maxCnt into the stream.
Assuming you have all the words extracted from a file in a List<String> this word count for each word can be computed using this approach,
Map<String, Long> wordToCountMap = words.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
The most freequent word can then be computed using the above map like so,
Entry<String, Long> mostFreequentWord = wordToCountMap.entrySet().stream()
.max(Map.Entry.comparingByValue())
.orElse(new AbstractMap.SimpleEntry<>("Invalid", 0l));
You may change the above two pipelines together if you wish like this,
Entry<String, Long> mostFreequentWord = words.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream()
.max(Map.Entry.comparingByValue())
.orElse(new AbstractMap.SimpleEntry<>("Invalid", 0l));
Update
As per the following discussion it is always good to return an Optional from your computation like so,
Optional<Entry<String, Long>> mostFreequentWord = words.stream()
.collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))
.entrySet().stream()
.max(Map.Entry.comparingByValue());
Well, you have done almost everything you needed with that TreeMap, but it seems you don't know that it has a method called lastEntry and that is the only one you need to call after you computed wordFrequency to get the word with the highest frequency.
The only problem is that this is not very optimal, since TreeMap sorts the data on each insert and you don't need sorted data, you need the max. Sorting in case of TreeMap is O(nlogn), while inserting into a HashMap is O(n).
So instead of using that TreeMap, all you need to change is to a HashMap:
wordFrequency = incoming.stream()
.map(String::toLowerCase)
.filter(word -> !word.trim().isEmpty())
.collect(Collectors.toMap(
Function.identity(),
word -> 1,
(a, b) -> a + b,
HashMap::new));
Once you have this Map, you need to find max - this operation is O(n) in general and could be achieved with a stream or without one:
Collections.max(wordFrequency.entrySet(), Map.Entry.comparingByValue())
This approach with give you O(n) for HashMap insert, and O(n) for finding the max - thus O(n) in general, so it's faster than TreeMap
Ok, first of all, your wordFrequency line can make use of Collectors#groupingBy and Collectors#counting instead of writing your own accumulator:
List<String> incoming = Arrays.asList("monkey", "dog", "MONKEY", "DOG", "giraffe", "giraffe", "giraffe", "Monkey");
wordFrequency = incoming.stream()
.filter(word -> !word.trim().isEmpty()) // filter first, so we don't lowercase empty strings
.map(String::toLowerCase)
.collect(Collectors.groupingBy(s -> s, Collectors.counting()));
Now that we got that out of the way... Your TODO line says use streams to determine maxCnt. You can do that easily by using max with naturalOrder:
int maxCnt = wordFrequency.values()
.stream()
.max(Comparator.naturalOrder())
.orElse(0L)
.intValue();
However, your comments make me think that what you actually want is a one-liner to print the most frequent words (all of them), i.e. the words that have maxCnt as value in wordFrequency. So what we need is to "reverse" the map, grouping the words by count, and then pick the entry with highest count:
wordFrequency.entrySet().stream() // {monkey=3, dog=2, giraffe=3}
.collect(groupingBy(Map.Entry::getValue, mapping(Map.Entry::getKey, toList()))).entrySet().stream() // reverse map: {3=[monkey, giraffe], 2=[dog]}
.max(Comparator.comparingLong(Map.Entry::getKey)) // maxCnt and all words with it: 3=[monkey, giraffe]
.ifPresent(e -> {
System.out.println("Words that appear " + e.getKey() + " times: " + e.getValue());
});
This solution prints all the words with maxCnt, instead of just one:
Words that appear 3 times: [monkey, giraffe].
Of course, you can concatenate the statements to get one big do-it-all statement, like this:
incoming.stream() // [monkey, dog, MONKEY, DOG, giraffe, giraffe, giraffe, Monkey]
.filter(word -> !word.trim().isEmpty()) // filter first, so we don't lowercase empty strings
.map(String::toLowerCase)
.collect(groupingBy(s -> s, counting())).entrySet().stream() // {monkey=3, dog=2, giraffe=3}
.collect(groupingBy(Map.Entry::getValue, mapping(Map.Entry::getKey, toList()))).entrySet().stream() // reverse map: {3=[monkey, giraffe], 2=[dog]}
.max(Comparator.comparingLong(Map.Entry::getKey)) // maxCnt and all words with it: 3=[monkey, giraffe]
.ifPresent(e -> {
System.out.println("Words that appear " + e.getKey() + " times: " + e.getValue());
});
But now we're stretching the meaning of "one statement" :)
By piecing together information I was able to successfully replace the for loop with
int maxCnt = wordFrequency.values().stream().max(Comparator.naturalOrder()).get();
System.out.print("Words that appear " + maxCnt + " times:");
I appreciate all the help.

split string and store it into HashMap java 8

I want to split below string and store it into HashMap.
String responseString = "name~peter-add~mumbai-md~v-refNo~";
first I split the string using delimeter hyphen (-) and storing it into ArrayList as below:
public static List<String> getTokenizeString(String delimitedString, char separator) {
final Splitter splitter = Splitter.on(separator).trimResults();
final Iterable<String> tokens = splitter.split(delimitedString);
final List<String> tokenList = new ArrayList<String>();
for(String token: tokens){
tokenList.add(token);
}
return tokenList;
}
List<String> list = MyClass.getTokenizeString(responseString, "-");
and then using the below code to convert it to HashMap using stream.
HashMap<String, String> = list.stream()
.collect(Collectors.toMap(k ->k.split("~")[0], v -> v.split("~")[1]));
The stream collector doesnt work as there is no value against refNo.
It works correctly if I have even number of elements in ArrayList.
Is there any way to handle this? Also suggest how I can use stream to do these two tasks (I dont want to use getTokenizeString() method) using stream java 8.
Unless Splitter is doing any magic, the getTokenizeString method is obsolete here. You can perform the entire processing as a single operation:
Map<String,String> map = Pattern.compile("\\s*-\\s*")
.splitAsStream(responseString.trim())
.map(s -> s.split("~", 2))
.collect(Collectors.toMap(a -> a[0], a -> a.length>1? a[1]: ""));
By using the regular expression \s*-\s* as separator, you are considering white-space as part of the separator, hence implicitly trimming the entries. There’s only one initial trim operation before processing the entries, to ensure that there is no white-space before the first or after the last entry.
Then, simply split the entries in a map step before collecting into a Map.
First of all, you don't have to split the same String twice.
Second of all, check the length of the array to determine if a value is present for a given key.
HashMap<String, String> map=
list.stream()
.map(s -> s.split("~"))
.collect(Collectors.toMap(a -> a[0], a -> a.length > 1 ? a[1] : ""));
This is assuming you want to put the key with a null value if a key has no corresponding value.
Or you can skip the list variable :
HashMap<String, String> map1 =
MyClass.getTokenizeString(responseString, "-")
.stream()
.map(s -> s.split("~"))
.collect(Collectors.toMap(a -> a[0], a -> a.length > 1 ? a[1] : ""));
private final String dataSheet = "103343262,6478342944, 103426540,84528784843, 103278808,263716791426, 103426733,27736529279,
103426000,27718159078, 103218982,19855201547, 103427376,27717278645,
103243034,81667273413";
final int chunk = 2;
AtomicInteger counter = new AtomicInteger();
Map<String, String> pairs = Arrays.stream(dataSheet.split(","))
.map(String::trim)
.collect(Collectors.groupingBy(i -> counter.getAndIncrement() / chunk))
.values()
.stream()
.collect(toMap(k -> k.get(0), v -> v.get(1)));
result:
pairs =
"103218982" -> "19855201547"
"103278808" -> "263716791426"
"103243034" -> "81667273413"
"103426733" -> "27736529279"
"103426540" -> "84528784843"
"103427376" -> "27717278645"
"103426000" -> "27718159078"
"103343262" -> "6478342944"
We need to group each 2 elements into key, value pairs, so will partion the list into chunks of 2, (counter.getAndIncrement() / 2) will result same number each 2 hits ex:
IntStream.range(0,6).forEach((i)->System.out.println(counter.getAndIncrement()/2));
prints:
0
0
1
1
2
2
You may use the same idea to partition list into chunks.
Another short way to do :
String responseString = "name~peter-add~mumbai-md~v-refNo~";
Map<String, String> collect = Arrays.stream(responseString.split("-"))
.map(s -> s.split("~", 2))
.collect(Collectors.toMap(a -> a[0], a -> a.length > 1 ? a[1] : ""));
System.out.println(collect);
First you split the String on basis of - , then you map like map(s -> s.split("~", 2))it to create Stream<String[]> like [name, peter][add, mumbai][md, v][refNo, ] and at last you collect it to toMap as a[0] goes to key and a[1] goes to value.

Transform and filter a Java Map with streams

I have a Java Map that I'd like to transform and filter. As a trivial example, suppose I want to convert all values to Integers then remove the odd entries.
Map<String, String> input = new HashMap<>();
input.put("a", "1234");
input.put("b", "2345");
input.put("c", "3456");
input.put("d", "4567");
Map<String, Integer> output = input.entrySet().stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> Integer.parseInt(e.getValue())
))
.entrySet().stream()
.filter(e -> e.getValue() % 2 == 0)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
System.out.println(output.toString());
This is correct and yields: {a=1234, c=3456}
However, I can't help but wonder if there's a way to avoid calling .entrySet().stream() twice.
Is there a way I can perform both transform and filter operations and call .collect() only once at the end?
Yes, you can map each entry to another temporary entry that will hold the key and the parsed integer value. Then you can filter each entry based on their value.
Map<String, Integer> output =
input.entrySet()
.stream()
.map(e -> new AbstractMap.SimpleEntry<>(e.getKey(), Integer.valueOf(e.getValue())))
.filter(e -> e.getValue() % 2 == 0)
.collect(Collectors.toMap(
Map.Entry::getKey,
Map.Entry::getValue
));
Note that I used Integer.valueOf instead of parseInt since we actually want a boxed int.
If you have the luxury to use the StreamEx library, you can do it quite simply:
Map<String, Integer> output =
EntryStream.of(input).mapValues(Integer::valueOf).filterValues(v -> v % 2 == 0).toMap();
One way to solve the problem with much lesser overhead is to move the mapping and filtering down to the collector.
Map<String, Integer> output = input.entrySet().stream().collect(
HashMap::new,
(map,e)->{ int i=Integer.parseInt(e.getValue()); if(i%2==0) map.put(e.getKey(), i); },
Map::putAll);
This does not require the creation of intermediate Map.Entry instances and even better, will postpone the boxing of int values to the point when the values are actually added to the Map, which implies that values rejected by the filter are not boxed at all.
Compared to what Collectors.toMap(…) does, the operation is also simplified by using Map.put rather than Map.merge as we know beforehand that we don’t have to handle key collisions here.
However, as long as you don’t want to utilize parallel execution you may also consider the ordinary loop
HashMap<String,Integer> output=new HashMap<>();
for(Map.Entry<String, String> e: input.entrySet()) {
int i = Integer.parseInt(e.getValue());
if(i%2==0) output.put(e.getKey(), i);
}
or the internal iteration variant:
HashMap<String,Integer> output=new HashMap<>();
input.forEach((k,v)->{ int i = Integer.parseInt(v); if(i%2==0) output.put(k, i); });
the latter being quite compact and at least on par with all other variants regarding single threaded performance.
Guava's your friend:
Map<String, Integer> output = Maps.filterValues(Maps.transformValues(input, Integer::valueOf), i -> i % 2 == 0);
Keep in mind that output is a transformed, filtered view of input. You'll need to make a copy if you want to operate on them independently.
You could use the Stream.collect(supplier, accumulator, combiner) method to transform the entries and conditionally accumulate them:
Map<String, Integer> even = input.entrySet().stream().collect(
HashMap::new,
(m, e) -> Optional.ofNullable(e)
.map(Map.Entry::getValue)
.map(Integer::valueOf)
.filter(i -> i % 2 == 0)
.ifPresent(i -> m.put(e.getKey(), i)),
Map::putAll);
System.out.println(even); // {a=1234, c=3456}
Here, inside the accumulator, I'm using Optional methods to apply both the transformation and the predicate, and, if the optional value is still present, I'm adding it to the map being collected.
Another way to do this is to remove the values you don't want from the transformed Map:
Map<String, Integer> output = input.entrySet().stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> Integer.parseInt(e.getValue()),
(a, b) -> { throw new AssertionError(); },
HashMap::new
));
output.values().removeIf(v -> v % 2 != 0);
This assumes you want a mutable Map as the result, if not you can probably create an immutable one from output.
If you are transforming the values into the same type and want to modify the Map in place this could be alot shorter with replaceAll:
input.replaceAll((k, v) -> v + " example");
input.values().removeIf(v -> v.length() > 10);
This also assumes input is mutable.
I don't recommend doing this because It will not work for all valid Map implementations and may stop working for HashMap in the future, but you can currently use replaceAll and cast a HashMap to change the type of the values:
((Map)input).replaceAll((k, v) -> Integer.parseInt((String)v));
Map<String, Integer> output = (Map)input;
output.values().removeIf(v -> v % 2 != 0);
This will also give you type safety warnings and if you try to retrieve a value from the Map through a reference of the old type like this:
String ex = input.get("a");
It will throw a ClassCastException.
You could move the first transform part into a method to avoid the boilerplate if you expect to use it alot:
public static <K, VO, VN, M extends Map<K, VN>> M transformValues(
Map<? extends K, ? extends VO> old,
Function<? super VO, ? extends VN> f,
Supplier<? extends M> mapFactory){
return old.entrySet().stream().collect(Collectors.toMap(
Entry::getKey,
e -> f.apply(e.getValue()),
(a, b) -> { throw new IllegalStateException("Duplicate keys for values " + a + " " + b); },
mapFactory));
}
And use it like this:
Map<String, Integer> output = transformValues(input, Integer::parseInt, HashMap::new);
output.values().removeIf(v -> v % 2 != 0);
Note that the duplicate key exception can be thrown if, for example, the old Map is an IdentityHashMap and the mapFactory creates a HashMap.
Here is code by abacus-common
Map<String, String> input = N.asMap("a", "1234", "b", "2345", "c", "3456", "d", "4567");
Map<String, Integer> output = Stream.of(input)
.groupBy(e -> e.getKey(), e -> N.asInt(e.getValue()))
.filter(e -> e.getValue() % 2 == 0)
.toMap(Map.Entry::getKey, Map.Entry::getValue);
N.println(output.toString());
Declaration: I'm the developer of abacus-common.

Java8 filter collect both type of value

Is there a way to collect both filtered and not filtered value in java 8 filter ?
One way is:
.filter( foo -> {
if(!foo.apply()){
// add to required collection
}
return foo.apply();
}
Is there a better alternative ?
Map<Boolean, List<Foo>> map =
collection.stream().collect(Collectors.partitioningBy(foo -> foo.isBar());
You can use a ternary operator with map, so that the function you apply is either the identity for some condition, In below example I calculating square of even numbers and keeping odd numbers as it is.
List<Integer> temp = arrays.stream()
.map(i -> i % 2 == 0 ? i*i : i)
.collect(Collectors.toList());
In your case it would be like this :
List<Integer> temp = arrays.stream()
.map(!foo.apply() -> ? doSomething: doSomethingElse)
.collect(Collectors.toList());

Categories