Sum of values in the map for specific key - java

I have text file with a file that on every line contain pair of name and amount like this:
Mike 5
Kate 2
Mike 3
I need to sum these values by key. I resolved this in this way
public static void main(String[] args) {
Map<String, Integer> map = new HashMap<String, Integer>();
try {
Files.lines(Paths.get("/Users/walter/Desktop/stuff.txt"))
.map(line -> line.split("\\s+")).forEach(line -> {
String key = line[0];
if (map.containsKey(key)) {
Integer oldValue = map.get(key);
map.put(key, oldValue + Integer.parseInt(line[1]));
} else {
map.put(line[0], Integer.parseInt(line[1]));
}
});
map.forEach((k,v) -> System.out.println(k + " " +v));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
;
}
How actually I could improve this code in more functional way, with abillity to process data more concurrently (using parallel streams, etc.)

When you want to write functional code, the rule is: don't use forEach. This is an imperative solution and breaks functional code.
What you want is to split each line and group by the first part (key) while summing the second part (values):
Map<String, Integer> map =
Files.lines(Paths.get("/Users/walter/Desktop/stuff.txt"))
.map(s -> s.split("\\s+"))
.collect(groupingBy(a -> a[0], summingInt(a -> Integer.parseInt(a[1]))));
In this code, we are splitting each line. Then, we are grouping the Stream using Collectors.groupingBy(classifier, downstream) where:
classifier, which is a function that classifies each item to extract the key of the resulting Map, just returns the first part of the line
downstream is a collector that reduces each value having the same key: in this case, it is Collectors.summingInt(mapper) which sums each integer extracted by the given mapper.
As a side-note (and just so you know), you could rewrite your whole forEach more simply using the new Map.merge(key, value, remappingFunction) method, with just a call to:
map.merge(line[0], Integer.valueOf(line[1]), Integer::sum);
This will put a new value with the key line[0] with the value Integer.valueOf(line[1]) if this key did not exist, otherwise, it will update the key with the given remapping function (which is the sum of the old and new value in this case).

Related

Effective way. of comparing list elements in Java

Is there any **effective way **of comparing elements in Java and print out the position of the element which occurs once.
For example: if I have a list: ["Hi", "Hi", "No"], I want to print out 2 because "No" is in position 2. I have solved this using the following algorithm and it works, BUT the problem is that if I have a large list it takes too much time to compare the entire list to print out the first position of the unique word.
ArrayList<String> strings = new ArrayList<>();
for (int i = 0; i < strings.size(); i++) {
int oc = Collections.frequency(strings, strings.get(i));
if (oc == 1)
System.out.print(i);
break;
}
I can think of counting each element's occurrence no and filter out the first element though not sure how large your list is.
Using Stream:
List<String> list = Arrays.asList("Hi", "Hi", "No");
//iterating thorugh the list and storing each element and their no of occurance in Map
Map<String, Long> counts = list.stream().collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()));
String value = counts.entrySet().stream()
.filter(e -> e.getValue() == 1) //filtering out all the elements which have more than 1 occurance
.map(Map.Entry::getKey) // creating a stream of element from map as all of these have only single occurance
.findFirst() //finding the first element from the element stream
.get();
System.out.println(list.indexOf(value));
EDIT:
A simplified version can be
Map<String, Long> counts2 = new LinkedHashMap<String, Long>();
for(String val : list){
long count = counts2.getOrDefault(val, 0L);
counts2.put(val, ++count);
}
for(String key: counts2.keySet()){
if(counts2.get(key)==1){
System.out.println(list.indexOf(key));
break;
}
}
The basic idea is to count each element's occurrence and store them in a Map.Once you have count of all elements occurrences. then you can simply check for the first element which one has 1 as count.
You can use HashMap.For example you can put word as key and index as value.Once you find the same word you can delete the key and last the map contain the result.
If there's only one word that's present only once, you can probably use a HashMap or HashSet + Deque (set for values, Deque for indices) to do this in linear time. A sort can give you the same in n log(n), so slower than linear but a lot faster than your solution. By sorting, it's easy to find in linear time (after the sort) which element is present only once because all duplicates will be next to each other in the array.
For example for a linear solution in pseudo-code (pseudo-Kotlin!):
counters = HashMap()
for (i, word in words.withIndex()) {
counters.merge(word, Counter(i, 1), (oldVal, newVal) -> Counter(oldVald.firstIndex, oldVald.count + newVal.count));
}
for (counter in counters.entrySet()) {
if (counter.count == 1) return counter.firstIndex;
}
class Counter(firstIndex, count)
Map<String,Boolean> + loops
Instead of using Map<String,Integer> as suggested in other answers.
You can maintain a HashMap (if you need to maintain the order, use LinkedHashMap instead) of type Map<String,Boolean> where a value would denote whether an element is unique or not.
The simplest way to generate the map is method put() in conjunction with containsKey() check.
But there are also more concise options like replace() + putIfAbsent(). putIfAbsent() would create a new entry only if key is not present in the map, therefore we can associate such string with a value of true (considered to be unique). On the other hand replace() would update only existing entry (otherwise map would not be effected), and if entry exist, the key is proved to be a duplicate, and it has to be associated with a value of false (non-unique).
And since Java 8 we also have method merge(), which expects tree arguments: a key, a value, and a function which is used when the given key already exists to resolve the old value and the new one.
The last step is to generate list of unique strings by iterating over the entry set of the newly created map. We need every key having a value of true (is unique) associated with it.
List<String> strings = // initializing the list
Map<String, Boolean> isUnique = new HashMap<>(); // or LinkedHashMap if you need preserve initial order of strings
for (String next: strings) {
isUnique.replace(next, false);
isUnique.putIfAbsent(next, true);
// isUnique.merge(next, true, (oldV, newV) -> false); // does the same as the commented out lines above
}
List<String> unique = new ArrayList<>();
for (Map.Entry<String, Boolean> entry: isUnique.entrySet()) {
if (entry.getValue()) unique.add(entry.getKey());
}
Stream-based solution
With streams, it can be done using collector toMap(). The overall logic remains the same.
List<String> unique = strings.stream()
.collect(Collectors.toMap( // creating intermediate map Map<String, Boolean>
Function.identity(), // key
key -> true, // value
(oldV, newV) -> false, // resolving duplicates
LinkedHashMap::new // Map implementation, if order is not important - discard this argument
))
.entrySet().stream()
.filter(Map.Entry::getValue)
.map(Map.Entry::getKey)
.toList(); // for Java 16+ or collect(Collectors.toList()) for earlier versions

How to remove Keys that would cause Collisions before executing Collectors.toMap()

I have a stream of objects similar to this previous question, however, instead of ignoring duplicate values, I would like to remove any values from that stream beforehand and print them out.
For example, from this snippet:
Map<String, String> phoneBook = people.stream()
.collect(toMap(Person::getName,
Person::getAddress));
If there were duplicate entries, it would cause a java.lang.IllegalStateException: Duplicate key error to be thrown.
The solution proposed in that question used a mergeFunction to keep the first entry if a collision was found.
Map<String, String> phoneBook =
people.stream()
.collect(Collectors.toMap(
Person::getName,
Person::getAddress,
(address1, address2) -> {
System.out.println("duplicate key found!");
return address1;
}
));
Instead of keeping the first entry, if there is a collision from a duplicate key in the stream, I want to know which value caused the collision and make sure that there are no occurrences of that value within the resulting map.
I.e. if "Bob" appeared three times in the stream, it should not be in the map even once.
In the process of creating that map, I would like to filter out any duplicate names and record them some way.
I want to make sure that when creating the list there can be no duplicate entry and for there to be some way to know which entries had duplicate keys in incoming stream. I was thinking about using groupingBy and filter beforehand to find the duplicate keys, but I am not sure what the best way to do it is.
I would like to remove any values from that stream beforehand.
As #JimGarrison has pointed out, preprocessing the data doesn't make sense.
You can't know it in advance whether a name is unique or not until the all data set has been processed.
Another thing that you have to consider that inside the stream pipeline (before the collector) you have knowledge on what data has been encountered previously. Because results of intermediate operations should not depend on any state.
In case if you are thinking that streams are acting like a sequence of loops and therefore assuming that it's possible to preprocess stream elements before collecting them, that's not correct. Elements of the stream pipeline are being processed lazily one at a time. I.e. all the operations in the pipeline will get applied on a single element and each operation will be applied only if it's needed (that's what laziness means).
For more information, have a look at this tutorial and API documentation
Implementations
You can segregate unique values and duplicates in a single stream statement by utilizing Collectors.teeing() and a custom object that will contain separate collections of duplicated and unique entries of the phone book.
Since the primarily function of this object only to carry the data I've implemented it as Java 16 record.
public record FilteredPhoneBook(Map<String, String> uniquePersonsAddressByName,
List<String> duplicatedNames) {}
Collector teeing() expects three arguments: two collectors and a function that merges the results produced by both collectors.
The map generated by the groupingBy() in conjunction with counting(), is meant to determine duplicated names.
Since there's no point to processing the data, toMap() which is used as the second collector will create a map containing all names.
When both collectors will hand out their results to the merger function, it will take care of removing the duplicates.
public static FilteredPhoneBook getFilteredPhoneBook(Collection<Person> people) {
return people.stream()
.collect(Collectors.teeing(
Collectors.groupingBy(Person::getName, Collectors.counting()), // intermediate Map<String, Long>
Collectors.toMap( // intermediate Map<String, String>
Person::getName,
Person::getAddress,
(left, right) -> left),
(Map<String, Long> countByName, Map<String, String> addressByName) -> {
countByName.values().removeIf(count -> count == 1); // removing unique names
addressByName.keySet().removeAll(countByName.keySet()); // removing all duplicates
return new FilteredPhoneBook(addressByName, new ArrayList<>(countByName.keySet()));
}
));
}
Another way to address this problem to utilize Map<String,Boolean> as the mean of discovering duplicates, as #Holger have suggested.
With the first collector will be written using toMap(). And it will associate true with a key that has been encountered only once, and its mergeFunction will assign the value of false if at least one duplicate was found.
The rest logic remains the same.
public static FilteredPhoneBook getFilteredPhoneBook(Collection<Person> people) {
return people.stream()
.collect(Collectors.teeing(
Collectors.toMap( // intermediate Map<String, Boolean>
Person::getName,
person -> true, // not proved to be a duplicate and initially considered unique
(left, right) -> false), // is a duplicate
Collectors.toMap( // intermediate Map<String, String>
Person::getName,
Person::getAddress,
(left, right) -> left),
(Map<String, Boolean> isUniqueByName, Map<String, String> addressByName) -> {
isUniqueByName.values().removeIf(Boolean::booleanValue); // removing unique names
addressByName.keySet().removeAll(isUniqueByName.keySet()); // removing all duplicates
return new FilteredPhoneBook(addressByName, new ArrayList<>(isUniqueByName.keySet()));
}
));
}
main() - demo
public static void main(String[] args) {
List<Person> people = List.of(
new Person("Alise", "address1"),
new Person("Bob", "address2"),
new Person("Bob", "address3"),
new Person("Carol", "address4"),
new Person("Bob", "address5")
);
FilteredPhoneBook filteredPhoneBook = getFilteredPhoneBook(people);
System.out.println("Unique entries:");
filteredPhoneBook.uniquePersonsAddressByName.forEach((k, v) -> System.out.println(k + " : " + v));
System.out.println("\nDuplicates:");
filteredPhoneBook.duplicatedNames().forEach(System.out::println);
}
Output
Unique entries:
Alise : address1
Carol : address4
Duplicates:
Bob
You can't know which keys are duplicates until you have processed the entire input stream. Therefore, any pre-processing step has to make a complete pass of the input before your main logic, which is wasteful.
An alternate approach could be:
Use the merge function to insert a dummy value for the offending key
At the same time, insert the offending key into a Set<K>
After the input stream is processed, iterate over the Set<K> to remove offending keys from the primary map.
In mathematical terms you want to partition your grouped aggregate and handle both parts separately.
Map<String, String> makePhoneBook(Collection<Person> people) {
Map<Boolean, List<Person>> phoneBook = people.stream()
.collect(Collectors.groupingBy(Person::getName))
.values()
.stream()
.collect(Collectors.partitioningBy(list -> list.size() > 1,
Collectors.mapping(r -> r.get(0),
Collectors.toList())));
// handle duplicates
phoneBook.get(true)
.forEach(x -> System.out.println("duplicate found " + x));
return phoneBook.get(false).stream()
.collect(Collectors.toMap(
Person::getName,
Person::getAddress));
}

Java Map getValue not possible

I got a code which gets all minimum values from a list called frequencies. Then it puts the min values with the percentage of total values into a String. To calculate the percentage I want to call minEntryes.getValue()(minEntryes is the Map<String, Integer> with all the min values in it), but it does not work. My code:
StringBuilder wordFrequencies = new StringBuilder();
URL url = new URL(urlString);//urlString is a String parameter of the function
AtomicInteger elementCount = new AtomicInteger();//total count of all the different characters
Map<String, Integer> frequencies = new TreeMap<>();//where all the frequencies of the characters will be stored
//example: e=10, r=4, (=3 g=4...
//read and count all the characters, works fine
try (Stream<String> stream = new BufferedReader(
new InputStreamReader(url.openStream(), StandardCharsets.UTF_8)).lines()) {
stream
.flatMapToInt(CharSequence::chars)
.filter(c -> !Character.isWhitespace(c))
.mapToObj(Character::toString)
.map(String::toLowerCase)
.forEach(s -> {
frequencies.merge(s, 1, Integer::sum);
elementCount.getAndIncrement();
});
} catch (IOException e) {
return "IOException:\n" + e.getMessage();
}
//counting the letters which are present the least amount of times
//in the example from above those are
//r=4, g=4
try (Stream<Map.Entry<String, Integer>> stream = frequencies.entrySet().stream()) {
Map<String, Integer> minEntryes = new TreeMap<>();
stream
.collect(Collectors.groupingBy(Map.Entry::getValue))
.entrySet()
.stream()
.min(Map.Entry.comparingByKey())
.map(Map.Entry::getValue)
.ifPresent(key -> {
IntStream i = IntStream.rangeClosed(0, key.size());
i.forEach(s -> minEntryes.put(key.get(s).getKey(), key.get(s).getValue()));
});
wordFrequencies.append("\n\nSeltenste Zeichen: (").append(100 / elementCount.floatValue() * minEntryes.getValue().append("%)"));
//this does not work
minEntryes.forEach((key, value) -> wordFrequencies.append("\n'").append(key).append("'"));
}
The compiler tells me to call get(String key) but I don't know the key. So my code to get it into the Map is way to complicated, I know, but I can't use Optional in this case(the task prohibits it). I tried to do it more simple but nothing worked.
I could get a key from minEntryes.forEach, but im wondering if there's a better solution for this.
It's not clear to me what you are trying to do, but if the question is how to get the value without knowing the key:
1st method: Use an for loop
for (int value : minEntryes.values()) {
// use 'value' instead of 'minEntryes.getValue()'
}
2nd method: Iterator "hack" (If you know there is always one value)
int value = minEntryes.values().iterator().next();
// use 'value' instead of 'minEntryes.getValue()'

Lambda to populate Map

I am trying to fill up a map with words and the number of their occurrences. I am trying to write a lambda to do it, like so:
Consumer<String> wordCount = word -> map.computeIfAbsent(word, (w) -> (new Integer(1) + 1).intValue());
map is Map<String, Integer>. It should just insert the word in the map as a key if it is absent and if it is present it should increase its integer value by 1. This one is not correct syntax-wise.
You can't increment the count using computeIfAbsent, since it will only be computed the first time.
You probably meant:
map.compute(word, (w, i) -> i == null ? 1 : i + 1);
This is what Collectors are for.
Assuming you have some Stream<String> words:
Map<String, Long> countedWords = words
.collect(Collectors
.groupingBy(
Function.identity(),
Collectors.counting());
It doesn't compile because you can't call a method on a primitive:
new Integer(1) -> 1 // unboxing was applied
(1 + 1).intValue() // incorrect
I would write it with Map#put and Map#getOrDefault:
Consumer<String> consumer = word -> map.put(word, map.getOrDefault(word, 0) + 1);

Adding non-duplicated elements to existing keys in java 8 functional style

I have a map I want to populate:
private Map<String, Set<String>> myMap = new HashMap<>();
with this method:
private void compute(String key, String[] parts) {
myMap.computeIfAbsent(key, k -> getMessage(parts));
}
compute() is invoked as follows:
for (String line : messages) {
String[] parts = line.split("-");
validator.validate(parts); //validates parts are as expected
String key = parts[parts.length - 1];
compute(key, parts);
}
parts elements are like this:
[AB, CC, 123]
[AB, FF, 123]
[AB, 456]
In the compute() method, as you can see I am trying to use the last part of the element of the array as a key and the other parts to be used as values for the map I am looking to build.
My Question: How do I add to existing key only the unique values using Java 8 functional style e.g.
{123=[AB, FF, CC]}
As you requested I added a lambda variant, which just adds the parts via lambda to the map in the compute-method:
private void compute(String key, String[] parts) {
myMap.computeIfAbsent(key,
s -> Stream.of(parts)
.limit(parts.length - 1)
.collect(toSet()));
}
But in this case you will only get something like 123=[AB, CC] in your map. Use merge instead, if you want to add also all values which come on subsequent calls:
private void compute(String key, String[] parts) {
myMap.merge(key,
s -> Stream.of(parts)
.limit(parts.length - 1)
.collect(toSet()),
(currentSet, newSet) -> {currentSet.addAll(newSet); return currentSet;});
}
I am not sure what you intend with computeIfAbsent, but from what you listed as parts and what you expect as output, you may also want to try the following instead of the whole code you listed :
// the function to identify your key
Function<String[], String> keyFunction = strings -> strings[strings.length - 1];
// the function to identify your values
Function<String[], List<String>> valuesFunction = strings -> Arrays.asList(strings).subList(0, strings.length - 1);
// a collector to add all entries of a collection to a (sorted) TreeSet
Collector<List<String>, TreeSet<Object>, TreeSet<Object>> listTreeSetCollector = Collector.of(TreeSet::new, TreeSet::addAll, (left, right) -> {
left.addAll(right);
return left;
});
Map myMap = Arrays.stream(messages) // or: messages.stream()
.map(s -> s.split("-"))
.peek(validator::validate)
.collect(Collectors.groupingBy(keyFunction,
Collectors.mapping(valuesFunction, listTreeSetCollector)));
Using your samples as input you get the result you mentioned (well, actually sorted, as I used a TreeSet).
String[] messages = new String[]{
"AB-CC-123",
"AB-FF-123",
"AB-456"};
produces a map containing:
123=[AB, CC, FF]
456=[AB]
Last, but not least: if you can, pass the key and the values themselves to your method. Don't split the logic about identifying the key and identifying the values. That makes it really hard to understand your code later on or by someone else.
Try this:
private void compute(String[] parts) {
int lastIndex = parts.length - 1;
String key = parts[lastIndex];
List<String> values = Arrays.asList(parts).subList(0, lastIndex);
myMap.computeIfAbsent(key, k -> new HashSet<>()).addAll(values);
}
Or if you want, you can replace the entire loop with a stream:
Map<String, Set<String>> myMap = messages.stream() // if messages is an array, use Arrays.stream(messages)
.map(line -> line.split("-"))
.peek(validator::validate)
.collect(Collectors.toMap(
parts -> parts[parts.length - 1],
parts -> new HashSet<>(Arrays.asList(parts).subList(0, parts.length - 1)),
(a, b) -> { a.addAll(b); return a; }));
To add more parts to a possibly existing key you're using the wrong method; you want merge(), not computeIfAbsent().
If validator.valudate() throws a checked Exception, you must call it outside a stream, so you'll need a foreach loop:
for (String message : messages) {
String[] parts = message.split("-");
validator.validate(parts);
LinkedList<String> list = new LinkedList(Arrays.asList(parts));
String key = list.getLast();
list.removeLast();
myMap.merge(key, new HashSet<>(list), Set::addAll);
}
Using a LinkedList, which has methods getLast() and removeLast(), makes the code very readable.
Disclaimer: Code may not compile or work as it was thumbed in on my phone (but there's a reasonable chance it will work)

Categories