HashMap : Adding values with common keys and printing them out - java

I have file which has String in the form key/value pair like people and count, example would be
"Reggy, 15"
"Jenny, 20"
"Reggy, 4"
"Jenny, 5"
and in the output I should have summed up all count values based on key so for our example output would be
"Reggy, 19"
"Jenny, 25"
Here is my approach:
Read each line and for each line get key and count using scanner and having , as delimiter
Now see if key is already present before if then just add currentValues to previousValues if not then take currentValue as value of HashMap.
Sample Implementation:
public static void main(final String[] argv) {
final File file = new File("C:\\Users\\rachel\\Desktop\\keyCount.txt");
try {
final Scanner scanner = new Scanner(file);
while (scanner.hasNextLine()) {
if (scanner.hasNext(".*,")) {
String key;
final String value;
key = scanner.next(".*,").trim();
if (!(scanner.hasNext())) {
// pick a better exception to throw
throw new Error("Missing value for key: " + key);
}
key = key.substring(0, key.length() - 1);
value = scanner.next();
System.out.println("key = " + key + " value = " + value);
}
}
} catch (final FileNotFoundException ex) {
ex.printStackTrace();
}
}
Part I am not clear about is how to divide key/value pair while reading them in and creating HashMap based on that.
Also is the approach am suggestion an optimal one or is there a way to enhance the performance more.

Since this is almost certainly a learning exercise, I'll stay away from writing code, letting you have all the fun.
Create a HashMap<String,Integer>. Every time that you see a key/value pair, check if the hash map has a value for the key (use 'containsKey(key)'). If it does, get that old value using get(key), add the new value, and store the result back using put(key, newValue). If the key is not there yet, add a new one - again, using put. Don't forget to make an int out if the String value (use Integer.valueOf(value) for that).
As far as optimizing goes, any optimization at this point would be premature: it does not even work! However, it's hard to get much faster than a single loop that you have, which is also rather straightforward.

Try this:
Map<String, Long> map = new HashMap<String, Long>();
while (scanner.hasNextLine()) {
if (scanner.hasNext(".*,")) {
....
if(map.containsKey(key))
map.put(key, map.get(key) + Long.valueOf(value));
else
map.put(key, Long.valueOf(value));
}
}

Simplest way I can think about splitting the values:
BufferedReader reader = new BufferedReader(new FileReader(file));
Map<String, Integer> mapping = new HashMap<String,Integer>();
String currentLine;
while ((currentLine = reader.readLine()) != null) {
String[] pair = currentLine.split(",");
if(pair.length != 2){ //could be less strict
throw new DataFormatException();
}
key = pair[0];
value = Integer.parseInt(pair[1]);
if(map.contains(key)){
value += map.get(key);
}
map.put(key,value);
}
It is most likely not the most efficient way in terms of performance, but is pretty straightforward. Scanner is usually used for parsing, but the parsing here doesn't look as complex, is just a split of strings.

For reading in, personally, I'd use:
Scanner.nextLine(), String.split(","), and Integer.valueOf(value)

Kind of late but clean solution with time complexity of O(n). This solution bypasses sort of arrays
public class Solution {
public static void main(String[] args) {
// Anagram
String str1 = "School master";
String str2 = "The classroom";
char strChar1[] = str1.replaceAll("[\\s]", "").toLowerCase().toCharArray();
char strChar2[] = str2.replaceAll("[\\s]", "").toLowerCase().toCharArray();
HashMap<Character, Integer> map = new HashMap<Character, Integer>();
for (char c : strChar1) {
if(map.containsKey(c)){
int value=map.get(c)+1;
map.put(c, value);
}else{
map.put(c, 1);
}
}
for (char c : strChar2) {
if(map.containsKey(c)){
int value=map.get(c)-1;
map.put(c, value);
}else{
map.put(c, 1);
}
}
for (char c : map.keySet()) {
if (map.get(c) != 0) {
System.out.println("Not anagram");
}
}
System.out.println("Is anagram");
}
}

public Map<String, Integer> mergeMaps(#NonNull final Map<String, Integer> mapOne,
#NonNull final Map<String, Integer> mapTwo) {
return Stream.of(mapOne.entrySet(), mapTwo.entrySet())
.flatMap(Collection::stream)
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, Integer::sum));
}

Related

Print words occurrences from Max to Min in Java (No Streams)

Can you please give me advice on how to print word occurrences from the most frequent value to the least frequent?
I've tried different methods, so I stopped on the Map, it gives me a much closer result.
public class InputOutput {
private String wordsFrequency() {
StringBuilder result = new StringBuilder();
try {
Map<String, Integer> map = new HashMap<>();
BufferedReader reader = new BufferedReader(new FileReader("words.txt"));
String words;
while ((words = reader.readLine()) != null) {
Scanner scan = new Scanner(words);
while (scan.hasNext()) {
String word = scan.next();
if (map.containsKey(word))
map.put(word, map.get(word) + 1);
else
map.put(word, 1);
}
scan.close();
}
reader.close();
Set<Entry<String, Integer>> entrySet = map.entrySet();
for (Entry<String, Integer> entry : entrySet) {
result.append(entry.getKey()).append("\t").append(entry.getValue()).append("\n");
}
} catch (IOException e) {
e.printStackTrace();
}
return result.toString();
}
public static void main(String[] args) {
InputOutput requestedData = new InputOutput();
System.out.println(requestedData.wordsFrequency());
}
}
File contents:
the day is sunny the the
the sunny is is is is is is
Expected output:
is 7
the 4
sunny 2
day 1
The output I'm getting:
the 4
is 7
sunny 2
day 1
List<Map.Entry<String, Integer>> frequencies = new ArrayList<>(map.entrySet());
frequencies.sort(Comparator.comparing(e -> e.getValue()).reversed());
A List may be sorted, or a TreeSet can be sorted (SortedSet) using a Comparator. Here with a function returning a Comparable value.
I'm sure there's a cleaner way to do it, but without using streams here's what I came up with:
String src = "the day is sunny the the the sunny is is is is is is";
try (Scanner scanner = new Scanner(new StringReader(src))) {
Map<String, Integer> map = new HashMap<>();
while (scanner.hasNext()) {
String word = scanner.next();
map.merge(word, 1, (a, b) -> a + 1);
}
Map<Integer, Collection<String>> cntMap = new TreeMap<>(Comparators.reverseOrder());
for (Entry<String, Integer> entry : map.entrySet()) {
Collection<String> list = cntMap.get(entry.getValue());
if (list == null) {
list = new TreeSet<>();
cntMap.put(entry.getValue(), list);
}
list.add(entry.getKey());
}
for (Entry<Integer, Collection<String>> entry : cntMap.entrySet()) {
System.out.println(entry.getValue() + " : " + entry.getKey());
}
}
You already have your data, here is how to get them in reverse sorted order.
declare a SortedSet using the comparator to compare the the values of the entries
then add the entries to the SortedSet and the will be sorted as they are entered.
Entry.comparingByValue(Comparator.reversed()) is used to sort on the count only and in reversed order.
SortedSet<Entry<String,Integer>> set
= new TreeSet(Entry.comparingByValue(Comparator.reverseOrder()));
set.addAll(map.entrySet());
Then print them.
set.forEach(e-> System.out.printf("%-7s : %d%n", e.getKey(), e.getValue()));
For your data, this would print
is : 7
the : 4
sunny : 2
day : 1
The issues with the code you've provided:
In case of the exception, the stream would not be closed. More over ever if all the data would be successfully read from a file, but exception occur during closing the reader you'll the data because lines of code that are responsible for processing the map will not be executed. Use try with resources to ensure that your resources would be properly closed.
Don't cram too much logic into one method. There are at least two responsibilities, and they should reside in separate methods, as the Single responsibility principle suggests.
Instead of utilizing Scanner you can split the line that has been read from a file.
And your current logic lucks sorting. That's why your current and expected output don't match.
You can generate a map Map<String, Integer> representing the frequency of each word.
Then create a list of entries of this map, sort it based on values in descending order.
And finally turn the sorted list of entries into a list of strings which you can print.
private static Map<String, Integer> wordsFrequency(String file) {
Map<String, Integer> frequencies = new HashMap<>();
try (var reader = Files.newBufferedReader(Path.of(file))) {
String[] words = reader.readLine().split(" ");
for (String word : words) {
// frequencies.merge(word, 1, Integer::sum); // an equivalent of the 2 lines below
int count = frequencies.getOrDefault(word, 0);
frequencies.put(word, count + 1);
}
} catch (IOException e) {
e.printStackTrace();
}
return frequencies;
}
public static List<String> mapToSortedList(Map<String, Integer> map) {
List<Map.Entry<String, Integer>> entries = new ArrayList<>(map.entrySet());
// sorting the list of entries
entries.sort(Map.Entry.<String, Integer>comparingByValue().reversed());
List<String> result = new ArrayList<>();
for (Map.Entry<String, Integer> entry :entries) {
result.add(entry.getKey() + " " + entry.getValue());
}
return result;
}
public static void main(String[] args) {
mapToSortedList(wordsFrequency("filePath.txt")).forEach(System.out::println);
}

Why do I get the wrong HashCode from string?

I have this structure:
private HashMap<Integer,HashMap<Object,Integer>> commandList= new HashMap<>();
populated this way:
{1={1=2, 2=3, 3=4, -999=-999, -998=-998}}
from this code:
if ((msgTypeTemp=commandList.get(this.msgType).get(msgContent))==null) {
Object s= "1";
System.out.println("Class of s: "+s.getClass().getSimpleName()+"\nClass of msgContent: "+msgContent.getClass().getSimpleName());
System.out.println("msgMap:\n"+msgMap);
System.out.println("commandList:\n"+commandList);
System.out.println(s.hashCode());
System.out.println(msgContent.hashCode());
System.out.println(commandList.get(this.msgType).get(s));
this.msgType=JSockOS_UndefinedMsg.MSG_CODE;
specialMsg=true;
} else {
this.msgType=msgTypeTemp;
if (specialMsgType(this.msgType)){
specialMsg=true;
}
}
My HashMap is generic type <String,Integer>
However, whenever I call the get method on msgContent, it comes out that instead of the hashcode of "1", it was a hashcode which until that moment was set to 0, and which then changed after the get method call.
This happens only for calls that use "msgContent" parameter...
If I use this: System.out.println(commandList.get(this.msgType).get(s));
It returns "2" as expected...
Look also this image, it may help.
msgContent gets changed before the above code in this way:
it was first: 2.1.
then it gets: 1.
remaining a string.
msgContent=msgContent.toString().split(Pattern.quote("."))[1];
do(msgContent); // a methods which implements the code showed before.
//msgContent is a parameter, --> public void do(Object msgContent)
[EDIT]:
PROBLEM FOUND: msgContent is 495 chars... will fix its changes and update!
Even though String is immutable, the value of hashCode is computed lazily for performance reasons, as shown here:
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
As far as your actual problem is, are you entirely certain that your keys are String? The type you've provided there is <Object, Integer>, not <String, Integer>.
My test case works fine as shown here (this prints elseSide):
public static void main(String... args) {
HashMap<Integer, HashMap<Object, Integer>> map = new HashMap<>();
HashMap<Object, Integer> innerMap = new HashMap<>();
innerMap.put("1", 2);
innerMap.put("2", 3);
innerMap.put("-999",-999);
innerMap.put("-998",-998);
map.put(1, innerMap);
int msgType = 1;
String msgContent = "2.1";
msgContent = msgContent.toString().split(Pattern.quote("."))[1];
System.out.println(map);
if(map.get(msgType).get(msgContent) == null) {
System.out.println("ifSide");
} else {
System.out.println("elseSide");
}
}
I think you should try adding the following debugging statements:
HashMap<Object, Integer> innerMap = commandList.get(this.msgType);
for(Map.Entry<Object, Integer> entry : innerMap.entrySet()) {
System.out.println("KeyClass: " + entry.getKey().getClass() +
"\tKeyValue:" + entry.getKey());
// This will make sure the string doesn't have any unprintable characters
if(entry.getKey() instanceof String) {
String key = (String) entry.getKey();
System.out.println("Key Length: " + key.getLength());
}
}
I don't think your key in your inner map is actually a String, or perhaps the String somehow has unprintable characters. A hash of 630719471 is much too high for a one character String. It's also possible that msgContent has unprintable characters as well.

compare & extract two Hashmap

I used Scanner to read through A.txt to generate A Hashmap,
also same method to read through B.txt to have B Hashmap.
These two hashmap have the "SOME" same key and would like to combine with each other.
If the key is are the same, print out "key, value1, value2".
Here is I have so far :
public static void main (String[] args) throws FileNotFoundException {
Scanner scanner1 = new Scanner(new File("score.txt"));
Map<String, String> tolerance = new HashMap<>();
Scanner scanner2 = new Scanner(new File("Count2.txt"));
Map<String, String> Pdegree = new HashMap<>();
while (scanner1.hasNextLine()) {
String line = scanner1.nextLine();
String[] array = line.split("\t",2);
String Name = array[0];
String score = array[1];
tolerance.put(Name,score);
}
while (scanner2.hasNextLine()) {
String line2 = scanner2.nextLine();
String[] array2 = line2.split("\t",2);
String Name2 = array2[0];
String degree = array2[1];
Pdegree.put(Name2,degree);
}
for(Map.Entry<String, String> entry : tolerance.entrySet()) {
String key = entry.getKey();
String value = entry.getValue();
for(Map.Entry<String, String> entry2 : Pdegree.entrySet()) {
String key2 = entry2.getKey();
String value2 = entry2.getValue();
if(key==key2){
System.out.println(key2 + "\t" + value + "\t" + value2);
}
}
}
}
}
Neither results nor error messages would show.
My question is how to extract the same key with respective values from two maps. Thanks.
I found the answer by myself. It should be
if(key.equals(key2))
You may use map1.putAll(map2) to combine two maps;
Why not use Guava's multimap? I believe that if you use put all and it comes across two identical keys, it simply adds a second value to the key. Then you can print out all teh key value pairs. If it has identical key and identical value what it does is implementation dependent.
https://guava-libraries.googlecode.com/svn/tags/release03/javadoc/com/google/common/collect/Multimap.html#put(K, V)

What is the fastest method to find duplicates from a collection

This is what I have tried and somehow I get the feeling that this is not right or this is not the best performing application, so is there a better way to do the searching and fetching the duplicate values from a Map or as a matter of fact any collection. And a better way to traverse through a collection.
public class SearchDuplicates{
public static void main(String[] args) {
Map<Integer, String> directory=new HashMap<Integer, String>();
Map<Integer, String> repeatedEntries=new HashMap<Integer, String>();
// adding data
directory.put(1,"john");
directory.put(2,"michael");
directory.put(3,"mike");
directory.put(4,"anna");
directory.put(5,"julie");
directory.put(6,"simon");
directory.put(7,"tim");
directory.put(8,"ashley");
directory.put(9,"john");
directory.put(10,"michael");
directory.put(11,"mike");
directory.put(12,"anna");
directory.put(13,"julie");
directory.put(14,"simon");
directory.put(15,"tim");
directory.put(16,"ashley");
for(int i=1;i<=directory.size();i++) {
String result=directory.get(i);
for(int j=1;j<=directory.size();j++) {
if(j!=i && result==directory.get(j) &&j<i) {
repeatedEntries.put(j, result);
}
}
System.out.println(result);
}
for(Entry<Integer, String> entry : repeatedEntries.entrySet()) {
System.out.println("repeated "+entry.getValue());
}
}
}
Any help would be appreciated. Thanks in advance
You can use a Set to determine whether entries are duplicate. Also, repeatedEntries might as well be a Set, since the keys are meaningless:
Map<Integer, String> directory=new HashMap<Integer, String>();
Set<String> repeatedEntries=new HashSet<String>();
Set<String> seen = new HashSet<String>();
// ... initialize directory, then:
for(int j=1;j<=directory.size();j++){
String val = directory.get(j);
if (!seen.add(val)) {
// if add failed, then val was already seen
repeatedEntries.add(val);
}
}
At the cost of extra memory, this does the job in linear time (instead of quadratic time of your current algorithm).
EDIT: Here's a version of the loop that doesn't rely on the keys being consecutive integers starting at 1:
for (String val : directory.values()) {
if (!seen.add(val)) {
// if add failed, then val was already seen
repeatedEntries.add(val);
}
}
That will detect duplicate values for any Map, regardless of the keys.
You can use this to found word count
Map<String, Integer> repeatedEntries = new HashMap<String, Integer>();
for (String w : directory.values()) {
Integer n = repeatedEntries.get(w);
n = (n == null) ? 1 : ++n;
repeatedEntries.put(w, n);
}
and this to print the stats
for (Entry<String, Integer> e : repeatedEntries.entrySet()) {
System.out.println(e);
}
List, Vector have a method contains(Object o) which return Boolean value based either this object is exist in collection or not.
You can use Collection.frequency to find all possible duplicates in any collection using
Collections.frequency(list, "a")
Here is a proper example
Most generic method to find
Set<String> uniqueSet = new HashSet<String>(list);
for (String temp : uniqueSet) {
System.out.println(temp + ": " + Collections.frequency(list, temp));
}
References from above link itself

Counting occurrences of a key in a Map in Java

I'm writing a project that captures Java keywords from a .java file and keeps track of the occurrences with a map. I've used a similar method in the past successfully, but I can't seem to adopt this method for my intended use here.
Map<String,Integer> map = new TreeMap<String,Integer>();
Set<String> keywordSet = new HashSet<String>(Arrays.asList(keywords));
Scanner input = new Scanner(file);
int counter = 0;
while (input.hasNext())
{
String key = input.next();
if (key.length() > 0)
{
if (keywordSet.contains(key))
{
map.put(key, 1);
counter++;
}
if(map.containsKey(key)) <--tried inner loop here, failed
{
int value = map.get(key);
value++;
map.put(key, value);
}
}
This block of code is supposed to add the keyword to the key, and increment the value each time the same key occurs. So far, it adds the keywords, but fails to properly increment the value. here is a sample output:
{assert=2, class=2, continue=2, default=2, else=2, ...}
Basically it increments every value in the map instead of the ones it's supposed to. I'm not sure if I'm over-thinking this or what. I've tried an inner loop and it gave me insane results. I really hope I'm just over-thinking this. Any help is greatly appreciated!
There's a much more concise (and easier to reason about) way to achieve what you want:
final ConcurrentMap<String, AtomicInteger> map = new ConcurrentHashMap<>();
final Scanner input = new Scanner(file);
while (input.hasNext()) {
final String key = input.next();
if (key.length() > 0) {
map.putIfAbsent(key, new AtomicInteger(0));
map.get(key).incrementAndGet();
}
}
Let's analyze why does this work.
Whenever the Scanner encounters a keyword, there are 2 possible cases: you either have encountered it before (ie, it is a known keyword), or it is an yet unseen keyword.
If it is an unseen keyword: putIfAbsent will put an AtomicInteger with value 0 in the map, and incrementAndGet() will set it to 1 right after, and, from now on, it becomes a known keyword;
If it is a known keyword: putIfAbsent will do nothing, and incrementAndGet() will increment the value that is already present in the map.
Then, if you want the key set, you do:
final Set<String> keys = map.keySet();
To print all the values, you could do something like:
for (final String k : map.keySet()) {
System.out.println(k + ": " + map.get(k).get());
}
You are not forced to use the two "different" classes I used above, ConcurrentMap and AtomicInteger. It is just easier to use them because they encapsulate much of the logic that you tried to write by yourself (and failed). The logic that they encapsulate is exactly all the other answers describe (ie, test if the value is present, if not set it to 0, then get whatever value is present, increment it and put it back into the map).
To maintain the keys of the map (our words being counted) in alphabetical order, use a ConcurrentNavigableMap such as ConcurrentSkipListMap .
For every key you scan you create a new entry in the map (overriding the existing one). Then, the next condition holds so you increment the count by 1, reaching the value 2.
The inner part should be something like:
if (keywordSet.contains(key))
{
Integer value = map.get(key);
if (value == null)
value = 0;
value++;
map.put(key, value);
}
Anyway, consider using some kind of a mutable integer to make this more efficient. You won't have to override entries in the map, and you won't be doing too much Integer boxing operations.
Even more concise using Map.merge (since Java 8):
if (keywordSet.contains(key)) {
map.merge(key, 1, (currentCount, notUsed) -> ++currentCount);
}
Here is a generic implementation of a counting map - a map with values representing the count of their keys:
public static <K> void count(K key, Map<K, Integer> map) {
map.merge(key, 1, (currentCount, notUsed) -> ++currentCount);
}
public static void main(String[] args) {
Map<String, Integer> map = new HashMap<>();
count("A", map);
count("B", map);
count("A", map);
count("Z", map);
count("A", map);
System.out.println(map); // {A=3, B=1, Z=1}
}
You always set the value to 1 and then update it by another one. What you need is to update the map value (and not setting it to 1 again).
Instead of:
map.put(key, 1);
use:
Integer value = map.get(key);
if (value == null){
value = 0
}
value++;
map.put(key, value);
And drop the second if.
Map<String, Integer> map = new HashMap<String, Integer>();
Set<String> keywordSet = new HashSet<String>(Arrays.asList(keywords));
Scanner input = new Scanner(file);
while (input.hasNext()){
String key = input.next();
if (key.length() > 0)
if (keywordSet.contains(key)){
Integer counter = map.get(key);
if (counter == null)
map.put(key, 1);
else
map.put(key, count + 1);
}
}
map.compute(key, (k, value) -> (value == null) ? 1 : (value + 1));

Categories