I am currently working on a project to count the frequency of words in a text file. The driver program places the words into an ArrayList (after making them lowercase and removing whitespace), and then the FreqCount object places the ArrayList into a HashMap that will handle the frequency operations. So far, I can get the driver to read the text file, put it into an ArrayList, and then put that into the HashMap. My issue is that the HashMap nodes do not repeat, so I am trying to increment the value each time the word is seen.
Driver:
package threetenProg3;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.ArrayList;
public class Driver {
public static void main(String[] args) throws FileNotFoundException{
File in = new File("test.txt");
Scanner scanFile = new Scanner(in);
ArrayList<String> parsed = new ArrayList<String>();
while(scanFile.hasNext()) { //if this ends up cutting off bottom line, make it a do while loop
parsed.add(scanFile.next().toLowerCase());
}
for(int i = parsed.size()-1; i>=0; i--) { //prints arraylist backwards
System.out.println(parsed.get(i));
} //*/
FreqCount fc = new FreqCount(parsed);
System.out.println("\n Hashmap: \n");
fc.printMap();
scanFile.close();
}
}
FreqCount:
package threetenProg3;
import java.util.HashMap;
import java.util.List;
public class FreqCount {
//attributes and initializations
private HashMap<String, Integer> map = new HashMap<String, Integer>();
//constructors
FreqCount(List<String> driverList){
for(int dLIndex = driverList.size()-1; dLIndex>=0; dLIndex--) { //puts list into hashmap
for(String mapKey : map.keySet()) {
if(mapKey.equals(driverList.get(dLIndex))) {
int tval = map.get(mapKey);
map.remove(mapKey);
map.put(mapKey, tval+1);
}else {
map.put(mapKey, 1);
}
}
}
}
//methods
public void printMap() {
for (String i : map.keySet()) { //function ripped straight outta w3schools lol
System.out.println("key: " + i + " value: " + map.get(i));
}
} //*/
}
Text file:
ONE TWO ThReE FoUR fIve
six seven
EIGHT
NINE
TEN ELEVEN
ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE, ONE, ONE,
Output:
one,
one,
one,
one
one
one
one
one
one
one
one
one
one
one
one
one
one
eleven
ten
nine
eight
seven
six
five
four
three
two
one
Hashmap:
key: nine value: 1
key: one, value: 1
key: six value: 1
key: four value: 1
key: one value: 1
key: seven value: 1
key: eleven value: 1
key: ten value: 1
key: five value: 1
key: three value: 1
key: two value: 1
key: eight value: 1
From what I see, the output should be printing the correct values for the keys' frequencies. Thanks in advance for any help!
You can change the definition of FreqCount as follows:
FreqCount(List<String> driverList) {
for (int dLIndex = driverList.size() - 1; dLIndex >= 0; dLIndex--) {
String key = driverList.get(dLIndex);
if (map.get(key) == null) {
map.put(key, 1);
} else {
map.put(key, map.get(key) + 1);
}
}
}
Output after this change:
Hashmap:
key: nine value: 1
key: one, value: 3
key: six value: 1
key: four value: 1
key: one value: 15
key: seven value: 1
key: eleven value: 1
key: ten value: 1
key: five value: 1
key: three value: 1
key: two value: 1
key: eight value: 1
Alternatively,
FreqCount(List<String> driverList) {
for (int dLIndex = driverList.size() - 1; dLIndex >= 0; dLIndex--) {
String key = driverList.get(dLIndex);
map.put(key, map.getOrDefault(key, 0) + 1);
}
}
Map#getOrDefault returns the value to which the specified key is mapped, or default value if this map contains no mapping for the key.
The easiest way to do this, imo, is to use the Map.merge method. The method takes the previous value and applies a mapping function. In this case the second value is unused. The first value is used to replace the existing value + 1. Thus you are getting the frequency of the occurrence of the strings.
Also notice that I changed our class to use a parse method which returns the map. It is not appropriate to do much computation in a class constructor.
After reading in the values.
FreqCount fc = new FreqCount();
Map<String,Integer> map = fc.parse(parsed);
map.entrySet().forEach(System.out::println);
prints
nine=1
one,=3
six=1
four=1
one=15
seven=1
eleven=1
ten=1
five=1
three=1
two=1
eight=1
The modified class
class FreqCount {
// attributes and initializations
private Map<String, Integer> map =
new HashMap<>();
public Map<String, Integer> parse (List<String> driverList) {
for (String str : driverList) {
map.merge(str, 1, (v1,notUsed)->v1 + 1);
}
return map;
}
}
You have several issues there.
FreqCount(List<String> driverList){
for(int dLIndex = driverList.size()-1; dLIndex>=0; dLIndex--) { //puts list into hashmap
if(map.get(driverList.get(dLIndex)) != null) {
int tval = map.get(driverList.get(dLIndex));
map.remove(driverList.get(dLIndex));
map.put(driverList.get(dLIndex), tval+1);
}else {
map.put(driverList.get(dLIndex), 1);
}
}
}
You are doing some weird for loop there trying to iterate through an empty Map. You need to check if there is a key to that word in the map, if there is you add one to the value and if there is not you add the new pair with value 1.
And to avoid the number with comas or dots(if you want you can add other characters to the regex that replaceAll takes as a parameter)
while(scanFile.hasNext()) { //if this ends up cutting off bottom line, make it a do while loop
String value = scanFile.next().toLowerCase();
value = value.replaceAll("[,.]", "");
parsed.add(value);
}
The output now is
one
one
one
one
one
one
one
one
one
one
one
one
one
one
one
one
one
eleven
ten
nine
eight
seven
six
five
four
three
two
one
Hashmap:
key: nine value: 1
key: six value: 1
key: four value: 1
key: one value: 18
key: seven value: 1
key: eleven value: 1
key: ten value: 1
key: five value: 1
key: three value: 1
key: two value: 1
key: eight value: 1
with no repetitions on the words even with comas and the correct count of each one
Related
Harold is a kidnapper who wrote a ransom note, but now he is worried it will be traced back to him through his handwriting. He found a magazine and wants to know if he can cut out whole words from it and use them to create an untraceable replica of his ransom note. The words in his note are case-sensitive and he must use only whole words available in the magazine. He cannot use substrings or concatenation to create the words he needs.
Given the words in the magazine and the words in the ransom note, print Yes if he can replicate his ransom note exactly using whole words from the magazine; otherwise, print No.
For example, the note is "Attack at dawn". The magazine contains only "attack at dawn". The magazine has all the right words, but there's a case mismatch. The answer is .
Sample Input 0
6 4
give me one grand today night
give one grand today
Sample Output 0
Yes
Sample Input 1
6 5
two times three is not four
two times two is four
Sample Output 1
No
My code 5/22 test cases failed :(
I can't figure out why 5 failed.
static void checkMagazine(String[] magazine, String[] note) {
int flag = 1;
Map<String, Integer> wordMap = new HashMap<>();
for(String word: magazine) {
if(!wordMap.containsKey(word)) {
wordMap.put(word, 1);
} else
wordMap.put(word,wordMap.get(word)+1);
}
for(String word: note){
if(!wordMap.containsKey(word)){
flag = 0;
break;
}
else wordMap.remove(word, wordMap.get(word));
}
if(flag == 0)
System.out.println("No");
else
System.out.println("Yes");
}
It's probably because instead of decrementing the count of the words in the magazine when you retrieve one, you're removing all counts of that word completely. Try this:
for(String word: note){
if(!(wordMap.containsKey(word) && wordMap.get(word) > 0)){
flag = 0;
break;
}
else wordMap.put(word, wordMap.get(word)-1);
}
wordMap is a frequency table and gives word counts.
However for every word in the note, you must decrease the word count instead of entirely removing the entry. Only when the word count reaches 0 one could remove the entry.
An other isssue is the case-sensitivity. Depending on the requirements you may need to convert all words to lowercase.
else {
wordMap.computeIfPresent(word, (k, v) -> v <= 1? null : v - 1);
}
This checks that the old value v is above 1 and then decreases it, or else returns a null value signaling to delete the entry.
The frequency counts can be done:
Map<String, Integer> wordMap = new HashMap<>();
for(String word: magazine) {
wordMap.merge(word, 1, Integer::sum);
}
I think, this implementation is simplier
static boolean checkMagazine(String[] magazine, String[] note) {
List<String> magazineCopy = new ArrayList<>(Arrays.asList(magazine));
for (String word : note)
{
if (magazineCopy.contains(word)) {
magazineCopy.remove(word);
continue;
}
return false;
}
return true;
}
I suppose your error is here:
else wordMap.remove(word, wordMap.get(word));
you are removing the word from the map, instead of decreasing the number of such words and only if the number reaches 0, you should remove the word from the map.
Python Solution
def checkMagazine(magazine, ransom):
magazine.sort()
ransom.sort()
for word in ransom:
if word not in magazine:
flag = False
break
else:
magazine.remove(word)
flag = True
if (flag):
print("Yes")
else:
print("No")
i have two HashMaps and want compare it as fast as possible but the problem is, the String of mapA consist of two words connected with a space. The String of mapB is only one word.
I dont want to count the occurences, that is already done, i want to compare the two diferent Strings
mapA:
key: hello world, value: 10
key: earth hi, value: 20
mapB:
key: hello, value: 5
key: world, value: 15
key: earth, value: 25
key: hi, value: 35
the first key of mapA should find key "hello" and key "world" from mapB
what i trying to do is parsing a long Text to find Co occurences and set a value how often they occur related to all words.
my first try:
for(String entry : mapA.keySet())
{
String key = (String) entry;
Integer mapAvalue = (Integer) mapA.get(entry);
Integer tokenVal1=0, tokenVal2=0;
String token1=key.substring(0, key.indexOf(" "));
String token2=key.substring(key.indexOf(" "),key.length()).trim();
for( String mapBentry : mapb.keySet())
{
String tokenkey = mapBentry;
if(tokenkey.equals(token1)){
tokenVal1=(Integer)tokens.get(tokenentry);
}
if(tokenkey.equals(token2)){
tokenVal2=(Integer)tokens.get(tokenentry);
}
if(token1!=null && token2!=null && tokenVal1>1000 && tokenVal2>1000 ){
**procedurecall(mapAvalue, token1, token2, tokenVal1, tokenVal2);**
}
}
}
You shouldn't iterate over a HashMap (O(n)) if you are just trying to find a particular key, that's what the HashMap lookup (O(1)) is used for. So eliminate your inner loop.
Also you can eliminate a few unnecessary variables in your code (e.g. key, tokenkey). You also don't need a third tokens map, you can put the token values in mapb.
for(String entry : mapA.keySet())
{
Integer mapAvalue = (Integer) mapA.get(entry);
String token1=entry.substring(0, entry.indexOf(" "));
String token2=entry.substring(entry.indexOf(" "),entry.length()).trim();
if(mapb.containsKey(token1) && mapb.containskey(token2))
{
// look up the tokens:
Integer tokenVal1=(Integer)mapb.get(token1);
Integer tokenVal2=(Integer)mapb.get(token2);
if(tokenVal1>1000 && tokenVal2>1000)
{
**procedurecall(mapAvalue, token1, token2, tokenVal1, tokenVal2);**
}
}
I've made an String array out of a .txt and now want to make a HashMap with this string as key. But I don't want to have the String as one key to one value, I want to have each Information as a new key for the HashMap.
private static String[] readAndConvertInputFile() {
String str = StdIn.readAll();
String conv = str.replaceAll("\'s", "").replaceAll("[;,?.:*/\\-_()\"\'\n]", " ").replaceAll(" {2,}", " ").toLowerCase();
return conv.split(" "); }
So the information in the string is like ("word", "thing", "etc.", "pp.", "thing").
My value should be the frequency of the word in the text. So for example key: "word" value: 1, key: "thing" value: 2 and so on... I'm clueless and would be grateful if someone could help me, at least with the key. :)
You can create a Map while using the String value at each array index as the key, and an Integer as the value to keep track of how many times a word appeared.
Map<String,Integer> map = new HashMap<String,Integer>();
Then when you want to increment, you can check if the Map already contains the key, if it does, increase it by 1, otherwise, set it to 1.
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
So, while you are looping over your string array, convert the String to lower case (if you want to ignore case for word occurrences), and increment the map using the if statement above.
for (String word : words) {
word = word.toLowerCase(); // remove if you want case sensitivity
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
}
A full example is shown below. I converted to words to lowercase to ignore case when using the key in the map, if you want to keep case, remove the line where I convert it to lowercase.
public static void main(String[] args) {
String s = "This this the has dog cat fish the cat horse";
String[] words = s.split(" ");
Map<String, Integer> occurences = new HashMap<String, Integer>();
for (String word : words) {
word = word.toLowerCase(); // remove if you want case sensitivity
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
}
for(Entry<String,Integer> en : occurences.entrySet()){
System.out.println("Word \"" + en.getKey() + "\" appeared " + en.getValue() + " times.");
}
}
Which will give me output:
Word "cat" appeared 2 times.
Word "fish" appeared 1 times.
Word "horse" appeared 1 times.
Word "the" appeared 2 times.
Word "dog" appeared 1 times.
Word "this" appeared 2 times.
Word "has" appeared 1 times.
Yes, you can use an array (regardless of element type) as a HashMap key.
No, shouldn't do so. The behavior is unlikely to be what you want (in general).
In your particular case, I don't see why you even propose using an array as a key in the first place. You seem to want Strings drawn from among your array elements as keys.
You could construct a word frequency table like so:
Map<String, Integer> computeFrequencies(String[] words) {
Map<String, Integer> frequencies = new HashMap<String, Integer>();
for (String word: words) {
Integer wordFrequency = frequencies.get(word);
frequencies.put(word,
(wordFrequency == null) ? 1 : (wordFrequency + 1));
}
return frequencies;
}
In java 8 using stream
String[] array=new String[]{"a","b","c","a"};
Map<String,Integer> map1=Arrays.stream(array).collect(Collectors.toMap(x->x,x->1,(key,value)->value+1));
Code:public class duplicate
{
public static void main(String[] args)throws IOException
{
System.out.println("Enter words separated by spaces ('.' to quit):");
Set<String> s = new HashSet<String>();
Scanner input = new Scanner(System.in);
while (true)
{
String token = input.next();
if (".".equals(token))
break;
if (!s.add(token))
System.out.println("Duplicate detected: " + token);
}
System.out.println(s.size() + " distinct words:\n" + s);
Set<String> duplicatesnum = new HashSet<String>();
String token = input.next();
if (!s.add(token))
{
duplicatesnum.add(token);
System.out.println("Duplicate detected: " + token);
}
System.out.println(duplicatesnum.size());
}
}
the output is:
Enter words separated by spaces ('.' to quit):
one two one two .
Duplicate detected: one
Duplicate detected: two
2 distinct words:
[two, one]
I assume you want to know the number of different duplicate words. You can use another HashSet<String> for the duplicates.
//Outside the loop
Set<String> duplicates = new HashSet<String>();
//Inside the loop
if (!s.add(token))
{
duplicates.add(token);
System.out.println("Duplicate detected: " + token);
}
//Outside the loop
System.out.println(duplicates.size());
Also if you care for the occurences of each word declare a HashMap<String, Integer> as in others posts is mentioned.
But if you want the number of all duplicate words(not different) just declare a counter:
//Outside the loop
int duplicates = 0;
//Inside the loop
if (!s.add(token))
{
duplicates++;
System.out.println("Duplicate detected: " + token);
}
//Outside the loop
System.out.println(duplicates);
Instead of a HashSet, use a HashMap. A HashSet only stores the values. A HashMap maps a value to another value (see http://www.geekinterview.com/question_details/47545 for an explanation)
In your case, the key of the HashMap is your string (just as the key of the HashSet is the string). The value in the HashMap is the number of times you encountered this string.
When you find a new string, add it to the HashMap, and set the value of the entry to zero.
When you encounter the same string later, increment the value in the HashMap.
Because you are using a HashSet, you will not know how many duplicates you have. If you went with a HashMap<String, Integer>, you could increment whenever you found that your key was != null.
In the if (!s.add(token)), you can increment a counter and then display it's value at the end.
Your question is a bit misleading. Some people understand that you want:
Input: hello man, hello woman, say good by to your man.
Output:
Found duplicate: Hello
Found duplicate: Man
Duplicate count: 2
Others understood you wanted:
Input: hello man, hello woman, say hello to your man.
Output:
Found duplicate: Hello - 3 appearances
Found duplicate: Man - 2 appearances
Assuming you want the 1st option - go with Petar Minchev's solution
Assuming you want the 2nd option - go with Patrick's solution. Don't forget that when you use an Integer in a Map, you can get/put int as well, and Java will Automatically Box/Unbox it for you, but if you rely on this - you can get NPEs when asking the map for a key that does not exist:
Map<String,Integer> myMap = new HashMap<String,Integer>();
myMap.get("key that does not exist"); // NPE here <---
The NPE is caused since the return value from 'get' is null, and that value is being cast into an Integer after which the intValue() method will be invoked - thus triggering an NPE.
You can use Google collections library:
Multiset<String> words = HashMultiset.create();
while (true) {
String token = input.next();
if (".".equals(token))
break;
if (!words.add(token))
System.out.println("Duplicate detected: " + token);
}
System.out.println(words.elementSet().size() + " distinct words:\n" + words.elementSet());
Collection<Entry<String>> duplicateWords = Collections2.filter(words.entrySet(), new Predicate<Entry<String>>() {
public boolean apply(Entry<String> entry) {
return entry.getCount() > 1;
}
});
System.out.println("There are " + duplicateWords.size() + " duplicate words.");
System.out.println("The duplicate words are: " + Joiner.on(", ").join(duplicateWords));
Example of output:
Enter words separated by spaces ('.' to quit):
aaa bbb aaa ccc aaa bbb .
3 distinct words:
[aaa, ccc, bbb]
There are 2 duplicate words.
The duplicate words are: aaa x 3, bbb x 2
In "Programming Pearls" I have met the following problem. The question is this: "print words in order of decreasing frequency". As I understand problem is this. Suppose there is a given string array, let's call it s (words I have chosen randomly, it does not matter),
String s[]={"cat","cat","dog","fox","cat","fox","dog","cat","fox"};
We see that string "cat" occurs 4 times, "fox" 3 times and "dog" 2 times. So the desired result will be this:
cat
fox
dog
I have written the following code in Java:
import java.util.*;
public class string {
public static void main(String[] args){
String s[]={"fox","cat","cat","fox","dog","cat","fox","dog","cat"};
Arrays.sort(s);
int counts;
int count[]=new int[s.length];
for (int i=0;i<s.length-1;i++){
counts=1;
while (s[i].equals(s[i+1])){
counts++;
}
count[i]=counts;
}
}
}
I have sorted the array and created a count array where I write the number of occurrences of each word in array.
My problem is that somehow the index of the integer array element and the string array element is not the same. How can I print words according to the maximum elements of the integer array?
To keep track of the count of each word, I would use a Map which maps a word to it's current count.
String s[]={"cat","cat","dog","fox","cat","fox","dog","cat","fox"};
Map<String, Integer> counts = new HashMap<String, Integer>();
for (String word : s) {
if (!counts.containsKey(word))
counts.put(word, 0);
counts.put(word, counts.get(word) + 1);
}
To print the result, go through the keys in the map and get the final value.
for (String word : counts.keySet())
System.out.println(word + ": " + (float) counts.get(word) / s.length);