comparing Hashmaps by different String Keys

comparing Hashmaps by different String Keys - java

i have two HashMaps and want compare it as fast as possible but the problem is, the String of mapA consist of two words connected with a space. The String of mapB is only one word.
I dont want to count the occurences, that is already done, i want to compare the two diferent Strings
mapA:
key: hello world, value: 10
key: earth hi, value: 20
mapB:
key: hello, value: 5
key: world, value: 15
key: earth, value: 25
key: hi, value: 35
the first key of mapA should find key "hello" and key "world" from mapB
what i trying to do is parsing a long Text to find Co occurences and set a value how often they occur related to all words.
my first try:
for(String entry : mapA.keySet())
{
String key = (String) entry;
Integer mapAvalue = (Integer) mapA.get(entry);
Integer tokenVal1=0, tokenVal2=0;
String token1=key.substring(0, key.indexOf(" "));
String token2=key.substring(key.indexOf(" "),key.length()).trim();
for( String mapBentry : mapb.keySet())
{
String tokenkey = mapBentry;
if(tokenkey.equals(token1)){
tokenVal1=(Integer)tokens.get(tokenentry);
}
if(tokenkey.equals(token2)){
tokenVal2=(Integer)tokens.get(tokenentry);
}
if(token1!=null && token2!=null && tokenVal1>1000 && tokenVal2>1000 ){
**procedurecall(mapAvalue, token1, token2, tokenVal1, tokenVal2);**
}
}
}

You shouldn't iterate over a HashMap (O(n)) if you are just trying to find a particular key, that's what the HashMap lookup (O(1)) is used for. So eliminate your inner loop.
Also you can eliminate a few unnecessary variables in your code (e.g. key, tokenkey). You also don't need a third tokens map, you can put the token values in mapb.
for(String entry : mapA.keySet())
{
Integer mapAvalue = (Integer) mapA.get(entry);
String token1=entry.substring(0, entry.indexOf(" "));
String token2=entry.substring(entry.indexOf(" "),entry.length()).trim();
if(mapb.containsKey(token1) && mapb.containskey(token2))
{
// look up the tokens:
Integer tokenVal1=(Integer)mapb.get(token1);
Integer tokenVal2=(Integer)mapb.get(token2);
if(tokenVal1>1000 && tokenVal2>1000)
{
**procedurecall(mapAvalue, token1, token2, tokenVal1, tokenVal2);**
}
}

Related

Incrementing HashMap values for key frequency doesn't print

I am currently working on a project to count the frequency of words in a text file. The driver program places the words into an ArrayList (after making them lowercase and removing whitespace), and then the FreqCount object places the ArrayList into a HashMap that will handle the frequency operations. So far, I can get the driver to read the text file, put it into an ArrayList, and then put that into the HashMap. My issue is that the HashMap nodes do not repeat, so I am trying to increment the value each time the word is seen.
Driver:
package threetenProg3;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.ArrayList;
public class Driver {
public static void main(String[] args) throws FileNotFoundException{
File in = new File("test.txt");
Scanner scanFile = new Scanner(in);
ArrayList<String> parsed = new ArrayList<String>();
while(scanFile.hasNext()) { //if this ends up cutting off bottom line, make it a do while loop
parsed.add(scanFile.next().toLowerCase());
}
for(int i = parsed.size()-1; i>=0; i--) { //prints arraylist backwards
System.out.println(parsed.get(i));
} //*/
FreqCount fc = new FreqCount(parsed);
System.out.println("\n Hashmap: \n");
fc.printMap();
scanFile.close();
}
}
FreqCount:
package threetenProg3;
import java.util.HashMap;
import java.util.List;
public class FreqCount {
//attributes and initializations
private HashMap<String, Integer> map = new HashMap<String, Integer>();
//constructors
FreqCount(List<String> driverList){
for(int dLIndex = driverList.size()-1; dLIndex>=0; dLIndex--) { //puts list into hashmap
for(String mapKey : map.keySet()) {
if(mapKey.equals(driverList.get(dLIndex))) {
int tval = map.get(mapKey);
map.remove(mapKey);
map.put(mapKey, tval+1);
}else {
map.put(mapKey, 1);
}
}
}
}
//methods
public void printMap() {
for (String i : map.keySet()) { //function ripped straight outta w3schools lol
System.out.println("key: " + i + " value: " + map.get(i));
}
} //*/
}
Text file:
ONE TWO ThReE FoUR fIve
six seven
EIGHT
NINE
TEN ELEVEN
ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE, ONE, ONE,
Output:
one,
one,
one,
one
one
one
one
one
one
one
one
one
one
one
one
one
one
eleven
ten
nine
eight
seven
six
five
four
three
two
one
Hashmap:
key: nine value: 1
key: one, value: 1
key: six value: 1
key: four value: 1
key: one value: 1
key: seven value: 1
key: eleven value: 1
key: ten value: 1
key: five value: 1
key: three value: 1
key: two value: 1
key: eight value: 1
From what I see, the output should be printing the correct values for the keys' frequencies. Thanks in advance for any help!

You can change the definition of FreqCount as follows:
FreqCount(List<String> driverList) {
for (int dLIndex = driverList.size() - 1; dLIndex >= 0; dLIndex--) {
String key = driverList.get(dLIndex);
if (map.get(key) == null) {
map.put(key, 1);
} else {
map.put(key, map.get(key) + 1);
}
}
}
Output after this change:
Hashmap:
key: nine value: 1
key: one, value: 3
key: six value: 1
key: four value: 1
key: one value: 15
key: seven value: 1
key: eleven value: 1
key: ten value: 1
key: five value: 1
key: three value: 1
key: two value: 1
key: eight value: 1
Alternatively,
FreqCount(List<String> driverList) {
for (int dLIndex = driverList.size() - 1; dLIndex >= 0; dLIndex--) {
String key = driverList.get(dLIndex);
map.put(key, map.getOrDefault(key, 0) + 1);
}
}
Map#getOrDefault returns the value to which the specified key is mapped, or default value if this map contains no mapping for the key.

The easiest way to do this, imo, is to use the Map.merge method. The method takes the previous value and applies a mapping function. In this case the second value is unused. The first value is used to replace the existing value + 1. Thus you are getting the frequency of the occurrence of the strings.
Also notice that I changed our class to use a parse method which returns the map. It is not appropriate to do much computation in a class constructor.
After reading in the values.
FreqCount fc = new FreqCount();
Map<String,Integer> map = fc.parse(parsed);
map.entrySet().forEach(System.out::println);
prints
nine=1
one,=3
six=1
four=1
one=15
seven=1
eleven=1
ten=1
five=1
three=1
two=1
eight=1
The modified class
class FreqCount {
// attributes and initializations
private Map<String, Integer> map =
new HashMap<>();
public Map<String, Integer> parse (List<String> driverList) {
for (String str : driverList) {
map.merge(str, 1, (v1,notUsed)->v1 + 1);
}
return map;
}
}

You have several issues there.
FreqCount(List<String> driverList){
for(int dLIndex = driverList.size()-1; dLIndex>=0; dLIndex--) { //puts list into hashmap
if(map.get(driverList.get(dLIndex)) != null) {
int tval = map.get(driverList.get(dLIndex));
map.remove(driverList.get(dLIndex));
map.put(driverList.get(dLIndex), tval+1);
}else {
map.put(driverList.get(dLIndex), 1);
}
}
}
You are doing some weird for loop there trying to iterate through an empty Map. You need to check if there is a key to that word in the map, if there is you add one to the value and if there is not you add the new pair with value 1.
And to avoid the number with comas or dots(if you want you can add other characters to the regex that replaceAll takes as a parameter)
while(scanFile.hasNext()) { //if this ends up cutting off bottom line, make it a do while loop
String value = scanFile.next().toLowerCase();
value = value.replaceAll("[,.]", "");
parsed.add(value);
}
The output now is
one
one
one
one
one
one
one
one
one
one
one
one
one
one
one
one
one
eleven
ten
nine
eight
seven
six
five
four
three
two
one
Hashmap:
key: nine value: 1
key: six value: 1
key: four value: 1
key: one value: 18
key: seven value: 1
key: eleven value: 1
key: ten value: 1
key: five value: 1
key: three value: 1
key: two value: 1
key: eight value: 1
with no repetitions on the words even with comas and the correct count of each one

Creating an anagram dictionary

I have to create an anagram dictionary using a hashtable. I take in a word from the user and have to output all the anagrams from that word from my anagram dictionary.
This is my current program, I'm creating a hash function which calculates a hash for each word, and words that are anagrams of eachother will have the same hash and be put in the same slot in the hashtable.
The part I'm having difficulty on is that when I create this map and perform my hashfunction on a user inputted word to get the index of the hashtable, how would I be able to return all the values that were at that index?
This is my code so far
fis = new FileInputStream(file);
BufferedReader br = new BufferedReader(new InputStreamReader(fis));
System.out.println("Total file size to read (in bytes) : " + fis.available());
String content = new String();
while ((content = br.readLine()) != null) {
singleAddress.add(content);
}
for(int i = 0; i<singleAddress.size(); i++)
{
char[] chars = singleAddress.get(i).toCharArray();
Arrays.sort(chars);
int hash = 0;
for(int j = 0; j<chars.length; j++)
{
hash = 2*hash + (int)chars[j];
}
numbers.put(singleAddress.get(i), hash);
System.out.println(hash + " " + i);
}
This I believe will create the anagram dictionary in the hashtable but I'm not sure how I would return all the values at a given index.

I'd use a Map<String, List<String> (or better a Google Guava Multimap<String, String>) and then apply your logic:
make a lower case version of your word
sort the characters for the lower case version to for a key
use that key to put the word into the map
When the user provides input you repeat steps 1 and 2 but use get(key) in step 3 and voilà you have your list of anagrams.
Example:
Word = Anna -> key = aann
User input = nana -> key = aann
Then you do dictionary.get("aann") and should get the list containing the element "Anna".
Edit: Issues with your code
You don't show the declaration of singleAddress and numbers but I assume it's a Set<String> and a Map<String, Integer>.
In numbers the key is the word and the value is the hash. You'd have to iterate over all entries in that map then in order to retrieve all with the same hash. Better swap it around.
The hash function might result in collisions, i.e. the same hash value for non-anagrams (as an example take "ac" and "ba", the hash for "ac" would be 2 * 64 + 66 = 194 and for "ba" it would be 2 * 65 + 64 = 194). That's why hash sets and maps in Java always use ´hashCode()_and_equals().hashCode()is used to get the bucket which is a list in the map whileequals()` is then used to check whether the keys are actually the same.

I would use a
Map<String, Collection<String>>
whose KEY would be "sorted String" for a particular word and its value would be a collection of all words the that can be made with key that are their in your dictionary
For e.g.
Key: EILNST
Value: [ELINTS, ENLIST, INLETS, LISTEN, SILENT, TINSEL]
So, in-case you want to search for word "Listen", sort the word and you will get all the anagrams for it and you have to exclude the word form the List retrieved.
Refer to solution:
Best algorithm to find anagram of word from dictonary

Distinctive number from hashmap

I'm new to java and I need to know How can i calculate distinctive number of words in HashMap
I got tweets and stored it into array of string like that
String [] words = {i, to , go , eat,know , i ,let , let , figure , eat};
HashMap <String,Integer> set=new HashMap();
for (String w:words)
{
int freq=set.get(w);
if (freq==null)
{
set.put(w1,1)
}
else
set.put(w1,freq+1)
}
let's suppose that HashMap now has all words that i need
now how can i calculate total of number of distinctive words ?
that i can see that words that have value = 1 in hashmap right ?
I tried to check
if (set.containsvalue(1))
int dist +=set.size();
but didn't work !

int dist = 0;
for (int i : set.values())
if (i == 1)
++dist;

Before you put a word into set, you should check if the key exists or not. If the key exists, then you should increase the value.

The following segment of your code is wrong:
int freq=set.get(w);
if (freq==null)
{
set.put(w1,1)
}
freq is declared to be an int to which null check cannot be applied. null checks are applicable to references.
Also, I think there is a typo that you are between w and w1
The correct code is:
String [] words = {i, to , go , eat,know , i ,let , let , figure , eat};
Map <String,Integer> set=new HashMap();
for (String w:words)
{
if (set.get(w)==null)
{
set.put(w,1)
}
else
set.put(w,set.get(w)+1)
}
Now if you iterate over the map to check the keys for which the value is 1, you will have your distinct words.

You have just to iterate through all the map and get the keys with freq == 1
int unique = 0;
for(String word : set.keySet()) {
int freq = set.get(word);
if(freq == 1) {
unique++;
}
}
System.out.println(unique);

How can I use a string array as key in hash map?

I've made an String array out of a .txt and now want to make a HashMap with this string as key. But I don't want to have the String as one key to one value, I want to have each Information as a new key for the HashMap.
private static String[] readAndConvertInputFile() {
String str = StdIn.readAll();
String conv = str.replaceAll("\'s", "").replaceAll("[;,?.:*/\\-_()\"\'\n]", " ").replaceAll(" {2,}", " ").toLowerCase();
return conv.split(" "); }
So the information in the string is like ("word", "thing", "etc.", "pp.", "thing").
My value should be the frequency of the word in the text. So for example key: "word" value: 1, key: "thing" value: 2 and so on... I'm clueless and would be grateful if someone could help me, at least with the key. :)

You can create a Map while using the String value at each array index as the key, and an Integer as the value to keep track of how many times a word appeared.
Map<String,Integer> map = new HashMap<String,Integer>();
Then when you want to increment, you can check if the Map already contains the key, if it does, increase it by 1, otherwise, set it to 1.
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
So, while you are looping over your string array, convert the String to lower case (if you want to ignore case for word occurrences), and increment the map using the if statement above.
for (String word : words) {
word = word.toLowerCase(); // remove if you want case sensitivity
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
}
A full example is shown below. I converted to words to lowercase to ignore case when using the key in the map, if you want to keep case, remove the line where I convert it to lowercase.
public static void main(String[] args) {
String s = "This this the has dog cat fish the cat horse";
String[] words = s.split(" ");
Map<String, Integer> occurences = new HashMap<String, Integer>();
for (String word : words) {
word = word.toLowerCase(); // remove if you want case sensitivity
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
}
for(Entry<String,Integer> en : occurences.entrySet()){
System.out.println("Word \"" + en.getKey() + "\" appeared " + en.getValue() + " times.");
}
}
Which will give me output:
Word "cat" appeared 2 times.
Word "fish" appeared 1 times.
Word "horse" appeared 1 times.
Word "the" appeared 2 times.
Word "dog" appeared 1 times.
Word "this" appeared 2 times.
Word "has" appeared 1 times.

Yes, you can use an array (regardless of element type) as a HashMap key.
No, shouldn't do so. The behavior is unlikely to be what you want (in general).
In your particular case, I don't see why you even propose using an array as a key in the first place. You seem to want Strings drawn from among your array elements as keys.
You could construct a word frequency table like so:
Map<String, Integer> computeFrequencies(String[] words) {
Map<String, Integer> frequencies = new HashMap<String, Integer>();
for (String word: words) {
Integer wordFrequency = frequencies.get(word);
frequencies.put(word,
(wordFrequency == null) ? 1 : (wordFrequency + 1));
}
return frequencies;
}

In java 8 using stream
String[] array=new String[]{"a","b","c","a"};
Map<String,Integer> map1=Arrays.stream(array).collect(Collectors.toMap(x->x,x->1,(key,value)->value+1));

how to find duplicate and unique string entries using Hashtable

Assume I'm taking input a string from command line and I want to find the duplicate and unique entries in the string by using Hashtable.
eg:
i/p:
hi hello bye hi good hello name hi day hi
o/p:
Unique elements are: bye, good, name, day
Duplicate elements are:
hi 3 times
hello 2 times

You can break the input apart by calling split(" ") on the input String. This will return a String[] representing each word. Iterate over this array, and use each String as the key into your Hashtable, with the value being an Integer. Each time you encounter a word, either increment its value, or set the value to 0 if no value is currently there.
Hashtable<String, Integer> hashtable = new Hashtable<String, Integer>();
String[] splitInput = input.split(" ");
for(String inputToken : splitInput) {
Integer val = hashtable.get(inputToken);
if(val == null) {
val = new Integer(0);
}
++val;
hashtable.put(inputToken, val);
}
Also, you may want to look into HashMap rather than Hashtable. HashMap is not thread safe, but is faster. Hashtable is a bit slower, but is thread safe. If you are trying to do this in a single thread, I would recommend HashMap.

Use a hashtable with string as key and a numeric type as counter.
Go through all the words and if they are not in the map, insert them; otherwise increase the count (the data part of the hashtable).
hth
Mario

you can convert each string into an integer. Then, use the generated integer as the hash value. To convert string to int, you can treat it as a base 256 number and then convert it

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

comparing Hashmaps by different String Keys - java

Related

Incrementing HashMap values for key frequency doesn't print

Creating an anagram dictionary

Distinctive number from hashmap

How can I use a string array as key in hash map?

how to find duplicate and unique string entries using Hashtable

Categories

Resources