I've made an String array out of a .txt and now want to make a HashMap with this string as key. But I don't want to have the String as one key to one value, I want to have each Information as a new key for the HashMap.
private static String[] readAndConvertInputFile() {
String str = StdIn.readAll();
String conv = str.replaceAll("\'s", "").replaceAll("[;,?.:*/\\-_()\"\'\n]", " ").replaceAll(" {2,}", " ").toLowerCase();
return conv.split(" "); }
So the information in the string is like ("word", "thing", "etc.", "pp.", "thing").
My value should be the frequency of the word in the text. So for example key: "word" value: 1, key: "thing" value: 2 and so on... I'm clueless and would be grateful if someone could help me, at least with the key. :)
You can create a Map while using the String value at each array index as the key, and an Integer as the value to keep track of how many times a word appeared.
Map<String,Integer> map = new HashMap<String,Integer>();
Then when you want to increment, you can check if the Map already contains the key, if it does, increase it by 1, otherwise, set it to 1.
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
So, while you are looping over your string array, convert the String to lower case (if you want to ignore case for word occurrences), and increment the map using the if statement above.
for (String word : words) {
word = word.toLowerCase(); // remove if you want case sensitivity
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
}
A full example is shown below. I converted to words to lowercase to ignore case when using the key in the map, if you want to keep case, remove the line where I convert it to lowercase.
public static void main(String[] args) {
String s = "This this the has dog cat fish the cat horse";
String[] words = s.split(" ");
Map<String, Integer> occurences = new HashMap<String, Integer>();
for (String word : words) {
word = word.toLowerCase(); // remove if you want case sensitivity
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
}
for(Entry<String,Integer> en : occurences.entrySet()){
System.out.println("Word \"" + en.getKey() + "\" appeared " + en.getValue() + " times.");
}
}
Which will give me output:
Word "cat" appeared 2 times.
Word "fish" appeared 1 times.
Word "horse" appeared 1 times.
Word "the" appeared 2 times.
Word "dog" appeared 1 times.
Word "this" appeared 2 times.
Word "has" appeared 1 times.
Yes, you can use an array (regardless of element type) as a HashMap key.
No, shouldn't do so. The behavior is unlikely to be what you want (in general).
In your particular case, I don't see why you even propose using an array as a key in the first place. You seem to want Strings drawn from among your array elements as keys.
You could construct a word frequency table like so:
Map<String, Integer> computeFrequencies(String[] words) {
Map<String, Integer> frequencies = new HashMap<String, Integer>();
for (String word: words) {
Integer wordFrequency = frequencies.get(word);
frequencies.put(word,
(wordFrequency == null) ? 1 : (wordFrequency + 1));
}
return frequencies;
}
In java 8 using stream
String[] array=new String[]{"a","b","c","a"};
Map<String,Integer> map1=Arrays.stream(array).collect(Collectors.toMap(x->x,x->1,(key,value)->value+1));
Related
Harold is a kidnapper who wrote a ransom note, but now he is worried it will be traced back to him through his handwriting. He found a magazine and wants to know if he can cut out whole words from it and use them to create an untraceable replica of his ransom note. The words in his note are case-sensitive and he must use only whole words available in the magazine. He cannot use substrings or concatenation to create the words he needs.
Given the words in the magazine and the words in the ransom note, print Yes if he can replicate his ransom note exactly using whole words from the magazine; otherwise, print No.
For example, the note is "Attack at dawn". The magazine contains only "attack at dawn". The magazine has all the right words, but there's a case mismatch. The answer is .
Sample Input 0
6 4
give me one grand today night
give one grand today
Sample Output 0
Yes
Sample Input 1
6 5
two times three is not four
two times two is four
Sample Output 1
No
My code 5/22 test cases failed :(
I can't figure out why 5 failed.
static void checkMagazine(String[] magazine, String[] note) {
int flag = 1;
Map<String, Integer> wordMap = new HashMap<>();
for(String word: magazine) {
if(!wordMap.containsKey(word)) {
wordMap.put(word, 1);
} else
wordMap.put(word,wordMap.get(word)+1);
}
for(String word: note){
if(!wordMap.containsKey(word)){
flag = 0;
break;
}
else wordMap.remove(word, wordMap.get(word));
}
if(flag == 0)
System.out.println("No");
else
System.out.println("Yes");
}
It's probably because instead of decrementing the count of the words in the magazine when you retrieve one, you're removing all counts of that word completely. Try this:
for(String word: note){
if(!(wordMap.containsKey(word) && wordMap.get(word) > 0)){
flag = 0;
break;
}
else wordMap.put(word, wordMap.get(word)-1);
}
wordMap is a frequency table and gives word counts.
However for every word in the note, you must decrease the word count instead of entirely removing the entry. Only when the word count reaches 0 one could remove the entry.
An other isssue is the case-sensitivity. Depending on the requirements you may need to convert all words to lowercase.
else {
wordMap.computeIfPresent(word, (k, v) -> v <= 1? null : v - 1);
}
This checks that the old value v is above 1 and then decreases it, or else returns a null value signaling to delete the entry.
The frequency counts can be done:
Map<String, Integer> wordMap = new HashMap<>();
for(String word: magazine) {
wordMap.merge(word, 1, Integer::sum);
}
I think, this implementation is simplier
static boolean checkMagazine(String[] magazine, String[] note) {
List<String> magazineCopy = new ArrayList<>(Arrays.asList(magazine));
for (String word : note)
{
if (magazineCopy.contains(word)) {
magazineCopy.remove(word);
continue;
}
return false;
}
return true;
}
I suppose your error is here:
else wordMap.remove(word, wordMap.get(word));
you are removing the word from the map, instead of decreasing the number of such words and only if the number reaches 0, you should remove the word from the map.
Python Solution
def checkMagazine(magazine, ransom):
magazine.sort()
ransom.sort()
for word in ransom:
if word not in magazine:
flag = False
break
else:
magazine.remove(word)
flag = True
if (flag):
print("Yes")
else:
print("No")
I am trying to find and print the words in a string that occurs more than one. And it works almost. I am however fighting with a small problem. The words a printed out twice since they occur twice in the sentence. I want them printed only once:
This is my code:
public class Main {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
String sentence = "is this a sentence or is this not ";
String[] myStringArray = sentence.split(" "); //Split the sentence by space.
int[] count = new int[myStringArray.length];
for (int i = 0; i < myStringArray.length; i++){
for (int j = 0; j < myStringArray.length; j++){
if (myStringArray[i].matches(myStringArray[j]))
count[i]++;
//else break;
}
}
for (int i = 0; i < myStringArray.length; i++) {
if (count[i] > 1)
System.out.println("1b. - Tokens that occurs more than once: " + myStringArray[i] + "\n");
}
}
}
You can try for (int i = 0; i < myStringArray.length; i+=2) instead.
break on the first match, after incrementing. then it won't also increment the second match.
Your code has some problems with it.
If you notice, your code will look through the list of n elements n^2 times.
If the occurrence of the word is twice. You will increment each word's count value twice.
What you need to keep track of is the set of words you have already seen, and check if a new word you encounter has already been seen or not.
If you had 3 occurrence of one word in your sentence, you each word would have a count of 3. The 3 is redundant data that doesn't need to be stored for each token, but rather just the word.
All this can be done easily if you know how a Map works.
Here is an implementation that would work.
import java.util.HashMap;
public class Main {
public static void main(String[] args) {
String sentence = "is this a sentence or is this not ";
String[] myStringArray = sentence.split("\\s"); //Split the sentence by space.
Map <String, Integer> wordOccurrences = new HashMap <String, Integer> (myStringArray.length);
for (String word : myStringArray)
if (wordOccurrences.contains(word))
wordOccurrences.put(word, wordOccurrences.get(word) + 1);
else wordOccurrences.put(word, 1);
for (String word : wordOccurrences.keySet())
if (wordOccurrences.get(word) > 1)
System.out.println("1b. - Tokens that occurs more than once: " + word + "\n");
}
}
We want to find the repeating words from an input string. So, I suggest the following approach which is fairly simple:
Make a Hash Map instance. The key (String) will be the word and the value(Integer) will be the frequency of its occurrence.
Split the string using split("\s") method to make an array of only words.
Introduce an Integer type 'frequency' variable with initial value '0'.
Iterate of the string array and after checking frequency, add each element ( or word) to the map (if frequency for that key is 0) or if
the key (word) exists, only increment the frequency by 1.
So you are now left with each word and its frequency.
For example, if input string is "We are getting dirty as this earth is getting polluted. We must stop it."
So, the map will be
{ ("We",2), ("are",1), ("getting",2), ("dirty",1), ("as",1), ("this",1), ("earth",1), ("is",1), ("polluted.",1), ("must",1), ("stop",1), ("it.",1) }
Now you know what is next step and how to use it. I agree with Kaushik.
How in Java can I get list of all characters appearing in string, with number of their appearances ? Let's say we have a string "I am really busy right now" so I should get :
i-2, a-2, r-2, m-1 and so on.
Just have a mapping of every character and their counts. You can get the character array of a String using String#toCharArray() and loop through it using the enhanced for loop. On every iteration, get the count from the mapping, set it if absent and then increment it with 1 and put back in map. Pretty straightforward.
Here's a basic kickoff example:
String string = "I am really busy right now";
Map<Character, Integer> characterCounts = new HashMap<Character, Integer>();
for (char character : string.toCharArray()) {
Integer characterCount = characterCounts.get(character);
if (characterCount == null) {
characterCount = 0;
}
characterCounts.put(character, characterCount + 1);
}
To learn more about maps, check the Sun tutorial on the subject.
You commented that it's "for a project", but it's however a typical homework question because it's pretty basic and covered in the first chapters of a decent Java book/tutorial. If you're new to Java, I suggest to get yourself through the Sun Trails Covering the Basics.
Is it homework? Without knowing it I'll assume a best-effort answer.
The logic behind your problem is to
go trought the list one character at time
count that character: since possible characters (excluding unicode) are just 256 you can have an array of 256 ints and count them there: in this way you won't need to search the correct counter but just increment the right index.
I'm not sure of your exact needs but it seems you want to count occurrences regardless of the case, maybe also ignore characters such as whitespace, etc. So you might want something like this:
String initial = "I am really busy right now";
String cleaned = initial.replaceAll("\\s", "") //remove all whitespace characters
.toLowerCase(); // lower all characters
Map<Character, Integer> map = new HashMap<Character, Integer>();
for (char character : cleaned.toCharArray()) {
Integer count = map.get(character);
count = (count!=null) ? count + 1 : 1;
map.put(character, count);
}
for (Map.Entry<Character, Integer> entry : map.entrySet()) {
System.out.println(entry.getKey() + " : " + entry.getValue());
}
Tweak the regex to meet your exact requirements (to skip punctuation, etc).
String input = "AAZERTTYAATY";
char[] chars = input.toCharArray();
Map<Character, Integer> map = new HashMap<>();
for (char aChar : chars) {
Integer charCount = map.putIfAbsent(aChar, 1);
if (charCount != null) {
charCount++;
map.put(aChar, charCount);
}
}
Code:public class duplicate
{
public static void main(String[] args)throws IOException
{
System.out.println("Enter words separated by spaces ('.' to quit):");
Set<String> s = new HashSet<String>();
Scanner input = new Scanner(System.in);
while (true)
{
String token = input.next();
if (".".equals(token))
break;
if (!s.add(token))
System.out.println("Duplicate detected: " + token);
}
System.out.println(s.size() + " distinct words:\n" + s);
Set<String> duplicatesnum = new HashSet<String>();
String token = input.next();
if (!s.add(token))
{
duplicatesnum.add(token);
System.out.println("Duplicate detected: " + token);
}
System.out.println(duplicatesnum.size());
}
}
the output is:
Enter words separated by spaces ('.' to quit):
one two one two .
Duplicate detected: one
Duplicate detected: two
2 distinct words:
[two, one]
I assume you want to know the number of different duplicate words. You can use another HashSet<String> for the duplicates.
//Outside the loop
Set<String> duplicates = new HashSet<String>();
//Inside the loop
if (!s.add(token))
{
duplicates.add(token);
System.out.println("Duplicate detected: " + token);
}
//Outside the loop
System.out.println(duplicates.size());
Also if you care for the occurences of each word declare a HashMap<String, Integer> as in others posts is mentioned.
But if you want the number of all duplicate words(not different) just declare a counter:
//Outside the loop
int duplicates = 0;
//Inside the loop
if (!s.add(token))
{
duplicates++;
System.out.println("Duplicate detected: " + token);
}
//Outside the loop
System.out.println(duplicates);
Instead of a HashSet, use a HashMap. A HashSet only stores the values. A HashMap maps a value to another value (see http://www.geekinterview.com/question_details/47545 for an explanation)
In your case, the key of the HashMap is your string (just as the key of the HashSet is the string). The value in the HashMap is the number of times you encountered this string.
When you find a new string, add it to the HashMap, and set the value of the entry to zero.
When you encounter the same string later, increment the value in the HashMap.
Because you are using a HashSet, you will not know how many duplicates you have. If you went with a HashMap<String, Integer>, you could increment whenever you found that your key was != null.
In the if (!s.add(token)), you can increment a counter and then display it's value at the end.
Your question is a bit misleading. Some people understand that you want:
Input: hello man, hello woman, say good by to your man.
Output:
Found duplicate: Hello
Found duplicate: Man
Duplicate count: 2
Others understood you wanted:
Input: hello man, hello woman, say hello to your man.
Output:
Found duplicate: Hello - 3 appearances
Found duplicate: Man - 2 appearances
Assuming you want the 1st option - go with Petar Minchev's solution
Assuming you want the 2nd option - go with Patrick's solution. Don't forget that when you use an Integer in a Map, you can get/put int as well, and Java will Automatically Box/Unbox it for you, but if you rely on this - you can get NPEs when asking the map for a key that does not exist:
Map<String,Integer> myMap = new HashMap<String,Integer>();
myMap.get("key that does not exist"); // NPE here <---
The NPE is caused since the return value from 'get' is null, and that value is being cast into an Integer after which the intValue() method will be invoked - thus triggering an NPE.
You can use Google collections library:
Multiset<String> words = HashMultiset.create();
while (true) {
String token = input.next();
if (".".equals(token))
break;
if (!words.add(token))
System.out.println("Duplicate detected: " + token);
}
System.out.println(words.elementSet().size() + " distinct words:\n" + words.elementSet());
Collection<Entry<String>> duplicateWords = Collections2.filter(words.entrySet(), new Predicate<Entry<String>>() {
public boolean apply(Entry<String> entry) {
return entry.getCount() > 1;
}
});
System.out.println("There are " + duplicateWords.size() + " duplicate words.");
System.out.println("The duplicate words are: " + Joiner.on(", ").join(duplicateWords));
Example of output:
Enter words separated by spaces ('.' to quit):
aaa bbb aaa ccc aaa bbb .
3 distinct words:
[aaa, ccc, bbb]
There are 2 duplicate words.
The duplicate words are: aaa x 3, bbb x 2
In "Programming Pearls" I have met the following problem. The question is this: "print words in order of decreasing frequency". As I understand problem is this. Suppose there is a given string array, let's call it s (words I have chosen randomly, it does not matter),
String s[]={"cat","cat","dog","fox","cat","fox","dog","cat","fox"};
We see that string "cat" occurs 4 times, "fox" 3 times and "dog" 2 times. So the desired result will be this:
cat
fox
dog
I have written the following code in Java:
import java.util.*;
public class string {
public static void main(String[] args){
String s[]={"fox","cat","cat","fox","dog","cat","fox","dog","cat"};
Arrays.sort(s);
int counts;
int count[]=new int[s.length];
for (int i=0;i<s.length-1;i++){
counts=1;
while (s[i].equals(s[i+1])){
counts++;
}
count[i]=counts;
}
}
}
I have sorted the array and created a count array where I write the number of occurrences of each word in array.
My problem is that somehow the index of the integer array element and the string array element is not the same. How can I print words according to the maximum elements of the integer array?
To keep track of the count of each word, I would use a Map which maps a word to it's current count.
String s[]={"cat","cat","dog","fox","cat","fox","dog","cat","fox"};
Map<String, Integer> counts = new HashMap<String, Integer>();
for (String word : s) {
if (!counts.containsKey(word))
counts.put(word, 0);
counts.put(word, counts.get(word) + 1);
}
To print the result, go through the keys in the map and get the final value.
for (String word : counts.keySet())
System.out.println(word + ": " + (float) counts.get(word) / s.length);