program to determine number of duplicates in a sentence

program to determine number of duplicates in a sentence - java

Code:public class duplicate
{
public static void main(String[] args)throws IOException
{
System.out.println("Enter words separated by spaces ('.' to quit):");
Set<String> s = new HashSet<String>();
Scanner input = new Scanner(System.in);
while (true)
{
String token = input.next();
if (".".equals(token))
break;
if (!s.add(token))
System.out.println("Duplicate detected: " + token);
}
System.out.println(s.size() + " distinct words:\n" + s);
Set<String> duplicatesnum = new HashSet<String>();
String token = input.next();
if (!s.add(token))
{
duplicatesnum.add(token);
System.out.println("Duplicate detected: " + token);
}
System.out.println(duplicatesnum.size());
}
}
the output is:
Enter words separated by spaces ('.' to quit):
one two one two .
Duplicate detected: one
Duplicate detected: two
2 distinct words:
[two, one]

I assume you want to know the number of different duplicate words. You can use another HashSet<String> for the duplicates.
//Outside the loop
Set<String> duplicates = new HashSet<String>();
//Inside the loop
if (!s.add(token))
{
duplicates.add(token);
System.out.println("Duplicate detected: " + token);
}
//Outside the loop
System.out.println(duplicates.size());
Also if you care for the occurences of each word declare a HashMap<String, Integer> as in others posts is mentioned.
But if you want the number of all duplicate words(not different) just declare a counter:
//Outside the loop
int duplicates = 0;
//Inside the loop
if (!s.add(token))
{
duplicates++;
System.out.println("Duplicate detected: " + token);
}
//Outside the loop
System.out.println(duplicates);

Instead of a HashSet, use a HashMap. A HashSet only stores the values. A HashMap maps a value to another value (see http://www.geekinterview.com/question_details/47545 for an explanation)
In your case, the key of the HashMap is your string (just as the key of the HashSet is the string). The value in the HashMap is the number of times you encountered this string.
When you find a new string, add it to the HashMap, and set the value of the entry to zero.
When you encounter the same string later, increment the value in the HashMap.

Because you are using a HashSet, you will not know how many duplicates you have. If you went with a HashMap<String, Integer>, you could increment whenever you found that your key was != null.

In the if (!s.add(token)), you can increment a counter and then display it's value at the end.

Your question is a bit misleading. Some people understand that you want:
Input: hello man, hello woman, say good by to your man.
Output:
Found duplicate: Hello
Found duplicate: Man
Duplicate count: 2
Others understood you wanted:
Input: hello man, hello woman, say hello to your man.
Output:
Found duplicate: Hello - 3 appearances
Found duplicate: Man - 2 appearances
Assuming you want the 1st option - go with Petar Minchev's solution
Assuming you want the 2nd option - go with Patrick's solution. Don't forget that when you use an Integer in a Map, you can get/put int as well, and Java will Automatically Box/Unbox it for you, but if you rely on this - you can get NPEs when asking the map for a key that does not exist:
Map<String,Integer> myMap = new HashMap<String,Integer>();
myMap.get("key that does not exist"); // NPE here <---
The NPE is caused since the return value from 'get' is null, and that value is being cast into an Integer after which the intValue() method will be invoked - thus triggering an NPE.

You can use Google collections library:
Multiset<String> words = HashMultiset.create();
while (true) {
String token = input.next();
if (".".equals(token))
break;
if (!words.add(token))
System.out.println("Duplicate detected: " + token);
}
System.out.println(words.elementSet().size() + " distinct words:\n" + words.elementSet());
Collection<Entry<String>> duplicateWords = Collections2.filter(words.entrySet(), new Predicate<Entry<String>>() {
public boolean apply(Entry<String> entry) {
return entry.getCount() > 1;
}
});
System.out.println("There are " + duplicateWords.size() + " duplicate words.");
System.out.println("The duplicate words are: " + Joiner.on(", ").join(duplicateWords));
Example of output:
Enter words separated by spaces ('.' to quit):
aaa bbb aaa ccc aaa bbb .
3 distinct words:
[aaa, ccc, bbb]
There are 2 duplicate words.
The duplicate words are: aaa x 3, bbb x 2

Related

Hashmap in for loop not reading all the input

This is for AOC day 2. The input is something along the lines of
"6-7 z: dqzzzjbzz
13-16 j: jjjvjmjjkjjjjjjj
5-6 m: mmbmmlvmbmmgmmf
2-4 k: pkkl
16-17 k: kkkkkkkkkkkkkkkqf
10-16 s: mqpscpsszscsssrs
..."
It's formatted like 'min-max letter: password' and seperated by line. I'm supposed to find how many passwords meet the minimum and maximum requirements. I put all that prompt into a string variable and used Pattern.quote("\n") to seperate the lines into a string array. This worked fine. Then, I replaced all the letters except for the numbers and '-' by making a pattern Pattern.compile("[^0-9]|-"); and running that for every index in the array and using .trim() to cut off the whitespace at the end and start of each string. This is all working fine, I'm getting the desired output like 6 7 and 13 16.
However, now I want to try and split this string into two. This is my code:
HashMap<Integer,Integer> numbers = new HashMap<Integer,Integer>();
for(int i = 0; i < inputArray.length; i++){
String [] xArray = x[i].split(Pattern.quote(" "));
int z = Integer.valueOf(xArray[0]);
int y = Integer.valueOf(xArray[1]);
System.out.println(z);
System.out.println(y);
numbers.put(z, y);
}
System.out.println(numbers);
So, first making a hasmap which will store <min, max> values. Then, the for loop (which runs 1000 times) splits every index of the 6 7 and 13 16 string into two, determined by the " ". The System.out.println(z); and System.out.println(y); are working as intended.
6
7
13
16
...
This output goes on to give me 2000 integers seperated by a line each time. That's exactly what I want. However, the System.out.println(numbers); is outputting:
{1=3, 2=10, 3=4, 4=7, 5=6, 6=9, 7=12, 8=11, 9=10, 10=18, 11=16, 12=13, 13=18, 14=16, 15=18, 16=18, 17=18, 18=19, 19=20}
I have no idea where to even start with debugging this. I made a test file with an array that is formatted like "even, odd" integers all the way up to 100. Using this exact same code (I did change the variable names), I'm getting a better output. It's not exactly desired since it starts at 350=351 and then goes to like 11=15 and continues in a non-chronological order but at least it contains all the 100 keys and values.
Also, completely unrelated question but is my formatting of the for loop fine? The extra space at the beginning and the end of the code?
Edit: I want my expected output to be something like {6=7, 13=16, 5=6, 2=4, 16=17...}. Basically, the hashmap would have the minimum and maximum as the key and value and it'd be in chronological order.

The problem with your code is that you're trying to put in a nail with a saw. A hashmap is not the right tool to achieve what you want, since
Keys are unique. If you try to input the same key multiple times, the first input will be overwritten
The order of items in a HashMap is undefined.
A hashmap expresses a key-value-relationship, which does not exist in this context
A better datastructure to save your Passwords would probably just be a ArrayList<IntegerPair> where you would have to define IntegerPair yourself, since java doesn't have the notion of a type combining two other types.

I think you are complicating the task unnecessarily. I would proceed as follows:
split the input using the line separator
for each line remove : and split using the spaces to get an array with length 3
build from the array in step two
3.1. the min/max char count from array[0]
3.2 charachter classes for the letter and its negation
3.3 remove from the password all letters that do not correspond to the given one and check if the length of the password is in range.
Something like:
public static void main(String[] args){
String input = "6-7 z: dqzzzjbzz\n" +
"13-16 j: jjjvjmjjkjjjjjjj\n" +
"5-6 m: mmbmmlvmbmmgmmf\n" +
"2-4 k: pkkl\n" +
"16-17 k: kkkkkkkkkkkkkkkqf\n" +
"10-16 s: mqpscpsszscsssrs\n";
int count = 0;
for(String line : input.split("\n")){
String[] temp = line.replace(":", "").split(" "); //[6-7, z, dqzzzjbzz]
String minMax = "{" + (temp[0].replace('-', ',')) + "}"; //{6,7}
String letter = "[" + temp[1] + "]"; //[z]
String letterNegate = "[^" + temp[1] + "]"; //[^z]
if(temp[2].replaceAll(letterNegate, "").matches(letter + minMax)){
count++;
}
}
System.out.println(count + "passwords are valid");
}

Hash Tables: Ransom Note hackerrank

Harold is a kidnapper who wrote a ransom note, but now he is worried it will be traced back to him through his handwriting. He found a magazine and wants to know if he can cut out whole words from it and use them to create an untraceable replica of his ransom note. The words in his note are case-sensitive and he must use only whole words available in the magazine. He cannot use substrings or concatenation to create the words he needs.
Given the words in the magazine and the words in the ransom note, print Yes if he can replicate his ransom note exactly using whole words from the magazine; otherwise, print No.
For example, the note is "Attack at dawn". The magazine contains only "attack at dawn". The magazine has all the right words, but there's a case mismatch. The answer is .
Sample Input 0
6 4
give me one grand today night
give one grand today
Sample Output 0
Yes
Sample Input 1
6 5
two times three is not four
two times two is four
Sample Output 1
No
My code 5/22 test cases failed :(
I can't figure out why 5 failed.
static void checkMagazine(String[] magazine, String[] note) {
int flag = 1;
Map<String, Integer> wordMap = new HashMap<>();
for(String word: magazine) {
if(!wordMap.containsKey(word)) {
wordMap.put(word, 1);
} else
wordMap.put(word,wordMap.get(word)+1);
}
for(String word: note){
if(!wordMap.containsKey(word)){
flag = 0;
break;
}
else wordMap.remove(word, wordMap.get(word));
}
if(flag == 0)
System.out.println("No");
else
System.out.println("Yes");
}

It's probably because instead of decrementing the count of the words in the magazine when you retrieve one, you're removing all counts of that word completely. Try this:
for(String word: note){
if(!(wordMap.containsKey(word) && wordMap.get(word) > 0)){
flag = 0;
break;
}
else wordMap.put(word, wordMap.get(word)-1);
}

wordMap is a frequency table and gives word counts.
However for every word in the note, you must decrease the word count instead of entirely removing the entry. Only when the word count reaches 0 one could remove the entry.
An other isssue is the case-sensitivity. Depending on the requirements you may need to convert all words to lowercase.
else {
wordMap.computeIfPresent(word, (k, v) -> v <= 1? null : v - 1);
}
This checks that the old value v is above 1 and then decreases it, or else returns a null value signaling to delete the entry.
The frequency counts can be done:
Map<String, Integer> wordMap = new HashMap<>();
for(String word: magazine) {
wordMap.merge(word, 1, Integer::sum);
}

I think, this implementation is simplier
static boolean checkMagazine(String[] magazine, String[] note) {
List<String> magazineCopy = new ArrayList<>(Arrays.asList(magazine));
for (String word : note)
{
if (magazineCopy.contains(word)) {
magazineCopy.remove(word);
continue;
}
return false;
}
return true;
}
I suppose your error is here:
else wordMap.remove(word, wordMap.get(word));
you are removing the word from the map, instead of decreasing the number of such words and only if the number reaches 0, you should remove the word from the map.

Python Solution
def checkMagazine(magazine, ransom):
magazine.sort()
ransom.sort()
for word in ransom:
if word not in magazine:
flag = False
break
else:
magazine.remove(word)
flag = True
if (flag):
print("Yes")
else:
print("No")

Compare Certain Strings (part of sentence) from Paragraph

I have a Map,
HashMap<String,String> dataCheck= new HashMap<String,String>();
dataCheck.put("Flag1","Additional Income");
dataCheck.put("Flag2","Be your own boss");
dataCheck.put("Flag3","Compete for your business");
and a paragraph.
String paragraph = "When you have an additional Income, you can be your
own boss. So advertise with us and compete for your business. We help
you get additional income";
So what I want to achieve is for every member of the Hashmap, I want to compare it with the paragraph and find a number of repetitions. The match My output must be as follows:
Flag1 - 2 , Flag2 - 1 , Flag3 - 1
So, basically, I just want to get an idea on how I compare certain string with another set of strings.
Update: The Match would be case insensitive.

You can use a loop with String.indexOf() to count occurrences.
In the following code, you'll see we are looping through our HashMap and comparing each entry to our paragraph.
HashMap<String, String> dataCheck = new HashMap<String, String>();
dataCheck.put("Flag1", "Additional Income");
dataCheck.put("Flag2", "Be your own boss");
dataCheck.put("Flag3", "Compete for your business");
String paragraph = "When you have an additional Income, you can be your own boss. So advertise with us and compete for your business. We help you get additional income";
// Now, iterate through each entry in the Map
for (Map.Entry<String, String> entry : dataCheck.entrySet()) {
// Keep track of the number of occurrences
int count = 0;
// After finding a match, we need to increase our index in the loop so it moves on to the next match
int startingIndex = 0;
// This will convert the strings to upper case (so our matches are case insensitive
// It will continue looping until we get an an indexOf == -1 (which means no match was found)
while ((startingIndex = paragraph.toUpperCase().indexOf(entry.getValue().toUpperCase(), startingIndex)) != -1) {
// Add to our count
count++;
// Move our index position forward for the next loop
startingIndex++;
}
// Finally, print out the total count per Flag
System.out.println(entry.getKey() + ": " + count);
}
Here is the result:
Flag1: 2
Flag2: 1
Flag3: 1

Java Print Stream printing spaces

I'm trying to print out a collection of words to a CSV file and am trying to avoid spaces being printed as a word.
static TreeMap<String,Integer> wordHash = new TreeMap<String,Integer>();
Set words=wordHash.entrySet();
Iterator it = words.iterator();
while(it.hasNext()) {
Map.Entry me = (Map.Entry)it.next();
System.out.println(me.getKey() + " occured " + me.getValue() + " times");
if (!me.getKey().equals(" ")) {
ps.println(me.getKey() + "," + me.getValue());
}
}
Whenever I open the CSV, as well as in the console, the output is :
1
10 1
a 4
test 2
I am trying to remove that top entry of a space, I thought the statement checking if the key wasn't a space would work however it's still printing spaces. Any help is appreciated. Thanks.

Your condition will eliminate only single space keys. If you want to eliminate any number of empty spaces, use :
if (!me.getKey().trim().isEmpty()) {
...
}
This is assuming me.getKey() can't be null.

How can I use a string array as key in hash map?

I've made an String array out of a .txt and now want to make a HashMap with this string as key. But I don't want to have the String as one key to one value, I want to have each Information as a new key for the HashMap.
private static String[] readAndConvertInputFile() {
String str = StdIn.readAll();
String conv = str.replaceAll("\'s", "").replaceAll("[;,?.:*/\\-_()\"\'\n]", " ").replaceAll(" {2,}", " ").toLowerCase();
return conv.split(" "); }
So the information in the string is like ("word", "thing", "etc.", "pp.", "thing").
My value should be the frequency of the word in the text. So for example key: "word" value: 1, key: "thing" value: 2 and so on... I'm clueless and would be grateful if someone could help me, at least with the key. :)

You can create a Map while using the String value at each array index as the key, and an Integer as the value to keep track of how many times a word appeared.
Map<String,Integer> map = new HashMap<String,Integer>();
Then when you want to increment, you can check if the Map already contains the key, if it does, increase it by 1, otherwise, set it to 1.
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
So, while you are looping over your string array, convert the String to lower case (if you want to ignore case for word occurrences), and increment the map using the if statement above.
for (String word : words) {
word = word.toLowerCase(); // remove if you want case sensitivity
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
}
A full example is shown below. I converted to words to lowercase to ignore case when using the key in the map, if you want to keep case, remove the line where I convert it to lowercase.
public static void main(String[] args) {
String s = "This this the has dog cat fish the cat horse";
String[] words = s.split(" ");
Map<String, Integer> occurences = new HashMap<String, Integer>();
for (String word : words) {
word = word.toLowerCase(); // remove if you want case sensitivity
if (occurences.containsKey(word)) {
occurences.put(word, occurences.get(word) + 1);
} else {
occurences.put(word, 1);
}
}
for(Entry<String,Integer> en : occurences.entrySet()){
System.out.println("Word \"" + en.getKey() + "\" appeared " + en.getValue() + " times.");
}
}
Which will give me output:
Word "cat" appeared 2 times.
Word "fish" appeared 1 times.
Word "horse" appeared 1 times.
Word "the" appeared 2 times.
Word "dog" appeared 1 times.
Word "this" appeared 2 times.
Word "has" appeared 1 times.

Yes, you can use an array (regardless of element type) as a HashMap key.
No, shouldn't do so. The behavior is unlikely to be what you want (in general).
In your particular case, I don't see why you even propose using an array as a key in the first place. You seem to want Strings drawn from among your array elements as keys.
You could construct a word frequency table like so:
Map<String, Integer> computeFrequencies(String[] words) {
Map<String, Integer> frequencies = new HashMap<String, Integer>();
for (String word: words) {
Integer wordFrequency = frequencies.get(word);
frequencies.put(word,
(wordFrequency == null) ? 1 : (wordFrequency + 1));
}
return frequencies;
}

In java 8 using stream
String[] array=new String[]{"a","b","c","a"};
Map<String,Integer> map1=Arrays.stream(array).collect(Collectors.toMap(x->x,x->1,(key,value)->value+1));

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

program to determine number of duplicates in a sentence - java

Because you are using a HashSet, you will not know how many duplicates you have. If you went with a HashMap<String, Integer>, you could increment whenever you found that your key was != null.

In the if (!s.add(token)), you can increment a counter and then display it's value at the end.

Related

Hashmap in for loop not reading all the input

Hash Tables: Ransom Note hackerrank

Compare Certain Strings (part of sentence) from Paragraph

Java Print Stream printing spaces

How can I use a string array as key in hash map?

Categories

Resources