Remove only one frequency of duplicate word from String array - java

I have a string array
String a = "This is a life and our life will be full of fun just like the Benn Steller's Secret life of Walter Mitty.";
String a1[]=a.split(" ");
for(String temp: a1)
{
System.out.println(temp);
}
Here "life" is repeated three times. Now I have to remove only one frequency of duplicate word form array.
please guide me....
Thanks.

You can use something like this, but this will remove only first occurence of specified word:
Full code which removes one duplicate. You need to know that it doesn't ignore special characters, and space is delimiter in this case.
public static void main(String []args){
String a = "This is a life and our life will be full of fun just like the Benn Steller's Secret life of Walter Mitty Mitty";
System.out.println(removeOneDuplicate(a));
}
public static String removeOneWord(String str, String word){
int value = str.indexOf(word);
String result = str.substring(0, value);
result += str.substring( value+word.length(), str.length());
return result;
}
public static String removeOneDuplicate(String a){
String [] tmp = a.split(" ");
Map<String, Integer> map = new HashMap<String, Integer>();
for(String s: tmp){
if( map.containsKey(s)){
int value = map.get(s);
if(value == 1)
a = removeOneWord(a, s);
map.put(s, value + 1);
}
else
map.put(s, 1);
}
return a;
}
Sample results:
INPUT: This is a life and our life will be full of fun just like the Benn Steller's Secret life of Walter Mitty Mitty
OUTPUT: This is a and our life will be full fun just like the Benn Steller's Secret life of Walter Mitty
In result You can see that life, of and Mitty is removed.
EDIT
If you want to remove all duplicates and leave first occurence of word change following lines:
int value = str.indexOf(word); -> int value = str.lastIndexOf(word);
int value = map.get(s);
if(value == 1)
a = removeOneWord(a, s);
map.put(s, value + 1);
to:
a = removeOneWord(a, s);

First of all, the example you provided is not a String array. It is a String.
I am giving the solution based on String. If you need it for String array, you will be able to do this on your own, if you understand this.
First, lets take a string tokenizer. A tokenizer breaks apart a string by a given character set. In its simplest form, it breaks apart a string by space.
For example, a string str = "This is a test". A simple tokenizer will break this string into words like "This" "is" "a" "test".
Below is the code to declare and use tokenizer:
StringTokenizer st = new StringTokenizer(a); // a is given your string
Now, we declare an array of string below. (An array of string is an array, the element of each array is a single string.)
String[] str_arr = new String[100];
We will now use the tokenizer to get each word of your string and keep each words in the array of strings like below:
int index=0; // to keep track of index of the array (of strings)
while (st.hasMoreElements()) {
str_arr[index] = (String) st.nextElement();
index++;
}
So, now we have an array of strings named 'str_arr'. Now we will check for each element of the array whether duplicate values are occuring or not. If we find a duplicate, we will replace it with a null value. But, we will do it only once. The remaining duplicates will be kept as it is, this is what you asked for, right?
To keep track of a string already searched and made null, we will use a HashMap like this.
HashMap<String, Integer> hash_map = new HashMap<String, Integer>();
Now, we will run 2 nested loops and after that, we will have a modified array where only multiple occurrence of a string is reduced by 1.
for(int i=0; i<index; i++){
String current_string = str_arr[i];
for(int j=i+1; j<index; j++){
if( (current_string.equals(str_arr[j])) && (hash_map.containsKey(current_string)==false) && str_arr[j]!=""){
hash_map.put(str_arr[j], 1);
str_arr[j]="";
break;
}
}
}
Now, you can print all the words simply as below:
for(int i=0; i<index; i++)
System.out.print(str_arr[i]+" ");
INPUT: This is a life and our life will be full of fun just like the Benn Steller's Secret life of Walter Mitty.
OUTPUT: This is a life and our will be full of fun just like the Benn Steller's Secret life Walter Mitty.
Sorry for long explanation, but still if you can't get any point, please comment. I will try to reply.
Thanks!
Happy Coding :)

As we know, set does not contain duplicate at all.
My Code:
String a = "This is a life and our life will be full of fun just like the Benn Steller's Secret life of Walter Mitty.";
String[] aSpilt = a.split(" ");
List<String> list = Arrays.asList(aSpilt);
System.out.print("The input is : ");
list.forEach((s) -> System.out.print(s + " "));
System.out.println();
Set<String> noDuplicateSet = new LinkedHashSet<>();
Set<String> duplicateSet = new LinkedHashSet<>();
list.forEach((i) -> {
if (!noDuplicateSet.add(i) && i.equals("life")) {
duplicateSet.add(i + " ");
}
});
System.out.print("The output is : ");
noDuplicateSet.forEach((s) -> System.out.print(s + " "));
System.out.println("");
duplicateSet.forEach((s) -> System.out.print(s + " "));
My output:
The input is : This is a life and our life will be full of fun just like the Benn Steller's Secret life of Walter Mitty.
The output is : This is a life and our will be full of fun just like the Benn Steller's Secret Walter Mitty
Note:
I kept the first life and remove the rest, and of was encountered more than once which I did not touched because the question wants just to keep first life and remove the rest.
I used lambda expression to traverse collections
Sources:
http://www.programcreek.com/2013/03/hashset-vs-treeset-vs-linkedhashset/
http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html

public static void main(String args[])
{
String s;
Scanner in=new Scanner(System.in);
s=in.nextLine();
String ch[]=s.split(" ");
String m=in.nextLine();
for(int i=;i<ch.length;i++)
{
if(ch[i].matches(m))
ch[i]="";
S.o.p(ch[i]);
}
}

Related

Check if String contains multiple values stored in Array of strings

I'm trying to write a program that checks if a string contains multiple words that must be occurred in a specific order the words are stored in Array of Strings
Here what I have reached so far
boolean Check = false;
Scanner S = new Scanner(System.in);
System.out.println("What is your question?");
String input=S.nextLine();
String[] Words = {"pay","car"};
for (int i = 0; i <= Words.length -1 ; i++) {
if (input.matches(".*\\b"+Words[i]+"\\b.*") && input.matches(".*\\b"+Words[1]+"\\b.*")) {
Check = true;
}
}
if (Check){
System.out.println("30k Dollar");
} else{
System.out.println("Wrong info! ");
}
Basically, what my code does is when the user input for example
"how much should i pay for the car?" he will get an answer of "30k Dollar"
because the strings "pay" and "car" are both in my array of strings.
Case 2: if the user input " bla bla car bla bla pay"
he will get the same answer.
How can I prevent the program from giving the same answer for the 2 different questions?
also in my code I used Words[i] and Words[1] but when I got larger list of words this wont work, I tried using nested loop but it didn't work.
You don't need to iterate over input words, just generate the full regex:
String[] words = {"pay","car"};
String regex = ".*\\b" + String.join("\\b.*\\b", words) + "\\b.*";
String test1= "how much should i pay for the car?";
System.out.println(test1.matches(regex)); // True
String test2 = "bla bla car bla bla pay";
System.out.println(test2.matches(regex)); // False
I will assume you always look for words separated by spaces, so you can get the words separated using split
String inputWords[] = input.split(" ");
First thing we need to reduce the time complexity of checking if the word is in our array so we can fill the array in a set but since we care about the order we better use a map with key the word and value the index of that word in the array
Map<String,Integer> map = new HashMap<>();
String[] words = {"pay","car"};
for(int i =0; i< words.length; i++)
map.put(words[i], i);
So now all you need is to iterate over your inputWords and check that all the words are there and you are not violating the order, this time complexity is O(n)
int lastFoundIndex = -1;
int numFound =0;
for(int i=0; i < inputWords.length; i++) {
if(map.get(inputWords[i]) != null ) {
if(map.get(inputWords[i]) < lastFoundIndex)
break;
lastFoundIndex = map.get(inputWords[i]);
numFound ++;
}
}
if(numFound >= words.length) // this condition means we are allowing more than occurence without violating the order
system.out.println("30k Dollar");
else
System.out.println("Wrong info! ");
You could combine them into a single regex check. You're already matching any character before or after (with .*) so just basically concatenate your regex strings into a single check.
if (input.matches(".*\\b" + Words[0] + "\\b.*\\b" + Words[1] + "\\b.*"))
EDIT: response to "also in my code I used Words[i] and Words[1] but when I got larger list of words this wont work, I tried using nested loop but it didn't work."
You could just iterate over the input words to create the regex string.
String regexPattern = ".*\\b" + String.Join("\\b.*\\b", Words) + "\\b.*";
EDIT2: here's my answer and edit combined w/ more context in the code:
String[] Words = {"pay","car"};
String regexPattern = ".*\\b" + String.Join("\\b.*\\b", Words) + "\\b.*";
if (input.matches(regexPattern)) {
System.out.println("30k Dollar");
} else {
System.out.println("Wrong info!");
}
EDIT3: Replaced Words.Join() with String.Join() cause I can Java w/o a compiler, real gud.

Java Scanner - Using conditions in scanner to check token type and splitting alphanumerals

I am trying to split two lines of strings inputted into the scanner as one big string, back into two separate strings (as shown with my below example and expected output).
Pseudo Code-ish Code
Scanner s = new Scanner("Fred: 18Bob D: 20").useDelimiter(":") //delimiter is probably pointless here
List<String> list = new ArrayList<>();
while (s.hasNext()) {
String str = "";
if (//check if next token is str) {
str = str + s.next();
}
if (//check if next token is :) {
//before the : will always be a name of arbitary token length (such as Fred,
//and Bob D), I also need to split "name: int" to "name : int" to achieve this
str = str + ": " + s.next();
}
if (//check if next token is alphanumeral) {
//split the alphanumeral then add the int to str then the character
str = str + s.next() + "\n" + s.next() //of course this won't work
//since s.next(will go onto the letter 'D')
}
else {
//more code if needed otherwise make the above if statement an else
}
list.add(str);
}
System.out.println(list);
Expected Output
Fred: 18
Bob D: 20
I just can't figure out how I can achieve this. If any pointers towards achieving this can be given, I would be more than thankful.
Also, a quick question. What's the difference between \n and line.separator and when should I use each one? From the simple examples I've seen in my class codes, line.separator has been used to separate items in a List<String> so that's the only experience I have with that.
You can try below snippet for your purpose :
List<String> list = new ArrayList<String>();
String str="";
while(s.hasNext()){
if(s.hasNextInt()){
str+=s.nextInt()+" ";
}
else {
String tmpData = s.next();
String pattern = ".*?(\\d+).*";
if(tmpData.matches(pattern)){
String firstNumber = tmpData.replaceFirst(".*?(\\d+).*", "$1");
str+=firstNumber;
list.add(str);
str="";
str+=tmpData.replace(firstNumber, "")+" ";
}else{
str+=tmpData;
}
}
}
list.add(str);
System.out.println(list);

How to compare character input by user to dictionary file in Java?

I need to read the user input and compare this to a dictionary.txt. The user may input any number of characters and the program must return all the words in the English language that can be made from these characters. The letters can be used in any order and may only be used once.
For example:
User Input: "odg"
Output: "dog" , "god" ... and any others
After quite a substantial amount of research, I have come up with the following partial solution:
Read user input
Convert to an array of characters
Loop through the document depending on array length
Using indexOf to compare each character in this array to each line, then printing the word/s which do not return -1
How do I compare a set of characters inputted by the user to those found in a text file (dictionary) ? The characters do not have to be in any order to match .(as seen in the example used above)
Bear with me here, I know this must be one of the most inefficient ways to do such a task! Any further ideas on how to implement my original idea would be appreciated, while I am also open to any new and more efficient methods to perform this operation.
Below is what I have come up with thus far:
public static void main(String[] args) throws FileNotFoundException {
BufferedReader reader1 = new BufferedReader(new FileReader(FILENAME));
Scanner sc = new Scanner(System.in);
String line;
ArrayList<String> match = new ArrayList<>();
System.out.println("Enter characters to see which english words match: ");
String userInput = sc.next();
char arr[] = userInput.toCharArray();
int i;
try {
while ((line = reader1.readLine()) != null) {
for (i=0; i < arr.length; i++)
{
if ((line.indexOf(userInput.charAt(i)) != -1) && (line.length() == arr.length)) {
match.add(line);
}
else {
// System.out.println("no matches");
}
}
}
System.out.println(match);
}
catch (IOException e) {
e.printStackTrace();
}
**Current results: **
Words in text file:
cab
dog
god
back
dogs
quick
User input: "odg"
Program output:
[god, god, god, dog, dog, dog]
The program should return all words in the dictionary that can be made out of the string entered by the user I am managing to return both instances in this case, however, each are displayed for three times (arr.length).
First of all, interesting question. I implemented my solution and Ole V.V's solution. Here are the codes based on your post. I test the only test case you provided, not sure whether this is what you want. Let me know if it is not working as you expected.
Solution One: counting O(nk)
public static void main(String[] args) throws IOException {
BufferedReader reader1 = new BufferedReader(new FileReader(FILENAME));
Scanner sc = new Scanner(System.in);
System.out.println("Enter characters to see which english words match: ");
String userInput = sc.next();
Map<Character, Integer> counter = count(userInput);
String line;
while ((line = reader1.readLine()) != null) {
Map<Character, Integer> lineCounter = count(line);
if(lineCounter.equals(counter)) {
System.out.println(line);
}
}
}
public static Map<Character, Integer> count(String input) {
Map<Character, Integer> result = new HashMap<Character, Integer>();
for (char c: input.toCharArray()) {
result.putIfAbsent(c, 0);
result.put(c, result.get(c) + 1);
}
return result;
}
Solution Two: sorting O(nk)
public static void main(String[] args) throws IOException {
BufferedReader reader = new BufferedReader(new FileReader(FILENAME));
Scanner sc = new Scanner(System.in);
System.out.println("Enter characters to see which english words match: ");
String userInput = sc.next();
userInput = sort(userInput);
String line;
while ((line = reader.readLine()) != null) {
String sortedLine = sort(line);
if(sortedLine.equals(userInput)) {
System.out.println(new String(line));
}
}
}
// counting sort
public static String sort(String input) {
char c[] = input.toCharArray();
int length = c.length;
char output[] = new char[length];
int count[] = new int[256];
for (int i = 0; i < length; i++) {
count[c[i]] = count[c[i]] + 1;
}
for (int i = 1; i <= 255; i++) {
count[i] += count[i - 1];
}
for (int i = 0; i < length; i++) {
output[count[c[i]] - 1] = c[i];
count[c[i]] = count[c[i]] - 1;
}
return new String(output);
}
The standard solution to this kind of problem is: sort the characters of the user input. So odg will become dgo and back will become abck. For each word in the dictionary, do the same sorting. So cab will become abc and dog will be dgo — hey, that’s the same as the first user input, so now we know that this word should be output.
The strong point with this solution is you make sure every letter is used exactly once. It even takes duplicate letters into account: if the same letter comes twice in the user input, it will only find words that also contain that letter exactly twice.
If you like, you can prepare your word list in advance by building a map where the keys are the alphabetically sorted words and the values are lists of words that contain those same letters. So key dgo will map to a list of [dog, god]. Then you just have to sort the input and make a lookup.
I'll show you a solution that is easy to understand and implement but not the fastest available:
Possible solution: Array sorting
Treat input string and dictionary word as array of chars, sort them, then compare them:
public static boolean stringsMatchSort(String a, String b) {
// Different length? Definitely no match!
if (a.length() != b.length()) {
return false;
}
// Turn both Strings to char arrays
char[] charsA = a.toCharArray();
char[] charsB = b.toCharArray();
// Sort both arrays
Arrays.sort(charsA);
Arrays.sort(charsB);
// Compare them, if equal: match!
return Arrays.equals(charsA, charsB);
}
Note how I made the meat of your program / problem into a method. You can then easily use that method in a loop that iterates over all words of your dictionary. The method doesn't care where the words come from: a file, a collection, additional user input, the network, etc.
It also helps to simplify your program by dividing it into smaller parts, each with a smaller responsibility. This is commonly known as divide & conquer and is one of the most valuable strategies for both, new and old programmers alike, when it comes to tackling complicated problems.
Other solutions: Prime numbers, HashMaps, ...
There are other (including faster and more elegant) solutions available. Take a look at these related questions, which yours is pretty much a duplicate of:
"How to check if two words are anagrams"
"finding if two words are anagrams of each other"
Additional notes
Depending on your application, it might be a good idea to first read the dictionary into a suitable collection. This would be especially helpful if you perform multiple "queries" against the same dictionary. Or, if the dictionary is really huge, you could already strip out duplicates during the creation of the collection.

Java Word Count

I am just starting out in Java so I appreciate your patience. Anyways, I am writing a word count program as you can tell by the title, I am stuck at the numWords function below the for loop, I am not sure what I should set it equal to. If someone could set me in the right direction that would be awesome. Thank you. Here is all of my code thus far, let me know if I not specific enough in what I am asking, this is my first post. Thanks again.
import java.util.Scanner;
public class WCount {
public static void main (String[] args) {
Scanner stdin = new Scanner(System.in);
String [] wordArray = new String [10000];
int [] wordCount = new int [10000];
int numWords = 0;
while(stdin.hasNextLine()){
String s = stdin.nextLine();
String [] words = s.replaceAll("[^a-zA-Z ]", "").toLowerCase().split("\\s\
+");
for(int i = 0; i < words.length; i++){
numWords = 0;
}
}
}
}
If your code is intended to just count words, then you don't need to iterate through the words array at all. In other words, replace your for loop with just:
numWords += words.length;
Most likely a simpler approach would be to look for sequences of alpha characters:
Matcher wordMatch = Pattern.compile("\\w+").matcher();
while (wordMatch.find())
numWords++;
If you need to do something with the words (such as store them in a map to a count) then this approach will make that simpler:
Map<String,Integer> wordCount = new HashMap<>();
Matcher wordMatch = Pattern.compile("\\w+").matcher();
while (wordMatch.find()) {
String word = wordMatch.group();
int count = wordCount.getOrDefault(word, 0);
wordCount.put(word, count + 1);
}
Don't worry. We were all beginners once.
First of all, you don't need to do the loop because "length" attribute already has it. But, if you want to practice with loops is so easy as increasing the counter each time the iterator advances and that's it.
numWords++;
Hint: Read the input
String sentence = stdin.nextLine();
Split the string
String [] words = sentence.split(" ");
Number of words in a sentence
System.out.println("number of words in a sentence are " + words.length);
You mentioned in comments that you would also like to print the line in alphabetical order. For that Java got you covered:
Arrays.sort(words);
The best way to count the amount of words in a String String phrase is simply to get a String array from it using the String method split String[] words = phrase.split(" ") and giving it as argument the space itself, this will return a String array with each different words, then you can simple check its lengthwords.length and this will give you the exact number.

Learning simple java string decoding/decryption

I want to write a program that decrypts an input string. It selects 0,2,4,6,8 etc. characters from each section of text input and displays it in reverse in the decryption output.
Input: bxoqb swi eymrawn yim
Output: my name is bob
Keep in mind that the program ignores the space symbol, and repeats the loop at the beginning of each word!
I couldn't find anything on the net that isn't based on a more complicated encryption/decryption systems. I'm starting with the simple stuff, first.
edit: Yes, my question is how do I learn how to do this? Or if someone could teach me a technique to decode strings like this?
pseudo code:
Split your string based of space and store it in list.
iterate your list, get each string(bxoqb) and now extract characters(bob) as you want and save it
Iterate same list in reverse order.
Hope it helps you to start.
The following code is the most straightforward way...
//code starts
public static void main(String[] args) {
String str = "bxoqb swi eymrawn yim";
String ans = decryption(str);
System.out.println(ans);
}
public static String decryption(String str) {
String ans = "";
String[] words = str.split(" ");
for (String s : words) {
for (int i = 0; i < s.length(); i += 2) {
ans = s.charAt(i) + ans;
}
ans = " " + ans;
}
return ans.trim();
}
//code ends
Hope it helps.

Categories