Java, removing elements from an ArrayList - java

I'm having an issue with this project. The basic premise is user enters a phrase and it's supposed to find any duplicate words and how many there are.
My issue is when entering just one word multiple times, such as...
hello hello hello hello hello
The output for that would be;
"There are 2 duplicates of the word "hello" in the phrase you entered."
"There are 1 duplicates of the word "hello" in the phrase you entered."
This only seems to happen in situations like this. If I enter in a random phrase with multiple words thrown in through out, it displays the correct answer. I think the problem has something to do with removing the duplicate words and how many times it iterates through the phrase, but I just cannot wrap my head around it. I've added print lines everywhere and have changed the times it iterates all sorts of ways, I through it in a Java Visualizer and still couldn't find the exact problem. Any help is greatly appreciated!
This is for an assignment for my online Java course, but it's only for learning/practice it does not go towards my major. I'm not looking for answers though just help.
public class DuplicateWords {
public static void main(String[] args) {
List<String> inputList = new ArrayList<String>();
List<String> finalList = new ArrayList<String>();
int duplicateCounter;
String duplicateStr = "";
Scanner scan = new Scanner(System.in);
System.out.println("Enter a sentence to determine duplicate words entered: ");
String inputValue = scan.nextLine();
inputValue = inputValue.toLowerCase();
inputList = Arrays.asList(inputValue.split("\\s+"));
finalList.addAll(inputList);
for(int i = 0; i < inputList.size(); i++) {
duplicateCounter = 0;
for(int j = i + 1; j < finalList.size(); j++) {
if(finalList.get(i).equalsIgnoreCase(finalList.get(j))
&& !finalList.get(i).equals("!") && !finalList.get(i).equals(".")
&& !finalList.get(i).equals(":") && !finalList.get(i).equals(";")
&& !finalList.get(i).equals(",") && !finalList.get(i).equals("\"")
&& !finalList.get(i).equals("?")) {
duplicateCounter++;
duplicateStr = finalList.get(i).toUpperCase();
}
if(finalList.get(i).equalsIgnoreCase(finalList.get(j))) {
finalList.remove(j);
}
}
if(duplicateCounter > 0) {
System.out.printf("There are %s duplicates of the word \"%s\" in the phrase you entered.", duplicateCounter, duplicateStr);
System.out.println();
}
}
}
}
Based on some suggestions I edited my code, but I'm not sure I'm going in the right direction
String previous = "";
for(Iterator<String> i = inputList.iterator(); i.hasNext();) {
String current = i.next();
duplicateCounter = 0;
for(int j = + 1; j < finalList.size(); j++) {
if(current.equalsIgnoreCase(finalList.get(j))
&& !current.equals("!") && !current.equals(".")
&& !current.equals(":") && !current.equals(";")
&& !current.equals(",") && !current.equals("\"")
&& !current.equals("?")) {
duplicateCounter++;
duplicateStr = current.toUpperCase();
}
if(current.equals(previous)) {
i.remove();
}
}
if(duplicateCounter > 0) {
System.out.printf("There are %s duplicates of the word \"%s\" in the phrase you entered.", duplicateCounter, duplicateStr);
System.out.println();
}
}

Your problem with your code is that when you remove an item, you still increment the index, so you skip over what would be the next item. In abbreviated form, your code is:
for (int j = i + 1; j < finalList.size(); j++) {
String next = finalList.get(i);
if (some test on next)
finalList.remove(next);
}
after remove is called, the "next" item will be at the same index, because removing an item directly like this causes all items to the right to be shuffled 1 place left to fill the gap. To fix, you should add this line after removing:
i--;
That would fix your problem, however, there's a cleaner way to do this:
String previous = "";
for (Iterator<String> i = inputList.iterator(); i.hasNext();) {
String current = i.next();
if (current.equals(previous)) {
i.remove(); // removes current item
}
previous = current;
}
inputList now has all adjacent duplicates removed.
To remove all duplicates:
List<String> finalList = inputList.stream().distinct().collect(Collectors.toList());
If you like pain, do it "manually":
Set<String> duplicates = new HashSet<>(); // sets are unique
for (Iterator<String> i = inputList.iterator(); i.hasNext();)
if (!duplicates.add(i.next())) // add returns true if the set changed
i.remove(); // removes current item

I would start by populating a Map<String, Integer> with each word; increment the Integer each time you encounter a word. Something like
String inputValue = scan.nextLine().toLowerCase();
String[] words = inputValue.split("\\s+");
Map<String, Integer> countMap = new HashMap<>();
for (String word : words) {
Integer current = countMap.get(word);
int v = (current == null) ? 1 : current + 1;
countMap.put(word, v);
}
Then you can iterate the Map entrySet and display every key (word) where the count is greater than 1. Something like,
String msgFormat = "There are %d duplicates of the word \"%s\" in "
+ "the phrase you entered.%n";
for (Map.Entry<String, Integer> entry : countMap.entrySet()) {
if (entry.getValue() > 1) {
System.out.printf(msgFormat, entry.getValue(), entry.getKey());
}
}

Before you add inputList to finalList, remove any duplicate items from inputList.

Related

Read ArrayList from file. Print words that appear only ONCE

Newbie to coding and Java, please be kind :)
I'm working on a project for school and I'm trying to iterate over an ArrayList that I read in from a text file.
I read the file in using a Scanner into an ArrayList and then sort the ArrayList using Collections.sort() with the hopes that I can check each element with the next one. If the element is the same as the next one, ignore and continue but if the element is not duplicated in the ArrayList, then add it to a new ArrayList.
So, when reading in a text file that has these words:
this this is a a sentence sentence that does not not make sense a sentence not sentence not really really why not this a sentence not sentence a this really why
the new ArrayList should be
is that does make sense
because those words only appear once.
public static void main (String[] args) throws FileNotFoundException {
Scanner fileIn = new Scanner(new File("words.txt"));
ArrayList<String> uniqueArrList = new ArrayList<String>();
ArrayList<String> tempArrList = new ArrayList<String>();
while (fileIn.hasNext()) {
tempArrList.add(fileIn.next());
Collections.sort(tempArrList);
}
for (String s : tempArrList) {
if(!uniqueArrList.contains(s))
uniqueArrList.add(s);
else if (uniqueArrList.contains(s))
uniqueArrList.remove(s);
Collections.sort(uniqueArrList);
System.out.println(uniqueArrList);
}
This is what I have so far but I keep ending up with this [a, does, is, make, really, sense, that]
I hope someone can tell me what I'm doing wrong :)
Your algorithm is not correct, because it keeps adding and removing items from uniqueArrList. Hence, it finds words that appear an odd number of times, and it does not care for the list to be sorted.
You can sort the list once (move sort out of the loop) and then use a very simple strategy:
Walk the list using an integer index
Check the word at the current index against the word at the next index
If words are different, print the current word, and advance index by one
If words are the same, walk the list forward until you see a different word, and use the location of that word as the next value for the loop index.
Here is a sample implementation:
Scanner fileIn = new Scanner(new File("words.txt"));
List<String> list = new ArrayList<>();
while (fileIn.hasNext()) {
list.add(fileIn.next());
}
Collections.sort(list);
int pos = 0;
while (pos != list.size()) {
int next = pos+1;
while (next != list.size() && list.get(pos).equals(list.get(next))) {
next++;
}
if (next == pos+1) {
System.out.println(list.get(pos));
}
pos = next;
}
Demo.
One option here would be to maintain a hashmap of words to counts as you parse the file. Then, iterate that map at the end to obtain the words which only appeared once:
Scanner fileIn = new Scanner(new File("words.txt"));
Map<String, Integer> map = new HashMap<>();
ArrayList<String> uniqueArrList = new ArrayList<String>();
while (fileIn.hasNext()) {
String word = fileIn.next():
Integer cnt = map.get(word);
map.put(word, cnt == null ? 1 : cnt.intValue() + 1);
}
// now iterate over all words in the map, adding unique words to a separate list
for (Map.Entry<String, Integer> entry : map.entrySet()) {
if (entry.getValue() == 1) {
uniqueArrList.add(entry.getKey());
}
}
You current approach is close, you should be sorting once after you add all the words. Then you need to keep an index to the List so you can test if equal elements are adjacent. Something like,
List<String> uniqueArrList = new ArrayList<>();
List<String> tempArrList = new ArrayList<>();
while (fileIn.hasNext()) {
tempArrList.add(fileIn.next());
}
Collections.sort(tempArrList);
for (int i = 1; i < tempArrList.size(); i++) {
String s = tempArrList.get(i - 1);
if (s.equals(tempArrList.get(i))) {
// skip all equal and adjacent values
while (s.equals(tempArrList.get(i)) && i + 1 < tempArrList.size()) {
i++;
}
} else {
uniqueArrList.add(s);
}
}
System.out.println(uniqueArrList);
The easiest way would be to use a Set or a HashSet since you forget to control the elements' repetition. However, if you have to use lists, there is no need to sort the elements. Just iterate twice over the words and there you go
List<String> uniqueWords = new ArrayList<>();
for (int i = 0; i < words.size(); i++) {
boolean hasDuplicate = false;
for (int j = 0; j < words.size(); j++) {
if (i != j) {
if (words.get(i).equals(words.get(j))){
hasDuplicate = true;
}
}
}
if (!hasDuplicate) {
uniqueWords.add(words.get(i))
}
}
Your logic error when you call
else if (uniqueArrList.contains(s))
uniqueArrList.remove(s);
Use One Array:
Scanner fileIn = new Scanner(new File("words.txt"));
ArrayList<String> tempArrList = new ArrayList<String>();
while (fileIn.hasNext()) {
tempArrList.add(fileIn.next());
}
Collections.sort(tempArrList);
System.out.println(tempArrList);
if (tempArrList.size() > 1) {
for (int i = tempArrList.size() - 1; i >= 0; i--) {
String item = tempArrList.remove(i);
if (tempArrList.removeAll(Collections.singleton(item))) {
if (i > tempArrList.size()) {
i = tempArrList.size();
}
} else {
tempArrList.add(item);
}
}
}
System.out.println(tempArrList);
I hope it can help you! Please feedback if it helpful.
For completeness only, this question is a no brainer with Java 8 Streams, using the distinct() intermediate operation:
public static void main (String[] args) throws FileNotFoundException {
final Scanner fileIn = new Scanner(new File("words.txt"));
final List<String> tempArrList = new ArrayList<String>();
while (fileIn.hasNext()) {
tempArrList.add(fileIn.next());
}
final List<String> uniqueArrList = tempArrList.stream().distinct().collect(Collectors.toList());
System.out.println(uniqueArrList);
}
This code prints (for the provided input):
[this, is, a, sentence, that, does, not, make, sense, really, why]
If we want all the words sorted, simply adding sorted() to the stream pipeline does the trick:
tempArrList.stream().sorted().distinct().collect(Collectors.toList());
and we obtain a sorted (and pretty) output:
[a, does, is, make, not, really, sense, sentence, that, this, why]

Is it correct to convert 2D CharArray to String and use .charAt() to compare a character?

So I have a char variable called "temp" which I'd like to compare to the element stored in "X" CharArray[X][Y] while I'm in a third for loop after the 2D array.
For example:
char temp;
temp = ' ';
String end;
end = "";
for (int i = 0; i < CharArray.length; i++){
for (int m = 0; m < 2; m++){
if (somethingY){
if (somethingZ){
for (int j = 0; j < something.length; j++){
//something
temp = somethingX;
if (temp == String.valueOf(CharArray[i][m]).charAt(0)){
end = String.valueOf(CharArray[i][m]);
System.out.print(end);
}
}
}
}
}
}
I've tried printing "temp" where it says "temp = somethingX" and it prints just fine. But when I try to save the String into a String variable, it will not print the variable called "end".
According to this, it won't do anything if the object is something else, but "end" is a String.
So, what am I doing wrong?
EDIT: In case there's a confusion, "I'm trying to print "end", but I figured if temp == String.valueOf(CharArray[i][m]).charAt(0) is correct, so should "end"'s part.".
EDIT2: Defined "temp" for people...
EDIT3: I tried "end.equals(String.valueOf(CharArray[i][m]));", but still nothing happens when I try to print it. I get no errors nor anything.
EDIT4: I tried putting String.valueOf(CharArray[i][m]).charAt(0) into a another variable called "temp2" and doing if (temp == temp2), but still the same thing.
EDIT5: I tried temp == CharArray[0][m] and then end = CharArray[0][m], but still nothing prints.
EDIT6: OK. Sense this will never get resolved, I'll just say the whole point of my problem. -> I have an ArrayList where each line is a combination of a letter, space and a number (e.g. "E 3"). I need to check if a letter is repeating and if it is, I need to sum the numbers from all repeating letters.
For example, if I have the following ArrayList:
Z 3
O 9
I 1
J 7
Z 7
K 2
O 2
I 8
K 8
J 1
I need the output to be:
Z 10
O 11
I 9
J 8
K 10
I didn't want people to do the whole thing for me, but it seems I've no choice, since I've wasted 2 days on this problem and I'm running out of time.
Use a map :
ArrayList<String> input=new ArrayList<String>();
input.add("O 2");
input.add("O 2");
Map<String, Integer> map= new HashMap<String, Integer>();
for (String s:input) {
String[] splitted=s.split(" ");
String letter=splitted[0];
Integer number=Integer.parseInt(splitted[1]);
Integer num=map.get(letter);
if (num==null) {
map.put(letter,number);
}
else {
map.put(letter,number+num);
}
}
for (Map.Entry<String, Integer> entry : map.entrySet()) {
System.out.println(entry.getKey() + " " + Integer.toString(entry.getValue()));
}
Without using a map :
ArrayList<String> input=new ArrayList<String>();
input.add("O 2");
input.add("O 2");
ArrayList<String> letters=new ArrayList<String>();
ArrayList<Integer> numbers=new ArrayList<Integer>();
for (String s:input) {
String[] splitted=s.split(" ");
String letter=splitted[0];
Integer number=Integer.parseInt(splitted[1]);
int index=-1;
boolean isthere=false;
for (String l:letters) {
index++;
if (l.equals(letter)) {
isthere=true; //BUGFIX
break;
}
}
if (isthere==false) { //BUGFIX
letters.add(letter);
numbers.add(number);
}
else {
numbers.set(index,numbers.get(index)+number);
}
}
for (int i=0; i < letters.size(); i++) {
System.out.println(letters.get(i));
System.out.print(numbers.get(i));
}
Converting it back to have a nice output :
ArrayList<String> output=new ArrayList<String>();
for (int i=0; i < letters.size(); i++) {
output.add(letters.get(i)+" "+Integer.toString(numbers.get(i));
}
Feel free to comment if you are having any questions.

Apply a Frequency to an Element in an Array

I am trying to make a script that will take a set of Words (custom class), organize them alphabetically into an array by their text value (this part works). From here I was going to count how many terms ahead of it are the same as it, and that will be the frequency for all those similar terms. Then it continues to do this till each element in the array has been assigned a frequency. From here it re sorts the elements back into their original position provided a pre stored variable that holds their original element order. Here is the code:
public void setFrequencies() {
List<Word> dupeWordList;
dupeWordList = new ArrayList<>(wordList);
dupeWordList.removeAll(Collections.singleton(null));
Collections.sort(dupeWordList, (Word one, Word other) -> one.getValue().compareTo(other.getValue()));
int count;
int currElement;
for(currElement = 0; currElement < dupeWordList.size(); currElement++) {
count = 1;
Word tempWord = dupeWordList.get(currElement);
tempWord.setFrequency(count);
if(currElement+1 <= dupeWordList.size() - 1) {
Word nextWord = dupeWordList.get(currElement+1);
while(tempWord.getValue().equals(nextWord.getValue())) {
count++;
currElement++;
tempWord.setFrequency(count);
for(int e = 0; e < count - 1; e++) {
Word middleWord = new Word();
if(currElement-count+2+e < dupeWordList.size() - 1) {
middleWord = dupeWordList.get(currElement-count+2+e);
}
middleWord.setFrequency(count);
}
if(currElement+1 <= dupeWordList.size() - 1) {
nextWord = dupeWordList.get(currElement+1);
} else {
break;
}
}
break;
}
}
List<Word> reSortedList = new ArrayList<>(wordList);
Word fillWord = new Word();
fillWord.setFrequency(0);
fillWord.setValue(null);
Collections.fill(reSortedList, fillWord);
for(int i = 0; i < dupeWordList.size(); i++) {
Word word = dupeWordList.get(i);
int wordOrder = word.getOrigOrder();
reSortedList.set(wordOrder, word);
}
System.out.println(Arrays.toString(DebugFreq(reSortedList)));
setWordList(reSortedList);
}
public int[] DebugFreq(List<Word> rSL) {
int[] results = new int[rSL.size()];
for(int i=0; i < results.length; i++) {
results[i] = rSL.get(i).getFrequency();
}
return results;
}
As you can see I set up a little debug method at the bottom. When I run this method is shows that every word was given a frequency of 1. I cant see the issue in my code, nor does it get any errors. Keep in mind I have had it display the sorted dupeWordList and it does correctly alphabetize and their are consecutive duplicate elements in it so this should not be happening.
So If I understand you correctly.. below code would be your solution.
Okay You have a list which is having a strings (terms or words) which are sorted in alphabetical Order.
// Okay the below list is already sorted in alphabetical order.
List<String> dupeWordList = new ArrayList<>(wordList);
To count the Frequency of words in your list, Map<String, Integer> might help you as below.
//Take a Map with Integer as value and String as key.
Map<String,Integer> result = new HashMap<String,Integer> ();
//Iterate your List
for(String s : dupeWordList)
{
if(map.containskey(s))
{
map.put(s,map.get(s)+1);
// Please consider casting here.
}else
{
map.put(s,1);
}
}
Okay now we have a map which is having the frequency of your words or terms as value in your map.
Hope it helps.

Counting occurrences in a string array and deleting the repeats using java

i'm having trouble with a code. I have read words from a text file into a String array, removed the periods and commas. Now i need to check the number of occurrences of each word. I managed to do that as well. However, my output contains all the words in the file, and the occurrences.
Like this:
the 2
birds 2
are 1
going 2
north 2
north 2
Here is my code:
public static String counter(String[] wordList)
{
//String[] noRepeatString = null ;
//int[] countArr = null ;
for (int i = 0; i < wordList.length; i++)
{
int count = 1;
for(int j = 0; j < wordList.length; j++)
{
if(i != j) //to avoid comparing itself
{
if (wordList[i].compareTo(wordList[j]) == 0)
{
count++;
//noRepeatString[i] = wordList[i];
//countArr[i] = count;
}
}
}
System.out.println (wordList[i] + " " + count);
}
return null;
I need to figure out 1) to get the count value into an array.. 2) to delete the repetitions.
As seen in the commenting, i tried to use a countArr[] and a noRepeatString[], in hopes of doing that.. but i had a NullPointerException.
Any thought on this matter will be much appreciated :)
I would first convert the array into a list because they are easier to operate on than arrays.
List<String> list = Arrays.asList(wordsList);
Then you should create a copy of that list (you'll se in a second why):
ArrayList<String> listTwo = new ArrayList<String>(list);
Now you remove all the duplicates in the second list:
HashSet hs = new HashSet();
hs.addAll(listTwo);
listTwo.clear();
listTwo.addAll(hs);
Then you loop through the second list and get the frequency of that word in the first list. But first you should create another arrayList to store the results:
ArrayList<String> results = new ArrayList<String>;
for(String word : listTwo){
int count = Collections.frequency(list, word);
String result = word +": " count;
results.add(result);
}
Finally you can output the results list:
for(String freq : results){
System.out.println(freq);}
I have not tested this code (can't do that right now). Please ask if there is a problem or it doesnÄt work. See these questions for reference:
How do I remove repeated elements from ArrayList?
One-liner to count number of occurrences of String in a String[] in Java?
How do I clone a generic List in Java?
some syntax issues in your code but works fine
ArrayList<String> results = new ArrayList<String>();
for(String word : listTwo){
int count = Collections.frequency(list, word);
String result = word +": "+ count;
results.add(result);
}

Problem implementing classifier algorithm for whitespace separated words

I have a text and split it into words separated by white spaces.
I'm classifying units and they work if it occurs in the same word (eg.: '100m', '90kg', '140°F', 'US$500'), but I'm having problems if they appears separately, each part in a word (eg.: '100 °C', 'US$ 450', '150 km').
The classifier algorithm can understand if the unit is in right and the value is missing is in the left or right side.
My question is how can I iterate over all word that are in a list providing the corrects word to the classifier.
This is only an example of code. I have tried in a lot of ways.
for(String word: words){
String category = classifier.classify(word);
if(classifier.needPreviousWord()){
// ?
}
if(classifier.needNextWord()){
// ?
}
}
In another words, I need to iterate over the list classifying all the words, and if the previous word is needed to test, provide the last word and the unit. If the next word is needed, provide the unit and the next word. Appears to be simple, but I don't know how to do.
Don't use an implicit iterator in your for loop, but an explicit. Then you can go back and forth as you like.
Iterator<String> i = words.iterator();
while (i.hasNext()) {
String category = classifier.classify(i.next());
if(classifier.needPreviousWord()){
i.previous();
}
if(classifier.needNextWord()){
i.next();
}
}
This is not complete, because I don't know what your classifier does exactly, but it should give you an idea on how to proceed.
This could help.
public static void main(String [] args)
{
List<String> words = new ArrayList<String>();
String previousWord = "";
String nextWord = "";
for(int i=0; i < words.size(); i++) {
if(i > 0) {
previousWord = words.get(i-1);
}
String currentWord = words.get(i);
if(i < words.size() - 1) {
nextWord = words.get(i+1);
} else {
nextWord = "";
}
String category = classifier.classify(word);
if(category.needPreviousWord()){
if(previousWord.length() == 0) {
System.out.println("ERROR: missing previous unit");
} else {
System.out.println(previousWord + currentWord);
}
}
if(category.needNextWord()){
if(nextWord.length() == 0) {
System.out.println("ERROR: missing next unit");
} else {
System.out.println(currentWord + nextWord);
}
}
}
}

Categories