I am trying to work out how to scan a text file of a conversation find how many positive words and negative words there are. The positive and negative words are contained within two separate text files which are used to 'scan' the conversation text file.
After it finds the number of positive and negative words I am trying to get it to tally each up and then tell me if there are more positive or negative words found.
I have the code below so far, it only gives me a count on the positive words. I am not looking at something like NLP at this stage just something on a much more basic level.
I think I have the second part looking for the negative words in the wrong location. And I think I need to use a boolean to tell me if there are more positive or negative words found, but I can't work out how to do it.
I am pretty stuck as I am new to Java, and programing in general.
Any help would be greatly appreciated.
package omgilisearch;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.HashSet;
import java.util.Map;
import java.util.Scanner;
import java.util.Set;
import java.util.TreeMap;
public class SentimentTest {
public static void main(String[] args) throws Exception {
printAllCounts(
readWordFile("ConversationTest.txt", loadKeywords("PositiveWords.txt")));
}
public static void main1(String[] args) throws Exception {
printAllCounts(
readWordFile("ConversationTest.txt", loadKeywords("NegativeWords.txt")));
}
private static Map<String, Integer> readWordFile(
String fname, Set<String> keywords) throws FileNotFoundException
{
final Map<String, Integer> frequencyData = new TreeMap<String, Integer>();
for (Scanner wordFile = new Scanner(new FileReader(fname));
wordFile.hasNext();)
{
final String word = wordFile.next();
if (keywords.contains(word))
frequencyData.put(word, getCount(word, frequencyData) + 1);
}
return frequencyData;
}
private static void printAllCounts(Map<String, Integer> frequencyData) {
System.out.println("-----------------------------------------------");
System.out.println(" Occurrences Word");
for(Map.Entry<String, Integer> e : frequencyData.entrySet())
System.out.printf("%15d %s\n", e.getValue(), e.getKey());
System.out.println("-----------------------------------------------");
}
private static int getCount(String word, Map<String, Integer> frequencyData) {
return frequencyData.containsKey(word)? frequencyData.get(word) : 0;
}
private static Set<String> loadKeywords(String fname)
throws FileNotFoundException
{
final Set<String> result = new HashSet<String>();
for (Scanner s = new Scanner(new FileReader(fname)); s.hasNext();)
result.add(s.next());
return result;
}
}
You would have to have some array of so called "bad" words (wich are hard coded) and then iterate through the whole text file and compare every word in the array with the word you currently inspecting. If the word matches with one of the words in the array, then increase some variable that is holding the amount of badwords eg. badWords++;. I believe this approach should work.
package omgilisearch;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.HashSet;
import java.util.Map;
import java.util.Scanner;
import java.util.Set;
import java.util.TreeMap;
public class SentimentTest {
public static void main(String[] args) throws Exception {
printAllCounts(
readWordFile("ConversationTest.txt"));
}
private static Map<String, Integer> readWordFile(String string) {
return null;
}
String[] goodWordsHolder = new String[3];{
goodWordsHolder[0] = "good"; goodWordsHolder[1] = "great";goodWordsHolder[2] = "excellent";
for(int iteration = 0; iteration < goodWordsHolder.length; iteration++) { String currentWordInText;
if(goodWordsHolder[iteration] == currentWordInText) { }// The word is a bad word } }
private static void printAllCounts(Map<String, Integer> frequencyData) {
System.out.println("-----------------------------------------------");
System.out.println(" Occurrences Word");
for(Map.Entry<String, Integer> e : frequencyData.entrySet())
System.out.printf("%15d %s\n", e.getValue(), e.getKey());
System.out.println("-----------------------------------------------");
}
}
package omgilisearch;
import java.io.*;
public class SentimentTest {
public static void main(String[] args) {
String[] lines = new String[0];
String path = "ConversationTest.txt";
BufferedReader br = null;
try {
File file = new File(path);
br = new BufferedReader(
new InputStreamReader(
new FileInputStream(file)));
String line;
while( (line = br.readLine()) != null ) {
lines = add(line, lines);
}
br.close();
} catch(IOException e) {
System.out.println("read error: " + e.getMessage());
}
print(lines);
}
private static String[] add(String s, String[] array) {
String[] goodWordsHolder = new String[3];{
}goodWordsHolder[0] = "good"; goodWordsHolder[1] = "great";goodWordsHolder[2] = "excellent";
for(int iteration = 0; iteration < goodWordsHolder.length; iteration++) { String currentWordInText = null; if(goodWordsHolder[iteration] == currentWordInText) { }}
return goodWordsHolder; }
private static void print(String[] data) {
for(int i = 0; i < data.length; i++)
System.out.println(data[i]);
}
}
Arrays store multiple items of the same information type eg. String[] badWords;. I believe you should use this, since I'm sure you will have more than 1 bad word that you would like to find in the conversation text, if not, then simple use 1 String eg. String badWord;.
I'm not going to write out all the code that will make it work, I'll just give you an algorithm.
public class test {
// The process of picking out all the good and bad words
public static void main(String[] args) {
// Setting up all the needed variables
// Set up all the good words
String[] goodWordsHolder = new String[2];
goodWordsHolder[0] = "firstGoodWord";
goodWordsHolder[1] = "secondGoodWord";
// Set up all the bad words
String[] badWordsHolder = new String[2];
badWordsHolder[0] = "firstBadWord";
badWordsHolder[1] = "secondBadWord";
// Set up the counters
int amountOfGoodWords = 0;
int amountOfBadWords = 0;
int currentWordInText = 0;
// boolean that will exit the loop
boolean ConversationEnded = false;
while(!ConversationEnded) {
// Compare the currentWord from the conversation with the hard coded words
for(int iteration = 0; iteration < goodWordsHolder.length; iteration++) {
if(goodWordsHolder[iteration] == getWordInText(currentWordInText)) {
amountOfGoodWords++;
}
}
for(int iteration = 0; iteration < badWordsHolder.length; iteration++) {
if(badWordsHolder[iteration] == getWordInText(currentWordInText)) {
amountOfBadWords++;
}
}
// Increase the current word value so the next time we compare the next word in the conversation will be compared
currentWordInText++;
// Check that we haven't reached the end of the conversation
if(endOfTheConversationHasBeenReached()) {
// This will exit the while loop
ConversationEnded = true;
}
}
// Now print all the information to the console
System.out.println("Amount of good Words: " + amountOfGoodWords);
System.out.println("Amount of bad Words: " + amountOfBadWords);
if(amountOfGoodWords > amountOfBadWords) {
System.out.println("There are more good words than bad words.");
}
else {
System.out.println("There are more bad words than good words.");
}
}
// The method(s) you'll have to code out yourself. I suggest you read up on the web and so on to assist you with this.
private static String getWordInText(int currentWordInText) {
// TODO Auto-generated method stub
return null;
}
private static boolean endOfTheConversationHasBeenReached() {
// TODO Auto-generated method stub
return false;
}
}
Excuse me if there are any logical errors. The code hasn't been debugged yet. ;) Hopefully this will guide you into the right direction.
Related
I am trying to create a hashtable to get an ArrayList from my text file read it and then count it into an another text file. I should tokenize each word and get the keys and values by counting them. So far I am still at the beginning and I don't get what is wrong with my code, it seems there is no error but it doesn't connect to the text and get the ArrayList or simply my code is wrong. I would appreciate any help. Thanks.
This is the Map file
public class Map {
public static String fileName= "C:Users\\ruken\\OneDrive\\Desktop\\workshop.txt";
private ArrayList<String> arr = new ArrayList<String>();
public ArrayList <String>getList () {
return this.arr;
}
private Hashtable<String, Integer> map = new Hashtable<String, Integer>();
public void load(String path) {
try{
FileReader f2 = new FileReader("C:Users\\ruken\\OneDrive\\Desktop\\workshop.txt");
Scanner s = new Scanner(f2);
while (s.hasNextLine()) {
String line = s.nextLine();
String[] words = line.split("\\s");
for (int i=0;i<words.length; i++){
String word = words[i];
if (! word.isEmpty()){
System.out.println(word);
arr.add(word);
}
}
}
f2.close();
System.out.println("An error occurred");
}
catch(IOException ex1)
{
Collections.sort(arr);
System.out.println("An error occurred.");
for (String counter: arr) {
System.out.println(counter);
}
ex1.printStackTrace();
}
}
public static void main(String[] args) {
Map m =new Map();
m.load("C:Users\\ruken\\OneDrive\\Desktop\\out.txt");
}
public Object get(String word) {
return null;
}
public void put(String word, int i) {
}
}
This is the Reduce file
package com.company;
import java.io.*;
import java.util.*;
public class Reduce {
private Hashtable<String, Integer> map=new Hashtable< String, Integer>();
public Hashtable < String, Integer> getHashTable () {
return map;
}
public void setHashTable ( Hashtable < String, Integer> map){
this.map =map;
}
public void findMin () {
}
public void findMax() {
}
public void sort (ArrayList<String> arr) throws IOException {
Collections.sort(arr);
Iterator it1 = arr.iterator();
while (it1.hasNext()) {
String word = it1.next().toString();
System.out.println(word);
}
}
//constructors
public void reduce (ArrayList<String> words) {
Iterator<String> it1 =words.iterator();
while (it1.hasNext()) {
String word=it1.next();
System.out.println (word);
if (map.containsKey(word)) {
map.put(word, 1);
}
else {
int count = map.get(word);
map.put(word, count+1);
}
System.out.println( map.containsValue(word));
}
}
}
Here is a part of workshop.txt. It is s basic simple text
"
Acknowledgements
I would like to thank Carl Fleischhauer and Prosser Gifford for the
opportunity to learn about areas of human activity unknown to me a scant
ten months ago, and the David and Lucile Packard Foundation for
supporting that opportunity. The help given by others is acknowledged on
a separate page.
19 October 1992
*** *** *** ****** *** *** ***
INTRODUCTION
The Workshop on Electronic Texts (1) drew together representatives of
various projects and interest groups to compare ideas, beliefs,
experiences, and, in particular, methods of placing and presenting
historical textual materials in computerized form. Most attendees gained
much in insight and outlook from the event. But the assembly did not
form a new nation, or, to put it another way, the diversity of projects
and interests was too great to draw the representatives into a cohesive,
action-oriented body.(2)"
Counting word frequency in text can be accomplished using the java stream API
Here is my implementation, followed by explanatory notes.
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.Hashtable;
import java.util.Map;
import java.util.function.BiConsumer;
import java.util.function.BinaryOperator;
import java.util.function.Function;
import java.util.function.Supplier;
import java.util.stream.Collectors;
public class WordFreq {
public static void main(String[] args) {
Path path = Paths.get("workshop.txt");
Function<String, String> keyMapper = Function.identity();
Function<String, Integer> valueMapper = (word) -> Integer.valueOf(1);
BinaryOperator<Integer> mergeFunction = (a, b) -> Integer.valueOf(a.intValue() + b.intValue());
Supplier<Hashtable<String, Integer>> mapSupplier = () -> new Hashtable<>();
try {
Map<String, Integer> map = Files.lines(path)
.flatMap(line -> Arrays.stream(line.split("\\b")))
.filter(word -> word.matches("^\\w+$"))
.map(word -> word.toLowerCase())
.collect(Collectors.toMap(keyMapper, valueMapper, mergeFunction, mapSupplier));
BiConsumer<String, Integer> action = (k, v) -> System.out.printf("%3d %s%n", v, k);
map.forEach(action);
}
catch (IOException xIo) {
xIo.printStackTrace();
}
}
}
Method lines() in class java.nio.file.Files creates a stream of the lines of text in the file. In this case the file is your workshop.txt file.
For each line of the file that is read, I split it into words using method split() in class java.lang.String and convert the array returned by method split() into another stream.
Actually each line of text is split at every word boundary so the array of words that method split() returns may contain strings that aren't really words. Therefore I filter the "words" in order to extract only real words.
Then I convert each word to lower case so that my final map will be case-insensitive. In other words, the word The and the word the will be considered the same word.
Finally I create a Map where the map key is a distinct word in the text of file workshop.txt and the map value is an Integer which is the number of occurrences of that word in the text.
Since you stipulated that the Map must be a Hashtable, I explicitly created a Hashtable to store the results of the collect operation on the stream.
The last part of the above code displays the contents of the Hashtable.
I sorted out the first part, "Map" as below, now I have an alphabetically sorted array.
as follows..now I should count the tokenized key values.
"..
yet
yet
yet
yet
yet
yielded
you
young
zeal
zero.
zooming
..."
package com.company;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;
import java.util.Collections;
public class Map {
public static String fileName= "C:\\Users\\ruken\\OneDrive\\Desktop\\workshop.txt";
private ArrayList<String> arr = new ArrayList<String>();
public ArrayList <String>getList () {
return this.arr;
}
private Hashtable<String, Integer> map = new Hashtable<String, Integer>();
public void load() {
try{
FileReader f2 = new FileReader("C:\\Users\\ruken\\OneDrive\\Desktop\\workshop.txt");
Scanner s = new Scanner(f2);
while (s.hasNextLine()) {
String line = s.nextLine();
String[] words = line.split("\\s");
for (int i=0;i<words.length; i++){
String word = words[i];
if (! word.isEmpty()){
System.out.println(word);
arr.add(word);
}
}
}
f2.close();
System.out.println();
}
catch(IOException ex1){
System.out.println("An error occurred.");
ex1.printStackTrace(); }
{
Collections.sort(arr);
System.out.println("Sorted.");
for (String counter: arr) {
System.out.println(counter);
}
}
}
public static void main(String[] args) {
Map m =new Map();
m.load();
}
}
The second part which is doing the reducing is:
package com.company;
import java.io.*;
import java.util.*;
import java.io.FileWriter;
import java.io.IOException;
public class Reduce {
private Hashtable<String, Integer> map = new Hashtable<String, Integer>();
public Hashtable<String, Integer> getHashTable() {
return map;
}
public void setHashTable(Hashtable<String, Integer> map) {
this.map = map;
}
//constructors
public void reduce (ArrayList<String> arr) {
Iterator<String> it1 = arr.iterator();
while (it1.hasNext()) {
String word = it1.next();
System.out.println(word);
if (map.containsKey(word)) {
int a = (int) map.get(word);
a++;
map.put(word, a);
} else {
map.put(word, 1);
}
}
}
public void write () {
try {
FileWriter f1 = new FileWriter("C:\\Users\\ruken\\OneDrive\\Desktop\\output.txt");
Iterator<String> it1 = map.keySet().iterator();
while (it1.hasNext()) {
String word = it1.next().toString();
f1.write(word + "" + ":" + "" + map.get(word) + "\n" );
}
f1.close();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
Map m =new Map();
m.load();
Reduce r = new Reduce ();
ArrayList<String> arr= m.getList();
r.reduce(arr);
r.write();
}
}
I have a text file with about 5000 names or so, each separated by line.
I am have already accomplished adding all the names to an ArrayList "names", but
i am not able to add anything to my arrayList scores.
I don't know where I'm going wrong, especially in the addScores method, nothing gets outputted at all.
If anymore information is required, please ask.
And thanks for the help..
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
public class ScoringNames {
BufferedReader x = null;
String location = "xxxxx\\names.txt";
ArrayList<String> names = new ArrayList<String>();
ArrayList<Integer> scores = new ArrayList<Integer>();
public void readFile(){ //Opens file, and prints every line.
try{
x = new BufferedReader(new FileReader(location));
} catch(FileNotFoundException e) {
e.printStackTrace();
}
try{
String name = x.readLine();
while(name != null){
//System.out.println(name);
names.add(name);
name = x.readLine();
}
} catch(IOException e) {
e.printStackTrace();
}
}
public int nameScore(String name){
this.readFile(); //Open file and read, so that values are added to <names>
this.sortNames();
int score = 0;
char[] tempName = name.toCharArray();
for (char i : tempName){
score += alphValue(i);
}
return score;
}
public void addScores(){
for(String x : names){
scores.add(nameScore(x));
}
}
public void printScores(){
for(int counter: scores)
System.out.println(counter);
}
public static void main(String[] args) {
ScoringNames abc = new ScoringNames();
abc.readFile();
abc.addScores();
abc.printScores();
}
}
The error i get is:
Exception in thread "main" java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901)
at java.util.ArrayList$Itr.next(ArrayList.java:851)
at ScoringNames.addScores(ScoringNames.java:148)
at ScoringNames.main(ScoringNames.java:163)
You are modifying the List names while accessing it from the For loop of addScores() method.
When you call nameScore(String str) method, Then you don't need to read the file again as all data has been read already and stored in the names list. You need to do just evaluation of the String and return the score.
public int nameScore(String name){
int score = 0;
char[] tempName = name.toCharArray();
for (char i : tempName){
score += alphValue(i);
}
return score;
}
Change method nameScore - remove two top lines.
this.readFile(); // Open file and read, so that values are added to
// <names>
this.sortNames();
There are not unnecessary and the readFile() is the reason of the error.
The reason of the error is that you try to change this.names value in for each loop (for (String x : names)) and that is forbidden in Java.
public int nameScore(String name) {
int score = 0;
char[] tempName = name.toCharArray();
for (char i : tempName) {
score += alphValue(i);
}
return score;
}
I'm trying to create a method which will search through an ArrayList containing several words searching for a specific String. Where if the String is not equal to any words in the ArrayList it will add the word to the list, and if the word already exists in the list, it will count how many times the word occurs and then add one more, which will represent the last String input.
This is what I've got so far in my code:
public void leggTilOrd(String ord) {
if (Ord.contains(ord)) {
teller++;
}
if (!Ord.contains(ord)) {
Ord.add(ord);
}
System.out.println(teller);
}
Obviously this will only add one more number in the counter (teller), so what I'm trying to achieve it to add 1 on top of all the occurrences of that specific String in the list and this is where I'm stuck.
Edit: I should also mention that Ord is an ArrayList I've created earlier in the code.
Here's the full code:
import java.util.Scanner;
import java.io.File;
import java.util.ArrayList;
public class Ordliste {
private ArrayList<String> Ord = new ArrayList<String>();
private int teller = 0;
public void lesBok (String filnavn) throws Exception{
Scanner fil = new Scanner(new File(filnavn));
while(fil.hasNextLine()){
Ord.add(fil.next());
} fil.close();
}
public void leggTilOrd(String ord){
if(Ord.contains(ord)){
teller++;
} if (!Ord.contains(ord)){
Ord.add(ord);
} System.out.println(teller);
}
}
Instead of doing this , you can keep track of the strings in your list using a Set
Set<String> set = new HashSet<String>();
public void leggTilOrd(String ord) {
if (set.contains(ord) != null) {
teller++;
} else {
Ord.add(ord);
set.add(ord);
}
System.out.println(teller);
}
Are you tied to using an ArrayList? I'd recommend using a Map<String, Integer> instead and store the number of occurrences of a specific string as value in the map:
if (map.get(ord) != null) {
map.put(ord, map.get(ord) + 1);
}
else {
map.put(ord, 1);
}
Still not sure exactly what you are trying to accomplish, do you increment teller each time you add a different word?
This is a simple way to do what I believe you are attempting, and should get you going in the right direction.
import java.util.Arrays;
import java.util.List;
public class test
{
static int teller = 0;
static List<String> words = Arrays.asList("dog", "cat", "dog");
public static void main(String[] args)
{
System.out.println(teller);
addAndCount("dog");
System.out.println(teller);
teller = 0;
addAndCount("cat");
System.out.println(teller);
teller = 0;
addAndCount("fish");
System.out.println(teller);
}
public static void addAndCount(String newWord)
{
teller += 1 + Math.toIntExact(
words.stream().filter(string -> newWord.equals(string)).count());
words.add(newWord);
}
}
I am making a game called Word Ladders, the game goes that you would change one character from a word each time to get to a new word and eventaully we want to hit the end word from a start word. All the words are contained in a dictionary which I used a HashSet for. Basically I am at the point where I have constructed my graph in a HashMap were the String key is the current word and the ArrayList contains all of the words that are 1 character off. What I am trying to do now is to construct a search of the word key against all of the words stored in the array list to and keep doing that over and over until I hit the target end word then record the path that was displayed maybe in a list or something.
So What I got so far is:
package ladder;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Queue;
import java.util.Scanner;
/**
*
* #author Dolphin
*/
public class Ladder {
public HashMap<String, ArrayList<String>> graph = new HashMap<String, ArrayList<String>>();
public HashSet<String> words = new HashSet<String>();
public ArrayList<String> path = new ArrayList<String>();
public Queue<String> ladder = new LinkedList<>();
public ArrayList<ArrayList<String>> path1 = new ArrayList<ArrayList<String>>();
public Queue<String> myQ = new LinkedList<>();
public String startingWord;
public String endingWord;
public void Ladder1(String filename, String start, String end)
throws FileNotFoundException {
startingWord = start;
endingWord = end;
loadlist(filename, start.length());//we cut the list down
for (String thewords : words) {
graph.put(thewords, new ArrayList<String>());
}
makeGraph(words, graph);
Bfs(startingWord, endingWord, graph, path);
//GeneratePath(graph,path,end,path1);
//System.out.print(path1);
// makeLadder(startingWord, endingWord, words);
//next we wana add each word to the array list, and make new array lists of words that are one character off
//then you find the shortest path of the grap
}
//first thing we do is seperate the words from the list by length of the start word
public void loadlist(String filename, Integer length)
throws FileNotFoundException {
Scanner scan = new Scanner(new File(filename));
while (scan.hasNextLine()) {
String word = scan.nextLine();
if (word.length() == length) {
words.add(word);
}
}
}
public void makeGraph(HashSet<String> words, HashMap<String, ArrayList<String>> graph) {
for (String x : graph.keySet()) {
ArrayList<String> xterms = graph.get(x);
for (String y : graph.keySet()) {
ArrayList<String> yterms = graph.get(y);
if (getDifference(x, y) == 1) {
if (!xterms.contains(y)) {
xterms.add(y);
}
if (!yterms.contains(x)) {
yterms.add(x);
}
}
graph.put(y, yterms);
}
graph.put(x, xterms);
}
//System.out.print(graph);
}
public void Bfs(String start, String end, HashMap<String, ArrayList<String>> graph, ArrayList<String> path) {
ArrayList<String> newpath = path;
ArrayList<String> stuff = graph.get(start);//we get the arraylist
for (int i = 0; i < stuff.size(); i++) {
myQ.add(stuff.get(i));
}
while (!myQ.isEmpty()) {
String w = myQ.poll();
newpath.add(w);
if (!w.equals(end)) {
newpath.clear();
Bfs(w, end, graph, newpath);
} else {
newpath = path;
}
}
}
public int getDifference(String word1, String word2) {
int difference = 0;
for (int i = 0; i < word1.length(); i++) {
if (word1.charAt(i) != word2.charAt(i)) {
difference++;
}
}
return difference;
}
public static void main(String[] args)
throws FileNotFoundException {
// TODO code application logic here
Ladder lad = new Ladder();
lad.Ladder1("wordList.txt", "roil", "food");
}
}
So pretty much my hashmap graph contains all the words in a file i pass through, I want to conduct a breadth first search on that graph where the graph is like the HashMap> i contain the graph in, I have a start word and a ending word that I want to search for and use. I also want to store the path that is used to get from the start word to the end word and eventually print it out. If someone could help implement the breadth first search or guide me on how to do that it would be really appreciated. Ive been stuck on this for a very long time.
So ive updated my Bfs to be like this from what was described below:
public void Bfs(String start, String end, HashMap<String, ArrayList<String>> graph, ArrayList<String> path) {
ArrayList<String> newpath = path;
ArrayList<String> stuff = graph.get(start);//we get the arraylist
for (int i = 0; i < stuff.size(); i++) {
myQ.add(stuff.get(i));
}
while (!myQ.isEmpty()) {
String w = myQ.poll();
newpath.add(w);
if (!w.equals(end)) {
newpath.clear();
Bfs(w, end, graph, newpath);
} else {
newpath = path;
}
}
}
I dont know how to store the word path, and im getting a stack overflow error, if anyone knows how to fix this, it would be greatly appreciated
Thanks ahead of time
I have a working word occurrence program that took me a while to code (still new at Java) and I was wondering if I could get a little assistance. Here is my code that I have so far:
import java.io.*;
import java.util.ArrayList;
import java.util.List;
public class TestWordOccurenceProgram {
public static void main(String[] args) {
String thisLine = null;
try {
FileReader fr = new FileReader("myTextDocument.txt");
BufferedReader br = new BufferedReader(fr);
//List<String> wordList = new ArrayList<>();
List<String> words = new ArrayList<>();
// make ArrayList of integers
List<Integer> counts = new ArrayList<>();
String word = "";
while ((thisLine = br.readLine()) != null ) {
word = word.concat(thisLine);
word = word.concat(" ");
}
String[] wordList = word.split("\\s");
for (int i = 0; i < wordList.length; i++) {
String temp = wordList[i];
if(words.contains(temp)) {
int x = words.indexOf(temp);
int value = counts.get(x);
value++;
counts.set(x, value);
}
else {
words.add(temp);
counts.add(1);
}
}
for (int i = 0; i < words.size(); i++) {
System.out.println(words.get(i) + ": " + counts.get(i));
}
br.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
System.exit(1);
} catch (IOException e) {
e.printStackTrace();
System.exit(1);
}
}
}
Here is what "myTextDocument.txt" has:
i am a rabbit
a happy rabbit am
yay i am a rabbit
a rabbit i am yay
Here is my output:
i: 3
am: 4
a: 4
rabbit: 4
happy: 1
yay: 2
Does anyone know if I could arrange these items from the highest number of word occurrences to the lowest number of word occurrences? Any help would be great!
You can use Map instead of List. and use compare method to sort map via its value.
refer this code :
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.Comparator;
import java.util.HashMap;
import java.util.Map;
import java.util.TreeMap;
public class PQ {
public static void main(String[] args) {
String thisLine = null;
try {
FileReader fr = new FileReader("D:\\test.txt");
BufferedReader br = new BufferedReader(fr);
HashMap<String,Integer> map = new HashMap<String,Integer>();
ValueComparator comparator = new ValueComparator(map);
TreeMap<String, Integer> treemap = new TreeMap<String, Integer>(comparator);
while((thisLine = br.readLine()) != null){
String[] str = thisLine.split("\\s+");
for(String s:str){
if(map.containsKey(s)){
Integer i = map.get(s);
i++;
map.put(s,i);
}else{
map.put(s, 1);
}
}
}
treemap.putAll(map);
System.out.println(treemap);
br.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
System.exit(1);
} catch (IOException e) {
e.printStackTrace();
System.exit(1);
}
}
}
class ValueComparator implements Comparator<String>{
Map<String, Integer> base;
public ValueComparator(Map<String, Integer> base) {
this.base = base;
}
public int compare(String a, String b) {
if (base.get(a) >= base.get(b)) {
return -1;
} else {
return 1;
}
}
}
Rather than using two separate lists (one with words, one with counts), why not create a WordAndCount object that has something like getWord and getCount methods? This WordAndCount class can implement Comparable, where you do comparisons based on count. Then, you can store a single List<WordAndCount>, and just sort the single list using Collections.sort.
Roughly, the outline could look like this:
public class WordAndCount implements Comparable<WordAndCount> {
private String word;
private int count;
public WordAndCount(String word) {...}
public void incrementCount() {...}
public int compareTo(WordAndCount other) {...}
}
Wrapping up the combination into a single class makes this much easier to solve, as it provides the easy link between word and its count.
I would recommend using Collections in Java for this, but instead you can use temp variables.
So the idea is to sort by counts. Pseudo-code before outputting:
int tempCount;
String tempWord;
for (int i = 1; i < counts.size(); i++) {
if (counts.get(i) < counts.get(i-1)) {
tempCount = counts.get(i-1);
tempWord = words.get(i-1);
counts.set(i-1, i);
counts.set(i, tempCount);
words.set(i-1, i);
words.set(i, tempWord);
}
You'd need an extra loop around that to correctly order them but hopefully gives you the right idea.