Using one text file to search through another text file - java

So I've been trying to get this to work for some time. Let me preface this by saying that I'm not a programmer. It's more a of a hobby that I've recently taken up. I've been trying to get 2 text files to search through each other line by line. i.e. One has a bunch of words (around 10, one per line), and the other has many more (close to 500) also one per line. What I would like is for my program to say how many times each of the words in the smaller text file appears in the larger one. What i have so far is:
import java.util.Scanner;
import java.io.File;
import java.util.regex.Pattern;
public class StringSearch
{
public static void main (String args[]) throws java.io.IOException
{
int tot = 0;
Scanner scan = null;
Scanner scan2 = null;
String str = null;
String str2 = null;
File file = new File("C:\\sample2.txt");
File file2 = new File("C:\\sample3.txt");
scan = new Scanner(file);
scan2 = new Scanner(file2);
while (scan.hasNextLine())
{
str = scan.nextLine();
tot = 0;
while (scan2.hasNextLine())
{
str2 = scan2.nextLine();
if(str.equals(str2))
{
tot++;
}
}
System.out.println("The String = " + str + " and it occurred " + tot + " times");
}
}
}
Not sure why this isnt working. It reads the first word in the first text file fine and counts how many times it appears in the second one, but then it just stop and doesnt move on the the second word in the first file. I hope that makes sense. Something is wrong with the second while loop I think, but I have no idea what.
So, any help would be greatly appreciated. I'm hoping to get this to work and move on to more complicated projects in the future. Gotta start somewhere right?
Cheers Guys

The issue you are running across is that you are using a scanner within a scanner. The way that you currently have your scanners nested, it causes one scanner to completely read through its entire text file for the first word, but after that first run through, it has already read the entire file and will never return true for scan2.hasNextLine().
A better way to achieve what you want is what remyabel stated. You should create an array that will contain all of the words from your small file that will be iterated through every time you go through a word in your other file. You would also need to create something to keep track of how many times each word is hit so you could use something like a hashmap.
It would look something along the lines of this:
Scanner scan = null;
Scanner scan2 = null;
String str = null;
String str2 = null;
File file = new File("C:\\sample2.txt");
File file2 = new File("C:\\sample3.txt");
scan = new Scanner(file);
scan2 = new Scanner(file2);
//Will contain all of your words to check against
ArrayList<String> dictionary = new ArrayList<String>();
//Contains the number of times each word is hit
HashMap<String,Integer> hits = new HashMap<String, Integer>();
while(scan.hasNextLine())
{
str = scan.nextLine();
dictionary.add(str);
hits.put(str, 0);
}
while (scan2.hasNextLine())
{
str2 = scan2.nextLine();
for(String str: dictionary)
{
if(str.equals(str2))
{
hits.put(str, hits.get(str) + 1);
}
}
}
for(String str: dictionary)
{
System.out.println("The String = " + str + " and it occurred " + hits.get(str) + " times");
}
}

Create a buffered reader and read the file into a map of <String, Integer>:
String filename = args[0];
BufferedReader words = new BufferedReader(new FileReader(FILENAME));
Map<String, Integer>m = new HashMap<String, Integer>();
for(String word: words.readLine()){
if(word!=null && word.trim().length()>0) {
m.add(String, 0);
}
}
Then read the words list and increment the map's value each time you find one:
String filename = args[1];
BufferedReader listOfWords = new BufferedReader(new FileReader(FILENAME2));
for(String word: listOfWords.readLine()){
if(word!=null && word.trim().length()>0) {
if(m.get(word)!=null){
m.add(word, m.get(word) + 1);
}
}
}
Then print the results:
for(String word: map.keys()){
if(map.get(word)>0){
System.out.println("The String = " + word + " occurred " + map.get(word) + " times");
}
}

Your approach with using nested loops would scan the second file for every word in the first one. This would be highly inefficient. I suggest loading the first file in a HashMap.
Not only this would leverage on quick lookups, you could update the count of occurrence easily as well. Not to mention, you would be scanning the second file just once and any duplicates that you might have in the first one would automatically be ignored (as the results would be the same).
Map<String, Integer> wordCounts = new HashMap<String, Integer>();
Scanner scanner = new Scanner("one\nfive\nten");
while (scanner.hasNextLine()) {
wordCounts.put(scanner.nextLine(), 0);
}
scanner.close();
scanner = new Scanner("one\n" + // 1 time
"two\nthree\nfour\n" +
"five\nfive\n" + // 2 times
"six\nseven\neight\nnine\n" +
"ten\nten\nten"); // 3 times
while (scanner.hasNextLine()) {
String word = scanner.nextLine();
Integer integer = wordCounts.get(word);
if (integer != null) {
wordCounts.put(word, ++integer);
}
}
scanner.close();
for (String word : wordCounts.keySet()) {
int count = wordCounts.get(word);
if (count > 0) {
System.out.println("'" + word + "' occurs " + count + " times.");
}
}
Output :
'ten' occurs 3 times.
'five' occurs 2 times.
'one' occurs 1 times.

Its just a simple logic issue..
add following statement below System.out.println
scan2 = new Scanner(file2);

Related

I need a program that will ask the user to enter the information to save, line to line in a file. How can I do it?

I need a program that will ask the user to enter the information to save, line to line in a file. How can I do it?
It has to look like this:
Please, choose an option:
1. Read a file
2. Write in a new file
2
File name? problema.txt
How many lines do you want to write? 2
Write line 1: Hey
Write line 2: How are you?
Done! The file problema.txt has been created and updated with the content given.
I have tried in various ways but I have not succeeded. First I have done it in a two-dimensional array but I can not jump to the next line.
Then I tried it with the ".newline" method without the array but it does not let me save more than one word.
Attempt 1
System.out.println("How many lines do you want to write? ");
int mida = sc.nextInt();
PrintStream escriptor = new PrintStream(f);
String [][] dades = new String [mida][3];
for (int i = 0; i < dades.length; i++) {
System.out.println("Write line " + i + " :");
for (int y=0; y < dades[i].length; y++) {
String paraula = sc.next();
System.out.println(paraula + " " + y);
dades[i][y] = paraula;
escriptor.print(" " + dades[i][y]);
}
escriptor.println();
}
Attempt 2
System.out.println("How many lines do you want to write? ");
int mida = sc.nextInt();
PrintStream escriptor = new PrintStream(f);
BufferedWriter ficheroSalida = new BufferedWriter(new FileWriter(new File(file1)));
for (int i = 0; i < mida; i++) {
System.out.println("Write line " + i + " :");
String paraula = sc.next();
ficheroSalida.write (paraula);
ficheroSalida.newLine();
ficheroSalida.flush();
}
System.out.println("Done! The file " + fitxer + " has been created and updated with the content given. ");
escriptor.close();
Attempt 1:
Write line 1: Hey How are
Write line 1: you...
Attempt 2:
Write line 1: Hey
Write line 2: How
Write line 3: are
Write line 4: you
Write line 5: ?
Well, you're almost there. First, I'd use a java.io.FileWriter in order to write the strings to a file.
It's not really necessary to use an array here if you just want to write the lines to a file.
You should also use the try-with-resources statement in order to create your writer. This makes sure that escriptor.close() gets called even if there is an error. You don't need to call .flush() in this case either because this will be done before the handles gets closed. It was good that you intended to do this on your own but in general its safer to use this special kind of statement whenever possible.
import java.io.*;
import java.util.Scanner;
public class Example {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
File f = new File("/tmp/output.txt");
System.out.println("How many lines do you want to write? ");
int mida = sc.nextInt();
sc.nextLine(); // Consume next empty line
try (FileWriter escriptor = new FileWriter(f)) {
for (int i = 0; i < mida; i++) {
System.out.println(String.format("Write line %d:", i + 1));
String paraula = sc.nextLine();
escriptor.write(String.format("%s\n", paraula));
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
In cases where your text file is kind of small and usage of streamreaders/streamwriters is not required, you can read the text, add what you want and write it all over again. Check this example:
public class ReadWrite {
private static Scanner scanner;
public static void main(String[] args) throws FileNotFoundException, IOException {
scanner = new Scanner(System.in);
File desktop = new File(System.getProperty("user.home"), "Desktop");
System.out.println("Yo, which file would you like to edit from " + desktop.getAbsolutePath() + "?");
String fileName = scanner.next();
File textFile = new File(desktop, fileName);
if (!textFile.exists()) {
System.err.println("File " + textFile.getAbsolutePath() + " does not exist.");
System.exit(0);
}
String fileContent = readFileContent(textFile);
System.out.println("How many lines would you like to add?");
int lineNumber = scanner.nextInt();
for (int i = 1; i <= lineNumber; i++) {
System.out.println("Write line number #" + i + ":");
String line = scanner.next();
fileContent += line;
fileContent += System.lineSeparator();
}
//Write all the content again
try (PrintWriter out = new PrintWriter(textFile)) {
out.write(fileContent);
out.flush();
}
scanner.close();
}
private static String readFileContent(File f) throws FileNotFoundException, IOException {
try (BufferedReader br = new BufferedReader(new FileReader(f))) {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
sb.append(line);
sb.append(System.lineSeparator());
line = br.readLine();
}
String everything = sb.toString();
return everything;
}
}
}
An execution of the example would be:
Yo, which file would you like to edit from C:\Users\George\Desktop?
hello.txt
How many lines would you like to add?
4
Write line number #1:
Hello
Write line number #2:
Stack
Write line number #3:
Over
Write line number #4:
Flow
with the file containing after:
Hello
Stack
Over
Flow
And if you run again, with the following input:
Yo, which file would you like to edit from C:\Users\George\Desktop?
hello.txt
How many lines would you like to add?
2
Write line number #1:
Hey
Write line number #2:
too
text file will contain:
Hello
Stack
Over
Flow
Hey
too
However, if you try to do it with huge files, your memory will not be enough, hence an OutOfMemoryError will be thrown. But for small files, it is ok.

Read a word from a file

If someone could help me figure out how to search if a word exists in a file, I would greatly appreciate it. I do know how to read an entire text file though.
And this is what I have so far:
public static void main(String[] args) throws IOException {
File file = new File("words.txt");
Scanner sc = new Scanner(System.in);
System.out.println("Enter a word you would like to search for:");
String word = sc.nextLine();
List<String> words = new ArrayList<>();
try {
sc = new Scanner(file).useDelimiter( ",");
while (sc.hasNext()) {
final String wordFromFile = sc.nextLine();
if (wordFromFile.contains(word)) {
// a match!
System.out.println("The entered word: " + word + " exists in the dictionary");
break;
}
}
} catch (IOException e) {
System.out.println(" cannot write to file " + file.toString());
}
}
}
Just iterate through all the words in file an insert each into a HashSet from the file first. This is linear time O(n) to accomplish, no way around this as you got to read in the whole file.
Assuming one word from file it's like:
HashSet<String> set = new HashSet<>();
while (sc.hasNext()) {
set.add(sc.nextLine();
}
If someone a sticker any they really want it read to a list type collection, you can generate a HashSet like this from the list:
Set<String> set = new HashSet<>(wordList);
Note: This conversion operation is also O(n), so to read it into a list and convert you're O(2n), which is still O(n), but if this list is long far from optimal
For the lookup and/or insertion of the new word you check, can then do it in O(1) time.
if (set.contains(word)) {
//...blah..blah...bla...
} else {
set.add(word);
}
Hence the hash in the name HashSet.
This might help you understand
public static void main(String a[]){
File file = new File("words.txt");
Scanner sc = new Scanner(System.in);
System.out.println("Enter a word you would like to search for:");
String word = sc.nextLine();
boolean exist = false;
List<String> words = new ArrayList<String>();
sc = new Scanner(file);
while(sc.hasNext()){
words.add(sc.next());
}
for(int i=0;i<words.size();i++){
if(words.get(i).equals(word)){
System.out.println("The entered word: " + word + " exists in the dictionary");
exist = true;
break;
}
}
if(!exist){
System.out.println("This word is not in the dictionary.");
System.out.println("Do you want to add it");
if(System.in.read() == 'y'){
words.add(word);
}
}
}

Getting the program to count the total amount of a char input from a user through a text file

My example text:
This-File-Contains-184-Characters.
The-Most-Frequent-Letter-Is-"E".
The-File-Includes-2-Upper-Case-Occurences
And-22-Lower-Case-Occurences-Of-"E".
The-Total-Number-Of-Its-Occurences-Is-24.
The example letter I'm using is "e".
My code:
import java.io.*;
import java.util.Scanner;
public class Homework4a
{
public static void main(String[] args) throws IOException
{
Scanner keyboard = new Scanner(System.in);
System.out.println("Enter name of the input file: ");
String fileName = keyboard.nextLine();
System.out.println("Enter letter: ");
char letter = keyboard.nextLine().charAt(0);
File file = new File(fileName);
Scanner scan = new Scanner(new FileReader(file));
try
{
char lowerCaseLetter = (new Character(letter)).toString().toLowerCase().charAt(0);
char upperCaseLetter = (new Character(letter)).toString().toUpperCase().charAt(0);
int lowerCounter=0;
int upperCounter = 0;
while(scan.hasNextLine())
{
String input = scan.nextLine();
for(int i=0; i<input.length(); i++)
{
if(input.charAt(i)== lowerCaseLetter)
{
lowerCounter++;
}
else if(input.charAt(i)== upperCaseLetter)
{
upperCounter++;
}
}
}
int totalLowerCounter = lowerCounter;
int totalUpperCounter = upperCounter;
int totalCounterSum = totalLowerCounter + totalUpperCounter;
System.out.println("The lower-case letter " + lowerCaseLetter + " occurs " + totalLowerCounter + " times");
System.out.println("The upper-case letter " + upperCaseLetter + " occurs " + totalUpperCounter + " times");
System.out.println("The total number of occurrences (\"" + lowerCaseLetter + "\" and \"" + upperCaseLetter +
"\") is " + (totalCounterSum));
}
finally
{
scan.close();
}
}
}
I'll give you some pointers. My best advice is to use divide & conquer:
Get the file name from user input (you already know about Scanner)
Read the text file (check out BufferedReader and FileReader)
Remember that a char is basically just an int, look up an ASCII table
You can therefore use an array (int[]) where the indices are the ASCII values, the values are the number of occurence
Go over the contents of the file, char by char, and add to the array accordingly
Basically, divide & conquer is all about splitting a task into its smallest problems, then tackle them individually. By breaking an assignment down in this way, even quite complex problems will come down to small problems that are easy to solve.
What's also nice is that once you've broke it down like that, you can go ahead and write a method for each of these sub-tasks. This way, you get a nicely organized code "for free".
I figured out the problem, I needed to have the println outside of the loop. This code can help you read a text file and find a specific character by changing the variable in the "if else" statement to the specific character you need to find in the text file. It then calculates how many lowercase and uppercase letter and the total of them both.
Here is the code that u can formulate to find a letter count from a text file.i have pushed an harcoded letter 'a' u can change it to dynamic also.
import java.io.*;
import java.util.Scanner;
public class CountTheNumberOfAs {
public static void main(String[] args)throws IOException
{
String fileName = "JavaIntro.txt";
String line = "";
Scanner scanner = new Scanner(new FileReader(fileName));
try {
while ( scanner.hasNextLine() ){
line = scanner.nextLine();
int counter = 0;
for( int i=0; i<line.length(); i++ ) {
if( line.charAt(i) == 'a' ) {
counter++;
}
}
System.out.println(counter);
}
}
finally {
scanner.close();
}}}

Random list, SecretPhrase Java Project

My school project requires me to modify my last assignment (code below) to pull a random phrase from a list of at least 10 for the user to guess. I drawing a blank on this. Any help would be appreciated. I understand I have to add a class that would import the text file or list, then I would need to modify a loop in order for it to randomly select?
import java.util.Scanner; // Allows the user to read different value types
public class SecretPhrase {
String phrase; //
Scanner scan = new Scanner(System.in);
SecretPhrase(String phrase){
this.phrase = phrase.toLowerCase();
}
public static void main(String args[]){
SecretPhrase start = new SecretPhrase("Java is Great"); // The phrase the user will have identify
start.go(); // Starts Program
}
void go(){
String guess;
String word="";
String[] words = new String[phrase.length()]; // array to store all charachters
ArrayList<String> lettersGuessed = new ArrayList();
for(int i=0;i<phrase.length();i++){
if(phrase.charAt(i)== ' '){words[i] = " ";}
else{words[i] = "*";} // Array that uses * to hide actual letters
}
int Gcount =0; // Records the count
while(!word.equals(phrase)){ // continues the loop
word = "";
int Lcount = 0;
System.out.print("Guess a letter> ");
guess = scan.next();
for(int i=0;i<phrase.length();i++){ // Accounts for any attempts by user to use more than one charachter at a time.
if((guess.charAt(0)+"").equals(phrase.charAt(i)+"")&&(lettersGuessed.indexOf(guess.charAt(0)+"")==-1)){
words[i] = ( guess.charAt(0))+ "";
Lcount++;
}
}
lettersGuessed.add(guess.charAt(0)+""); // Reveals the letter in phrase
System.out.println("You found " + Lcount +" letters"); // Prints out the total number of times the charachter was in the phrase
for(int i=0;i<words.length;i++){
word=word+words[i];
}
System.out.println(word);
Gcount ++;
}
System.out.println("Good Job! It took you " + Gcount + " guesses!" ); // Prints out result with total count
}
}
In the existing code, you are creating a SecretPhrase object with the phrase to guess:
public static void main(String args[]){
SecretPhrase start = new SecretPhrase("Java is Great");
start.go(); // Starts Program
}
You should replace it with a List (either ArrayList or LinkedList would be fine) and populate it with your data (given by either file, user input or hard-coded):
ArrayList<SecretPhrase> phrases = new ArrayList<SecretPhrase>();
//reading from file:
File file = new File("myPhrases.txt");
FileReader reader = new FileReader(file);
BufferedReader br = new BufferedReader(reader);
String phrase = null;
while ((phrase = br.readLine()) != null) {
phrases.add(new SecretPhrase(phrase));
}
Now either use Random on phrases.size() and execute go on it, or if you're looking for it to be a series of phrases, you can create a permutation and loop over them. I'm not sure what your requirements here are.

input string problem

System.out.println("Please enter the required word :");
Scanner scan2 = new Scanner(System.in);
String word2 = scan2.nextLine();
String[] array2 = word2.split(" ");
for (int b = 0; b < array2.length; b++) {
int numofDoc = 0;
for (int i = 0; i < filename; i++) {
try {
BufferedReader in = new BufferedReader(new FileReader(
"C:\\Users\\user\\fypworkspace\\TextRenderer\\abc"
+ i + ".txt"));
int matchedWord = 0;
Scanner s2 = new Scanner(in);
{
while (s2.hasNext()) {
if (s2.next().equals(word2))
matchedWord++;
}
}
if (matchedWord > 0)
numofDoc++;
} catch (IOException e) {
System.out.println("File not found.");
}
}
System.out.println("This file contain the term " + numofDoc);
}
}
}
this is my code for calculating number of documents containing a specific term. For example :
assume i have 10 million text file and string COW appears in one thousand of these. I am looking for the total one thousand documents containing the COW string.
My program currently only can process one string input.
The output of my program is :
COW
The files containing this term is 1000.
The problem i facing now is when i input 3 strings, It cannot process 3 strings. For example :
COW IS GOOD
The files containing this term is 0.
The files containing this term is 0.
The files containing this term is 0.
I have been trying whole day but i cant see where is my mistake. Mind pointing my mistakes ?
According to your code, you do a loop 3 times (array2.length) but you don't use the array2 at all, instead, you look for the string "COW IS GOOD" three times. you should change the line s2.next().equals(word2) to s2.next().equals(array2[b])
The problem lies here:
if (s2.next().equals(word2))
if word2 = "I love you" and you're doing an equals(), s2.next() must contain the word I love you.
One way to solve this.
String[] words = word2.split(" ");
for (String word: words) {
if (s2.next().equals(word)) {
matchedWord++;
}
}

Categories