I'm working on a project and I'm trying to count
1) The number of words.
2) The number of lines in a text file.
My problem is that I can't figure out how to detect when the file goes to the next line so I can increment lines correctly. Basically if next is not a space increment words and if next is a new line, increment lines. How would I do this? Thanks!
public static void readFile(Scanner f) {
int words = 0;
int lines = 0;
while (f.hasNext()) {
if (f.next().equals("\n")) {
lines++;
} else if (!(f.next().equals(" "))) {
words++;
}
}
System.out.println("Total number of words: " + words);
System.out.println("Total number of lines: " + lines);
}
Try this:
public static void readFile(Scanner f) {
int words = 0;
int lines = 0;
while (f.hasNextLine()) {
String line = f.nextLine();
lines++;
for (String token : line.split("\\s+")) {
if (!token.isEmpty()) {
words++;
}
}
}
System.out.println("Total number of words: " + words);
System.out.println("Total number of lines: " + lines);
}
Do you have to use InputStream? (Yes) It is better to use a BufferedReader with an InputStreamReader passed in so you can read the file line by line and increment while doing so.
numLines = 0;
try (BufferedReader br = new BufferedReader(new InputStreamReader(inputStream))) {
String line;
while ((line = br.readLine()) != null)
{
numLines++;
// process the line.
}
}
Then to count the words just split the string using a regular expression that finds whitespaces. myStringArray = MyString.Split(MyRegexPattern); will then return a String[] of all the words. Then all you do is numWords += myStringArray.length();
You can use an InputStreamReader to create a bufferedreader which can read a file line by line:
int amountOfLines = 0;
try {BufferedReader br = new BufferedReader(new InputStreamReader(inputStream))} catch (Exception e) {e.printStackTrace();}
String line;
while ((line = br.readLine()) != null{
numLines++;
// process the line.
}
You can then use the split(String) method to separate every part
Try following:
public static void readFile(Scanner f) {
int words = 0;
int lines = 0;
while (f.hasNextLine()) {
String line = f.nextLine();
String[] arr = line.split("\\s");
words += arr.length;
lines++;
}
System.out.println("Total number of words: " + words);
System.out.println("Total number of lines: " + lines);
}
Related
What I was trying to do was to read a mnist train file, and express it's first digit in eleven digits, and keep other same.
So 3,1,4,6 ... to ,0,0,0,1,0,0,0,0,0,0,1,4,6... (there's "" at first digit so total 11 digits)
I thought it's an easy job but it wasn't.
import java.io.*;
public class T {
public static void main(String[] args){
File file = new File("./src/dataset/mnist_train.csv");
File wfile = new File("./src/dataset/conv_mnist_train2.txt");
try{
BufferedReader bufferedReader = new BufferedReader(new FileReader(file));
BufferedWriter fileWriter = new BufferedWriter(new FileWriter(wfile));
String line;
String[] numbers;
int g = 0, cnt = 0, cnt2 = 0;
while ((line = bufferedReader.readLine()) != null) {
cnt2++;
numbers = line.split(",");
for(String i : numbers){
if(g == 0){
for(int j=0; j<10; ++j) {
if(j == Integer.parseInt(i)) fileWriter.write("," + 1);
else{ fileWriter.write("," + 0); cnt++;}
}
g++;
}
else {fileWriter.write("," +i); cnt++;}
}
fileWriter.newLine();
System.out.println(numbers.length + " " + cnt + " " + cnt2);
g = 0; cnt = 0;
}
}catch(Exception e){
e.printStackTrace();
}
}
}
g, cnt, cnt2 are numbers I used for debugging but I didn't find any problem here; it naturally converted each lines with 785 letters into new lines with 795 letters.
import java.io.*;
public class Tes {
public static void main(String[] args){
File file = new File("./src/dataset/conv_mnist_train2.txt");
try{
BufferedReader bufferedReader = new BufferedReader(new FileReader(file));
String line;
int g = 0;
while ((line = bufferedReader.readLine()) != null) {
g++;
String[] N = line.split(",");
if(N.length != 795){
System.out.println(N.length + " " + g);
for(String i : N) System.out.print(i + " ");
System.out.println();
}
}
}catch(Exception e){
e.printStackTrace();
}
}
}
But what happened is that when I run my second code, which shouldn't print anything, printed result and said my 59994th row data is only consisted of 311 letters. But from my first code, I confirmed that my 59994th row has 795 letters. I don't know what's going on here.
Also I tried to use FileWriter and FileReader instead of BufferedWriter & Reader, but it didn't solve problem. Could somebody tell me what's going on, and how to fix this?
The problem was that I didn't close the reader/writer. Didn't know it could end up in serious error.
Background: This program reads in a text file and replaces a word in the file with user input.
Problem: I am trying to read in a line of text from a text file and store the words into an array.
Right now the array size is hard-coded with an number of indexes for test purposes, but I want to make the array capable of reading in a text file of any size instead.
Here is my code.
public class FTR {
public static Scanner input = new Scanner(System.in);
public static Scanner input2 = new Scanner(System.in);
public static String fileName = "C:\\Users\\...";
public static String userInput, userInput2;
public static StringTokenizer line;
public static String array_of_words[] = new String[19]; //hard-coded
/* main */
public static void main(String[] args) {
readFile(fileName);
wordSearch(fileName);
replace(fileName);
}//main
/*
* method: readFile
*/
public static void readFile(String fileName) {
try {
FileReader file = new FileReader(fileName);
BufferedReader read = new BufferedReader(file);
String line_of_text = read.readLine();
while (line_of_text != null) {
System.out.println(line_of_text);
line_of_text = read.readLine();
}
} catch (Exception e) {
System.out.println("Unable to read file: " + fileName);
System.exit(0);
}
System.out.println("**************************************************");
}
/*
* method: wordSearch
*/
public static void wordSearch(String fileName) {
int amount = 0;
System.out.println("What word do you want to find?");
userInput = input.nextLine();
try {
FileReader file = new FileReader(fileName);
BufferedReader read = new BufferedReader(file);
String line_of_text = read.readLine();
while (line_of_text != null) { //there is a line to read
System.out.println(line_of_text);
line = new StringTokenizer(line_of_text); //tokenize the line into words
while (line.hasMoreTokens()) { //check if line has more words
String word = line.nextToken(); //get the word
if (userInput.equalsIgnoreCase(word)) {
amount += 1; //count the word
}
}
line_of_text = read.readLine(); //read the next line
}
} catch (Exception e) {
System.out.println("Unable to read file: " + fileName);
System.exit(0);
}
if (amount == 0) { //if userInput was not found in the file
System.out.println("'" + userInput + "'" + " was not found.");
System.exit(0);
}
System.out.println("Search for word: " + userInput);
System.out.println("Found: " + amount);
}//wordSearch
/*
* method: replace
*/
public static void replace(String fileName) {
int amount = 0;
int i = 0;
System.out.println("What word do you want to replace?");
userInput2 = input2.nextLine();
System.out.println("Replace all " + "'" + userInput2 + "'" + " with " + "'" + userInput + "'");
try {
FileReader file = new FileReader(fileName);
BufferedReader read = new BufferedReader(file);
String line_of_text = read.readLine();
while (line_of_text != null) { //there is a line to read
line = new StringTokenizer(line_of_text); //tokenize the line into words
while (line.hasMoreTokens()) { //check if line has more words
String word = line.nextToken(); //get the word
if (userInput2.equalsIgnoreCase(word)) {
amount += 1; //count the word
word = userInput;
}
array_of_words[i] = word; //add word to index in array
System.out.println("WORD: " + word + " was stored in array[" + i + "]");
i++; //increment array index
}
//THIS IS WHERE THE PRINTING HAPPENS
System.out.println("ARRAY ELEMENTS: " + Arrays.toString(array_of_words));
line_of_text = read.readLine(); //read the next line
}
BufferedWriter outputWriter = null;
outputWriter = new BufferedWriter(new FileWriter("C:\\Users\\..."));
for (i = 0; i < array_of_words.length; i++) { //go through the array
outputWriter.write(array_of_words[i] + " "); //write word from array to file
}
outputWriter.flush();
outputWriter.close();
} catch (Exception e) {
System.out.println("Unable to read file: " + fileName);
System.exit(0);
}
if (amount == 0) { //if userInput was not found in the file
System.out.println("'" + userInput2 + "'" + " was not found.");
System.exit(0);
}
}//replace
}//FTR
You can use java.util.ArrayList (which dynamically grows unlike an array with fixed size) to store the string objects (test file lines) by replacing your array with the below code:
public static List<String> array_of_words = new java.util.ArrayList<>();
You need to use add(string) to add a line (string) and get(index) to retrieve the line (string)
Please refer the below link for more details:
http://docs.oracle.com/javase/8/docs/api/java/util/ArrayList.html
You may want to give a try to ArrayList.
In Java normal arrays cannot be initialized without giving initial size and they cannot be expanded during run time. Whereas ArrayLists have resizable-array implementation of the List interface.ArrayList also comes with number of useful builtin functions such as
Size()
isEmpty()
contains()
clone()
and others. On top of these you can always convert your ArrayList to simple array using ArrayList function toArray(). Hope this answers your question. I'll prepare some code and share with you to further explain things you can achieve using List interface.
Use not native [] arrays but any kind of java collections
List<String> fileContent = Files.readAllLines(Paths.get(fileName));
fileContent.stream().forEach(System.out::println);
long amount = fileContent.stream()
.flatMap(line -> Arrays.stream(line.split(" +")))
.filter(word -> word.equalsIgnoreCase(userInput))
.count();
List<String> words = fileContent.stream()
.flatMap(line -> Arrays.stream(line.split(" +")))
.filter(word -> word.length() > 0)
.map(word -> word.equalsIgnoreCase(userInput) ? userInput2 : word)
.collect(Collectors.toList());
Files.write(Paths.get(fileName), String.join(" ", words).getBytes());
of course you can works with such lists more traditionally, with loops
for(String line: fileContent) {
...
}
or even
for (int i = 0; i < fileContent.size(); ++i) {
String line = fileContent.get(i);
...
}
i just like streams :)
I want combine the two methods Just some error in my document parser, frequencyCounter and parseFiles thsi code.
I want all of frequencyCounter should be a function that should be executed from within parseFiles, and relevant information don't worry about the file's content should be passed to doSomething so that it knows what to print.
Right now I'm just keep messing up on how to put these two methods together, please give some advices
this is my main class:
public class Yolo {
public static void frodo() throws Exception {
int n; // number of keywords
Scanner sc = new Scanner(System.in);
System.out.println("number of keywords : ");
n = sc.nextInt();
for (int j = 0; j <= n; j++) {
Scanner scan = new Scanner(System.in);
System.out.println("give the testword : ");
String testWord = scan.next();
System.out.println(testWord);
File document = new File("path//to//doc1.txt");
boolean check = true;
try {
FileInputStream fstream = new FileInputStream(document);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
strLine = br.readLine();
// Read File Line By Line
int count = 0;
while ((strLine = br.readLine()) != null) {
// check to see whether testWord occurs at least once in the
// line of text
check = strLine.toLowerCase().contains(testWord.toLowerCase());
if (check) {
// get the line
String[] lineWords = strLine.split("\\s+");
// System.out.println(strLine);
count++;
}
}
System.out.println(testWord + "frequency: " + count);
br.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
The code below gives you this output:
Professor frequency: 54
engineering frequency: 188
data frequency: 2
mining frequency: 2
research frequency: 9
Though this is only for doc1, you've to add a loop to iterate on all the 5 documents.
public class yolo {
public static void frodo() throws Exception {
String[] keywords = { "Professor" , "engineering" , "data" , "mining" , "research"};
for(int i=0; i< keywords.length; i++){
String testWord = keywords[i];
File document = new File("path//to//doc1.txt");
boolean check = true;
try {
FileInputStream fstream = new FileInputStream(document);
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));
String strLine;
strLine = br.readLine();
// Read File Line By Line
int count = 0;
while ((strLine = br.readLine()) != null) {
// check to see whether testWord occurs at least once in the
// line of text
check = strLine.toLowerCase().contains(testWord.toLowerCase());
if (check) {
// get the line
String[] lineWords = strLine.split("\\s+");
count++;
}
}
System.out.println(testWord + "frequency: " + count);
br.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
hope this helps!
So I have pretty much completed (I think) my wc program in Java, that takes a filename from a user input (even multiple), and counts the lines, words, bytes (number of characters) from the file. There were 2 files provided for testing purposes, and they are in a .dat format, being readable from dos/linux command lines. Everything is working properly except for the count when there are \n or \r\n characters at the end of line. It will not count these. Please help?
import java.io.*;
import java.util.regex.Pattern;
public class Prog03 {
private static int totalWords = 0, currentWords = 0;
private static int totalLines =0, currentLines = 0;
private static int totalBytes = 0, currentBytes = 0;
public static void main(String[] args) {
System.out.println("This program determines the quantity of lines, words, and bytes\n" +
"in a file or files that you specify.\n" +
"\nPlease enter one or more file names, comma-separated: ");
getFileName();
System.out.println();
} // End of main method.
public static void countSingle (String fileName, BufferedReader in) {
try {
String line;
String[] words;
//int totalWords = 0;
int totalWords1 = 0;
int lines = 0;
int chars = 0;
while ((line = in.readLine()) != null) {
lines++;
currentLines = lines;
chars += line.length();
currentBytes = chars;
words = line.split(" ");
totalWords1 += countWords(line);
currentWords = totalWords1;
} // End of while loop.
System.out.println(currentLines + "\t\t" + currentWords + "\t\t" + currentBytes + "\t\t"
+ fileName);
} catch (Exception ex) {
ex.printStackTrace();
}
}
public static void countMultiple(String fileName, BufferedReader in) {
try {
String line;
String[] words;
int totalWords1 = 0;
int lines = 0;
int chars = 0;
while ((line = in.readLine()) != null) {
lines++;
currentLines = lines;
chars += line.length();
currentBytes = chars;
words = line.split(" ");
totalWords1 += countWords(line);
currentWords = totalWords1;
} // End of while loop.
totalLines += currentLines;
totalBytes += currentBytes;
totalWords += totalWords1;
} catch (Exception ex) {
ex.printStackTrace();
}
} // End of method count().
private static long countWords(String line) {
long numWords = 0;
int index = 0;
boolean prevWhitespace = true;
while (index < line.length()) {
char c = line.charAt(index++);
boolean currWhitespace = Character.isWhitespace(c);
if (prevWhitespace && !currWhitespace) {
numWords++;
}
prevWhitespace = currWhitespace;
}
return numWords;
} // End of method countWords().
private static void getFileName() {
BufferedReader in ;
try {
in = new BufferedReader(new InputStreamReader(System.in));
String fileName = in.readLine();
String [] files = fileName.split(", ");
System.out.println("Lines\t\tWords\t\tBytes" +
"\n--------\t--------\t--------");
for (int i = 0; i < files.length; i++) {
FileReader fileReader = new FileReader(files[i]);
in = new BufferedReader(fileReader);
if (files.length == 1) {
countSingle(files[0], in);
in.close();
}
else {
countMultiple(files[i], in);
System.out.println(currentLines + "\t\t" +
currentWords + "\t\t" + currentBytes + "\t\t"
+ files[i]);
in.close();
}
}
if (files.length > 1) {
System.out.println("----------------------------------------" +
"\n" + totalLines + "\t\t" + totalWords + "\t\t" + totalBytes + "\t\tTotals");
}
}
catch (FileNotFoundException ioe) {
System.out.println("The specified file was not found. Please recheck "
+ "the spelling and try again.");
ioe.printStackTrace();
}
catch (IOException ioe) {
ioe.printStackTrace();
}
}
} // End of class
that is the entire program, if anyone helping should need to see anything, however this is where I count the length of each string in a line (and I assumed that the eol characters would be part of this count, but they aren't.)
public static void countMultiple(String fileName, BufferedReader in) {
try {
String line;
String[] words;
int totalWords1 = 0;
int lines = 0;
int chars = 0;
while ((line = in.readLine()) != null) {
lines++;
currentLines = lines;
**chars += line.length();**
currentBytes = chars;
words = line.split(" ");
totalWords1 += countWords(line);
currentWords = totalWords1;
} // End of while loop.
totalLines += currentLines;
totalBytes += currentBytes;
totalWords += totalWords1;
} catch (Exception ex) {
ex.printStackTrace();
}
}
BufferedReader always ignores new line or line break character. There is no way to do this using readLine().
You can use read() method instead. But in that case you have to read each character individually.
just a comment, to split a line to words, it is not enough to split based on single space: line.split(" "); you will miss if there are multiple spaces or tabs between words. better to do split on any whitespace char line.split("\\s+");
This is some code that I found to help with reading in a 2D Array, but the problem I am having is this will only work when reading a list of number structured like:
73
56
30
75
80
ect..
What I want is to be able to read multiple lines that are structured like this:
1,0,1,1,0,1,0,1,0,1
1,0,0,1,0,0,0,1,0,1
1,1,0,1,0,1,0,1,1,1
I just want to essentially import each line as an array, while structuring them like an array in the text file.
Everything I have read says to use scan.usedelimiter(","); but everywhere I try to use it the program throws straight to the catch that replies "Error converting number". If anyone can help I would greatly appreciate it. I also saw some information about using split for the buffered reader, but I don't know which would be better to use/why/how.
String filename = "res/test.txt"; // Finds the file you want to test.
try{
FileReader ConnectionToFile = new FileReader(filename);
BufferedReader read = new BufferedReader(ConnectionToFile);
Scanner scan = new Scanner(read);
int[][] Spaces = new int[10][10];
int counter = 0;
try{
while(scan.hasNext() && counter < 10)
{
for(int i = 0; i < 10; i++)
{
counter = counter + 1;
for(int m = 0; m < 10; m++)
{
Spaces[i][m] = scan.nextInt();
}
}
}
for(int i = 0; i < 10; i++)
{
//Prints out Arrays to the Console, (not needed in final)
System.out.println("Array" + (i + 1) + " is: " + Spaces[i][0] + ", " + Spaces[i][1] + ", " + Spaces[i][2] + ", " + Spaces[i][3] + ", " + Spaces[i][4] + ", " + Spaces[i][5] + ", " + Spaces[i][6]+ ", " + Spaces[i][7]+ ", " + Spaces[i][8]+ ", " + Spaces[i][9]);
}
}
catch(InputMismatchException e)
{
System.out.println("Error converting number");
}
scan.close();
read.close();
}
catch (IOException e)
{
System.out.println("IO-Error open/close of file" + filename);
}
}
I provide my code here.
public static int[][] readArray(String path) throws IOException {
//1,0,1,1,0,1,0,1,0,1
int[][] result = new int[3][10];
BufferedReader reader = new BufferedReader(new FileReader(path));
String line = null;
Scanner scanner = null;
line = reader.readLine();
if(line == null) {
return result;
}
String pattern = createPattern(line);
int lineNumber = 0;
MatchResult temp = null;
while(line != null) {
scanner = new Scanner(line);
scanner.findInLine(pattern);
temp = scanner.match();
int count = temp.groupCount();
for(int i=1;i<=count;i++) {
result[lineNumber][i-1] = Integer.parseInt(temp.group(i));
}
lineNumber++;
scanner.close();
line = reader.readLine();
}
return result;
}
public static String createPattern(String line) {
char[] chars = line.toCharArray();
StringBuilder pattern = new StringBuilder();;
for(char c : chars) {
if(',' == c) {
pattern.append(',');
} else {
pattern.append("(\\d+)");
}
}
return pattern.toString();
}
The following piece of code snippet might be helpful. The basic idea is to read each line and parse out CSV. Please be advised that CSV parsing is generally hard and mostly requires specialized library (such as CSVReader). However, the issue in hand is relatively straightforward.
try {
String line = "";
int rowNumber = 0;
while(scan.hasNextLine()) {
line = scan.nextLine();
String[] elements = line.split(',');
int elementCount = 0;
for(String element : elements) {
int elementValue = Integer.parseInt(element);
spaces[rowNumber][elementCount] = elementValue;
elementCount++;
}
rowNumber++;
}
} // you know what goes afterwards
Since it is a file which is read line by line, read each line using a delimiter ",".
So Here you just create a new scanner object passing each line using delimter ","
Code looks like this, in first for loop
for(int i = 0; i < 10; i++)
{
Scanner newScan=new Scanner(scan.nextLine()).useDelimiter(",");
counter = counter + 1;
for(int m = 0; m < 10; m++)
{
Spaces[i][m] = newScan.nextInt();
}
}
Use the useDelimiter method in Scanner to set the delimiter to "," instead of the default space character.
As per the sample input given, if the next row in a 2D array begins in a new line, instead of using a ",", multiple delimiters have to be specified.
Example:
scan.useDelimiter(",|\\r\\n");
This sets the delimiter to both "," and carriage return + new line characters.
Why use a scanner for a file? You already have a BufferedReader:
FileReader fileReader = new FileReader(filename);
BufferedReader reader = new BufferedReader(fileReader);
Now you can read the file line by line. The tricky bit is you want an array of int
int[][] spaces = new int[10][10];
String line = null;
int row = 0;
while ((line = reader.readLine()) != null)
{
String[] array = line.split(",");
for (int i = 0; i < array.length; i++)
{
spaces[row][i] = Integer.parseInt(array[i]);
}
row++;
}
The other approach is using a Scanner for the individual lines:
while ((line = reader.readLine()) != null)
{
Scanner s = new Scanner(line).useDelimiter(',');
int col = 0;
while (s.hasNextInt())
{
spaces[row][col] = s.nextInt();
col++;
}
row++;
}
The other thing worth noting is that you're using an int[10][10]; this requires you to know the length of the file in advance. A List<int[]> would remove this requirement.