Using tokenizer to count number of words in a file? - java

I am making a program that makes an user choose a file then the program reads from the file. Now I've been told to make the program using bufferedreader and string tokenizer. So far I got program opening the file and counting the number of lines. But the number of words is not so easy.
This is my code so far:
int getWords() throws IOException
{
int count = 0;
BufferedReader BF = new BufferedReader(new FileReader(f));
try {
StringTokenizer words = new StringTokenizer(BF.readLine());
while(words.hasMoreTokens())
{
count++;
words.nextToken();
}
BF.close();
} catch(FileNotFoundException e) {
}
return count;
}
Buffered reader can only read a line at a time but I don't know how to make it read more lines.

to count words you can use countTokens() instead of loop
to read all lines use
String line = null;
while(null != (line = BF.readLine())) {
StringTokenizer words = new StringTokenizer(line);
words.countTokens();//use this value as number of words in line
}

As you said, buffered reader will read one line at a time. So you have to read lines until there are no more lines. readLine() returns null when the end of file is reached.
So do something like this
int getWords() throws IOException {
int count = 0;
BufferedReader BF = new BufferedReader(new FileReader(f));
String line;
try {
while ((line = BF.readLine()) != null) {
StringTokenizer words = new StringTokenizer(line);
while(words.hasMoreTokens()) {
count++;
words.nextToken();
}
}
return count;
} catch(FileNotFoundException e) {
} finally {
BF.close();
}
// Either rethrow the exception or return an error code like -1.
}

Related

Reading Line of String in Text File is Not Consistent

Hi StackOverFlow People,
I have this issue in my development of System, where I have 4451 lines of record in a text file, and I am retrieving it using BufferedReader and split every line by pipe ( | ). I'm using Quartz also to run this reading of file every day. when I test it, I set the quartz job every minute so I can test It if it actually reading the file in every minute. It reads all of the line in the text file by checking it using this.
BufferedReader reader = new BufferedReader((newInputStreamReader(inputStream));
String line = null;
int counter = 0;
while((line = reader.readLine()) != null){
counter++;
}
System.out.println(counter);
But when I split the String, The result of retrieving 4451 records is inconsistent. Sometimes, It only retrieves 1000+ to 2000+ records, and Sometime it retrieves 4451, but not consistently. This is my code.
try {
BufferedReader reader = new BufferedReader((newInputStreamReader(inputStream));
String line = null;
int counter = 0;
String[] splitLine = null;
while((line = reader.readLine()) != null){
splitLine = line.split("\\|"); // Splitting the line using '|' Delimiter
for(String temp : splitLine) {
System.out.println(temp);
}
counter++;
}
System.out.println(counter);
} catch (IOException e) {
e.printStackTrace();
}
Is the splitting of String and Iterating of the readfile at the same time could be the cause?
EDIT:
There's no Exception Occured in the Situation. It Only print the length of by using the counter variable.
My Expected Output is I want to Retrieve all the records per line in the text file and split the string per line by pipe. counter is the count of lines retrieved.
I didn't find any error in your code but the code that I have written is working perfectly fine. Here is the code
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
class Test {
public static void main(String[] args) {
FileReader inputStream = null;
BufferedReader reader = null;
try {
inputStream = new FileReader("Input.txt");
reader = new BufferedReader(inputStream);
String line = null;
int counter = 0;
String[] splitLine = null;
while ((line = reader.readLine()) != null) {
splitLine = line.split("\\|");
for (String temp : splitLine) {
System.out.println(temp);
}
counter++;
}
System.out.println(counter);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
reader.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
Shouldn't pipe delimiter be just "|" instead of "\\|"?
Try Changing your code to:
splitLine = line.split("|"); // Splitting the line using '|' Delimiter

How to read records from a text file?

I tried this:
public static void ReadRecord()
{
String line = null;
try
{
FileReader fr = new FileReader("input.txt");
BufferedReader br = new BufferedReader(fr);
line = br.readLine();
while(line != null)
{
System.out.println(line);
}
br.close();
fr.close();
}
catch (Exception e)
{
}
}
}
It non stop and repeatedly reads only one record that i had inputtd and wrote into the file earlier...How do i read records and use tokenization in reading records?
You have to read the lines in the file repeatedly in the loop using br.readLine(). br.readLine() reads only one line at time.
do something like this:
while((line = br.readLine()) != null) {
System.out.println(line);
}
Check this link also if you have some problems. http://www.mkyong.com/java/how-to-read-file-from-java-bufferedreader-example/
Tokenization
If you want to split your string into tokens you can use the StringTokenizer class or can use the String.split() method.
StringTokenizer Class
StringTokenizer st = new StringTokenizer(line);
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
st.hasMoreTokens() - will check whether any more tokens are present.
st.nextToken() - will get the next token
String.split()
String[] result = line.split("\\s"); // split line into tokens
for (int x=0; x<result.length; x++) {
System.out.println(result[x]);
}
line.split("\\s") - will split line with space as the delimiter. It returns a String array.
try this
while((line = br.readLine()) != null)
{
System.out.println(line);
}
Try This :
BufferedReader br = new BufferedReader(new FileReader("input.txt"));
while((line=br.readline())!=null)
System.out.println(line);
For a text file called access.txt locate for example on your X drive, this should work.
public static void readRecordFromTextFile throws FileNotFoundException
{
try {
File file = new File("X:\\access.txt");
Scanner sc = new Scanner(file);
sc.useDelimiter(",|\r\n");
System.out.println(sc.next());
while (sc.hasNext()) {
System.out.println(sc.next());
}
sc.close();// closing the scanner stream
} catch (FileNotFoundException e) {
System.out.println("Enter existing file name");
e.printStackTrace();
}
}

Is there a utility method for reading first n lines from file?

I have searched the following popular libraries:
Guava - Fiels.readLines
nio - Files.readFirstLine or Files.readAllLines
ApacheCommons - FileUtils.readLines
All methods read whole file into memory as String collection. But that is not useful for large files with thousands of lines? Is there a simple method call to read the first n lines of a file in any of these libraries?
You could use LineNumberReader
LineNumberReader reader =
new LineNumberReader
(new InputStreamReader(new FileInputStream("/path/to/file"), "UTF-8"));
try{
String line;
while (((line = reader.readLine()) != null) && reader.getLineNumber() <= 10) {
...
}
}finally{
reader.close()
}
With Java 8 you can use Files.lines:
List<String> readFirst(final Path path, final int numLines) throws IOException {
try (final Stream<String> lines = Files.lines(path)) {
return lines.limit(numLines).collect(toList());
}
}
Pre Java 8 you can write something yourself fairly easily:
List<String> readFirst(final Path path, final int numLines) throws IOException {
try (final BufferedReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8)) {
final List<String> lines = new ArrayList<>(numLines);
int lineNum = 0;
String line;
while ((line = reader.readLine()) != null && lineNum < numLines) {
lines.add(line);
lineNum++;
}
return lines;
}
}
I do not know "ready" utility that you want but it is very simple. First create instance of BufferedReader:
BufferedReader reader = new BufferedReader(new FileReader("myfile.txt"));
Now read the first line:
Stirng line = reder.readLine();
Obviously you can call this method as many times as you need to read n first lines of the file, for example:
for (int i = 0; i < n; i++) {
Stirng line = reder.readLine();
// do whatever you want with the line
}

java: how to use bufferedreader to read specific line

Lets say I have a text file called: data.txt (contains 2000 lines)
How do I read given specific line from: 500-1500 and then 1500-2000
and display the output of specific line?
this code will read whole files (2000 line)
public static String getContents(File aFile) {
StringBuffer contents = new StringBuffer();
try {
BufferedReader input = new BufferedReader(new FileReader(aFile));
try {
String line = null;
while (( line = input.readLine()) != null){
contents.append(line);
contents.append(System.getProperty("line.separator"));
}
}
finally {
input.close();
}
}
catch (IOException ex){
ex.printStackTrace();
}
return contents.toString();
}
How do I modify above code to read specific line?
I suggest java.io.LineNumberReader. It extends BufferedReader and
you can use its LineNumberReader.getLineNumber(); to get the current line number
You can also use Java 7 java.nio.file.Files.readAllLines which returns a List<String> if it suits you better
Note:
1) favour StringBuilder over StringBuffer, StringBuffer is just a legacy class
2) contents.append(System.getProperty("line.separator")) does not look nice
use contents.append(File.separator) instead
3) Catching exception seems irrelevant, I would also suggest to change your code as
public static String getContents(File aFile) throws IOException {
BufferedReader rdr = new BufferedReader(new FileReader("aFile"));
try {
StringBuilder sb = new StringBuilder();
// read your lines
return sb.toString();
} finally {
rdr.close();
}
}
now code looks cleaner in my view. And if you are in Java 7 use try-with-resources
try (BufferedReader rdr = new BufferedReader(new FileReader("aFile"))) {
StringBuilder sb = new StringBuilder();
// read your lines
return sb.toString();
}
so finally your code could look like
public static String[] getContents(File aFile) throws IOException {
try (LineNumberReader rdr = new LineNumberReader(new FileReader(aFile))) {
StringBuilder sb1 = new StringBuilder();
StringBuilder sb2 = new StringBuilder();
for (String line = null; (line = rdr.readLine()) != null;) {
if (rdr.getLineNumber() >= 1500) {
sb2.append(line).append(File.pathSeparatorChar);
} else if (rdr.getLineNumber() > 500) {
sb1.append(line).append(File.pathSeparatorChar);
}
}
return new String[] { sb1.toString(), sb2.toString() };
}
}
Note that it returns 2 strings 500-1499 and 1500-2000
A slightly more cleaner solution would be to use FileUtils in apache commons.
http://commons.apache.org/io/api-release/org/apache/commons/io/FileUtils.html
Example snippet:
String line = FileUtils.readLines(aFile).get(lineNumber);
The better way is to use BufferedReader. If you want to read line 32 for example:
for(int x = 0; x < 32; x++){
buf.readLine();
}
lineThreeTwo = buf.readLine();
Now in String lineThreeTwo you have stored line 32.

Problems with my program involving arraylists, bufferedreader, methods, and overall forgetfullness of how java works

I am having difficulties with a program that I have been working on all day. I am trying to read a text file and read each line one at a time. Take that line and make an arraylist of the words of the line. then using the index of the arraylist define terms with it.
public class PCB {
public static void main(String arg[]) {
read();
}
public static ArrayList read() {
BufferedReader inputStream = null;
ArrayList<String> tokens = new ArrayList<String>();
try {
inputStream = new BufferedReader(new FileReader("processes1.txt"));
String l;
while ((l = inputStream.readLine()) != null) {
Scanner tokenize = new Scanner(l);
while (tokenize.hasNext()) {
tokens.add(tokenize.next());
}
return tokens;
}
} catch (IOException ioe) {
ArrayList<String> nothing = new ArrayList<String>();
nothing.add("error1");
System.out.println("error");
//return nothing;
}
return tokens;
}
}
The error I am getting is it only reads the first line. What am I doing wrong?
Thank you so much in advance
You have "return tokens;" in your while loop. Seems like that early return would effectively cut off processing on the first line.
Try changing your loop to the following. Note how I moved the return statement.
while ((l = inputStream.readLine()) != null) {
Scanner tokenize = new Scanner(l);
while (tokenize.hasNext()) {
tokens.add(tokenize.next());
}
}
return tokens; // <-- outside the loop
Edit: If you want to read the entire file and store the tokens of each line in a seperate array, then you could create an ArrayList of ArrayList.
public static ArrayList<ArrayList<String>> tokenizeFile(string filename) {
BufferedReader inputStream = new BufferedReader(new FileReader(filename));
ArrayList<ArrayList<String>> lines = new ArrayList<ArrayList<String>>();
while (true) {
String line = inputStream.readLine();
if (line == null) break;
ArrayList<String> tokens = new ArrayList<String>();
Scanner tokenizer = new Scanner(line);
while (tokenizer.hasNext()) {
tokens.add(tokenizer.next());
}
lines.Add(tokens);
}
return lines;
}
Note: My Java is rusty.
Simplify to this ...
String l;
while ((l = inputStream.readLine()) != null) {
tokens.addAll(Arrays.asList(l.split(" ")));
}
... creates a list of all tokens on all lines in the file (if that is what you want).

Categories