Finishing File Class - java

I keep getting an error telling me lineNumber cannot be resolved to a variable? I'm not really sure how to fix this exactly. Am I not importing a certain file to java that helps with this?
And also how would I count the number of chars with spaces and without spaces.
Also I need a method to count unique words but I'm not really sure what unique words are.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.Scanner;
import java.util.StringTokenizer;
import java.util.ArrayList;
import java.util.List;
public class LineWordChar {
public void main(String[] args) throws IOException {
// Convert our text file to string
String text = new Scanner( new File("way to your file"), "UTF-8" ).useDelimiter("\\A").next();
BufferedReader bf=new BufferedReader(new FileReader("way to your file"));
String lines="";
int linesi=0;
int words=0;
int chars=0;
String s="";
// while next lines are present in file int linesi will add 1
while ((lines=bf.readLine())!=null){
linesi++;}
// Tokenizer separate our big string "Text" to little string and count them
StringTokenizer st=new StringTokenizer(text);
while (st.hasMoreTokens()){
s = st.nextToken();
words++;
// We take every word during separation and count number of char in this words
for (int i = 0; i < s.length(); i++) {
chars++;}
}
System.out.println("Number of lines: "+linesi);
System.out.println("Number of words: "+words);
System.out.print("Number of chars: "+chars);
}
}
abstract class WordCount {
/**
* #return HashMap a map containing the Character count, Word count and
* Sentence count
* #throws FileNotFoundException
*
*/
public static void main() throws FileNotFoundException {
lineNumber=2; // as u want
File f = null;
ArrayList<Integer> list=new ArrayList<Integer>();
f = new File("file_stats.txt");
Scanner sc = new Scanner(f);
int totalLines=0;
int totalWords=0;
int totalChars=0;
int totalSentences=0;
while(sc.hasNextLine())
{
totalLines++;
if(totalLines==lineNumber){
String line = sc.nextLine();
totalChars += line.length();
totalWords += new StringTokenizer(line, " ,").countTokens(); //line.split("\\s").length;
totalSentences += line.split("\\.").length;
break;
}
sc.nextLine();
}
list.add(totalChars);
list.add(totalWords);
list.add(totalSentences);
System.out.println(lineNumber+";"+totalWords+";"+totalChars+";"+totalSentences);
}
}

In order to get your code running you have to do at least two changes:
Replace:
lineNumber=2; // as u want
with
int lineNumber=2; // as u want
Also, you need to modify your main method, you can not throw an exception in your main method declaration because there is nothing above it to catch the exception, you have to handle exceptions inside it:
public static void main(String[] args) {
// Convert our text file to string
try {
String text = new Scanner(new File("way to your file"), "UTF-8").useDelimiter("\\A").next();
BufferedReader bf = new BufferedReader(new FileReader("way to your file"));
String lines = "";
int linesi = 0;
int words = 0;
int chars = 0;
String s = "";
// while next lines are present in file int linesi will add 1
while ((lines = bf.readLine()) != null) {
linesi++;
}
// Tokenizer separate our big string "Text" to little string and count them
StringTokenizer st = new StringTokenizer(text);
while (st.hasMoreTokens()) {
s = st.nextToken();
words++;
// We take every word during separation and count number of char in this words
for (int i = 0; i < s.length(); i++) {
chars++;
}
}
System.out.println("Number of lines: " + linesi);
System.out.println("Number of words: " + words);
System.out.print("Number of chars: " + chars);
} catch (Exception e) {
e.printStackTrace();
}
}
I've used a global Exception catch, you can separate expetion in several catches, in order to handle them separatedly. It gives me an exception telling me an obvious FileNotFoundException, besides of that your code runs now.

lineNumber variable should be declared with datatype.
int lineNumber=2; // as u want
change the first line in the main method from just lineNumber to int lineNumber = 2 by setting its data type, as it is important to set data type of every variable in Java.

Related

Word Count from a text file using Java

I am trying to write a simple code that will give me the word count from a text file. The code is as follows:
import java.io.File; //to read file
import java.util.Scanner;
public class ReadTextFile {
public static void main(String[] args) throws Exception {
String filename = "textfile.txt";
File f = new File (filename);
Scanner scan = new Scanner(f);
int wordCnt = 1;
while(scan.hasNextLine()) {
String text = scan.nextLine();
for (int i = 0; i < text.length(); i++) {
if(text.charAt(i) == ' ' && text.charAt(i-1) != ' ') {
wordCnt++;
}
}
}
System.out.println("Word count is " + wordCnt);
}
}
this code compiles but does not give the correct word count. What am I doing incorrectly?
Right now you are only incrementing wordCnt if the character you are on is a whitespace and the character before it is not. However this discounts several cases, such as if there is not a space, but a newline character. Consider if your file looked like:
This is a text file\n
with a bunch of\n
words.
Your method should return ten, but since there is not space after the words file, and of it will not count them as words.
If you just want the word count you can do something along the lines of:
while(scan.hasNextLine()){
String text = scan.nextLine();
wordCnt+= text.split("\\s+").length;
}
Which will split on white space(s), and return how many tokens are in the resulting Array
First of all remember about closing resources. Please check this out.
Since Java 8 you can count words in this way:
String regex = "\\s+"
String filename = "textfile.txt";
File f = new File (filename);
long wordCnt = 1;
try (var scanner = new Scanner (f)){
wordCnt scanner.lines().map(str -> str.split(regex)).count();
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Word count is " + wordCnt);

How to extract ONLY words from a txt file in Java

So I have to extract data from a text file.
The text file is set up like this.
3400 Moderate
310 Light
etc.
I need to extract the numbers, store them in one array, and the strings, and store them in another array so I can do calculations to the numbers based on whats written in the array, and then output that to a file. I've got the last part down, I just cant figure out how to separate the ints from the strings when I extract the data from the txt. file.
Here is what I have now, but it's just extracting the int and the word as a String.
import java.io.*;
import java.util.*;
public class HorseFeed {
public static void main(String[] args){
Scanner sc = null;
try {
sc = new Scanner(new File("C:\\Users\\Patric\\Desktop\\HorseWork.txt"));
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
List<String> lines = new ArrayList<String>();
while (sc.hasNextLine()) {
lines.add(sc.nextLine());
}
String[] arr = lines.toArray(new String[0]);
for(int i = 0; i< 100; i++){
System.out.print(arr[i]);
}
}
}
Use split(String regex) in String class. Set the regex to search for whitespaces OR digits. It will return a String[] which contains words.
If you are analyzing it line by line, you would want another String[] in which you would append all the words from the new lines.
plz, follow the code.
import java.io.*;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class HorseFeed {
public static void main(String[] args) throws FileNotFoundException, IOException {
List<String> lineList = new ArrayList<String>();
BufferedReader br = new BufferedReader(new FileReader(new File("C:\\Users\\Patric\\Desktop\\HorseWork.txt")));
String line;
while ((line = br.readLine()) != null) {
Pattern pattern = Pattern.compile("[0-9]+");
Matcher matcher = pattern.matcher(line);
if( pattern.matcher(line).matches()){
while(matcher.find()){
lineList.add(matcher.group());
}
}
}
}
}
here lineList contains your integer.
This should work:
import java.io.*;
import java.util.*;
public class HorseFeed {
public static void main(String[] args) throws FileNotFoundException {
List<Integer> intList = new ArrayList<Integer>();
List<String> strList = new ArrayList<String>();
Scanner sc = new Scanner(new File("C:\\Users\\Patric\\Desktop\\HorseWork.txt"));
while (sc.hasNextLine()) {
String line = sc.nextLine();
String[] lineParts = line.split("\\s+");
Integer intValue = Integer.parseInt(lineParts[0]);
String strValue = lineParts[1];
intList.add(intValue);
strList.add(strValue);
}
System.out.println("Values: ");
for(int i = 0; i < intList.size(); i++) {
System.out.print("\t" + intList.get(i) + ": " + strList.get(i));
}
}
}
First extract all text of file and stored it into String . then use replaceall method of string class with pattern to remove digits from it.
Example:
String fileText = new String("welcome 2 java");
ss = fileText.replaceAll("-?\\d+", "");
System.out.println(ss);

Code not printing anything

I am writing code that reads in a text file through the command line arguments in the main method and prints out each word in it on its own line without printing any word more than once, it will not print anything, can anyone help?
import java.util.*;
import java.io.*;
public class Tokenization {
public static void main(String[] args) throws Exception{
String x = "";
String y = "";
File file = new File(args[0]);
Scanner s = new Scanner(file);
String [] words = null;
while (s.hasNext()){
x = s.nextLine();
}
words = x.split("\\p{Punct}");
String [] moreWords = null;
for (int i = 0; i < words.length;i++){
y = y + " " + words[i];
}
moreWords = y.split("\\s+");
String [] unique = unique(moreWords);
for (int i = 0;i<unique.length;i++){
System.out.println(unique[i]);
}
s.close();
}
public static String[] unique (String [] s) {
String [] uniques = new String[s.length];
for (int i = 0; i < s.length;i++){
for(int j = i + 1; j < s.length;j++){
if (!s[i].equalsIgnoreCase(s[j])){
uniques[i] = s[i];
}
}
}
return uniques;
}
}
You have several problems:
you're reading whole file line by line, but assign only last line to variable x
you're doing 2 splits, both on regexp, it is enough 1
in unique - you're filling only some parts of array, other parts are null
Here is shorter version of what you need:
import java.io.File;
import java.util.HashSet;
import java.util.Scanner;
import java.util.Set;
public class Tokenization {
public static void main(String[] args) throws Exception {
Set<String> words = new HashSet<String>();
try {
File file = new File(args[0]);
Scanner scanner = new Scanner(file);
while (scanner.hasNext()) {
String[] lineWords = scanner.nextLine().split("[\\p{Punct}\\s]+");
for (String s : lineWords)
words.add(s.toLowerCase());
}
scanner.close();
} catch (Exception e) {
System.out.println("Cannot read file [" + e.getMessage() + "]");
System.exit(1);
}
for (String s : words)
System.out.println(s);
}
}

Check if a file contains strings and create an array for new strings

I need to create a method that will read the file, and check each word in the file. Each new word in the file should be stored in a string array. The method should be case insensitive. Please help.
The file says the following:
Ask not what your country can do for you
ask what you can do for your country
So the array should only contain: ask, not, what, your, country, can, do, for, you
import java.util.*;
import java.io.*;
public class TextAnalysis {
public static void main (String [] args) throws IOException {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
System.out.println("TEXT FILE STATISTICS");
System.out.println("--------------------");
System.out.println("Length of the longest word: " + longestWord(fileScanner));
System.out.println("Number of words in file wordlist: " );
countWords();
System.out.println("Word-frequency statistics");
}
public static String longestWord (Scanner s) {
String longest = "";
while (s.hasNext()) {
String word = s.next();
if (word.length() > longest.length()) {
longest = word;
}
}
return (longest.length() + " " + "(\"" + longest + "\")");
}
public static void countWords () throws IOException {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
int count = 0;
while(fileScanner.hasNext()) {
String word = fileScanner.next();
count++;
}
System.out.println("Number of words in file: " + count);
}
public static int wordList (int words) {
File in01 = new File("a5_testfiles/in01.txt");
Scanner fileScanner = new Scanner(in01);
int size = words;
String [] list = new String[size];
for (int i = 0; i <= size; i++) {
while(fileScanner.hasNext()){
if(!list[].contains(fileScanner.next())){
list[i] = fileScanner.next();
}
}
}
}
}
You could take advantage of my following code snippet (it will not store the duplicate words)!
File file = new File("names.txt");
FileReader fr = new FileReader(file);
StringBuilder sb = new StringBuilder();
char[] c = new char[256];
while(fr.read(c) > 0){
sb.append(c);
}
String[] ss = sb.toString().toLowerCase().trim().split(" ");
TreeSet<String> ts = new TreeSet<String>();
for(String s : ss)
ts.add(s);
for(String s : ts){
System.out.println(s);
}
And the output is:
ask
can
country
do
for
not
what
you
your
You could always just try:
List<String> words = new ArrayList<String>();
//read lines in your file all at once
List<String> allLines = Files.readAllLines(yourFile, Charset.forName("UTF-8"));
for(int i = 0; i < allLines.size(); i++) {
//change each line from your file to an array of words using "split(" ")".
//Then add all those words to the list "words"
words.addAll(Arrays.asList(allLines.get(i).split(" ")));
}
//convert the list of words to an array.
String[] arr = words.toArray(new String[words.size()]);
Using Files.readAllLines(yourFile, Charset.forName("UTF-8")); to read all the lines of yourFile is much cleaner than reading each individually. The problem with your approach is that you're counting the number of lines, not the number of words. If there are multiple words on one line, your output will be incorrect.
Alternatively, if you do not use Java 7, you can create a list of lines as follows and then count the words at the end (as opposed to your approach in countWords():
List<String> allLines = new ArrayList<String>();
Scanner fileScanner = new Scanner(yourFile);
while (fileScanner.hasNextLine()) {
allLines.add(scanner.nextLine());
}
fileScanner.close();
Then split each line as shown in the previous code and create your array. Also note that you should use a try{} catch block around your scanner rather than throws ideally.

Error while counting number of character,lines and words in java

i have written the following code to count the number of character excluding white spaces,count number of words,count number of lines.But my code is not showing proper output.
import java.io.*;
class FileCount
{
public static void main(String args[]) throws Exception
{
FileInputStream file=new FileInputStream("sample.txt");
BufferedReader br=new BufferedReader(new InputStreamReader(file));
int i;
int countw=0,countl=0,countc=0;
do
{
i=br.read();
if((char)i==(' '))
countw++;
else if((char)i==('\n'))
countl++;
else
countc++;
}while(i!=-1);
System.out.println("Number of words:"+countw);
System.out.println("Number of lines:"+countl);
System.out.println("Number of characters:"+countc);
}
}
my file sample.txt has
hi my name is john
hey whts up
and my out put is
Number of words:6
Number of lines:2
Number of characters:26
You need to discard other whitespace characters as well including repeats, if any. A split around \\s+ gives you words separated by not only all whitespace characters but also any appearance of those characters in succession.
Having got a list of all words in the line it gets easier to update the count of words and characters using length methods of array and String.
Something like this will give you the result:
String line = null;
String[] words = null;
while ((line = br.readLine()) != null) {
countl++;
words = line.split("\\s+");
countw += words.length;
for (String word : words) {
countc += word.length();
}
}
A new line means also that the words ends.
=> There is not always a ' ' after each word.
do
{
i=br.read();
if((char)i==(' '))
countw++;
else if((char)i==('\n')){
countl++;
countw++; // new line means also end of word
}
else
countc++;
}while(i!=-1);
End of file should also increase the number of words (if no ' ' of '\n' was the last character.
Also handling of more than one space between words is still not handled correctly.
=> You should think about more changes in your approach to handle this.
import java.io.*;
class FileCount {
public static void main(String args[]) throws Exception {
FileInputStream file = new FileInputStream("sample.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(file));
int i;
int countw = 0, countl = 0, countc = 0;
do {
i = br.read();
if ((char) i == (' ')) { // You should also check for other delimiters, such as tabs, etc.
countw++;
}
if ((char) i == ('\n')) { // This is for linux Windows should be different
countw++; // Newlines also delimit words
countl++;
} // Removed else. Newlines and spaces are also characters
if (i != -1) {
countc++; // Don't count EOF as character
}
} while (i != -1);
System.out.println("Number of words " + countw);
System.out.println("Number of lines " + countl); // Print lines instead of words
System.out.println("Number of characters " + countc);
}
}
Ouput:
Number of words 8
Number of lines 2
Number of characters 31
Validation
$ wc sample.txt
2 8 31 sample.txt
Try this:
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStreamReader;
public class FileCount {
/**
*
* #param filename
* #return three-dimensional int array. Index 0 is number of lines
* index 1 is number of words, index 2 is number of characters
* (excluding newlines)
*/
public static int[] getStats(String filename) throws IOException {
FileInputStream file = new FileInputStream(filename);
BufferedReader br = new BufferedReader(new InputStreamReader(file));
int[] stats = new int[3];
String line;
while ((line = br.readLine()) != null) {
stats[0]++;
stats[1] += line.split(" ").length;
stats[2] += line.length();
}
return stats;
}
public static void main(String[] args) {
int[] stats = new int[3];
try {
stats = getStats("sample.txt");
} catch (IOException e) {
System.err.println(e.toString());
}
System.out.println("Number of words:" + stats[1]);
System.out.println("Number of lines:" + stats[0]);
System.out.println("Number of characters:" + stats[2]);
}
}

Categories