I want to print out the total number of letters (not including whitespace characters) of all the Latin names in the data file. Duplicate letters must be counted. This is what I have done so far:
List<Person> peopleFile = new ArrayList<>();
int numberOfLetters = 0;
try {
BufferedReader br = new BufferedReader(new FileReader("people_data.txt"));
String fileRead = br.readLine();
while (fileRead != null) {
String[] tokenSize = fileRead.split(":");
String commonName = tokenSize[0];
String latinName = tokenSize[1];
Person personObj = new Person(commonName, latinName);
peopleFile.add(personObj);
fileRead = br.readLine();
// Iterating each word
for (String s: tokenSize) {
// Updating the numberOfLetters
numberOfLetters += s.length();
}
}
br.close();
}
catch (FileNotFoundException e) {
System.out.println("file not found");
}
catch (IOException ex) {
System.out.println("An error has occured: " + ex.getMessage());
}
System.out.print("Total number of letters in all Latin names = ");
System.out.println(numberOfLetters);
The problem is that it prints out all number of letters in the file, I just want it to print out the number of characters in the Latin names.
The text file:
David Lee:Cephaloscyllium ventriosum
Max Steel:Galeocerdo cuvier
Jimmy Park:Sphyrna mokarren
What you are doing wrong is you are counting all the names despite you tokenize them. You can use this method to count letters of any String or Sentence.
public static int countLetter(String name) {
int count = 0;
if(name != null && !name.isEmpty()) {
/* This regular expression is splitting String at the
* sequence of Non-alphabetic characters. Hence actually
* splitting the Name into group of words */
String[] tokens = name.split("[^a-zA-Z]+");
for(String token : tokens) {
count += token.length();
}
}
return count;
}
And replace these lines
/* Note: here you are iterating all your Names from each line */
for (String s: tokenSize) {
// Updating the numberOfLetters
numberOfLetters += s.length();
}
with this
numberOfLetters += countLetter(latinName);
Does it make sense ? I hope you found your problem.
NB: you can experiment with this regex here
Get rid of all the blank spaces before summing the length :
s=s.replaceAll("[ \n\t]+","");
numberOfLetters += s.length();
Related
I've got the following code that opens and read a file and separates it to words.
My problem is at making an array of these words in alphabetical order.
import java.io.*;
class MyMain {
public static void main(String[] args) throws IOException {
File file = new File("C:\\Kennedy.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
String line = null;
int line_count=0;
int byte_count;
int total_byte_count=0;
int fromIndex;
while( (line = br.readLine())!= null ){
line_count++;
fromIndex=0;
String [] tokens = line.split(",\\s+|\\s*\\\"\\s*|\\s+|\\.\\s*|\\s*\\:\\s*");
String line_rest=line;
for (int i=1; i <= tokens.length; i++) {
byte_count = line_rest.indexOf(tokens[i-1]);
//if ( tokens[i-1].length() != 0)
//System.out.println("\n(line:" + line_count + ", word:" + i + ", start_byte:" + (total_byte_count + fromIndex) + "' word_length:" + tokens[i-1].length() + ") = " + tokens[i-1]);
fromIndex = fromIndex + byte_count + 1 + tokens[i-1].length();
if (fromIndex < line.length())
line_rest = line.substring(fromIndex);
}
total_byte_count += fromIndex;
}
}
}
I would read the File with a Scanner1 (and I would prefer the File(String,String) constructor to provide the parent folder). And, you should remember to close your resources explicitly in a finally block or you might use a try-with-resources statement. Finally, for sorting you can store your words in a TreeSet in which the elements are ordered using their natural ordering2. Something like,
File file = new File("C:/", "Kennedy.txt");
try (Scanner scanner = new Scanner(file)) {
Set<String> words = new TreeSet<>();
int line_count = 0;
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
line_count++;
String[] tokens = line.split(",\\s+|\\s*\\\"\\s*|\\s+|\\.\\s*|\\s*\\:\\s*");
Stream.of(tokens).forEach(word -> words.add(word));
}
System.out.printf("The file contains %d lines, and in alphabetical order [%s]%n",
line_count, words);
} catch (Exception e) {
e.printStackTrace();
}
1Mainly because it requires less code.
2or by a Comparator provided at set creation time
If you are storing the tokens in a String Array, use Arrays.sort() and get a naturally sorted Array. In this case as its String, you will get a sorted array of tokens.
I am practicing to write a program that gets a text file from user and provides data such as characters, words, and lines in the text.
I have searched and looked over the same topic but cannot find a way to make my code run.
public class Document{
private Scanner sc;
// Sets users input to a file name
public Document(String documentName) throws FileNotFoundException {
File inputFile = new File(documentName);
try {
sc = new Scanner(inputFile);
} catch (IOException exception) {
System.out.println("File does not exists");
}
}
public int getChar() {
int Char= 0;
while (sc.hasNextLine()) {
String line = sc.nextLine();
Char += line.length() + 1;
}
return Char;
}
// Gets the number of words in a text
public int getWords() {
int Words = 0;
while (sc.hasNext()) {
String line = sc.next();
Words += new StringTokenizer(line, " ,").countTokens();
}
return Words;
}
public int getLines() {
int Lines= 0;
while (sc.hasNextLine()) {
Lines++;
}
return Lines;
}
}
Main method:
public class Main {
public static void main(String[] args) throws FileNotFoundException {
DocStats doc = new DocStats("someText.txt");
// outputs 1451, should be 1450
System.out.println("Number of characters: "
+ doc.getChar());
// outputs 0, should be 257
System.out.println("Number of words: " + doc.getWords());
// outputs 0, should be 49
System.out.println("Number of lines: " + doc.getLines());
}
}
I know exactly why I get 1451 instead of 1451. The reason is because I do not have '\n' at the end of the last sentence but my method adds
numChars += line.length() + 1;
However, I cannot find a solution to why I get 0 for words and lines.
*My texts includes elements as: ? , - '
After all, could anyone help me to make this work?
**So far, I the problem that concerns me is how I can get a number of characters, if the last sentence does not have '\n' element. Is there a chance I could fix that with an if statement?
-Thank you!
After doc.getChar() you have reached the end of file. So there's nothing more to read in this file!
You should reset your scanner in your getChar/Words/Lines methods, such as:
public int getChar() {
sc = new Scanner(inputFile);
...
// solving your problem with the last '\n'
while (sc.hasNextLine()) {
String line = sc.nextLine();
if (sc.hasNextLine())
Char += line.length() + 1;
else
Char += line.length();
}
return char;
}
Please note that a line ending is not always \n! It might also be \r\n (especially under windows)!
public int getWords() {
sc = new Scanner(inputFile);
...
public int getLines() {
sc = new Scanner(inputFile);
...
I would use one sweep to calculate all 3, with different counters. just a loop over each char, check if its a new word etc, increase counts , use Charater.isWhiteSpace *
import java.io.*;
/**Cound lines, characters and words Assumes all non white space are words so even () is a word*/
public class ChrCounts{
String data;
int chrCnt;
int lineCnt;
int wordCnt;
public static void main(String args[]){
ChrCounts c = new ChrCounts();
try{
InputStream data = null;
if(args == null || args.length < 1){
data = new ByteArrayInputStream("quick brown foxes\n\r new toy\'s a fun game.\nblah blah.la la ga-ma".getBytes("utf-8"));
}else{
data = new BufferedInputStream( new FileInputStream(args[0]));
}
c.process(data);
c.print();
}catch(Exception e){
System.out.println("ee " + e);
e.printStackTrace();
}
}
public void print(){
System.out.println("line cnt " + lineCnt + "\nword cnt " + wordCnt + "\n chrs " + chrCnt);
}
public void process(InputStream data) throws Exception{
int chrCnt = 0;
int lineCnt = 0;
int wordCnt = 0;
boolean inWord = false;
boolean inNewline = false;
//char prev = ' ';
while(data.available() > 0){
int j = data.read();
if(j < 0)break;
chrCnt++;
final char c = (char)j;
//prev = c;
if(c == '\n' || c == '\r'){
chrCnt--;//some editors do not count line seperators as new lines
inWord = false;
if(!inNewline){
inNewline = true;
lineCnt++;
}else{
//chrCnt--;//some editors dont count adjaccent line seps as characters
}
}else{
inNewline = false;
if(Character.isWhitespace(c)){
inWord = false;
}else{
if(!inWord){
inWord = true;
wordCnt++;
}
}
}
}
//we had some data and last char was not in new line, count last line
if(chrCnt > 0 && !inNewline){
lineCnt++;
}
this.chrCnt = chrCnt;
this.lineCnt = lineCnt;
this.wordCnt = wordCnt;
}
}
I am currently trying to compare the lines in a textfile to find the shortest line and longest line and display how many characters are in each. The code I have listed below allows me to count all the character, words, and lines. I have no idea where to start comparing the lines? Any help would be appreciated.
import java.util.Scanner;
import java.io.*;
public class Test{
public static void main(String [] args){
System.out.println("Please enter the filename: ");
Scanner input = new Scanner(System.in);
String fileName = input.nextLine();
FileReader fReader;
try {
fReader = new FileReader(fileName);
BufferedReader reader = new BufferedReader(fReader);
String cursor; //
String content = "";
int lines = 0;
int words = 0;
int chars = 0;
while((cursor = reader.readLine()) != null){
// count lines
lines += 1;
content += cursor;
// count words
String []_words = cursor.split(" ");
for( String w : _words)
{
words++;
}
}
chars = content.length();
System.out.println("The filename is " + fileName);
System.out.println(chars + " Characters,");
System.out.println(words + " words and " + lines + " lines.");
} catch (FileNotFoundException ex) {
// Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
System.out.println("File not found!");
} catch (IOException ex) {
//Logger.getLogger(Main.class.getName()).log(Level.SEVERE, null, ex);
System.out.println("An error has occured: " + ex.getMessage());
}
}
}
You must create 2 vars to store short and long lines...
String longest = "";
String shortest = "";
Then in your existing code, compare with current line:
while((cursor = reader.readLine()) != null){
// compare shortest and longest.
int currentSize = cursor.lenght;
if (currentSize > longest.lenght || longest.equals("")) {
longest = cursor;
} else if (currentSize < shortest.lenght || longest.equals("")) {
shortest = cursor;
}
// count lines
lines += 1;
content += cursor;
// count words
String []_words = cursor.split(" ");
for( String w : _words)
{
words++;
}
}
After the loop you can do what you need with results:
System.out.println("Longest line has " + longest.lenght);
System.out.println("Shortest line has " + shortest.lenght);
If you only need the sizes and not the lines you can create int variables.
int longest = 0;
int shortest = 0;
// then inside the loop
int currentSize = cursor.lenght;
if (currentSize > longest || currentSize = 0) {
longest = currentSize;
} else if (currentSize < shortest || currentSize = 0) {
shortest = currentSize;
}
You need 2 String variables, one to hold the shortest String and one to hold the longest String. Then as you process each line, compare the length of the current line to the shortest/longest.
If it is shorter than your shortest String, set the shortest String to the current line.
else
If it is longer than your longest String, set the longest String to the current line.
Process the results at the end on those two String variables.
I was programing in Python but now I want to do the same code in Java. Can you help me please? This is the code that I was working on
import random
import re
a = "y"
while a == "y":
i = input('Search: ')
b = i.lower()
word2 = ""
for letter in b:
lista = []
with open('d:\lista.txt', 'r') as inF:
for item in inF:
if item.startswith(letter):
lista.append(item)
word = random.choice(lista)
word2 = word2 + word
print(word2)
a = input("Again? ")
Now I want to do the same on Java but Im not really sure how to do it. Its not that easy. Im just a beginner. So far I founded a code that makes the search in a text file but I'm stuck.
This is the java code. It finds the position of the word. I've been trying to modify it without the results Im looking for.
import java.io.*;
import java.util.Scanner;
class test {
public static void main(String[] args){
Scanner input = new Scanner(System.in);
System.out.println("Search: ");
String searchText = input.nextLine();
String fileName = "lista.txt";
StringBuilder sb = new StringBuilder();
try {
BufferedReader reader = new BufferedReader(new FileReader(fileName));
while (reader.ready()) {
sb.append(reader.readLine());
}
}
catch(IOException ex) {
ex.printStackTrace();
}
String fileText = sb.toString();
System.out.println("Position in file : " + fileText.indexOf(searchText));
}
}
What I want is to find an item in a text file, a list, but just want to show the items that begin with the letters of the string I want to search. For example, I have the string "urgent" and the text file contains:
baby
redman
love
urban
gentleman
game
elephant
night
todd
So the display would be "urban"+"redman"+"gentleman"+ until it reaches the end of the string.
Let's assume that you've already tokenized the string so you've got a list of Strings, each containing a single word. It's what comes from the reader if you've got one word per line, which is how your Python code is written.
String[] haystack = {"baby", "redman", "love", "urban", "gentleman", "game",
"elephant", "night", "todd"};
Now, to search for a needle, you can simply compare the first characters of your haystack to all characters of the needle :
String needle = "urgent";
for (String s : haystack) {
for (int i = 0; i < needle.length(); ++i) {
if (s.charAt(0) == needle.charAt(i)) {
System.out.println(s);
break;
}
}
}
This solutions runs in O(|needle| * |haystack|).
To improve it a bit for the cost of a little bit of extra memory, we can precompute a hash table for the available starts :
String needle = "urgent";
Set<Character> lookup = new HashSet<Character>();
for (int i = 0; i < needle.length(); ++i) {
lookup.add(needle.charAt(i));
}
for (String s : haystack) {
if (lookup.contains(s.charAt(0))) {
System.out.println(s);
}
}
The second solution runs in O(|needle| + |haystack|).
This works if your list of words isn't too large. If your list of words is large you could adapt this so that you stream over the file multiple time collecting words to use.
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Random;
public class Test {
public static void main(String[] args) {
Map<Character, List<String>> map = new HashMap<Character, List<String>>();
File file = new File("./lista.txt");
BufferedReader reader = null;
try {
reader = new BufferedReader(new FileReader(file));
String line = null;
while ((line = reader.readLine()) != null) {
// assumes words are space separated with no
// quotes or commas
String[] tokens = line.split(" ");
for(String word : tokens) {
if(word.length() == 0) continue;
// might as well avoid case issues
word = word.toLowerCase();
Character firstLetter = Character.valueOf(word.charAt(0));
List<String> wordsThatStartWith = map.get(firstLetter);
if(wordsThatStartWith == null) {
wordsThatStartWith = new ArrayList<String>();
map.put(firstLetter, wordsThatStartWith);
}
wordsThatStartWith.add(word);
}
}
Random rand = new Random();
String test = "urgent";
List<String> words = new ArrayList<String>();
for (int i = 0; i < test.length(); i++) {
Character key = Character.valueOf(test.charAt(i));
List<String> wordsThatStartWith = map.get(key);
if(wordsThatStartWith != null){
String randomWord = wordsThatStartWith.get(rand.nextInt(wordsThatStartWith.size()));
words.add(randomWord);
} else {
// text file didn't contain any words that start
// with this letter, need to handle
}
}
for(String w : words) {
System.out.println(w);
}
} catch (Exception e) {
e.printStackTrace();
} finally {
if(reader != null) {
try {
reader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
}
This assumes the content of lista.txt looks like
baby redman love urban gentleman game elephant night todd
And the output will look something like
urban
redman
gentleman
elephant
night
todd
I have a text file which contains 20 000 lines. data which are in it as columns. But spaces between columns are different and column length also different. Ex
aaaaa ()()()()()bdo()()()()()()()() ttttt ()() dgee ()()()()() yyyy
bbb()()()()()()()ggg ()()()()()()()( fff()()()(gbe()()()()()()( yHH
cc()()()()()()()()dddd()()()()()()() I ()()()()bdeg()()()()()()yyyyy
here spaces represent from brackets
Like that!!!
I want to replace N th (ex: 4th ) column with the specific word (ex: "name" )
example out put :
aaaaa ()()()()()bdo()()()()()()()() ttttt ()() name ()()()()() yyyy
bbb()()()()()()()ggg ()()()()()()()( fff()()()(name()()()()()()( yHH
cc()()()()()()()()dddd()()()()()()() I ()()()()name()()()()()()yyyyy
here spaces represent from brackets
can anyone help me on this ?
public static void replaceColumn(int column, String word, File file) throws IOException {
Scanner in = new Scanner(file);
PrintWriter out = new PrintWriter(file);
while (in.hasNextLine()) {
String line = in.nextLine();
line = line.trim();
String columns = line.split(" ");
columns[column] = word;
line = arrayToString(columns, " ");
out.println(line);
}
in.close();
out.close();
}
//Helper method
private static String arrayToString(Object[] array, String separator) {
if (array.length == 0) {
return "";
}
StringBuilder sb = new StringBuilder();
for (Object element : array) {
sb.append(element);
sb.append(separator);
}
sb.delete(sb.length - separator.length(), sb.length());
return sb.toString();
}