Reading a sequence until the empty line - java

I am writing a Java program. I need help with the input of the program, that is a sequence of lines containing two tokens separated by one or more spaces.
import java.util.Scanner;
class ArrayCustomer {
public static void main(String[] args) {
Customer[] array = new Customer[5];
Scanner aScanner = new Scanner(System.in);
int index = readInput(aScanner, array);
}
}

It is better to use value.trim().length()
The trim() method will remove extra spaces if any.
Also String is assigned to Customer you will need to create a object out of the String of type Customer before assigning it.

Try this code... You can put the file you want to read from where "stuff.txt" currently is. This code uses the split() method from the String class to tokenize each line of text until the end of the file. In the code the split() method splits each line based on a space. This method takes a regex such as the empty space in this code to determine how to tokenize.
import java.io.*;
import java.util.ArrayList;
public class ReadFile {
static ArrayList<String> AL = new ArrayList<String>();
public static void main(String[] args) {
try {
BufferedReader br = new BufferedReader(new FileReader("stuff.txt"));
String datLine;
while((datLine = br.readLine()) != null) {
AL.add(datLine); // add line of text to ArrayList
System.out.println(datLine); //print line
}
System.out.println("tokenizing...");
//loop through String array
for(String x: AL) {
//split each line into 2 segments based on the space between them
String[] tokens = x.split(" ");
//loop through the tokens array
for(int j=0; j<tokens.length; j++) {
//only print if j is a multiple of two and j+1 is not greater or equal to the length of the tokens array to preven ArrayIndexOutOfBoundsException
if ( j % 2 ==0 && (j+1) < tokens.length) {
System.out.println(tokens[j] + " " + tokens[j+1]);
}
}
}
} catch(IOException ioe) {
System.out.println("this was thrown: " + ioe);
}
}
}

Related

search for multiple strings from a text file in java

I'm trying to search of multiple words given from a user ( i used array to store them in ) from one txt file , and then if that word presented once in the file it will be displayed and if it's not it won't.
also for the words itself , if it's duplicated it will search it once.
the problem now when i search for only one it worked , but with multiple words it keeps repeated that the word isn't present even if it's there.
i would like to know where should i put the for loop and what's the possible changes.
package search;
import java.io.*;
import java.util.Scanner;
public class Read {
public static void main(String[] args) throws IOException
{
Scanner sc = new Scanner(System.in);
String[] words=null;
FileReader fr = new FileReader("java.txt");
BufferedReader br = new BufferedReader(fr);
String s;
System.out.println("Enter the number of words:");
Integer n = sc.nextInt();
String wordsArray[] = new String[n];
System.out.println("Enter words:");
for(int i=0; i<n; i++)
{
wordsArray[i]=sc.next();
}
for (int i = 0; i <n; i++) {
int count=0; //Intialize the word to zero
while((s=br.readLine())!=null) //Reading Content from the file
{
{
words=s.split(" "); //Split the word using space
for (String word : words)
{
if (word.equals(wordsArray[i])) //Search for the given word
{
count++; //If Present increase the count by one
}
}
if(count == 1)
{
System.out.println(wordsArray[i] + " is unique in file ");
}
else if (count == 0)
{
System.out.println("The given word is not present in the file");
}
else
{
System.out.println("The given word is present in the file more than 1 time");
}
}
}
}
fr.close();
}
}
The code which you wrote is error prone and remember always there should be proper break condition when you use while loop.
Try the following code:
public class Read {
public static void main(String[] args)
{
// Declaring the String
String paragraph = "These words can be searched";
// Declaring a HashMap of <String, Integer>
Map<String, Integer> hashMap = new HashMap<>();
// Splitting the words of string
// and storing them in the array.
String[] words = new String[]{"These", "can", "searched"};
for (String word : words) {
// Asking whether the HashMap contains the
// key or not. Will return null if not.
Integer integer = hashMap.get(word);
if (integer == null)
// Storing the word as key and its
// occurrence as value in the HashMap.
hashMap.put(word, 1);
else {
// Incrementing the value if the word
// is already present in the HashMap.
hashMap.put(word, integer + 1);
}
}
System.out.println(hashMap);
}
}
I've tried by hard coding the values, you can take words and paragraph from the file and console.
The 'proper' class to use for extracting words from text is java.text.BreakIterator
You can try the following (reading line-wise in case of large files)
import java.text.BreakIterator;
import java.util.Arrays;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Stream;
import java.nio.file.Files;
import java.nio.file.Paths;
public class WordFinder {
public static void main(String[] args) {
try {
if (args.length < 2) {
WordFinder.usage();
System.exit(1);
}
ArrayList<String> argv = new ArrayList<>(Arrays.asList(args));
String path = argv.remove(0);
List<String> found = WordFinder.findWords(Files.lines(Paths.get(path)), argv);
System.out.printf("Found the following word(s) in file at %s%n", path);
System.out.println(found);
} catch (Throwable t) {
t.printStackTrace();
}
}
public static List<String> findWords(Stream<String> lines, ArrayList<String> searchWords) {
List<String> result = new ArrayList<>();
BreakIterator boundary = BreakIterator.getWordInstance();
lines.forEach(line -> {
boundary.setText(line);
int start = boundary.first();
for (int end = boundary.next(); end != BreakIterator.DONE; start = end, end = boundary.next()) {
String candidate = line.substring(start, end);
if (searchWords.contains(candidate)) {
result.add(candidate);
searchWords.remove(candidate);
}
}
});
return result;
}
private static void usage() {
System.err.println("Usage: java WordFinder <Path to input file> <Word 1> [<Word 2> <Word 3>...]");
}
}
Sample run:
goose#t410:/tmp$ echo 'the quick brown fox jumps over the lazy dog' >quick.txt
goose#t410:/tmp$ java WordFinder quick.txt dog goose the did quick over
Found the following word(s) in file at quick.txt
[the, quick, over, dog]
goose#t410:/tmp$

Java Counting occurrences in a List and removing addition/duplicates from the structure while updating the word count

I currently have the words reading from a text file into a String ArrayList. My assignment asked me to not use any HashMaps or HashSets, anything of that nature. While counting the occurrences of a word I also have to remove any additionals(, . : [] ; = -) and duplicates of the same word. Just currently having trouble with how to remove the additionals and removing duplicates any help is appreciated (Beginner at Java). Unable to use splits.
Here is my code:
public static void main(String[] args) throws FileNotFoundException, IOException
{
//Create input Scanner
FileInputStream file = new FileInputStream("Assignment1BData.txt");
Scanner input = new Scanner(file);
//Create the ArrayList
ArrayList<String> wordCount = new ArrayList<String>();
ArrayList<Integer> numCount = new ArrayList<Integer>();
//Read through the file and find the words from text
while(input.hasNext())
{
String word = input.next();
//Create index to look through lines of text
if(wordCount.contains(word))
{
int index = numCount.indexOf(word);
numCount.set(index, numCount.get(index) + 1);
}
else
{
wordCount.add(word);
numCount.add(1);
}
}
input.close();
file.close();
//Print output in for loop
for(int i = 0; i < wordCount.size(); i++)
{
System.out.println(wordCount.get(i) + " = " + numCount.get(i));
}
}
indexOf(Object o) method on the ArrayList should solve the problem of duplicates. Before you go on to add the string to the ArrayList simply call this method if that string already exists in the ArrayList it returns the index otherwise it returns -1. Just keep adding the string read from the text file to ArrayList as long as the indexOf method returns -1 otherwise simply ignore(since it already exists).
You can use something like: newList = removeDuplicates(YourList) So that the duplicates would get removed and you get a new list...
Got it here: https://www.geeksforgeeks.org/how-to-remove-duplicates-from-arraylist-in-java/
You can try by creating a list with unique elements using Stream API as,
List<Integer> listWithDuplicates = Lists.newArrayList(5, 0, 3, 1, 2, 3, 0, 0);
List<Integer> listWithoutDuplicates = listWithDuplicates.stream()
.distinct()
.collect(Collectors.toList());
Then iterate over the original array/list (listWithDuplicates ) and get it compared with listWithoutDuplicates and count for the match.
Try this, for replacng special symbols before doing lookup and doing the count .
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Scanner;
public class WordCounter {
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
//Create input Scanner
FileInputStream file = new FileInputStream("/tmp/Assignment1BData.txt");
Scanner input = new Scanner(file);
//Create the ArrayList
ArrayList<String> wordCount = new ArrayList<String>();
ArrayList<Integer> numCount = new ArrayList<Integer>();
//Read through the file and find the words from text
while(input.hasNextLine())
{
String word = input.nextLine();
//Replace all characters
word = word.replaceAll("[,.:;=-\\[\\]]", "");
//Create index to look through lines of text
if(wordCount.contains(word))
{
int index = wordCount.indexOf(word);
numCount.set(index, numCount.get(index) + 1);
}
else
{
wordCount.add(word);
numCount.add(1);
}
}
input.close();
file.close();
//Print output in for loop
for(int i = 0; i < wordCount.size(); i++)
{
System.out.println(wordCount.get(i) + " = " + numCount.get(i));
}
}
}

Using a method to call individual strings from an array (looping)

This is the question from my assignment that I am unsure of:
The class is to contain a public method nextWord(). When a new line is read, use the String method .split("\s+") to create an array of the words that are on the line. Each call to the nextWord() method is to return the next word in the array. When all of the words in the array have been processed, read the next line in the file. The nextWord()method returns the value null when the end of the file is reached.
I have read the file, and stored each individual string in an array called tokenz.
I'm not sure how I can have a method called "nextWord" which returns each individual word from tokenz one at a time. Maybe I don't understand the question?
The last part of the question is:
In your main class, write a method named processWords() which instantiates the MyReader class (using the String "A2Q2in.txt"). Then write a loop that obtains one word at a time from the MyReader class using the nextWord() method and prints each word on a new line.
I've thought of ways to do this but I'm not sure how to return each word from the nextWord method i'm supposed to write. I can't increase a count because after the String is returned, anything after the return statement cannot be reached because the method is done processing.
Any help would be appreciated, maybe I'm going about this the wrong way?
Can't use array lists or anything like that.
Here is my code.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
public class A2Q2
{
public static void main (String [] args)
{
processWords();
}
public static void processWords()
{
MyReader reader = new MyReader("A2Q2.txt");
String[] words = new String[174];
words[0] = reader.nextWord();
System.out.println(words[0]);
}
}
class MyReader
{
static String name;
static BufferedReader fileIn;
static String inputLine;
static int tokensLength = 0;
static String[] tokens;
static int counter = 0;
// constructor.
public MyReader(String name)
{
this.name = name;
}
public static String[] readFile()
{
String[] tokenz = new String[174];
int tokensLength = 0;
try
{
fileIn = new BufferedReader (new FileReader(name));
inputLine = fileIn.readLine();
while(inputLine !=null)
{
tokens = inputLine.split("\\s+");
for (int i = 0 ; i < tokens.length; i++)
{
int j = i + tokensLength;
tokenz[j] = tokens[i];
}
tokensLength = tokensLength + tokens.length;
inputLine = fileIn.readLine();
}
fileIn.close();
}
catch (IOException ioe)
{
System.out.println(ioe.getMessage());
ioe.printStackTrace();
}
//FULL ARRAY OF STRINGS IN TOKENZ
return tokenz;
}
public static String nextWord()
{
String[] tokenzz = readFile();
//????
return tokenzz[0];
}
}
Here's a conceptual model for you.
Keep track of your MyReader's state to know which value to return next.
the following example uses tokenIndex to decide where to read at next.
class MyReader
{
String[] tokens;
int tokenIndex = 0;
public String nextWord()
{
if(tokens == null || tokens.length <= tokenIndex)
{
// feel free to replace this line with whatever logic you want to
// use to fill in a new line.
tokens = readNextLine();
tokenIndex = 0;
}
String retVal = tokens[tokenIndex];
tokenIndex++;
return retval;
}
}
Mind you, this isn't a complete solution(it doesn't check for the end of file for instance), only a demonstration of the concept. You might have to elaborate a bit.
Use a loop and process each element in the array, printing them one at a time?

Calculate number of words in an ArrayList while some words are on the same line

I'm trying to calculate how many words an ArrayList contains. I know how to do this if every words is on a separate line, but some of the words are on the same line, like:
hello there
blah
cats dogs
So I'm thinking I should go through every entry and somehow find out how many words the current entry contains, something like:
public int numberOfWords(){
for(int i = 0; i < arraylist.size(); i++) {
int words = 0;
words = words + (number of words on current line);
//words should eventually equal to 5
}
return words;
}
Am I thinking right?
You should declare and instantiate int words outside of the loop the int is not reassign during every iteration of the loop. You can use the for..each syntax to loop through the list, which will eliminate the need to get() items out of the list. To handle multiple words on a line split the String into an Array and count the items in the Array.
public int numberOfWords(){
int words = 0;
for(String s:arraylist) {
words += s.split(" ").length;
}
return words;
}
Full Test
public class StackTest {
public static void main(String[] args) {
List<String> arraylist = new ArrayList<String>();
arraylist.add("hello there");
arraylist.add("blah");
arraylist.add(" cats dogs");
arraylist.add(" ");
arraylist.add(" ");
arraylist.add(" ");
int words = 0;
for(String s:arraylist) {
s = s.trim().replaceAll(" +", " "); //clean up the String
if(!s.isEmpty()){ //do not count empty strings
words += s.split(" ").length;
}
}
System.out.println(words);
}
}
Should looks like this:
public int numberOfWords(){
int words = 0;
for(int i = 0; i < arraylist.size(); i++) {
words = words + (number of words on current line);
//words should eventually equal to 5
}
return words;
}
I think this could help you .
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.StringTokenizer;
public class LineWord {
public static void main(String args[]) {
try {
File f = new File("C:\\Users\\MissingNumber\\Documents\\NetBeansProjects\\Puzzlecode\\src\\com\\test\\test.txt"); // Creating the File passing path to the constructor..!!
BufferedReader br = new BufferedReader(new FileReader(f)); //
String strLine = " ";
String filedata = "";
while ((strLine = br.readLine()) != null) {
filedata += strLine + " ";
}
StringTokenizer stk = new StringTokenizer(filedata);
List <String> token = new ArrayList <String>();
while (stk.hasMoreTokens()) {
token.add(stk.nextToken());
}
//Collections.sort(token);
System.out.println(token.size());
br.close();
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
}
}
}
So you'll red data from a file in this case and store them in a list after tokenizing them , just count them , If you just want to get input from the console use the Bufferedreader , tokenize them , separating with space , put in list , simple get size .
Hope you got what you are looking for .

Counting number of words in a file

I'm having a problem counting the number of words in a file. The approach that I am taking is when I see a space or a newLine then I know to count a word.
The problem is that if I have multiple lines between paragraphs then I ended up counting them as words also. If you look at the readFile() method you can see what I am doing.
Could you help me out and guide me in the right direction on how to fix this?
Example input file (including a blank line):
word word word
word word
word word word
You can use a Scanner with a FileInputStream instead of BufferedReader with a FileReader. For example:-
File file = new File("sample.txt");
try(Scanner sc = new Scanner(new FileInputStream(file))){
int count=0;
while(sc.hasNext()){
sc.next();
count++;
}
System.out.println("Number of words: " + count);
}
I would change your approach a bit. First, I would use a BufferedReader to read the file file in line-by-line using readLine(). Then split each line on whitespace using String.split("\\s") and use the size of the resulting array to see how many words are on that line. To get the number of characters you could either look at the size of each line or of each split word (depending of if you want to count whitespace as characters).
This is just a thought. There is one very easy way to do it. If you just need number of words and not actual words then just use Apache WordUtils
import org.apache.commons.lang.WordUtils;
public class CountWord {
public static void main(String[] args) {
String str = "Just keep a boolean flag around that lets you know if the previous character was whitespace or not pseudocode follows";
String initials = WordUtils.initials(str);
System.out.println(initials);
//so number of words in your file will be
System.out.println(initials.length());
}
}
Just keep a boolean flag around that lets you know if the previous character was whitespace or not (pseudocode follows):
boolean prevWhitespace = false;
int wordCount = 0;
while (char ch = getNextChar(input)) {
if (isWhitespace(ch)) {
if (!prevWhitespace) {
prevWhitespace = true;
wordCount++;
}
} else {
prevWhitespace = false;
}
}
I think a correct approach would be by means of Regex:
String fileContent = <text from file>;
String[] words = Pattern.compile("\\s+").split(fileContent);
System.out.println("File has " + words.length + " words");
Hope it helps. The "\s+" meaning is in Pattern javadoc
import java.io.BufferedReader;
import java.io.FileReader;
public class CountWords {
public static void main (String args[]) throws Exception {
System.out.println ("Counting Words");
FileReader fr = new FileReader ("c:\\Customer1.txt");
BufferedReader br = new BufferedReader (fr);
String line = br.readLin ();
int count = 0;
while (line != null) {
String []parts = line.split(" ");
for( String w : parts)
{
count++;
}
line = br.readLine();
}
System.out.println(count);
}
}
Hack solution
You can read the text file into a String var. Then split the String into an array using a single whitespace as the delimiter StringVar.Split(" ").
The Array count would equal the number of "Words" in the file.
Of course this wouldnt give you a count of line numbers.
3 steps: Consume all the white spaces, check if is a line, consume all the nonwhitespace.3
while(true){
c = inFile.read();
// consume whitespaces
while(isspace(c)){ inFile.read() }
if (c == '\n'){ numberLines++; continue; }
while (!isspace(c)){
numberChars++;
c = inFile.read();
}
numberWords++;
}
File Word-Count
If in between words having some symbols then you can split and count the number of Words.
Scanner sc = new Scanner(new FileInputStream(new File("Input.txt")));
int count = 0;
while (sc.hasNext()) {
String[] s = sc.next().split("d*[.#:=#-]");
for (int i = 0; i < s.length; i++) {
if (!s[i].isEmpty()){
System.out.println(s[i]);
count++;
}
}
}
System.out.println("Word-Count : "+count);
Take a look at my solution here, it should work. The idea is to remove all the unwanted symbols from the words, then separate those words and store them in some other variable, i was using ArrayList. By adjusting the "excludedSymbols" variable you can add more symbols which you would like to be excluded from the words.
public static void countWords () {
String textFileLocation ="c:\\yourFileLocation";
String readWords ="";
ArrayList<String> extractOnlyWordsFromTextFile = new ArrayList<>();
// excludedSymbols can be extended to whatever you want to exclude from the file
String[] excludedSymbols = {" ", "," , "." , "/" , ":" , ";" , "<" , ">", "\n"};
String readByteCharByChar = "";
boolean testIfWord = false;
try {
InputStream inputStream = new FileInputStream(textFileLocation);
byte byte1 = (byte) inputStream.read();
while (byte1 != -1) {
readByteCharByChar +=String.valueOf((char)byte1);
for(int i=0;i<excludedSymbols.length;i++) {
if(readByteCharByChar.equals(excludedSymbols[i])) {
if(!readWords.equals("")) {
extractOnlyWordsFromTextFile.add(readWords);
}
readWords ="";
testIfWord = true;
break;
}
}
if(!testIfWord) {
readWords+=(char)byte1;
}
readByteCharByChar = "";
testIfWord = false;
byte1 = (byte)inputStream.read();
if(byte1 == -1 && !readWords.equals("")) {
extractOnlyWordsFromTextFile.add(readWords);
}
}
inputStream.close();
System.out.println(extractOnlyWordsFromTextFile);
System.out.println("The number of words in the choosen text file are: " + extractOnlyWordsFromTextFile.size());
} catch (IOException ioException) {
ioException.printStackTrace();
}
}
This can be done in a very way using Java 8:
Files.lines(Paths.get(file))
.flatMap(str->Stream.of(str.split("[ ,.!?\r\n]")))
.filter(s->s.length()>0).count();
BufferedReader bf= new BufferedReader(new FileReader("G://Sample.txt"));
String line=bf.readLine();
while(line!=null)
{
String[] words=line.split(" ");
System.out.println("this line contains " +words.length+ " words");
line=bf.readLine();
}
The below code supports in Java 8
//Read file into String
String fileContent=new String(Files.readAlBytes(Paths.get("MyFile.txt")),StandardCharacters.UFT_8);
//Keeping these into list of strings by splitting with a delimiter
List<String> words = Arrays.asList(contents.split("\\PL+"));
int count=0;
for(String x: words){
if(x.length()>1) count++;
}
sop(x);
So easy we can get the String from files by method: getText();
public class Main {
static int countOfWords(String str) {
if (str.equals("") || str == null) {
return 0;
}else{
int numberWords = 0;
for (char c : str.toCharArray()) {
if (c == ' ') {
numberWords++;
}
}
return ++numberWordss;
}
}
}

Categories