My word count program is not working - java

The code below is meant to count the number of times the words in list y occur either in a document via FileReader or list x. Eventually I want list y to be an imported document as well, but when I run the code on a document it either gives me a false count or no count at all. What’s going on?
Also the files are form notepad. I'm using windows
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.*;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class test {
#SuppressWarnings("resource")
public static void main(String[] args) throws Exception {
don w = new don();
List<Integer> ist = new ArrayList<Integer>();
// List<String> x =Arrays.asList
// ("is","dishonorable","dismal","miserable","horrible","discouraging","distress","anguish","mine","is");
BufferedReader in = new BufferedReader(new FileReader("this one.txt"));
String str;
List<String> list = new ArrayList<String>();
while ((str = in.readLine()) != null) {
list.add(str);
// System.out.println(list);
List<String> y = Arrays.asList("Hello", "the", "string", "is", "mine");
for (String aY : y) {
int count = 0;
for (String aX : list) {
if (aY.contains(aX)) {
count++;
}
}
ist.add(count);
// no need to reset the count
}
int g = ist .stream()
.mapToInt(value -> value)
.sum();
System.out.println(g);
}
}
}

If you want to count, you should... count.
Here, you only check if the string contains a substring.
What you should do instead is roughly the following:
static int count(String line, String word) {
int count = 0;
for (int offset = line.indexOf(word); offset >= 0; offset = line.indexOf(word, offset + 1 + word.length())) {
count++;
}
return count;
}
Now, of course, you probably have to take into account the fact that you're looking for substrings and not words. But then if you already learned that, you might want to use regular expressions to help you further.

Related

search for multiple strings from a text file in java

I'm trying to search of multiple words given from a user ( i used array to store them in ) from one txt file , and then if that word presented once in the file it will be displayed and if it's not it won't.
also for the words itself , if it's duplicated it will search it once.
the problem now when i search for only one it worked , but with multiple words it keeps repeated that the word isn't present even if it's there.
i would like to know where should i put the for loop and what's the possible changes.
package search;
import java.io.*;
import java.util.Scanner;
public class Read {
public static void main(String[] args) throws IOException
{
Scanner sc = new Scanner(System.in);
String[] words=null;
FileReader fr = new FileReader("java.txt");
BufferedReader br = new BufferedReader(fr);
String s;
System.out.println("Enter the number of words:");
Integer n = sc.nextInt();
String wordsArray[] = new String[n];
System.out.println("Enter words:");
for(int i=0; i<n; i++)
{
wordsArray[i]=sc.next();
}
for (int i = 0; i <n; i++) {
int count=0; //Intialize the word to zero
while((s=br.readLine())!=null) //Reading Content from the file
{
{
words=s.split(" "); //Split the word using space
for (String word : words)
{
if (word.equals(wordsArray[i])) //Search for the given word
{
count++; //If Present increase the count by one
}
}
if(count == 1)
{
System.out.println(wordsArray[i] + " is unique in file ");
}
else if (count == 0)
{
System.out.println("The given word is not present in the file");
}
else
{
System.out.println("The given word is present in the file more than 1 time");
}
}
}
}
fr.close();
}
}
The code which you wrote is error prone and remember always there should be proper break condition when you use while loop.
Try the following code:
public class Read {
public static void main(String[] args)
{
// Declaring the String
String paragraph = "These words can be searched";
// Declaring a HashMap of <String, Integer>
Map<String, Integer> hashMap = new HashMap<>();
// Splitting the words of string
// and storing them in the array.
String[] words = new String[]{"These", "can", "searched"};
for (String word : words) {
// Asking whether the HashMap contains the
// key or not. Will return null if not.
Integer integer = hashMap.get(word);
if (integer == null)
// Storing the word as key and its
// occurrence as value in the HashMap.
hashMap.put(word, 1);
else {
// Incrementing the value if the word
// is already present in the HashMap.
hashMap.put(word, integer + 1);
}
}
System.out.println(hashMap);
}
}
I've tried by hard coding the values, you can take words and paragraph from the file and console.
The 'proper' class to use for extracting words from text is java.text.BreakIterator
You can try the following (reading line-wise in case of large files)
import java.text.BreakIterator;
import java.util.Arrays;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Stream;
import java.nio.file.Files;
import java.nio.file.Paths;
public class WordFinder {
public static void main(String[] args) {
try {
if (args.length < 2) {
WordFinder.usage();
System.exit(1);
}
ArrayList<String> argv = new ArrayList<>(Arrays.asList(args));
String path = argv.remove(0);
List<String> found = WordFinder.findWords(Files.lines(Paths.get(path)), argv);
System.out.printf("Found the following word(s) in file at %s%n", path);
System.out.println(found);
} catch (Throwable t) {
t.printStackTrace();
}
}
public static List<String> findWords(Stream<String> lines, ArrayList<String> searchWords) {
List<String> result = new ArrayList<>();
BreakIterator boundary = BreakIterator.getWordInstance();
lines.forEach(line -> {
boundary.setText(line);
int start = boundary.first();
for (int end = boundary.next(); end != BreakIterator.DONE; start = end, end = boundary.next()) {
String candidate = line.substring(start, end);
if (searchWords.contains(candidate)) {
result.add(candidate);
searchWords.remove(candidate);
}
}
});
return result;
}
private static void usage() {
System.err.println("Usage: java WordFinder <Path to input file> <Word 1> [<Word 2> <Word 3>...]");
}
}
Sample run:
goose#t410:/tmp$ echo 'the quick brown fox jumps over the lazy dog' >quick.txt
goose#t410:/tmp$ java WordFinder quick.txt dog goose the did quick over
Found the following word(s) in file at quick.txt
[the, quick, over, dog]
goose#t410:/tmp$

how to print an extra element in 1d array without bound exception?

at the beginning I read the file and used split() method and stored each value in 1d array. i must store index 0 and 1 in a string value ans index 2,3 and 4 must be stored in 1d array because "supervisor" object arguments contains two string values(name and id) and 1d array (interests) the problem is at row 0 there is an extra interest (3 interests) and at row 1 and 2 there are two interests.
what i thought about is to store the interests in an arraylist (because the size is not static )and convert it back to 1d array but it did not work
tries to store the interests in 2d array and convert t back to 1d array but it did not work , while splitting the file i splitted (, and #) but i noticed at the end of every interest there is a #
so i kept the # and thought if i can do an if condition while reading the file. is there any simple idea to avoid the error?
the file supervisor.txt contains:
00023, Dr. Haneen, artificial intelligent, data mining, pattern recognition#
00013, Dr. Manar, database, network#
00011, Dr. Hajar, software engineering, games#
Code
public static void main(String[] args)throws Exception {
File supervisorFile=new File("supervisor.txt");
if (!supervisorFile.exists()) {
System.out.println("Sorry the file is not found!"); //checks if the file exists if no it terminates the program
System.exit(0);
}
supervisor sup=null;
String[]supArray=null;
Scanner supRead=new Scanner(supervisorFile);//read supervisor file
while (supRead.hasNext()) {
supArray=supRead.nextLine().split(",");
sup=addSupervisor(supArray);
//System.out.println(sup.toString());
}
}
public static supervisor addSupervisor(String[]arr){
String id=arr[0];
String name=arr[1];
String[] interest=new String[3];
for (int i = 0; i < interest.length; i++) { //here i tried to store all the interests
interest[i]=arr[2]+arr[3]+arr[4];
}//it prints artificial intelligent data mining pattern recognition# and then an indexOutOfBoundsException
return new supervisor(id,name,interest);
}
The solution is to use split with a limit parameter.
class Supervisor{
final String id;
final String name;
String[] fields;
Supervisor(String id, String name, String[] fields) {
this.id = id;
this.name = name;
this.fields = fields;
}
}
Path path = Paths.get("supervisor.txt");
List<Supervisor> supervisors = Files.lines(path, Charset.defaultCharset())
.filter(l -> l.endsWith("#"))
.map(l -> l.substring(0, l.length() - 1)) // Remove #
.map(l -> l.split(",\\s*", 3)) // "00013", "Dr. Manar", "database, network"
.filter(w -> w.length == 3)
.map(w -> new Supervisor(w[0], w[1], w[2].split(",\s*")))
.collect(Collectors.toList());
Use split & ArraysList
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import org.apache.commons.lang3.builder.ToStringBuilder;
import org.apache.commons.lang3.builder.ToStringStyle;
import lombok.AllArgsConstructor;
#AllArgsConstructor
class Supervisor {
String id;
String name;
List<String> interest;
#Override
public String toString() {
return ToStringBuilder.reflectionToString(this, ToStringStyle.NO_CLASS_NAME_STYLE);
}
}
public class AMain {
public static void main(String[] args) {
String id, name, line;
String[] arr;
List<String> list = new ArrayList<>();
try (BufferedReader br = new BufferedReader(new FileReader("file/supervisor.txt"))) {
while ((line = br.readLine()) != null) {
arr = line.trim().split(",");
list.addAll(Arrays.asList(arr));
if (list.size() > 2) {
id = list.get(0); // get id
list.remove(0); // remove id
name = list.get(0); // get name
list.remove(0); // remove name
System.out.println(new Supervisor(id, name, list));
}
list.clear(); // clear all
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Output
[id=00023,name= Dr. Haneen,interest=[ artificial intelligent, data mining, pattern recognition#]]
[id=00013,name= Dr. Manar,interest=[ database, network#]]
[id=00011,name= Dr. Hajar,interest=[ software engineering, games#]]
Your second and third line only has 4 Strings split up by a comma. That makes it 4 Strings in the Array. In your addSupervisor methode you are trying to access arr[4], the 5th String, which is out of bound.
You get an error because you are trying to use arr[4], but wioth the lines 2 and 3 the size of the array will be 4, so the maximum index you can use is 3.
I don't know for sue what Supervisor is, but would this work:
public static supervisoraddSupervisor(String[]arr){
String id=arr[0];
String name=arr[1];
String[] interest=new String[arr.length - 2];
for (int i = 0; i < interest.length; i++) {
interest[i]=arr[i + 2];
}
return new supervisor(id,name,interest);
}
Try it online!
First you should get the line and then work with it like you do. The problem is in the loop for where you suppose all the "supervisor" have 3 interests. Also you are storing all the interest in the first pos of the array:
for (int i = 0; i < interest.length; i++) { //here i tried to store all the interests
interest[i]=arr[2]+arr[3]+arr[4];
}
So I think you should use a function like this:
private static String[] extractInterest(String[] line) {
String[] res = new String[line.length - 2]; //There are two index that haven't got interest
for(int i = 0; i<res.length; ++i) {
res[i] = line[i+2].replaceFirst(" ", "").replace("#","");
}
return res;
}
And this is the "main":
public static void main(String[] args) {
File file = new File("Data.txt");
try (Scanner sc = new Scanner(file)) { //Will close the sc automatically
String[] line;
while(sc.hasNext()) {
line = sc.nextLine().split(",");
int id = Integer.parseInt(line[0]);
String name = line[1].replaceFirst(" ",""); //For delete first " "
String[] interest = extractInterest(line);
Supervisor s = new Supervisor(id,name,interest);
System.out.println(s.toString());
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
A final advice, Java classes names must begin with uppercase by agreement. So you should change the name of your class "supervisor" to "Supervisor"

how to separate values in a string index that are char and int

okay basically im wanting to separate the elements in a string from int and char values while remaining in the array, but to be honest that last parts not a requirement, if i need to separate the values into two different arrays then so be it, id just like to keep them together for neatness. this is my input:
5,4,A
6,3,A
8,7,B
7,6,B
5,2,A
9,7,B
now the code i have so far does generally what i want it to do, but not completely
here is the output i have managed to produce with my code but here is where im stuck
54A
63A
87B
76B
52A
97B
here is where the fun part is, i need to take the numbers and the character values and separate them so i can use them in a comparison/math formula.
basically i need this
int 5, 4;
char 'A';
but of course stored in the array that they are in.
Here is the code i have come up with so far.
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
public class dataminingp1
{
String[] data = new String[100];
String line;
public void readf() throws IOException
{
FileReader fr = new FileReader("C:\\input.txt");
BufferedReader br = new BufferedReader(fr);
int i = 0;
while ((line = br.readLine()) != null)
{
data[i] = line;
System.out.println(data[i]);
i++;
}
br.close();
System.out.println("Data length: "+data.length);
String[][] root;
List<String> lines = Files.readAllLines(Paths.get("input.txt"), StandardCharsets.UTF_8);
root = new String[lines.size()][];
lines.removeAll(Arrays.asList("", null)); // <- remove empty lines
for(int a =0; a<lines.size(); a++)
{
root[a] = lines.get(a).split(" ");
}
String changedlines;
for(int c = 0; c < lines.size(); c++)
{
changedlines = lines.get(c).replace(',', ' '); // remove all commas
lines.set(c, changedlines);// Set the 0th index in the lines with the changedLine
changedlines = lines.get(c).replaceAll(" ", ""); // remove all white/null spaces
lines.set(c, changedlines);
changedlines = lines.get(c).trim(); // remove all null spaces before and after the strings
lines.set(c, changedlines);
System.out.println(lines.get(c));
}
}
public static void main(String[] args) throws IOException
{
dataminingp1 sarray = new dataminingp1();
sarray.readf();
}
}
i would like to do this as easily as possible because im not to incredibly far along with java but i am learning so if need be i can manage with a difficult process. Thank you in advance for any at all help you may give. Really starting to love java as a language thanks to its simplicity.
This is an addition to my question to clear up any confusion.
what i want to do is take the values stored in the string array that i have in the code/ input.txt and parse those into different data types, like char for character and int for integer. but im not sure how to do that currently so what im asking is, is there a way to parse these values all at the same time with out having to split them into different arrays cause im not sure how id do that since it would be crazy to go through the input file and find exactly where every char starts and every int starts, i hope this cleared things up a bit.
Here is something you could do:
int i = 0;
for (i=0; i<list.get(0).size(); i++) {
try {
Integer.parseInt(list.get(0).substring(i, i+1));
// This is a number
numbers.add(list.get(0).substring(i, i+1));
} catch (NumberFormatException e) {
// This is not a number
letters.add(list.get(0).substring(i, i+1));
}
}
When the character is not a number, it will throw a NumberFormatException, so, you know it is a letter.
for(int c = 0; c < lines.size(); c++){
String[] chars = lines.get(c).split(",");
String changedLines = "int "+ chars[0] + ", " + chars[1] + ";\nchar '" + chars[0] + "';";
lines.set(c, changedlines);
System.out.println(lines.get(c));
}
It is very easy, if your input format is standartized like this. As long as you dont specify more (like can have more than 3 variables in one row, or char can be in any column, not only just third, the easiest approach is this :
String line = "5,4,A";
String[] array = line.split(",");
int a = Integer.valueOf(array[0]);
int b = Integer.valueOf(array[1]);
char c = array[2].charAt(0);
Maybe something like this will help?
List<Integer> getIntsFromArray(String[] tokens) {
List<Integer> ints = new ArrayList<Integer>();
for (String token : tokens) {
try {
ints.add(Integer.parseInt(token));
} catch (NumberFormatException nfe) {
// ...
}
}
return ints;
}
This will only grab the integers, but maybe you could hack it around a bit to do what you want :p
List<String> lines = Files.readAllLines(Paths.get("input.txt"), StandardCharsets.UTF_8);
String[][] root = new String[lines.size()][];
for (int a = 0; a < lines.size(); a++) {
root[a] = lines.get(a).split(","); // Just changed the split condition to split on comma
}
Your root array now has all the data in the 2d array format where each row represents the each record/line from the input and each column has the data required(look below).
5 4 A
6 3 A
8 7 B
7 6 B
5 2 A
9 7 B
You can now traverse the array where you know that first 2 columns of each row are the numbers you need and the last column is the character.
Try this way by using getNumericValue() and isDigit methods. This might also work,
String myStr = "54A";
boolean checkVal;
List<Integer> myInt = new ArrayList<Integer>();
List<Character> myChar = new ArrayList<Character>();
for (int i = 0; i < myStr.length(); i++) {
char c = myStr.charAt(i);
checkVal = Character.isDigit(c);
if(checkVal == true){
myInt.add(Character.getNumericValue(c));
}else{
myChar.add(c);
}
}
System.out.println(myInt);
System.out.println(myChar);
Also check, checking character properties

read from .txt file and convert in to List<Map>

i have heavy .txt file. it contains a format Like this:
0 1 2 3 4 5 6 7 ... n
0 A, B, c, D, E, F, G, H,
1 AA, BB, CC, DD, EE, FF, GG, HH,
2
3
.
.
n
i want to save each row in Map.
for example in first row: map<0,A> . map<1,B>, Map<2,C>,...
then i want to save this maps in List. for example i want to save 100 rows in List.
for example if i write this function: "" list.get(1).get(4); "" i recived "EE"
it means first i have to go in 1 row, then i go to 4 and recive "EE".
could you please guidance me how to solve this problem?
i read some article about "spring batch" .and it related what i want
could you please help me how can i fix this problem?
public class spliter {
static int w=0;
private static HashMap<Integer,String> map = new HashMap<Integer,String>();
private static List<Map<Integer, String>> list=new ArrayList<Map<Integer,String>>();
public static void main(String[] args) throws IOException{
String string = null;
try {
BufferedReader reader = new BufferedReader(new FileReader("C:\\test.txt"));
while( (string = reader.readLine()) != null ) {
String[] parts = string.split(",");
int i=parts.length;
for(int j=0; j<i; j++){
map.put(j, parts[j]);
}
list.add(map);
w++;
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
Something this simple can be solved using a Scanner to read each line and then String.split(...) to split each line. Something like:
while line exists
read line into String using Scanner
split String using String#split(...)
use array from split to create a list
add above list to master list
end while
Note that you can contain this in a list of lists, without the need of a Map, at all. List<List<String>> should do it for you.
I think that it would be more instructive to you for us to give you general advice like this, and then to see what you can do with it.
I have made into a Community Wiki, so all might contribute easily to this answer and so no-one will get reputation for up-votes.
I needed the same, i did it with yours in mind
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.*;
public class Spliter {
private static List<Map<String, Object>> list = new ArrayList<> ();
public static void main(String[] args) throws IOException {
BufferedReader reader = new BufferedReader(new FileReader("C:\\Users\\829784\\Desktop\\Repo\\Documentacion Repositorio\\test.txt"));
String line = "";
String firstline = reader.readLine();
while ((line = reader.readLine()) != null) {
Map < String, Object > map = new TreeMap < > ();
String[] partsfirstline = firstline.split(";");
String[] parts = line.split(";");
int i = parts.length;
for (int j = 0; j < i; j++) {
map.put(partsfirstline[j], parts[j]);
}
list.add(map);
}
System.out.println(list.get(0).values());
System.out.println(list.get(1).values());
}
}
you could us something like this
public class ArrayReader {
public static void main(String[] args) {
List<List<String >> array = new ArrayList<>();
try (BufferedReader br = new BufferedReader(new FileReader("file.txt"));){
String line;
while ((line=br.readLine())!=null)
array.add(Arrays.asList(line.split(",")));
} catch (IOException e) {
e.printStackTrace();
}
}
}

Calculate number of words in an ArrayList while some words are on the same line

I'm trying to calculate how many words an ArrayList contains. I know how to do this if every words is on a separate line, but some of the words are on the same line, like:
hello there
blah
cats dogs
So I'm thinking I should go through every entry and somehow find out how many words the current entry contains, something like:
public int numberOfWords(){
for(int i = 0; i < arraylist.size(); i++) {
int words = 0;
words = words + (number of words on current line);
//words should eventually equal to 5
}
return words;
}
Am I thinking right?
You should declare and instantiate int words outside of the loop the int is not reassign during every iteration of the loop. You can use the for..each syntax to loop through the list, which will eliminate the need to get() items out of the list. To handle multiple words on a line split the String into an Array and count the items in the Array.
public int numberOfWords(){
int words = 0;
for(String s:arraylist) {
words += s.split(" ").length;
}
return words;
}
Full Test
public class StackTest {
public static void main(String[] args) {
List<String> arraylist = new ArrayList<String>();
arraylist.add("hello there");
arraylist.add("blah");
arraylist.add(" cats dogs");
arraylist.add(" ");
arraylist.add(" ");
arraylist.add(" ");
int words = 0;
for(String s:arraylist) {
s = s.trim().replaceAll(" +", " "); //clean up the String
if(!s.isEmpty()){ //do not count empty strings
words += s.split(" ").length;
}
}
System.out.println(words);
}
}
Should looks like this:
public int numberOfWords(){
int words = 0;
for(int i = 0; i < arraylist.size(); i++) {
words = words + (number of words on current line);
//words should eventually equal to 5
}
return words;
}
I think this could help you .
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.StringTokenizer;
public class LineWord {
public static void main(String args[]) {
try {
File f = new File("C:\\Users\\MissingNumber\\Documents\\NetBeansProjects\\Puzzlecode\\src\\com\\test\\test.txt"); // Creating the File passing path to the constructor..!!
BufferedReader br = new BufferedReader(new FileReader(f)); //
String strLine = " ";
String filedata = "";
while ((strLine = br.readLine()) != null) {
filedata += strLine + " ";
}
StringTokenizer stk = new StringTokenizer(filedata);
List <String> token = new ArrayList <String>();
while (stk.hasMoreTokens()) {
token.add(stk.nextToken());
}
//Collections.sort(token);
System.out.println(token.size());
br.close();
} catch (Exception e) {
System.err.println("Error: " + e.getMessage());
}
}
}
So you'll red data from a file in this case and store them in a list after tokenizing them , just count them , If you just want to get input from the console use the Bufferedreader , tokenize them , separating with space , put in list , simple get size .
Hope you got what you are looking for .

Categories