Java NetBeans: Why does my string array terminate prematurely? - java

I found the problem. Apparently, there were random spaces in some of the names in the csv file, which was causing breaks at the 257th entry, as well as several others later on. So, I just took out the spaces and everything works fine now. Thanks to all who tried to help.
I have this code that reads from a csv file, puts the values in String array, and prints them for me to see. It runs fine until it reaches the 257th member of the array (each member has 3 values: last name, first name, and birth year). Here is a functioning version of the code:
package testing.csv.files;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class Test {
public static void main(String[] args) {
//.csv comma separated values
String fileName = "C:/Users/Owner/Desktop/Data.csv";
File file = new File(fileName); // TODO: read about File Names
try {
Scanner inputStream = new Scanner(file);
inputStream.next(); //Ignore first line of titles
while (inputStream.hasNext()){
String data = inputStream.next(); // gets a whole line
String[] values = data.split(",");
System.out.println(data);
}
inputStream.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Now, when I change the line
System.out.println(data);
To this:
System.out.println(values[2]);
What I expected to happen was for only the birth years (3rd column) to be printed for every person in the array. However, it only prints out until the 257th person's birth year (out of over 18,000), and gives me the following error message:
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 2
at testing.csv.files.Test.main(Test.java:22)
Java Result: 1
BUILD SUCCESSFUL (total time: 0 seconds)
The "java: 22" seems to be referring to the above snippet of code I posted above that I changed. I am not really sure what the problem is. If my syntax is wrong, why did it print at all? The only thing I can think of is that perhaps a string array can only handle 257 different people each with their own 3 values. If that were the case, then I would need some kind of larger version of string to hold all of my data. Has anyone encountered this problem before? Is the problem somewhere in my syntax and loop?

If there are only two things in the values array, then the highest location that you can index into is 1.
For arrays, you can only index into size - 1 spots; that is, if your array was size ten, you could index into a location 9, or more verbose: array[9].
Change your indexing statement to this:
System.out.println(values[1]);

You might want to see the 257th record in the csv file. Would the split method create three tokens for it? If it should result in less than three tokens and you try to print the third token by typing
System.out.println(values[2]);
you will get an ArrayIndexOutOfBoundsException.

Change:
String data = inputStream.next(); // next() can read the input only till the space
to:
String data = inputStream.nextLine(); // nextLine() reads input including space between the words
Also better way is to iterate the array instead acess through index may be particular line in csv not containing the third column.

Related

How to get information from a CSV file in Java

I have a .CSV file with rows for [ID], [NAME], [LASTNAME], [EMAIL], [GENDER]. There are 1000 entries.
Out of those five rows I have to:
Find the total of people on the list. (using a for loop?)
Show the first 10 names (name, lastname).
Show 3 RANDOM names.
Only display emails. (Doable with the current code)
Display the first letters of their last name.
Add a random number behind their last name.
Can someone make an example, please?
As a Java beginner I really can't seem to find an answer to this. I have searched everywhere and i think im going crazy.
I have imported the .csv file to my java eclipse, using the following code, currently it only displays the ID's.
package test;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class test111 {
public static void main(String[] args) {
String fileName="test.csv";
File file = new File(fileName);
try {
Scanner inputStream = new Scanner(file);
inputStream.next();
while (inputStream.hasNext()) {
String data = inputStream.next();
String[] values = data.split(",");
System.out.println(values[0]);
}
inputStream.close();
System.out.println("e");
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
In order to take every value you have to do something like this:
int id = Integer.parseInt(values[0]);
String name = values[1];
String lastName = values[2];
String email = values[3];
String gender = values[4];
Yours is a lesson in why you shouldn't blindly copy code without reading it, researching it and understanding it.
Look at line 15 from your own code. What do you think is going on here?
String[] values = data.split(",");
You've read a line from your CSV file into a field called data and now you're calling the .split method with a comma as a parameter to break up that line into tokens wherever it finds a comma. Those tokens are being placed into the cells of an array called values.
Why are you only getting your ID attribute? Look at this line:
System.out.println(values[0]);
We just established that all tokens have been placed in an array called values. Let's look at the attribute key you provided:
[ID], [NAME], [LASTNAME], [EMAIL], [GENDER]
Hmm... if printing values[0] gives us the ID, I wonder what we'd get if we... oh, I don't know... maybe tried to print from a different element in the array?
Try:
System.out.println(values[1]);
System.out.println(values[2]);
System.out.println(values[3]);
System.out.println(values[4]);
See what you get.
Coding is about trying things in order to learn them. Knowledge doesn't magically become embedded in your head. To learn something, you have to practice repeatedly, and in doing so, you're going to make mistakes -- this is just how it goes. Don't give up so easily.

Reliable saving JTextArea string in multi entry text file

I have a software that stores its data in multible nested data objects. On saving this project data, every instance gets an out handle (BufferedWriter) and writes its own data. Most data is single line and no problem, but there are a few multiline strings that come from JTextAreas. For storing them, I wrote a sub method multiLineWriter(), that splits the string in single lines, writes the number of lines and then the single lines. In theory. Because its not always working. Often it writes out the line count as 1 but then writes out two lines. Or it writes out 1, but writes out two lines with text and an empty line. Its not reliable. After loading the project back, often the complete data is destroyed. A typcal object saving block looks like this:
// *** write data to file
public void writeDataFile(BufferedWriter out) {
try {
out.write(""+getHeadline() );
out.newLine();
out.write(""+getStartDateAsString() );
out.newLine();
out.write(""+getEndDateAsString() );
out.newLine();
out.write(""+getPlaceIndex() );
out.newLine();
multiLineWriter(out, getDescription() );
} catch(Exception e) {}
}
// *** read data from File
public void readDataFile(BufferedReader in) {
try {
setHeadline(in.readLine());
setStartDateAsString(in.readLine());
setEndDateAsString(in.readLine());
setPlaceIndex(in.readLine());
setDescription(multiLineReader(in));
} catch(Exception e) {}
}
The multline writer/reader looks like this:
public void multiLineWriter(BufferedWriter out, String areaText) {
try {
String ls = System.getProperty("line.separator");
String[] lines = areaText.split(ls);
int lineCount = lines.length;
out.write(""+lineCount);
out.newLine();
for(int i = 0;i<lineCount;i++) {
out.write(lines[i]);
out.newLine();
}
} catch(Exception e) {}
}
public String multiLineReader(BufferedReader in) {
String targetString = "";
try {
String ls = System.getProperty("line.separator");
int lineCount = Integer.parseInt(in.readLine());
for(int i = 0;i<lineCount;i++) {
targetString = targetString + in.readLine() + ls;
}
} catch(Exception e) {}
return targetString;
}
As said, lineCount often is 1, but the loop seems to go two or more times because I have sometimes two or three lines after the 1 in a datafile.
This is not reliable for the project. Do you have an idea how I can change the multiLineWriter/reader to reliably store and read the data? The JTextArea save method does not work in this combined data file format.
__More info: __
Properties are a good style for the whole datafile. Since I was allright with the old style seen above most of the times I am sticking to that. Changing the current project to properties is a lot of handwork.
I reuse the out. I have Project Object, that creates the out. This out is then passed to multiple objects with subobjects, sometimes in loops, and everyone writes it data to this single out. After all data is written the project Object of course flushes and closes the stream. The empty exceptions are no problem in this case, because there are no exceptions (so there is nothing to analyse in a stack trace). Its not an exception problem but a logical problem.
The JTextArea read/write is not a good option. At time of saving the file, the data is not in a JTextArea but in a string, that was saved sometime ago during runtime from a JTextArea. To use the write method of JtextArea I would need to restore the string to the area and then use the write method. Because of hundreds of those description objects I would need to do this hundred of times in a save process. Sounds not well. On the other hand I am sure that the read method would not work, because it would read in the datafile up to the end and wouldn't handle the nested datastructure in the datafile.
Its not bad to be human readable. Currently this is helping me, to manually correct the values after a save process, so I am not loosing any data (I now this is stupid, but it works:-)
To be short: I guess I have a problem with the split method of strings and the content of the strings in the string array.
Problem should be made clearer. I have this JTextArea. It is like one field in a display for datasets (its a little private genealogy program that mainly manages hundreds of persons and places). A lot of dataobjects have a description field. Contents of the JTextArea are stored to one single String variable when you change the person in display for example (String personDescription). The writeDataFile() Method you see above is for an event object, that has a description field, too.
So when I write a File, I write from one String to the file. Since this string is taken from the JTextArea, it contains all new line characters that you can produce in a JTextArea. When storing this with one out.write (data) call you have multiple lines in the resulting data file because of possible new line characters in the String. So you can't read all this content back with one in.readLine() call. That's why I created the multiline writers and readers. But they don't work as expected.
Here I show you an exerpt from the resulting datafile
...
# +++ FileCollection:
0
# +++ ImageCollection:
0
58
true
Surname
Arthur
25.09.1877
1
01.01.1950
6
https://familysearch.org/
1
Bekannt ist, dass er auf dem Friedhof Großbeerenstr. lag.
Bekannt ist auch, dass die Trauzeugen bei der Heirat Dorothea Surname und Hermann Surname waren. Hermann ist vermutlich ein Bruder von Valerie.
Weitere Informationen gibt es nicht bisher.
# +++ EventCollection:
0
# +++ FileCollection:
0
...
There is more data before and below, but here is the wrong written data. Its directly below the link to familysearch.org. The first line that follows should have the line count. If there is no text it would have a 0 and the next line would be the info sting '# + EventCollection:'. If there would be one line, it would have a 1 and the next line would be that single line of text for description. Or other numbers depending on the amount of lines from the JTextArea. But as you see, there is written a 1 in this case, but there are 3 (!) Lines of text following.
So the main problem seems to be the way I work with the split method in the multiLineWriter().
String ls = System.getProperty("line.separator");
String[] lines = areaText.split(ls);
int lineCount = lines.length;
This seems to be critical. Since I write the resulting array of the split in a loop, this loop must be done three times? Because I have 3 lines of text in the datafile. But the lineCount is written as a 1? So this seems to be wrong. Could be that this string was not splitted, but still contains line break characters. That would not be what I am looking for. And in the array of splittet Strings there should not be any line break characters anymore (that would destroy the file writing, too).
Hope the problem is better described now. And the question is, how should the multiline writer and reader method be designed to store and read this data reliable.
I tried it myself. As I said there was a problem using the split method on strings. I changed this now to use a Scanner. To be correct, I use some ideas from How do I use System.getProperty("line.separator").toString()?
So in the end I just changed the multiLineWrite Method to use the Scanner (from the util package). It looks like this:
public void multiLineWriter(BufferedWriter out, String areaText) {
List<String> slines = new ArrayList<String>();
try {
Scanner sc = new Scanner(areaText);
while (sc.hasNextLine()) {
slines.add(sc.nextLine());
}
int slineCount = slines.size();
out.write(""+slineCount);
out.newLine();
for(int i = 0;i<slineCount;i++) {
out.write(slines.get(i));
out.newLine();
}
} catch(Exception e) {}
}
So now this seems to be reliable for me. I did a test with parallel writing of the split method and the Scanner method, and the split method had the wrong line count and the Scanner was correct.

Number format exception for large inputs

This code works fine for some inputs.
but I get a NumberFormatError for higher values of inputs such as 1000000.
The input (taken for s[]) ranges from values 1-2000000
What could be the reason?
import java.io.*;
import java.util.*;
import java.text.*;
import java.math.*;
import java.util.regex.*;
public class Solution {
public static void main(String[] args) {
/* Enter your code here. Read input from STDIN. Print output to STDOUT. Your class should be named Solution. */
try
{
BufferedReader read = new BufferedReader(new InputStreamReader(System.in));
int no=Integer.parseInt(read.readLine());
String s[]=read.readLine().split(" ");
int result=0;
for(int i=0; i<no; i++)
{
result+= Integer.parseInt(s[i]);
if(result<0)
result=0;
}
System.out.println(result);
}
catch(IOException e)
{
System.out.println(e.getMessage());
}
}
}
Inside your for-loop, your first entered digit is the size of the array. That's how your logic is so far. Unless you're actually loading in 2,000,000 numbers manually (or copy/pasting), this would throw an ArrayIndexOutOfBoundsException.
You would get a NumberFormatException if you were to type in non-digits as the second input, or a number larger than Integer.MAX_VALUE (2147483647) or less than Integer.MIN_VALUE (-2147483648).
Entering something like:
1000000
2 1 2 1 2 1 /*... 999990 digits later ...*/ 2 1 2 1
makes the program terminate correctly. Here's the input file I used, if anyone wants it: http://ge.tt/95Lr2Kw/v/0
The program was compiles and run manually from a command promt like so: java Solution < in.txt.
Edit: I just remembered that the input values in the array could be as large as 2000000. You would have to use a BigInteger to hold a result value as large as 2000000^2.
I am agree with #lzmaki.
I don't get any NumberFormatException for your value.
But, I get ArrayIndexOutofBoundException which actually caused from StackOverFlow when I tried like this:
1000000
1 2 3
then enter
as in that time, the system recognize that it have not enough memory in its stack for hold such huge number of data.
I got NumberFormatException for the following case:
1000000
enter twice
becuase not system get a non-number format to convert integer format which is "".
I hope my analysis help you to find your bug :)
Assuming it's not a buffer overflow, any non-numeric character passed to Integer.parseInt will throw a NumberFormatException. This includes whitespace and non-printable characters like newlines as well as decimal points (Since floating point numbers are not integers).
You can try validating your inputs by using .trim() when you call read.readLine(), as well as checking for null or empty string before passing to Integer.parseInt(). Something like:
String input = read.readLine();
if ( input != null )
input = input.trim();
if ( input.equals("") )
throw new IllegalArgumentException("Input must be a number");
int no=Integer.parseInt( input );
However you decide to validate input for the first line, also do for the second readLine() call. Hopefully you can narrow down exactly what's causing the problem.

Java Scanner to print previous and next lines

I am using 'java.util.Scanner' to read and scan for keywords and want to print the previous 5 lines and next 5 lines of the encountered keyword, below is my code
ArrayList<String> keywords = new ArrayList<String>();
keywords.add("ERROR");
keywords.add("EXCEPTION");
java.io.File file = new java.io.File(LOG_FILE);
Scanner input = null;
try {
input = new Scanner(file);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
int count = 0;
String previousLine = null;
while(input.hasNext()){
String line = input.nextLine();
for(String keyword : keywords){
if(line.contains(keyword)){
//print prev 5 lines
system.out.println(previousLine); // this will print only last previous line ( i need last 5 previous lines)
???
//print next 5 lines
system.out.println(input.nextLine());
system.out.println(input.nextLine());
system.out.println(input.nextLine());
system.out.println(input.nextLine());
system.out.println(input.nextLine());
}
previousLine = line;
}
any pointers to print previous 5 lines..?
any pointers to print previous 5 lines..?
Save them in an Dequeue<String> such as a LinkedList<String> for its "First In First Out (FIFO)" behavior.
Either that or use 5 variables or an array of 5 Strings, manually move Strings from one slot or variable to another, and then print them.
If you use Dequeue/LinkedList, use the Dequeue's addFirst(...) method to add a new String to the beginning and removeLast() to remove the list's last String (if its size is > 5). Iterate through the LinkedList to get the current Strings it contains.
Other suggestions:
Your Scanner's check scanner.hasNextXXX() method should match the get method, scanner.nextXXX(). So you should check for hasNextLine() if you're going to call nextLine(). Otherwise you risk problems.
Please try to post real code here in your questions, not sort-of, will never compile code. i.e., system.out.println vs System.out.println. I know it's a little thing, but it means a lot when others try to play with your code.
Use ArrayList's contains(...) method to get rid of that for loop.
e.g.,
LinkedList<String> fivePrevLines = new LinkedList<>();
java.io.File file = new java.io.File(LOG_FILE);
Scanner input = null;
try {
input = new Scanner(file);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
while (input.hasNextLine()) {
String line = input.nextLine();
if (keywords.contains(line)) {
System.out.println("keyword found!");
for (String prevLine : fivePrevLines) {
System.out.println(prevLine);
}
} else {
fivePrevLines.addFirst(line);
if (fivePrevLines.size() > 5) {
fivePrevLines.removeLast();
}
}
}
if (input != null) {
input.close();
}
Edit
You state in comment:
ok i ran small test program to see if the contains(...) method works ...<unreadable unformatted code>... and this returned keyword not found...!
It's all how you use it. The contains(...) method works to check if a Collection contains another object. It won't work if you feed it a huge String that may or may not use one of the Strings in the collection, but will work on the individual Strings that comprise the larger String. For example:
ArrayList<String> temp = new ArrayList<String>();
temp.add("error");
temp.add("exception");
String s = "Internal Exception: org.apache.tomcat.dbcp.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object";
String[] tokens = s.split("[\\s\\.:,]+");
for (String token : tokens) {
if (temp.contains(token.toLowerCase())) {
System.out.println("keyword found: " + token);
} else {
System.out.println("keyword not found: " + token);
}
}
Also, you will want to avoid posting code in comments since they don't retain their formatting and are unreadable and untestable. Instead edit your original question and post a comment to alert us to the edit.
Edit 2
As per dspyz:
For stacks and queues, when there isn't any significant functionality/performance reason to use one over the other, you should default to ArrayDeque rather than LinkedList. It's generally faster, takes up less memory, and requires less garbage collection.
If your file is small (< a million lines) you are way better off just copying the lines into an ArrayList and then getting the next and previous 5 lines using random access into the array.
Sometimes the best solution is just plain brute force.
Your code is going to get tricky if you have two keyword hits inside your +-5 line window. Let's say you have hits two lines apart. Do you dump two 10-line windows? One 12-line window?
Random access will make implementing this stuff way easier.

Displaying Stats with arrays in Java

I am trying to display statistics from a simple text file using arrays in Java. I know what I am supposed to do, but I don't really how how to code it. So can anybody show me a sample code on how to do it.
So let's say the text file is called gameranking.txt, that contains the following information (This is a simple txt file to use as an example):
Game Event, 1st place, second place, third place, fourth place
World of Warcraft, John, Michael, Bill, Chris
Call of Duty, Michael, Chris, John, Bill
League of Legends, John, Chris, Bill, Michael.
My goal is to display stats such as how many first places, second places.. each individual won in a table like the following
Placement First place, second, third, fourth
John 2 0 1 0
Chris 0 2 0 1
etc...
My thought:
First, I would read the gameranking.txt and stores it to "input". Then I can use the while loop to read each line and store each line into a string called "line", afterward, I would use the array method "split" to pull out each string and store them into individual array. Afterward, I would count which placement each individual won and display them into a neat table using printf.
My first problem is I don't know how to create the arrays for this data. Do I first need to read through the file and see how many strings are in each row and column, then create the array table accordingly? Or can I store each string in an array as I read them?
The pseudocode that I have right now is the following.
Count how many rows are there and store it in row
Count how many column are there and store it in column
Create an array
String [] [] gameranking = new String [row] [column]
Next read the text file and store the info into the arrays
using:
while (input.hasNextLine) {
String line = input.nextLine();
while (line.hasNext()) {
Use line.split to pull out each string
first string = event and store it into the array
second string = first place
third string =......
Somewhere in the code, I need to count the placement....
Can somebody please show me how I should go about doing this?
I am not going to write the full program, but I will try to tackle each question and give you a simple suggestion:
Reading the initial file, you can get each line and store it in a string using a BufferedReader (or if you like, use a LineNumberReader)
BufferedReader br = new BufferedReader(new FileReader(file));
String strLine;
while ((strLine = br.readLine()) != null) {
......Do stuff....
}
At that point, in the while loop you will go through the string (since it comma delimited, you can use that to seperate each section). for each substring you can
a) compare it with first, second, third, fourth to get placement.
b) if its not any of those, then it could either be a game name or a user name
You can figure that out by position or nth substring (ie if this is the 5th substring, its likely to be the first game name. since you have 4 players, the next game name will be the 10th substring, etc.). Do note, I ignored "Game event" as that's not part of the pattern. You can use split to do this or a number of other options, rather than try to explain that I will give you a link to a tutorial I found:
http://pages.cs.wisc.edu/~hasti/cs302/examples/Parsing/parseString.html
As for tabulating results, Basically you can get an int array for each player which keeps track of their 1st, 2nd, 3rd, awards etc.
int[] Bob = new int[4]; //where 0 denotes # of 1st awards, etc.
int[] Jane = new int[4]; //where 0 denotes # of 1st awards, etc.
Showing the table is a matter of organizing the data and using a JTable in a GUI:
http://docs.oracle.com/javase/tutorial/uiswing/components/table.html
Alrighty...Here is what I wrote up, I am sure there is a cleaner and faster way, but this should give you an idea:
String[] Contestants = {"Bob","Bill","Chris","John","Michael"};
int[][] contPlace=new int[Contestants.length][4];
String file = "test.txt";
public FileParsing() throws Exception {
Arrays.fill(contPlace[0], 0);
Arrays.fill(contPlace[1], 0);
Arrays.fill(contPlace[2], 0);
Arrays.fill(contPlace[3], 0);
BufferedReader br = new BufferedReader(new FileReader(file));
String strLine;
while((strLine=br.readLine())!=null){
String[] line = strLine.split(",");
System.out.println(line[0]+"/"+line[1]+"/"+line[2]+"/"+line[3]+"/"+line[4]);
if(line[0].equals("Game Event")){
//line[1]==1st place;
//line[2]==2nd place;
//line[3]==3rd place;
}else{//we know we are on a game line, so we can just pick the names
for(int i=0;i<line.length;i++){
for(int j=0;j<Contestants.length;j++){
if(line[i].trim().equals(Contestants[j])){
System.out.println("j="+j+"i="+i+Contestants[j]);
contPlace[j][i-1]++; //i-1 because 1st substring is the game name
}
}
}
}
}
//Now how to get contestants out of the 2d array
System.out.println("Placement First Second Third Fourth");
System.out.println(Contestants[0]+" "+contPlace[0][0]+" "+contPlace[0][1]+" "+contPlace[0][2]+" "+contPlace[0][3]);
System.out.println(Contestants[1]+" "+contPlace[1][0]+" "+contPlace[1][1]+" "+contPlace[1][2]+" "+contPlace[1][3]);
System.out.println(Contestants[2]+" "+contPlace[2][0]+" "+contPlace[2][1]+" "+contPlace[2][2]+" "+contPlace[2][3]);
System.out.println(Contestants[3]+" "+contPlace[3][0]+" "+contPlace[3][1]+" "+contPlace[3][2]+" "+contPlace[3][3]);
System.out.println(Contestants[4]+" "+contPlace[4][0]+" "+contPlace[4][1]+" "+contPlace[4][2]+" "+contPlace[4][3]);
}
If you need to populate the contestants array or keep track of the games, you will have to insert appropriate code. Also note, using this 2-d array method is probably not best if you want to do anything other than display them. You should be able to take my code, add a main, and see it run.
Since it's a text file, use Scanner class.
It can be customized so that you can read the contents line-by-line, word-by-word, or customized delimiter.
The readfromfile method reads a plain text file one line at a time.
public static void readfromfile(String fileName) {
try {
Scanner scanner = new Scanner(new File(fileName));
scanner.useDelimiter(",");
System.out.println(scanner.next()); //instead of printing, take each word and store them in string array
scanner.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
This will get you started.

Categories