how can i use split() with a big number of elements, java - java

I need to process a big text file, there are almost 400 column in each line, and almost 800000 lines in the file, the format of each line in the file is like:
340,9,2,3........5,2,LA
what I want to do is, for each line, if the last column is LA, then print the first column of this line.
i write a simple program to do it
BufferedReader bufr = new BufferedReader(new FileReader ("A.txt"));
BufferedWriter bufw = new BufferedWriter(new FileWriter ("LA.txt"));
String line = null;
while ((line = bufr.readLine()) != null) {
String [] text = new String [388];
text = line.split(",");
if (text [387] == args[2]) {
bufw.write(text[0]);
bufw.newLine();
bufw.flush();
}
}
bufw.close();
bufr.close();
but it seems the length of an array cant be that big, i received a java.lang.ArrayIndexOutOfBoundsException
since i'm using split(",") in order to get the last column of a line, and it will be out of array bounds, how can I do with it? thanks.

text does not need to be initialized, String.split will create a correctly sized array:
String[] text = line.split(",");
You're also comparing Strings using reference equality (==). You should be using .equals():
if (text[387].equals(args[2])) { ... }
You're probably getting java.lang.ArrayIndexOutOfBoundsException because the the index 387 is too big. If you want to get last element, use this:
text[text.length - 1]

Modify and try this
String [] text = line.split(",");
if (text [text.length - 1].equals(args[2])) {
bufw.write(text[0]);
bufw.newLine();
bufw.flush();
}
Assuming args[2] is LA.

String [] text;
Change your code to this. You don't need to initialize a size. When the String.split method executes it will automatically initialize the correct size for your array.

If you just need the first and the last column, then there is no need to create an array out of the current line.
You could do something like this:
final String test = "340,9,2,354,63,5,5,45,634,5,5,2,LA";
final char delimiter = ',';
final String lastColumn = test.substring(test.lastIndexOf(delimiter) + 1);
if (lastColumn.equals("LA")) {
final String firstColumn = test.substring(0, test.indexOf(delimiter));
System.out.println(firstColumn);
}
This code extracts the last column first and tests it. If it matches "LA", then it extract the first column. It will ignore the remaining content of the line.
Your code would be:
BufferedReader bufr = new BufferedReader(new FileReader ("A.txt"));
BufferedWriter bufw = new BufferedWriter(new FileWriter ("LA.txt"));
String line = null;
while ((line = bufr.readLine()) != null) {
final String lastColumn = line.substring(line.lastIndexOf(delimiter) + 1);
if (lastColumn.equals(args[2])) {
bufw.write(line.substring(0, line.indexOf(delimiter)));
bufw.newLine();
bufw.flush();
}
}
bufw.close();
bufr.close();
(this code is not tested yet, but you get the idea :))

Related

Read from file with BufferedReader

Basically I've got an assignment which reads multiple lines from a .txt file.
There are 4 values in the text file per line and each value is separated by 2 spaces.
There are about 10 lines of data in the file.
After taking the input from the file the program then puts it onto a Database. The database connection functionality works fine.
My issue now is with reading from the file using a BufferedReader.
The issue is that if I uncomment any 1 of the 3 lines at the bottom the BufferedReader reads every other line. And if I don't use them then there's an exception as the next input is of type String.
I have contemplated using a Scanner with the .hasNextLine() method.
Any thoughts on what could be the problem and how to fix it?
Thanks.
File file = new File(FILE_INPUT_NAME);
FileReader fr = new FileReader(file);
BufferedReader readFile = new BufferedReader(fr);
String line = null;
while ((line = readFile.readLine()) != null) {
String[] split = line.split(" ", 4);
String id = split[0];
nameFromFile = split[1];
String year = split[2];
String mark = split[3];
idFromFile = Integer.parseInt(id);
yearOfStudyFromFile = Integer.parseInt(year);
markFromFile = Integer.parseInt(mark);
//line = readFile.readLine();
//readFile.readLine();
//System.out.println(readFile.readLine());
}
Edit: There was an error in the formatting of the .txt file. a missing value.
But now I get an ArrayOutOfBoundsException.
Edit edit: Another error in the .txt file! Turns out there was a single space instead of a double. It seems to be working now. But any advice on how to deal with file errors like this in the future?
The issue is that if I uncomment any 1 of the 3 lines at the bottom the BufferedReader reads every other line.
Correct. If you put any of those lines of code in, the line of text read will be thrown away and not processed. You're already reading in the while condition. You don't need another read. If you put any of those lines in, they will be thrown away and not proce
A compilable version of the code posted could be
public void read() throws IOException {
File file = new File(FILE_INPUT_NAME);
FileReader fr = new FileReader(file);
BufferedReader readFile = new BufferedReader(fr);
String line;
while ((line = readFile.readLine()) != null) {
String[] split = line.split(" ", 4);
if (split.length != 4) { // Not enough tokens (e.g., empty line) read
continue;
}
String id = split[0];
String nameFromFile = split[1];
String year = split[2];
String mark = split[3];
int idFromFile = Integer.parseInt(id);
int yearOfStudyFromFile = Integer.parseInt(year);
int markFromFile = Integer.parseInt(mark);
//line = readFile.readLine();
//readFile.readLine();
//System.out.println(readFile.readLine());
}
}
The above uses a single space (" " instead of the original " "). To split on any number of changes, a regular expression can be used, e.g. "\\s+". Of course, exactly 2 spaces can also be used, if that reflects the structure of the input data.
What the method should do with the extracted values (e.g., returning them in an object of some type, or saving them to a database directly), is up to the application using it.

Buffered Reader find specific line separator char then read that line

My program needs to read from a multi-lined .ini file, I've got it to the point it reads every line that start with a # and prints it. But i only want to to record the value after the = sign. here's what the file should look like:
#music=true
#Volume=100
#Full-Screen=false
#Update=true
this is what i want it to print:
true
100
false
true
this is my code i'm currently using:
#SuppressWarnings("resource")
public void getSettings() {
try {
BufferedReader br = new BufferedReader(new FileReader(new File("FileIO Plug-Ins/Game/game.ini")));
String input = "";
String output = "";
while ((input = br.readLine()) != null) {
String temp = input.trim();
temp = temp.replaceAll("#", "");
temp = temp.replaceAll("[*=]", "");
output += temp + "\n";
}
System.out.println(output);
}catch (IOException ex) {}
}
I'm not sure if replaceAll("[*=]", ""); truly means anything at all or if it's just searching for all for of those chars. Any help is appreciated!
Try following:
if (temp.startsWith("#")){
String[] splitted = temp.split("=");
output += splitted[1] + "\n";
}
Explanation:
To process lines only starting with desired character use String#startsWith method. When you have string to extract values from, String#split will split given text with character you give as method argument. So in your case, text before = character will be in array at position 0, text you want to print will be at position 1.
Also note, that if your file contains many lines starting with #, it should be wise not to concatenate strings together, but use StringBuilder / StringBuffer to add strings together.
Hope it helps.
Better use a StringBuffer instead of using += with a String as shown below. Also, avoid declaring variables inside loop. Please see how I've done it outside the loop. It's the best practice as far as I know.
StringBuffer outputBuffer = new StringBuffer();
String[] fields;
String temp;
while((input = br.readLine()) != null)
{
temp = input.trim();
if(temp.startsWith("#"))
{
fields = temp.split("=");
outputBuffer.append(fields[1] + "\n");
}
}

Reading from text area by line and assigning variables

I need to append variables to each line of text from a TextArea. The TextArea is coded, and works perfectly. I can retrieve information from the TextArea by using TextArea.getText();
To break it apart, I am trying to use a BufferedReader. Unfortunately, this does not work. Is there a different way of doing this? Here is an example of how the information needs to be written in the text area:
"workerName"
"workerDepartment"
"workerNumber"
BufferedReader inStream= new BufferedReader
(new InputStreamReader(TextArea.getText()));
String workerName = "";
String workerDepartment = "";
int workerNumber = 0;
String line = inStream.readLine();
while (line != null) {
workerName = line;
line = inStream.readLine();
workerDepartment = line;
line = inStream.readLine();
workerNumber = Integer.parseInt(line);
}
inStream.close();
if the lines are separated by any delimiter(for example newline, comma...) , then use split method of String and put the delimiter
String[] lines = TextArea.getText().split("\n");
//then you can access your array
String workerName = lines[0];
String workerDepartment = lines[1];
// and so on
Also you need to check array size before getting the value to prevent ArrayOutOfIndexException, for example if there are two lines only then you should not call lines[2], so do the check:
if ( lines.length < 3 ) {
// input is not complete, show error message
}
else {
// do your splitting and reading values
}

JAVA - import CSV to ArrayList

I'm trying import CSV file to Arraylist using StringTokenizer:
public class Test
{
public static void main(String [] args)
{
List<ImportedXls> datalist = new ArrayList<ImportedXls>();
try
{
FileReader fr = new FileReader("c:\\temp.csv");
BufferedReader br = new BufferedReader(fr);
String stringRead = br.readLine();
while( stringRead != null )
{
StringTokenizer st = new StringTokenizer(stringRead, ",");
String docNumber = st.nextToken( );
String note = st.nextToken( ); /** PROBLEM */
String index = st.nextToken( ); /** PROBLEM */
ImportedXls temp = new ImportedXls(docNumber, note, index);
datalist.add(temp);
// read the next line
stringRead = br.readLine();
}
br.close( );
}
catch(IOException ioe){...}
for (ImportedXls item : datalist) {
System.out.println(item.getDocNumber());
}
}
}
I don't understand how the nextToken works, because if I keep the initialize three variables (docNumber, note and index) as nextToken(), it fails on:
Exception in thread "main" java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken(Unknown Source)
at _test.Test.main(Test.java:32)
If I keep docNumber only, it works. Could you help me?
It seems that some of the rows of your input file have less then 3 comma separated fields.You should always check if tokenizer has more tokens (StringTokenizer.hasMoreTokens), unless you are are 100% sure your input is correct.
CORRECT parsing of CSV files is not so trivial task. Why not to use a library that can do it very well - http://opencsv.sourceforge.net/ ?
Seems like your code is getting to a line that the Tokenizer is only breaking up into 1 part instead of 3. Is it possible to have lines with missing data? If so, you need to handle this.
Most probably your input file doesn't contain another element delimited by , in at least one line. Please show us your input - if possible the line that fails.
However, you don't need to use StringTokenizer. Using String#split() might be easier:
...
while( stringRead != null )
{
String[] elements = stringRead.split(",");
if(elements.length < 3) {
throw new RuntimeException("line too short"); //handle missing entries
}
String docNumber = elements[0];
String note = elements[1];
String index = elements[2];
ImportedXls temp = new ImportedXls(docNumber, note, index);
datalist.add(temp);
// read the next line
stringRead = br.readLine();
}
...
You should be able to check your tokens using the hasMoreTokens() method. If this returns false, then it's possible that the line you've read does not contain anything (i.e., an empty string).
It would be better though to use the String.split() method--if I'm not mistaken, there were plans to deprecate the StringTokenizer class.

How to read a String (file) to array in java

Suppose there is a file named as SUN.txt
File contains : a,b,dd,ss,
I want to make dynamic array depending upon the number of attributes in file.
If ther is a char after comma then array will be of 0-4 i.e of length 5.
In the above mentioned case there is no Char which returns 0-3 Array of length 4. I want to read the NULL after comma too.
How do i do that?
Sundhas
You should think about
Reading the file into a String
Splitting the file by separator ','
Using a list for adding the characters and convert the list to an array, when the list is filled
As Markus said, you want to do something like this..
//Create a buffred reader so that you can read in the file
BufferedReader reader = new BufferedReader(new FileReader(new File(
"\\SUN.txt")));
//The StringBuffer will be used to create a string if your file has multiple lines
StringBuffer sb = new StringBuffer();
String line;
while((line = reader.readLine())!= null)
{
sb.append(line);
}
//We now split the line on the "," to get a string array of the values
String [] store = sb.toString().split(",");
I do not quite understand why you would want the NULL after the comma? I am assuming that you mean after the last comma you would like that to be null in your array? I do not quite see the point in that but that is not what the question is.
If that is the case you wont read in a NULL, if after the comma there was a space, you could read that in.
If you would like a NULL you would have to add it in yourself at the end so you could do something like
//Create a buffred reader so that you can read in the file
BufferedReader reader = new BufferedReader(new FileReader(new File(
"\\SUN.txt")));
//Use an arraylist to store the values including nulls
ArrayList<String> store = new ArrayList<String>();
String line;
while((line = reader.readLine())!= null)
{
String [] splitLine = line.split(",");
for(String x : splitLine)
{
store.add(line);
}
//This tests to see if the last character of the line is , and will add a null into the array list
if(line.endsWith(","))
store.add(null);
}
String [] storeWithNull = store.toArray();
Well if you want want to simply open the file and store the content in a array of string then
1) open the file into a string
2) split the string using a regex "," http://download.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#split(java.lang.String)
but I'm curious why you can't use a String file directly ?
For your datatructure, use a list of arrays. Each list entry is a line of your textfile, each entry is an array that holds the comma separated values:
List<String[]> data = new ArrayList<String[]>();
String line = readNextLine(); // custom method, to be implemented
while (line != null) {
data.add(line.split(","));
line = readNextLine();
}
(assuming, your file contains 1..n lines of comma separated values)
You may want to have it like this:
"a,b,c,d," -> {"a", "b", "c", "d", null}
Here's a suggestion how to solve that problem:
List<String[]> data = new ArrayList<String[]>();
String line = readNextLine(); // custom method, to be implemented
while (line != null) {
String[] values = new String[5];
String[] pieces = line.split(",");
for (int i = 0; i<pieces.length; i++)
values[i] = pieces[i];
data.add(values);
line = readNextLine();
}
its seems like a CSV file something like this will work assuming it has 5 lines and 5 values
String [][] value = new String [5][5];
File file = new File("SUN.txt");
BufferedReader br = new BufferedReader(new FileReader(file));
String line = null;
int row = 0;
int col = 0;
while((line = br.readLine()) != null ){
StringTokenizer s = new StringTokenizer(line,",");
while (s.hasMoreTokens()){
value[row][col] = s.nextToken();
col++;
}
col = 0;
row++;
}
i havent tested this code
Read the file, using BufferedReader, one line at the time.
Use split(",", -1) to convert to an array of String[] including also empty strings beyond the last comma as part of your array.
Load the String[] parts into a List.

Categories