I'm trying to create a 2D array from a .txt file, where the .txt file looks something like this:
xxxx
xxxx
xxxx
xxxx
or something like this:
xxx
xxx
xxx
So I need to handle multiple sizes of a 2D array (Note: Each 2D array will not always be equal x and y dimensions). Is there anyway to initialize the array, or get the number of characters/letters/numbers per line and number of columns? I do not want to use a general statement, something like:
String[][] myArray = new Array[100][100];
And then would filling the array using filewriter and scanner classes look like this?
File f = new File(filename);
Scanner input = new Scanner(f);
for(int i = 0; i < myArray[0][].length; i++){
for(int j = 0; j < myArray[][0].length, j++){
myArray[i][j] = input.nextLine();
}
}
You have several choices as I see it:
Iterate through the file twice, the first time getting the array parameters, or
Iterate through it once, but fill up a List<List<SomeType>> possibly instantiating your Lists as ArrayLists. The latter will give you much greater flexibility in the short and long run.
(per MadProgrammer) The third option is to re-structure the file to provide the meta data required to make decisions about the size of the array.
For example, using your code,
File f = new File(filename);
Scanner input = new Scanner(f);
List<List<String>> nestedLists = new ArrayList<>();
while (input.hasNextLine()) {
String line = input.nextLine();
List<String> innerList = new ArrayList<>();
Scanner innerScanner = new Scanner(line);
while (innerScanner.hasNext()) {
innerList.add(innerScanner.next());
}
nestedLists.add(innerList);
innerScanner.close();
}
input.close();
Java Matrix can have each line (which is an array) by your size desicion.
You can use: ArrayUtils.add(char[] array, char element) //static method
But before that, you need to check what it the file lines length
Either this, you can also use ArrayList> as a collection which is holding your data
I have a .txt file that lists integers in groups like so:
20,15,10,1,2
7,8,9,22,23
11,12,13,9,14
and I want to read in one of those groups randomly and store the integers of that group into an array. How would I go about doing this? Every group has one line of five integers seperated by commas. The only way I could think of doing this is by incrementing a variable in a while loop that would give me the number of lines and then somehow read from one of those lines that is chosen randomly, but I'm not sure how it would read from only one of those lines randomly. Here's the code that I could come up with to sort of explain what I'm thinking:
int line = 0;
Scanner filescan = new Scanner (new File("Coords.txt"));
while (filescan.hasNextLine())
{
line++;
}
Random r = new Random(line);
Now what do I do to make it scan line r and place all of the integers read on line r into a 1-d array?
There is an old answer in StackOverflow about choosing a line randomly. By using the choose() method you can randomly get any line. I take no credit of the answer. If you like my answer upvote the original answer.
String[] numberLine = choose(new File("Coords.txt")).split(",");
int[] numbers = new int[5];
for(int i = 0; i < 5; i++)
numbers[i] = Integer.parseInt(numberLine[i]);
I'm assuming you know how to parse the line and get the integers out (Integer.parseInt, perhaps with a regular expression). If you're sing a scanner, you can specify that in your constructor.
Keep the contents of each line, and use that:
int line = 0;
Scanner filescan = new Scanner (new File("Coords.txt"));
List<String> content = new ArrayList<String>(); // new
while (filescan.hasNextLine())
{
content.add(filescan.next()); // new
line++;
}
Random r = new Random(line);
String numbers = content.get(r.nextInt(content.size()); // new
// Get numbers out of "numbers"
Read lines one by one from the file, store them in a list and generate a random number from the list's size and use it to get the random line.
public static void main(String[] args) throws Exception {
List<String> aList = new ArrayList<String>();
Scanner filescan = new Scanner(new File("Coords.txt"));
while (filescan.hasNextLine()) {
String nxtLn = filescan.nextLine();
//there can be empty lines in your file, ignore them
if (!nxtLn.isEmpty()) {
//add lines to the list
aList.add(nxtLn);
}
}
System.out.println();
Random r = new Random();
int randomIndex=r.nextInt(aList.size());
//get the random line
String line=aList.get(randomIndex);
//make 1 d array
//...
}
My data is stored in large matrices stored in txt files with millions of rows and 4 columns of comma-separated values. (Each column stores a different variable, and each row stores a different millisecond's data for all four variables.) There is also some irrelevant header data in the first dozen or so lines. I need to write Java code to load this data into four arrays, with one array for each column in the txt matrix. The Java code also needs to be able to tell when the header is done, so that the first data row can be split into entries for the 4 arrays. Finally, the java code needs to iterate through the millions of data rows, repeating the process of decomposing each row into four numbers which are each entered into the appropriate array for the column in which the number was located.
Can anyone show me how to alter the code below in order to accomplish this?
I want to find the fastest way to accomplish this processing of millions of rows. Here is my code:
MainClass2.java
package packages;
public class MainClass2{
public static void main(String[] args){
readfile2 r = new readfile2();
r.openFile();
int x1Count = r.readFile();
r.populateArray(x1Count);
r.closeFile();
}
}
readfile2.java
package packages;
import java.io.*;
import java.util.*;
public class readfile2 {
private Scanner scan1;
private Scanner scan2;
public void openFile(){
try{
scan1 = new Scanner(new File("C:\\test\\samedatafile.txt"));
scan1 = new Scanner(new File("C:\\test\\samedatafile.txt"));
}
catch(Exception e){
System.out.println("could not find file");
}
}
public int readFile(){
int scan1Count = 0;
while(scan1.hasNext()){
scan1.next();
scan1Count += 1;
}
return scan1Count;
}
public double[] populateArray(int scan1Count){
double[] outputArray1 = new double[scan1Count];
double[] outputArray2 = new double[scan1Count];
double[] outputArray3 = new double[scan1Count];
double[] outputArray4 = new double[scan1Count];
int i = 0;
while(scan2.hasNext()){
//what code do I write here to:
// 1.) identify the start of my time series rows after the end of the header rows (e.g. row starts with a number AT LEAST 4 digits in length.)
// 2.) split each time series row's data into a separate new entry for each of the 4 output arrays
i++;
}
return outputArray1, outputArray2, outputArray3, outputArray4;
}
public void closeFile(){
scan1.close();
scan2.close();
}
}
Here are the first 19 lines of a typical data file:
text and numbers on first line
1 msec/sample
3 channels
ECG
Volts
Z_Hamming_0_05_LPF
Ohms
dz/dt
Volts
min,CH2,CH4,CH41,
,3087747,3087747,3087747,
0,-0.0518799,17.0624,0,
1.66667E-05,-0.0509644,17.0624,-0.00288295,
3.33333E-05,-0.0497437,17.0624,-0.00983428,
5E-05,-0.0482178,17.0624,-0.0161573,
6.66667E-05,-0.0466919,17.0624,-0.0204402,
8.33333E-05,-0.0448608,17.0624,-0.0213986,
0.0001,-0.0427246,17.0624,-0.0207532,
0.000116667,-0.0405884,17.0624,-0.0229672,
EDIT
I tested Shilaghae's code suggestion. It seems to work. However, the length of all the resulting arrays is the same as x1Count, so that zeros remain in the places where Shilaghae's pattern matching code is not able to place a number. (This is a result of how I wrote the code originally.)
I was having trouble finding the indices where zeros remain, but there seemed to be a lot more zeros besides the ones expected where the header was. When I graphed the derivative of the temp[1] output, I saw a number of sharp spikes where false zeros in temp[1] might be. If I can tell where the zeros in temp[1], temp[2], and temp[3] are, I might be able to modify the pattern matching to better retain all the data.
Also, it would be nice to simply shorten the output array to no longer include the rows where the header was in the input file. However, the tutorials I have found regarding variable length arrays only show oversimplified examples like:
int[] anArray = {100, 200, 300, 400};
The code might run faster if it no longer uses scan1 to produce scan1Count. I do not want to slow the code down by using an inefficient method to produce a variable-length array. And I also do not want to skip data in my time series in the cases where the pattern matching is not able to split the input row into 4 numbers. I would rather keep the in-time-series zeros so that I can find them and use them to debug the pattern matching.
Can anyone show how to do these things in fast-running code?
SECOND EDIT
So
"-{0,1}\\d+.\\d+,"
repeats for times in the expression:
"-{0,1}\\d+.\\d+,-{0,1}\\d+.\\d+,-{0,1}\\d+.\\d+,-{0,1}\\d+.\\d+,"
Does
"-{0,1}\\d+.\\d+,"
decompose into the following three statements:
"-{0,1}" means that a minus sign occurs zero or one times, while
"\\d+." means that the minus sign(or lack of minus sign) is followed by several digits of any value followed by a decimal point, so that finally
"\\d+," means that the decimal point is followed by several digits of any value?
If so, what about numbers in my data like "1.66667E-05," or "-8.06131E-05," ? I just scanned one of the input files, and (out of 3+ million 4-column rows) it contains 638 numbers that contain E, of which 5 were in the first column, and 633 were in the last column.
FINAL EDIT
The final code was very simple, and simply involved using string.split() with "," as the regular expression. To do that, I had to manually delete the headers from the input file so that the data only contained rows with 4 comma separated numbers.
In case anyone is curious, the final working code for this is:
public double[][] populateArray(int scan1Count){
double[] outputArray1 = new double[scan1Count];
double[] outputArray2 = new double[scan1Count];
double[] outputArray3 = new double[scan1Count];
double[] outputArray4 = new double[scan1Count];
try {
File tempfile = new File("C:\\test\\mydatafile.txt");
FileInputStream fis = new FileInputStream(tempfile);
DataInputStream in = new DataInputStream(fis);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
int i = 0;
while ((strLine = br.readLine()) != null) {
String[] split = strLine.split(",");
outputArray1[i] = Double.parseDouble(split[0]);
outputArray2[i] = Double.parseDouble(split[1]);
outputArray3[i] = Double.parseDouble(split[2]);
outputArray4[i] = Double.parseDouble(split[3]);
i++;
}
} catch (IOException e) {
System.out.println("e for exception is:"+e);
e.printStackTrace();
}
double[][] temp = new double[4][];
temp[0]= outputArray1;
temp[1]= outputArray2;
temp[2]= outputArray3;
temp[3]= outputArray4;
return temp;
}
Thank you for everyone's help. I am going to close this thread now because the question has been answered.
You could read line to line the file and for every line you could control with a regular expression (http://www.vogella.de/articles/JavaRegularExpressions/article.html) if the line presents exactly 4 comma.
If the line presents exactly 4 comma you can split the line with String.split and fill the 4 array otherwise you pass at next line.
public double[][] populateArray(int scan1Count){
double[] outputArray1 = new double[scan1Count];
double[] outputArray2 = new double[scan1Count];
double[] outputArray3 = new double[scan1Count];
double[] outputArray4 = new double[scan1Count];
//Read File Line By Line
try {
File tempfile = new File("samedatafile.txt");
FileInputStream fis = new FileInputStream(tempfile);
DataInputStream in = new DataInputStream(fis);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
int i = 0;
while ((strLine = br.readLine()) != null) {
Pattern pattern = Pattern.compile("-{0,1}\\d+.\\d+,-{0,1}\\d+.\\d+,-{0,1}\\d+.\\d+,-{0,1}\\d+.\\d+,");
Matcher matcher = pattern.matcher(strLine);
if (matcher.matches()){
String[] split = strLine.split(",");
outputArray1[i] = Double.parseDouble(split[0]);
outputArray2[i] = Double.parseDouble(split[1]);
outputArray3[i] = Double.parseDouble(split[2]);
outputArray4[i] = Double.parseDouble(split[3]);
}
i++;
}
} catch (IOException e) {
e.printStackTrace();
}
double[][] temp = new double[4][];
temp[0]= outputArray1;
temp[1]= outputArray2;
temp[2]= outputArray3;
temp[3]= outputArray4;
return temp;
}
You can split up each line using String.split().
To skip the headers, you can either read the first N lines and discard them (if you know how many there are) or you will need to look for a specific marker - difficult to advise without seeing your data.
You may also need to change your approach a little because you currently seem to be sizing the arrays according to the total number of lines (assuming your Scanner returns lines?) rather than omitting the count of header lines.
I'd deal with the problem of the headers by simply attempting to parse every line as four numbers, and throwing away any lines where the parsing doesn't work. If there is a possibility of unparseable lines after the header lines, then you can set a flag the first time you get a "good" line, and then report any subsequent "bad" lines.
Split the lines with String.split(...). It is not the absolute fastest way to do it, but the CPU time of your program will be spent elsewhere ... so it probably doesn't matter.
How would you write this code?
This particular question is about a maze game that has an arraylist of occupants which are Explorers (you), Monsters (touching will kill you), and Treasures. The game uses blocks of square objects in which these occupants reside in. The particular thing I want to do is file reading which can export the current configuration of the maze or import them as a txt file.
The specs:
First read in the rows and cols of the Maze to create a Square[][] of the appropriate size. Then construct and read in all the Squares/Occupants.
For Squares, the Maze will first determine that the line starts with "Square". It will then read in the row and col of the Square and use that information to construct a Square object. Finally it will pass the rest of the Scanner to the Square's toObject method so it can initialize itself.
For all other Occupants, the Maze will determine what kind of Occupant it is and construct the appropriate object using the constructor that only takes a Maze. It will not read the row or the col from the Scanner, but simply pass the Scanner on to the toObject method of the newly created object.
This is code that I have so far which could be wrong:
public void readMazeFromFile(String fileName) throws IOException, FileNotFoundException, MazeReadException
{
Scanner fileSc = new Scanner(new File(fileName));
String line = fileSc.nextLine(); //whats on the line, will be overwritten
Scanner lineSc = new Scanner(line);
String temp;
lineSc.useDelimiter(",");
int lineNum = 1; //every time you scan a line out, do lineNum++
int r1, r2, r3, r4, c1, c2, c3, c4;
rows = fileSc.nextInt();
cols = fileSc.nextInt();
Square hi = new Square(rows, cols);
line = fileSc.nextLine();
while ( line != null)
{
line = lineSc.nextLine();
lineSc = new Scanner(line);
if( lineSc.equals("Square"))
{
r1 = lineSc.nextInt();
c1 = lineSc.nextInt();
hi.toObject(lineSc);
}
if (lineSc.equals("Explorer"))
{
explorer.toObject(lineSc);
}
if (lineSc.equals("Treasure"))
{
Treasure.toObject(lineSc);
}
lineNum++;
}
Here is sample output:
5,5
Square,0,0,true,false,false,true,true,true
Square,0,1,true,false,true,false,true,true
Square,0,2,true,false,true,false,false,false
Square,0,3,true,false,false,false,false,false
Square,0,4,true,true,false,false,false,false
Square,1,0,false,false,true,true,true,true
Square,1,1,true,false,true,false,false,false
Square,1,2,true,true,false,false,false,false
Square,1,3,false,true,false,true,false,false
Square,1,4,false,true,false,true,false,false
Square,2,0,true,false,false,true,false,false
Square,2,1,true,false,true,false,false,false
Square,2,2,false,true,false,false,false,false
Square,2,3,false,true,false,true,false,false
Square,2,4,false,true,false,true,false,false
Square,3,0,false,true,false,true,false,false
Square,3,1,true,false,false,true,false,false
Square,3,2,false,true,false,false,false,false
Square,3,3,false,true,true,true,false,false
Square,3,4,false,true,false,true,false,false
Square,4,0,false,true,true,true,false,false
Square,4,1,false,true,true,true,false,false
Square,4,2,false,false,true,true,false,false
Square,4,3,true,false,true,false,false,false
Square,4,4,false,true,true,false,false,false
Explorer,0,0,Scary Name
Treasure,4,4,true
Treasure,2,2,false
Monster,4,4
Monster,3,3
What would you write for this section?
This is a skeleton that you can start with. You really don't need anything more than a Scanner here; it can do everything that you'd need to scan the input and do the conversion, etc. You want to use its next(), nextInt(), nextBoolean() and nextLine() methods.
Scanner in = new Scanner(new File(filename));
in.useDelimiter("\\s+|,");
int rows = in.nextInt();
int cols = in.nextInt();
// construct the array
while (in.hasNext()) {
String type = in.next();
int r = in.nextInt();
int c = in.nextInt();
// read more depending on type
}
Use a
FileReader
on the file and then attach a
StreamTokenizer
to that, far more efficient :)