Java Buffered Reader Text File Parsing - java

I am really struggling with parsing a text file. I have a text file which is in the following format
ID
Float Float
Float Float
.... // variable number of floats
END
ID
Float Float
Float Float
....
END
etc However the ID can represent one of two values, 0 which means it is a new field, or -1 which means it is related to the last new field. The number of times a related field can repeat it self is unlimited. Which is where the problem is occurring.
As I have a method in a library which takes an ArrayList of the new Floats, then an ArrayList of an ArrayList of the related floats.
When I try and code the logic for this I just keep getting deeper and deeper embedded while loops.
I would really appreciate any suggestions as to how I should go about this. Thanks in advance.
Here is the code I have so far.
BufferedReader br = new BufferedReader(new FileReader(buildingsFile));
String[] line = br.readLine().trim().split(" ");
boolean original = true;
while(true)
{
if(line[0].equals("END"))
break;
startCoordinate = new Coordinate(Double.parseDouble(line[0]), Double.parseDouble(line[1]));
while(true)
{
line = br.readLine().trim().split(" ");
if(!line[0].equals("END") && original == true)
polypoints.add(new Coordinate(Double.parseDouble(line[0]), Double.parseDouble(line[1])));
else if(!line[0].equals("END") && original == false)
cutout.add(new Coordinate(Double.parseDouble(line[0]), Double.parseDouble(line[1])));
else if(line[0].equals("END") && original == false)
{
cutouts.add(cutout);
cutout.clear();
}
else if(line[0].equals("-99999"))
original = false;
else if(line[0].equals("0"))
break;
}
buildingDB.addBuilding(mapName, startCoord, polypoints, cutouts);
}
New Code
int i = 0;
BufferedReader br = new BufferedReader(new FileReader(buildingsFile));
String[] line;
while(true)
{
line = br.readLine().trim().split(" ");
if(line[0].equals("END"))
break;
polygons.add(new Polygon(line));
while(true)
{
line = br.readLine().trim().split(" ");
if(line[0].equals("END"))
break;
polygons.get(i).addCoord(new Coordinate(Double.parseDouble(line[0]), Double.parseDouble(line[1])));
}
i++;
}
System.out.println(polygons.size());
int j = 0;
for(i = 0; i< polygons.size(); i++)
{
Building newBuilding = new Building();
if(polygons.get(i).isNew == true)
{
newBuilding = new Building();
newBuilding.startCoord = new Coordinate(polygons.get(i).x, polygons.get(i).y);
}
while(polygons.get(i).isNew == false)
newBuilding.cutouts.add(polygons.get(i).coords);
buildings.add(newBuilding);
}
for(i = 0; i<buildings.size(); i++)
{
System.out.println(i);
buildingDB.addBuilding(mapName, buildings.get(i).startCoord, buildings.get(i).polypoint, buildings.get(i).cutouts);
}

Maybe you should use map for new floats and related floats..if got your question it should help..example:
HashMap hm = new HashMap();
hm.put("Rohit", new Double(3434.34));

I assume that a "field" means an ID and a variable number of coordinates (pairs of floats), that, judging from your code, represents a polygon in fact.
I would first load all the polygons, each into a separate Polygon object:
class Polygon {
boolean isNew;
List<Coordinate> coordinates;
}
and store the polygons in another list. Then in a 2nd pass go through all the polygons to group them according to their IDs into something like
class Building {
Polygon polygon;
List<Polygon> cutouts;
}
I think this would be fairly simple to code.
OTOH if you have a huge amount of data in the file, and/or you prefer processing the read data little by little, you could simply read a polygon and all its associated cutouts, until you find the next polygon (ID of 0), at which point you could simply pass the stuff read so far to the building DB and start reading the next polygon.

You can try using ANTLR here, The Grammar defines the format of the text you are expecting and then you can wrap the contents in a Java object. The * and + Wildcards will resolve the complexity of while and for. Its very simple and easy to use, you dont have to construct AST you can take the parsed content from java objects directly. But the only overhead is you have to add the ANTLR.jar to your path.

Related

Java: Most efficient way to loop through CSV and sum values of one column for each unique value in another Column

I have a CSV file with 500,000 rows of data and 22 columns. This data represents all commercial flights in the USA for one year. I am being tasked with finding the tail number of the plane that flew the most miles in the data set. Column 5 contains the airplain's tail number for each flight. Column 22 contains the total distance traveled.
Please see my extractQ3 method below. First, created a HashMap for the whole CSV using the createHashMap() method. Then, I ran a for loop to identify every unique tail number in the dataset and stored them in an array called tailNumbers. Then for each unique tail number, I looped through the entire Hashmap to calculate the total miles of distance for that tail number.
The code runs fine on smaller datasets, but once the sized increased to 500,000 rows the code becomes horribly inefficient and takes an eternity to run. Can anyone provide me with a faster way to do this?
public class FlightData {
HashMap<String,String[]> dataMap;
public static void main(String[] args) {
FlightData map1 = new FlightData();
map1.dataMap = map1.createHashMap();
String answer = map1.extractQ3(map1);
}
public String extractQ3(FlightData map1) {
ArrayList<String> tailNumbers = new ArrayList<String>();
ArrayList<Integer> tailMiles = new ArrayList<Integer>();
//Filling the Array with all tail numbers
for (String[] value : map1.dataMap.values()) {
if(Arrays.asList(tailNumbers).contains(value[4])) {
} else {
tailNumbers.add(value[4]);
}
}
for (int i = 0; i < tailNumbers.size(); i++) {
String tempName = tailNumbers.get(i);
int miles = 0;
for (String[] value : map1.dataMap.values()) {
if(value[4].contentEquals(tempName) && value[19].contentEquals("0")) {
miles = miles + Integer.parseInt(value[21]);
}
}
tailMiles.add(miles);
}
Integer maxVal = Collections.max(tailMiles);
Integer maxIdx = tailMiles.indexOf(maxVal);
String maxPlane = tailNumbers.get(maxIdx);
return maxPlane;
}
public HashMap<String,String[]> createHashMap() {
File flightFile = new File("flights_small.csv");
HashMap<String,String[]> flightsMap = new HashMap<String,String[]>();
try {
Scanner s = new Scanner(flightFile);
while (s.hasNextLine()) {
String info = s.nextLine();
String [] piecesOfInfo = info.split(",");
String flightKey = piecesOfInfo[4] + "_" + piecesOfInfo[2] + "_" + piecesOfInfo[11]; //Setting the Key
String[] values = Arrays.copyOfRange(piecesOfInfo, 0, piecesOfInfo.length);
flightsMap.put(flightKey, values);
}
s.close();
}
catch (FileNotFoundException e)
{
System.out.println("Cannot open: " + flightFile);
}
return flightsMap;
}
}
The answer depends on what you mean by "most efficient", "horribly inefficient" and "takes an eternity". These are subjective terms. The answer may also depend on specific technical factors (speed vs. memory consumption; the number of unique flight keys compared to the number of overall records; etc.).
I would recommend applying some basic streamlining to your code, to start with. See if that gets you a better (acceptable) result. If you need more, then you can consider more advanced improvements.
Whatever you do, take some timings to understand the broad impacts of any changes you make.
Focus on going from "horrible" to "acceptable" - and then worry about more advanced tuning after that (if you still need it).
Consider using a BufferedReader instead of a Scanner. See here. Although the scanner may be just fine for your needs (i.e. if it's not a bottleneck).
Consider using logic within your scanner loop to capture tail numbers and accumulated mileage in one pass of the data. The following is deliberately basic, for clarity and simplicity:
// The string is a tail number.
// The integer holds the accumulated miles flown for that tail number:
Map<String, Integer> planeMileages = new HashMap();
if (planeMileages.containsKey(tailNumber)) {
// add miles to existing total:
int accumulatedMileage = planeMileages.get(tailNumber) + flightMileage;
planeMileages.put(tailNumber, accumulatedMileage);
} else {
// capture new tail number:
planeMileages.put(tailNumber, flightMileage);
}
After that, once you have completed the scanner loop, you can iterate over your planeMileages to find the largest mileage:
String maxMilesTailNumber;
int maxMiles = 0;
for (Map.Entry<String, Integer> entry : planeMileages.entrySet()) {
int planeMiles = entry.getValue();
if (planeMiles > maxMiles) {
maxMilesTailNumber = entry.getKey();
maxMiles = planeMiles;
}
}
WARNING - This approach is just for illustration. It will only capture one tail number. There could be multiple planes with the same maximum mileage. You would have to adjust your logic to capture multiple "winners".
The above approach removes the need for several of your existing data structures, and related processing.
If you still face problems, put in some timers to see which specific areas of your code are slowest - and then you will have more specific tuning opportunities you can focus on.
I suggest you use the java 8 Stream API, so that you can take advantage of Parallel streams.

Java read text file with maze and get all possible paths [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
EDIT: I have tried to store the lines character by character into a 2D Array.
However, the problem is to get all possible paths of a maze from 0 to 1 inside of a text file. And the asterisk are the walls or obstacle.
Maze looks like this
8,8
********
*0 *
* *
* ** *
* ** *
* *
* 1*
********
I'm not sure if it's achievable to put it into a Two Dimensional Array string. And do a recursion or dynamic programming afterwards.
Note that the only movements allowed is right and down, also the 0 destination could be somewhere on 2nd, 3rd and so on column. Same as 1 destination as well.
Any tips or suggestions will be appreciated, thank you in advance!
Yep, this is fairly easy to do:
Read the first line of the text file and parse out the dimensions.
Create an array of length n.
For every (blank) item in the array:
Create a new length-n array as the data.
Parse the next line of the text file as individual characters into the array.
After this, you'll have your n x n data structure to complete your game with.
Using a Map to store this File Seems like a good idea.
While I don't think reading a file character by character would be an issue,
BufferedReader br = new BufferedReader(new FileReader(file));
String line = br.readLine();
You have specified the grid dimensions say (n x n)
A Simple way I could visualize is by generating unique keys for every coordinate.
More like a Parser method to store Keys in the Map:
public String parseCoordinate(int x, int y){
return x + "" + y;
}
Map<String, Boolean> gridMap = new HashMap<>();
So when you read file by Characters, you could put parsed coordinates as keys in the map:
gridMap.put(parseCoordinate(lineCount, characterCount), line.charAt(characterCount) == '*');
I'm assuming the only problem you are facing is to decide how to read the file correctly for processing or applying the algorithm to determine the number of unique paths in the given maze.
private static int[][] getMatrixFromFile(File f) throws IOException {
//Read the input file as a list of String lines
List<String> lines = Files.lines(f.toPath())
//.map(line -> line.substring(1 , line.length() - 1))
.collect(Collectors.toList());
//Get the dimensions of the maze from the first line
String[] dimensions = lines.get(0).split("\\*");
//initalize a sub matrix of just the maze dimensions ignoring the walls
int[][] mat = new int[Integer.valueOf(dimensions[0]) - 2 ][Integer.valueOf(dimensions[1]) - 2];
//for each line in the maze excluding the boundaries , if you encounter a * encode as 0 else 1
for( int i = 2 ; i < lines.size() - 1 ; i++) {
String currLine = lines.get(i);
int j = 0;
for(char c : currLine.toCharArray())
mat[i - 2][j] = (c == '*') ? 0 : 1;
}
return mat;
}
With this in place you can now focus on the algorithm for actually traversing the matrix to determine the number of unique paths from top-right to bottom-left.
Having said that , once you have the above matrix you are not limited to traversing just top-right to bottom-left , rather any arbit point in you maze can serve as start and end points.
If you require help with figuring out the number of unique paths , i can edit to include the bit , but Dynamic programming should help in getting the same.
private char[][] maze;
private void read() {
final InputStream inputStream = YourClass.class.getResourceAsStream(INPUT_PATH);
final BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
try {
final String header = reader.readLine();
final String[] tokens = header.split(",");
if (tokens.length < 2) {
throw new RuntimeException("Invalid header"); // Use a dedicated exception
}
final int dimX = parseInt(tokens[0]);
final int dimY = parseInt(tokens[1]);
maze = new char[dimX][dimY];
for (int i = 0; i < dimY; i++) {
final String line = reader.readLine();
maze[i] = line.toCharArray();
}
} catch (final IOException e) {
// handle exception
} finally {
try {
reader.close();
} catch (IOException e) {
// handle exception
}
}
}
Now, some assumptions: I assumed the first line contains the declaration of the maze size, so it will be used to initialize the two dimensional array. The other assumption is that you can make use of a char array, but that's pretty easy to change if you want.
From here you can start working on your path finding algorithm.
By the way, this thing you're trying to implement reminds me a lot of this challenge in the Adventofcode challenge series. There are a lot of people discussing their solutions to the challenge, just have a look in Reddit for instance and you'll find plenty oh tips on how to go on with your little experiment.
Have fun!

checking if my array elements meet requirements

I need to create a method which checks each element in my array to see if it is true or false, each element holds several values such as mass, formula, area etc for one compound, and in total there are 30 compounds (so the array has 30 elements). I need an algorithm to ask if mass < 50 and area > 5 = true .
My properties class looks like:
public void addProperty (Properties pro )
{
if (listSize >=listlength)
{
listlength = 2 * listlength;
TheProperties [] newList = new TheProperties [listlength];
System.arraycopy (proList, 0, newList, 0, proList.length);
proList = newList;
}
//add new property object in the next position
proList[listSize] = pro;
listSize++;
}
public int getSize()
{
return listSize;
}
//returns properties at a paticular position in list numbered from 0
public TheProperties getProperties (int pos)
{
return proList[pos];
}
}
and after using my getters/setters from TheProperties I put all the information in the array using the following;
TheProperties tp = new properties();
string i = tp.getMass();
String y = tp.getArea();
//etc
theList.addProperty(tp);
I then used the following to save an output of the file;
StringBuilder builder = new StringBuilder();
for (int i=0; i<theList.getSize(); i++)
{
if(theList.getProperties(i).getFormatted() != null)
{
builder.append(theList.getProperties(i).getFormatted());
builder.append("\n");
}
}
SaveFile sf = new SaveFile(this, builder.toString());
I just cant work out how to interrogate each compound individually for whether they reach the value or not, reading a file in and having a value for each one which then gets saved has worked, and I can write an if statement for the requirements to check against, but how to actually check the elements for each compound match the requirements? I am trying to word this best I can, I am still working on my fairly poor java skills.
Not entirely sure what you are after, I found your description quite hard to understand, but if you want to see if the mass is less than 50 and the area is greater than 5, a simple if statement, like so, will do.
if (tp.getMass() < 50 && tp.getArea() > 5) {}
Although, you will again, have to instantiate tp and ensure it has been given its attributes through some sort of constructor.
Lots of ways to do this, which makes it hard to answer.
You could check at creation time, and just not even add the invalid ones to the list. That would mean you only have to loop once.
If you just want to save the output to the file, and not do anything else, I suggest you combine the reading and writing into one function.
Open up the read and the write file
while(read from file){
check value is ok
write to file
}
close both files
The advantage of doing it this way are:
You only loop through once, not three times, so it is faster
You never have to store the whole list in memory, so you can handle really large files, with thousands of elements.
In case the requirements changes, you can write method that uses Predicate<T>, which is a FunctionalInterface designed for such cases (functionalInterfaces was introduced in Java 8):
// check each element of the list by custom condition (predicate)
public static void checkProperties(TheList list, Predicate<TheProperties> criteria) {
for (int i=0; i < list.getSize(); i++) {
TheProperties tp = list.get(i);
if (!criteria.apply(tp)) {
throw new IllegalArgumentException(
"TheProperty at index " + i + " does not meet the specified criteria");
}
}
}
If you want to check if mass < 50 and area > 5, you would write:
checkProperties(theList, new Predicate<TheProperties> () {
#Override
public boolean apply(TheProperties tp) {
return tp.getMass() < 50 && tp.getArea() > 5;
}
}
This can be shortened by using lambda expression:
checkProperties(theList, (TheProperties tp) -> {
return tp.getMass() < 50 && tp.getArea() > 5;
});

Does libsvm work for multi output regression?

I have been trying to use jlibSVM
I want to use it for multi output regression.
for example my :
feature set / inputs will be x1,x2,x3
and outputs/target values will be y1,y2
Is this possible using the libSVM library ?
The API docs are not clear and there is not example app showing the use of jlibsvm so I tried to modify the code inside lexecyexec/svm_train.java
The author has originally just created the app to use one output/target value only .
this is seen in this part where the author tries to read the training file :
private void read_problem() throws IOException
{
BufferedReader fp = new BufferedReader(new FileReader(input_file_name));
Vector<Float> vy = new Vector<Float>();
Vector<SparseVector> vx = new Vector<SparseVector>();
int max_index = 0;
while (true)
{
String line = fp.readLine();
if (line == null)
{
break;
}
StringTokenizer st = new StringTokenizer(line, " \t\n\r\f:");
vy.addElement(Float.parseFloat(st.nextToken()));
int m = st.countTokens() / 2;
SparseVector x = new SparseVector(m);
for (int j = 0; j < m; j++)
{
//x[j] = new svm_node();
x.indexes[j] = Integer.parseInt(st.nextToken());
x.values[j] = Float.parseFloat(st.nextToken());
}
if (m > 0)
{
max_index = Math.max(max_index, x.indexes[m - 1]);
}
vx.addElement(x);
}
I tried to modify it so that the vector vy accepts a sparse vector with 2 values.
The program gets executed but the model file seems to be wrong.
Can anyone please verify if they have used jlibsvm for multiple output svm regression???
If yes can someone please explain how they achieved this ??
If no then does someone know of a similar svm implementation in Java ??
The classic SVM algorithm does not support multi dimensional outputs. One way to work around this would be to have a SVM model for each output dimension.

Java: Issue Reading Text file then Converting

I've got an issue getting a method to read a file, then converting it to an integer. Here is a brief explanation of the program. It is essentially a car dealership inventory that keeps track of the vehicles in the lot by keeping them written down in a text file. When the program starts it will need to read the file and put all the current cars into an array so they can be displayed. Then the rest of the program will do other things like remove cars and add news ones etc. The part I am at is when the program first starts it needs to read the file, but I can't seem to get it to work.
The text file consists of 6 lines in total; 4 numbers first then 2 words respectively. I want the method to read the first four lines and convert those into integers and store them in a temporary array. Then after that it will read the next two lines and store those in a temporary array as well. Afterwards I take all these stored values and send them to a constructor. The constructor is then stored in an Arraylist and the Arraylist can be accessed anytime. In the output it does all of this just fine. But it wants to run through the method a second time despite barriers in place to prevent this.
Here is the code. Its a class and not the main program. I will try to explain the program as best I can inside the code.
public class Vehicle {
//All the different private variables for the constructors and methods
private int intholder[], year, type, kilometres, price, loop;
private String make, model, myline, holder[];
//The Arraylist that the different vehicle objects will be stored
ArrayList<Vehicle> allCars = new ArrayList<Vehicle>();
//The Default constructor
public Vehicle(){
make = "Vehicle Make";
model = "Vehicle Model";
type = 0;
year = 0;
kilometres = 0;
price = 0;
}
//The constructor that has information sent to it
public Vehicle(int _type, int _year, int _kilometres, int _price, String _make, String _model){
make = _make;
model = _model;
type = _type;
year = _year;
kilometres = _kilometres;
price = _price;
}
//Text file information
/*
* CAR TYPE CODE:
* 1 - Sedan
* 2 - Truck
* 3 - Crossover
* 4 - SUV
* 5 - Sports
*
* There is a total of 6 lines for each car and are as follows
* 1 - int Type integer
* 2 - int Year
* 3 - int Kilometres
* 4 - int Asking price
* 5 - String Make
* 6 - String Model
*/
//The method in question. It reads through the file, converts the integers and stores them,
//stores the strings, and sends all the information to the constructor
public void readCars()throws IOException{
BufferedReader readFile = new BufferedReader(new FileReader("C:/Users/David/Desktop/FinalProject/Carlot.txt"));
//Setting the length of the temporary arrays
holder = new String[2];
intholder = new int[4];
//The main loop in the method.
do{
//Read the first 4 lines of the file and convert them to integers.
//The try catch shouldn't have to be there because the first 4 lines
//of the file are all numbers, but I put it in there to see when it was messing up.
for(int i = 0; i < 4; i++){
myline = readFile.readLine();
try{
intholder[i] = Integer.parseInt(myline);
}
catch(NumberFormatException e){
System.out.println(e);
}
//Had this in here to see how many lines down the file it would go before messing up.
System.out.println(myline);
}
//Loop to store the Strings
for(int i = 0; i < 2; i++){
myline = readFile.readLine();
holder[i] = myline;
System.out.println(myline);
}
//Sends all the data to the constructor
Vehicle V = new Vehicle(intholder[0], intholder[1], intholder[2], intholder[3], holder[0], holder[1]);
//Several if statements to determine which subclass of vehicle it is.
if(intholder[0]==1){
Sedan S = new Sedan();
allCars.add(S);
}
else if(intholder[0]==2){
Truck T = new Truck();
allCars.add(T);
}
else if(intholder[0]==3){
Crossover C = new Crossover();
allCars.add(C);
}
else if(intholder[0]==4){
SUV U = new SUV();
allCars.add(U);
}
else if(intholder[0]==5){
Sports P = new Sports();
allCars.add(P);
}
//Only break the loop if the myline equals null
}while(myline != null);
//if the loop breaks, close the file
readFile.close();
}
Now I think I know where it is going wrong. At the end of the do/while, it checks if "myline" is null. And because the last time it read the file it was still a String the loop continues. The last time it goes through the loop, everything is null so trying to convert the integer is impossible so I get errors. But I have no idea how to get it to read the file at the end of the loop without going to the next line. Here is what the text file looks like.
1
2007
150250
5000
Toyota
Corolla
2
2005
240400
4500
Chevorlet
Silverado
I can't have it read at the end of the loop because if it does and there are still more cars after the one I just did, It goes into the next line when the loop restarts everything is thrown off.
Any help is appreciated, Thanks!
Use a labeled break statement in your for loops to simply exit out of the main do while loop when myline becomes null. The way other objects are being instantiated within the loop doesn't leave much room for easy refactoring hence the use of a labeled break makes sense here.
outerloop:
do {
for (int i = 0; i < 4; i++) {
if ((myline = readFile.readLine()) == null) break outerloop;
// ..
}
for (int i = 0; i < 2; i++) {
if ((myline = readFile.readLine()) == null) break outerloop;
// ..
}
// ..
} while (myline != null);
Maybe you could use a while loop instead of a do-while loop and read the next line from the file before anything else. Something like this:
String myline = null;
while( (myline = readFile.readLine()) != null ) {
// All your logic...
}
readFile.close();
The condition of while loop does the following: first, read the next line of the file with myline = readFile.readLine(). The previous statement returns the value of myline, so now we check that it is not null with the comparison:
(myline = readFile.readLine()) != null

Categories