Reading Un-delimited text file in java via flatpack - java

I want to read data from text file in java, but text file doesn't contain any delimiter like space or comma after some text. Some guy told me that its possible via flatpack.
So how can I read text and parse it as delimited and stored them.
Eg of text file data
"Prod Name" "City" "Price" "zipcode" "Date"
samsungA London 65001402110/07/2018
samsungA California 35001202122/08/2018
samsungA Delhi 44001202112/08/2018
I want to store: as:
Name in string
City in string
Price in int
zipcode in int
date as date
Any view on how to achieve this?

You can do this with a simple file reader. Your file is delimited by spaces; each row ends with a newline character according to your example.
As such, you just need to do a bit of arithmetic to calculate the indexes as you have price, post code and date information in the third piece of each row.
public static void main(String...args) throws IOException {
final File file = new File("/home/william/test.txt");
final String delimiter = " ";
final int dateStrLen = 10;
final int postCodeLen = 6;
BufferedReader br = new BufferedReader(new FileReader(file));
String tmp;
while ((tmp = br.readLine()) != null) {
String[] values = tmp.split(delimiter);
String name = values[0];
String city = values[1];
int dateStartPos = values[2].length() - dateStrLen;
int postCodeStartPos = dateStartPos - postCodeLen;
String date = values[2].substring(dateStartPos);
String postCode = values[2].substring(postCodeStartPos, dateStartPos);
String price = values[2].substring(0, postCodeStartPos);
// do something with the data
// you could store it with a dto or in arrays, one for each "column"
System.out.println(String.format("name: %s; city: %s; price: %s; post-code: %s; date: %s", name, city, price, postCode, date));
}
}

I think that using a flatpack or not is not the problem.
If the file does not contain delimiters, then you should view the table as a file built by data-columns and read it with character position definition.
You should say then that at the start of the file you have position 0 and then the next character is position 1 and then 2 ... and so on.
Then all rows that have data between inclusive 0 and 7 characters wide is the "Prod Name" and will return samsungA.
From character 9 to 18 (assuming 18 is the maximum position) you should read records of "City".
So prerequisite is to know how many characters wide is each data column.
For example row 1 has "London" but then is "California" and you could have wider names. So you need to know or you need to find the maximum position that ends the data for each data-column.
And you can do it without flatpack.

Well you can use parser, and xml schema to define the length of the required variables that way one can extract the required varaibles. But yes, those variables will have predefined length.
String data= "samsungA500";
String schema = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n" +
"<!-- DTD can be pulled from the Jar or over the web -->\r\n" +
"<!DOCTYPE PZMAP SYSTEM \"flatpack.dtd\" >\r\n" +
"<!--<!DOCTYPE PZMAP SYSTEM \"http://flatpack.sourceforge.net/flatpack.dtd\"> -->\r\n" +
"<PZMAP>\r\n" +
" <COLUMN name=\"std_name\" length=\"9\" />\r\n" +
" <COLUMN name=\"std_price\" length=\"3\" />\r\n" +
"</PZMAP>";
InputStream mapping = new ByteArrayInputStream(schema.getBytes());
InputStream dataStream = new ByteArrayInputStream(data.getBytes());
Parser pzparser = DefaultParserFactory.getInstance().newFixedLengthParser(mapping, dataStream);
DataSet ds = pzparser.parse();
while (ds.next()) {
System.out.println(ds.getString("std_name"));
System.out.println(ds.getInt("std_price"));
System.out.println(ds.getString("std_name"));
}

Related

How to control the comparator to avoid the , inside the " " in a csv file

I am trying to sort the values by age and I get an error. There is a method in the program where it separates the values by ",". Upon inspecting the error, it seems that it also considers the , inside the string address (which it shouldn't do). Any ideas as to how I can make it ignore the , inside the double quoted values?
Here's the part of the code where I split the values:
while ((string = input.readLine()) != null) { // Reads each line and loops until there is no more text to be read
String[] list = string.split(",", 8); // Uses "," to split the data
String name = list[0] + "," + list[1]; // The parts of data is assigned their specific variable
String email = list[2];// The parts of data is assigned their specific variable
String address = list[3];
int age = Integer.parseInt(list[4]); // Using parseInt, the data becomes an integer so comparison will be possible
String residency = list[5];
int district = Integer.parseInt(list[6]);
String gender = list[7];
person.add(new Person(name, email, address, age, residency, district, gender)); // Passes the information from the txt file
}

What is wrong in my file reading with Scanner class?

Every time I run it, gives this message (( InputMismatchException )) where is the problem from ?
File f = new File("nameList.txt");
try {
PrintWriter out;
out = new PrintWriter(f);
for (int i = 0; i < 4; i++) {
out.printf("Name : %s Age : %d ", "Rezaee-Hadi", 19);
out.println("");
}
out.close();
} catch (IOException ex) {
System.out.println("Exception thrown : " + ex);
}
try {
Scanner in = new Scanner(f);
String name = in.nextLine();
int age = in.nextInt();
for (int i = 0; i < 4; i++) {
System.out.println(name);
System.out.println(age);
}
in.close();
} catch (FileNotFoundException ex) {
System.out.println("Exception thrown : " + ex);
}
You are creating your data file in the following data format:
Name : Rezaee-Hadi Age : 19
Now, it really doesn't matter (to some extent) how you format your data file as long as you realize that you may need to parse that data later on. You really don't need to maintain a header with your data on each file line. We already know that the first piece of data on any file line is to be a Name and the second piece of data on any file line is to be the Age of the person the Name relates to. So, the following is sufficient:
Rezaee-Hadi, 19
If you want, you can place a header as the very first line of the data file so that it can easily be determined what each piece of data on each line relates to, for example:
Name, Age
Rezaee-Hadi, 19
Fred Flintstone, 32
Tom Jones, 66
John Smith, 54
This is actually a typical format for CSV data files.
Keeping with the file data format you are already using:
There is nothing wrong with using the Scanner#nextLine() method. It's a good way to go but you should be iterating through the file line by line using a while loop because you may not always know exactly how many actual data lines are contained within the file, for example:
Scanner in = new Scanner(f);
String dataLine;
while (in.hasNextLine()) {
dataLine = in.nextLine().trim();
// Skip Blank Lines
if (dataLine.equals("")) {
continue;
}
System.out.println(dataLine);
}
This will print all the data lines contained within your file. But this is not what you really want is it. You want to separate the name and age from each line which means then that you need to parse the data from each line. One way (in your case) would be something like this:
String dataLine;
Scanner in = new Scanner(f);
while (in.hasNextLine()) {
dataLine = in.nextLine().trim();
// Skip Blank Lines
if (dataLine.equals("")) {
continue;
}
String[] dataParts = dataLine.replace("Name : " , "").split(" Age : ");
System.out.println("The Person's Name: " + dataParts[0] + System.lineSeparator()
+ "The Person's Age: " + dataParts[1] + System.lineSeparator());
}
In the above code we iterate through the entire data file one line at a time using a while loop. As each line is read into the dataLine string variable it is also trimmed of any leading or trailing whitespaces. Normally we don't want these. We then check to make sure the line is not blank. We don't normally want these either and here we skip past those blank lines by issuing a continue to the while loop so as to immediately initiate another iteration. If the file line line actually contains data then it is held within the dataLine variable.
Now we want to parse that data so as to retrieve the Name and the Age and place them into a String Array. We do this by using the String#split() method but first we get rid of the "Name : " portion of the line using the String#replace() method since we don't want to deal with this text while we parse the line. In the String#split() method we supply a string delimiter to split by and that delimiter is " Age : ".
String[] dataParts = dataLine.replace("Name : " , "").split(" Age : ");
Now when each line is parsed, the Name and Age will be contained within the dataParts[] string array as elements located at index 0 and index 1. We now use these array elements to display the results to console window.
At this point the Age is a string located in the dataParts[] array at index 1 but you may want to convert this age to a Integer (int) type value. To do this you can utilize the Integer.parseInt() or Integer.valueOf() methods but before you do that you should validate the fact the the string you are about to pass to either of these methods is indeed a string numerical integer value. To do this you would utilize the String#matches() method along with a simple little Regular Expression (RegEx):
int age = 0;
if (dataParts[1].matches("\\d+")) {
age = Integer.parseInt(dataParts[1]);
// OR age = Integer.valueOf(dataParts[1]);
System.out.println("Age = " + age);
}
else {
System.out.println("Age is not a numerical value!");
}
The regular expression "\\d+" placed within the String#matches() method basically means, "Is the supplied string a string representation of a integer numerical value?". If the method finds that it is not then boolean false is returned. If it finds that the value supplied is a string integer numerical value then boolean true is returned. Doing things this way will prevent any NumberFormatException's from occurring.
Replace this:
int age=0;
while (in.hasNext()) {
// if the next is a Int,
// print found and the Int
if (in.hasNextInt()) {
age = in.nextInt();
System.out.println("Found Int value :"
+ age);
}
}
in place of this:
int age = in.nextInt();
Then you will not get "InputMismatchException" anymore..

Assign a variable to a string of text that is between a certain delimiters Ex. “|” using Java

I have a string that I want to break down and assign different part of this string to different variables.
String:
String str ="NAME=Mike|Phone=555.555.555| address 298 Stack overflow drive";
To Extract the Name:
int startName = str.indexOf("=");
int endName = str.indexOf("|");
String name = str.substring(startName +1 , endName ).trim();
But I can't extract the phone number:
int startPhone = arg.indexOf("|Phone");
int endPhone = arg.indexOf("|");
String sip = arg.substring(startPhone + 7, endPhone).trim();
Now how can I extract the phone number that is between delimiter "|".
Also, is there a different way to extract the name using the between delimiter "=" & the first "|"
You can split on both = and | at the same time, and then pick the non-label parts
String delimiters = "[=\\|]";
String[] splitted = str.split(delimiters);
String name = splitted[1];
String phone = splitted[3];
Note that his code assumes that the input is formatted exactly as you posted. You may want to check for whitespace and other irregularities.
String[] details = str.split("|");
String namePart = details[0];
String phonePart = details[1];
String addressPart = details[2];
String name = namePart.substring(namePart.indexOf("=") + 1).trim();
String phone = phonePart.substring(phonePart.indexOf("=") + 1).trim();
String address = addressPart.trim();
I hope this could help.

Parse a plain text into a Java Object

I´m parsing a plain text and trying to convert into an Object.
The text looks like(and i can´t change the format):
"N001";"2014-08-12-07.11.37.352000";" ";"some#email.com ";4847 ;"street";"NAME SURNAME ";26 ;"CALIFORNIA ";21
and The Object to convert:
String index;
String timestamp;
String mail;
Integer zipCode
...
I´ve tried with:
StringTokenizer st1 = new StringTokenizer(N001\";\"2014-08-12-07.11.37.352000\";\" \";\"some#email.com \";4847 ;\"street\";\"NAME SURNAME \";26 ;\"CALIFORNIA \";21);
while(st2.hasMoreTokens()) {
System.out.println(st2.nextToken(";").replaceAll("\"",""));
}
And the output is the correct one, i´ve thinking to have a counter and hardcoding with a case bucle and set the field deppending the counter, but the problem is that I have 40 fields...
Some idea?
Thanks a lot!
String line = "N001";"2014-08-12-07.11.37.352000";" ";"some#email.com ";4847 ;"street";"NAME SURNAME ";26 ;"CALIFORNIA ";21
StringTokenizer st1 = new StringTokenizer(line, ";");
while(st2.hasMoreTokens()) {
System.out.println(st2.nextToken().replaceAll("\"",""));
}
Or you can use split method and directly get a array of values using the delimiter ;
String []values = line.split(";");
then iterate through the array and get and cast the values they way you want
Regardless of the way you are parsing the file, you somehow need to define the mapping of column-to-field (and how to parse the text).
if this is a CVS file, you could use a library like super-csv. All you need to do is write a mapping definition.
I would first split your input String based on the semi-colon separator, then clean up the values.
For instance:
String input = "\"N001\";\"2014-08-12-07.11.37.352000\";\" " +
"\";\"some#email.com " +
"\";4847 ;\"street\";\"NAME " +
"SURNAME \";26 ;\"CALIFORNIA " +
"\";21 ";
// raw split
String[] split = input.split(";");
System.out.printf("Raw: %n%s%n", Arrays.toString(split));
// cleaning up whitespace and double quotes
ArrayList<String> cleanValues = new ArrayList<String>();
for (String s: split) {
String clean = s.replaceAll("[\\s\"]", "");
if (!clean.isEmpty()) {
cleanValues.add(clean);
}
}
System.out.printf("Clean: %n%s%n", cleanValues);
Output
Raw:
["N001", "2014-08-12-07.11.37.352000", " ", "some#email.com ", 4847 , "street", "NAME SURNAME ", 26 , "CALIFORNIA ", 21 ]
Clean:
[N001, 2014-08-12-07.11.37.352000, some#email.com, 4847, street, NAMESURNAME, 26, CALIFORNIA, 21]
Note
In order to map the values to your variables you will need to know their index in advance, and it will have to be consistent.
Then you can use the get(int i) method to retrieve them from your List - e.g. cleanValues.get(2) will get you the e-mail, etc.
Note (2)
If you do not know the indices in advance or they may vary, then you are in trouble.
You can of course try to get those indices by using regular expressions but I suspect you might end up complicating your life quite a bit.
you can use Java Reflection to automate your process.
Iterate over the fields
Field[] fields = dummyRow.getClass().getFields();
and set your values
SomeClass object = construct.newInstance();
field.set(object , value);

Cannot get values from splitted Array String into a String

I am trying to get the values out of String[] value; into String lastName;, but I get errors and it says java.lang.ArrayIndexOutOfBoundsException: 2
at arduinojava.OpenFile.openCsv(OpenFile.java:51) (lastName = value[2];). Here is my code, but I am not sure if it is going wrong at the split() or declaring the variables or getting the data into another variable.
Also I am calling input.next(); three times for ignoring first row, because otherwise of study of Field of study would also be printed out..
The rows I am trying to share are in a .csv file:
University Firstname Lastname Field of study
Karlsruhe Jerone L Software Engineering
Amsterdam Shahin S Software Engineering
Mannheim Saman K Artificial Intelligence
Furtwangen Omid K Technical Computing
Esslingen Cherelle P Technical Computing
Here's my code:
// Declare Variable
JFileChooser fileChooser = new JFileChooser();
StringBuilder sb = new StringBuilder();
// StringBuilder data = new StringBuilder();
String data = "";
int rowCounter = 0;
String delimiter = ";";
String[] value;
String lastName = "";
/**
* Opencsv csv (comma-seperated values) reader
*/
public void openCsv() throws Exception {
if (fileChooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
// Get file
File file = fileChooser.getSelectedFile();
// Create a scanner for the file
Scanner input = new Scanner(file);
// Ignore first row
input.next();
input.next();
input.next();
// Read from input
while (input.hasNext()) {
// Gets whole row
// data.append(rowCounter + " " + input.nextLine() + "\n");
data = input.nextLine();
// Split row data
value = data.split(String.valueOf(delimiter));
lastName = value[2];
rowCounter++;
System.out.println(rowCounter + " " + data + "Lastname: " + lastName);
}
input.close();
} else {
sb.append("No file was selected");
}
}
lines are separated by spaces not by semicolon as per your sample. Try in this way to split based on one or more spaces.
data.split("\\s+");
Change the delimiter as shown below:
String delimiter = "\\s+";
EDIT
The CSV file should be in this format. All the values should be enclosed inside double quotes and there should be a valid separator like comma,space,semicolon etc.
"University" "Firstname" "Lastname" "Field of study"
"Karlsruhe" "Jerone" "L" "Software Engineering"
"Amsterdam" "Shahin" "S" "Software Engineering"
Please check if you file is using delimiter as ';' if not add it and try it again, it should work!!
Use OpenCSV Library for read CSV files .Here is a detailed example on read/write CSV files using java by Viral Patel

Categories