I am reading csv file method as under -
public ArrayList<String> fileRead(File f) throws IOException {
FileReader fr = new FileReader(f);
BufferedReader br = new BufferedReader(fr);
ArrayList<String> CSVData = new ArrayList<String>();
String text;
try {
while ((text = br.readLine()) != null) {
CSVData.add(text);
log.debug(text);
}
log.info(f + ": Read successfully");
br.close();
fr.close();
} catch (IOException e) {
log.error("Error in reading file " + e.getMessage());
}
return CSVData;
}
but I want to read file till defined column number e.g. till 20th column,
but if in between I will found empty cell for some column then as above code it will exit on (text = br.readLine()) != null ,
so finally my question is how to read CSV file till particular columns either its empty cell or whatever it should read till those column and break point for moving next line should be that column example 20th column ,
Thank in advance for help and support
You should use uniVocity-parsers' column selection feature to process your file.
Here's an example:
CsvParserSettings settings = new CsvParserSettings();
parserSettings.selectFields("Foo", "Bar", "Blah");
// or if your file does not have a row with column headers, you can use indexes:
parserSettings.selectIndexes(4, 20, 2);
CsvParser parser = new CsvParser(settings);
List<String[]> allRows = parser.parseAll(new FileReader(f));
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).
Do NOT try to parse a CSV by yourself. There are many intricacies there. For example, using code such as the one you pasted:
while ((text = br.readLine()) != null) {
Will break as soon as your CSV has values that contain newline characters.
Related
I have a text document that has multiple separate entries all compiled into one .log file.
The format of the file looks something like this.
$#UserID#$
Date
User
UserInfo
SteamFriendID
=========================
<p>Message</p>
$#UserID#$
Date
User
UserInfo
SteamFriendID
========================
<p>Message</p>
$#UserID#$
Date
User
UserInfo
SteamFriendID
========================
<p>Message</p>
I'm trying to take everything in between the instances of "$#UserID$#", and print them into separate text files.
So far, with the looking that I've done, I tried implementing it using StringBuilder in something like this.
FileReader fr = new FileReader(“Path to raw file.”);
int idCount = 1;
FileWriter fw = new FileWriter("Path to parsed files" + idCount);
BufferedReader br = new BufferedReader(fr);
//String line, date, user, userInfo, steamID;
StringBuilder sb = new StringBuilder();
//br.readLine();
while ((line = br.readLine()) != null) {
if(line.substring(0,1).contains("$#")) {
if (sb.length() != 0) {
File file = new File("Path to parsed logs" + idCount);
PrintWriter pw = new PrintWriter(file, "UTF-8");
pw.println(sb.toString());
pw.close();
//System.out.println(sb.toString());
Sb.delete(0, sb.length());
idCount++;
}
continue;
}
sb.append(line + "\r\n");
}
But this only gives me the first 2 of the entries in separate parsed files. Leaving the 3rd one out for some reason.
The other way I was thinking about doing it was reading in all the lines using .readAllLines(), store the list as an array, loop through the lines to find "$#", get that line's index & then recursively write the lines starting at the index given.
Does anyone know of a better way to do this, or would be willing to explain to me why I'm only getting two of the three entries parsed?
Short / quick fix is to write the contents of the StringBuilder once after your while loop like this:
public static void main(String[] args) {
try {
int idCount = 1;
FileReader fr = new FileReader("<path to desired file>");
BufferedReader br = new BufferedReader(fr);
//String line, date, user, userInfo, steamID;
StringBuilder sb = new StringBuilder();
//br.readLine();
String line = "";
while ((line = br.readLine()) != null) {
if(line.startsWith("$#")) {
if (sb.length() != 0) {
writeFile(sb.toString(), idCount);
System.out.println(sb);
sb.setLength(0);
idCount++;
}
continue;
}
sb.append(line + "\r\n");
}
if (sb.length() != 0) {
writeFile(sb.toString(), idCount);
System.out.println(sb);
idCount++;
}
} catch (IOException e) {
e.printStackTrace();
}
}
private static void writeFile(String content, int id) throws IOException
{
File file = new File("<path to desired dir>\\ID_" + id + ".txt");
file.createNewFile();
PrintWriter pw = new PrintWriter(file, "UTF-8");
pw.println(content);
pw.close();
}
I've changed two additional things:
the condition "line.substring(0,1).contains("$#")" did not work properly, the substring call only returns one character, but is compared to two characters -> never true. I changed that to use the 'startsWith' method.
After the content of the StringBuilder is written to file, you did not reset or empty it, resulting in the second and third file containing every previous blocks aswell (thrid file equals input then...). So thats done with "sb.setLength(0);".
I am trying to read in a text file and then manipulate a little and update the records into a new text file.
Here is what I have so far:
ArrayList<String> linesList = new ArrayList<>();
BufferedReader br;
String empid, email;
String[] data;
try {
String line;
br = new BufferedReader(new FileReader("file.txt"));
while ((line = br.readLine()) !=null) {
linesList.add(line);
}
br.close();
}
catch (IOException e) { e.printStackTrace(); }
for (int i = 0; i < linesList.size(); i++) {
data = linesList.get(i).split(",");
empid = data[0];
ccode = data[3];
}
File tempFile = new File("File2.txt");
BufferedWriter bw = new BufferedWriter(new FileWriter(tempFile));
for (int i = 0; i < linesList.size(); i++) {
if(i==0){
bw.write(linesList.get(i));
bw.newLine();
}
else{
data = linesList.get(i).split(",");
String empid1 = data[0];
if(data[13].equals("IND")) {
String replace = data[3].replaceAll("IND", "IN");
ccode1 = replace;
System.out.println(ccode1);
}
else if(data[13].equals("USA")) {
String replace = data[3].replaceAll("USA", "US");
ccode1 = replace;
}
else {
ccode1 = replace; //This does not work as replace is not defined here, but how can I get it to work here.
}
String newData=empid1+","+ccode1;
bw.write(newData);
bw.newLine();
}
}
Here is what is inside the text file:
EID,First,Last,Country
1,John,Smith,USA
2,Jane,Smith,IND
3,John,Adams,USA
So, what I need help with is editing the three letter country code and replacing it with a 2 letter country code. For example: USA would become US, and IND would become IN. I am able to read in the country code, but am having trouble in changing the value and then replacing the changed value back into a different text file. Any help is appreciated. Thanks in advance.
Open file in text editor, Search and Replace, ,USA with ,US, ,IND with ,IN and so on.
As such, to automate it, on the same while loop you read a line do:
//while(read){ line.replaceAll(",USA",",US");
That will be the easiest way to complete your objective.
To save, open a BufferedWriter bw; just like you opened a reader and use bw.write(). You would probably prefer to open both at the same time, the reader on your source file, and the writer on a new file, with _out suffix. That way you dont need to keep the file data in memory, you can read and write as you loop.
For harder ways, read the csv specs: https://www.rfc-editor.org/rfc/rfc4180#section-2
Notice that you have to account for the possibility of fields being enclosed in quotes, like: "1","John","Smith","USA", which means you also have to replace ,\"USA with ,\"US.
The delimiter may or may not be a comma, you have to make sure yur input will always use the same delimiter, or that you can detect and switch at runtime.
You have to account for the case where a delimiter may be part of a field, or where quotes are part of a field.
Now you know/can solve these issues you can, instead of using replace, parse the lines character by character using while( (/*int*/ c = br.read()) != -1), and do this replacement manually with an if gate.
/*while(read)*/
if( c == delimiter ){
if not field, start next field, else add to field value
} else if( c == quote ){
if field value empty, ignore and expect closing quote, else if quote escape not marked, mark it, else, add quote to field value
}
(...)
} else if( c == 13 or c == 10 ){
finished line, check last field of row read and replace data
}
To make it better/harder, define a parsing state machine, put the states into an Enum, and write the if gates with them in mind (this will make your code be more like a compiler parser).
You can find parsing code at different stages here: https://www.mkyong.com/java/how-to-read-and-parse-csv-file-in-java/
You need to change a little bit in your concept. If you want to edit a file then,
create a new file and write content in new file and delete old file and rename new file
with old name.
ArrayList<String> linesList = new ArrayList<>();
BufferedReader br;
String[] data;
File original=new File("D:\\abc\\file.txt");
try {
String line;
br = new BufferedReader(new FileReader(original));
while ((line = br.readLine()) !=null) {
linesList.add(line);
}
br.close();
}
catch (IOException e) { e.printStackTrace(); }
File tempFile = new File("D:\\abc\\tempfile.txt");
BufferedWriter bw = new BufferedWriter(new FileWriter(tempFile));
for (int i = 0; i < linesList.size(); i++) {
if(i==0){
bw.write(linesList.get(i));
bw.newLine();
}
else{
data = linesList.get(i).split(",");
String empid = data[0];
String name=data[1];
String lname=data[2];
String ccode = data[3].substring(0, 2);
String newData=empid+","+name+","+lname+","+ccode+"\n";
bw.write(newData);
bw.newLine();
}
}
bw.close();
if (!original.delete()) {
System.out.println("Could not delete file");
return;
}
// Rename the new file to the filename the original file had.
if (!tempFile.renameTo(original))
System.out.println("Could not rename file");
I am new to Java and I got a situation to which I'm clueless about. I need to take all the csv files from a folder read them one by one, validate them for eg. There's data like name, age, email etc. So the name should have only letters, age should be numeric and email should be in valid email format. The file which has invalid data in any of the row, there shouldn't be any further processing of that particular csv file and it should be moved to another folder which will have erroneous csv files and the program will move onto next csv in the folder until all of them gets checked, validated and moved.
I don't know how to begin with this. Please help me out guys.
OK, let's separate this question into following four smaller topics:
Java program to read a folder, result is a list of files
Java program to read a file, result is a list of lines
Java program to parse a line, get a list of columns
For name, age, email, validate the data
Step 1: Java program to read a folder, result is a list of files
Assuming you have below in your top of java program
import java.io.*;
import java.util.*;
Below code should get a list of file in a folder
File f = new File(folder);
File[] fileList = f.listFiles();
Step 2: Java program to read a file, result is a list of lines
String line;
BufferedReader br = new BufferedReader(new FileReader(path));
while ((line = br.readLine()) != null) {
String l = line.trim(); // Remove end of line. You can print line here.
}
br.close();
Step 3: Java program to parse a line, get a list of columns
String[] columns = l.split(","); // separate line by comma
for( int i=0; i<columns.length; i++ )
{
System.out.println(columns[i].trim());// remove space after comma
}
Step 4: Validate e.g. age
Age has to be integer so parse it as integer
int age = Integer.parseInt(columns[3].trim());//assuming age at column #3
See another answer come out. That answer doesn't have folder to file loop.
Hope this helps.
Firstly, save your file to .csv format1. This works on excel sheets. Then call this function in main() by this code, you will read the .csv file row-wise, each cell at a time .
Try this out:
public List<HashMap<String, Object>> convertCSVRecordToList() {
String csvFile = "your_file_name.csv";
BufferedReader br = null;
String line = "";
String cvsSplitBy = ",";
HashMap<String, Object> Map = new HashMap<String, Object>();
List<HashMap<String, Object>> MapList = new ArrayList<HashMap<String, Object>>();
try {
br = new BufferedReader(new FileReader(csvFile));
while ((line = br.readLine()) != null) {
System.out.println(line);
String[] data = line.split(cvsSplitBy);
Map.put("filed_name", data[3]);
Map.put("field_name", data[0]);
Map.put("field_name", data[2]);
Map.put("fiels_name", data[1]);
MapList.add(Map);
Map = new HashMap<String, Object>();
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
catch (IOException e) {
e.printStackTrace();
}
return MapList;
}
I would like to read a csv file and print the result in command line as seperated data without , delimetire.
for ex csv file has data
PVT Ltd, computer department, 5 Employees
That needs to be displayed in command prompt as
PVT Ltd
computer department
5 Employees
right now I have code as
try{
File myFile=new File("CSV-to-ArayList.csv");
FileReader fileReader=new FileReader(myFile);
BufferedReader reader=new BufferedReader(fileReader);
String line=null;
while((line= reader.readLine())!= null){
System.out.println(line);
}
reader.close();
}catch(Exception ex){
ex.printStackTrace();
}
But this one prints result as it is like csv file
PVT Ltd, computer department, 5 Employees
please help how would I achieve the result i want.
Here are 3 options which you can read about and experiment with:
Using java.util.Scanner
Using String.split() function
Using 3rd Party libraries like OpenCSV
And here is some code for you:
http://howtodoinjava.com/2013/05/27/parse-csv-files-in-java/
It's a nice example to experiment with basic java concepts.
I hope you do use this to gain knowledge and basic how-to.
You can split the string by separator and then join with a new separator, like a space or a new line:
So instead of:
while((line= reader.readLine())!= null){
System.out.println(line);
}
You can do:
while((line= reader.readLine())!= null){
System.out.println(line.split(",").join("\n"));
}
If the only seperator should be the "," You could first read in the whole file, and then split them into an String[], which you could then easily display line by line, or do other operations with. This would be my code:
try {
File myFile = new File("est.csv");
FileReader fileReader = new FileReader(myFile);
String result;
try (BufferedReader reader = new BufferedReader(fileReader)) {
String line;
result = "";
while ((line = reader.readLine()) != null) {
result = result.concat(line);
}
}
String[] items = result.split(",\\s*");
for (String item : items) {
System.out.println(item);
}
} catch (Exception ex) {
ex.printStackTrace();
}
The "\s*" is regex for any spaces between the items.
I have a tab delimited text file which I want to parse using openscsv and upload to a database. I used CSVReader() to parse the file. The problem is, some column values have tabs within. For instance, a column ends with a tab, and then it has another tab which is used for separating it from the next column.
I'm having trouble in parsing this file. How do I avoid delimiters which are as part of the value?
This is the file I'm trying to parse. Each line has 2 columns and there are 5 rows in total. The first row is the header. However, when I parse it using the following code, I get only 3 rows:
CSVReader reader = new CSVReader(new FileReader("input.txt"), '\t');
String[] nextLine;
int cnt = 0;
while ((nextLine = reader.readNext()) != null) {
if (nextLine != null) {
cnt++;
System.out.println("Length of row "+cnt+" = "+nextLine.length);
System.out.println(Arrays.toString(nextLine));
}
}
******** Update ********
Doing a normal readline such as below prints 5 lines:
BufferedReader br = new BufferedReader(new FileReader("input.txt"));
int lines = 0;
while(br.readLine() != null){
lines++;
}
System.out.println(lines);
Put quotes on your data - here is a modified unit test from CSVReaderTest that shows quotes will work:
#Test
public void testSkippingLinesWithDifferentEscape() throws IOException
{
StringBuilder sb = new StringBuilder(CSVParser.INITIAL_READ_SIZE);
sb.append("Skip this line?t with tab").append("\n"); // should skip this
sb.append("And this line too").append("\n"); // and this
sb.append("a\t'b\tb\tb'\t'c'").append("\n"); // single quoted elements
CSVReader c = new CSVReader(new StringReader(sb.toString()), '\t', '\'', '?', 2);
String[] nextLine = c.readNext();
assertEquals(3, nextLine.length);
assertEquals("a", nextLine[0]);
assertEquals("b\tb\tb", nextLine[1]);
assertEquals("c", nextLine[2]);
}
If that does not work please post some of the lines from your input.txt. When I click on the link it takes me to some website trying to sell me a dropbox clone.