Exporting CSV really strange formatted - java

What am i doing? I am exporting my sqlite database into a csv -- atleast i try to
I've done this both manually and with "OpenCSV".
With both methods I get very strange results. They just seem not well formatted. Neither the columns (which are usually seperated by ',' ? ) nor special characters (which are said to be handled within opencsv) look like they should. code:
CSVWriter writer = new CSVWriter(new FileWriter(file),'\n',',');
String[] items = new String[11];
c.moveToFirst();
while(!c.isAfterLast()){
items[0] = c.getString(c.getColumnIndex(BaseColumns._ID));
items[1] = c.getString(c.getColumnIndex(DepotTableMetaData.ITEM_QRCODE));
items[2] = c.getString(c.getColumnIndex(DepotTableMetaData.ITEM_NAME));
items[3] = c.getString(c.getColumnIndex(DepotTableMetaData.ITEM_AMOUNT));
items[4] = c.getString(c.getColumnIndex(DepotTableMetaData.ITEM_UNIT));
items[5] = c.getString(c.getColumnIndex(DepotTableMetaData.ITEM_PPU));
items[6] = c.getString(c.getColumnIndex(DepotTableMetaData.ITEM_TOTAL));
items[7] = c.getString(c.getColumnIndex(DepotTableMetaData.ITEM_COMMENT));
items[8] = c.getString(c.getColumnIndex(DepotTableMetaData.ITEM_SHOPPING));
items[9] = c.getString(c.getColumnIndex(DepotTableMetaData.CREATED_DATE));
items[10] = c.getString(c.getColumnIndex(DepotTableMetaData.MODIFIED_DATE));
c.moveToNext();
writer.writeNext(items);
}
writer.close();
and it all gives this as a result:
I've also done it through FileWriter and StringBuffer but it seems to give exactly the same results...I'd love if you could help me ;)
I have looked through stackoverflow but couldn't find any matching question ;/
edit: yes i know that I use the "old, deprecated" cursor, but that's not the question here. Thanks.
edit2: SOLVED !
you have to assign some common encoding !
CSVWriter writer = new CSVWriter(new OutputStreamWriter(new FileOutputStream(destination+"/output.csv"),"UTF-8"));
did the job perfectly!

You use an OpenCSV Writer, which takes a row of the CSV file as an array of Strings, and generates the separators between columns and rows automatically, but instead of letting OpenCSV do it for you, you do it explicitely by appending all the values of a row in a single String. So obviously, OpenCSV takes your unique value and considers it contains a single column, where commas and newlines must be encoded.
You should call writer.writeNext() with an array of Strings, each String in the array being a single cell from the table. The writer will generate the commas and the newlines for you.

Related

Univocity CSV parser glues the whole line if it begins with quote "

I'm using univocity 2.7.5 to parse csv file. Till now it worked fine and parsed a row in csv file as String array with n elements, where n = number of columns in a row. But now i have a file, where rows start with quote " and the parser cannot handle it. It returns a row as String array with only one element which contains whole row data. I tried to remove that quote from csv file and it worked fine, but there are about 500,000 rows. What should i do to make it work?
Here is the sample line from my file (it has quotes in source file too):
"100926653937,Kasym Amina,620414400630,Marzhan Erbolova,""Kazakhstan, Almaty, 66, 3"",87029845662"
And here's my code:
CsvParserSettings settings = new CsvParserSettings();
settings.setDelimiterDetectionEnabled(true);
CsvParser parser = new CsvParser(settings);
List<String[]> rows = parser.parseAll(csvFile);
Author of the library here. The input you have there is a well-formed CSV, with a single value consisting of:
100926653937,Kasym Amina,620414400630,Marzhan Erbolova,"Kazakhstan, Almaty, 66, 3",87029845662
If that row appeared in the middle of your input, I suppose your input has unescaped quotes (somewhere before you got to that line). Try playing with the unescaped quote handling setting:
For example, this might work:
settings.setUnescapedQuoteHandling(UnescapedQuoteHandling.STOP_AT_CLOSING_QUOTE);
If nothing works, and all your lines look like the one you posted, then you can parse the input twice (which is shitty and slow but will work):
CsvParser parser = new CsvParser(settings);
parser.beginParsing(csvFile);
List<String[]> out = new ArrayList<>();
String[] row;
while ((row = parser.parseNext()) != null) {
//got a row with unexpected length?
if(row.length == 1){
//break it down again.
row = parser.parseLine(row[0]);
}
out.add(row);
}
Hope this helps.

Best way to populate a user defined object using the values of string array

I am reading two different csv files and populating data into two different objects. I am splitting each line of csv file based on regex(regex is different for two csv files) and populating the object using each data of that array which is obtained by splitting each line using regex as shown below:
public static <T> List<T> readCsv(String filePath, String type) {
List<T> list = new ArrayList<T>();
try {
File file = new File(filePath);
FileInputStream fileInputStream = new FileInputStream(file);
InputStreamReader inputStreamReader = new InputStreamReader(fileInputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader)
list = bufferedReader.lines().skip(1).map(line -> {
T obj = null;
String[] data = null;
if (type.equalsIgnoreCase("Student")) {
data = line.split(",");
ABC abc = new ABC();
abc.setName(data[0]);
abc.setRollNo(data[1]);
abc.setMobileNo(data[2]);
obj = (T)abc;
} else if (type.equalsIgnoreCase("Employee")) {
data = line.split("\\|");
XYZ xyz = new XYZ();s
xyz.setName(Integer.parseInt(data[0]));
xyz.setCity(data[1]);
xyz.setEmployer(data[2]);
xyz.setDesignation(data[3]);
obj = (T)xyz;
}
return obj;
}).collect(Collectors.toList());} catch(Exception e) {
}}
csv files are as below:
i. csv file to populate ABC object:
Name,rollNo,mobileNo
Test1,1000,8888888888
Test2,1001,9999999990
ii. csv file to populate XYZ object
Name|City|Employer|Designation
Test1|City1|Emp1|SSE
Test2|City2|Emp2|
The issue is there can be a missing data for any of the above columns in the csv file as shown in the second csv file. In that case, I will get ArrayIndexOutOfBounds exception.
Can anyone let me know what is the best way to populate the object using the data of the string array?
Thanks in advance.
In addition to the other mistakes you made and that were pointed out to you in the comments your actual problem is caused by line.split("\\|") calling line.split("\\|", 0) which discards the trailing empty String. You need to call it with line.split("\\|", -1) instead and it will work.
The problem appears to be that one or more of the last values on any given CSV line may be empty. In that case, you run into the fact that String.split(String) suppresses trailing empty strings.
Supposing that you can rely on all the fields in fact being present, even if empty, you can simply use the two-arg form of split():
data = line.split(",", -1);
You can find details in that method's API docs.
If you cannot be confident that the fields will be present at all, then you can force them to be by adding delimiters to the end of the input string:
data = (line + ",,").split(",", -1);
Since you only use the first values few values, any extra trailing values introduced by the extra delimiters would be ignored.

How to write two comma seperated values as one value

I am retrieving the values using regular expression in jmeter and writing those values into a csv file.But one of my value returns values as (value1,value2), how can i add write those 2 values as one value in csv file.Below is my code
String statusvar = vars.get("guid");
String guidstat = vars.get("guidn");
String custstat = vars.get("custType");
String fpath = vars.get("write_file_path");
String newStatus;
FileWriter fstream = new FileWriter(fpath+"new_record.csv", false);
BufferedWriter out = new BufferedWriter(fstream);
out.write(statusvar+","+guidstat+","+custstat);
out.newLine();
out.flush();
Write your values within quotes and it should be OK. If a value contains quotes, then you'd need to escape them. Just replace each " by "", so value"a,valueB is written as "value""a,valueB"
If this becomes too tricky then I suggest getting a CSV parsing/writing library to do the job for you such as univocity-parsers - I'm the author of this one by the way.

Trying to convert CSV to XLSX, but columns are being split up using the wrong commas

I am using the accepted answer from here. Basically, I am converting a csv to .xlsx, and it looks like the solution pulls everything in individual cells into 1 line using the buffered reader, and then using:
String str[] = currentLine.split(",");
.. the string is split up into separate parts of the array for each column. My problem is that in some of my data, there are commas, so the algorithm gets confused and makes more columns than needed, splitting sentences into different columns which doesn't really work for me. Is there another way I can split the sentences up perhaps? I'd happily split the string up using a different unique character (maybe |?), but I don't know how to replace the comma provided by the bufferedreader. Any help would be great. Code I am using below for reference:
public static void csvToXLSX() {
try {
String csvFileAddress = "test.csv"; //csv file address
String xlsxFileAddress = "test.xlsx"; //xlsx file address
XSSFWorkbook workBook = new XSSFWorkbook();
XSSFSheet sheet = workBook.createSheet("sheet1");
String currentLine=null;
int RowNum=0;
BufferedReader br = new BufferedReader(new FileReader(csvFileAddress));
while ((currentLine = br.readLine()) != null) {
String str[] = currentLine.split(",");
RowNum++;
XSSFRow currentRow=sheet.createRow(RowNum);
for(int i=0;i<str.length;i++){
currentRow.createCell(i).setCellValue(str[i]);
}
}
FileOutputStream fileOutputStream = new FileOutputStream(xlsxFileAddress);
workBook.write(fileOutputStream);
fileOutputStream.close();
System.out.println("Done");
} catch (Exception ex) {
System.out.println(ex.getMessage()+"Exception in try");
}
}
Well, CSV is something more than just text file with lines separated with commas.
For example, some fields in CSV can be quoted; this is the way comma is escaped within one field.
Quotes are quoted as well, with double-quotes.
And there also could be newlines within one CSV line, they must also be quoted.
So, to sum up, a CSV lines
1,"2,3","4
5",6,7,""""
should be parsed to array of "1", "2,3", "4\n5", "6", "7","\"" (and that is a single row of a CSV table).
As you can see, you can't just mindlessly split every line by comma. I suggest you to use some library instead of doing this by yourself. http://www.liquibase.org/javadoc/liquibase/util/csv/opencsv/CSVReader.html will work just fine.

How to escape comma and double quote at same time for CSV file?

I am writing a Java app to export data from Oracle to csv file
Unfortunately the content of data may quite tricky. Still comma is the deliminator, but some data on a row could be like this:
| ID | FN | LN | AGE | COMMENT |
|----------------------------------------------------------------|
| 123 | John | Smith | 39 | I said "Hey, I am 5'10"." |
|----------------------------------------------------------------|
so this is one of the string on the comment column:
I said "Hey, I am 5'10"."
No kidding, I need to show above comment without compromise in excel or open office from a CSV file generated by Java, and of course cannot mess up other regular escaping situation(i.e. regular double quotes, and regular comma within a tuple). I know regular expression is powerful but how can we achieve the goal with such complicated situation?
There are several libraries. Here are two examples:
❐ Apache Commons Lang
Apache Commons Lang includes a special class to escape or unescape strings (CSV, EcmaScript, HTML, Java, Json, XML): org.apache.commons.lang3.StringEscapeUtils.
Escape to CSV
String escaped = StringEscapeUtils
.escapeCsv("I said \"Hey, I am 5'10\".\""); // I said "Hey, I am 5'10"."
System.out.println(escaped); // "I said ""Hey, I am 5'10""."""
Unescape from CSV
String unescaped = StringEscapeUtils
.unescapeCsv("\"I said \"\"Hey, I am 5'10\"\".\"\"\""); // "I said ""Hey, I am 5'10""."""
System.out.println(unescaped); // I said "Hey, I am 5'10"."
* You can download it from here.
❐ OpenCSV
If you use OpenCSV, you will not need to worry about escape or unescape, only for write or read the content.
Writing file:
FileOutputStream fos = new FileOutputStream("awesomefile.csv");
OutputStreamWriter osw = new OutputStreamWriter(fos, "UTF-8");
CSVWriter writer = new CSVWriter(osw);
...
String[] row = {
"123",
"John",
"Smith",
"39",
"I said \"Hey, I am 5'10\".\""
};
writer.writeNext(row);
...
writer.close();
osw.close();
os.close();
Reading file:
FileInputStream fis = new FileInputStream("awesomefile.csv");
InputStreamReader isr = new InputStreamReader(fis, "UTF-8");
CSVReader reader = new CSVReader(isr);
for (String[] row; (row = reader.readNext()) != null;) {
System.out.println(Arrays.toString(row));
}
reader.close();
isr.close();
fis.close();
* You can download it from here.
Excel has to be able to handle the exact same situation.
Put those things into Excel, save them as CSV, and examine the file with a text editor. Then you'll know the rules Excel is applying to these situations.
Make Java produce the same output.
The formats used by Excel are published, by the way...
****Edit 1:**** Here's what Excel does
****Edit 2:**** Note that php's fputcsv does the same exact thing as excel if you use " as the enclosure.
rdeslonde#mydomain.com
Richard
"This is what I think"
gets transformed into this:
Email,Fname,Quoted
rdeslonde#mydomain.com,Richard,"""This is what I think"""
Thanks to both Tony and Paul for the quick feedback, its very helpful. I actually figure out a solution through POJO. Here it is:
if (cell_value.indexOf("\"") != -1 || cell_value.indexOf(",") != -1) {
cell_value = cell_value.replaceAll("\"", "\"\"");
row.append("\"");
row.append(cell_value);
row.append("\"");
} else {
row.append(cell_value);
}
in short if there is special character like comma or double quote within the string in side the cell, then first escape the double quote("\"") by adding additional double quote (like "\"\""), then put the whole thing into a double quote (like "\""+theWholeThing+"\"" )
You could also look at how Python writes Excel-compatible csv files.
I believe the default for Excel is to double-up for literal quote characters - that is, literal quotes " are written as "".
If you're using CSVWriter. Check that you don't have the option
.withQuotechar(CSVWriter.NO_QUOTE_CHARACTER)
When I removed it the comma was showing as expected and not treating it as new column
"cell one","cell "" two","cell "" ,three"
Save this to csv file and see the results, so double quote is used to escape itself
Important Note
"cell one","cell "" two", "cell "" ,three"
will give you a different result because there is a space after the comma, and that will be treated as "
String stringWithQuates = "\""+ "your,comma,separated,string" + "\"";
this will retain the comma in CSV file
In openCSV, use below method to create csvWriter obj,
CSVWriter csvWriter = new CSVWriter(writer, CSVWriter.DEFAULT_SEPARATOR, CSVWriter.DEFAULT_ESCAPE_CHARACTER, CSVWriter.DEFAULT_LINE_END, CSVWriter.DEFAULT_QUOTE_CHARACTER);
In this, DEFAULT_QUOTE_CHARACTER is very important.
It will work perfectly, If you want to insert any ',' or '"' in csv file.

Categories