Writing to CSV file with multiline - java

I am basically trying to write to a csv file, one of the cells in that csv file will contain multiple lines but one cell. I have read online that if you wrap it around "" you will generally be fine. This is the case in finder, however when I try to open it in excel it does not work that way what is your suggestion

Try to use this tutorial.
https://www.baeldung.com/apache-commons-csv
But, post your code to more detailed response

Related

How to read/print the contents for excel file using docx4j?

I've been googling and I've not found one example for this.
I am able to extract the contents for a DOCX file but so far no clue how to get the contents of an EXCEL file.
I know you use
SpreadsheetMLPackage spreadsheetMLPackage = SpreadsheetMLPackage.load(file);
to load the file, but I don't know how to proceed from here. I've check whatever methods SpreadsheetMLPackage has but nothing has gotten me the contents.
First you need to understand the structure of a xlsx file.
Unzip one, or run it through the docx4j webapp.
For how the parts relate to one another, see:
http://openxmldeveloper.org/blog/b/openxmldeveloper/archive/2007/08/13/1970.aspx
I guess the key method you'll want is getWorksheet
But first you'll need to get the WorkbookPart; do that with spreadsheetMLPackage.getWorkbookPart()

Parsing XML file from the end of file

I want to use XML for storing some data. But I do not want read full file when I want to get the last data that was inserted there, as well as I do not want to rewrite full file when adding new data there. Is there a standard way in java to parse xml file not from the beginning but from the end. So that for example SAX or StaX parser will first encounter last closing root tag and than last tag. Or if I want to do this I should read and write everything like I am reading/writing regular text file?
Fundamentally, XML is a poor representation choice for this. The format is inherently "contained" like this, and I haven't seen any APIs which encourage you to fight against that.
Options:
Choose a different format entirely (e.g. use a database)
Create lots of small XML files instead - each one self-contained. When you want the whole of the data, read all the files
Just swallow the hit and read/write the whole file each time.
I found a good topic on this with example solutions for what I want.
This link: http://www.oreillynet.com/xml/blog/2007/03/parsing_xml_backwards.html
Seems that XML is not good file format to achieve what I want. There is no standard parser that can parse XML from the end instead of beginning.
Probably the best solution for will be storing all xml data in one file that contains composition of many xml files contents. On each line stored separate contents of XML. The file itself is not well formed XML but each line contains well formed xml that I will parse using standard xml parser(StaX).
This way I will be able to read just lines from the end of file and append new data to the end of file. When I need the whole data or only the part of it I will read all line or part of them. Probably I can also implement pagination from the end of file for that because the file can be big.
Why XML in each line? I think it is easy to use API for parsing it as well as it is human readable to store data in xml instead of just separating values in the line with some symbol.
Why not use sax/stax and simply process only your last entry? Yes, it will need to open and go through the whole file, but at least it's fairly efficient as opposed to loading the whole DOM tree.
Short of doing that, I don't think you can do what you're asking using XML as a source.
Another alternative, apart from the ones provided by Jon Skeet in his answer, would be to keep the same format but insert the latest entries first, and stop processing the files as soon as you've read your entry.

Read text files and write it to excel in java

I have to read a text file and write it to an already existing excel file. The excel file is a customized excel sheet with different items in different columns. The items has different values for each of them... These items with there value can be found in a text file. But i dont have much idea as to how to do this.
E.g- example.txt
Name: John
Age=24
Sex=M
Graduate=M.S
example.xlsx
Age: Sex:
Name: Graduate:
Thanks in advance :)
Just as for so many other problems that need solved, there's an Apache library for that! In this case, it's the POI library. I've only used it for very basic spreadsheet manipulation, but managed that by just following a few tutorials. I'd link to one, but I can't now remember where it was.
Please see Apache POI-HSSF library for reading and writing Excel files with Java. There are some quick guides to get you started.
This post How to read and write excel file in java might help you.
You can also create a *.csv (comma separated value) file in Java. Just create a simple text file with CSV extension and put your values in there like that :
Age:,24,Sex:,M,
So you just separate your values with commas (or other delimiters like ';').
Every line in this file is a row, and every delimiter separates two columns. You won't be able to add colours/styles/formatting this way, but it gives you a file that is openable and understandable even without Excel (or other spreadsheet software).

Excel or text file, which one to use?

I need to suggest an input, excel file or text file.
assuming the input is large number of lines where I need to read the first String, for example:
A,B,C,D....
I need to read the first String (in this case A) to identify the matching row, should I use excel file and use POI to read the first cell of each row? or text file where each line tokens are separated by delimiter and to parse each line reading the first token.
Use a text file. Because computers like it more. If business requires it, rename that text file into a "csv" file and you've got an Excel file.
If humans are going to enter data then use Excel. If the file is used as a communication channel between two systems use as simple as possible file.
If at all possible, use text file - much easier to handle/troubleshoot, easier to generate, uses less memory, does not have restrictions on number of rows, etc. In general - more predictable.
If you go with text files and you have people manually preparing those text files, and you are dealing with non-ASCII text, you better make sure everybody will send you the files in correct encoding (usually UTF-8 would be the best). This is not an issue with Excel.
The only reason to use Excel workbook would be when you need some "business-people" to produce those input files, then that input effectively becomes a user interface to your system - Excel is usually considered more user friendly than Notepad. ;-)
If you do go with Excel, make sure that the people producing those Excel files will give you the correct version (I assume you would want the "old" XLS format, not the new XLSX format).
Rule of thumb: use a text file. It's more interchangeable and way easier to handle by any other software you may need to support in a few years.
If you need some humans to edit those data and you need some beautiful/color display the Excel can provide, consider creating a macro that would store data in csv.

How to find if the file is a CSV file?

I have a scenario wherein the user uploads a file to the system. The only file that the system understands in a CSV, but the user can upload any type of file eg: jpeg, doc, html. I need to throw an exception if the user uploads anything other than CSV file.
Can anybody let me know how can I find if the uploaded file is a CSV file or not?
CSV files vary a lot, and they all could be called, legitimately, CSV files.
I guess your approach is not the best one, the correct approach would be to tell if the uploaded file is a text file the application can parse instead of it it's a CSV or not.
You would report errors whenever you can't parse the file, be it a JPG, MP3 or CSV in a format you cannot parse.
To do that, I would try to find a library to parse various CSV file formats, else you have a long road ahead writing code to parse many possible types of CSV files (or restricting the application's flexibility by supporting few CSV formats.)
One such library for Java is opencsv
If you're using some library CSV parser, all you would have to do is catch any errors it throws.
If the CSV parser you're using is remotely robust, it will throw some useful errors in the event that it doesn't understand the file format.
I can think of several methods.
One way is to try to decode the file using UTF-8. (This is built into Java and is probably built into .NET too.) If the file decodes properly, then you at least know that it's a text file of some kind.
Once you know it's a text file, parse out the individual fields from each line and check that you get the number of fields that you expect. If the number of fields per line is inconsistent then you might just have a file that contains text but is not organized into lines and fields.
Otherwise you have a CSV. Then you can validate the fields.
If it's a web application, you might want to check the content-type HTTP header the browser sends when uploading/posting a file through a form.
If there's a bind for the language you're using, you might also try using libmagic, is pretty good at recognizing file types. For example, the UNIX tool file uses it.
http://sourceforge.net/projects/libmagic/
I don't know if you can tell for 100% certain in any way, but I'd suggest that the first validations should be:
Is the file extension .csv
Count the number of commas in the file per line, there should normally be the same amount of commas on each line of the file for it to be a valid CSV file. (As Jkramer said, this only works if the files can't contain quoted commas).
try this one :
String type = Files.probeContentType(Paths.get(filepath));
I solved it like this: read the file with UTF-16 encoding, if no comma is found in the file, it means UTF-16 encoding didnt work. Which means that this csv file is of Excel format (NOT plain text).
if(fileA.endsWith(".csv") && fileB.endsWith(".csv")) {
second_list=readCSVFile(fileA);
new_list=readCSVFile(fileB);
if(!String.join("", second_list).contains(",") || !String.join("", new_list).contains(",")) {
//read these files with UTF-8 encoding
System.out.println("[WARN] csv files will be read like text files. (UTF-16 encoding couldnt find any comma in the file i.e., UTF-16 encoding didn't work)");
second_list=readFile(fileA);
new_list=readFile(fileB);
} else {
// keep the csv as UTF-16 encoded
}

Categories