opencsv: how to parse data with double quotes inside cells? - java

I'm trying to parse some public data using opencsv (version 3.10). Here's a snippet of code that grabs a CSV and maps the records to a list of POJO's:
URL permitsURL = new URL("http://assessor.boco.solutions/ASR_PublicDataFiles/Permits.csv");
InputStream permitInputStream = permitsURL.openStream();
Reader permitStreamReader = new InputStreamReader(permitInputStream);
CsvToBean<PermitRecord> csvToBean = new CsvToBean<PermitRecord>();
Map<String, String> columnMapping = new HashMap<String, String>();
columnMapping.put("strap", "strap");
columnMapping.put("issued_by", "issuedBy");
columnMapping.put("permit_num", "permitNum");
columnMapping.put("permit_category", "permitCategory");
columnMapping.put("issue_dt", "issueDt");
columnMapping.put("estimated_value", "estimatedValue");
columnMapping.put("description", "description");
HeaderColumnNameTranslateMappingStrategy<PermitRecord> strategy = new HeaderColumnNameTranslateMappingStrategy<PermitRecord>();
strategy.setType(PermitRecord.class);
strategy.setColumnMapping(columnMapping);
List<PermitRecord> permitRecordList = null;
CSVReader csvReader = new CSVReader(permitStreamReader);
permitRecordList = csvToBean.parse(strategy, csvReader);
There are fewer records in the parsed list than in the CSV. Looking at the data, I notice that there are sometimes double quotes within the cell values. Here's an example:
"R0601364 ","LAFAYETTE","14-0486","DECK","4/29/2014 12:00:00 AM","3834","deck under 36\"""
"R0601365 ","LAFAYETTE","13-0570","NEW CONSTRUCTION","5/22/2013 12:00:00 AM","121899","SIN FAMILY HOME PLN CUSTOM FIN BASEMENT"
The deck under 36" is causing the subsequent records to get rolled in to the description. This is more obvious when viewed through the IDE:
Can you see what I'm doing wrong? I suspect there's an easy fix because it's parsed correctly by Excel, and opencsv seems to be the defacto standard for Java CSV parsing.

The Univocity CSV parsers are really easy to use. Mapping the CSV columns to POJO attributes is a breeze.
I added the following dependency to the pom.xml:
<dependency>
<groupId>com.univocity</groupId>
<artifactId>univocity-parsers</artifactId>
<version>2.5.4</version>
</dependency>
The CSV columns are mapped to attributes using annotations. Note the handy annotations:
Parsed(field = "abc"): maps the CSV column to the variable
#Trim: removes leading/trailing whitespace
#Format(formats = {"MM/dd/yyyy"}): allows us to specify the date format
Here's the POJO:
package io.woolford.entity;
import com.univocity.parsers.annotations.Format;
import com.univocity.parsers.annotations.Parsed;
import com.univocity.parsers.annotations.Trim;
import java.util.Date;
public class PermitRecord {
#Trim
#Parsed(field = "strap")
private String strap;
#Parsed(field = "issued_by")
private String issuedBy;
#Parsed(field = "permit_num")
private String permitNum;
#Parsed(field = "permit_category")
private String permitCategory;
#Format(formats = {"MM/dd/yyyy"})
#Parsed(field = "issue_dt")
private Date issueDt;
#Parsed(field = "estimated_value")
private Integer estimatedValue;
#Parsed(field = "description")
private String description;
// getters & setters removed for brevity
}
Then, to create a list of POJO's from the records in the CSV file:
URL permitsURL = new URL("http://assessor.boco.solutions/ASR_PublicDataFiles/Permits.csv");
InputStream permitInputStream = permitsURL.openStream();
List<PermitRecord> permitRecordList = new CsvRoutines().parseAll(PermitRecord.class, permitInputStream);
Credit to #JeronimoBackes for this elegant solution. And thanks for Univocity for their excellent CSV parser.

Related

How can I grab a value from an ArrayList nested in a LinkedHashMap?

I currently have a yaml file that looks like this:
description: this-apps-config
options:
- customer: joe
id: 1
date: 2022-01-01
print: False
- customer: jane
id: 2
date: 2022-01-02
print: True
I am able to successfully read this in using snakeyaml:
Yaml yaml = new Yaml();
InputStream inputStream = new FileInputStream(new File("file.yml"));
Map<String, Object> data = yaml.load(inputStream);
System.out.println(data);
The above code retrieves everything as a LinkedHashMap with the options being ArrayList of another HashMap that looks like this:
{description=this-apps-config, options=[{customer=joe, id=1, date=2022-01-01, print=False}, {customer=jane, id=2, date=2022-01-02, print=True}]}
My question is, how do I get the print value in each of the options? The closest I've gotten is doing:
ArrayList<Object> al = new ArrayList<>()
al.add(data.get("options"))
This only gets me that first options ArrayList though. Not sure how to get deeper.
Thanks
YAML allows you to load a file into a custom class, and supports top-level types that fields of other types, including collections. Try something like the following:
public class MyYaml {
private String description;
private List<Customer> options;
// getters and setters
}
public class Customer {
private String customer;
private int id;
private Date date;
private boolean print;
// getters and setters
}
And then where you want to load the file:
Yaml yaml = new Yaml();
InputStream inputStream = this.getClass()
.getClassLoader()
.getResourceAsStream("myFile.yaml");
MyYaml myYaml = yaml.load(inputStream);
Here is a relevant tutorial that might help you.

Trouble with getting strings into JSON

My servlet recieves/loads multiple parameters from/for an article (price, id, count, name).
While they are saved in the session for other purposes I want to display them in a Shopping cart.
So my idea was to get all values into a json like this
{"id":1, "prductName":"article1"}
but my json always ends up empty.
I had two approaches:
String prname = request.getParameter("name");
String anz = String.valueOf(session.getAttribute("Anzahl"));
String prid = request.getParameter("id");
String price = request.getParameter("price");
These are my parameters:
First try:
class ToJson{
String prname1 = String.valueOf(session.getAttribute("prname"));
String anz1 = String.valueOf(session.getAttribute("Anzahl"));
String prid1 = String.valueOf(session.getAttribute("id"));
String price1 = String.valueOf(session.getAttribute("price"));
}
ToJson obj = new ToJson();
Jsonb jsonb = JsonbBuilder.create();
String jsn1 = jsonb.toJson(obj);
Ends up with: {}
Second try:
ArrayList<String> ar = new ArrayList<String>();
ar.add(prname);
ar.add(price);
ar.add(prid);
ar.add(anz);
ToJson obj = new ToJson();
Jsonb jsonb = JsonbBuilder.create();
String jsn = jsonb.toJson(ar);
Ends up with: ["P1neu","25","1","145"]
It isn't in a format I wanted and I also don't know how to access the seperate values here, I tried jsn[1] but it didnt work.
Could you help me, please?
To your first question, why JSON object is printing empty:
You are missing getters & setters in the ToJSON class for JSON Builder/Parser to access the properties/fields, and that's why its printing as empty object.
To your second question, how do I access JSON properties:
JSON representation is a natively a string representation, and you can't read part of string as jsn[1].
For reading JSON object properties, you convert it into POJO using available any of preferred open source parser libraries like Jacksons, Gson etc. And then access POJO properties using standard java getter/setters.

How to map csv file to pojo class in java

I am using java maven plugin.I want to fetch employee.csv file records in pojo class.
this pojo class I am generating from employee.csv header and all fields of pojo class are String type.now I want to map employee.csv to generated pojo class.my requirement is I dont want to specify column names manually.because if I change csv file then again I have to chane my code so it should dynamically map with any file. for instance
firstName,lastName,title,salary
john,karter,manager,54372
I want to map this to pojo which I have already
public class Employee
{
private String firstName;
private String lastName;
.
.
//getters and setters
//toString()
}
uniVocity-parsers allows you to map your pojo easily.
class Employee {
#Trim
#LowerCase
#Parsed
private String firstName;
#Parsed
private String lastName;
#NullString(nulls = { "?", "-" }) // if the value parsed in the quantity column is "?" or "-", it will be replaced by null.
#Parsed(defaultNullRead = "0") // if a value resolves to null, it will be converted to the String "0".
private Integer salary; // The attribute name will be matched against the column header in the file automatically.
...
}
To parse:
BeanListProcessor<Employee> rowProcessor = new BeanListProcessor<Employee>(Employee.class);
CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.setRowProcessor(rowProcessor);
parserSettings.setHeaderExtractionEnabled(true);
CsvParser parser = new CsvParser(parserSettings);
//And parse!
//this submits all rows parsed from the input to the BeanListProcessor
parser.parse(new FileReader(new File("/path/to/your.csv")));
List<Employee> beans = rowProcessor.getBeans();
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).
you can use openCSV jar to read the data and then you can map the each column values with the class attributes.
Due to security reason, i can not share my code with you.

Converting a single CSV/TSV string into a Java object?

Instead of converting an entire CSV file to an object, is there a simple API that takes in one csv or tsv string, and converts it to an object? The api's I've found so far are geared towards csv/tsv FIlE to list of objects.
Obviously I could just split the String and call a constructor, but was wondering if there was a clean api I could use.
You can do this with Jackson. It looks pretty similar to the other answers but seems to perform better than SuperCSV according to their tests.
Define your POJO (both the annotation and constructor seems to be necessary):
#JsonPropertyOrder({ "foo", "bar" })
public class FooBar {
private String foo;
private String bar;
public FooBar() {
}
// Setters, getters, toString()
}
Then parse it:
String input = "1,2\n3,4";
StringReader reader = new StringReader(input);
CsvMapper m = new CsvMapper();
CsvSchema schema = m.schemaFor(FooBar.class).withoutHeader().withLineSeparator("\n").withColumnSeparator(',');
try {
MappingIterator<FooBar> r = m.reader(FooBar.class).with(schema).readValues(reader);
while (r.hasNext()) {
System.out.println(r.nextValue());
}
} catch (JsonProcessingException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
Go with uniVocity-parsers as it is at least twice as fast than SuperCSV and has way more features.
For example, let's say your bean is:
class TestBean {
// if the value parsed in the quantity column is "?" or "-", it will be replaced by null.
#NullString(nulls = { "?", "-" })
// if a value resolves to null, it will be converted to the String "0".
#Parsed(defaultNullRead = "0")
private Integer quantity; // The attribute type defines which conversion will be executed when processing the value.
// In this case, IntegerConversion will be used.
// The attribute name will be matched against the column header in the file automatically.
#Trim
#LowerCase
// the value for the comments attribute is in the column at index 4 (0 is the first column, so this means fifth column in the file)
#Parsed(index = 4)
private String comments;
// you can also explicitly give the name of a column in the file.
#Parsed(field = "amount")
private BigDecimal amount;
#Trim
#LowerCase
// values "no", "n" and "null" will be converted to false; values "yes" and "y" will be converted to true
#BooleanString(falseStrings = { "no", "n", "null" }, trueStrings = { "yes", "y" })
#Parsed
private Boolean pending;
Now, to read your input as a list of TestBean
// BeanListProcessor converts each parsed row to an instance of a given class, then stores each instance into a list.
BeanListProcessor<TestBean> rowProcessor = new BeanListProcessor<TestBean>(TestBean.class);
CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.setRowProcessor(rowProcessor);
parserSettings.setHeaderExtractionEnabled(true);
CsvParser parser = new CsvParser(parserSettings);
parser.parse(getReader("/examples/bean_test.csv"));
// The BeanListProcessor provides a list of objects extracted from the input.
List<TestBean> beans = rowProcessor.getBeans();
To parse TSV files, just change the combination of CsvParserSettings & CsvParser to TsvParserSettings & TsvParser.
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).
I'm using this Api:
http://jsefa.sourceforge.net/
You can use annotations to convert your entities in CSV.
In the case of SuperCSV which you mentioned in a comment, you could pass it a String wrapped in a StringReader, i.e.
CsvBeanReader beanReader=new CsvBeanReader(new StringReader(theString), preferences);
beanReader.read(theBean, nameMapping);
I was currently dealing with a similar issue. in my case I wanted to import a single csv row at a time into a single pojo as I was getting my data in the form of discrete single line websocket updates. at the end jackson worked best for me as I didnt have to put everything into a list of pojos first.
here the code
String csvString="rick|sanchez|99"
private CsvMapper mapper=new CsvMapper();
private CsvSchema schema = mapper.schemaFor(Pojo.class).withColumnSeparator('|');
private ObjectReader r=mapper.readerFor(Pojo.class).with(schema);
Pojo pojo=r.readValue(csvString);
for this to work you also ned to add the following annotation to your pojo
#JsonPropertyOrder({"firstName","lastName","age"})
as far as I know its the only one that easily lets you parse a single csv line into a single pojo instance. obviously you could also do this over a constructor by hand but these libraries deal with with type conversions for you so its particularly useful if your pojo contains lots of different attributes

Is there an elegant way to Generate Excel spreadsheet from List<POJO>? (JAVA)

In java, Is there a elegant way to Generate Excel spreadsheet from List?
There are two possible and radically different approaches:
Write a CSV file. That's comma-separated, you just write out your fields, separated by commas, into a file with a .csv extension. Excel can read that just fine and it's dramatically simple.
Use Apache/Jakarta POI, a library, to write perfectly formatted, Office-compatible Excel files (Excel 95, 2003, ... various standards). This takes a bit more work.
As a previous answer suggests, CSV is an easy way to do this, but Excel has a habit of inferring data types - for example, if a string looks like a number, it will be formatted as a number, even if you have double-quoted it. If you want more control, you can try generating Excel XML, which in your case may be using a template, and generating a table that looks a little bit like an HTML table. See an example of a simple Excel XML document.
You can try ssio
public class Player {
#SsColumn(index = 0, name = "Id")
private long id;
#SsColumn(index = 1) // the column name will be decided as "Birth Country"
private String birthCountry;
#SsColumn(index = 2, typeHandler = FullNameTypeHandler.class) //complex prop type
private FullName fullName;
#SsColumn(index = 3) //The enum's name() will be saved. Otherwise, use a typeHandler
private SportType sportType;
#SsColumn(index = 4, format = "yyyy/MM/dd") //date format
private LocalDate birthDate;
#SsColumn(index = 5, typeHandler = TimestampAsMillisHandler.class)
//if you prefer saving timestamp as number
private LocalDateTime createdWhen;
...
}
SaveParam<Player> saveParam =
//Excel-like file. For CSV, use "new CsvSaveParamBuilder()"
new OfficeSaveParamBuilder<Player>()
.setBeanClass(Player.class)
.setBeans(players)
.setOutputTarget(outputStream)
.build();
SsioManager ssioManager = SsioManagerFactory.newInstance();
SaveResult saveResult = ssioManager.save(saveParam);

Categories