Java POJO to/from CSV, using field names as column titles

Java POJO to/from CSV, using field names as column titles - java

I’m looking for a Java library that can read/write a list of “simple objects” from/to a CSV file.
Let’s define a “simple object” as a POJO that all its fields are primitive types/strings.
The matching between an object’s field and a CSV column must be defined according to the name of the field and the title (first row) of the column - the two must be identical. No additional matching information should be required by the library! Such additional matching information is a horrible code duplication (with respect to the definition of the POJO class) if you simply want the CSV titles to match the field names.
This last feature is something I’ve failed to find in all the libraries I looked at: OpenCSV, Super CSV and BeanIO.
Thanks!!
Ofer

uniVocity-parsers does not require you to provide the field names in your class, but it uses annotations if you need to determine a different name, or even data manipulation to be performed. It is also way faster than the other libraries you tried:
class TestBean {
#NullString(nulls = { "?", "-" }) // if the value parsed in the quantity column is "?" or "-", it will be replaced by null.
#Parsed(defaultNullRead = "0") // if a value resolves to null, it will be converted to the String "0".
private Integer quantity; // The attribute name will be matched against the column header in the file automatically.
#Trim
#LowerCase
#Parsed
private String comments;
...
}
To parse:
BeanListProcessor<TestBean> rowProcessor = new BeanListProcessor<TestBean>(TestBean.class);
CsvParserSettings parserSettings = new CsvParserSettings();
parserSettings.setRowProcessor(rowProcessor);
parserSettings.setHeaderExtractionEnabled(true);
CsvParser parser = new CsvParser(parserSettings);
//And parse!
//this submits all rows parsed from the input to the BeanListProcessor
parser.parse(new FileReader(new File("/examples/bean_test.csv")));
List<TestBean> beans = rowProcessor.getBeans();
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

Related

ParserException in Manchester Syntax

I try to use ManchesterOWLSyntaxParser from OWL-API. I need to convert String in Manchester syntax to OWL Axiom, which I can add to existing ontology. The problem is, that I always get Parser Exception (something like bellow):
Exception in thread "main" org.semanticweb.owlapi.manchestersyntax.renderer.ParserException: Encountered Class: at line 1 column 1. Expected one of:
Class name
Object property name
Data property name
inv
Functional
inverse
InverseFunctional
(
Asymmetric
Transitive
Irreflexive
{
Symmetric
Reflexive
at org.semanticweb.owlapi.manchestersyntax.parser.ManchesterOWLSyntaxParserImpl$ExceptionBuilder.build(ManchesterOWLSyntaxParserImpl.java:2802)
at org.semanticweb.owlapi.manchestersyntax.parser.ManchesterOWLSyntaxParserImpl.parseAxiom(ManchesterOWLSyntaxParserImpl.java:2368)
at Main.main(Main.java:29)
I have read about Manchester syntax at w3c website, but I don't know where the problem is. Maybe manchester parser should be used in different way.
Code with example of string in Manchester syntax, which I have tried to parse.
OWLOntology o = ontologyManager.loadOntologyFromOntologyDocument(new File("family.owl"));
OWLDataFactory df = o.getOWLOntologyManager().getOWLDataFactory();
ManchesterOWLSyntaxParser parser = new ManchesterOWLSyntaxParserImpl(ontologyManager.getOntologyConfigurator(), df);
parser.setStringToParse("Class: <somePrefix#Father>" +
" EquivalentTo: \n" +
" <somePrefix#Male>\n" +
" and <somePrefix#Parent>");
OWLAxiom ax = parser.parseAxiom();

The ontology does not have declarations for the classes and properties in the fragment. The parser cannot parse the fragment without knowing the entities involved.
Just like parsing a whole ontology, classes, properties and data types need declaration axioms in the ontology object.

Read a text File which has numeric columns (like -10.0, -9.9, +9.9 etc.) through Apache Flink

I have requirement where I need to read a file which is generated by another application and file has 201 numeric column name like: -10.0, -9.9, -9.8, -9.7 .......0.....+9.7, +9.8, +9.9, +10.0 so total I have 201 columns in the file. I am reading many files through Flink but file has string type column name and I am creating an model Object with the attributes as columns name available in file as below
DataSet<Person>> csvInput = env.readCsvFile("file:///path/to/my/textfile")
.pojoType(Person.class, "name", "age", "zipcode");
above code will ready file and Person object will be populated with the values available in the File.
I am facing challenge in new requirement where file columns name is numeric and in Java I cannot create a variable with numeric value along with decimal like -10.0 etc.
like private String -10.0 not allowed in java
I am seeking for a solution, could any one please help me out here.

Modify the metamodel's schema to change/rename column names

I am using Apache MetaModel to get the schema information. There is one use case, where I need to create CsvDataContext object for csv file with no header. I have column names in a separate data structure (List<String> colNames).
The context object gives column names as "A", "B", "C", etc. I guess metamodel assigns some default column names to the tables with no headers.
Is there any way to modify the schema which is held by the CsvDataContext object?
I believe UpdateableDataContext should work, but the documentation doesn't expose any method that allows modifying the metadata like column name.
How is it possible to achieve this scenario?

When you create your CsvDataContext, you specify a CsvConfiguration. One of the options in the CsvConfiguration is to provide a ColumnNamingStrategy. The default strategy is in deed to use alphabetic characters, A, B, C etc. But you can use a custom naming strategy, like this:
ColumnNamingStrategy columnNamingStrategy =
ColumnNamingStrategies.customNames("id", "foo", "bar", "baz");
CsvConfiguration configuration = new CsvConfiguration(
0, columnNamingStrategy, "UTF-8", ',', '"', '\\', true, false);
return new CsvDataContext(file, configuration);

Apache common CSV formatter: IOException: invalid char between encapsulated token and delimiter

I am trying to parse a CSV file using JakartaCommons-csv
Sample input file
Field1,Field2,Field3,Field4,Field5
"Ryan, R"u"bianes"," dummy#gmail.com","29445","626","South delhi, Rohini 122001"
Formatter: CSVFormat.newFormat(',').withIgnoreEmptyLines().withQuote('"')
CSV_DELIMITER is ,
Output
Field1 value after CSV parsing should be : Ryan, R"u"bianes
Field5 value after CSV parsing should be : South delhi, Rohini 122001
Exception: Caused by: java.io.IOException: (line 2) invalid char between encapsulated token and delimiter

The problem is that your file is not following the accepted standard for quoting in CSV files. The correct way to represent a quote in a quoted string is by repeating the quote. For example.
Field1,Field2,Field3,Field4,Field5
"Ryan, R""u""bianes"," dummy#gmail.com","29445","626","South delhi, Rohini 122001"
If you restrict yourself to the standard form of CSV quoting, the Apache Commons CSV parser should work.
Unfortunately, it is not feasible to write a consistent parser for your variant format because there is no way disambiguate an embedded comma and a field separator if you need to represent a field containing "Ryan R","baines".
The rules for quoting in CSV files are set out in various places including RFC 4180.

The problem here is that the quotes are not properly escaped. Your parser doesn't handle that. Try univocity-parsers as this is the only parser for java I know that can handle unescaped quotes inside a quoted value. It is also 4 times faster than Commons CSV. Try this code:
//configure the parser to handle your situation
CsvParserSettings settings = new CsvParserSettings();
settings.setHeaderExtractionEnabled(true); //uses first line as headers
settings.setUnescapedQuoteHandling(STOP_AT_CLOSING_QUOTE);
settings.trimQuotedValues(true); //trim whitespace around values in quotes
//create the parser
CsvParser parser = new CsvParser(settings);
String input = "" +
"Field1,Field2,Field3,Field4,Field5\n" +
"\"Ryan, R\"u\"bianes\",\" dummy#gmail.com\",\"29445\",\"626\",\"South delhi, Rohini 122001\"";
//parse your input
List<String[]> rows = parser.parseAll(new StringReader(input));
//print the parsed values
for(String[] row : rows){
for(String value : row){
System.out.println('[' + value + ']');
}
System.out.println("-----");
}
This will print:
[Ryan, R"u"bianes]
[dummy#gmail.com]
[29445]
[626]
[South delhi, Rohini 122001]
-----
Hope it helps.
Disclosure: I'm the author of this library, it's open source and free (Apache 2.0 license)

Working out a Google App engine entity's property data type

I have this entity - I'm trying to determine the type of its properties - in Google App Engine's internal data-types PREFERRED (as opposed to Java data types).
The below code is obviously simplified. In reality I do not know the entity's properties or anything else about it.
final DatastoreService dss = DatastoreServiceFactory.getDatastoreService();
final Query query = new Query("Person");
final PreparedQuery pq = dss.prepare(query);
for (Entity entity : pq.asIterable())
{
final Object property = entity.getProperty("some_property");
// Here I want to determine which data type 'property' represents - GAE-wise.
}
In App Engine's Java code I've found some hints:
DataTypeTranslator
DataTypeTranslator.typeMap (internal private member)
Property.Meaning.GD_PHONENUMBER
I'm unable to link those together into what I need - some sort of reflection.
I wish I was able to do something like this:
entity.getPropertyType("some_property");
Does anyone know better?
DataTypeTranslator source code here
Edit #1: <<
INGORE this one. It's me who put these postfixes (I was confused by the doc).
Here's more important info I've found.
I'm getting it in Eclipse' tool-tip mini-window when I point over an entity (one which I just fetched from the Datastore).
The Datastore seems to send it (this payload) as raw text which is nice, maybe I'll have to parse it (but, how do I get it from code LOL).
Pay attention to the types in here, it's written plain simple.
Here it is:
<Entity [Bird(9)]:
Int64Type:44rmna4kc2g23i9brlupps74ir#Int64Type = 1234567890
String:igt7qvk9p89nc3gjqn9s3jq69c = 7tns1l48vpttq5ff47i3jlq3f9
PhoneNumber:auih50aecl574ud23v9h4rfvt1#PhoneNumberType = 03-6491234
Date:k1qstkn9np0mpb6fp41cj6i3am = Wed Jul 20 23:03:13 UTC 2011
>
For example, property named String:igt7qvk9p89nc3gjqn9s3jq69c has the value of 7tns1l48vpttq5ff47i3jlq3f9 and it doesn't tell its type. Also property Date:k1qstkn9np0mpb6fp41cj6i3am.
Property named Int64Type:44rmna4kc2g23i9brlupps74ir has the value of "1234567890" and here it strictly mentions that the data type is of "Int64Type".

I'm searching for it too.
It's a bit of a hack, but at least my output includes the type (without needing a secret decoder ring). But my code is slightly different:
Query allusersentityquery = new Query();
allusersentityquery.setAncestor(userKey);
for (final Entity entity : datastore.prepare(allusersentityquery).asIterable()) {
Map<String, Object> properties = entity.getProperties();
String[] propertyNames = properties.keySet().toArray(
new String[properties.size()]);
for(final String propertyName : propertyNames) {
// propertyNames string contains
// "com.google.appengine.api.datastore.PostalAddress" if it is a Postal Address
}
}
There seems to be no documents about determining the Property Types here.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java POJO to/from CSV, using field names as column titles - java

Related

ParserException in Manchester Syntax

Read a text File which has numeric columns (like -10.0, -9.9, +9.9 etc.) through Apache Flink

Modify the metamodel's schema to change/rename column names

Apache common CSV formatter: IOException: invalid char between encapsulated token and delimiter

Working out a Google App engine entity's property data type

Categories

Resources