beanIO: identify different records with literal

beanIO: identify different records with literal - java

SITUATION:
I use beanIO 2.1.0 to read a csv-file into different kind of objects.
This is my csv-File. A list of animals (color, type, number of legs).
In my list are also animals without a type (last row).
brown;cat;4
white;dog;4
brown;dog;4
black;;8
I want to read the csv-file into different animal-objects.
If the type is 'cat' it should be a cat-object. The same with dog.
If the type isn't cat or dog, e.g. empty or an unknown animal-type, then it should be an animal-object.
Here the belonging beanIO-mapping:
<beanio xmlns="http://www.beanio.org/2012/03" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">
<stream name="animalFile" format="csv" >
<parser>
<property name="delimiter" value=";"/>
</parser>
<record name="animal" class="zoo.Cat">
<field name="color" />
<field name="type" rid="true" literal="cat"/>
<field name="legs"/>
</record>
<record name="animal" class="zoo.Dog">
<field name="color" />
<field name="type" rid="true" literal="dog"/>
<field name="legs"/>
</record>
<record name="animal" class="zoo.Animal" >
<field name="color" />
<field name="type"/>
<field name="legs"/>
</record>
</stream>
</beanio>
My program reads the csv-file, parses it with beanIO and calls the toString-method of the parsed objects.
This is the output. It looks fine:
CAT: brown;cat;4
DOG: white;dog;4
DOG: brown;dog;4
ANIMAL: black;;8
PROBLEM:
Now I just change the order of the animals in the csv-file.
In the second row is the unknown animal-type:
brown;cat;4
black;;8
white;dog;4
brown;dog;4
This ist the new output!
When the first unknown animal is found, then all the following rows are also unknown animals.
CAT: brown;cat;4
ANIMAL: black;;8
ANIMAL: white;dog;4
ANIMAL: brown;dog;4
QUESTION:
Is it a bug in beanIO or can I configure it in the beanIO-mapping?

EDIT: Updated answer after comments from OP.
This is not a bug in BeanIO. You have two options to identify a record with. First, you have the literal attribute as you used it so far. Secondly you can also use a regular expression (regex) to identify records with.
You want to match an Animal object when the type field is not cat or dog, or as you stated when it is an empty string/object.
Your type field definition could be one of two for the Animal record.
<field name="type" rid="true" regex="\s*" />
Here it will match whenever the type field contains spaces as defined by the java regular expressions.
OR
<field name="type" rid="true" regex=""^(?:(?!\b(cat|dog)\b).)*$" />
This will match any record where the type field doesn't contain the words cat or dog.
Try it with this Animal record:
<record name="animal" class="zoo.Animal" >
<field name="color" />
<field name="type" rid="true" regex=""^(?:(?!\b(cat|dog)\b).)*$" />
<field name="legs"/>
</record>
Off-topic. Technically you are not reading a CSV file because then your delimiter must be a comma. Instead, you have a delimited format which uses a semi-colon (;) as a delimiter.
I would also suggest that you make the names of your record definitions unique in your xml mapping file. The record name is used in error messages for reporting the location of a problem. If you have the same record name for all records, you will not know where to look for the problem.

Related

Key-value mapping in BeanIO fixed-length record

I have the following specification for a fixed-length data file (refer to record-C type of specification, page 4)
a second part, having a length of 1,800 characters, consisting of a table of 75 elements to be used for the display of the only data present in the communication; each of these elements is constituted by a field-code
of 8 characters and by a field-value of 16 characters
It means that the first 89 characters (omitted in the above summary) are plain old fixed-length and then, for the remaining 1800, I have to take them into groups of key-value pairs each counting up to 24 characters. Blank spaces are trimmed and empty pairs are not considered in the process.
Ideally, my bean may be constructed like
public class RecordC{
private List<Pair<String, String>> table = new ArrayList<>(MAX_TABLE_SIZE); //I don't want to use Map **yet**
}
Something can be e.g. Apache Common's Pair<String,String> or anything suitable for KVP mapping.
I understand that I can create a whole TypeHandler that takes the full 1800 bytes but I wanted to exploit the power of BeanIO.
Here is what I have done so far
<record name="RECORD_C" class="it.csttech.ftt.data.beans.ftt2017.RecordC" order="3" minOccurs="1" maxOccurs="1" maxLength="2000">
<field name="tipoRecord" rid="true" at="0" ignore="true" required="true" length="1" lazy="true" literal="C" />
<field name="cfContribuente" at="1" length="16" align="left" trim="true" lazy="true" />
<field name="progressivoModulo" at="17" length="8" padding="0" align="right" trim="true" lazy="true" />
<field name="spazioDisposizioneUtente" at="25" length="3" align="left" trim="true" lazy="true" />
<field name="spazioUtente" at="53" length="20" align="left" trim="true" lazy="true" />
<field name="cfProduttoreSoftware" at="73" length="16" align="left" trim="true" lazy="true" />
<segment name="table" collection="list" lazy="true" class="org.apache.commons.lang3.tuple.ImmutablePair">
<field name="key" type="java.lang.String" at="0" length="8" trim="true" lazy="true" setter="#1" />
<field name="value" type="java.lang.String" at="8" length="16" trim="true" lazy="true" setter="#2" />
</segment>
<field name="terminatorA" at="1897" length="1" rid="true" literal="A" ignore="true" />
</record>
Unfortunately this does not work in testing. I get only a single record in the list, decoded at positions [0-7] and [8-23] instead of expected [89-113][114-???][....][....]
Question is: how do I in BeanIO declare repeating fixed-length fields?

I have currently resolved my unmarshalling problem by removing all at attributes in the RecordC specifications. As I found out, the "at" attribute is absolute to the record and not relative to the repeating segment. However this forced me to add some ignored fields in the unmarshalling at the sole cost of a few ignores.
I will test the marshalling against the official controller once I have data

Mapping both xml element and its attribute using BeanIO

I would like to map the totalAmt tag in below xml file, both its value 100 and it's attribute Ccy.
<?xml version="1.0" encoding="UTF-8"?>
<transaction>
<id>
<eId>transactionId001</eId>
</id>
<amount>
<totalAmt Ccy="XXX">100</totalAmt>
</amount>
</transaction>
By reading BeanIO reference guide and posts here I got the impression that only one of them can be mapped.
So my question is: Can BeanIO handle this tag and could you show me how?
What I have tried and didn't work:
<segment name="amount">
<field name="totalAmount" xmlName="totalAmt"></field>
<field name="currency" xmlName="Ccy" xmlType="attribute"></field>
</segment>

Close, but you still need to add the segment element inside the segment tag to tell which field the attribute is belong to.
example.
<segment name="amount">
<field name="totalAmount" xmlName="totalAmt"></field>
<segment name="totalAmt">
<field name="type" xmlName="Ccy" xmlType="attribute"></field>
</segment>
</segment>

I am using bean io 2.1 version
The
<segment name="totalAmt">
<field name="totalAmount" xmlType="text"></field> --->the bean variable "totalAmount" will give say 100
<field name="Cctype" xmlName="Ccy" xmlType="attribute" default="XXX"></field> -->either set default value as XXX or it will take from cctype variable
</segment>

BeanIO - Expected minimum 1 occurrences

I recently upgraded my BeanIO framework to 2.0.6 version to parse my flat (tab delimited) files to java objects and I noticed so weird behavior.
I can't leave fields null in the last file line at the end because BeanIO throws this error message at me: "Expected minimum 1 occurrences."
I tried to even set the maxLength to 4 on the entire record so that it account for the extra null field at the end but it still throws that exception. What's strange is that it only does it for the last line and not for null fields in the other lines.
Mapping:
<beanio xmlns="http://www.beanio.org/2012/03"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">
<stream name="Inventory" format="delimited" strict="true" resourceBundle="com.crunchtime.mapping.cdp.Inventory">
<record name="myRecord" minOccurs="1" maxOccurs="unbounded" minLength="0" maxLength="4" class="com.test.Record">
<field name="userName" type="string"/>
<field name="userId" type="string"/>
<field name="type" type="string"/>
<field name="version" type="string"/>
</record>
</stream>
</beanio>
File:
Mark User1 M 1.0
Tom User2 D 1.1
Jim User3 M 2.0
Scott User4 G
Does anybody has any ideas on how to disable that behavior?
I looked at beanio.properties but I can't modify since it's locked.

Using BeanIO 2.0 or later, you must configure fields that may not be present in the input stream with minOccurs="0".

Unmarshalling list of objects using castor gives java.lang.IllegalArgumentException: object is not an instance of declaring class

I'm using castor 1.3.3-rc1 and I've been puzzled with this problem. Have read the manuals few times and I believe I've done everything right here, but I keep getting :
java.lang.IllegalArgumentException: object is not an instance of declaring class{File: [not available]; line: 4; column: 43}
when unmarshalling my xml.
These are my java classes:
public class ReportConfiguration {
private List<ColumnMapping> columnMappings;
// getters and setters omitted
}
public class ColumnMapping {
private int index;
private String label;
private String sumTotal;
// getters and setters omitted
}
This is my xml data file which will be unmarshalled into java classes above
<reportConfiguration>
<columnMappings>
<columnMapping index="0" label="Login"/>
<columnMapping index="1" label="Group"/>
<columnMapping index="2" label="Profit" sumTotal="yes"/>
</columnMappings>
</reportConfiguration>
And this is my castor mapping file
<mapping>
<class name="my.company.ReportConfiguration">
<map-to xml="reportConfiguration"/>
<field name="columnMappings" collection="arraylist" type="my.company.ColumnMapping">
<bind-xml name="columnMappings"/>
</field>
</class>
<class name="my.company.ColumnMapping">
<map-to xml="columnMapping"/>
<field name="index" type="integer" required="true">
<bind-xml name="index" node="attribute"/>
</field>
<field name="label" type="string" required="true">
<bind-xml name="label" node="attribute"/>
</field>
<field name="sumTotal" type="string">
<bind-xml name="sumTotal" node="attribute"/>
</field>
</class>
</mapping>
I used Spring OXM, created a org.springframework.oxm.castor.CastorMarshaller instance on my application context, and injected an Unmarshaller instance as dependency. When unmarshalling I just do something like this:
ReportConfiguration config = (ReportConfiguration) unmarshaller.unmarshall(new StreamSource(inputStream));
Can anyone spot what did I do wrong / how else I can debug this problem ?

Ah actually I found the answer. I need to supply container="false" attribute on the castor mapping :
<field name="columnMappings" collection="arraylist" type="my.company.ColumnMapping" container="false">
<bind-xml name="columnMappings"/>
</field>
This is what castor manual says:
container Indicates whether the field should be treated as a
container, i.e. only it's fields should be persisted, but not the
containing class itself. In this case, the container attribute should
be set to true (supported in Castor XML only).
I think the default is true -- in which case castor hopes to find multiple instance of <columnMapping> directly under <reportConfiguration>, not contained inside a <columnMappings>
A more helpful error message could be presented.

Trying to serialize an object compactly using Castor

I'm using Castor to write out a map of user ID's to time intervals. I'm using it to save and resume progress in a lengthy task, and I'm trying to make the XML as compact as possible. My map is from string userID's to a class that contains the interval timestamps, along with additional transient data that I don't need to serialize.
I'm able to use a nested class mapping:
...
<field name="userIntervals" collection="map">
<bind-xml name="u">
<class name="org.exolab.castor.mapping.MapItem">
<field name="key" type="string"><bind-xml name="n" node="attribute"/></field>
<field name="value" type="my.package.TimeInterval"/>
</class>
</bind-xml>
</field>
...
<class name="my.package.TimeInterval">
<map-to xml="ti"/>
<field name="intervalStart" type="long"><bind-xml name="s" node="attribute"/></field>
<field name="intervalEnd" type="long"><bind-xml name="e" node="attribute"/></field>
</class>
...
And get output that looks like:
<u n="36164639"><value s="1292750896000" e="1292750896000"/></u>
What I'd like is the name, start, and end of the user in a single node like this.
<u n="36164639" s="1292750896000" e="1292750896000"/>
But I can't seem to finagle it so the start and end attributes in the "value" go in the same node as the "key". Any ideas would be greatly appreciated.

Nash,
I think to arrange the castor mapping is bit tricky.
If you want to have structure like
<u n="36164639" s="1292750896000" e="1292750896000"/>
Then you need to create a new pojo file where it will be having
all the three fields Key,intervalStart,intervalEnd.
And let the File name as KeyTimeInterval
And map it like the below.
<field name="userIntervals" collection="map">
<class name="org.exolab.castor.mapping.MapItem">
<field name="u" type="my.package.KeyTimeInterval">
<bind-xml name="u" node="element"/>
</field>
</class>
</field>
<class name="my.package.KeyTimeInterval">
<field name="key" type="String">
<bind-xml name="n" node="attribute"/></field>
<field name="intervalStart" type="long">
<bind-xml name="s" node="attribute"/></field>
<field name="intervalEnd" type="long">
<bind-xml name="e" node="attribute"/></field>
</class>

I think you should be able to use location on s and e. Try this:-
...
<class name="my.package.TimeInterval">
<map-to xml="ti"/>
<field name="intervalStart" type="long">
<bind-xml name="s" location="u" node="attribute"/>
</field>
<field name="intervalEnd" type="long">
<bind-xml name="e" location="u" node="attribute"/>
</field>
</class>

Am answering my own question here, since there is a solution that does exactly what I want, and there's actually an error in the explanation at http://www.castor.org/xml-mapping.html#Sample-3:-Using-the-container-attribute - the container attribute is exactly what's needed here.
Changing one line in the mapping:
<field name="value" type="my.package.TimeInterval" container="true"/>
did exactly what I wanted, it didn't create a subelement for the value, just mapped the fields into the existing parent element. Since then, I've used this quite a few times to map multiple-value classes into their parent.
The error of course is the documentation states you do this by setting the container attribute to false. Of course, it should be true.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

beanIO: identify different records with literal - java

Related

Key-value mapping in BeanIO fixed-length record

Mapping both xml element and its attribute using BeanIO

BeanIO - Expected minimum 1 occurrences

Unmarshalling list of objects using castor gives java.lang.IllegalArgumentException: object is not an instance of declaring class

Trying to serialize an object compactly using Castor

Categories

Resources