Key-value mapping in BeanIO fixed-length record - java

I have the following specification for a fixed-length data file (refer to record-C type of specification, page 4)
a second part, having a length of 1,800 characters, consisting of a table of 75 elements to be used for the display of the only data present in the communication; each of these elements is constituted by a field-code
of 8 characters and by a field-value of 16 characters
It means that the first 89 characters (omitted in the above summary) are plain old fixed-length and then, for the remaining 1800, I have to take them into groups of key-value pairs each counting up to 24 characters. Blank spaces are trimmed and empty pairs are not considered in the process.
Ideally, my bean may be constructed like
public class RecordC{
private List<Pair<String, String>> table = new ArrayList<>(MAX_TABLE_SIZE); //I don't want to use Map **yet**
}
Something can be e.g. Apache Common's Pair<String,String> or anything suitable for KVP mapping.
I understand that I can create a whole TypeHandler that takes the full 1800 bytes but I wanted to exploit the power of BeanIO.
Here is what I have done so far
<record name="RECORD_C" class="it.csttech.ftt.data.beans.ftt2017.RecordC" order="3" minOccurs="1" maxOccurs="1" maxLength="2000">
<field name="tipoRecord" rid="true" at="0" ignore="true" required="true" length="1" lazy="true" literal="C" />
<field name="cfContribuente" at="1" length="16" align="left" trim="true" lazy="true" />
<field name="progressivoModulo" at="17" length="8" padding="0" align="right" trim="true" lazy="true" />
<field name="spazioDisposizioneUtente" at="25" length="3" align="left" trim="true" lazy="true" />
<field name="spazioUtente" at="53" length="20" align="left" trim="true" lazy="true" />
<field name="cfProduttoreSoftware" at="73" length="16" align="left" trim="true" lazy="true" />
<segment name="table" collection="list" lazy="true" class="org.apache.commons.lang3.tuple.ImmutablePair">
<field name="key" type="java.lang.String" at="0" length="8" trim="true" lazy="true" setter="#1" />
<field name="value" type="java.lang.String" at="8" length="16" trim="true" lazy="true" setter="#2" />
</segment>
<field name="terminatorA" at="1897" length="1" rid="true" literal="A" ignore="true" />
</record>
Unfortunately this does not work in testing. I get only a single record in the list, decoded at positions [0-7] and [8-23] instead of expected [89-113][114-???][....][....]
Question is: how do I in BeanIO declare repeating fixed-length fields?

I have currently resolved my unmarshalling problem by removing all at attributes in the RecordC specifications. As I found out, the "at" attribute is absolute to the record and not relative to the repeating segment. However this forced me to add some ignored fields in the unmarshalling at the sole cost of a few ignores.
I will test the marshalling against the official controller once I have data

Related

beanIO: identify different records with literal

SITUATION:
I use beanIO 2.1.0 to read a csv-file into different kind of objects.
This is my csv-File. A list of animals (color, type, number of legs).
In my list are also animals without a type (last row).
brown;cat;4
white;dog;4
brown;dog;4
black;;8
I want to read the csv-file into different animal-objects.
If the type is 'cat' it should be a cat-object. The same with dog.
If the type isn't cat or dog, e.g. empty or an unknown animal-type, then it should be an animal-object.
Here the belonging beanIO-mapping:
<beanio xmlns="http://www.beanio.org/2012/03" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.beanio.org/2012/03 http://www.beanio.org/2012/03/mapping.xsd">
<stream name="animalFile" format="csv" >
<parser>
<property name="delimiter" value=";"/>
</parser>
<record name="animal" class="zoo.Cat">
<field name="color" />
<field name="type" rid="true" literal="cat"/>
<field name="legs"/>
</record>
<record name="animal" class="zoo.Dog">
<field name="color" />
<field name="type" rid="true" literal="dog"/>
<field name="legs"/>
</record>
<record name="animal" class="zoo.Animal" >
<field name="color" />
<field name="type"/>
<field name="legs"/>
</record>
</stream>
</beanio>
My program reads the csv-file, parses it with beanIO and calls the toString-method of the parsed objects.
This is the output. It looks fine:
CAT: brown;cat;4
DOG: white;dog;4
DOG: brown;dog;4
ANIMAL: black;;8
PROBLEM:
Now I just change the order of the animals in the csv-file.
In the second row is the unknown animal-type:
brown;cat;4
black;;8
white;dog;4
brown;dog;4
This ist the new output!
When the first unknown animal is found, then all the following rows are also unknown animals.
CAT: brown;cat;4
ANIMAL: black;;8
ANIMAL: white;dog;4
ANIMAL: brown;dog;4
QUESTION:
Is it a bug in beanIO or can I configure it in the beanIO-mapping?
EDIT: Updated answer after comments from OP.
This is not a bug in BeanIO. You have two options to identify a record with. First, you have the literal attribute as you used it so far. Secondly you can also use a regular expression (regex) to identify records with.
You want to match an Animal object when the type field is not cat or dog, or as you stated when it is an empty string/object.
Your type field definition could be one of two for the Animal record.
<field name="type" rid="true" regex="\s*" />
Here it will match whenever the type field contains spaces as defined by the java regular expressions.
OR
<field name="type" rid="true" regex=""^(?:(?!\b(cat|dog)\b).)*$" />
This will match any record where the type field doesn't contain the words cat or dog.
Try it with this Animal record:
<record name="animal" class="zoo.Animal" >
<field name="color" />
<field name="type" rid="true" regex=""^(?:(?!\b(cat|dog)\b).)*$" />
<field name="legs"/>
</record>
Off-topic. Technically you are not reading a CSV file because then your delimiter must be a comma. Instead, you have a delimited format which uses a semi-colon (;) as a delimiter.
I would also suggest that you make the names of your record definitions unique in your xml mapping file. The record name is used in error messages for reporting the location of a problem. If you have the same record name for all records, you will not know where to look for the problem.

How to add Annotations elements in metadata generated by Apache Olingo V2.0?

I have developed Odata service for a system entity which generates a metadata but however I cant figure out how to add Annotations element to it. Sample Metadata generated is as follows :-
<?xml version="1.0" encoding="utf-8"?>
<edmx:Edmx xmlns:edmx="http://schemas.microsoft.com/ado/2007/06/edmx" xmlns:sap="http://www.sap.com/Protocols/SAPData" Version="1.0">
<edmx:DataServices xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"
m:DataServiceVersion="1.0">
<Schema xmlns="http://schemas.microsoft.com/ado/2008/09/edm" Namespace="myNamespace" sap:schema-version="1">
<EntityType Name="System">
<Key>
<PropertyRef Name="Id" />
</Key>
<Property Name="Id" Type="Edm.Int32" Nullable="false" />
<Property Name="name" Type="Edm.String" sap:label="System Name" sap:creatable="false"
sap:updatable="false" sap:sortable="false" sap:required-in-filter="true"/>
<Property Name="description" Type="Edm.String" />
<Property Name="status" Type="Edm.String" />
<Property Name="type" Type="Edm.String" />
</EntityType>
<EntityContainer Name="ODataEntityContainer" m:IsDefaultEntityContainer="true">
<EntitySet Name="Systems" EntityType="myNamespace.System" />
<FunctionImport Name="NumberOfSystems" ReturnType="Collection(myNamespace.System)"
m:HttpMethod="GET" />
</EntityContainer>
</Schema>
</edmx:DataServices>
</edmx:Edmx>
I need to add following elements to above metatada
<Annotations Target="myNamespace.System"
xmlns="http://docs.oasis-open.org/odata/ns/edm">
<Annotation Term="com.sap.vocabularies.UI.v1.LineItem">
<Collection>
<Record Type="com.sap.vocabularies.UI.v1.DataField">
<PropertyValue Property="Value" Path="name" />
</Record>
<Record Type="com.sap.vocabularies.UI.v1.DataField">
<PropertyValue Property="Value" Path="description"/>
</Record>
<Record Type="com.sap.vocabularies.UI.v1.DataField">
<PropertyValue Property="Value" Path="status" />
</Record>
</Collection>
</Annotation>
</Annotations>
I came across the org.apache.olingo.commons.api.edm.provider.annotation package but cant find any suitable API. Please let me know how should I proceed.
Thanks in advance.
The annotations you would like to use have been introduced with OData V3 which is why they are not directly supported with the Olingo V2 library.
You can use the EdmProvider AnnotationElement and AnnotationAttribute classes to mimic this behaviour though. For example You can create a AnnotationElement with the name "Annotations" this element will then have the "AnnotationAttribute" Target=SomeString. Since an "AnnotationElement" can have child elements you can put your Collection element there. Namespaces are also handled with "AnnotationAttributes".
You can only attach the annotation to Edm elements which are derived from the EdmAnnotatable interface. So this is a difference to V3.
This is currently the only way to get this behaviour with Olingo V2.

How can I create a blob index field correctly with Solr 5?

I think the title of my question explains much of what I need. I am using the Data Importer Handler of Apache SOLR 5. I configured my solrconfig.xml, schema.xml and data-config.xml. It's working for now.
However, I need to add one more field. An Oracle Blob field. First, let me show my configurations:
data-config.xml
<dataConfig>
<!-- Datasource -->
<dataSource name="myDS"
setReadOnly="true"
driver="oracle.jdbc.OracleDriver"
url="jdbc:oracle:thin:#//server.example.com:1521/service_name"
user="user"
password="pass"/>
<document name="products">
<entity name="product"
dataSource="myDS"
query="select * from products"
pk="id"
processor="SqlEntityProcessor">
<field column="id" name="id" />
<field column="name" name="name" />
<field column="price" name="price" />
<field column="store" name="store" />
<!-- I've added this blob field -->
<field column="picture" name="picture" />
</entity>
</document>
</dataConfig>
solrconfig.xml
<requestHandler name="/products" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
<!-- JDBCs -->
<lib dir="../../../lib" />
My fields in schema.xml
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="_text" type="string" indexed="true" stored="false" multiValued="true"/>
<field name="name" type="string" indexed="true" stored="true"/>
<field name="price" type="float" indexed="true" stored="true"/>
<!-- BLOB field -->
<field name="picture" type="binary" indexed="true" stored="true"/>
<copyField source="*" dest="_text"/>
<!-- ommited solr default fields -->
Now, when I start a full-importer, SOLR only indexes some records. This is the output after SOLR finish the importing:
Indexing completed. Added/Updated: 64 documents. Deleted 0 documents. (Duration: 04s)
Requests: 1 (0/s), Fetched: 1369 (342/s), Skipped: 0, Processed: 64 (16/s)
Started: less than a minute ago
As you can see, I have 1369 records, but SOLR only index 64 documents. If I remove the field picture from schema or, set index and stored attributes to false, SOLR import all documents.
I opened the SOLR log, and found this error when importing the blob field:
3436212 [Thread-19] WARN org.apache.solr.handler.dataimport.SolrWriter – Error creating document : SolrInputDocument(fields: [name=PRODUCTNAME, price=PRICE, store=STORE, picture=oracle.sql.BLOB#4130607a, _version_=1497915495941144576])
org.apache.solr.common.SolrException: ERROR: [doc=<ID>] Error adding field 'picture'='oracle.sql.BLOB#4130607a' msg=Illegal character .
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:176)
at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:78)
at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:240)
at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:166)
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:931)
at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1085)
at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:697)
at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:104)
at org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:71)
at org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:263)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:511)
at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:415)
at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:330)
at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:232)
at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:416)
at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:480)
at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:461)
Caused by: java.lang.IllegalArgumentException: Illegal character .
at org.apache.solr.common.util.Base64.base64toInt(Base64.java:150)
at org.apache.solr.common.util.Base64.base64ToByteArray(Base64.java:117)
at org.apache.solr.schema.BinaryField.createField(BinaryField.java:89)
at org.apache.solr.schema.FieldType.createFields(FieldType.java:305)
at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:48)
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:123)
... 18 more
I checked querying directly against database, and it's working fine. I am using SOLR 5, ojdbc7 and Java 8. How can I use the binary field correctly in SOLR?
Update
I've changed the properties of picture in schema.xml setting indexed=false. This way:
<!-- BLOB field -->
<field name="picture" type="binary" indexed="false" stored="true"/>
Then, I restarted SOLR, reloaded my core, and did a Full-Import again. No success and same exception. The same 64 documents that I described above was imported and the field picture does not appear in JSON response. The query I execute is:
/select?q=*%3A*&wt=json&indent=true

DataSource.getInheritsFrom() fails with ClassCastException

In SmartGwtEE project I have hierarchy of DataSources described in .ds.xml files, here is some of them:
BaseElement_DS.ds.xml
<DataSource ID="BaseElement_DS" serverConstructor="com.isomorphic.jpa.JPADataSource"
beanClassName="lnudb.server.model.BaseElement">
<fields>
<field name="id" type="sequence" hidden="true" primaryKey="true" />
<field name="name" type="text" title="Name" required="true" />
<field name="dsId" type="text" title="Datasource" hidden="true"/>
</fields>
</DataSource>
Human_DS.ds.xml
<DataSource ID="Human_DS" serverConstructor="com.isomorphic.jpa.JPADataSource"
beanClassName="org.zasadnyy.lnudb.server.model.Human" inheritsFrom="BaseElement_DS"
useParentFieldOrder="true">
<fields>
<field name="surname" type="text" />
<field name="birthday" type="date" title="Birthday" required="false" />
</fields>
</DataSource>
Problem: when I try to get parent datasource id in code
String parentDsId = DataSource.get("Human_DS").getInheritsFrom();
ClassCastExeption is raised from inside of getInheritsFrom() method:
java.lang.ClassCastException: com.google.gwt.core.client.JavaScriptObject$ cannot be cast to java.lang.String
I will be grateful for any help.
This way you won't get the exception any longer:
String parentDsId = DataSource.get("Human_DS").getInheritsFrom() + "";
However, I'm not sure whether this is "ok" for your purposes. If this is not good for you, try to create a Javascript object and assign the before mentioned value to it. I hope this helps.

Trying to serialize an object compactly using Castor

I'm using Castor to write out a map of user ID's to time intervals. I'm using it to save and resume progress in a lengthy task, and I'm trying to make the XML as compact as possible. My map is from string userID's to a class that contains the interval timestamps, along with additional transient data that I don't need to serialize.
I'm able to use a nested class mapping:
...
<field name="userIntervals" collection="map">
<bind-xml name="u">
<class name="org.exolab.castor.mapping.MapItem">
<field name="key" type="string"><bind-xml name="n" node="attribute"/></field>
<field name="value" type="my.package.TimeInterval"/>
</class>
</bind-xml>
</field>
...
<class name="my.package.TimeInterval">
<map-to xml="ti"/>
<field name="intervalStart" type="long"><bind-xml name="s" node="attribute"/></field>
<field name="intervalEnd" type="long"><bind-xml name="e" node="attribute"/></field>
</class>
...
And get output that looks like:
<u n="36164639"><value s="1292750896000" e="1292750896000"/></u>
What I'd like is the name, start, and end of the user in a single node like this.
<u n="36164639" s="1292750896000" e="1292750896000"/>
But I can't seem to finagle it so the start and end attributes in the "value" go in the same node as the "key". Any ideas would be greatly appreciated.
Nash,
I think to arrange the castor mapping is bit tricky.
If you want to have structure like
<u n="36164639" s="1292750896000" e="1292750896000"/>
Then you need to create a new pojo file where it will be having
all the three fields Key,intervalStart,intervalEnd.
And let the File name as KeyTimeInterval
And map it like the below.
<field name="userIntervals" collection="map">
<class name="org.exolab.castor.mapping.MapItem">
<field name="u" type="my.package.KeyTimeInterval">
<bind-xml name="u" node="element"/>
</field>
</class>
</field>
<class name="my.package.KeyTimeInterval">
<field name="key" type="String">
<bind-xml name="n" node="attribute"/></field>
<field name="intervalStart" type="long">
<bind-xml name="s" node="attribute"/></field>
<field name="intervalEnd" type="long">
<bind-xml name="e" node="attribute"/></field>
</class>
I think you should be able to use location on s and e. Try this:-
...
<class name="my.package.TimeInterval">
<map-to xml="ti"/>
<field name="intervalStart" type="long">
<bind-xml name="s" location="u" node="attribute"/>
</field>
<field name="intervalEnd" type="long">
<bind-xml name="e" location="u" node="attribute"/>
</field>
</class>
Am answering my own question here, since there is a solution that does exactly what I want, and there's actually an error in the explanation at http://www.castor.org/xml-mapping.html#Sample-3:-Using-the-container-attribute - the container attribute is exactly what's needed here.
Changing one line in the mapping:
<field name="value" type="my.package.TimeInterval" container="true"/>
did exactly what I wanted, it didn't create a subelement for the value, just mapped the fields into the existing parent element. Since then, I've used this quite a few times to map multiple-value classes into their parent.
The error of course is the documentation states you do this by setting the container attribute to false. Of course, it should be true.

Categories