Java fixed field file format - java

I would like to know if exists something like JAXB for fixed field ascii text format.
Can be very useful to Marshal java objects to fixed field ascii text file like JAXB do with XML.
Thank you.

I found this, i think it is exactly what i needed. Please mind that needs commons-lang but not last version, i had to use legacy v.2.6 otherwise i got a StringUtils not found.
http://fixedformat4j.ancientprogramming.com/usage/index.html

Related

Cleanest way to deserialize a non-standard (wrong) format of list of JSON string that doesn't have quote

I'm trying to deserialize a Java String to a List of String. Due to some reason, the input may come in two formats:
"[\"string1\", \"string2\"]"
or
"[string1, string2]"
The library I'm using is Jackson databind.
For the first case, it's a typical, easy case that Jackson supports.
For the second, I understand it's not a correct format of JSON and I can hack to achieve the goal by splitting this String by , and remove []s etc, but just would like to know if someone knows a clean way to deserialize something like that.
Thanks in advance.
To answer your question, you can look into YAML parsers. :)
Jackson has an extention for YAML support so that would be your clean solution.
YAML is a superset of JSON so it can parse any valid JSON... as well as many more complex transcripts (like strings without ").

Escaping an xml string in java

I read elements with CDATA sections from a rss-feed which I need to convert to valid xml. The content in the CDATA section is mostly valid xhtml, but some times characters like ampersand appear in attributes (url's).
I can use .replaceAll("&", "&") to solve this but thinking a bit forward it may be that other invalid characters show up in attributes or text.
The CMS to which I'm importing the element, won't accept CDATA sections without setting up another configuration for the content, so my question is: is there any simple way to escape the string, only for attributes and text?
I'm using the jdom library to manipulate the xml after the import.
Edit: I've checked out apache's StringEscapeUtils, but this is escaping the whole string. I need something that will only escape attribute values and text inside elements.
Apache Commons provides handy functions for this: StringEscapeUtils
When you use JDOM it will automatically correctly escape ay content that needs it. Is your CMS loaded with the output of JDOM, or are you using some other library to populate the CMS...?
In essence, if you have valid XML input, and you use JDOM (something from org.jdom2.output.*) to output the data, then you will always have good output.... so, what are you doing to have broken output?
Rolf

Escaping Special Characters with JiBX (Un) Marshalling

i want that during marshelling special character should escape,
is there any way to do this?
alt="<i><b> image alt</b></i>"
this is saved as
<b><i>image alt</b></i>
i want to save value as it is
If you store something as XML, you HAVE to escape that signs. Otherwise you XML will become invalid:
<xml>text</xml>
if test == </xml> the XML will be clearly invalid:
<xml></xml></xml>
This must be:
<xml></xml></xml>
If you unmarshall it, it should become the correct value again.
You may also use CDATA
I thought I share my experience, because answers I found weren't quit comprehensive (and I'm still not pretty sure if this is the most professional solution out there).
In our project we use maven-jibx-plugin to generate POJOs from XSDs (in two runs as usual: 1. *.xsd->binding.xml, then 2. binding.xml-> *.java).
Based on documentation of value node and Dennis Sosnoski's answer on jibx mailing list I added xml-maven-plugin to our project build process. I use it to apply an XSL file on generated binding.xml before POJO generation. The point is to change value of style attribute on appropriate value node from text to cdata.
So far it seams it solved my encoding issue and now I can return to client xmls like:
<Description><![CDATA[<strong>Valuable content goes here</strong>...<br />]]></Description>
Hope this makes someones life easier. :)

Java library to escape/clean XML?

I get some malformed xml text input like:
"<Tag>something</Tag> 8 > 3, 2 < 3, ... <Tag>something</Tag>"
I want to clean the input so to get:
"<Tag>something</Tag> 8 > 3, 2 < 3, ... <Tag>something</Tag>"
That is, escape those special symbols like <,> and yet keep the valid tags ("<Tag>something</Tag>, note, with the same case)
Do you know of any java library to do this? Probably a xml/html parser? (though I don't really need a parser, simple a "clean" procedure)
JTidy is "HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML"
But it can also be used with xml. Check the documentation. It's incredible smart, it will probably work for you.
I don't know of any library that would do that. Your input is malformed XML, and no proper XML parser would accept it. More important, it is not always possible to distinguish an actual tag from something that looks-like-a-tag-but-is-really-text. Therefore any heuristic-based attempt that you make to solve the problem will be fragile; i.e. it could occasionally produce malformed XML.
The best approach is address the problem before you assemble the XML.
If you generate the XML by (for example) unparsing a DOM, then the unparser will take care of the escaping for you.
If you are generating the XML by templating or string bashing, then you need to call something like StringEscapeUtils.escapeXml on the relevant text chunks ... before the XML tags get incorporated.
If you leave the problem until after the "XML" has been assembled, it cannot be properly fixed.
The best solution is to fix the program generating your text input. The easiest such fix would involve an escape utility like the other answers suggested. If that's not an option, I'd use a regular expression like
</?[a-zA-Z]+ */?>
to match the expected tags, and then split the string up into tags (which you want to pass through unchanged) and text between tags (against which you want to apply an escape method.)
I wouldn't count on an XML parser to be able to do it for you because what you're dealing with isn't valid XML. It is possible for the existing lack of escaping to produce ambiguities, so you might not be able to do a perfect job either.
Check out Guava's XmlEscaper. It is in pre-release for version 11 but the code is available.
Apache Commons Lang contains a class named StringEscapeUtils which does exactly what you want! The method you'd want to use is escapeXml, I presume.

JSON decode issue

I'm trying to decode JSON output of a Java program (jackson) and having some issues.
The cause of the problem is the following snippet:
{
"description": "... lives\uMOVE™ OFFERS ",
}
Which causes ValueError: Invalid \uXXXX escape.
Any ideas on how to fix this?
EDIT: The output is from an Avro file, the Avro package uses jackson to emit records as JSON.
EDIT2: After poking about in the source files, it might be the case that the JSON is constructed manually (sorry jackson).
What's the original string supposed to look like? \uXXXX is a unicode escape sequence, so it's interpreting \uMOVE as a single character, but it's not a valid unicode value. JSON is always assumed to be unicode, so you'll likely need to fix the string in the originating app
Try quoting the \u like this:
{
"description": "... lives\\uMOVE™ OFFERS ",
}
Basically the input isn't valid json.
The spec on http://www.json.org/ defines how strings should be be encoded. You will have to fix the JSON output from the other application.
This is a known bug in Avro versions < 1.6.0. See AVRO-851 for more details.
Jackson does not currently have a configuration feature to allow accepting such input. (Was it generated with Jackson?)
You could modify the stream parser to handle it. Follow the stack trace to the method(s) that would need changing.
You could submit a change request at http://jira.codehaus.org/browse/JACKSON for Jackson to be enhanced to provide such a feature, though I'm not sure how popular the request would be, and whether it would ever be implemented.

Categories