We are moving to camel in our application. I need to proccess some xml messages (get values\compare statuses). To solve this problems have bunch of custom processors written using pure java, but I was asked to change this using camel features.
Example of code:
.choice()
.when().xpath("/Response/Header/Status = 'OK' ")......
This is working fine.
Now I need to compare hint with some other hint, to do this I need to set value of:
/Response/Header/Hint
to lower case and check for contains.
If - /Response/Header/Hint value (for example:
<Hint>MyHint</Hint>
- to lower case contains "hint" then route to... otherwise to ....
I am not xpath expert and camel looks like has some changes fo this, so can you please help me with this.
One more thing I am interested, how do I remove whole < Hint>MyHint< /Hint> before passing message forward (remove some tags)
And can you advice some tutorial to get quickly into xpath for camel.
You could use fn:lower-case(string) to compare the hint as explained in How can I convert a string to upper- or lower-case with XSLT?.
About the removal of the <Hint> tag you have mutiple posibilities, like:
Use XSLT to filter the content as shown in remove xml tags with XSLT
Call a Bean that does the filtering
Answer is this:
.choice()
.when().xpath("/Response/Header/Status/text() = 'OK'")
.to("xslt:xsl/RemoveTag.xsl")
.choice().when().xpath("//Response/Header/Hint[contains(translate(., 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'hint')]").to
RemoveTag.xsl is small changed remove xml tags with XSLT
Great thank to olivier roger!
Related
I have a confusion on this requirement how to do it.
I receive an xml as a string from the database and need to find the value of particular elements inside the xml string. Here, my thought was,
1- convert String to xml.
2 - loop the xml using NodeList and DocumentBuilder (OR) Use JaxB. which one is the better option?
I'd definitely recommend JAXB instead of doing it by hand but if you're a bit masochistic it's doable by hand :3
One more option is to use Regular Expressions or use Groovy:)
The two approaches I usually follow are:
Convert the HTML to a string, and then test it against a target string. The problem with this approach is that it is too brittle, and there'll be very frequent false negatives due to say, things like extra whitespace somewhere.
Convert the HTML to a string and parse it back as an XML, and then use XPath queries to assert on specific nodes. This approach works well but not all HTML comes with closing tags and parsing it as XML fails in such cases.
Both these approaches have serious flaws. I imagine there must be a well-established approach (or approaches) for this sort of tests. What is it?
You could use jsoup or JTidy instead of XML parsing and use your second strategy.
I read elements with CDATA sections from a rss-feed which I need to convert to valid xml. The content in the CDATA section is mostly valid xhtml, but some times characters like ampersand appear in attributes (url's).
I can use .replaceAll("&", "&") to solve this but thinking a bit forward it may be that other invalid characters show up in attributes or text.
The CMS to which I'm importing the element, won't accept CDATA sections without setting up another configuration for the content, so my question is: is there any simple way to escape the string, only for attributes and text?
I'm using the jdom library to manipulate the xml after the import.
Edit: I've checked out apache's StringEscapeUtils, but this is escaping the whole string. I need something that will only escape attribute values and text inside elements.
Apache Commons provides handy functions for this: StringEscapeUtils
When you use JDOM it will automatically correctly escape ay content that needs it. Is your CMS loaded with the output of JDOM, or are you using some other library to populate the CMS...?
In essence, if you have valid XML input, and you use JDOM (something from org.jdom2.output.*) to output the data, then you will always have good output.... so, what are you doing to have broken output?
Rolf
i want that during marshelling special character should escape,
is there any way to do this?
alt="<i><b> image alt</b></i>"
this is saved as
<b><i>image alt</b></i>
i want to save value as it is
If you store something as XML, you HAVE to escape that signs. Otherwise you XML will become invalid:
<xml>text</xml>
if test == </xml> the XML will be clearly invalid:
<xml></xml></xml>
This must be:
<xml></xml></xml>
If you unmarshall it, it should become the correct value again.
You may also use CDATA
I thought I share my experience, because answers I found weren't quit comprehensive (and I'm still not pretty sure if this is the most professional solution out there).
In our project we use maven-jibx-plugin to generate POJOs from XSDs (in two runs as usual: 1. *.xsd->binding.xml, then 2. binding.xml-> *.java).
Based on documentation of value node and Dennis Sosnoski's answer on jibx mailing list I added xml-maven-plugin to our project build process. I use it to apply an XSL file on generated binding.xml before POJO generation. The point is to change value of style attribute on appropriate value node from text to cdata.
So far it seams it solved my encoding issue and now I can return to client xmls like:
<Description><![CDATA[<strong>Valuable content goes here</strong>...<br />]]></Description>
Hope this makes someones life easier. :)
I get some malformed xml text input like:
"<Tag>something</Tag> 8 > 3, 2 < 3, ... <Tag>something</Tag>"
I want to clean the input so to get:
"<Tag>something</Tag> 8 > 3, 2 < 3, ... <Tag>something</Tag>"
That is, escape those special symbols like <,> and yet keep the valid tags ("<Tag>something</Tag>, note, with the same case)
Do you know of any java library to do this? Probably a xml/html parser? (though I don't really need a parser, simple a "clean" procedure)
JTidy is "HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML"
But it can also be used with xml. Check the documentation. It's incredible smart, it will probably work for you.
I don't know of any library that would do that. Your input is malformed XML, and no proper XML parser would accept it. More important, it is not always possible to distinguish an actual tag from something that looks-like-a-tag-but-is-really-text. Therefore any heuristic-based attempt that you make to solve the problem will be fragile; i.e. it could occasionally produce malformed XML.
The best approach is address the problem before you assemble the XML.
If you generate the XML by (for example) unparsing a DOM, then the unparser will take care of the escaping for you.
If you are generating the XML by templating or string bashing, then you need to call something like StringEscapeUtils.escapeXml on the relevant text chunks ... before the XML tags get incorporated.
If you leave the problem until after the "XML" has been assembled, it cannot be properly fixed.
The best solution is to fix the program generating your text input. The easiest such fix would involve an escape utility like the other answers suggested. If that's not an option, I'd use a regular expression like
</?[a-zA-Z]+ */?>
to match the expected tags, and then split the string up into tags (which you want to pass through unchanged) and text between tags (against which you want to apply an escape method.)
I wouldn't count on an XML parser to be able to do it for you because what you're dealing with isn't valid XML. It is possible for the existing lack of escaping to produce ambiguities, so you might not be able to do a perfect job either.
Check out Guava's XmlEscaper. It is in pre-release for version 11 but the code is available.
Apache Commons Lang contains a class named StringEscapeUtils which does exactly what you want! The method you'd want to use is escapeXml, I presume.