Jaxb unmarshal - elements with minoccurs 0 which should be combined - java

I have following XML file:
<?xml version="1.0" encoding="utf-8"?>
<Paragraph>
<ParaStyleName>headline_red</ParaStyleName>
<TextStyleRanges>
<TextStyleRange>
<CharStyleName>[Ohne]</CharStyleName>
<Contents>
<Content>inhalt</Content>
<Content>test text</Content>
<SpecialCharacter name="HARD_RETURN"/>
<Content> "text here</Content>
<SpecialCharacter name="DOUBLE_QUOTE_LEFT"/>
</Contents>
</TextStyleRange>
</TextStyleRanges>
</Paragraph>
From this xml I need to obtain the Content part like this:
inhalt test text HARD_RETURN "text here DOUBLE_QUOTE_LEFT
For me the tag order inside of <Contents> is important, problem is that the number of and <SpecialCharacter> is not always fix, and also the position of this tags is not fixed.
Note: I'm using JAXB for this and I have created the Model Class for Contents, for Content and for SpecialCharacter where in Contents I have as members ArrayList<Content> and ArrayList<SpecialCharacter> but in this case I can't linked the lists correct to keep the correct order of tags.
Please HELP me with a solution for this case.
Thanks!

You are going to need to merge these two lists as follows:
#XmlElements(
#XmlElement(name="Content", type=Content.class),
#XmlElement(name="SpecialCharacter", type=SpecialCharacter.class)
})
public List<Object> getValues() {
return values;
}

Related

XmlUnit ignore order of elements when comparing XML files

I have the two following xml files:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pricebooks xmlns="http://www.blablabla.com">
<pricebook>
<header pricebook-id="my-id">
<currency>GBP</currency>
<display-name xml:lang="x-default">display name</display-name>
<description>my description 1</description>
</header>
<price-tables>
<price-table product-id="id1" mode="mode1">
<amount quantity="1">30.00</amount>
</price-table>
<price-table product-id="id2" mode="mode2">
<amount quantity="1">60.00</amount>
</price-table>
</price-tables>
</pricebook>
</pricebooks>
and
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pricebooks xmlns="http://www.blablabla.com">
<pricebook>
<header pricebook-id="my-id">
<currency>GBP</currency>
<display-name xml:lang="x-default">display name</display-name>
<description>my description 1</description>
</header>
<price-tables>
<price-table product-id="id2" mode="mode2">
<amount quantity="1">60.00</amount>
</price-table>
<price-table product-id="id1" mode="mode1">
<amount quantity="1">30.00</amount>
</price-table>
</price-tables>
</pricebook>
</pricebooks>
Which I'm trying to compare ignoring the order of the elements price-table, so for me those two are equal. I'm using
<dependency>
<groupId>org.xmlunit</groupId>
<artifactId>xmlunit-core</artifactId>
<version>2.5.0</version>
</dependency>
and the code is the following, but I'm not able to make it work. It complains because the attribute values id1 and id2 are different.
Diff myDiffSimilar = DiffBuilder
.compare(expected)
.withTest(actual)
.checkForSimilar()
.ignoreWhitespace()
.ignoreComments()
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText))
.build();
assertFalse(myDiffSimilar.hasDifferences());
I have also tried to edit the the nodeMatcher as follow:
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.conditionalBuilder()
.whenElementIsNamed("price-tables")
.thenUse(ElementSelectors.byXPath("./price-table", ElementSelectors.byNameAndText))
.elseUse(ElementSelectors.byName)
.build()))
Thank you for your help.
I don't see any nested text inside your price-table elements at all, so byNamAndText matches on element names only - which will be the same for all price-tables and thus not do what you want.
In you example there is no ambiguity for price-tables as there is only one anyway. So the byXPath approach looks wrong. At least in your snippet XMLUnit should do fine with byName except for the price-table elements.
I'm not sure whether product-id alone is what identifies your price-table elements or the combination of all attributes. Either byNameAndAttributes("product-id") or its byNameAndAllAttributes cousin should work.
If it is only product-id then byNameAndAttributes("product-id") becomes byName for all elements that don't have any product-id attribute at all. In this special case byNameAndAttribute("product-id") alone will work for your whole document as we can see it - more or less by accident.
If you need more complex rules for other elements than price-table or you want to make things more explicit than
ElementSelectors.conditionalBuilder()
.whenElementIsNamed("price-table")
.thenUse(ElementSelectors.byNameAndAttributes("product-id"))
// more special cases
.elseUse(ElementSelectors.byName)
is the better choice.

Check if element node contains no text using java and Xpath?

I am new to Xpath. I am facing a problem that I have to get a boolean response from Xpath, if an element does not contains any text then it should return false otherwise true. I have seen many examples and I don't have much time to learn Xpath expressions. Below is the Xml file.
<?xml version="1.0" encoding="UTF-8" ?>
<order id="1234" date="05/06/2013">
<customer first_name="James" last_name="Rorrison">
<email>j.rorri#me.com</email>
<phoneNumber>+44 1234 1234</phoneNumber>
</customer>
<content>
<order_line item="H2G2" quantity="1">
<unit_price>23.5</unit_price>
</order_line>
<order_line item="Harry Potter" quantity="2">
<unit_price></unit_price>//**I want false here**
</order_line>
</content>
<credit_card number="1357" expiry_date="10/13" control_number="234" type="Visa" />
</order>
Could you point me the right direction to create xpath expression for this problem.
What I want is a expression(dummy expression) as below.
/order/content/order_line/unit_price[at this point I want to put a validation which will return true or false based on some check of isNull or notNull].
The following xpath will do this:
not(boolean(//*[not(text() or *)]))
but this xpath will also include the credit_card node since it to does not contain any text (the attributes are not text()).
if you also want to exclude node with attributes then use this..
not(boolean(//*[not(text() or * or #*)]))
Following your edit, you can do this..
/order/content/order_line/unit_price[not(text()]
It will return a list of nodes with no text and from there you can test against the count of nodes for your test.
or to return true/false..
not(boolean(/order/content/order_line/unit_price[not(text()]))

Xml id attribute to work with Java's getElementById? [duplicate]

This question already has answers here:
Java DOM getElementByID
(2 answers)
Closed 3 years ago.
I have an xml document being parsed in Java as a w3c document.
In my xml, i have many elements of the same name, e.g <item ..... />, each one with unique attribute's value, e.g <item name="a" .... />.
I want in java to do:
doc.getElementById("a")
in order to get that specific item I have there with that name.
How can I tell java to use 'name' as the id?
Or, alternately, How can I fetch that specific item in least complexity?
DOM is not the best API to easily query your document and get back found elements. Learn XPath, which is a more appropriate API, or iterate through the tree of elements by yourself.
getElementById() will only return the element which has the given id attribute (edit: marked as such in the document DTD or schema). It can't find by name attribute.
See Java XML DOM: how are id Attributes special? for details.
You need to write a DTD that defines your attribute as being of type ID.
Well, To make a complete answer, I had to use DTD schemas like everyone stated.
Since my needs are quite simple, I added it in embedded in my xml the following way:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST item
name ID #REQUIRED
>
]>
<root> .... </root>
The only important thing left to know is that once you declare the ATTLIST, I have to declare all of the rest of my attributes, therefore, you need to add IMPLIED:
some-attribute CDATA #IMPLIED
It says that some-attribute contains some data (can use also PCDATA for parsed cdata), and is implied, which means, it can be there or it cannot. doesnt matter.
So eventually, it'll look something like:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST item
name ID #REQUIRED
some-attribute CDATA #IMPLIED
>
]>
<root> .... </root>
And from Java side, Just use it blindly, e.g getElementById("some-name")
In order to make doc.getElementById("a") work you need to change your XML to <item id="a" name="a" .... />
If you can't change the XML, you could use XPath to retrieve this element.

Getting doc tag names while iterating the elements

i have this XML document:
<?xml version="1.0" encoding="UTF-16"?>
<root>
<items>
<item1>
<tag1>1</tag1>
<tag2>2</tag2>
<tag3>3</tag3>
</item1>
<item2>
<tag1>4</tag1>
<tag2>5</tag2>
<tag3>6</tag3>
</item2>
</items>
</root>
I want to iterate the item elements (item1, item2...), and for each tag get the tag name and after that the value of the tag.
I am using DOM parser.
Any ideas?
Sorry, but this ain't an unsolvable or complicated problem, this is simply reading a tutorial which can be googled within seconds.
And of course, you might also check the documentation, which will give you a hint about this handy method called "getNodeName()".

unmarshalling <br/> in XML data

I have some xml data I'm trying to unmarshall into java objects and one of the elements contains <br/> elements:
<details>
<para>
Line Number One
<br/>
Line Number Two
</para>
</details>
In my Details java object I have:
class Details {
#XmlElement(name="para")
private List<String> paragraphs;
}
The problem is that the only element in the paragraphs list is 'Line Number Two'. Does anyone know how I can deal with this?
You can represent mixed content with #XmlMixed as follows (note that it's applied to content of a class itself rather than to its element, thus you need an additional class):
class Details {
#XmlElement(name="para")
private Para para;
...
}
class Para {
#XmlMixed
#XmlAnyElement
private List<Object> paragraphs;
...
}
paragraphs property will contain Strings for text lines and Elements for XML elements.
In that case the XML is not formed correctly. Put the entire data inside the tags within CDATA to avoid this issue. Refer - http://www.w3schools.com/xml/xml_cdata.asp
You could use #XmlAnyElement along with a DomHandler to preserve fragments of the XML document as a String. Below is a link to a complete example demonstrating how to do this:
http://blog.bdoughan.com/2011/04/xmlanyelement-and-non-dom-properties.html

Categories