XmlUnit ignore order of elements when comparing XML files - java

I have the two following xml files:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pricebooks xmlns="http://www.blablabla.com">
<pricebook>
<header pricebook-id="my-id">
<currency>GBP</currency>
<display-name xml:lang="x-default">display name</display-name>
<description>my description 1</description>
</header>
<price-tables>
<price-table product-id="id1" mode="mode1">
<amount quantity="1">30.00</amount>
</price-table>
<price-table product-id="id2" mode="mode2">
<amount quantity="1">60.00</amount>
</price-table>
</price-tables>
</pricebook>
</pricebooks>
and
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pricebooks xmlns="http://www.blablabla.com">
<pricebook>
<header pricebook-id="my-id">
<currency>GBP</currency>
<display-name xml:lang="x-default">display name</display-name>
<description>my description 1</description>
</header>
<price-tables>
<price-table product-id="id2" mode="mode2">
<amount quantity="1">60.00</amount>
</price-table>
<price-table product-id="id1" mode="mode1">
<amount quantity="1">30.00</amount>
</price-table>
</price-tables>
</pricebook>
</pricebooks>
Which I'm trying to compare ignoring the order of the elements price-table, so for me those two are equal. I'm using
<dependency>
<groupId>org.xmlunit</groupId>
<artifactId>xmlunit-core</artifactId>
<version>2.5.0</version>
</dependency>
and the code is the following, but I'm not able to make it work. It complains because the attribute values id1 and id2 are different.
Diff myDiffSimilar = DiffBuilder
.compare(expected)
.withTest(actual)
.checkForSimilar()
.ignoreWhitespace()
.ignoreComments()
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText))
.build();
assertFalse(myDiffSimilar.hasDifferences());
I have also tried to edit the the nodeMatcher as follow:
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.conditionalBuilder()
.whenElementIsNamed("price-tables")
.thenUse(ElementSelectors.byXPath("./price-table", ElementSelectors.byNameAndText))
.elseUse(ElementSelectors.byName)
.build()))
Thank you for your help.

I don't see any nested text inside your price-table elements at all, so byNamAndText matches on element names only - which will be the same for all price-tables and thus not do what you want.
In you example there is no ambiguity for price-tables as there is only one anyway. So the byXPath approach looks wrong. At least in your snippet XMLUnit should do fine with byName except for the price-table elements.
I'm not sure whether product-id alone is what identifies your price-table elements or the combination of all attributes. Either byNameAndAttributes("product-id") or its byNameAndAllAttributes cousin should work.
If it is only product-id then byNameAndAttributes("product-id") becomes byName for all elements that don't have any product-id attribute at all. In this special case byNameAndAttribute("product-id") alone will work for your whole document as we can see it - more or less by accident.
If you need more complex rules for other elements than price-table or you want to make things more explicit than
ElementSelectors.conditionalBuilder()
.whenElementIsNamed("price-table")
.thenUse(ElementSelectors.byNameAndAttributes("product-id"))
// more special cases
.elseUse(ElementSelectors.byName)
is the better choice.

Related

Jaxb unmarshal - elements with minoccurs 0 which should be combined

I have following XML file:
<?xml version="1.0" encoding="utf-8"?>
<Paragraph>
<ParaStyleName>headline_red</ParaStyleName>
<TextStyleRanges>
<TextStyleRange>
<CharStyleName>[Ohne]</CharStyleName>
<Contents>
<Content>inhalt</Content>
<Content>test text</Content>
<SpecialCharacter name="HARD_RETURN"/>
<Content> "text here</Content>
<SpecialCharacter name="DOUBLE_QUOTE_LEFT"/>
</Contents>
</TextStyleRange>
</TextStyleRanges>
</Paragraph>
From this xml I need to obtain the Content part like this:
inhalt test text HARD_RETURN "text here DOUBLE_QUOTE_LEFT
For me the tag order inside of <Contents> is important, problem is that the number of and <SpecialCharacter> is not always fix, and also the position of this tags is not fixed.
Note: I'm using JAXB for this and I have created the Model Class for Contents, for Content and for SpecialCharacter where in Contents I have as members ArrayList<Content> and ArrayList<SpecialCharacter> but in this case I can't linked the lists correct to keep the correct order of tags.
Please HELP me with a solution for this case.
Thanks!
You are going to need to merge these two lists as follows:
#XmlElements(
#XmlElement(name="Content", type=Content.class),
#XmlElement(name="SpecialCharacter", type=SpecialCharacter.class)
})
public List<Object> getValues() {
return values;
}

XML Comparison using XMLUnit

I want to compare two XML files using XMLUnit (I don't want to reinvent something which is already present).
XML 1:
<?xml version="1.0"?>
<Product>
<Property>
<Container value="1">Test 01</Container>
<Container value="3">Test 02</Container>
<Container value="5">Test 03</Container>
</Property>
</Product>
XML2:
<?xml version="1.0"?>
<Product>
<Property>
<Container value="3">Test 01</Container>
<Container value="7">Test 02</Container>
<Container value="1">Test 03</Container>
<Container value="5">Test 04</Container>
</Property>
</Product>
I want to compare the elements only if the node along with the attribute matches. Also if the position is different then it should be similar.
I have tried with DetailedDiff but it is showing a lot of results but I only want to extract specific changes. Please give your seggestions.
If you're after an order-independent comparison then ElementQualifier would help:
http://xmlunit.sourceforge.net/userguide/html/ar01s03.html
In some cases the order of elements in two pieces of XML may not be significant. If this is true, the DifferenceEngine needs help to determine which Elements to compare. This is the job of an ElementQualifier (see Section 3.4, “ElementQualifier”).
Specifically, ElementNameAndAttributeQualifier seems to match your requirements:
Only Elements with the same name - and Namespace URI if present - as well as the same values for all attributes given in ElementNameAndAttributeQualifier's constructor qualify.

Xml id attribute to work with Java's getElementById? [duplicate]

This question already has answers here:
Java DOM getElementByID
(2 answers)
Closed 3 years ago.
I have an xml document being parsed in Java as a w3c document.
In my xml, i have many elements of the same name, e.g <item ..... />, each one with unique attribute's value, e.g <item name="a" .... />.
I want in java to do:
doc.getElementById("a")
in order to get that specific item I have there with that name.
How can I tell java to use 'name' as the id?
Or, alternately, How can I fetch that specific item in least complexity?
DOM is not the best API to easily query your document and get back found elements. Learn XPath, which is a more appropriate API, or iterate through the tree of elements by yourself.
getElementById() will only return the element which has the given id attribute (edit: marked as such in the document DTD or schema). It can't find by name attribute.
See Java XML DOM: how are id Attributes special? for details.
You need to write a DTD that defines your attribute as being of type ID.
Well, To make a complete answer, I had to use DTD schemas like everyone stated.
Since my needs are quite simple, I added it in embedded in my xml the following way:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST item
name ID #REQUIRED
>
]>
<root> .... </root>
The only important thing left to know is that once you declare the ATTLIST, I have to declare all of the rest of my attributes, therefore, you need to add IMPLIED:
some-attribute CDATA #IMPLIED
It says that some-attribute contains some data (can use also PCDATA for parsed cdata), and is implied, which means, it can be there or it cannot. doesnt matter.
So eventually, it'll look something like:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST item
name ID #REQUIRED
some-attribute CDATA #IMPLIED
>
]>
<root> .... </root>
And from Java side, Just use it blindly, e.g getElementById("some-name")
In order to make doc.getElementById("a") work you need to change your XML to <item id="a" name="a" .... />
If you can't change the XML, you could use XPath to retrieve this element.

Getting doc tag names while iterating the elements

i have this XML document:
<?xml version="1.0" encoding="UTF-16"?>
<root>
<items>
<item1>
<tag1>1</tag1>
<tag2>2</tag2>
<tag3>3</tag3>
</item1>
<item2>
<tag1>4</tag1>
<tag2>5</tag2>
<tag3>6</tag3>
</item2>
</items>
</root>
I want to iterate the item elements (item1, item2...), and for each tag get the tag name and after that the value of the tag.
I am using DOM parser.
Any ideas?
Sorry, but this ain't an unsolvable or complicated problem, this is simply reading a tutorial which can be googled within seconds.
And of course, you might also check the documentation, which will give you a hint about this handy method called "getNodeName()".

Update a single element in an xml document

Is it possible to parse and then modify a single element in an XML document?
I'm currently writing a script in ruby which needs to modify a value (specified by xpath) in an xml file. I'm currently using the REXML library to do this:
xmldocument = Document.new(File.new(filename))
property = XPath.first(xmldocument, "/parent/element/property")
property.text = "New property value"
puts xmldocument
Where the input xml is:
<?xml version="1.0" encoding="UTF-8"?>
<parent>
<element>
<property>Old property value</property>
<verbose />
</element>
...
(more elements here)
...
</parent>
And the output is:
<?xml version='1.0' encoding='UTF-8'?>
<parent>
<element>
<property>New property value</property>
<verbose/>
</element>
...
(more elements here)
...
</parent>
You should notice that the output xml is slightly reformatted and more than my desired change are made. For example the tag <verbose /> is changed to <verbose/> and double quotes are replaced with single quotes in the first line.
What is the best way to modify just a given element of an xml file and leave the rest of the file intact? Ideally, there is a solution for Ruby but I'd love to know the solution in other languages such as Java.
In Java, the Saxon library should accomplish everything you're looking for:
http://sourceforge.net/projects/saxon/

Categories