XML Comparison using XMLUnit - java

I want to compare two XML files using XMLUnit (I don't want to reinvent something which is already present).
XML 1:
<?xml version="1.0"?>
<Product>
<Property>
<Container value="1">Test 01</Container>
<Container value="3">Test 02</Container>
<Container value="5">Test 03</Container>
</Property>
</Product>
XML2:
<?xml version="1.0"?>
<Product>
<Property>
<Container value="3">Test 01</Container>
<Container value="7">Test 02</Container>
<Container value="1">Test 03</Container>
<Container value="5">Test 04</Container>
</Property>
</Product>
I want to compare the elements only if the node along with the attribute matches. Also if the position is different then it should be similar.
I have tried with DetailedDiff but it is showing a lot of results but I only want to extract specific changes. Please give your seggestions.

If you're after an order-independent comparison then ElementQualifier would help:
http://xmlunit.sourceforge.net/userguide/html/ar01s03.html
In some cases the order of elements in two pieces of XML may not be significant. If this is true, the DifferenceEngine needs help to determine which Elements to compare. This is the job of an ElementQualifier (see Section 3.4, “ElementQualifier”).
Specifically, ElementNameAndAttributeQualifier seems to match your requirements:
Only Elements with the same name - and Namespace URI if present - as well as the same values for all attributes given in ElementNameAndAttributeQualifier's constructor qualify.

Related

XmlUnit ignore order of elements when comparing XML files

I have the two following xml files:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pricebooks xmlns="http://www.blablabla.com">
<pricebook>
<header pricebook-id="my-id">
<currency>GBP</currency>
<display-name xml:lang="x-default">display name</display-name>
<description>my description 1</description>
</header>
<price-tables>
<price-table product-id="id1" mode="mode1">
<amount quantity="1">30.00</amount>
</price-table>
<price-table product-id="id2" mode="mode2">
<amount quantity="1">60.00</amount>
</price-table>
</price-tables>
</pricebook>
</pricebooks>
and
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<pricebooks xmlns="http://www.blablabla.com">
<pricebook>
<header pricebook-id="my-id">
<currency>GBP</currency>
<display-name xml:lang="x-default">display name</display-name>
<description>my description 1</description>
</header>
<price-tables>
<price-table product-id="id2" mode="mode2">
<amount quantity="1">60.00</amount>
</price-table>
<price-table product-id="id1" mode="mode1">
<amount quantity="1">30.00</amount>
</price-table>
</price-tables>
</pricebook>
</pricebooks>
Which I'm trying to compare ignoring the order of the elements price-table, so for me those two are equal. I'm using
<dependency>
<groupId>org.xmlunit</groupId>
<artifactId>xmlunit-core</artifactId>
<version>2.5.0</version>
</dependency>
and the code is the following, but I'm not able to make it work. It complains because the attribute values id1 and id2 are different.
Diff myDiffSimilar = DiffBuilder
.compare(expected)
.withTest(actual)
.checkForSimilar()
.ignoreWhitespace()
.ignoreComments()
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.byNameAndText))
.build();
assertFalse(myDiffSimilar.hasDifferences());
I have also tried to edit the the nodeMatcher as follow:
.withNodeMatcher(new DefaultNodeMatcher(ElementSelectors.conditionalBuilder()
.whenElementIsNamed("price-tables")
.thenUse(ElementSelectors.byXPath("./price-table", ElementSelectors.byNameAndText))
.elseUse(ElementSelectors.byName)
.build()))
Thank you for your help.
I don't see any nested text inside your price-table elements at all, so byNamAndText matches on element names only - which will be the same for all price-tables and thus not do what you want.
In you example there is no ambiguity for price-tables as there is only one anyway. So the byXPath approach looks wrong. At least in your snippet XMLUnit should do fine with byName except for the price-table elements.
I'm not sure whether product-id alone is what identifies your price-table elements or the combination of all attributes. Either byNameAndAttributes("product-id") or its byNameAndAllAttributes cousin should work.
If it is only product-id then byNameAndAttributes("product-id") becomes byName for all elements that don't have any product-id attribute at all. In this special case byNameAndAttribute("product-id") alone will work for your whole document as we can see it - more or less by accident.
If you need more complex rules for other elements than price-table or you want to make things more explicit than
ElementSelectors.conditionalBuilder()
.whenElementIsNamed("price-table")
.thenUse(ElementSelectors.byNameAndAttributes("product-id"))
// more special cases
.elseUse(ElementSelectors.byName)
is the better choice.

XStream read back xml (java)

I have an xml with a format like this:
<list>
<book>
<price>20</price>
<author>sam</author>
<features type="comic">
<pocket>yes</pocket>
</features>
<avaiable>yes</avaiable>
</book>
<book>
<price>50</price>
<author>john</author>
<features type="novel">
<manga>no</manga>
</features>
<avaiable>yes</avaiable>
</book>
</list>
what I need is to read first the feature type, since this attribute is needed to know wich class I need to instantate in java. I've tryed to read until this node, and then moveUp() to the book node again...but when I try to go down again to get the rest of node values its not possible...seems like it's not possible to read again the nodes once you have go down
So my question is: is there a way to get first an especific node and then move back to read the other nodes ?

Xml id attribute to work with Java's getElementById? [duplicate]

This question already has answers here:
Java DOM getElementByID
(2 answers)
Closed 3 years ago.
I have an xml document being parsed in Java as a w3c document.
In my xml, i have many elements of the same name, e.g <item ..... />, each one with unique attribute's value, e.g <item name="a" .... />.
I want in java to do:
doc.getElementById("a")
in order to get that specific item I have there with that name.
How can I tell java to use 'name' as the id?
Or, alternately, How can I fetch that specific item in least complexity?
DOM is not the best API to easily query your document and get back found elements. Learn XPath, which is a more appropriate API, or iterate through the tree of elements by yourself.
getElementById() will only return the element which has the given id attribute (edit: marked as such in the document DTD or schema). It can't find by name attribute.
See Java XML DOM: how are id Attributes special? for details.
You need to write a DTD that defines your attribute as being of type ID.
Well, To make a complete answer, I had to use DTD schemas like everyone stated.
Since my needs are quite simple, I added it in embedded in my xml the following way:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST item
name ID #REQUIRED
>
]>
<root> .... </root>
The only important thing left to know is that once you declare the ATTLIST, I have to declare all of the rest of my attributes, therefore, you need to add IMPLIED:
some-attribute CDATA #IMPLIED
It says that some-attribute contains some data (can use also PCDATA for parsed cdata), and is implied, which means, it can be there or it cannot. doesnt matter.
So eventually, it'll look something like:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST item
name ID #REQUIRED
some-attribute CDATA #IMPLIED
>
]>
<root> .... </root>
And from Java side, Just use it blindly, e.g getElementById("some-name")
In order to make doc.getElementById("a") work you need to change your XML to <item id="a" name="a" .... />
If you can't change the XML, you could use XPath to retrieve this element.

XML parsing:Retrieve multiple rows in xml using digester

While parsing an xml file like the one below, i want to get the list of telephone numbers for one particular id.I am using Digester to do this.But i am not understanding how to add the call methods or createobjects .Can anyone help me with this.My xml file contains 1000's of
types
<?xml version='1.0' encoding='utf-8'?>
<address-book>
<contact type="individual">
<id>50</id>
<city>New York</city>
<province>NY</province>
<postalcode>10013</postalcode>
<country>USA</country>
<address>
<telephone>1-212-345-6789</telephone>
<telephone>1-212-345-6789</telephone>
<telephone>1-212-345-6789</telephone>
<telephone>1-212-345-6789</telephone>
</address>
</contact>
<contact type="business">
<id>52</id>
<city>Zagreb</city>
<province></province>
<postalcode>10000</postalcode>
<country>Croatia</country>
<address>
<telephone>1-212-345-6789</telephone>
<telephone>1-212-345-6789</telephone>
<telephone>1-212-345-6789</telephone>
<telephone>1-212-345-6789</telephone>
</address>
</contact>
Also how should i stop the parsing when i get the required Id.
Although the question was specific to using the apache-commons-digester, this can be solved by the host of libraries already available in the XML families of functions - namely a SAX parser coupled with an XPath search. Instead of brute-forcing through the data, if what is being searched is known, an XPath query can find the data relatively efficiently. Otherwise, if traversing the entire set of data for indexing or other purposes, again, recommend using a simple SAX parser and looping through the elements (again possibly via an //MyElement type XPath query) and then for each instance, pass the value to a function for indexing or whatever operation. The apache-commons-digester may be overly complicated and/or slow for what is needed.

Update a single element in an xml document

Is it possible to parse and then modify a single element in an XML document?
I'm currently writing a script in ruby which needs to modify a value (specified by xpath) in an xml file. I'm currently using the REXML library to do this:
xmldocument = Document.new(File.new(filename))
property = XPath.first(xmldocument, "/parent/element/property")
property.text = "New property value"
puts xmldocument
Where the input xml is:
<?xml version="1.0" encoding="UTF-8"?>
<parent>
<element>
<property>Old property value</property>
<verbose />
</element>
...
(more elements here)
...
</parent>
And the output is:
<?xml version='1.0' encoding='UTF-8'?>
<parent>
<element>
<property>New property value</property>
<verbose/>
</element>
...
(more elements here)
...
</parent>
You should notice that the output xml is slightly reformatted and more than my desired change are made. For example the tag <verbose /> is changed to <verbose/> and double quotes are replaced with single quotes in the first line.
What is the best way to modify just a given element of an xml file and leave the rest of the file intact? Ideally, there is a solution for Ruby but I'd love to know the solution in other languages such as Java.
In Java, the Saxon library should accomplish everything you're looking for:
http://sourceforge.net/projects/saxon/

Categories