Adding content from one xml file to the other in JAVA - java

I've properties file which has key values of name tag in 2 xml files, one is source and other one is destination.
I need to check whether name tag with the same value from properties file is there or not in destination xml, if its there I should not do anything, if its not there the source xml file should be iterated to search for name tag value which is from properties file. Once it found the same name tag should be added from source.xml file to destination.xml file..
Please do help me on this java code
private void updateCofigDestn() throws ParserConfigurationException, TransformerConfigurationException, TransformerException, IOException, SAXException {
prop = loadConfigProperties();
String ConfigSrcFile = prop.getProperty("ConfigSourceFile");
String ConfigDesnFile = prop.getProperty("ConfigDestnFile");
System.out.println("\nConfig Destn Path update config :: " + ConfigDesnFile);
File configSrcFile = new File(ConfigSrcFile + "\\config.xml");
File configDstnFile = new File(ConfigDesnFile + "\\config.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setValidating(false);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document docSrc = dBuilder.parse(configSrcFile);
Document docDestn = dBuilder.parse(configDstnFile);
Set < Object > keys = getAllKeys();
for (Object k: keys) {
if (k.toString().startsWith("JDBC")) {
System.out.println("Inside Keys");
String key = (String) k;
keyVal = getPropertyValue(key);
System.out.println(key + ": " + getPropertyValue(key));
NodeList listSrc =
docSrc.getElementsByTagName("jdbc-system-resource");
NodeList listDsn =
docDestn.getElementsByTagName("jdbc-system-resource");
System.out.println("listDsn.item(0)" + listDsn.item(0).getTextContent());
if (listDsn.item(0) != null) {
for (int t = 0; t < listDsn.getLength(); t++) {
Element elmntDsn1 = (Element) listDsn.item(t);
String DsNameDsn1 = elmntDsn1.getElementsByTagName("name").item(0).getTextContent();
System.out.println("DS At DESTN in Update Conf " + DsNameDsn1);
if (keyVal.equalsIgnoreCase(DsNameDsn1)) {} else {
for (int temp = 0; temp < listSrc.getLength(); temp++) {
Element elmntSrc = (Element) listSrc.item(temp);
String DsNameSrc = elmntSrc.getElementsByTagName("name").item(0).getTextContent();
// elmntSrc.getElementsByTagName(keyVal).item(0).getTextContent();
// configDestn(keyVal);
//System.out.println("value bool >>>>> " +res ) ;
if (keyVal.equalsIgnoreCase(DsNameSrc) && keyVal != null) {
Node copiedNode = docDestn.importNode(elmntSrc, true);
docDestn.getDocumentElement().appendChild(copiedNode);
System.out.println(" Updating the destination Config File");
TransformerFactory.newInstance().newTransformer().transform(new DOMSource(docDestn),
new StreamResult(new FileWriter(configDstnFile)));
}
}
}
}
} else {
System.out.println("Destination List is null ");
for (int temp = 0; temp < listSrc.getLength(); temp++) {
Element elmntSrc = (Element) listSrc.item(temp);
String elmntValSrc = elmntSrc.getElementsByTagName("name").item(0).getTextContent();
if (keyVal.equalsIgnoreCase(elmntValSrc) &&
keyVal != null) {
Node copiedNode = docDestn.importNode(elmntSrc, true);
docDestn.getDocumentElement().appendChild(copiedNode);
System.out.println(" Updating the destination Config File in NULL");
TransformerFactory.newInstance().newTransformer().transform(new DOMSource(docDestn),
new StreamResult(new FileWriter(configDstnFile)));
}
}
}
}
}
}
For ex..
config.properties
file1 = def
file2 = xyz
file3 = abc
source.xml
<domain>
<node0>
<name>xyz</name>
</node0>
<node1>
<name>abc</name>
</node1>
<node2>
<name>def</name>
</node2>
</domain>
destination.xml
<domain>
<node1>
<name>abc</name>
</node1>
</domain>
Step1: It takes key value of file 1 'def' from properties file and checks in destination.xml file, since its not there it will append it.
Step2: It takes the next key value of file 2 'xyz' value from properties file and checks in destination.xml file, since its not there it will append it.
Step3: It takes the next key value of file 3 'abc' from properties file and checks in destination.xml or not, since its there it will not appended.
And now the destination.xml should be looks like,
<domain>
<node1>
<name>abc</name>
</node1>
<node0>
<name>xyz</name>
</node0>
<node2>
<name>def</name>
</node2>
</domain>
This is my requirement to do in JAVA, I have tried lot of coding.
Please do help me out on this..

You're almost there.
The mistake you are doing is using inner 3 level nested for-loop whereas only 2 level nested for-loop is required.
The for-loop of listSrc should be outside the destSrc.
Try this below one.
package com.tmp;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.util.Map;
import java.util.Properties;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class Tmp {
public static void main( String[] args ) {
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = builderFactory.newDocumentBuilder();
XPath path = XPathFactory.newInstance().newXPath();
Document destDocument = builder.parse( new FileInputStream( "D:\\tmp\\destination.xml" ) );
Document srcDocument = builder.parse( new FileInputStream( "D:\\tmp\\source.xml" ) );
Element destRootEle = destDocument.getDocumentElement();
Element srcRootEle = srcDocument.getDocumentElement();
Properties properties = new Properties();
properties.load( new FileInputStream( "D:\\tmp\\config.properties" ) );
// read properties from config.properties file one by one
for ( Map.Entry<Object, Object> entry : properties.entrySet() ) {
String propVal = (String) entry.getValue();
NodeList destNodeList = (NodeList) path.evaluate( "//name", destRootEle, XPathConstants.NODESET );
boolean destNodeNotExist = true;
// iterate through the destination.xml to check whether property value is exist or not
for ( int i = 0; i < destNodeList.getLength(); i++ ) {
Node node = destNodeList.item( i );
if ( propVal.trim().equals( node.getTextContent().trim() ) ) {
destNodeNotExist = false;
break;
}
}
// if the property value is not found in destination.xml then check for the node in source.xml to add to the destination.xml
if ( destNodeNotExist ) {
NodeList srcNodeList = (NodeList) path.evaluate( "//name", srcRootEle, XPathConstants.NODESET );
for ( int i = 0; i < srcNodeList.getLength(); i++ ) {
Node missingNodeToAdd = srcNodeList.item( i );
if ( propVal.trim().equals( missingNodeToAdd.getTextContent().trim() ) ) {
destRootEle.appendChild( destDocument.adoptNode( missingNodeToAdd.getParentNode() ) );
break;
}
}
}
}
// save the changes made to destination.xml file into file system
Transformer tr = TransformerFactory.newInstance().newTransformer();
tr.setOutputProperty( OutputKeys.INDENT, "yes" );
tr.setOutputProperty( OutputKeys.OMIT_XML_DECLARATION, "yes" );
tr.setOutputProperty( OutputKeys.ENCODING, "UTF-8" );
tr.transform( new DOMSource( destDocument ), new StreamResult( new FileOutputStream( "D:\\tmp\\destination.xml" ) ) );
} catch ( Exception e ) {
e.printStackTrace();
}
}
}

Related

Get element text content from xml with <break> using DOM

I have the following part from xml file:
<database>
<document form='Record'>
<item name='SystemsList'><text>2000;Generl;All equipment<break/>
2001;General;All equipment<break/>
2002;General;All equipment<break/>
2003;General;All Equipment</text></item>
<item name='RmNumber'><text>001</text></item>
<item name='Reason'><text>Don't know</text></item>
<item name='Something'><text>smth</text></item>
</document>
</database>
For now I use the following code:
Document doc1 = dBuilder.parse(fXmlFile1);
doc1.getDocumentElement().normalize();
NodeList kList1 =doc1.getElementsByTagName("item");
for(int temp=0;temp<kList1.getLength();temp++)
{
Node kNode1=kList1.item(temp);
//System.out.println("\nCurrent Element :" + kNode.getNodeName());
if (kNode1.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) kNode1;
//System.out.println("node name"+eElement.getNodeName());
Node in=eElement.getFirstChild();
//System.out.println("__________________"+in.getFirstChild().getTextContent());
//System.out.println("IN text content----:"+in.getTextContent()+":--------");
if(eElement.getAttribute("name").equals("SystemsList")==true)
{
NodeList kList2=in.getChildNodes();
//if((in.getTextContent()!=null)&&!(in.getTextContent()).isEmpty()&& !(in.getTextContent().length()==0))
//{
for(int k=0;k<kList2.getLength();k++)
{
Node kNode2 = kList2.item(k);
if((kNode2.getTextContent()!=null)&&!(kNode2.getTextContent()).isEmpty()&& !(kNode2.getTextContent().length()==0))
stringBuilder.append(kNode2.getTextContent()+"\n");
}
//}
}
}
}
String s=new String(stringBuilder);
String sa[]=s.split("\n");
System.out.println("size"+sa.length);
for(String st:sa)
{
System.out.println(st);
}
This code makes the following String="2000;General;All equipment2001;General;All equipment2002;General;All equipment2003;General;All Equipment".
The question is how can I get this xml part with break to be ArrayList where each element is 1 line from the xml above or just to make a String array, f/e : SystemsListByYear[0]="2000;Generl;All equipment", SystemsListByYear[1]="2001;Generl;All equipment" and etc.
P.S. I use the DOM library.
Edited-question-to-correct
Edit part:
if (kNode1.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) kNode1;
//System.out.println("node name"+eElement.getNodeName());
Node in=eElement.getFirstChild();
//System.out.println("__________________"+in.getFirstChild().getTextContent());
//System.out.println("IN text content----:"+in.getTextContent()+":--------");
if(eElement.getAttribute("name").equals("SystemsList")==true)
{
NodeList kList2=in.getChildNodes();
//if((in.getTextContent()!=null)&&!(in.getTextContent()).isEmpty()&& !(in.getTextContent().length()==0))
//{
for(int k=0;k<kList2.getLength();k++)
{
Node kNode2 = kList2.item(k);
if((kNode2.getTextContent()!=null)&&!(kNode2.getTextContent()).isEmpty()&& !(kNode2.getTextContent().length()==0))
stringBuilder.append(kNode2.getTextContent()+"\n");
}
//}
}
}
Then this will solve your problem
package com.test;
import java.io.File;
import java.io.FileInputStream;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class Test {
public static void main(String args[]) throws Exception {
FileInputStream fileInputStream = new FileInputStream(new File(
"src/file.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document doc1 = builder.parse(fileInputStream);
doc1.getDocumentElement().normalize();
NodeList kList1 = doc1.getElementsByTagName("item");
List<String> alist=new ArrayList<String>();
StringBuilder stringBuilder=new StringBuilder();
String SystemsListByYear;
for (int temp = 0; temp < kList1.getLength(); temp++) {
Node kNode1 = kList1.item(temp);
System.out.println("\nCurrent Element :" + kNode1.getNodeName());
if (kNode1.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) kNode1;
System.out.println("node name"+eElement.getNodeName());
Node in=eElement.getFirstChild();
if((in.getTextContent()!=null)&&!(in.getTextContent()).isEmpty()&& !(in.getTextContent().length()==0))
stringBuilder.append(in.getTextContent());
}
}
String s=new String(stringBuilder);
String sa[]=s.split("\n");
System.out.println("size"+sa.length);
for(String st:sa)
{
System.out.println(st);
}
}
}
output
node nameitem
size4
2000;Generl;All equipment
2001;General;All equipment
2002;General;All equipment
2003;General;All Equipment
Split the text content at <break/> and add each split element to an ArrayList.

Read values from a complex xml using java

HI I am new to Java and trying to read an XML file.
Here is my XML file :-
<?xml version="1.0" encoding="UTF-8"?>
<parameter>
<attribute>a</attribute>
Here is my code I am trying to read the key and value from the xml but I am stuck .Here is my code :-
public class TestDBMain {
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
File file = new File("ACL.xml");
DocumentBuilderFactory dbfactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = dbfactory.newDocumentBuilder();
Document doc = builder.parse(file);
NodeList nList = doc.getElementsByTagName("testCaseDataName");
for(int i = 0;i<nList.getLength();i++){
Node nNode = nList.item(i);
if(nNode.getNodeType()== Node.ELEMENT_NODE){
Element ele = (Element) nNode;
// System.out.println(ele.getTextContent());
//System.out.println(ele.getElementsByTagName("testCaseName").item(0).getTextContent());
System.out.println(ele.getAttributeNode("testCaseDataName"));
//I dont know which methods to use to print the key and value in the xml under parameter
}
}
}
}
Can anyone please help me with this
Disclaimer: I maintain the JDOM project, so I am biased.... but... this is an ideal use case for JDOM:
Document doc = new SAXBuilder().build(new File("ACL.xml"));
Element root = doc.getRootElement();
for (Element testcase : root.getChildren()) {
int id = Integer.parseInt(testcase.getChildText("id"));
String name = testcase.getChildText("testCaseName");
String expect = testcase.getChildText("expectedResult");
Map<String,String> params = new LinkedHashMap<String,String>();
Element parmemt = testcase.getChild("parameter");
if (parmemt != null) {
Iterator<Element> it = parmemt.getChildren().iterator();
while (it.hasNext()) {
Element key = it.next();
if (!"key".equals(key.getName())) {
throw new IllegalStateException("Expected key but got " + key);
}
if (!it.hasNext()) {
throw new IllegalStateException("Expected value for key " + key);
}
Element val = it.next();
if (!"value".equals(val.getName())) {
throw new IllegalStateException("Expected value but got " + val);
}
params.put(key.getValue(), val.getValue());
}
}
System.out.printf("Processing test case %d -> %s\n Expect %s\n Parameters: %s\n",
id, name, expect, params.toString());
}
For me this produces the output
Processing test case 1 -> EditTest
Expect nooptionsacltrue
Parameters: {}
Processing test case 2 -> AddTest
Expect featuresaddedacltrue
Parameters: {featues=w,f}
Processing test case 3 -> AddTest
Expect duplicateacltrue
Parameters: {projectType=NEW, Name=28HPM, status=ACTIVE, canOrder=Yes}
your code read <testCaseDataName> node. it is not go inside of this tag.
so try this..
for(int i = 0;i<nList.getLength();i++){
NodeList nodeList = nList.item(i).getChildNodes();
for(int j = 0;j<nList.getLength();j++){
Node nNode = nodeList.item(j);
if(nNode.getNodeType()== Node.ELEMENT_NODE){
System.out.println(nNode.getNodeName() +" : "+nNode.getTextContent());
if(nNode.getNodeName().equals("parameter")){
NodeList param = nNode.getChildNodes();
System.out.println(" "+param.item(0).getNodeName() +" : "+param.item(0).getTextContent());
System.out.println(" "+param.item(1).getNodeName() +" : "+param.item(1).getTextContent());
}
}
}
}

DOM parser, why do I get just one child of an element?

my question is "DOM parser, why do I get just one child of an element?"
I looked into this and this one, but I do not get the point.
What I'm trying to do is the following:
I have an XML file (see the extract below) :
<POITEM>
<item>
<PO_ITEM>00010</PO_ITEM>
<SHORT_TEXT>ITEM_A</SHORT_TEXT>
<MATL_GROUP>20010102</MATL_GROUP>
<AGREEMENT>4600010076</AGREEMENT>
<AGMT_ITEM>00010</AGMT_ITEM>
<HL_ITEM>00000</HL_ITEM>
<NET_PRICE>1.000000000</NET_PRICE>
<QUANTITY>1.000</QUANTITY>
<PO_UNIT>EA</PO_UNIT>
</item>
<item>
<PO_ITEM>00020</PO_ITEM>
<SHORT_TEXT>ITEM_B</SHORT_TEXT>
<MATL_GROUP>20010102</MATL_GROUP>
<AGREEMENT>4600010080</AGREEMENT>
<AGMT_ITEM>00020</AGMT_ITEM>
<HL_ITEM>00000</HL_ITEM
<NET_PRICE>5.000000000</NET_PRICE>
<QUANTITY>5.000</QUANTITY>
<PO_UNIT>EA</PO_UNIT>
</item>
</POITEM>
I only want to extract <PO_ITEM>, <SHORT_TEXT>, <MATL_GROUP>, <NET_PRICE>, <QUANTITY> and <PO_UNIT> and write it into another, smaller XML file.
So this is my code:
nodes = dcmt.getElementsByTagName("POITEM");
Element rootElement2 = doc1.createElement("PO_POSITIONS");
rootElement1.appendChild(rootElement2);
Element details2 = doc1.createElement("PO_DETAILS");
rootElement2.appendChild(details2);
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element element = (Element) node;
Element position = doc1.createElement("position");
details2.appendChild(position);
Element poItm = doc1.createElement("PO_ITEM");
poItm.appendChild(doc1.createTextNode(getValue("PO_ITEM", element)));
position.appendChild(poItm);
Element matlGrp = doc1.createElement("MATL_GROUP");
matlGrp.appendChild(doc1.createTextNode(getValue("MATL_GROUP",element)));
position.appendChild(matlGrp);
Element poUnit = doc1.createElement("PO_UNIT");
poUnit.appendChild(doc1.createTextNode(getValue("PO_UNIT",element)));
position.appendChild(poUnit);
Element netPrice = doc1.createElement("NET_PRICE");
netPrice.appendChild(doc1.createTextNode(getValue("NET_PRICE",element)));
position.appendChild(netPrice);
Element shortTxt = doc1.createElement("SHORT_TEXT");
shortTxt.appendChild(doc1.createTextNode(getValue("SHORT_TEXT",element)));
position.appendChild(shortTxt);
//Element matl = doc2.createElement("MATERIAL");
//matl.appendChild(doc2.createTextNode(getValue("MATERIAL",element)));
//details2.appendChild(matl);
Element qnty = doc1.createElement("QUANTITY");
qnty.appendChild(doc1.createTextNode(getValue("QUANTITY",element)));
position.appendChild(qnty);
/*Element preqNr = doc1.createElement("PREQ_NO");
preqNr.appendChild(doc1.createTextNode(getValue("PREQ_NO",element)));
details2.appendChild(preqNr); */
}
}
So far so good, I'm getting a new XML File, but it only holds the first entry, so as i understand it, by the nodes = dcmt.getElementsByTagName("POITEM"); gets into the first <item> until the first </item> and then gets out of the loop. So how do I manage step into the next item? Do I need to create some kind of loop, to access the next <item> ?
By the way, changing the structure of the XML file is no option, since I get the file from an interface.
Or do I make a mistake while writing the new XML file?
The output looks like this:
<PO_POSITIONS>
<PO_DETAILS>
<position>
<PO_ITEM>00010</PO_ITEM>
<MATL_GROUP>20010102</MATL_GROUP>
<PO_UNIT>EA</PO_UNIT>
<NET_PRICE>1.00000000</NET_PRICE>
<SHORT_TEXT>ITEM_A</SHORT_TEXT>
<QUANTITY>1.000</QUANTITY>
</position>
</PO_DETAILS>
</PO_POSITIONS>
You could parse it yourself, it's kind of a pain. When I did xml way back when, we used to use stylesheets to do these kinds of transformations. Something like this post: How to transform XML with XSL using Java
If that's not an option, then to do it by hand (I omitted the new document construction, but you can see where it goes):
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.junit.Test;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class XMLTest {
#Test
public void testXmlParsing() throws Exception {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new File("/Users/aakture/Documents/workspace-sts-2.9.1.RELEASE/smartfox/branches/trunk/java/gelato/src/test/resources/sample.xml").getAbsolutePath());
Node poItem = doc.getElementsByTagName("POITEM").item(0);
NodeList poItemChildren = poItem.getChildNodes();
for (int i = 0; i < poItemChildren.getLength(); i++) {
Node item = poItemChildren.item(i);
NodeList itemChildren = item.getChildNodes();
for (int j = 0; j < itemChildren.getLength(); j++) {
Node itemChild = itemChildren.item(j);
if("PO_ITEM".equals(itemChild.getNodeName())) {
System.out.println("found PO_ITEM: " + itemChild.getTextContent());
}
if("MATL_GROUP".equals(itemChild.getNodeName())) {
System.out.println("found MATL_GROUP: " + itemChild.getTextContent());
}
}
}
}
}

Getting XML Node text value with Java DOM

I can't fetch text value with Node.getNodeValue(), Node.getFirstChild().getNodeValue() or with Node.getTextContent().
My XML is like
<add job="351">
<tag>foobar</tag>
<tag>foobar2</tag>
</add>
And I'm trying to get tag value (non-text element fetching works fine). My Java code sounds like
Document doc = db.parse(new File(args[0]));
Node n = doc.getFirstChild();
NodeList nl = n.getChildNodes();
Node an,an2;
for (int i=0; i < nl.getLength(); i++) {
an = nl.item(i);
if(an.getNodeType()==Node.ELEMENT_NODE) {
NodeList nl2 = an.getChildNodes();
for(int i2=0; i2<nl2.getLength(); i2++) {
an2 = nl2.item(i2);
// DEBUG PRINTS
System.out.println(an2.getNodeName() + ": type (" + an2.getNodeType() + "):");
if(an2.hasChildNodes())
System.out.println(an2.getFirstChild().getTextContent());
if(an2.hasChildNodes())
System.out.println(an2.getFirstChild().getNodeValue());
System.out.println(an2.getTextContent());
System.out.println(an2.getNodeValue());
}
}
}
It prints out
tag type (1):
tag1
tag1
tag1
null
#text type (3):
_blank line_
_blank line_
...
Thanks for the help.
I'd print out the result of an2.getNodeName() as well for debugging purposes. My guess is that your tree crawling code isn't crawling to the nodes that you think it is. That suspicion is enhanced by the lack of checking for node names in your code.
Other than that, the javadoc for Node defines "getNodeValue()" to return null for Nodes of type Element. Therefore, you really should be using getTextContent(). I'm not sure why that wouldn't give you the text that you want.
Perhaps iterate the children of your tag node and see what types are there?
Tried this code and it works for me:
String xml = "<add job=\"351\">\n" +
" <tag>foobar</tag>\n" +
" <tag>foobar2</tag>\n" +
"</add>";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
ByteArrayInputStream bis = new ByteArrayInputStream(xml.getBytes());
Document doc = db.parse(bis);
Node n = doc.getFirstChild();
NodeList nl = n.getChildNodes();
Node an,an2;
for (int i=0; i < nl.getLength(); i++) {
an = nl.item(i);
if(an.getNodeType()==Node.ELEMENT_NODE) {
NodeList nl2 = an.getChildNodes();
for(int i2=0; i2<nl2.getLength(); i2++) {
an2 = nl2.item(i2);
// DEBUG PRINTS
System.out.println(an2.getNodeName() + ": type (" + an2.getNodeType() + "):");
if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getTextContent());
if(an2.hasChildNodes()) System.out.println(an2.getFirstChild().getNodeValue());
System.out.println(an2.getTextContent());
System.out.println(an2.getNodeValue());
}
}
}
Output was:
#text: type (3): foobar foobar
#text: type (3): foobar2 foobar2
If your XML goes quite deep, you might want to consider using XPath, which comes with your JRE, so you can access the contents far more easily using:
String text = xp.evaluate("//add[#job='351']/tag[position()=1]/text()",
document.getDocumentElement());
Full example:
import static org.junit.Assert.assertEquals;
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathFactory;
import org.junit.Before;
import org.junit.Test;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
public class XPathTest {
private Document document;
#Before
public void setup() throws Exception {
String xml = "<add job=\"351\"><tag>foobar</tag><tag>foobar2</tag></add>";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
document = db.parse(new InputSource(new StringReader(xml)));
}
#Test
public void testXPath() throws Exception {
XPathFactory xpf = XPathFactory.newInstance();
XPath xp = xpf.newXPath();
String text = xp.evaluate("//add[#job='351']/tag[position()=1]/text()",
document.getDocumentElement());
assertEquals("foobar", text);
}
}
I use a very old java. Jdk 1.4.08 and I had the same issue. The Node class for me did not had the getTextContent() method. I had to use Node.getFirstChild().getNodeValue() instead of Node.getNodeValue() to get the value of the node. This fixed for me.
If you are open to vtd-xml, which excels at both performance and memory efficiency, below is the code to do what you are looking for...in both XPath and manual navigation... the overall code is much concise and easier to understand ...
import com.ximpleware.*;
public class queryText {
public static void main(String[] s) throws VTDException{
VTDGen vg = new VTDGen();
if (!vg.parseFile("input.xml", true))
return;
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
// first manually navigate
if(vn.toElement(VTDNav.FC,"tag")){
int i= vn.getText();
if (i!=-1){
System.out.println("text ===>"+vn.toString(i));
}
if (vn.toElement(VTDNav.NS,"tag")){
i=vn.getText();
System.out.println("text ===>"+vn.toString(i));
}
}
// second version use XPath
ap.selectXPath("/add/tag/text()");
int i=0;
while((i=ap.evalXPath())!= -1){
System.out.println("text node ====>"+vn.toString(i));
}
}
}

Best way to compare 2 XML documents in Java

I'm trying to write an automated test of an application that basically translates a custom message format into an XML message and sends it out the other end. I've got a good set of input/output message pairs so all I need to do is send the input messages in and listen for the XML message to come out the other end.
When it comes time to compare the actual output to the expected output I'm running into some problems. My first thought was just to do string comparisons on the expected and actual messages. This doens't work very well because the example data we have isn't always formatted consistently and there are often times different aliases used for the XML namespace (and sometimes namespaces aren't used at all.)
I know I can parse both strings and then walk through each element and compare them myself and this wouldn't be too difficult to do, but I get the feeling there's a better way or a library I could leverage.
So, boiled down, the question is:
Given two Java Strings which both contain valid XML how would you go about determining if they are semantically equivalent? Bonus points if you have a way to determine what the differences are.
Sounds like a job for XMLUnit
http://www.xmlunit.org/
https://github.com/xmlunit
Example:
public class SomeTest extends XMLTestCase {
#Test
public void test() {
String xml1 = ...
String xml2 = ...
XMLUnit.setIgnoreWhitespace(true); // ignore whitespace differences
// can also compare xml Documents, InputSources, Readers, Diffs
assertXMLEqual(xml1, xml2); // assertXMLEquals comes from XMLTestCase
}
}
The following will check if the documents are equal using standard JDK libraries.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc1 = db.parse(new File("file1.xml"));
doc1.normalizeDocument();
Document doc2 = db.parse(new File("file2.xml"));
doc2.normalizeDocument();
Assert.assertTrue(doc1.isEqualNode(doc2));
normalize() is there to make sure there are no cycles (there technically wouldn't be any)
The above code will require the white spaces to be the same within the elements though, because it preserves and evaluates it. The standard XML parser that comes with Java does not allow you to set a feature to provide a canonical version or understand xml:space if that is going to be a problem then you may need a replacement XML parser such as xerces or use JDOM.
Xom has a Canonicalizer utility which turns your DOMs into a regular form, which you can then stringify and compare. So regardless of whitespace irregularities or attribute ordering, you can get regular, predictable comparisons of your documents.
This works especially well in IDEs that have dedicated visual String comparators, like Eclipse. You get a visual representation of the semantic differences between the documents.
The latest version of XMLUnit can help the job of asserting two XML are equal. Also XMLUnit.setIgnoreWhitespace() and XMLUnit.setIgnoreAttributeOrder() may be necessary to the case in question.
See working code of a simple example of XML Unit use below.
import org.custommonkey.xmlunit.DetailedDiff;
import org.custommonkey.xmlunit.XMLUnit;
import org.junit.Assert;
public class TestXml {
public static void main(String[] args) throws Exception {
String result = "<abc attr=\"value1\" title=\"something\"> </abc>";
// will be ok
assertXMLEquals("<abc attr=\"value1\" title=\"something\"></abc>", result);
}
public static void assertXMLEquals(String expectedXML, String actualXML) throws Exception {
XMLUnit.setIgnoreWhitespace(true);
XMLUnit.setIgnoreAttributeOrder(true);
DetailedDiff diff = new DetailedDiff(XMLUnit.compareXML(expectedXML, actualXML));
List<?> allDifferences = diff.getAllDifferences();
Assert.assertEquals("Differences found: "+ diff.toString(), 0, allDifferences.size());
}
}
If using Maven, add this to your pom.xml:
<dependency>
<groupId>xmlunit</groupId>
<artifactId>xmlunit</artifactId>
<version>1.4</version>
</dependency>
Building on Tom's answer, here's an example using XMLUnit v2.
It uses these maven dependencies
<dependency>
<groupId>org.xmlunit</groupId>
<artifactId>xmlunit-core</artifactId>
<version>2.0.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.xmlunit</groupId>
<artifactId>xmlunit-matchers</artifactId>
<version>2.0.0</version>
<scope>test</scope>
</dependency>
..and here's the test code
import static org.junit.Assert.assertThat;
import static org.xmlunit.matchers.CompareMatcher.isIdenticalTo;
import org.xmlunit.builder.Input;
import org.xmlunit.input.WhitespaceStrippedSource;
public class SomeTest extends XMLTestCase {
#Test
public void test() {
String result = "<root></root>";
String expected = "<root> </root>";
// ignore whitespace differences
// https://github.com/xmlunit/user-guide/wiki/Providing-Input-to-XMLUnit#whitespacestrippedsource
assertThat(result, isIdenticalTo(new WhitespaceStrippedSource(Input.from(expected).build())));
assertThat(result, isIdenticalTo(Input.from(expected).build())); // will fail due to whitespace differences
}
}
The documentation that outlines this is https://github.com/xmlunit/xmlunit#comparing-two-documents
Thanks, I extended this, try this ...
import java.io.ByteArrayInputStream;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
public class XmlDiff
{
private boolean nodeTypeDiff = true;
private boolean nodeValueDiff = true;
public boolean diff( String xml1, String xml2, List<String> diffs ) throws Exception
{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc1 = db.parse(new ByteArrayInputStream(xml1.getBytes()));
Document doc2 = db.parse(new ByteArrayInputStream(xml2.getBytes()));
doc1.normalizeDocument();
doc2.normalizeDocument();
return diff( doc1, doc2, diffs );
}
/**
* Diff 2 nodes and put the diffs in the list
*/
public boolean diff( Node node1, Node node2, List<String> diffs ) throws Exception
{
if( diffNodeExists( node1, node2, diffs ) )
{
return true;
}
if( nodeTypeDiff )
{
diffNodeType(node1, node2, diffs );
}
if( nodeValueDiff )
{
diffNodeValue(node1, node2, diffs );
}
System.out.println(node1.getNodeName() + "/" + node2.getNodeName());
diffAttributes( node1, node2, diffs );
diffNodes( node1, node2, diffs );
return diffs.size() > 0;
}
/**
* Diff the nodes
*/
public boolean diffNodes( Node node1, Node node2, List<String> diffs ) throws Exception
{
//Sort by Name
Map<String,Node> children1 = new LinkedHashMap<String,Node>();
for( Node child1 = node1.getFirstChild(); child1 != null; child1 = child1.getNextSibling() )
{
children1.put( child1.getNodeName(), child1 );
}
//Sort by Name
Map<String,Node> children2 = new LinkedHashMap<String,Node>();
for( Node child2 = node2.getFirstChild(); child2!= null; child2 = child2.getNextSibling() )
{
children2.put( child2.getNodeName(), child2 );
}
//Diff all the children1
for( Node child1 : children1.values() )
{
Node child2 = children2.remove( child1.getNodeName() );
diff( child1, child2, diffs );
}
//Diff all the children2 left over
for( Node child2 : children2.values() )
{
Node child1 = children1.get( child2.getNodeName() );
diff( child1, child2, diffs );
}
return diffs.size() > 0;
}
/**
* Diff the nodes
*/
public boolean diffAttributes( Node node1, Node node2, List<String> diffs ) throws Exception
{
//Sort by Name
NamedNodeMap nodeMap1 = node1.getAttributes();
Map<String,Node> attributes1 = new LinkedHashMap<String,Node>();
for( int index = 0; nodeMap1 != null && index < nodeMap1.getLength(); index++ )
{
attributes1.put( nodeMap1.item(index).getNodeName(), nodeMap1.item(index) );
}
//Sort by Name
NamedNodeMap nodeMap2 = node2.getAttributes();
Map<String,Node> attributes2 = new LinkedHashMap<String,Node>();
for( int index = 0; nodeMap2 != null && index < nodeMap2.getLength(); index++ )
{
attributes2.put( nodeMap2.item(index).getNodeName(), nodeMap2.item(index) );
}
//Diff all the attributes1
for( Node attribute1 : attributes1.values() )
{
Node attribute2 = attributes2.remove( attribute1.getNodeName() );
diff( attribute1, attribute2, diffs );
}
//Diff all the attributes2 left over
for( Node attribute2 : attributes2.values() )
{
Node attribute1 = attributes1.get( attribute2.getNodeName() );
diff( attribute1, attribute2, diffs );
}
return diffs.size() > 0;
}
/**
* Check that the nodes exist
*/
public boolean diffNodeExists( Node node1, Node node2, List<String> diffs ) throws Exception
{
if( node1 == null && node2 == null )
{
diffs.add( getPath(node2) + ":node " + node1 + "!=" + node2 + "\n" );
return true;
}
if( node1 == null && node2 != null )
{
diffs.add( getPath(node2) + ":node " + node1 + "!=" + node2.getNodeName() );
return true;
}
if( node1 != null && node2 == null )
{
diffs.add( getPath(node1) + ":node " + node1.getNodeName() + "!=" + node2 );
return true;
}
return false;
}
/**
* Diff the Node Type
*/
public boolean diffNodeType( Node node1, Node node2, List<String> diffs ) throws Exception
{
if( node1.getNodeType() != node2.getNodeType() )
{
diffs.add( getPath(node1) + ":type " + node1.getNodeType() + "!=" + node2.getNodeType() );
return true;
}
return false;
}
/**
* Diff the Node Value
*/
public boolean diffNodeValue( Node node1, Node node2, List<String> diffs ) throws Exception
{
if( node1.getNodeValue() == null && node2.getNodeValue() == null )
{
return false;
}
if( node1.getNodeValue() == null && node2.getNodeValue() != null )
{
diffs.add( getPath(node1) + ":type " + node1 + "!=" + node2.getNodeValue() );
return true;
}
if( node1.getNodeValue() != null && node2.getNodeValue() == null )
{
diffs.add( getPath(node1) + ":type " + node1.getNodeValue() + "!=" + node2 );
return true;
}
if( !node1.getNodeValue().equals( node2.getNodeValue() ) )
{
diffs.add( getPath(node1) + ":type " + node1.getNodeValue() + "!=" + node2.getNodeValue() );
return true;
}
return false;
}
/**
* Get the node path
*/
public String getPath( Node node )
{
StringBuilder path = new StringBuilder();
do
{
path.insert(0, node.getNodeName() );
path.insert( 0, "/" );
}
while( ( node = node.getParentNode() ) != null );
return path.toString();
}
}
AssertJ 1.4+ has specific assertions to compare XML content:
String expectedXml = "<foo />";
String actualXml = "<bar />";
assertThat(actualXml).isXmlEqualTo(expectedXml);
Here is the Documentation
Below code works for me
String xml1 = ...
String xml2 = ...
XMLUnit.setIgnoreWhitespace(true);
XMLUnit.setIgnoreAttributeOrder(true);
XMLAssert.assertXMLEqual(actualxml, xmlInDb);
skaffman seems to be giving a good answer.
another way is probably to format the XML using a commmand line utility like xmlstarlet(http://xmlstar.sourceforge.net/) and then format both the strings and then use any diff utility(library) to diff the resulting output files. I don't know if this is a good solution when issues are with namespaces.
I'm using Altova DiffDog which has options to compare XML files structurally (ignoring string data).
This means that (if checking the 'ignore text' option):
<foo a="xxx" b="xxx">xxx</foo>
and
<foo b="yyy" a="yyy">yyy</foo>
are equal in the sense that they have structural equality. This is handy if you have example files that differ in data, but not structure!
I required the same functionality as requested in the main question. As I was not allowed to use any 3rd party libraries, I have created my own solution basing on #Archimedes Trajano solution.
Following is my solution.
import java.io.ByteArrayInputStream;
import java.nio.charset.Charset;
import java.util.HashMap;
import java.util.Map;
import java.util.Map.Entry;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.junit.Assert;
import org.w3c.dom.Document;
/**
* Asserts for asserting XML strings.
*/
public final class AssertXml {
private AssertXml() {
}
private static Pattern NAMESPACE_PATTERN = Pattern.compile("xmlns:(ns\\d+)=\"(.*?)\"");
/**
* Asserts that two XML are of identical content (namespace aliases are ignored).
*
* #param expectedXml expected XML
* #param actualXml actual XML
* #throws Exception thrown if XML parsing fails
*/
public static void assertEqualXmls(String expectedXml, String actualXml) throws Exception {
// Find all namespace mappings
Map<String, String> fullnamespace2newAlias = new HashMap<String, String>();
generateNewAliasesForNamespacesFromXml(expectedXml, fullnamespace2newAlias);
generateNewAliasesForNamespacesFromXml(actualXml, fullnamespace2newAlias);
for (Entry<String, String> entry : fullnamespace2newAlias.entrySet()) {
String newAlias = entry.getValue();
String namespace = entry.getKey();
Pattern nsReplacePattern = Pattern.compile("xmlns:(ns\\d+)=\"" + namespace + "\"");
expectedXml = transletaNamespaceAliasesToNewAlias(expectedXml, newAlias, nsReplacePattern);
actualXml = transletaNamespaceAliasesToNewAlias(actualXml, newAlias, nsReplacePattern);
}
// nomralize namespaces accoring to given mapping
DocumentBuilder db = initDocumentParserFactory();
Document expectedDocuemnt = db.parse(new ByteArrayInputStream(expectedXml.getBytes(Charset.forName("UTF-8"))));
expectedDocuemnt.normalizeDocument();
Document actualDocument = db.parse(new ByteArrayInputStream(actualXml.getBytes(Charset.forName("UTF-8"))));
actualDocument.normalizeDocument();
if (!expectedDocuemnt.isEqualNode(actualDocument)) {
Assert.assertEquals(expectedXml, actualXml); //just to better visualize the diffeences i.e. in eclipse
}
}
private static DocumentBuilder initDocumentParserFactory() throws ParserConfigurationException {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(false);
dbf.setCoalescing(true);
dbf.setIgnoringElementContentWhitespace(true);
dbf.setIgnoringComments(true);
DocumentBuilder db = dbf.newDocumentBuilder();
return db;
}
private static String transletaNamespaceAliasesToNewAlias(String xml, String newAlias, Pattern namespacePattern) {
Matcher nsMatcherExp = namespacePattern.matcher(xml);
if (nsMatcherExp.find()) {
xml = xml.replaceAll(nsMatcherExp.group(1) + "[:]", newAlias + ":");
xml = xml.replaceAll(nsMatcherExp.group(1) + "=", newAlias + "=");
}
return xml;
}
private static void generateNewAliasesForNamespacesFromXml(String xml, Map<String, String> fullnamespace2newAlias) {
Matcher nsMatcher = NAMESPACE_PATTERN.matcher(xml);
while (nsMatcher.find()) {
if (!fullnamespace2newAlias.containsKey(nsMatcher.group(2))) {
fullnamespace2newAlias.put(nsMatcher.group(2), "nsTr" + (fullnamespace2newAlias.size() + 1));
}
}
}
}
It compares two XML strings and takes care of any mismatching namespace mappings by translating them to unique values in both input strings.
Can be fine tuned i.e. in case of translation of namespaces. But for my requirements just does the job.
This will compare full string XMLs (reformatting them on the way). It makes it easy to work with your IDE (IntelliJ, Eclipse), cos you just click and visually see the difference in the XML files.
import org.apache.xml.security.c14n.CanonicalizationException;
import org.apache.xml.security.c14n.Canonicalizer;
import org.apache.xml.security.c14n.InvalidCanonicalizerException;
import org.w3c.dom.Element;
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.TransformerException;
import java.io.IOException;
import java.io.StringReader;
import static org.apache.xml.security.Init.init;
import static org.junit.Assert.assertEquals;
public class XmlUtils {
static {
init();
}
public static String toCanonicalXml(String xml) throws InvalidCanonicalizerException, ParserConfigurationException, SAXException, CanonicalizationException, IOException {
Canonicalizer canon = Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C14N_OMIT_COMMENTS);
byte canonXmlBytes[] = canon.canonicalize(xml.getBytes());
return new String(canonXmlBytes);
}
public static String prettyFormat(String input) throws TransformerException, ParserConfigurationException, IOException, SAXException, InstantiationException, IllegalAccessException, ClassNotFoundException {
InputSource src = new InputSource(new StringReader(input));
Element document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
Boolean keepDeclaration = input.startsWith("<?xml");
DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementationLS impl = (DOMImplementationLS) registry.getDOMImplementation("LS");
LSSerializer writer = impl.createLSSerializer();
writer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);
writer.getDomConfig().setParameter("xml-declaration", keepDeclaration);
return writer.writeToString(document);
}
public static void assertXMLEqual(String expected, String actual) throws ParserConfigurationException, IOException, SAXException, CanonicalizationException, InvalidCanonicalizerException, TransformerException, IllegalAccessException, ClassNotFoundException, InstantiationException {
String canonicalExpected = prettyFormat(toCanonicalXml(expected));
String canonicalActual = prettyFormat(toCanonicalXml(actual));
assertEquals(canonicalExpected, canonicalActual);
}
}
I prefer this to XmlUnit because the client code (test code) is cleaner.
Using XMLUnit 2.x
In the pom.xml
<dependency>
<groupId>org.xmlunit</groupId>
<artifactId>xmlunit-assertj3</artifactId>
<version>2.9.0</version>
</dependency>
Test implementation (using junit 5) :
import org.junit.jupiter.api.Test;
import org.xmlunit.assertj3.XmlAssert;
public class FooTest {
#Test
public void compareXml() {
//
String xmlContentA = "<foo></foo>";
String xmlContentB = "<foo></foo>";
//
XmlAssert.assertThat(xmlContentA).and(xmlContentB).areSimilar();
}
}
Other methods : areIdentical(), areNotIdentical(), areNotSimilar()
More details (configuration of assertThat(~).and(~) and examples) in this documentation page.
XMLUnit also has (among other features) a DifferenceEvaluator to do more precise comparisons.
XMLUnit website
Using JExamXML with java application
import com.a7soft.examxml.ExamXML;
import com.a7soft.examxml.Options;
.................
// Reads two XML files into two strings
String s1 = readFile("orders1.xml");
String s2 = readFile("orders.xml");
// Loads options saved in a property file
Options.loadOptions("options");
// Compares two Strings representing XML entities
System.out.println( ExamXML.compareXMLString( s1, s2 ) );
Since you say "semantically equivalent" I assume you mean that you want to do more than just literally verify that the xml outputs are (string) equals, and that you'd want something like
<foo> some stuff here</foo></code>
and
<foo>some stuff here</foo></code>
do read as equivalent. Ultimately it's going to matter how you're defining "semantically equivalent" on whatever object you're reconstituting the message from. Simply build that object from the messages and use a custom equals() to define what you're looking for.

Categories