Locating XML element anywhere in the document with Java

Locating XML element anywhere in the document with Java - java

Given the following XML (example):
<?xml version="1.0" encoding="UTF-8"?>
<rsb:VersionInfo xmlns:atom="http://www.w3.org/2005/Atom" xmlns:rsb="http://ws.rsb.de/v2">
<rsb:Variant>Windows</rsb:Variant>
<rsb:Version>10</rsb:Version>
</rsb:VersionInfo>
I need to get the values of Variant and Version. My current approach is using XPath as I cannnot rely on the given structure. All I know is that there is an element rsb:Version somewhere in the document.
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "//Variant";
InputSource inputSource = new InputSource("test.xml");
String result = (String) xpath.evaluate(expression, inputSource, XPathConstants.STRING);
System.out.println(result);
This however does not output anything. I have tried the following XPath expressions:
//Variant
//Variant/text()
//rsb:Variant
//rsb:Variant/text()
What is the correct XPath expression? Or is there an even simpler way getting to this element?

I would suggest just looping through the document to find the given tag
public static void main(String[] args) throws SAXException, IOException,ParserConfigurationException, TransformerException {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document document = docBuilder.parse(new File("test.xml"));
NodeList nodeList = document.getElementsByTagName("rsb:VersionInfo");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// do something with the current element
System.out.println(node.getNodeName());
}
}
}
Edit: Yassin pointed out that it won't get child nodes. This should point you in the right direction for getting the children.
private static List<Node> getChildren(Node n)
{
List<Node> children = asList(n.getChildNodes());
Iterator<Node> it = children.iterator();
while (it.hasNext())
if (it.next().getNodeType() != Node.ELEMENT_NODE)
it.remove();
return children;
}

Related

use xpath to extract value from xml file with multiple namespace in java

I am trying to extract node value with multiple namespaces in java but not succeed. The xml file is like:
<ns26:start xmlns:ns26="http://www.tektronix.com/iris/isa/capture/start"
xmlns:ns31="http://www.tektronix.com/iris/isa/filters"
xmlns:ns13="http://www.tektronix.com/iris/isa/monitoredObjects"
xmlns:ns6="http://www.tektronix.com/iris/isa"
xmlns:ns10="http://www.tektronix.com/iris/isa/monNodeObjects"
xmlns:ns7="http://www.tektronix.com/iris/isa/capture/monitoredElements"
xmlns:ns11="http://www.tektronix.com/iris/isa/pointcodes"
xmlns:ns8="http://www.tektronix.com/iris/isa/capture/captureSession"
xmlns:ns2="http://www.tektronix.com/iris/isa/sessionSaveInfo"
xmlns:ns4="http://www.tektronix.com/iris/isa/customData"
xmlns:ns3="http://www.tektronix.com/iris/isa/manifest">
<ns6:Id>LAB:11300/isaclient;440</ns6:Id>
</ns26:start>
I want to extract Id with xpath local-name(). Expression like //*[local-name()='start']/*[local-name()='Id'] but didn't get any matched node. Please help to find issue here. Thanks
Add the java code here:
public static List<String> getXPathValueNamespace(String xml, String expression throws ParserConfigurationException, SAXException, IOException, XPathExpressionException
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder;
Document doc = null;
List<String> list = new ArrayList<String>();
builder = factory.newDocumentBuilder();
InputSource source = new InputSource(new StringReader(xml));
doc = builder.parse(source);
// Create XPathFactory object
XPathFactory xpathFactory = XPathFactory.newInstance();
// Create XPath object
XPath xpath = xpathFactory.newXPath();
XPathExpression expr = xpath.compile(expression);
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++)
list.add(nodes.item(i).getNodeValue());
return list;
}

The expression //*[local-name()='start']/*[local-name()='Id'] works and for the example document one node should be contained in the result node list.
But you should use nodes.item(i).getTextContent() to retrieve the node content, since getNodeValue() returns null for element nodes.

How to get XML node names without namespace in java?

I have an xml file having data which looks like given below:
....
<ems:MessageInformation>
<ecs:MessageID>2147321820</ecs:MessageID>
<ecs:MessageTimeStamp>2016-01-01T04:38:33</ecs:MessageTimeStamp>
<ecs:SendingSystem>LD</ecs:SendingSystem>
<ecs:ReceivingSystem>CH</ecs:ReceivingSystem>
<ecs:ServicingFipsCountyCode>037</ecs:ServicingFipsCountyCode>
<ecs:Environment>UGS-D8UACS02</ecs:Environment>
</ems:MessageInformation>
....
There are many other nodes also. All nodes have namespace like ecs,tns,ems etc. I am suing following code part to extract all node names without namespace.
public static void main(String[] args) throws SAXException, IOException, ParserConfigurationException, TransformerException {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document document = docBuilder.parse(new File("C:\\Users\\DadMadhR\\Desktop\\temp\\EDR_D3A0327.XML"));
NodeList nodeList = document.getElementsByTagName("*");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
//System.out.println(node.getNodeName());
System.out.println(node.getLocalName());
}
}
But when I execute this code, it's printing null for individual node. Can someone tell me what I am doing wrong here?
I read on internet and I came to know that node.getLocalName() will give node name without namespace. What is wrong then in my case?

You need to set the document builder factory to be namespace aware first. Then getLocalName() will start returning non-null values.
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
docBuilder.setNamespaceAware(true); // <=== here
Document document = docBuilder.parse(new File("C:\\Users\\DadMadhR\\Desktop\\temp\\EDR_D3A0327.XML"));

Java Dom parser reports wrong number of child nodes

I have the following xml file:
<?xml version="1.0" encoding="UTF-8"?>
<users>
<user id="0" firstname="John"/>
</users>
Then I'm trying to parse it with java, but getchildnodes reports wrong number of child nodes.
Java code:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(this.file);
document.getDocumentElement().normalize();
Element root = document.getDocumentElement();
NodeList nodes = root.getChildNodes();
System.out.println(nodes.getLength());
Result: 3
Also I'm getting NPEs for accessing the nodes attributes, so I'm guessing something's going horribly wrong.

The child nodes consist of elements and text nodes for whitespace. You will want to check the node type before processing the attributes. You may also want to consider using the javax.xml.xpath APIs available in the JDK/JRE starting with Java SE 5.
Example 1
This example demonstrates how to issue an XPath statement against a DOM.
package forum11649396;
import java.io.StringReader;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.*;
import org.xml.sax.InputSource;
public class Demo {
public static void main(String[] args) throws Exception {
String xml = "<?xml version='1.0' encoding='UTF-8'?><users><user id='0' firstname='John'/></users>";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(new InputSource(new StringReader(xml)));
XPathFactory xpf = XPathFactory.newInstance();
XPath xpath = xpf.newXPath();
Element userElement = (Element) xpath.evaluate("/users/user", document, XPathConstants.NODE);
System.out.println(userElement.getAttribute("id"));
System.out.println(userElement.getAttribute("firstname"));
}
}
Example 2
The following example demonstrates how to issue an XPath statement against an InputSource to get a DOM node. This saves you from having to parse the XML into a DOM yourself.
package forum11649396;
import java.io.StringReader;
import javax.xml.xpath.*;
import org.w3c.dom.*;
import org.xml.sax.InputSource;
public class Demo {
public static void main(String[] args) throws Exception {
String xml = "<?xml version='1.0' encoding='UTF-8'?><users><user id='0' firstname='John'/></users>";
XPathFactory xpf = XPathFactory.newInstance();
XPath xpath = xpf.newXPath();
InputSource inputSource = new InputSource(new StringReader(xml));
Element userElement = (Element) xpath.evaluate("/users/user", inputSource, XPathConstants.NODE);
System.out.println(userElement.getAttribute("id"));
System.out.println(userElement.getAttribute("firstname"));
}
}

There are three child nodes:
a text node containing a line break
an element node (tagged user)
a text node containing a line break
So when processing the child nodes, check for element nodes.

You have to make sure you account for the '\n' between the nodes, which count for text nodes. You can test for that using if(root.getNodeType() == Node.ELEMENT_NODE)
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(this.file);
document.getDocumentElement().normalize();
for(Node root = document.getFirstChild(); root != null; root = root.getNextSibling()) {
if(root.getNodeType() == Node.ELEMENT_NODE) {
NodeList nodes = root.getChildNodes();
System.out.println(root.getNodeName() + " has "+nodes.getLength()+" children");
for(int i=0; i<nodes.getLength(); i++) {
Node n = nodes.item(i);
System.out.println("\t"+n.getNodeName());
}
}
}

I didn't notice any of the answers addressing your last note about NPEs when trying to access attributes.
Also I'm getting NPEs for accessing the nodes attributes, so I'm guessing something's going horribly wrong.
Since I've seen the following suggestion on a few sites, I assume it's a common way to access attributes:
String myPropValue = node.getAttributes().getNamedItem("myProp").getNodeValue();
which works fine if the nodes always contain a myProp attribute, but if it has no attributes, getAttributes will return null. Also, if there are attributes, but no myProp attribute, getNamedItem will return null.
I'm currently using
public static String getStrAttr(Node node, String key) {
if (node.hasAttributes()) {
Node item = node.getAttributes().getNamedItem(key);
if (item != null) {
return item.getNodeValue();
}
}
return null;
}
public static int getIntAttr(Node node, String key) {
if (node.hasAttributes()) {
Node item = node.getAttributes().getNamedItem(key);
if (item != null) {
return Integer.parseInt(item.getNodeValue());
}
}
return -1;
}
in a utility class, but your mileage may vary.

Getting null values when trying to parse an XML using XPATH in Java

When running this code in Eclipse I get only null values:
String recordsAsXML = "<vGVC:Row xmlns:vGVC=\"urn:com.versai:Bacon:GetViewContentsResponse.v1.0\"><vGVC:Column><vGVC:Value>aaa0</vGVC:Value><vGVC:ObjectName>2ff31656-b10c-11e0-99f6-e01321378d2a</vGVC:ObjectName></vGVC:Column><vGVC:ModifiedAfterQuery>false</vGVC:ModifiedAfterQuery></vGVC:Row>";
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(recordsAsXML));
Document doc = builder.parse(is);
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//vGVC:Value/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
I'm trying to extract the "aaa0" value.
Could you please help me understand what I'm doing wrong?
Thanks in advance, Michal.

You should use namespace context to let XPath know about your namespace, for example like this:
xpath.setNamespaceContext(new NamespaceContext() {
#Override public Iterator<?> getPrefixes(String namespaceURI) { return null; }
#Override public String getPrefix(String namespaceURI) { return "vGVC"; }
#Override public String getNamespaceURI(String prefix) { return "urn:com.versai:Bacon:GetViewContentsResponse.v1.0"; }
});
This should be done before calling evaluate, and if you have more than one namespace to handle, you can add them all in one implementation and use if statements to return the right prefix/URI.
More details in the javadocs here: http://download.oracle.com/javase/6/docs/api/javax/xml/namespace/NamespaceContext.html

Create XML document using nodeList

I need to create a XML Document object using the NodeList. Can someone pls help me to do this. This is my Java code:
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.*;
import org.w3c.dom.*;
public class ReadFile {
public static void main(String[] args) {
String exp = "/configs/markets";
String path = "testConfig.xml";
try {
Document xmlDocument = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(path);
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression xPathExpression = xPath.compile(exp);
NodeList nodes = (NodeList)
xPathExpression.evaluate(xmlDocument,
XPathConstants.NODESET);
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
I want to have an XML file like this:
<configs>
<markets>
<market>
<name>Real</name>
</market>
<market>
<name>play</name>
</market>
</markets>
</configs>
Thanks in advance.

You should do it like this:
you create a new org.w3c.dom.Document newXmlDoc where you store the nodes in your NodeList,
you create a new root element, and append it to newXmlDoc
then, for each node n in your NodeList, you import n in newXmlDoc, and then you append n as a child of root
Here is the code:
public static void main(String[] args) {
String exp = "/configs/markets/market";
String path = "src/a/testConfig.xml";
try {
Document xmlDocument = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(path);
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression xPathExpression = xPath.compile(exp);
NodeList nodes = (NodeList) xPathExpression.
evaluate(xmlDocument, XPathConstants.NODESET);
Document newXmlDocument = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().newDocument();
Element root = newXmlDocument.createElement("root");
newXmlDocument.appendChild(root);
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
Node copyNode = newXmlDocument.importNode(node, true);
root.appendChild(copyNode);
}
printTree(newXmlDocument);
} catch (Exception ex) {
ex.printStackTrace();
}
}
public static void printXmlDocument(Document document) {
DOMImplementationLS domImplementationLS =
(DOMImplementationLS) document.getImplementation();
LSSerializer lsSerializer =
domImplementationLS.createLSSerializer();
String string = lsSerializer.writeToString(document);
System.out.println(string);
}
The output is:
<?xml version="1.0" encoding="UTF-16"?>
<root><market>
<name>Real</name>
</market><market>
<name>play</name>
</market></root>
Some notes:
I've changed exp to /configs/markets/market, because I suspect you want to copy the market elements, rather than the single markets element
for the printXmlDocument, I've used the interesting code in this answer
I hope this helps.
If you don't want to create a new root element, then you may use your original XPath expression, which returns a NodeList consisting of a single node (keep in mind that your XML must have a single root element) that you can directly add to your new XML document.
See following code, where I commented lines from the code above:
public static void main(String[] args) {
//String exp = "/configs/markets/market/";
String exp = "/configs/markets";
String path = "src/a/testConfig.xml";
try {
Document xmlDocument = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().parse(path);
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression xPathExpression = xPath.compile(exp);
NodeList nodes = (NodeList) xPathExpression.
evaluate(xmlDocument,XPathConstants.NODESET);
Document newXmlDocument = DocumentBuilderFactory.newInstance()
.newDocumentBuilder().newDocument();
//Element root = newXmlDocument.createElement("root");
//newXmlDocument.appendChild(root);
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.item(i);
Node copyNode = newXmlDocument.importNode(node, true);
newXmlDocument.appendChild(copyNode);
//root.appendChild(copyNode);
}
printXmlDocument(newXmlDocument);
} catch (Exception ex) {
ex.printStackTrace();
}
}
This will give you the following output:
<?xml version="1.0" encoding="UTF-16"?>
<markets>
<market>
<name>Real</name>
</market>
<market>
<name>play</name>
</market>
</markets>

you can try the adoptNode() method of Document. Maybe you will need to iterate over your NodeList. You can access the individual Nodes with nodeList.item(i).If you want to wrap your search results in an Element, you can use createElement() from the Document and appendChild() on the newly created Element

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Locating XML element anywhere in the document with Java - java

Related

use xpath to extract value from xml file with multiple namespace in java

How to get XML node names without namespace in java?

Java Dom parser reports wrong number of child nodes

Getting null values when trying to parse an XML using XPATH in Java

Create XML document using nodeList

Categories

Resources