Retrieve value of XML node and node attribute using XPath in JAXP - java

Given an xml document that looks like the following:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<entry key="agentType">STANDARD</entry>
<entry key="DestinationTransferStates"></entry>
<entry key="AgentStatusPublishRate">300</entry>
<entry key="agentVersion">f000-703-GM2-20101109-1550</entry>
<entry key="CommandTimeUTC">2010-12-24T02:25:43Z</entry>
<entry key="PublishTimeUTC">2010-12-24T02:26:09Z</entry>
<entry key="queueManager">AGENTQMGR</entry>
</properties>
I want to print the values of the "key" attribute and the element so it looks like this:
agentType = STANDARD
DestinationTransferStates =
AgentStatusPublishRate = 300
agentVersion = f000-703-GM2-20101109-1550
CommandTimeUTC = 2010-12-24T02:25:43Z
PublishTimeUTC = 2010-12-24T02:26:09Z
queueManager = AGENTQMGR
I'm able to print the node values with no problem using this code:
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//properties/entry/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
And I can print the values of the "key" attribute by changing the xpath expression and the node methods as follows:
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//properties/entry");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getAttributes().getNamedItem("key").getNodeValue());
}
It seems like there would be a way to get at both of these values in a single evaluate. I could always evaluate two NodeLists and iterate through them with a common index but I'm not sure they are guaranteed to be returned in the same order. Any suggestions appreciated.

What about getTextContent()? This should do the work.
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++)
{
Node currentItem = nodes.item(i);
String key = currentItem.getAttributes().getNamedItem("key").getNodeValue();
String value = currentItem.getTextContent();
System.out.printf("%1s = %2s\n", key, value);
}
For further informations please see the javadoc for getTextContent(). I hope this will help you.

Related

Howto iterate through XML by node/element names

Is there a way i can iterate over nodes/elements by their names like this:
<rootnode>
<foo>
<bar>
stuff
</bar>
....
document.getDocumentElement.getElement("foo").getElement("bar").getValue();
I think the XPath should do the trick.
Provided you already have parsed the document as org.w3c.dom.Document:
String expression = "/rootnode/foo/bar";
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
System.out.println(nodeList.item(i).getFirstChild().getNodeValue());
}

Java XPath - select all nodes does not work with namespaces

I would like to select all elements (in my case //ab) from given XML. But it does not select anything (matches variable contains empty list):
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("//ab");
NodeList matches = (NodeList) expr.evaluate(new InputSource(new ByteArrayInputStream(
("<x:xml xmlns:x=\"yyy\">"+
"<x:ab>123</x:ab>"+
"</x:xml>").getBytes())), XPathConstants.NODESET);
if (matches != null)
{
for (int i = 0; i < matches.getLength(); i++)
{
Node node = (Node) matches.item(i);
System.out.println(node.getNodeName());
System.out.println(node.getNodeValue());
}
}
Interesting is when change the xpath to : *//ab , then it does select the ab element.
Also when I remove namespace from xml then the xpath //ab does select the ab element.
Same as when I add namespaceContext to xpath object and add namespace to xpath query: //x:ab , then it does select the ab element.
How should I change the code, so I would get ab element without changing the query?

XPATH evaluation against child node

I have xml as follows,
<students>
<Student><age>23</age><id>2000</id><name>PP2000</name></Student>
<Student><age>23</age><id>1000</id><name>PP1000</name></Student>
</students>
I have 2 xpaths Template XPATH = students/Student will be the template nodes, but I cannot hard code this xpath, because it will change for other XMLs, and XML is pretty dynamic, can expand (but with the same base XPATHs) So if I evaluate one more XPATH using the template node, I'm using the following code,
XPath xpathResource = XPathFactory.newInstance().newXPath();
Document xmlDocument = //creating document;
NodeList nodeList = (NodeList)xpathResource.compile("//students/Student").evaluate(xmlDocument, XPathConstants.NODESET);
for (int nodeIndex = 0; nodeIndex < nodeList.getLength(); nodeIndex++) {
Node currentNode = nodeList.item(nodeIndex);
String xpathID = "//students/Student/id";
String xpathName = "//students/Student/name";
NodeList childID = (NodeList)xpathResource.compile(xpathID).evaluate(currentNode, XPathConstants.NODESET);
NodeList childName = (NodeList)xpathResource.compile(xpathName).evaluate(currentNode, XPathConstants.NODESET);
System.out.println("node ID " +childID.item(0).getTextContent());
System.out.println("node Name " +childName.item(0).getTextContent());
}
Now the problem is, this for loop will execute for 2 times, but both time I'm getting 2000 , PP2000 as ID value. Is there any way to iterate to the child node with generic XPATH against a node. I cannot go generic XPATH against the whole XMLDocument, I have some validation to do. I want to use XML nodelist as result set rows, so that I can validate the XML value and do my stuff.
XPath xpathResource = XPathFactory.newInstance().newXPath();
Document xmlDocument = //creating document;
NodeList nodeList = (NodeList)xpathResource.compile("//students/Student/id").evaluate(xmlDocument, XPathConstants.NODESET);
for (int nodeIndex = 0; nodeIndex < nodeList.getLength(); nodeIndex++) {
Node currentNode = nodeList.item(nodeIndex);
System.out.println("node " +currentNode.getTextContent());
}

Getting a node's children depending on another children's text

I want to get the english or hungarian elements' text depending on the title. So far, I came up with this. Can you help me with a cleaner or more professional solution for this using xpath?
The XML:
<Textbook>
<TEXT>
<Title>SAMPLE TITLE 1</Title>
<English>Sample english text</English>
<Hungarian>Sample hungarian text</Hungarian>
</TEXT>
<TEXT>
<Title>SAMPLE TITLE 2</Title>
<English>Sample english text 2</English>
<Hungarian>Sample hungarian text 2</Hungarian>
</TEXT>
</Textbook>
The code:
public String getResults (String elementName, String language) throws XPathExpressionException {
xpathFactory = XPathFactory.newInstance();
xpath = xpathFactory.newXPath();
XPathExpression expr = xpath.compile("/Textbook/TEXT/Title");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
if (nodes.item(i).getTextContent().equals(elementName)) {
XPathExpression expr2 = xpath.compile("/Textbook/TEXT/" + language);
NodeList nodes2 = (NodeList) expr2.evaluate(doc, XPathConstants.NODESET);
return nodes2.item(i).getTextContent();
}
}
return null;
}
You can filter <TEXT> by it's chld <Title> content. For example, this XPath will get <TEXT> having child <Title> with content equals "SAMPLE TITLE 1" :
//TEXT[Title='SAMPLE TITLE 1']
So your requirement can actually be fulfilled using single XPath like so :
.....
String path = "/Textbook/TEXT[Title='" + elementName + "']/" + language
XPathExpression expr = xpath.compile(path);
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
return nodes.item(i).getTextContent();
}
.....

Getting XML child elements with XPath

I have this XML:
<root>
<items>
<item1>
<tag1>1</tag1>
<sub>
<sub1>10 </sub1>
<sub2>20 </sub2>
</sub>
</item1>
<item2>
<tag1>1</tag1>
<sub>
<sub1> </sub1>
<sub2> </sub2>
</sub>
</item2>
</items>
</root>
I want to get the item1 element and the name and values of the child elements.
That is, i want to get: tag1 - 1,sub1-10,sub2-20.
How can i do this? so far i can only get elements without children.
Document doc = ...;
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("/root/items/item1/*/text()");
Object o = expr.evaluate(doc, XPathConstants.NODESET);
NodeList list = (NodeList) o;
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
/**
* File: Ex1.java #author ronda
*/
public class Ex1 {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory Factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = Factory.newDocumentBuilder();
Document doc = builder.parse("myxml.xml");
//creating an XPathFactory:
XPathFactory factory = XPathFactory.newInstance();
//using this factory to create an XPath object:
XPath xpath = factory.newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//" + "item1" + "/*");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
System.out.println(nodes.getLength());
for (int i = 0; i < nodes.getLength(); i++) {
Element el = (Element) nodes.item(i);
System.out.println("tag: " + el.getNodeName());
// seach for the Text children
if (el.getFirstChild().getNodeType() == Node.TEXT_NODE)
System.out.println("inner value:" + el.getFirstChild().getNodeValue());
NodeList children = el.getChildNodes();
for (int k = 0; k < children.getLength(); k++) {
Node child = children.item(k);
if (child.getNodeType() != Node.TEXT_NODE) {
System.out.println("child tag: " + child.getNodeName());
if (child.getFirstChild().getNodeType() == Node.TEXT_NODE)
System.out.println("inner child value:" + child.getFirstChild().getNodeValue());;
}
}
}
}
}
I get this output loading the xml of your question in file named: myxml.xml:
run:
2
tag: tag1
inner value:1
tag: sub
inner value:
child tag: sub1
inner child value:10
child tag: sub2
inner child value:20
...a bit wordy, but allow us to understand how it works. PS: I found a good guide in here

Categories