I want to get the english or hungarian elements' text depending on the title. So far, I came up with this. Can you help me with a cleaner or more professional solution for this using xpath?
The XML:
<Textbook>
<TEXT>
<Title>SAMPLE TITLE 1</Title>
<English>Sample english text</English>
<Hungarian>Sample hungarian text</Hungarian>
</TEXT>
<TEXT>
<Title>SAMPLE TITLE 2</Title>
<English>Sample english text 2</English>
<Hungarian>Sample hungarian text 2</Hungarian>
</TEXT>
</Textbook>
The code:
public String getResults (String elementName, String language) throws XPathExpressionException {
xpathFactory = XPathFactory.newInstance();
xpath = xpathFactory.newXPath();
XPathExpression expr = xpath.compile("/Textbook/TEXT/Title");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
if (nodes.item(i).getTextContent().equals(elementName)) {
XPathExpression expr2 = xpath.compile("/Textbook/TEXT/" + language);
NodeList nodes2 = (NodeList) expr2.evaluate(doc, XPathConstants.NODESET);
return nodes2.item(i).getTextContent();
}
}
return null;
}
You can filter <TEXT> by it's chld <Title> content. For example, this XPath will get <TEXT> having child <Title> with content equals "SAMPLE TITLE 1" :
//TEXT[Title='SAMPLE TITLE 1']
So your requirement can actually be fulfilled using single XPath like so :
.....
String path = "/Textbook/TEXT[Title='" + elementName + "']/" + language
XPathExpression expr = xpath.compile(path);
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
return nodes.item(i).getTextContent();
}
.....
Related
I am trying to display all text within text nodes only, within an XFA XML document while ignoring namespaces.
I came up with an Xpath that returns the desired results within XMLSpy with xpath 1.0 but the same Xpath in Java returns null for some reason.
Xpath = //*[local-name()='text'][string-length(normalize-space(.))>0]
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
ArrayList<String> list = new ArrayList<>();
XPathExpression expr = xpath.compile("//*[local-name()='text'][string-length(normalize-space(.))>0]");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println("This prints null = " + nodes.item(i).getNodeValue());
}
XML file wouldn't post here so it can be viewed at the link below:
https://drive.google.com/file/d/1n-v3gzT-3GgxNnYKFUvMPjRQmtnkqcpY/view?usp=sharing
The problem is that it's not the <text> elements that contain the values, but their child text nodes.
Replace the line
System.out.println("This prints null = " + nodes.item(i).getNodeValue());
with
System.out.println("This does not print null = " + nodes.item(i).getFirstChild().getNodeValue());
I have xml as follows,
<students>
<Student><age>23</age><id>2000</id><name>PP2000</name></Student>
<Student><age>23</age><id>1000</id><name>PP1000</name></Student>
</students>
I have 2 xpaths Template XPATH = students/Student will be the template nodes, but I cannot hard code this xpath, because it will change for other XMLs, and XML is pretty dynamic, can expand (but with the same base XPATHs) So if I evaluate one more XPATH using the template node, I'm using the following code,
XPath xpathResource = XPathFactory.newInstance().newXPath();
Document xmlDocument = //creating document;
NodeList nodeList = (NodeList)xpathResource.compile("//students/Student").evaluate(xmlDocument, XPathConstants.NODESET);
for (int nodeIndex = 0; nodeIndex < nodeList.getLength(); nodeIndex++) {
Node currentNode = nodeList.item(nodeIndex);
String xpathID = "//students/Student/id";
String xpathName = "//students/Student/name";
NodeList childID = (NodeList)xpathResource.compile(xpathID).evaluate(currentNode, XPathConstants.NODESET);
NodeList childName = (NodeList)xpathResource.compile(xpathName).evaluate(currentNode, XPathConstants.NODESET);
System.out.println("node ID " +childID.item(0).getTextContent());
System.out.println("node Name " +childName.item(0).getTextContent());
}
Now the problem is, this for loop will execute for 2 times, but both time I'm getting 2000 , PP2000 as ID value. Is there any way to iterate to the child node with generic XPATH against a node. I cannot go generic XPATH against the whole XMLDocument, I have some validation to do. I want to use XML nodelist as result set rows, so that I can validate the XML value and do my stuff.
XPath xpathResource = XPathFactory.newInstance().newXPath();
Document xmlDocument = //creating document;
NodeList nodeList = (NodeList)xpathResource.compile("//students/Student/id").evaluate(xmlDocument, XPathConstants.NODESET);
for (int nodeIndex = 0; nodeIndex < nodeList.getLength(); nodeIndex++) {
Node currentNode = nodeList.item(nodeIndex);
System.out.println("node " +currentNode.getTextContent());
}
XML stream
<l>
<i>
<a>AAA</a>
<b>BBB</b>
<c>CCC</c>
</i>
<i>
<a>AAA2</a>
<b>BBB2</b>
<c>CCC2</c>
</i>
<i>
...
</i>
</l>
I want to output the following text with some Java code:
> CCC
> CCC2
...
Here is the code I wrote to produce the expected result:
Java code
DocumentBuilder docBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document d = docBuilder.parse("file:///C:/path/to/my/xml/stream.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("//i");
NodeList listOfiNodes = (NodeList) expr.evaluate(d, XPathConstants.NODESET);
for(int i=0;i<listOfiNodes.getLength();i++) {
XPathExpression expr2 = xpath.compile("//c");
System.out.println("> " + ((Node) expr2.evaluate(listOfiNodes.item(i), XPathConstants.NODE)).getTextContent());
}
expr2 keeps on returning the first c node. So I get this output:
> CCC
> CCC
...
The evaluation performed by expr2 doesn't seem to "stay" on the node passed to evaluate() method. Why?
NOTA: I don't want to get the c nodes directly with the xpath //i/c (or /l/i/c).
Java 6
//c selects all matching nodes in the whole document. Use c instead and you will receive this output:
> CCC
> CCC2
Note that you will get an NPE if a Node i does not contain a c in the line where you print the results. The following code should be working as expected:
DocumentBuilder docBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document d = docBuilder.parse("stream.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("//i");
NodeList listOfiNodes = (NodeList) expr.evaluate(d, XPathConstants.NODESET);
for (int i = 0; i < listOfiNodes.getLength(); i++) {
javax.xml.xpath.XPathExpression expr2 = xpath.compile("c");
Node item = listOfiNodes.item(i);
Node node = (Node) expr2.evaluate(item, XPathConstants.NODE);
if (null != node) {
System.out.println("> " + node.getTextContent());
}
}
Change "//c" with ".//c"
XPathExpression expr2 = xpath.compile(".//c");
It will start the search anywhere from the current node instead of the whole document.
XPathExpression expr2 = (XPathExpression) xpath.compile(".//c");
for(int i=0;i<listOfiNodes.getLength();i++) {
System.out.println("> " + ((Node) expr2.evaluate(listOfiNodes.item(i), XPathConstants.NODE)).getTextContent());
}
Output:
CCC
CCC2
I have this XML:
<root>
<items>
<item1>
<tag1>1</tag1>
<sub>
<sub1>10 </sub1>
<sub2>20 </sub2>
</sub>
</item1>
<item2>
<tag1>1</tag1>
<sub>
<sub1> </sub1>
<sub2> </sub2>
</sub>
</item2>
</items>
</root>
I want to get the item1 element and the name and values of the child elements.
That is, i want to get: tag1 - 1,sub1-10,sub2-20.
How can i do this? so far i can only get elements without children.
Document doc = ...;
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("/root/items/item1/*/text()");
Object o = expr.evaluate(doc, XPathConstants.NODESET);
NodeList list = (NodeList) o;
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
/**
* File: Ex1.java #author ronda
*/
public class Ex1 {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory Factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = Factory.newDocumentBuilder();
Document doc = builder.parse("myxml.xml");
//creating an XPathFactory:
XPathFactory factory = XPathFactory.newInstance();
//using this factory to create an XPath object:
XPath xpath = factory.newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//" + "item1" + "/*");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
System.out.println(nodes.getLength());
for (int i = 0; i < nodes.getLength(); i++) {
Element el = (Element) nodes.item(i);
System.out.println("tag: " + el.getNodeName());
// seach for the Text children
if (el.getFirstChild().getNodeType() == Node.TEXT_NODE)
System.out.println("inner value:" + el.getFirstChild().getNodeValue());
NodeList children = el.getChildNodes();
for (int k = 0; k < children.getLength(); k++) {
Node child = children.item(k);
if (child.getNodeType() != Node.TEXT_NODE) {
System.out.println("child tag: " + child.getNodeName());
if (child.getFirstChild().getNodeType() == Node.TEXT_NODE)
System.out.println("inner child value:" + child.getFirstChild().getNodeValue());;
}
}
}
}
}
I get this output loading the xml of your question in file named: myxml.xml:
run:
2
tag: tag1
inner value:1
tag: sub
inner value:
child tag: sub1
inner child value:10
child tag: sub2
inner child value:20
...a bit wordy, but allow us to understand how it works. PS: I found a good guide in here
Given an xml document that looks like the following:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<entry key="agentType">STANDARD</entry>
<entry key="DestinationTransferStates"></entry>
<entry key="AgentStatusPublishRate">300</entry>
<entry key="agentVersion">f000-703-GM2-20101109-1550</entry>
<entry key="CommandTimeUTC">2010-12-24T02:25:43Z</entry>
<entry key="PublishTimeUTC">2010-12-24T02:26:09Z</entry>
<entry key="queueManager">AGENTQMGR</entry>
</properties>
I want to print the values of the "key" attribute and the element so it looks like this:
agentType = STANDARD
DestinationTransferStates =
AgentStatusPublishRate = 300
agentVersion = f000-703-GM2-20101109-1550
CommandTimeUTC = 2010-12-24T02:25:43Z
PublishTimeUTC = 2010-12-24T02:26:09Z
queueManager = AGENTQMGR
I'm able to print the node values with no problem using this code:
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//properties/entry/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
And I can print the values of the "key" attribute by changing the xpath expression and the node methods as follows:
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//properties/entry");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getAttributes().getNamedItem("key").getNodeValue());
}
It seems like there would be a way to get at both of these values in a single evaluate. I could always evaluate two NodeLists and iterate through them with a common index but I'm not sure they are guaranteed to be returned in the same order. Any suggestions appreciated.
What about getTextContent()? This should do the work.
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++)
{
Node currentItem = nodes.item(i);
String key = currentItem.getAttributes().getNamedItem("key").getNodeValue();
String value = currentItem.getTextContent();
System.out.printf("%1s = %2s\n", key, value);
}
For further informations please see the javadoc for getTextContent(). I hope this will help you.