XPath read root element returns null - java

Given an XML file like:
<source>
<element value="a">
<element value="b">
</source>
I'm trying to read the root element ("source") of the XML using Java and XPath:
public String parseExpression(Document doc) {
NodeList nodeList = (NodeList) xPath.compile("/").evaluate(
doc, XPathConstants.NODESET);
return nodeList.item(0).getFirstChild().getNodeValue();
}
However it returns null. Why?

Because .getNodeValue(); does not return the value of the attribute. Try (Element)nodeList.item(0).getFirstChild()).getAttribute("value") instead.
The value you are trying to read is not in the element node you are accessing.
It is in a seperate attribute node which is only accessable when you cast your NodeList entry to Element.

Related

Get XML child node from parent node using XPATH java

I'm trying to get specific child node from list of nodes using xpath.
Here is my xml input
<root>
<Transaction>
<code> 123 </code>
<Reason> test1 </Reason>
</Transaction>
<Transaction>
<code> 456 </code>
</Transaction>
<Transaction>
<code> 789 </code>
<Reason> test2 </Reason>
</Transaction>
</root>
I'm trying to get all the transactions as node list and then check one by one inside each trancation either it has reason or not all using Xpath. Here is my sample code.
Document document = builder.parse(new FileInputStream(file));
XPath xPath = XPathFactory.newInstance().newXPath();
//Get all the transactions
NodeList nodeList = (NodeList) xPath.evaluate("//Transaction", document, XPathConstants.NODESET);
for (int temp = 0; temp < nodeList.getLength(); temp++) {
Node node = nodeList.item(temp);
Element element = (Element) node;
if (node.getNodeType() == Node.ELEMENT_NODE) {
XPath xPath2 = XPathFactory.newInstance().newXPath();
//This one always return first value
Node child = (Node) xPath2.evaluate("//Reason", node, XPathConstants.NODE);
if(child != null) {
System.out.println(child.getTextContent());
}
//This is working as expected
if(element.getElementsByTagName("Reason").getLength() > 0) {
System.out.println(element.getElementsByTagName("Reason").item(0).getTextContent());
}
}
}
If I cast the node to element and try to get child element by tag name it's working fine. But when I try to do it using X-PAth it returns all the values from other nodes as well.
There's no need to execute the loop in Java; you can do it in the XPath expression itself:
//Transaction[Reason]
That XPath will return you all the Transaction elements which have a child Reason element.
If you want to get the Reason elements which are children of a Transaction element, then use this XPath:
//Transaction/Reason
If you want to get all the text nodes which are children of Reason elements which are children of a Transaction element, then use this XPath:
//Transaction/Reason/text()

Why does getLocalName() return null?

I'm loading some XML string like this:
Document doc = getDocumentBuilder().parse(new InputSource(new StringReader(xml)));
Later, I extract a node from this Document:
XPath xpath = getXPathFactory().newXPath();
XPathExpression expr = xpath.compile(expressionXPATH);
NodeList nodeList = (NodeList)expr.evaluate(doc, XPathConstants.NODESET);
Node node = nodeList.item(0);
Now I want to get the local name of this node but I get null.
node.getLocalName(); // return null
With the debugger, I saw that my node has the following type: DOCUMENT_POSITION_DISCONNECTED.
The Javadoc states that getLocalName() returns null for this type of node.
Why node is of type DOCUMENT_POSITION_DISCONNECTED and not ELEMENT_NODE?
How to "convert" the type of the node?
As the documentation https://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html#getLocalName() states:
for nodes created with a DOM Level 1 method, [...] this is always null
so make sure you use a namespace aware DocumentBuilderFactory with setNamespaceAware(true), that way the DOM is supporting the namespace aware DOM Level 2/3 and will have a non-null value for getLocalName().
A simple test program
String xml = "<root/>";
DocumentBuilderFactory db = DocumentBuilderFactory.newInstance();
Document dom1 = db.newDocumentBuilder().parse(new InputSource(new StringReader(xml)));
System.out.println(dom1.getDocumentElement().getLocalName() == null);
db.setNamespaceAware(true);
Document dom2 = db.newDocumentBuilder().parse(new InputSource(new StringReader(xml)));
System.out.println(dom2.getDocumentElement().getLocalName() == null);
outputs
true
false
so (at least) the local name problem you have is caused by using a DOM Level 1, not namespace aware document (builder factory).

Java xPath - extract subdocument from XML

I have an XML document as follows:
<DocumentWrapper>
<DocumentHeader>
...
</DocumentHeader>
<DocumentBody>
<Invoice>
<Buyer/>
<Seller/>
</Invoice>
</DocumentBody>
</DocumentWrapper>
I would like to extract from it the content of DocumentBody element as String, raw XML document:
<Invoice>
<Buyer/>
<Seller/>
</Invoice>
With xPath it could be simple to get by:
/DocumentWrapper/DocumentBody
Unfrotunatelly, my Java code doesn't want to work as I want. It returns empty lines instead of expected result. Is there any chance to do that, or I have to return NodeList and then genereate xml document from them?
My Java code:
XPathFactory xPathFactoryXPathFactory.newInstance();
XPath xPath xPathFactory.newXPath();
XPathExpression xPath.compile(xPathQuery);
String result = expression.evaluate(xmlDocument);
Calling this method
String result = expression.evaluate(xmlDocument);
is the same as calling this
String result = (String) expression.evaluate(xmlDocument, XPathConstants.STRING);
which returns the character data of the result node, or the character data of all child nodes in case the result node is an element.
You should probably do something like this:
Node result = (Node) expression.evaluate(xmlDocument, XPathConstants.NODE);
TransformerFactory.newInstance().newTransformer()
.transform(new DOMSource(result), new StreamResult(System.out));

How to retrieve a specific node's value in XPath?

I have a XML file with this format:
<object>
<origin>1:1:1</origin>
<normal>2:2:2</normal>
<leafs>
<object>
<origin>1:1:1</origin>
<normal>3:3:3</normal>
<leafs>none</leafs>
</object>
</leafs>
</object>
How could I retrieve the value "none" of element <leafs> on second level of the tree? I used this
XPathExpression expLeafs = xpath.compile("*[name()='leafs']");
Object resLeafs = expLeafs.evaluate(node, XPathConstants.NODESET);
NodeList leafsList = (NodeList) resLeafs;
if (!leafsList.item(0).getFirstChild().getNodeValue().equals("none"))
more code...
but it doesn't work because there are some empty text nodes bofore and after "none". Is there a way to deal with it like xpath.compile("*[value()='none']")?
I just ran a simple test program using your XML file and
expr = xpath.compile("/object/leafs/object/leafs/text()");
and got the desired "none" result. If you have additional requirements, you'll have to edit your question.
After a checking the code line #Lord Torgamus provided i managed to parse the document as i needed like this:
XPathExpression expLeafs = xpath.compile("*[name()='leafs']");
Object resLeafs = expLeafs.evaluate(node, XPathConstants.NODESET);
NodeList leafsList = (NodeList) resLeafs;
Node nd = leafsList.item(0);
XPathExpression expr = xpath.compile("text()");
Object resultObj = expr.evaluate(nd, XPathConstants.NODE);
String str = expr.evaluate(nd).trim();
System.out.println(str);
and the output is "none" with no other empty text node.

Document - How to get a tag's value by its name?

I'm using Java's DOM parser to parse an XML file.
let's say I have the following XML
<?xml version="1.0"?>
<config>
<dotcms>
<endPoint>ip</endPoint>
</dotcms>
</config>
</xml>
I like to get the value of 'endPoint'. I can do it with the following code snippet. (assuming that I already parsed it with DocumentBuilder)
NodeList nodeList = this.doc.getElementByTagName("dotcms");
Node nValue = (Node) nodeList.item(0);
return nValue.getNodeValue();
Is it possible to get a value of a field by a field's name? Like....
Node nValue = nodeList.getByName("endPoint") something like this...?
You should use XPath for these sorts of tasks:
//endPoint/text()
or:
/config/dotcms/endPoint/text()
Of course Java has a built-in support for XPath:
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("//endPoint/text()");
Object value = expr.evaluate(doc, XPathConstants.STRING);
You could also use jOOX, a jquery-like DOM wrapper, to write even less code:
// Using css-style selectors
String text1 = $(document).find("endPoint").text();
// Using XPath
String text2 = $(document).xpath("//endPoint").text();

Categories