Java XML Parse xPath not getting all child elements

Java XML Parse xPath not getting all child elements - java

I have a very large XML file with a lot of elements.
I'm only interested in the cases which look like the example below. there's about 400 cases in the xml document I want to parse through the document and print out each element and name.
<cases>
<case>
<id/>
<title/>
<type/>
<priority/>
<estimate/>
<references/>
<custom>
<functional_area/>
<technology_dependence/>
<reviewed/>
<steps_completed>
</steps_completed>
<preconds> </preconds>
<steps_seperated>
<step>
<index/>
<content>
</content>
<expected>
</expected>
</step>
<step>
<index/>
<content>
</content>
<expected>
</expected>
</step>
<step>
</steps_seperated>
</custom>
</case>
at the moment my code works fine up until "steps_seperated" where it stops and goes onto the next case.
my code looks like this (MVCE BELOW)
I can't work out why it stops after "steps_seperated" and starts a new case
a second issue I've noticed is it only displays out 10 or so cases (im not sure if this is because I'm running it in netbeans )
any help would be very much appreciated thank you
p.s
mvce
public void printCaseElements(NodeList list){
for(int i = 0 ;i <list.getLength();i++){
Element el = (Element) list.item(i);
System.out.println("tag: " + el.getNodeName());
if(el.getFirstChild().getNodeType() == Node.TEXT_NODE)
{
System.out.println("Inner Value: "+ el.getFirstChild().getNodeValue());
System.out.println("________________________________________________________________________");
NodeList children = el.getChildNodes();
for(int k = 0; k < children.getLength(); k++){
Node child = children.item(k);
if (child.getNodeType() != Node.TEXT_NODE){
System.out.println("child tag: "+ child.getNodeName());
if(child.getFirstChild().getNodeType() == Node.TEXT_NODE){
System.out.println("inner child value :" + child.getFirstChild().getNodeValue());
System.out.println("____________________________________________________________________________________________");
}
}
}
}
}
P.P.S
I think the issue may be when I use xPath
DOMParser parser =new DOMParser() ;
InputSource source = new InputSource(path) ;
try {
parser.parse(source);
Element docElement = parser.getDocument().getDocumentElement();
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression expression =xPath.compile("//case/*");
NodeList list =(NodeList) expression.evaluate(docElement,XPathConstants.NODESET);
DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document newDoc = documentBuilder.newDocument();
Element newElement = newDoc.createElement("cases");
newDoc.appendChild(newElement);
for(int i =0 ; i <list.getLength(); i++){
Node n = newDoc.importNode(list.item(i), true);
newElement.appendChild(n);
}
when I changed ("//case/") to ("//steps_seperated/") it showed all the elements below steps separated. but not the elements before steps_seperated

Related

XPATH evaluation against child node

I have xml as follows,
<students>
<Student><age>23</age><id>2000</id><name>PP2000</name></Student>
<Student><age>23</age><id>1000</id><name>PP1000</name></Student>
</students>
I have 2 xpaths Template XPATH = students/Student will be the template nodes, but I cannot hard code this xpath, because it will change for other XMLs, and XML is pretty dynamic, can expand (but with the same base XPATHs) So if I evaluate one more XPATH using the template node, I'm using the following code,
XPath xpathResource = XPathFactory.newInstance().newXPath();
Document xmlDocument = //creating document;
NodeList nodeList = (NodeList)xpathResource.compile("//students/Student").evaluate(xmlDocument, XPathConstants.NODESET);
for (int nodeIndex = 0; nodeIndex < nodeList.getLength(); nodeIndex++) {
Node currentNode = nodeList.item(nodeIndex);
String xpathID = "//students/Student/id";
String xpathName = "//students/Student/name";
NodeList childID = (NodeList)xpathResource.compile(xpathID).evaluate(currentNode, XPathConstants.NODESET);
NodeList childName = (NodeList)xpathResource.compile(xpathName).evaluate(currentNode, XPathConstants.NODESET);
System.out.println("node ID " +childID.item(0).getTextContent());
System.out.println("node Name " +childName.item(0).getTextContent());
}
Now the problem is, this for loop will execute for 2 times, but both time I'm getting 2000 , PP2000 as ID value. Is there any way to iterate to the child node with generic XPATH against a node. I cannot go generic XPATH against the whole XMLDocument, I have some validation to do. I want to use XML nodelist as result set rows, so that I can validate the XML value and do my stuff.

XPath xpathResource = XPathFactory.newInstance().newXPath();
Document xmlDocument = //creating document;
NodeList nodeList = (NodeList)xpathResource.compile("//students/Student/id").evaluate(xmlDocument, XPathConstants.NODESET);
for (int nodeIndex = 0; nodeIndex < nodeList.getLength(); nodeIndex++) {
Node currentNode = nodeList.item(nodeIndex);
System.out.println("node " +currentNode.getTextContent());
}

Getting tag and value from XML file

I got a xml config file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Config>
<useProxy>true</useProxy>
<proxyReqPass>true</proxyReqPass>
<proxyHost>proxy.net.br</proxyHost>
<proxyUser>admin</proxyUser>
<proxyPass>12345</proxyPass>
</Config>
I have a list of Data() objects, each Data() contains 2 strings, the tag name and the value of the tag. So i want to insert in this list the data of this xml file, like the example:
List<Data> data = new ArrayList<Data>();
File fXmlFile = new File("Config.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList elements = doc.getElementsByTagName("Config");
for (int i = 0; i < elements.getLength(); i++) {
Node nNode = elements.item(i);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
data.add(new Data(eElement.getTagName(), eElement.getTextContent()));
}
}
And if i print the list:
for(int i = 0; i < list.size(); i++)
System.out.println("Node: " + list.get(i).getTagName() + " Value: " + list.get(i).getTextContent());
I want the result to be:
> Node: useProxy Value: true
Node: proxyReqPass Value: true
Node: proxyHost Value: proxy.net.br
Node: proxyUser Value: admin
Node: proxyPass Value: 12345
But the result is:
> Node: Config Value:
false
false
I don´t know where´s my mistake, please somebody help me

You're iterating over the results of the search for the <Config> tag. You should be iterating over the search results children.
NodeList configTags = doc.getElementsByTagName("Config");
// assuming there will only be one `Config` node
NodeList elements = configTags.item(0).getChildNodes();
for (int i = 0; i < elements.getLength(); i++) {
// (everything else looks correct)...
}
When you query getElementsByTagName(), a NodeList is returned which, in your case should always contains one node, the <Config> node. To access the child nodes (<useProxy>, etc), you need to get the first Node out of the node list and query for it's children qith getChildNodes().

Getting XML child elements with XPath

I have this XML:
<root>
<items>
<item1>
<tag1>1</tag1>
<sub>
<sub1>10 </sub1>
<sub2>20 </sub2>
</sub>
</item1>
<item2>
<tag1>1</tag1>
<sub>
<sub1> </sub1>
<sub2> </sub2>
</sub>
</item2>
</items>
</root>
I want to get the item1 element and the name and values of the child elements.
That is, i want to get: tag1 - 1,sub1-10,sub2-20.
How can i do this? so far i can only get elements without children.

Document doc = ...;
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("/root/items/item1/*/text()");
Object o = expr.evaluate(doc, XPathConstants.NODESET);
NodeList list = (NodeList) o;

import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
/**
* File: Ex1.java #author ronda
*/
public class Ex1 {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory Factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = Factory.newDocumentBuilder();
Document doc = builder.parse("myxml.xml");
//creating an XPathFactory:
XPathFactory factory = XPathFactory.newInstance();
//using this factory to create an XPath object:
XPath xpath = factory.newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//" + "item1" + "/*");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
System.out.println(nodes.getLength());
for (int i = 0; i < nodes.getLength(); i++) {
Element el = (Element) nodes.item(i);
System.out.println("tag: " + el.getNodeName());
// seach for the Text children
if (el.getFirstChild().getNodeType() == Node.TEXT_NODE)
System.out.println("inner value:" + el.getFirstChild().getNodeValue());
NodeList children = el.getChildNodes();
for (int k = 0; k < children.getLength(); k++) {
Node child = children.item(k);
if (child.getNodeType() != Node.TEXT_NODE) {
System.out.println("child tag: " + child.getNodeName());
if (child.getFirstChild().getNodeType() == Node.TEXT_NODE)
System.out.println("inner child value:" + child.getFirstChild().getNodeValue());;
}
}
}
}
}
I get this output loading the xml of your question in file named: myxml.xml:
run:
2
tag: tag1
inner value:1
tag: sub
inner value:
child tag: sub1
inner child value:10
child tag: sub2
inner child value:20
...a bit wordy, but allow us to understand how it works. PS: I found a good guide in here

how to get attribute of given node?

I am trying to write DOM XML parsing.
My Xml file
<?xml version="1.0"?>
<BLAH>
<AgentNm type="citi1">
<accName>accName1</accName>
<accType>accType1</accType>
<someThing>someThing1</someThing>
<amt>100000</amt>
</AgentNm>
<AgentNm type="citi2">
<accName>accName2</accName>
<accType>accType2</accType>
<someThing>someThing2</someThing>
<amt>200000</amt>
</AgentNm>
</BLAH>
And i tried following java code
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse (new File("c:\\file.xml"));
// normalize text representation
doc.getDocumentElement ().normalize ();
System.out.println ("Root element of the doc is " +doc.getDocumentElement().getNodeName());
NodeList agentNm = doc.getElementsByTagName("AgentNm");
int totalAgentNm = agentNm.getLength();
System.out.println("Total no of Agents : " + totalAgentNm);
for(int s=0; s<agentNm.getLength() ; s++){
Node firstPersonNode = agentNm.item(s);
if(firstPersonNode.getNodeType() == Node.ELEMENT_NODE){
Element firstPersonElement = (Element)firstPersonNode;
PrintNodeElem(firstPersonElement,"type");
}//end of if clause
}//end of for loop with s var
static void PrintNodeElem(Element nodeElem,String elem){
NodeList someThingList = nodeElem.getElementsByTagName(elem);
Element ageElement = (Element)someThingList.item(0);
NodeList textAgeList = ageElement.getChildNodes();
System.out.println(elem+" : " +((Node)textAgeList.item(0)).getNodeValue().trim());
}
But, when i tried to execute above method,
i am getting null pointer exception.
can any one explain me, how to fix this.

if you want an attribute of a given node, I would suggest XPath. It is much easier.
http://onjava.com/onjava/2005/01/12/xpath.html

Looping over nodes and extracting specific subnode values using Java's XPath

I understand from Googling that it makes more sense to extract data from XML using XPath than by using DOM looping.
At the moment, I have implemented a solution using DOM, but the code is verbose, and it feels untidy and unmaintainable, so I would like to switch to a cleaner XPath solution.
Let's say I have this structure:
<products>
<product>
<title>Some title 1</title>
<image>Some image 1</image>
</product>
<product>
<title>Some title 2</title>
<image>Some image 2</image>
</product>
...
</products>
I want to be able to run a for loop for each of the <product> elements, and inside this for loop, extract the title and image node values.
My code looks like this:
InputStream is = conn.getInputStream();
DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(is);
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/products/product");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList products = (NodeList) result;
for (int i = 0; i < products.getLength(); i++) {
Node n = products.item(i);
if (n != null && n.getNodeType() == Node.ELEMENT_NODE) {
Element product = (Element) n;
// do some DOM navigation to get the title and image
}
}
Inside my for loop I get each <product> as a Node, which is cast to an Element.
Can I simply use my instance of XPathExpression to compile and run another XPath on the Node or the Element?

Yes, you can always do like this -
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/products/product");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
expr = xpath.compile("title"); // The new xpath expression to find 'title' within 'product'.
NodeList products = (NodeList) result;
for (int i = 0; i < products.getLength(); i++) {
Node n = products.item(i);
if (n != null && n.getNodeType() == Node.ELEMENT_NODE) {
Element product = (Element) n;
NodeList nodes = (NodeList) expr.evaluate(product,XPathConstants.NODESET); //Find the 'title' in the 'product'
System.out.println("TITLE: " + nodes.item(0).getTextContent()); // And here is the title
}
}
Here I have given example of extracting the 'title' value. In same way you can do for 'image'

I'm not a big fan of this approach because you have to build a document (which might be expensive) before you can apply XPaths to it.
I've found VTD-XML a lot more efficient when it comes to applying XPaths to documents, because you don't need to load the whole document into memory. Here is some sample code:
final VTDGen vg = new VTDGen();
vg.parseFile("file.xml", false);
final VTDNav vn = vg.getNav();
final AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("/products/product");
while (ap.evalXPath() != -1) {
System.out.println("PRODUCT:");
// you could either apply another xpath or simply get the first child
if (vn.toElement(VTDNav.FIRST_CHILD, "title")) {
int val = vn.getText();
if (val != -1) {
System.out.println("Title: " + vn.toNormalizedString(val));
}
vn.toElement(VTDNav.PARENT);
}
if (vn.toElement(VTDNav.FIRST_CHILD, "image")) {
int val = vn.getText();
if (val != -1) {
System.out.println("Image: " + vn.toNormalizedString(val));
}
vn.toElement(VTDNav.PARENT);
}
}
Also see this post on Faster XPaths with VTD-XML.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java XML Parse xPath not getting all child elements - java

Related

XPATH evaluation against child node

Getting tag and value from XML file

Getting XML child elements with XPath

how to get attribute of given node?

Looping over nodes and extracting specific subnode values using Java's XPath

Categories

Resources