Java XML Parse xPath not getting all child elements - java

I have a very large XML file with a lot of elements.
I'm only interested in the cases which look like the example below. there's about 400 cases in the xml document I want to parse through the document and print out each element and name.
<cases>
<case>
<id/>
<title/>
<type/>
<priority/>
<estimate/>
<references/>
<custom>
<functional_area/>
<technology_dependence/>
<reviewed/>
<steps_completed>
</steps_completed>
<preconds> </preconds>
<steps_seperated>
<step>
<index/>
<content>
</content>
<expected>
</expected>
</step>
<step>
<index/>
<content>
</content>
<expected>
</expected>
</step>
<step>
</steps_seperated>
</custom>
</case>
at the moment my code works fine up until "steps_seperated" where it stops and goes onto the next case.
my code looks like this (MVCE BELOW)
I can't work out why it stops after "steps_seperated" and starts a new case
a second issue I've noticed is it only displays out 10 or so cases (im not sure if this is because I'm running it in netbeans )
any help would be very much appreciated thank you
p.s
mvce
public void printCaseElements(NodeList list){
for(int i = 0 ;i <list.getLength();i++){
Element el = (Element) list.item(i);
System.out.println("tag: " + el.getNodeName());
if(el.getFirstChild().getNodeType() == Node.TEXT_NODE)
{
System.out.println("Inner Value: "+ el.getFirstChild().getNodeValue());
System.out.println("________________________________________________________________________");
NodeList children = el.getChildNodes();
for(int k = 0; k < children.getLength(); k++){
Node child = children.item(k);
if (child.getNodeType() != Node.TEXT_NODE){
System.out.println("child tag: "+ child.getNodeName());
if(child.getFirstChild().getNodeType() == Node.TEXT_NODE){
System.out.println("inner child value :" + child.getFirstChild().getNodeValue());
System.out.println("____________________________________________________________________________________________");
}
}
}
}
}
P.P.S
I think the issue may be when I use xPath
DOMParser parser =new DOMParser() ;
InputSource source = new InputSource(path) ;
try {
parser.parse(source);
Element docElement = parser.getDocument().getDocumentElement();
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression expression =xPath.compile("//case/*");
NodeList list =(NodeList) expression.evaluate(docElement,XPathConstants.NODESET);
DocumentBuilder documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document newDoc = documentBuilder.newDocument();
Element newElement = newDoc.createElement("cases");
newDoc.appendChild(newElement);
for(int i =0 ; i <list.getLength(); i++){
Node n = newDoc.importNode(list.item(i), true);
newElement.appendChild(n);
}
when I changed ("//case/") to ("//steps_seperated/") it showed all the elements below steps separated. but not the elements before steps_seperated

Related

XPATH evaluation against child node

I have xml as follows,
<students>
<Student><age>23</age><id>2000</id><name>PP2000</name></Student>
<Student><age>23</age><id>1000</id><name>PP1000</name></Student>
</students>
I have 2 xpaths Template XPATH = students/Student will be the template nodes, but I cannot hard code this xpath, because it will change for other XMLs, and XML is pretty dynamic, can expand (but with the same base XPATHs) So if I evaluate one more XPATH using the template node, I'm using the following code,
XPath xpathResource = XPathFactory.newInstance().newXPath();
Document xmlDocument = //creating document;
NodeList nodeList = (NodeList)xpathResource.compile("//students/Student").evaluate(xmlDocument, XPathConstants.NODESET);
for (int nodeIndex = 0; nodeIndex < nodeList.getLength(); nodeIndex++) {
Node currentNode = nodeList.item(nodeIndex);
String xpathID = "//students/Student/id";
String xpathName = "//students/Student/name";
NodeList childID = (NodeList)xpathResource.compile(xpathID).evaluate(currentNode, XPathConstants.NODESET);
NodeList childName = (NodeList)xpathResource.compile(xpathName).evaluate(currentNode, XPathConstants.NODESET);
System.out.println("node ID " +childID.item(0).getTextContent());
System.out.println("node Name " +childName.item(0).getTextContent());
}
Now the problem is, this for loop will execute for 2 times, but both time I'm getting 2000 , PP2000 as ID value. Is there any way to iterate to the child node with generic XPATH against a node. I cannot go generic XPATH against the whole XMLDocument, I have some validation to do. I want to use XML nodelist as result set rows, so that I can validate the XML value and do my stuff.
XPath xpathResource = XPathFactory.newInstance().newXPath();
Document xmlDocument = //creating document;
NodeList nodeList = (NodeList)xpathResource.compile("//students/Student/id").evaluate(xmlDocument, XPathConstants.NODESET);
for (int nodeIndex = 0; nodeIndex < nodeList.getLength(); nodeIndex++) {
Node currentNode = nodeList.item(nodeIndex);
System.out.println("node " +currentNode.getTextContent());
}

Getting tag and value from XML file

I got a xml config file:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Config>
<useProxy>true</useProxy>
<proxyReqPass>true</proxyReqPass>
<proxyHost>proxy.net.br</proxyHost>
<proxyUser>admin</proxyUser>
<proxyPass>12345</proxyPass>
</Config>
I have a list of Data() objects, each Data() contains 2 strings, the tag name and the value of the tag. So i want to insert in this list the data of this xml file, like the example:
List<Data> data = new ArrayList<Data>();
File fXmlFile = new File("Config.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList elements = doc.getElementsByTagName("Config");
for (int i = 0; i < elements.getLength(); i++) {
Node nNode = elements.item(i);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
data.add(new Data(eElement.getTagName(), eElement.getTextContent()));
}
}
And if i print the list:
for(int i = 0; i < list.size(); i++)
System.out.println("Node: " + list.get(i).getTagName() + " Value: " + list.get(i).getTextContent());
I want the result to be:
> Node: useProxy Value: true
Node: proxyReqPass Value: true
Node: proxyHost Value: proxy.net.br
Node: proxyUser Value: admin
Node: proxyPass Value: 12345
But the result is:
> Node: Config Value:
false
false
I don´t know where´s my mistake, please somebody help me
You're iterating over the results of the search for the <Config> tag. You should be iterating over the search results children.
NodeList configTags = doc.getElementsByTagName("Config");
// assuming there will only be one `Config` node
NodeList elements = configTags.item(0).getChildNodes();
for (int i = 0; i < elements.getLength(); i++) {
// (everything else looks correct)...
}
When you query getElementsByTagName(), a NodeList is returned which, in your case should always contains one node, the <Config> node. To access the child nodes (<useProxy>, etc), you need to get the first Node out of the node list and query for it's children qith getChildNodes().

Getting XML child elements with XPath

I have this XML:
<root>
<items>
<item1>
<tag1>1</tag1>
<sub>
<sub1>10 </sub1>
<sub2>20 </sub2>
</sub>
</item1>
<item2>
<tag1>1</tag1>
<sub>
<sub1> </sub1>
<sub2> </sub2>
</sub>
</item2>
</items>
</root>
I want to get the item1 element and the name and values of the child elements.
That is, i want to get: tag1 - 1,sub1-10,sub2-20.
How can i do this? so far i can only get elements without children.
Document doc = ...;
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("/root/items/item1/*/text()");
Object o = expr.evaluate(doc, XPathConstants.NODESET);
NodeList list = (NodeList) o;
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
/**
* File: Ex1.java #author ronda
*/
public class Ex1 {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory Factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = Factory.newDocumentBuilder();
Document doc = builder.parse("myxml.xml");
//creating an XPathFactory:
XPathFactory factory = XPathFactory.newInstance();
//using this factory to create an XPath object:
XPath xpath = factory.newXPath();
// XPath Query for showing all nodes value
XPathExpression expr = xpath.compile("//" + "item1" + "/*");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
System.out.println(nodes.getLength());
for (int i = 0; i < nodes.getLength(); i++) {
Element el = (Element) nodes.item(i);
System.out.println("tag: " + el.getNodeName());
// seach for the Text children
if (el.getFirstChild().getNodeType() == Node.TEXT_NODE)
System.out.println("inner value:" + el.getFirstChild().getNodeValue());
NodeList children = el.getChildNodes();
for (int k = 0; k < children.getLength(); k++) {
Node child = children.item(k);
if (child.getNodeType() != Node.TEXT_NODE) {
System.out.println("child tag: " + child.getNodeName());
if (child.getFirstChild().getNodeType() == Node.TEXT_NODE)
System.out.println("inner child value:" + child.getFirstChild().getNodeValue());;
}
}
}
}
}
I get this output loading the xml of your question in file named: myxml.xml:
run:
2
tag: tag1
inner value:1
tag: sub
inner value:
child tag: sub1
inner child value:10
child tag: sub2
inner child value:20
...a bit wordy, but allow us to understand how it works. PS: I found a good guide in here

how to get attribute of given node?

I am trying to write DOM XML parsing.
My Xml file
<?xml version="1.0"?>
<BLAH>
<AgentNm type="citi1">
<accName>accName1</accName>
<accType>accType1</accType>
<someThing>someThing1</someThing>
<amt>100000</amt>
</AgentNm>
<AgentNm type="citi2">
<accName>accName2</accName>
<accType>accType2</accType>
<someThing>someThing2</someThing>
<amt>200000</amt>
</AgentNm>
</BLAH>
And i tried following java code
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse (new File("c:\\file.xml"));
// normalize text representation
doc.getDocumentElement ().normalize ();
System.out.println ("Root element of the doc is " +doc.getDocumentElement().getNodeName());
NodeList agentNm = doc.getElementsByTagName("AgentNm");
int totalAgentNm = agentNm.getLength();
System.out.println("Total no of Agents : " + totalAgentNm);
for(int s=0; s<agentNm.getLength() ; s++){
Node firstPersonNode = agentNm.item(s);
if(firstPersonNode.getNodeType() == Node.ELEMENT_NODE){
Element firstPersonElement = (Element)firstPersonNode;
PrintNodeElem(firstPersonElement,"type");
}//end of if clause
}//end of for loop with s var
static void PrintNodeElem(Element nodeElem,String elem){
NodeList someThingList = nodeElem.getElementsByTagName(elem);
Element ageElement = (Element)someThingList.item(0);
NodeList textAgeList = ageElement.getChildNodes();
System.out.println(elem+" : " +((Node)textAgeList.item(0)).getNodeValue().trim());
}
But, when i tried to execute above method,
i am getting null pointer exception.
can any one explain me, how to fix this.
if you want an attribute of a given node, I would suggest XPath. It is much easier.
http://onjava.com/onjava/2005/01/12/xpath.html

Looping over nodes and extracting specific subnode values using Java's XPath

I understand from Googling that it makes more sense to extract data from XML using XPath than by using DOM looping.
At the moment, I have implemented a solution using DOM, but the code is verbose, and it feels untidy and unmaintainable, so I would like to switch to a cleaner XPath solution.
Let's say I have this structure:
<products>
<product>
<title>Some title 1</title>
<image>Some image 1</image>
</product>
<product>
<title>Some title 2</title>
<image>Some image 2</image>
</product>
...
</products>
I want to be able to run a for loop for each of the <product> elements, and inside this for loop, extract the title and image node values.
My code looks like this:
InputStream is = conn.getInputStream();
DocumentBuilder builder =
DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(is);
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/products/product");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList products = (NodeList) result;
for (int i = 0; i < products.getLength(); i++) {
Node n = products.item(i);
if (n != null && n.getNodeType() == Node.ELEMENT_NODE) {
Element product = (Element) n;
// do some DOM navigation to get the title and image
}
}
Inside my for loop I get each <product> as a Node, which is cast to an Element.
Can I simply use my instance of XPathExpression to compile and run another XPath on the Node or the Element?
Yes, you can always do like this -
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("/products/product");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
expr = xpath.compile("title"); // The new xpath expression to find 'title' within 'product'.
NodeList products = (NodeList) result;
for (int i = 0; i < products.getLength(); i++) {
Node n = products.item(i);
if (n != null && n.getNodeType() == Node.ELEMENT_NODE) {
Element product = (Element) n;
NodeList nodes = (NodeList) expr.evaluate(product,XPathConstants.NODESET); //Find the 'title' in the 'product'
System.out.println("TITLE: " + nodes.item(0).getTextContent()); // And here is the title
}
}
Here I have given example of extracting the 'title' value. In same way you can do for 'image'
I'm not a big fan of this approach because you have to build a document (which might be expensive) before you can apply XPaths to it.
I've found VTD-XML a lot more efficient when it comes to applying XPaths to documents, because you don't need to load the whole document into memory. Here is some sample code:
final VTDGen vg = new VTDGen();
vg.parseFile("file.xml", false);
final VTDNav vn = vg.getNav();
final AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("/products/product");
while (ap.evalXPath() != -1) {
System.out.println("PRODUCT:");
// you could either apply another xpath or simply get the first child
if (vn.toElement(VTDNav.FIRST_CHILD, "title")) {
int val = vn.getText();
if (val != -1) {
System.out.println("Title: " + vn.toNormalizedString(val));
}
vn.toElement(VTDNav.PARENT);
}
if (vn.toElement(VTDNav.FIRST_CHILD, "image")) {
int val = vn.getText();
if (val != -1) {
System.out.println("Image: " + vn.toNormalizedString(val));
}
vn.toElement(VTDNav.PARENT);
}
}
Also see this post on Faster XPaths with VTD-XML.

Categories