Simple dom4j parsing in Java - can't access child nodes - java

I know this is so easy and I've spent all day banging my head. I have an XML document like this:
<WMS_Capabilities version="1.3.0" xmlns="http://www.opengis.net/wms">
<Service>
<Name>WMS</Name>
<Title>Metacarta WMS VMaplv0</Title>
</Service>
<Capability>
<Layer>
<Name>Vmap0</Name>
<Title>Metacarta WMS VMaplv0</Title>
<Abstract>Vmap0</Abstract>
...
There can be multiple Layer nodes, and any Layer node can have a nested Layer node. I can quickly select all of the layer nodes and iterate through them with the following xpath code:
Map<String, String> uris = new HashMap<String, String>();
uris.put("wms", "http://www.opengis.net/wms");
XPath xpath1 = doc.createXPath("//wms:Layer");
xpath1.setNamespaceURIs(uris);
List nodes1 = xpath1.selectNodes(doc);
for (Iterator<?> layerIt = nodes1.iterator(); layerIt.hasNext();) {
Node node = (Node) layerIt.next();
}
I get back all Layer nodes. Perfect. But when I try to access each Name or Title child node, I get nothing. I've tried as many various combinations I can think of:
name = node.selectSingleNode("./wms:Name");
name = node.selectSingleNode("wms:Name");
name = node.selectSingleNode("Name");
etc etc, but it always returns null. I'm guessing it has something to do with the namespace, but all I'm after is the name and title text values for each one of the Layer nodes I've obtained. Can anybody offer any help:

I believe that Node.selectSingleNode() evaluates the supplied XPath expression with an empty namespace context. So there is no way of accessing a node in no namespace by name. It's necessary to use an expression such as *[local-name='Name']. If you want/need a namespace context, execute XPath expressions via the XPath object.

Thanks everybody for the help. It was Michael Kay's last clue that got it for me... I needed to use a relative path from the current node, include the namespace URI, and select from the context of the current node I'm iterating through:
Map<String, String> uris = new HashMap<String, String>();
uris.put("wms", "http://www.opengis.net/wms");
XPath xpath1 = doc.createXPath("//wms:Layer");
xpath1.setNamespaceURIs(uris);
List nodes1 = xpath1.selectNodes(doc);
for (Iterator<?> layerIt = nodes1.iterator(); layerIt.hasNext();) {
Node node = (Node) layerIt.next();
XPath nameXpath = node.createXPath("./wms:Name");
nameXpath.setNamespaceURIs(uris);
XPath titleXpath = node.createXPath("./wms:Title");
titleXpath.setNamespaceURIs(uris);
Node name = nameXpath.selectSingleNode(node);
Node title = titleXpath.selectSingleNode(node);
}

Related

How to determine the class type of a sub-class in Java

I have an XML file with the following elements.
<productType>
<productTypeX />
<!-- One of the following elements are also possible:
<productTypeY />
<productTypeZ />
-->
</productType>
So, the XML could also look like this:
<productType>
<productTypeZ />
</productType>
The XML is unmarshalled to a POJO by using JAXB.
How can I determine if the child of <productType> is X, Y or Z? Either in the mapped POJO or directly in the XML?
Now there is a way maybe not cheaper than checking by hand - writing if for every GETTER about sub-classes(null == obj.getProductTypeX()) but here it is:
Lets assume that you end up with JAXBElement<ProductType> productType when you unmarshall.
Now you need to end up with a Element (org.w3c.dom.Element) object. Which can be done like this:
DOMResult res = new DOMResult();
marshaller.marshal(productType, res);
Element elt = ((Document)res.getNode()).getDocumentElement();
Now the interface Element extends the interface Node from which we can
come to a conclusion that we end up here with a TREE structure object and we can get his existing children like :
NodeList nodeList = elt.getChildNodes();
Now you can check the type and value of every Node but you have to check if the Node is an ELEMENT_NODE or ATTRIBUTE_NODE in most cases:
for (int i = 0; i < nodeList.getLength(); i++) {
Node currentNode = nodeList.item(i);
if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
currentNode.getNodeName();
currentNode.getTextContent();
//And whatever you like
}
}
I hope this will help you or give you any directions how to get what you need.

how to change the Node of a TriplePath in Jena?

I want to change a node of a Jena TriplePath (org.apache.jena.sparql.core.TriplePath), but I haven't found any manner. Imagine I have this code:
TriplePath tp = null;
....
//tp has been defined and not null
Node domain = tp.getSubject();
Node predicate = tp.getPredicate();
Node range = tp.getObject();
Node newNode = NodeFactory.createURI("http://www.example.com/example/example");
//And now? How can I set a Node (domain/predicate/range) of tp?
The question is, how can I set any Node (domain/predicate/range) of the TriplePath tp with the newNode I've created? Is there any manner?
You need to create a new path and assign it to tp. TriplePaths are immutable, as is the rest of the SPARQL algebra in Jena (any ways to defeat this should not be used!).
For more complex setups, have a template with variables and use:
TriplePath Substitute.substitute(TriplePath triplePath, Binding binding)

How to parse through a Node and extract the value of a child node in Java?

The input to the function this code is in, is a Node configNode. I need to extract the value of a child node inTemplate. The following is the code. Only null is printed.
XPath xpath = XPathFactory.newInstance().newXPath();
Node inTemplateNode = (Node) xpath.compile("#inTemplate").evaluate(configNode, XPathConstants.NODE);
String inTemplate = (inTemplateNode != null) ? inTemplateNode.getTextContent() : null;
System.out.println("inTemplate Value =" + inTemplate);
Can anyone help me as to why this code is not working.
The XPath expression #inTemplate selects the attribute named inTemplate of the context node (e.g. <config inTemplate="foo"/>). If you really need an attribute value then doing ((Element)configNode).getAttribute("inTemplate") should work in the DOM without the need to use any XPath.
If you want to select a child element (e.g. <config><inTemplate>foo</inTemplate></config>) named inTemplate then use the path inTemplate and not #inTemplate.

reading xml file with multiple child node

Consider i have a XML file like the below xml file.
<top>
<CRAWL>
<NAME>div[class=name],attr=0</NAME>
<PRICE>span[class~=(?i)(price-new|price-old)],attr=0</PRICE>
<DESC>div[class~=(?i)(sttl dyn|bin)],attr=0</DESC>
<PROD_IMG>div[class=image]>a>img,attr=src</PROD_IMG>
<URL>div[class=name]>a,attr=href</URL>
</CRAWL>
<CRAWL>
<NAME>img[class=img],attr=alt</NAME>
<PRICE>div[class=g-b],attr=0</PRICE>
<DESC>div[class~=(?i)(sttl dyn|bin)],attr=0</DESC>
<PROD_IMG>img[itemprop=image],attr=src</PROD_IMG>
<URL>a[class=img],attr=href</URL>
</CRAWL>
</top>
what i want is first take all the values coming under and after finishing the first operation go to the next one and repeat it even though i have more than two tag.I have managed to get if just one is available. using the values coming inside the tags i am doing some other function. in each it has values from different and i am using that values for different operations. everything else if fine other than i dont know how to loop the fetching inside the xml file.
regards
If I'm understanding this correctly, you're trying to extract data from ALL tags that exist within your XML fragment. There are multiple solutions to this. I'm listing them below:
XPath: If you know exactly what your XML structure is, you can employ XPath for each node=CRAWL to find data within tags:
// Instantiate XPath variable
XPath xpath = XPathFactory.newInstance().newXPath();
// Define the exact XPath expressions you want to get data for:
XPathExpression name = xpath.compile("//top/CRAWL/NAME/text()");
XPathExpression price = xpath.compile("//top/CRAWL/PRICE/text()");
XPathExpression desc = xpath.compile("//top/CRAWL/DESC/text()");
XPathExpression prod_img = xpath.compile("//top/CRAWL/PROD_IMG/text()");
XPathExpression url = xpath.compile("//top/CRAWL/URL/text()");
At this point, each of the variables above will contain the data for each of the tags. You could drop this into an array for each where you will have all the data for each of the tags in all elements.
The other (more efficient solution) is to have the data stored by doing DOM based parsing:
// Instantiate the doc builder
DocumentBuilder xmlDocBuilder = domFactory.newDocumentBuilder();
Document xmlDoc = xmlDocBuilder.parse("xmlFile.xml");
// Create NodeList of element tag "CRAWL"
NodeList crawlNodeList = xmlDoc.getElementsByTagName("CRAWL");
// Now iterate through each item in the NodeList and get the values of
// each of the elements in Name, Price, Desc etc.
for (Node node: crawlNodeList) {
NamedNodeMap subNodeMap = node.getChildNodes();
int currentNodeMapLength = subNodeMap.getLength();
// Get each node's name and value
for (i=0; i<currentNodeMapLength; i++){
// Iterate through all of the values in the nodeList,
// e.g. NAME, PRICE, DESC, etc.
// Do something with these values
}
}
Hope this helps!

XPath query returns duplicate nodes

I have a SOAP response that I'm processing in Java. It has a element with several different child elements. I'm using the following code to try to grab all of the bond nodes and find which one has a child tag with a value of ACTIVE. The NodeList returned by the initial evaluate statement contains 4 nodes, which is the correct number of children in the SOAP response, but they are all duplicates of the first element. Here is the code:
NodeList nodes = (NodeList)xpath.evaluate("//:bond", doc, XPathConstants.NODESET);
for(int i = 0; i < nodes.getLength(); i++){
HashMap<String, String> map = new HashMap<String, String>();
Element bond = (Element)nodes.item(i);
// Get only active bonds
String status = xpath.evaluate("//:status", bond);
String id = xpath.evaluate("//:instrumentId", bond);
if(!status.equals("ACTIVE"))
continue;
map.put("isin", xpath.evaluate(":isin", bond));
map.put("cusip", xpath.evaluate(":cusip", bond));
}
Thanks for your help,
Jared
The answer to your immediate question is that expressions like //:status will ignore the node that you pass in, and start from the root of the document.
However, there's probably an easier solution than what you've got, by using XPath to apply the test to the node. I think this should work, although it might contain typos (in particular, I can't remember whether text() can stand on its own or must be used in a predicate expression):
//:bond/:status[text()='ACTIVE']/..

Categories