Search in xpath - java

Suppose this is the xml:
<lib>
<books type="paperback" name="A" />
<books type="pdf" name="B" />
<books type="hardbound" name="A" />
</lib>
What will be the xpath code to search for book of type="paperback" and name="A"? TIA.
Currently my code looks like this:
import org.w3c.dom.*;
import javax.xml.xpath.*;
import javax.xml.parsers.*;
import java.io.IOException;
import org.xml.sax.SAXException;
public class demo {
public static void main(String[] args)
throws ParserConfigurationException, SAXException,
IOException, XPathExpressionException {
DocumentBuilderFactory domFactory =
DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("xml.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
// XPath Query for showing all nodes value
String version="fl1.0";
XPathExpression expr = xpath.compile("//books/type[#input="paperback"]/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
}
}

/lib/books[#type='paperback' and #name='A']
Have a look here if you're struggling with xpath syntax, it has a few nice examples.
Also, if you just need help with XML in general and related technologies, have a look at the guide here

Right now you seem to search for something like <book><type input="paperback"/></book> which clearly is wrong. Your xpath expression should probably be something like //books[#type="paperback" and #name="A"]/text()

The answers above are good...I wanted to contribute a resource. The best xpath tutorial I've seen is here.

Related

XPathExpression.evaluate using Node [duplicate]

I want to manipulate xml doc having default namespace but no prefix. Is there a way to use xpath without namespace uri just as if there is no namespace?
I believe it should be possible if we set namespaceAware property of documentBuilderFactory to false. But in my case it is not working.
Is my understanding is incorrect or I am doing some mistake in code?
Here is my code:
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(false);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xPath.evaluate("//author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
Here is my xml:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="http://www.mydomain.com/schema">
<author>
<book title="t1"/>
<book title="t2"/>
</author>
</root>
The XPath processing for a document that uses the default namespace (no prefix) is the same as the XPath processing for a document that uses prefixes:
For namespace qualified documents you can use a NamespaceContext when you execute the XPath. You will need to prefix the fragments in the XPath to match the NamespaceContext. The prefixes you use do not need to match the prefixes used in the document.
http://download.oracle.com/javase/6/docs/api/javax/xml/namespace/NamespaceContext.html
Here is how it looks with your code:
import java.util.Iterator;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Demo {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new MyNamespaceContext());
NodeList nl = (NodeList) xPath.evaluate("/ns:root/ns:author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
}
private static class MyNamespaceContext implements NamespaceContext {
public String getNamespaceURI(String prefix) {
if("ns".equals(prefix)) {
return "http://www.mydomain.com/schema";
}
return null;
}
public String getPrefix(String namespaceURI) {
return null;
}
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
}
Note:
I also used the corrected XPath suggested by Dennis.
The following also appears to work, and is closer to your original question:
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Demo {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xPath.evaluate("/root/author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
}
}
Blaise Doughan is right, attached code is correct.
Problem was somewhere elese. I was running all my tests through Application launcher in Eclipse IDE and nothing was working. Then I discovered Eclipse project was cause of all grief. I ran my class from command prompt, it worked. Created a new eclipse project and pasted same code there, it worked there too.
Thank you all guys for your time and efforts.
I've written a simple NamespaceContext implementation (here), that might be of help. It takes a Map<String, String> as input, where the key is a prefix, and the value is a namespace.
It follows the NamespaceContext spesification, and you can see how it works in the unit tests.
Map<String, String> mappings = new HashMap<>();
mappings.put("foo", "http://foo");
mappings.put("foo2", "http://foo");
mappings.put("bar", "http://bar");
context = new SimpleNamespaceContext(mappings);
context.getNamespaceURI("foo"); // "http://foo"
context.getPrefix("http://foo"); // "foo" or "foo2"
context.getPrefixes("http://foo"); // ["foo", "foo2"]
Note that it has a dependency on Google Guava

Parse Last.Fm XML from API in Java

URL: http://ws.audioscrobbler.com/2.0/?method=chart.gethypedtracks&api_key=1732077d6772048ccc671c754061cb18&limit=10
From the above url I need to somehow remove the Artist name and the track name from the XML file produced from each Song given but I have no Idea how to work with an XML file structured in this way ??
Any help or pointers would be very much appreciated !
Thanks,
Ross
Here's a fully working class that loads the URL you have indicated and parses the Track and artist names.
Basically it reads the xml into a Document, and runs 2 xpath queries in loops to get the data you want.
The document itself is simple xml, if you reformat it, it looks like:
<?xml version="1.0" encoding="utf-8"?>
<lfm status="ok">
<tracks page="1" perPage="10" totalPages="50" total="500">
<track>
<name>Hysterical</name>
<duration>231</duration>
<percentagechange>3626</percentagechange>
<mbid/>
<url>http://www.last.fm/music/Clap+Your+Hands+Say+Yeah/_/Hysterical</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Clap Your Hands Say Yeah</name>
...
All I did to clean it up was run it through a re-formatter like xmlstarlet as I mentioned in my comment. Note: you don't have to reformat it for java to read it if it's well formed. Human readable is all a re-format does for you.
The first xpath query gets the track name using a path lfm/tracks/track/name. You can use something like this xpath tester to try out your xpath queries (you can paste your xml in and it will reformat it too). If you don't understand xpath, there are many sources on the net.
The second xpath works relative to the current track name node, and looks for a following-sibling node of type artist with a name sub-node, and then displays the text of the node.
Here's the code
package net.fish;
import java.net.URL;
import java.net.URLConnection;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class ParseXML {
private static final DocumentBuilderFactory DOCUMENT_BUILDER_FACTORY = DocumentBuilderFactory.newInstance();
private static final XPathFactory XPATH_FACTORY = XPathFactory.newInstance();
public static void main(String[] args) throws Exception {
new ParseXML().parseXml("http://ws.audioscrobbler.com/2.0/?method=chart.gethypedtracks&api_key=1732077d6772048ccc671c754061cb18&limit=10");
}
private void parseXml(String urlPath) throws Exception {
URL url = new URL(urlPath);
URLConnection connection = url.openConnection();
DocumentBuilder db = DOCUMENT_BUILDER_FACTORY.newDocumentBuilder();
final Document document = db.parse(connection.getInputStream());
XPath xPathEvaluator = XPATH_FACTORY.newXPath();
XPathExpression nameExpr = xPathEvaluator.compile("lfm/tracks/track/name");
NodeList trackNameNodes = (NodeList) nameExpr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < trackNameNodes.getLength(); i++) {
Node trackNameNode = trackNameNodes.item(i);
System.out.println(String.format("Track Name: %s" , trackNameNode.getTextContent()));
XPathExpression artistNameExpr = xPathEvaluator.compile("following-sibling::artist/name");
NodeList artistNameNodes = (NodeList) artistNameExpr.evaluate(trackNameNode, XPathConstants.NODESET);
for (int j=0; j < artistNameNodes.getLength(); j++) {
System.out.println(String.format(" - Artist Name: %s", artistNameNodes.item(j).getTextContent()));
}
}
}
}

Why doesn't XPath namespace-uri() work out of the box with Java?

I am trying to use the namespace-uri() function in XPath to retrieve nodes based on their fully qualified name. The query //*[local-name() = 'customerName' and namespace-uri() = 'http://example.com/officeN'] in this online XPath tester, among others, correctly returns the relevant nodes. Yet the following self-contained Java class does not retrieve anything. What am I doing wrong with namespace-uri()?
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class Test{
public static void main(String[] args)throws Exception {
XPathExpression expr = XPathFactory.newInstance().newXPath().compile(
"//*[local-name() = 'customerName' and namespace-uri() = 'http://example.com/officeN']");
String xml=
"<Agents xmlns:n=\"http://example.com/officeN\">\n"+
"\t<n:Agent>\n\t\t<n:customerName>Joe Shmo</n:customerName>\n\t</n:Agent>\n"+
"\t<n:Agent>\n\t\t<n:customerName>Mary Brown</n:customerName>\n\t</n:Agent>\n</Agents>";
System.out.println(xml);
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new InputSource(new StringReader(xml)));
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
System.err.println("\n\nNodes:");
for (int i = 0; i < nodes.getLength(); i++) {
System.err.println(nodes.item(i));
}
}
}
The query looks fine. You also need to declare your DocumentBuilderFactory to be "namespace-aware".
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.newDocumentBuilder().parse(new InputSource(new StringReader(xml)));

XPath error in Android

My goal is executing an XQuery using XPath.
My XML file is:
<?xml version="1.0" encoding="UTF-8"?>
<postes>
<poste>
<gouvernourat>Kairouan</gouvernourat>
<ville>Kairouan sud</ville>
<cp>3100</cp>
</poste>
<poste>
<gouvernourat>Tunis</gouvernourat>
<ville>Ghazela</ville>
<cp>1002</cp>
</poste>
</postes>
My Java code is:
package xmlparse;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class QueryXML {
public void query() throws ParserConfigurationException, SAXException,
IOException, XPathExpressionException {
// Standard of reading a XML file
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder;
Document doc = null;
XPathExpression expr = null;
builder = factory.newDocumentBuilder();
doc = builder.parse("a.xml"); //C:\\Users\\aymen\\Desktop\\
// Create a XPathFactory
XPathFactory xFactory = XPathFactory.newInstance();
// Create a XPath object
XPath xpath = xFactory.newXPath();
// Compile the XPath expression
expr = xpath.compile("/postes/poste[gouvernourat='Tunis']/ville/text()");
// Run the query and get a nodeset
Object result = expr.evaluate(doc, XPathConstants.NODESET);
// Cast the result to a DOM NodeList
NodeList nodes = (NodeList) result;
for (int i=0; i<nodes.getLength();i++){
System.out.println(nodes.item(i).getNodeValue());
}
}
public static void main(String[] args) throws XPathExpressionException, ParserConfigurationException, SAXException, IOException {
QueryXML process = new QueryXML();
process.query();
}
}
When I launch this Java code the result is displayed on the console correctly (System.out.println).
But if I copy this code to my Android application and change System.out.println(nodes.item(i).getNodeValue()); to Text2.setText(nodes.item(i).getNodeValue()); (I have a TextView named Text2)
When I execute the code and I click the button the TextView stays empty (No error for Force Close)
Thank you in advance
Attribute names needs to start with '#' while using XPath in Android.
So change
[gouvernourat='Tunis']
To
[#gouvernourat='Tunis']
Refer http://developer.android.com/reference/javax/xml/xpath/package-summary.html for details.

What am I doing wrong with XPath?

I am trying to manipulate xsd schema as an xml document that should not be a problem, I believe. But facing troubles with XPath. Whatever XPath I try, it returns nothing. Tried it with or without namespaces but no success.
Please help me understand what am I doing wrong?
My xml is:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.mydomain.com" xmlns="http://www.mydomain.com" elementFormDefault="qualified">
<xs:complexType name="Label">
<xs:choice maxOccurs="unbounded" minOccurs="0">
<xs:element name="Listener"/>
</xs:choice>
</xs:complexType>
</xs:schema>
and application code is:
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setValidating(false);
domFactory.setNamespaceAware(true);
domFactory.setIgnoringComments(true);
domFactory.setIgnoringElementContentWhitespace(true);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("C:/Temp/test.xsd");
// This part works
Node rootNode = dDoc.getElementsByTagName("xs:schema").item(0);
System.out.println(rootNode.getNodeName());
// This part doesn't work
XPath xPath1 = XPathFactory.newInstance().newXPath();
NodeList nList1 = (NodeList) xPath1.evaluate("//xs:schema", dDoc, XPathConstants.NODESET);
System.out.println(nList1.item(0).getNodeName());
// This part doesn't work
XPath xPath2 = XPathFactory.newInstance().newXPath();
NodeList nList2 = (NodeList) xPath2.evaluate("//xs:element", rootNode, XPathConstants.NODESET);
System.out.println(nList2.item(0).getNodeName());
}catch (Exception e){
e.printStackTrace();
}
Set a namespace context using XPath.setNamespaceContext(). This binds the xs prefix to the http://www.w3.org/2001/XMLSchema namespace.
Made changes to your code. It works:
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class XPathtest {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory
.newInstance();
domFactory.setValidating(false);
domFactory.setNamespaceAware(true);
domFactory.setIgnoringComments(true);
domFactory.setIgnoringElementContentWhitespace(true);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("C:/Temp/test.xsd");
// This part works
Node rootNode = dDoc.getElementsByTagName("xs:schema").item(0);
System.out.println(rootNode.getNodeName());
// This part doesn't work
XPath xPath1 = XPathFactory.newInstance().newXPath();
NamespaceContext nsContext = new NamespaceContext() {
#Override
public String getNamespaceURI(String prefix) {
return "http://www.w3.org/2001/XMLSchema";
}
#Override
public String getPrefix(String namespaceURI) {
return "xs";
}
#Override
public Iterator getPrefixes(String namespaceURI) {
Set s = new HashSet();
s.add("xs");
return s.iterator();
}
};
xPath1.setNamespaceContext((NamespaceContext) nsContext);
NodeList nList1 = (NodeList) xPath1.evaluate("//xs:schema", dDoc,
XPathConstants.NODESET);
System.out.println(nList1.item(0).getNodeName());
// This part doesn't work
// XPath xPath2 = XPathFactory.newInstance().newXPath();
NodeList nList2 = (NodeList) xPath1.evaluate("//xs:element",
rootNode, XPathConstants.NODESET);
System.out.println(nList2.item(0).getNodeName());
} catch (Exception e) {
e.printStackTrace();
}
}
}
The reason is that you haven't specified what xs means. An xml parser must know the namespace url, xs is just an identifier.
You can demonstrate this yourself by using the following code:
XPath xPath = XPathFactory.newInstance().newXPath();
SimpleNamespaceContext nsContext = new SimpleNamespaceContext();
nsContext.addNamespace("t", "http://www.w3.org/2001/XMLSchema");
xPath.setNamespaceContext(nsContext);
xPath.evaluate("//t:schema", dDoc, XPathConstants.NODESET);
You can see that I now use the identifier t instead of xs but that doesn't matter as long as you use the same namespace url.

Categories