I want to manipulate xml doc having default namespace but no prefix. Is there a way to use xpath without namespace uri just as if there is no namespace?
I believe it should be possible if we set namespaceAware property of documentBuilderFactory to false. But in my case it is not working.
Is my understanding is incorrect or I am doing some mistake in code?
Here is my code:
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(false);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xPath.evaluate("//author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
Here is my xml:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="http://www.mydomain.com/schema">
<author>
<book title="t1"/>
<book title="t2"/>
</author>
</root>
The XPath processing for a document that uses the default namespace (no prefix) is the same as the XPath processing for a document that uses prefixes:
For namespace qualified documents you can use a NamespaceContext when you execute the XPath. You will need to prefix the fragments in the XPath to match the NamespaceContext. The prefixes you use do not need to match the prefixes used in the document.
http://download.oracle.com/javase/6/docs/api/javax/xml/namespace/NamespaceContext.html
Here is how it looks with your code:
import java.util.Iterator;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Demo {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new MyNamespaceContext());
NodeList nl = (NodeList) xPath.evaluate("/ns:root/ns:author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
}
private static class MyNamespaceContext implements NamespaceContext {
public String getNamespaceURI(String prefix) {
if("ns".equals(prefix)) {
return "http://www.mydomain.com/schema";
}
return null;
}
public String getPrefix(String namespaceURI) {
return null;
}
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
}
Note:
I also used the corrected XPath suggested by Dennis.
The following also appears to work, and is closer to your original question:
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Demo {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xPath.evaluate("/root/author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
}
}
Blaise Doughan is right, attached code is correct.
Problem was somewhere elese. I was running all my tests through Application launcher in Eclipse IDE and nothing was working. Then I discovered Eclipse project was cause of all grief. I ran my class from command prompt, it worked. Created a new eclipse project and pasted same code there, it worked there too.
Thank you all guys for your time and efforts.
I've written a simple NamespaceContext implementation (here), that might be of help. It takes a Map<String, String> as input, where the key is a prefix, and the value is a namespace.
It follows the NamespaceContext spesification, and you can see how it works in the unit tests.
Map<String, String> mappings = new HashMap<>();
mappings.put("foo", "http://foo");
mappings.put("foo2", "http://foo");
mappings.put("bar", "http://bar");
context = new SimpleNamespaceContext(mappings);
context.getNamespaceURI("foo"); // "http://foo"
context.getPrefix("http://foo"); // "foo" or "foo2"
context.getPrefixes("http://foo"); // ["foo", "foo2"]
Note that it has a dependency on Google Guava
Related
I have been banging my head over for two days now.
I have a XHTML web-page from which i want to scrap some data
I am using JTidy to DOMParse and then XPathFactory to find nodes using XPath
The Xhtml snippet is something like this
<div style="line-height: 22px;" id="dvTitle" class="titlebtmbrdr01">BAJAJ AUTO LTD.</div>
Now i want that BAJAJ AUTO LTD.
The code that i am using is :
import java.io.IOException;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Vector;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class BSEQuotesExtractor implements valueExtractor {
#Override
public Vector<String> getName(Document d) throws XPathExpressionException {
// TODO Auto-generated method stub
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//div[#id='dvTitle']/text()");
Object result = expr.evaluate(d, XPathConstants.NODESET);
NodeList nodes = (NodeList)result;
for(int i=0;i<nodes.getLength();i++)
{
System.out.println(nodes.item(i).getNodeValue());
}
return null;
}
public static void main(String[] args) throws MalformedURLException, IOException, XPathExpressionException{
BSEQuotesExtractor q = new BSEQuotesExtractor();
DOMParser parser = new DOMParser(new URL("http://www.bseindia.com/bseplus/StockReach/StockQuote/Equity/BAJAJ%20AUTO%20LTD/BAJAJAUT/532977/Scrips").openStream());
Document d = parser.getDocument();
q.getName(d);
}
}
But i gett a null output and not BAJAJ AUTO LTD.
Please rescue me
try this.
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//div[#id='dvTitle']");
Object result = expr.evaluate(d, XPathConstants.NODE);
Node node = (Node)result;
System.out.println(node.getTextContent());
you must use XPathConstants.STRING instead of XPathConstants.NODESET.
You want to get a value of a single element (div), not a list of nodes.
Write:
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
String divContent = (String) path.evaluate("//div[#id='dvTitle']", document, XPathConstants.STRING);
Into divContent you get "BAJAJ AUTO LTD.".
I am trying to use the namespace-uri() function in XPath to retrieve nodes based on their fully qualified name. The query //*[local-name() = 'customerName' and namespace-uri() = 'http://example.com/officeN'] in this online XPath tester, among others, correctly returns the relevant nodes. Yet the following self-contained Java class does not retrieve anything. What am I doing wrong with namespace-uri()?
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class Test{
public static void main(String[] args)throws Exception {
XPathExpression expr = XPathFactory.newInstance().newXPath().compile(
"//*[local-name() = 'customerName' and namespace-uri() = 'http://example.com/officeN']");
String xml=
"<Agents xmlns:n=\"http://example.com/officeN\">\n"+
"\t<n:Agent>\n\t\t<n:customerName>Joe Shmo</n:customerName>\n\t</n:Agent>\n"+
"\t<n:Agent>\n\t\t<n:customerName>Mary Brown</n:customerName>\n\t</n:Agent>\n</Agents>";
System.out.println(xml);
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new InputSource(new StringReader(xml)));
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
System.err.println("\n\nNodes:");
for (int i = 0; i < nodes.getLength(); i++) {
System.err.println(nodes.item(i));
}
}
}
The query looks fine. You also need to declare your DocumentBuilderFactory to be "namespace-aware".
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
dbf.newDocumentBuilder().parse(new InputSource(new StringReader(xml)));
My goal is executing an XQuery using XPath.
My XML file is:
<?xml version="1.0" encoding="UTF-8"?>
<postes>
<poste>
<gouvernourat>Kairouan</gouvernourat>
<ville>Kairouan sud</ville>
<cp>3100</cp>
</poste>
<poste>
<gouvernourat>Tunis</gouvernourat>
<ville>Ghazela</ville>
<cp>1002</cp>
</poste>
</postes>
My Java code is:
package xmlparse;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class QueryXML {
public void query() throws ParserConfigurationException, SAXException,
IOException, XPathExpressionException {
// Standard of reading a XML file
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder;
Document doc = null;
XPathExpression expr = null;
builder = factory.newDocumentBuilder();
doc = builder.parse("a.xml"); //C:\\Users\\aymen\\Desktop\\
// Create a XPathFactory
XPathFactory xFactory = XPathFactory.newInstance();
// Create a XPath object
XPath xpath = xFactory.newXPath();
// Compile the XPath expression
expr = xpath.compile("/postes/poste[gouvernourat='Tunis']/ville/text()");
// Run the query and get a nodeset
Object result = expr.evaluate(doc, XPathConstants.NODESET);
// Cast the result to a DOM NodeList
NodeList nodes = (NodeList) result;
for (int i=0; i<nodes.getLength();i++){
System.out.println(nodes.item(i).getNodeValue());
}
}
public static void main(String[] args) throws XPathExpressionException, ParserConfigurationException, SAXException, IOException {
QueryXML process = new QueryXML();
process.query();
}
}
When I launch this Java code the result is displayed on the console correctly (System.out.println).
But if I copy this code to my Android application and change System.out.println(nodes.item(i).getNodeValue()); to Text2.setText(nodes.item(i).getNodeValue()); (I have a TextView named Text2)
When I execute the code and I click the button the TextView stays empty (No error for Force Close)
Thank you in advance
Attribute names needs to start with '#' while using XPath in Android.
So change
[gouvernourat='Tunis']
To
[#gouvernourat='Tunis']
Refer http://developer.android.com/reference/javax/xml/xpath/package-summary.html for details.
I am trying to manipulate xsd schema as an xml document that should not be a problem, I believe. But facing troubles with XPath. Whatever XPath I try, it returns nothing. Tried it with or without namespaces but no success.
Please help me understand what am I doing wrong?
My xml is:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.mydomain.com" xmlns="http://www.mydomain.com" elementFormDefault="qualified">
<xs:complexType name="Label">
<xs:choice maxOccurs="unbounded" minOccurs="0">
<xs:element name="Listener"/>
</xs:choice>
</xs:complexType>
</xs:schema>
and application code is:
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setValidating(false);
domFactory.setNamespaceAware(true);
domFactory.setIgnoringComments(true);
domFactory.setIgnoringElementContentWhitespace(true);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("C:/Temp/test.xsd");
// This part works
Node rootNode = dDoc.getElementsByTagName("xs:schema").item(0);
System.out.println(rootNode.getNodeName());
// This part doesn't work
XPath xPath1 = XPathFactory.newInstance().newXPath();
NodeList nList1 = (NodeList) xPath1.evaluate("//xs:schema", dDoc, XPathConstants.NODESET);
System.out.println(nList1.item(0).getNodeName());
// This part doesn't work
XPath xPath2 = XPathFactory.newInstance().newXPath();
NodeList nList2 = (NodeList) xPath2.evaluate("//xs:element", rootNode, XPathConstants.NODESET);
System.out.println(nList2.item(0).getNodeName());
}catch (Exception e){
e.printStackTrace();
}
Set a namespace context using XPath.setNamespaceContext(). This binds the xs prefix to the http://www.w3.org/2001/XMLSchema namespace.
Made changes to your code. It works:
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class XPathtest {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory
.newInstance();
domFactory.setValidating(false);
domFactory.setNamespaceAware(true);
domFactory.setIgnoringComments(true);
domFactory.setIgnoringElementContentWhitespace(true);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("C:/Temp/test.xsd");
// This part works
Node rootNode = dDoc.getElementsByTagName("xs:schema").item(0);
System.out.println(rootNode.getNodeName());
// This part doesn't work
XPath xPath1 = XPathFactory.newInstance().newXPath();
NamespaceContext nsContext = new NamespaceContext() {
#Override
public String getNamespaceURI(String prefix) {
return "http://www.w3.org/2001/XMLSchema";
}
#Override
public String getPrefix(String namespaceURI) {
return "xs";
}
#Override
public Iterator getPrefixes(String namespaceURI) {
Set s = new HashSet();
s.add("xs");
return s.iterator();
}
};
xPath1.setNamespaceContext((NamespaceContext) nsContext);
NodeList nList1 = (NodeList) xPath1.evaluate("//xs:schema", dDoc,
XPathConstants.NODESET);
System.out.println(nList1.item(0).getNodeName());
// This part doesn't work
// XPath xPath2 = XPathFactory.newInstance().newXPath();
NodeList nList2 = (NodeList) xPath1.evaluate("//xs:element",
rootNode, XPathConstants.NODESET);
System.out.println(nList2.item(0).getNodeName());
} catch (Exception e) {
e.printStackTrace();
}
}
}
The reason is that you haven't specified what xs means. An xml parser must know the namespace url, xs is just an identifier.
You can demonstrate this yourself by using the following code:
XPath xPath = XPathFactory.newInstance().newXPath();
SimpleNamespaceContext nsContext = new SimpleNamespaceContext();
nsContext.addNamespace("t", "http://www.w3.org/2001/XMLSchema");
xPath.setNamespaceContext(nsContext);
xPath.evaluate("//t:schema", dDoc, XPathConstants.NODESET);
You can see that I now use the identifier t instead of xs but that doesn't matter as long as you use the same namespace url.
I am looking for example Java code that can construct an XML document that uses namespaces. I cannot seem to find anything using my normal favourite tool so was hoping someone may be able to help me out.
There are a number of ways of doing this. Just a couple of examples:
Using XOM
import nu.xom.Document;
import nu.xom.Element;
public class XomTest {
public static void main(String[] args) {
XomTest xomTest = new XomTest();
xomTest.testXmlDocumentWithNamespaces();
}
private void testXmlDocumentWithNamespaces() {
Element root = new Element("my:example", "urn:example.namespace");
Document document = new Document(root);
Element element = new Element("element", "http://another.namespace");
root.appendChild(element);
System.out.print(document.toXML());
}
}
Using Java Implementation of W3C DOM
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.DOMImplementation;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSOutput;
import org.w3c.dom.ls.LSSerializer;
public class DomTest {
private static DocumentBuilderFactory dbf = DocumentBuilderFactory
.newInstance();
public static void main(String[] args) throws Exception {
DomTest domTest = new DomTest();
domTest.testXmlDocumentWithNamespaces();
}
public void testXmlDocumentWithNamespaces() throws Exception {
DocumentBuilder db = dbf.newDocumentBuilder();
DOMImplementation domImpl = db.getDOMImplementation();
Document document = buildExampleDocumentWithNamespaces(domImpl);
serialize(domImpl, document);
}
private Document buildExampleDocumentWithNamespaces(
DOMImplementation domImpl) {
Document document = domImpl.createDocument("urn:example.namespace",
"my:example", null);
Element element = document.createElementNS("http://another.namespace",
"element");
document.getDocumentElement().appendChild(element);
return document;
}
private void serialize(DOMImplementation domImpl, Document document) {
DOMImplementationLS ls = (DOMImplementationLS) domImpl;
LSSerializer lss = ls.createLSSerializer();
LSOutput lso = ls.createLSOutput();
lso.setByteStream(System.out);
lss.write(document, lso);
}
}
I am not sure, what you trying to do, but I use jdom for most of my xml-issues and it supports namespaces (of course).
The code:
Document doc = new Document();
Namespace sNS = Namespace.getNamespace("someNS", "someNamespace");
Element element = new Element("SomeElement", sNS);
element.setAttribute("someKey", "someValue", Namespace.getNamespace("someONS", "someOtherNamespace"));
Element element2 = new Element("SomeElement", Namespace.getNamespace("someNS", "someNamespace"));
element2.setAttribute("someKey", "someValue", sNS);
element.addContent(element2);
doc.addContent(element);
produces the following xml:
<?xml version="1.0" encoding="UTF-8"?>
<someNS:SomeElement xmlns:someNS="someNamespace" xmlns:someONS="someOtherNamespace" someONS:someKey="someValue">
<someNS:SomeElement someNS:someKey="someValue" />
</someNS:SomeElement>
Which should contain everything you need. Hope that helps.