What is the proper way to call getAttributeNS using Java DOM?

What is the proper way to call getAttributeNS using Java DOM? - java

I'm having a problem correctly calling getAttributeNS() (and other NS methods) from Java DOM. First, here is my sample XML doc:
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book xmlns:c="http://www.w3schools.com/children/" xmlns:foo="http://foo.org/foo" category="CHILDREN">
<title foo:lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
And here is my little Java class that uses DOM and calls getAttributeNS:
package com.mycompany.proj;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Element;
import java.io.File;
public class AttributeNSProblem
{
public static void main(String[] args)
{
try
{
File fXmlFile = new File("bookstore_ns.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
System.out.println("Root element: " + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("title");
Element elem = (Element)nList.item(0);
String lang = elem.getAttributeNS("http://foo.org/foo", "lang");
System.out.println("title lang: " + lang);
lang = elem.getAttribute("foo:lang");
System.out.println("title lang: " + lang);
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
When I call getAttributeNS("http://foo.org/foo", "lang"), it returns an empty String. I've also tried getAttributeNS("foo", "lang"), same result.
What's the proper way to retrieve the value of an attribute qualified by a namespace?
Thanks.

Immediately after DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();, add dbFactory.setNamespaceAware(true);

Related

XPathExpression.evaluate using Node [duplicate]

I want to manipulate xml doc having default namespace but no prefix. Is there a way to use xpath without namespace uri just as if there is no namespace?
I believe it should be possible if we set namespaceAware property of documentBuilderFactory to false. But in my case it is not working.
Is my understanding is incorrect or I am doing some mistake in code?
Here is my code:
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(false);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xPath.evaluate("//author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
Here is my xml:
<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="http://www.mydomain.com/schema">
<author>
<book title="t1"/>
<book title="t2"/>
</author>
</root>

The XPath processing for a document that uses the default namespace (no prefix) is the same as the XPath processing for a document that uses prefixes:
For namespace qualified documents you can use a NamespaceContext when you execute the XPath. You will need to prefix the fragments in the XPath to match the NamespaceContext. The prefixes you use do not need to match the prefixes used in the document.
http://download.oracle.com/javase/6/docs/api/javax/xml/namespace/NamespaceContext.html
Here is how it looks with your code:
import java.util.Iterator;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Demo {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new MyNamespaceContext());
NodeList nl = (NodeList) xPath.evaluate("/ns:root/ns:author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
}
private static class MyNamespaceContext implements NamespaceContext {
public String getNamespaceURI(String prefix) {
if("ns".equals(prefix)) {
return "http://www.mydomain.com/schema";
}
return null;
}
public String getPrefix(String namespaceURI) {
return null;
}
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
}
Note:
I also used the corrected XPath suggested by Dennis.
The following also appears to work, and is closer to your original question:
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Demo {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nl = (NodeList) xPath.evaluate("/root/author", dDoc, XPathConstants.NODESET);
System.out.println(nl.getLength());
} catch (Exception e) {
e.printStackTrace();
}
}
}

Blaise Doughan is right, attached code is correct.
Problem was somewhere elese. I was running all my tests through Application launcher in Eclipse IDE and nothing was working. Then I discovered Eclipse project was cause of all grief. I ran my class from command prompt, it worked. Created a new eclipse project and pasted same code there, it worked there too.
Thank you all guys for your time and efforts.

I've written a simple NamespaceContext implementation (here), that might be of help. It takes a Map<String, String> as input, where the key is a prefix, and the value is a namespace.
It follows the NamespaceContext spesification, and you can see how it works in the unit tests.
Map<String, String> mappings = new HashMap<>();
mappings.put("foo", "http://foo");
mappings.put("foo2", "http://foo");
mappings.put("bar", "http://bar");
context = new SimpleNamespaceContext(mappings);
context.getNamespaceURI("foo"); // "http://foo"
context.getPrefix("http://foo"); // "foo" or "foo2"
context.getPrefixes("http://foo"); // ["foo", "foo2"]
Note that it has a dependency on Google Guava

how to select nodes with different tags using DOM?

i have an xml file that looks like :
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<HWData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<NE MOID="WBTS-42" NEType="WBTS">
<EQHO MOID="EQHO-1-0" >
<UNIT MOID="UNIT-FAN-1" State="enabled"></UNIT>
<UNIT MOID="UNIT-FAN-3" State="enabled"></UNIT>
</EQHO>
</NE>
<NE MOID="RNC-40" NEType="RNC">
<EQHO MOID="EQHO-3-0" >
<UNIT MOID="UNIT-FAN-5" State="disabled"></UNIT>
<UNIT MOID="UNIT-FAN-6" State="disabled"></UNIT>
</EQHO>
</NE>
</HWData>
i am asking for how can i get NodeList containing "NE" and "UNIT" tags using DOM ?
thanks

You can do it manually:
import java.io.File;
import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class XmlDomTest {
public static void main(String[] args) throws Exception {
File file = new File("/path/to/your/file");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
Set<String> filteredNames = new HashSet<String>(Arrays.asList("NE", "UNIT"));
NodeList list = collectNodes(doc, filteredNames);
for (int i = 0; i < list.getLength(); i++)
System.out.println(list.item(i).getNodeName());
}
private static NodeList collectNodes(Document doc, Set<String> filteredNames) {
Node ret = doc.createElement("NodeList");
collectNodes(doc, filteredNames, ret);
return ret.getChildNodes();
}
private static void collectNodes(Node node, Set<String> filteredNames, Node ret) {
NodeList chn = node.getChildNodes();
for (int i = 0; i < chn.getLength(); i++) {
Node child = chn.item(i);
if (filteredNames.contains(child.getNodeName()))
ret.appendChild(child);
collectNodes(child, filteredNames, ret);
}
}
}

try this :
public static List<String> MOIDList(File file) throws SAXException, IOException, ParserConfigurationException, XPathExpressionException{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression exp = xPath.compile("//NE | //UNIT");
NodeList nl = (NodeList)exp.evaluate(doc, XPathConstants.NODESET);
List<String> MoidList = new ArrayList<>();
for (int i = 0; i < nl.getLength(); i++) {
String moid=((Element)nl.item(i)).getAttribute("MOID");
MoidList.add(moid);
}
return MoidList;
}

The xpath to only select the MOIDS is //NE/#MOID | //UNIT/#MOID.
You should have a look at my open sourced Xml-parser-library unXml. It's available on Maven Central.
You can then do the following:
import com.nerdforge.unxml.Parsing;
import com.nerdforge.unxml.factory.ParsingFactory;
import org.w3c.dom.Document;
import java.util.List;
public class Parser {
public List<String> parseXml(String xml){
Parsing parsing = ParsingFactory.getInstance().create();
Document document = parsing.xml().document(xml);
List<String> result = parsing
.arr("//NE/#MOID | //UNIT/#MOID", parsing.text())
.as(String.class)
.apply(document);
return result;
}
}
parseXml will return the result:
[WBTS-42, UNIT-FAN-1, UNIT-FAN-3, RNC-40, UNIT-FAN-5, UNIT-FAN-6]
You can also create more complex nested datastructures if you need. Give me a comment here, if you want an example on how to do it.

org.xml.sax.SAXParseException: Premature end file

I currently have the following XML file.
http://www.cse.unsw.edu.au/~cs9321/14s1/assignments/musicDb.xml
My XMLParser.java class.
package edu.unsw.comp9321.assignment1;
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
public class XMLParser {
public void search () {
try{
File fXmlFile = new File("/COMP9321Assignment1/xml/musicDb.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
} catch (Exception e) {
e.printStackTrace();
}
}
}
I create an object in another class and call the search but I keep receiving the above stated error.
Would anyone know what the problem might be?
Thanks for your help.

Normally "org.xml.sax.SAXParseException: Premature end file" occurrs due to several reasons
1.Check your xml were all the tags are closed properly at same level
2.check if any name space issues.
3.check for welformness of your xml document

Getting too many child nodes and cant get attributes

I have a simple XML, and I want to get the attributes. There are a few examples on the web, but I still dont understand why I get 17 when I see only 4. I even try to count locations where I think text could be, but still I don't get that number unless is the length of the output . Which leads me to not know how to get the attribute name of all Tag3.
<?xml version="1.0" encoding="UTF-8"?>
<tag1 xmlns="something">
<xxxxxx-Set>
<tag3 Name="a"/>
<tag3 Name="b"/>
<tag3 Name="c"/>
<tag3 Name="d"/>
</xxxxxx-Set>
<tagB>
<tag3 Name="a"/>
<tag3 Name="b"/>
<tag3 Name="c"/>
<tag3 Name="d"/>
</tagB>
</tag1>
This is my java code:
import java.io.File;
import java.util.Arrays;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class ParseXML {
public static void main(String[] args) {
try {
File test= new File("test.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(test);
NodeList tagAs= doc.getElementsByTagName("xxxxxx-Set").item(0).getChildNodes(); //should be all the tag3 elements?
for(int i = 0; i < tagAs.getLength(); i++) {
System.out.println(tagAs);
System.out.println(i);
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
Note: adding .getAttributes().getNamedItem("Name").getNodeValue() to the print statement gives me null exception.
And the output is:
[xxxxxx-Set: null]
0
[xxxxxx-Set: null]
1
...
[xxxxxx-Set: null]
16

If you want to take all your Name attributes (it's better to name them with lower case), use next approach:
Element xSet = (Element) doc.getElementsByTagName("xxxxxx-Set").item(0);
NodeList xSetTags = xSet.getElementsByTagName("tag3");
for(int i = 0; i < xSetTags.getLength(); i++) {
Element tag3 = (Element) xSetTags.item(i);
System.out.println(tag3.getAttribute("Name"));
}
I made it using org.w3c.dom.Element class. It's not the best idea to work with org.w3c.dom.Node, because this class represents not only xml elements, but attributes, comments and other too. Look documentation to get difference between Node and Element classes.

XML text extraction

Scenario:
Given the following XML file:
<a:root
xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
aaaaaaaaaaaaaa
</a:root>
How do I extract the text inside the main element <a:root>:
"\naaaaaaaaaaaaaa\n"
The code I have right now is:
import java.io.File;
import java.util.Stack;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Proof {
public static void main(String[] args) {
Document doc = null;
DocumentBuilderFactory dbf = null;
DocumentBuilder docBuild = null;
try {
dbf = DocumentBuilderFactory.newInstance();
docBuild = dbf.newDocumentBuilder();
doc = docBuild.parse(new File("test2.xml"));
System.out.println(doc.getFirstChild().getTextContent());
} catch(Exception e) {
e.printStackTrace();
}
}
}
But it returns the text I desire ("aaaaaaaaaaaaaa") + the inner text for the rest of the elements . Output:
Apples
Bananas
African Coffee Table
80
120
aaaaaaaaaaaaaa
The requirement is not to use an additional XML java library !

The answer by #Kirill Polishchuk is not corect:
The proposed:
a:root/text()
Is a relative expression and if it isn't evaluated having the root (/) node as the context node it selects nothing in the provided XML document.
Even the XPath expression: /a:root/text() is incorrect, because it selects three text nodes -- all text node children of the top element -- including two whitespace-only text nodes.
Here is a correct XPath solution:
/a:root/text()[string-length(normalize-space()) > 0]
When this Xpath expression is applied on the provided XML document (corrected to be well-formed):
<a:root
xmlns:a="UNDEFINED !!!!"
xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
aaaaaaaaaaaaaa
</a:root>
It selects the last (and only non-whitespace-only) text node child of the top element, as required:
aaaaaaaaaaaaaa
XSLT-based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="UNDEFINED !!!!"
>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:text>"</xsl:text>
<xsl:copy-of select=
"/a:root/text()
[string-length(normalize-space()) > 0]"/>"
</xsl:template>
</xsl:stylesheet>
when this transformation is applied against the provided XML document (above), the wanted, correctly selecte text node is output:
"
aaaaaaaaaaaaaa
"

You can use XPath: a:root/text()

Use this
import java.io.File;
import java.util.Stack;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Proof {
public static void main(String[] args) {
Document doc = null;
DocumentBuilderFactory dbf = null;
DocumentBuilder docBuild = null;
try {
dbf = DocumentBuilderFactory.newInstance();
docBuild = dbf.newDocumentBuilder();
doc = docBuild.parse(new File("test2.xml"));
Element x= doc.getDocumentElement();
NodeList m=x.getChildNodes();
for(int i=0;i<m.getLength();i++){
Node it=m.item(i);
if(it.getNodeType()==3){
System.out.println(it.getNodeValue());
}
}
} catch(Exception e) {
e.printStackTrace();
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

What is the proper way to call getAttributeNS using Java DOM? - java

Immediately after DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();, add dbFactory.setNamespaceAware(true);

Related

XPathExpression.evaluate using Node [duplicate]

how to select nodes with different tags using DOM?

org.xml.sax.SAXParseException: Premature end file

Getting too many child nodes and cant get attributes

XML text extraction

Categories

Resources