Getting attributes values from a node in XML file - java

I have an XML file and i want to get values of the nodes attributes in it, it works efficiently when the node is usual but is the case of nodes named like something:something it didn't give me back any result, just null.
The XML file :
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<rss version="2.0" xmlns:yweather="http://xml.weather.yahoo.com/ns/rss/1.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
<channel>
<title>Yahoo! Weather - Sunnyvale, CA</title>
<link>http://us.rd.yahoo.com/dailynews/rss/weather/Sunnyvale__CA/*http://weather.yahoo.com/forecast/USCA1116_f.html</link>
<description>Yahoo! Weather for Sunnyvale, CA</description>
<language>en-us</language>
<lastBuildDate>Fri, 18 Dec 2009 9:38 am PST</lastBuildDate>
<ttl>60</ttl>
<yweather:location city="Sunnyvale" region="CA" country="United States"/>
<yweather:units temperature="F" distance="mi" pressure="in" speed="mph"/>
</channel>
</rss>
The Java Code :
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//rss/#version");
Object result = expr.evaluate(doc, XPathConstants.STRING);
System.out.println(result);
the previous java code works efficiently but when replacing //rss/#version with
//rss/channel/yweather:location/#city it returns me null.

First of all, the part before the : is called a namespace. It is quite an important concept in XML.
To retrieve a value with a namespace you have to make the context aware of the namespace. You can do this using
xpath.setNamespaceContext(context);
context must be an implementation of NamespaceContext. In this case, the namspaces are defined within the XML so it might be good to have a namespace resolver which can get the namespaces from the document directly. This class is exactly doing this:
public class UniversalNamespaceResolver implements NamespaceContext {
private Document sourceDocument;
public UniversalNamespaceResolver(Document document) {
sourceDocument = document;
}
public String getNamespaceURI(String prefix) {
if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return sourceDocument.lookupNamespaceURI(null);
} else {
return sourceDocument.lookupNamespaceURI(prefix);
}
}
public String getPrefix(String namespaceURI) {
return sourceDocument.lookupPrefix(namespaceURI);
}
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
Read more about it at http://www.ibm.com/developerworks/library/x-nmspccontext/

Related

How to read < as < from an XML? [duplicate]

I am new to XML. I want to read the following XML on the basis of request name. Please help me on how to read the below XML in Java -
<?xml version="1.0"?>
<config>
<Request name="ValidateEmailRequest">
<requestqueue>emailrequest</requestqueue>
<responsequeue>emailresponse</responsequeue>
</Request>
<Request name="CleanEmail">
<requestqueue>Cleanrequest</requestqueue>
<responsequeue>Cleanresponse</responsequeue>
</Request>
</config>
If your XML is a String, Then you can do the following:
String xml = ""; //Populated XML String....
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
Element rootElement = document.getDocumentElement();
If your XML is in a file, then Document document will be instantiated like this:
Document document = builder.parse(new File("file.xml"));
The document.getDocumentElement() returns you the node that is the document element of the document (in your case <config>).
Once you have a rootElement, you can access the element's attribute (by calling rootElement.getAttribute() method), etc. For more methods on java's org.w3c.dom.Element
More info on java DocumentBuilder & DocumentBuilderFactory. Bear in mind, the example provided creates a XML DOM tree so if you have a huge XML data, the tree can be huge.
Related question.
Update Here's an example to get "value" of element <requestqueue>
protected String getString(String tagName, Element element) {
NodeList list = element.getElementsByTagName(tagName);
if (list != null && list.getLength() > 0) {
NodeList subList = list.item(0).getChildNodes();
if (subList != null && subList.getLength() > 0) {
return subList.item(0).getNodeValue();
}
}
return null;
}
You can effectively call it as,
String requestQueueName = getString("requestqueue", element);
In case you just need one (first) value to retrieve from xml:
public static String getTagValue(String xml, String tagName){
return xml.split("<"+tagName+">")[1].split("</"+tagName+">")[0];
}
In case you want to parse whole xml document use JSoup:
Document doc = Jsoup.parse(xml, "", Parser.xmlParser());
for (Element e : doc.select("Request")) {
System.out.println(e);
}
If you are just looking to get a single value from the XML you may want to use Java's XPath library. For an example see my answer to a previous question:
How to use XPath on xml docs having default namespace
It would look something like:
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Demo {
public static void main(String[] args) {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document dDoc = builder.parse("E:/test.xml");
XPath xPath = XPathFactory.newInstance().newXPath();
Node node = (Node) xPath.evaluate("/Request/#name", dDoc, XPathConstants.NODE);
System.out.println(node.getNodeValue());
} catch (Exception e) {
e.printStackTrace();
}
}
}
There are a number of different ways to do this. You might want to check out XStream or JAXB. There are tutorials and the examples.
If the XML is well formed then you can convert it to Document. By using the XPath you can get the XML Elements.
String xml = "<stackusers><name>Yash</name><age>30</age></stackusers>";
Form XML-String Create Document and find the elements using its XML-Path.
Document doc = getDocument(xml, true);
public static Document getDocument(String xmlData, boolean isXMLData) throws Exception {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
dbFactory.setIgnoringComments(true);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc;
if (isXMLData) {
InputSource ips = new org.xml.sax.InputSource(new StringReader(xmlData));
doc = dBuilder.parse(ips);
} else {
doc = dBuilder.parse( new File(xmlData) );
}
return doc;
}
Use org.apache.xpath.XPathAPI to get Node or NodeList.
System.out.println("XPathAPI:"+getNodeValue(doc, "/stackusers/age/text()"));
NodeList nodeList = getNodeList(doc, "/stackusers");
System.out.println("XPathAPI NodeList:"+ getXmlContentAsString(nodeList));
System.out.println("XPathAPI NodeList:"+ getXmlContentAsString(nodeList.item(0)));
public static String getNodeValue(Document doc, String xpathExpression) throws Exception {
Node node = org.apache.xpath.XPathAPI.selectSingleNode(doc, xpathExpression);
String nodeValue = node.getNodeValue();
return nodeValue;
}
public static NodeList getNodeList(Document doc, String xpathExpression) throws Exception {
NodeList result = org.apache.xpath.XPathAPI.selectNodeList(doc, xpathExpression);
return result;
}
Using javax.xml.xpath.XPathFactory
System.out.println("javax.xml.xpath.XPathFactory:"+getXPathFactoryValue(doc, "/stackusers/age"));
static XPath xpath = javax.xml.xpath.XPathFactory.newInstance().newXPath();
public static String getXPathFactoryValue(Document doc, String xpathExpression) throws XPathExpressionException, TransformerException, IOException {
Node node = (Node) xpath.evaluate(xpathExpression, doc, XPathConstants.NODE);
String nodeStr = getXmlContentAsString(node);
return nodeStr;
}
Using Document Element.
System.out.println("DocumentElementText:"+getDocumentElementText(doc, "age"));
public static String getDocumentElementText(Document doc, String elementName) {
return doc.getElementsByTagName(elementName).item(0).getTextContent();
}
Get value in between two strings.
String nodeVlaue = org.apache.commons.lang.StringUtils.substringBetween(xml, "<age>", "</age>");
System.out.println("StringUtils.substringBetween():"+nodeVlaue);
Full Example:
public static void main(String[] args) throws Exception {
String xml = "<stackusers><name>Yash</name><age>30</age></stackusers>";
Document doc = getDocument(xml, true);
String nodeVlaue = org.apache.commons.lang.StringUtils.substringBetween(xml, "<age>", "</age>");
System.out.println("StringUtils.substringBetween():"+nodeVlaue);
System.out.println("DocumentElementText:"+getDocumentElementText(doc, "age"));
System.out.println("javax.xml.xpath.XPathFactory:"+getXPathFactoryValue(doc, "/stackusers/age"));
System.out.println("XPathAPI:"+getNodeValue(doc, "/stackusers/age/text()"));
NodeList nodeList = getNodeList(doc, "/stackusers");
System.out.println("XPathAPI NodeList:"+ getXmlContentAsString(nodeList));
System.out.println("XPathAPI NodeList:"+ getXmlContentAsString(nodeList.item(0)));
}
public static String getXmlContentAsString(Node node) throws TransformerException, IOException {
StringBuilder stringBuilder = new StringBuilder();
NodeList childNodes = node.getChildNodes();
int length = childNodes.getLength();
for (int i = 0; i < length; i++) {
stringBuilder.append( toString(childNodes.item(i), true) );
}
return stringBuilder.toString();
}
OutPut:
StringUtils.substringBetween():30
DocumentElementText:30
javax.xml.xpath.XPathFactory:30
XPathAPI:30
XPathAPI NodeList:<stackusers>
<name>Yash</name>
<age>30</age>
</stackusers>
XPathAPI NodeList:<name>Yash</name><age>30</age>
following links might help
http://labe.felk.cvut.cz/~xfaigl/mep/xml/java-xml.htm
http://developerlife.com/tutorials/?p=25
http://www.java-samples.com/showtutorial.php?tutorialid=152
There are two general ways of doing that. You will either create a Domain Object Model of that XML file, take a look at this
and the second choice is using event driven parsing, which is an alternative to DOM xml representation. Imho you can find the best overall comparison of these two basic techniques here. Of course there are much more to know about processing xml, for instance if you are given XML schema definition (XSD), you could use JAXB.
There are various APIs available to read/write XML files through Java.
I would refer using StaX
Also This can be useful - Java XML APIs
You can make a class which extends org.xml.sax.helpers.DefaultHandler and call
start_<tag_name>(Attributes attrs);
and
end_<tag_name>();
For it is:
start_request_queue(attrs);
etc.
And then extends that class and implement xml configuration file parsers you want. Example:
...
public void startElement(String uri, String name, String qname,
org.xml.sax.Attributes attrs)
throws org.xml.sax.SAXException {
Class[] args = new Class[2];
args[0] = uri.getClass();
args[1] = org.xml.sax.Attributes.class;
try {
String mname = name.replace("-", "");
java.lang.reflect.Method m =
getClass().getDeclaredMethod("start" + mname, args);
m.invoke(this, new Object[] { uri, (org.xml.sax.Attributes)attrs });
}
catch (IllegalAccessException e) {
throw new RuntimeException(e);
}
catch (NoSuchMethodException e) {
throw new RuntimeException(e); }
catch (java.lang.reflect.InvocationTargetException e) {
org.xml.sax.SAXException se =
new org.xml.sax.SAXException(e.getTargetException());
se.setStackTrace(e.getTargetException().getStackTrace());
}
and in a particular configuration parser:
public void start_Request(String uri, org.xml.sax.Attributes attrs) {
// make sure to read attributes correctly
System.err.println("Request, name="+ attrs.getValue(0);
}
Since you are using this for configuration, your best bet is apache commons-configuration. For simple files it's way easier to use than "raw" XML parsers.
See the XML how-to

Why won't my xpath work?

I have the following xml:
<?xml version="1.0" encoding="UTF-8"?>
<prefix:someName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:prefix="someUri" xsi:schemaLocation="someLocation.xsd">
<prefix:someName2>
....
</prefix:someName2>
</prefix:someName>
And my code looks like this:
private Node doXpathThingy(Document doc) {
final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new NamespaceContext(){
#Override
public String getNamespaceURI(String prefix) {
if (prefix == null) {
throw new NullPointerException("Null prefix");
}
return doc.lookupNamespaceURI(prefix);
}
#Override
public String getPrefix(String namespaceURI) {
return null;
}
#Override
public Iterator getPrefixes(String namespaceURI) {
return null;
}
});
try {
XPathExpression expr = xPath.compile(xpathString);
return (Node)expr.evaluate(doc, XPathConstants.NODESET);
} catch (Exception e) {
.... }
}
I'm trying to get this to work with any valid xpath. It works with these xpaths:
"prefix:someName"
"."
But NOT with: "prefix:someName2". It returns null.
I guess I'm still not getting something about namespaces, but I don't understand what? I've tried leaving out the prefixes from my xpath but then nothing works at all.
I've also checked if the correct uri is returned for the prefix at doc.lookupNamespaceURI(prefix), and it is.
Any help would be greatly appreciated.
The query prefix:XXX means child::prefix:XXX, that is, find an element child of the context node whose name is prefix:XXX. Your context node is the document node at the root of the tree. The document node has a child named prefix:someName, but it doesn't have a child named prefix:someName2. If you want to find a grandchild of the document node, try the query */prefix:someName2.
Can't say I'm familiar with the Java way of doing XPath, but it looks like you are making an XPath query from the root of the document, so what you are seeing is the expected behavior.
Try this to find someName2 anywhere in the doc
//prefix:someName2
or this to find it as the child of someName2
/prefix:someName/prefix:someName2
or this to find it as the direct child of any root element
/*/prefix:someName2

Get value from xml tags which are of same name using xpaths

I publish some csv input file on a server and it gives me a xml file that looks like this:
<ns0:TransportationEvent xmlns:ns0="http://www.server.com/schemas/TransportationEvent.xsd">
<ns0:creationDateTime>2017-04-06</ns0:creationDateTime>
.....
.....
</ns0:TransportationEvent>
<ns0:TransportationEvent xmlns:ns0="http://www.fedex.com/schemas/TransportationEvent.xsd">
<ns0:creationDateTime>2017-04-25</ns0:creationDateTime>
.....
.....
</ns0:TransportationEvent>
The TransportationEvent tag would be added again and again with the updated date in it.
I am retrieving data from this xml using XpathFactory class and NamespaceContext class which is shown as below:
NamespaceContext ctx = new NamespaceContext() {
public String getNamespaceURI(String prefix) {
String uri;
if (prefix.equals("ns0"))
uri = "http://www.server.com/schemas/TransportationEvent.xsd";
else
uri = null;
return uri;
}
public Iterator getPrefixes(String val) {
return null;
}
// Dummy implementation - not used!
public String getPrefix(String uri) {
return null;
}
};
XPathFactory xpathFact = XPathFactory.newInstance();
XPath xpath = xpathFact.newXPath();
xpath.setNamespaceContext(ctx);
String strXpath = "//ns0:TransportationEvent/ns0:creationDateTime/text()";
String creationDateTime = xpath.evaluate(strXpath, doc);
The above code gives the value of creationDateTime as 2017-04-06. Basically it always take values from the first TransportationEvent tag.
I need to pick data from that "TransportationEvent" tag where the "creationDateTime" is equal to today's date.
I can perform this by using NodeList class and can iterate through all the "TransportationEvent" tags but then I would not be able to use the Xpath or NamespaceContext implementation. I am finding no connection between the NodeList class and the NamespaceContext class or the Xpath class.
I want to get the value of ctx which has the context of the latest TransportationEvent tag.
I know I am missing something. Could somebody help please?
Use the last() function in a predicate to select only the last TransportationEvent:
String strXpath = "//ns0:TransportationEvent[last()]/ns0:creationDateTime/text()";

javax.xml, XPath is not extracted from XML with namespaces

There is "original" XML
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
<soap:Header>
<context xmlns="urn:zimbra">
<session id="555">555</session>
<change token="333"/>
</context>
</soap:Header>
<soap:Body>
<AuthResponse xmlns="urn:zimbraAccount">
<lifetime>172799999</lifetime>
<session id="555">555</session>
<skin>carbon</skin>
</AuthResponse>
</soap:Body>
</soap:Envelope>
The XML is parsed in this way
// javax.xml.parsers.*
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(pathToXml);
Then I'm trying to extract session id by XPath
// javax.xml.xpath.*;
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
// next xpath does not work with Java and online xpath tester
//XPathExpression expr = xpath.compile("/soap:Envelope/soap:Header/context/session/text()");
// this xpath works with online xpath tester but does not with in Java
XPathExpression expr = xpath.compile("/soap:Envelope/soap:Header/*[name()='context']/*[name()='session']/text()");
String sessionId = (String)expr.evaluate(doc, XPathConstants.STRING);
Tested here
http://www.xpathtester.com/xpath/678ae9388e3ae2fc8406eb8cf14f3119
When the XML is simplified to this
<Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
<Header>
<context>
<session id="555">555</session>
<change token="333"/>
</context>
</Header>
<Body>
<AuthResponse xmlns="urn:zimbraAccount">
<lifetime>172799999</lifetime>
<session id="555">555</session>
<skin>carbon</skin>
</AuthResponse>
</Body>
</Envelope>
This XPath does its job
XPathExpression expr = xpath.compile("/Envelope/Header/context/session/text()");
How to extract session id from "original" XML with Java?
UPDATE: JDK 1.6
The answer is that you need to correctly use namespaces and namespace prefixes:
First, make your DocumentBuilderFactory namespace aware by calling this before you use it:
factory.setNamespaceAware(true);
Then do this to retrieve the value you want:
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
xpath.setNamespaceContext(new NamespaceContext() {
#Override
public String getNamespaceURI(String prefix) {
if (prefix.equals("soap")) {
return "http://www.w3.org/2003/05/soap-envelope";
}
if (prefix.equals("zmb")) {
return "urn:zimbra";
}
return XMLConstants.NULL_NS_URI;
}
#Override
public String getPrefix(String namespaceURI) {
throw new UnsupportedOperationException("Not supported yet.");
}
#Override
public Iterator getPrefixes(String namespaceURI) {
throw new UnsupportedOperationException("Not supported yet.");
}
});
XPathExpression expr =
xpath.compile("/soap:Envelope/soap:Header/zmb:context/zmb:session");
String sessionId = (String)expr.evaluate(doc, XPathConstants.STRING);
You may need to add a line to the beginning of your file to import the NamespaceContext class:
import javax.xml.namespace.NamespaceContext;
http://ideone.com/X3iX5N
You can always do it by ignoring namespace, not the ideal method but works.
"/*[local-name()='Envelope']/*[local-name()='Header']/*[local-name()='context']/*[local-name()='session']/text()"

using xpath with namespace from a java class

I am trying to parse an xml document with namespace using XPATH. I have read how it is supposed to be done. I have implemented NamespaceContext as well. But, I still am not getting the values. I think I am missing something simple.
My xml input is
<?xml version="1.0" encoding="UTF-8"?>
<ns1:customer xmlns:ns1="http://test/ns1">
<ns1:name>john</ns1:name>
</ns1:customer>
My Main file is TestXMLPath
public static void main(String[] args) throws Exception {
String myInputXML = "src/testxmlpath/input-with-namespace.xml";
DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
String expression ="/ns1:customer/ns1:name";
Document document = db.parse(new File(myInputXML)) ;
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(new SimpleNamespaceContextImpl());
String value = xpath.evaluate(expression,document);
System.out.println("value" + value);
}
my NamespaceContext implementation is
public class SimpleNamespaceContextImpl implements NamespaceContext {
#Override
public String getNamespaceURI(String prefix) {
System.out.println("getNameSpace for prefix "+prefix);
if (prefix == null) {
throw new NullPointerException("Null prefix");
} else if ("ns1".equals(prefix)) {
return "http://test/ns1";
} else if ("xml".equals(prefix)) {
return XMLConstants.XML_NS_URI;
} else {
return XMLConstants.XML_NS_URI;
}
}
#Override
public String getPrefix(String namespaceURI) {
return "ns1";
}
#Override
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
I print out when a method gets called. Here is the output.
getNameSpace for prefix ns1
getNameSpace for prefix ns1
value
BUILD SUCCESSFUL
I can't understand, why won't it work ??
Any help will be greatly appreciated.
Thanks
Works fine for me. Output:
getNameSpace for prefix ns1
getNameSpace for prefix ns1
valuejohn
Are you sure you're loading the right document? I'm using Xerces to build the document and Saxon to evaluate the XPath. A dump of the relevant classes:
class com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl
class com.sun.org.apache.xerces.internal.dom.DeferredDocumentImpl
class net.sf.saxon.xpath.XPathFactoryImpl

Categories