how to get a node value in Xpath - Java - java

I've got a section of XML that looks like this:
<entry>
<id>tag:example.com,2005:Release/343597</id>
<published>2012-04-10T11:29:19Z</published>
<updated>2012-04-10T12:04:41Z</updated>
<link type="text/html" href="http://example.com/projects/example1" rel="alternate"/>
<title>example1</title>
</entry>
I need to grab the link http://example.com/projects/example1 from this block. I'm not sure how to do this. To get the title of the project I use this code:
String title1 = children.item(9).getFirstChild().getNodeValue();
where children is the getChildNodes() object for the <entry> </entry> block. But I keep getting NullPointerExceptions when I try to get the node value for the <link> node in a similar way. I see that the XML code is different for the <link> node, and I'm not sure what it's value is.... Please advise!

The xpath expression to get that node is
//entry/link/#href
In java you can write
Document doc = ... // your XML document
XPathExpression xp = XPathFactory.newInstance().newXPath().compile("//entry/link/#href");
String href = xp.evaluate(doc);
Then if you need to get the link value of the entry with a specific id you can change the xpath expression to
//entry[id='tag:example.com,2005:Release/343597']/link/#href
Finally if you want to get all the links in the documents, if the document has many entry elements you can write
Document doc = ... // your XML document
XPathExpression xp = XPathFactory.newInstance().newXPath().compile("//entry/link/#href");
NodeList links = (NodeList) xp.evaluate(doc, XPathConstants.NODESET);
// and iterate on links

Here is the complete code:
DocumentBuilderFactory domFactory = DocumentBuilderFactory
.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("test.xml");
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile("//entry/link/#href");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i));
}

Related

How to write XPath to get node attribute value from a "Name Space XML" in Java

INPUT_XML:
<?xml version="1.0" encoding="UTF-8">
<root xmlns:ns1="http://path1/schema1" xmlns:ns2="http://path2/schema2">
<ns1:abc>1234</ns1:abc>
<ns2:def>5678</ns2:def>
</root>
In Java, I am trying to write XPath expression which will get the value corresponding to this attribute "xmlns:ns1" from the above INPUT_XML string content.
I've tried the following:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(INPUT_XML);
String xpathExpression = "/root/xmlns:ns1";
// Create XPathFactory object
XPathFactory xpathFactory = XPathFactory.newInstance();
// Create XPath object
XPath xpath = xpathFactory.newXPath();
// Create XPathExpression object
XPathExpression expr = xpath.compile(xpathExpression);
// Evaluate expression result on XML document
NodeList nodes = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
But the above code is not giving the expected value of the specified attribute i.e. xmlns:ns1. I heavily suspect the xPathExpression is wrong. Please suggest with the right XPath expression or the right approach to tackle this issue.
If you're using an XPath 1.0 processor, or a XPath 2.0 processor with XPath 1.0 compatibility mode turned on, you can use the namespace axis to select the namespace value.
You will need to make the following change in your code:
String xpathExpression = "/root/namespace::ns1"
The xmlns:ns1="http://path1/schema1" and xmlns:ns2="http://path2/schema2" are not attributes, but namespace declarations. You cannot retrieve them with an XPath declaration so easily (there is XPath function namespace-uri() for this purpose, but root element does not have any namespace, it only defines them for future use).
When using DOM API you could use method lookupNamespaceURI():
System.out.println("ns1 = " + doc.getDocumentElement().lookupNamespaceURI("ns1"));
System.out.println("ns2 = " + doc.getDocumentElement().lookupNamespaceURI("ns2"));
When using XPath you could try following expressions:
namespace-uri(/*[local-name()='root']/*[local-name()='abc'])
namespace-uri(/*[local-name()='root']/*[local-name()='def'])

How to extract data from <dc> tag in java?

I am currently trying to extract the tag element < dc:title > from an epub in Java. However, i tried using
doc.getDocumentElement().getElementsByTagName("dc:title"));
and it only showed 2nd element :com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl. I would like to know how can I extract < dc:tittle > ?
Here is my code:
File fXmlFile = new File("file directory");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
System.out.println("1st element :" + doc.getElementsByTagName("dc");
System.out.println("2nd element :" + doc.getDocumentElement().getElementsByTagName("dc:title"));
System output:
1st element : com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl#4f53e9be
2nd element :com.sun.org.apache.xerces.internal.dom.DeepNodeListImpl#e16e1a2
Added Sample Data
<dc:title>
<![CDATA[someData]]>
</dc:title>
<dc:creator>
<![CDATA[someData]>
</dc:creator>
<dc:language>someData</dc:language>
The method getElementsByTagName(String) is return a List of matching elements (note plural 's'). You then need to specify which element (such as by using .item(index) to access a Node instance) you want to use. Therewith, you can using getNodeValue() on that Node object.
EDITED: because of the CDATA element, rather use Node.getTextContent():
NodeList elems = doc.getElementsByTagName("dc:title");
Node item = elems.item(0);
System.out.println(item.getTextContent());
I would suggest using xpath to get the desired output.
Also, refer following link for examples.
https://www.journaldev.com/1194/java-xpath-example-tutorial
For example:
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "//dc:title/text()";
NodeList nodes = (NodeList) xPath.compile(expression).evaluate(doc, XPathConstants.NODESET);
System.out.println(nodes.item(0).getNodeValue());

How to select all child nodes from a parent node from xml if there are many parents node with same name?

<?xml version="1.0" encoding="UTF-8"?>
<invoice>
<obs>
<ob>
<code>ABC</code>
</ob>
<ob>
<code>123</code>
</ob>
</obs>
</invoice>
<invoice>
<obs>
<ob>
<code>DEF</code>
</ob>
</obs>
</invoice>
</invoices>
Question:
I have that xml, which will come to me from external system ,it can have large number of invoice nodes and one invoice node can have large number of 'code' nodes. I want to read the code nodes of all 'invoice' nodes and save them in an array like this :
invoice[1].code[1]=ABC
invoice[1].code[2]=123
invoice[2].code[1]=DEF
How to do this using XPathExpression in JAVA. My code is below which is not working.
expr = xpath.compile("//invoices/invoice/obs/ob/code/text()");
result1=expr.evaluate(dc, XPathConstants.NODESET);
nodes =(NodeList)result1;
Please give some general solution in case of number of nodes are high.
It will give both as there are two invoices try to give id to your xml like
<invoices>
<invoice name="invoice1">
<obs>
<ob>
<code>ABC</code>
</ob>
</obs>
</invoice>
<invoice name="invoice2">
<obs>
<ob>
<code>DEF</code>
</ob>
</obs>
</invoice>
</invoices>
then
expr = xpath.compile("//invoices[#name='invoice1']/invoice/obs/ob/code/text()");
you could try this xpath:
//invoices/invoice[descendant::code[.='ABC']]/obs/ob/code
There are 2 nodes where your xpath passes. You can try this.
String xml = "<invoices><invoice><obs><ob><code>ABC</code></ob><ob><code>111</code></ob></obs></invoice><invoice><obs><ob><code>DEF</code></ob></obs></invoice></invoices>";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder =factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(xml)));
XPathFactory xpathFactory = XPathFactory.newInstance();
Map<String,List<String>> invoiceCodeMap = new LinkedHashMap<>();
XPathExpression invoiceXpathExp = xpathFactory.newXPath().compile("//invoices/invoice");
NodeList invoiceNodes = (NodeList) invoiceXpathExp.evaluate(document, XPathConstants.NODESET);
//Iterate Invoice nodes
for(int invoiceIndex=0;invoiceIndex<invoiceNodes.getLength();invoiceIndex++){
String invoiceID = "invoice"+(invoiceIndex+1);
List<String> codeList = new ArrayList<>();
XPathExpression codeXpathExp = xpathFactory.newXPath().compile("obs/ob/code/text()");
NodeList codeNodes = (NodeList) codeXpathExp.evaluate(invoiceNodes.item(invoiceIndex), XPathConstants.NODESET);
for(int codeIndex=0;codeIndex<codeNodes.getLength();codeIndex++){
Node code = codeNodes.item(codeIndex);
codeList.add(code.getTextContent());
}
invoiceCodeMap.put(invoiceID, codeList);
}
System.out.println(invoiceCodeMap);

How to access to value read XML using XPath in Java

I want to read XML data using XPath in Java.
I have the next XML file named MyXML.xml:
<?xml version="1.0" encoding="iso-8859-1" ?>
<REPOSITORY xmlns:LIBRARY="http://www.openarchives.org/LIBRARY/2.0/"
xmlns:xsi="http://www.w3.prg/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.openarchives.org/LIBRARY/2.0/ http://www.openarchives.org/LIBRARY/2.0/LIBRARY-PHM.xsd">
<repository>Test</repository>
<records>
<record>
<ejemplar>
<library_book:book
xmlns:library_book="http://www.w3c.es/LIBRARY/book/"
xmlns:book="http://www.w3c.es/LIBRARY/book/"
xmlns:bookAssets="http://www.w3c.es/LIBRARY/book/"
xmlns:bookAsset="http://www.w3c.es/LIBRARY/book/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3c.es/LIBRARY/book/ http://www.w3c.es/LIBRARY/replacement/book.xsd">
<book:bookAssets count="1">
<book:bookAsset nasset="1">
<book:bookAsset.id>value1</book:bookAsset.id>
<book:bookAsset.event>
<book:bookAsset.event.id>value2</book:bookAsset.event.id>
</book:bookAsset.event>
</book:bookAsset>
</book:bookAssets>
</library_book:book>
</ejemplar>
</record>
</records>
</REPOSITORY>
I want access to value1 and value2 values. For this, I try this:
// Standard of reading a XML file
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder;
Document doc = null;
XPathExpression expr = null;
builder = factory.newDocumentBuilder();
doc = builder.parse("MyXML.xml");
// Create a XPathFactory
XPathFactory xFactory = XPathFactory.newInstance();
// Create a XPath object
XPath xpath = xFactory.newXPath();
expr = xpath.compile("//REPOSITORY/records/record/ejemplar/library_book:book//book:bookAsset.event.id/text()");
Object result = expr.evaluate(doc, XPathConstants.STRING);
System.out.println("RESULT=" + (String)result);
But I don't get any results. Only prints RESULT=.
¿How to access to value1 and value2 values?. ¿What is the XPath filter to apply?.
Thanks in advanced.
I'm using JDK6.
You are having problems with namespaces, what you can do is
take them into account
ignore them using the XPath local-name() function
Solution 1 implies implementing a NamespaceContext that maps namespaces names and URIs and set it on the XPath object before querying.
Solution 2 is easy, you just need to change your XPath (but depending on your XML you may fine-tune your XPath to be sure to select the correct element):
XPath xpath = xFactory.newXPath();
expr = xpath.compile("//*[local-name()='bookAsset.event.id']/text()");
Object result = expr.evaluate(doc, XPathConstants.STRING);
System.out.println("RESULT=" + result);
Runnable example on ideone.
You can take a look at the following blog article to better understand the uses of namespaces and XPath in Java (even if old)
Try
Object result = expr.evaluate(doc, XPathConstants.NODESET);
// Cast the result to a DOM NodeList
NodeList nodes = (NodeList) result;
for (int i=0; i<nodes.getLength();i++){
System.out.println(nodes.item(i).getNodeValue());
}
One approach is to implement a name space context like:
public static class UniversalNamespaceResolver implements NamespaceContext {
private Document sourceDocument;
public UniversalNamespaceResolver(Document document) {
sourceDocument = document;
}
public String getNamespaceURI(String prefix) {
if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return sourceDocument.lookupNamespaceURI(null);
} else {
return sourceDocument.lookupNamespaceURI(prefix);
}
}
public String getPrefix(String namespaceURI) {
return sourceDocument.lookupPrefix(namespaceURI);
}
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
And then use it like
xpath.setNamespaceContext(new UniversalNamespaceResolver(doc));
You also need to move up all the namespace declarations to the root node (REPOSITORY). Otherwise it might be a problem if you have namespace declarations on two different levels.

How to get root node attributes on java

I have an xml file like down below. I want to get pharmacies nodes' latitude and longitude attributes.I can get chilnodes attributes but couldnt get root node attributes. I am new on java and xml. I could not find a solution how to do.
<pharmacies Acc="4" latitude="36.8673380" longitude="30.6346640" address="Ayujkila">
<pharmacy name="sadde" owner="" address="dedes" distance="327.000555668" phone="342343" lat="36.8644" long="30.6345" accuracy="8"/>
<pharmacy name="Sun " owner="" address="degerse" distance="364.450016586" phone="45623" lat="36.8641" long="30.6353" accuracy="8"/>
<pharmacy name="lara" owner="" address="freacde" distance="927.262190129" phone="564667" lat="36.8731" long="30.6422" accuracy="8"
<end/>
</pharmacies>
This is my part of code. I get xml file from a url address.
DocumentBuilderFactory dbf =DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(url.openStream()));
doc.getDocumentElement().normalize();
NodeList nodeList =doc.getElementsByTagName("pharmacy");
for (int i = 0; i < nodeList.getLength(); i++){
Node node =nodeList.item(i);
Element fstElmnt = (Element) node;
NodeList pharmacyList = fstElmnt.getElementsByTagName("pharmacy");
Element pharmacyElement = (Element) pharmacyList.item(0);
Element pharmacyElement = (Element) pharmacyList.item(0);
HashMap<String,String>map=new HashMap<String,String>();
map.put("name", pharmacyElement.getAttribute("name"));
map.put("distance", pharmacyElement.getAttribute("phone"));
list.add(map);
latt.add(pharmacyElement.getAttribute("lat"));
....
The <pharmacies> element itself can be obtained using
Element pharmacies = doc.getDocumentElement();
You can get the attributes from that.
doc.getDocumentElement() will return the root element and you can call getAttribute( attrName ) on it like you would on any other element.
try the following:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new ByteArrayInputStream(xml.getBytes("UTF-8")));
doc.getDocumentElement().normalize();
System.out.println(doc.getChildNodes().getLength());
Node item = doc.getChildNodes().item(0);
System.out.println(item.getNodeName());
Node lat = item.getAttributes().getNamedItem("latitude");
String s = lat.getNodeValue();
System.out.println(s.equals("36.8673380")); // Value of /pharmacies[#latitude]/value()
You need to use pharmacies instead of pharmacy if you need to get attributes for root node pharmacies.And use getAttributes method instead.You can see lot of examples at this site.
http://java.sun.com/developer/codesamples/xml.html#dom
Try Its Work For me, Res is your final String:
doc = b.parse(new ByteArrayInputStream(result.getBytes("UTF-8")));
Node rootNode=doc.getDocumentElement();
res = rootNode.getNodeName().toString();
The <pharmacies> is itself an element & can be obtained using
Element pharmacies = doc.getDocumentElement();
Now this pharmacies reference variable of Elements holds all the attributes under <pharmacies> element. We can get desired attributes one by one using attribute name like :
pharmacies.getAttribute("latitude"); // Will give 36.8673380
pharmacies.getAttribute("longitude"); // Will give 30.6346640

Categories