Parsing xml using Java - java

I am trying to parse a dom element.
Element element:
<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<id>http://X/feed2</id>
<title>Sample Feed</title>
<entry>
<id>http://X/feed2/104</id>
<title>New Title</title>
</entry>
</feed>
I am trying to fetch the following entry:
<entry>
<id>http://top.cs.vt.edu/libx2/vsony7#vt.edu/feed2/104</id>
<title>New Title</title>
</entry>
I am parsing the xml by using the xpath:
"/atom:feed/atom:entry[atom:id=\"http://X/feed2/104\"]"
But, I get an exception when I try to parse Dom Element. Can someone suggest a simple approach to achieve this in Java?
Please see my full code:
public static parseXml() {
String externalEntryIdUrl = "http://theta.cs.vt.edu/~rupen/thirtylibapps/137";
String externalFeedUrl = StringUtils.substringBeforeLast(externalEntryIdUrl, "/");
try {
URL url = new URL(externalFeedUrl);
InputStream externalXml = new BufferedInputStream(url.openStream());
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(externalXml);
Element externalFeed = doc.getDocumentElement();
String atomNameSpace = "xmlns:atom=\"http://www.w3.org/2005/Atom\"";
String entryIdPath = String.format("//%s:entry[%s:id=%s]", atomNameSpace, atomNameSpace, externalEntryIdUrl);
Element externalEntry = (Element) XPathSupport.evalNode(entryIdPath, externalFeed);
} catch (Exception ex) {
// Throw exception
}
}
static synchronized Node evalNode(String xpathExpr, Node node) {
NodeList result = evalNodeSet(xpathExpr, node);
if (result.getLength() > 1)
throw new Error ("More than one node for:" + xpathExpr);
else if (result.getLength() == 1)
return result.item(0);
else
return null;
}
static synchronized NodeList evalNodeSet(String xpathExpr, Node node) {
try {
static XPath xpath = factory.newXPath();
xpath.setNamespaceContext(context);
static NamespaceContext context = new NamespaceContext() {
private Map<String, String> prefix2URI = new HashMap<String, String>();
{
prefix2URI.put("libx", "http://libx.org/xml/libx2");
prefix2URI.put("atom", "http://www.w3.org/2005/Atom");
}
};
XPathExpression expr = xpath.compile(xpathExpr);
Object result = expr.evaluate(node, XPathConstants.NODESET);
return (NodeList)result;
} catch (XPathExpressionException xpee) {
throw new Error ("An xpath expression exception: " + xpee);
}
}
SEVERE: >>java.lang.Error: An xpath expression exception: javax.xml.xpath.XPathExpressionException

You can use SAX parser.
Here is a example for SAX parsing http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/

You could leverage a NamespaceContextand do something like the following:
package forum9059851;
import java.io.FileInputStream;
import java.util.Iterator;
import javax.xml.namespace.NamespaceContext;
import javax.xml.xpath.*;
import org.w3c.dom.Element;
import org.xml.sax.InputSource;
public class Demo {
public static void main(String[] args) {
try {
XPathFactory xpf = XPathFactory.newInstance();
XPath xp = xpf.newXPath();
xp.setNamespaceContext(new MyNamespaceContext());
XPathExpression xpe = xp.compile("ns:feed/ns:entry");
FileInputStream xmlStream = new FileInputStream("src/forum9059851/input.xml");
InputSource xmlInput = new InputSource(xmlStream);
Element result = (Element) xpe.evaluate(xmlInput, XPathConstants.NODE);
System.out.println(result);
} catch (Exception ex) {
// Throw exception
}
}
private static class MyNamespaceContext implements NamespaceContext {
public String getNamespaceURI(String prefix) {
if("ns".equals(prefix)) {
return "http://www.w3.org/2005/Atom";
}
return null;
}
public String getPrefix(String namespaceURI) {
return null;
}
public Iterator getPrefixes(String namespaceURI) {
return null;
}
}
}

If you don't want to reinvent the wheel and want to parse feed data I would recommend going with the already available Rome library.

I figured that I didn't set the namespace awareness while fetching the xml from a URL.
So,
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
Doing so fixes my issue. Without doing this, setting the namespace context for XPathFactory instance while parsing the xml as shown in my example doesn't work by itself.

Related

Conversion of String to w3c.Document returns null Document

I am getting xml response from 3rd party API via feign client which I am collecting in String and then trying to convert the String into org.w3c.dom.Document.
I have searched for String to Document conversion code and came across below links.
https://howtodoinjava.com/java/xml/parse-string-to-xml-dom/
How to convert String to DOM Document object in java?
https://www.journaldev.com/1237/java-convert-string-to-xml-document-and-xml-document-to-string
problem is my conversion logic is not working and Document = null.
public static void main(String[] args) {
final String xmlStr = "<Emp id=\"1\"><name>Pankaj</name><age>25</age>\n"+
"<role>Developer</role><gen>Male</gen></Emp>";
Document doc = convertStringToXMLDocument(xmlStr);
}
private static Document convertStringToXMLDocument(String xmlString)
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try
{
builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(xmlString)));
return doc;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
I have tried debugging code of builder.parse() but not able to find why document conversion is null.
Output: doc: "[#document : null]"
Your Document doc is not null. The "[#document : null]" only is the output from xerces NodeImpl.toString. It means [element name or element type : element value]. So [#document : null] means document element : no direct value. That's true because only text nodes have direct values.
You will be able to iterate over the Document doc and to get all the child nodes (element nodes as well as text nodes) as follows:
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.xml.sax.InputSource;
public class DocumentBuilderTest {
public static void main(String[] args) {
final String xmlStr = "<Emp id=\"1\"><name>Pankaj</name><age>25</age>\n"+
"<role>Developer</role><gen>Male</gen></Emp>";
Document doc = convertStringToXMLDocument(xmlStr);
printAllChildNodes(doc);
}
public static void printAllChildNodes(Node node) {
System.out.println(node);
NodeList nodeList = node.getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++) {
Node currentNode = nodeList.item(i);
printAllChildNodes(currentNode);
}
}
private static Document convertStringToXMLDocument(String xmlString) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(xmlString)));
return doc;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
}
This prints:
[#document: null]
[Emp: null]
[name: null]
[#text: Pankaj]
[age: null]
[#text: 25]
[#text:
]
[role: null]
[#text: Developer]
[gen: null]
[#text: Male]

How to read an online XML file for currency rates in java

I'm building a simple currency converter which needs to sue online rates. I found the following API from the European Central Bank to use:
http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml
My problem is im struggling to implement it. Here is what i have so far after using a bunch of different sources to try and get this code together.
try{
URL url = new URL("http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(url.openStream()));
doc.getDocumentElement().normalize();
NodeList nodeList1 = doc.getElementsByTagName("Cube");
for(int i = 0; i < nodeList1.getLength(); i++){
Node node = nodeList1.item(i);
}
}
catch(Exception e){
}
So what i thought is that this code would take down all the nodes which tart with "Cube", and contain the rates.
Anyone have an easier wya to pull down the rates from the API into an array in the order they appear on the XML as that's all I'm trying to do
Thanks
XPath is one way to answer this, since you just want to extract information from the XML and not change the XML. The structure of the XML suggests that you're looking for nodes that are Cube nodes, that are child of Cube which is also a child of Cube -- Cube nested three times, so extract nodes with an XPath compiled using this String: "//Cube/Cube/Cube". This looks for nodes that have Cube nested 3 times located anywhere (the //) in the Document:
XPathExpression expr = xpath.compile("//Cube/Cube/Cube");
Then check the nodes for a "currency" attribute. If they have this, then they also have a "rate" attribute, and then extract this information.
NamedNodeMap attribs = node.getAttributes();
if (attribs.getLength() > 0) {
Node currencyAttrib = attribs.getNamedItem(CURRENCY);
if (currencyAttrib != null) {
String currencyTxt = currencyAttrib.getNodeValue();
String rateTxt = attribs.getNamedItem(RATE).getNodeValue();
// ...
}
}
Where CURRENCY = "currency" and RATE = "rate"
For example:
import java.io.IOException;
import java.io.InputStream;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class TestXPath {
private static final String CURRENCY = "currency";
private static final String CUBE_NODE = "//Cube/Cube/Cube";
private static final String RATE = "rate";
public static void main(String[] args) {
List<CurrencyRate> currRateList = new ArrayList<>();
DocumentBuilderFactory builderFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
try {
builder = builderFactory.newDocumentBuilder();
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
Document document = null;
String spec = "http://www.ecb.europa.eu/stats/eurofxref/eurofxref-daily.xml";
try {
URL url = new URL(spec);
InputStream is = url.openStream();
document = builder.parse(is);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
String xPathString = CUBE_NODE;
XPathExpression expr = xpath.compile(xPathString);
NodeList nl = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nl.getLength(); i++) {
Node node = nl.item(i);
NamedNodeMap attribs = node.getAttributes();
if (attribs.getLength() > 0) {
Node currencyAttrib = attribs.getNamedItem(CURRENCY);
if (currencyAttrib != null) {
String currencyTxt = currencyAttrib.getNodeValue();
String rateTxt = attribs.getNamedItem(RATE).getNodeValue();
currRateList.add(new CurrencyRate(currencyTxt, rateTxt));
}
}
}
} catch (SAXException | IOException | XPathExpressionException e) {
e.printStackTrace();
}
for (CurrencyRate currencyRate : currRateList) {
System.out.println(currencyRate);
}
}
}
public class CurrencyRate {
private String currency;
private String rate; // ?double
public CurrencyRate(String currency, String rate) {
super();
this.currency = currency;
this.rate = rate;
}
public String getCurrency() {
return currency;
}
public String getRate() {
return rate;
}
#Override
public String toString() {
return "CurrencyRate [currency=" + currency + ", rate=" + rate + "]";
}
// equals, hashCode,....
}

Convert XML String to Map and get the key & value pairs using Java

I have an XML String. I'm trying to convert that string into map so that I can get key & value. However its not able to convert. Here is my code
String xmlString = "<?xml version="1.0" encoding="UTF-8"?><user>
<kyc></kyc>
<address></address>
<resiFI></resiFI></user>"
def convertStringToDocument = {
xmlString ->
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
org.w3c.dom.Document doc = builder.parse(new InputSource(new StringReader(xmlString)));
return doc;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
def populateDocProofsFromWaiversXML = {
xmlString, mandateFlag ->
final List<DocumentProof> documentProofs = new ArrayList<DocumentProof>();
if (xmlString != null) {
try {
HashMap<String, String> values = new HashMap<String, String>();
Document xml = convertStringToDocument(waiversList);
org.w3c.dom.Node user = xml.getFirstChild();
NodeList childs = user.getChildNodes();
org.w3c.dom.Node child;
for (int i = 0; i < childs.getLength(); i++) {
child = childs.item(i);
System.out.println(child.getNodeName());
System.out.println(child.getNodeValue());
values.put(child.getNodeName(), child.getNodeValue());
}
} catch (Throwable t) {
println "error"
//LOG.error("Could not set document proofs from waivers ", t);
}
}
return documentProofs;
}
I'd like to get "kyc" as key and the respective value. Any better ideas?
package com.test;
import java.io.StringReader;
import java.util.HashMap;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class Random {
/**
* #param args
*/
public static void main(String[] args) {
HashMap<String, String> values = new HashMap<String, String>();
String xmlString = "<?xml version=\"1.0\" encoding=\"UTF-8\"?><user><kyc>123</kyc><address>test</address><resiFI>asds</resiFI></user>";
Document xml = convertStringToDocument(xmlString);
Node user = xml.getFirstChild();
NodeList childs = user.getChildNodes();
Node child;
for (int i = 0; i < childs.getLength(); i++) {
child = childs.item(i);
System.out.println(child.getNodeName());
System.out.println(child.getTextContent());
values.put(child.getNodeName(), child.getTextContent());
}
}
private static Document convertStringToDocument(String xmlStr) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
Document doc = builder.parse(new InputSource(new StringReader(
xmlStr)));
return doc;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
}
This will work. Please check :)
You can play with DOM.

Document parsing shows null

I need help in the below concept.
I want to get attributes of xref node in the code. i.e id and its value, location and its value, type and its value.
I am passing xml as string. But the document shows null on parsing.
PLease help me in this.
import java.io.StringReader;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class GetAtrribute {
/**
* #param args
*/
public static void main(String[] args) {
String xml = "<xref id=\"19703675\" location=\"abstract\" type=\"external\">PubMed Abstract: http://www.abcd.nlm.nih.gov/...</xref>"; //Populated XML String....
GetAtrribute ga = new GetAtrribute();
try {
ga.getValues(xml);
} catch (Exception e) {
e.printStackTrace();
}
}
public String getValues(String xmlStr) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
xmlStr = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>" + xmlStr;
try {
builder = factory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(
xmlStr)));
Element element = document.getDocumentElement();
NodeList list = element.getElementsByTagName("xref");
if (list != null && list.getLength() > 0) {
NodeList subList = list.item(0).getChildNodes();
if (subList != null && subList.getLength() > 0) {
return subList.item(0).getNodeValue();
}
for (int count = 0; count < subList.getLength(); count++) {
System.out.println(subList.item(count).getNodeValue());
}
}
} catch (Exception e) {
e.printStackTrace();
}
return xmlStr;
}
}
Your problem is that when you run this line:
Element element = document.getDocumentElement();
you're actually selecting xref already, because its the only xml element. You could either wrap another object around xref, or just use the variable 'element' to get the details.
p.s. your class name is spelt wrong: GetAtrribute -> GetAttribute
I suggest you to use XPath to find data in your XML:
XPath xPath = XPathFactory.newInstance().newXPath();
Document baseDoc;
try (InputStream pStm = new ByteArrayInputStream(baseXmlString.getBytes("utf-8"))) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
baseDoc = builder.parse(pStm);
} catch (SAXException | IOException | ParserConfigurationException ex) {
getLogger().error(null, ex);
return null;
}
try {
XPathExpression expression = xPath.compile(xPathExpression);
return (T) expression.evaluate(baseDoc, pathType);
} catch (XPathExpressionException ex) {
getLogger().error(null, ex);
}
return null;
For example take a look at here

XML parsing with Child not value parsing

XML parsing with Child not value parsing
import java.io.File;
import java.io.FileInputStream;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import com.sun.org.apache.xml.internal.dtm.ref.DTMNodeList;
public class XPathEvaluator {
/*
* ServiceGroup serviceGroup = new ServiceGroup(); List<Service>
* requiredServices = new ArrayList<Service>(); List<Service>
* recommandedServices = new ArrayList<Service>(); Service service = new
* Service();
*/
public void evaluateDocument(File xmlDocument) {
try {
XPathFactory factory = XPathFactory.newInstance();
XPath xPath = factory.newXPath();
String requiredServicesExpression = "/Envelope/Header";
InputSource requiredServicesInputSource = new InputSource(
new FileInputStream(xmlDocument));
DTMNodeList requiredServicesNodes = (DTMNodeList) xPath.evaluate(
requiredServicesExpression, requiredServicesInputSource,
XPathConstants.NODESET);
System.out.println(requiredServicesNodes.getLength());
NodeList requiredNodeList = (NodeList) requiredServicesNodes;
for (int i = 0; i < requiredNodeList.getLength(); i++) {
Node node = requiredNodeList.item(i);
System.out.println(node.getChildNodes());
}
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] argv) {
XPathEvaluator evaluator = new XPathEvaluator();
File xmlDocument = new File("d://eva.xml");
evaluator.evaluateDocument(xmlDocument);
}
}
my xml is following in this i am try to parse header information
<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<Header>
<User id="MAKRISH"/>
<Request-Id id="1"/>
<Type name="Response"/>
<Application-Source name="vss" version="1.0"/>
<Application-Destination name="test" />
<Outgo-Timestamp date="2012-08-24" time="14:50:00"/>
<DealerCode>08301</DealerCode>
<Market>00000</Market>
</Header>
</Envelope>
i am not able to get Header child how can i get them it is giving me null on getchildNodes method. i have check for many solution but get any thing.
The following parsing is done with DOM as per tagging , i hope this should help you to solve
{
try{
File file = new File("xmlfile");
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(file);
Element root = document.getDocumentElement();
root.normalize();
printNode(root, 0);
} catch (Exception e) {
}
}
public static void printNode(Node node, int depth) {
if (node.getNodeType() == Node.TEXT_NODE) {
System.out.printf("%s%n", node.getNodeValue());
} else {
NamedNodeMap attributes = node.getAttributes();
if ((attributes == null) || (attributes.getLength() == 0)) {
System.out.printf("%s%n", node.getNodeName());
} else {
System.out.printf("%s ", node.getNodeName());
printAttributes(attributes);
}
}
NodeList children = node.getChildNodes();
for(int i=0; i<children.getLength(); i++) {
Node childNode = children.item(i);
printNode(childNode, depth+1);
}
}
private static void printAttributes(NamedNodeMap attributes) {
for(int i=0; i<attributes.getLength(); i++)
{
Node attribute = attributes.item(i);
System.out.printf(" %s=\"%s\"", attribute.getNodeName(),
attribute.getNodeValue());
}
}
}
The accepted answer to this related question has a good example of parsing xml using xpath.
I've debugged into your code, and the getChildNodes call is in fact not returning null, but it has got a confusing toString().

Categories