Parse XML with XPath & namespaces in Java - java

Can you help me adjust this code so it manages to parse the XML? If I drop the XML namespace it works:
String webXmlContent = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
"<foo xmlns=\"http://foo.bar/boo\"><bar>baz</bar></foo>";
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
org.w3c.dom.Document doc = builder.parse(new StringInputStream(webXmlContent));
NamespaceContextImpl namespaceContext = new NamespaceContextImpl();
namespaceContext.startPrefixMapping("foo", "http://www.w3.org/2001/XMLSchema-instance");
XPath xpath = XPathFactory.newInstance().newXPath();
xpath.setNamespaceContext(namespaceContext);
XPathExpression expr = xpath.compile("/foo/bar");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
System.out.println("Got " + nodes.getLength() + " nodes");

You must use a prefix in your XPath, e. g.: "/my:foo/my:bar" You can choose any prefix you like - it doesn't have anything to do with the prefixes you use or don't use in the XML file - but you must choose one. This is a limitation of XPath 1.0.
You must perform prefix mapping from "my" to "http://foo.bar/boo" (not to "http://www.w3.org/2001/XMLSchema-instance")

Related

how to parse xml to java in nodelist

that is my xml
<?xml version = "1.0" encoding = "UTF-8"?>
<ns0:GetADSLProfileResponse xmlns:ns0 = "http://">
<ns0:Result>
<ns0:eCode>0</ns0:eCode>
<ns0:eDesc>Success</ns0:eDesc>
</ns0:Result>
</ns0:GetADSLProfileResponse>
that is my code in java I need to know how to start in this
I tried some code online but still did not solve my problem
how to get the values in the result to loop in it and get 0 in ecode and Success in eDesc
CustomerProfileResult pojo = new CustomerProfileResult();
String body = readfile();
System.out.println(body);
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse(new InputSource(new StringReader(body)));
XPath xpath =XPathFactory.newInstance().newXPath();
XPathExpression name = xpath.compile("/xml/GetADSLProfileResponse/Result");
NodeList nodeName = (NodeList) name.evaluate(dom, XPathConstants.NODESET);
if(nodeName!=null){
}
Summary
You can try to following expression which allows you to select nodes without caring the namespace ns0:
/*[local-name()='GetADSLProfileResponse']/*[local-name()='Result']/*
Explanation
In your syntax, several parts were incorrect. Let's take a look together. XPath syntax /xml means that the root node of the document is <xml>, but the root element is <ns0:GetADSLProfileResponse>; GetADSLProfileResponse is incorrect too, because your XML file contains a namespace. Same for Result:
/xml/GetADSLProfileResponse/Result
In my solution, I ignored the namespace, because your namespace provided is incomplet. Here's a full program to get started:
String XML =
"<?xml version = \"1.0\" encoding = \"UTF-8\"?>\n"
+ "<ns0:GetADSLProfileResponse xmlns:ns0 = \"http://\">\n"
+ " <ns0:Result>\n"
+ " <ns0:eCode>0</ns0:eCode>\n"
+ " <ns0:eDesc>Success</ns0:eDesc>\n"
+ " </ns0:Result>\n"
+ "</ns0:GetADSLProfileResponse> ";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document;
try (InputStream in = new ByteArrayInputStream(XML.getBytes(StandardCharsets.UTF_8))) {
document = builder.parse(in);
}
XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xPath.compile("/*[local-name()='GetADSLProfileResponse']/*[local-name()='Result']/*");
NodeList nodeList = (NodeList) expr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
System.out.println(node.getNodeName() + ": " + node.getTextContent());
}
It prints:
ns0:eCode: 0
ns0:eDesc: Success
See also:
How to query XML using namespaces in Java with XPath?
Node (Java Platform SE 8)

parse xml using dom java

I have the bellow xml:
<modelingOutput>
<listOfTopics>
<topic id="1">
<token id="354">wish</token>
</topic>
</listOfTopics>
<rankedDocs>
<topic id="1">
<documents>
<document id="1" numWords="0"/>
<document id="2" numWords="1"/>
<document id="3" numWords="2"/>
</documents>
</topic>
</rankedDocs>
<listOfDocs>
<documents>
<document id="1">
<topic id="1" percentage="4.790644689978203%"/>
<topic id="2" percentage="11.427632949428334%"/>
<topic id="3" percentage="17.86913349249596%"/>
</document>
</documents>
</listOfDocs>
</modelingOutput>
Ι Want to parse this xml file and get the topic id and percentage from ListofDocs
The first way is to get all document element from xml and then I check if grandfather node is ListofDocs.
But the element document exist in rankedDocs and in listOfDocs, so I have a very large list.
So I wonder if exist better solution to parse this xml avoiding if statement?
My code:
public void parse(){
Document dom = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
dom = db.parse(is);
Element doc = dom.getDocumentElement();
NodeList documentnl = doc.getElementsByTagName("document");
for (int i = 1; i <= documentnl.getLength(); i++) {
Node item = documentnl.item(i);
Node parentNode = item.getParentNode();
Node grandpNode = parentNode.getParentNode();
if(grandpNode.getNodeName() == "listOfDocs"{
//get value
}
}
}
First, when checking the node name you shouldn't compare Strings using ==. Always use the equals method instead.
You can use XPath to evaluate only the document topic elements under listOfDocs:
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression xPathExpression = xPath.compile("//listOfDocs//document/topic");
NodeList topicnl = (NodeList) xPathExpression.evaluate(dom, XPathConstants.NODESET);
for(int i = 0; i < topicnl.getLength(); i++) {
...
If you do not want to use the if statement you can use XPath to get the element you need directly.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("source.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("/*/listOfDocs/documents/document/topic");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getAttributes().getNamedItem("id"));
System.out.println(nodes.item(i).getAttributes().getNamedItem("percentage"));
}
Please check GitHub project here.
Hope this helps.
I like to use XMLBeam for such tasks:
public class Answer {
#XBDocURL("resource://data.xml")
public interface DataProjection {
public interface Topic {
#XBRead("./#id")
int getID();
#XBRead("./#percentage")
String getPercentage();
}
#XBRead("/modelingOutput/listOfDocs//document/topic")
List<Topic> getTopics();
}
public static void main(final String[] args) throws IOException {
final DataProjection dataProjection = new XBProjector().io().fromURLAnnotation(DataProjection.class);
for (Topic topic : dataProjection.getTopics()) {
System.out.println(topic.getID() + ": " + topic.getPercentage());
}
}
}
There is even a convenient way to convert the percentage to float or double. Tell me if you like to have an example.

How to get xml attribute values using Document builder factory

How to get attribute values by using the following code i am getting ; as output for msg . I want to print MSID,type,CHID,SPOS,type,PPOS values can any one solve this issue .
String xml1="<message MSID='20' type='2635'>"
+"<che CHID='501' SPOS='2'>"
+"<pds type='S'>"
+"<position PPOS='S01'/>"
+"</pds>"
+"</che>"
+"</message>";
InputSource source = new InputSource(new StringReader(xml1));
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(source);
XPathFactory xpathFactory = XPathFactory.newInstance();
XPath xpath = xpathFactory.newXPath();
String msg = xpath.evaluate("/message/che/CHID", document);
String status = xpath.evaluate("/pds/position/PPOS", document);
System.out.println("msg=" + msg + ";" + "status=" + status);
You need to use # in your XPath for an attribute, and also your path specifier for the second element is wrong:
String msg = xpath.evaluate("/message/che/#CHID", document);
String status = xpath.evaluate("/message/che/pds/position/#PPOS", document);
With those changes, I get an output of:
msg=501;status=S01
You can use Document.getDocumentElement() to get the root element and Element.getElementsByTagName() to get child elements:
Document document = db.parse(source);
Element docEl = document.getDocumentElement(); // This is <message>
String msid = docEl.getAttribute("MSID");
String type = docEl.getAttribute("type");
Element position = (Element) docEl.getElementsByTagName("position").item(0);
String ppos = position.getAttribute("PPOS");
System.out.println(msid); // Prints "20"
System.out.println(type); // Prints "2635"
System.out.println(ppos); // Prints "S01"

Java XML - nested elements with same name

How can I reach to elements which have same name and recursive inclusion using Java XML? This has worked in python ElementTree, but for some reason I need to get this running in Java.
I have tried:
String filepath = ("file.xml");
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(filepath);
NodeList nl = doc.getElementsByTagName("*/*/foo");
Example
<foo>
<foo>
<foo>
</foo>
</foo>
</foo>
You seem to be under the impression that getElementsByTagName takes an XPath expression. It doesn't. As documented:
Returns a NodeList of all the Elements in document order with a given tag name and are contained in the document.
If you need to use XPath, you should look at the javax.xml.xpath package. Sample code:
Object set = xpath.evaluate("*/*/foo", doc, XPathConstants.NODESET);
NodeList list = (NodeList) set;
int count = list.getLength();
for (int i = 0; i < count; i++) {
Node node = list.item(i);
// Handle the node
}

Parsing xml name-value fields

In my xml, I am seraching for speciffic names and want to retrieve their value.
for example i have this field:
<n0:field>
<n0:name n4:type="n3:string" xmlns:n3="http://www.w3.org/2001/XMLSchema" xmlns:n4="http://www.w3.org/2001/XMLSchema-instance">LifePolicyID</n0:name>
<n0:value n6:type="n5:string" xmlns:n5="http://www.w3.org/2001/XMLSchema" xmlns:n6="http://www.w3.org/2001/XMLSchema-instance">1</n0:value>
</n0:field>
I try to get the value of the LifePolicyID name.
Is there a way to do it programatticly?
Right now i am usin Xpath like this:
XPathExpression xpe = xpath.compile("//*[name/text()='" + name +"']/value");
Where name is in this case is LifePolicyID. But it dont work.
Any ideas?
Your code seems to work for me
String xml =
"<n0:field xmlns:n0='http://test/uri'>" +
" <n0:name n4:type='n3:string' xmlns:n3='http://www.w3.org/2001/XMLSchema' xmlns:n4='http://www.w3.org/2001/XMLSchema-instance'>LifePolicyID</n0:name>" +
" <n0:value n6:type='n5:string' xmlns:n5='http://www.w3.org/2001/XMLSchema' xmlns:n6='http://www.w3.org/2001/XMLSchema-instance'>1</n0:value>" +
"</n0:field>";
String name = "LifePolicyID";
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression xpe = xpath.compile("//*[name/text()='" + name +"']/value");
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(new InputSource(new StringReader(xml)));
System.out.println(xpe.evaluate(doc));

Categories