I have the bellow xml:
<modelingOutput>
<listOfTopics>
<topic id="1">
<token id="354">wish</token>
</topic>
</listOfTopics>
<rankedDocs>
<topic id="1">
<documents>
<document id="1" numWords="0"/>
<document id="2" numWords="1"/>
<document id="3" numWords="2"/>
</documents>
</topic>
</rankedDocs>
<listOfDocs>
<documents>
<document id="1">
<topic id="1" percentage="4.790644689978203%"/>
<topic id="2" percentage="11.427632949428334%"/>
<topic id="3" percentage="17.86913349249596%"/>
</document>
</documents>
</listOfDocs>
</modelingOutput>
Ι Want to parse this xml file and get the topic id and percentage from ListofDocs
The first way is to get all document element from xml and then I check if grandfather node is ListofDocs.
But the element document exist in rankedDocs and in listOfDocs, so I have a very large list.
So I wonder if exist better solution to parse this xml avoiding if statement?
My code:
public void parse(){
Document dom = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
dom = db.parse(is);
Element doc = dom.getDocumentElement();
NodeList documentnl = doc.getElementsByTagName("document");
for (int i = 1; i <= documentnl.getLength(); i++) {
Node item = documentnl.item(i);
Node parentNode = item.getParentNode();
Node grandpNode = parentNode.getParentNode();
if(grandpNode.getNodeName() == "listOfDocs"{
//get value
}
}
}
First, when checking the node name you shouldn't compare Strings using ==. Always use the equals method instead.
You can use XPath to evaluate only the document topic elements under listOfDocs:
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression xPathExpression = xPath.compile("//listOfDocs//document/topic");
NodeList topicnl = (NodeList) xPathExpression.evaluate(dom, XPathConstants.NODESET);
for(int i = 0; i < topicnl.getLength(); i++) {
...
If you do not want to use the if statement you can use XPath to get the element you need directly.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("source.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("/*/listOfDocs/documents/document/topic");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getAttributes().getNamedItem("id"));
System.out.println(nodes.item(i).getAttributes().getNamedItem("percentage"));
}
Please check GitHub project here.
Hope this helps.
I like to use XMLBeam for such tasks:
public class Answer {
#XBDocURL("resource://data.xml")
public interface DataProjection {
public interface Topic {
#XBRead("./#id")
int getID();
#XBRead("./#percentage")
String getPercentage();
}
#XBRead("/modelingOutput/listOfDocs//document/topic")
List<Topic> getTopics();
}
public static void main(final String[] args) throws IOException {
final DataProjection dataProjection = new XBProjector().io().fromURLAnnotation(DataProjection.class);
for (Topic topic : dataProjection.getTopics()) {
System.out.println(topic.getID() + ": " + topic.getPercentage());
}
}
}
There is even a convenient way to convert the percentage to float or double. Tell me if you like to have an example.
Related
I have an XML document that has multiple hpp:HourlyHistoricalPrice elements as in the following way:
<?xml version="1.0">
<hhp:HourlyHistoricalPrices xmlns:hhp="urn:or-HourlyHistoricalPrices">
<hhp:HourlyHistoricalPrice xmlns:hhp="urn:or-HourlyHistoricalPrice">
<hhp:indexId>1025127</hhp:indexId>
<hhp:resetDate>20161231T000000</hhp:resetDate>
<hhp:refSource>AIBO</hhp:refSource>
<hhp:indexLocation/>
<hhp:price1>50,870000</hhp:price1>
...
<hhp:price48>43,910000</hhp:price48>
</hhp:HourlyHistoricalPrice>
<hhp:HourlyHistoricalPrice xmlns:hhp="urn:or-HourlyHistoricalPrice">
<hhp:indexId>1025127</hhp:indexId>
<hhp:resetDate>20160101T000000</hhp:resetDate>
<hhp:refSource>AIBO</hhp:refSource>
<hhp:indexLocation/>
<hhp:price1>51,870000</hhp:price1>
...
<hhp:price48>49,910000</hhp:price48>
</hhp:HourlyHistoricalPrice>
<hhp:HourlyHistoricalPrice xmlns:hhp="urn:or-HourlyHistoricalPrice">
<hhp:indexId>1025127</hhp:indexId>
<hhp:resetDate>20163112T000000</hhp:resetDate>
<hhp:refSource>APX</hhp:refSource>
<hhp:indexLocation/>
<hhp:price1>63,870000</hhp:price1>
...
<hhp:price48>29,910000</hhp:price48>
</hhp:HourlyHistoricalPrice>
</hhp:HourlyHistoricalPrices>
I want to retrieve only the hhp:HourlyHistoricalPrice nodes that have a particular value for hhp:refSource, e.g. AIBO.
I was trying the below XPathExpression but this retrieves nothing.
XPathFactory xpf = XPathFactory.newInstance();
XPath xpath = xpf.newXPath();
String strExprssion =
"/hhp:HourlyHistoricalPrices/hhp:HourlyHistoricalPrice[hhp:refSource='AIBO']";
XPathExpression expression = xpath.compile(strExprssion);
NodeList nodes = (NodeList) expression.evaluate(originalXmlDoc, XPathConstants.NODESET);
System.out.println(nodes.getLength());
I would be grateful if somebody could provide advise on the correct expression to use.
Thanks a lot.
You need to expand the prefix into the xml namespace it represents:
String strExprssion = "//urn:or-HourlyHistoricalPrice:HourlyHistoricalPrice[urn:or-HourlyHistoricalPrice:refSource='AIBO']";
So for me, this test class
public class XPathCheck {
public static void main(String[] args) throws FileNotFoundException, IOException, XPathExpressionException {
XPathFactory xpf = XPathFactory.newInstance();
XPath xpath = xpf.newXPath();
try (InputStream file = new FileInputStream(Paths.get("src", "inputFile.xml").toFile())) {
String strExprssion = "//urn:or-HourlyHistoricalPrice:HourlyHistoricalPrice[urn:or-HourlyHistoricalPrice:refSource='AIBO']";
XPathExpression expression = xpath.compile(strExprssion);
NodeList nodes = (NodeList) expression.evaluate(new InputSource(file), XPathConstants.NODESET);
System.out.println(nodes.getLength());
}
}
}
outputs "2".
Hello I am getting an unexpected error, please help me out!?
I want to search for the name of a person and display all the available information about him.
In the following code I am trying to find the person with first name Ivan and this "translate" is copied from other xml topic in stackoverflow as a incase-sensitive option.
public static void main(String[] args) {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory
.newInstance();
Document doc = factory.newDocumentBuilder().parse(
new File("staff.xml"));
XPathFactory xFactory = XPathFactory.newInstance();
XPath xPath = xFactory.newXPath();
XPathExpression exp = xPath
.compile("/staff/person/name/firstName[contains(translate(text(), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ', 'abcdefghijklmnopqrstuvwxyz'), 'Ivan')]");
NodeList nl = (NodeList) exp.evaluate(doc.getFirstChild(),
XPathConstants.NODESET);
for (int index = 0; index < nl.getLength(); index++) {
Node node = nl.item(index);
System.out.println(node.getTextContent());
}
} catch (Exception ex) {
Logger.getLogger(TestXML05.class.getName()).log(Level.SEVERE, null,
ex);
}
}
And this is my XML example file:
<?xml version="1.0" encoding="utf-8"?>
<staff>
<person id="1" role="chief">
<name>
<firstName>Ivan</firstName>
<lastName>Popov</lastName>
</name>
<phone>
<phoneOne>0273090909</phoneOne>
<phoneTwo>0878123456</phoneTwo>
</phone>
<email>i.popov#fdiba.tu-sofia.bg</email>
<room>10202</room>
<title>Dr.Ing.</title>
</person>
<person id="2" role="dozent">
<name>
<firstName>Georgi</firstName>
<lastName>Ivanov</lastName>
</name>
<phone>
<phoneOne>029988115</phoneOne>
<phoneTwo>0888123333</phoneTwo>
</phone>
<email>g.ivanov#fdiba.tu-sofia.bg</email>
<room>10203</room>
<title>Dr.Ing.</title>
</person>
<person id="3" role="assistent">
<name>
<firstName>Petur</firstName>
<lastName>Kirilov</lastName>
</name>
<phone>
<phoneOne>028773455</phoneOne>
<phoneTwo>0898448576</phoneTwo>
</phone>
<email>p.kirilov#fdiba.tu-sofia.bg</email>
<room>10308</room>
<title>Ing.</title>
</person>
</staff>
Your xpath expression seems to be incorrect. You need to change the xpath expression to /staff/person/name/firstName[contains(text(),'Georgi')]/../... This selects person node corresponding to the person with the first name Georgi.
public static void main(String[] args) {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory
.newInstance();
Document doc = factory.newDocumentBuilder().parse(
new File("src/resources/staff.xml"));
XPathFactory xFactory = XPathFactory.newInstance();
XPath xPath = xFactory.newXPath();
XPathExpression exp = xPath
.compile("/staff/person/name/firstName[contains(text(),'Georgi')]/../..");
NodeList nl = (NodeList) exp.evaluate(doc,
XPathConstants.NODESET);
for (int index = 0; index < nl.getLength(); index++) {
Node node = nl.item(index);
if (node.hasAttributes()) {
Attr attr = (Attr) node.getAttributes().getNamedItem("role");
if (attr != null) {
String attribute= attr.getValue();
System.out.println("Person role : " + attribute);
}
}
System.out.println(node.getTextContent());
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
This question already has answers here:
Get element name from XML in Java DOM
(3 answers)
Closed 8 years ago.
I never really know how to work with XML tags.How do I traverse the node and print particular node in the XML tag.Below is the XML file.
<Employees>
<Employee>
<Gender></Gender>
<Name>
<Firstname></Firstname>
<Lastname></Lastname>
</Name>
<Email></Email>
<Projects>
<Project></Project>
</Projects>
<PhoneNumbers>
<Home></Home>
<Office></Office>
</PhoneNumbers>
</Employee>
There is no data but this is the structure.I am using the following code to parse it partially.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Document xmlDocument = builder.parse("employees.xml");
System.out.println(xmlDocument.getDocumentElement().getNodeName());
I would like to print the gender and lastname values.How do I parse the tag which is inside the Name tag which in turn the Name is inside the Employee tag.
Regards.
You should use XPATH. There is a good explanation in this post.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(<uri_as_string>);
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile(<xpath_expression>);
Try this.
String expression = "/Employees/Employee/Gender"; //read Gender value
NodeList nList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET);
for (int j = 0; nList != null && j < nList.getLength(); j++) {
Node node = nList.item(j);
System.out.println("" + node.getFirstChild().getNodeValue());
}
expression = "/Employees/Employee/Name/Lastname"; //read Lastname value
nList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET);
for (int j = 0; nList != null && j < nList.getLength(); j++) {
Node node = nList.item(j);
System.out.println("" + node.getFirstChild().getNodeValue());
}
How can I reach to elements which have same name and recursive inclusion using Java XML? This has worked in python ElementTree, but for some reason I need to get this running in Java.
I have tried:
String filepath = ("file.xml");
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(filepath);
NodeList nl = doc.getElementsByTagName("*/*/foo");
Example
<foo>
<foo>
<foo>
</foo>
</foo>
</foo>
You seem to be under the impression that getElementsByTagName takes an XPath expression. It doesn't. As documented:
Returns a NodeList of all the Elements in document order with a given tag name and are contained in the document.
If you need to use XPath, you should look at the javax.xml.xpath package. Sample code:
Object set = xpath.evaluate("*/*/foo", doc, XPathConstants.NODESET);
NodeList list = (NodeList) set;
int count = list.getLength();
for (int i = 0; i < count; i++) {
Node node = list.item(i);
// Handle the node
}
Here is my code, maybe you will notice right away what I'm missing :
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse(fileName));
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr = xpath.compile("//CustomerId");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
Nodelist nodes = (NodeList) result;
Text a = doc.createTextNode("value");
Element p = doc.createElement("newNode");
p.appendChild(a);
for (int i = 0; i < nodes.getLength(); i++) {
nodes.item(i).insertBefore(p, nodes.item(i));
}
I'm trying to insert new node(<newNode>value</newNode>) before CustomerId existing node. Here is my XML sample file :
<Customer>
<names>
<firstName>fName</firstName>
<lastName>lName</lastName>
<middleName>nName</middleName>
<nickName/>
</names>
<addressList>
<address>
<streetInfo>
<houseNumber>22</houseNumber>
<baseName>Street base name</baseName>
<district>kewl district</district>
</streetInfo>
<zipcode>22231</zipcode>
<state>xxx</state>
<country>xxxz</country>
<primary>true</primary>
</address>
</addressList>
<CustomerId/>
<SSN>561381</SSN>
<phone>
<homePhone>123123123</homePhone>
<officePhone/>
<homePhone>21319414</homePhone>
</phone>
<preferred>true</preferred>
</Customer>
This is an exception getting thrown I just don't know what else to try :
NOT_FOUND_ERR: An attempt is made to
reference a node in a context where it
does not exist.
Here an example I just tested using the xml sample you provided.
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setIgnoringComments(true);
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse(new File("XmlTest.xml"));
NodeList nodes = doc.getElementsByTagName("CustomerId");
Text a = doc.createTextNode("value");
Element p = doc.createElement("newNode");
p.appendChild(a);
nodes.item(0).getParentNode().insertBefore(p, nodes.item(0));
Here is the result:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Customer>
<names>
<firstName>fName</firstName>
<lastName>lName</lastName>
<middleName>nName</middleName>
<nickName/>
</names>
<addressList>
<address>
<streetInfo>
<houseNumber>22</houseNumber>
<baseName>Street base name</baseName>
<district>kewl district</district>
</streetInfo>
<zipcode>22231</zipcode>
<state>xxx</state>
<country>xxxz</country>
<primary>true</primary>
</address>
</addressList>
<newNode>value</newNode>
<CustomerId/>
<SSN>561381</SSN>
<phone>
<homePhone>123123123</homePhone>
<officePhone/>
<homePhone>21319414</homePhone>
</phone>
<preferred>true</preferred>
</Customer>
If you're interested, here's the sample code I used to show the result:
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlOutput = result.getWriter().toString();
System.out.println(xmlOutput);
I think you want to insert into the parent, not the child:
nodes.item(i).getParentNode().insertBefore(p, nodes.item(i));