Xml parser - Getting information inside comment - java

My objective is to acesse some information inside the xml commented code, in this case, the information inside the tag town.
For a html extract like this:
<script type="text/xml">
<!--
<world>
<city>
<town>
London
</town>
</city>
<city>
<town>
New York
</town>
</city>
</world>
-->
</script>
I want to get that "London" and "New York".
My code is:
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("city");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Town : " + eElement.getElementsByTagName("Town").item(0).getTextContent());
}
}
Yet, this code does not work. BUT, if i remove the comments, it does. Why it doesn't work it comments, and how to fix it?
Thank you!

It doesn't work because the element isn't an element of xml in your example, rather it is part of a comment. Why would doc.getElementsByTagName("city"); parse and return xml nodes from comments?
You can read a bit more about it here

Related

How to read the values from a XML file using Java when all the tags are same..?

My XML looks something like below
<var id="attr1">
<attr1>
<var id="key1">value1</var>
<var id="key2">value2</var>
<var id="key3">value3</var>
</attr1>
</var>
<var id="attr2">
<attr2>
<var id="key1">value4</var>
<var id="key2">value5</var>
<var id="key3">value6</var>
<var id="key4">value7</var>
</attr2>
</var>
I am trying to get the values (which will be unique) from the above xml. The key names can be same and so are the tag names (in this case it is "var"), which is making it challenging while getting the values. The attribute names are unique as well. If I use the below code
NodeList nList = doc.getElementsByTagName("var");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("Current Element :" + nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Parent id : "
+ eElement.getAttribute("id"));
}
}
If I execute the above method, I am getting all the keys. Is there a way to get the value of a specific key..? How to actually traverse to get to the value (say value1)..?
you can get the element text content by calling the eElement.getTextContent() method.
Here is a link to the Node interface javadoc (which Element extends) .

How to add nodes from another xml using xmlbeans

I am using xmlbeans to generate the xml document, while I need to extract all the children from another xml file and insert them to my current document.
The to_be_add.xml:
<root>
<style>
.....
</style>
<atlas img="styles/jmap.png">
....
</atlas>
.....
</root>
And this xml file does not have a schema so I do not create related java class to map it. You think it as a plain xml file.
I want the style atlas node added. I use the following codes:
XmlObject pointRoot = XmlObject.Factory.parse(Main.class.getResourceAsStream("to_be_added.xml"));
NodeList nodeList = pointRoot.getDomNode().getChildNodes();
Node themeNode = renderthemeDoc.getDomNode();
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
themeNode.appendChild(node);
}
Then I got error:
Exception in thread "main"
org.apache.xmlbeans.impl.store.DomImpl$WrongDocumentErr: Child to add
is from another document
And I found this post by searching "child to .... another document": how to add a xml document to another xml document in java which said that the connection between the element and the document has to be broken between the element can be add to other document.
So I try to build the Document object(that is why the variable pointDoc and themeDoc exist):
XmlObject pointRoot = XmlObject.Factory.parse(Main.class.getResourceAsStream("to_be_added.xml"));
Document pointDoc = pointRoot.getDomNode().getOwnerDocument();
System.out.println(pointDoc);
Element element = pointDoc.getDocumentElement();
NodeList nodeList = element.getChildNodes();
Document themeDoc = myCurrentDoc.getDomNode().getOwnerDocument();
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
node = themeDoc.importNode(node, true);
themeDoc.appendChild(node);
}
Then I got NullPointerException which said that the pointDoc is null.
That is the whole process how I try to solve this problem. If it is unclear, please tell me, I will update accordingly.
Is it possible to fix it?
Since your other XML file is not mapped to a class, you can use a regular DOM parser to read it and extract its nodes. But using a generic object factory you can still get the nodes:
XmlObject pointRoot = XmlObject.Factory.parse( "<root>\n" +
" <style>\n" +
" </style>\n" +
" <atlas img=\"styles/jmap.png\">\n" +
" </atlas>\n" +
"</root>");
Node pointDoc = pointRoot.getDomNode().getFirstChild();
NodeList nodeList = pointDoc.getChildNodes();
for(int i = 0; i < nodeList.getLength(); i++) {
System.out.println("Node: " + nodeList.item(i).getNodeName());
}
This will print:
Node: #text
Node: style
Node: #text
Node: atlas
Node: #text

Java parse xml i18n

I'm doing i18n for my web app and I have a list of error messages in an xml file, like this:
<ErrorList>
<Errors culture="en">
<Error id="InvalidBooking">
<Heading1>We are unable to find your booking.</Heading1>
<Heading2></Heading2>
<Description1>Please try again.</Description1>
<Description2>OR</Description2>
<Description3>Seek assistance at the Check-In counter.</Description3>
<Description4></Description4>
</Error>
</Errors>
<Errors culture="zh">
<Error id="InvalidBooking">
<Heading1>我们无法找到您的预订。</Heading1>
<Heading2></Heading2>
<Description1>请再试一次。.</Description1>
<Description2>或</Description2>
<Description3>请联系登机柜台寻求帮助。</Description3>
<Description4></Description4>
</Error>
</Errors>
</ErrorList>
I use the DOM xml parser to load the error messages. Then I get the specific error by matching Error id:
DocumentBuilder builder = DocumentBuilderFactory.newInstance().
newDocumentBuilder();
Document doc = builder.parse(xml);
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("Error");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
Error _error = new Error();
_error.setKey(eElement.getAttribute("id"));
_error.setHeading1(eElement.getElementsByTagName("Heading1").item(0).getTextContent());
_error.setHeading2(eElement.getElementsByTagName("Heading2").item(0).getTextContent());
_error.setDescription1(eElement.getElementsByTagName("Description1").item(0).getTextContent());
_error.setDescription2(eElement.getElementsByTagName("Description2").item(0).getTextContent());
_error.setDescription3(eElement.getElementsByTagName("Description3").item(0).getTextContent());
_error.setDescription5(eElement.getElementsByTagName("Description4").item(0).getTextContent());
getErrors().add(_error);
How do I go about specifying the culture id to only load the error messages in that section? For example, when I switch to chinese, it looks up the Errors culture="zh" and only parses in that section?
Thanks.
In your for loop you should be able to get the attribute for the nNode and check it.
if ("zh".equals(nNode.getAttribute("culture"))) {
// do it here...
}

Parsing XML file Using Java (DOM parser)

Ok, so I have been able to kind of parse through this xml file. But I am unable to get to the section I want.
http://www.faroo.com/api?q=iphone&start=1&length=10&l=en&src=news&f=rss
This is the URL to the xml because it looks very ugly just pasted on here. I have gone through this xml and have copied it to a file. The part that I need is the "title" in the first "item". I have gone through with this code:
System.out.println(myDocument.getElementsByTagName("item").item(0).getTextContent());
And this just prints all of the contents of the first "item", like "title" and "link" and "description" but I do not want all of it, I only want "title" to be printed. I have having problems getting it to work exactly right, but I feel like I am close. Any help will be appreciated. Thanks.
From the Oracle documentation on the org.w3c.dom package:
This attribute returns the text content of this node and its descendants.
Your code is calling getTextContent() on the item tag. If you modify your code so that it retrieves the text from the title tag, it works correctly.
System.out.println(myDocument.getElementsByTagName("item").item(0).getFirstChild().getTextContent());
Note that this relies on title being the first child tag in item. You may want to change this to a more order-independant solution.
Below is a code that iterates through the whole rss and gets all the titles, links and descriptions. You can create an object that has title, link and description as attributes and use it as you please:
try {
File fXmlFile = new File("api.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("item");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("title : " + eElement.getElementsByTagName("title").item(0).getTextContent());
System.out.println("link : " + eElement.getElementsByTagName("link").item(0).getTextContent());
System.out.println("description : " + eElement.getElementsByTagName("description").item(0).getTextContent());
}
}
} catch (Exception e) {
e.printStackTrace();
}
Hope that helps.

Java XML importNode function not working as expected

My XML looks like this.
I would like to "export" collected_objects into another document. Here is my code-
NodeList nList = reader.getElementsByTagName("collected_objects");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
output.importNode(nNode, true);
}
output refers to the new document I want to write to.
The code is not importing anything from source document. All I get is XML "header" - <?xml version="1.0" encoding="UTF-8" standalone="no"?>
I was expecting that since I've set deep to true, all of the child nodes will be imported but that is not happening.
What am I doing wrong?
importNode only imports the node to the document. You still have to append it somewhere using Node.appendNode(child)
use importNode using this way
Element rootElement = doc.getElementsByTagName("collected_objects");
doc.appendChild(rootElement);
for (Node n = iterator.nextNode(); n != null; n = iterator.nextNode()) {
rootElement.appendChild(doc.importNode(n, true));
}

Categories