I am trying to write a piece of code that can parse any xml and print its contents. I am using DOM parser. I am able to get the name of the root tag of the xml, but cant obtain tag name of the immediate child. This can be done easily in case the node names are known by using the method 'getElementsByTagName' . Is there any way out of this dilemma ?
My code goes like this :
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
doc.getDocumentElement().normalize();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
doc.getDocumentElement().getNodeName() // this gets me the name of the root node.
Now how can i get the name of the immediate child node so that i can traverse the xml using getElementsByTagName("x").
Thanks in advance.
getChildNodes() returns all children of an element. The list will contain more then just elements so you'll have to check each child node if it is an element:
NodeList nodes = doc.getDocumentElement().getChildNodes();
for (int i = 0; i < nodes.getLength(); i++) {
Node node = nodes.get(i);
if (node instanceof Element) {
Element childElement = (Element) node;
System.out.println("tag name: " + childElement.getTagName());
}
}
Related
I have to extract tag value from an xml Document that contains a single tag like below:
<error>Permission denied</error>
i have tried:
String xmlRecords = "<error>Permission denied</error>"
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xmlRecords));
Document doc = db.parse(is);
Node nodes = doc.getFirstChild();
String = nodes.getNodeValue();
but it dont works.
How can i do it ?
Use doc.getDocumentElement().getTextContent() to get the string Permission denied.
With DOM it´s util to know the structure of the XML document, and which node level are you looking for.
After get Document, you can use document.getElementsByTagName("root") to look for the root or father tags, and get the childs as a list to look for the item. Something like this:
NodeList listresults = document.getElementsByTagName('father/root element string');
NodeList nl = listresults.item(0).getChildNodes();
// Recorremos los nodos
for (int temp = 0; temp < nl.getLength(); temp++) {
Node node = nl.item(temp);
// Check if it is a node
if (node.getNodeType() == Node.ELEMENT_NODE) {
Element element = (Element) node;
if(element.getNodeName().equals("error")){
// check the element
}
}
}
I hope this helps you.
just try following code.
String value = nodes.getTextContent();
You have to construct the string if you are using the above approach. You will get the string values of the tag name and content using the functions.
Tag name = nodes.getTextContent()
tag value = nodes.getLocalName()
I guess this is what you want
Element element = document.getDocumentElement();
NodeList errorTagList = element.getElementsByTagName("error");
if (errorTagList != null && errorTagList.getLength() > 0) {
NodeList errorTagSubList = errorTagList.item(0).getChildNodes();
if (errorTagSubList != null && errorTagSubList.getLength() > 0) {
String value = errorTagSubList.item(0).getNodeValue();
}
}
How do I list the element names at a given level in an xml schema hierarchy? The code I have below is listing all element names at every level of the hierarchy, with no concept of nesting.
Here is my xml file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><?xml-stylesheet type="text/xsl" href="CDA.xsl"?>
<SomeDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:something">
<title>some title</title>
<languageCode code="en-US"/>
<versionNumber value="1"/>
<recordTarget>
<someRole>
<id extension="998991"/>
<addr use="HP">
<streetAddressLine>1357 Amber Drive</streetAddressLine>
<city>Beaverton</city>
<state>OR</state>
<postalCode>97867</postalCode>
<country>US</country>
</addr>
<telecom value="tel:(816)276-6909" use="HP"/>
</someRole>
</recordTarget>
</SomeDocument>
Here is my java method for importing and iterating the xml file:
public static void parseFile() {
//get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
//Using factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
//parse using builder to get DOM representation of the XML file
Document dom = db.parse("D:\\mypath\\somefile.xml");
//get the root element
Element docEle = dom.getDocumentElement();
//get a nodelist of elements
NodeList nl = docEle.getElementsByTagName("*");
if (nl != null && nl.getLength() > 0) {
for (int i = 0; i < nl.getLength(); i++) {
Node node = nl.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
System.out.println("node.getNodeName() is: "+node.getNodeName());
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
The output of the above program is:
title
languageCode
versionNumber
recordTarget
someRole
id
addr
streetAddressLine
city
state
postalCode
country
telecom
Instead, I would like to output the following:
title
languageCode
versionNumber
recordTarget
It would be nice to then be able to list the children of recordTarget as someRole, and then to list the children of someRole as id, addr, and telecom. And so on, but at my discretion in the code. How can I change my code to get the output that I want?
You're getting all nodes with this line:
NodeList nl = docEle.getElementsByTagName("*");
Change it to
NodeList nl = docEle.getChildNodes();
to get all of its children. Your print statement will then give you the output you're looking for.
Then, when you iterate through your NodeList, you can choose to call the same method on each Node you create:
NodeList children = node.getChildNodes();
If you want to print an XML-like structure, perhaps a recursive method that prints all child nodes is what you are looking for.
You could re-write the parseFile (I'd rather call it parseChildrenElementNames) method to take an input String that specifies the element name for which you want to print out its children element names:
public static void parseChildrenElementNames(String parentElementName) {
// get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// Using factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
// parse using builder to get DOM representation of the XML file
Document dom = db
.parse("D:\\mypath\\somefile.xml");
// get the root element
NodeList elementsByTagName = dom.getElementsByTagName(parentElementName);
if(elementsByTagName != null) {
Node parentElement = elementsByTagName.item(0);
// get a nodelist of elements
NodeList nl = parentElement.getChildNodes();
if (nl != null) {
for (int i = 0; i < nl.getLength(); i++) {
Node node = nl.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
System.out.println("node.getNodeName() is: "
+ node.getNodeName());
}
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
However, this will only consider the first element that matches the specified name.
For example, to get the list of elements under the first node named someRole, you would call parseChildrenElementNames("someRole"); which would print out:
node.getNodeName() is: id
node.getNodeName() is: addr
node.getNodeName() is: telecom
I am trying to import node from one doc to another:
DocumentBuilder db = dbf.newDocumentBuilder();
DocumentBuilder db2 = dbf2.newDocumentBuilder();
Document doc1 =parser.buildDoc(message.getBytes("UTF-8"));
Document doc2 = db2.parse(new FileInputStream(new File("C:\\Temp\\workspace2\\Resource2Q\\xml_template.xml")));
NodeList list = doc1.getElementsByTagName("Form");
for(int i=0; i<list.getLength(); i++)
{
Element element = (Element) list.item(i);
Node copiedNode = doc1.importNode(element, true);
doc2.getDocumentElement().appendChild(copiedNode); ...
The last line of code gives me: "WRONG_DOCUMENT_ERR: A node is used in a different document than the one that created it".
Why is this happening? i am importing the node.
Node copiedNode = doc1.importNode(element, true);
should be
Node copiedNode = doc2.importNode(element, true);
The node comes from doc1, and you want to import it into doc2. Not into doc1, where it already comes from.
How can I reach to elements which have same name and recursive inclusion using Java XML? This has worked in python ElementTree, but for some reason I need to get this running in Java.
I have tried:
String filepath = ("file.xml");
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(filepath);
NodeList nl = doc.getElementsByTagName("*/*/foo");
Example
<foo>
<foo>
<foo>
</foo>
</foo>
</foo>
You seem to be under the impression that getElementsByTagName takes an XPath expression. It doesn't. As documented:
Returns a NodeList of all the Elements in document order with a given tag name and are contained in the document.
If you need to use XPath, you should look at the javax.xml.xpath package. Sample code:
Object set = xpath.evaluate("*/*/foo", doc, XPathConstants.NODESET);
NodeList list = (NodeList) set;
int count = list.getLength();
for (int i = 0; i < count; i++) {
Node node = list.item(i);
// Handle the node
}
I have this XML document:
<?xml version="1.0" encoding="utf-8"?>
<RootElement>
<Achild>
.....
</Achild>
</RootElement>
How can I check if the document contains Achild element or not? I tried
final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// Use the factory to create a builder
try {
final DocumentBuilder builder = factory.newDocumentBuilder();
final Document doc = builder.parse(configFile);
final Node parentNode = doc.getDocumentElement();
final Element childElement = (Element) parentNode.getFirstChild();
if(childElement.getNodeName().equalsIgnoreCase(...
but it gives me an error (childElement is null).
I think that you're getting #text node (that between <RootElement> and <Achild>) as first child (that's pretty common mistake), for example:
final Node parentNode = doc.getDocumentElement();
Node childElement = parentNode.getFirstChild();
System.out.println(childElement.getNodeName());
Returns:
#text
Use instead:
final Node parentNode = doc.getDocumentElement();
NodeList childElements = parentNode.getChildNodes();
for (int i = 0; i < childElements.getLength(); ++i)
{
Node childElement = childElements.item(i);
if (childElement instanceof Element)
System.out.println(childElement.getNodeName());
}
Wanted result:
Achild
EDIT:
There is second way using DocumentBuilderFactory.setIgnoringElementContentWhitespace method:
factory.setIgnoringElementContentWhitespace(true);
However this works only in validating mode, so you need to provide DTD in your XML document:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE RootElement [
<!ELEMENT RootElement (Achild)+>
<!ELEMENT Achild (#PCDATA)>
]>
<RootElement>
<Achild>some text</Achild>
</RootElement>
and set factory.setValidating(true). Full example:
final DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setValidating(true);
factory.setIgnoringElementContentWhitespace(true);
final DocumentBuilder builder = factory.newDocumentBuilder();
final Document doc = builder.parse("input.xml");
final Node rootNode = doc.getDocumentElement();
final Element childElement = (Element) rootNode.getFirstChild();
System.out.println(childElement.getNodeName());
Wanted result with original code:
Achild
It sounds like .getFirstChild() is returning you a text node containing the white space between "" and "", in which case you would need to advance to the next sibling node to get to where you expect.