I have this XML code:
<root>
<node>
</first_child>
</second_child>
</third_child>
</node>
</root>
I need to take all children nodes one by one and save like three Node variable using DOM.
If I use
doc.getElementsByTagName("node");
I take this "node" with all the children, while I need only "first_child, second_child and third_child"
How to obtain this?
Element el = (Element)(doc.getElementsByTagName("node").item(0));
NodeList children = el.getChildNodes();
for (int i=0; i<children.getLength(); i++) {
System.out.println(children.item(0).getNodeValue());
}
Element el;
el = (Element) doc.getElementsByTagName("node").item(0);
el.getChildNodes();
You can get the children in this way.
var children = document.getElementById('node').getElementsByTagName('*');
Related
I have an XML file with the following elements.
<productType>
<productTypeX />
<!-- One of the following elements are also possible:
<productTypeY />
<productTypeZ />
-->
</productType>
So, the XML could also look like this:
<productType>
<productTypeZ />
</productType>
The XML is unmarshalled to a POJO by using JAXB.
How can I determine if the child of <productType> is X, Y or Z? Either in the mapped POJO or directly in the XML?
Now there is a way maybe not cheaper than checking by hand - writing if for every GETTER about sub-classes(null == obj.getProductTypeX()) but here it is:
Lets assume that you end up with JAXBElement<ProductType> productType when you unmarshall.
Now you need to end up with a Element (org.w3c.dom.Element) object. Which can be done like this:
DOMResult res = new DOMResult();
marshaller.marshal(productType, res);
Element elt = ((Document)res.getNode()).getDocumentElement();
Now the interface Element extends the interface Node from which we can
come to a conclusion that we end up here with a TREE structure object and we can get his existing children like :
NodeList nodeList = elt.getChildNodes();
Now you can check the type and value of every Node but you have to check if the Node is an ELEMENT_NODE or ATTRIBUTE_NODE in most cases:
for (int i = 0; i < nodeList.getLength(); i++) {
Node currentNode = nodeList.item(i);
if (currentNode.getNodeType() == Node.ELEMENT_NODE) {
currentNode.getNodeName();
currentNode.getTextContent();
//And whatever you like
}
}
I hope this will help you or give you any directions how to get what you need.
I'm trying to extract values from an InputStream containing XML data. The general data layout is something like this:
<objects count="1">
<object>
<stuff>...</stuff>
<more_stuff>...</more_stuff>
...
<connections>
<connection>124</connection>
<connection>128</connection>
</connections>
</object>
<objects>
I need to find the integers stored in the <connection> attributes. However, I can't guarantee that there will always be exactly two (there may be just one or none at all). Even more, there will be cases where the element <connections> is not present.
I've been looking at examples like this, but it doesn't mention how to handle cases where a parent is non-existent.
The case where <connections> doesn't exist at all is quite rare (but is something I definitely need to know when it does happen), and the case where it does exist but contains less than two <connection>'s would be even more rare (basically I expect it to never happen).
Should I just assume everything is in place and catch the exception if something happens, or is there a clever way to detect the presence of <connections>?
My initial idea was to use something like:
InputStream response = urlConnection.getInputStream();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(response);
String xPathExpressionString = "/objects/object/connections/connection";
XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression expr = xPath.compile(xPathExpressionString);
NodeList nodeList = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
Node intersectionNode = nodeList.item(i);
if (intersectionNode.getNodeType() == Node.ELEMENT_NODE) { // What is this anyway?
// Do something with value
}
}
According to the example linked above, this should handle the case with varying amounts of <connection>'s, but how should I deal with <connections> missing alltoghether.
(Btw, there should always only be a single object, so no need to worry about that)
Use this xpath expression:
"//object//connection"
The "//" construct is a short form for the "self-or-descendants" axis. So the expression above will select all <connection> elements that have an <object> parent.
From the below code we can get all the names of child tags of a document and once we it goes into second if block it means there is connections tag existing as childnode for given doc:
As you said we don't know information about the parent we can use the below line accordingly to the xml present.
group.getChildNodes().item(0).getChildNodes()......
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
NodeList groupList = doc.getChildNodes().item(0).getChildNodes();
for (int groupCount = 0; groupCount < groupList.getLength(); groupCount++)
{
Node group = groupList.item(groupCount);
if (group.getNodeType() == Node.ELEMENT_NODE)
{
if(group.getNodeName().equals("connections"))
{
}
}
}
My First Answer in Stackoverflow.Hope this helps.
I know I can use DocumentBuilder to parse an xml file and traverse through the nodes but I am stuck at figuring out if the node has any more children. So for example in this xml:
<MyDoc>
<book>
<title> ABCD </title>
</book>
</MyDoc>
if I do node.hasChildNodes() I get true for both book and title. But what I am trying to do is if a node has some text value (not attributes) like title then print it otherwise don't do anything. I know this is some simple check but I just can't seem to find the answer on web. I am probably not searching with right keywords. Thanks in advance.
Try getChildNodes(). That will return a NodeList object which will allow you to iterate through all of the Nodes under the one you're referencing. regardless of what names they might have.
You have to check the type of the child nodes that you get by calling getChildNodes()by calling getNodeType(). <book> has a child of type ELEMENT_NODE whereas <title> has a child of type TEXT_NODE.
I am not sure but I think you wanted a way to iterate through all of the elements regardless of how nested it is. The below recursively goes through all elements. It then prints the elements value as long as its not just white space:
public static void main(String[] args) throws SAXException, IOException, ParserConfigurationException
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("test.xml");
NodeList childNodes = doc.getChildNodes();
iterateNodes(childNodes);
}
private static void iterateNodes(NodeList childNodes)
{
for (int i = 0; i < childNodes.getLength(); ++i)
{
Node node = childNodes.item(i);
String text = node.getNodeValue();
if (text != null && !text.trim().isEmpty()) {
System.out.println(text);
}
if (node.hasChildNodes()) {
iterateNodes(node.getChildNodes());
}
}
}
Text nodes exist under element nodes in a DOM, and data is always stored in text nodes. Perhaps the most common error in DOM processing is to navigate to an element node and expect it to contain the data that is stored in that element. Not so! Even the simplest element node has a text node under it that contains the data.
Ref: http://docs.oracle.com/javase/tutorial/jaxp/dom/readingXML.html
I have an XML Document:
<entities xmlns="urn:yahoo:cap">
<entity score="0.988">
<text end="4" endchar="4" start="0" startchar="0">Messi</text>
<wiki_url>http://en.wikipedia.com/wiki/Lionel_Messi</wiki_url>
<types>
<type region="us">/person</type>
</types>
</entity>
</entities>
I have a TreeMap<String,String> data which stores the getTextContent() for both the "text" and "wiki_url" element. Some "entity"s will only have the "text" element (no "wiki_url") so i need a way of finding out when there is only the text element as the child and when there is a "wiki_url". I could use document.getElementByTag("text") & document.getElementByTag("wiki_url") but then I would lose the relationship between the text and the url.
I'm trying to get the amount of elements within the "entity" element by using:
NodeList entities = document.getElementsByTagName("entity"); //List of all the entity nodes
int nchild; //Number of children
System.out.println("Number of entities: "+ entities.getLength()); //Prints 1 as expected
nchild=entities.item(0).getChildNodes().getLength(); //Returns 7
However as shows above this returns 7 (which I don't understand, surely its 3 or 4 if you include the grandchild)
I was then going to use the number of children to cycle through them all to check if getNodeName().equals("wiki_url") and save it to data if correct.
Why is it that i am getting the number of children as 7 when I can only count 3 children and 1 grandchild?
The white-spaces following > of <entity score="0.988"> also count for nodes, similarly end of line chararcter between the tags are also parsed to nodes. If you are interested in a particular node with a name, add a helper method like below and call wherever you want.
Node getChild(final NodeList list, final String name)
{
for (int i = 0; i < list.getLength(); i++)
{
final Node node = list.item(i);
if (name.equals(node.getNodeName()))
{
return node;
}
}
return null;
}
and call
final NodeList childNodes = entities.item(0).getChildNodes();
final Node textNode = getChild(childNodes, "text");
final Node wikiUrlNode = getChild(childNodes, "wiki_url");
Normally when working with DOM, comeup with helper methods like above to simplify main processing logic.
I have a xml structure as follows:
<rurl modify="0" children="yes" index="8" name="R-URL">
<status>enabled</status>
<rurl-link priority="3">http</rurl-link>
<rurl-link priority="5">http://localhost:80</rurl-link>
<rurl-link priority="4">abc</rurl-link>
<rurl-link priority="3">b</rurl-link>
<rurl-link priority="2">a</rurl-link>
<rurl-link priority="1">newlinkkkkkkk</rurl-link>
</rurl>
Now, I want to remove a child node, where text is equal to http. currently I am using this code:
while(subchilditr.hasNext()){
Element subchild = (Element)subchilditr.next();
if (subchild.getText().equalsIgnoreCase(text)) {
message = subchild.getText();
update = "Success";
subchild.removeAttribute("priority");
subchild.removeContent();
}
But it is not completely removing the sub element from xml file. It leaves me with
<rurl-link/>
Any suggestions?
You'll need to do this:
List<Element> elements = new ArrayList<Element>();
while (subchilditr.hasNext()) {
Element subchild = (Element) subchilditr.next();
if (subchild.getText().equalsIgnoreCase(text)) {
elements.add(subchild);
}
}
for (Element element : elements) {
element.getParent().removeContent(element);
}
If you try to remove an element inside of the loop you'll get a ConcurrentModificationException.
If you have the parent element rurl you can remove its children using the method removeChild or removeChildren.
Use removeChild()
http://download.oracle.com/javase/1.5.0/docs/api/org/w3c/dom/Node.html#removeChild(org.w3c.dom.Node)