Is there any way to programmatically comment a particular child in XML?
My requirement is I need to find out the attribute value from the xml.If that values exists I need to comment that particular child itself which the attribute belongs.
eg:
<Company>
<employee name="John">
<dept id="Purchase"></dept>
</employee>
</company>
so here if I search for dept id "purchase" if it is found then the employee John should be able to add comment.
any idea? I am using jdom parser.
This is the JavaCode that will add a comment to the document itself:
Element element = doc.getDocumentElement();
Comment comment = doc.createComment("This is a comment");
element.getParentNode().insertBefore(comment, element);
You can create new comment nodes using Document.createComment() and insert them into your DOM tree using e.g. Node.insertBefore(Node, Node).
Related
I have this XML (just a little part.. the complete xml is big)
<Root>
<Products>
<Product ID="307488">
<ClassificationReference ClassificationID="AR" Type="AgencyLink"/>
<ClassificationReference ClassificationID="AM" Type="AgencyLink">
<MetaData>
<Value AttributeID="tipoDeCompra" ID="C">Compra Centralizada</Value>
</MetaData>
</ClassificationReference>
</Product>
</Products>
</Root>
Well... I want to get the data from the line
<Value AttributeID="tipoDeCompra" ID="C">Compra Centralizada</Value>
I'm using DOM and when I use nodoValue.getTextContent() I got "Compra Centralizada" and that is ok...
But when I use nodoValue.getNodeName() I got "MetaData" but I was expecting "Value"
What is the explanations for this behaviour?
Thanks!
Your nodeValuevariable most likely points to the MetaData node, so the returned name is correct.
Note that for an element node Node.getTextContent() returns the concatenation of the text content of all child nodes. Therefore in your example the text content of the MetaData element is equal to the text content of the Value element, namely Compra Centralizada.
I guess your are getting the Node object using getElementsByTagName("MetaData"). In this case nodoValue.getTextContent() will return the text content correctly but to get the node name you need to get the child node.
Your current node must be MetaData and getTextContent() will give all the text within its opening and closing tags. This is because you are getting
Compra Centralizada
as the value. You should get the first child using getChildNodes() and then can get the Value tag.
This question already has answers here:
Java DOM getElementByID
(2 answers)
Closed 3 years ago.
I have an xml document being parsed in Java as a w3c document.
In my xml, i have many elements of the same name, e.g <item ..... />, each one with unique attribute's value, e.g <item name="a" .... />.
I want in java to do:
doc.getElementById("a")
in order to get that specific item I have there with that name.
How can I tell java to use 'name' as the id?
Or, alternately, How can I fetch that specific item in least complexity?
DOM is not the best API to easily query your document and get back found elements. Learn XPath, which is a more appropriate API, or iterate through the tree of elements by yourself.
getElementById() will only return the element which has the given id attribute (edit: marked as such in the document DTD or schema). It can't find by name attribute.
See Java XML DOM: how are id Attributes special? for details.
You need to write a DTD that defines your attribute as being of type ID.
Well, To make a complete answer, I had to use DTD schemas like everyone stated.
Since my needs are quite simple, I added it in embedded in my xml the following way:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST item
name ID #REQUIRED
>
]>
<root> .... </root>
The only important thing left to know is that once you declare the ATTLIST, I have to declare all of the rest of my attributes, therefore, you need to add IMPLIED:
some-attribute CDATA #IMPLIED
It says that some-attribute contains some data (can use also PCDATA for parsed cdata), and is implied, which means, it can be there or it cannot. doesnt matter.
So eventually, it'll look something like:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ATTLIST item
name ID #REQUIRED
some-attribute CDATA #IMPLIED
>
]>
<root> .... </root>
And from Java side, Just use it blindly, e.g getElementById("some-name")
In order to make doc.getElementById("a") work you need to change your XML to <item id="a" name="a" .... />
If you can't change the XML, you could use XPath to retrieve this element.
I have a xml file like the following:
<file>
<students>
<student>
<name>Arthur</name>
<height>168</height>
</student>
<student>
<name>John</name>
<height>176</height>
</student>
</students>
</file>
How do I check whether for each opening tag, there is an ending tag? For example, if I do not provide the ending tag as:
<file>
<students>
<student>
<name>Arthur</name>
<height>168</height>
// Ending tag for student missing here
<student>
<name>John</name>
<height>176</height>
</student>
</students>
</file>
How do I continue parsing the rest of the file?
I tried with SAX parser as explained here, but its not very suitable for me as it throws an exception in case I do not provide a closing tag as in the second xml code I provided.
An XML file that does not verify your condition "for each opening tag, there is an ending tag", is not well formed.
To check that an XML file is well formed is the first job of a XML parser (it's its first task). Hence, you need a XML parser.
The tutorial you found has a bug in it. characters() maybe called multiple times for the same element (source). The proper way to mark the end of an element is to reset the respective boolean states inside of endElement(). The comments section has code that shows the required change.
With that issue fixed, you can do error checking in startElement() to ensure that the file is not trying to start an invalid element given the current state. This will also allow you to ensure that a name element is only found inside of a student element.
You can implement the following algorithm (pseudo-code):
String xml = ...
stack = new Stack()
while True:
tag = extractNextTag(xml)
// no new tag is found
if tag == null:
break
if (tag.isOpening()):
stack.push(tag.name)
else:
oldTagName = stack.pop()
if (oldTagName != tag.name):
error("Open/close tag error")
if ! stack.isEmpty():
error("Open/close tag error")
you can implement function extractNewTag with 10-20 lines of codes using some knowled about parsers or just writing simple regular expression.
Of course when you search for a new tag you need to start searching from the symbol that follows the last tag you found.
I doubt there is a way to get the element which has a particular value (text) from xml document using xpath.
Example doc:
<domain log-root="/logs" application-root="/applications"><resources>
<jdbc-resource pool-name="SamplePool" jndi-name="jdbc/sample" />
<jdbc-resource pool-name="TimerPool" jndi-name="abc">text1</jdbc-resource>
<jdbc-resource pool-name="TimerPool" jndi-name="def">text2</jdbc-resource>
<jdbc-resource pool-name="TimerPool" jndi-name="ghi">text3</jdbc-resource></resources</domain>
Example xPath Query:
/domain//jdbc-resource[#pool-name='TimerPool']/text()='text2'
Please post your ideas if there is any.
Use:
/domain/*/jdbc-resource[#pool-name='TimerPool' and .='text2']
or you may use:
/domain/*/jdbc-resource[#pool-name='TimerPool'][.='text2']
Both expressions above select all jdbc-resource elements the string value of whose pool-name attribute is "TimerPoool" and whose string value (of the jdbc-resource element) is "text2" and that are grand-children of the top element of the XML document.
Well, text() should do. http://www.w3schools.com/xpath/xpath_examples.asp
Have you tried it already? Also, check the path, it could be
//jdbc-resource[#pool-name='TimerPool']/text()='text2'
or
/domain/resource/jdbc-resource[#pool-name='TimerPool']/text()='text2'
or
//resource/jdbc-resource[#pool-name='TimerPool']/text()='text2'
I'm using SAX to read/parse XML documents and I have it working fine except for this particular site where eclipse tells me "junk after document element" and I get no data returned
http://www.zachblume.com/apis/rhyme.php?format=xml&word=example
The site is not mine..just trying to get some data from it.
Yes, that's not an XML document. It's trying to include more than one root element:
<?xml version="1.0"?>
<word>ampal</word>
<word>ample</word>
<word>hampel</word>
<word>hample</word>
<word>lampl</word>
<word>pampel</word>
<word>sample</word>
The parser regards everything after <word>ampal</word> as by that time it's read a complete document... hence the complain about "junk after document element".
An XML document can only have one root, but several children within the root. For example:
<?xml version="1.0"?>
<words>
<word>ampal</word>
<word>ample</word>
<word>hampel</word>
<word>hample</word>
<word>lampl</word>
<word>pampel</word>
<word>sample</word>
</words>
The page does not contain XML. It contains an XML snippet at best:
<?xml version="1.0"?>
<word>ampal</word>
<word>ample</word>
<word>hampel</word>
<word>hample</word>
<word>lampl</word>
<word>pampel</word>
<word>sample</word>
This is incorrect since there is no document element. SAX interprets the first <word> as the document element, and correctly reports "junk after document element" since for all it knows, the document element ends on line 1.
To get around the error, do not treat this document as XML. Download it as text, remove the XML declaration (<?xml version="1.0"?>) and then wrap it in a fake document element before you try to process it.