Why is DOM doing this? (Wrong nodeName XML) - java

I have this XML (just a little part.. the complete xml is big)
<Root>
<Products>
<Product ID="307488">
<ClassificationReference ClassificationID="AR" Type="AgencyLink"/>
<ClassificationReference ClassificationID="AM" Type="AgencyLink">
<MetaData>
<Value AttributeID="tipoDeCompra" ID="C">Compra Centralizada</Value>
</MetaData>
</ClassificationReference>
</Product>
</Products>
</Root>
Well... I want to get the data from the line
<Value AttributeID="tipoDeCompra" ID="C">Compra Centralizada</Value>
I'm using DOM and when I use nodoValue.getTextContent() I got "Compra Centralizada" and that is ok...
But when I use nodoValue.getNodeName() I got "MetaData" but I was expecting "Value"
What is the explanations for this behaviour?
Thanks!

Your nodeValuevariable most likely points to the MetaData node, so the returned name is correct.
Note that for an element node Node.getTextContent() returns the concatenation of the text content of all child nodes. Therefore in your example the text content of the MetaData element is equal to the text content of the Value element, namely Compra Centralizada.

I guess your are getting the Node object using getElementsByTagName("MetaData"). In this case nodoValue.getTextContent() will return the text content correctly but to get the node name you need to get the child node.

Your current node must be MetaData and getTextContent() will give all the text within its opening and closing tags. This is because you are getting
Compra Centralizada
as the value. You should get the first child using getChildNodes() and then can get the Value tag.

Related

Check if element node contains no text using java and Xpath?

I am new to Xpath. I am facing a problem that I have to get a boolean response from Xpath, if an element does not contains any text then it should return false otherwise true. I have seen many examples and I don't have much time to learn Xpath expressions. Below is the Xml file.
<?xml version="1.0" encoding="UTF-8" ?>
<order id="1234" date="05/06/2013">
<customer first_name="James" last_name="Rorrison">
<email>j.rorri#me.com</email>
<phoneNumber>+44 1234 1234</phoneNumber>
</customer>
<content>
<order_line item="H2G2" quantity="1">
<unit_price>23.5</unit_price>
</order_line>
<order_line item="Harry Potter" quantity="2">
<unit_price></unit_price>//**I want false here**
</order_line>
</content>
<credit_card number="1357" expiry_date="10/13" control_number="234" type="Visa" />
</order>
Could you point me the right direction to create xpath expression for this problem.
What I want is a expression(dummy expression) as below.
/order/content/order_line/unit_price[at this point I want to put a validation which will return true or false based on some check of isNull or notNull].
The following xpath will do this:
not(boolean(//*[not(text() or *)]))
but this xpath will also include the credit_card node since it to does not contain any text (the attributes are not text()).
if you also want to exclude node with attributes then use this..
not(boolean(//*[not(text() or * or #*)]))
Following your edit, you can do this..
/order/content/order_line/unit_price[not(text()]
It will return a list of nodes with no text and from there you can test against the count of nodes for your test.
or to return true/false..
not(boolean(/order/content/order_line/unit_price[not(text()]))

Return type of node itself in xpath

Having the following xml
<xml>
<property href="abc">b</property>
<element attr="def">k</element>
</xml>
How can I make the following xpath return literally element.
*[#attr='def']
On it's own this might seem a weird thing to do, but using the above xpath I can't find the node type itself (only the attributes and children).
If you want the name of an element node then use name(*[#attr = 'def']) or local-name(*[#attr = 'def']).

XStream doesn't show CData tags

When I read an XML with XStream, it doesn't show tag <![CDATA[ and ]]>.
I'd like XStream to show it.
For example:
This is a part of "test.xml"
<![CDATA[<b>]]>
If I show it in a browser, the browser shows it correctly:
<![CDATA[ <b> ]]>
But when I read and show XML with XStream I see only:
<b>
If i'm not mistaken each element should have a name and a value, (if their being read in as Xppdom objects). I'm guessing what you're looking at is the value. with the it might be a little different, because it is unparsed data, so the name may be "!CDATA" or may not have one at all. In the normal case: if you have <node attr1='val1'> text </node>, when it is read in, calling .getName() will return "node", .getValue() will return text, and .getAttribute("attr1") will return "val1".
If you wanted to print everything with their tags you could make a method String formatXppDom(XppDom elem) to format a printable string with the tags.

query to get element that has particular value using xpath

I doubt there is a way to get the element which has a particular value (text) from xml document using xpath.
Example doc:
<domain log-root="/logs" application-root="/applications"><resources>
<jdbc-resource pool-name="SamplePool" jndi-name="jdbc/sample" />
<jdbc-resource pool-name="TimerPool" jndi-name="abc">text1</jdbc-resource>
<jdbc-resource pool-name="TimerPool" jndi-name="def">text2</jdbc-resource>
<jdbc-resource pool-name="TimerPool" jndi-name="ghi">text3</jdbc-resource></resources</domain>
Example xPath Query:
/domain//jdbc-resource[#pool-name='TimerPool']/text()='text2'
Please post your ideas if there is any.
Use:
/domain/*/jdbc-resource[#pool-name='TimerPool' and .='text2']
or you may use:
/domain/*/jdbc-resource[#pool-name='TimerPool'][.='text2']
Both expressions above select all jdbc-resource elements the string value of whose pool-name attribute is "TimerPoool" and whose string value (of the jdbc-resource element) is "text2" and that are grand-children of the top element of the XML document.
Well, text() should do. http://www.w3schools.com/xpath/xpath_examples.asp
Have you tried it already? Also, check the path, it could be
//jdbc-resource[#pool-name='TimerPool']/text()='text2'
or
/domain/resource/jdbc-resource[#pool-name='TimerPool']/text()='text2'
or
//resource/jdbc-resource[#pool-name='TimerPool']/text()='text2'

java xpath parsing

Is there a way to retrieve from a XML file all the nodes that are not empty using XPath? The XML looks like this:
<workspace>
<light>
<activeFlag>true</activeFlag>
<ambientLight>0.0:0.0:0.0:0.0</ambientLight>
<diffuseLight>1.0:1.0;1.0:1.0</diffuseLight>
<specularLight>2.0:2.0:2.0:2.0</specularLight>
<position>0.1:0.1:0.1:0.1</position>
<spotDirection>0.2:0.2:0.2:0.2</spotDirection>
<spotExponent>1.0</spotExponent>
<spotCutoff>2.0</spotCutoff>
<constantAttenuation>3.0</constantAttenuation>
<linearAtenuation>4.0</linearAtenuation>
<quadricAttenuation>5.0</quadricAttenuation>
</light>
<camera>
<activeFlag>true</activeFlag>
<position>2:2:2</position>
<normal>1:1:1</normal>
<direction>0:0:0</direction>
</camera>
<object>
<material>lemn</material>
<Lu>1</Lu>
<Lv>2</Lv>
<unit>metric</unit>
<tip>tip</tip>
<origin>1:1:1</origin>
<normal>2:2:2</normal>
<parent>
<object>null</object>
</parent>
<leafs>
<object>null</object>
</leafs>
</object>
After each tag the parser "sees" another empty node that i don't need.
I guess what you want is all element nodes that have an immediate text node child that does not consist solely of white space:
//*[string-length(normalize-space(text())) > 0]
If you're using XSLT, use <xsl:strip-space elements="*"/>. If you're not, it depends what technology you are using (you haven't told us), eg. DOM, JDOM, etc.
You want:
//*[normalize-space()]
The expression:
//*[string-length(normalize-space(text())) > 0]
is a wrong answer. It selects all elements in the document whose first text node child's text isn't whitespace-only.
Therefore, this wouldn't select:
<p><b>Hello </b><i>World!</i></p>
although this paragraph contains quite a lot of text...

Categories