Using XPath in XMLObject to query by namespace - java

I have a simple XML document
<abc:MyForm xmlns:abc='http://myform.com'>
<abc:Forms>
<def:Form1 xmlns:def='http://decform.com'>
....
</def:Form1>
<ghi:Form2 xmlns:ghi='http://ghiform.com'>
....
</ghi:Form2>
</abc:Forms>
</abc:MyForm>
I'm using XMLObjects from Apache and when I try to do the following xpath expression it works perfectly
object.selectPath("declare namespace abc='http://myform.com'
abc:Form/abc:Forms/*");
this gives me the 2 Form nodes (def and ghi). However I want to be able to query by specifying a namespace, so let's say I only want Form2. I've tried this and it fails
object.selectPath("declare namespace abc='http://myform.com'
abc:Form/abc:Forms/*
[namespace-uri() = 'http://ghiform.com']");
The selectPath returns 0 nodes. Does anyone know what is going on?
Update:
If I do the following in 2 steps, then I can get the result that I want.
XmlObject forms = object.selectPath("declare namespace abc='http://myform.com'
abc:Form/abc:Forms")[0];
forms.selectPath("*[namespace-uri() = 'http://ghiform.com']");
this gives me the ghi:Form node just like it should, I don't understand why it doesn't do it as a single XPath expression though.
Thanks

The simple answer is that you can't. The namespace prefix is just a shorthand for the namespace URI, which is all that matters.
For a namespace-aware parser, your two tags are identical.
If you really want to differentiate using the prefix (although you really, really shouldn't be doing it), you can use a non namespace-aware parser and just treat the prefix as if it was part of the element name.
But ideally you should read a tutorial on how namespaces work and try to use them as they were designed to be used.

Related

Access JDOM elements independent of its namespace

I have a xml like below:
<v2:Root xmlns:v2="www.example.com/xsd/">
<ABC>test data</ABC>
<ABC>test data1</ABC>
<ABC>test data2</ABC>
</v2:Root>
When I'm accessing ABC element using JDOM2, i'm getting the element value in debug like
[Element:ABC[Namespace:"www.example.com/xsd/"]].
That's why i couldn't access the element by just using Xpath expression "//ABC". I'm forced to use expression "/*[local-name()='ABC']".Then it works.
Now, my requirement is to acces the elemnt using expression "//ABC" only. Is there any way?
Thanks in advance for any help.
I think you are mistaken about what your XML actually looks like. I believe you also must have:
xmlns="www.example.com/xsd/"
in there somewhere otherwise your ABC Elements would be in the NO_NAMESPACE namespace (and the ABC toString() method would look like: [Element:ABC] )
So, your XML snippet does not match the ABC Element toString() output.
If you fix your question it will be easier to suggest what your XPath expression should look like.
EDIT, assuming I am right that you have the additional redefinition of the default Namespace, then you can use the following JDOM to get the ABC elements:
XPathFactory xpf = XPathFactory.instance();
Namespace defns = Namespace.getNamespace("defns", "www.example.com/xsd/");
XPathExpression<Element> xpe = xpf.compile("//defns:ABC", Filters.element(), null, defns);
List<Element> abcs = xpe.evaluate(doc);
You should read the following exerpt from the XPath specification carefully:
A QName in the node test is expanded into an expanded-name using the namespace declarations from the expression context. This is the same way expansion is done for element type names in start and end-tags except that the default namespace declared with xmlns is not used: if the QName does not have a prefix, then the namespace URI is null (this is the same way attribute names are expanded). It is an error if the QName has a prefix for which there is no namespace declaration in the expression context.

How can I use XPath to resolve the following 'tags' values?

The xml file is :
<xml-fragment xmlns:xyz="http://someurl">
<xyz:xyzcontent>
<contentattribute>
<key>tags</key>
<value>tag1, tag2</value>
</contentattribute>
</xyz:xyzcontent>
...
I've tried the following:
XPathExpression createdDateExpression = xpath.compile("/contentattribute/key/attribute::tags/value");
There are several problems with your query.
The XML is broken (root tag not closed) -- probably just a copy/paste mistake
You're starting somewhere right in the middle of the XML tree, but actually try to query from the root node. Use the descendant-or-self-axis // in the beginning.
Which attribute are you querying using the attribute-axis? There is none.
Where did you register the namespaces? What namespace is xyz, anyway? I guess it's actually vp, but you obfuscated incompletely (or are not giving all relevant parts of the document).
Use predicates and string comparison to filter at axis steps.
Try following:
Make sure to register the namespace, have a look at the reference for that (or give more information).
Use the XPath query //contentattribute[key='tags']/value

Reading XML data between tags

I have a XML for which i am writing a servlet to pick up contents from the XML. One such tag is <itunes:author>Jonathan Kendrick</itunes:author>
I need to get author value for this. because of :
I tried using namespace and using escape sequence for : but it did not worked for me.
For rest of other XML elements i am simply using
String link=node.getChildText("link").toString();
I am using Jdom parser
in your XML the sequernce 'itunes:author' represents what's called a Q-Namem a "Qualified Name". In XML it consists of a 'Namespace prefix', and a 'Local Name'. In your example, the namespace prefix is 'itunes', and the 'local name' is 'author'.
What you want is the 'author' element in the namespace linked to the prefix 'itunes'. The actual namespace is normally a full URL. I believe the full URL for your example is probably xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd", but you should check that.
So, the Namespace is "http://www.itunes.com/dtds/podcast-1.0.dtd", it's prefix is declared to be 'itunes' (but it could be something else - the actual prefix name is not technically important...)
You want to get the 'author' in the 'http://www.itunes.com/dtds/podcast-1.0.dtd' Namespace so you want:
String author = node.getChildText("author", Namespace.getNamespace("http://www.itunes.com/dtds/podcast-1.0.dtd"));
For more information on Namespaces check out: http://www.w3schools.com/xml/xml_namespaces.asp

XPath results to empty string

Following is my XML file
<xyzevent xmlns="http://www.xyz.com/common/xyzevent/v1" xmlns:xsi="http://www.w3.org2001XMLSchema-instance">
<header>
----
</header>
<subscription xmlns="http://www.xyz.com/common/xyzevent/source/v1">
<sender></sender>
<receiver>
<clientsubscription>
<servicemap>nanna</servicemap>
</clientsubscription>
</receiver>
</subscription>
</xyzevent>
When I budila org.w3c.dom.Document from this XML and applying XPathExperssion with expression
/xyzevent/subscription/receiver/clientsubscription/servicemap/text()
results empty string. What can be the issue with the expression?
Thank you
That's because your XML document uses a namespace. XPath is really annoying with namespaces. To confirm this, strip the two xmlns=http://.../v1 from the document and run your XPath expression agains the unnamespaced, unverifiable XML file. It'll match.
What's happening is that your XPath expression tries to select /xyzevent, when your document contains {http://.../v1}:xyzevent, which is not the same thing.
There are various ways around this problem. The proper way is to set up a NamespaceContext so you can use the prefix:localName notation in your XPath expression and have the prefixes be resolved to the correct URI. There's a short blurb about this in the xerces docs and some more elsewhere on StackOverflow. There's an extensive description at ibm.com.
Your NamespaceContext will contain two (or more) mappings:
{
event => http://www.xyz.com/common/xyzevent/v1
source => http://www.xyz.com/common/xyzevent/source/v1
}
Your XPath expression can then become /event:xyzevent/source:subscription/source:receiver/.../text().
As a nasty workaround, you can rewrite your xpath expression to select using the local-name() function:
/*[local-name()='xyzevent']/*[local-name()='subscription'/ ...
In this case, the expression matches any element whose local name is xyzevent, regardless of namespace URI.
Your XML has default namespace: xmlns="http://www.xyz.com/common/xyzevent/v1", therefore you need to define it in your XML/XPath engine.
Or use this XPath:
/*[local-name() = 'xyzevent']
/*[local-name() = 'subscription']
/*[local-name() = 'receiver']
/*[local-name() = 'clientsubscription']
/*[local-name() = 'servicemap']
/text()
xyzevent is your root element, so you just need to use "/subscription/receiver/clientsubscription/servicemap/text()".
I evaluated your expression in the following link:
http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm
And it seems, it selects "nanna".

Java+DOM: How do I elegantly rename an xmlns:xyz attribute?

I have something like that as input:
<root xmlns="urn:my:main"
xmlns:a="urn:my:a" xmlns:b="urn:my:b">
...
</root>
And want to have something like that as output:
<MY_main:root xmlns:MY_main="urn:my:main"
xmlns:MY_a="urn:my:a" xmlns:MY_b="urn:my:b">
...
</MY_main:root>
... or the other way round.
How do I achieve this using DOM in an elegant way?
That is, without searching for attribute names starting with "xmlns".
You will not find the xmlns attributes in your DOM, they are not part of the DOM.
You may have some success if you find the nodes you want (getElementsByTagNameNS) and set their qualifiedName (qname) to a new value containing the prefix you like. Then re-generate the XML document.
By the way, the namespace prefix (which is what you are trying to change) is largely irrelevant when using any sane XML parser. The namespace URI is what counts. Why would you want to set the prefix to a specific value?
I have used the following jdom stub to remove all the namespace references:
Element rootElement = new SAXBuilder().build(contents).getRootElement();
for (Iterator i = rootElement.getDescendants(new ElementFilter()); i.hasNext();) {
Element el = (Element) i.next();
if (el.getNamespace() != null) el.setNamespace(null);
}
return rootElement;
Reading and writing the xml is done as normal. If you are just after human readable output that should do the job. If however you need to convert back you may have a problem.
The following may work to replace the namespaces with a more friendly version based on your example (untested):
rootElement.setNamespace(Namespace.getNamespace("MY_Main", "urn:my:main"));
rootElement.addNamespaceDeclaration(Namespace.getNamespace("MY_a", "urn:my:a"))
rootElement.addNamespaceDeclaration(Namespace.getNamespace("MY_b", "urn:my:b"))

Categories