I have something like that as input:
<root xmlns="urn:my:main"
xmlns:a="urn:my:a" xmlns:b="urn:my:b">
...
</root>
And want to have something like that as output:
<MY_main:root xmlns:MY_main="urn:my:main"
xmlns:MY_a="urn:my:a" xmlns:MY_b="urn:my:b">
...
</MY_main:root>
... or the other way round.
How do I achieve this using DOM in an elegant way?
That is, without searching for attribute names starting with "xmlns".
You will not find the xmlns attributes in your DOM, they are not part of the DOM.
You may have some success if you find the nodes you want (getElementsByTagNameNS) and set their qualifiedName (qname) to a new value containing the prefix you like. Then re-generate the XML document.
By the way, the namespace prefix (which is what you are trying to change) is largely irrelevant when using any sane XML parser. The namespace URI is what counts. Why would you want to set the prefix to a specific value?
I have used the following jdom stub to remove all the namespace references:
Element rootElement = new SAXBuilder().build(contents).getRootElement();
for (Iterator i = rootElement.getDescendants(new ElementFilter()); i.hasNext();) {
Element el = (Element) i.next();
if (el.getNamespace() != null) el.setNamespace(null);
}
return rootElement;
Reading and writing the xml is done as normal. If you are just after human readable output that should do the job. If however you need to convert back you may have a problem.
The following may work to replace the namespaces with a more friendly version based on your example (untested):
rootElement.setNamespace(Namespace.getNamespace("MY_Main", "urn:my:main"));
rootElement.addNamespaceDeclaration(Namespace.getNamespace("MY_a", "urn:my:a"))
rootElement.addNamespaceDeclaration(Namespace.getNamespace("MY_b", "urn:my:b"))
Related
Application Background:
Basically, I am building an application in which I am parsing the XML document using SAX PARSER for every incoming tag I would like to know its datatype and other information so I am using the XSD associated with that XML file to get the datatype and other information related to those tags. Hence, I am parsing the XSD file and storing all the information in Hashmap so that whenever the tag comes I can pass that XML TAG as key to my Hashmap and obtain the value (information associated with it which is obtained during XSD parsing) associated with it.
Problem I am facing:
As of now, I am able to parse my XSD using the DocumentBuilderFactory. But during the collection of elements, I am able to get only one type of element and store it in my NODELIST such as elements with tag name "xs:element". My XSD also has some other element type such as "xs:complexType", xs:any etc. I would like to read all of them and store them into a single NODELIST which I can later loop and push to HASHMAP. However I am unable to add any additional elements to my NODELIST after adding one type to it:
Below code will add tags with the xs:element
NodeList list = doc.getElementsByTagName("xs:element");
How can I add the tags with xs:complexType and xs:any to the same NODELIST?
Is this a good way to find the datatype and other attributes of the XSD or any other better approach available. As I may need to hit the HASHMAP many times for every TAG in XML will there be a performance issue?
Is DocumentBuilderFactory is a good approach to parse XML or are there any better libaraies for XSD parsing? I looked into Xerces2 but could not find any good example and I got struck and posted the question here.
Following is my code for parsing the XSD using DocumentBuilderFactory:
public class DOMParser {
private static Map<String, Element> xmlTags = new HashMap<String, Element>();
public static void main(String[] args) throws URISyntaxException, SAXException, IOException, ParserConfigurationException {
String xsdPath1 = Paths.get(Xerces2Parser.class.getClassLoader().getResource("test.xsd").toURI()).toFile().getAbsolutePath();
String filePath1 = Path.of(xsdPath1).toString();
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(new File(filePath1));
NodeList list = doc.getElementsByTagName("xs:element");
System.out.println(list.getLength());
// How to add the xs:complexType to same list as above
// list.add(doc.getElementsByTagName("xs:complexType"));
// list = doc.getElementsByTagName("xs:complexType");
// Loop and add data to Map for future lookups
for (int i = 0; i < list.getLength(); i++) {
Element element = (Element) list.item(i);
if (element.hasAttributes()) {
xmlTags.put(element.getAttribute("name"), element);
}
}
}
}
I don't know what you are trying to achieve (you have described the code you are writing, not the problem it is designed to solve) but what you are doing seems misguided. Trying to get useful information out of an XSD schema by parsing it at the XML level is really hard work, and it's clear from the questions you are asking that you haven't appreciated the complexities of what you are attempting.
It's hard to advise you on the low-level detail of maintaining hash maps and node lists when we don't understand what you are trying to achieve. What information are you trying to extract from the schema, and why?
There are a number of ways of getting information out of a schema at a higher level. Xerces has a Java API for accessing a compiled schema. Saxon has an XML representation of compiled schemas called SCM (the difference from raw XSD is that all the work of expanding xs:include and xs:import, expanding attribute groups, model groups, and substitution groups etc has been done for you). Saxon also has an XPath API (a set of extension functions) for accessing compiled schema information.
My XML file looks like this:
<Messages>
<Contact Name="Robin" Number="8775454554">
<Message Date="24 Jan 2012" Time="04:04">this is report1</Message>
</Contact>
<Contact Name="Tobin" Number="546456456">
<Message Date="24 Jan 2012" Time="04:04">this is report2</Message>
</Contact>
<Messages>
I need to check whether the 'Number' attribute of Contact element is equal to 'somenumber' and if it is, I'm required to insert one more Message element inside Contact element.
How can it be achieved using DOM? And what are the drawbacks of using DOM?
The main drawback to using a DOM is it's necessary to load the whole model into memory at once, rather than if your simply parsing the document, you can limit the data you keep in memory at one point. This of course isn't really an issue until your processing very large XML documents.
As for the processing side of things, something like the following should work:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document dom = db.parse(is);
NodeList contacts = dom.getElementsByTagName("Contact");
for(int i = 0; i < contacts.getLength(); i++) {
Element contact = (Element) contacts.item(i);
String contactNumber = contact.getAttribute("Number");
if(contactNumber.equals(somenumber)) {
Element newMessage = dom.createElement("Message");
// Configure the message element
contact.appendChild(newMessage);
}
}
DOM has two main disadvantages:
It requires reading of the complete XML into a Java representation in memory. That can be both time and memory consuming
It is a pretty verbose API, so you need to write a lot of code to achieve simple things like you're asking for.
If time and memory consumption is OK for you, but verbosity is not, you could still use jOOX, a library that I have created to wrap standard Java DOM objects to simplify manipulation of XML. These are some examples of how you would implement your requirement with jOOX:
// With css-style selectors
String result1 = $(file).find("Contact[Number=somenumber]").append(
$("<Message Date=\"25 Jan 2012\" Time=\"23:44\">this is report2</Message>")
).toString();
// With XPath
String result2 = $(file).find("//Contact[#Number = somenumber]").append(
$("<Message Date=\"25 Jan 2012\" Time=\"23:44\">this is report2</Message>")
).toString();
// Instead of file, you can also provide your source XML in various other forms
Note that jOOX only wraps standard Java DOM. The underlying operations (find() and append(), as well as $() actually perform various DOM operations).
You will do something to this effect.
Get the NodeList of Contact element.
Iterate through the NodeList and get Contact element.
Get Number through contact.getAttribute("Number") where contact is of type Element.
If your number equals someNumber, then add Message by calling contact.appendChild(). Message must be an element.
Use the Element class to create a new element
Element message = doc.createElement("Message");
message.setAttribute("message", strMessage);
Now add this element after whatever element you want using
elem.getParentNode().insertBefore(message, elem.getNextSibling());
You might want to take a look at this tutorial its about exactly what you want to do
I have an xml trying to parse & read it, but dont know how many nodes the xml may contain? So I am trying to read the node & node values ?
How I get the same say:
<company>
<personNam>John</personName>
<emailId>abc#test.com</emaiId>
<department>Products</department>
(may have additionaly nodes & values for same)
</company>
Sorry forgot to add my code, using Dom:-
Document document = getDocumentBuilder().parse(new ByteArrayInputStream(myXML.getBytes("UTF-8")));
String xPathExp = "//company";
XPath xPath = getXPath();
NodeList nodeList = (NodeList)xPath.evaluate(xPathExp, document, XPathConstants.NODESET);
nodeListSize = nodeList.getLength();
System.out.println("#####nodeListSize"+nodeListSize);
for(int i=0;i<nodeListSize;i++){
element=(Element)nodeList.item(i);
m1XMLOutputResponse=element.getTextContent();
System.out.println("#####"+element.getTagName()+" "+element.getTextContent());
}
Consider using the JAXB library. It's really a painless way of mapping your XML to Java classes and back. The basic principle is that JAXB takes your XML Schemas (XSD) and generates corresponding Java classes for you. Then you just call marshall or unmarshall methods which populate your Java class with the contents of the XML, or generates the XML from your Java class.
The only drawback is, of course, that you'd need to know how to write the XML Schemas :)
Learn how to use XML DOM. Here is an example on how to use XML DOM to fetch node and node values.
I have a simple XML document
<abc:MyForm xmlns:abc='http://myform.com'>
<abc:Forms>
<def:Form1 xmlns:def='http://decform.com'>
....
</def:Form1>
<ghi:Form2 xmlns:ghi='http://ghiform.com'>
....
</ghi:Form2>
</abc:Forms>
</abc:MyForm>
I'm using XMLObjects from Apache and when I try to do the following xpath expression it works perfectly
object.selectPath("declare namespace abc='http://myform.com'
abc:Form/abc:Forms/*");
this gives me the 2 Form nodes (def and ghi). However I want to be able to query by specifying a namespace, so let's say I only want Form2. I've tried this and it fails
object.selectPath("declare namespace abc='http://myform.com'
abc:Form/abc:Forms/*
[namespace-uri() = 'http://ghiform.com']");
The selectPath returns 0 nodes. Does anyone know what is going on?
Update:
If I do the following in 2 steps, then I can get the result that I want.
XmlObject forms = object.selectPath("declare namespace abc='http://myform.com'
abc:Form/abc:Forms")[0];
forms.selectPath("*[namespace-uri() = 'http://ghiform.com']");
this gives me the ghi:Form node just like it should, I don't understand why it doesn't do it as a single XPath expression though.
Thanks
The simple answer is that you can't. The namespace prefix is just a shorthand for the namespace URI, which is all that matters.
For a namespace-aware parser, your two tags are identical.
If you really want to differentiate using the prefix (although you really, really shouldn't be doing it), you can use a non namespace-aware parser and just treat the prefix as if it was part of the element name.
But ideally you should read a tutorial on how namespaces work and try to use them as they were designed to be used.
I have an XML file where some sub tags (child node elements) are optional.
e.g.
<part>
<note>
</rest>
</note>
<note>
<pitch></pitch>
</note>
<note>
<pitch></pitch>
</note>
</part>
But when I read the XML files by tags, it throws a NullPointerException - since some sub-tags are optional (e.g. rest and pitch in above example). How can I filter this out? I couldn't come across any methods to find whether an element exists by a particular tag name. Even if I have a condition to check whether getElementsByTagName("tag-name") method not returns NULL - still it goes in the condition body and obviously throw the exception.
How may I resolve this?
The java code is:
if(fstelm_Note.getElementsByTagName("rest")!=null){
if(fstelm_Note.getElementsByTagName("rest")==null){
break;
}
NodeList restElmLst = fstelm_Note.getElementsByTagName("rest");
Element restElm = (Element)restElmLst.item(0);
NodeList rest = restElm.getChildNodes();
String restVal = ((Node)rest.item(0)).getNodeValue().toString();
}else if(fstelm_Note.getElementsByTagName("note")!=null){
if(fstelm_Note.getElementsByTagName("note")==null){
break;
}
NodeList noteElmLst = fstelm_Note.getElementsByTagName("note");
Element noteElm = (Element)noteElmLst.item(0);
NodeList note = noteElm.getChildNodes();
String noteVal = ((Node)note.item(0)).getNodeValue().toString();
}
Any insight or suggestions are appreciated.
Thanks in advance.
I had this very same problem (using getElementsByTagName() to get "optional" nodes in an XML file), so I can tell by experience how to solve it. It turns out that getElementsByTagName does not return null when no matching nodes are found; instead, it returns a NodeList object of zero length.
As you may guess, the right way to check if a node exists in an XML file before trying to fetch its contents would be something similar to:
NodeList nl = element.getElementsByTagName("myTag");
if (nl.getLength() > 0) {
value = nl.item(0).getTextContent();
}
Make sure to specify a "default" value in case the tag is never found.
It may be that your NodeLists are not null, but are empty. Can you try changing your code like this and see what happens?
NodeList restElmLst = fstelm_Note.getElementsByTagName("rest");
if (restElmLst != null && !restElmLst.isEmpty())
{
Element restElm = (Element)rests.item(0);
...
etc. (Doublecheck syntax etc., since I'm not in front of a compiler.)
Your requirements are extremely unclear but I would very likely use the javax.xml.xpath package to parse your XML document with the XML Path Language (XPath).
Have a look at:
XML Validation and XPath Evaluation in J2SE 5.0
Parsing an XML Document with XPath
But you should try to explain the general problem you are trying to solve rather than the specific problem you're facing. But doing so, 1. you will probably get better answers and 2. the current chosen path might not be the best one.
Try something like below
bool hasCity = OrderXml.Elements("City").Any();
where OrderXml is parent element.
First you need to create the nodelist and then check the length of nodelist to check whether the current element exists or not in the xml string.
NodeList restElmLst = fstelm_Note.getElementsByTagName("rest");
if (restElmLst.getLength() > 0) {
String restVal = restElm.getElementsByTagName("rest").item(0).getTextContent();
}