I'm trying to use the renameNode() method of the org.w3c.dom.Document class to rename the root node of an XML document.
My code is similar to this:
xml.renameNode(Element, "http://newnamespaceURI", "NewRootNodeName");
The code does rename the root element but doesn't apply the namespace prefix. Hard-coding the namespace prefix would not work as it has to be dynamic.
Any ideas why it is not working?
Many thanks
I tried it with JDK 6:
public static void main(String[] args) throws Exception {
// Create an empty XML document
Document xml = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
// Create the root node with a namespace
Element root = xml.createElementNS("http://oldns", "doc-root");
xml.appendChild(root);
// Add two child nodes. One with the root namespace and one with another ns
root.appendChild(xml.createElementNS("http://oldns", "child-node-1"));
root.appendChild(xml.createElementNS("http://other-ns", "child-node-2"));
// Serialize the document
System.out.println(serializeXml(xml));
// Rename the root node
xml.renameNode(root, "http://new-ns", "new-root");
// Serialize the document
System.out.println(serializeXml(xml));
}
/*
* Helper function to serialize a XML document.
*/
private static String serializeXml(Document doc) throws Exception {
Transformer transformer = TransformerFactory.newInstance().newTransformer();
Source source = new DOMSource(doc.getDocumentElement());
StringWriter out = new StringWriter();
Result result = new StreamResult(out);
transformer.transform(source, result);
return out.toString();
}
The output is (formatting added by me):
<doc-root xmlns="http://oldns">
<child-node-1/>
<child-node-2 xmlns="http://other-ns"/>
</doc-root>
<new-root xmlns="http://new-ns">
<child-node-1 xmlns="http://oldns"/>
<child-node-2 xmlns="http://other-ns"/>
</new-root>
So it works like expected. The root node has a new local name and new namespace while the child nodes remains the same including their namespaces.
I managed to sort this by looking up the namespace prefix like this:
String namespacePrefix = rootelement.lookupPrefix("http://newnamespaceURI");
and then using this with the renameNode method:
xml.renameNode(Element, "http://newnamespaceURI", namespacePrefix + ":" + "NewRootNodeName");
Related
I am trying to extract an element (as a String) out of an XML document. I have tried both approaches suggested in this SO answer (a similar method is also suggested here) and they both fail to properly account for namespace prefixes that may be defined in some outer-level document.
Using the following code:
// entry point method; see exampes of values for the String `s` in the question
public static String stripPayload(String s) throws Exception {
final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
final Document doc = dbf.newDocumentBuilder().parse(new InputSource(new StringReader(s)));
final XPath xPath = XPathFactory.newInstance().newXPath();
final String xPathToGetToTheNodeWeWishToExtract = "/*[local-name()='envelope']/*[local-name()='payload']";
final Node result = (Node) xPath.evaluate(xPathToGetToTheNodeWeWishToExtract, doc, XPathConstants.NODE);
return nodeToString_A(result); // or: nodeToString_B(result)
}
public static String nodeToString_A(Node node) throws Exception {
final StringWriter buf = new StringWriter();
final Transformer xform = TransformerFactory.newInstance().newTransformer();
xform.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
xform.setOutputProperty(OutputKeys.STANDALONE, "yes");
xform.transform(new DOMSource(node), new StreamResult(buf));
return(buf.toString());
}
public static String nodeToString_B(Node node) throws Exception {
final Document document = node.getOwnerDocument();
final DOMImplementationLS domImplLS = (DOMImplementationLS) document.getImplementation();
final LSSerializer serializer = domImplLS.createLSSerializer();
final String str = serializer.writeToString(node);
return str;
}
If the stripPayload method if passed the following strings:
<envelope><payload><a></a><b></b></payload></envelope>
or
<envelope><p:payload xmlns:p='foo'><a></a><b></b></p:payload></envelope>
… both nodeToString_A and nodeToString_B methods work. However, if I pass the following equally valid XML document where the namespace prefix is defined in an outer element:
<envelope xmlns:p='foo'><p:payload><a></a><b></b></p:payload></envelope>
… then both methods fail as they simply emit:
<p:payload><a/><b/></p:payload>
Thus, they are already producing an invalid document as the namespace prefix definition is left out.
The more complicated example below (which uses namespace prefixes in attributes):
<envelope xmlns:p='foo' xmlns:a='alpha'><p:payload a:attr='dummy'><a></a><b></b></p:payload></envelope>
… actually causes nodeToString_A to fail with an exception whereas at least nodeToString_B produces the invalid:
<p:payload a:attr="dummy"><a/><b/></p:payload>
(where again, the prefixes are not defined).
So my question is:
What is a robust way to extract and stringify an inner XML element in a way that takes care of namespace prefixes that may be defined in some outer element?
You just need to enable name-space-awareness.
public static String stripPayload(String s) throws Exception {
final DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
...
}
The output will be ...
<p:payload xmlns:p="foo"><a/><b/></p:payload>
I am currently using the Java XPath API to extract some text from a String.
This String, however, often has HTML formatting (<b>, <em>, <sub>, etc). When I run my code, the HTML tags are stripped off. Is there any way to avoid this?
Here is a sample input:
<document>
<summary>
The <b>dog</b> jumped over the fence.
</summary>
</document>
Here is a snippet of my code:
XPathFactory factory = XPathFactory.newInstance();
XPath xPath = factory.newXPath();
InputSource source = new InputSource(new StringReader(xml));
String output = xPath.evaluate("/document/summary", source);
Here is the current output:
The dog jumped over the fence.
Here is the output I want:
The <b>dog</b> jumped over the fence.
Thanks in advance for all your help.
A simple straight forward (but maybe not very efficient) solution:
/**
* Serializes a XML node to a string representation without XML declaration
*
* #param node The XML node
* #return The string representation
* #throws TransformerFactoryConfigurationError
* #throws TransformerException
*/
private static String node2String(Node node) throws TransformerFactoryConfigurationError, TransformerException {
final Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
final StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(node), new StreamResult(writer));
return writer.toString();
}
/**
* Serializes the inner (child) nodes of a XML element.
* #param el
* #return
* #throws TransformerFactoryConfigurationError
* #throws TransformerException
*/
private static String elementInner2String(Element el) throws TransformerFactoryConfigurationError, TransformerException {
final NodeList children = el.getChildNodes();
final StringBuilder sb = new StringBuilder();
for(int i = 0; i < children.getLength(); i++) {
final Node child = children.item(i);
sb.append(node2String(child));
}
return sb.toString();
}
Then the XPath evaluation should return the node instead of the string:
Element summaryElement = (Element) xpath.evaluate("/document/summary", doc, XPathConstants.NODE);
String output = elementInner2String(summaryElement);
The <b>dog</b> jumped over the fence
Get children from this string. You will have 2 Text Nodes and one Element Node. Treat them accordingly.
As part of the parser, it will read the text as XML and will classify the contents of the node summary as text, node, text. When you use /document/summary, the resolver will return a string which is made up of all the descendants of the selected node. This give you text + node.text + text. This is the reason you lose the bold tag. The input string inside of summary should either be:
HTML encoded -or-
Wrapped in a CDATA tag.
Wrapping inside of CDATA tag treats the the contents as text:
<document>
<summary>
<![CDATA[The <b>dog</b> jumped over the fence.]]>
</summary>
The problem with your solution is that the parser will want to treat as good xml structure. If you had an unbalanced tag inside summary, you would get an exception.
The solution to your question would be to loop over the elements to get text data while preserving the node names. This may work for your example, however, if you have an unbalanced tag it will break:
The <b>dog</b> jumped over <br> the fence
Don't use this solution to parse data between the summary tag. Instead either use CDATA or use some sort of regex to get content between the start and end points.
this is my xml:
Example:
<?xml version="1.0" encoding="UTF_8" standalone="yes"?>
<StoreMessage xmlns="http://www.xxx.com/feed">
<billingDetail>
<billingDetailId>987</billingDetailId>
<contextId>0</contextId>
<userId>
<pan>F0F8DJH348DJ</pan>
<contractSerialNumber>46446</contractSerialNumber>
</userId>
<declaredVehicleClass>A</declaredVehicleClass>
</billingDetail>
<billingDetail>
<billingDetailId>543</billingDetailId>
<contextId>0</contextId>
<userId>
<pan>F0F854534534348DJ</pan>
<contractSerialNumber>4666546446</contractSerialNumber>
</userId>
<declaredVehicleClass>C</declaredVehicleClass>
</billingDetail>
</StoreMessage>
With JDOM parser i want to get all <billingDetail> xml nodes from it.
my code:
SAXBuilder builder = new SAXBuilder();
try {
Reader in = new StringReader(xmlAsString);
Document document = (Document)builder.build(in);
Element rootNode = document.getRootElement();
List<?> list = rootNode.getChildren("billingDetail");
XMLOutputter outp = new XMLOutputter();
outp.setFormat(Format.getCompactFormat());
for (int i = 0; i < list.size(); i++) {
Element node = (Element)list.get(i);
StringWriter sw = new StringWriter();
outp.output(node.getContent(), sw);
StringBuffer sb = sw.getBuffer();
String text = sb.toString();
xmlRecords.add(sb.toString());
}
} catch (IOException io) {
io.printStackTrace();
} catch (JDOMException jdomex) {
jdomex.printStackTrace();
}
but i never get as output xml node as string like:
<billingDetail>
<billingDetailId>987</billingDetailId>
<contextId>0</contextId>
<userId>
<pan>F0F8DJH348DJ</pan>
<contractSerialNumber>46446</contractSerialNumber>
</userId>
<declaredVehicleClass>A</declaredVehicleClass>
</billingDetail>
what i am doing wrong? How can i get this output with JDOM parser?
EDIT
And why if XML start with
<StoreMessage> instead like <StoreMessage xmlns="http://www.xxx.com/MediationFeed">
then works? How is this possible?
The problem is that there are two versions of the getChildren method:
java.util.List getChildren(java.lang.String name)
This returns a List of all the child elements nested directly (one level deep) within this element with the given local name and belonging to no namespace, returned as Element objects.
and
java.util.List getChildren(java.lang.String name, Namespace ns)
This returns a List of all the child elements nested directly (one level deep) within this element with the given local name and belonging to the given Namespace, returned as Element objects.
The first one doesn't find your node if it belongs to a namespace, you should use the second one.
I am creating a W3C Document object using a String value. Once I created the Document object, I want to add a namespace to the root element of this document. Here's my current code:
Document document = builder.parse(new InputSource(new StringReader(xmlString)));
document.getDocumentElement().setAttributeNS("http://com", "xmlns:ns2", "Test");
document.setPrefix("ns2");
TransformerFactory tranFactory = TransformerFactory.newInstance();
Transformer aTransformer = tranFactory.newTransformer();
Source src = new DOMSource(document);
Result dest = new StreamResult(new File("c:\\xmlFileName.xml"));
aTransformer.transform(src, dest);
What I use as input:
<product>
<arg0>DDDDDD</arg0>
<arg1>DDDD</arg1>
</product>
What the output should look like:
<ns2:product xmlns:ns2="http://com">
<arg0>DDDDDD</arg0>
<arg1>DDDD</arg1>
</ns2:product>
I need to add the prefix value and namespace also to the input xml string. If I try the above code I am getting this exception:
NAMESPACE_ERR: An attempt is made to create or change an object in a way which is incorrect with regard to namespaces.
Appreciate your help!
Since there is not an easy way to rename the root element, we'll have to replace it with an element that has the correct namespace and attribute, and then copy all the original children into it. Forcing the namespace declaration is not needed because by giving the element the correct namespace (URI) and setting the prefix, the declaration will be automatic.
Replace the setAttribute and setPrefix with this (line 2,3)
String namespace = "http://com";
String prefix = "ns2";
// Upgrade the DOM level 1 to level 2 with the correct namespace
Element originalDocumentElement = document.getDocumentElement();
Element newDocumentElement = document.createElementNS(namespace, originalDocumentElement.getNodeName());
// Set the desired namespace and prefix
newDocumentElement.setPrefix(prefix);
// Copy all children
NodeList list = originalDocumentElement.getChildNodes();
while(list.getLength()!=0) {
newDocumentElement.appendChild(list.item(0));
}
// Replace the original element
document.replaceChild(newDocumentElement, originalDocumentElement);
In the original code the author tried to declare an element namespace like this:
.setAttributeNS("http://com", "xmlns:ns2", "Test");
The first parameter is the namespace of the attribute, and since it's a namespace attribute it need to have the http://www.w3.org/2000/xmlns/ URI. The declared namespace should come into the 3rd parameter
.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:ns2", "http://com");
Bellow approach also works for me, but probably should not use in performance critical case.
Add name space to document root element as attribute.
Transform the document to XML string. The purpose of this step is to make the child element in the XML string inherit parent element namespace.
Now the xml string have name space.
You can use the XML string to build a document again or used for JAXB unmarshal, etc.
private static String addNamespaceToXml(InputStream in)
throws ParserConfigurationException, SAXException, IOException,
TransformerException {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
/*
* Must not namespace aware, otherwise the generated XML string will
* have wrong namespace
*/
// dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(in);
Element documentElement = document.getDocumentElement();
// Add name space to root element as attribute
documentElement.setAttribute("xmlns", "http://you_name_space");
String xml = transformXmlNodeToXmlString(documentElement);
return xml;
}
private static String transformXmlNodeToXmlString(Node node)
throws TransformerException {
TransformerFactory transFactory = TransformerFactory.newInstance();
Transformer transformer = transFactory.newTransformer();
StringWriter buffer = new StringWriter();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new DOMSource(node), new StreamResult(buffer));
String xml = buffer.toString();
return xml;
}
Partially gleaned from here, and also from a comment above, I was able to get it to work (transforming an arbitrary DOM Node and adding a prefix to it and all its children) thus:
private String addNamespacePrefix(Document doc, Node node) throws TransformerException {
Element mainRootElement = doc.createElementNS(
"http://abc.de/x/y/z", // namespace
"my-prefix:fake-header-element" // prefix to "register" it with the DOM so we don't get exceptions later...
);
List<Element> descendants = nodeListToArrayRecurse(node.getChildNodes()); // for some reason we have to grab all these before doing the first "renameNode" ... no idea why ...
mainRootElement.appendChild(node);
doc.renameNode(node, "http://abc.de/x/y/z", "my-prefix:" + node.getNodeName());
descendants.stream().forEach(c -> doc.renameNode(c, "http://abc.de/x/y/z", "my-prefix:" + c.getNodeName()));
}
private List<Element> nodeListToArrayRecurse(NodeList entryNodes) {
List<Element> allEntries = new ArrayList<>();
for (int i = 0; i < entryNodes.getLength(); i++) {
Node child = entryNodes.item(i);
if (child.getNodeType() == Node.ELEMENT_NODE) {
allEntries.add((Element) child);
allEntries.addAll(nodeListToArray(child.getChildNodes())); // recurse
} // ignore other [i.e. text] nodes https://stackoverflow.com/questions/14566596/loop-through-all-elements-in-xml-using-nodelist
}
return allEntries;
}
If it helps anybody. I then convert it to string, then manually remove the extra header and closing lines. What a pain, I must be doing something wrong...
This seems to be working for me, and it's much simpler than those answers provided:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
document = builder.parse(new File(filename));
document.getDocumentElement().setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:yourNamespace", "http://whatever/else");
I'm trying to convert a ResultSet to an XML file.
I've first used this example for the serialization.
import org.w3c.dom.bootstrap.DOMImplementationRegistry;
import org.w3c.dom.Document;
import org.w3c.dom.ls.DOMImplementationLS;
import org.w3c.dom.ls.LSSerializer;
...
DOMImplementationRegistry registry = DOMImplementationRegistry.newInstance();
DOMImplementationLS impl =
(DOMImplementationLS)registry.getDOMImplementation("LS");
...
LSSerializer writer = impl.createLSSerializer();
String str = writer.writeToString(document);
After I made this work, I tried to validate my XML file, there were a couple of warnings.
One about not having a doctype. So I tried another way to implement this. I came across the Transformer class. This class lets me set the encoding, doctype, etc.
The previous implementation supports automatic namespace fix-up. The following does not.
private static Document toDocument(ResultSet rs) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.newDocument();
URL namespaceURL = new URL("http://www.w3.org/2001/XMLSchema-instance");
String namespace = "xmlns:xsi="+namespaceURL.toString();
Element messages = doc.createElementNS(namespace, "messages");
doc.appendChild(messages);
ResultSetMetaData rsmd = rs.getMetaData();
int colCount = rsmd.getColumnCount();
String attributeValue = "true";
String attribute = "xsi:nil";
rs.beforeFirst();
while(rs.next()) {
amountOfRecords = 0;
Element message = doc.createElement("message");
messages.appendChild(message);
for(int i = 1; i <= colCount; i++) {
Object value = rs.getObject(i);
String columnName = rsmd.getColumnName(i);
Element messageNode = doc.createElement(columnName);
if(value != null) {
messageNode.appendChild(doc.createTextNode(value.toString()));
} else {
messageNode.setAttribute(attribute, attributeValue);
}
message.appendChild(messageNode);
}
amountOfRecords++;
}
logger.info("Amount of records archived: " + amountOfRecords);
TransformerFactory tff = TransformerFactory.newInstance();
Transformer tf = tff.newTransformer();
tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
tf.setOutputProperty(OutputKeys.INDENT, "yes");
BufferedWriter bf = createFile();
StreamResult sr = new StreamResult(bf);
DOMSource source = new DOMSource(doc);
tf.transform(source, sr);
return doc;
}
While I was testing the previous implementation I got an TransformationException: Namespace for prefix 'xsi' has not been declared. As you can see I've tried to add a namespace with the xsi prefix to the root element of my document. After testing this I still got the Exception. What is the correct way to set namespaces and their prefixes?
Edit: Another problem I have with the first implementation is that the last element in the XML document doesn't have the last three closing tags.
The correct way to set a node on a namespaceAware document is by using:
rootNode.createElementNS("http://example/namespace", "PREFIX:aNodeName");
So you can replace "PREFIX" with your own custom prefix and replace "aNodeName" with the name of your node. To avoid having each node having its own namespace declaration you can define the namespaces as attributes on your root node like so:
rootNode.setAttribute("xmlns:PREFIX", "http://example/namespace");
Please be sure to set:
documentBuilderFactory.setNamespaceAware(true)
Otherwise you don't have namespaceAwareness.
Please note that setting an xmlns-prefix with setAttribute is wrong.
If you ever want to eg sign your DOM, you have to use setAttributeNS:
element.setAttributeNS("http://www.w3.org/2000/xmlns/", "xmlns:PREFIX", "http://example/namespace");
You haven't added the namespace declaration in the root node; you just declared the root node in the namespace, two entirely different things. When building a DOM, you need to reference the namespace on every relevant Node. In other words, when you add your attribute, you need to define its namespace (e.g., setAttributeNS).
Side note: Although XML namespaces look like URLs, they really aren't. There's no need to use the URL class here.