Convert Element to XML string without <?xml version=...> prefix - java

I have a legacy code with a lot of org.w3c.dom.Element generation like that
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
Element exampleElement = doc.createElement("example");
exampleElement.appendChild(...
How can I convert exampleElement to XML string like that? (Any additional libraries is allowed)
<example>
...
</example>
Not that
<?xml version="1.0" encoding="UTF-8"?>
<example>
...
</example>

Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OMIT_XML_DECLARATION, "yes");
Writer writer = new StringWriter();
transformer.transform(new DOMSource(node), new StreamResult(writer));
return writer.toString();

There's some options:
Use an XSL Transformer to apply the changes using the directive omit-xml-declaration
Transformer transformer= TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty("omit-xml-declaration", "yes");
transformer.transform(new DOMSource(document), new StreamResult(stream));
Use a string replace process to replace the first line with a regex or using an indexOf("?>")
Depending on the implementation of w3c Document in the JDK used is possible to use some options in DOMConfiguration of the document like canonical-form
document.getDomConfig().setParameter("canonical-form",true);

Related

Pre-added Whitespaces in XML on first line only

I am facing an issue while generating XML through Java code. I am printing the xml directly onto webpage to present the stats. The first line of the generated XML is always having some spaces at the start which makes the XML look weird. How do I remove the same. I tried removing the xml line but still whatever be the first line it has pre white spaces.
Please let me now how do i remove it
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<headNode>
<statsHead1>
<totalCount>0.05</totalCount>
</statsHead1>
<statsHead2>
<statsSubHead1>
<count1>0</count1>
<count2>0.0</count2>
</statsSubHead1>
<statsSubHead2>
<count1>0</count1>
<count2>0.0</count2>
</statsSubHead2>
<totalcount1>0</totalcount1>
<totalcount2>0.0</totalcount2>
</statsHead2>
</headNode>
Code that I use is
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
DOMSource source = new DOMSource(doc);
StringWriter outWriter = new StringWriter();
Result result = new StreamResult(outWriter);
transformer.transform(source, result);
return outWriter.toString();

how to print double quotes in xml attribute value

Hi i am generating a xml file using javax.xml parsers able to generate a xml file. But in my attribute value i was getting &quot instead of double quote.
How to print double quotes in attribute value. Below is my code
Document doc = docBuilder.newDocument();
Element rootElement = doc.createElement("elements");
doc.appendChild(rootElement);
rootElement.setAttribute("area", "area");
rootElement.setAttribute("page", "pagename");
//element
Element element = doc.createElement("element");
rootElement.appendChild(element);
element.setAttribute("key", "key");
element.setAttribute("id", "id");
element.setAttribute("path", "//*[#id="email"]");
}
// write the content into xml file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File(ApplicationContext.getPath()+File.separator+"test.xml"));
// Output to console for testing
// StreamResult result = new StreamResult(System.out);
transformer.transform(source, result);
Output :
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<elements area="area" page="pagename">
<element id="id" key="key" path="//*[#id="email"]"/>
</elements>
Expected output:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<elements area="area" page="pagename">
<element id="id" key="key" path="//*[#id="email"]"/>
</elements>
Thanks inadvance
The output you are trying to produce is not well-formed XML, and no XML parser will accept it. If you want to produce stuff that isn't XML then you can do so, of course, but XML-aware tools will try very hard to prevent it.

How to remove standalone attribute declaration in xml document?

Im am currently creating an xml using Java and then I transform it into a String. The xml declaration is as follows:
DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
Document doc = docBuilder.newDocument();
doc.setXmlVersion("1.0");
For transforming the document into String, I include the following declaration:
TransformerFactory transfac = TransformerFactory.newInstance();
Transformer trans = transfac.newTransformer();
trans.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
trans.setOutputProperty(OutputKeys.VERSION, "1.0");
trans.setOutputProperty(OutputKeys.ENCODING,"UTF-8");
trans.setOutputProperty(OutputKeys.INDENT, "yes");
And then I do the transformation:
StringWriter sw = new StringWriter();
StreamResult result = new StreamResult(sw);
DOMSource source = new DOMSource(doc);
trans.transform(source, result);
String xmlString = sw.toString();
The problem is that in the XML Declaration attributes, the standalone attribute is included and I don't want that, but I want the version and encoding attributes to appear:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
Is there any property where that could be specified?
From what I've read you can do this by calling the below method on Document before creating the DOMSource:
doc.setXmlStandalone(true); //before creating the DOMSource
If you set it false you cannot control it to appear or not. So setXmlStandalone(true) on Document. In transformer if you want an output use OutputKeys with whatever "yes" or "no" you need. If you setXmlStandalone(false) on Document your output will be always standalone="no" no matter what you set (if you set) in Transformer.
Read the thread in this forum

Remove the XML header from an XML in Java

StringWriter writer = new StringWriter();
XmlSerializer serializer = new KXmlSerializer();
serializer.setOutput(writer);
serializer.startDocument(null, null);
serializer.setFeature("http://xmlpull.org/v1/doc/features.html#indent-output", true);
// Creating XML
serializer.endDocument();
String xmlString = writer.toString();
In the above environment, whether there are any standard API's available to remove the XML header <?xml version='1.0' ?> or do you suggest to go via string manipulation:
if (s.startsWith("<?xml ")) {
s = s.substring(s.indexOf("?>") + 2);
}
Wanted the output in the xmlString without XML header info <?xml version='1.0' ?>.
Ideally you can make an API call to exclude the XML header if desired. It doesn't appear that KXmlSerializer supports this though (skimming through the code here). If you had a org.w3c.dom.Document (or actually any other implementation of javax.xml.transform.Source) you could accomplish what you want this way:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(doc), new StreamResult(writer));
Otherwise if you have to use KXmlSerializer it looks like you'll have to manipulate the output.
If you use a JAXP serializer you get access to all the output properties defined in XSLT, for example omit-xml-declaration="yes". You can get this in the form of an "identity transformer", called using transformerFactory.getTransformer() with no parameters, on which you then call setOutputProperty(). Another example:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
t.setOutputProperty("omit-xml-declaration", "yes");
Don't make call to:
serializer.startDocument();
It adds the XML header, though you need to call:
serializer.endDocument();
else your XML will be created as a blank String.

XML Document to String

What's the simplest way to get the String representation of a XML Document (org.w3c.dom.Document)? That is all nodes will be on a single line.
As an example, from
<root>
<a>trge</a>
<b>156</b>
</root>
(this is only a tree representation, in my code it's a org.w3c.dom.Document object, so I can't treat it as a String)
to
"<root> <a>trge</a> <b>156</b> </root>"
Thanks!
Assuming doc is your instance of org.w3c.dom.Document:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(doc), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("\n|\r", "");
Use the Apache XMLSerializer
here's an example:
http://www.informit.com/articles/article.asp?p=31349&seqNum=3&rl=1
you can check this as well
http://www.netomatix.com/XmlFileToString.aspx
First you need to get rid of all newline characters in all your text nodes. Then you can use an identity transform to output your DOM tree. Look at the javadoc for TransformerFactory#newTransformer().

Categories