Pre-added Whitespaces in XML on first line only - java

I am facing an issue while generating XML through Java code. I am printing the xml directly onto webpage to present the stats. The first line of the generated XML is always having some spaces at the start which makes the XML look weird. How do I remove the same. I tried removing the xml line but still whatever be the first line it has pre white spaces.
Please let me now how do i remove it
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<headNode>
<statsHead1>
<totalCount>0.05</totalCount>
</statsHead1>
<statsHead2>
<statsSubHead1>
<count1>0</count1>
<count2>0.0</count2>
</statsSubHead1>
<statsSubHead2>
<count1>0</count1>
<count2>0.0</count2>
</statsSubHead2>
<totalcount1>0</totalcount1>
<totalcount2>0.0</totalcount2>
</statsHead2>
</headNode>
Code that I use is
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
DOMSource source = new DOMSource(doc);
StringWriter outWriter = new StringWriter();
Result result = new StreamResult(outWriter);
transformer.transform(source, result);
return outWriter.toString();

Related

Java XML api removes whitespace before self closing tag

I've XML file which contains only one element
<Message>
<Location URI ="XXX:XXX:XXX" />
</Message>
I want to read and print same XML using Java, but after print it loses white space before />
<Message>
<Location URI ="XXX:XXX:XXX"/>
</Message>
I have tried different configuration of DocumentBuilderFactory and Transformer but the result is same.
Any Idea?
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document requestDocument = builder.parse(this.getClass().getResourceAsStream("/message-template.xml"));
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
DOMSource domSource = new DOMSource(requestDocument);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
transformer.transform(domSource, result);
System.out.println(writer.toString());
Here :
DOMSource domSource = new DOMSource(requestDocument);
...
transformer.transform(domSource, result);
You transform a DOMSource into a StreamResult . A DOMSource is not not a textual representation of the XML file but a Document Object Model (DOM) tree.
So whitespaces that are not considered as relevant to represent the content of the tree are not kept in the DOMSource :
URI ="XXX:XXX:XXX" />
|-------> not preserved
Most of APIs to represent and manipulate XML work in this way.
If you need to keep not significant whitespace in your result, you should probably do yourself the parsing of the XML file.

DOM XML Public Doctype not appearing in result xml file

I have written a code to generate XML files. I am stuck at defining doctype for the XML as it should be public. I am able to get SYSTEM doctype successfully but somehow not able to get public doctype written in XML. Below code for SYSTEM doctype is working but same snippet for PUBLIC doctype is not working :
String xmldestpath = "C:/failed/tester.xml";
doctype2 = CreateDoctypeString();
StreamResult result = new StreamResult(new File(xmldestpath ));
try {
transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM,"TEST");
transformer.transform(source, result);
// logger.debug("COMPLETED Copying xml files /....!!");
System.out.println("COMPLETED Copying xml files to bulk import....!!");
Not working snippet. Its not giving error but no doctype is appearing in resultant xml:
String xmldestpath = "C:/failed/tester.xml";
doctype2 = CreateDoctypeString();
StreamResult result = new StreamResult(new File(xmldestpath ));
try {
transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC,"TEST");
transformer.transform(source, result);
// //logger.debug("COMPLETED Copying xml files /....!!");
System.out.println("COMPLETED Copying xml files to bulk import....!!");
If you know you need/want PUBLIC, perhaps you should know that a public literal cannot exist without a system literal.
The XML specification shows:
ExternalID ::= 'SYSTEM' S SystemLiteral
| 'PUBLIC' S PubidLiteral S SystemLiteral
So it should be easy to conclude that you need to specify both in order to get it to work, as demonstrated by this MCVE:
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC, "TEST1");
transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "TEST2");
transformer.transform(new StreamSource(new StringReader("<Root></Root>")),
new StreamResult(System.out));
Output
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Root PUBLIC "TEST1" "TEST2">
<Root/>

Convert Element to XML string without <?xml version=...> prefix

I have a legacy code with a lot of org.w3c.dom.Element generation like that
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
Element exampleElement = doc.createElement("example");
exampleElement.appendChild(...
How can I convert exampleElement to XML string like that? (Any additional libraries is allowed)
<example>
...
</example>
Not that
<?xml version="1.0" encoding="UTF-8"?>
<example>
...
</example>
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OMIT_XML_DECLARATION, "yes");
Writer writer = new StringWriter();
transformer.transform(new DOMSource(node), new StreamResult(writer));
return writer.toString();
There's some options:
Use an XSL Transformer to apply the changes using the directive omit-xml-declaration
Transformer transformer= TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty("omit-xml-declaration", "yes");
transformer.transform(new DOMSource(document), new StreamResult(stream));
Use a string replace process to replace the first line with a regex or using an indexOf("?>")
Depending on the implementation of w3c Document in the JDK used is possible to use some options in DOMConfiguration of the document like canonical-form
document.getDomConfig().setParameter("canonical-form",true);

Java Transformer outputs < and > instead of <>

I am editing an XML file in Java with a Transformer by adding more nodes. The old XML code is unchanged but the new XML nodes have < and > instead of <> and are on the same line. How do I get <> instead of < and > and how do I get line breaks after the new nodes. I already read several similar threads but wasn't able to get the right formatting. Here is the relevant portion of the code:
// Read the XML file
DocumentBuilderFactory dbf= DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc=db.parse(xmlFile.getAbsoluteFile());
Element root = doc.getDocumentElement();
// create a new node
Element newNode = doc.createElement("Item");
// add it to the root node
root.appendChild(newNode);
// create a new attribute
Attr attribute = doc.createAttribute("Name");
// assign the attribute a value
attribute.setValue("Test...");
// add the attribute to the new node
newNode.setAttributeNode(attribute);
// transform the XML
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
StreamResult result = new StreamResult(new FileWriter(xmlFile.getAbsoluteFile()));
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
Thanks
To replace the &gt and other tags you can use org.apache.commons.lang3:
StringEscapeUtils.unescapeXml(resp.toString());
After that you can use the following property of transformer for having line breaks in your xml:
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
based on a question posted here:
public void writeToOutputStream(Document fDoc, OutputStream out) throws Exception {
fDoc.setXmlStandalone(true);
DOMSource docSource = new DOMSource(fDoc);
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.INDENT, "no");
transformer.transform(docSource, new StreamResult(out));
}
produces:
<?xml version="1.0" encoding="UTF-8"?>
The differences I see:
fDoc.setXmlStandalone(true);
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
Try passing InputStream instead of Writer to StreamResult.
StreamResult result = new StreamResult(new FileInputStream(xmlFile.getAbsoluteFile()));
The Transformer documentation also suggests that.

Remove the XML header from an XML in Java

StringWriter writer = new StringWriter();
XmlSerializer serializer = new KXmlSerializer();
serializer.setOutput(writer);
serializer.startDocument(null, null);
serializer.setFeature("http://xmlpull.org/v1/doc/features.html#indent-output", true);
// Creating XML
serializer.endDocument();
String xmlString = writer.toString();
In the above environment, whether there are any standard API's available to remove the XML header <?xml version='1.0' ?> or do you suggest to go via string manipulation:
if (s.startsWith("<?xml ")) {
s = s.substring(s.indexOf("?>") + 2);
}
Wanted the output in the xmlString without XML header info <?xml version='1.0' ?>.
Ideally you can make an API call to exclude the XML header if desired. It doesn't appear that KXmlSerializer supports this though (skimming through the code here). If you had a org.w3c.dom.Document (or actually any other implementation of javax.xml.transform.Source) you could accomplish what you want this way:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(doc), new StreamResult(writer));
Otherwise if you have to use KXmlSerializer it looks like you'll have to manipulate the output.
If you use a JAXP serializer you get access to all the output properties defined in XSLT, for example omit-xml-declaration="yes". You can get this in the form of an "identity transformer", called using transformerFactory.getTransformer() with no parameters, on which you then call setOutputProperty(). Another example:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
t.setOutputProperty("omit-xml-declaration", "yes");
Don't make call to:
serializer.startDocument();
It adds the XML header, though you need to call:
serializer.endDocument();
else your XML will be created as a blank String.

Categories