Remove the XML header from an XML in Java - java

StringWriter writer = new StringWriter();
XmlSerializer serializer = new KXmlSerializer();
serializer.setOutput(writer);
serializer.startDocument(null, null);
serializer.setFeature("http://xmlpull.org/v1/doc/features.html#indent-output", true);
// Creating XML
serializer.endDocument();
String xmlString = writer.toString();
In the above environment, whether there are any standard API's available to remove the XML header <?xml version='1.0' ?> or do you suggest to go via string manipulation:
if (s.startsWith("<?xml ")) {
s = s.substring(s.indexOf("?>") + 2);
}
Wanted the output in the xmlString without XML header info <?xml version='1.0' ?>.

Ideally you can make an API call to exclude the XML header if desired. It doesn't appear that KXmlSerializer supports this though (skimming through the code here). If you had a org.w3c.dom.Document (or actually any other implementation of javax.xml.transform.Source) you could accomplish what you want this way:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(doc), new StreamResult(writer));
Otherwise if you have to use KXmlSerializer it looks like you'll have to manipulate the output.

If you use a JAXP serializer you get access to all the output properties defined in XSLT, for example omit-xml-declaration="yes". You can get this in the form of an "identity transformer", called using transformerFactory.getTransformer() with no parameters, on which you then call setOutputProperty(). Another example:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
t.setOutputProperty("omit-xml-declaration", "yes");

Don't make call to:
serializer.startDocument();
It adds the XML header, though you need to call:
serializer.endDocument();
else your XML will be created as a blank String.

Related

How to prevent self-closing <tags/> in XML?

I modify XML file using the Transformer class and transform method. It correctly modify my parameters but changed XML style (write XML attributes in different way):
Original:
<a struct="b"></a>
<c></c>
After edit:
<a struct="b"/>
<c/>
I know that I can set properties: transformer.setOutputProperty(OutputKeys.KEY,value), but I did not find proper settings.
Can anyone help the transformer not change the write format?
XMLReader xr = new XMLFilterImpl(XMLReaderFactory.createXMLReader()
Source src = new SAXSource(xr, new InputSource(new
StringReader(xmlArray[i])));
<<modify xml>>
TransformerFactory transFactory = TransformerFactory.newInstance();
Transformer transformer = transFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION,"yes");
StringWriter buffer = new StringWriter();
transformer.transform(src, new StreamResult(buffer));
xmlArray[i] = buffer.toString();
Those forms are semantically equivalent. No conforming XML parser will care, and neither should you.

DOM XML Public Doctype not appearing in result xml file

I have written a code to generate XML files. I am stuck at defining doctype for the XML as it should be public. I am able to get SYSTEM doctype successfully but somehow not able to get public doctype written in XML. Below code for SYSTEM doctype is working but same snippet for PUBLIC doctype is not working :
String xmldestpath = "C:/failed/tester.xml";
doctype2 = CreateDoctypeString();
StreamResult result = new StreamResult(new File(xmldestpath ));
try {
transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM,"TEST");
transformer.transform(source, result);
// logger.debug("COMPLETED Copying xml files /....!!");
System.out.println("COMPLETED Copying xml files to bulk import....!!");
Not working snippet. Its not giving error but no doctype is appearing in resultant xml:
String xmldestpath = "C:/failed/tester.xml";
doctype2 = CreateDoctypeString();
StreamResult result = new StreamResult(new File(xmldestpath ));
try {
transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC,"TEST");
transformer.transform(source, result);
// //logger.debug("COMPLETED Copying xml files /....!!");
System.out.println("COMPLETED Copying xml files to bulk import....!!");
If you know you need/want PUBLIC, perhaps you should know that a public literal cannot exist without a system literal.
The XML specification shows:
ExternalID ::= 'SYSTEM' S SystemLiteral
| 'PUBLIC' S PubidLiteral S SystemLiteral
So it should be easy to conclude that you need to specify both in order to get it to work, as demonstrated by this MCVE:
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.DOCTYPE_PUBLIC, "TEST1");
transformer.setOutputProperty(OutputKeys.DOCTYPE_SYSTEM, "TEST2");
transformer.transform(new StreamSource(new StringReader("<Root></Root>")),
new StreamResult(System.out));
Output
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Root PUBLIC "TEST1" "TEST2">
<Root/>

Convert Element to XML string without <?xml version=...> prefix

I have a legacy code with a lot of org.w3c.dom.Element generation like that
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
Element exampleElement = doc.createElement("example");
exampleElement.appendChild(...
How can I convert exampleElement to XML string like that? (Any additional libraries is allowed)
<example>
...
</example>
Not that
<?xml version="1.0" encoding="UTF-8"?>
<example>
...
</example>
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OMIT_XML_DECLARATION, "yes");
Writer writer = new StringWriter();
transformer.transform(new DOMSource(node), new StreamResult(writer));
return writer.toString();
There's some options:
Use an XSL Transformer to apply the changes using the directive omit-xml-declaration
Transformer transformer= TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty("omit-xml-declaration", "yes");
transformer.transform(new DOMSource(document), new StreamResult(stream));
Use a string replace process to replace the first line with a regex or using an indexOf("?>")
Depending on the implementation of w3c Document in the JDK used is possible to use some options in DOMConfiguration of the document like canonical-form
document.getDomConfig().setParameter("canonical-form",true);

Can JAXP be used to create HTML5 documents?

Are there elements in the HTML5 specification which can not be created with a XML library such as JAXP? One example are named HTML entities which are not defined in XML. Are there other areas which are incompatible?
JAXP apparently only works on well formed XML. You'd need to convert the HTML to XHTML before subjecting it to the JAXP's standard parser.
// Create Transformer
TransformerFactory tf = TransformerFactory.newInstance();
StreamSource xslt = new StreamSource(
"src/blog/jaxbsource/xslt/stylesheet.xsl");
Transformer transformer = tf.newTransformer(xslt);
// Source
JAXBContext jc = JAXBContext.newInstance(Library.class);
JAXBSource source = new JAXBSource(jc, catalog);
// Result
StreamResult result = new StreamResult(System.out);
// Transform
transformer.transform(source, result);
Url:[https://dzone.com/articles/using-jaxb-xslt-produce-html][1]

XML Document to String

What's the simplest way to get the String representation of a XML Document (org.w3c.dom.Document)? That is all nodes will be on a single line.
As an example, from
<root>
<a>trge</a>
<b>156</b>
</root>
(this is only a tree representation, in my code it's a org.w3c.dom.Document object, so I can't treat it as a String)
to
"<root> <a>trge</a> <b>156</b> </root>"
Thanks!
Assuming doc is your instance of org.w3c.dom.Document:
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(doc), new StreamResult(writer));
String output = writer.getBuffer().toString().replaceAll("\n|\r", "");
Use the Apache XMLSerializer
here's an example:
http://www.informit.com/articles/article.asp?p=31349&seqNum=3&rl=1
you can check this as well
http://www.netomatix.com/XmlFileToString.aspx
First you need to get rid of all newline characters in all your text nodes. Then you can use an identity transform to output your DOM tree. Look at the javadoc for TransformerFactory#newTransformer().

Categories