I'm trying to output a CDATA section in the result of XSLT using Xalan 2.7.1. I have applied this XSL to the XML in a tool and the result contains CDATA. In the method below, no CDATA is in the result and no exception is thrown. I feel like I'm missing something here.
test.xml
<?xml version="1.0" encoding="UTF-8"?>
<parentelem>
<childelem>Test text</childelem>
</parentelem>
test.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="UTF-8" cdata-section-elements="newchildelem" />
<xsl:template match="/">
<parentelem>
<newchildelem><xsl:value-of select="/parentelem/childelem" /></newchildelem>
</parentelem>
</xsl:template>
</xsl:stylesheet>
Transform.java
import java.io.FileReader;
import java.io.StringWriter;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamReader;
import javax.xml.stream.XMLStreamWriter;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stax.StAXResult;
import javax.xml.transform.stax.StAXSource;
public class Transform {
public static void main (String[] args){
try {
XMLStreamReader xmlReader = XMLInputFactory.newInstance().createXMLStreamReader(
new FileReader("test.xml"));
XMLStreamReader xslReader = XMLInputFactory.newInstance().createXMLStreamReader(
new FileReader("test.xsl"));
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Source xslSource = new StAXSource(xslReader);
Source xmlSource = new StAXSource(xmlReader);
Transformer transf = transformerFactory.newTransformer(xslSource);
StringWriter xmlString = new StringWriter();
XMLStreamWriter xmlWriter = XMLOutputFactory.newInstance().createXMLStreamWriter(
xmlString);
Result transformedXml = new StAXResult(xmlWriter);
transf.transform(xmlSource, transformedXml);
xmlWriter.flush();
System.out.println(xmlString.toString());
} catch (Exception e) {
e.printStackTrace();
}
}
}
console output
<?xml version="1.0" encoding="UTF-8"?><parentelem><newchildelem>Test text</newchildelem></parentelem>
Are you saying you want to output the CDATA as part of the element?
<newchildelem><xsl:value-of select="/parentelem/childelem" /></newchildelem>
with
<newchildelem><xsl:text><![CDATA[
</xsl:text><xsl:value-of select="/parentelem/childelem" /><xsl:text>]]></xsl:text></newchildelem>
or some other form, but with the escaped characters to omit
<newchildelem><![CDATA[Test text]]></newchildelem>
or am I misunderstanding the question perhaps?
It works for me, with Xalan 2.7.1, not sure why it doesn't work for you.
I simplified the code fragment, but I don't think there's any functional difference, but try it anyway:
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.apache.xalan.Version;
public class Transform {
public static void main(String[] args) throws Exception {
System.out.println(Version.getVersion());
Source xslSource = new StreamSource(Transform.class.getResourceAsStream("test.xsl"));
Source xmlSource = new StreamSource(Transform.class.getResourceAsStream("test.xml"));
Transformer transf = TransformerFactory.newInstance().newTransformer(xslSource);
StreamResult transformedXml = new StreamResult(System.out);
transf.transform(xmlSource, transformedXml);
}
}
Output is:
Xalan Java 2.4.1
<?xml version="1.0" encoding="UTF-8"?>
<parentelem><newchildelem><![CDATA[Test text]]></newchildelem></parentelem>
What is odd is that Xalan's Version.getVersion() returns 2.4.1, not 2.7.1, and I'm definitely using 2.7.1 here.
1) I guess it turns out i haven't been using Xalan 2.7.1. The code in skaffman's answer made me think to check Version.getVersion() and the signature is "XL TXE Java 1.0.7". This appears to be the default when using IBM java [IBM J9 VM (build 2.4, J2RE 1.6.0)].
2) I switched from using StAXSource and StAXResult to using StreamSource and StreamResult and it works fine (like in skaffman's answer). Specifically the change from StAXResult to StreamResult is what worked. Using StAXSource with StreamResult works too.
Related
I am using sax to create an xml file, but when I run the program, the following error occurs:
[Fatal Error] :-1:-1: Premature end of file.
org.xml.sax.SAXParseException; Premature end of file.
at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:205)
I checked the files written, but I didn't see any problems.The following is my code and the xml file of the operation.
1. Code
The main logic of the following code is: create an XML node, and then append this node to the old file.
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Text;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import java.io.File;
import java.util.HashMap;
import java.util.Map;
import java.util.Objects;
public class XmlWriterByDom {
private static final XMLConfigUtils xmlConfig = XMLConfigUtils.getInstance();
public void xmlInsert(Map<String, String> xmlNode, String xmlPath) {
Document doc = xmlConfig.getDocument(xmlPath);
Text nodeValue;
Element root = doc.getDocumentElement();
Element food = doc.createElement(XmlTag.FOOD);
Element name = doc.createElement(XmlTag.NAME);
Element price = doc.createElement(XmlTag.PRICE);
Element desc = doc.createElement(XmlTag.DESC);
nodeValue = doc.createTextNode(xmlNode.get(XmlTag.FOOD));
name.appendChild(nodeValue);
food.appendChild(name);
nodeValue = doc.createTextNode(xmlNode.get(XmlTag.PRICE));
price.appendChild(nodeValue);
food.appendChild(price);
nodeValue = doc.createTextNode(xmlNode.get(XmlTag.DESC));
desc.appendChild(nodeValue);
food.appendChild(desc);
root.appendChild(food);
try {
xmlPath = Objects.requireNonNull( this.getClass().getClassLoader().getResource(xmlPath)).getPath();
TransformerFactory transformer = TransformerFactory.newInstance();
Transformer trans = transformer.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File(xmlPath).toURI().getPath());
trans.transform(source, result);
} catch (TransformerException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
Map<String, String> xmlNode = new HashMap<>(3);
xmlNode.put("name", "tomato");
xmlNode.put("price", "$10");
xmlNode.put("description", "dishes");
new XmlWriterByDom().xmlInsert(xmlNode, "xml/henan-dishes.xml");
}
}
2. XML
<?xml version="1.0" encoding="UTF-8" ?>
<menu>
<food>
<name>1</name>
<price>18$</price>
<description>2</description>
</food>
<food>
<name>1</name>
<price>59$</price>
<description>2</description>
</food>
</menu>
When I try to delete the compiled folder with the target name and re-execute, the result is correct. May be really a problem with the file system!
Project structure
I am trying to transform the following XML
<PHONEBOOK>
<PERSON>
<NAME>Ren1</NAME>
<EMAIL>ren1#gmail.com</EMAIL>
<TELEPHONE>999-999-9999</TELEPHONE>
<WEB>www.ren1.com</WEB>
</PERSON>
<PERSON>
<NAME>Ren2</NAME>
<EMAIL>ren2#gmail.com</EMAIL>
<TELEPHONE>999-999-9999</TELEPHONE>
<WEB>www.ren2.com</WEB>
</PERSON>
<PERSON>
<NAME>Ren3</NAME>
<EMAIL>ren3#gmail.com</EMAIL>
<TELEPHONE>999-999-9999</TELEPHONE>
<WEB>www.ren3.com</WEB>
</PERSON>
</PHONEBOOK>
to
<Names><Name>Ren1</Name><Name>Ren2</Name><Name>Ren3</Name></Names>
using DOMSource, DOMResult and XSLT.
XSLT used is as follows
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output omit-xml-declaration="yes" method="xml"></xsl:output>
<xsl:template match="/">
<Names>
<xsl:for-each select="PHONEBOOK/PERSON">
<Name>
<xsl:value-of select="NAME" />
</Name>
</xsl:for-each>
</Names>
Java Code used for transformation:
package test1;
import java.io.IOException;
import java.io.StringWriter;
import java.io.ObjectInputStream.GetField;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMResult;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;
public class Test2 {
public static void main(String[] args) throws TransformerException,
ParserConfigurationException, SAXException, IOException {
// TODO Auto-generated method stub
//Stylesheet
StreamSource stylesource = new StreamSource(
"src/test1/transform_stylesheet1.xsl");
DocumentBuilderFactory docbFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder dBuilder = docbFactory.newDocumentBuilder();
//source XML
Document sourceDoc = dBuilder.parse("src/test1/Sample1.xml");
DOMSource source = new DOMSource(sourceDoc);
TransformerFactory transformerFactory = TransformerFactory
.newInstance();
Transformer transformer = transformerFactory
.newTransformer(stylesource);
Document document = dBuilder.newDocument();
DOMResult result = new DOMResult(document);
transformer.transform(source, result);
Node resultDoc = ((Document) result.getNode()).getDocumentElement();
System.out.println(resultDoc.getChildNodes().getLength());
// print the result
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(resultDoc), new StreamResult(writer));
String str = writer.toString();
System.out.println(str);
}
}
Output of the above is as follows:
3 <Names/>
but i expect,
3
<Names><Name>Ren1</Name><Name>Ren2</Name><Name>Ren3</Name></Names>
i debugged the code and found that 'resultDoc' has the content which i expect. Am i missing something while printing the result?
Your problem is that you're using the same transformer for the stylesheet processing and the output. That means, the stylesheet is applied again, but this time to the <Names><Name>Ren1</Name>...</Names> xml. You can imagine that this doesn't give the results you want.
Change your code to:
// print the result
StringWriter writer = new StringWriter();
Transformer transformer2 = transformerFactory.newTransformer();
transformer2.transform(new DOMSource(resultDoc), new StreamResult(writer));
String str = writer.toString();
System.out.println(str);
and it should work.
As #Abel mentions, you can also do the stylesheet processing and the to String in one go:
StringWriter writer = new StringWriter();
transformer.transform(source, new StreamResult(writer));
String str = writer.toString();
System.out.println(str);
You don't need the DOMResult and DOMSource variables then.
I am using XMLStreamReader to achieve my goal(splitting xml file). It looks good, but still does not give the desired result. My aim is to split every node "nextTag" from an input file:
<?xml version="1.0" encoding="UTF-8"?>
<firstTag>
<nextTag>1</nextTag>
<nextTag>2</nextTag>
</firstTag>
The outcome should look like this:
<?xml version="1.0" encoding="UTF-8"?><nextTag>1</nextTag>
<?xml version="1.0" encoding="UTF-8"?><nextTag>2</nextTag>
Referring to Split 1GB Xml file using Java I achieved my goal with this code:
import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.StringWriter;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamReader;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stax.StAXSource;
import javax.xml.transform.stream.StreamResult;
public class Demo4 {
public static void main(String[] args) throws Exception {
InputStream inputStream = new FileInputStream("input.xml");
BufferedReader in = new BufferedReader(new InputStreamReader(inputStream));
XMLInputFactory factory = XMLInputFactory.newInstance();
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
XMLStreamReader streamReader = factory.createXMLStreamReader(in);
while (streamReader.hasNext()) {
streamReader.next();
if (streamReader.getEventType() == XMLStreamReader.START_ELEMENT
&& "nextTag".equals(streamReader.getLocalName())) {
StringWriter writer = new StringWriter();
t.transform(new StAXSource(streamReader), new StreamResult(
writer));
String output = writer.toString();
System.out.println(output);
}
}
}
}
Actually very simple. But, my input file is in form from a single line:
<?xml version="1.0" encoding="UTF-8"?><firstTag><nextTag>1</nextTag><nextTag>2</nextTag></firstTag>
My Java code does not produce the desired output anymore, instead just this result:
<?xml version="1.0" encoding="UTF-8"?><nextTag>1</nextTag>
After spending hours, I am pretty sure to already find out the reason:
t.transform(new StAXSource(streamReader), new StreamResult(writer));
It is because, after the transform method being executed, the cursor will automatically moved forward to the next event. And in the code, I have this fraction:
while (streamReader.hasNext()) {
streamReader.next();
...
t.transform(new StAXSource(streamReader), new StreamResult(writer));
...
}
After the first transform, the streamReader gets directly 2 times next():
1. from the transform method
2. from the next method in the while loop
So, in case of this specific line XML, the cursor can never achive the second open tag .
In opposite, if the input XML has a pretty print form, the second can be reached from the cursor because there is a space-event after the first closing tag
Unfortunately, I could not find anything how to do settings, so that the transformator does not automatically spring to next event after performing the transform method. This is so frustating.
Does anybody have any idea how I can deal with it? Also semantically is very welcome. Thank you so much.
Regards,
Ratna
PS. I can surely write a workaround for this problem(pretty print the xml document before transforming it, but this would mean that the input xml was being modified before, this is not allowed)
As you elaborated did the transformation step proceed to the next create element if the element-nodes follow directly each other.
In order to deal with this, you can rewrite you code using nested while loops, like this:
while(reader.next() != XMLStreamConstants.END_DOCUMENT) {
while(reader.getEventType() == XMLStreamConstants.START_ELEMENT && reader.getLocalName().equals("nextTag")) {
StringWriter writer = new StringWriter();
// will transform the current node to a String, moves the cursor to the next START_ELEMENT
t.transform(new StAXSource(reader), new StreamResult(writer));
System.out.println(writer.toString());
}
}
In case your xml file fits in memory, you can try with the help of the JOOX library, imported in gradle like:
compile 'org.jooq:joox:1.3.0'
And the main class, like:
import java.io.File;
import java.io.IOException;
import org.joox.JOOX;
import org.joox.Match;
import org.w3c.dom.Document;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import static org.joox.JOOX.$;
public class Main {
public static void main(String[] args)
throws IOException, SAXException, TransformerException {
DocumentBuilder builder = JOOX.builder();
Document document = builder.parse(new File(args[0]));
Transformer transformer =
TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty("omit-xml-declaration", "no");
final Match $m = $(document);
$m.find("nextTag").forEach(tag -> {
try {
transformer.transform(
new DOMSource(tag),
new StreamResult(System.out));
System.out.println();
}
catch (TransformerException e) {
System.exit(1);
}
});
}
}
It yields:
<?xml version="1.0" encoding="UTF-8"?><nextTag>1</nextTag>
<?xml version="1.0" encoding="UTF-8"?><nextTag>2</nextTag>
Is there a way to print the XML content without the XML header tag in Java?
For example if I have an XML like this:
<?xml version='1.0' encoding='UTF-8'?>
<rootElement>
<childElement>Text</childElement>
</rootElement>
I just want to print
<rootElement>
<childElement>Text</childElement>
</rootElement>
This is very similar to what I am doing so far:
http://sacrosanctblood.blogspot.com/2008/07/convert-xml-file-to-xml-string-in-java.html
I cannot give out the exact source code but the above link example should give you some idea. Here's that code with imports:
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.TransformerConfigurationException;
import javax.xml.transform.TransformerException;
import org.w3c.dom.Attr;
import org.w3c.dom.Document;
import org.w3c.dom.Text;
public String convertXMLFileToString(String fileName)
{
try{
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
InputStream inputStream = new FileInputStream(new File(fileName));
org.w3c.dom.Document doc = documentBuilderFactory.newDocumentBuilder().parse(inputStream);
StringWriter stw = new StringWriter();
Transformer serializer = TransformerFactory.newInstance().newTransformer();
serializer.transform(new DOMSource(doc), new StreamResult(stw));
return stw.toString();
}
catch (Exception e) {
e.printStackTrace();
}
return null;
}
Transformer serializer = TransformerFactory.newInstance().newTransformer();
serializer.setOutputProperty("omit-xml-declaration", "yes");
serializer.transform(new DOMSource(doc), new StreamResult(stw));
Good and old XSL ;).
In an open source project I maintain, we have at least three different ways of reading, processing and writing XML files and I would like to standardise on a single method for ease of maintenance and stability.
Currently all of the project files use XML from the configuration to the stored data, we're hoping to migrate to a simple database at some point in the future but will still need to read/write some form of XML files.
The data is stored in an XML format that we then use a XSLT engine (Saxon) to transform into the final HTML files.
We currently utilise these methods:
- XMLEventReader/XMLOutputFactory (javax.xml.stream)
- DocumentBuilderFactory (javax.xml.parsers)
- JAXBContext (javax.xml.bind)
Are there any obvious pros and cons to each of these?
Personally, I like the simplicity of DOM (Document Builder), but I'm willing to convert to one of the others if it makes sense in terms of performance or other factors.
Edited to add:
There can be a significant number of files read/written when the project runs, between 100 & 10,000 individual files of around 5Kb each
It depends on what you are doing with the data.
If you are simply performing XSLT transforms on XML files to produce HTML files then you may not need to touch a parser directly:
import java.io.File;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class Demo {
public static void main(String[] args) throws Exception {
TransformerFactory tf = TransformerFactory.newInstance();
StreamSource xsltTransform = new StreamSource(new File("xslt.xml"));
Transformer transformer = tf.newTransformer(xsltTransform);
StreamSource source = new StreamSource(new File("source.xml"));
StreamResult result = new StreamResult(new File("result.html"));
transformer.transform(source, result);
}
}
If you need to make changes to the input document before you transform it, DOM is a convenient mechanism for doing this:
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;
public class Demo {
public static void main(String[] args) throws Exception {
TransformerFactory tf = TransformerFactory.newInstance();
StreamSource xsltTransform = new StreamSource(new File("xslt.xml"));
Transformer transformer = tf.newTransformer(xsltTransform);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(new File("source.xml"));
// modify the document
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(new File("result.html"));
transformer.transform(source, result);
}
}
If you prefer a typed model to make changes to the data then JAXB is a perfect fit:
import java.io.File;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Unmarshaller;
import javax.xml.bind.util.JAXBSource;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class Demo {
public static void main(String[] args) throws Exception {
TransformerFactory tf = TransformerFactory.newInstance();
StreamSource xsltTransform = new StreamSource(new File("xslt.xml"));
Transformer transformer = tf.newTransformer(xsltTransform);
JAXBContext jc = JAXBContext.newInstance("com.example.model");
Unmarshaller unmarshaller = jc.createUnmarshaller();
Model model = (Model) unmarshaller.unmarshal(new File("source.xml"));
// modify the domain model
JAXBSource source = new JAXBSource(jc, model);
StreamResult result = new StreamResult(new File("result.html"));
transformer.transform(source, result);
}
}
This is a very subjective topic. It primarily depends on how you are going to use the xml and size of XML. If XML is (always) small enough to be loaded in to memory, then you don't have to worry about memory foot print. You can use DOM parser. If you need to a parse through 150 MB xml you may want to think of using SAX. etc.