I have an xml file like the following:
<?xml version="1.0"?>
<Book>
<Title>Ulysses</Title>
<Author>James <b>Joyce</b></Author>
</Book>
I need to parse this using Java into a pojo like
title="Ulysses"
author="James <b>Joyce</b>"
In other words, I need the html or possible custom xml tags to remain as plain text rather than xml elements, when parsing.
I can't edit the XML at all but it would be ok for me to create a custom xslt file to transform the xml.
I've got the following Java code for using xslt to assist with the reading of the xml,
TransformerFactory factory = TransformerFactory.newInstance();
Source stylesheetSource = new StreamSource(new File(stylesheetPathname).getAbsoluteFile());
Transformer transformer = factory.newTransformer(stylesheetSource);
Source inputSource = new StreamSource(new File(inputPathname).getAbsoluteFile());
Result outputResult = new StreamResult(new File(outputPathname).getAbsoluteFile());
transformer.transform(inputSource, outputResult);
This does apply my xslt to the file which is written out but I can't come up with the correct xslt to do it. I had a look at Add CDATA to an xml file but this does not work for me.
Essentially, I believe I want the file to look like
<?xml version="1.0"?>
<Book>
<Title>Ulysses</Title>
<Author><![CDATA[James <b>Joyce</b>]]></Author>
</Book>
Then I can extract
"James <b>Joyce</b>". I tried the approach suggested here: Add CDATA to an xml file
But it did not work for me.
I used the following xslt:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" omit-xml-declaration="no"/>
<xsl:template match="Author">
<xsl:copy>
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
<xsl:copy-of select="*"/>
<xsl:text disable-output-escaping="yes">]]></xsl:text>
</xsl:copy>
</xsl:template>
and this produced:
<?xml version="1.0" encoding="UTF-8"?>
Ulysses
<Author><![CDATA[
<b>Joyce</b>]]></Author>
Can you please help with this? I want the original document to be written out in it's entirety but with the CDATA surrounding everything within the author element.
Thanks
Isn't using a simple html/xml parser like Jsoup a better way of solving this?
Using Jsoup you can try something like this:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.parser.Parser;
import org.jsoup.select.Elements;
public class Example {
public static void main(String[] args) {
String xml = "<?xml version=\"1.0\"?>\n"
+ "<Book>\n"
+ " <Title>Ulysses</Title>\n"
+ " <Author>James <b>Joyce</b></Author>\n"
+ "</Book>";
Document doc = Jsoup.parse(xml, "", Parser.xmlParser());
doc.outputSettings().prettyPrint(false);
Elements books = doc.select("Book");
for(Element e: books){
Book b = new Book(e.select("Title").html(),e.select("Author").html());
System.out.println(b.title);
System.out.println(b.author);
}
}
public static class Book{
String title;
String author;
public Book(String title, String author) {
this.title = title;
this.author = author;
}
}
}
With XSLT 3.0 as supported by Saxon 9.8 HE (available on Maven and Sourceforge) you can use XSLT as follows:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:math="http://www.w3.org/2005/xpath-functions/math"
exclude-result-prefixes="xs math"
version="3.0">
<xsl:output cdata-section-elements="Author"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template match="Author">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:value-of select="serialize(node())"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
As for your attempt, you basically need to "implement" the identity transformation template concisely written in XSLT 3.0 as <xsl:mode on-no-match="shallow-copy"/> as a template
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()"/>
</xsl:copy>
</xsl:template>
in XSLT 1.0 so that those nodes not handled by more specialized template (like the one for Author elements) are recursively copied through.
Then, with the copy-of selecting all child nodes node() and not only the element nodes * you get
<xsl:template match="Author">
<xsl:copy>
<xsl:apply-templates select="#*"/>
<xsl:text disable-output-escaping="yes"><![CDATA[</xsl:text>
<xsl:copy-of select="node()"/>
<xsl:text disable-output-escaping="yes">]]></xsl:text>
</xsl:copy>
</xsl:template>
Related
Can someone provide me the source code for replacing values dynamically for existing XSLT file using java object
XSLT File:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:apply-templates select="CheckDomainCmd"/>
</xsl:template>
<xsl:template match="CheckDomainCmd">
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:ietf:params:xml:ns:epp-1.0 epp-1.0.xsd">
<command>
<check>
<domain:check xmlns:domain="http://www.nic.cz/xml/epp/domain-1.4" xsi:schemaLocation="http://www.nic.cz/xml/epp/domain-1.4 domain-1.4.xsd">
<domain:name><xsl:value-of select="DomainName"/>.<xsl:value-of select="TLD" /></domain:name>
</domain:check>
</check>
<clTRID>
<xsl:value-of select="RIMTransactionID"/>
</clTRID>
</command>
</epp>
</xsl:template>
</xsl:stylesheet>
Java Object:
public class checkDomain {
private String DomainName;
private String TLD;
private String RIMTransactionID;
// getters and setters
}
I need a source code in java/spring to put values to the XSLT select attribute dynamically.
For Example, in java object we have the following values, how to transform java object values to XSLT attributes:
public class XSLTConversion {
public static void main(String[] args) {
CheckDomain checkDomain = new CheckDomain():
checkDomain.setDomainName("test");
checkDomain.setTLD("com");
checkDomain.setRIMTransactionID("qwertyco123456");
replaceValuesToXSLTFile(checkDomain, "checkdomain.xslt");
}
public static void replaceValuesToXSLTFile(CheckDomain checkDomain, String fileName) {
}
}
After transformation, I need the file content like below
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:apply-templates select="CheckDomainCmd"/>
</xsl:template>
<xsl:template match="CheckDomainCmd">
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:ietf:params:xml:ns:epp-1.0 epp-1.0.xsd">
<command>
<check>
<domain:check xmlns:domain="http://www.nic.cz/xml/epp/domain-1.4" xsi:schemaLocation="http://www.nic.cz/xml/epp/domain-1.4 domain-1.4.xsd">
<domain:name><xsl:value-of select="test"/>.<xsl:value-of select="com" /></domain:name>
</domain:check>
</check>
<clTRID>
<xsl:value-of select="qwertyco123456"/>
</clTRID>
</command>
</epp>
</xsl:template>
</xsl:stylesheet>
Your XSLT code needs to declare global parameters and reference them:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:param name="DomainName"/>
<xsl:param name="TLD"/>
<xsl:param name="RIMTransactionID"/>
<xsl:template match="/">
<xsl:apply-templates select="CheckDomainCmd"/>
</xsl:template>
<xsl:template match="CheckDomainCmd">
<epp xmlns="urn:ietf:params:xml:ns:epp-1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:ietf:params:xml:ns:epp-1.0 epp-1.0.xsd">
<command>
<check>
<domain:check xmlns:domain="http://www.nic.cz/xml/epp/domain-1.4" xsi:schemaLocation="http://www.nic.cz/xml/epp/domain-1.4 domain-1.4.xsd">
<domain:name><xsl:value-of select="$DomainName"/>.<xsl:value-of select="$TLD" /></domain:name>
</domain:check>
</check>
<clTRID>
<xsl:value-of select="$RIMTransactionID"/>
</clTRID>
</command>
</epp>
</xsl:template>
</xsl:stylesheet>
then your Java code can create a Transformer from the XSLT and use e.g. transformer.setParameter("TLD", checkDomain.getTLD()) (see https://docs.oracle.com/javase/8/docs/api/javax/xml/transform/Transformer.html#setParameter-java.lang.String-java.lang.Object-) and so on for the other parameters before calling the transform method.
I am able to generate CSV file from XML file using XSLT, but the only header of XML file header is only showing on CSV file. The Values are not showing up.
Here is my java code:-
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;
public class xml2csv {
public static void main() throws Exception {
File stylesheet = new File("C:/Users/Admin/Desktop/out.xslt");
File xmlSource = new File("C:/Users/Admin/Desktop/out.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(xmlSource);
StreamSource stylesource = new StreamSource(stylesheet);
Transformer transformer = TransformerFactory.newInstance().newTransformer(stylesource);
Source source = new DOMSource(document);
Result outputTarget = new StreamResult(new File("C:/Users/Admin/Desktop/out.csv"));
transformer.transform(source, outputTarget);
}
}
the XML file:-
<root>
<header>Symbol</header>
<row>NIFTY 50</row>
<row>LUPIN</row>
<header>Open</header>
<row>9,670.35</row>
<row>1,082.90</row>
<header>High</header>
<row>9,684.25</row>
<row>1,137.00</row>
</root>
XSLT file:-
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" >
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">
Symbol,Open,High
<xsl:for-each select="//header">
<xsl:value-of select="concat(Symbol, ',', Open, ',', High)"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
So, I am getting only header of XML using this XSLT, where am I going wrong?
If I am guessing correctly at what you're trying to accomplish here, you will need to do something like:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:output method="text"/>
<xsl:template match="/root">
<!-- header -->
<xsl:text>Symbol,Open,High
</xsl:text>
<!-- data -->
<xsl:variable name="n" select="count(row) div 3" />
<xsl:for-each select="row[position() <= $n]">
<xsl:variable name="i" select="position()" />
<xsl:text>"</xsl:text>
<xsl:value-of select="."/>
<xsl:text>","</xsl:text>
<xsl:value-of select="../row[$n + $i]"/>
<xsl:text>","</xsl:text>
<xsl:value-of select="../row[2 * $n + $i]"/>
<xsl:text>"
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Applied to your input example, the result will be:
Symbol,Open,High
"NIFTY 50","9,670.35","9,684.25"
"LUPIN","1,082.90","1,137.00"
I have added quotes around the values because some of them contain commas - but I did not handle the possibility of some them containing a quote.
As I mentioned in a comment to your question, this could be a lot easier if your XML were structured in a more friendly way.
I have an xml document that I need to sort by date. I have a java program that calls an xsl stylesheet template. The output is just the same xml but should be sorted by date descending. The sort is not working. The result I get is the output looks the same as the input.
Here is what the source xml looks like:
<?xml version="1.0"?>
<ABCResponse xmlns="http://www.example.com/Schema">
<ABCDocumentList>
<ABCDocument>
<DocumentType>APPLICATION</DocumentType>
<EffectiveDate>20140110010000</EffectiveDate>
<Name>JOE DOCS</Name>
</ABCDocument>
<ABCDocument>
<DocumentType>FORM</DocumentType>
<EffectiveDate>20140206010000</EffectiveDate>
<Name>JOE DOCS</Name>
</ABCDocument>
<ABCDocument>
<DocumentType>PDF</DocumentType>
<EffectiveDate>20140120010000</EffectiveDate>
<Name>JOE DOCS</Name>
</ABCDocument>
</ABCDocumentList>
</ABCResponse>
Java:
import java.io.File;
import java.io.StringReader;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class TestMisc {
/**
* #param args
*/
public static void main(String[] args) {
Source xmlInput = new StreamSource(new File("c:/temp/ABCListDocsResponse.xml"));
Source xsl = new StreamSource(new File("c:/temp/DocResponseDateSort.xsl"));
Result xmlOutput = new StreamResult(new File("c:/temp/ABC_output1.xml"));
try {
javax.xml.transform.TransformerFactory transFact =
javax.xml.transform.TransformerFactory.newInstance( );
javax.xml.transform.Transformer trans = transFact.newTransformer(xsl);
trans.transform(xmlInput, xmlOutput);
} catch (TransformerException e) {
}
}
}
Here is the XSL
XSL:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" version="1.0" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="ABCResponse/ABCDocumentList/ABCDocument">
<xsl:copy>
<xsl:apply-templates select="#*|node()">
<xsl:sort select="EffectiveDate" data-type="number" order="descending"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
If you want to sort the ABCDocuments, you need to stop one level higher. You also need to use namespaces correctly since your source XML uses namespaces:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ex="http://www.example.com/Schema">
<xsl:output method="xml" omit-xml-declaration="yes" version="1.0" indent="yes"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="ex:ABCDocumentList">
<xsl:copy>
<xsl:apply-templates select="ex:ABCDocument">
<xsl:sort select="ex:EffectiveDate" data-type="number" order="descending"/>
</xsl:apply-templates>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
I have an XML file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd>
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<country>USA</country>
<company>Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd>
</catalog>
And this XSL file:
<?xml version="1.0" ?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:value-of select="/catalog/cd/artist"/>
<xsl:variable name = "artist" select = "/catalog/cd/artist()"/>
<xsl:variable name="year" select="/catalog/cd/year()"/>
<xsl:Object-bean name="{$artist}" id="{$year}">
</xsl:Object-bean>
</xsl:template>
</xsl:stylesheet>
Now I want to transform the result into a Java class.
Java:
#XmlRootElement(name = "Object-bean")
#XmlAccessorType(XmlAccessType.NONE)
public class ObjectBean {
#XmlAttribute(name = "name")
private String name;
#XmlAttribute
private String id;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
}
but when i run it it show me this error:
Error at xsl:Object-bean on line 7 column 49 of test.xsl:
XTSE0010: Unknown XSLT element: Object-bean
Exception in thread "main" javax.xml.transform.TransformerConfigurationException: Failed to compile stylesheet. 1 error detected.
at net.sf.saxon.PreparedStylesheet.prepare(PreparedStylesheet.java:176)
at net.sf.saxon.TransformerFactoryImpl.newTemplates(TransformerFactoryImpl.java:139)
at net.sf.saxon.TransformerFactoryImpl.newTransformer(TransformerFactoryImpl.java:91)
at XslExecutor.main(XslExecutor.java:28)
The XML holds the original data (Document A). The XSLT is a transformation template that translates the XML data (Document A) into other XML document (Document B).And finally you are trying to marshall the output of the XSLT template (Document B) into a POJO annotated with JAXB. JAXB annotations work similar to the XSLT template. They provide a binding mechanism between XML and a POJO.
XSLT JAXB
(XML Document A) ---------------------> (XML Document B) -------------------->POJO
That explained, just to set a common understanding, the output you are showing says the XSLT transformation is failing. In fact the XSL you provide is completely wrong. Start with something like this, that works with the XML you provided:
<?xml version="1.0" ?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:element name="Object-bean">
<xsl:attribute name="artist">
<xsl:value-of select="/catalog/cd/artist"/>
</xsl:attribute>
<xsl:attribute name="year">
<xsl:value-of select="/catalog/cd/year"/>
</xsl:attribute>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
The reason of the error is your incorrect xslt template. What do you want to achieve by applying xslt transformation? If by doing so you want to build POJO it is not a good idea..
At first you have to transform your initial xml file with xslt template and after this you have to unmarshal xml to your POJO using JAXB.
========= UPDATE =========
Many thanks to Tomalak for the correct XSL syntax, and to Ian Roberts for pointing out that in order to use namespaces in my XSLT, I need to call "setNamespaceAware(true)" at the very beginning, in my DocumentBuilderFactory.
========= END UPDATE =========
Q: How can I write an XSLT stylesheet that filters out all elements and/or all node trees in the "http://foo.com/abc" namespace?
I have an XML file that looks like this:
SOURCE XML:
<zoo xmlns="http://myurl.com/wsdl/myservice">
<animal>elephant</animal>
<exhibit>
<animal>walrus</animal>
<animal>sea otter</animal>
<trainer xmlns="http://foo.com/abc">Jack</trainer>
</exhibit>
<exhibit xmlns="http://foo.com/abc">
<animal>lion</animal>
<animal>tiger</animal>
</exhibit>
</zoo>
DESIRED RESULT XML:
<zoo xmlns="http://myurl.com/wsdl/myservice">
<animal>elephant</animal>
<exhibit>
<animal>walrus</animal>
<animal>sea otter</animal>
</exhibit>
</zoo>
XSLT (thanks to Tomalak):
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="http://foo.com/abc"
exclude-result-prefixes="a"
>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="a:* | #a:*" />
</xsl:stylesheet>
Thank you in advance!
JAVA PROGRAM THAT SUCCESSFULLY DOES THE XSLT FILTERING BY NAMESPACE:
import java.io.*;
import org.w3c.dom.*; // XML DOM
import javax.xml.parsers.*; // DocumentBuilder, etc
import javax.xml.transform.*; // Transformer, etc
import javax.xml.transform.stream.*; // StreamResult, StreamSource, etc
import javax.xml.transform.dom.DOMSource;
public class Test {
public static void main(String[] args) {
new Test().testZoo();
}
public void testZoo () {
String zooXml = Test.readXmlFile ("zoo.xml");
if (zooXml == null)
return;
try {
// Create a new document builder factory, and make sure it[s namespace-aware
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder docBuilder = dbf.newDocumentBuilder ();
// Read XFDD input string into DOM
Document xfddDoc =
docBuilder.parse(new StringBufferInputStream (zooXml));
// Filter out all elements in "http://foo.com/abc" namespace
StreamSource styleSource = new StreamSource (new File ("zoo.xsl"));
Transformer transformer =
TransformerFactory.newInstance().newTransformer (styleSource);
// Convert final DOM back to String
StringWriter buffer = new StringWriter ();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes"); // Remember: we want to insert this XML as a subnode
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(new DOMSource(xfddDoc), new StreamResult (buffer));
String translatedXml = buffer.toString();
}
catch (Exception e) {
System.out.println ("convertTransactionData error: " + e.getMessage());
e.printStackTrace ();
}
}
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="http://foo.com/abc"
exclude-result-prefixes="a"
>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="a:* | #a:*" />
</xsl:stylesheet>
Empty templates match nodes but do not output anything, effectively removing what they match.