Java Saxon 10.5 XSL 3 Transformation, href file not found - java

I'm writing a Java application that does a XML transformation using XSLT3, using Saxon-HE 10.5 (as a Maven project).
My XSLT sheet imports other XSLT sheets, using <xsl:import> (e.g. <xsl:import href="sheet1.xsl"/>). All of the XSLT sheets are located inside ./src/main/resources. However, when I try to run the program, I get a FileNotFound Exception from Saxon, since it is looking for the files at the project base directory.
I assume there is some way to change where Saxon is looking for the files, but I was not able to find out how to achieve this when using the s9api API.
Here's my Java code performing the transformation:
public void transformXML(String xmlFile, String output) throws SaxonApiException, IOException, XPathExpressionException, ParserConfigurationException, SAXException {
Processor processor = new Processor(false);
XsltCompiler compiler = processor.newXsltCompiler();
XsltExecutable stylesheet = compiler.compile(new StreamSource(this.getClass().getClassLoader().getResourceAsStream("transform.xsl")));
Serializer out = processor.newSerializer(new File(output));
out.setOutputProperty(Serializer.Property.METHOD, "text");
Xslt30Transformer transformer = stylesheet.load30();
transformer.transform(new StreamSource(new File(xmlFile)), out);
}
Any help is appreciated.
Edit:
My solution based on #Michael Kay's recommendation:
public void transformXML(String xmlFile, String output) throws SaxonApiException, IOException, XPathExpressionException, ParserConfigurationException, SAXException {
Processor processor = new Processor(false);
XsltCompiler compiler = processor.newXsltCompiler();
compiler.setURIResolver(new ClasspathResourceURIResolver());
XsltExecutable stylesheet = compiler.compile(new StreamSource(this.getClass().getClassLoader().getResourceAsStream("transform.xsl")));
Serializer out = processor.newSerializer(new File(output));
out.setOutputProperty(Serializer.Property.METHOD, "text");
Xslt30Transformer transformer = stylesheet.load30();
transformer.transform(new StreamSource(new File(xmlFile)), out);
}
}
class ClasspathResourceURIResolver implements URIResolver
{
#Override
public Source resolve(String href, String base) throws TransformerException {
return new StreamSource(this.getClass().getClassLoader().getResourceAsStream(href));
}
}

Saxon doesn't know the base URI of the stylesheet (it has no way of knowing, because you haven't told it), so it can't resolve relative URIs appearing in xsl:import/#href.
Normally I would suggest supplying a base URI in the second argument of new StreamSource(). However, since the main stylesheet is loaded using getResourceAsStream(), I suspect you want to load secondary stylesheet modules using the same mechanism, and this can be done by setting a URIResolver on the XsltCompiler object.

Related

Unable find href within XSLT, when transforming XML. Java

Some background... I am transforming a xml (condensedModel) file using a xslt (deployments.xslt).
I have a deployments.xslt file that is using a functions.xslt file that is included using href.
This is where the problem looks to be. It seems to not be able to find the functions.xslt file I made and am referencing within that file.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:variable name="lowercase" select="'abcdefghijklmnopqrstuvwxyz-_'"/>
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ. '"/>
<!-- <xsl:variable name="kubename" select="'name'"/>-->
<xsl:variable name="kubename" select="'app.kubernetes.io/name'"/>
<xsl:variable name="quot">"</xsl:variable>
<xsl:variable name="apos">'</xsl:variable>
<xsl:include href="functions.xslt"/>
Java:
private TransformerFactory factory = TransformerFactory.newInstance();
private Transformer transformer;
private void init(String xslt) throws IOException, TransformerConfigurationException, AggregatorException {
if (xslt == null || xslt.isEmpty()) {
throw new AggregatorException("XSLT was null or empty. Unable to transform condensed model to yaml");
}
transformer = factory.newTransformer(new StreamSource(new StringReader(xslt)));
Security.addProvider(new BouncyCastleProvider());
}
public String transform(TransformingYamlEnum transformerToUse) throws TransformerException, IOException, AggregatorException {
String xslt = determineXsltToUse(transformerToUse);
init(xslt);
try (ByteArrayOutputStream bos = new ByteArrayOutputStream()) {
InputStream inputStream = new ByteArrayInputStream(condensedModel.getBytes(StandardCharsets.UTF_8));
process(inputStream, bos);
return bos.toString("UTF-8");
}
}
private void process(InputStream inputStream, OutputStream outputStream) throws TransformerException {
transformer.transform(
new StreamSource(inputStream),
new StreamResult(outputStream));
}
This is obviously within a jar and the deployments.xslt is being loaded in within my init(). The actual transformation takes place when I call trasform().
It does work when I hard code the path to let's say the desktop and manually place the functions.xslt file on the desktop. But as you can imagine, this is not a viable option. Any ideas about what I am doing wrong?
The XSLT processor cannot resolve a relative URI in xsl:include/xsl:import unless it knows the base URI of the stylesheet. If you supply the input as a StreamSource wrapping an InputStream, without also supplying a SystemId, then the base URI will be unknown.
In your example the XSLT processor has no idea where the stylesheet code came from.

How to secure javax.xml.transform.TransformerFactory from XML external attacks

I have researched on the subject but couldn't find any relevant info regarding that
Do we need to take any security measurements to secure javax.xml.transform.Transformer against XML external entity attacks?
I did the following and it seems to expand the dtd.
String fileData = "<!DOCTYPE acunetix [ <!ENTITY sampleVal SYSTEM \"file:///media/sample\">]><username>&sampleVal;</username>";
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Transformer transformer = transformerFactory.newTransformer();
StringWriter buff = new StringWriter();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new StreamSource(new StringReader(fileData)), new StreamResult(buff));
System.out.println(buff.toString());
output contains the value from the file
<username>test</username>
Your code seems correct. When I run this slightly modified JUnit test case:
#Test
public void test() throws TransformerException, URISyntaxException {
File testFile = new File(getClass().getResource("test.txt").toURI());
assertTrue(testFile.exists());
String fileData = "<!DOCTYPE acunetix [ <!ENTITY foo SYSTEM \"file://" +
testFile.toString() +
"\">]><xxe>&foo;</xxe>";
TransformerFactory transformerFactory = TransformerFactory.newInstance();
System.out.println(transformerFactory.getClass().getName());
transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Transformer transformer = transformerFactory.newTransformer();
StringWriter buff = new StringWriter();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new StreamSource(new StringReader(fileData)), new StreamResult(buff));
assertEquals("<xxe>&foo;</xxe>", buff.toString());
}
I get the following output:
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl
[Fatal Error] :1:182: External Entity: Failed to read external document 'test.txt', because 'file' access is not allowed due to restriction set by the accessExternalDTD property.
ERROR: 'External Entity: Failed to read external document 'test.txt', because 'file' access is not allowed due to restriction set by the accessExternalDTD property.'
From the setFeature JavaDocs:
All implementations are required to support the XMLConstants.FEATURE_SECURE_PROCESSING feature. When the feature is:
true: the implementation will limit XML processing to conform to implementation limits and behave in a secure fashion as defined by the implementation. Examples include resolving user defined style sheets and functions. If XML processing is limited for security reasons, it will be reported via a call to the registered ErrorListener.fatalError(TransformerException exception). See setErrorListener(ErrorListener listener).
That error goes away if I comment out transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true); and then the test fails because the entity is resolved.
Try adding an ErrorListener to both the TransformerFactory and Transformer:
transformerFactory.setErrorListener(new ErrorListener() {
#Override
public void warning(TransformerException exception) throws TransformerException {
System.out.println("In Warning: " + exception.toString());
}
#Override
public void error(TransformerException exception) throws TransformerException {
System.out.println("In Error: " + exception.toString());
}
#Override
public void fatalError(TransformerException exception) throws TransformerException {
System.out.println("In Fatal: " + exception.toString());
}
});
Transformer transformer = transformerFactory.newTransformer();
transformer.setErrorListener(transformerFactory.getErrorListener());
I see the following new console output now:
In Error: javax.xml.transform.TransformerException: External Entity: Failed to read external document 'test.txt', because 'file' access is not allowed due to restriction set by the accessExternalDTD property.
Maybe your implementation is treating it as a warning? Otherwise, maybe it's the implementation you're using? It looks like the JavaDoc spec isn't precise, so one implementation might do something different than another. I'd be interested to know faulty implementations!
I know that this is an old post but for those who find themselves here, I hope is helps :)
After applying the solution below, SonarQube still complained with 'Disable access to external entities in XML parsing' security issue :(
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
Eventually I landed on the solution below which finally fixed the issue for me.
TransformerFactory factory = TransformerFactory.newInstance();
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);

Load resource referencing ressources in Java Webapp

I got a xslt transformation done with something like this:
public static String transform(Source xml, String xsltPath) {
try {
InputStream is = MyClass.class.getResourceAsStream(xsltPath);
final Source xslt = new StreamSource(is);
final TransformerFactory transFact = TransformerFactory.newInstance();
final Transformer trans = transFact.newTransformer(xslt);
final OutputStream os = new ByteArrayOutputStream();
final StreamResult result = new StreamResult(os);
trans.transform(xml, new StreamResult(os));
final String theResult = result.getOutputStream().toString();
return theResult;
}
catch (TransformerException e) {
return null;
}
}
As you can see xslt is loaded from resources. The function together with the transformation files i need are bundled in a library and this works as long as the library is stand alone from a main method or so.
However if this library is bundled with a webapplication and deployed in Jetty/Tomcat it gets a bit complicated. As long as the transformation files in it self do not reference any other files from resources there is no problem but with files like this:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet>
<xsl:import href="import_file1.xsl" />
<xsl:import href="import_file2.xsl" />
<xsl:template name="aTtemplate">
<xsl:for-each select="document('import_file3.xml')">
...
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
The imports cannot be resolved and the document from the for each loop cannot be found. In Tomcat a workaround is to put the files inside the $TOMCAT/bin directory but that is not a suitable solution for us. Is there any method to get this resources recursively out of the lib?

JAXP with XSD import ArrayIndexOutOfBoundsException

I'm facing an issue with JDK (both 1.6 and 1.7) XSLT transformations.
The thing is that I want to process simple WSDL that is using xsd:import for its XSD (that lies in same location) with my XSLT transformation.
public static void main(String[] args) throws Exception {
InputStream xmlStream = new FileInputStream("/home/d1x/temp/xslt/test.wsdl");
String xmlSystemId = "file:///home/d1x/temp/xslt/test.wsdl";
InputStream xsltStream = XsltTransformation.class.getResourceAsStream("wsdl-viewer.xsl");
OutputStream outputStream = new FileOutputStream("/home/d1x/temp/xslt/output.html");
new XsltTransformation().transform(xmlStream, xmlSystemId, xsltStream, outputStream);
}
public void transform(InputStream xmlStream, String xmlSystemId, InputStream xsltStream, OutputStream outputStream) {
Source xmlSource = new StreamSource(xmlStream, xmlSystemId);
Source xsltSource = new StreamSource(xsltStream);
TransformerFactory transFact = TransformerFactory.newInstance();
try {
Transformer trans = transFact.newTransformer(xsltSource);
trans.transform(xmlSource, new StreamResult(outputStream));
} catch (TransformerConfigurationException e) {
e.printStackTrace();
} catch (TransformerException e) {
e.printStackTrace();
}
}
When I run my code, I get this exception that is kinda hard to debug. When I remove the import, everything works fine.
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at com.sun.org.apache.xml.internal.utils.SuballocatedIntVector.elementAt(SuballocatedIntVector.java:438)
at com.sun.org.apache.xml.internal.dtm.ref.DTMDefaultBase._firstch(DTMDefaultBase.java:524)
at com.sun.org.apache.xalan.internal.xsltc.dom.SAXImpl.access$200(SAXImpl.java:76)
at com.sun.org.apache.xalan.internal.xsltc.dom.SAXImpl$NamespaceChildrenIterator.next(SAXImpl.java:1433)
at com.sun.org.apache.xalan.internal.xsltc.dom.StepIterator.next(StepIterator.java:111)
at com.sun.org.apache.xalan.internal.xsltc.dom.StepIterator.next(StepIterator.java:111)
at com.sun.org.apache.xalan.internal.xsltc.dom.DupFilterIterator.setStartNode(DupFilterIterator.java:96)
at com.sun.org.apache.xalan.internal.xsltc.dom.UnionIterator$LookAheadIterator.setStartNode(UnionIterator.java:78)
at com.sun.org.apache.xalan.internal.xsltc.dom.MultiValuedNodeHeapIterator.setStartNode(MultiValuedNodeHeapIterator.java:212)
at com.sun.org.apache.xalan.internal.xsltc.dom.CurrentNodeListIterator.setStartNode(CurrentNodeListIterator.java:153)
at com.sun.org.apache.xalan.internal.xsltc.dom.CachedNodeListIterator.setStartNode(CachedNodeListIterator.java:55)
at GregorSamsa.topLevel()
... etc...
WSDL itself is very simple and is using the import:
...<types>
<xsd:schema>
<xsd:import namespace="http://mytest.com" schemaLocation="test.xsd"/>
</xsd:schema>
</types>...
Used XSLT can be found at: http://tomi.vanek.sk/xml/wsdl-viewer.xsl
I managed to solve this issue by switching to Saxon implementation of JAXP instead of built-in Java implementation. The only code change was:
TransformerFactory transFact = net.sf.saxon.TransformerFactoryImpl.newInstance();

Transformation Failing due to xsl:include

I have a Java maven project which includes XSLT transformations. I load the stylesheet as follows:
TransformerFactory tFactory = TransformerFactory.newInstance();
DocumentBuilderFactory dFactory = DocumentBuilderFactory
.newInstance();
dFactory.setNamespaceAware(true);
DocumentBuilder dBuilder = dFactory.newDocumentBuilder();
ClassLoader cl = this.getClass().getClassLoader();
java.io.InputStream in = cl.getResourceAsStream("xsl/stylesheet.xsl");
InputSource xslInputSource = new InputSource(in);
Document xslDoc = dBuilder.parse(xslInputSource);
DOMSource xslDomSource = new DOMSource(xslDoc);
Transformer transformer = tFactory.newTransformer(xslDomSource);
The stylesheet.xsl has a number of statements. These appear to be causing problems, when I try to run my unit tests I get the following errors:
C:\Code\workspace\app\dummy.xsl; Line #0; Column #0; Had IO Exception with stylesheet file: footer.xsl
C:\Code\workspace\app\dummy.xsl; Line #0; Column #0; Had IO Exception with stylesheet file: topbar.xsl
The include statements in the XSLT are relative links
xsl:include href="footer.xsl"
xsl:include href="topbar.xsl"
I have tried experimenting and changing these to the following - but I still get the error.
xsl:include href="xsl/footer.xsl"
xsl:include href="xsl/topbar.xsl"
Any ideas? Any help much appreciated.
Solved my problem using a URIResolver.
class MyURIResolver implements URIResolver {
#Override
public Source resolve(String href, String base) throws TransformerException {
try {
ClassLoader cl = this.getClass().getClassLoader();
java.io.InputStream in = cl.getResourceAsStream("xsl/" + href);
InputSource xslInputSource = new InputSource(in);
Document xslDoc = dBuilder.parse(xslInputSource);
DOMSource xslDomSource = new DOMSource(xslDoc);
xslDomSource.setSystemId("xsl/" + href);
return xslDomSource;
} catch (...
And assigning this with the TransformerFactory
tFactory.setURIResolver(new MyURIResolver());
URIResolver can also be used in a more straightforward way as below:
class XsltURIResolver implements URIResolver {
#Override
public Source resolve(String href, String base) throws TransformerException {
try{
InputStream inputStream = this.getClass().getClassLoader().getResourceAsStream("xslts/" + href);
return new StreamSource(inputStream);
}
catch(Exception ex){
ex.printStackTrace();
return null;
}
}
}
Use the URIResolver with TransformerFactory as shown below:
TransformerFactory transFact = TransformerFactory.newInstance();
transFact.setURIResolver(new XsltURIResolver());
Or with a lambda expression:
transFact.setURIResolver((href, base) -> {
final InputStream s = this.getClass().getClassLoader().getResourceAsStream("xslts/" + href);
return new StreamSource(s);
});
Set your DocumentBuilder object with an EntityResolver.
You'll have to extend EntityResolver class to resolve your external entities (footer.xsl and topbar.xsl).
I had a problem similar to this once with relative paths in the XSLT.
If you can, try to put absolute paths in the XSLT - that should resolve the error.
An absolute path probably isn't preferable for the final version of the XSLT, but it should get you past the maven problem. Perhaps you can have two versions of the XSLT, one with absolute paths for maven and one with relative paths for whatever other tool it's being used with.

Categories