Transformation Failing due to xsl:include - java

I have a Java maven project which includes XSLT transformations. I load the stylesheet as follows:
TransformerFactory tFactory = TransformerFactory.newInstance();
DocumentBuilderFactory dFactory = DocumentBuilderFactory
.newInstance();
dFactory.setNamespaceAware(true);
DocumentBuilder dBuilder = dFactory.newDocumentBuilder();
ClassLoader cl = this.getClass().getClassLoader();
java.io.InputStream in = cl.getResourceAsStream("xsl/stylesheet.xsl");
InputSource xslInputSource = new InputSource(in);
Document xslDoc = dBuilder.parse(xslInputSource);
DOMSource xslDomSource = new DOMSource(xslDoc);
Transformer transformer = tFactory.newTransformer(xslDomSource);
The stylesheet.xsl has a number of statements. These appear to be causing problems, when I try to run my unit tests I get the following errors:
C:\Code\workspace\app\dummy.xsl; Line #0; Column #0; Had IO Exception with stylesheet file: footer.xsl
C:\Code\workspace\app\dummy.xsl; Line #0; Column #0; Had IO Exception with stylesheet file: topbar.xsl
The include statements in the XSLT are relative links
xsl:include href="footer.xsl"
xsl:include href="topbar.xsl"
I have tried experimenting and changing these to the following - but I still get the error.
xsl:include href="xsl/footer.xsl"
xsl:include href="xsl/topbar.xsl"
Any ideas? Any help much appreciated.

Solved my problem using a URIResolver.
class MyURIResolver implements URIResolver {
#Override
public Source resolve(String href, String base) throws TransformerException {
try {
ClassLoader cl = this.getClass().getClassLoader();
java.io.InputStream in = cl.getResourceAsStream("xsl/" + href);
InputSource xslInputSource = new InputSource(in);
Document xslDoc = dBuilder.parse(xslInputSource);
DOMSource xslDomSource = new DOMSource(xslDoc);
xslDomSource.setSystemId("xsl/" + href);
return xslDomSource;
} catch (...
And assigning this with the TransformerFactory
tFactory.setURIResolver(new MyURIResolver());

URIResolver can also be used in a more straightforward way as below:
class XsltURIResolver implements URIResolver {
#Override
public Source resolve(String href, String base) throws TransformerException {
try{
InputStream inputStream = this.getClass().getClassLoader().getResourceAsStream("xslts/" + href);
return new StreamSource(inputStream);
}
catch(Exception ex){
ex.printStackTrace();
return null;
}
}
}
Use the URIResolver with TransformerFactory as shown below:
TransformerFactory transFact = TransformerFactory.newInstance();
transFact.setURIResolver(new XsltURIResolver());
Or with a lambda expression:
transFact.setURIResolver((href, base) -> {
final InputStream s = this.getClass().getClassLoader().getResourceAsStream("xslts/" + href);
return new StreamSource(s);
});

Set your DocumentBuilder object with an EntityResolver.
You'll have to extend EntityResolver class to resolve your external entities (footer.xsl and topbar.xsl).

I had a problem similar to this once with relative paths in the XSLT.
If you can, try to put absolute paths in the XSLT - that should resolve the error.
An absolute path probably isn't preferable for the final version of the XSLT, but it should get you past the maven problem. Perhaps you can have two versions of the XSLT, one with absolute paths for maven and one with relative paths for whatever other tool it's being used with.

Related

Java Saxon 10.5 XSL 3 Transformation, href file not found

I'm writing a Java application that does a XML transformation using XSLT3, using Saxon-HE 10.5 (as a Maven project).
My XSLT sheet imports other XSLT sheets, using <xsl:import> (e.g. <xsl:import href="sheet1.xsl"/>). All of the XSLT sheets are located inside ./src/main/resources. However, when I try to run the program, I get a FileNotFound Exception from Saxon, since it is looking for the files at the project base directory.
I assume there is some way to change where Saxon is looking for the files, but I was not able to find out how to achieve this when using the s9api API.
Here's my Java code performing the transformation:
public void transformXML(String xmlFile, String output) throws SaxonApiException, IOException, XPathExpressionException, ParserConfigurationException, SAXException {
Processor processor = new Processor(false);
XsltCompiler compiler = processor.newXsltCompiler();
XsltExecutable stylesheet = compiler.compile(new StreamSource(this.getClass().getClassLoader().getResourceAsStream("transform.xsl")));
Serializer out = processor.newSerializer(new File(output));
out.setOutputProperty(Serializer.Property.METHOD, "text");
Xslt30Transformer transformer = stylesheet.load30();
transformer.transform(new StreamSource(new File(xmlFile)), out);
}
Any help is appreciated.
Edit:
My solution based on #Michael Kay's recommendation:
public void transformXML(String xmlFile, String output) throws SaxonApiException, IOException, XPathExpressionException, ParserConfigurationException, SAXException {
Processor processor = new Processor(false);
XsltCompiler compiler = processor.newXsltCompiler();
compiler.setURIResolver(new ClasspathResourceURIResolver());
XsltExecutable stylesheet = compiler.compile(new StreamSource(this.getClass().getClassLoader().getResourceAsStream("transform.xsl")));
Serializer out = processor.newSerializer(new File(output));
out.setOutputProperty(Serializer.Property.METHOD, "text");
Xslt30Transformer transformer = stylesheet.load30();
transformer.transform(new StreamSource(new File(xmlFile)), out);
}
}
class ClasspathResourceURIResolver implements URIResolver
{
#Override
public Source resolve(String href, String base) throws TransformerException {
return new StreamSource(this.getClass().getClassLoader().getResourceAsStream(href));
}
}
Saxon doesn't know the base URI of the stylesheet (it has no way of knowing, because you haven't told it), so it can't resolve relative URIs appearing in xsl:import/#href.
Normally I would suggest supplying a base URI in the second argument of new StreamSource(). However, since the main stylesheet is loaded using getResourceAsStream(), I suspect you want to load secondary stylesheet modules using the same mechanism, and this can be done by setting a URIResolver on the XsltCompiler object.

How to secure javax.xml.transform.TransformerFactory from XML external attacks

I have researched on the subject but couldn't find any relevant info regarding that
Do we need to take any security measurements to secure javax.xml.transform.Transformer against XML external entity attacks?
I did the following and it seems to expand the dtd.
String fileData = "<!DOCTYPE acunetix [ <!ENTITY sampleVal SYSTEM \"file:///media/sample\">]><username>&sampleVal;</username>";
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Transformer transformer = transformerFactory.newTransformer();
StringWriter buff = new StringWriter();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new StreamSource(new StringReader(fileData)), new StreamResult(buff));
System.out.println(buff.toString());
output contains the value from the file
<username>test</username>
Your code seems correct. When I run this slightly modified JUnit test case:
#Test
public void test() throws TransformerException, URISyntaxException {
File testFile = new File(getClass().getResource("test.txt").toURI());
assertTrue(testFile.exists());
String fileData = "<!DOCTYPE acunetix [ <!ENTITY foo SYSTEM \"file://" +
testFile.toString() +
"\">]><xxe>&foo;</xxe>";
TransformerFactory transformerFactory = TransformerFactory.newInstance();
System.out.println(transformerFactory.getClass().getName());
transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
Transformer transformer = transformerFactory.newTransformer();
StringWriter buff = new StringWriter();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new StreamSource(new StringReader(fileData)), new StreamResult(buff));
assertEquals("<xxe>&foo;</xxe>", buff.toString());
}
I get the following output:
com.sun.org.apache.xalan.internal.xsltc.trax.TransformerFactoryImpl
[Fatal Error] :1:182: External Entity: Failed to read external document 'test.txt', because 'file' access is not allowed due to restriction set by the accessExternalDTD property.
ERROR: 'External Entity: Failed to read external document 'test.txt', because 'file' access is not allowed due to restriction set by the accessExternalDTD property.'
From the setFeature JavaDocs:
All implementations are required to support the XMLConstants.FEATURE_SECURE_PROCESSING feature. When the feature is:
true: the implementation will limit XML processing to conform to implementation limits and behave in a secure fashion as defined by the implementation. Examples include resolving user defined style sheets and functions. If XML processing is limited for security reasons, it will be reported via a call to the registered ErrorListener.fatalError(TransformerException exception). See setErrorListener(ErrorListener listener).
That error goes away if I comment out transformerFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true); and then the test fails because the entity is resolved.
Try adding an ErrorListener to both the TransformerFactory and Transformer:
transformerFactory.setErrorListener(new ErrorListener() {
#Override
public void warning(TransformerException exception) throws TransformerException {
System.out.println("In Warning: " + exception.toString());
}
#Override
public void error(TransformerException exception) throws TransformerException {
System.out.println("In Error: " + exception.toString());
}
#Override
public void fatalError(TransformerException exception) throws TransformerException {
System.out.println("In Fatal: " + exception.toString());
}
});
Transformer transformer = transformerFactory.newTransformer();
transformer.setErrorListener(transformerFactory.getErrorListener());
I see the following new console output now:
In Error: javax.xml.transform.TransformerException: External Entity: Failed to read external document 'test.txt', because 'file' access is not allowed due to restriction set by the accessExternalDTD property.
Maybe your implementation is treating it as a warning? Otherwise, maybe it's the implementation you're using? It looks like the JavaDoc spec isn't precise, so one implementation might do something different than another. I'd be interested to know faulty implementations!
I know that this is an old post but for those who find themselves here, I hope is helps :)
After applying the solution below, SonarQube still complained with 'Disable access to external entities in XML parsing' security issue :(
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
Eventually I landed on the solution below which finally fixed the issue for me.
TransformerFactory factory = TransformerFactory.newInstance();
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_SCHEMA, "");
factory.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);

How to read a XML Document by using DocumentBuilderFactory and DocumentBuilder?

I have written the following java method to read an XML File by using DocumentBuilderFactory and DocumentBuilder:
public static Document readAndGenerateXmlFile(String path, String fileName){
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = null;
Document xmlDocument = null;
try {
docBuilder = docBuilderFactory.newDocumentBuilder();
xmlDocument = docBuilder.parse(new File(path + fileName));
} catch (ParserConfigurationException exec) {
logger.error(exec);
} catch (SAXException exec) {
logger.error(exec);
} catch (IOException exec) {
logger.error(exec);
}
return xmlDocument;
}
Also I works with Apache Maven and using a Glassfish Application Server. The XML file, which I want to read, exists in the following path "src/main/resources/myfolder/myXmlFile.xml". The parameter for the method are "path=src/main/resources/" and "fileName=tester.xml"
The java method will be called by the following method:
#ManagedBean(name="mybean")
#SessionScoped
public class GuiBean {
#PostConstruct
public void initializeGUI(){
Document xmlDocument = MyXmlFactory.readAndGenerateXmlFile("src/main/resources/myfolder", "myXmlFile.xml" );
// other java code
}
}
But now I have the problem, that occurs an IOException during execute the java method above. I get the error message "The system could not found the named path". Also I can see, that the path will be extend by Java JVM (?) to "C:/Tools/myGlassfishServer/src/main/resources/myfolder/myXmlFile.xml".
Does anybody have an idea, why I get this errormessage? If I donĀ“t started this method on an application server, the file will be founded.

JAXP with XSD import ArrayIndexOutOfBoundsException

I'm facing an issue with JDK (both 1.6 and 1.7) XSLT transformations.
The thing is that I want to process simple WSDL that is using xsd:import for its XSD (that lies in same location) with my XSLT transformation.
public static void main(String[] args) throws Exception {
InputStream xmlStream = new FileInputStream("/home/d1x/temp/xslt/test.wsdl");
String xmlSystemId = "file:///home/d1x/temp/xslt/test.wsdl";
InputStream xsltStream = XsltTransformation.class.getResourceAsStream("wsdl-viewer.xsl");
OutputStream outputStream = new FileOutputStream("/home/d1x/temp/xslt/output.html");
new XsltTransformation().transform(xmlStream, xmlSystemId, xsltStream, outputStream);
}
public void transform(InputStream xmlStream, String xmlSystemId, InputStream xsltStream, OutputStream outputStream) {
Source xmlSource = new StreamSource(xmlStream, xmlSystemId);
Source xsltSource = new StreamSource(xsltStream);
TransformerFactory transFact = TransformerFactory.newInstance();
try {
Transformer trans = transFact.newTransformer(xsltSource);
trans.transform(xmlSource, new StreamResult(outputStream));
} catch (TransformerConfigurationException e) {
e.printStackTrace();
} catch (TransformerException e) {
e.printStackTrace();
}
}
When I run my code, I get this exception that is kinda hard to debug. When I remove the import, everything works fine.
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
at com.sun.org.apache.xml.internal.utils.SuballocatedIntVector.elementAt(SuballocatedIntVector.java:438)
at com.sun.org.apache.xml.internal.dtm.ref.DTMDefaultBase._firstch(DTMDefaultBase.java:524)
at com.sun.org.apache.xalan.internal.xsltc.dom.SAXImpl.access$200(SAXImpl.java:76)
at com.sun.org.apache.xalan.internal.xsltc.dom.SAXImpl$NamespaceChildrenIterator.next(SAXImpl.java:1433)
at com.sun.org.apache.xalan.internal.xsltc.dom.StepIterator.next(StepIterator.java:111)
at com.sun.org.apache.xalan.internal.xsltc.dom.StepIterator.next(StepIterator.java:111)
at com.sun.org.apache.xalan.internal.xsltc.dom.DupFilterIterator.setStartNode(DupFilterIterator.java:96)
at com.sun.org.apache.xalan.internal.xsltc.dom.UnionIterator$LookAheadIterator.setStartNode(UnionIterator.java:78)
at com.sun.org.apache.xalan.internal.xsltc.dom.MultiValuedNodeHeapIterator.setStartNode(MultiValuedNodeHeapIterator.java:212)
at com.sun.org.apache.xalan.internal.xsltc.dom.CurrentNodeListIterator.setStartNode(CurrentNodeListIterator.java:153)
at com.sun.org.apache.xalan.internal.xsltc.dom.CachedNodeListIterator.setStartNode(CachedNodeListIterator.java:55)
at GregorSamsa.topLevel()
... etc...
WSDL itself is very simple and is using the import:
...<types>
<xsd:schema>
<xsd:import namespace="http://mytest.com" schemaLocation="test.xsd"/>
</xsd:schema>
</types>...
Used XSLT can be found at: http://tomi.vanek.sk/xml/wsdl-viewer.xsl
I managed to solve this issue by switching to Saxon implementation of JAXP instead of built-in Java implementation. The only code change was:
TransformerFactory transFact = net.sf.saxon.TransformerFactoryImpl.newInstance();

Make DocumentBuilder.parse ignore DTD references

When I parse my xml file (variable f) in this method, I get an error
C:\Documents and Settings\joe\Desktop\aicpcudev\OnlineModule\map.dtd (The system cannot find the path specified)
I know I do not have the dtd, nor do I need it. How can I parse this File object into a Document object while ignoring DTD reference errors?
private static Document getDoc(File f, String docId) throws Exception{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(f);
return doc;
}
Try setting features on the DocumentBuilderFactory:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setValidating(false);
dbf.setNamespaceAware(true);
dbf.setFeature("http://xml.org/sax/features/namespaces", false);
dbf.setFeature("http://xml.org/sax/features/validation", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder db = dbf.newDocumentBuilder();
...
Ultimately, I think the options are specific to the parser implementation. Here is some documentation for Xerces2 if that helps.
A similar approach to the one suggested by #anjanb
builder.setEntityResolver(new EntityResolver() {
#Override
public InputSource resolveEntity(String publicId, String systemId)
throws SAXException, IOException {
if (systemId.contains("foo.dtd")) {
return new InputSource(new StringReader(""));
} else {
return null;
}
}
});
I found that simply returning an empty InputSource worked just as well?
I found an issue where the DTD file was in the jar file along with the XML. I solved the issue based on the examples here, as follows: -
DocumentBuilder db = dbf.newDocumentBuilder();
db.setEntityResolver(new EntityResolver() {
public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException {
if (systemId.contains("doc.dtd")) {
InputStream dtdStream = MyClass.class
.getResourceAsStream("/my/package/doc.dtd");
return new InputSource(dtdStream);
} else {
return null;
}
}
});
Source XML (With DTD)
<!DOCTYPE MYSERVICE SYSTEM "./MYSERVICE.DTD">
<MYACCSERVICE>
<REQ_PAYLOAD>
<ACCOUNT>1234567890</ACCOUNT>
<BRANCH>001</BRANCH>
<CURRENCY>USD</CURRENCY>
<TRANS_REFERENCE>201611100000777</TRANS_REFERENCE>
</REQ_PAYLOAD>
</MYACCSERVICE>
Java DOM implementation for accepting above XML as String and removing DTD declaration
public Document removeDTDFromXML(String payload) throws Exception {
System.out.println("### Payload received in XMlDTDRemover: " + payload);
Document doc = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
dbf.setValidating(false);
dbf.setNamespaceAware(true);
dbf.setFeature("http://xml.org/sax/features/namespaces", false);
dbf.setFeature("http://xml.org/sax/features/validation", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(payload));
doc = db.parse(is);
} catch (ParserConfigurationException e) {
System.out.println("Parse Error: " + e.getMessage());
return null;
} catch (SAXException e) {
System.out.println("SAX Error: " + e.getMessage());
return null;
} catch (IOException e) {
System.out.println("IO Error: " + e.getMessage());
return null;
}
return doc;
}
Destination XML (Without DTD)
<MYACCSERVICE>
<REQ_PAYLOAD>
<ACCOUNT>1234567890</ACCOUNT>
<BRANCH>001</BRANCH>
<CURRENCY>USD</CURRENCY>
<TRANS_REFERENCE>201611100000777</TRANS_REFERENCE>
</REQ_PAYLOAD>
</MYACCSERVICE>
I know I do not have the dtd, nor do I need it.
I am suspicious of this statement; does your document contain any entity references? If so, you definitely need the DTD.
Anyway, the usual way of preventing this from happening is using an XML catalog to define a local path for "map.dtd".
here's another user who got the same issue : http://forums.sun.com/thread.jspa?threadID=284209&forumID=34
user ddssot on that post says
myDocumentBuilder.setEntityResolver(new EntityResolver() {
public InputSource resolveEntity(java.lang.String publicId, java.lang.String systemId)
throws SAXException, java.io.IOException
{
if (publicId.equals("--myDTDpublicID--"))
// this deactivates the open office DTD
return new InputSource(new ByteArrayInputStream("<?xml version='1.0' encoding='UTF-8'?>".getBytes()));
else return null;
}
});
The user further mentions "As you can see, when the parser hits the DTD, the entity resolver is called. I recognize my DTD with its specific ID and return an empty XML doc instead of the real DTD, stopping all validation..."
Hope this helps.
I'm working with sonarqube, and sonarlint for eclipse showed me Untrusted XML should be parsed without resolving external data (squid:S2755)
I managed to solve it using:
factory = DocumentBuilderFactory.newInstance();
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// If you can't completely disable DTDs, then at least do the following:
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-general-entities
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-general-entities
// JDK7+ - http://xml.org/sax/features/external-general-entities
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
// Xerces 1 - http://xerces.apache.org/xerces-j/features.html#external-parameter-entities
// Xerces 2 - http://xerces.apache.org/xerces2-j/features.html#external-parameter-entities
// JDK7+ - http://xml.org/sax/features/external-parameter-entities
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
// Disable external DTDs as well
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
// and these as well, per Timothy Morgan's 2014 paper: "XML Schema, DTD, and Entity Attacks"
factory.setXIncludeAware(false);
factory.setExpandEntityReferences(false);

Categories