Veracode XML External Entity Reference (XXE)

Veracode XML External Entity Reference (XXE) - java

I've got the next finding in my veracode report:
Improper Restriction of XML External Entity Reference ('XXE') (CWE ID 611)
referring the next code bellow
...
DocumentBuilderFactory dbf=null;
DocumentBuilder db = null;
try {
dbf=DocumentBuilderFactory.newInstance();
dbf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
dbf.setExpandEntityReferences(false);
dbf.setXIncludeAware(false);
dbf.setValidating(false);
dbf.newDocumentBuilder();
InputStream stream = new ByteArrayInputStream(datosXml.getBytes());
Document doc = db.parse(stream, "");
...
I've been researching but I haven't found out a reason for this finding or a way of making it disappear.
Could you tell me how to do it?

Have you seen the OWASP guide about XXE?
You are not disabling the 3 features you should disable. Most importantly the first one:
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);

Background:
The XXE attack is constructed around XML language capabilities to define arbitrary entities using the external Data Type Definition (DTD) and the ability to read or execute files.
Below is an example of XML file containing DTD declaration that when processed may return output of local “/etc/passwd” file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE test [
<!ELEMENT test ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
Mitigation:
To avoid exploitation of XEE vulnerability the best approach is to disable the ability to load entities from external source.
Now the way to disable the DTDs will defer depending upon the language used (Java,C++, .NET) and the XML parser being used (DocumentBuilderFactory, SAXParserFactory, TransformerFactory to name a few considering the java language).
Below two official references provides the best information on how to achieve the same.
https://rules.sonarsource.com/java/RSPEC-2755
https://github.com/OWASP/CheatSheetSeries/blob/master/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.md

Related

XML parser configured does not prevent nor limit external entities resolution. This can expose the parser to an XML External Entities attack

We had a security audit on our code, and they mentioned that our code is vulnerable to EXternal Entity (XXE) attack.
Explanation- XML External Entities attacks benefit from an XML feature to build documents dynamically at the time of processing. An XML entity allows inclusion of data dynamically from a given resource. External entities allow an XML document to include data from an external URI. Unless configured to do otherwise, external entities force the XML parser to access the resource specified by the URI, e.g., a file on the local machine or on a remote system. This behavior exposes the application to XML External Entity (XXE) attacks, which can be used to perform denial of service of the local system, gain unauthorized access to files on the local machine, scan remote machines, and perform denial of service of remote systems. The following XML document shows an example of an XXE attack.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [
<!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///dev/random" >]><foo>&xxe;</foo>
This example could crash the server (on a UNIX system), if the XML parser attempts to substitute the entity with the contents of the /dev/random file.
Recommendation- The XML unmarshaller should be configured securely so that it does not allow external entities as part of an incoming XML document. To avoid XXE injection do not use unmarshal methods that process an XML source directly as java.io.File, java.io.Reader or java.io.InputStream. Parse the document with a securely configured parser and use an unmarshal method that takes the secure
parser as the XML source as shown in the following example:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setExpandEntityReferences(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(<XML Source>);
Model model = (Model) u.unmarshal(document);
And written code is below where found the XXE attack-
1- DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
2- DocumentBuilder db = dbf.newDocumentBuilder();
3- InputSource is = new InputSource();
4- is.setCharacterStream(new StringReader(xml));
5-
6- Document doc = db.parse(is);
7- NodeList nodes = doc.getElementsByTagName(elementsByTagName);
8-
9- return nodes;
I am getting XXE attack on the line no 6.
Please help how can I resolve the above issue. Anyone help is appreciated !

For a detailed explanation and options for remediation I suggest you look at OWASP XEE Cheat Sheet
We had a similar issue raised and resolved it by disabling DOCTYPES (the first suggestion on the link above) as we didn't need them:
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

For javax.xml.parsers.DocumentBuilderFactory,the following setting would be enough to prevent XXE attack
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Disallow the DTDs (doctypes) entirely.
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// Or do the following:
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);

How to add a tag to the prolog of an XML via Java

I need my XML to have the following tags, the problem is that I'm not entirely sure how to communicate the issue so google searches are not providing anything. Hopefully by seeing my question you'll understand my issue.
The beginning of the XML document that I am creating in my Java code needs to have the following in the prolog:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<?AdlibExpress applanguage="USA" appversion="5.0.0" dtdversion="2.6.3" ?>
<!DOCTYPE JOBS SYSTEM "D:\Adlib5.4\System\DTD\AdlibExpress.dtd">
The encoding is easy enough. I am having difficulty adding:
<?AdlibExpress applanguage="USA" appversion="5.0.0" dtdversion="2.6.3" ?>
I don't even know what to look for or what to call this.
Here is the library I am using
DocumentBuilderFactory docFactory1 = DocumentBuilderFactory.newInstance();
docFactory1.setNamespaceAware(true);
DocumentBuilder docBuilder1 = docFactory1.newDocumentBuilder();
// root elements
Document doc = docBuilder1.newDocument();
Theres gotta be something in one of these classes to do what I'm talking about but I'm not having luck finding it. If you're able to help me, can you also let me know how to properly communicate the issue? Thanks

<?AdlibExpress...?> is a processing instruction.
You can create it with
doc.appendChild(
doc.createProcessingInstruction("AdlibExpress", "applanguage=\"USA\" appversion=\"5.0.0\" dtdversion=\"2.6.3\""));

Specifying DTD to be used by DocumentBuilders for XML parsing?

I am currently writing a tool, using Java 1.6, that brings together a number of XML files. All of the files validate to the DocBook 4.5 DTD (I have checked this using xmllint and specifying the DocBook 4.5 DTD as the --dtdvalid parameter), but not all of them include the DOCTYPE declaration.
I load each XML file into the DOM to perform the required manipulation like so:
private Document fileToDocument( File input ) throws ParserConfigurationException, IOException, SAXException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
factory.setIgnoringElementContentWhitespace(false);
factory.setIgnoringComments(false);
factory.setValidating(false);
factory.setExpandEntityReferences(false);
DocumentBuilder builder = factory.newDocumentBuilder();
return builder.parse( input );
}
For the most part this has worked quite well, I can use he returned object to navigate the tree and perform the required manipulations and then write the document back out. Where I am encountering problems is with files which:
Do not include the DOCTYPE declaration, and
Do include entities defined in the DTD (for example — / —).
Where this is the case an exception is thrown from the builder.parse(...) call with the message:
[Fatal Error] :5:15: The entity "mdash" was referenced, but not declared.
Fair enough, it isn't declared. What I would ideally do in this instance is set the DocumentBuilderFactory to always use the DocBook 4.5 DTD regardless of whether one is specified in the file.
I did try validation using the DocBook 4.5 schema but found that this produced a number of unrelated errors with the XML. It seems like the schema might not be functionally equivalent to the DTD, at least for this version of the DocBook specification.
The other option I can think of is to read the file in, try and detect whether a doctype was set or not, and then set one if none was found prior to actually parsing the XML into the DOM.
So, my question is, is there a smarter way that I have not seen to tell the parser to use a specific DTD or ensure that parsing proceeds despite the entities not resolving (not just the &emdash; example but any entities in the XML - there are a large number of potentials)?

Could using an EntityResolver2 and implementing EntityResolver2.getExternalSubset() help?
... This method can also be used with documents that have no DOCTYPE declaration. When the root element is encountered, but no DOCTYPE declaration has been seen, this method is invoked. If it returns a value for the external subset, that root element is declared to be the root element, giving the effect of splicing a DOCTYPE declaration at the end the prolog of a document that could not otherwise be valid. ...

Adding source validation to a StructuredTextViewer

I added to my application a nice XML source viewer. Now, I have an XSD scheme that defines the xml document. Any idea where to start on adding some source validation that relies on this scheme?
Thanks!

To check that your XML is well-formed, just run it through a DocumentBuilderFactory parser. To additionally validate it against an .xsd schema referenced in the XML, call:
factory.setValidating( true );
If the xsd schema is not referenced within the XML that you are validating, you can supply it yourself like this:
factory.setAttribute(JAXP_SCHEMA_SOURCE, new File(schemaSource) );
For more information, read the article from Oracle here:
http://download.oracle.com/javaee/1.4/tutorial/doc/JAXPDOM8.html

XML to be validated against multiple xsd schemas

I'm writing the xsd and the code to validate, so I have great control here.
I would like to have an upload facility that adds stuff to my application based on an xml file. One part of the xml file should be validated against different schemas based on one of the values in the other part of it. Here's an example to illustrate:
<foo>
<name>Harold</name>
<bar>Alpha</bar>
<baz>Mercury</baz>
<!-- ... more general info that applies to all foos ... -->
<bar-config>
<!-- the content here is specific to the bar named "Alpha" -->
</bar-config>
<baz-config>
<!-- the content here is specific to the baz named "Mercury" -->
</baz>
</foo>
In this case, there is some controlled vocabulary for the content of <bar>, and I can handle that part just fine. Then, based on the bar value, the appropriate xml schema should be used to validate the content of bar-config. Similarly for baz and baz-config.
The code doing the parsing/validation is written in Java. Not sure how language-dependent the solution will be.
Ideally, the solution would permit the xml author to declare the appropriate schema locations and what-not so that s/he could get the xml validated on the fly in a sufficiently smart editor.
Also, the possible values for <bar> and <baz> are orthogonal, so I don't want to do this by extension for every possible bar/baz combo. What I mean is, if there are 24 possible bar values/schemas and 8 possible baz values/schemas, I want to be able to write 1 + 24 + 8 = 33 total schemas, instead of 1 * 24 * 8 = 192 total schemas.
Also, I'd prefer to NOT break out the bar-config and baz-config into separate xml files if possible. I realize that might make all the problems much easier, as each xml file would have a single schema, but I'm trying to see if there is a good single-xml-file solution.

I finally figured this out.
First of all, in the foo schema, the bar-config and baz-config elements have a type which includes an any element, like this:
<sequence>
<any minOccurs="0" maxOccurs="1"
processContents="lax" namespace="##any" />
</sequence>
In the xml, then, you must specify the proper namespace using the xmlns attribute on the child element of bar-config or baz-config, like this:
<bar-config>
<config xmlns="http://www.example.org/bar/Alpha">
... config xml here ...
</config>
</bar-config>
Then, your XML schema file for bar Alpha will have a target namespace of http://www.example.org/bar/Alpha and will define the root element config.
If your XML file has namespace declarations and schema locations for both of the schema files, this is sufficient for the editor to do all of the validating (at least good enough for Eclipse).
So far, we have satisfied the requirement that the xml author may write the xml in such a way that it is validated in the editor.
Now, we need the consumer to be able to validate. In my case, I'm using Java.
If by some chance, you know the schema files that you will need to use to validate ahead of time, then you simply create a single Schema object and validate as usual, like this:
Schema schema = factory().newSchema(new Source[] {
new StreamSource(stream("foo.xsd")),
new StreamSource(stream("Alpha.xsd")),
new StreamSource(stream("Mercury.xsd")),
});
In this case, however, we don't know which xsd files to use until we have parsed the main document. So, the general procedure is to:
Validate the xml using only the main (foo) schema
Determine the schema to use to validate the portion of the document
Find the node that is the root of the portion to validate using a separate schema
Import that node into a brand new document
Validate the brand new document using the other schema file
Caveat: it appears that the document must be built namespace-aware in order for this to work.
Here's some code (this was ripped from various places of my code, so there might be some errors introduced by the copy-and-paste):
// Contains the filename of the xml file
String filename;
// Load the xml data using a namespace-aware builder (the method
// 'stream' simply opens an input stream on a file)
Document document;
DocumentBuilderFactory docBuilderFactory =
DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
document = docBuilderFactory.newDocumentBuilder().parse(stream(filename));
// Create the schema factory
SchemaFactory sFactory = SchemaFactory.newInstance(
XMLConstants.W3C_XML_SCHEMA_NS_URI);
// Load the main schema
Schema schema = sFactory.newSchema(
new StreamSource(stream("foo.xsd")));
// Validate using main schema
schema.newValidator().validate(new DOMSource(document));
// Get the node that is the root for the portion you want to validate
// using another schema
Node node= getSpecialNode(document);
// Build a Document from that node
Document subDocument = docBuilderFactory.newDocumentBuilder().newDocument();
subDocument.appendChild(subDocument.importNode(node, true));
// Determine the schema to use using your own logic
Schema subSchema = parseAndDetermineSchema(document);
// Validate using other schema
subSchema.newValidator().validate(new DOMSource(subDocument));

Take a look at NVDL (Namespace-based Validation Dispatching Language) - http://www.nvdl.org/
It is designed to do what you want to do (validate parts of an XML document that have their own namespaces and schemas).
There is a tutorial here - http://www.dpawson.co.uk/nvdl/ - and a Java implementation here - http://jnvdl.sourceforge.net/
Hope that helps!
Kevin

You need to define a target namespace for each separately-validated portions of the instance document. Then you define a master schema that uses <xsd:include> to reference the schema documents for these components.
The limitation with this approach is that you can't let the individual components define the schemas that should be used to validate them. But it's a bad idea in general to let a document tell you how to validate it (ie, validation should something that your application controls).

You can also use a "resource resolver" to allow "xml authors" to specify their own schema file, at least to some extent, ex: https://stackoverflow.com/a/41225329/32453 at the end of the day, you want a fully compliant xml file that can be validatable with normal tools, anyway :)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Veracode XML External Entity Reference (XXE) - java

Related

XML parser configured does not prevent nor limit external entities resolution. This can expose the parser to an XML External Entities attack

How to add a tag to the prolog of an XML via Java

Specifying DTD to be used by DocumentBuilders for XML parsing?

Adding source validation to a StructuredTextViewer

XML to be validated against multiple xsd schemas

Categories

Resources