JAXB Validator does not detect syntax errors? - java

I want to validate a xml file with its xsd before unmarshalling it.
The code is as follows :
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = factory.newSchema(xsdFilePath);
Validator validator = schema.newValidator();
validator.setErrorHandler(new MyValidationErrorHandler());
validator.validate(new StreamSource(xmlFilePath));
I found that when a xml element is not closed, Validator failed to record it as an error, But the UnMarshaller recognizes this and throws an "Invalid content was found starting with element.." Error.
I want the Validation and the Unmarshalling/Marshalling to be different operations.
Are there ways to have the Validator detect such syntax errors in the xml file?

You'll have to distinguish two things:
The elementary syntax of an XML document
The document's compliance with an XML SChema
If the elementary syntax isn't right, there's no document that can be investigated for its element structure, attribure existence, value compliance with facets and so on and so on.
I'm afraid you'll have to catch both kinds of exceptions.
You may, however, handle everything in a single unmarshalling operation:
JAXBContext payloadContext = JAXBContext.newInstance("generated");
Unmarshaller unmarshaller = payloadContext.createUnmarshaller();
unmarshaller.setSchema(schemaFactory.newSchema(... )););
unmarshaller.setEventHandler( new ValidationEventHandler(){
public boolean handleEvent(ValidationEvent event) {
System.out.println( "Event! " + event );
return true;
}
} );
Later
To have validation only, you'll still have to parse, but if you don't have JAXB-ish classes, you get by with JAXP:
static class Handler implements ErrorHandler {
public void error(SAXParseException exception){
System.out.println( "error: " + exception.getMessage() );
}
public void fatalError(SAXParseException exception){
System.out.println( "fatal: " + exception.getMessage() );
}
public void warning(SAXParseException exception){
System.out.println( "warning: " + exception.getMessage() );
}
}
Handler handler = new Handler();
DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder();
parser.setErrorHandler( handler );
try {
Document document = parser.parse(new File("test.xml"));
SchemaFactory factory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Source schemaFile = new StreamSource(new File("test.xsd"));
Schema schema = factory.newSchema(schemaFile);
Validator validator = schema.newValidator();
validator.setErrorHandler( handler );
try {
validator.validate(new DOMSource(document));
} catch (SAXException e) {
// ...
System.out.println( "VAlidation error" );
}
} catch (SAXParseException e) {
// syntax error in XML document
System.out.println( "Syntax error" );
}
For validation, setting a handler will not throw a ParseException, so one of these is redundant.

Related

How to get XML element information in case of SAXParseException

When validating an xml source against an xsd schema in a standard java environment, i cannot find a way to get the information about the element that failed validation (in many specific cases).
When catching a SAXParseException, the information of the element is gone. However, when debugging into the xerces.XmlSchemaValidator, i can see that the reason is the specific error message that is not defined to give away information about the element.
For example (and this is also the case in my java demo) the "cvc-mininclusive-valid" error is defined this way:
cvc-minInclusive-valid: Value ''{0}'' is not facet-valid with respect to minInclusive ''{1}'' for type ''{2}''.
https://wiki.xmldation.com/Support/Validator/cvc-mininclusive-valid
What I would would prefer is, that this kind of message would be produced:
cvc-type.3.1.3: The value ''{1}'' of element ''{0}'' is not valid. https://wiki.xmldation.com/Support/Validator/cvc-type-3-1-3
When debugging into xerces.XMLSchemaValidator, I can see that there are two consecutive calls to reportSchemaError(...) - the second only occuring, if the first one did return without an exception being thrown.
Is there any way to configure the validator to use the second way of reporting OR to enrich the SAXParseException with the element information?
Please see my copy&paste&runnable example code below for further explanation:
String xsd =
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n" +
"<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" version=\"1.0\">" +
"<xs:element name=\"demo\">" +
"<xs:complexType>" +
"<xs:sequence>" +
// given are two elements that cannot be < 1
"<xs:element name=\"foo\" type=\"xs:positiveInteger\" minOccurs=\"0\" maxOccurs=\"unbounded\" />" +
"<xs:element name=\"bar\" type=\"xs:positiveInteger\" minOccurs=\"0\" maxOccurs=\"unbounded\" />" +
"</xs:sequence>" +
"</xs:complexType>" +
"</xs:element>" +
"</xs:schema>";
String xml =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
"<demo>" +
"<foo>1</foo>" +
// invalid!
"<foo>0</foo>" +
"<bar>2</bar>" +
"</demo>";
Validator validator = SchemaFactory
.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
.newSchema(new StreamSource(new StringReader(xsd)))
.newValidator();
try {
validator.validate(new StreamSource(new StringReader(xml)));
} catch (SAXParseException e) {
// unfortunately no element or line/column info:
System.err.println(e.getMessage());
// better, but still no element info:
System.err.println(String.format("Line %s - Column %s - %s",
e.getLineNumber(),
e.getColumnNumber(),
e.getMessage()));
}
This isn't well documented but if you have a recent version of Xerces-J (see SVN Rev 380997), you can validate a DOMSource and query the Validator from your ErrorHandler to retrieve the current Element node that the validator was processing when it reported the error.
For example, you could write an ErrorHandler like:
public class ValidatorErrorHandler implements ErrorHandler {
private Validator validator;
public ValidatorErrorHandler(Validator v) {
validator = v;
}
...
public void error(SAXParseException spe) throws SAXException {
Node node = null;
try {
node = (Node)
validator.getProperty(
"http://apache.org/xml/properties/dom/current-element-node");
}
catch (SAXException se) {}
...
}
and then invoke the Validator with this ErrorHandler like:
Validator validator = SchemaFactory
.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
.newSchema(new StreamSource(new StringReader(xsd)))
.newValidator();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new InputSource(new StringReader(xml));
ErrorHandler errorHandler = new ValidatorErrorHandler(validator);
validator.setErrorHandler(errorHandler);
validator.validate(new DOMSource(doc));
to obtain the element where an error occurred.
Try using an error handler:
public class LoggingErrorHandler implements ErrorHandler {
private boolean isValid = true;
public boolean isValid() {
return this.isValid;
}
#Override
public void warning(SAXParseException exc) {
System.err.println(exc);
}
#Override
public void error(SAXParseException exc) {
System.err.println(exc);
this.isValid = false;
}
#Override
public void fatalError(SAXParseException exc) throws SAXParseException {
System.err.println(exc);
this.isValid = false;
throw exc;
}
}
and use it in validator:
Validator validator = SchemaFactory
.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
.newSchema(new StreamSource(new StringReader(xsd)))
.newValidator();
LoggingErrorHandler errorHandler = new LoggingErrorHandler();
validator.setErrorHandler(errorHandler);
validator.validate(new StreamSource(new StringReader(xml)));
return errorHandler.isValid();
I know this is old, but the answer from Michael Glavassevich works like charme! I'm not yet able to upvote or comment, but this one offers his real deep knowledge.

How can I get more information on an invalid DOM element through the Validator?

I am validating an in-memory DOM object using the javax.xml.validation.Validator class against an XSD schema. I am getting a SAXParseException being thrown during the validation whenever there is some data corruption in the information I populate my DOM from.
An example error:
org.xml.SAXParseException: cvc-datatype-valid.1.2.1: '???"??[?????G?>???p~tn??~0?1]' is not a valid valud for 'hexBinary'.
What I am hoping is that there is a way to find the location of this error in my in-memory DOM and print out the offending element and its parent element. My current code is:
public void writeDocumentToFile(Document document) throws XMLWriteException {
try {
// Validate the document against the schema
Validator validator = getSchema(xmlSchema).newValidator();
validator.validate(new DOMSource(document));
// Serialisation logic here.
} catch(SAXException e) {
throw new XMLWriteException(e); // This is being thrown
} // Some other exceptions caught here.
}
private Schema getSchema(URL schema) throws SAXException {
SchemaFactory schemaFactory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
// Some logic here to specify a ResourceResolver
return schemaFactory.newSchema(schema);
}
I have looked into the Validator#setErrorHandler(ErrorHandler handler) method but the ErrorHandler interface only gives me exposure to a SAXParseException which only exposes the line number and column number of the error. Because I am using an in-memory DOM this returns -1 for both line and column number.
Is there a better way to do this? I don't really want to have to manually validate the Strings before I add them to the DOM if the libraries provide me the function I'm looking for.
I'm using JDK 6 update 26 and JDK 6 update 7 depending on where this code is running.
EDIT: With this code added -
validator.setErrorHandler(new ErrorHandler() {
#Override
public void warning(SAXParseException exception) throws SAXException {
printException(exception);
throw exception;
}
#Override
public void error(SAXParseException exception) throws SAXException {
printException(exception);
throw exception;
}
#Override
public void fatalError(SAXParseException exception) throws SAXException {
printException(exception);
throw exception;
}
private void printException(SAXParseException exception) {
System.out.println("exception.getPublicId() = " + exception.getPublicId());
System.out.println("exception.getSystemId() = " + exception.getSystemId());
System.out.println("exception.getColumnNumber() = " + exception.getColumnNumber());
System.out.println("exception.getLineNumber() = " + exception.getLineNumber());
}
});
I get the output:
exception.getPublicId() = null
exception.getSystemId() = null
exception.getColumnNumber() = -1
exception.getLineNumber() = -1
If you are using Xerces (the Sun JDK default), you can get the element that failed validation through the http://apache.org/xml/properties/dom/current-element-node property:
...
catch (SAXParseException e)
{
Element curElement = (Element)validator.getProperty("http://apache.org/xml/properties/dom/current-element-node");
System.out.println("Validation error: " + e.getMessage());
System.out.println("Element: " + curElement);
}
Example:
String xml = "<root xmlns=\"http://www.myschema.org\">\n" +
"<text>This is text</text>\n" +
"<number>32</number>\n" +
"<number>abc</number>\n" +
"</root>";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
Document doc = dbf.newDocumentBuilder().parse(new ByteArrayInputStream(xml.getBytes("UTF-8")));
Schema schema = getSchema(getClass().getResource("myschema.xsd"));
Validator validator = schema.newValidator();
try
{
validator.validate(new DOMSource(doc));
}
catch (SAXParseException e)
{
Element curElement = (Element)validator.getProperty("http://apache.org/xml/properties/dom/current-element-node");
System.out.println("Validation error: " + e.getMessage());
System.out.println(curElement.getLocalName() + ": " + curElement.getTextContent());
//Use curElement.getParentNode() or whatever you need here
}
If you need to get line/column numbers from the DOM, this answer has a solution to that problem.
SaxParseException exposes the SystemId and PublicId. Does that not give you enough information?

how can i unmarshall in jaxb and enjoy the schema validation without using an explicit schema file

I am using jaxb for my application configurations
I feel like I am doing something really crooked and I am looking for a way to not need an actual file or this transaction.
As you can see in code I:
1.create a schema into a file from my JaxbContext (from my class annotation actually)
2.set this schema file in order to allow true validation when I unmarshal
JAXBContext context = JAXBContext.newInstance(clazz);
Schema mySchema = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI).newSchema(schemaFile);
jaxbContext.generateSchema(new MySchemaOutputResolver()); // ultimately creates schemaFile
Unmarshaller u = m_context.createUnmarshaller();
u.setSchema(mySchema);
u.unmarshal(...);
do any of you know how I can validate jaxb without needing to create a schema file that sits in my computer?
Do I need to create a schema for validation, it looks redundant when I get it by JaxbContect.generateSchema ?
How do you do this?
Regarding ekeren's solution above, it's not a good idea to use PipedOutputStream/PipedInputStream in a single thread, lest you overflow the buffer and cause a deadlock. ByteArrayOutputStream/ByteArrayInputStream works, but if your JAXB classes generate multiple schemas (in different namespaces) you need multiple StreamSources.
I ended up with this:
JAXBContext jc = JAXBContext.newInstance(Something.class);
final List<ByteArrayOutputStream> outs = new ArrayList<ByteArrayOutputStream>();
jc.generateSchema(new SchemaOutputResolver(){
#Override
public Result createOutput(String namespaceUri, String suggestedFileName) throws IOException {
ByteArrayOutputStream out = new ByteArrayOutputStream();
outs.add(out);
StreamResult streamResult = new StreamResult(out);
streamResult.setSystemId("");
return streamResult;
}});
StreamSource[] sources = new StreamSource[outs.size()];
for (int i=0; i<outs.size(); i++) {
ByteArrayOutputStream out = outs.get(i);
// to examine schema: System.out.append(new String(out.toByteArray()));
sources[i] = new StreamSource(new ByteArrayInputStream(out.toByteArray()),"");
}
SchemaFactory sf = SchemaFactory.newInstance( XMLConstants.W3C_XML_SCHEMA_NS_URI );
m.setSchema(sf.newSchema(sources));
m.marshal(docs, new DefaultHandler()); // performs the schema validation
I had the exact issue and found a solution in the Apache Axis 2 source code:
protected List<DOMResult> generateJaxbSchemas(JAXBContext context) throws IOException {
final List<DOMResult> results = new ArrayList<DOMResult>();
context.generateSchema(new SchemaOutputResolver() {
#Override
public Result createOutput(String ns, String file) throws IOException {
DOMResult result = new DOMResult();
result.setSystemId(file);
results.add(result);
return result;
}
});
return results;
}
and after you've acquired your list of DOMResults that represent the schemas, you will need to transform them into DOMSource objects before you can feed them into a schema generator. This second step might look something like this:
Unmarshaller u = myJAXBContext.createUnmarshaller();
List<DOMSource> dsList = new ArrayList<DOMSource>();
for(DOMResult domresult : myDomList){
dsList.add(new DOMSource(domresult.getNode()));
}
String schemaLang = "http://www.w3.org/2001/XMLSchema";
SchemaFactory sFactory = SchemaFactory.newInstance(schemaLang);
Schema schema = sFactory.newSchema((DOMSource[]) dsList.toArray(new DOMSource[0]));
u.setSchema(schema);
I believe you just need to set a ValidationEventHandler on your unmarshaller. Something like this:
public class JAXBValidator extends ValidationEventCollector {
#Override
public boolean handleEvent(ValidationEvent event) {
if (event.getSeverity() == event.ERROR ||
event.getSeverity() == event.FATAL_ERROR)
{
ValidationEventLocator locator = event.getLocator();
// change RuntimeException to something more appropriate
throw new RuntimeException("XML Validation Exception: " +
event.getMessage() + " at row: " + locator.getLineNumber() +
" column: " + locator.getColumnNumber());
}
return true;
}
}
And in your code:
Unmarshaller u = m_context.createUnmarshaller();
u.setEventHandler(new JAXBValidator());
u.unmarshal(...);
If you use maven using jaxb2-maven-plugin can help you. It generates schemas in generate-resources phase.

How to validate against schema in JAXB 2.0 without marshalling?

I need to validate my JAXB objects before marshalling to an XML file. Prior to JAXB 2.0, one could use a javax.xml.bind.Validator. But that has been deprecated so I'm trying to figure out the proper way of doing this. I'm familiar with validating at marshall time but in my case I just want to know if its valid. I suppose I could marshall to a temp file or memory and throw it away but wondering if there is a more elegant solution.
Firstly, javax.xml.bind.Validator has been deprecated in favour of javax.xml.validation.Schema (javadoc). The idea is that you parse your schema via a javax.xml.validation.SchemaFactory (javadoc), and inject that into the marshaller/unmarshaller.
As for your question regarding validation without marshalling, the problem here is that JAXB actually delegates the validation to Xerces (or whichever SAX processor you're using), and Xerces validates your document as a stream of SAX events. So in order to validate, you need to perform some kind of marshalling.
The lowest-impact implementation of this would be to use a "/dev/null" implementation of a SAX processor. Marshalling to a null OutputStream would still involve XML generation, which is wasteful. So I would suggest:
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(locationOfMySchema);
Marshaller marshaller = jaxbContext.createMarshaller();
marshaller.setSchema(schema);
marshaller.marshal(objectToMarshal, new DefaultHandler());
DefaultHandler will discard all the events, and the marshal() operation will throw a JAXBException if validation against the schema fails.
You could use a javax.xml.bind.util.JAXBSource (javadoc) and a javax.xml.validation.Validator (javadoc), throw in an implementation of org.xml.sax.ErrorHandler (javadoc) and do the following:
import java.io.File;
import javax.xml.XMLConstants;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.util.JAXBSource;
import javax.xml.validation.*;
public class Demo {
public static void main(String[] args) throws Exception {
Customer customer = new Customer();
customer.setName("Jane Doe");
customer.getPhoneNumbers().add(new PhoneNumber());
customer.getPhoneNumbers().add(new PhoneNumber());
customer.getPhoneNumbers().add(new PhoneNumber());
JAXBContext jc = JAXBContext.newInstance(Customer.class);
JAXBSource source = new JAXBSource(jc, customer);
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sf.newSchema(new File("customer.xsd"));
Validator validator = schema.newValidator();
validator.setErrorHandler(new MyErrorHandler());
validator.validate(source);
}
}
For More Information, See My Blog
http://blog.bdoughan.com/2010/11/validate-jaxb-object-model-with-xml.html
This how we did it. I had to find a way to validate the xml file versus an xsd corresponding to the version of the xml since we have many apps using different versions of the xml content.
I didn't really find any good examples on the net and finally finished with this. Hope this will help.
ValidationEventCollector vec = new ValidationEventCollector();
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
URL xsdURL = getClass().getResource("/xsd/" + xsd);
Schema schema = sf.newSchema(xsdURL);
//You should change your jaxbContext here for your stuff....
Unmarshaller um = (getJAXBContext(NotificationReponseEnum.NOTIFICATION, notificationWrapper.getEnteteNotification().getTypeNotification()))
.createUnmarshaller();
um.setSchema(schema);
try {
StringReader reader = new StringReader(xml);
um.setEventHandler(vec);
um.unmarshal(reader);
} catch (javax.xml.bind.UnmarshalException ex) {
if (vec != null && vec.hasEvents()) {
erreurs = new ArrayList < MessageErreur > ();
for (ValidationEvent ve: vec.getEvents()) {
MessageErreur erreur = new MessageErreur();
String msg = ve.getMessage();
ValidationEventLocator vel = ve.getLocator();
int numLigne = vel.getLineNumber();
int numColonne = vel.getColumnNumber();
erreur.setMessage(msg);
msgErreur.setCode(ve.getSeverity())
erreur.setException(ve.getLinkedException());
erreur.setPosition(numLigne, numColonne);
erreurs.add(erreur);
logger.debug("Erreur de validation xml" + "erreur : " + numLigne + "." + numColonne + ": " + msg);
}
}
}

Validate an XML File Against Multiple Schema Definitions

I'm trying to validate an XML file against a number of different schemas (apologies for the contrived example):
a.xsd
b.xsd
c.xsd
c.xsd in particular imports b.xsd and b.xsd imports a.xsd, using:
<xs:include schemaLocation="b.xsd"/>
I'm trying to do this via Xerces in the following manner:
XMLSchemaFactory xmlSchemaFactory = new XMLSchemaFactory();
Schema schema = xmlSchemaFactory.newSchema(new StreamSource[] { new StreamSource(this.getClass().getResourceAsStream("a.xsd"), "a.xsd"),
new StreamSource(this.getClass().getResourceAsStream("b.xsd"), "b.xsd"),
new StreamSource(this.getClass().getResourceAsStream("c.xsd"), "c.xsd")});
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new StringReader(xmlContent)));
but this is failing to import all three of the schemas correctly resulting in cannot resolve the name 'blah' to a(n) 'group' component.
I've validated this successfully using Python, but having real problems with Java 6.0 and Xerces 2.8.1. Can anybody suggest what's going wrong here, or an easier approach to validate my XML documents?
So just in case anybody else runs into the same issue here, I needed to load a parent schema (and implicit child schemas) from a unit test - as a resource - to validate an XML String. I used the Xerces XMLSchemFactory to do this along with the Java 6 validator.
In order to load the child schema's correctly via an include I had to write a custom resource resolver. Code can be found here:
https://code.google.com/p/xmlsanity/source/browse/src/com/arc90/xmlsanity/validation/ResourceResolver.java
To use the resolver specify it on the schema factory:
xmlSchemaFactory.setResourceResolver(new ResourceResolver());
and it will use it to resolve your resources via the classpath (in my case from src/main/resources). Any comments are welcome on this...
http://www.kdgregory.com/index.php?page=xml.parsing
section 'Multiple schemas for a single document'
My solution based on that document:
URL xsdUrlA = this.getClass().getResource("a.xsd");
URL xsdUrlB = this.getClass().getResource("b.xsd");
URL xsdUrlC = this.getClass().getResource("c.xsd");
SchemaFactory schemaFactory = schemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
//---
String W3C_XSD_TOP_ELEMENT =
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n"
+ "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" elementFormDefault=\"qualified\">\n"
+ "<xs:include schemaLocation=\"" +xsdUrlA.getPath() +"\"/>\n"
+ "<xs:include schemaLocation=\"" +xsdUrlB.getPath() +"\"/>\n"
+ "<xs:include schemaLocation=\"" +xsdUrlC.getPath() +"\"/>\n"
+"</xs:schema>";
Schema schema = schemaFactory.newSchema(new StreamSource(new StringReader(W3C_XSD_TOP_ELEMENT), "xsdTop"));
The schema stuff in Xerces is (a) very, very pedantic, and (b) gives utterly useless error messages when it doesn't like what it finds. It's a frustrating combination.
The schema stuff in python may be a lot more forgiving, and was letting small errors in the schema go past unreported.
Now if, as you say, c.xsd includes b.xsd, and b.xsd includes a.xsd, then there's no need to load all three into the schema factory. Not only is it unnecessary, it will likely confuse Xerces and result in errors, so this may be your problem. Just pass c.xsd to the factory, and let it resolve b.xsd and a.xsd itself, which it should do relative to c.xsd.
From the xerces documentation :
http://xerces.apache.org/xerces2-j/faq-xs.html
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
...
StreamSource[] schemaDocuments = /* created by your application */;
Source instanceDocument = /* created by your application */;
SchemaFactory sf = SchemaFactory.newInstance(
"http://www.w3.org/XML/XMLSchema/v1.1");
Schema s = sf.newSchema(schemaDocuments);
Validator v = s.newValidator();
v.validate(instanceDocument);
I faced the same problem and after investigating found this solution. It works for me.
Enum to setup the different XSDs:
public enum XsdFile {
// #formatter:off
A("a.xsd"),
B("b.xsd"),
C("c.xsd");
// #formatter:on
private final String value;
private XsdFile(String value) {
this.value = value;
}
public String getValue() {
return this.value;
}
}
Method to validate:
public static void validateXmlAgainstManyXsds() {
final SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
String xmlFile;
xmlFile = "example.xml";
// Use of Enum class in order to get the different XSDs
Source[] sources = new Source[XsdFile.class.getEnumConstants().length];
for (XsdFile xsdFile : XsdFile.class.getEnumConstants()) {
sources[xsdFile.ordinal()] = new StreamSource(xsdFile.getValue());
}
try {
final Schema schema = schemaFactory.newSchema(sources);
final Validator validator = schema.newValidator();
System.out.println("Validating " + xmlFile + " against XSDs " + Arrays.toString(sources));
validator.validate(new StreamSource(new File(xmlFile)));
} catch (Exception exception) {
System.out.println("ERROR: Unable to validate " + xmlFile + " against XSDs " + Arrays.toString(sources)
+ " - " + exception);
}
System.out.println("Validation process completed.");
}
I ended up using this:
import org.apache.xerces.parsers.SAXParser;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.helpers.DefaultHandler;
import java.io.IOException;
.
.
.
try {
SAXParser parser = new SAXParser();
parser.setFeature("http://xml.org/sax/features/validation", true);
parser.setFeature("http://apache.org/xml/features/validation/schema", true);
parser.setFeature("http://apache.org/xml/features/validation/schema-full-checking", true);
parser.setProperty("http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation", "http://your_url_schema_location");
Validator handler = new Validator();
parser.setErrorHandler(handler);
parser.parse("file:///" + "/home/user/myfile.xml");
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException ex) {
e.printStackTrace();
}
class Validator extends DefaultHandler {
public boolean validationError = false;
public SAXParseException saxParseException = null;
public void error(SAXParseException exception)
throws SAXException {
validationError = true;
saxParseException = exception;
}
public void fatalError(SAXParseException exception)
throws SAXException {
validationError = true;
saxParseException = exception;
}
public void warning(SAXParseException exception)
throws SAXException {
}
}
Remember to change:
1) The parameter "http://your_url_schema_location" for you xsd file location.
2) The string "/home/user/myfile.xml" for the one pointing to your xml file.
I didn't have to set the variable: -Djavax.xml.validation.SchemaFactory:http://www.w3.org/2001/XMLSchema=org.apache.xerces.jaxp.validation.XMLSchemaFactory
Just in case, anybody still come here to find the solution for validating xml or object against multiple XSDs, I am mentioning it here
//Using **URL** is the most important here. With URL, the relative paths are resolved for include, import inside the xsd file. Just get the parent level xsd here (not all included xsds).
URL xsdUrl = getClass().getClassLoader().getResource("my/parent/schema.xsd");
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(xsdUrl);
JAXBContext jaxbContext = JAXBContext.newInstance(MyClass.class);
Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
unmarshaller.setSchema(schema);
/* If you need to validate object against xsd, uncomment this
ObjectFactory objectFactory = new ObjectFactory();
JAXBElement<MyClass> wrappedObject = objectFactory.createMyClassObject(myClassObject);
marshaller.marshal(wrappedShipmentMessage, new DefaultHandler());
*/
unmarshaller.unmarshal(getClass().getClassLoader().getResource("your/xml/file.xml"));
If all XSDs belong to the same namespace then create a new XSD and import other XSDs into it. Then in java create schema with the new XSD.
Schema schema = xmlSchemaFactory.newSchema(
new StreamSource(this.getClass().getResourceAsStream("/path/to/all_in_one.xsd"));
all_in_one.xsd :
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:ex="http://example.org/schema/"
targetNamespace="http://example.org/schema/"
elementFormDefault="unqualified"
attributeFormDefault="unqualified">
<xs:include schemaLocation="relative/path/to/a.xsd"></xs:include>
<xs:include schemaLocation="relative/path/to/b.xsd"></xs:include>
<xs:include schemaLocation="relative/path/to/c.xsd"></xs:include>
</xs:schema>

Categories