Catching exceptions with Xalan xslt - java

I have the folowwing XSLT based on Xalan:
TransformerFactory factory = TransformerFactory.newInstance();
XalanErrorListener listener = new XalanErrorListener();
factory.setErrorListener(listener);
// Create transformer
StreamSource config = new StreamSource(xslPath);
Transformer transformer = factory.newTransformer(config);
// Create input / ouput
StreamSource source = new StreamSource(inputPath);
StreamResult result = new StreamResult(outputPath);
// Transform
transformer.transform(source, result);
My XalanErrorListener simply overrides error, fatalError and warning methods from the javax.xml.transform.ErrorListener class and logs the exception:
public final class XalanErrorListener implements ErrorListener {
static final Logger LOGGER = LoggerFactory.getLogger(XalanErrorListener.class);
#Override
public void error(TransformerException exception) throws TransformerException {
LOGGER.error(exception);
}
#Override
public void fatalError(TransformerException exception) throws TransformerException {
LOGGER.error(exception);
}
#Override
public void warning(TransformerException exception) throws TransformerException {
LOGGER.warn(exception);
}
}
Yet, when executing on a badly encoded file, I get the following message in the console:
(Location of error unknown)
com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException:
Invalid byte 2 of 2-byte UTF-8 sequence.
The program executes normally: no exception is thrown or logged and the generated file is empty!
How can I catch the exception to handle it the way I want?

The ErrorListener you supply to Xalan catches transformation errors, but it does not catch XML parsing errors. For that you need to supply an ErrorHandler to the Xerces parser.

The problem came from the fact that the ErrorListener needed to be set to the Transformer and not the TransformerFactory:
Transformer transformer = factory.newTransformer(config);
transformer.setErrorListener(listener);

Related

Handle illegal URI characters in xslt inclusion

In a xsl transformation I have a xslt file that includes some other xslt. The problem is that the URI for these xslt contains illegal characters, in particular '##'. The xslt looks like this:
<xsl:include href="/appdm/tomcat/webapps/sentys##1.0.0/WEB-INF/classes/xslt/release_java/xslt/gen.xslt" />
and when I try to instantiate a java Transformer I get the error:
javax.xml.transform.TransformerConfigurationException: javax.xml.transform.TransformerConfigurationException: javax.xml.transform.TransformerException: org.xml.sax.SAXException: org.apache.xml.utils.URI$MalformedURIException: Fragment contains invalid character:#
This is the java code:
public String xslTransform2String(String sXml, String sXslt) throws Exception {
String sResult = null;
try {
Source oStrSource = createStringSource(sXml);
DocumentBuilderFactory oDocFactory = DocumentBuilderFactory.newInstance();
oDocFactory.setNamespaceAware(true);
//sXslt is the xslt content with the inclusions
//<xsl:include href="/appdm/tomcat/webapps/sentys##1.0.0/WEB-INF/classes/xslt/release_java/xslt/gen.xslt" />"
Document oDocXslt = oDocFactory.newDocumentBuilder().parse(new InputSource(new StringReader(sXslt)));
Source oXsltSource = new DOMSource(oDocXslt);
StringWriter oStrOut = new StringWriter();
Result oTransRes = createStringResult(oStrOut);
Transformer oTrans = createXsltTransformer(oXsltSource);
oTrans.transform(oStrSource, oTransRes);
sResult = oStrOut.toString();
} catch (Exception oEx) {
throw new BddException(oEx, XmlProvider.ERR_XSLT, null);
}
return sResult;
}
private Transformer createXsltTransformer(Source oXsltSource) throws Exception {
Transformer transformer = getXsltTransformerFactory().newTransformer(
oXsltSource);
ErrorListener errorListener = new DefaultErrorListener();
transformer.setErrorListener(errorListener);
return transformer;
}
is there a way I can go with relative paths instead of absolute path?
Thank you
To avoid the MalformedURIException, replace the second or both # with %23.
See https://stackoverflow.com/a/5007362/4092205

JAXB Validator does not detect syntax errors?

I want to validate a xml file with its xsd before unmarshalling it.
The code is as follows :
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = factory.newSchema(xsdFilePath);
Validator validator = schema.newValidator();
validator.setErrorHandler(new MyValidationErrorHandler());
validator.validate(new StreamSource(xmlFilePath));
I found that when a xml element is not closed, Validator failed to record it as an error, But the UnMarshaller recognizes this and throws an "Invalid content was found starting with element.." Error.
I want the Validation and the Unmarshalling/Marshalling to be different operations.
Are there ways to have the Validator detect such syntax errors in the xml file?
You'll have to distinguish two things:
The elementary syntax of an XML document
The document's compliance with an XML SChema
If the elementary syntax isn't right, there's no document that can be investigated for its element structure, attribure existence, value compliance with facets and so on and so on.
I'm afraid you'll have to catch both kinds of exceptions.
You may, however, handle everything in a single unmarshalling operation:
JAXBContext payloadContext = JAXBContext.newInstance("generated");
Unmarshaller unmarshaller = payloadContext.createUnmarshaller();
unmarshaller.setSchema(schemaFactory.newSchema(... )););
unmarshaller.setEventHandler( new ValidationEventHandler(){
public boolean handleEvent(ValidationEvent event) {
System.out.println( "Event! " + event );
return true;
}
} );
Later
To have validation only, you'll still have to parse, but if you don't have JAXB-ish classes, you get by with JAXP:
static class Handler implements ErrorHandler {
public void error(SAXParseException exception){
System.out.println( "error: " + exception.getMessage() );
}
public void fatalError(SAXParseException exception){
System.out.println( "fatal: " + exception.getMessage() );
}
public void warning(SAXParseException exception){
System.out.println( "warning: " + exception.getMessage() );
}
}
Handler handler = new Handler();
DocumentBuilder parser = DocumentBuilderFactory.newInstance().newDocumentBuilder();
parser.setErrorHandler( handler );
try {
Document document = parser.parse(new File("test.xml"));
SchemaFactory factory =
SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Source schemaFile = new StreamSource(new File("test.xsd"));
Schema schema = factory.newSchema(schemaFile);
Validator validator = schema.newValidator();
validator.setErrorHandler( handler );
try {
validator.validate(new DOMSource(document));
} catch (SAXException e) {
// ...
System.out.println( "VAlidation error" );
}
} catch (SAXParseException e) {
// syntax error in XML document
System.out.println( "Syntax error" );
}
For validation, setting a handler will not throw a ParseException, so one of these is redundant.

How to transform XML to HTML using XSLT in ANDROID?

I am working on a android(2.2) project which needs xsl transformation. The below code works perfectly in a regular non-android java project
public static String transform() throws TransformerException {
Source xmlInput = new StreamSource(new File("samplexml.xml"));
Source xslInput = new StreamSource(new File("samplexslt.xslt"));
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(xslInput);
OutputStream baos = new ByteArrayOutputStream();
Result result = new StreamResult(baos);
transformer.transform(xmlInput, result);
return baos.toString();
}
I need similar functionality on android. For this I created 2 files under resources/raw:
samplexml.xml
samplexslt.xslt
(contents of these files come from here.
I tried the below code & it does not work (note the StreamSource constructor arg):
public static String transform() throws TransformerException {
TransformerFactory factory = TransformerFactory.newInstance();
Source xmlInput = new StreamSource(this.getResources().openRawResource(R.raw.samplexml));
Source xslInput = new StreamSource(this.getResources().openRawResource(R.raw.samplexslt));
Transformer transformer = factory.newTransformer(xslInput);//NullPointerException here
OutputStream baos = new ByteArrayOutputStream();
Result result = new StreamResult(baos);
transformer.transform(xmlInput, result);
}
I saw the spec & believe I need to set a systemId. But I couldn't get the above code to work.
So, in an android project, how to handle xslt transformations? Please provide your thoughts.
As we know that we Cannot usethisin a static context and you are doing this in your static method transform(). You can do it like this_
public class YourLoadXSLClass extends Activity {
static Resources res;
#Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
res = getResources();
String strHTML = transform();
// Other code.....
}
/*
* Your method that Transform CSLT.
*/
public static String transform() throws TransformerException {
TransformerFactory factory = TransformerFactory.newInstance();
// Now your raw files are accessible here.
Source xmlInput = new StreamSource(
LoadXSLTinWebview.res.openRawResource(R.raw.samplexml));
Source xslInput = new StreamSource(
LoadXSLTinWebview.res.openRawResource(R.raw.samplexslt));
Transformer transformer = factory.newTransformer(xslInput);
OutputStream baos = new ByteArrayOutputStream();
Result result = new StreamResult(baos);
transformer.transform(xmlInput, result);
return baos.toString();
}
}
Here is the complete class code that do the needful. I hope this will help you & all!
I've never done anything with XSLT but, looking at your code, logically there are only two things that could cause an NPE on that line. The first would be that factory might be null but that doesn't make sense.
That leaves xslInput as being the culprit which suggests openRawResource(R.raw.samplexslt) is failing to return a valid InputStream for the StreamSource constructor to use. Try putting a log statement in such as...
if (xslInput != null {
Transformer transformer = factory.newTransformer(xslInput);
...
}
else
Log.d("SomeTAG", "xslInput is null!!!");
If it turns out that xslInput is actually null then it suggests openRawResource(...) can't find/process the .xslt file properly. In that case I'd suggest using AssetManagerto open the .xslt file by name...
AssetManager am = this.getAssets();
Source xslInput = new StreamSource(am.open("samplexslt.xslt"));

How to skip well-formed for java DOM parser

I know this has been asked multiple times here, but I've a different issue dealing with it. In my case, the app receives a non well-formed dom structure passed as a string. Here's a sample :
<div class='video yt'><div class='yt_url'>http://www.youtube.com/watch?v=U_QLu_Twd0g&feature=abcde_gdata</div></div>
As you can see, the content is not well-formed. Now, if I try to parse using a normal SAX or DOM parse it'll throw an exception which is understood.
org.xml.sax.SAXParseException: The reference to entity "feature" must end with the ';' delimiter.
As per the requirement, I need to read this document,add few additional div tags and send the content back as a string. This works great by using a DOM parser as I can read through the input structure and add additional tags at their required position.
I tried using tools like JTidy to do a pre-processing and then parse, but that results in converting the document to a fully-blown html, which I don't want. Here's a sample code :
StringWriter writer = new StringWriter();
Tidy tidy = new Tidy(); // obtain a new Tidy instance
tidy.setXHTML(true);
tidy.parse(new ByteArrayInputStream(content.getBytes()), writer);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new ByteArrayInputStream(writer.toString().getBytes()));
// Traverse thru the content and add new tags
....
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
This completely converts the input to a well-formed html document. It then becomes hard to remove html tags manually. The other option I tried was to use SAX2DOM, which too creates a HTML doc. Here's a sample code .
ByteArrayInputStream is = new ByteArrayInputStream(content.getBytes());
Parser p = new Parser();
p.setFeature(IContentExtractionConstant.SAX_NAMESPACE,true);
SAX2DOM sax2dom = new SAX2DOM();
p.setContentHandler(sax2dom);
p.parse(new InputSource(is));
Document doc = (Document)sax2dom.getDOM();
I'll appreciate if someone can share their ideas.
Thanks
The simplest way is replacing xml reserved characters with the corresponding xml entities. You can do this manually:
content.replaceAll("&", "&");
If you don't want to modify your string before parsing it, I could propose you another way using SaxParser, but this solution is more complicated. Basically you have to:
write a LexicalHandler in
combination with ContentHandler
tell the parser to continue its
execution after fatal error (the
ErrorHandler isn't enough)
treat undeclared entities as simple
text
UPDATE
According to your comment, I'm going to add some details regarding the second solution. I've writed a class which extends DefaulHandler (default implementation of EntityResolver, DTDHandler, ContentHandler and ErrorHandler) and implements LexicalHandler. I've extended ErrorHandler's fatalError method (my implementations does nothing instead of throwing the exception) and ContentHandler's characters method which works in combination with startEntity method of LexicalHandler.
public class MyHandler extends DefaultHandler implements LexicalHandler {
private String currentEntity = null;
#Override
public void fatalError(SAXParseException e) throws SAXException {
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
String content = new String(ch, start, length);
if (currentEntity != null) {
content = "&" + currentEntity + content;
currentEntity = null;
}
System.out.print(content);
}
#Override
public void startEntity(String name) throws SAXException {
currentEntity = name;
}
#Override
public void endEntity(String name) throws SAXException {
}
#Override
public void startDTD(String name, String publicId, String systemId)
throws SAXException {
}
#Override
public void endDTD() throws SAXException {
}
#Override
public void startCDATA() throws SAXException {
}
#Override
public void endCDATA() throws SAXException {
}
#Override
public void comment(char[] ch, int start, int length) throws SAXException {
}
}
This is my main which parses your xml not well formed. It's very important the setFeature, because without it the parser throws the SaxParseException despite of the ErrorHandler empty implementation.
public static void main(String[] args) throws ParserConfigurationException,
SAXException, IOException {
String xml = "<div class='video yt'><div class='yt_url'>http://www.youtube.com/watch?v=U_QLu_Twd0g&feature=abcde_gdata</div></div>";
SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
XMLReader xmlReader = saxParser.getXMLReader();
MyHandler myHandler = new MyHandler();
xmlReader.setContentHandler(myHandler);
xmlReader.setErrorHandler(myHandler);
xmlReader.setProperty("http://xml.org/sax/properties/lexical-handler",
myHandler);
xmlReader.setFeature(
"http://apache.org/xml/features/continue-after-fatal-error",
true);
xmlReader.parse(new InputSource(new StringReader(xml)));
}
This main prints out the content of your div element which contains the error:
http://www.youtube.com/watch?v=U_QLu_Twd0g&feature=abcde_gdata
Keep in mind that this is an example which works with your input, maybe you'll have to complete it...for instance if you have some characters correctly escaped you should add some lines of code to handle this situation etc.
Hope this helps.

how can i unmarshall in jaxb and enjoy the schema validation without using an explicit schema file

I am using jaxb for my application configurations
I feel like I am doing something really crooked and I am looking for a way to not need an actual file or this transaction.
As you can see in code I:
1.create a schema into a file from my JaxbContext (from my class annotation actually)
2.set this schema file in order to allow true validation when I unmarshal
JAXBContext context = JAXBContext.newInstance(clazz);
Schema mySchema = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI).newSchema(schemaFile);
jaxbContext.generateSchema(new MySchemaOutputResolver()); // ultimately creates schemaFile
Unmarshaller u = m_context.createUnmarshaller();
u.setSchema(mySchema);
u.unmarshal(...);
do any of you know how I can validate jaxb without needing to create a schema file that sits in my computer?
Do I need to create a schema for validation, it looks redundant when I get it by JaxbContect.generateSchema ?
How do you do this?
Regarding ekeren's solution above, it's not a good idea to use PipedOutputStream/PipedInputStream in a single thread, lest you overflow the buffer and cause a deadlock. ByteArrayOutputStream/ByteArrayInputStream works, but if your JAXB classes generate multiple schemas (in different namespaces) you need multiple StreamSources.
I ended up with this:
JAXBContext jc = JAXBContext.newInstance(Something.class);
final List<ByteArrayOutputStream> outs = new ArrayList<ByteArrayOutputStream>();
jc.generateSchema(new SchemaOutputResolver(){
#Override
public Result createOutput(String namespaceUri, String suggestedFileName) throws IOException {
ByteArrayOutputStream out = new ByteArrayOutputStream();
outs.add(out);
StreamResult streamResult = new StreamResult(out);
streamResult.setSystemId("");
return streamResult;
}});
StreamSource[] sources = new StreamSource[outs.size()];
for (int i=0; i<outs.size(); i++) {
ByteArrayOutputStream out = outs.get(i);
// to examine schema: System.out.append(new String(out.toByteArray()));
sources[i] = new StreamSource(new ByteArrayInputStream(out.toByteArray()),"");
}
SchemaFactory sf = SchemaFactory.newInstance( XMLConstants.W3C_XML_SCHEMA_NS_URI );
m.setSchema(sf.newSchema(sources));
m.marshal(docs, new DefaultHandler()); // performs the schema validation
I had the exact issue and found a solution in the Apache Axis 2 source code:
protected List<DOMResult> generateJaxbSchemas(JAXBContext context) throws IOException {
final List<DOMResult> results = new ArrayList<DOMResult>();
context.generateSchema(new SchemaOutputResolver() {
#Override
public Result createOutput(String ns, String file) throws IOException {
DOMResult result = new DOMResult();
result.setSystemId(file);
results.add(result);
return result;
}
});
return results;
}
and after you've acquired your list of DOMResults that represent the schemas, you will need to transform them into DOMSource objects before you can feed them into a schema generator. This second step might look something like this:
Unmarshaller u = myJAXBContext.createUnmarshaller();
List<DOMSource> dsList = new ArrayList<DOMSource>();
for(DOMResult domresult : myDomList){
dsList.add(new DOMSource(domresult.getNode()));
}
String schemaLang = "http://www.w3.org/2001/XMLSchema";
SchemaFactory sFactory = SchemaFactory.newInstance(schemaLang);
Schema schema = sFactory.newSchema((DOMSource[]) dsList.toArray(new DOMSource[0]));
u.setSchema(schema);
I believe you just need to set a ValidationEventHandler on your unmarshaller. Something like this:
public class JAXBValidator extends ValidationEventCollector {
#Override
public boolean handleEvent(ValidationEvent event) {
if (event.getSeverity() == event.ERROR ||
event.getSeverity() == event.FATAL_ERROR)
{
ValidationEventLocator locator = event.getLocator();
// change RuntimeException to something more appropriate
throw new RuntimeException("XML Validation Exception: " +
event.getMessage() + " at row: " + locator.getLineNumber() +
" column: " + locator.getColumnNumber());
}
return true;
}
}
And in your code:
Unmarshaller u = m_context.createUnmarshaller();
u.setEventHandler(new JAXBValidator());
u.unmarshal(...);
If you use maven using jaxb2-maven-plugin can help you. It generates schemas in generate-resources phase.

Categories