I am given an XML document that must be allowed to have a Document Type Declaration (DTD), but we prohibit any ENTITY declarations.
The XML document is parsed with SAXParser.parse(), as follows:
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
factory.setValidating(true);
SAXParser parser = factory.newSAXParser();
The XML is then passed into the parser as an InputSource:
InputSource inputSource= ... ;
parser.parse(inputSource, handler);
And the handler has a resolveEntity method, which SAXParser.parse() calls:
public InputSource resolveEntity(String pubID, String sysID) throws SAXException {
InputSource inputSource = null;
try {
inputSource = entityResolver.resolveEntity(publicId, systemId);
}
catch (IOException e) {
throw new SAXException(e);
}
return inputSource;
}
When I pass in an XML file with an ENTITY reference, it seems that nothing is being done - no exceptions are thrown and nothing is stripped - to or about the prohibited ENTITY reference.
Here is an example of the bad XML I am using. The DTD should be allowed, but the !ENTITY line should be disallowed:
<!DOCTYPE foo SYSTEM "foo.dtd" [
<!ENTITY gotcha SYSTEM "file:///gotcha.txt"> <!-- This is disallowed-->
]>
<label>&gotcha;</label>
What do I need to do to make sure that ENTITY references are disallowed in the XML, but that DTDs are still allowed?
Set a org.xml.sax.ext.DeclHandler on the SaxParser.
parser.setProperty("http://xml.org/sax/properties/declaration-handler", myDeclHandler);
The DeclHandler gets notified when a internal entity declaration is parsed. To disallow entity decls you can simple throw a SAXException:
public class MyDeclHandler extends org.xml.sax.ext.DefaultHandler2 {
public void internalEntityDecl(String name, String value) throws SAXException {
throw new SAXException("not allowed");
}
}
Related
I previously unmarshaled my documents with this Method from "javax.xml.bind.Unmarshaller" (i am using Moxy as jaxb provider):
<Object> JAXBElement<Object> javax.xml.bind.Unmarshaller.unmarshal(Source source, Class<Object> declaredType) throws JAXBException
Now i have to refactor my code to use
<?> JAXBElement<?> javax.xml.bind.Unmarshaller.unmarshal(Node node, Class<?> declaredType) throws JAXBException
The problem is that i get an unmarshal exception:
Caused by: javax.xml.bind.UnmarshalException
- with linked exception:
[Exception [EclipseLink-25004] (Eclipse Persistence Services - 2.4.0.v20120608-r11652): org.eclipse.persistence.exceptions.XMLMarshalException
Exception Description: An error occurred unmarshalling the document
Internal Exception: org.xml.sax.SAXParseExceptionpublicId: ; systemId: ; lineNumber: 0; columnNumber: 0; cvc-elt.1: Cannot find the declaration of element 'invoice:request'.]
i tried unmarshal with Document and with Document.getDocumentElement(). The data in document is the same as in InputStream. Document is created this way:
protected static Document extractDocument(InputStream sourceData) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
Document doc;
try {
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
doc = docBuilder.parse(new InputSource(sourceData));
} catch (Exception e) {
throw new IllegalArgumentException("Problem on reading xml, cause: ", e);
}
return doc;
}
I need the "intermediate" Document to do the type recognition/ to get the second argument for unmarshal.
so how to use the unmarshal method with Node/Document that the outcome is the same as used with the inputstream?
Edit for rick
I am unmarshalling xml data from here xsds are in this zip and examples are here.
i wrote a simple test (jaxb provider is moxy, it was also used to generate the binding classes):
#Test
public void test() throws Exception {
Unmarshaller um;
JAXBContext jaxbContext = JAXBContext
.newInstance("....generatedClasses.invoice.general430.request");
um = jaxbContext.createUnmarshaller();
InputStream xmlIs = SingleTests.class.getResourceAsStream("/430_requests/dentist_ersred_TG_430.xml");
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = sf.newSchema(this.getClass().getResource("/xsd/" + "generalInvoiceRequest_430.xsd"));
// start
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
Document doc;
try {
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
doc = docBuilder.parse(new InputSource(xmlIs));
} catch (Exception e) {
throw new Exception("Problem on recognizing invoice type of given xml, cause: ", e);
}
// end
um.setSchema(schema);
um.unmarshal(doc, ....generatedClasses.invoice.general430.request.RequestType.class);
}
This is not working with the error described above. But as soon as deleted the "start end block" and unmarshall the stream xmlIs directly (not using the Document) the code works.
Ok i asked the same Question here the answer is:
simple add
docBuilderFactory.setNamespaceAware(true);
The following code works for me with a simple test model and gives the same result in all three cases:
JAXBContext ctx = JAXBContext.newInstance(new Class[] { TestObject.class }, null);
Source source = new StreamSource("src/stack16038604/instance.xml");
JAXBElement o1 = ctx.createUnmarshaller().unmarshal(source, TestObject.class);
System.out.println(o1.getValue());
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
InputStream inputStream2 = ClassLoader.getSystemResourceAsStream("stack16038604/instance.xml");
Document node = builderFactory.newDocumentBuilder().parse(inputStream2);
JAXBElement o2 = ctx.createUnmarshaller().unmarshal(node, TestObject.class);
System.out.println(o2.getValue());
DocumentBuilderFactory builderFactory2 = DocumentBuilderFactory.newInstance();
InputStream inputStream3 = ClassLoader.getSystemResourceAsStream("stack16038604/instance.xml");
InputSource inputSource = new InputSource(inputStream3);
Document node2 = builderFactory2.newDocumentBuilder().parse(inputSource);
JAXBElement o3 = ctx.createUnmarshaller().unmarshal(node2, TestObject.class);
System.out.println(o3.getValue());
If you provided a systemId in the Source use-case, then you should use the same systemId on the InputSource you create:
inputSource.setSystemId(sameIdAsYouUsedForSource);
If you are still having problems could you post the XML that you are trying to unmarshal? Also if your JAXB objects were generated from an XML Schema that could provide useful information as well.
Hope this helps,
Rick
I have a xml file which is a response from Webservice.It has got various namespaces involved with it. When I try to validate it with appropriate XSD its throwing "org.xml.sax.SAXParseException: cvc-elt.1: Cannot find the declaration of element 'SOAP-ENV:Envelope'." The namespace declaration for all the namespaces are declared in the response.
Following is my code
try {
DocumentBuilderFactory xmlFact = DocumentBuilderFactory.newInstance();
SchemaFactory schemaFactory = SchemaFactory
.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
SAXSource mainInputStream = new SAXSource(new InputSource(new FileInputStream(new File("FIXEDINCOME_v3_0.xsd"))));
SAXSource importInputStream1 =new SAXSource(new InputSource(new FileInputStream(new File("Rating.xsd"))));
SAXSource importInputStream2 = new SAXSource(new InputSource(new FileInputStream(new File("Datatypes.xsd"))));
Source[] sourceSchema = new SAXSource[]{mainInputStream , importInputStream1, importInputStream2};
Schema schema = schemaFactory.newSchema(sourceSchema);
xmlFact.setNamespaceAware(true);
xmlFact.setSchema(schema);
DocumentBuilder builder = xmlFact.newDocumentBuilder();
xmlDOC = builder.parse(new InputSource(new StringReader(inputXML)));
NamespaceContext ctx = new NamespaceContext() {
public String getNamespaceURI(String prefix) {
String uri;
if (prefix.equals("ns0"))
uri = "http://namespace.worldnet.ml.com/EDS/Standards/Common/Service_v1_0/";
else if (prefix.equals("ns1"))
uri = "http://namespace.worldnet.ml.com/EDS/Product/Services/Get_Product_Data_Svc_v3_0/";
else if (prefix.equals("ns2"))
uri = "http://namespace.worldnet.ml.com/DataSOA/Product/Objects/FixedIncome/FixedIncome_v3_0/";
else if (prefix.equals("ns3")) {
uri = "http://namespace.worldnet.ml.com/DataSOA/Product/Objects/Rating/Rating_v2_0/";
} else if (prefix.equals("SOAP-ENV")) {
uri = "http://schemas.xmlsoap.org/soap-envelope/";
} else
uri = null;
return uri;
}
// Dummy implementation - not used!
public Iterator getPrefixes(String val) {
return null;
}
// Dummy implemenation - not used!
public String getPrefix(String uri) {
return null;
}
};
XPathFactory xpathFact = XPathFactory.newInstance();
xPath = xpathFact.newXPath();
xPath.setNamespaceContext(ctx);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I don't think the problem is with detecting the namespace definition for the SOAP-ENV prefix. The validator needs the XSD file that defines elements used in that namespace in order to validate the SOAP-ENV:Envelope element, so you need to tell the validator where that schema is located.
I think the solution is either to add the following to your XML reponse:
<SOAP-ENV:Envelope
xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsi="http://schemas.xmlsoap.org/soap/envelope/"
xsi:schemaLocation="http://schemas.xmlsoap.org/soap/envelope/
http://schemas.xmlsoap.org/soap/envelope/">
Or, go download that schema off the web, save it to your local filesystem as an XSD file, and add it to your sourceSchema array. The first method should be preferred as it leads to more portable code (and XML).
Have you tried using the following URI for the SOAP-ENV?
http://schemas.xmlsoap.org/soap/envelope/
The set of schemas you are providing for validation does not include the soap schema. you can either include the soap schema in the schema collection, or, if you don't care about validating the soap wrapper, just grab the actual body content element and run your validation from there.
i have a schema.xsd which includes and modifies xhtml like this:
<xs:redefine schemaLocation="http://www.w3.org/MarkUp/SCHEMA/xhtml11.xsd">
...
</xs:redefine>
Now i have written a Validator, which
reads the schema from xml file
uses a CatalogManager for resolving entities
it works fine as it does not load any files from the net but rather finds xhtml11.xsd as given in my catalog.xml file.
public class XmlTemplateValidator implements TemplateValidator
{
public List<SAXParseException> validate ( String xml ) throws Exception
{
Reader input = new StringReader(xml);
InputSource inputSource = new InputSource(input);
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
factory.setValidating(true);
SAXParser parser = factory.newSAXParser();
parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage", "http://www.w3.org/2001/XMLSchema");
XMLReader reader = parser.getXMLReader();
reader.setEntityResolver(new CatalogResolver());
DefaultErrorHandler handler = new DefaultErrorHandler();
reader.setErrorHandler(handler);
reader.parse(inputSource);
return handler.getSaxParseExceptions();
}
}
Now i want exactly the same thing, but i want to give the schema inside my validator (so not let the author say against which schema it should validate, but rather let the validator decide which schema to use.
public class NewXmlTemplateValidator implements TemplateValidator
{
static final String schemaSource = "schema.xsd";
public List<SAXParseException> validate ( String xml ) throws Exception
{
Reader input = new StringReader(xml);
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
URL schemaUrl = getClass().getResource(schemaSource);
Schema schema = factory.newSchema(schemaUrl);
Validator validator = schema.newValidator();
DefaultErrorHandler handler = new DefaultErrorHandler();
validator.setErrorHandler(handler);
Source source = new StreamSource(input);
validator.validate(source);
return handler.getSaxParseExceptions();
}
}
It works but it does load all xhtml files form the net which takes rather long and is not what i want.
So i want to validate a XML String against a predefined schema with proper entity resolving via a catalog.xml definition.
How can i easily add a CatalogResolver to the second setup?
Add an XMLCatalogResolver in the second example like this:
SchemaFactory factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
URL schemaUrl = getClass().getResource(schemaSource);
String catalog = getClass().getResource("/catalog.xml").getFile();
XMLCatalogResolver resolver = new XMLCatalogResolver(new String[] { catalog });
factory.setResourceResolver(resolver);
I know this has been asked multiple times here, but I've a different issue dealing with it. In my case, the app receives a non well-formed dom structure passed as a string. Here's a sample :
<div class='video yt'><div class='yt_url'>http://www.youtube.com/watch?v=U_QLu_Twd0g&feature=abcde_gdata</div></div>
As you can see, the content is not well-formed. Now, if I try to parse using a normal SAX or DOM parse it'll throw an exception which is understood.
org.xml.sax.SAXParseException: The reference to entity "feature" must end with the ';' delimiter.
As per the requirement, I need to read this document,add few additional div tags and send the content back as a string. This works great by using a DOM parser as I can read through the input structure and add additional tags at their required position.
I tried using tools like JTidy to do a pre-processing and then parse, but that results in converting the document to a fully-blown html, which I don't want. Here's a sample code :
StringWriter writer = new StringWriter();
Tidy tidy = new Tidy(); // obtain a new Tidy instance
tidy.setXHTML(true);
tidy.parse(new ByteArrayInputStream(content.getBytes()), writer);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new ByteArrayInputStream(writer.toString().getBytes()));
// Traverse thru the content and add new tags
....
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
This completely converts the input to a well-formed html document. It then becomes hard to remove html tags manually. The other option I tried was to use SAX2DOM, which too creates a HTML doc. Here's a sample code .
ByteArrayInputStream is = new ByteArrayInputStream(content.getBytes());
Parser p = new Parser();
p.setFeature(IContentExtractionConstant.SAX_NAMESPACE,true);
SAX2DOM sax2dom = new SAX2DOM();
p.setContentHandler(sax2dom);
p.parse(new InputSource(is));
Document doc = (Document)sax2dom.getDOM();
I'll appreciate if someone can share their ideas.
Thanks
The simplest way is replacing xml reserved characters with the corresponding xml entities. You can do this manually:
content.replaceAll("&", "&");
If you don't want to modify your string before parsing it, I could propose you another way using SaxParser, but this solution is more complicated. Basically you have to:
write a LexicalHandler in
combination with ContentHandler
tell the parser to continue its
execution after fatal error (the
ErrorHandler isn't enough)
treat undeclared entities as simple
text
UPDATE
According to your comment, I'm going to add some details regarding the second solution. I've writed a class which extends DefaulHandler (default implementation of EntityResolver, DTDHandler, ContentHandler and ErrorHandler) and implements LexicalHandler. I've extended ErrorHandler's fatalError method (my implementations does nothing instead of throwing the exception) and ContentHandler's characters method which works in combination with startEntity method of LexicalHandler.
public class MyHandler extends DefaultHandler implements LexicalHandler {
private String currentEntity = null;
#Override
public void fatalError(SAXParseException e) throws SAXException {
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
String content = new String(ch, start, length);
if (currentEntity != null) {
content = "&" + currentEntity + content;
currentEntity = null;
}
System.out.print(content);
}
#Override
public void startEntity(String name) throws SAXException {
currentEntity = name;
}
#Override
public void endEntity(String name) throws SAXException {
}
#Override
public void startDTD(String name, String publicId, String systemId)
throws SAXException {
}
#Override
public void endDTD() throws SAXException {
}
#Override
public void startCDATA() throws SAXException {
}
#Override
public void endCDATA() throws SAXException {
}
#Override
public void comment(char[] ch, int start, int length) throws SAXException {
}
}
This is my main which parses your xml not well formed. It's very important the setFeature, because without it the parser throws the SaxParseException despite of the ErrorHandler empty implementation.
public static void main(String[] args) throws ParserConfigurationException,
SAXException, IOException {
String xml = "<div class='video yt'><div class='yt_url'>http://www.youtube.com/watch?v=U_QLu_Twd0g&feature=abcde_gdata</div></div>";
SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser();
XMLReader xmlReader = saxParser.getXMLReader();
MyHandler myHandler = new MyHandler();
xmlReader.setContentHandler(myHandler);
xmlReader.setErrorHandler(myHandler);
xmlReader.setProperty("http://xml.org/sax/properties/lexical-handler",
myHandler);
xmlReader.setFeature(
"http://apache.org/xml/features/continue-after-fatal-error",
true);
xmlReader.parse(new InputSource(new StringReader(xml)));
}
This main prints out the content of your div element which contains the error:
http://www.youtube.com/watch?v=U_QLu_Twd0g&feature=abcde_gdata
Keep in mind that this is an example which works with your input, maybe you'll have to complete it...for instance if you have some characters correctly escaped you should add some lines of code to handle this situation etc.
Hope this helps.
I'm trying to use JAXB to unmarshal an xml file into objects but have come across a few difficulties. The actual project has a few thousand lines in the xml file so i've reproduced the error on a smaller scale as follows:
The XML file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<catalogue title="some catalogue title"
publisher="some publishing house"
xmlns="x-schema:TamsDataSchema.xml"/>
The XSD file for producing JAXB classes
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="catalogue" type="catalogueType"/>
<xsd:complexType name="catalogueType">
<xsd:sequence>
<xsd:element ref="journal" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="title" type="xsd:string"/>
<xsd:attribute name="publisher" type="xsd:string"/>
</xsd:complexType>
</xsd:schema>
Code snippet 1:
final JAXBContext context = JAXBContext.newInstance(CatalogueType.class);
um = context.createUnmarshaller();
CatalogueType ct = (CatalogueType)um.unmarshal(new File("file output address"));
Which throws the error:
javax.xml.bind.UnmarshalException: unexpected element (uri:"x-schema:TamsDataSchema.xml", local:"catalogue"). Expected elements are <{}catalogue>
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(UnmarshallingContext.java:642)
at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:247)
at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:242)
at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Loader.java:116)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext$DefaultRootLoader.childElement(UnmarshallingContext.java:1049)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(UnmarshallingContext.java:478)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(UnmarshallingContext.java:459)
at com.sun.xml.bind.v2.runtime.unmarshaller.SAXConnector.startElement(SAXConnector.java:148)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
...etc
So the namespace in the XML document is causing issues, unfortunately if it's removed it works fine, but as the file is supplied by the client we're stuck with it. I've attempted numerous ways of specifying it in the XSD but none of the permutations seem to work.
I also attempted to unmarshal ignoring namespace using the following code:
Unmarshaller um = context.createUnmarshaller();
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader = sax.newSAXParser().getXMLReader();
final Source er = new SAXSource(reader, new InputSource(new FileReader("file location")));
CatalogueType ct = (CatalogueType)um.unmarshal(er);
System.out.println(ct.getPublisher());
System.out.println(ct.getTitle());
which works fine but fails to unmarshal element attributes and prints
null
null
Due to reasons beyond our control we're limited to using Java 1.5 and we're using JAXB 2.0 which is unfortunate because the second code block works as desired using Java 1.6.
any suggestions would be greatly appreciated, the alternative is cutting the namespace declaration out of the file before parsing it which seems inelegant.
Thank you for this post and your code snippet. It definitely put me on the right path as I was also going nuts trying to deal with some vendor-provided XML that had xmlns="http://vendor.com/foo" all over the place.
My first solution (before I read your post) was to take the XML in a String, then xmlString.replaceAll(" xmlns=", " ylmns="); (the horror, the horror). Besides offending my sensibility, in was a pain when processing XML from an InputStream.
My second solution, after looking at your code snippet: (I'm using Java7)
// given an InputStream inputStream:
String packageName = docClass.getPackage().getName();
JAXBContext jc = JAXBContext.newInstance(packageName);
Unmarshaller u = jc.createUnmarshaller();
InputSource is = new InputSource(inputStream);
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader;
try {
reader = sax.newSAXParser().getXMLReader();
} catch (SAXException | ParserConfigurationException e) {
throw new RuntimeException(e);
}
SAXSource source = new SAXSource(reader, is);
#SuppressWarnings("unchecked")
JAXBElement<T> doc = (JAXBElement<T>)u.unmarshal(source);
return doc.getValue();
But now, I found a third solution which I like much better, and hopefully that might be useful to others: How to define properly the expected namespace in the schema:
<xsd:schema jxb:version="2.0"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
xmlns="http://vendor.com/foo"
targetNamespace="http://vendor.com/foo"
elementFormDefault="unqualified"
attributeFormDefault="unqualified">
With that, we can now remove the sax.setNamespaceAware(false); line (update: actually, if we keep the unmarshal(SAXSource) call, then we need to sax.setNamespaceAware(true). But the simpler way is to not bother with SAXSource and the code surrounding its creation and instead unmarshal(InputStream) which by default is namespace-aware. And the ouput of a marshal() also has the proper namespace too.
Yeh. Only about 4 hours down the drain.
How to ignore the namespaces
You can use an XMLStreamReader that is non-namespace aware, it will basically trim out all namespaces from the xml file that you're parsing:
// configure the stream reader factory
XMLInputFactory xif = XMLInputFactory.newFactory();
xif.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); // this is the magic line
// create xml stream reader using our configured factory
StreamSource source = new StreamSource(someFile);
XMLStreamReader xsr = xif.createXMLStreamReader(source);
// unmarshall, note that it's better to reuse JAXBContext, as newInstance()
// calls are pretty expensive
JAXBContext jc = JAXBContext.newInstance(your.ObjectFactory.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
Object unmarshal = unmarshaller.unmarshal(xsr);
Now the actual xml that gets fed into JAXB doesn't have any namespace info.
Important note (xjc)
If you generated java classes from an xsd schema using xjc and the schema had a namespace defined, then the generated annotations will have that namespace, so delete it manually! Otherwise JAXB won't recognize such data.
Places where the annotations should be changed:
ObjectFactory.java
// change this line
private final static QName _SomeType_QNAME = new QName("some-weird-namespace", "SomeType");
// to something like
private final static QName _SomeType_QNAME = new QName("", "SomeType", "");
// and this annotation
#XmlElementDecl(namespace = "some-weird-namespace", name = "SomeType")
// to this
#XmlElementDecl(namespace = "", name = "SomeType")
package-info.java
// change this annotation
#javax.xml.bind.annotation.XmlSchema(namespace = "some-weird-namespace", elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
// to something like this
#javax.xml.bind.annotation.XmlSchema(namespace = "", elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
Now your JAXB code will expect to see everything without any namespaces and the XMLStreamReader that we created supplies just that.
Here is my solution for this Namespace related issue. We can trick JAXB by implementing our own XMLFilter and Attribute.
class MyAttr extends AttributesImpl {
MyAttr(Attributes atts) {
super(atts);
}
#Override
public String getLocalName(int index) {
return super.getQName(index);
}
}
class MyFilter extends XMLFilterImpl {
#Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
super.startElement(uri, localName, qName, new VersAttr(atts));
}
}
public SomeObject testFromXML(InputStream input) {
try {
// Create the JAXBContext
JAXBContext jc = JAXBContext.newInstance(SomeObject.class);
// Create the XMLFilter
XMLFilter filter = new VersFilter();
// Set the parent XMLReader on the XMLFilter
SAXParserFactory spf = SAXParserFactory.newInstance();
//spf.setNamespaceAware(false);
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
filter.setParent(xr);
// Set UnmarshallerHandler as ContentHandler on XMLFilter
Unmarshaller unmarshaller = jc.createUnmarshaller();
UnmarshallerHandler unmarshallerHandler = unmarshaller
.getUnmarshallerHandler();
filter.setContentHandler(unmarshallerHandler);
// Parse the XML
InputSource is = new InputSource(input);
filter.parse(is);
return (SomeObject) unmarshallerHandler.getResult();
}catch (Exception e) {
logger.debug(ExceptionUtils.getFullStackTrace(e));
}
return null;
}
There is a workaround for this issue explained in this post: JAXB: How to ignore namespace during unmarshalling XML document?. It explains how to dynamically add/remove xmlns entries from XML using a SAX Filter. Handles marshalling and unmarshalling alike.