JAXB unmarshalling ignoring namespace turns element attributes into null - java

I'm trying to use JAXB to unmarshal an xml file into objects but have come across a few difficulties. The actual project has a few thousand lines in the xml file so i've reproduced the error on a smaller scale as follows:
The XML file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<catalogue title="some catalogue title"
publisher="some publishing house"
xmlns="x-schema:TamsDataSchema.xml"/>
The XSD file for producing JAXB classes
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="catalogue" type="catalogueType"/>
<xsd:complexType name="catalogueType">
<xsd:sequence>
<xsd:element ref="journal" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
<xsd:attribute name="title" type="xsd:string"/>
<xsd:attribute name="publisher" type="xsd:string"/>
</xsd:complexType>
</xsd:schema>
Code snippet 1:
final JAXBContext context = JAXBContext.newInstance(CatalogueType.class);
um = context.createUnmarshaller();
CatalogueType ct = (CatalogueType)um.unmarshal(new File("file output address"));
Which throws the error:
javax.xml.bind.UnmarshalException: unexpected element (uri:"x-schema:TamsDataSchema.xml", local:"catalogue"). Expected elements are <{}catalogue>
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.handleEvent(UnmarshallingContext.java:642)
at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:247)
at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportError(Loader.java:242)
at com.sun.xml.bind.v2.runtime.unmarshaller.Loader.reportUnexpectedChildElement(Loader.java:116)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext$DefaultRootLoader.childElement(UnmarshallingContext.java:1049)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext._startElement(UnmarshallingContext.java:478)
at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallingContext.startElement(UnmarshallingContext.java:459)
at com.sun.xml.bind.v2.runtime.unmarshaller.SAXConnector.startElement(SAXConnector.java:148)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl$NSContentDispatcher.scanRootElementHook(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
...etc
So the namespace in the XML document is causing issues, unfortunately if it's removed it works fine, but as the file is supplied by the client we're stuck with it. I've attempted numerous ways of specifying it in the XSD but none of the permutations seem to work.
I also attempted to unmarshal ignoring namespace using the following code:
Unmarshaller um = context.createUnmarshaller();
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader = sax.newSAXParser().getXMLReader();
final Source er = new SAXSource(reader, new InputSource(new FileReader("file location")));
CatalogueType ct = (CatalogueType)um.unmarshal(er);
System.out.println(ct.getPublisher());
System.out.println(ct.getTitle());
which works fine but fails to unmarshal element attributes and prints
null
null
Due to reasons beyond our control we're limited to using Java 1.5 and we're using JAXB 2.0 which is unfortunate because the second code block works as desired using Java 1.6.
any suggestions would be greatly appreciated, the alternative is cutting the namespace declaration out of the file before parsing it which seems inelegant.

Thank you for this post and your code snippet. It definitely put me on the right path as I was also going nuts trying to deal with some vendor-provided XML that had xmlns="http://vendor.com/foo" all over the place.
My first solution (before I read your post) was to take the XML in a String, then xmlString.replaceAll(" xmlns=", " ylmns="); (the horror, the horror). Besides offending my sensibility, in was a pain when processing XML from an InputStream.
My second solution, after looking at your code snippet: (I'm using Java7)
// given an InputStream inputStream:
String packageName = docClass.getPackage().getName();
JAXBContext jc = JAXBContext.newInstance(packageName);
Unmarshaller u = jc.createUnmarshaller();
InputSource is = new InputSource(inputStream);
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader;
try {
reader = sax.newSAXParser().getXMLReader();
} catch (SAXException | ParserConfigurationException e) {
throw new RuntimeException(e);
}
SAXSource source = new SAXSource(reader, is);
#SuppressWarnings("unchecked")
JAXBElement<T> doc = (JAXBElement<T>)u.unmarshal(source);
return doc.getValue();
But now, I found a third solution which I like much better, and hopefully that might be useful to others: How to define properly the expected namespace in the schema:
<xsd:schema jxb:version="2.0"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
xmlns="http://vendor.com/foo"
targetNamespace="http://vendor.com/foo"
elementFormDefault="unqualified"
attributeFormDefault="unqualified">
With that, we can now remove the sax.setNamespaceAware(false); line (update: actually, if we keep the unmarshal(SAXSource) call, then we need to sax.setNamespaceAware(true). But the simpler way is to not bother with SAXSource and the code surrounding its creation and instead unmarshal(InputStream) which by default is namespace-aware. And the ouput of a marshal() also has the proper namespace too.
Yeh. Only about 4 hours down the drain.

How to ignore the namespaces
You can use an XMLStreamReader that is non-namespace aware, it will basically trim out all namespaces from the xml file that you're parsing:
// configure the stream reader factory
XMLInputFactory xif = XMLInputFactory.newFactory();
xif.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, false); // this is the magic line
// create xml stream reader using our configured factory
StreamSource source = new StreamSource(someFile);
XMLStreamReader xsr = xif.createXMLStreamReader(source);
// unmarshall, note that it's better to reuse JAXBContext, as newInstance()
// calls are pretty expensive
JAXBContext jc = JAXBContext.newInstance(your.ObjectFactory.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
Object unmarshal = unmarshaller.unmarshal(xsr);
Now the actual xml that gets fed into JAXB doesn't have any namespace info.
Important note (xjc)
If you generated java classes from an xsd schema using xjc and the schema had a namespace defined, then the generated annotations will have that namespace, so delete it manually! Otherwise JAXB won't recognize such data.
Places where the annotations should be changed:
ObjectFactory.java
// change this line
private final static QName _SomeType_QNAME = new QName("some-weird-namespace", "SomeType");
// to something like
private final static QName _SomeType_QNAME = new QName("", "SomeType", "");
// and this annotation
#XmlElementDecl(namespace = "some-weird-namespace", name = "SomeType")
// to this
#XmlElementDecl(namespace = "", name = "SomeType")
package-info.java
// change this annotation
#javax.xml.bind.annotation.XmlSchema(namespace = "some-weird-namespace", elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
// to something like this
#javax.xml.bind.annotation.XmlSchema(namespace = "", elementFormDefault = javax.xml.bind.annotation.XmlNsForm.QUALIFIED)
Now your JAXB code will expect to see everything without any namespaces and the XMLStreamReader that we created supplies just that.

Here is my solution for this Namespace related issue. We can trick JAXB by implementing our own XMLFilter and Attribute.
class MyAttr extends AttributesImpl {
MyAttr(Attributes atts) {
super(atts);
}
#Override
public String getLocalName(int index) {
return super.getQName(index);
}
}
class MyFilter extends XMLFilterImpl {
#Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
super.startElement(uri, localName, qName, new VersAttr(atts));
}
}
public SomeObject testFromXML(InputStream input) {
try {
// Create the JAXBContext
JAXBContext jc = JAXBContext.newInstance(SomeObject.class);
// Create the XMLFilter
XMLFilter filter = new VersFilter();
// Set the parent XMLReader on the XMLFilter
SAXParserFactory spf = SAXParserFactory.newInstance();
//spf.setNamespaceAware(false);
SAXParser sp = spf.newSAXParser();
XMLReader xr = sp.getXMLReader();
filter.setParent(xr);
// Set UnmarshallerHandler as ContentHandler on XMLFilter
Unmarshaller unmarshaller = jc.createUnmarshaller();
UnmarshallerHandler unmarshallerHandler = unmarshaller
.getUnmarshallerHandler();
filter.setContentHandler(unmarshallerHandler);
// Parse the XML
InputSource is = new InputSource(input);
filter.parse(is);
return (SomeObject) unmarshallerHandler.getResult();
}catch (Exception e) {
logger.debug(ExceptionUtils.getFullStackTrace(e));
}
return null;
}

There is a workaround for this issue explained in this post: JAXB: How to ignore namespace during unmarshalling XML document?. It explains how to dynamically add/remove xmlns entries from XML using a SAX Filter. Handles marshalling and unmarshalling alike.

Related

JAXB ~ Dynamically parse multiple namespaces

I am trying to create an unmarshaller that will work for the following XML files:
<?xml version="1.0" encoding="UTF-8"?>
<REQ-IF xmlns="http://www.omg.org/spec/ReqIF/20110401/reqif.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.omg.org/spec/ReqIF/20110401/reqif.xsd
xml:lang="en">
[...]
</REQ-IF>
<?xml version="1.0" encoding="UTF-8"?>
<REQ-IF xmlns="http://www.omg.org/spec/ReqIF/20110401/reqif.xsd"
xmlns:configuration="http://eclipse.org/rmf/pror/toolextensions/1.0"
xmlns:id="http://pror.org/presentation/id"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
[...]
</REQ-IF>
<?xml version="1.0" encoding="UTF-8"?>
<REQ-IF xmlns="http://www.omg.org/spec/ReqIF/20110401/reqif.xsd"
xmlns:doors="http://www.ibm.com/rdm/doors/REQIF/xmlns/1.0"
xmlns:reqif="http://www.omg.org/spec/ReqIF/20110401/reqif.xsd"
xmlns:reqif-common="http://www.prostep.org/reqif"
xmlns:reqif-xhtml="http://www.w3.org/1999/xhtml"
xmlns:rm="http://www.ibm.com/rm"
xmlns:rm-reqif="http://www.ibm.com/rm/reqif"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
[...]
</REQ-IF>
All those files are structurally the same and are based of the same top-level namespace, but also contain a variety of variable sub-level namespaces and other "things" (which by my understanding should be attributes, but are not), which need to be saved in the system.
Thus far, I have managed to get to the point where this much is saved:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<REQ-IF xmlns="http://www.omg.org/spec/ReqIF/20110401/reqif.xsd">
[...]
</REQ-IF>
however, my intended result would look like this:
<?xml version="1.0" encoding="UTF-8"?>
<REQ-IF xmlns="http://www.omg.org/spec/ReqIF/20110401/reqif.xsd"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.omg.org/spec/ReqIF/20110401/reqif.xsd
xml:lang="en">
[...]
</REQ-IF>
So the top-level namespace is saved, but the sub-level namespaces and other "things" are lost in the import/export process. This is bad.
How can I save those other sub-namespaces and other "things", considering that they are dynamically generated?
Basically, what I want to say is "save all these extra attributes in any way you like while parsing the XML, and once you export the XML again, re-write them exactly as they were".
If your main use case is to read ReqIF files, consider that there is an open source implementation of a ReqIF (de)serializer at https://www.eclipse.org/rmf/
Unfortunately it seems that JAXB alone is not capable of manage all the namespace prefixes dynamically and you need to combine it with another parsing mechanism.
I would try to implement something like this (only rough implementation, details below):
public class MyXmlHandler {
XMLInputFactory xif = XMLInputFactory.newInstance();
XMLOutputFactory xof = XMLOutputFactory.newInstance();
XMLEventFactory xef = XMLEventFactory.newInstance();
/**
* Retrieve XMLEvent for root element
*/
public StartElement getStartElement(String source) throws XMLStreamException {
XMLEvent event;
XMLEventReader reader = xif.createXMLEventReader(new StringReader(source));
while (reader.hasNext()) {
event = reader.nextEvent();
if (event.isStartElement()) {
return event.asStartElement();
}
// alternativery you can retrieve here also QNames for first level child elements
// and return all this data in some synthetic wrapper class
}
return null; // alternatively throw an exception
}
/**
* Write root element, than some content from JAXB elements, than end element
*/
public void write(
Marshaller marshaller,
Writer writer,
StartElement root,
List<JAXBElement> elements
) throws JAXBException, XMLStreamException {
XMLEventWriter xew = xof.createXMLEventWriter(writer);
marshaller.setProperty(Marshaller.JAXB_FRAGMENT, true);
xew.add(root);
for(JAXBElement element : elements) {
marshaller.marshal(element, xew);
}
xew.add(xef.createEndElement(root.getName(), root.getNamespaces()));
xew.close();
}
}
And use it like this:
// create JAXB context and unmarshaller
JAXBContext ctx = JAXBContext.newInstance(RootClass.class);
Unmarshaller unmarshaller = ctx.createUnmarshaller();
// unmarshall XML
JAXBElement<RootClass> element = unmarshaller.unmarshal(source, RootClass.class);
RootClass rootValue = element.getValue();
// extract root element data from XML
StartElement root = handler.getStartElement(data);
// perform some business logic
// create marshaller
Marshaller marshaller = ctx.createMarshaller();
// create list of JAXBElements for root children
List<JAXBElement> elements = new ArrayList<>();
QName qname = ... // construct qualified name or retrieve it from saved structure
// very schematic, names of the child elements depend on your implementation
elements.add(new JAXBElement(qname , rootValue.getChild().getClass(), rootValue.getChild()));
handler.write(marshaller, writer, root, elements);
When preserving root element data, you can save also QNames for its children in some wrapper class. So far I see the REQ-IF structure contains header, core content and tool extensions. You could save QNames for all of them and then use them for constructing JAXB elements during marshalling process.

Java Unmarshalling issue (jaxb) [duplicate]

I am using JAXB to parse xml elements from the SOAP response. I have defined POJO classes for the xml elements. I have tested pojo classes without namespace and prefix its working fine .Though when i am trying to parse with namespaces and prefix facing the following exception.Requirement is to parse the input from SOAPMessage Object
javax.xml.bind.UnmarshalException: unexpected element (uri:"http://schemas.xmlsoap.org/soap/envelope/", local:"Envelope"). Expected elements are <{}Envelope>
Tried to fix by creating #XMLSchema for the package in package-info.java and located this file in package folder.Can any one guide me to move forward?
Referred this posts but didn help me .
EDITED :XMLSchema
#javax.xml.bind.annotation.XmlSchema (
xmlns = { #javax.xml.bind.annotation.XmlNs(prefix = "env",
namespaceURI="http://schemas.xmlsoap.org/soap/envelope/"),
#javax.xml.bind.annotation.XmlNs(prefix="ns3", namespaceURI="http://www.xxxx.com/ncp/oomr/dto/")
}
)
package com.one.two;
Thanks in advance
This can be done without modifying the generated JAXB code using standard SOAPMessage class. I wrote about this here and here
It's a little fiddly but works correctly.
Marshalling
Farm farm = new Farm();
farm.getHorse().add(new Horse());
farm.getHorse().get(0).setName("glue factory");
farm.getHorse().get(0).setHeight(BigInteger.valueOf(123));
Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
Marshaller marshaller = JAXBContext.newInstance(Farm.class).createMarshaller();
marshaller.marshal(farm, document);
SOAPMessage soapMessage = MessageFactory.newInstance().createMessage();
soapMessage.getSOAPBody().addDocument(document);
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
soapMessage.writeTo(outputStream);
String output = new String(outputStream.toByteArray());
Unmarshalling
String example =
"<soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\"><soapenv:Header /><soapenv:Body><ns2:farm xmlns:ns2=\"http://adamish.com/example/farm\"><horse height=\"123\" name=\"glue factory\"/></ns2:farm></soapenv:Body></soapenv:Envelope>";
SOAPMessage message = MessageFactory.newInstance().createMessage(null,
new ByteArrayInputStream(example.getBytes()));
Unmarshaller unmarshaller = JAXBContext.newInstance(Farm.class).createUnmarshaller();
Farm farm = (Farm)unmarshaller.unmarshal(message.getSOAPBody().extractContentAsDocument());
Here is how you can handle your use cae:
If You Need to Map the Envelope Element
package-info
Typically you would use the #XmlSchema as follows. Using the namespace and elementFormDefault properties like I've done means that all data mapped to XML elements unless otherwise mapped will belong to the http://www.xxxx.com/ncp/oomr/dto/ namespace. The information specified in xmlns is for XML schema generation altough some JAXB implementations use this to determine the preferred prefix for a namespace when marshalling (see: http://blog.bdoughan.com/2011/11/jaxb-and-namespace-prefixes.html).
#XmlSchema (
namespace="http://www.xxxx.com/ncp/oomr/dto/",
elementFormDefault=XmlNsForm.QUALIFIED,
xmlns = {
#XmlNs(prefix = "env", namespaceURI="http://schemas.xmlsoap.org/soap/envelope/"),
#XmlNs(prefix="whatever", namespaceURI="http://www.xxxx.com/ncp/oomr/dto/")
}
)
package com.one.two;
import javax.xml.bind.annotation.*;
Envelope
If within the com.one.two you need to map to elements from a namespace other than http://www.xxxx.com/ncp/oomr/dto/ then you need to specify it in the #XmlRootElement and #XmlElement annotations.
package com.one.two;
import javax.xml.bind.annotation.*;
#XmlRootElement(name="Envelope", namespace="http://schemas.xmlsoap.org/soap/envelope/")
#XmlAccessorType(XmlAccessType.FIELD)
public class Envelope {
#XmlElement(name="Body", namespace="http://schemas.xmlsoap.org/soap/envelope/")
private Body body;
}
For More Information
http://blog.bdoughan.com/2010/08/jaxb-namespaces.html
If You Just Want to Map the Body
You can use a StAX parser to parse the message and advance to the payload portion and unmarshal from there:
import javax.xml.bind.*;
import javax.xml.stream.*;
import javax.xml.transform.stream.StreamSource;
public class UnmarshalDemo {
public static void main(String[] args) throws Exception {
XMLInputFactory xif = XMLInputFactory.newFactory();
StreamSource xml = new StreamSource("src/blog/stax/middle/input.xml");
XMLStreamReader xsr = xif.createXMLStreamReader(xml);
xsr.nextTag();
while(!xsr.getLocalName().equals("return")) {
xsr.nextTag();
}
JAXBContext jc = JAXBContext.newInstance(Customer.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
JAXBElement<Customer> jb = unmarshaller.unmarshal(xsr, Customer.class);
xsr.close();
}
}
For More Information
http://blog.bdoughan.com/2012/08/handle-middle-of-xml-document-with-jaxb.html
Just wanted to add onto the existing answers -- while unmarshalling if the XML document is not namespace aware you might receive an error: javax.xml.bind.UnmarshalException: unexpected element (uri:"http://some.url";, local:"someOperation")
If this is the case you can simply use a different method on the unmarshaller:
Unmarshaller unmarshaller = JAXBContext.newInstance(YourObject.class).createUnmarshaller();
JAXBElement<YourObject> element = unmarshaller.unmarshal(message.getSOAPBody().extractContentAsDocument(), YourObject.class);
YourObject yo = element.getValue();

Jaxb ignore the namespace on unmarshalling

I use Jaxb2 and Spring. I am trying to unmarshal some XML that are sent by 2 of my customers.
Up to now, I only had to handle one customer which sent some xml like this :
<foo xmlns="com.acme">
<bar>[...]</bar>
<foo>
that is bound to a POJO like this :
#XmlType(name = "", propOrder = {"bar"})
#XmlRootElement(name = "Foo")
public class Foo {
#XmlElement(name = "Bar")
private String bar;
[...]
}
I discovered that the previous developer hardcoded the namespace in the unmarshaller in order to make it work.
Now, the second customer sends the same XML but changes the namespace!
<foo xmlns="com.xyz">
<bar>[...]</bar>
<foo>
Obviously, the unmarshaller fails to unmarshall because it expects some {com.acme}foo instead of {com.xyz}foo. Unforunately, asking the customer to change the XML is not an option.
What I tried :
1) In application-context.xml, I searched for a configuration which would allow me to ignore the namespace but could not find one :
<bean id="marshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="packagesToScan">
<list>
<value>com.mycompany.mypkg</value>
</list>
</property>
<property name="marshallerProperties">
<map>
<entry key="???"><value type="java.lang.Boolean">false</value></entry>
</map>
</property>
</bean>
it seems that the only available options are the ones listed in the Jaxb2Marshaller's Javadoc :
/**
* Set the JAXB {#code Marshaller} properties. These properties will be set on the
* underlying JAXB {#code Marshaller}, and allow for features such as indentation.
* #param properties the properties
* #see javax.xml.bind.Marshaller#setProperty(String, Object)
* #see javax.xml.bind.Marshaller#JAXB_ENCODING
* #see javax.xml.bind.Marshaller#JAXB_FORMATTED_OUTPUT
* #see javax.xml.bind.Marshaller#JAXB_NO_NAMESPACE_SCHEMA_LOCATION
* #see javax.xml.bind.Marshaller#JAXB_SCHEMA_LOCATION
*/
public void setMarshallerProperties(Map<String, ?> properties) {
this.marshallerProperties = properties;
}
2) I also tried to configure the unmarshaller in the code :
try {
jc = JAXBContext.newInstance("com.mycompany.mypkg");
Unmarshaller u = jc.createUnmarshaller();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(false);//Tried this option.
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(xmlFile.toFile());
u.unmarshal(new DOMSource(doc));
return (Foo)u.unmarshal(new StreamSource(xmlFile.toFile()));
} catch (ParserConfigurationException | SAXException | IOException | JAXBException e) {
LOGGER.error("Erreur Unmarshalling CPL");
}
3) Different form with a SAXParser :
try {
jc = JAXBContext.newInstance("com.mycompany.mypkg");
Unmarshaller um = jc.createUnmarshaller();
final SAXParserFactory sax = SAXParserFactory.newInstance();
sax.setNamespaceAware(false);
final XMLReader reader = sax.newSAXParser().getXMLReader();
final Source er = new SAXSource(reader, new InputSource(new FileReader(xmlFile.toFile())));
return (Foo)um.unmarshal(er);
}catch(...) {[...]}
This one works! But still, I would prefer to be able to autowire the Unmarshaller without needing this ugly conf everytime.
Namesapce awareness is feature of the document reader/builder/parser not marshallers. XML elements from different namespaces represents different entities == objects, so marshallers cannot ignore them.
You correctly switched off the namespaces in your SAX reader and as you said it worked. I don't understand your problem with it, your marshaller still can be injected, the difference is in obtaining the input data.
The same trick with document builder should also work (I will test it later on), I suspect that you were still using the marshaller with "hardcoded" namespace but your document was namespace free.
In my project I use XSLT to solve similar issue. Setting namespace awarness is definitely easier solution. But, with XSLT I could selectviely remove only some namespaces and also my my input xml are not always identical (ignoring namespaces) and sometimes I have to rename few elements so XSLT gives me this extra flexibility.
To remove namespaces you can use such xslt template:
<xsl:stylesheet version="1.0" xmlns:e="http://timet.dom.robust.ed" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="#* | node()" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<xsl:attribute name="{local-name()}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
<xsl:template match="text() | processing-instruction() | comment()">
<xsl:copy />
</xsl:template>
</xsl:stylesheet>
Then in Java before unmarshalling I transform the input data:
Transformer transformer = TransformerFactory.newInstance().newTransformer(stylesource);
Source source = new DOMSource(xml);
DOMResult result = new DOMResult();
transformer.transform(source, result);
Thank you all, here shared my solution which works for my code , i try to make it generic every namespace contain ": " i write code if any tag have ":" it will remove from xml , This is used to skip namespace during unmarshalling using jaxb.
public class NamespaceFilter {
private NamespaceFilter() {
}
private static final String COLON = ":";
public static XMLReader nameSpaceFilter() throws SAXException {
XMLReader xr = new XMLFilterImpl(XMLReaderFactory.createXMLReader()) {
private boolean skipNamespace;
#Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
if (qName.indexOf(COLON) > -1) {
skipNamespace = true;
} else {
skipNamespace = false;
super.startElement("", localName, qName, atts);
}
}
#Override
public void endElement(String uri, String localName, String qName) throws SAXException {
if (qName.indexOf(COLON) > -1) {
skipNamespace = true;
} else {
skipNamespace = false;
super.endElement("", localName, qName);
}
}
#Override
public void characters(char[] ch, int start, int length) throws SAXException {
if (!skipNamespace) {
super.characters(ch, start, length);
}
}
};
return xr;
}
}
for unmarshalling
XMLReader xr = NamespaceFilter.nameSpaceFilter();
Source src = new SAXSource(xr, new InputSource("C:\\Users\\binal\\Desktop\\response.xml"));
StringWriter sw = new StringWriter();
Result res = new StreamResult(sw);
TransformerFactory.newInstance().newTransformer().transform(src, res);
JAXBContext jc = JAXBContext.newInstance(Tab.class);
Unmarshaller u = jc.createUnmarshaller();
String done = sw.getBuffer().toString();
StringReader reader = new StringReader(done);
Tab tab = (Tab) u.unmarshal(reader);
System.out.println(tab);
`
I've followed Java docs to handle namespace in xml while unmarshalling, it did the trick.
JAXBContext jc = JAXBContext.newInstance( "com.acme.foo" );
Unmarshaller u = jc.createUnmarshaller();
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(false);
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(new File( "nosferatu.xml"));
//If just a string
//InputSource is = new InputSource(new StringReader(line));
//Document doc = db.parse(is)
Object o = u.unmarshal( doc );
Used Java API JAXB Context and Unmarshal features.
https://docs.oracle.com/javase/8/docs/api/javax/xml/bind/Unmarshaller.html

SAXException when parsing XML file with XSD schema

I have the following XSD file:
<xs:schema xmlns:xs='http://www.w3.org/2001/XMLSchema'
targetNamespace='http://www.wvf.com/schemas'
xmlns='http://www.wvf.com/schemas'
xmlns:acmewvf='http://www.wvf.com/schemas'>
<xs:element name='loft'>
</xs:element>
</xs:schema>
and the following XML file:
<?xml version="1.0"?>
<acmewvf:loft xmlns:acmewvf="http://www.wvf.com/schemas"
xmlns="http://www.wvf.com/schemas">
</acmewvf:loft>
When I execute the following Java code:
public void parse(InputStream constraints) {
final SchemaFactory schemaFactory = new XMLSchemaFactory();
final URL resource =
ClassLoader.getSystemClassLoader().getResource(SCHEMA_PATH);
final DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
Document doc = null;
factory.setSchema(schemaFactory.newSchema(resource));
final DocumentBuilder builder = factory.newDocumentBuilder();
doc = builder.parse(constraints);
I get the following SAXException (on the last line of the code):
cvc-elt.1: Cannot find the declaration
of element 'acmewvf:loft'.
(Note that SCHEMA_PATH points to the XSD file whose contents are given above and constraints is an input stream to the XML file whose contents are also given above.)
What's going wrong here?
See Using the Validating Parser. Probably, you should try to add the following to generate a namespace-aware, validating parser:
factory.setNamespaceAware(true);
factory.setValidating(true);
try {
factory.setAttribute(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);
}
catch (IllegalArgumentException x) {
// Happens if the parser does not support JAXP 1.2
...
}
Don't forget to define:
static final String JAXP_SCHEMA_LANGUAGE =
"http://java.sun.com/xml/jaxp/properties/schemaLanguage";
static final String W3C_XML_SCHEMA =
"http://www.w3.org/2001/XMLSchema";

Validate an XML File Against Multiple Schema Definitions

I'm trying to validate an XML file against a number of different schemas (apologies for the contrived example):
a.xsd
b.xsd
c.xsd
c.xsd in particular imports b.xsd and b.xsd imports a.xsd, using:
<xs:include schemaLocation="b.xsd"/>
I'm trying to do this via Xerces in the following manner:
XMLSchemaFactory xmlSchemaFactory = new XMLSchemaFactory();
Schema schema = xmlSchemaFactory.newSchema(new StreamSource[] { new StreamSource(this.getClass().getResourceAsStream("a.xsd"), "a.xsd"),
new StreamSource(this.getClass().getResourceAsStream("b.xsd"), "b.xsd"),
new StreamSource(this.getClass().getResourceAsStream("c.xsd"), "c.xsd")});
Validator validator = schema.newValidator();
validator.validate(new StreamSource(new StringReader(xmlContent)));
but this is failing to import all three of the schemas correctly resulting in cannot resolve the name 'blah' to a(n) 'group' component.
I've validated this successfully using Python, but having real problems with Java 6.0 and Xerces 2.8.1. Can anybody suggest what's going wrong here, or an easier approach to validate my XML documents?
So just in case anybody else runs into the same issue here, I needed to load a parent schema (and implicit child schemas) from a unit test - as a resource - to validate an XML String. I used the Xerces XMLSchemFactory to do this along with the Java 6 validator.
In order to load the child schema's correctly via an include I had to write a custom resource resolver. Code can be found here:
https://code.google.com/p/xmlsanity/source/browse/src/com/arc90/xmlsanity/validation/ResourceResolver.java
To use the resolver specify it on the schema factory:
xmlSchemaFactory.setResourceResolver(new ResourceResolver());
and it will use it to resolve your resources via the classpath (in my case from src/main/resources). Any comments are welcome on this...
http://www.kdgregory.com/index.php?page=xml.parsing
section 'Multiple schemas for a single document'
My solution based on that document:
URL xsdUrlA = this.getClass().getResource("a.xsd");
URL xsdUrlB = this.getClass().getResource("b.xsd");
URL xsdUrlC = this.getClass().getResource("c.xsd");
SchemaFactory schemaFactory = schemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
//---
String W3C_XSD_TOP_ELEMENT =
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\n"
+ "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\" elementFormDefault=\"qualified\">\n"
+ "<xs:include schemaLocation=\"" +xsdUrlA.getPath() +"\"/>\n"
+ "<xs:include schemaLocation=\"" +xsdUrlB.getPath() +"\"/>\n"
+ "<xs:include schemaLocation=\"" +xsdUrlC.getPath() +"\"/>\n"
+"</xs:schema>";
Schema schema = schemaFactory.newSchema(new StreamSource(new StringReader(W3C_XSD_TOP_ELEMENT), "xsdTop"));
The schema stuff in Xerces is (a) very, very pedantic, and (b) gives utterly useless error messages when it doesn't like what it finds. It's a frustrating combination.
The schema stuff in python may be a lot more forgiving, and was letting small errors in the schema go past unreported.
Now if, as you say, c.xsd includes b.xsd, and b.xsd includes a.xsd, then there's no need to load all three into the schema factory. Not only is it unnecessary, it will likely confuse Xerces and result in errors, so this may be your problem. Just pass c.xsd to the factory, and let it resolve b.xsd and a.xsd itself, which it should do relative to c.xsd.
From the xerces documentation :
http://xerces.apache.org/xerces2-j/faq-xs.html
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamSource;
import javax.xml.validation.Schema;
import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Validator;
...
StreamSource[] schemaDocuments = /* created by your application */;
Source instanceDocument = /* created by your application */;
SchemaFactory sf = SchemaFactory.newInstance(
"http://www.w3.org/XML/XMLSchema/v1.1");
Schema s = sf.newSchema(schemaDocuments);
Validator v = s.newValidator();
v.validate(instanceDocument);
I faced the same problem and after investigating found this solution. It works for me.
Enum to setup the different XSDs:
public enum XsdFile {
// #formatter:off
A("a.xsd"),
B("b.xsd"),
C("c.xsd");
// #formatter:on
private final String value;
private XsdFile(String value) {
this.value = value;
}
public String getValue() {
return this.value;
}
}
Method to validate:
public static void validateXmlAgainstManyXsds() {
final SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
String xmlFile;
xmlFile = "example.xml";
// Use of Enum class in order to get the different XSDs
Source[] sources = new Source[XsdFile.class.getEnumConstants().length];
for (XsdFile xsdFile : XsdFile.class.getEnumConstants()) {
sources[xsdFile.ordinal()] = new StreamSource(xsdFile.getValue());
}
try {
final Schema schema = schemaFactory.newSchema(sources);
final Validator validator = schema.newValidator();
System.out.println("Validating " + xmlFile + " against XSDs " + Arrays.toString(sources));
validator.validate(new StreamSource(new File(xmlFile)));
} catch (Exception exception) {
System.out.println("ERROR: Unable to validate " + xmlFile + " against XSDs " + Arrays.toString(sources)
+ " - " + exception);
}
System.out.println("Validation process completed.");
}
I ended up using this:
import org.apache.xerces.parsers.SAXParser;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.helpers.DefaultHandler;
import java.io.IOException;
.
.
.
try {
SAXParser parser = new SAXParser();
parser.setFeature("http://xml.org/sax/features/validation", true);
parser.setFeature("http://apache.org/xml/features/validation/schema", true);
parser.setFeature("http://apache.org/xml/features/validation/schema-full-checking", true);
parser.setProperty("http://apache.org/xml/properties/schema/external-noNamespaceSchemaLocation", "http://your_url_schema_location");
Validator handler = new Validator();
parser.setErrorHandler(handler);
parser.parse("file:///" + "/home/user/myfile.xml");
} catch (SAXException e) {
e.printStackTrace();
} catch (IOException ex) {
e.printStackTrace();
}
class Validator extends DefaultHandler {
public boolean validationError = false;
public SAXParseException saxParseException = null;
public void error(SAXParseException exception)
throws SAXException {
validationError = true;
saxParseException = exception;
}
public void fatalError(SAXParseException exception)
throws SAXException {
validationError = true;
saxParseException = exception;
}
public void warning(SAXParseException exception)
throws SAXException {
}
}
Remember to change:
1) The parameter "http://your_url_schema_location" for you xsd file location.
2) The string "/home/user/myfile.xml" for the one pointing to your xml file.
I didn't have to set the variable: -Djavax.xml.validation.SchemaFactory:http://www.w3.org/2001/XMLSchema=org.apache.xerces.jaxp.validation.XMLSchemaFactory
Just in case, anybody still come here to find the solution for validating xml or object against multiple XSDs, I am mentioning it here
//Using **URL** is the most important here. With URL, the relative paths are resolved for include, import inside the xsd file. Just get the parent level xsd here (not all included xsds).
URL xsdUrl = getClass().getClassLoader().getResource("my/parent/schema.xsd");
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(xsdUrl);
JAXBContext jaxbContext = JAXBContext.newInstance(MyClass.class);
Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
unmarshaller.setSchema(schema);
/* If you need to validate object against xsd, uncomment this
ObjectFactory objectFactory = new ObjectFactory();
JAXBElement<MyClass> wrappedObject = objectFactory.createMyClassObject(myClassObject);
marshaller.marshal(wrappedShipmentMessage, new DefaultHandler());
*/
unmarshaller.unmarshal(getClass().getClassLoader().getResource("your/xml/file.xml"));
If all XSDs belong to the same namespace then create a new XSD and import other XSDs into it. Then in java create schema with the new XSD.
Schema schema = xmlSchemaFactory.newSchema(
new StreamSource(this.getClass().getResourceAsStream("/path/to/all_in_one.xsd"));
all_in_one.xsd :
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:ex="http://example.org/schema/"
targetNamespace="http://example.org/schema/"
elementFormDefault="unqualified"
attributeFormDefault="unqualified">
<xs:include schemaLocation="relative/path/to/a.xsd"></xs:include>
<xs:include schemaLocation="relative/path/to/b.xsd"></xs:include>
<xs:include schemaLocation="relative/path/to/c.xsd"></xs:include>
</xs:schema>

Categories