Adding source validation to a StructuredTextViewer - java

I added to my application a nice XML source viewer. Now, I have an XSD scheme that defines the xml document. Any idea where to start on adding some source validation that relies on this scheme?
Thanks!

To check that your XML is well-formed, just run it through a DocumentBuilderFactory parser. To additionally validate it against an .xsd schema referenced in the XML, call:
factory.setValidating( true );
If the xsd schema is not referenced within the XML that you are validating, you can supply it yourself like this:
factory.setAttribute(JAXP_SCHEMA_SOURCE, new File(schemaSource) );
For more information, read the article from Oracle here:
http://download.oracle.com/javaee/1.4/tutorial/doc/JAXPDOM8.html

Related

Multiple XSD schemas in one xjc/JAXB generated XML file?

I've got a set of XSD schema files I'm generating Java classes from:
http://xmlgw.companieshouse.gov.uk/v1-0/schema/Egov_ch.xsd
http://xmlgw.companieshouse.gov.uk/v1-0/schema/forms/ReturnofAllotmentShares-v3-0.xsd
http://xmlgw.companieshouse.gov.uk/v1-0/schema/forms/FormSubmission-v2-11.xsd
These come together to form a single XML document I need to submit to the Companies House XML Gateway. Right now, the XML files generated from the Java files generated by xjc from these schemas do not include schemaLocation or other schema-related information. By following this answer I'm able to add a top-level schemaLocation to the <GovTalkMessage> element, but not to the others (e.g. <FormSubmission> and <ReturnofAllotmentShares>.
Is there a way I can do this, or to set it at Java-class generation time?
EDIT: Error returned when I remove one of the schemaLocation attributes (if I include it, it returns successfully)
<GovTalkErrors>
<Error>
<RaisedBy>CH_XML_Gateway</RaisedBy>
<Number>100</Number>
<Type>fatal</Type>
<Text>XML failed schema validation: Invalid XML: Unknown element 'ReturnofAllotmentShares' line 36 column 86</Text>
<Location></Location>
</Error>
</GovTalkErrors>
My understanding is that without the schemaLocation attribute, the parser doesn't know what specific document we're referring to here. Each document (maybe 20 of them) is it's own XML schema.

How to parse complex nested xml file in JAVA

i am new to xml parsing not able to decide how to parse this complex xml file in java .
I am able to parse simple xml file but when when it comes to complex xml file i am confused .Not able to read elements of xml using java .
Here is my sample xml file .
<?xml version="1.0"?>
<env:ContentEnvelope xsi:schemaLocation="http://fundamental.schemas.financial.jso.com/Fundamental/2011-07-07/
https://theshare.jso.com/sites/TRM-IA/Content%20Marketplace/Strategic%20Data%20Interfaces/SDI%20Schemas/Schemas/Fundamentals/2015-09-25/FundamentalMaster.xsd"
xmlns:esg="http://fundamental.schemas.financial.jso.com/ESGSupportingInfo/2011-07-07/"
xmlns:md="http://data.schemas.financial.jso.com/metadata/2010-10-10/"
xmlns:cr="http://fundamental.schemas.financial.jso.com/CoraxData/2012-10-25/"
xmlns:ful="http://fundamental.schemas.financial.jso.com/FundamentalLineItem/2011-07-07/"
xmlns:fun="http://fundamental.schemas.financial.jso.com/Fundamental/2011-07-07/"
xmlns:ir="http://fundamental.schemas.financial.jso.com/FinancialInstrumentRelationship/2011-07-07/"
xmlns:fl="http://fundamental.schemas.financial.jso.com/FinancialLineItem/2011-07-07/"
xmlns:pe="http://fundamental.schemas.financial.jso.com/FinancialPeriod/2011-07-07/"
xmlns:seg="http://fundamental.schemas.financial.jso.com/FinancialSegment/2011-07-07/"
xmlns:sr="http://fundamental.schemas.financial.jso.com/FinancialSource/2011-07-07/"
xmlns:sli="http://fundamental.schemas.financial.jso.com/StandardizedLineItem/2011-07-07/"
xmlns:ss="http://fundamental.schemas.financial.jso.com/StandardizedStatement/2011-07-07/"
xmlns:fs="http://fundamental.schemas.financial.jso.com/FinancialStatement/2011-07-07/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:env="http://data.schemas.tfn.jso.com/Envelope/2008-05-01/" minVers="1.0" majVers="3" pubStyle="Message">
<env:Header>
<env:Info>
<env:Id>urn:uuid:069527ab-2c10-48bb-b3d2-206f4e66e5d2</env:Id>
<env:TimeStamp>2016-12-23T10:09:09+00:00</env:TimeStamp>
</env:Info>
<fun:OrgId>20240</fun:OrgId>
<fun:PartitionId>1</fun:PartitionId>
</env:Header>
<env:Body minVers="0.0" majVers="1" contentSet="Fundamental">
<env:ContentItem action="Insert">
<env:Data xsi:type="fun:FundamentalDataItem">
<fun:Fundamental effectiveTo="9999-12-31T00:00:00+00:00" effectiveFrom="2013-06-29T00:55:15.313+00:00" uniqueFuamentalSet="0054341342">
<fun:OrganizationId objectType="Organization" objectTypeId="404510">42565596</fun:OrganizationId>
<fun:PrimaryReportingEntityCode>A4C67</fun:PrimaryReportingEntityCode>
<fun:TotalPrimaryReportingShares>567923000.00000</fun:TotalPrimaryReportingShares>
<fun:LocalLanguageId>505074</fun:LocalLanguageId>
<fun:IndustryGroups>
<fun:IndustryGroup validTo="9999-12-31T00:00:00+00:00" validFrom="1900-01-01T00:00:00+00:00">
<fun:GroupCode>BNK</fun:GroupCode>
<fun:GroupName languageId="505074">Bank</fun:GroupName>
<fun:TaxonomyId>1</fun:TaxonomyId>
<fun:IndustryGroupCodeId>3011649</fun:IndustryGroupCodeId>
</fun:IndustryGroup>
</fun:IndustryGroups>
<fun:GaapCode>CAG</fun:GaapCode>
<fun:ConsolidationBasis>Consolidated</fun:ConsolidationBasis>
<fun:IsFiling>true</fun:IsFiling>
<fun:ConsolidationBasisId>3013598</fun:ConsolidationBasisId>
<fun:GaapCodeId>3011536</fun:GaapCodeId>
<fun:Taxonomies>
<fun:Taxonomy>1</fun:Taxonomy>
</fun:Taxonomies>
<fun:WorldScopeIds>
<fun:WorldScopeId validTo="9999-12-31T00:00:00+00:00" validFrom="2012-03-31T00:00:00+00:00">C12436390</fun:WorldScopeId>
</fun:WorldScopeIds>
</fun:Fundamental>
</env:Data>
</env:ContentItem>
Definitely JAXB will help you here.
Since you are dealing with complex xml files, i would suggest below approach( i agree it's lengthy and manual but shall work fine).
1) Generate xsd schema out of given xml content
2) Create a JAXB project in eclipse and create and empty XSD file and write it with xsd schema generated above
3) To convert .xsd file to pojo Right click on .xsd file and generate JAXB classes
4) Now write a code to un-marshal the data and run it, this should give you a corresponding java class.

JAXB and XSLT processor

I am using JAXB and maven-jaxb2-plugin and I am able right now to bind my schemas to Java code successfully.
I also have a .xsl file "annotate_schemas.xsl" that modifies a specific schema adding some additional information.
Finally, on the schema that I want transformed, I added the header:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="annotate_schemas.xsl"?>
...
The problem is that, while the .xsl is correct (if I open my schema file in a browser, the transformation is done flawlessly), JAXB ignores it and binds an untouched version of my schema.
My question is: Does JAXB (and/or its plugin) have an XSLT processor?? Is there a way to tell JAXB to bind the result of the XSLT transformation instead of the original?
Thank you very much
JAXB, like the vast majority of XML-consuming applications, takes no notice of an <?xml-stylesheet?> processing instruction. If you want to transform a document before passing it to JAXB, you need to transform it explicitly, for example by using the JAXP transformation API. (There is an option in JAXP to request transformation according to the value of the xml-stylesheet PI if that's how you want to control it: TransformerFactory.useAssociatedStylesheet()).
You can try something like this:
TransformerFactory transFact = TransformerFactory.newInstance();
Templates displayTemplate = transFact.newTemplates(new StreamSource(new File("your_xsl_file")));
TransformerHandler handler =
((SAXTransformerFactory) transFact).newTransformerHandler(displayTemplate);

Dynamic XML creation in Java

I am trying to dynamically y create an XML file in Java to display a timetable. I have created a DTD for my XML file and I have an XSL file I would like to use to transform the XML. I don't know exactly how to continue.
What I've tried so far is onClick of some button a Servlet is called which generates the string of the content of the XML file (inserting the dynamic parts of the XML into the String. I now have a String containing the content of the XML file. I would now like to transform the XML file using an XSL file i have on my server and display the result in the page which has called the Servlet (doing this via AJAX).
I'm not sure if I'm in the direction, perhaps I shouldn't even create the XML code in String form from the beginning. So my question is, how do I continue from here? how do I transform the XML string, using the XSL file, and send it as a response to the AJAX call so I can plant the generated code into the page? Or if this is not the way to do it, how do I create a dynamic XML file in a different way producing the same result?
You can use JAXP for this. It's part of standard Java SE API.
StringReader xmlInput = new StringReader(xmlStringWhichYouHaveCreated);
InputStream xslInput = getServletContext().getResourceAsStream("file.xsl"); // Or wherever it is. As long as you've it as an InputStream, it's fine.
Source xmlSource = new StreamSource(xmlInput);
Source xslSource = new StreamSource(xslInput);
Result xmlResult = new StreamResult(response.getOutputStream()); // XML result will be written to HTTP response.
Transformer transformer = TransformerFactory.newInstance().newTransformer(xslSource);
transformer.transform(xmlSource, xmlResult);
Depending on how complicated and large your XML is going to be I would suggest two options. For small, simple structures Java's DOM implementation (Document) will suffice.
If your XML is more elaborate I would look into JAXB. The benefit there is that there are tools that automatically create Java classes from an XML schema (XSD). So you'd have to transform your DTD into an XSD first, but that shouldn't be a problem. You end up with plain data transfer objects (plain objects with getters/setters for the values of the corresponding XML elements) and parsing/encoding plus setting namespaces correctly is done for you. It's quite convenient but can also be a bit of an overkill for simple XML structures.
In both cases, you will end up with a Document instance that you can finally transform using JAXP.
Apache XMLBeans are a nice solution to serializing to and from XML. Here's what you need to do:
Download XMLBeans from http://www.apache.org/dyn/closer.cgi/xmlbeans/binaries
Use the XMLBeans inst2xsd executable (in the bin dir0 to convert your DTD to an XSD
Use the XMLBeans ANT task to convert the XSD into classes which you can use in your app
Here's an example ANT script to use XMLBeans to create the classes:
<project name="my_project" basedir="..">
<property name="my_project.project.path" value="${basedir}"/>
<property name="xbean.dir" value="C:/lib/xmlbeans-2.2.0/lib" />
<path id="classpath">
<fileset dir="${xbean.dir}" includes="**/*.jar" />
</path>
<taskdef name="xmlbean" classname="org.apache.xmlbeans.impl.tool.XMLBean" classpathref="classpath" />
<xmlbean schema="${testing_project.project.path}/my.xsd" srcgendir="${my_project.project.path}/src-tms-template-filter-fields" classgendir="${my_project.project.path}/bin">
<classpath><path refid="classpath" /></classpath>
</xmlbean>
You'll now have nice Java classes which you can use for clean code to create the XML from the data stored in your DB. Use BalusC's answer for the XSLT.

XML to be validated against multiple xsd schemas

I'm writing the xsd and the code to validate, so I have great control here.
I would like to have an upload facility that adds stuff to my application based on an xml file. One part of the xml file should be validated against different schemas based on one of the values in the other part of it. Here's an example to illustrate:
<foo>
<name>Harold</name>
<bar>Alpha</bar>
<baz>Mercury</baz>
<!-- ... more general info that applies to all foos ... -->
<bar-config>
<!-- the content here is specific to the bar named "Alpha" -->
</bar-config>
<baz-config>
<!-- the content here is specific to the baz named "Mercury" -->
</baz>
</foo>
In this case, there is some controlled vocabulary for the content of <bar>, and I can handle that part just fine. Then, based on the bar value, the appropriate xml schema should be used to validate the content of bar-config. Similarly for baz and baz-config.
The code doing the parsing/validation is written in Java. Not sure how language-dependent the solution will be.
Ideally, the solution would permit the xml author to declare the appropriate schema locations and what-not so that s/he could get the xml validated on the fly in a sufficiently smart editor.
Also, the possible values for <bar> and <baz> are orthogonal, so I don't want to do this by extension for every possible bar/baz combo. What I mean is, if there are 24 possible bar values/schemas and 8 possible baz values/schemas, I want to be able to write 1 + 24 + 8 = 33 total schemas, instead of 1 * 24 * 8 = 192 total schemas.
Also, I'd prefer to NOT break out the bar-config and baz-config into separate xml files if possible. I realize that might make all the problems much easier, as each xml file would have a single schema, but I'm trying to see if there is a good single-xml-file solution.
I finally figured this out.
First of all, in the foo schema, the bar-config and baz-config elements have a type which includes an any element, like this:
<sequence>
<any minOccurs="0" maxOccurs="1"
processContents="lax" namespace="##any" />
</sequence>
In the xml, then, you must specify the proper namespace using the xmlns attribute on the child element of bar-config or baz-config, like this:
<bar-config>
<config xmlns="http://www.example.org/bar/Alpha">
... config xml here ...
</config>
</bar-config>
Then, your XML schema file for bar Alpha will have a target namespace of http://www.example.org/bar/Alpha and will define the root element config.
If your XML file has namespace declarations and schema locations for both of the schema files, this is sufficient for the editor to do all of the validating (at least good enough for Eclipse).
So far, we have satisfied the requirement that the xml author may write the xml in such a way that it is validated in the editor.
Now, we need the consumer to be able to validate. In my case, I'm using Java.
If by some chance, you know the schema files that you will need to use to validate ahead of time, then you simply create a single Schema object and validate as usual, like this:
Schema schema = factory().newSchema(new Source[] {
new StreamSource(stream("foo.xsd")),
new StreamSource(stream("Alpha.xsd")),
new StreamSource(stream("Mercury.xsd")),
});
In this case, however, we don't know which xsd files to use until we have parsed the main document. So, the general procedure is to:
Validate the xml using only the main (foo) schema
Determine the schema to use to validate the portion of the document
Find the node that is the root of the portion to validate using a separate schema
Import that node into a brand new document
Validate the brand new document using the other schema file
Caveat: it appears that the document must be built namespace-aware in order for this to work.
Here's some code (this was ripped from various places of my code, so there might be some errors introduced by the copy-and-paste):
// Contains the filename of the xml file
String filename;
// Load the xml data using a namespace-aware builder (the method
// 'stream' simply opens an input stream on a file)
Document document;
DocumentBuilderFactory docBuilderFactory =
DocumentBuilderFactory.newInstance();
docBuilderFactory.setNamespaceAware(true);
document = docBuilderFactory.newDocumentBuilder().parse(stream(filename));
// Create the schema factory
SchemaFactory sFactory = SchemaFactory.newInstance(
XMLConstants.W3C_XML_SCHEMA_NS_URI);
// Load the main schema
Schema schema = sFactory.newSchema(
new StreamSource(stream("foo.xsd")));
// Validate using main schema
schema.newValidator().validate(new DOMSource(document));
// Get the node that is the root for the portion you want to validate
// using another schema
Node node= getSpecialNode(document);
// Build a Document from that node
Document subDocument = docBuilderFactory.newDocumentBuilder().newDocument();
subDocument.appendChild(subDocument.importNode(node, true));
// Determine the schema to use using your own logic
Schema subSchema = parseAndDetermineSchema(document);
// Validate using other schema
subSchema.newValidator().validate(new DOMSource(subDocument));
Take a look at NVDL (Namespace-based Validation Dispatching Language) - http://www.nvdl.org/
It is designed to do what you want to do (validate parts of an XML document that have their own namespaces and schemas).
There is a tutorial here - http://www.dpawson.co.uk/nvdl/ - and a Java implementation here - http://jnvdl.sourceforge.net/
Hope that helps!
Kevin
You need to define a target namespace for each separately-validated portions of the instance document. Then you define a master schema that uses <xsd:include> to reference the schema documents for these components.
The limitation with this approach is that you can't let the individual components define the schemas that should be used to validate them. But it's a bad idea in general to let a document tell you how to validate it (ie, validation should something that your application controls).
You can also use a "resource resolver" to allow "xml authors" to specify their own schema file, at least to some extent, ex: https://stackoverflow.com/a/41225329/32453 at the end of the day, you want a fully compliant xml file that can be validatable with normal tools, anyway :)

Categories