I am trying to validate a pdf files documentation using xsd, where I convert the given pdf to xml and parse it through the schema xsd, and it validates, but lets assume there is a heading and it has 2 subheadings how do I change to xsd schema such that for a particular type of heading it should and must have minimum 2 subheadings of particular text(words/sentences), how do I add conditions to the xsd file for it validate specifically designed documents ?
here is the xsd
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="elements">
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs="unbounded" ref="element"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="element">
<xs:complexType>
<xs:sequence>
<xs:element ref="pageno"/>
</xs:sequence>
<xs:attribute name="level" use="required" type="xs:integer"/>
<xs:attribute name="title" use="required"/>
<xs:attribute name="type" use="required"/>
</xs:complexType>
</xs:element>
<xs:element name="pageno" type="xs:integer"/>
</xs:schema>
and here is the xml I used to generate this xsd:
<elements>
<element type ="Introduction" level="1" title="Introduction">
<pageno>4</pageno>
</element>
<element type ="Introduction" level="2" title="Enhancements to the HP CSA vCenter Simple Compute">
<pageno>4</pageno>
</element>
<element type ="System requirements" level="1" title="System requirements">
<pageno>5</pageno>
</element>
<element type ="System requirements" level="2" title="Software components">
<pageno>5</pageno>
</element>
<element type ="Configuration requirements" level="1" title="Configuration requirements">
<pageno>7</pageno>
</element>
<element type ="Configuration requirements" level="2" title="Installing content capsule">
<pageno>7</pageno>
</element>
<element type ="Configuring offerings in HP CSA" level="1" title="Configuring offerings in HP CSA">
<pageno>8</pageno>
</element>
<element type ="Configuring offerings in HP CSA" level="2" title="Configuring subscriber options">
<pageno>8</pageno>
</element>
<element type ="Configuring subscriber options" level="2" title="Adding providers">
<pageno>8</pageno>
</element>
<element type ="Adding providers" level="2" title="Associating resource offerings with providers">
<pageno>9</pageno>
</element>
<element type ="Associating resource offerings with providers" level="2" title="Changing component properties">
<pageno>10</pageno>
</element>
<element type ="Changing component properties" level="2" title="Creating the service offering">
<pageno>12</pageno>
</element>
<element type ="Creating the service offering" level="2" title="Publishing the service offering">
<pageno>13</pageno>
</element>
<element type ="Publishing the service offering" level="3" title="Publishing service offering to a Catalog">
<pageno>13</pageno>
</element>
<element type ="Subscribing to the service" level="1" title="Subscribing to the service">
<pageno>14</pageno>
</element>
<element type ="Subscribing to the service" level="2" title="Canceling a subscription">
<pageno>14</pageno>
</element>
<!-- <element type ="adasdasd" level = "5" title= "dasdsad">
</element> -->
<element type ="Limitations" level="1" title="Limitations">
<pageno>16</pageno>
</element>
<element type ="Appendix A: HP Operations Orchestration flows" level="1" title="Appendix A: HP Operations Orchestration flows">
<pageno>17</pageno>
</element>
<element type ="Appendix B: Integrating with IP Address Management solutions" level="1" title="Appendix B: Integrating with IP Address Management solutions">
<pageno>19</pageno>
</element>
<element type ="Additional resources" level="1" title="Additional resources">
<pageno>20</pageno>
</element>
<element type ="Send Documentation Feedback" level="1" title="Send Documentation Feedback">
<pageno>21</pageno>
</element>
</elements>
If you think I am lacking in clarity in question then please let me know I will answer any queries.
Thank you
You can define a Schema type of your own that is a heading that must contain specific text, and use the conditional type assignment feature of XSD 1.1, if your implementation supports it.
A more widely supported approach is to embed Schematron rules in your schema to do what you want.
Remember that if you over-constrain your input the schema may become fragile -- that is, hard to maintain in the event of changes.
Related
So, I have an object which can contain a list of Objects of that type. I wrote an XSD which looks something like this:
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="myNamespace" elementFormDefault="qualified">
<complexType name="BinModel">
<sequence>
<element type="string" name="min" />
<element type="string" name="max" />
<element type="string" name="fieldname" />
<element type="int" name="defaultValue" />
<element xmlns:ref="BinModel" name="innerBins" maxOccurs="unbounded" minOccurs="0" />
</sequence>
</complexType>
<element name="AllBins">
<complexType>
<sequence>
<element type="string" name="fieldnames" maxOccurs="unbounded" minOccurs="0"/>
<element type="int" name="defaultValue"/>
<element xmlns:type="BinModel" name="outerBins" maxOccurs="unbounded" minOccurs="0" />
</sequence>
</complexType>
</element>
</schema>
It produces two java classes, BinModel and AllBins respectively, but in each of those classes, even though I specify that they contain a list of type BinModel, it produces a List of type Object.
How do I generate a class which has a List of BinModels?
So I realized something was wrong when it took me adding xmlns before type and ref in order to not show errors in the editor. I looked at another xsd for a recursively defined Object somewhere else in my stupidly huge codebase to try and find a solution and discovered two things.
1. We were defining our own namespace
2. We referenced our own objects within that namespace.
So the corrected code looks something like this:
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:myProject="projectNamespace" targetNamespace="myNamespace" elementFormDefault="qualified">
<complexType name="BinModel">
<sequence>
<element type="string" name="min" />
<element type="string" name="max" />
<element type="string" name="fieldname" />
<element type="int" name="defaultValue" />
<element type="myProject:BinModel" name="innerBins" maxOccurs="unbounded" minOccurs="0" />
</sequence>
</complexType>
<element name="AllBins">
<complexType>
<sequence>
<element type="string" name="fieldnames" maxOccurs="unbounded" minOccurs="0"/>
<element type="int" name="defaultValue"/>
<element type="myProject:BinModel" name="outerBins" maxOccurs="unbounded" minOccurs="0" />
</sequence>
</complexType>
</element>
</schema>
I need to valid XSD for a XML file.
This is xml file.
<?xml version="1.0" encoding="UTF-8"?>
<edge xmlns="http://www.example.org/flow">
<flowPara
style="letter-spacing:0px;fill:#000000;
font-size:9pt;font-family:Arial;text-anchor:start;text-align:start;"
id="textArea_38">
Text Flow checking and
<flowSpan style="fill:#FA0101; ">
font color change
text
<flowSpan style="fill:#C5A2A2; " />
</flowSpan>
in
<flowSpan
style="font-style:italic;letter-spacing:0px;fill:#000000;
font-size:9pt;font-family:Arial;text-anchor:start;text-align:start;">text
area
</flowSpan>
.
<flowSpan style="letter-spacing:0px;">
<flowSpan
style="text-decoration:underline;letter-spacing:0px;fill:#000000;
font-size:9pt;font-family:Arial;text-anchor:start;text-align:start;">Text
Flow
</flowSpan>
checking and
<flowSpan style="fill:#FA0101; ">
font color
change text
<flowSpan style="fill:#C5A2A2; ">
<flowSpan style="fill:#000000;
">in text area.</flowSpan>
</flowSpan>
</flowSpan>
</flowSpan>
</flowPara>
</edge>
This is XSD file which i created .
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.example.org/flow"
xmlns:tns="http://www.example.org/flow" elementFormDefault="qualified">
<element name="edge">
<complexType>
<sequence>
<element name="flowPara" maxOccurs="unbounded">
<complexType mixed="true">
<sequence>
<element name="flowspan" maxOccurs="unbounded">
<complexType mixed="true">
<sequence>
<element name="flowspan" maxOccurs="unbounded">
<complexType mixed="true">
<simpleContent>
<extension base="string">
<attribute name="style" type="string" />
</extension>
</simpleContent>
</complexType>
</element>
</sequence>
<attribute name="style" type="string" />
</complexType>
</element>
</sequence>
<attribute name="style" type="string" />
<attribute name="id" type="string" />
</complexType>
</element>
</sequence>
</complexType>
</element>
</schema>
i faced this kind of exception
Exception: cvc-complex-type.2.4.b: The content of element 'flowPara' is not complete. One of '{"http://www.example.org/flow":flowspan}' is expected.
1) In your schema, you have flowspan with the s in lowercase in several places. Change that to flowSpan.
2) You declared mixed content, but your flowSpan element is not optional, so it will always be required and will not validate it if the contents of flowSpan don't contain another flowSpan. Add minOccurs="0" so it becomes optional.
3) You don't need to declare simple content if you have mixed content with an optional nested flowSpan. Your schema could be reorganized using a reference for flowSpan, since it's used recursively. You could try this:
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.example.org/flow"
xmlns:tns="http://www.example.org/flow"
elementFormDefault="qualified">
<element name="edge">
<complexType>
<sequence>
<element name="flowPara" maxOccurs="unbounded">
<complexType mixed="true">
<sequence>
<element ref="tns:flowSpan" maxOccurs="unbounded"/>
</sequence>
<attribute name="style" type="string" />
<attribute name="id" type="string" />
</complexType>
</element>
</sequence>
</complexType>
</element>
<element name="flowSpan">
<complexType mixed="true">
<sequence>
<element ref="tns:flowSpan" minOccurs="0" maxOccurs="unbounded"/>
</sequence>
<attribute name="style" type="string" />
</complexType>
</element>
</schema>
You didn't say minOccurs='0', so it requires one.
I have an XML string, and I could not use the supplied XSD to unmarshal the object in java. So I tried to use an online tool (www.freeformatter.com/xsd-generator.html) to generate a valid xsd and got the same error. I don't understand what I'm seeing.
Here's the XML:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Message xmlns:ns1="http://www.domain.com/ws" xmlns="http://www.domain.com/ws/protocol">
<HeaderMessage>
<MSGTYPE>reply</MSGTYPE>
<ORIGINATOR>XXXX</ORIGINATOR>
<SENDER>XXXX</SENDER>
<TIMESTAMP>2013-12-12 17:48:09.649</TIMESTAMP>
<IDPROCESS>2013-12-12 17:48:09.649</IDPROCESS>
<IDMESSAGE>AN-1386866889649</IDMESSAGE>
<IDREQUEST>AN-1386866889649</IDREQUEST>
<SERVICENAME>RESULT</SERVICENAME>
<ERRORFLAG>OK</ERRORFLAG>
<ERRORCODE>300</ERRORCODE>
<ERRORMSG>Success</ERRORMSG>
</HeaderMessage>
<BodyMessage>
<ns1:ServiceResultObject isin="XX0000000000">
<ns1:ResultObject value="true" codIsin="XX0000000000" />
</ns1:ServiceResultObject>
</BodyMessage>
</Message>
And here's the XSD I got from the tool:
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="http://www.domain.com/ws" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="ServiceResultObject">
<xs:complexType>
<xs:sequence>
<xs:element name="ResultObject">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute type="xs:string" name="value"/>
<xs:attribute type="xs:string" name="codIsin"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute type="xs:string" name="isin"/>
</xs:complexType>
</xs:element>
</xs:schema>
After I generate the classes, I get the error
javax.xml.bind.UnmarshalException: unexpected element (uri:"http://www.domain.com/ws/protocol", local:"Message"). Expected elements are <{http://www.domain.com/ws}ServiceResultObject>
Why do I lose all this header information? Why does the XSD not result in a schema that actually unmarshals the object? The XSD supplied by the service guys here also only defined the inner object.
Since your XML document has 2 namespaces (http://www.domain.com/ws/protocol & http://www.domain.com/ws) you are going to need 2 XML schemas to represent it. One schema can reference another with an import element.
XML Schemas
Below I have started the XML Schemas that you will need for your XML.
ws.xsd (for http://www.domain.com/ws namespace)
This is part of the XML schema for the http://www.domain.com/ws. The whole one is what you have already generated.
<?xml version="1.0" encoding="UTF-8"?>
<schema
xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.domain.com/ws"
xmlns:tns="http://www.domain.com/ws"
elementFormDefault="qualified">
<element name="ServiceResultObject">
<complexType>
<sequence/>
<attribute name="isin" type="string"/>
</complexType>
</element>
</schema>
ws_protocol.xsd (for http://www.domain.com/ws/protocol namespace)
Here is a partial version of the schema that you are missing for the http://www.domain.com/ws/protocol namespace. Note the import element that references the other XML Schema, and <element ref="ws:ServiceResultObject"/> which references an element from the other XML Schema.
<?xml version="1.0" encoding="UTF-8"?>
<schema
xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.domain.com/ws/protocol"
xmlns:tns="http://www.domain.com/ws/protocol"
xmlns:ws="http://www.domain.com/ws"
elementFormDefault="qualified">
<import namespace="http://www.domain.com/ws" schemaLocation="ws.xsd"/>
<element name="Message">
<complexType>
<sequence>
<element name="HeaderMessage">
<complexType>
<sequence>
<element name="MSGTYPE" type="string"/>
</sequence>
</complexType>
</element>
<element name="BodyMessage">
<complexType>
<sequence>
<element ref="ws:ServiceResultObject"/>
</sequence>
</complexType>
</element>
</sequence>
</complexType>
</element>
</schema>
Creating the JAXBContext
Once you have the two XML Schemas the classes will generate to 2 different packages. Below is an example of how to bootstrap the JAXBContext. Note that the package names are delimited by the : character.
JAXBContext jc = JAXBContext.newInstance("com.domain.ws:com.domain.ws.protocol");
I have a remote system that returns an XML similar to the one below.
<BalanceResponse xmlns="http://example.com/balance">
<BalanceResult>
<Balance xmlns="">
<amount>10</amount>
<Balance>
</BalanceResult>
</BalanceResponse>
I created an xsd to match it
<s:schema elementFormDefault="qualified" targetNamespace="http://example.com/balance">
<s:element name="BalanceResponse">
<s:complexType>
<s:element minOccurs="0" maxOccurs="1" name="BalanceResult">
<s:complexType>
<s:element minOccurs="0" maxOccurs="1" name="Balance">
<s:complexType>
<s:element minOccurs="0" maxOccurs="1" name="amount" type="s:decimal" />
</s:complexType>
</s:element>
</s:complexType>
</s:element>
</s:complexType>
</s:schema>
I use JAXB to generate the stub classes. However, I know that my (un/)marshaller cannot bind the Balance element because the namespace is different.
Question is, how can i declare a different (blank)namespace for my element Balance?
You could do something like the following. Since elementFormDefault is unqualified all global elements (BalanceResponse and BalanceResult will be namespace qualified and all local elements (Balance and amount) won't be.
<?xml version="1.0" encoding="UTF-8"?>
<schema
xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://example.com/balance"
xmlns:tns="http://example.com/balance"
elementFormDefault="unqualified">
<element name="BalanceResponse">
<complexType>
<sequence>
<element ref="tns:BalanceResult"/>
</sequence>
</complexType>
</element>
<element name="BalanceResult">
<complexType>
<sequence>
<element name="Balance">
<complexType>
<sequence>
<element name="amount" type="int"/>
</sequence>
</complexType>
</element>
</sequence>
</complexType>
</element>
</schema>
If as in the XML Schema in your question put elementFormDefault as qualified then it would expect all of the XML elements to be namespace qualified.
I am using JAXB and can't figure out why my nested objects aren't being unmarshalled. I am generating the classes via the XJC command.
For example, when I unmarshall the Works object, the Composers collection always contains one Composer instance will a NULL name.
My XML looks like this:
<Works>
<Work>
<Composer>
<Name>Test Name</Name>
</Composer>
</Work>
</Works>
and XSD is like this:
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
jxb:version="2.0" xmlns:tns="http://www.example.org/test/"
targetNamespace="http://www.example.org/test/">
<element name="Works" type="tns:Work"></element>
<complexType name="Work">
<sequence>
<element name="Composers" type="tns:Composer" maxOccurs="unbounded"
minOccurs="1">
</element>
</sequence>
</complexType>
<complexType name="Composer">
<sequence>
<element name="Name" type="string">
</element>
</sequence>
</complexType>
And my code that does the unmarshalling:
JAXBContext jc = JAXBContext.newInstance("mypackagename");
Unmarshaller um = jc.createUnmarshaller();
Works works = (Works)um.unmarshal(new FileReader("src/main/resources/works.xml"));
Work work = works.getWorks().get(0);
Composer composer = work.getComposers().get(0);
System.out.println(composer.getName());
Name is always NULL, even though I know it has a value.
You could have an XML schema like:
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:jxb="http://java.sun.com/xml/ns/jaxb"
jxb:version="2.0" xmlns:tns="http://www.example.org/test/"
targetNamespace="http://www.example.org/test/">
<element name="Works" type="tns:Works"></element>
<complexType name="Works">
<sequence>
<element name="Work" type="tns:Work" maxOccurs="unbounded"/>
</sequence>
</complexType>
<complexType name="Work">
<sequence>
<element name="Composer" type="tns:Composer" maxOccurs="unbounded"/>
</sequence>
</complexType>
<complexType name="Composer">
<sequence>
<element name="Name" type="string"/>
</sequence>
</complexType>
</schema>
That corresponds to the following XML:
<Works xmlns="http://www.example.org/test/">
<Work>
<Composer>
<Name>Test Name</Name>
</Composer>
</Work>
</Works>