Java, UnmarshallingException caused by XML attribute with special chars: ;ìè+òàù-<^èç°§_>!£$%&/()=?~`'#; - java

my xml file has a tag with an attribute "containsValue" which contains the "special" characters you can see in the subject:
<original_msg_body id="msgBodySpecialCharsRule" containsValue=";ìè+òàù-<^èç°§_>!£$%&/()=?~`'#;" />
in my xml schema the attribute has xs:string:
<xs:attribute name="containsValue" type="xs:string" />
I use this value inside a Java software which check if this value is contained inside another String.
but I always obtain this Exception:
javax.xml.bind.UnmarshalException
- with linked exception:
[org.xml.sax.SAXParseException: The value of attribute "containsValue" associated with an element type "original_msg_body" must not contain the '<' character.]
How can I solve it? I've tried changing the attribute type to xs:NMTOKEN, ut I get the same exception. Is there any other type?
I think I could change the characters encoding, for example using the HTML representation, like <, but than could be tricky for the string comparison...

Use entity references: replace < with < and > with &gt etc. in your XML document. Your XML parser will then handle conversion between actual character and its entity reference. That is, in your code you get the actual < or > character.

You need to escape special XML entities like <, >, " with <, >, &quote;

Related

How define a CDATA type in XSD so user doesn't have to escape characters or use "<![CDATA" tag?

I have an XSD that defines an element "password". I want to allow any character there. Currently, I have the element defined as an xs:string and the user has to either escape the string (e.g. myP&ssword) or enclose it in a CDATA tag (e.g. <![CDATA[myP&ssword]]>).
Is there a way to define an XSD so it won't require either of those but won't fail on validation?
XSD Element:
<xs:element name="password" type="xs:string" />
XML That throws the error:
<password>myP&ssword</password>
Is this possible?
No this is not possible. The schema may specifies rules which define if a XML document is valid with respect to these rules, but it still requires the document to be well-formed.
From the XML schema recommendation:
Any application that consumes well-formed XML can use the formalism
defined here to express syntactic, structural and value constraints
applicable to its document instances.

org.xml.sax.SAXParseException: The value of attribute "id" associated with an element type "Employee" must not contain the '<' character. at

I have the Issue when I'm trying to read XML parsing and I have input XML contains some special characters like &,<,> and "".
While parsing XML using SAXParse api getting below Excpetions.
org.xml.sax.SAXParseException: The value of attribute "id" associated with an element type "Employee" must not contain the '<' character. at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
Please let me know how to replace special character before XML Parsing happens.
If any body have piece of code please share.
Its great help !.
-Vishwanath
You need to escape the following characters:
" "
' &apos;
< <
> >
& &

Skipping Html Content in Tag attributes

I am using SAX Parser to parse following piece of data with "Description" attribute containing HTML content . But I am getting error "The value of attribute "Description" associated with an element type "null" must not contain the '<' character".
How to make SAX Parser ignore this tag while XML Processing?
<Thread ThreadID="22" Title="google"
Description="http://google.com/"
DisplayName="Sam" LoginID="hjaja" UserEmailID="abx#ers"
UserSapCode="12345"
IsAnonymous="Yes" CreatedDate="2015-04-29T21:56:04.943" ReplyCount="0"
ViewCount="0" PopularityPoints="0" LastUpdatedBy="" LastPostDate="" />
Thanks in advance.
I really thing that you should take a look at this post (HTML code inside XML) to see how other people recommended to tackle such problem.
No XML parser can parse this data as the data do not comply the xml format. Please refer XML specifications.
There are two ways you can solve this:
Change the source format
Change the source to create the proper XML. You can include HTMLs by escaping the characters using these:
" "
' &apos;
< <
> >
& &
Change the target algo
Second is by creating your own parsing algorithm for you case.
Usually answer is always the the first one.

xml mapping error

i am working on project , in that there is one xml file (IDE Eclipse Indigo).
I am facing a problem with sincle line
<?xml version="1.0" encoding="UTF-8"?>
<BookingConfirmRQ xmlns="http://www.expediaconnect.com/EQC/BC/2007/09"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Authentication username="yyyyyyyy" password="xxxxxxxx" />
<Hotel id="<hotelId/>" />
<BookingConfirmNumbers>
<BookingConfirmNumber bookingID="<bookindId/>"
bookingType="<bookingType/>" confirmNumber="<confirmNumber/>"
confirmTime="<confirmTime/>" />
</BookingConfirmNumbers>
</BookingConfirmRQ>
Here near < Hotel id="< hotelId/>"/> am getting error like_
The value of attribute "id" associated with an element type "Hotel" must not contain the '<' character.
i search it , checked jar's, reformatted still getting error, can sombody help me?
thank u.
You can ignore validation of XML from eclipse windows-preference-validation menu and this way if you don't want to change you can avoid this error
Attribute values should only contain literal text:
<Hotel id="134" />
You need to escape the angle brackets in the value of the attribute like this:
<Hotel id="<hotelId/>" />
Same with the all the other attributes. The angle brackets are on the list of reserved characters that have to be escaped in XML.
Unless you do that, the XML is not well-formed and nothing will process it. Turning off validation - i.e. validation against a DTD or schema - will not help here. The XML has to be well-formed before it can be parsed.
That said, the XML looks very odd, as if you're including whole XML-elements as the value of attributes which is just wrong. So even if you fix the escaping problem this XML may not say what you meant.

How do I include &, <, > etc in XML attribute values

I want to create an XML file which will be used to store the structure of a Java program. I am able to successfully parse the Java program and create the tags as required. The problem arises when I try to include the source code inside my tags, since Java source code may use a vast number of entity reference and reserved characters like &, < ,> , &. I am not able to create a valid XML.
My XML should go like this:
<?xml version="1.0"?>
<prg name="prg_name">
<class name= "class_name>
<parent>parent class</parent>
<interface>Interface name</interface>
.
.
.
<method name= "method_name">
<statement>the ordinary java statement</statement>
<if condition="Conditional Expression">
<statement> true statements </statement>
</if>
<else>
<statement> false statements </statement>
</else>
<statement> usual control statements </statement>
.
.
.
</method>
</class>
.
.
.
</prg>
Like this, but the problem is conditional expressions of if or other statements have a lot of & or other reserved symbols in them which prevents XML from getting validated. Since all this data (source code) is given by the user I have little control over it. Escaping the characters will be very costly in terms of time.
I can use CDATA to escape the element text but it can not be used for attribute values containing conditional expressions. I am using Antlr Java grammar to parse the Java program and getting the attributes and content for the tags. So is there any other workaround for it?
You will have to escape
" to "
' to &apos;
< to <
> to >
& to &
for xml.
In XML attributes you must escape
" with "
< with <
& with &
if you wrap attribute values in double quotes ("), e.g.
<MyTag attr="If a<b & b<c then a<c, it's obvious"/>
meaning tag MyTag with attribute attr with text If a<b & b<c then a<c, it's obvious - note: no need to use &apos; to escape ' character.
If you wrap attribute values in single quotes (') then you should escape these characters:
' with &apos;
< with <
& with &
and you can write " as is.
Escaping of > with > in attribute text is not required, e.g. <a b=">"/> is well-formed XML.

Categories