I have a requirement where I should be able to take any XML document, parse it dynamically and store into a stack/deque for further processing.
Can someone recommend what is a good way to parse XMLs dynamically in JAVA.
Consider this XML
<Response>
<Stock>
<RecordID>130</RecordID>
<SegmentLength>0023</SegmentLength>
<Account>
<Number>233342</Number>
<Type>P</Type>
</Account>
</Stock>
<Stock>
<RecordID>030</RecordID>
<SegmentLength>1023</SegmentLength>
<Account>
<Number>255673</Number>
<Type>P</Type>
</Account>
</Stock>
</Response>
How can I write a method that parsers this XML dynamically and pushes elements into a stack/deque.
I cannot use DOM as DOM requires me to provide the element tag while parsing. The program should be able to accept any XML and parse it dynamically
<XML>
<log>
<date>20022014</date>
<time>2323</time>
<schools>
<school name="ahss"/>
<student>shiva</student>
<class>B</class>
</schools>
</log>
<log>...</log>
</XML>
need to parse this xml format using DOM i have tried many substitutes but i couldn't get it
is there any one to give shoulder
I am using below format to response for the webservices.
<Name>abc</Name>
<Detail>
<RESPONSE>
<Age>20</Age>
<Address>blahblah</Address>
<Mobile>12345</Mobile>
</RESPONSE>
</Detail>
Due to the requirements, I need to return xml format data insides the <Detail></Detail> tag.
In my java class, I parse using Xstream and format into xml and put insides the Detail tag.
But when I test using SOAPUI , I am getting extra <![CDATA[<RESPONSE>.. <</RESPONSE>]]> insdies Detail tag.
How can I avoid having those CDATA tag for the xml response?
<![CDATAP[......]]> is used to tell that the XML meaning of it should not be taken and to treat it as normal text that is called character data. so Parser won't seek for any XML meaning in it.
As Dave Newton and kshitij told it will automatically removed while converting it into object.
If you are not supposed to parse it as it is no issue to bother about it.
I am working on Java. I am parsing an xml file, I am getting tag values, it is working. I have xml file as follows:
<DOC>
<STUDENT>
<ID>1</ID>
<NAME>DAN</NAME>
<ADDRESS>U.K</ADDRESS>
</STUDENT>
<STUDENT>
<ID>2</ID>
<NAME>JACK</NAME>
<ADDRESS>U.S</ADDRESS>
</STUDENT>
</DOC>
I have question that I want to fetch data inside <DOC>....</DOC> with their tag name & value as well. Means I want data as follows:
"<STUDENT>
<ID>1</ID>
<NAME>DAN</NAME>
<ADDRESS>U.K</ADDRESS>
</STUDENT>
<STUDENT>
<ID>2</ID>
<NAME>JACK</NAME>
<ADDRESS>U.S</ADDRESS>
</STUDENT>"
Please guide me how to do it.
The most common approaches in Java are to use one of either SAX or Dom parsing libraries.
If you look them up you should find loads of documentation/tutorials about them.
Dom is the easiest to use normally as it stores the entire XML in memory and you cna then access any tag, however, this is less performant and can be problematic if you are using very large XML. SAX requires more work, but reads the XML and processes each tag as it gets to it.
Both are able to do what you need though.
Take a look at SAX Parser.
This link might be helpful too: http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/
I have a 'complex item' that is in XML, Then a 'workitem' (in xml) that contains lots of other info, and i would like this to contain a string that contains the complex item in xml.
for example:
<inouts name="ClaimType" type="complex" value="<xml string here>"/>
However, trying SAX and other java parsers I cannot get it to process this line, it doesn't like the < or the "'s in the string, I have tried escaping, and converting the " to '.
Is there anyway around this at all?? Or will I have to come up with another solution?
Thanks
I think you'll find that the XML you're dealing with won't parse with a lot of parsers since it's invalid. If you have control over the XML, you'll at a bare minimum need to escape the attribute so it's something like:
<inouts name="ClaimType" type="complex" value="<xml string here>" />
Then, once you've extracted the attribute you can possibly re-parse it to treat it as XML.
Alternatively, you can take one of the approaches above (using CDATA sections) with some re-factoring of your XML.
If you don't have control over your XML, you could try using the TagSoup library to parse it to see how you go. (Disclaimer: I've only used TagSoup for HTML, I have no idea how it'd go with non-HTML content)
(The tag soup site actually appears down ATM, but you should be able to find enough doco on the web, and downloads via the maven repository)
Possibly the easiest solution would be to use a CDATA section. You could convert your example to look like this:
<inouts name="ClaimType" type="complex">
<![CDATA[
<xml string here>
]]>
</inouts>
If you have more than one attribute you want to store complex strings for, you could use multiple child elements with different names:
<inouts name="ClaimType" type="complex">
<value1>
<![CDATA[
<xml string here>
]]>
</value1>
<value2>
<![CDATA[
<xml string here>
]]>
</value2>
</inouts>
Or multiple value elements with an identifying id:
<inouts name="ClaimType" type="complex">
<value id="complexString1">
<![CDATA[
<xml string here>
]]>
</value>
<value id="complexString2">
<![CDATA[
<xml string here>
]]>
</value>
</inouts>
CDATA section or escaping
NB There is a big difference between escaping and encoding, which some other posters have referred to. Be careful of confusing the two.
I'm not sure how it works for attributes, and if escaping (< as < and > as >) does not work, then I don't know.
If it were an inner tag: you could use the Xml Any mechanism (never used it myself) or declare it in a CDATA section.
you are http://www.doingitwrong.com/
If inouts/#value really is tree-structured (i.e. XML) then it shouldn't be an attribute, it should be a child element:
<inout name="ClaimType" type="complex">
<value>
<some-arbitrary>
<xml-stuff/>
</some-arbitrary>
</value>
</inout>
If it is not, in fact, guaranteed to be well-formed XML, but just sort of looks like it because you put some pointy brackets in it, then you should ask yourself if there isn't some better way to solve this problem. That failing, use <![CDATA[ as some have already suggested.