How to solve docbuilder SAX exception: unexpected end of document? - java

I have a service that gives some car information in an xml format.
<?xml version="1.0" encoding='UTF-8'?>
<cars>
<car>
<id>5</id>
<name>qwer</name>
</car>
<car>
<id>6</id>
<name>qwert</name>
</car>
</cars>
Now the problem that I'm having is that my
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document doc = docBuilder.parse(xml);
Sometimes throws a SAXException (sometimes it works just fine, but when I reboot the server (still in development) I sometimes keep getting it) with as cause SAXException: unexpected end of document.
But when I place a bufferreader there to see what it's receiving and I copy the value into an xml document and I open it in firefox/ie it looks just fine.

An XML document must have one, and only one, root element.
You should have a <cars> element (or similar) wrapping your group of <car>s.
The error message doesn't make sense though - since you have unexpected content after what should be the end of the document.

You get this exception because the example you entered is a valid XML fragment (as a consequence readable by Firefox), but an invalid XML document, as it has more than one root node, which is forbidden by XML rules.
Try to create one XML document for each <car> tag, and SAX will be fine.

Related

Special characters creates problem while writing xml

first of all please excuse my shallow understanding into coding as I am a business analyst. Now my question. I am writing java code to convert a csv into xml. I am able to read csv successfully into objects. However, while writing the xml, when special a space or "=" is encounteredan error is thrown.
Piece of the problematic code, I have imporovised the value in create element just to highlight the problem. In actual I am getting this value from an object:-
DocumentBuilderFactory documentFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();
Document xmlDocument= documentBuilder.newDocument();
Element root = xmlDocument.createElement("Media NationalGroupId="8" AllFTA="1002" AllSTV="1001");
xmlDocument.appendChild(root);
My xml should look something like this
<Media DateCreated="20200224 145251" NationalGroupId="8" AllFTA="1002" AllSTV="1001" AllTV="1000" NextId="1000000">
createElement should only receive Media as the argument.
To add the other attributes (DateCreated, NationalGroupId, etc), you need to call setAttribute on root, one by one.

Error while parsing a feed due to a whitespace in XML file

In one part of my app, I'm using this code to read a RSS feed:
DocumentBuilder builder = factory.newDocumentBuilder();
Document dom = builder.parse(this.url.openConnection().getInputStream());
Element root = dom.getDocumentElement();
NodeList items = root.getElementsByTagName("item");
for (int i=0;i<items.getLength();i++){...
The problem is that one of the feeds that I want to read starts whith a whitespace just before the <?xml just like that
<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Joomla! - Open Source Content Management" -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
When my app tries to read this feed throw the following error:
org.xml.sax.SAXParseException: processing instructions must not start with xml (position:unknown #1:2 in java.io.InputStreamReader#605667c)
Now my doubt is: how can I avoid this error?
Thanks.
If you are sure with the structure of response you can use skip() method of InputStream. Find below the code snippet.
Document dom = builder.parse(this.url.openConnection().getInputStream().skip(1));
Element root = dom.getDocumentElement();
Else convert the InputStream to String. You can use Apache IOUtils.
Process the String and then parse the xml.

how to parse an XML file in Android (using navigation drawer as layout)

I'm developing an Android application. I would use navigation drawer as layout.
Actually I'm taking parts from autogenerated code of Eclipse IDE. Problem is that I can't correctly read my XML file because compiler keeps say me that file is not reachable due to an uncorrect path (could not open file:///MYPATH).
I put my XML file in xml directory that is subdirectory of "res".
How can I access to this directory from my utility class that is not linked to any activity class of my project?
What is the best way to parse and print an XML file within an Android application?
Is there a specialized API for doing this?
(sample taken from my XML file:
<?xml version="1.0" encoding="UTF-8"?>
<data>
<zname id = "breast" name="Breast">
<cancer cancer_name="">
<t_parameter t_level="" t_desc="">
</t_parameter>
<n_parameter n_level="" n_desc="">
</n_parameter>
<m_parameter m_level="" m_desc="">
</m_parameter>
<stage stage_level="" stage_desc="">
</stage>
<guideline url="">
</guideline>
</cancer>
</zname>
</data>
)
Create a InputStream:
InputStream is = context.getResources().openRawResource(R.xml.data);
Once you have the input stream (above) you can pass it to an instance of DocumentBuilder:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document dom = builder.parse(is);
Element root = dom.getDocumentElement();
NodeList items = root.getElementsByTagName("zname");

Using java to generating XML with specific DTD declarations

I need to generate a XML file that contains specific XML Declarations and DTD declarations as shown below:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE paymentService PUBLIC "-//CompanyName//DTD CompanyName PaymentService v2//EN"
"http://dtd.CompanyName.com/paymentService_v2.dtd">
The remaining XML to be generated also has a specific Elements and associated values.
I was wondering what would be the best way to generate this XML within my java class? Using String Buffer or DOM? Any suggestions with an example or sample code will be hugely appreciated.
Thanks
I would recommend using the Java DOM API. Dealing with XML or XHTML in String objects is notoriously time consuming and buggy, so try and use a propper parser like DOM whenever you have the option to.
The below code should add a doc type and your xml declaration using Java DOM. The <?xml... should be added to the top automatically when the DocumentBuilder creates your document.
// Create document
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.newDocument();
//Create doc type
DOMImplementation domImpl = doc.getImplementation();
DocumentType doctype = domImpl.createDocumentType("paymentService", "-//CompanyName//DTD CompanyName PaymentService v2//EN", "http://dtd.CompanyName.com/paymentService_v2.dtd");
doc.appendChild(doctype);
// Add root element
Element rootElement = doc.createElement("root");
doc.appendChild(rootElement);
The XML created by the above should look like this;
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE paymentService PUBLIC "-//CompanyName//DTD CompanyName PaymentService v2//EN" "http://dtd.CompanyName.com/paymentService_v2.dtd">
<root>
</root>
A great many of the methods used in the code above can throw a large number and variety of exceptions, so make sure your exception handling is up to scratch. I hope this helps.
Link to the Official DOM API Guide

Java parsing xml file with appended data

I've xml file, which looks like this:
<Header>
<Type>TestType</Type>
<Owner>Me</Owner>
</Header>
ĺß™¸Ű;?źÉćáţ¬=ńgăűßEŶáCórýjąŞŢđ·I_§Ä†ÉD¤ďsĂŢŘö¤xi¦Ö†5ÚPMáx^š‡âő
Those funny letters are binary coded data.
I've a trouble with parsing it. All I want to do is read values of Type and Owner nodes and data after Header. That data can be big. It's basically xml with data appended after it. Header always starts with and ends with . Number of child nodes in it can change
I tried just simple parsing:
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(f);
and what I got was:
com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 2 of 3-byte UTF-8 sequence.
In order to be processed by an XML parser a file must be well formed and optionally valid (The latter requires testing against a "schema" describing the expected tag format).
In this case your document is not well formed:
$ xmllint --noout File1.xml
File1.xml:5: parser error : Extra content at the end of the document
ĺß™¸Ű;?źÉćáţ¬=ńgăűßEŶáCórýjąŞŢđ·I_§Ä†ÉD¤ďsĂ
^
I would suggest finding some way to strip away the offending characters and then process the properly formatted XML. For example assuming the XML is in the first 4 files of the file:
head -n 4 File1.xml | xmllint --noout -
You could try a SAX parser instead which does not read in the whole document. Just read in elements/attributes until you have what you want, then stop.
But this is not a well formed XML file. If possible, fix it by putting the (encoded) binary data into its own element.

Categories