SAXParser catch nothing with minimized XML document - java

I'm running a small Android project which could read RSS/Atom Feed documents, using SAX library. Everything works well for default RSS sources, but with minimized sources (without spaces or new line tokens), it produces nothing but a list of blank items. My logs in Log cat also display nothing. I double check this problems with variant RSS sites, but problems still there. Below is my inheritance class of DefaultHandler which I use to handle Rss sources
public class RssContentHandler extends DefaultHandler {
private static final int UNKNOWN_STATE = -1;
private static final int ELEMENT_START = 0;
private static final int TITLE_END = 1;
private static final int DESCRIPTION_END = 2;
private static final int LINK_END = 3;
private static final int PUBDATE_END = 4;
private static final int CHANNEL_END = 5;
private int iState = UNKNOWN_STATE;
private String fullCharacters;
private boolean itemFound = false;
private RssItem rssItem;
private RssFeed rssFeed;
public RssContentHandler() {
}
public RssFeed getFeed() {
return this.rssFeed;
}
#Override
public void startDocument() {
rssItem = new RssItem();
rssFeed = new RssFeed();
Log.i("startDocument", "startDocument");
}
#Override
public void endDocument() {
}
#Override
public void startElement(String _uri, String _localName, String _qName, Attributes _attributes) {
if (_localName.equalsIgnoreCase("item")) {
itemFound = true;
rssItem = new RssItem();
this.iState = UNKNOWN_STATE;
} else
this.iState = ELEMENT_START;
fullCharacters = "";
}
#Override
public void endElement(String _uri, String _localName, String _qName) {
if (_localName.equalsIgnoreCase("item"))
this.rssFeed.addItem(this.rssItem);
else if (_localName.equalsIgnoreCase("title"))
this.iState = TITLE_END;
else if (_localName.equalsIgnoreCase("description"))
this.iState = DESCRIPTION_END;
else if (_localName.equalsIgnoreCase("link"))
this.iState = LINK_END;
else if (_localName.equalsIgnoreCase("pubDate"))
this.iState = PUBDATE_END;
else if (_localName.equalsIgnoreCase("channel"))
this.iState = CHANNEL_END;
else
this.iState = UNKNOWN_STATE;
}
#Override
public void characters(char[] _ch, int _start, int _length) {
String strCharacters = new String(_ch, _start, _length);
if (this.iState == ELEMENT_START)
fullCharacters += strCharacters;
else {
if (!itemFound) {
switch (this.iState) {
case TITLE_END:
this.rssFeed.setTitle(fullCharacters);
break;
case DESCRIPTION_END:
this.rssFeed.setDescription(fullCharacters);
break;
case LINK_END:
this.rssFeed.setLink(fullCharacters);
break;
case PUBDATE_END:
this.rssFeed.setPubDate(fullCharacters);
break;
}
} else {
switch (this.iState) {
case TITLE_END:
this.rssItem.setTitle(fullCharacters);
Log.i("characters", fullCharacters);
break;
case DESCRIPTION_END:
this.rssItem.setDescription(fullCharacters);
break;
case LINK_END:
this.rssItem.setLink(fullCharacters);
break;
case PUBDATE_END:
this.rssItem.setPubDate(fullCharacters);
break;
}
}
this.iState = UNKNOWN_STATE;
}
}
}
and snippet to setup the parser:
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet();
try {
request.setURI(new URI(_strUrl));
} catch (URISyntaxException e) {
e.printStackTrace();
}
HttpResponse response = client.execute(request);
Reader inputStream = new InputStreamReader(response.getEntity().getContent());
RssContentHandler rssContentHandler = new RssContentHandler();
InputSource inputSource = new InputSource();
inputSource.setCharacterStream(inputStream);
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
SAXParser saxParser = saxParserFactory.newSAXParser();
saxParser.parse(inputSource, rssContentHandler);
this.rssFeed = rssContentHandler.getFeed();
P/s: i'm using Android 2.3 x86 installed on VirtualBox for Debugging, and these sources work fine with the built-in RSS Reader app come with the x86 version. So what's wrong here?

Try with _qName instead of _localName.
Your xml contains CDATA so You cann't parse the XML response with your current parser. You have to use LexicalHandler for parsing Raw HTML.
public class MyHandler implements LexicalHandler {
public void startDTD(String name, String publicId, String systemId)
throws SAXException {}
public void endDTD() throws SAXException {}
public void startEntity(String name) throws SAXException {}
public void endEntity(String name) throws SAXException {}
public void startCDATA() throws SAXException {}
public void endCDATA() throws SAXException {}
public void comment (char[] text, int start, int length)
throws SAXException {
String comment = new String(text, start, length);
System.out.println(comment);
}
You can also parse your XML with DOM if memory is not the issue. For more help visit Handling Lexical Events

Related

How to convert a simple method that returns the List<String> into Multi<String> based on Smallrye Mutiny?

I am developing an application that reads the XML file and creates the Hash ID based on the details present in XML. As of now, everything is working perfectly, and able to get the List<String>.
I would like to convert this application into Reactive Streams using the Smallrye Mutiny so I went through some of the documentation but did not understand clearly how to convert this application into Reactive Streams where I do not have to wait for the completion of all XML file to return the List<String>. Rather I can start returning the Multi<String> as and when the its generated.
Following is the simple XML that I am reading using SAX Parser to create the Hash ID:
<customerList>
<customer>
<name>Batman</name>
<age>25</age>
</customer>
<customer>
<name>Superman</name>
<age>28</age>
</customer>
</customerList>
Following is the Main application which will make a call to SaxHandler:
public Multi<String> xmlEventHashGenerator(final InputStream xmlStream) throws SAXException, ParserConfigurationException, IOException {
final SAXParserFactory factory = SAXParserFactory.newInstance();
final SaxHandler saxHandler = new SaxHandler();
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
factory.newSAXParser().parse(xmlStream, saxHandler);
return Multi.createFrom().emitter(em ->{
saxHandler.getRootNodes().forEach(contextNode -> {
final String preHashString = contextNode.toString();
try {
final StringBuilder hashId = new StringBuilder();
MessageDigest.getInstance("SHA-256").digest(preHashString.getBytes(StandardCharsets.UTF_8));
hashId.append(DatatypeConverter.printHexBinary(digest).toLowerCase());
em.emit(hashId.toString());
} catch (NoSuchAlgorithmException e) {
e.printStackTrace();
}
});
em.complete();
});
}
Following is the SaxHandler which will read the XML and create HashIDs:
public class SaxHandler extends DefaultHandler {
#Getter
private final List<String> eventHashIds = new ArrayList<>();
#Getter
private final List<ContextNode> rootNodes = new ArrayList<>();
private final HashMap<String, String> contextHeader = new HashMap<>();
private final String hashAlgorithm;
private ContextNode currentNode = null;
private ContextNode rootNode = null;
private final StringBuilder currentValue = new StringBuilder();
public SaxHandler(final String hashAlgorithm) {
this.hashAlgorithm = hashAlgorithm;
}
#Override
public void startElement(final String uri, final String localName, final String qName, final Attributes attributes) {
if (rootNode == null && qName.equals("customer")) {
rootNode = new ContextNode(contextHeader);
currentNode = rootNode;
rootNode.children.add(new ContextNode(rootNode, "type", qName));
}else if (currentNode != null) {
ContextNode n = new ContextNode(currentNode, qName, (String) null);
currentNode.children.add(n);
currentNode = n;
}
}
#Override
public void characters(char[] ch, int start, int length) {
currentValue.append(ch, start, length);
}
#Override
public void endElement(final String uri, final String localName, final String qName) {
if (rootNode != null && !qName.equals("customer")) {
final String value = !currentValue.toString().trim().equals("") ? currentValue.toString().trim() : null;
currentNode.children.add(new ContextNode(currentNode, qName, value));
}
if (qName.equals("customer")) {
rootNodes.add(rootNode);
rootNode = null;
}
currentValue.setLength(0);
}
}
Following is the Test:
#Test
public void xmlTest() throws Exception {
final HashGenerator eventHashGenerator = new HashGenerator();
final InputStream xmlStream = getClass().getResourceAsStream("/customer.xml");
final List<String> eventHashIds = eventHashGenerator.xmlHashGenerator(xmlStream, "sha3-256");
System.out.println("\nGenerated Event Hash Ids : \n" + eventHashIds);
}
Can someone please guide me to some example or provide some idea on how to convert this application into SmallRye Mutinty Multi<String> based application?
I think you can refactor xmlEventHashGenerator to
public Multi<String> xmlEventHashGenerator(final InputStream xmlStream) throws SAXException, ParserConfigurationException, IOException {
final SAXParserFactory factory = SAXParserFactory.newInstance();
final SaxHandler saxHandler = new SaxHandler();
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
factory.newSAXParser().parse(xmlStream, saxHandler);
return Multi.createFrom()
.iterable( saxHandler.getRootNodes() )
.map( RootNode::toString )
.map( this::convertDatatype );
}
private String convertDatatype(String preHashString) {
try {
// I think we could create the MessageDigest instance only once
byte[] digest = MessageDigest.getInstance( "SHA-256" )
.digest( preHashString.getBytes( StandardCharsets.UTF_8 ) );
return DatatypeConverter.printHexBinary( digest ).toLowerCase();
}
catch (NoSuchAlgorithmException e) {
throw new IllegalArgumentException( e );
}
}
The test method will look something like:
#Test
public void xmlTest() throws Exception {
final HashGenerator eventHashGenerator = new HashGenerator();
final InputStream xmlStream = getClass().getResourceAsStream("/customer.xml");
System.out.println("Generated Event Hash Ids: ");
eventHashGenerator
.xmlHashGenerator(xmlStream)
// Print all the hash codes
.invoke( hash -> System.out.println( hash )
.await().indefinitely();
}
But if you want to concatenate all the hash codes, you can do:
#Test
public void xmlTest() throws Exception {
final HashGenerator eventHashGenerator = new HashGenerator();
final InputStream xmlStream = getClass()
.getResourceAsStream("/customer.xml");
String hash = eventHashGenerator
.xmlHashGenerator(xmlStream)
// Concatenate all the results
.collect().with( Collectors.joining() );
// Print the hashcode
.invoke( hashcode -> System.out.println("\nGenerated Event Hash Ids : \n" + hashcode) )
.await().indefinitely();
}

XML parsing in android.not working

I have a few Java files that I have to try and get info from an XML on the internet. I made the files with the help of some tutorials online but I can't find the problem with what I have.
Below are the three classes I used.
MainXMLClass.java
public class News extends ActionBarActivity {
static final String baseURL = "http://coderdojo.com/rss.xml";
ListView News;
#Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.news);
getActionBar().setHomeButtonEnabled(true);
xmlRefs();
GetURLData();
ArrayList<String> XMLData = new ArrayList<>();
XMLData.add(XMLDataCollected.GetXMLData());
ArrayAdapter<String> adapter = new ArrayAdapter<String>(this,
R.id.lvNews, XMLData);
News.setAdapter(adapter);
}
private void xmlRefs() {
// TODO Auto-generated method stub
News = (ListView) findViewById(R.id.lvNews);
}
private void GetURLData() {
// TODO Auto-generated method stub
try {
URL webPage = new URL(baseURL);
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
XMLReader reader = parser.getXMLReader();
XMLDataHandler Data = new XMLDataHandler();
reader.setContentHandler(Data);
reader.parse(new InputSource(webPage.openStream()));
} catch (Exception e) {
e.printStackTrace();
}
}
}
My XMLHandler.java Class:
public class XMLDataHandler extends DefaultHandler {
XMLDataCollected Info = new XMLDataCollected();
public String getInformation() {
return XMLDataCollected.GetXMLData();
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if (localName.equals("title")) {
String title = localName.getBytes().toString();
Info.setTitle(title);
} else if (localName.equals("link")) {
String link = localName.getBytes().toString();
Info.setLink(link);
} else if (localName.equals("description")) {
String description = localName.getBytes().toString();
Info.setDescription(description);
}
}
}
And finally my XMLDataCollected.java class:
public class XMLDataCollected {
static String title;
static String description;
static String link;
public void setTitle(String t) {
title = t;
}
public void setDescription(String d) {
description = d;
}
public void setLink(String l) {
link = l;
}
public static String GetXMLData() {
return title + description + link;
}
}
I've been trying for about three days to get this sorted but so far I haven't been able to find a solution anywhere.
This is my first time trying to use XML parsing so I'm aware there is bound to be a few things wrong with the files but any help is appreciated.
I may be missing something, but are you SURE about that URL, as I understand, this is the final URL : http://coderdojo.com/news?page=0 or some other number. But when I typed into the browser, the result is not XML format.

XML response how to assign values to variables

I get the xml repsonse for http request. I store it as a string variable
String str = in.readLine();
And the contents of str is:
<response>
<lastUpdate>2012-04-26 21:29:18</lastUpdate>
<state>tx</state>
<population>
<li>
<timeWindow>DAYS7</timeWindow>
<confidenceInterval>
<high>15</high>
<low>0</low>
</confidenceInterval>
<size>0</size>
</li>
</population>
</response>
I want to assign tx, DAYS7 to variables. How do I do that?
Thanks
Slightly modified code from http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/
public class ReadXMLFile {
// Your variables
static String state;
static String timeWindow;
public static void main(String argv[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
// Http Response you get
String httpResponse = "<response><lastUpdate>2012-04-26 21:29:18</lastUpdate><state>tx</state><population><li><timeWindow>DAYS7</timeWindow><confidenceInterval><high>15</high><low>0</low></confidenceInterval><size>0</size></li></population></response>";
DefaultHandler handler = new DefaultHandler() {
boolean bstate = false;
boolean tw = false;
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase("STATE")) {
bstate = true;
}
if (qName.equalsIgnoreCase("TIMEWINDOW")) {
tw = true;
}
}
public void characters(char ch[], int start, int length) throws SAXException {
if (bstate) {
state = new String(ch, start, length);
bstate = false;
}
if (tw) {
timeWindow = new String(ch, start, length);
tw = false;
}
}
};
saxParser.parse(new InputSource(new ByteArrayInputStream(httpResponse.getBytes("utf-8"))), handler);
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("State is " + state);
System.out.println("Time windows is " + timeWindow);
}
}
If you're running this as a part of some process you might want to extend the ReadXMLFile from DefaultHandler.

parsing xml file from network database in android

i am trying to parse an xml file from an URL. I found an example in the following link
http://www.anddev.org/parsing_xml_from_the_net_-_using_the_saxparser-t353.html
and tried using it in my code but it returned the values to be as null
Following is my code of parsing xml
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
URL url = new URL("http://www.siva.com/search");
/** Handling XML */
SAXParserFactory saxparserfactory = SAXParserFactory.newInstance();
SAXParser saxparser = saxparserfactory.newSAXParser();
XMLReader xmlreader = saxparser.getXMLReader();
/* Create a new ContentHandler and apply it to the XML-Reader*/
ForListXMLHandler forlistmyhandler = new ForListXMLHandler();
xmlreader.setContentHandler(forlistmyhandler);
/* Parse the xml-data from our URL. */
xmlreader.parse(new InputSource(url.openStream()));
/* Parsing has finished. */
/* Our ExampleHandler now provides the parsed data to us. */
ParsedDataSet parsedDataSet = forlistmyhandler.getParsedData();
System.out.println(parsedDataSet.toString());
}
following is the code of MyXMLhandler
public class ForListXMLHandler extends DefaultHandler {
private boolean in_outertag = false;
private boolean in_innertag = false;
private boolean in_First_name = false;
private boolean in_Last_name = false;
private ParsedDataSet myParsedDataSet = new ParsedDataSet();
public ParsedDataSet getParsedData() {
return this.myParsedDataSet;
}
#Override
public void startDocument() throws SAXException {
this.myParsedDataSet = new ParsedDataSet();
}
#Override
public void endDocument() throws SAXException {
// Nothing to do
}
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) throws SAXException {
if (localName.equals("Searchdata")) {
this.in_outertag = true;
} else if (localName.equals("Searchdata")) {
this.in_innertag = true;
} else if (localName.equals("First_name")) {
this.in_First_name = true;
} else if (localName.equals("Last_name")) {
this.in_Last_name = true;
}
}
/**
* Gets be called on closing tags like:
* */
#Override
public void endElement(String namespaceURI, String localName, String qName) throws SAXException {
if (localName.equals("Searchdata")) {
this.in_outertag = false;
} else if (localName.equals("Searchdata")) {
this.in_innertag = false;
} else if (localName.equals("First_name")) {
this.in_First_name = false;
} else if (localName.equals("Last_name")) {
// Nothing to do here
}
}
/**
* Gets be called on the following structure: characters
*/
#Override
public void characters(char ch[], int start, int length) {
if (this.in_First_name) {
myParsedDataSet.setfirstname(new String(ch, start, length));
}
if (this.in_Last_name) {
myParsedDataSet.setlastname(new String(ch, start, length));
}
}
}
next part is of my parsed data set class
public class ParsedDataSet {
private String First_name = null;
private String Last_name = null;
public String getFirstname() {
return First_name;
}
public void setfirstname(String First_name) {
this.First_name = First_name;
}
public String getlastname() {
return Last_name;
}
public void setlastname(String Last_name) {
this.Last_name = Last_name;
}
public String toString() {
return this.First_name + "n" + this.Last_name;
}
}
pls tell me where i am getting error
The endElement method gets fired before the characters method, so your boolean variables are always set to false when the characters method gets fired. You should move some code from endElement to characters, something like this:
#Override
public void endElement(String namespaceURI, String localName, String qName) throws SAXException {
}
#Override
public void characters(char ch[], int start, int length) {
if (this.in_First_name) {
this.in_First_name = false;
myParsedDataSet.setfirstname(new String(ch, start, length));
}
if (this.in_Last_name) {
this.in_Last_name = false;
myParsedDataSet.setlastname(new String(ch, start, length));
}
}
You should also take a look here for a complete explanation on "Working with XML on Android".

Reading multiple xml documents from a socket in java

I'm writing a client which needs to read multiple consecutive small XML documents over a socket. I can assume that the encoding is always UTF-8 and that there is optionally delimiting whitespace between documents. The documents should ultimately go into DOM objects. What is the best way to accomplish this?
The essense of the problem is that the parsers expect a single document in the stream and consider the rest of the content junk. I thought that I could artificially end the document by tracking the element depth, and creating a new reader using the existing input stream. E.g. something like:
// Broken
public void parseInputStream(InputStream inputStream) throws Exception
{
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLOutputFactory xof = XMLOutputFactory.newInstance();
XMLEventFactory eventFactory = XMLEventFactory.newInstance();
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document doc = documentBuilder.newDocument();
XMLEventWriter domWriter = xof.createXMLEventWriter(new DOMResult(doc));
XMLStreamReader xmlStreamReader = factory.createXMLStreamReader(inputStream);
XMLEventReader reader = factory.createXMLEventReader(xmlStreamReader);
int depth = 0;
while (reader.hasNext()) {
XMLEvent evt = reader.nextEvent();
domWriter.add(evt);
switch (evt.getEventType()) {
case XMLEvent.START_ELEMENT:
depth++;
break;
case XMLEvent.END_ELEMENT:
depth--;
if (depth == 0)
{
domWriter.add(eventFactory.createEndDocument());
System.out.println(doc);
reader.close();
xmlStreamReader.close();
xmlStreamReader = factory.createXMLStreamReader(inputStream);
reader = factory.createXMLEventReader(xmlStreamReader);
doc = documentBuilder.newDocument();
domWriter = xof.createXMLEventWriter(new DOMResult(doc));
domWriter.add(eventFactory.createStartDocument());
}
break;
}
}
}
However running this on input such as <a></a><b></b><c></c> prints the first document and throws an XMLStreamException. Whats the right way to do this?
Clarification: Unfortunately the protocol is fixed by the server and cannot be changed, so prepending a length or wrapping the contents would not work.
Length-prefix each document (in bytes).
Read the length of the first document from the socket
Read that much data from the socket, dumping it into a ByteArrayOutputStream
Create a ByteArrayInputStream from the results
Parse that ByteArrayInputStream to get the first document
Repeat for the second document etc
IIRC, XML documents can have comments and processing-instructions at the end, so there's no real way of telling exactly when you have come to the end of the file.
A couple of ways of handling the situation have already been mentioned. Another alternative is to put in an illegal character or byte into the stream, such as NUL or zero. This has the advantage that you don't need to alter the documents and you never need to buffer an entire file.
just change to whatever stream
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.StringReader;
import javax.xml.namespace.QName;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;
public class LogParser {
private XMLInputFactory inputFactory = null;
private XMLStreamReader xmlReader = null;
InputStream is;
private int depth;
private QName rootElement;
private static class XMLStream extends InputStream
{
InputStream delegate;
StringReader startroot = new StringReader("<root>");
StringReader endroot = new StringReader("</root>");
XMLStream(InputStream delegate)
{
this.delegate = delegate;
}
public int read() throws IOException {
int c = startroot.read();
if(c==-1)
{
c = delegate.read();
}
if(c==-1)
{
c = endroot.read();
}
return c;
}
}
public LogParser() {
inputFactory = XMLInputFactory.newInstance();
}
public void read() throws Exception {
is = new XMLStream(new FileInputStream(new File(
"./myfile.log")));
xmlReader = inputFactory.createXMLStreamReader(is);
while (xmlReader.hasNext()) {
printEvent(xmlReader);
xmlReader.next();
}
xmlReader.close();
}
public void printEvent(XMLStreamReader xmlr) throws Exception {
switch (xmlr.getEventType()) {
case XMLStreamConstants.END_DOCUMENT:
System.out.println("finished");
break;
case XMLStreamConstants.START_ELEMENT:
System.out.print("<");
printName(xmlr);
printNamespaces(xmlr);
printAttributes(xmlr);
System.out.print(">");
if(rootElement==null && depth==1)
{
rootElement = xmlr.getName();
}
depth++;
break;
case XMLStreamConstants.END_ELEMENT:
System.out.print("</");
printName(xmlr);
System.out.print(">");
depth--;
if(depth==1 && rootElement.equals(xmlr.getName()))
{
rootElement=null;
System.out.println("finished element");
}
break;
case XMLStreamConstants.SPACE:
case XMLStreamConstants.CHARACTERS:
int start = xmlr.getTextStart();
int length = xmlr.getTextLength();
System.out
.print(new String(xmlr.getTextCharacters(), start, length));
break;
case XMLStreamConstants.PROCESSING_INSTRUCTION:
System.out.print("<?");
if (xmlr.hasText())
System.out.print(xmlr.getText());
System.out.print("?>");
break;
case XMLStreamConstants.CDATA:
System.out.print("<![CDATA[");
start = xmlr.getTextStart();
length = xmlr.getTextLength();
System.out
.print(new String(xmlr.getTextCharacters(), start, length));
System.out.print("]]>");
break;
case XMLStreamConstants.COMMENT:
System.out.print("<!--");
if (xmlr.hasText())
System.out.print(xmlr.getText());
System.out.print("-->");
break;
case XMLStreamConstants.ENTITY_REFERENCE:
System.out.print(xmlr.getLocalName() + "=");
if (xmlr.hasText())
System.out.print("[" + xmlr.getText() + "]");
break;
case XMLStreamConstants.START_DOCUMENT:
System.out.print("<?xml");
System.out.print(" version='" + xmlr.getVersion() + "'");
System.out.print(" encoding='" + xmlr.getCharacterEncodingScheme()
+ "'");
if (xmlr.isStandalone())
System.out.print(" standalone='yes'");
else
System.out.print(" standalone='no'");
System.out.print("?>");
break;
}
}
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
try {
new LogParser().read();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private static void printName(XMLStreamReader xmlr) {
if (xmlr.hasName()) {
System.out.print(getName(xmlr));
}
}
private static String getName(XMLStreamReader xmlr) {
if (xmlr.hasName()) {
String prefix = xmlr.getPrefix();
String uri = xmlr.getNamespaceURI();
String localName = xmlr.getLocalName();
return getName(prefix, uri, localName);
}
return null;
}
private static String getName(String prefix, String uri, String localName) {
String name = "";
if (uri != null && !("".equals(uri)))
name += "['" + uri + "']:";
if (prefix != null)
name += prefix + ":";
if (localName != null)
name += localName;
return name;
}
private static void printAttributes(XMLStreamReader xmlr) {
for (int i = 0; i < xmlr.getAttributeCount(); i++) {
printAttribute(xmlr, i);
}
}
private static void printAttribute(XMLStreamReader xmlr, int index) {
String prefix = xmlr.getAttributePrefix(index);
String namespace = xmlr.getAttributeNamespace(index);
String localName = xmlr.getAttributeLocalName(index);
String value = xmlr.getAttributeValue(index);
System.out.print(" ");
System.out.print(getName(prefix, namespace, localName));
System.out.print("='" + value + "'");
}
private static void printNamespaces(XMLStreamReader xmlr) {
for (int i = 0; i < xmlr.getNamespaceCount(); i++) {
printNamespace(xmlr, i);
}
}
private static void printNamespace(XMLStreamReader xmlr, int index) {
String prefix = xmlr.getNamespacePrefix(index);
String uri = xmlr.getNamespaceURI(index);
System.out.print(" ");
if (prefix == null)
System.out.print("xmlns='" + uri + "'");
else
System.out.print("xmlns:" + prefix + "='" + uri + "'");
}
}
A simple solution is to wrap the documents on the sending side in a new root element:
<?xml version="1.0"?>
<documents>
... document 1 ...
... document 2 ...
</documents>
You must make sure that you don't include the XML header (<?xml ...?>), though. If all documents use the same encoding, this can be accomplished with a simple filter which just ignores the first line of each document if it starts with <?xml
Found this forum message (which you probably already saw), which has a solution by wrapping the input stream and testing for one of two ascii characters (see post).
You could try an adaptation on this by first converting to use a reader (for proper character encoding) and then doing element counting until you reach the closing element, at which point you trigger the EOM.
Hi
I also had this problem at work (so won't post resulting the code). The most elegant solution that I could think of, and which works pretty nicely imo, is as follows
Create a class for example DocumentSplittingInputStream which extends InputStream and takes the underlying inputstream in its constructor (or gets set after construction...).
Add a field with a byte array closeTag containing the bytes of the closing root node you are looking for.
Add a field int called matchCount or something, initialised to zero.
Add a field boolean called underlyingInputStreamNotFinished, initialised to true
On the read() implementation:
Check if matchCount == closeTag.length, if it does, set matchCount to -1, return -1
If matchCount == -1, set matchCount = 0, call read() on the underlying inputstream until you get -1 or '<' (the xml declaration of the next document on the stream) and return it. Note that for all I know the xml spec allows comments after the document element, but I knew I was not going to get that from the source so did not bother handling it - if you can not be sure you'll need to change the "gobble" slightly.
Otherwise read an int from the underlying inputstream (if it equals closeTag[matchCount] then increment matchCount, if it doesn't then reset matchCount to zero) and return the newly read byte
Add a method which returns the boolean on whether the underlying stream has closed.
All reads on the underlying input stream should go through a separate method where it checks if the value read is -1 and if so, sets the field "underlyingInputStreamNotFinished" to false.
I may have missed some minor points but i'm sure you get the picture.
Then in the using code you do something like, if you are using xstream:
DocumentSplittingInputStream dsis = new DocumentSplittingInputStream(underlyingInputStream);
while (dsis.underlyingInputStreamNotFinished()) {
MyObject mo = xstream.fromXML(dsis);
mo.doSomething(); // or something.doSomething(mo);
}
David
I had to do something like this and during my research on how to approach it, I found this thread that even though it is quite old, I just replied (to myself) here wrapping everything in its own Reader for simpler use
I was faced with a similar problem. A web service I'm consuming will (in some cases) return multiple xml documents in response to a single HTTP GET request. I could read the entire response into a String and split it, but instead I implemented a splitting input stream based on user467257's post above. Here is the code:
public class AnotherSplittingInputStream extends InputStream {
private final InputStream realStream;
private final byte[] closeTag;
private int matchCount;
private boolean realStreamFinished;
private boolean reachedCloseTag;
public AnotherSplittingInputStream(InputStream realStream, String closeTag) {
this.realStream = realStream;
this.closeTag = closeTag.getBytes();
}
#Override
public int read() throws IOException {
if (reachedCloseTag) {
return -1;
}
if (matchCount == closeTag.length) {
matchCount = 0;
reachedCloseTag = true;
return -1;
}
int ch = realStream.read();
if (ch == -1) {
realStreamFinished = true;
}
else if (ch == closeTag[matchCount]) {
matchCount++;
} else {
matchCount = 0;
}
return ch;
}
public boolean hasMoreData() {
if (realStreamFinished == true) {
return false;
} else {
reachedCloseTag = false;
return true;
}
}
}
And to use it:
String xml =
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<root>first root</root>" +
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<root>second root</root>";
ByteArrayInputStream is = new ByteArrayInputStream(xml.getBytes());
SplittingInputStream splitter = new SplittingInputStream(is, "</root>");
BufferedReader reader = new BufferedReader(new InputStreamReader(splitter));
while (splitter.hasMoreData()) {
System.out.println("Starting next stream");
String line = null;
while ((line = reader.readLine()) != null) {
System.out.println("line ["+line+"]");
}
}
I use JAXB approach to unmarshall messages from multiply stream:
MultiInputStream.java
public class MultiInputStream extends InputStream {
private final Reader source;
private final StringReader startRoot = new StringReader("<root>");
private final StringReader endRoot = new StringReader("</root>");
public MultiInputStream(Reader source) {
this.source = source;
}
#Override
public int read() throws IOException {
int count = startRoot.read();
if (count == -1) {
count = source.read();
}
if (count == -1) {
count = endRoot.read();
}
return count;
}
}
MultiEventReader.java
public class MultiEventReader implements XMLEventReader {
private final XMLEventReader reader;
private boolean isXMLEvent = false;
private int level = 0;
public MultiEventReader(XMLEventReader reader) throws XMLStreamException {
this.reader = reader;
startXML();
}
private void startXML() throws XMLStreamException {
while (reader.hasNext()) {
XMLEvent event = reader.nextEvent();
if (event.isStartElement()) {
return;
}
}
}
public boolean hasNextXML() {
return reader.hasNext();
}
public void nextXML() throws XMLStreamException {
while (reader.hasNext()) {
XMLEvent event = reader.peek();
if (event.isStartElement()) {
isXMLEvent = true;
return;
}
reader.nextEvent();
}
}
#Override
public XMLEvent nextEvent() throws XMLStreamException {
XMLEvent event = reader.nextEvent();
if (event.isStartElement()) {
level++;
}
if (event.isEndElement()) {
level--;
if (level == 0) {
isXMLEvent = false;
}
}
return event;
}
#Override
public boolean hasNext() {
return isXMLEvent;
}
#Override
public XMLEvent peek() throws XMLStreamException {
XMLEvent event = reader.peek();
if (level == 0) {
while (event != null && !event.isStartElement() && reader.hasNext()) {
reader.nextEvent();
event = reader.peek();
}
}
return event;
}
#Override
public String getElementText() throws XMLStreamException {
throw new NotImplementedException();
}
#Override
public XMLEvent nextTag() throws XMLStreamException {
throw new NotImplementedException();
}
#Override
public Object getProperty(String name) throws IllegalArgumentException {
throw new NotImplementedException();
}
#Override
public void close() throws XMLStreamException {
throw new NotImplementedException();
}
#Override
public Object next() {
throw new NotImplementedException();
}
#Override
public void remove() {
throw new NotImplementedException();
}
}
Message.java
#XmlAccessorType(XmlAccessType.FIELD)
#XmlRootElement(name = "Message")
public class Message {
public Message() {
}
#XmlAttribute(name = "ID", required = true)
protected long id;
public long getId() {
return id;
}
public void setId(long id) {
this.id = id;
}
#Override
public String toString() {
return "Message{id=" + id + '}';
}
}
Read multiply messages:
public static void main(String[] args) throws Exception{
StringReader stringReader = new StringReader(
"<Message ID=\"123\" />\n" +
"<Message ID=\"321\" />"
);
JAXBContext context = JAXBContext.newInstance(Message.class);
Unmarshaller unmarshaller = context.createUnmarshaller();
XMLInputFactory inputFactory = XMLInputFactory.newFactory();
MultiInputStream multiInputStream = new MultiInputStream(stringReader);
XMLEventReader xmlEventReader = inputFactory.createXMLEventReader(multiInputStream);
MultiEventReader multiEventReader = new MultiEventReader(xmlEventReader);
while (multiEventReader.hasNextXML()) {
Object message = unmarshaller.unmarshal(multiEventReader);
System.out.println(message);
multiEventReader.nextXML();
}
}
results:
Message{id=123}
Message{id=321}

Categories