How to fix issue with StAX parsing in java using xml? - java

I am trying to parse the value displayName from an xml file I have. I am using the StAX way to do the parsing, however when I go to parse the code it grabs everything but the actual value I want. I honestly have no clue why because I strictly mention to grab the name that .equalsIgnoreCase(). I don't receive any errors, the only problem is I can't grab the value that I need.
Java:
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.util.Iterator;
import javax.xml.namespace.QName;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.*;
public class Driver {
private static boolean bname;
public static void main(String[] args) throws FileNotFoundException, XMLStreamException {
File file = new File("C:\\Users\\Robert\\Desktop\\root\\SDKCode\\src\\main\\java\\com\\example\\xmlClass\\data.xml");
parser(file);
}
public static void parser(File file) throws FileNotFoundException, XMLStreamException {
bname = false;
XMLInputFactory factory = XMLInputFactory.newInstance();
XMLEventReader eventReader = factory.createXMLEventReader(new FileReader(file));
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
// This will trigger when the tag is of type <...>
if (event.isStartElement()) {
StartElement element = (StartElement) event;
Iterator<Attribute> iterator = element.getAttributes();
while (iterator.hasNext()) {
Attribute attribute = iterator.next();
QName name = attribute.getName();
String value = attribute.getValue();
System.out.println(name + " = " + value);
}
if (element.getName().toString().equalsIgnoreCase("displayName")) {
bname = true;
}
}
if (event.isEndElement()) {
EndElement element = (EndElement) event;
if (element.getName().toString().equalsIgnoreCase("displayName")) {
bname = false;
}
}
if (event.isCharacters()) {
// Depending upon the tag opened the data is retrieved .
Characters element = (Characters) event;
if (bname) {
System.out.println(element.getData());
}
}
}
}
}
XML:
<?xml version="1.0" encoding="UTF-8"?>
<results
xmlns="urn:www-collation-com:1.0"
xmlns:coll="urn:www-collation-com:1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:www-collation-com:1.0
urn:www-collation-com:1.0/results.xsd">
<WebServiceImpl array="1"
guid="FFVVRJ5618KJRHNFUIRV845NRUVHR" xsi:type="coll:com.model.topology.app.web.WebService">
<isPlaceholder>false</isPlaceholder>
<displayName>server.servername.siqom.siqom.us.com</displayName>
<hierarchyType>WebService</hierarchyType>
<hierarchyDomain>app.web</hierarchyDomain>
</WebServiceImpl>
</results>

Related

How to create an area?

import java.io.File;
import java.io.IOException;
import java.util.List;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.JDOMException;
import org.jdom.input.SAXBuilder;
public class ReadXMLFile {
public static void main(String[] args) {
SAXBuilder builder = new SAXBuilder();
File xmlFile = new File("c:\\test.xml");
try {
Document document = (Document) builder.build(xmlFile);
Element rootNode = document.getRootElement();
List list = rootNode.getChildren("raum");
for (int i = 0; i < list.size(); i++) {
Element node = (Element) list.get(i);
System.out.println("ID : " + node.getChildText("ID"));
}
} catch (IOException io) {
System.out.println(io.getMessage());
} catch (JDOMException jdomex) {
System.out.println(jdomex.getMessage());
}
}
}
I don't understand how the step in between has to look like in order to insert the imported coordinates into the polygon.. Maybe someone can help me with this?
You can follow any sample JDOM parser example and do it.
For example, this explains how to read the xml and take the data in a list and iterate over it. Just follow the steps and understand what you are doing, you can easily get it done.
For the sake of completeness?
This is how to achieve it using SAX parser.
Note that it is not clear to me, from your question, which Polygon you are referring to. I presume it is a java class. It can't be java.awt.Polygon because its points are all int whereas your sample XML file contains only double values. The only other class I thought of was javafx.scene.shape.Polygon that contains an array of points where each point is a double. Hence in the below code, I create an instance of javafx.scene.shape.Polygon.
For the situation you describe in your question, I don't see the point (no pun intended) in loading the entire DOM tree into memory. You simply need to create a point every time you encounter a x and a y coordinate in the XML file and add those coordinates to a collection of points.
Here is the code. Note that I created an XML file named polygon0.xml that contains the entire XML from your question. Also note that you can extend class org.xml.sax.helpers.DefaultHandler rather than implement interface ContentHandler.
import java.io.FileReader;
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import javafx.scene.shape.Polygon;
public class Polygons implements ContentHandler {
private boolean isX;
private boolean isY;
private Polygon polygon;
/* Start 'ContentHandler' interface methods. */
#Override // org.xml.sax.ContentHandler
public void setDocumentLocator(Locator locator) {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void startDocument() throws SAXException {
polygon = new Polygon();
}
#Override // org.xml.sax.ContentHandler
public void endDocument() throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void startPrefixMapping(String prefix, String uri) throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void endPrefixMapping(String prefix) throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
isX = "x".equals(qName);
isY = "y".equals(qName);
}
#Override // org.xml.sax.ContentHandler
public void endElement(String uri, String localName, String qName) throws SAXException {
if (isX) {
isX = false;
}
if (isY) {
isY = false;
}
}
#Override // org.xml.sax.ContentHandler
public void characters(char[] ch, int start, int length) throws SAXException {
if (isX || isY) {
StringBuilder sb = new StringBuilder(length);
int end = start + length;
for (int i = start; i < end; i++) {
sb.append(ch[i]);
}
polygon.getPoints().add(Double.parseDouble(sb.toString()));
}
}
#Override // org.xml.sax.ContentHandler
public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void processingInstruction(String target, String data) throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void skippedEntity(String name) throws SAXException {
// Do nothing.
}
/* End 'ContentHandler' interface methods. */
public static void main(String[] args) {
Polygons instance = new Polygons();
Path path = Paths.get("polygon0.xml");
SAXParserFactory spf = SAXParserFactory.newInstance();
try (FileReader reader = new FileReader(path.toFile())) { // throws java.io.IOException
SAXParser saxParser = spf.newSAXParser(); // throws javax.xml.parsers.ParserConfigurationException , org.xml.sax.SAXException
XMLReader xmlReader = saxParser.getXMLReader(); // throws org.xml.sax.SAXException
xmlReader.setContentHandler(instance);
InputSource input = new InputSource(reader);
xmlReader.parse(input);
System.out.println(instance.polygon);
}
catch (IOException |
ParserConfigurationException |
SAXException x) {
x.printStackTrace();
}
}
}
Here is the output from running the above code:
Polygon[points=[400.3, 997.2, 400.3, 833.1, 509.9, 833.1, 509.9, 700.0, 242.2, 700.0, 242.2, 600.1, 111.1, 600.1, 111.1, 300.0, 300.0, 300.0, 300.0, 420.0, 600.5, 420.0, 600.5, 101.9, 717.8, 101.9, 717.8, 200.0, 876.5, 200.0, 876.5, 500.8, 1012.1, 500.8, 1012.1, 900.2, 902.0, 900.2, 902.0, 997.2], fill=0x000000ff]
EDIT
As requested, by OP, here is an implementation using JDOM (version 2.0.6)
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.JDOMException;
import org.jdom2.filter.ElementFilter;
import org.jdom2.input.SAXBuilder;
import org.jdom2.util.IteratorIterable;
import javafx.scene.shape.Polygon;
public class Polygon2 {
public static void main(String[] args) {
Polygon polygon = new Polygon();
Path path = Paths.get("polygon0.xml");
SAXBuilder builder = new SAXBuilder();
try {
Document jdomDoc = builder.build(path.toFile()); // throws java.io.IOException , org.jdom2.JDOMException
Element root = jdomDoc.getRootElement();
IteratorIterable<Element> iter = root.getDescendants(new ElementFilter("edge"));
while (iter.hasNext()) {
Element elem = iter.next();
Element childX = elem.getChild("x");
polygon.getPoints().add(Double.parseDouble(childX.getText()));
Element childY = elem.getChild("y");
polygon.getPoints().add(Double.parseDouble(childY.getText()));
}
}
catch (IOException | JDOMException x) {
x.printStackTrace();
}
System.out.println(polygon);
}
}
You can read XML files DOM parser library check this article.
I assume you are working on a Desktop application so you might want to use FileChooser for file selection. Here is an example of this.
Also, I think you would need to make some structural changes (for convinience) to your XML file so that it would have something like this:
<xpoints>
<x>5<x/>
...
</xpoints>
<ypoints>
<y>5<y/>
...
</ypoints>
But for existing structure doing something like this would be enogh:
File file = new File("file");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName("edge");
// you can iterate over all edges
for (int itr = 0; itr < nodeList.getLength(); itr++)
{
Node node = nodeList.item(itr);
if (node.getNodeType() == Node.ELEMENT_NODE)
{
Element eElement = (Element) node;
//then you can access values, for example, to pass them to an array
array.add(eElement.getElementsByTagName("x").item(0).getTextContent()));
}
}

Getting Parent Child Hierarchy in Sax XML parser

I'm using SAX (Simple API for XML) to parse an XML document. I'm getting output for all the tags the file have, but i want it to show the tags in parent child hierarchy.
For Example:
This is my output
<dblp>
<www>
<author>
</author><title>
</title><url>
</url><year>
</year></www><inproceedings>
<month>
</month><pages>
</pages><booktitle>
</booktitle><note>
</note><cdrom>
</cdrom></inproceedings><article>
<journal>
</journal><volume>
</volume></article><ee>
</ee><book>
<publisher>
</publisher><isbn>
</isbn></book><incollection>
<crossref>
</crossref></incollection><editor>
</editor><series>
</series></dblp>
But i want it to display the output like this (it displays the children with extra spacing (that's how i want it to be))
<dblp>
<www>
<author>
</author>
<title>
</title>
<url>
</url>
<year>
</year>
</www>
<inproceedings>
<month>
</month>
<pages>
</pages>
<booktitle>
</booktitle>
<note>
</note>
<cdrom>
</cdrom>
</inproceedings>
<article>
<journal>
</journal>
<volume>
</volume>
</article>
<ee>
</ee>
<book>
<publisher>
</publisher>
<isbn>
</isbn>
</book>
<incollection>
<crossref>
</crossref>
</incollection>
<editor>
</editor>
<series>
</series>
</dblp>
But i can't figure out how can i detect that parser is parsing a parent tag or a children.
here is my code:
package com.teamincredibles.sax;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class Parser extends DefaultHandler {
public void getXml() {
try {
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
SAXParser saxParser = saxParserFactory.newSAXParser();
final MySet openingTagList = new MySet();
final MySet closingTagList = new MySet();
DefaultHandler defaultHandler = new DefaultHandler() {
public void startDocument() throws SAXException {
System.out.println("Starting Parsing...\n");
}
public void endDocument() throws SAXException {
System.out.print("\n\nDone Parsing!");
}
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if (!openingTagList.contains(qName)) {
openingTagList.add(qName);
System.out.print("<" + qName + ">\n");
}
}
public void characters(char ch[], int start, int length)
throws SAXException {
/*for(int i=start; i<(start+length);i++){
System.out.print(ch[i]);
}*/
}
public void endElement(String uri, String localName, String qName)
throws SAXException {
if (!closingTagList.contains(qName)) {
closingTagList.add(qName);
System.out.print("</" + qName + ">");
}
}
};
saxParser.parse("xml/sample.xml", defaultHandler);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String args[]) {
Parser readXml = new Parser();
readXml.getXml();
}
}
You can consider a StAX implementation:
package be.duo.stax;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
public class StaxExample {
public void getXml() {
InputStream is = null;
try {
is = new FileInputStream("c:\\dev\\sample.xml");
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLStreamReader reader = inputFactory.createXMLStreamReader(is);
parse(reader, 0);
} catch(Exception ex) {
System.out.println(ex.getMessage());
} finally {
if(is != null) {
try {
is.close();
} catch(IOException ioe) {
System.out.println(ioe.getMessage());
}
}
}
}
private void parse(XMLStreamReader reader, int depth) throws XMLStreamException {
while(true) {
if(reader.hasNext()) {
switch(reader.next()) {
case XMLStreamConstants.START_ELEMENT:
writeBeginTag(reader.getLocalName(), depth);
parse(reader, depth+1);
break;
case XMLStreamConstants.END_ELEMENT:
writeEndTag(reader.getLocalName(), depth-1);
return;
}
}
}
}
private void writeBeginTag(String tag, int depth) {
for(int i = 0; i < depth; i++) {
System.out.print(" ");
}
System.out.println("<" + tag + ">");
}
private void writeEndTag(String tag, int depth) {
for(int i = 0; i < depth; i++) {
System.out.print(" ");
}
System.out.println("</" + tag + ">");
}
public static void main(String[] args) {
StaxExample app = new StaxExample();
app.getXml();
}
}
There is an idiom for StAX with a loop like this for every tag in the XML:
private MyTagObject parseMyTag(XMLStreamReader reader, String myTag) throws XMLStreamException {
MyTagObject myTagObject = new MyTagObject();
while (true) {
switch (reader.next()) {
case XMLStreamConstants.START_ELEMENT:
String localName = reader.getLocalName();
if(localName.equals("myOtherTag1")) {
myTagObject.setMyOtherTag1(parseMyOtherTag1(reader, localName));
} else if(localName.equals("myOtherTag2")) {
myTagObject.setMyOtherTag2(parseMyOtherTag2(reader, localName));
}
// and so on
break;
case XMLStreamConstants.END_ELEMENT:
if(reader.getLocalName().equals(myTag) {
return myTagObject;
}
break;
}
}
well what have you tried? you should use a transformer found here: How to pretty print XML from Java?
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);
Almost any useful SAX application needs to maintain a stack. When startElement is called, you push information to the stack, when endElement is called, you pop the stack. Exactly what you put on the stack depends on the application; it's often the element name. For your application, you don't actually need a full stack, you only need to know its depth. You could get by with maintaining this using depth++ in startElement and depth-- in endElement(). Then you just output depth spaces before the element name.

Parsing issue in Java StAX XML

I've been searching for a solution to this problem and found nothing. I'm creating a simple chat over Java TCP/IP sockets as an assignment and I've encountered a problem creating an XML string containing the letter åäö.
Lets say I pass an XMLObject with the fields
name="Server", color="#FF0000" text="åäö"
the generated xmlstring that is returned by the parser will be
<?xml version="1.0"?><message sender="Server"><text color="#FF0000”>???</text></message>
åäö turns into ???. Any idea? Thanks!
** UPDATE**
Ok, so I just realised the problem arises when getting some text from a JTextArea.
If I run this code:
package self.edu.javaprojekt.view;
import self.edu.javaprojekt.model.XMLObject;
import self.edu.javaprojekt.model.XMLParser;
public class Test {
/**
* #param args
*/
public static void main(String[] args) {
XMLObject object = new XMLObject();
object.setColor("#FF0000");
object.setSender("Server");
object.setText("åäö");
String parsed = XMLParser.parse(object);
System.out.println(parsed);
System.out.println(XMLParser.parse(parsed));
}
}
I get the proper result:
<?xml version="1.0" encoding="UTF-8"?><message sender="Server"><text color="#FF0000">åäö</text></message>
XMLObject [sender=Server, text=åäö, color=#FF0000]
But with a text area. The text that is returned in the following method is added to an XMLObject and then sent for parsing.
public String readAndClearMessageArea() {
String text = messageTextArea.getText();
appendToPane("\n" + controller.getName() + ": " + text, controller.getColor());
messageTextArea.setText("");
return text;
}
I get the weird result. Any new ideas? Thanks!!
This is the parser
package self.edu.javaprojekt.model;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import java.io.OutputStream;
import java.io.UnsupportedEncodingException;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
import java.util.Iterator;
import javax.xml.stream.XMLEventFactory;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLEventWriter;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLOutputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.events.Attribute;
import javax.xml.stream.events.Characters;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartDocument;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;
public class XMLParser {
public static final String MESSAGE_TAG = "message";
public static final String SENDER_TAG = "sender";
public static final String TEXT_TAG = "text";
public static final String COLOR_TAG = "color";
public static final String DISCONNECT_TAG = "disconnect";
public static final String ENCRYPTED_TAG = "encrypted";
public static final String ENCRYPTED_TYPE_TAG = "type";
public static final String KEY_REQUEST_TAG = "keyrequest";
public static String parse(XMLObject object) {
// create an XMLOutputFactory
XMLOutputFactory outputFactory = XMLOutputFactory.newInstance();
// create XMLEventWriter
XMLEventWriter eventWriter;
ByteArrayOutputStream out = new ByteArrayOutputStream();
String xmlString = "";
try {
eventWriter = outputFactory.createXMLEventWriter(out);
// create an EventFactory
XMLEventFactory eventFactory = XMLEventFactory.newInstance();
//XMLEvent end = eventFactory.createDTD("\n");
//XMLEvent tab = eventFactory.createDTD("\t");
// create and write Start Tag
StartDocument startDocument = eventFactory.createStartDocument("UTF-8");
eventWriter.add(startDocument);
// create message open tag
//eventWriter.add(end);
StartElement messageStartElement = eventFactory.createStartElement(
"", "", MESSAGE_TAG);
eventWriter.add(messageStartElement);
eventWriter.add(eventFactory.createAttribute(SENDER_TAG,
object.getSender()));
//eventWriter.add(end);
// create the text
//eventWriter.add(tab);
StartElement textStartElement = eventFactory.createStartElement("",
"", TEXT_TAG);
eventWriter.add(textStartElement);
eventWriter.add(eventFactory.createAttribute(COLOR_TAG,
object.getColor()));
Characters characters = eventFactory.createCharacters(object.getText());
eventWriter.add(characters);
//eventWriter.add(tab);
eventWriter.add(eventFactory.createEndElement("", "", TEXT_TAG));
//eventWriter.add(end);
eventWriter.add(eventFactory.createEndElement("", "", MESSAGE_TAG));
//eventWriter.add(end);
eventWriter.add(eventFactory.createEndDocument());
eventWriter.close();
xmlString = out.toString("UTF-8");
} catch (XMLStreamException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (UnsupportedEncodingException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return xmlString;
}
#SuppressWarnings("unchecked")
public static XMLObject parse(String string) {
XMLObject object = null;
try {
// First, create a new XMLInputFactory
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
// Setup a new eventReader
InputStream in;
if (!string.equals("")) {
in = new ByteArrayInputStream(Charset.forName("UTF-8")
.encode(string).array());
} else {
in = new FileInputStream("xmltest.txt");
}
XMLEventReader eventReader = inputFactory.createXMLEventReader(in);
// read the XML document
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
if (event.isStartElement()) {
StartElement startElement = event.asStartElement();
// If we have a message
if (startElement.getName().getLocalPart()
.equals(MESSAGE_TAG)) {
object = new XMLObject();
// We read the attributes from this tag and add the
// sender
// attribute to our object
Iterator<Attribute> attributes = startElement
.getAttributes();
while (attributes.hasNext()) {
Attribute attribute = attributes.next();
if (attribute.getName().getLocalPart()
.equals(SENDER_TAG)) {
object.setSender(attribute.getValue());
}
}
}
if (startElement.getName().getLocalPart().equals(TEXT_TAG)) {
// We read the attributes from this tag and add the
// color
// attribute to our object
Iterator<Attribute> attributes = startElement
.getAttributes();
while (attributes.hasNext()) {
Attribute attribute = attributes.next();
if (attribute.getName().getLocalPart()
.equals(COLOR_TAG)) {
object.setColor(attribute.getValue());
}
}
// And then get the text
event = eventReader.nextEvent();
object.setText(event.asCharacters().getData());
continue;
}
}
// If we reach the end of the message, return the object
if (event.isEndElement()) {
EndElement endElement = event.asEndElement();
if (endElement.getName().getLocalPart().equals(MESSAGE_TAG)) {
return object;
}
}
}
} catch (XMLStreamException | FileNotFoundException e) {
e.printStackTrace();
}
return null;
}
}
This is the class holding the xml values
package self.edu.javaprojekt.model;
public class XMLObject {
private String sender, text, color;
public String getSender() {
return sender;
}
public String getText() {
return text;
}
public String getColor() {
return color;
}
public void setSender(String sender) {
this.sender = sender;
}
public void setText(String text) {
this.text = text;
}
public void setColor(String color) {
this.color = color;
}
#Override
public String toString() {
return "XMLObject [sender=" + sender + ", text=" + text + ", color="
+ color + "]";
}
}
I have seen this before, you can use the different encoding to see which one work for you. But try ISO-8859-1 or UTF-16. Not sure which encoding you need.

How to read a big XML file in JAVA and Spilt it into small XML files Based on Tag?

I am new to JAVA programming, now I in need of JAVA program to read a big XML file that containing .. tags. Sample input as follows.
Input.xml
<row>
<Name>Filename1</Name>
</row>
<row>
<Name>Filename2</Name>
</row>
<row>
<Name>Filename3</Name>
</row>
<row>
<Name>Filename4</Name>
</row>
<row>
<Name>Filename5</Name>
</row>
<row>
<Name>Filename6</Name>
</row>
.
.
I need output as first <row> </row> as a single .xml file with filename as filename1.xml
and second <row>..</row> as filename2.xml and so.
Can anyone tell the steps how to do it in simple way with Java, it will be very useful if you give any sample codes ?
You could do the following with StAX because you said your xml is large
Code for Your Use Case
The following code uses StAX APIs to break up the document as outlined in your question:
import java.io.*;
import java.util.*;
import javax.xml.namespace.QName;
import javax.xml.stream.*;
import javax.xml.stream.events.*;
public class Demo {
public static void main(String[] args) throws Exception {
Demo demo = new Demo();
demo.split("src/forum7408938/input.xml", "nickname");
//demo.split("src/forum7408938/input.xml", null);
}
private void split(String xmlResource, String condition) throws Exception {
XMLEventFactory xef = XMLEventFactory.newFactory();
XMLInputFactory xif = XMLInputFactory.newInstance();
XMLEventReader xer = xif.createXMLEventReader(new FileReader(xmlResource));
StartElement rootStartElement = xer.nextTag().asStartElement(); // Advance to statements element
StartDocument startDocument = xef.createStartDocument();
EndDocument endDocument = xef.createEndDocument();
XMLOutputFactory xof = XMLOutputFactory.newFactory();
while(xer.hasNext() && !xer.peek().isEndDocument()) {
boolean metCondition;
XMLEvent xmlEvent = xer.nextTag();
if(!xmlEvent.isStartElement()) {
break;
}
// Be able to split XML file into n parts with x split elements(from
// the dummy XML example staff is the split element).
StartElement breakStartElement = xmlEvent.asStartElement();
List<XMLEvent> cachedXMLEvents = new ArrayList<XMLEvent>();
// BOUNTY CRITERIA
// I'd like to be able to specify condition that must be in the
// split element i.e. I want only staff which have nickname, I want
// to discard those without nicknames. But be able to also split
// without conditions while running split without conditions.
if(null == condition) {
cachedXMLEvents.add(breakStartElement);
metCondition = true;
} else {
cachedXMLEvents.add(breakStartElement);
xmlEvent = xer.nextEvent();
metCondition = false;
while(!(xmlEvent.isEndElement() && xmlEvent.asEndElement().getName().equals(breakStartElement.getName()))) {
cachedXMLEvents.add(xmlEvent);
if(xmlEvent.isStartElement() && xmlEvent.asStartElement().getName().getLocalPart().equals(condition)) {
metCondition = true;
break;
}
xmlEvent = xer.nextEvent();
}
}
if(metCondition) {
// Create a file for the fragment, the name is derived from the value of the id attribute
FileWriter fileWriter = null;
fileWriter = new FileWriter("src/forum7408938/" + breakStartElement.getAttributeByName(new QName("id")).getValue() + ".xml");
// A StAX XMLEventWriter will be used to write the XML fragment
XMLEventWriter xew = xof.createXMLEventWriter(fileWriter);
xew.add(startDocument);
// BOUNTY CRITERIA
// The content of the spitted files should be wrapped in the
// root element from the original file(like in the dummy example
// company)
xew.add(rootStartElement);
// Write the XMLEvents that were cached while when we were
// checking the fragment to see if it matched our criteria.
for(XMLEvent cachedEvent : cachedXMLEvents) {
xew.add(cachedEvent);
}
// Write the XMLEvents that we still need to parse from this
// fragment
xmlEvent = xer.nextEvent();
while(xer.hasNext() && !(xmlEvent.isEndElement() && xmlEvent.asEndElement().getName().equals(breakStartElement.getName()))) {
xew.add(xmlEvent);
xmlEvent = xer.nextEvent();
}
xew.add(xmlEvent);
// Close everything we opened
xew.add(xef.createEndElement(rootStartElement.getName(), null));
xew.add(endDocument);
fileWriter.close();
}
}
}
}
I can suggest using SAXParser and extending the DefaultHandler class' methods.
You can use a few booleans to keep a track of which tag you are in.
DefaultHandler will let you know when you are in a particular tag by the startElement() method. Then, you will be given the contents of the tag by the characters() method and finally you will be notified of the end of a tag by the endElement() method.
As soon as you are notified of the end of a <row>, you can get the contents of the tag you just saved and create a file out of it.
Looking at your example, you just need a couple of boolean values -- boolean inRow and boolean inName so this should not be a hard task =)
Example from Mykong (I am leaving out the actual code, you must do it on your own. It is fairly trivial):
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXMLFile {
public static void main(String argv[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bfname = false;
boolean blname = false;
boolean bnname = false;
boolean bsalary = false;
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
System.out.println("Start Element :" + qName);
if (qName.equalsIgnoreCase("FIRSTNAME")) {
bfname = true;
}
if (qName.equalsIgnoreCase("LASTNAME")) {
blname = true;
}
if (qName.equalsIgnoreCase("NICKNAME")) {
bnname = true;
}
if (qName.equalsIgnoreCase("SALARY")) {
bsalary = true;
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length) throws SAXException {
if (bfname) {
System.out.println("First Name : " + new String(ch, start, length));
bfname = false;
}
if (blname) {
System.out.println("Last Name : " + new String(ch, start, length));
blname = false;
}
if (bnname) {
System.out.println("Nick Name : " + new String(ch, start, length));
bnname = false;
}
if (bsalary) {
System.out.println("Salary : " + new String(ch, start, length));
bsalary = false;
}
}
};
saxParser.parse("c:\\file.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
The best approach is JAXB MArshal and unmarshaller to read and create xml fils.
Here is example
Assuming that your file have element that contains those rows:
<root>
<row><Name>Filename1</Name></row>
<row><Name>Filename2</Name></row>
<row><Name>Filename3</Name></row>
<row><Name>Filename4</Name></row>
<row><Name>Filename5</Name></row>
<row><Name>Filename6</Name></row>
</root>
This code will do the trick:
package com.example;
import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class Main {
public static String readXmlFromFile(String fileName) throws Exception {
BufferedReader reader = new BufferedReader(new FileReader(fileName));
String line = null;
StringBuilder stringBuilder = new StringBuilder();
String lineSeparator = System.getProperty("line.separator");
while ((line = reader.readLine()) != null) {
stringBuilder.append(line);
stringBuilder.append(lineSeparator);
}
return stringBuilder.toString();
}
public static List<String> divideXmlByTag(String xml, String tag) throws Exception {
List<String> list = new ArrayList<String>();
Document document = loadXmlDocument(xml);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
NodeList rowList = document.getElementsByTagName(tag);
for(int i=0; i<rowList.getLength(); i++) {
Node rowNode = rowList.item(i);
if (rowNode.getNodeType() == Node.ELEMENT_NODE) {
DOMSource source = new DOMSource(rowNode);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
StreamResult streamResult = new StreamResult(baos);
transformer.transform(source, streamResult);
list.add(baos.toString());
}
}
return list;
}
private static Document loadXmlDocument(String xml) throws SAXException, IOException, ParserConfigurationException {
return loadXmlDocument(new ByteArrayInputStream(xml.getBytes()));
}
private static Document loadXmlDocument(InputStream inputStream) throws SAXException, IOException, ParserConfigurationException {
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = null;
documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(inputStream);
inputStream.close();
return document;
}
public static void main(String[] args) throws Exception {
String xmlString = readXmlFromFile("d:/test.xml");
System.out.println("original xml:\n" + xmlString + "\n");
System.out.println("divided xml:\n");
List<String> dividedXmls = divideXmlByTag(xmlString, "row");
for (String xmlPart : dividedXmls) {
System.out.println(xmlPart + "\n");
}
}
}
You only need to write this xml parts to separates files.
Since the user requested one more solution posting other way.
use a StAX parser for this situation. It will prevent the entire document from being read into memory at one time.
Advance the XMLStreamReader to the local root element of the sub-fragment.
You can then use the javax.xml.transform APIs to produce a new document from this XML fragment. This will advance the XMLStreamReader to the end of that fragment.
Repeat step 1 for the next fragment.
Code Example
For the following XML, output each "statement" section into a file named after the "account attributes value":
<statements>
<statement account="123">
...stuff...
</statement>
<statement account="456">
...stuff...
</statement>
import java.io.File;
import java.io.FileReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamReader;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stax.StAXSource;
import javax.xml.transform.stream.StreamResult;
public class Demo {
public static void main(String[] args) throws Exception {
XMLInputFactory xif = XMLInputFactory.newInstance();
XMLStreamReader xsr = xif.createXMLStreamReader(new FileReader("input.xml"));
xsr.nextTag(); // Advance to statements element
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
while(xsr.nextTag() == XMLStreamConstants.START_ELEMENT) {
File file = new File("out/" + xsr.getAttributeValue(null, "account") + ".xml");
t.transform(new StAXSource(xsr), new StreamResult(file));
}
}
}
If you're new to Java then the people recommending SAX and StAX parsing are throwing you in at the deep end! This is pretty low-level stuff, highly efficient, but not designed for beginners. You said the file is "big" and they've all assumed that to mean "very big", but in my experience an unquantified "big" can mean anything from 1Mb to 20Gb, so designing a solution based on that description is somewhat premature.
It's much easier to do this with XSLT 2.0 than with Java. All it takes is a stylesheet like this:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="row">
<xsl:result-document href="{FileName}">
<xsl:copy-of select="."/>
</xsl:result-document>
</xsl:template>
</xsl:stylesheet>
And if it has to be within a Java application, you can easily invoke the transformation from Java using an API.
Try out this,
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
public class Test{
static public void main(String[] arg) throws Exception{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("foo.xml");
TransformerFactory tranFactory = TransformerFactory.newInstance();
Transformer aTransformer = tranFactory.newTransformer();
NodeList list = doc.getFirstChild().getChildNodes();
for (int i=0; i<list.getLength(); i++){
Node element = list.item(i).cloneNode(true);
if(element.hasChildNodes()){
Source src = new DOMSource(element);
FileOutputStream fs=new FileOutputStream("k" + i + ".xml");
Result dest = new StreamResult(fs);
aTransformer.transform(src, dest);
fs.close();
}
}
}
}
Source: Related Answer

When used Hardcoded value only XML parser is working

This is my program,
I am using STAX Parser to parse an XML document .
When I use hardcoded value then only its working (item.account = "sss";)
But when ever if I try to use
item.account = eventReader.getElementText();
The size of the List is 0
(List items = new ArrayList();)
import java.io.StringReader;
import java.util.*;
import javax.xml.stream.XMLEventReader;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
import javax.xml.stream.events.Attribute;
import javax.xml.stream.events.EndElement;
import javax.xml.stream.events.StartElement;
import javax.xml.stream.events.XMLEvent;
public class Pavan {
static final String DATE = "date";
static final String ITEM = "item";
static final String MODE = "mode";
static final String UNIT = "unit";
static final String CURRENT = "current";
static final String INTERACTIVE = "interactive";
#SuppressWarnings({ "unchecked", "null" })
public static void main(String args[]) {
List<Position> items = new ArrayList<Position>();
try {
String documentAsString = "<request>\r\n" + "<accountid>1234</accountid>\r\n" + "<accountid>234</accountid>\r\n" + "</request>";
StringReader stringReadertext = new StringReader(documentAsString);
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLStreamReader reader = inputFactory.createXMLStreamReader(stringReadertext);
XMLEventReader eventReader = inputFactory.createXMLEventReader(reader);
Position item = null;
while (eventReader.hasNext()) {
XMLEvent event = eventReader.nextEvent();
if (event.isStartElement()) {
StartElement startElement = event.asStartElement();
if (startElement.getName().toString().equals("accountid")) {
item = new Position();
item.account = "sss";
//item.account = eventReader.getElementText();
}
}
// If we reach the end of an item element we add it to the list
if (event.isEndElement()) {
EndElement endElement = event.asEndElement();
if (endElement.getName().toString().equals("accountid")) {
items.add(item);
}
}
}
} catch (XMLStreamException e) {
e.printStackTrace();
}
System.out.println(items.size());
}
}
class Position
{
String account ;
}
Because your end event was "eaten" and the if (event.isEndElement()) was never triggered. However, when you hardcode, "if (event.isEndElement())" is triggered because the missing of "eventReader.getElementText()".
i believe that getElementText() is consuming the end event, so your second if block is never entered. the docs for that method are a little confusing.

Categories