How do I parse my simple XML file with Java and SAX? - java

I am trying to parse the file below. I want to print the id and name of each passenger. Can you give me code to parse it ?
<?xml version="1.0" encoding="utf-8"?>
<root xmlns:android="www.google.com">
<passenger id = "001">
<name>Tom Cruise</name>
</passenger>
<passenger id = "002">
<name>Tom Hanks</name>
</passenger>
</root>
UPDATE
This is what i had tried. Code, problems etc mentioned here -
Error in output of a simple SAX parser

Here is a working example to start with, though I suggest you to use StAX instead, you will see that SAX is not very convenient
import java.io.File;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAX2 {
public static void main(String[] args) throws Exception {
SAXParser parser = SAXParserFactory.newInstance().newSAXParser();
parser.parse(new File("test.xml"), new DefaultHandler() {
#Override
public void startElement(String uri, String localName,
String qName, Attributes atts) throws SAXException {
if (qName.equals("passenger")) {
System.out.println("id = " + atts.getValue(0));
}
}
#Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
String text = new String(ch, start, length);
if (!text.trim().isEmpty()) {
System.out.println("name " + text);
}
}
});
}
}
output
id = 001
name Tom Cruise
id = 002
name Tom Hanks

Create a DocumentBuilderFactory.
Obtain a DocumentBuilder from the factory.
Use one of the parse() methods of the builder to create a Document.
Once you have a Document, you can get the passenger Elements with Document's getElementsByTagName() method.
I'm sure you'll be able to work out the rest.

SAXParserFactory factory = SAXParserFactory.newInstance();
try {
InputStream xmlInput = new FileInputStream("theFile.xml");
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new SaxHandler();
saxParser.parse(xmlInput, handler);
} catch (Throwable err) {
err.printStackTrace ();
}

String str = "<?xml version=\"1.0\" encoding=\"utf-8\"?> " +
"<root xmlns:android=\"www.google.com\">" +
"<passenger id = \"001\">" +
"<name>Tom Cruise</name>" +
"</passenger>" +
"<passenger id = \"002\">" +
"<name>Tom Hanks</name>" +
"</passenger>" +
"</root>";
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(str));
final Document document = db.parse(is);
System.out.println("node Name " + document.getChildNodes().item(0).getChildNodes().item(1).getNodeName());

Related

How to read XML declaration with Java SAX

I want to read the XML declaration from an XML file with Java SAX. For example
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
I tried using DefaultHandler, but characters and startElement don't get called for the XML declaration. This is my code:
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAXStuff {
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
SAXParser sp = SAXParserFactory.newInstance().newSAXParser();
sp.parse("test.xml", new DefaultHandler() {
public void characters(char[] ch, int start, int length) throws SAXException {
for(int i = start; i < start + length; i++) {
System.out.print(ch[i]);
}
}
public void startElement(String uri, String localName, String qName, Attributes attributes)
throws SAXException {
System.out.println(qName);
}
});
}
}
How can I get the XML declaration using SAX in Java?
Since Java 14, org.xml.sax.ContentHandler has a declaration method for this purpose. DefaultHandler implements ContentHandler, so this method can be overriden to provide a custom action.
This is the method signature:
void declaration​(String version, String encoding, String standalone) throws SAXException
version - the version string as in the input document, null if not specified
encoding - the encoding string as in the input document, null if not specified
standalone - the standalone string as in the input document, null if not specified
Example:
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler(){
#Override
public void declaration(String version, String encoding, String standalone) {
String declaration = "<?xml "
+ (version != null ? "version=\"" + version + "\"": "")
+ (encoding != null ? " encoding=\"" + encoding + "\"": "")
+ (standalone != null ? " standalone=\"" + standalone + "\"": "")
+ "?>";
System.out.println(declaration);
}
};
parser.parse(new File("file.xml"), handler);

Split XML using SAX parser

I have following xml file.
<Engineers>
<Engineer>
<Name>JOHN</Name>
<Position>STL</Position>
<Team>SS</Team>
</Engineer>
<Engineer>
<Name>UDAY</Name>
<Position>TL</Position>
<Team>SG</Team>
</Engineer>
<Engineer>
<Name>INDRA</Name>
<Position>Director</Position>
<Team>PP</Team>
</Engineer>
</Engineers>
I need to split this xml into smaller xml strings when Xpath is given as Engineers/Enginner.
Smaller xml strings are as follows
<Engineers>
<Engineer>
<Name>INDRA</Name>
<Position>Director</Position>
<Team>PP</Team>
</Engineer>
</Engineers>
<Engineers>
<Engineer>
<Name>JOHN</Name>
<Position>STL</Position>
<Team>SS</Team>
</Engineer>
</Engineers>
I have implemented following so far using SAX which we can get the elements inside XML but not as what I want.How can I proceed??
public class ReadSAX
{
public static void main( String[] args )
{
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException {
System.out.println("Start Element :" + qName);
public void endElement(String uri, String localName,
String qName)
throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length)
throws SAXException {
System.out.println(new String(ch, start, length));
}
};
File file = new File("c:\\file.xml");
InputStream inputStream= new FileInputStream(file);
Reader reader = new InputStreamReader(inputStream,"UTF-8");
InputSource is = new InputSource(reader);
is.setEncoding("UTF-8");
saxParser.parse(is, handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Why use such a low-level coding approach?
In XSLT 2.0 it's simply
<xsl:template match="/">
<xsl:for-each select="Engineers/Engineer">
<xsl:result-document select="{position()}.xml">
<Engineers>
<xsl:copy-of select="."/>
</Engineers>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
and if that takes too much memory, get a streaming XSLT 3.0 processor which will solve the problem.
I think what you need to do is to use VTD-XML's cut and paste ability... this paper, entitled performance analysis of java apis for xml processing, will tell you more on vtd-xml..
http://sdiwc.us/digitlib/journal_paper.php?paper=00000582.pdf
import com.ximpleware.*;
import java.io.*;
public class splitXML {
public static void main(String[] args) throws VTDException, IOException {
VTDGen vg = new VTDGen();
if (!vg.parseFile("d:\\xml\\input.xml", false)){
System.out.println("error");
return;
}
VTDNav vn = vg.getNav();
AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("/engineers/engineer");
int i=0,n=0;
FileOutputStream fos =null;
byte[] stag="<engineers>".getBytes();
byte[] etag="</engineers>".getBytes();
while((i=ap.evalXPath())!=-1){
fos.write(stag);
fos = new FileOutputStream("d:\\xml\\output"+(++n)+".xml");
long l = vn.getElementFragment();
fos.write(vn.getXML().getBytes(), (int)l, (int)(l>>32));
fos.write(etag);
fos.close();
}
}
}

XML string parsing in Java

I am trying to parse through a XML format String for example;
<params city="SANTA ANA" dateOfBirth="1970-01-01"/>
My goal is to add attributes name in an array list such as {city,dateOfBirth} and values of the attributes in another array list such as {Santa Ana, 1970-01-01}
any advice, please help!
Create SAXParserFactory.
Create SAXParser.
Create YourHandler, which extends DefaultHandler.
Parse your file using SAXParser and YourHandler.
For example:
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
parser.parse(yourFile, new YourHandler());
} catch (ParserConfigurationException e) {
System.err.println(e.getMessage());
}
where, yourFile - object of the File class.
In YourHandler class:
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class YourHandler extends DefaultHandler {
String tag = "params"; // needed tag
String city = "city"; // name of the attribute
String value; // your value of the city
#Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
if(localName.equals(tag)) {
value = attributes.getValue(city);
}
}
public String getValue() {
return value;
}
}`
More information for SAX parser and DefaultHandler here and here respectively.
Using JDOM (http://www.jdom.org/docs/apidocs/):
String myString = "<params city='SANTA ANA' dateOfBirth='1970-01-01'/>";
SAXBuilder builder = new SAXBuilder();
Document myStringAsXML = builder.build(new StringReader(myString));
Element rootElement = myStringAsXML.getRootElement();
ArrayList<String> attributeNames = new ArrayList<String>();
ArrayList<String> values = new ArrayList<String>();
List<Attribute> attributes = new ArrayList<Attribute>();
attributes.addAll(rootElement.getAttributes());
Iterator<Element> childIterator = rootElement.getDescendants();
while (childIterator.hasNext()) {
Element childElement = childIterator.next();
attributes.addAll(childElement.getAttributes());
}
for (Attribute attribute: attributes) {
attributeNames.add(attribute.getName());
values.add(attribute.getValue());
}
System.out.println("Attribute names: " + attributeNames);
System.out.println("Values: " + values);

Getting Parent Child Hierarchy in Sax XML parser

I'm using SAX (Simple API for XML) to parse an XML document. I'm getting output for all the tags the file have, but i want it to show the tags in parent child hierarchy.
For Example:
This is my output
<dblp>
<www>
<author>
</author><title>
</title><url>
</url><year>
</year></www><inproceedings>
<month>
</month><pages>
</pages><booktitle>
</booktitle><note>
</note><cdrom>
</cdrom></inproceedings><article>
<journal>
</journal><volume>
</volume></article><ee>
</ee><book>
<publisher>
</publisher><isbn>
</isbn></book><incollection>
<crossref>
</crossref></incollection><editor>
</editor><series>
</series></dblp>
But i want it to display the output like this (it displays the children with extra spacing (that's how i want it to be))
<dblp>
<www>
<author>
</author>
<title>
</title>
<url>
</url>
<year>
</year>
</www>
<inproceedings>
<month>
</month>
<pages>
</pages>
<booktitle>
</booktitle>
<note>
</note>
<cdrom>
</cdrom>
</inproceedings>
<article>
<journal>
</journal>
<volume>
</volume>
</article>
<ee>
</ee>
<book>
<publisher>
</publisher>
<isbn>
</isbn>
</book>
<incollection>
<crossref>
</crossref>
</incollection>
<editor>
</editor>
<series>
</series>
</dblp>
But i can't figure out how can i detect that parser is parsing a parent tag or a children.
here is my code:
package com.teamincredibles.sax;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class Parser extends DefaultHandler {
public void getXml() {
try {
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
SAXParser saxParser = saxParserFactory.newSAXParser();
final MySet openingTagList = new MySet();
final MySet closingTagList = new MySet();
DefaultHandler defaultHandler = new DefaultHandler() {
public void startDocument() throws SAXException {
System.out.println("Starting Parsing...\n");
}
public void endDocument() throws SAXException {
System.out.print("\n\nDone Parsing!");
}
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if (!openingTagList.contains(qName)) {
openingTagList.add(qName);
System.out.print("<" + qName + ">\n");
}
}
public void characters(char ch[], int start, int length)
throws SAXException {
/*for(int i=start; i<(start+length);i++){
System.out.print(ch[i]);
}*/
}
public void endElement(String uri, String localName, String qName)
throws SAXException {
if (!closingTagList.contains(qName)) {
closingTagList.add(qName);
System.out.print("</" + qName + ">");
}
}
};
saxParser.parse("xml/sample.xml", defaultHandler);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String args[]) {
Parser readXml = new Parser();
readXml.getXml();
}
}
You can consider a StAX implementation:
package be.duo.stax;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
public class StaxExample {
public void getXml() {
InputStream is = null;
try {
is = new FileInputStream("c:\\dev\\sample.xml");
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLStreamReader reader = inputFactory.createXMLStreamReader(is);
parse(reader, 0);
} catch(Exception ex) {
System.out.println(ex.getMessage());
} finally {
if(is != null) {
try {
is.close();
} catch(IOException ioe) {
System.out.println(ioe.getMessage());
}
}
}
}
private void parse(XMLStreamReader reader, int depth) throws XMLStreamException {
while(true) {
if(reader.hasNext()) {
switch(reader.next()) {
case XMLStreamConstants.START_ELEMENT:
writeBeginTag(reader.getLocalName(), depth);
parse(reader, depth+1);
break;
case XMLStreamConstants.END_ELEMENT:
writeEndTag(reader.getLocalName(), depth-1);
return;
}
}
}
}
private void writeBeginTag(String tag, int depth) {
for(int i = 0; i < depth; i++) {
System.out.print(" ");
}
System.out.println("<" + tag + ">");
}
private void writeEndTag(String tag, int depth) {
for(int i = 0; i < depth; i++) {
System.out.print(" ");
}
System.out.println("</" + tag + ">");
}
public static void main(String[] args) {
StaxExample app = new StaxExample();
app.getXml();
}
}
There is an idiom for StAX with a loop like this for every tag in the XML:
private MyTagObject parseMyTag(XMLStreamReader reader, String myTag) throws XMLStreamException {
MyTagObject myTagObject = new MyTagObject();
while (true) {
switch (reader.next()) {
case XMLStreamConstants.START_ELEMENT:
String localName = reader.getLocalName();
if(localName.equals("myOtherTag1")) {
myTagObject.setMyOtherTag1(parseMyOtherTag1(reader, localName));
} else if(localName.equals("myOtherTag2")) {
myTagObject.setMyOtherTag2(parseMyOtherTag2(reader, localName));
}
// and so on
break;
case XMLStreamConstants.END_ELEMENT:
if(reader.getLocalName().equals(myTag) {
return myTagObject;
}
break;
}
}
well what have you tried? you should use a transformer found here: How to pretty print XML from Java?
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);
Almost any useful SAX application needs to maintain a stack. When startElement is called, you push information to the stack, when endElement is called, you pop the stack. Exactly what you put on the stack depends on the application; it's often the element name. For your application, you don't actually need a full stack, you only need to know its depth. You could get by with maintaining this using depth++ in startElement and depth-- in endElement(). Then you just output depth spaces before the element name.

Parsing and updating xml using SAX parser in java

I have an xml file with similar tags ->
<properties>
<definition>
<name>IP</name>
<description></description>
<defaultValue>10.1.1.1</defaultValue>
</definition>
<definition>
<name>Name</name>
<description></description>
<defaultValue>MyName</defaultValue>
</definition>
<definition>
<name>Environment</name>
<description></description>
<defaultValue>Production</defaultValue>
</definition>
</properties>
I want to update the default value of the definition with name : Environment.
Is it possible to do that using SAX parser?
Can you please point me to proper documentation?
So far I have parsed the document but when I update defaultValue, it updates all defaultValues. I dont know how to parse the exact default value tag.
Anything is possible with SAX, it's just waaaaay harder than it has to be. It's pretty old school and there are many easier ways to do this (JAXB, XQuery, XPath, DOM etc ).
That said lets do it with SAX.
It sounds like the problem you are having is that you are not tracking the state of your progress through the document. SAX simply works by making the callbacks when it stumbles across an event within the document
This is a fairly crude way of parsing the doc and updating the relevant node using SAX. Basically I am checking when we hit a element with the value you want to update (Environment) and setting a flag so that when we get to the contents of the defaultValue node, the characters callback lets me remove the existing value and replace it with the new value.
import java.io.StringReader;
import java.util.Arrays;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
public class Q26897496 extends DefaultHandler {
public static String xmlDoc = "<?xml version='1.0'?>"
+ "<properties>"
+ " <definition>"
+ " <name>IP</name>"
+ " <description></description>"
+ " <defaultValue>10.1.1.1</defaultValue>"
+ " </definition>"
+ " <definition>"
+ " <name>Name</name>"
+ " <description></description>"
+ " <defaultValue>MyName</defaultValue>"
+ " </definition>"
+ " <definition>"
+ " <name>Environment</name>"
+ " <description></description>"
+ " <defaultValue>Production</defaultValue>"
+ " </definition>"
+ "</properties>";
String elementName;
boolean mark = false;
char[] updatedDoc;
public static void main(String[] args) {
Q26897496 q = new Q26897496();
try {
q.parse();
} catch (Exception e) {
e.printStackTrace();
}
}
public Q26897496() {
}
public void parse() throws Exception {
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
SAXParser saxParser = spf.newSAXParser();
XMLReader xml = saxParser.getXMLReader();
xml.setContentHandler(this);
xml.parse(new InputSource(new StringReader(xmlDoc)));
System.out.println("new xml: \n" + new String(updatedDoc));
}
#Override
public void startDocument() throws SAXException {
System.out.println("starting");
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
this.elementName = localName;
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
String value = new String(ch).substring(start, start + length);
if (elementName.equals("name")) {
if (value.equals("Environment")) {
this.mark = true;
}
}
if (elementName.equals("defaultValue") && mark == true) {
// update
String tmpDoc = new String(ch);
String leading = tmpDoc.substring(0, start);
String trailing = tmpDoc.substring(start + length, tmpDoc.length());
this.updatedDoc = (leading + "NewValueForDefaulValue" + trailing).toCharArray();
mark = false;
}
}
}

Categories