I'm reading XML file using SAX parser utility.
Here is my sample XML
<?xml version="1.0"?><company><Account AccountNumber="100"><staff><firstname>yong</firstname><firstname>jin</firstname></staff></Account></company>
Here is the code
import java.util.Arrays;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXML {
public static void main(String argv[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bAccount = false;
public void startElement(String uri, String localName, String qName, Attributes attributes)
throws SAXException {
System.out.println("Start Element :" + qName);
if (qName.equalsIgnoreCase("ACCOUNT")) {
bAccount = true;
}
}
public void endElement(String uri, String localName, String qName) throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char[] ch, int start, int length) throws SAXException {
System.out.println("Im here:" + bAccount);
if (bAccount) {
System.out.println("Account First Name : " + new String(ch, start, length));
bAccount = false;
StringBuilder Account = new StringBuilder();
for (int i = start; i < ch.length - 1; i--) {
if (String.valueOf(ch[i]).equals("<")) {
System.out.println("Account:" +Account);
break;
} else {
Account.append(ch[i]);
}
}
}
}
};
saxParser.parse("C:\\Lenny\\Work\\XML\\Out_SaxParsing_01.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
As you can see in XML, Account tag is something like this Account AccountNumber="100", What I want to do is, I want to capture Tag too as well.
So to achieve that, in characters method, I'm trying to read the array from right to left, So that I could get the Account AccountNumber="100", when Account encountered as event.
But am not able to reach there, The event is getting generated, but its not going to characters method. I think it should go into characters method once Account tag is encountered. But its not..!
May I know please what am missing or doing wrong ?
Any Help please..!
AccountNumber="100" is an attribute of the Account element so inside the startElement handler you have you can read out the attributes parameter to access that value.
Related
import java.io.File;
import java.io.IOException;
import java.util.List;
import org.jdom.Document;
import org.jdom.Element;
import org.jdom.JDOMException;
import org.jdom.input.SAXBuilder;
public class ReadXMLFile {
public static void main(String[] args) {
SAXBuilder builder = new SAXBuilder();
File xmlFile = new File("c:\\test.xml");
try {
Document document = (Document) builder.build(xmlFile);
Element rootNode = document.getRootElement();
List list = rootNode.getChildren("raum");
for (int i = 0; i < list.size(); i++) {
Element node = (Element) list.get(i);
System.out.println("ID : " + node.getChildText("ID"));
}
} catch (IOException io) {
System.out.println(io.getMessage());
} catch (JDOMException jdomex) {
System.out.println(jdomex.getMessage());
}
}
}
I don't understand how the step in between has to look like in order to insert the imported coordinates into the polygon.. Maybe someone can help me with this?
You can follow any sample JDOM parser example and do it.
For example, this explains how to read the xml and take the data in a list and iterate over it. Just follow the steps and understand what you are doing, you can easily get it done.
For the sake of completeness?
This is how to achieve it using SAX parser.
Note that it is not clear to me, from your question, which Polygon you are referring to. I presume it is a java class. It can't be java.awt.Polygon because its points are all int whereas your sample XML file contains only double values. The only other class I thought of was javafx.scene.shape.Polygon that contains an array of points where each point is a double. Hence in the below code, I create an instance of javafx.scene.shape.Polygon.
For the situation you describe in your question, I don't see the point (no pun intended) in loading the entire DOM tree into memory. You simply need to create a point every time you encounter a x and a y coordinate in the XML file and add those coordinates to a collection of points.
Here is the code. Note that I created an XML file named polygon0.xml that contains the entire XML from your question. Also note that you can extend class org.xml.sax.helpers.DefaultHandler rather than implement interface ContentHandler.
import java.io.FileReader;
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import javafx.scene.shape.Polygon;
public class Polygons implements ContentHandler {
private boolean isX;
private boolean isY;
private Polygon polygon;
/* Start 'ContentHandler' interface methods. */
#Override // org.xml.sax.ContentHandler
public void setDocumentLocator(Locator locator) {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void startDocument() throws SAXException {
polygon = new Polygon();
}
#Override // org.xml.sax.ContentHandler
public void endDocument() throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void startPrefixMapping(String prefix, String uri) throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void endPrefixMapping(String prefix) throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
isX = "x".equals(qName);
isY = "y".equals(qName);
}
#Override // org.xml.sax.ContentHandler
public void endElement(String uri, String localName, String qName) throws SAXException {
if (isX) {
isX = false;
}
if (isY) {
isY = false;
}
}
#Override // org.xml.sax.ContentHandler
public void characters(char[] ch, int start, int length) throws SAXException {
if (isX || isY) {
StringBuilder sb = new StringBuilder(length);
int end = start + length;
for (int i = start; i < end; i++) {
sb.append(ch[i]);
}
polygon.getPoints().add(Double.parseDouble(sb.toString()));
}
}
#Override // org.xml.sax.ContentHandler
public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void processingInstruction(String target, String data) throws SAXException {
// Do nothing.
}
#Override // org.xml.sax.ContentHandler
public void skippedEntity(String name) throws SAXException {
// Do nothing.
}
/* End 'ContentHandler' interface methods. */
public static void main(String[] args) {
Polygons instance = new Polygons();
Path path = Paths.get("polygon0.xml");
SAXParserFactory spf = SAXParserFactory.newInstance();
try (FileReader reader = new FileReader(path.toFile())) { // throws java.io.IOException
SAXParser saxParser = spf.newSAXParser(); // throws javax.xml.parsers.ParserConfigurationException , org.xml.sax.SAXException
XMLReader xmlReader = saxParser.getXMLReader(); // throws org.xml.sax.SAXException
xmlReader.setContentHandler(instance);
InputSource input = new InputSource(reader);
xmlReader.parse(input);
System.out.println(instance.polygon);
}
catch (IOException |
ParserConfigurationException |
SAXException x) {
x.printStackTrace();
}
}
}
Here is the output from running the above code:
Polygon[points=[400.3, 997.2, 400.3, 833.1, 509.9, 833.1, 509.9, 700.0, 242.2, 700.0, 242.2, 600.1, 111.1, 600.1, 111.1, 300.0, 300.0, 300.0, 300.0, 420.0, 600.5, 420.0, 600.5, 101.9, 717.8, 101.9, 717.8, 200.0, 876.5, 200.0, 876.5, 500.8, 1012.1, 500.8, 1012.1, 900.2, 902.0, 900.2, 902.0, 997.2], fill=0x000000ff]
EDIT
As requested, by OP, here is an implementation using JDOM (version 2.0.6)
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.JDOMException;
import org.jdom2.filter.ElementFilter;
import org.jdom2.input.SAXBuilder;
import org.jdom2.util.IteratorIterable;
import javafx.scene.shape.Polygon;
public class Polygon2 {
public static void main(String[] args) {
Polygon polygon = new Polygon();
Path path = Paths.get("polygon0.xml");
SAXBuilder builder = new SAXBuilder();
try {
Document jdomDoc = builder.build(path.toFile()); // throws java.io.IOException , org.jdom2.JDOMException
Element root = jdomDoc.getRootElement();
IteratorIterable<Element> iter = root.getDescendants(new ElementFilter("edge"));
while (iter.hasNext()) {
Element elem = iter.next();
Element childX = elem.getChild("x");
polygon.getPoints().add(Double.parseDouble(childX.getText()));
Element childY = elem.getChild("y");
polygon.getPoints().add(Double.parseDouble(childY.getText()));
}
}
catch (IOException | JDOMException x) {
x.printStackTrace();
}
System.out.println(polygon);
}
}
You can read XML files DOM parser library check this article.
I assume you are working on a Desktop application so you might want to use FileChooser for file selection. Here is an example of this.
Also, I think you would need to make some structural changes (for convinience) to your XML file so that it would have something like this:
<xpoints>
<x>5<x/>
...
</xpoints>
<ypoints>
<y>5<y/>
...
</ypoints>
But for existing structure doing something like this would be enogh:
File file = new File("file");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(file);
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName("edge");
// you can iterate over all edges
for (int itr = 0; itr < nodeList.getLength(); itr++)
{
Node node = nodeList.item(itr);
if (node.getNodeType() == Node.ELEMENT_NODE)
{
Element eElement = (Element) node;
//then you can access values, for example, to pass them to an array
array.add(eElement.getElementsByTagName("x").item(0).getTextContent()));
}
}
I am trying to parse attached xml(Please find attachment) file.
xml document is as given below.check the attachment 1 and 2
sample data of xml file
In order to parse this xml, I used SAX parser. program is as follows.
package com.dom;
import java.io.File;
import java.io.IOException;
import java.util.Enumeration;
import java.util.Hashtable;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
public class DemoXML {
File file;
SAXParserFactory factory;
SAXParser saxParser;
UserHandler handler;
public void loadXML()
{
file = new File("E:/fifthWorkbenchProjects/XMLUtility/src/input/FIXBOND.xml");
System.out.println(file.exists());
}
public void readXML()
{
factory = SAXParserFactory.newInstance();
try {
saxParser = factory.newSAXParser();
handler = new UserHandler();
try {
saxParser.parse(file,handler);
} catch (IOException e) {
e.printStackTrace();
}
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
}
}
public static void main(String args[])
{
DemoXML ob = new DemoXML();
ob.loadXML();
ob.readXML();
}
}
class UserHandler extends DefaultHandler
{
Hashtable tags;
#Override
public void startDocument()
{
System.out.println("Document started");
tags = new Hashtable();
}
#Override
public void endDocument()
{
System.out.println("Documents ended");
}
#Override
public void startElement(String namespaceURI,String localName,String qname,Attributes atts) throws SAXException
{
// System.out.println("Element started");
// if(qname.equals("Currency"))
System.out.print(qname+"-->");
}
#Override
public void endElement(String uri,String localName, String qname)
{
}
#Override
public void characters(char[] ch, int start, int length)
{
String str = new String(ch,start,length);
System.out.println(str);
System.out.println();
}
}
I get output in following manner.
true
Document started
FIgovcorpagncy-->InstrumentDescription-->InstrumentType-->FI GOVCORPAGNCY
InstrumentSubType-->FIXDBOND
InstrumentName-->QUEENSNR 0% 07/06/2016
InstrumentDescription-->QUEENSNR 0% 07/06/2016
Currency-->GBP
InstrumentStatus-->ACTIVE
AmountOutstanding-->48384375
AmtOutstandingDate-->2012-06-27T00:00:00.000
PrincipalExchange-->N
CountryOfRisk-->GB
InstrumentCompleteness-->50
CapitalRanking-->1
AtIssuance-->IssueDate-->2012-06-27T00:00:00.000
OriginalIssueAmount-->48384375
PrivatePlacementFlag-->Y
MinimumDenomination-->1000
MinimumIncrement-->0.01
and so on ....I am able to access all nodes but observe one thing over here for first element in tree,complete element address is printed like
FIgovcorpagncy-->InstrumentDescription-->InstrumentType-->FI GOVCORPAGNCY
then for rest of the elements in tree, it prints tag name and corresponding value like
InstrumentSubType-->FIXDBOND
InstrumentName-->QUEENSNR 0% 07/06/2016
InstrumentDescription-->QUEENSNR 0% 07/06/2016
Currency-->GBP
InstrumentStatus-->ACTIVE
AmountOutstanding-->48384375
so on....
here my requirement is I want to print these elements also with full hierarchic manner as the first element.
how to go about it?
class UserHandler extends DefaultHandler
{
List li_elements,li_values;
LinkedHashMap<List<String>,List<String>> hm;
boolean endElementFlag;
#Override
public void startDocument()
{
System.out.println("Document started");
li_elements = new ArrayList<String>();
li_values=new ArrayList<String>();
}
#Override
public void endDocument()
{
System.out.println("Documents ended"+hm.size());
for(Map.Entry m:hm.entrySet())
{
System.out.println(m.getKey()+""+m.getValue());
}
}
#Override
public void startElement(String namespaceURI,String localName,String qname,Attributes atts) throws SAXException
{
li_elements.add(qname);
//System.out.println("Element Started");
//System.out.println(qname+" added in element list");
}
#Override
public void endElement(String uri,String localName, String qname)
{
if(!li_values.isEmpty())
{
System.out.println("Element address list:-"+li_elements+"and Corresponding Value:-"+li_values);
System.out.println();
}
li_elements.remove(li_elements.size()-1);
li_values.clear();
}
#Override
public void characters(char[] ch, int start, int length)
{
String str = new String(ch,start,length);
li_values.add(str);
}
}
I was expecting something like this. this prints the output in a format that I was hoping for.
I'm using SAX (Simple API for XML) to parse an XML document. My purpose is to parse the document so that i can separate entities from the the XML and create an ER Diagram from these entities (which i will create manually after i get all the entities the file have).
Although i'm on very initial stage of coding every thing i have discussed above, but i' just stuck at this particular problem right now.
here is my code:
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class Parser extends DefaultHandler {
public void getXml() {
try {
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
SAXParser saxParser = saxParserFactory.newSAXParser();
final MySet openingTagList = new MySet();
final MySet closingTagList = new MySet();
DefaultHandler defaultHandler = new DefaultHandler() {
public void startDocument() throws SAXException {
System.out.println("Starting Parsing...\n");
}
public void endDocument() throws SAXException {
System.out.print("\n\nDone Parsing!");
}
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if (!openingTagList.contains(qName)) {
openingTagList.add(qName);
System.out.print("<" + qName + ">");
}
}
public void characters(char ch[], int start, int length)
throws SAXException {
for (int i = start; i < (start + length); i++) {
System.out.print(ch[i]);
}
}
public void endElement(String uri, String localName, String qName)
throws SAXException {
if (!closingTagList.contains(qName)) {
closingTagList.add(qName);
System.out.print("</" + qName + ">");
}
}
};
saxParser.parse("student.xml", defaultHandler);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String args[]) {
Parser readXml = new Parser();
readXml.getXml();
}
}
What i'm trying to achieve is when the startElement method detects that the tag was already traversed it should skip the tag as well all the other entities inside the tag, but i'm confused about how to implement that part.
Note: Purpose is to read the tags, i don't care about the records in between them. MySet is just an abstraction which contains method like contains (if the set has the passed data) etc nothing much.
Any help would be appropriated. Thanks
Due to the nature of xml it's not possible to know which tags will appear later in the file. So there is no 'skip the next x bytes'-trick.
Just ask for reasonable sized files - maybe there is a possibility to split the data.
In my opinion reading a xml file with more than 1 gb is no fun - regardless of the used library.
I'm using SAX (Simple API for XML) to parse an XML document. I'm getting output for all the tags the file have, but i want it to show the tags in parent child hierarchy.
For Example:
This is my output
<dblp>
<www>
<author>
</author><title>
</title><url>
</url><year>
</year></www><inproceedings>
<month>
</month><pages>
</pages><booktitle>
</booktitle><note>
</note><cdrom>
</cdrom></inproceedings><article>
<journal>
</journal><volume>
</volume></article><ee>
</ee><book>
<publisher>
</publisher><isbn>
</isbn></book><incollection>
<crossref>
</crossref></incollection><editor>
</editor><series>
</series></dblp>
But i want it to display the output like this (it displays the children with extra spacing (that's how i want it to be))
<dblp>
<www>
<author>
</author>
<title>
</title>
<url>
</url>
<year>
</year>
</www>
<inproceedings>
<month>
</month>
<pages>
</pages>
<booktitle>
</booktitle>
<note>
</note>
<cdrom>
</cdrom>
</inproceedings>
<article>
<journal>
</journal>
<volume>
</volume>
</article>
<ee>
</ee>
<book>
<publisher>
</publisher>
<isbn>
</isbn>
</book>
<incollection>
<crossref>
</crossref>
</incollection>
<editor>
</editor>
<series>
</series>
</dblp>
But i can't figure out how can i detect that parser is parsing a parent tag or a children.
here is my code:
package com.teamincredibles.sax;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class Parser extends DefaultHandler {
public void getXml() {
try {
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
SAXParser saxParser = saxParserFactory.newSAXParser();
final MySet openingTagList = new MySet();
final MySet closingTagList = new MySet();
DefaultHandler defaultHandler = new DefaultHandler() {
public void startDocument() throws SAXException {
System.out.println("Starting Parsing...\n");
}
public void endDocument() throws SAXException {
System.out.print("\n\nDone Parsing!");
}
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if (!openingTagList.contains(qName)) {
openingTagList.add(qName);
System.out.print("<" + qName + ">\n");
}
}
public void characters(char ch[], int start, int length)
throws SAXException {
/*for(int i=start; i<(start+length);i++){
System.out.print(ch[i]);
}*/
}
public void endElement(String uri, String localName, String qName)
throws SAXException {
if (!closingTagList.contains(qName)) {
closingTagList.add(qName);
System.out.print("</" + qName + ">");
}
}
};
saxParser.parse("xml/sample.xml", defaultHandler);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String args[]) {
Parser readXml = new Parser();
readXml.getXml();
}
}
You can consider a StAX implementation:
package be.duo.stax;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
public class StaxExample {
public void getXml() {
InputStream is = null;
try {
is = new FileInputStream("c:\\dev\\sample.xml");
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLStreamReader reader = inputFactory.createXMLStreamReader(is);
parse(reader, 0);
} catch(Exception ex) {
System.out.println(ex.getMessage());
} finally {
if(is != null) {
try {
is.close();
} catch(IOException ioe) {
System.out.println(ioe.getMessage());
}
}
}
}
private void parse(XMLStreamReader reader, int depth) throws XMLStreamException {
while(true) {
if(reader.hasNext()) {
switch(reader.next()) {
case XMLStreamConstants.START_ELEMENT:
writeBeginTag(reader.getLocalName(), depth);
parse(reader, depth+1);
break;
case XMLStreamConstants.END_ELEMENT:
writeEndTag(reader.getLocalName(), depth-1);
return;
}
}
}
}
private void writeBeginTag(String tag, int depth) {
for(int i = 0; i < depth; i++) {
System.out.print(" ");
}
System.out.println("<" + tag + ">");
}
private void writeEndTag(String tag, int depth) {
for(int i = 0; i < depth; i++) {
System.out.print(" ");
}
System.out.println("</" + tag + ">");
}
public static void main(String[] args) {
StaxExample app = new StaxExample();
app.getXml();
}
}
There is an idiom for StAX with a loop like this for every tag in the XML:
private MyTagObject parseMyTag(XMLStreamReader reader, String myTag) throws XMLStreamException {
MyTagObject myTagObject = new MyTagObject();
while (true) {
switch (reader.next()) {
case XMLStreamConstants.START_ELEMENT:
String localName = reader.getLocalName();
if(localName.equals("myOtherTag1")) {
myTagObject.setMyOtherTag1(parseMyOtherTag1(reader, localName));
} else if(localName.equals("myOtherTag2")) {
myTagObject.setMyOtherTag2(parseMyOtherTag2(reader, localName));
}
// and so on
break;
case XMLStreamConstants.END_ELEMENT:
if(reader.getLocalName().equals(myTag) {
return myTagObject;
}
break;
}
}
well what have you tried? you should use a transformer found here: How to pretty print XML from Java?
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);
Almost any useful SAX application needs to maintain a stack. When startElement is called, you push information to the stack, when endElement is called, you pop the stack. Exactly what you put on the stack depends on the application; it's often the element name. For your application, you don't actually need a full stack, you only need to know its depth. You could get by with maintaining this using depth++ in startElement and depth-- in endElement(). Then you just output depth spaces before the element name.
I have student.xml file and am parsing this file using SAX Parser and now I need to store data into MySQL Database and so what approach is recommended.
Code:
package sax;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXML extends DefaultHandler{
public void characters(char[] ch, int start, int length) throws SAXException {
String s =new String(ch, start, length);
if(s.trim().length()>0) {
System.out.println(" Value: "+s);
}
}
public void startDocument() throws SAXException {
System.out.println("Start document");
}
public void endDocument() throws SAXException {
System.out.println("End document");
}
public void startElement(String uri, String localName, String name,
Attributes attributes) throws SAXException {
System.out.println("start element : "+name);
}
public void endElement(String uri, String localName, String name) throws SAXException {
System.out.println("end element");
}
public static void main(String[] args) {
ReadXML handler = new ReadXML();
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
saxParser.parse("student.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
I would create a set of tables that represent the data contained in students.xml and then populate them as you parse the data.
You might be able to directly store the XML into the DB. Many DB packages have the functionality that allows for XML formated data to be inserted into the appropriate spots in the database. I believe that PostGres and MS-SQL can do it.
There is existing functionality to do this in MySQL. See here