Join two strings retrieved by SAX - java

I have an XML file like this one:
<?xml version="1.0" encoding="UTF-8"?>
<Article>
<ArticleTitle>Java-SAX Tutorial</ArticleTitle>
<Author>
<FamilyName>Yong</FamilyName>
<GivenName>Mook</GivenName>
<GivenName>Kim</GivenName>
<nickname>mkyong</nickname>
<salary>100000</salary>
</Author>
<Author>
<FamilyName>Low</FamilyName>
<GivenName>Yin</GivenName>
<GivenName>Fong</GivenName>
<nickname>fong fong</nickname>
<salary>200000</salary>
</Author>
</Article>
I have tried the example in mkyong's tutorial here and I can retrieve data perfectly from it using SAX, it gives me:
Article Title : Java-SAX Tutorial
Given Name : Kim
Given Name : Mook
Family Name : Yong
Given Name : Yin
Given Name : Fong
Family Name : Low
But I want it to give me something like this:
Article Title : Java-SAX Tutorial
Author : Kim Mook Yong
Author : Yin Fong Low
In other terms, I would like to retrieve some of the child nodes of the node Author, not all of them, put them in a string variable and display them.
This is the class I use in order to parse the Authors with the modification I have tried to do:
public class ReadAuthors {
public void parse(String filePath) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bFamilyName = false;
boolean bGivenName = false;
#Override
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase("FamilyName")) {
bFamilyName = true;
}
if (qName.equalsIgnoreCase("GivenName")) {
bGivenName = true;
}
}
#Override
public void endElement(String uri, String localName,
String qName) throws SAXException {
}
#Override
public void characters(char ch[], int start, int length) throws SAXException {
String fullName = "";
String familyName = "";
String givenName ="";
if (bFamilyName) {
familyName = new String(ch, start, length);
fullName += familyName;
bFamilyName = false;
}
if (bGivenName) {
givenName = new String(ch, start, length);
fullName += " " + givenName;
bGivenName = false;
}
System.out.println("Full Name : " + fullName);
}
};
saxParser.parse(filePath, handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
With this modification, it only gives me the ArticleTitle value and it doesn't return anything regarding the authors full names.
I have another class for parsing the ArticleTitle node and they are both called in a Main class.
What did I do wrong? And how can I fix it?

The fullName variable is overwritten everytime when the characters method is called. I think you should move out that variable into the handler: init with empty string when Author starts and write out when it ends. The concatenation should work as you did. I haven't tried this out but something similear should work:
public class ReadAuthors {
public void parse(String filePath) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bName = false;
String fullName = "";
#Override
public void startElement(String uri, String localName,String qName,
Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase("FamilyName")) {
bName = true;
}
if (qName.equalsIgnoreCase("GivenName")) {
bName = true;
}
if (qName.equalsIgnoreCase("Author")) {
fullName = "";
}
}
#Override
public void endElement(String uri, String localName,
String qName) throws SAXException {
if (qName.equalsIgnoreCase("Author")) {
System.out.println("Full Name : " + fullName);
}
}
#Override
public void characters(char ch[], int start, int length) throws SAXException {
String name = "";
if (bName) {
name = new String(ch, start, length);
fullName += name;
bName = false;
}
}
};
saxParser.parse(filePath, handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}

Related

Turn java XML SAX Parser to web app Tomcat

I've got a java SAX Parser for XML (we set the date, make URL reqest for this date and parse XML file). Now I need to turn this code to web app in Tomcat. I've imported all nessessary libraries, created artefacts, but don't know how to change code itself.\
Here is initial code
Handler:
public class UserHandler extends DefaultHandler {
boolean bName = false;
boolean bValue = false;
String result=" ";
#Override
public void startElement(String uri,
String localName, String qName, Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase("Valute")) {
String CharCode = attributes.getValue("CharCode");
} else if (qName.equalsIgnoreCase("Name")) {
bName = true;
} else if (qName.equalsIgnoreCase("Value")) {
bValue = true;
}
}
#Override
public void endElement(String uri,
String localName, String qName) throws SAXException {
if (qName.equalsIgnoreCase("Valute")) {
System.out.print(" ");
}
}
#Override
public void characters(char ch[], int start, int length) throws SAXException {
if (bName) {
result=(new String(ch, start, length)+" ");
bName = false;
} else if (bValue) {
result=result+(new String(ch, start, length));
bValue = false;
System.out.print(result);
}
}
}
Main:
public static void main(String[] args) throws MalformedURLException {
//Set the date dd.mm.yyyy
String date="12.08.2020";
String link ="http://www.cbr.ru/scripts/XML_daily.asp?date_req=";
URL url =new URL(link);
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
UserHandler userHandler = new UserHandler();
saxParser.parse(String.valueOf(url+date), userHandler);
} catch (Exception e) {
e.printStackTrace();
}
}
}

SAX Parser: How to read only from `Item` tag?

I am using SAX Parser to parse some XML content. Please check my code below.
public void parse(InputSource is, AppDataBean appDataBean) throws RuntimeException {
int limitCheck;
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
Log.d("SAX",appDataBean.getUrl());
DefaultHandler handler = new DefaultHandler() {
boolean title = false;
boolean link = false;
boolean author = false;
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException {
if (qName.equalsIgnoreCase(TITLE)) {
title = true;
}
if (qName.equalsIgnoreCase(LINK)) {
link = true;
}
if (qName.equalsIgnoreCase(AUTHOR)) {
author = true;
}
//Log.d("SAX","Start Element :" + qName);
}
public void endElement(String uri, String localName,
String qName)
throws SAXException {
}
public void characters(char ch[], int start, int length)
throws SAXException {
System.out.println(new String(ch, start, length));
if (title) {
Log.d("SAX","End Element :" + "First Name : "
+ new String(ch, start, length));
title = false;
}
if (link) {
Log.d("SAX","End Element :" + "Last Name : "
+ new String(ch, start, length));
link = false;
}
if (author) {
Log.d("SAX","End Element :" + "Nick Name : "
+ new String(ch, start, length));
author = false;
}
}
};
saxParser.parse(is, handler);
} catch (Exception e) {
e.printStackTrace();
}
}
Below is how my XML will look like.
<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
<?xml-stylesheet type="text/xsl" href="rss.xsl"?>
<channel>
<title>MyRSS</title>
<atom:link href="http://www.example.com/rss.php" rel="self" type="application/rss+xml" />
<link>http://www.example.com/rss.php</link>
<description>MyRSS</description>
<language>en-us</language>
<pubDate>Tue, 22 May 2018 13:15:15 +0530</pubDate>
<item>
<title>Title 1</title>
<pubDate>Tue, 22 May 2018 13:14:40 +0530</pubDate>
<link>http://www.example.com/news.php?nid=47610</link>
<guid>http://www.example.com/news.php?nid=47610</guid>
<description>bla bla bla</description>
</item>
</channel>
</rss>
However in here, I nee to avoid the Channel tag and only read of the root tag is Item. Then only I can get the real content. How can I do this?
Update
As suggested by an answer, I tried using the SAX Parser with stack. Below is the code but still I no good, now it prints nothing for the First Name
public void parse(InputSource is, AppDataBean appDataBean) throws RuntimeException {
int limitCheck;
stack = new Stack<>();
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
Log.d("SAX", appDataBean.getUrl());
DefaultHandler handler = new DefaultHandler() {
boolean title = false;
boolean link = false;
boolean author = false;
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException {
Log.d("SAX", "localName: " + localName);
if(localName.equalsIgnoreCase("item"))
{
stack = new Stack<>();
stack.push(qName);
}
if (qName.equalsIgnoreCase(TITLE)) {
if(stack.peek().equalsIgnoreCase("item"))
{
title = true;
}
}
if (qName.equalsIgnoreCase(LINK)) {
link = true;
}
if (qName.equalsIgnoreCase(AUTHOR)) {
author = true;
}
//Log.d("SAX","Start Element :" + qName);
}
public void endElement(String uri, String localName,
String qName)
throws SAXException {
stack.pop();
}
public void characters(char ch[], int start, int length)
throws SAXException {
System.out.println(new String(ch, start, length));
if (title) {
Log.d("SAX", "End Element :" + "First Name : "
+ new String(ch, start, length));
title = false;
}
if (link) {
Log.d("SAX", "End Element :" + "Last Name : "
+ new String(ch, start, length));
link = false;
}
if (author) {
Log.d("SAX", "End Element :" + "Nick Name : "
+ new String(ch, start, length));
author = false;
}
}
};
saxParser.parse(is, handler);
} catch (Exception e) {
e.printStackTrace();
}
}
Typically a SAX application will maintain a stack to hold context. On a startElement event, push the element name to the stack; on endElement pop it off the stack. Then when you get a startElement event for a title element, you can do stack.peek() to see what the parent of the title is.

How to handle namespaces with SAX Parser?

I'm trying to learn to parse XML documents, I have a XML document that uses namespaces so, I'm sure I need to do something to parse correctly.
This is what I have:
DefaultHandler handler = new DefaultHandler() {
boolean bfname = false;
boolean blname = false;
boolean bnname = false;
boolean bsalary = false;
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
System.out.println("Start Element :" + qName);
if (qName.equalsIgnoreCase("FIRSTNAME")) {
bfname = true;
}
if (qName.equalsIgnoreCase("LASTNAME")) {
blname = true;
}
if (qName.equalsIgnoreCase("NICKNAME")) {
bnname = true;
}
if (qName.equalsIgnoreCase("SALARY")) {
bsalary = true;
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length) throws SAXException {
if (bfname) {
System.out.println("First Name : " + new String(ch, start, length));
bfname = false;
}
if (blname) {
System.out.println("Last Name : " + new String(ch, start, length));
blname = false;
}
if (bnname) {
System.out.println("Nick Name : " + new String(ch, start, length));
bnname = false;
}
if (bsalary) {
System.out.println("Salary : " + new String(ch, start, length));
bsalary = false;
}
}
};
saxParser.parse(file, handler);
My question is, how I can handle the namespase in this example?
To elaborate on what Blaise's point with sample code, consider this contrived example:
<?xml version="1.0" encoding="UTF-8"?>
<!-- ns.xml -->
<root xmlns:foo="http://data" xmlns="http://data">
<foo:record>ONE</foo:record>
<bar:record xmlns:bar="http://data">TWO</bar:record>
<record>THREE</record>
<record xmlns="http://metadata">meta 1</record>
<foo:record xmlns:foo="http://metadata">meta 2</foo:record>
</root>
There are two different types of record element. One in the http://data namespace; the other in http://metadata namespace. There are three data records and two metadata records.
The document could be normalized to this:
<?xml version="1.0" encoding="UTF-8"?>
<ns0:root xmlns:ns0="http://data" xmlns:ns1="http://metadata">
<ns0:record>ONE</ns0:record>
<ns0:record>TWO</ns0:record>
<ns0:record>THREE</ns0:record>
<ns1:record>meta 1</ns1:record>
<ns1:record>meta 2</ns1:record>
</ns0:root>
But the code must handle the general case.
Here is some code for printing the metadata records:
class MetadataPrinter extends DefaultHandler {
private boolean isMeta = false;
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
isMeta = "http://metadata".equals(uri) && "record".equals(localName);
}
#Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
if (isMeta) {
System.out.println();
isMeta = false;
}
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
if (isMeta) {
System.out.print(new String(ch, start, length));
}
}
}
SAXParserFactory factory = SAXParserFactory.newInstance();
factory.setNamespaceAware(true);
SAXParser parser = factory.newSAXParser();
parser.parse(new File("ns.xml"), new MetadataPrinter());
Note: namespace awareness must be enabled explicitly in some of the older Java XML APIs (SAX and DOM among them.)
In a namespace qualified XML document there are two components to a nodes name: namespace URI and local name (these are passed in as parameters to the startElement and endElement events). When you are checking for the presence of an element you should be matching on both these parameters. Currently your code would work for both documents below even though they are namespace qualified differently.
<foo xmlns="FOO">
<bar>Hello World</bar>
</foo>
And
<foo xmlns="BAR">
<bar>Hello World</bar>
</foo>
You are currently (and incorrectly) matching on the qName parameter. The problem with what you are doing is that the qName might change based on the prefix used to represent a namespace. The two documents below have the exact same namespace qualification. The local names and namespaces are the same, but their QNames are different.
<foo xmlns="FOO">
<bar>Hello World</bar>
</foo>
And
<ns:foo xmlns:ns="FOO">
<ns:bar>Hello World</ns:bar>
<ns:foo>

XML response how to assign values to variables

I get the xml repsonse for http request. I store it as a string variable
String str = in.readLine();
And the contents of str is:
<response>
<lastUpdate>2012-04-26 21:29:18</lastUpdate>
<state>tx</state>
<population>
<li>
<timeWindow>DAYS7</timeWindow>
<confidenceInterval>
<high>15</high>
<low>0</low>
</confidenceInterval>
<size>0</size>
</li>
</population>
</response>
I want to assign tx, DAYS7 to variables. How do I do that?
Thanks
Slightly modified code from http://www.mkyong.com/java/how-to-read-xml-file-in-java-sax-parser/
public class ReadXMLFile {
// Your variables
static String state;
static String timeWindow;
public static void main(String argv[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
// Http Response you get
String httpResponse = "<response><lastUpdate>2012-04-26 21:29:18</lastUpdate><state>tx</state><population><li><timeWindow>DAYS7</timeWindow><confidenceInterval><high>15</high><low>0</low></confidenceInterval><size>0</size></li></population></response>";
DefaultHandler handler = new DefaultHandler() {
boolean bstate = false;
boolean tw = false;
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
if (qName.equalsIgnoreCase("STATE")) {
bstate = true;
}
if (qName.equalsIgnoreCase("TIMEWINDOW")) {
tw = true;
}
}
public void characters(char ch[], int start, int length) throws SAXException {
if (bstate) {
state = new String(ch, start, length);
bstate = false;
}
if (tw) {
timeWindow = new String(ch, start, length);
tw = false;
}
}
};
saxParser.parse(new InputSource(new ByteArrayInputStream(httpResponse.getBytes("utf-8"))), handler);
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("State is " + state);
System.out.println("Time windows is " + timeWindow);
}
}
If you're running this as a part of some process you might want to extend the ReadXMLFile from DefaultHandler.

Reading nested tags with sax parser

i am trying to read a xml file with following tag, but the sax parser is unable to read nested tags like
<active-prod-ownership>
<ActiveProdOwnership>
<Product code="3N3" component="TRI_SCORE" orderNumber="1-77305469" />
</ActiveProdOwnership>
</active-prod-ownership>
here is the code i am using
public class LoginConsumerResponseParser extends DefaultHandler {
// ===========================================================
// Fields
// ===========================================================
static String str="default";
private boolean in_errorCode=false;
private boolean in_Ack=false;
private boolean in_activeProdOwnership= false;
private boolean in_consumerId= false;
private boolean in_consumerAccToken=false;
public void startDocument() throws SAXException {
Log.e("i am ","in start document");
}
public void endDocument() throws SAXException {
// Nothing to do
Log.e("doc read", " ends here");
}
/** Gets be called on opening tags like:
* <tag>
* Can provide attribute(s), when xml was like:
* <tag attribute="attributeValue">*/
public void startElement(String namespaceURI, String localName,
String qName, Attributes atts) throws SAXException {
if(localName.equals("ack")){
in_Ack=true;
}
if(localName.equals("error-code")){
in_errorCode=true;
}
if(localName.equals("active-prod-ownership")){
Log.e("in", "active product ownership");
in_activeProdOwnership=true;
}
if(localName.equals("consumer-id")){
in_consumerId= true;
}
if(localName.equals("consumer-access-token"))
{
in_consumerAccToken= true;
}
}
/** Gets be called on closing tags like:
* </tag> */
public void endElement(String namespaceURI, String localName, String qName)
throws SAXException {
if(localName.equals("ack")){
in_Ack=false;
}
if(localName.equals("error-code")){
in_errorCode=false;
}
if(localName.equals("active-prod-ownership")){
in_activeProdOwnership=false;
}
if(localName.equals("consumer-id")){
in_consumerId= false;
}
if(localName.equals("consumer-access-token"))
{
in_consumerAccToken= false;
}
}
/** Gets be called on the following structure:
* <tag>characters</tag> */
public void characters(char ch[], int start, int length) {
if(in_Ack){
str= new String(ch,start,length);
}
if(str.equalsIgnoreCase("success")){
if(in_consumerId){
}
if(in_consumerAccToken){
}
if(in_activeProdOwnership){
str= new String(ch,start,length);
Log.e("active prod",str);
}
}
}
}
but on reaching the tag in_activeProdOwnersip read only "<" as the contents of the tag
please help i need to the whole data to be read
The tags in your XML file and parser does not match. I think you are mixing-up tags with attribute names. Here is the code that correctly parses your sample XML:
public class LoginConsumerResponseParser extends DefaultHandler {
public void startDocument() throws SAXException {
System.out.println("startDocument()");
}
public void endDocument() throws SAXException {
System.out.println("endDocument()");
}
public void startElement(String namespaceURI, String localName,
String qName, Attributes attrs)
throws SAXException {
if (qName.equals("ActiveProdOwnership")) {
inActiveProdOwnership = true;
} else if (qName.equals("Product")) {
if (!inActiveProdOwnership) {
throw new SAXException("Product tag not expected here.");
}
int length = attrs.getLength();
for (int i=0; i<length; i++) {
String name = attrs.getQName(i);
System.out.print(name + ": ");
String value = attrs.getValue(i);
System.out.println(value);
}
}
}
public void endElement(String namespaceURI, String localName, String qName)
throws SAXException {
if (localName.equals("ActiveProdOwnership"))
inActiveProdOwnership = false;
}
public void characters(char ch[], int start, int length) {
}
public static void main(String args[]) throws Exception {
String xmlFile = args[0];
File file = new File(xmlFile);
if (file.exists()) {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
DefaultHandler handler = new Test();
parser.parse(xmlFile, handler);
}
else {
System.out.println("File not found!");
}
}
private boolean inActiveProdOwnership = false;
}
A sample run will produce the following output:
startDocument()
code: 3N3
component: TRI_SCORE
orderNumber: 1-77305469
endDocument()
I suspect this is what's going wrong:
new String(ch,start,length);
Here, you're passing a char[] to the String constructor, but the constructor is supposed to take a byte[]. The end result is you get a mangled String.
I suggest instead that you make the str field a StringBuilder, not a String, and then use this:
builder.append(ch,start,length);
You then need to clear the StringBuilder each time startElement() is called.

Categories