Java Replace words within xml

Java Replace words within xml - java

I have the following xml
<some tag>
<some_nested_tag attr="Hello"> Text </some_nested_tag>
Hello world Hello Programming
</some tag>
From the above xml, I want to replace the occurances of the word "Hello" which are part of the tag content but not part of tag attribute.
I want the following output (Replacing Hello by HI):
<some tag>
<some_nested_tag attr="Hello"> Text </some_nested_tag>
HI world HI Programming
</some tag>
I tried java regex and also some of the DOM parser tutorials, but without any luck. I am posting here for help as I have limited time available to fix this in my project. Help would be appreciated.

That can be done by using a negative lookbehind.
Try this regex:
(?<!attr=")Hello
It will match Hello that is not preceded by attr=.
So you could try this:
str = str.replaceAll("(?<!attr=")Hello", "Hi");
It can also be done by negative lookahead:
Hello(?!([^<]+)?>)

string.replaceAll("(?i)\\shello\\s", " HI ");
Regex Explanation:
\sHello\s
Options: Case insensitive
Match a single character that is a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s»
Match the character string “Hello” literally (case insensitive) «Hello»
Match a single character that is a “whitespace character” (ASCII space, tab, line feed, carriage return, vertical tab, form feed) «\s»
hi
Insert the character string “ HI ” literally « HI »
Regex101 Demo

XSLT is a language for transforming XML documents into other XML documents. You can match all the text nodes containing 'Hello' and replace the content of those particular nodes.
A small example of using XSLT in Java:
import javax.xml.transform.*;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import java.io.File;
import java.io.IOException;
import java.net.URISyntaxException;
public class TestMain {
public static void main(String[] args) throws IOException, URISyntaxException, TransformerException {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource(new File("transform.xslt"));
Transformer transformer = factory.newTransformer(xslt);
Source text = new StreamSource(new File("input.xml"));
transformer.transform(text, new StreamResult(new File("output.xml")));
}
}
There was a good question on replacing string using XSLT - you can find an example of XSLT template there:
XSLT string replace

Here is a fully functional example using SAX parser. It is adapted to your case with minimal changes from this example
The actual replacement takes place in MyCopyHandler#endElement() and MyCopyHandler#startElement() and the XML element text content is collected in MyCopyHandler#characters(). Note the buffer maintenance too - it is important in handling mixed element content (text and child elements)
I know XSLT solution is also possible, but it is not that portable.
public class XMLReplace {
/**
* #param args
* #throws SAXException
* #throws ParserConfigurationException
*/
public static void main(String[] args) throws Exception {
final String str = "<root> Hello <nested attr='Hello'> Text </nested> Hello world Hello Programming </root>";
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser parser = spf.newSAXParser();
XMLReader reader = parser.getXMLReader();
reader.setErrorHandler(new MyErrorHandler());
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PrintWriter out = new PrintWriter(baos);
MyCopyHandler duper = new MyCopyHandler(out);
reader.setContentHandler(duper);
InputSource is = new InputSource(new StringReader(str));
reader.parse(is);
out.close();
System.out.println(baos);
}
}
class MyCopyHandler implements ContentHandler {
private boolean namespaceBegin = false;
private String currentNamespace;
private String currentNamespaceUri;
private Locator locator;
private final PrintWriter out;
private final StringBuilder buffer = new StringBuilder();
public MyCopyHandler(PrintWriter out) {
this.out = out;
}
public void setDocumentLocator(Locator locator) {
this.locator = locator;
}
public void startDocument() {
}
public void endDocument() {
}
public void startPrefixMapping(String prefix, String uri) {
namespaceBegin = true;
currentNamespace = prefix;
currentNamespaceUri = uri;
}
public void endPrefixMapping(String prefix) {
}
public void startElement(String namespaceURI, String localName, String qName, Attributes atts) {
// Flush buffer - needed in case of mixed content (text + elements)
out.print(buffer.toString().replaceAll("Hello", "HI"));
// Prepare to collect element text content
this.buffer.setLength(0);
out.print("<" + qName);
if (namespaceBegin) {
out.print(" xmlns:" + currentNamespace + "=\"" + currentNamespaceUri + "\"");
namespaceBegin = false;
}
for (int i = 0; i < atts.getLength(); i++) {
out.print(" " + atts.getQName(i) + "=\"" + atts.getValue(i) + "\"");
}
out.print(">");
}
public void endElement(String namespaceURI, String localName, String qName) {
// Process text content
out.print(buffer.toString().replaceAll("Hello", "HI"));
out.print("</" + qName + ">");
// Reset buffer
buffer.setLength(0);
}
public void characters(char[] ch, int start, int length) {
// Store chunk of text - parser is allowed to provide text content in chunks for performance reasons
buffer.append(Arrays.copyOfRange(ch, start, start + length));
}
public void ignorableWhitespace(char[] ch, int start, int length) {
for (int i = start; i < start + length; i++)
out.print(ch[i]);
}
public void processingInstruction(String target, String data) {
out.print("<?" + target + " " + data + "?>");
}
public void skippedEntity(String name) {
out.print("&" + name + ";");
}
}
class MyErrorHandler implements ErrorHandler {
public void warning(SAXParseException e) throws SAXException {
show("Warning", e);
throw (e);
}
public void error(SAXParseException e) throws SAXException {
show("Error", e);
throw (e);
}
public void fatalError(SAXParseException e) throws SAXException {
show("Fatal Error", e);
throw (e);
}
private void show(String type, SAXParseException e) {
System.out.println(type + ": " + e.getMessage());
System.out.println("Line " + e.getLineNumber() + " Column " + e.getColumnNumber());
System.out.println("System ID: " + e.getSystemId());
}
}

Related

How to read XML declaration with Java SAX

I want to read the XML declaration from an XML file with Java SAX. For example
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
I tried using DefaultHandler, but characters and startElement don't get called for the XML declaration. This is my code:
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class SAXStuff {
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
SAXParser sp = SAXParserFactory.newInstance().newSAXParser();
sp.parse("test.xml", new DefaultHandler() {
public void characters(char[] ch, int start, int length) throws SAXException {
for(int i = start; i < start + length; i++) {
System.out.print(ch[i]);
}
}
public void startElement(String uri, String localName, String qName, Attributes attributes)
throws SAXException {
System.out.println(qName);
}
});
}
}
How can I get the XML declaration using SAX in Java?

Since Java 14, org.xml.sax.ContentHandler has a declaration method for this purpose. DefaultHandler implements ContentHandler, so this method can be overriden to provide a custom action.
This is the method signature:
void declaration(String version, String encoding, String standalone) throws SAXException
version - the version string as in the input document, null if not specified
encoding - the encoding string as in the input document, null if not specified
standalone - the standalone string as in the input document, null if not specified
Example:
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser parser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler(){
#Override
public void declaration(String version, String encoding, String standalone) {
String declaration = "<?xml "
+ (version != null ? "version=\"" + version + "\"": "")
+ (encoding != null ? " encoding=\"" + encoding + "\"": "")
+ (standalone != null ? " standalone=\"" + standalone + "\"": "")
+ "?>";
System.out.println(declaration);
}
};
parser.parse(new File("file.xml"), handler);

I am trying to use a url xml parse but it looks like i keep getting an empty xml

I am trying to get info from a weather API called even though when i am making the request i am getting a response, but when i am trying to get only a specific part of the response i get null response every time can someone help? here is the code for my handler :
package weathercalls;
import java.util.ArrayList;
import org.xml.sax.Attributes;
import org.xml.sax.helpers.DefaultHandler;
public class Handler extends DefaultHandler
{
// Create three array lists to store the data
public ArrayList<Integer> lows = new ArrayList<Integer>();
public ArrayList<Integer> highs = new ArrayList<Integer>();
public ArrayList<String> regions = new ArrayList<String>();
// Make sure that the code in DefaultHandler's
// constructor is called:
public Handler()
{
super();
}
/*** Below are the three methods that we are extending ***/
#Override
public void startDocument()
{
System.out.println("Start document");
}
#Override
public void endDocument()
{
System.out.println("End document");
}
// This is where all the work is happening:
#Override
public void startElement(String uri, String name, String qName, Attributes atts)
{
if(qName.compareTo("region") == 0)
{
String region = atts.getLocalName(0);
System.out.println("Day: " + region);
this.regions.add(region);
}
if(qName.compareToIgnoreCase("wind_degree") == 0)
{
int low = atts.getLength();
System.out.println("Low: " + low);
this.lows.add(low);
}
if(qName.compareToIgnoreCase("high") == 0)
{
int high = Integer.parseInt(atts.getValue(0));
System.out.println("High: " + high);
this.highs.add(high);
}
}
}
and here is my main file code :
package weathercalls;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import org.xml.sax.InputSource;
import org.xml.sax.XMLReader;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.SAXException;
public class weatherCalls {
public static void main(String[] args) throws Exception {
//Main url
String main_url = "http://api.weatherapi.com/v1/";
//Live or Weekly forecast
String live_weather = "current.xml?key=";
//String sevendays_weather = "orecast.xml?key=";
//API Key + q
String API_Key = "c2e285e55db74def97f151114201701&q=";
//Location Setters
String location = "London";
InputSource inSource = null;
InputStream in = null;
XMLReader xr = null;
/**
URL weather = new URL(main_url + live_weather + API_Key + location);
URLConnection yc = weather.openConnection();
BufferedReader in1 = new BufferedReader(
new InputStreamReader(
yc.getInputStream()));
String inputLine;
while ((inputLine = in1.readLine()) != null)
System.out.println(inputLine);
in1.close();**/
try
{
// Turn the string into a URL object
String complete_url = main_url + live_weather + API_Key + location;
URL urlObject = new URL(complete_url);
// Open the stream (which returns an InputStream):
in = urlObject.openStream();
/** Now parse the data (the stream) that we received back ***/
// Create an XML reader
SAXParserFactory parserFactory = SAXParserFactory.newInstance();
SAXParser parser = parserFactory.newSAXParser();
xr = parser.getXMLReader();
// Tell that XML reader to use our special Google Handler
Handler ourSpecialHandler = new Handler();
xr.setContentHandler(ourSpecialHandler);
// We have an InputStream, but let's just wrap it in
// an InputSource (the SAX parser likes it that way)
inSource = new InputSource(in);
// And parse it!
xr.parse(inSource);
System.out.println(complete_url);
System.out.println(urlObject);
System.out.println(in);
System.out.println(xr);
System.out.println(inSource);
System.out.println(parser);
}
catch(IOException ioe)
{
ioe.printStackTrace();
}
catch(SAXException se)
{
se.printStackTrace();
}
}
}
and this is my console print:
Start document
Day: null
Low: 0
End document
http://api.weatherapi.com/v1/current.xml?key=c2e285e55db74def97f151114201701&q=London
http://api.weatherapi.com/v1/current.xml?key=c2e285e55db74def97f151114201701&q=London
sun.net.www.protocol.http.HttpURLConnection$HttpInputStream#2471cca7
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser#5fe5c6f
org.xml.sax.InputSource#6979e8cb
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl#763d9750

I think you are trying to extract the values from the XML tags and if it is the case then you are doing it wrong. Attributes object contains the attributes of a particular tag and to get the value you have to do some extra work. Similar to the start of a tag, there are separate events for the contents and the end of a tag. current_tag variable will keep track of the current tag being processed. Below is a sample code:
class Handler extends DefaultHandler {
// Create three array lists to store the data
public ArrayList<Integer> lows = new ArrayList<Integer>();
public ArrayList<Integer> highs = new ArrayList<Integer>();
public ArrayList<String> regions = new ArrayList<String>();
// Make sure that the code in DefaultHandler's
// constructor is called:
public Handler() {
super();
}
/*** Below are the three methods that we are extending ***/
#Override
public void startDocument() {
System.out.println("Start document");
}
#Override
public void endDocument() {
System.out.println("End document");
}
//Keeps track of the current tag;
String currentTag = "";
// This is where all the work is happening:
#Override
public void startElement(String uri, String name, String qName, Attributes atts) {
//Save the current tag being handled
currentTag = qName;
}
//Detect end tag
#Override
public void endElement(String uri, String localName, String qName) throws SAXException {
//Reset it
currentTag = "";
}
#Override
public void characters(char[] ch, int start, int length) throws SAXException {
//Rules based on current tag
switch (currentTag) {
case "region":
String region = String.valueOf(ch, start, length);
this.regions.add(region);
System.out.println("Day: " + region);
break;
case "wind_degree":
int low = Integer.parseInt(String.valueOf(ch, start, length));
System.out.println("Low: " + low);
this.lows.add(low);
break;
case "high":
int high = Integer.parseInt(String.valueOf(ch, start, length));
System.out.println("High: " + high);
this.highs.add(high);
break;
}
}}
NOTE: Please refrain from sharing your API keys or passwords on the internet.

XML parse section by section in SAX or StAX

shorten version of my XML file look like this:
<?xml version="1.0" encoding="UTF-8"?>
<MzIdentML id="MS-GF+">
<SequenceCollection xmlns="http://psidev.info/psi/pi/mzIdentML/1.1">
<DBSequence length="146" id="DBSeq143">
<cvParam cvRef="PSI-MS" accession="MS:1001088"></cvParam>
</DBSequence>
<Peptide id="Pep7">
<PeptideSequence>MFLSFPTTK</PeptideSequence>
<Modification location="1" monoisotopicMassDelta="15.994915">
<cvParam cvRef="UNIMOD" accession="UNIMOD:35" name="Oxidation"></cvParam>
</Modification>
</Peptide>
<PeptideEvidence dBSequence_ref="DBSeq143" id="PepEv_160_1_18"></PeptideEvidence>
<PeptideEvidence dBSequence_ref="DBSeq143" id="PepEv_275_8_133"></PeptideEvidence>
</SequenceCollection>
</MzIdentML>
I want to get DBSequence, Peptide and PeptideEvidence details separately.but attributes of parent and children(or nested children..if there are).In other words, I want all the attribues as key-value pairs in each section I illustrated bellow:
----------------------------------------------------------------------
<DBSequence length="146" id="DBSeq143">
<cvParam cvRef="PSI-MS" accession="MS:1001088"></cvParam>
</DBSequence>
----------------------------------------------------------------------
<Peptide id="Pep7">
<PeptideSequence>MFLSFPTTK</PeptideSequence>
<Modification location="1" monoisotopicMassDelta="15.994915">
<cvParam cvRef="UNIMOD" accession="UNIMOD:35" name="Oxidation"></cvParam>
</Modification>
</Peptide>
----------------------------------------------------------------------
<PeptideEvidence dBSequence_ref="DBSeq143" id="PepEv_160_1_18"></PeptideEvidence>
<PeptideEvidence dBSequence_ref="DBSeq143" id="PepEv_275_8_133"></PeptideEvidence>
----------------------------------------------------------------------
For example, if we consider <DBSequence> section:
<DBSequence length="146" id="DBSeq143">
<cvParam cvRef="PSI-MS" accession="MS:1001088"></cvParam>
</DBSequence>
should be output as:
DBSequence=>length=146;id=DBSeq143;cvRef=PSI-MS;accession=MS:1001088;
This is the code I wrote in SAX:
package lucene.parse;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class MzIdentMLSAXParser extends DefaultHandler {
private boolean isDBsequence = false;
String DBSequenceSection;
String PeptideEvidenceDocument;
public static void main(String[] argv) throws SAXException, ParserConfigurationException, IOException {
MzIdentMLSAXParser ps = new MzIdentMLSAXParser("file_path_here/sample.xml");
}
public MzIdentMLSAXParser(String dataDir) throws FileNotFoundException, SAXException, ParserConfigurationException, IOException {
FileInputStream fis = new FileInputStream(dataDir);
SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser parser = spf.newSAXParser();
parser.parse(fis, this);
}
#Override
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException {
if (qName.equals("DBSequence")) {
// each time we found a new DBSequence, we re-initialize DBSequenceSection
DBSequenceSection = "";
// get attributes of DBSequence
for (int i = 0; i < atts.getLength(); i++) {
DBSequenceSection += atts.getQName(i) + "=" + atts.getValue(i) + ";";
}
isDBsequence = true;
} else if ((qName.equals("cvParam")) && (isDBsequence)) {
// get attributes of cvParam which are belongs to DBSequence
// there can be cvParam that are not belongs to DBSequence.
for (int i = 0; i < atts.getLength(); i++) {
DBSequenceSection += atts.getQName(i) + "=" + atts.getValue(i) + ";";
}
} else if (qName.equals("PeptideEvidence")) {
// each time we found a new PeptideEvidence, we re-initialize docuDBSequenceSectionment
PeptideEvidenceDocument = "";
for (int i = 0; i < atts.getLength(); i++) {
PeptideEvidenceDocument += atts.getQName(i) + "=" + atts.getValue(i) + ";";
}
}
}
#Override
public void endElement(String uri, String localName, String qName) throws SAXException {
if (qName.equals("DBSequence")) {
System.out.println(qName +"=>"+DBSequenceSection);
isDBsequence = false;
} else if (qName.equals("PeptideEvidence")) {
System.out.println(qName +"=>"+PeptideEvidenceDocument);
}
}
}
Is there any easy way of doing this? because I have lots of tags like this with nested nodes. Challenge here is <cvParam> appears not only in <DBSequence> tag, but in other tags like <Modification> etc. I tried with StAX too. but couldn't make it.

Here is a working example of using StAX. StAX excels when parsing known XML structures, but can be used for dynamic parsing too.
This code relies on knowledge, e.g. knowing that we want the content of DBSequence, Peptide, and PeptideEvidence, and that PeptideSequence has text content, while the others don't.
The methods use recursion to follow the structure of the XML.
public static void main(String[] args) throws Exception {
String xml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
"<MzIdentML id=\"MS-GF+\">\n" +
" <SequenceCollection xmlns=\"http://psidev.info/psi/pi/mzIdentML/1.1\">\n" +
" <DBSequence length=\"146\" id=\"DBSeq143\">\n" +
" <cvParam cvRef=\"PSI-MS\" accession=\"MS:1001088\"></cvParam>\n" +
" </DBSequence>\n" +
" <Peptide id=\"Pep7\">\n" +
" <PeptideSequence>MFLSFPTTK</PeptideSequence>\n" +
" <Modification location=\"1\" monoisotopicMassDelta=\"15.994915\">\n" +
" <cvParam cvRef=\"UNIMOD\" accession=\"UNIMOD:35\" name=\"Oxidation\"></cvParam>\n" +
" </Modification>\n" +
" </Peptide>\n" +
" <PeptideEvidence dBSequence_ref=\"DBSeq143\" id=\"PepEv_160_1_18\"></PeptideEvidence>\n" +
" <PeptideEvidence dBSequence_ref=\"DBSeq143\" id=\"PepEv_275_8_133\"></PeptideEvidence>\n" +
" </SequenceCollection>\n" +
"</MzIdentML>";
XMLStreamReader reader = XMLInputFactory.newFactory().createXMLStreamReader(new StringReader(xml));
try {
reader.nextTag();
search(reader);
} finally {
reader.close();
}
}
private static void search(XMLStreamReader reader) throws XMLStreamException {
// reader must be on START_ELEMENT upon entry, and will be on matching END_ELEMENT on return
assert reader.getEventType() == XMLStreamConstants.START_ELEMENT;
while (reader.nextTag() == XMLStreamConstants.START_ELEMENT) {
String name = reader.getLocalName();
switch (name) {
case "DBSequence":
case "Peptide":
case "PeptideEvidence": {
Map<String, String> props = new LinkedHashMap<>();
collectProps(reader, props);
System.out.println(name + ": " + props);
break; }
default:
search(reader);
}
}
}
private static void collectProps(XMLStreamReader reader, Map<String, String> props) throws XMLStreamException {
// reader must be on START_ELEMENT upon entry, and will be on matching END_ELEMENT on return
assert reader.getEventType() == XMLStreamConstants.START_ELEMENT;
for (int i = 0; i < reader.getAttributeCount(); i++)
props.put(reader.getAttributeLocalName(i), reader.getAttributeValue(i));
String name = reader.getLocalName();
switch (name) {
case "PeptideSequence":
props.put(name, reader.getElementText());
break;
default:
while (reader.nextTag() == XMLStreamConstants.START_ELEMENT)
collectProps(reader, props);
}
}
OUTPUT
DBSequence: {length=146, id=DBSeq143, cvRef=PSI-MS, accession=MS:1001088}
Peptide: {id=Pep7, PeptideSequence=MFLSFPTTK, location=1, monoisotopicMassDelta=15.994915, cvRef=UNIMOD, accession=UNIMOD:35, name=Oxidation}
PeptideEvidence: {dBSequence_ref=DBSeq143, id=PepEv_160_1_18}
PeptideEvidence: {dBSequence_ref=DBSeq143, id=PepEv_275_8_133}

Getting Parent Child Hierarchy in Sax XML parser

I'm using SAX (Simple API for XML) to parse an XML document. I'm getting output for all the tags the file have, but i want it to show the tags in parent child hierarchy.
For Example:
This is my output
<dblp>
<www>
<author>
</author><title>
</title><url>
</url><year>
</year></www><inproceedings>
<month>
</month><pages>
</pages><booktitle>
</booktitle><note>
</note><cdrom>
</cdrom></inproceedings><article>
<journal>
</journal><volume>
</volume></article><ee>
</ee><book>
<publisher>
</publisher><isbn>
</isbn></book><incollection>
<crossref>
</crossref></incollection><editor>
</editor><series>
</series></dblp>
But i want it to display the output like this (it displays the children with extra spacing (that's how i want it to be))
<dblp>
<www>
<author>
</author>
<title>
</title>
<url>
</url>
<year>
</year>
</www>
<inproceedings>
<month>
</month>
<pages>
</pages>
<booktitle>
</booktitle>
<note>
</note>
<cdrom>
</cdrom>
</inproceedings>
<article>
<journal>
</journal>
<volume>
</volume>
</article>
<ee>
</ee>
<book>
<publisher>
</publisher>
<isbn>
</isbn>
</book>
<incollection>
<crossref>
</crossref>
</incollection>
<editor>
</editor>
<series>
</series>
</dblp>
But i can't figure out how can i detect that parser is parsing a parent tag or a children.
here is my code:
package com.teamincredibles.sax;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class Parser extends DefaultHandler {
public void getXml() {
try {
SAXParserFactory saxParserFactory = SAXParserFactory.newInstance();
SAXParser saxParser = saxParserFactory.newSAXParser();
final MySet openingTagList = new MySet();
final MySet closingTagList = new MySet();
DefaultHandler defaultHandler = new DefaultHandler() {
public void startDocument() throws SAXException {
System.out.println("Starting Parsing...\n");
}
public void endDocument() throws SAXException {
System.out.print("\n\nDone Parsing!");
}
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
if (!openingTagList.contains(qName)) {
openingTagList.add(qName);
System.out.print("<" + qName + ">\n");
}
}
public void characters(char ch[], int start, int length)
throws SAXException {
/*for(int i=start; i<(start+length);i++){
System.out.print(ch[i]);
}*/
}
public void endElement(String uri, String localName, String qName)
throws SAXException {
if (!closingTagList.contains(qName)) {
closingTagList.add(qName);
System.out.print("</" + qName + ">");
}
}
};
saxParser.parse("xml/sample.xml", defaultHandler);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String args[]) {
Parser readXml = new Parser();
readXml.getXml();
}
}

You can consider a StAX implementation:
package be.duo.stax;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamConstants;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
public class StaxExample {
public void getXml() {
InputStream is = null;
try {
is = new FileInputStream("c:\\dev\\sample.xml");
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
XMLStreamReader reader = inputFactory.createXMLStreamReader(is);
parse(reader, 0);
} catch(Exception ex) {
System.out.println(ex.getMessage());
} finally {
if(is != null) {
try {
is.close();
} catch(IOException ioe) {
System.out.println(ioe.getMessage());
}
}
}
}
private void parse(XMLStreamReader reader, int depth) throws XMLStreamException {
while(true) {
if(reader.hasNext()) {
switch(reader.next()) {
case XMLStreamConstants.START_ELEMENT:
writeBeginTag(reader.getLocalName(), depth);
parse(reader, depth+1);
break;
case XMLStreamConstants.END_ELEMENT:
writeEndTag(reader.getLocalName(), depth-1);
return;
}
}
}
}
private void writeBeginTag(String tag, int depth) {
for(int i = 0; i < depth; i++) {
System.out.print(" ");
}
System.out.println("<" + tag + ">");
}
private void writeEndTag(String tag, int depth) {
for(int i = 0; i < depth; i++) {
System.out.print(" ");
}
System.out.println("</" + tag + ">");
}
public static void main(String[] args) {
StaxExample app = new StaxExample();
app.getXml();
}
}
There is an idiom for StAX with a loop like this for every tag in the XML:
private MyTagObject parseMyTag(XMLStreamReader reader, String myTag) throws XMLStreamException {
MyTagObject myTagObject = new MyTagObject();
while (true) {
switch (reader.next()) {
case XMLStreamConstants.START_ELEMENT:
String localName = reader.getLocalName();
if(localName.equals("myOtherTag1")) {
myTagObject.setMyOtherTag1(parseMyOtherTag1(reader, localName));
} else if(localName.equals("myOtherTag2")) {
myTagObject.setMyOtherTag2(parseMyOtherTag2(reader, localName));
}
// and so on
break;
case XMLStreamConstants.END_ELEMENT:
if(reader.getLocalName().equals(myTag) {
return myTagObject;
}
break;
}
}

well what have you tried? you should use a transformer found here: How to pretty print XML from Java?
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
//initialize StreamResult with File object to save to file
StreamResult result = new StreamResult(new StringWriter());
DOMSource source = new DOMSource(doc);
transformer.transform(source, result);
String xmlString = result.getWriter().toString();
System.out.println(xmlString);

Almost any useful SAX application needs to maintain a stack. When startElement is called, you push information to the stack, when endElement is called, you pop the stack. Exactly what you put on the stack depends on the application; it's often the element name. For your application, you don't actually need a full stack, you only need to know its depth. You could get by with maintaining this using depth++ in startElement and depth-- in endElement(). Then you just output depth spaces before the element name.

Parsing and updating xml using SAX parser in java

I have an xml file with similar tags ->
<properties>
<definition>
<name>IP</name>
<description></description>
<defaultValue>10.1.1.1</defaultValue>
</definition>
<definition>
<name>Name</name>
<description></description>
<defaultValue>MyName</defaultValue>
</definition>
<definition>
<name>Environment</name>
<description></description>
<defaultValue>Production</defaultValue>
</definition>
</properties>
I want to update the default value of the definition with name : Environment.
Is it possible to do that using SAX parser?
Can you please point me to proper documentation?
So far I have parsed the document but when I update defaultValue, it updates all defaultValues. I dont know how to parse the exact default value tag.

Anything is possible with SAX, it's just waaaaay harder than it has to be. It's pretty old school and there are many easier ways to do this (JAXB, XQuery, XPath, DOM etc ).
That said lets do it with SAX.
It sounds like the problem you are having is that you are not tracking the state of your progress through the document. SAX simply works by making the callbacks when it stumbles across an event within the document
This is a fairly crude way of parsing the doc and updating the relevant node using SAX. Basically I am checking when we hit a element with the value you want to update (Environment) and setting a flag so that when we get to the contents of the defaultValue node, the characters callback lets me remove the existing value and replace it with the new value.
import java.io.StringReader;
import java.util.Arrays;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
public class Q26897496 extends DefaultHandler {
public static String xmlDoc = "<?xml version='1.0'?>"
+ "<properties>"
+ " <definition>"
+ " <name>IP</name>"
+ " <description></description>"
+ " <defaultValue>10.1.1.1</defaultValue>"
+ " </definition>"
+ " <definition>"
+ " <name>Name</name>"
+ " <description></description>"
+ " <defaultValue>MyName</defaultValue>"
+ " </definition>"
+ " <definition>"
+ " <name>Environment</name>"
+ " <description></description>"
+ " <defaultValue>Production</defaultValue>"
+ " </definition>"
+ "</properties>";
String elementName;
boolean mark = false;
char[] updatedDoc;
public static void main(String[] args) {
Q26897496 q = new Q26897496();
try {
q.parse();
} catch (Exception e) {
e.printStackTrace();
}
}
public Q26897496() {
}
public void parse() throws Exception {
SAXParserFactory spf = SAXParserFactory.newInstance();
spf.setNamespaceAware(true);
SAXParser saxParser = spf.newSAXParser();
XMLReader xml = saxParser.getXMLReader();
xml.setContentHandler(this);
xml.parse(new InputSource(new StringReader(xmlDoc)));
System.out.println("new xml: \n" + new String(updatedDoc));
}
#Override
public void startDocument() throws SAXException {
System.out.println("starting");
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
this.elementName = localName;
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
String value = new String(ch).substring(start, start + length);
if (elementName.equals("name")) {
if (value.equals("Environment")) {
this.mark = true;
}
}
if (elementName.equals("defaultValue") && mark == true) {
// update
String tmpDoc = new String(ch);
String leading = tmpDoc.substring(0, start);
String trailing = tmpDoc.substring(start + length, tmpDoc.length());
this.updatedDoc = (leading + "NewValueForDefaulValue" + trailing).toCharArray();
mark = false;
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java Replace words within xml - java

That can be done by using a negative lookbehind. Try this regex: (?<!attr=")Hello It will match Hello that is not preceded by attr=. So you could try this: str = str.replaceAll("(?<!attr=")Hello", "Hi"); It can also be done by negative lookahead: Hello(?!([^<]+)?>)

Related

How to read XML declaration with Java SAX

I am trying to use a url xml parse but it looks like i keep getting an empty xml

XML parse section by section in SAX or StAX

Getting Parent Child Hierarchy in Sax XML parser

Parsing and updating xml using SAX parser in java

Categories

Resources