Parsing dblp.xml with java DOM/SAX

Parsing dblp.xml with java DOM/SAX - java

I am trying to parse dblp.xml in java to get the author names/title/year etc, but since the file is huge (860MB), I cannot use DOM/SAX on the complete file.
So I split the file into multiple small files of around 100MB each.
Now each file contains various (thousands of) nodes like this:
<dblp>
<inproceedings mdate="2011-06-23" key="conf/aime/BianchiD95">
<author>Nadia Bianchi</author>
<author>Claudia Diamantini</author>
<title>Integration of Neural Networks and Rule Based Systems in the Interpretation of Liver Biopsy Images.</title>
<pages>367-378</pages>
<year>1995</year>
<crossref>conf/aime/1995</crossref>
<booktitle>AIME</booktitle>
<url>db/conf/aime/aime1995.html#BianchiD95</url>
<ee>http://dx.doi.org/10.1007/3-540-60025-6_152</ee>
</inproceedings>
</dblp>
100MB should be readable in DOM, I am assuming, but the code stops after roughly 45k lines. Here is the java code I am using:
#SuppressWarnings({"unchecked", "null"})
public List<dblpModel> readConfigDOM(String configFile) {
List<dblpModel> items = new ArrayList<dblpModel>();
List<String> strList = null;
dblpModel item = null;
try {
File fXmlFile = new File(configFile);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("incollection");
for (int temp = 0; temp < nList.getLength(); temp++) {
item = new dblpModel();
strList = new ArrayList<String>();
Node nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
strList = getTagValueString("title", eElement);
System.out.println(strList.get(0).toString());
strList = getTagValueString("author", eElement);
System.out.println("Author : " + strList.size());
for(String s: strList) {
System.out.println(s);
}
}
items.add(item);
}
} catch (Exception e) {
e.printStackTrace();
}
return items;
}
private static String getTagValueString(String sTag, Element eElement) {
String temp = "";
StringBuffer concatTestSb = new StringBuffer();
List<String> strList = new ArrayList<String>();
int len = eElement.getElementsByTagName(sTag).getLength();
try {
for (int i = 0; i < len; i++) {
NodeList nl = eElement.getElementsByTagName(sTag).item(i).getChildNodes();
if (nl.getLength() > 1) {
for (int j = 0; j < nl.getLength(); j++) {
concatTestSb.append(nl.item(j).getTextContent());
}
} else {
temp = nl.item(0).getNodeValue();
concatTestSb.append(temp);
if (len > 1) {
concatTestSb.append("*");
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
return concatTestSb.toString();
}
Any help? I have tried using STAX api for parsing large documents also, but that also

If you goal is to just get the details out, the just use a BufferedReader to read the file as a text file. If you want, throw in some regex.
if using mysql is an option, you may be able to get it to do the heavy lifting through it's XML Functions
Hope this helps.

Don't fuss too much about the xml format. It is not terribly useful anyway. Just read it as text file and parse the lines as string. You can then export the data to a csv and use it the way you want from that point.
Unfortunately xml is not very efficient for large documents. I did something similar here for a research project:
http://qualityofdata.com/2011/03/27/dblp-for-sql-server/

Related

Reading multiple xml files java

i have ~25000 XML files i need to read in java. This is my code:
private static void ProcessFile() {
try {
File fXmlFile = new File("C:/Users/Emolk/Desktop/000010.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("sindex");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("");
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Name : " + eElement.getElementsByTagName("name").item(0).getTextContent());
System.out.println("Count : " + eElement.getElementsByTagName("count").item(0).getTextContent());
Entity CE = new Entity(eElement.getElementsByTagName("name").item(0).getTextContent(), Integer.parseInt(eElement.getElementsByTagName("count").item(0).getTextContent()));
Entities.add(CE);
System.out.println("Entity added! ");
}
}
System.out.println(Entities);
} catch (Exception e) {
e.printStackTrace();
}
}
How do i read 25000 files instead of just the one?
I tried joining all the xml files together using this: https://www.sobolsoft.com/howtouse/combine-xml-files.htm
But that gave me this error:
[Fatal Error] joined.xml:130:2: The markup in the document following the
root element must be well-formed.

If performance is not a concern, then you can do something like,
import java.io.File;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class ReadFiles {
public static void main(String[] args) {
File dir = new File("D:/Work"); //Directory where your file exists
File [] files = dir.listFiles();
for(File file : files) {
if(file.isFile() && file.getName().endsWith(".xml")) { //You can validate file name with extension if needed
ProcessFile(file, Entities); // Assumed you have declared Entities, may be list of other collection
}
}
System.out.println(Entities);
}
private static void ProcessFile(File fXmlFile, List<E> Entities) {
try {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("sindex");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("");
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("Name : " + eElement.getElementsByTagName("name").item(0).getTextContent());
System.out.println("Count : " + eElement.getElementsByTagName("count").item(0).getTextContent());
Entity CE = new Entity(eElement.getElementsByTagName("name").item(0).getTextContent(), Integer.parseInt(eElement.getElementsByTagName("count").item(0).getTextContent()));
Entities.add(CE);
System.out.println("Entity added! ");
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

To read mutliple files you should use a loop of some kind for iteration. You can either scan for all valid files in a directory.
File folder = new File("path/to/directory");
File[] files = folder.listFiles();
for (int i = 0; i < files.length; i++) {
// you can also filter for .xml if needed
if (files[i].isFile()) {
// parse the file
}
}
Next, you need to decide how you want to parse your files: sequential or in parallel.
Parallel is a lot faster since you use multiple threads to parse files.
One Thread
You can reuse your code that you already wrote, and loop over the files:
for (File file : files) {
processFile(file, yourListOfEntities);
}
Multiple Threads:
Aquire a ScheduledExecutorService and submit multiple tasks.
ExecutorService service = Executors.newFixedThreadPool(5);
for (File file : files) {
service.execute(() -> processFile(file, yourListOfEntities));
}
An important note here: The default implementation of ArrayList is not thread safe, so you should (since the List is used by multiple threads) synchronize access to it:
List<Entity> synchronizedList = Collections.synchronizedList(yourListOfEntities);
Also, DocumentBuilder is not thread safe and should be created once per thread (you have it right if you just call your method). This note is just for the case if you think about optimizing it.

Not able to parse duplicate xml tag values using DocumentBuilder in Java

I'm able to parse the XML object if it has a single unique inner tag. But the problem comes when I have two duplicate tags in a parent tag. How can I get both tag values? I'm getting the response as XML string.
Here is my code
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(responseXML));
if (is != null) {
Document doc = db.parse(is);
String errorCode = "";
NodeList errorDetails = doc.getElementsByTagName("ERROR-LIST");
if (errorDetails != null) {
int length = errorDetails.getLength();
if (length > 0) {
for (int i = 0; i < length; i++) {
if (errorDetails.item(i).getNodeType() == Node.ELEMENT_NODE) {
Element el = (Element) errorDetails.item(i);
if (el.getNodeName().contains("ERROR-LIST")) {
NodeList errorCodes = el.getElementsByTagName("ERROR-CODE");
for (int j = 0; j < errorCodes.getLength(); j++) {
Node errorCode1 = errorCodes.item(j);
logger.info(errorCode1.getNodeValue());
}
}
}
}
} else {
isValidResponse = true;
}
}
}
The response which I'm getting from server is
<DATA><HEADER><RESPONSE-TYPE CODE = "0" DESCRIPTION = "Response Error" />
</HEADER><BODY><ERROR-LIST>
<ERROR-CODE>9000</ERROR-CODE>
<ERROR-CODE>1076</ERROR-CODE>
</ERROR-LIST></BODY></DATA>
Im able to get only 9000 error code, how can I catch all error codes which are under error list?
Any ideas would be greatly appreciated.

You are explicitly requesting the first element of the error list:
el.getElementsByTagName("ERROR-CODE").item(0).getTextContent();
Loop over all the nodes getElementsByTagName returns.
NodeList errorCodes = el.getElementsByTagName("ERROR-CODE");
for (int j = 0; j < errorCodes.getLength(); j++) {
String errorCode = errorCodes.item(j).getTextContent();
}

Java File Iteration in For loop

I have few files saved in my local directory. One video file and one
xml file. Video file details will be stored in xml file.
We are moving videos from one system to another system. Before
uploading the video file and xml data from one system to another
system, need to check for the title of video in the other system and
upload only if the same title doesn't exist.
This is working fine. But uploading is happening 4 times instead of 2
times. Please help.
Here is the main code:
List<File> videoFiles = new ArrayList<File>();
List<File> xmlFiles = new ArrayList<File>();
File[] allVideos = checkVideos();
for(File file:allVideos ) {
if(file.getName().endsWith("flv")) {
videoFiles .add(file);
}
if(file.getName().endsWith("xml")) {
xmlFiles .add(file);
}
}
System.out.println(videoFiles.size());
System.out.println(xmlFiles.size());
processUpload(videoFiles ,xmlFiles);
Here are the methods:
private static void processUpload(List<File> videoFiles, List<File> xmlFiles) throws ParserConfigurationException, SAXException, IOException, ApiException {
NodeList nodes = null;
File video= null;
File xml = null;
String title = null;
String localFileTitle = null;
Media newMedia = null;
for(int i=0;i < videoFiles.size();i++) {
System.out.println("videoFiles.getName() ->"+videoFiles.get(i).getName());
video= videoFiles.get(i);
for(int j=0;j < xmlFiles.size();j++) {
xml = xmlFiles.get(j);
System.out.println("xmlFiles.getName() ->"+xmlFiles.get(i).getName());
nodes = parseXml(xml);
localFileTitle = processNodes(nodes);
title = checkTitles(localFileTitle);
newMedia = initiateUploadProcess(flv, title );
}
}
}
private static NodeList parseXml(File xmlFile) throws ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
//System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("video");
return nList;
}
private static String processNodes(NodeList nodes) {
String fileTitle = null;
if(nodes.getLength() >= 1) {
for (int temp = 0; temp < nodes.getLength(); temp++) {
Node nNode = nodes.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
fileTitle = eElement.getElementsByTagName("title").item(0).getTextContent();
if(fileTitle != null) {
//System.out.println("Local File Title ------>"+fileTitle);
}
}
}
}
return fileTitle;
}
private static String checkTitles(String localTitle) throws ApiException {
String title = null;
MediaList mediaResponse = fetchVideos(caID);
if(mediaResponse.totalCount >= 1) {
for(MediaEntry media:mediaResponse.objects) {
if(!localTitle.equals(media.name)) {
System.out.println("Titles are not same. Need to return");
title = localTitle;
}
}
}
return title ;
}
private static MediaEntry initiateUploadProcess(File videoFile,
String localFileTitle) throws ApiException, ParserConfigurationException, SAXException, IOException {
UploadToken ktoken = null;
UploadMedia entry = null;
MediaEntry mediaEntry = null;
ktoken = generateToken();
if (ktoken != null) {
//System.out.println("ktoken.id ----------->" + ktoken.id);
if (ktoken.id != null) {
uploadToken(ktoken.id, flvFile);
entry = uploadMediaToChannel(categoryID, categoryName, localFileTitle);
if (entry.id != null) {
System.out.println("entry.id ------->" + entry.id);
mediaEntry = addMediaContent(ktoken.id, entry.id);
}
}
}
return mediaEntry;
}
Here is the output:
videoFiles.getName() ->22701846_91167469.flv
xmlFiles.getName() ->22701846_91167469.xml
Titles are not same. Need to return
Titles are not same. Need to return
video.id ------->0_50wh1m4p
xmlFiles.getName() ->22701846_91167469.xml
Titles are not same. Need to return
Titles are not same. Need to return
video.id ------->0_79v605ue
videoFiles.getName() ->22701846_91477939.flv
xmlFiles.getName() ->22701846_91477939.xml
Titles are not same. Need to return
Titles are not same. Need to return
video.id ------->0_0kihent1
xmlFiles.getName() ->22701846_91477939.xml
Titles are not same. Need to return
Titles are not same. Need to return
Titles are not same. Need to return
video.id ------->0_miogft0i

In pseudo-code, you have something like those two loops there:
for (i in 1,2)
for (j in 1, 2)
...
upload
And you are surprised that you have 4 (2 x 2) uploads?
Try something like:
for (i in 1,2)
... fetch video file info
... fetch xml file info
upload
instead!

Doing the nested for loops is not very efficient, but I think the problem is in your checkVideos() method (not listed in the question) and it is returning duplicate file objects.
Edit: the line where you print the xml file is using the "i" variable (from the outer loop)

Unable to delete a specific node in XML

I have an XML file and I need to delete a specific node. The node to be deleted will be defined dynamically based on the logic. I have been searching in internet for a solution but couldn't delete my node still. am getting error - NOT_FOUND_ERR: An attempt is made to reference a node in a context where it does not exist
Below is a sample XML File. I need to delete the node <NameValuePairs> which has <name>Local Variables</name>. Below is my sample XML Files Java Code
Sample XML File
<?xml version="1.0" encoding="UTF-8"?>
<DeploymentDescriptors xmlns="http://www.tibco.com/xmlns/dd">
<name>Test</name>
<version>1</version>
<DeploymentDescriptorFactory>
<name>RepoInstance</name>
</DeploymentDescriptorFactory>
<DeploymentDescriptorFactory>
<name>NameValuePairs</name>
</DeploymentDescriptorFactory>
<NameValuePairs>
<name>Global Variables</name>
<NameValuePair>
<name>Connections1</name>
<value>7222</value>
<requiresConfiguration>true</requiresConfiguration>
</NameValuePair>
<NameValuePair>
<name>Connections2</name>
<value>7222</value>
<requiresConfiguration>true</requiresConfiguration>
</NameValuePair>
</NameValuePairs>
<NameValuePairs>
<name>Local Variables</name>
<NameValuePair>
<name>Connections3</name>
<value>8222</value>
<requiresConfiguration>true</requiresConfiguration>
</NameValuePair>
<NameValuePair>
<name>Connections3</name>
<value>8222</value>
<requiresConfiguration>true</requiresConfiguration>
</NameValuePair>
</NameValuePairs>
</DeploymentDescriptors>
Java Code
File fDestFile = new File("myfile.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document oDoc3 = dBuilder.parse(fDestFile);
NodeList oDestFlowList = oDoc3.getElementsByTagName("NameValuePairs");
for (int m = 0; m < oDestFlowList.getLength(); m++) {
NodeList oDestchildList = oDestFlowList.item(m).getChildNodes();
for (int n = 0; n < oDestchildList.getLength(); n++) {
Node oDestchildNode = oDestchildList.item(n);
if ("name".equals(oDestchildNode.getNodeName())) {
//oDestchildNode.getParentNode().removeChild(oDestchildNode); //Not Working
//oDoc3.getDocumentElement().removeChild(oDestchildNode); //Not Working
}
}
}
}

You need create a separate reference from the parent node as an Element so that you aren't referencing the node that you are removing:
File fDestFile = new File("src/myfile.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = null;
try {
dBuilder = dbFactory.newDocumentBuilder();
Document oDoc3 = null;
oDoc3 = dBuilder.parse(fDestFile);
NodeList oDestFlowList = oDoc3.getElementsByTagName("NameValuePairs");
// Loop through all 'NameValuePairs'
for (int m = oDestFlowList.getLength()-1; m >=0 ; m--) {
NodeList oDestchildList = oDestFlowList.item(m).getChildNodes();
// Loop through children of 'NameValuePairs'
for (int n = oDestchildList.getLength()-1; n >=0 ; n--) {
// Remove children if they are of the type 'name'
if(oDestchildList.item(n).getNodeName().equals("name")){
oDestFlowList.item(m).removeChild(oDestchildList.item(n));
// For debugging
System.out.println(oDestchildList.item(n).getNodeName());
}
}
}
Source source = new DOMSource(oDoc3);
Result result = new StreamResult(fDestFile);
Transformer transformer = null;
transformer = TransformerFactory.newInstance().newTransformer();
// Transform your XML document (i.e. save changes to file)
transformer.transform(source, result);
} catch (Exception e) {
// Catch the exception here
e.printStackTrace();
}
}
If you are still having issues, then I would think that it is an issue with the node types. This was working for me before I put the check in for 'oDestchildNode.getNodeType()' but I would look at what type of node you are returning and go from there.

Here is the final piece of code that finally worked
public static void main(String[] args) {
File fXmlSubFile = new File("Sub.xml");
File fXmlOriginalFile = new File("Original.xml");
File fDestFile = new File("myfile.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
FileChannel source = null;
FileChannel destination = null;
XPath xPath = XPathFactory.newInstance().newXPath();
try{
if (!fDestFile.exists()) {
fDestFile.createNewFile();
}
source = new FileInputStream(fXmlOriginalFile).getChannel();
destination = new FileOutputStream(fDestFile).getChannel();
if (destination != null && source != null) {
destination.transferFrom(source, 0, source.size());
}
if (source != null) {
source.close();
}
if (destination != null) {
destination.close();
}
dBuilder = dbFactory.newDocumentBuilder();
Document oSubDoc = dBuilder.parse(fXmlSubFile);
Document oDestDoc = dBuilder.parse(fDestFile);
oSubDoc.getDocumentElement().normalize();
oDestDoc.getDocumentElement().normalize();
String sDestExpression = "/DeploymentDescriptors/NameValuePairs";
String sSubExpression = "/NameValuePairs";
NodeList nodeDestList = (NodeList) xPath.compile(sDestExpression).evaluate(oDestDoc, XPathConstants.NODESET);
NodeList nodeSubList = (NodeList) xPath.compile(sSubExpression).evaluate(oSubDoc, XPathConstants.NODESET);
for (int i = nodeDestList.getLength()-1; i >=0 ; i--) {
Node oDestNode = nodeDestList.item(i);
if (oDestNode.getNodeType() == Node.ELEMENT_NODE) {
Element oDestElement = (Element) oDestNode;
for (int j =0; j<nodeSubList.getLength(); j++) {
Node oSubNode = nodeSubList.item(j);
if (oSubNode.getNodeType() == Node.ELEMENT_NODE) {
Element oSubElement = (Element) oSubNode;
if(oDestElement.getElementsByTagName("name").item(0).getTextContent().equals(oSubElement.getElementsByTagName("name").item(0).getTextContent())){
oDestNode.getParentNode().removeChild(oDestNode);
}
}
}
}
}
Source src = new DOMSource(oDestDoc);
Result result = new StreamResult(fDestFile);
Transformer transformer = null;
transformer = TransformerFactory.newInstance().newTransformer();
// Transform your XML document (i.e. save changes to file)
transformer.transform(src, result);
}catch(Exception ex){
System.out.println("error:"+ex.getMessage());
ex.printStackTrace();
}
}

How to parse xml in android-java?

In my application, I have an XML file and I want to parse the XML file and extract data from the XML tags. Here is my XML file.
<array>
<recipe>
<name> Crispy Fried Chicken </name>
<description> Deliciously Crispy Fried Chicken</description>
<prepTime>1.5 hours </prepTime>
<instructions>instruction steps</instructions>
<ingredients>
<item>
<itemName>Chicken Parts</itemName>
<itemAmount>2 lbs</itemAmount>
</item>
<item>
<itemName>Salt & Peppers</itemName>
<itemAmount>As teste</itemAmount>
</item>
</ingredients>
</recipe>
<recipe>
<name> Bourben Chicken </name>
<description> A good recipe! A tad on the hot side!</description>
<prepTime>1 hours </prepTime>
<instructions>instruction steps</instructions>
<ingredients>
<item>
<itemName>Boneless Chicken</itemName>
<itemAmount>2.5 lbs</itemAmount>
</item>
<item>
<itemName>Olive Oil</itemName>
<itemAmount>1 -2 tablespoon</itemAmount>
</item>
<item>
<itemName>Olive Oil</itemName>
<itemAmount>1 -2 tablespoon</itemAmount>
</item>
</ingredients>
</recipe>
</array>
I have used DOM parser to parse the above xml file and I have extracted data from <name>, <description>, <prepTime> and <instructions> tags BUT I don't know how to extract data from <ingredients> TAG. You can see my code that I have developed for DOM parser. Here is my DOM parser
public class DOMParser
{
// parse Plist and fill in arraylist
public ArrayList<DataModel> parsePlist(String xml)
{
final ArrayList<DataModel> dataModels = new ArrayList<DataModel>();
//Get the xml string from assets XML file
final Document doc = convertStringIntoXML(xml);
// final NodeList nodes_array = doc.getElementsByTagName("array");
//Iterating through the nodes and extracting the data.
NodeList nodeList = doc.getDocumentElement().getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++)
{
Node node = nodeList.item(i);
if (node instanceof Element)
{
DataModel model = new DataModel();
NodeList childNodes = node.getChildNodes();
for (int j = 0; j < childNodes.getLength(); j++)
{
Node cNode = childNodes.item(j);
if (cNode instanceof Element)
{
String content = cNode.getLastChild().getTextContent().trim();
if(cNode.getNodeName().equalsIgnoreCase("name"))
model.setName(content);
else if(cNode.getNodeName().equalsIgnoreCase("description"))
model.setDescription(content);
else if(cNode.getNodeName().equalsIgnoreCase("prepTime"))
model.setPrepTime(content);
else if(cNode.getNodeName().equalsIgnoreCase("instructions"))
model.setInstructions(content);
}
}
dataModels.add(model);
}
}
return dataModels;
}
// Create xml document object from XML String
private Document convertStringIntoXML(String xml)
{
Document doc = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try
{
DocumentBuilder db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xml));
doc = db.parse(is);
}
catch (ParserConfigurationException e)
{
System.out.println("XML parse error: " + e.getMessage());
return null;
}
catch (SAXException e)
{
System.out.println("Wrong XML file structure: " + e.getMessage());
return null;
}
catch (IOException e)
{
System.out.println("I/O exeption: " + e.getMessage());
return null;
}
return doc;
}
}

You need to iterate ingredients child nodes like you do it for recipe tag.
But the more easy way is to use XPath.

you can change your code as below.
public ArrayList<DataModel> parsePlist(String xml)
{
final ArrayList<DataModel> dataModels = new ArrayList<DataModel>();
//Get the xml string from assets XML file
final Document doc = convertStringIntoXML(xml);
//final NodeList nodes_array = doc.getElementsByTagName("array");
//Iterating through the nodes and extracting the data.
NodeList nodeList = doc.getDocumentElement().getChildNodes();
for (int i = 0; i < nodeList.getLength(); i++)
{
Node node = nodeList.item(i);
if (node instanceof Element)
{
DataModel model = new DataModel();
NodeList childNodes = node.getChildNodes();
for (int j = 0; j < childNodes.getLength(); j++)
{
Node cNode = childNodes.item(j);
if (cNode instanceof Element)
{
String content = cNode.getLastChild().getTextContent().trim();
if(cNode.getNodeName().equalsIgnoreCase("name"))
model.setName(content);
else if(cNode.getNodeName().equalsIgnoreCase("description"))
model.setDescription(content);
else if(cNode.getNodeName().equalsIgnoreCase("prepTime"))
model.setPrepTime(content);
else if(cNode.getNodeName().equalsIgnoreCase("instructions"))
model.setInstructions(content);
else if(cNode.getNodeName().equalsIgnoreCase("ingredients"))
{
Element ingredEle = (Element)cNode;
NodeList ingredList = ingredEle
.getElementsByTagName("ingredients");
for (int i = 0; i < ingredList.getLength(); i++)
{
Element item = (Element)ingredList.item(i);
if(item.hasChildNodes())
{
NodeList itemList = item.getElementsByTagName("item");
for (int j = 0; j < itemList.getLength(); j++)
{
Element itemEle = (Element)itemList.item(j);
if (getNodeValue(itemEle, "itemName") != null)
{
String name = getNodeValue(itemEle, "itemName");
//set name here
}
if (getNodeValue(itemEle, "itemAmount") != null)
{
String amount = getNodeValue(itemEle,"itemAmount");
//set amount here
}
}
}
}
}
}
dataModels.add(model);
}
}
return dataModels;
}
private String getNodeValue(Element element, String elementTemplateLoc) {
NodeList nodes = element.getElementsByTagName(elementTemplateLoc);
return getTextNodeValue(nodes.item(0));
}
Hope this will work for you

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing dblp.xml with java DOM/SAX - java

If you goal is to just get the details out, the just use a BufferedReader to read the file as a text file. If you want, throw in some regex. if using mysql is an option, you may be able to get it to do the heavy lifting through it's XML Functions Hope this helps.

Related

Reading multiple xml files java

Not able to parse duplicate xml tag values using DocumentBuilder in Java

Java File Iteration in For loop

Unable to delete a specific node in XML

How to parse xml in android-java?

Categories

Resources