I have the following XML document which I'm trying to get the inner text. I have tried numerous ways, using Xpath, DOM, SAX but no success.
This is my XML, I'm not sure if it's the XML structure which is causing a problem or my code.
<?xml version="1.0"?>
<ArrayOfPurchaseEntitites xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema">
<PurchaseEntitites>
<rInstalmentAmt>634.0</rInstalmentAmt>
<rAnnualRate>12.0</rAnnualRate>
<rInterestAmt>2670.0</rInterestAmt>
<dFirstInstalment>3/31/2016 12:00:00 AM</dFirstInstalment>
<dLastInstalment>8/31/2018 12:00:00 AM</dLastInstalment>
<rInsurancePremium>1350.0</rInsurancePremium>
<sResponseCode>00</sResponseCode>
</PurchaseEntitites>
</ArrayOfPurchaseEntitites>
InputStream stream = connect.getInputStream();
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilderFactory.setNamespaceAware(true);
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document doc = documentBuilder.parse(stream);
doc.normalize();
System.out.println("===============================================================");
String g = doc.getDocumentElement().getTextContent();
System.out.println(g);
NodeList rootNodes = doc.getElementsByTagName("ArrayOfPurchaseEntitites");
Node rootnode =rootNodes.item(0);
Element rootElement = (Element) rootnode;
NodeList noteslist = rootElement.getElementsByTagName("PurchaseEntitites");
for(int i = 0; i < noteslist.getLength(); i++)
{
Node theNote = noteslist.item(i);
Element noteElement =(Element) theNote;
Node theExpiryDate = noteElement.getElementsByTagName("dLastInstalment").item(0);
Element dateElement = (Element) theExpiryDate;
System.out.println(dateElement.getTextContent());
}
stream.close();
I had a similar problem where I wanted to call getElementsByTagName for the first item in a NodeList. The trick - which you already utilize - is to cast the Node to Element. However, just to be sure, I suggest you add if (rootnode instanceof Element).
Assuming you use packages javax.xml.parsers and org.w3c.dom (no wild guess) your code works nicely when the xml is read from a file.
So if there still a problem with the code (it's been a while since this question was asked) I suggest you update the question with more info regarding connect.getInputStream();.
Related
I've got XPath of XML with it's structure like
<Statement xsi:type="conditionStatement">
<Id>CONDITION_0001</Id>
<Bounds>
<xValue>13</xValue>
<yValue>145</yValue>
<Height>402</Height>
<Width>513</Width>
</Bounds>
.........
.........
</statement>
Xpath takes me to xsi:type. But when I'm trying to get the name of node which is "statement" as expected, it's getting null.
My code for this is:-
nodeList = (NodeList) xPath.compile(xPathSrcFile).evaluate(xmlDocument, XPathConstants.NODESET);
for (int i = 0; i < nodeList.getLength(); i++) {
nodeList.item(i).getParentNode();
}
For rest of the cases, code is working perfectly fine but when it gets to "xsi", code is throwing nullpointer exception.
Need some help to get node name from this.
try this
NodeList nodeList = null;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
InputStream inputStream= new FileInputStream(file);//xmlDocument as file
Reader reader = new InputStreamReader(inputStream,"ISO-8859-1");
InputSource is = new InputSource(reader);
is.setEncoding("ISO-8859-1");
Document doc = db.parse(is);
Element docEle = doc.getDocumentElement();
nodeList = docEle.getElementsByTagName("Statement");
1 Your XML file is incorrect:
it begins with Statement
and ends with /statement
2 you need this at root tag:
< root xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
3 to get the name of you tag, use:
nodeList.item(i).getTagName();
4 what is your Xpath ?
I've written a twitter desktop app that basically just lets me post tweets and pics... nothing fancy.
I've got everything working but this last part of persisting a config file (which is the following XML generated by my application.
<?xml version="1.0" encoding="UTF-8" standalone="no"?><Twitterer><config id="1"><accessToken>ENDLESS-STRING-OF-CHARACTERS</accessToken><accessTokenSecret>ANOTHER-ENDLESS-STRING-OF-CHARACTERS</accessTokenSecret></config></Twitterer>
What I need to do is just set the accessToken & accessTokenSecret variables. The filename is config.xml.
I've been looking at a lot of examples on the net, but can't seem to wrap my head around only getting two values from the file, which shouldn't need a loop.
This is as far as I've gotten on this last piece of my puzzle:
try {
File fXmlFile = new File(this.getFileName());
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("config");
int numberOfConfigs = nList.getLength();
// GET THE TWO VARIABLES HERE
} catch (Exception e) {
}
If anyone can help me just read those two tags into their corresponding variables I would be quite appreciative. I can handle the rest of the Authorization after that.
What I need to do is just set the accessToken & accessTokenSecret variables
A simple code using getElementsByTagName() method
Element root = doc.getDocumentElement();
root.getElementsByTagName("accessToken").item(0).getTextContent()
root.getElementsByTagName("accessTokenSecret").item(0).getTextContent()
output:
ENDLESS-STRING-OF-CHARACTERS
ANOTHER-ENDLESS-STRING-OF-CHARACTERS
OR try as child node of config tag
Element root = doc.getDocumentElement();
NodeList configNodeList = root.getElementsByTagName("config");
NodeList nodeList = ((Node) configNodeList.item(0)).getChildNodes();
System.out.println(nodeList.item(0).getTextContent());
System.out.println(nodeList.item(1).getTextContent());
I have a series of XML files I am looking through and grabbing a specific element from.
<key>A</key>
I'm using this snippet of code to grab the XML element, but it returns null instead of the element I am looking for. I am not able to change the XML files.
File key = new File(filePath);
PrintWriter keyWriter = new PrintWriter(key);
File xmlFile = new File(configPath);
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(xmlFile);
NodeList nodes = document.getElementsByTagName("key");
Element keyValue = (Element) nodes.item(0);
keyWriter.println(keyValue);
keyWriter.close();
}
I've tried using the document method as well as the apache xmlconfiguration and getElementbyId but all have returned null so far.
I noticed in your code that your passing the element object to the writer's println function as in:
keyWriter.println(keyValue);
This will print a null value in the file. Try replacing it with:
keyWriter.println(keyValue.getTextContent());
I have an XML config file that has just one parent and one child. This will always be like this and never change. It looks something like this:
<parent>
<child1>test</child1>
<child2>123</child2>
</parent>
I want to use java DOM (org.w3c.dom.Document) to parse the XML into a TreeMap so that I can access the attributes as keys/values. I'm guessing I'd need to create a for loop that scans through the XML and adds the key (parent) and value (child) line by line?
You can traverse the XML document using JAXP APIs, you don't need to know the structure or node names in advance
InputStream is = new ByteArrayInputStream(xml.getBytes("UTF-8"));
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbf.newDocumentBuilder();
Document doc = docBuilder.parse(is);
NodeList nodeList = doc.getChildNodes();
and you can iterate on document and get the nodes and attributes
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
NamedNodeMap attributes = node.getAttributes();
//...
}
I am currently modifying a piece of code and I am wondering if the way the XML is formatted (tabs and spacing) will affect the way in which it is parsed into the DocumentBuilderFactory class.
In essence the question is...can I pass a big long string with no spacing into the DocumentBuilderFactory or does it need to be formatted in some way?
Thanks in advance, included below is the Class definition from Oracles website.
Class DocumentBuilderFactory
"Defines a factory API that enables applications to obtain a parser that produces DOM object trees from XML documents. "
The documents will be different. Tabs and new lines will be converted into text nodes. You can eliminate these using the following method on DocumentBuilderFactory:
http://download.oracle.com/javase/6/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setIgnoringElementContentWhitespace(boolean)
But in order for it to work you must set up your DOM parser to validate the content against a DTD or xml schema.
Alternatively you could programmatically remove the extra whitespace yourself using something like the following:
public static void removeEmptyTextNodes(Node node) {
NodeList nodeList = node.getChildNodes();
Node childNode;
for (int x = nodeList.getLength() - 1; x >= 0; x--) {
childNode = nodeList.item(x);
if (childNode.getNodeType() == Node.TEXT_NODE) {
if (childNode.getNodeValue().trim().equals("")) {
node.removeChild(childNode);
}
} else if (childNode.getNodeType() == Node.ELEMENT_NODE) {
removeEmptyTextNodes(childNode);
}
}
}
It should not affect the ability of the parser as long as the string is valid XML. Tabs and newlines are stripped out or ignored by parsers and are really for the aesthetics of the human reader.
Note you will have to pass in an input stream (StringBufferInputStream for example) to the DocumentBuilder as the string version of parse assumes it is a URI to the XML.
The DocumentBuilder builds different DOM objects for xml string with line feeds and xml string without line feeds. Here is the code I tested:
StringBuilder sb = new StringBuilder();
sb.append("<root>").append(newlineChar).append("<A>").append("</A>").append(newlineChar).append("<B>tagB").append("</B>").append("</root>");
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
InputStream xmlInput = new ByteArrayInputStream(sb.toString().getBytes());
Element documentRoot = builder.parse(xmlInput).getDocumentElement();
NodeList nodes = documentRoot.getChildNodes();
System.out.println("How many children does the root have? => "nodes.getLength());
for(int index = 0; index < nodes.getLength(); index++){
System.out.println(nodes.item(index).getLocalName());
}
Output:
How many children does the root have? => 4
null
A
null
B
But if the new newlineChar is removed from the StringBuilder,
the ouptput is:
How many children does the root have? => 2
A
B
This demonstrates that the DOM objects generated by DocumentBuilder are different.
There shouldn't be any effect regarding the format of the XML-String, but I can remember a strange problem, as I passed a long String to an XML parser. The paser was unable to parse a XML-File as it was written all in one long line.
It may be better if you insert line-breaks, in that kind, that the lines wold not be longer than, lets say 1000 bytes.
But sadly i do neigther remember why that error occured nor which parser I took.