I'm trying to read xml file, ex :
<entry>
<title>FEED TITLE</title>
<id>5467sdad98787ad3149878sasda</id>
<tempi type="application/xml">
<conento xmlns="http://mydomainname.com/xsd/radiofeed.xsd" madeIn="USA" />
</tempi>
</entry>
Here is the code I have so far :
Here is my attempt of trying to code this, what to say not successful thats why I started bounty. Here it is http://pastebin.com/huKP4KED .
Bounty update :
I really really tried to do this for days now didn't expect to be so hard, I'll accept useful links/books/tutorials but prefer code because I need this done yesterday.
Here is what I need:
Concerning xml above :
I need to get value of title, id
attribute value of tempi as well as madeIn attribute value of contento
What is the best way to do this ?
EDIT:
#Pascal Thivent
Maybe creating method would be good idea like public String getValue(String xml, Element elementname), where you specify tag name, the method returns tag value or tag attribute(maybe give it name as additional method argument) if the value is not available
What I really want to get certain tag value or attribute if tag value(s) is not available, so I'm in the process of thinking what is the best way to do so since I've never done it before
The best solution for this is to use XPath. Your pastebin is expired, but here's what I gathered. Let's say we have the following feed.xml file:
<?xml version="1.0" encoding="UTF-8" ?>
<entries>
<entry>
<title>FEED TITLE 1</title>
<id>id1</id>
<tempi type="type1">
<conento xmlns="dontcare?" madeIn="MadeIn1" />
</tempi>
</entry>
<entry>
<title>FEED TITLE 2</title>
<id>id2</id>
<tempi type="type2">
<conento xmlns="dontcare?" madeIn="MadeIn2" />
</tempi>
</entry>
<entry>
<id>id3</id>
</entry>
</entries>
Here's a short but compile-and-runnable proof-of-concept (with feed.xml file in the same directory).
import javax.xml.xpath.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import java.io.*;
import java.util.*;
public class XPathTest {
static class Entry {
final String title, id, origin, type;
Entry(String title, String id, String origin, String type) {
this.title = title;
this.id = id;
this.origin = origin;
this.type = type;
}
#Override public String toString() {
return String.format("%s:%s(%s)[%s]", id, title, origin, type);
}
}
final static XPath xpath = XPathFactory.newInstance().newXPath();
static String evalString(Node context, String path) throws XPathExpressionException {
return (String) xpath.evaluate(path, context, XPathConstants.STRING);
}
public static void main(String[] args) throws Exception {
File file = new File("feed.xml");
Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(file);
NodeList entriesNodeList = (NodeList) xpath.evaluate("//entry", document, XPathConstants.NODESET);
List<Entry> entries = new ArrayList<Entry>();
for (int i = 0; i < entriesNodeList.getLength(); i++) {
Node entryNode = entriesNodeList.item(i);
entries.add(new Entry(
evalString(entryNode, "title"),
evalString(entryNode, "id"),
evalString(entryNode, "tempi/conento/#madeIn"),
evalString(entryNode, "tempi/#type")
));
}
for (Entry entry : entries) {
System.out.println(entry);
}
}
}
This produces the following output:
id1:FEED TITLE 1(MadeIn1)[type1]
id2:FEED TITLE 2(MadeIn2)[type2]
id3:()[]
Note how using XPath makes the value retrieval very simple, intuitive, readable, and straightforward, and "missing" values are also gracefully handled.
API links
package javax.xml.xpath
http://www.w3.org/TR/xpath
Wikipedia/XPath
Use Element.getAttribute and Element.setAttribute
In your example, ((Node) content.item(0)).getFirstChild().getAttributes(). Assuming that content is a typo, and you mean contento, getFirstChild is correctly returning NULL as contento has no children. Try: ((Node) contento.item(0)).getAttributes() instead.
Another issue is that by using getFirstChild and getChildNodes()[0] without checking the return value, you are running the risk of picking up child text nodes, instead of the element you want.
As pointed out, <contento> doesn't have any child so instead of:
(contento.item(0)).getFirstChild().getAttributes()
You should treat the Node as Element and use getAttribute(String), something like this:
((Element)contento.item(0)).getAttribute("madeIn")
Here is a modified version of your code (it's not the most robust code I've written):
InputStream inputStream = new ByteArrayInputStream(xml.getBytes());
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(inputStream);
doc.getDocumentElement().normalize();
System.out.println("Root element " + doc.getDocumentElement().getNodeName());
NodeList nodeLst = doc.getElementsByTagName("entry");
System.out.println("Information of all entries");
for (int s = 0; s < nodeLst.getLength(); s++) {
Node fstNode = nodeLst.item(s);
if (fstNode.getNodeType() == Node.ELEMENT_NODE) {
Element fstElmnt = (Element) fstNode;
NodeList title = fstElmnt.getElementsByTagName("title").item(0).getChildNodes();
System.out.println("Title : " + (title.item(0)).getNodeValue());
NodeList id = fstElmnt.getElementsByTagName("id").item(0).getChildNodes();
System.out.println("Id: " + (id.item(0)).getNodeValue());
Node tempiNode = fstElmnt.getElementsByTagName("tempi").item(0);
System.out.println("Type : " + ((Element) tempiNode).getAttribute("type"));
Node contento = tempiNode.getChildNodes().item(0);
System.out.println("Made in : " + ((Element) contento).getAttribute("madeIn"));
}
}
Running it on your XML snippet produces the following output:
Root element entry
Information of all entries
Title : FEED TITLE
Id: 5467sdad98787ad3149878sasda
Type : application/xml
Made in : USA
By the way, did you consider using something like Rome instead?
Related
I have an XML file as below. I want to get its specific child tag from the parent tag using java.
<?xml version="1.0"?>
<class>
<question id="scores">
<ans>12</ans>
<ans>32</ans>
<ans>44</ans>
</question>
<question id="ratings">
<ans>10</ans>
<ans>22</ans>
<ans>45</ans>
<ans>100</ans>
</question>
<default>
Sorry wrong
</default>
</class>
i want the function to be like this
String function(String id)
it will return the ans tag randomly
i.e if I give input id=scores, the program will look in the XML tag for scores as id and get length()of its children, in this case, 3, then retun randomly like 32 or 44 or 12.if id is not present, return default.
my code so far
public class ChatBot {
private String filepath="E:\\myfile.xml";
private File file;
private Document doc;
public ChatBot() throws SAXException, IOException, ParserConfigurationException {
file = new File("E:\\myfile.xml");
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
doc = db.parse(file);
}
String Function(String id){
// This part
return null;
}
}
As suggested by #LMC (because of org.w3c.dom.Document.getElementById() not recognizing arbitrary id attributes as IDs for getElementById() or as a browser would, mostly for HTML semantics/format), maybe:
String Function(String id) throws XPathExpressionException {
XPath xPath = XPathFactory.newInstance().newXPath();
// Be aware: id inserted without any escaping!
NodeList parents = (NodeList)xPath.evaluate("/class/question[#id='" + id + "']", doc, XPathConstants.NODESET);
if (parents.getLength() < 1) {
return null;
} else if (parents.getLength() > 1) {
// Huh, duplicates?
}
Element parent = (Element)parents.item(0);
NodeList children = parent.getChildNodes();
List<Element> answers = new ArrayList<Element>();
for (int i = 0, max = children.getLength(); i < max; i++) {
if (children.item(i).getNodeType() != Node.ELEMENT_NODE) {
continue;
}
if (children.item(i).getNodeName().equals("ans") != true) {
// Huh?
continue;
}
answers.add((Element)children.item(i));
}
if (answers.size() <= 0) {
return null;
}
int selection = (int)(Math.random() * answers.size());
return answers.get(selection).getTextContent();
}
Want to Achive:
Get an unknown XML file's Elements (Element Name, How many elements are there in the xml file).
Then get all the attributes and their name and values to use it later (eg Comparison to other xml file)
element_vs_attribute
Researched:
1. 2. 3. 4. 5.
And many more
Does Anyone have any idea for this?
I dont want to pre define more then 500 table like in the previous code snippet, somehow i should be able to get the number of elements and the element names itself dynamically.
EDIT!
Example1
<Root Attri1="" Attri2="">
<element1 EAttri1="" EAttri2=""/>
<Element2 EAttri1="" EAttri2="">
<nestedelement3 NEAttri1="" NEAttri2=""/>
</Element2>
</Root>
Example2
<Root Attri1="" Attri2="" Attr="" At="">
<element1 EAttri1="" EAttri2="">
<nestedElement2 EAttri1="" EAttri2="">
<nestedelement3 NEAttri1="" NEAttri2=""/>
</nestedElement2>
</element1>
</Root>
Program Snipet:
String Example1[] = {"element1","Element2","nestedelement3"};
String Example2[] = {"element1","nestedElement2","nestedelement3"};
for(int i=0;i<Example1.length;++){
NodeList Elements = oldDOC.getElementsByTagName(Example1[i]);
for(int j=0;j<Elements.getLength();j++) {
Node nodeinfo=Elements.item(j);
for(int l=0;l<nodeinfo.getAttributes().getLength();l++) {
.....
}
}
Output:
The expected result is to get all the Element and all the Attributes out from the XML file without pre defining anything.
eg:
Elements: element1 Element2 nestedelement3
Attributes: Attri1 Attri2 EAttri1 EAttri2 EAttri1 EAttri2 NEAttri1 NEAttri2
The right tool for this job is xpath
It allows you to collect all or some elements and attributes based on various criteria. It is the closest you will get to a "universal" xml parser.
Here is the solution that I came up with. The solution first finds all element names in the given xml doc, then for each element, it counts the element's occurrences, then collect it all to a map. same for attributes.
I added inline comments and method/variable names should be self explanatory.
import java.io.*;
import java.nio.file.*;
import java.util.*;
import java.util.function.*;
import java.util.stream.*;
import org.w3c.dom.*;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
public class TestXpath
{
public static void main(String[] args) {
XPath xPath = XPathFactory.newInstance().newXPath();
try (InputStream is = Files.newInputStream(Paths.get("C://temp/test.xml"))) {
// parse file into xml doc
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document xmlDocument = builder.parse(is);
// find all element names in xml doc
Set<String> allElementNames = findNames(xmlDocument, xPath.compile("//*[name()]"));
// for each name, count occurrences, and collect to map
Map<String, Integer> elementsAndOccurrences = allElementNames.stream()
.collect(Collectors.toMap(Function.identity(), name -> countElementOccurrences(xmlDocument, name)));
System.out.println(elementsAndOccurrences);
// find all attribute names in xml doc
Set<String> allAttributeNames = findNames(xmlDocument, xPath.compile("//#*"));
// for each name, count occurrences, and collect to map
Map<String, Integer> attributesAndOccurrences = allAttributeNames.stream()
.collect(Collectors.toMap(Function.identity(), name -> countAttributeOccurrences(xmlDocument, name)));
System.out.println(attributesAndOccurrences);
} catch (Exception e) {
e.printStackTrace();
}
}
public static Set<String> findNames(Document xmlDoc, XPathExpression xpathExpr) {
try {
NodeList nodeList = (NodeList)xpathExpr.evaluate(xmlDoc, XPathConstants.NODESET);
// convert nodeList to set of node names
return IntStream.range(0, nodeList.getLength())
.mapToObj(i -> nodeList.item(i).getNodeName())
.collect(Collectors.toSet());
} catch (XPathExpressionException e) {
e.printStackTrace();
}
return new HashSet<>();
}
public static int countElementOccurrences(Document xmlDoc, String elementName) {
return countOccurrences(xmlDoc, elementName, "count(//*[name()='" + elementName + "'])");
}
public static int countAttributeOccurrences(Document xmlDoc, String attributeName) {
return countOccurrences(xmlDoc, attributeName, "count(//#*[name()='" + attributeName + "'])");
}
public static int countOccurrences(Document xmlDoc, String name, String xpathExpr) {
XPath xPath = XPathFactory.newInstance().newXPath();
try {
Number count = (Number)xPath.compile(xpathExpr).evaluate(xmlDoc, XPathConstants.NUMBER);
return count.intValue();
} catch (XPathExpressionException e) {
e.printStackTrace();
}
return 0;
}
}
How do I list the element names at a given level in an xml schema hierarchy? The code I have below is listing all element names at every level of the hierarchy, with no concept of nesting.
Here is my xml file:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><?xml-stylesheet type="text/xsl" href="CDA.xsl"?>
<SomeDocument xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:something">
<title>some title</title>
<languageCode code="en-US"/>
<versionNumber value="1"/>
<recordTarget>
<someRole>
<id extension="998991"/>
<addr use="HP">
<streetAddressLine>1357 Amber Drive</streetAddressLine>
<city>Beaverton</city>
<state>OR</state>
<postalCode>97867</postalCode>
<country>US</country>
</addr>
<telecom value="tel:(816)276-6909" use="HP"/>
</someRole>
</recordTarget>
</SomeDocument>
Here is my java method for importing and iterating the xml file:
public static void parseFile() {
//get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
//Using factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
//parse using builder to get DOM representation of the XML file
Document dom = db.parse("D:\\mypath\\somefile.xml");
//get the root element
Element docEle = dom.getDocumentElement();
//get a nodelist of elements
NodeList nl = docEle.getElementsByTagName("*");
if (nl != null && nl.getLength() > 0) {
for (int i = 0; i < nl.getLength(); i++) {
Node node = nl.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
System.out.println("node.getNodeName() is: "+node.getNodeName());
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
The output of the above program is:
title
languageCode
versionNumber
recordTarget
someRole
id
addr
streetAddressLine
city
state
postalCode
country
telecom
Instead, I would like to output the following:
title
languageCode
versionNumber
recordTarget
It would be nice to then be able to list the children of recordTarget as someRole, and then to list the children of someRole as id, addr, and telecom. And so on, but at my discretion in the code. How can I change my code to get the output that I want?
You're getting all nodes with this line:
NodeList nl = docEle.getElementsByTagName("*");
Change it to
NodeList nl = docEle.getChildNodes();
to get all of its children. Your print statement will then give you the output you're looking for.
Then, when you iterate through your NodeList, you can choose to call the same method on each Node you create:
NodeList children = node.getChildNodes();
If you want to print an XML-like structure, perhaps a recursive method that prints all child nodes is what you are looking for.
You could re-write the parseFile (I'd rather call it parseChildrenElementNames) method to take an input String that specifies the element name for which you want to print out its children element names:
public static void parseChildrenElementNames(String parentElementName) {
// get the factory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
// Using factory get an instance of document builder
DocumentBuilder db = dbf.newDocumentBuilder();
// parse using builder to get DOM representation of the XML file
Document dom = db
.parse("D:\\mypath\\somefile.xml");
// get the root element
NodeList elementsByTagName = dom.getElementsByTagName(parentElementName);
if(elementsByTagName != null) {
Node parentElement = elementsByTagName.item(0);
// get a nodelist of elements
NodeList nl = parentElement.getChildNodes();
if (nl != null) {
for (int i = 0; i < nl.getLength(); i++) {
Node node = nl.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
System.out.println("node.getNodeName() is: "
+ node.getNodeName());
}
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
However, this will only consider the first element that matches the specified name.
For example, to get the list of elements under the first node named someRole, you would call parseChildrenElementNames("someRole"); which would print out:
node.getNodeName() is: id
node.getNodeName() is: addr
node.getNodeName() is: telecom
If I have an XML document like below:
<foo>
<foo1>Foo Test 1</foo1>
<foo2>
<another1>
<test10>This is a duplicate</test10>
</another1>
</foo2>
<foo2>
<another1>
<test1>Foo Test 2</test1>
</another1>
</foo2>
<foo3>Foo Test 3</foo3>
<foo4>Foo Test 4</foo4>
</foo>
How do I get the XPath of <test1> for example? So the output should be something like: foo/foo2[2]/another1/test1
I'm guessing the code would look something like this:
public String getXPath(Document document, String xmlTag) {
String xpath = "";
...
//Get the node from xmlTag
//Get the xpath using the node
return xpath;
}
Let's say String XPathVar = getXPath(document, "<test1>");. I need to get back an absolute xpath that will work in the following code:
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression xpr = xpath.compile(XPathVar);
xpr.evaluate(Document, XPathConstants.STRING);
But it can't be a shortcut like //test1 because it will also be used for meta data purposes.
When printing the result out via:
System.out.println(xpr.evaluate(Document, XPathConstants.STRING));
I should get the node's value. So if XPathVar = foo/foo2[2]/another1/test1 then I should get back:
Foo Test 2 and not This is a duplicate
You don't 'get' an xpath in the same way you don't 'get' sql.
An xpath is a query you write based on your understanding of an xml document or schema, just as sql is a query you write based on your understanding of a database schema - you don't 'get' either of them.
I would be possible to generate xpath statements from the DOM simply by walking back up the nodes from a given node, though to do this generically enough, taking into account attribute values on each node, would make the resulting code next to useless. For example (which comes with a warning that this will find the first node that has a given name, xpath is much more that this and you may as well just use the xpath //foo2):
import java.io.ByteArrayInputStream;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
public class XPathExample
{
private static String getXPath(Node root, String elementName)
{
for (int i = 0; i < root.getChildNodes().getLength(); i++)
{
Node node = root.getChildNodes().item(i);
if (node instanceof Element)
{
if (node.getNodeName().equals(elementName))
{
return "/" + node.getNodeName();
}
else if (node.getChildNodes().getLength() > 0)
{
String xpath = getXPath(node, elementName);
if (xpath != null)
{
return "/" + node.getNodeName() + xpath;
}
}
}
}
return null;
}
private static String getXPath(Document document, String elementName)
{
return document.getDocumentElement().getNodeName() + getXPath(document.getDocumentElement(), elementName);
}
public static void main(String[] args)
{
try
{
Document document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(
new ByteArrayInputStream(
("<foo><foo1>Foo Test 1</foo1><foo2><another1><test1>Foo Test 2</test1></another1></foo2><foo3>Foo Test 3</foo3><foo4>Foo Test 4</foo4></foo>").getBytes()
)
);
String xpath = "/" + getXPath(document, "test1");
System.out.println(xpath);
Node node1 = (Node)XPathFactory.newInstance().newXPath().compile(xpath).evaluate(document, XPathConstants.NODE);
Node node2 = (Node)XPathFactory.newInstance().newXPath().compile("//test1").evaluate(document, XPathConstants.NODE);
//This evaluates to true, hence you may as well just use the xpath //test1.
System.out.println(node1.equals(node2));
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
Likewise you could write an XML transformation that turned an xml document into a series of xpath statements but this transformation would be more complicated that writing the xpath in the first place and so largely pointless.
How's this:
private static String getXPath(Document root, String elementName)
{
try{
XPathExpression expr = XPathFactory.newInstance().newXPath().compile("//" + elementName);
Node node = (Node)expr.evaluate(root, XPathConstants.NODE);
if(node != null) {
return getXPath(node);
}
}
catch(XPathExpressionException e) { }
return null;
}
private static String getXPath(Node node) {
if(node == null || node.getNodeType() != Node.ELEMENT_NODE) {
return "";
}
return getXPath(node.getParentNode()) + "/" + node.getNodeName();
}
Note that this is first locating the node (using XPath) and then using the located node to get its XPath. Quite the roundabout approach to get a value you already have.
Working ideone example: http://ideone.com/EL4783
I am using DOM to parse an XML string as in the following example. This works great except in one instance. The document which I am trying to parse looks like this:
<response requestID=\"1234\">
<expectedValue>Alarm</expectedValue>
<recommendations>For steps on how to resolve visit Website and use the search features for \"Alarm\"<recommendations>
<setting>Active</setting>
<response>
The code I used to parse the XML is as follows:
try {
DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xmlResult));
Document doc = db.parse(is);
NodeList nlResponse = doc.getElementsByTagName("response");
String[] String = new String[3]; //result entries
for (int i = 0; i < nlResponse.getLength(); i++) {
Element e = (Element) nlResponse.item(i);
int c1 = 0; //count for string array
NodeList ev = e.getElementsByTagName("expectedValue");
Element line = (Element) ev.item(0);
String[c1] = (getCharacterDataFromElement(line));
c1++;
NodeList rec = e.getElementsByTagName("recommendations");
line = (Element) rec.item(0);
String[c1] = (getCharacterDataFromElement(line));
c1++;
NodeList set = e.getElementsByTagName("settings");
line = (Element) set.item(0);
String[c1] = (getCharacterDataFromElement(line));
c1++;
I am able to parse the code and put the result into a string array (as opposed to the System.out.println()). With the current code, my string array looks as follows:
String[0] = "Alarm"
String[1] = "For steps on how to resolve visit"
String[2] = "Active"
I would like some way of being able to read the rest of the information within "Recommendations" in order to ultimately display the hyperlink (along with other output) in a TextView. How can I do this?
I apologize for my previous answer in assuming your xml was ill-formed.
I think what is happening is that your call to the getCharacterDataFromElement is only looking at the first child node for text, when it will need to look at all the child nodes and getting the href attribute as well as the text for the 2nd child node when looking at the recommendations node.
e.g. after getting the Element for recommendation
String srec = "";
NodeList nl = line.getChildNodes();
srec += nl.item(0).getTextContent();
Node n = nl.item(1);
NamedNodeMap nm = n.getAttributes();
srec += "" + n.getTextContent() + "";
srec += nl.item(2).getTextContent();
String[c1] = srec;