I have a simple XML, and I want to get the attributes. There are a few examples on the web, but I still dont understand why I get 17 when I see only 4. I even try to count locations where I think text could be, but still I don't get that number unless is the length of the output . Which leads me to not know how to get the attribute name of all Tag3.
<?xml version="1.0" encoding="UTF-8"?>
<tag1 xmlns="something">
<xxxxxx-Set>
<tag3 Name="a"/>
<tag3 Name="b"/>
<tag3 Name="c"/>
<tag3 Name="d"/>
</xxxxxx-Set>
<tagB>
<tag3 Name="a"/>
<tag3 Name="b"/>
<tag3 Name="c"/>
<tag3 Name="d"/>
</tagB>
</tag1>
This is my java code:
import java.io.File;
import java.util.Arrays;
import java.util.List;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class ParseXML {
public static void main(String[] args) {
try {
File test= new File("test.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(test);
NodeList tagAs= doc.getElementsByTagName("xxxxxx-Set").item(0).getChildNodes(); //should be all the tag3 elements?
for(int i = 0; i < tagAs.getLength(); i++) {
System.out.println(tagAs);
System.out.println(i);
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
Note: adding .getAttributes().getNamedItem("Name").getNodeValue() to the print statement gives me null exception.
And the output is:
[xxxxxx-Set: null]
0
[xxxxxx-Set: null]
1
...
[xxxxxx-Set: null]
16
If you want to take all your Name attributes (it's better to name them with lower case), use next approach:
Element xSet = (Element) doc.getElementsByTagName("xxxxxx-Set").item(0);
NodeList xSetTags = xSet.getElementsByTagName("tag3");
for(int i = 0; i < xSetTags.getLength(); i++) {
Element tag3 = (Element) xSetTags.item(i);
System.out.println(tag3.getAttribute("Name"));
}
I made it using org.w3c.dom.Element class. It's not the best idea to work with org.w3c.dom.Node, because this class represents not only xml elements, but attributes, comments and other too. Look documentation to get difference between Node and Element classes.
Related
My objective is to Create Reusable xml parsing class concerning that return type could be array or arraylist
My code is working but I wanted reusablity I am unable to create reusable class/method due to return type which could array or arraylist is not working.**
1) I have created a xml file as follows:
<SearchStrings>
<Search id="1111" type="high">
<Questions>What is software Testing?</Questions>
<Tips>How to connect database with eclipse ?</Tips>
<Multiple>Who was the first prime minister of India? </Multiple>
<Persons>Who is Dr.APJ Abdul Kalam </Persons>
</Search>
<Search id="2222" type="low">
<Questions>What is Automation Testing?</Questions>
<Tips>How to use selenium webdriver </Tips>
<Multiple>Who was the fourth prime minister of India? </Multiple>
<Persons>Who is Superman? </Persons>
</Search>
<Search id="3333" type="medium">
<Questions>What is Selenium ide Testing?</Questions>
<Tips>How to use selenium webdriver with eclipse ? </Tips>
<Multiple>Who was the ninth prime minister of India? </Multiple>
<Persons>Who is Spiderman? </Persons>
</Search>
<Search id="4444" type="optional">
<Questions>What is database Testing?</Questions>
<Tips>How to use Class in java ? </Tips>
<Multiple>Who was the eight prime minister of India? </Multiple>
<Persons>Who is motherindia? </Persons>
</Search>
</SearchStrings>
2) Creating a class which fetch nodes of tags at once and store all of them in
a String [] SearchString and then use this array to fetch the values and by .sendKeys(value) attribute search them at google.
Simplified:
1) Store elements tag element in an reusable datatype my knowledge is limited so using string array.
2) Fetch string array elements and search them using the .sendkeys(element) at google.
my code is as below:
package searchexperiment;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.concurrent.TimeUnit;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
public class Experiment implements Paths
{
public static WebDriver driver;
static Document document;
static DocumentBuilder db;
public static void main(String args[])
{
String[] SearchStrings;
driver=new FirefoxDriver();
driver.manage().timeouts().implicitlyWait(50, TimeUnit.SECONDS);
driver.get("https://www.google.com/");
//loading xml as test data
WebElement googlebox=driver.findElement(By.id("gbqfq"));
try {
FileInputStream file = new FileInputStream(new File(test_xml));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
System.out.println("*************************");
String expression="/SearchStrings/Search/Questions";
System.out.println("This is ordered expression \n"+expression);
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for(int i=0;i< nodeList.getLength();i++)
{
// Node nNode = emailNodeElementList.item(j);
// Element eElement = (Element) nNode;
System.out.println("Taking the loop value");
// below push is not working.
Object array = push(SearchStrings[i],nodeList.item(i).getFirstChild().getNodeValue());
String text=nodeList.item(i).getFirstChild().getNodeValue();
googlebox.clear();
googlebox.sendKeys(text);
System.out.println("Closing the loop value");
}
I am using the string array in order to make xml parsing class reusable.
I have used an interface to get file name
public interface Paths {
String test_xml="XML/Searchtext.xml";
}
Reusable method along with class was :
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;
import searchexperiment.Paths;
public class DocBuilderClass implements Paths
{
public static String[] username()
{
String[] SearchElements=new String[4];
try
{
FileInputStream file = new FileInputStream(new File(test_xml));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
System.out.println("*************************");
String expression="/SearchStrings/Search/Tips";
System.out.println("This is ordered expression \n"+expression);
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
//int size=
for(int i=0;i< nodeList.getLength();i++)
{
// Node nNode = emailNodeElementList.item(j);
// Element eElement = (Element) nNode;
System.out.println("Taking the loop value");
//Object array = push(SearchStrings[i],nodeList.item(i).getFirstChild().getNodeValue());
String text=nodeList.item(i).getFirstChild().getNodeValue();
//googlebox.clear();
// googlebox.sendKeys(text);
SearchElements[i]=text;
System.out.println("Closing the loop value");
}
}
catch(Exception ex)
{
System.out.println("This is a exception" + ex);
}
finally
{
}
return SearchElements;
}
}
and then way to call the class was as follows:
String [] namelist=DocBuilderClass.username();
for(int i=0;i<namelist.length;i++)
{
String abc=namelist[i];
googlebox.sendKeys(abc);
googlebox.clear();
googlebox.sendKeys(namelist[i]);
}
References were Reference Link String[] array
Reference Link XML Parsing
All I learn that Your basics should be strong to solve a strong and complex problems.
I'm having a problem correctly calling getAttributeNS() (and other NS methods) from Java DOM. First, here is my sample XML doc:
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book xmlns:c="http://www.w3schools.com/children/" xmlns:foo="http://foo.org/foo" category="CHILDREN">
<title foo:lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
</bookstore>
And here is my little Java class that uses DOM and calls getAttributeNS:
package com.mycompany.proj;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Element;
import java.io.File;
public class AttributeNSProblem
{
public static void main(String[] args)
{
try
{
File fXmlFile = new File("bookstore_ns.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
System.out.println("Root element: " + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("title");
Element elem = (Element)nList.item(0);
String lang = elem.getAttributeNS("http://foo.org/foo", "lang");
System.out.println("title lang: " + lang);
lang = elem.getAttribute("foo:lang");
System.out.println("title lang: " + lang);
}
catch (Exception e)
{
e.printStackTrace();
}
}
}
When I call getAttributeNS("http://foo.org/foo", "lang"), it returns an empty String. I've also tried getAttributeNS("foo", "lang"), same result.
What's the proper way to retrieve the value of an attribute qualified by a namespace?
Thanks.
Immediately after DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();, add dbFactory.setNamespaceAware(true);
URL: http://ws.audioscrobbler.com/2.0/?method=chart.gethypedtracks&api_key=1732077d6772048ccc671c754061cb18&limit=10
From the above url I need to somehow remove the Artist name and the track name from the XML file produced from each Song given but I have no Idea how to work with an XML file structured in this way ??
Any help or pointers would be very much appreciated !
Thanks,
Ross
Here's a fully working class that loads the URL you have indicated and parses the Track and artist names.
Basically it reads the xml into a Document, and runs 2 xpath queries in loops to get the data you want.
The document itself is simple xml, if you reformat it, it looks like:
<?xml version="1.0" encoding="utf-8"?>
<lfm status="ok">
<tracks page="1" perPage="10" totalPages="50" total="500">
<track>
<name>Hysterical</name>
<duration>231</duration>
<percentagechange>3626</percentagechange>
<mbid/>
<url>http://www.last.fm/music/Clap+Your+Hands+Say+Yeah/_/Hysterical</url>
<streamable fulltrack="0">0</streamable>
<artist>
<name>Clap Your Hands Say Yeah</name>
...
All I did to clean it up was run it through a re-formatter like xmlstarlet as I mentioned in my comment. Note: you don't have to reformat it for java to read it if it's well formed. Human readable is all a re-format does for you.
The first xpath query gets the track name using a path lfm/tracks/track/name. You can use something like this xpath tester to try out your xpath queries (you can paste your xml in and it will reformat it too). If you don't understand xpath, there are many sources on the net.
The second xpath works relative to the current track name node, and looks for a following-sibling node of type artist with a name sub-node, and then displays the text of the node.
Here's the code
package net.fish;
import java.net.URL;
import java.net.URLConnection;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class ParseXML {
private static final DocumentBuilderFactory DOCUMENT_BUILDER_FACTORY = DocumentBuilderFactory.newInstance();
private static final XPathFactory XPATH_FACTORY = XPathFactory.newInstance();
public static void main(String[] args) throws Exception {
new ParseXML().parseXml("http://ws.audioscrobbler.com/2.0/?method=chart.gethypedtracks&api_key=1732077d6772048ccc671c754061cb18&limit=10");
}
private void parseXml(String urlPath) throws Exception {
URL url = new URL(urlPath);
URLConnection connection = url.openConnection();
DocumentBuilder db = DOCUMENT_BUILDER_FACTORY.newDocumentBuilder();
final Document document = db.parse(connection.getInputStream());
XPath xPathEvaluator = XPATH_FACTORY.newXPath();
XPathExpression nameExpr = xPathEvaluator.compile("lfm/tracks/track/name");
NodeList trackNameNodes = (NodeList) nameExpr.evaluate(document, XPathConstants.NODESET);
for (int i = 0; i < trackNameNodes.getLength(); i++) {
Node trackNameNode = trackNameNodes.item(i);
System.out.println(String.format("Track Name: %s" , trackNameNode.getTextContent()));
XPathExpression artistNameExpr = xPathEvaluator.compile("following-sibling::artist/name");
NodeList artistNameNodes = (NodeList) artistNameExpr.evaluate(trackNameNode, XPathConstants.NODESET);
for (int j=0; j < artistNameNodes.getLength(); j++) {
System.out.println(String.format(" - Artist Name: %s", artistNameNodes.item(j).getTextContent()));
}
}
}
}
Scenario:
Given the following XML file:
<a:root
xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
aaaaaaaaaaaaaa
</a:root>
How do I extract the text inside the main element <a:root>:
"\naaaaaaaaaaaaaa\n"
The code I have right now is:
import java.io.File;
import java.util.Stack;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Proof {
public static void main(String[] args) {
Document doc = null;
DocumentBuilderFactory dbf = null;
DocumentBuilder docBuild = null;
try {
dbf = DocumentBuilderFactory.newInstance();
docBuild = dbf.newDocumentBuilder();
doc = docBuild.parse(new File("test2.xml"));
System.out.println(doc.getFirstChild().getTextContent());
} catch(Exception e) {
e.printStackTrace();
}
}
}
But it returns the text I desire ("aaaaaaaaaaaaaa") + the inner text for the rest of the elements . Output:
Apples
Bananas
African Coffee Table
80
120
aaaaaaaaaaaaaa
The requirement is not to use an additional XML java library !
The answer by #Kirill Polishchuk is not corect:
The proposed:
a:root/text()
Is a relative expression and if it isn't evaluated having the root (/) node as the context node it selects nothing in the provided XML document.
Even the XPath expression: /a:root/text() is incorrect, because it selects three text nodes -- all text node children of the top element -- including two whitespace-only text nodes.
Here is a correct XPath solution:
/a:root/text()[string-length(normalize-space()) > 0]
When this Xpath expression is applied on the provided XML document (corrected to be well-formed):
<a:root
xmlns:a="UNDEFINED !!!!"
xmlns:h="http://www.w3.org/TR/html4/"
xmlns:f="http://www.w3schools.com/furniture">
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
aaaaaaaaaaaaaa
</a:root>
It selects the last (and only non-whitespace-only) text node child of the top element, as required:
aaaaaaaaaaaaaa
XSLT-based verification:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:a="UNDEFINED !!!!"
>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:text>"</xsl:text>
<xsl:copy-of select=
"/a:root/text()
[string-length(normalize-space()) > 0]"/>"
</xsl:template>
</xsl:stylesheet>
when this transformation is applied against the provided XML document (above), the wanted, correctly selecte text node is output:
"
aaaaaaaaaaaaaa
"
You can use XPath: a:root/text()
Use this
import java.io.File;
import java.util.Stack;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
public class Proof {
public static void main(String[] args) {
Document doc = null;
DocumentBuilderFactory dbf = null;
DocumentBuilder docBuild = null;
try {
dbf = DocumentBuilderFactory.newInstance();
docBuild = dbf.newDocumentBuilder();
doc = docBuild.parse(new File("test2.xml"));
Element x= doc.getDocumentElement();
NodeList m=x.getChildNodes();
for(int i=0;i<m.getLength();i++){
Node it=m.item(i);
if(it.getNodeType()==3){
System.out.println(it.getNodeValue());
}
}
} catch(Exception e) {
e.printStackTrace();
}
}
}
I am parsing this XML file:
<?xml version="1.0" encoding="UTF-8"?>
<tests>
<test category="Русский"/>
<test category="ελληνικά"/>
<test category="中文"/>
<test category="English"/>
</tests>
Main class is:
import java.io.File;
import java.io.FileInputStream;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathFactory;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class TestUnicode {
public static void main(String[] args) throws Exception {
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression lolwhy = xpath.compile("//test");
final InputSource inputSource =
new InputSource(
new FileInputStream(
new File("sample.xml")));
NodeList parent = (NodeList) lolwhy.evaluate(
inputSource,
XPathConstants.NODESET);
System.out.println(parent.getLength());
for (int i = 0; i < parent.getLength(); i++) {
System.out.println(parent.item(i).getAttributes().
getNamedItem("category").getNodeValue());
}
}
}
And the output is:
4
???????
????????
??
English
What am I doing wrong here?
EDIT: ok, this issue was related to hebrew appears as question marks in netbeans and the solution is this: Setting the default Java character encoding?
Could be that the parsing is ok, but the output is wrong.
If you you used a font that doesn't contain those characters, or if you output the values to HTML, but specify a wrong encoding, this can be the result.
The font-issue being the more likely one.
System.out.println is the culprit.
See if this helps
http://hints.macworld.com/article.php?story=20050208053951714