reading xml - java, dom

reading xml - java, dom - java

I have a problem with reading data from xml using dom. I don't know why "System.out.println(nNode.getChildNodes().item(0).hasAttributes());" returns false... In my xml file this node contains attributes. Could you help me please?
This is my code:
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public class XmlParser {
private String[] linia;
private String[] wariant;
private String[] przystanek;
private String[] tabliczka;
private String[] dzien;
private String[] godz;
private String[] min;
public void readXml() {
try {
File fXmlFile = new File("c:\\file.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :"
+ doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("linia");
System.out.println("-----------------------");
Node nNode = nList.item(0);
linia = new String[nNode.getAttributes().getLength()];
System.out.println(nNode.getAttributes().getLength());
int i = 0;
while (i < nNode.getAttributes().getLength()) {
linia[i] = nNode.getAttributes().item(i) + "";
System.out.print(linia[i] + " ");
i++;
}
wariant = new String[nNode.getChildNodes().getLength()];
System.out.println();
System.out.println(nNode.getChildNodes().getLength());
System.out.println(nNode.getNodeName());
int j = 0;
System.out.println(nNode.getChildNodes().item(0).hasAttributes());
while (j < nNode.getChildNodes().getLength()) {
wariant[j] = nNode.getChildNodes().item(j).getAttributes()
.item(0)
+ "";
// if(wariant[j].toString()!=null)
System.out.println(" " + wariant[j]);
j++;
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

Have you checked the child node at index 1? My guess is that your parser sees all characters between tags (newlines, tabs, spaces) as CDATA and parses them as CDATA nodes which do not have attributes.

Related

Java XPath - find tags prefixed with

I have a following HTML
<data-my-tag>
<data-another-tag>
... content ...
</data-another-tag>
<data-my-tag>
... content ...
</data-my-tag>
</data-my-tag>
Now I need to find all tags starting with prefix <data-. I need to find their names and also their contents. I know this is not possible to achieve with regex, so I started to work with javax.xml.parsers. It is easy for me to find some tags according to a particular name, but I am unable to find tags starting with some prefix.
What is the expression or code to find tags starting with prefix?

You can use XPath's starts-with function:
public void findElements(InputSource source,
String prefix) {
try {
XPath xpath = XPathFactory.newInstance().newXPath();
NodeList matches = (NodeList) xpath.evaluate(
"//*[starts-with(local-name(), '" + prefix + "')]",
source, XPathConstants.NODESET);
int count = matches.getLength();
for (int i = 0; i < count; i++) {
Node match = matches.item(i);
System.out.println("Element: " + match.getNodeName());
System.out.println("Text: " + match.getTextContent().trim());
System.out.println();
}
} catch (XPathException e) {
throw new RuntimeException(e);
}
}

Can we use something like this :
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import java.io.File;
public class Demo {
public static void main(String[] args) {
try {
File inputFile = new File("input.txt");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(inputFile);
doc.getDocumentElement().normalize();
NodeList nList = doc.getDocumentElement().getChildNodes();
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE || nNode.getNodeName().startsWith("<data-")) {
System.out.println(nNode.getTextContent());
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

How can I grab api data from an xml where the tag name changes

So I'm trying to do a five day forecast with the openweathermap api. it returns an xml of the five day forecast here
I've been trying to get the info using the code NodeList nodeList = doc.getElementsByTagName("time");. If you check the xml you'll see the tag <time> contains the forecast for every 2 hour range. but the problem is I can't seem to grab anything from inside those tags since the name is actually <tag="date range""time range" to "time range + 3hrs">.
try {
DocumentBuilderFactory dbFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(this.fiveDayForecastURL);
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName("time");
for (int i = 0; i < nodeList.getLength(); i++){
Node node = nodeList.item(i);
NamedNodeMap namedNodeMap = node.getAttributes();
Node attr = namedNodeMap.getNamedItem("max");
// just trying to grab anything from inside these tags
// but ideally would want min and max temp for the range
if (node.getNodeType() == Node.ELEMENT_NODE){
System.out.println(attr);
// always prints [time: null]
}
System.out.println(node);
// always prints null
}
} catch (ParserConfigurationException | IOException ex) {
ex.printStackTrace();
} catch (org.xml.sax.SAXException e) {
e.printStackTrace();
}
I'm sure I'm missing some lines of code or something but is there a way to grab everything between the time tags even though the tag names change every time? thanks

First off, I forgot how cumbersome it is to parse XML with a DOM parser.
Have you considered requesting the returned data as JSON that you can then parse with gson?
So - you're on the right track, but in order to get the min/max temperature for a given time period, you need to keep digging down in the DOM hierarchy.
temperature is a child element of time, so you'll need to grab it, then get the min and max attribute values off of it.
Something like:
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import java.io.IOException;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.NamedNodeMap;
public class Test {
private static final String FIVE_DAY_FORECAST_URL =
"https://api.openweathermap.org/data/2.5/forecast?q=Denver&appid=8984d739fa91d7031fff0e84a3d2c520&mode=xml&units=imperial";
private static final String TIME_ELEM = "time";
private static final String TEMPERATURE_ELEM = "temperature";
private static final String TIME_FROM_ATTR = "from";
private static final String TIME_TO_ATTR = "to";
private static final String TEMPERATURE_MIN_ATTR = "min";
private static final String TEMPERATURE_MAX_ATTR = "max";
private static void getWeatherForcast() {
try {
final DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
final DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
final Document doc = dBuilder.parse(FIVE_DAY_FORECAST_URL);
doc.getDocumentElement().normalize();
NodeList nodeList = doc.getElementsByTagName(TIME_ELEM);
for (int i = 0; i < nodeList.getLength(); i++) {
final Node node = nodeList.item(i);
final NamedNodeMap namedNodeMap = node.getAttributes();
final Node fromAttr = namedNodeMap.getNamedItem(TIME_FROM_ATTR);
final Node toAttr = namedNodeMap.getNamedItem(TIME_TO_ATTR);
System.out.println("Time: " + fromAttr + " " + toAttr);
final NodeList timeChildren = node.getChildNodes();
for (int j = 0; j < timeChildren.getLength(); j++) {
final Node timeChild = timeChildren.item(j);
if (TEMPERATURE_ELEM.equals(timeChild.getNodeName())) {
final NamedNodeMap temperatureAttrMap = timeChild.getAttributes();
final String minTemp = temperatureAttrMap.getNamedItem(TEMPERATURE_MIN_ATTR).getNodeValue();
final String maxTemp = temperatureAttrMap.getNamedItem(TEMPERATURE_MAX_ATTR).getNodeValue();
System.out.println("min: " + minTemp + " max: " + maxTemp);
}
}
}
} catch (ParserConfigurationException | IOException ex) {
ex.printStackTrace();
} catch (org.xml.sax.SAXException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
getWeatherForcast();
}
}

Task from java web services and XML

This is the task from java web services and XML:
Create a translation service.
Customer service to activate the service method as follows:
getWord ("automobil", "russian", "polish")
The first parameter is the required word, the second is the original language, and the third target language.
The method should return a string with the appropriate word or words separated by commas if there are synonyms.
Data source, the service should use XML documents (the system may have only a few words, in order to test the functionality).
This is the java doc:
package xmlparsiranje;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import java.io.File;
import java.util.Scanner;
public class Xmlparsiranje {
public static void main(String[] argv) throws Exception {
// try {
File fXmlFile = new File("C:\\zaTestiranje.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
Scanner word = new Scanner(System.in);
System.out.println("Input word: ");
String rijec = word.nextLine();
Scanner izvoriste = new Scanner(System.in);
System.out.println("Izvorni: ");
String izvorni = izvoriste.nextLine();
Scanner Odrediste = new Scanner(System.in);
System.out.println("Odrediste: ");
String odredisni = Odrediste.nextLine();
NodeList nList = doc.getElementsByTagName("word");
// System.out.println(odredisni);
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
Element eElement = (Element) nNode;
NodeList engleski = eElement.getElementsByTagName("english");
NodeList ruski = eElement.getElementsByTagName("russian");
NodeList poljski = eElement.getElementsByTagName("polish");
// System.out.println(engleski.item(0).getFirstChild().getTextContent());
if (odredisni.equals("english"))
{
if(izvorni.equals("russian")){
if(ruski.item(0).getFirstChild().getTextContent().equals(rijec))
{
System.out.println(ruski.item(0).getFirstChild().getTextContent());
System.out.println(engleski.item(0).getFirstChild().getTextContent());
}
}
if(izvorni.equals("polish")) {
if(poljski.item(0).getFirstChild().getTextContent().equals(rijec)) {
System.out.println(poljski.item(0).getFirstChild().getTextContent());
System.out.println(engleski.item(0).getFirstChild().getTextContent());
}
}
}
if (odredisni.equals("russian"))
{
if(izvorni.equals("english")){
if(engleski.item(0).getFirstChild().getTextContent().equals(rijec))
{
System.out.println(engleski.item(0).getFirstChild().getTextContent());
System.out.println(ruski.item(0).getFirstChild().getTextContent());
}
}
if(izvorni.equals("polish")) {
if(poljski.item(0).getFirstChild().getTextContent().equals(rijec)) {
System.out.println(poljski.item(0).getFirstChild().getTextContent());
System.out.println(ruski.item(0).getFirstChild().getTextContent());
}
}
}
if (odredisni.equals("polish"))
{
if(izvorni.equals("english")){
if(engleski.item(0).getFirstChild().getTextContent().equals(rijec))
{
System.out.println(engleski.item(0).getFirstChild().getTextContent());
System.out.println(poljski.item(0).getFirstChild().getTextContent());
}
}
if(izvorni.equals("russian")) {
if(poljski.item(0).getFirstChild().getTextContent().equals(rijec)) {
System.out.println(poljski.item(0).getFirstChild().getTextContent());
System.out.println(ruski.item(0).getFirstChild().getTextContent());
}
}
}
/* String trazenaRijec = getTagValue("english", eElement);
String engleski = getTagValue("english", eElement);
String ruski = getTagValue("russian", eElement);
String poljski = getTagValue("polish", eElement);
if (odredisni.equals(engleski))
{
System.out.println("Engleski : " + getTagValue("english", eElement));
}
if (odredisni.equals(ruski))
{
System.out.println("Ruski : " + getTagValue("russian", eElement));
}
if (odredisni.equals(poljski))
{
System.out.println("Poljski : " + getTagValue("polish", eElement));
} */
/* System.out.println("English : " + getTagValue("english", eElement));
System.out.println("Russian : " + getTagValue("russian", eElement));
System.out.println("Polish : " + getTagValue("polish", eElement));*/
}
// } catch (Exception e) {
// e.printStackTrace();
}
private static String getTagValue(String sTag, Element eElement) {
NodeList nlList = eElement.getElementsByTagName(sTag).item(0).getChildNodes();
Node nValue = (Node) nlList.item(0);
return nValue.getNodeValue();
}
}
And this is XML file:
<?xml version="1.0" encoding="UTF-8"?>
<translate>
<word>
<english>Car</english>
<russian>Avtomobil</russian>
<polish>Samochod</polish>
</word>
<word>
<english>Love</english>
<russian>Lobite</russian>
<polish>milosc</polish>
</word>
<word>
<english>Busy</english>
<russian>Zanimate</russian>
<polish>Zajety</polish>
</word>
</translate>
It didnt get accepted by the instructor. He says there is no service.
What am I doing wrong?

Parsing XML from webpage

If I copy and paste the xml from this site into a xml file I can parse it with java
http://api.indeed.com/ads/apisearch?publisher=8397709210207872&q=java&l=austin%2C+tx&sort&radius&st&jt&start&limit&fromage&filter&latlong=1&chnl&userip=1.2.3.4&v=2
However, I want to parse it directly from a webpage if possible!
Here's my current code:
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;
import org.w3c.dom.Element;
import org.xml.sax.SAXException;
import java.io.File;
import java.io.IOException;
public class XMLParser {
public void readXML(String parse) {
File xml = new File(parse);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
try {
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xml);
// System.out.println("Root element :"
// + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("result");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
// System.out.println("\nCurrent Element :" +
nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("job title : "
+
eElement.getElementsByTagName("jobtitle").item(0)
.getTextContent());;
System.out.println("Company: "
+
eElement.getElementsByTagName("company")
.item(0).getTextContent());
System.out.println("City : "
+
eElement.getElementsByTagName("city").item(0)
.getTextContent());
System.out.println("State : "
+
eElement.getElementsByTagName("state").item(0)
.getTextContent());
System.out.println("Country : "
+
eElement.getElementsByTagName("country").item(0)
.getTextContent());
System.out.println("Date posted : "
+
eElement.getElementsByTagName("date").item(0)
.getTextContent());
System.out.println("Job summary : "
+
eElement.getElementsByTagName("snippet").item(0)
.getTextContent());
System.out.println("Latitude : "
+
eElement.getElementsByTagName("latitude").item(0).getTextContent());
System.out.println("longitude : "
+
eElement.getElementsByTagName("longitude").item(0).getTextContent());
}
}
} catch (ParserConfigurationException | SAXException | IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void main(String[] args) {
new XMLParser().readXML("test.xml");
}
}
any help would be appreciated.

Give it the URI instead of the XML. It will download it for you.
Document doc = dBuilder.parse(uriString)

Please find the code snippet like this
String url = "http://api.indeed.com/ads/apisearch?publisher=8397709210207872&q=java&l=austin%2C+tx&sort&radius&st&jt&start&limit&fromage&filter&latlong=1&chnl&userip=1.2.3.4&v=2";
try
{
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
DocumentBuilder b = f.newDocumentBuilder();
Document doc = b.parse(url);
}

you need to have the element/nodes you want in a for loop. So it can scan through xml file, and find the right node you searching for.
reads the xml file as a string, and creates a xml structure
builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(connection.getInputStream());
NodeList nodes = doc.getElementsByTagName("mode");
for (int i = 0; i < nodes.getLength(); i++)
Element element = (Element) nodes.item(i);
//Gets tag from XML and it´s content
NodeList nodeMode = element.getElementsByTagName("mode");
Element elemMode = (Element) nodeMode.item(0);
and after if you want to pick out a value and parse to an int or what you want you do like this:
int currentMode = Integer.parseInt(elemMode.getFirstChild().getTextContent());

That's how I parsed data directly from url http://www.nbp.pl/kursy/xml/+something
static class Kurs {
public float kurs_sprzedazy;
public float kurs_kupna;
}
private static DocumentBuilder dBuilder;
private static Kurs getData(String filename, String currency) throws Exception {
Document doc = dBuilder.parse("http://www.nbp.pl/kursy/xml/"+filename+".xml");
doc.getDocumentElement().normalize();
NodeList nList = doc.getElementsByTagName("pozycja");
for(int i = 0; i < nList.getLength(); i++) {
Element nNode = (Element)nList.item(i);
if(nNode.getElementsByTagName("kod_waluty").item(0).getTextContent().equals(currency)) {
Kurs kurs = new Kurs();
String data = nNode.getElementsByTagName("kurs_sprzedazy").item(0).getTextContent();
data = data.replace(',', '.');
kurs.kurs_sprzedazy = Float.parseFloat(data);
data = nNode.getElementsByTagName("kurs_kupna").item(0).getTextContent();
data = data.replace(',', '.');
kurs.kurs_kupna = Float.parseFloat(data);
return kurs;
}
}
return null;
}

Failing to get element values using Element.getAttribute()

I would like to read an xml file. I' ve found an example which is good until the xml element doesn't have any attributes. Of course i've tried to look after how could I read attributes, but it doesn't works.
XML for example
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<car>
<properties>
<test h="1.12" w="4.2">
<colour>red</colour>
</test>
</properties>
</car>
Java Code:
public void readXML(String file) {
try {
File fXmlFile = new File(file);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
System.out.println("test : "
+ getTagValue("test", eElement));
System.out.println("colour : " + getTagValue("colour", eElement));
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
public String getTagValue(String sTag, Element eElement) {
NodeList nlList = eElement.getElementsByTagName(sTag).item(0)
.getChildNodes();
Node nValue = (Node) nlList.item(0);
System.out.println(nValue.hasAttributes());
if (sTag.startsWith("test")) {
return eElement.getAttribute("w");
} else {
return nValue.getNodeValue();
}
}
Output:
false
test :
false
colour : red
My problem is, that i can't print out the attributes. How could i get the attributes?

There is alot wrong with your code; undeclared variables and a seemingly crazy algorithm. I rewrote it and it works:
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
public final class LearninXmlDoc
{
private static String getTagValue(final Element element)
{
System.out.println(element.getTagName() + " has attributes: " + element.hasAttributes());
if (element.getTagName().startsWith("test"))
{
return element.getAttribute("w");
}
else
{
return element.getNodeValue();
}
}
public static void main(String[] args)
{
final String fileName = "c:\\tmp\\test\\domXml.xml";
readXML(fileName);
}
private static void readXML(String fileName)
{
Document document;
DocumentBuilder documentBuilder;
DocumentBuilderFactory documentBuilderFactory;
NodeList nodeList;
File xmlInputFile;
try
{
xmlInputFile = new File(fileName);
documentBuilderFactory = DocumentBuilderFactory.newInstance();
documentBuilder = documentBuilderFactory.newDocumentBuilder();
document = documentBuilder.parse(xmlInputFile);
nodeList = document.getElementsByTagName("*");
document.getDocumentElement().normalize();
for (int index = 0; index < nodeList.getLength(); index++)
{
Node node = nodeList.item(index);
if (node.getNodeType() == Node.ELEMENT_NODE)
{
Element element = (Element) node;
System.out.println("\tcolour : " + getTagValue(element));
System.out.println("\ttest : " + getTagValue(element));
System.out.println("-----");
}
}
}
catch (Exception exception)
{
exception.printStackTrace();
}
}
}

If you have a schema for the file, or can make one, you can use XMLBeans. It makes Java beans out of the XML, as the name implies. Then you can just use getters to get the attributes.

Use dom4j library.
InputStream is = new FileInputStream(filePath);
SAXReader reader = new SAXReader();
org.dom4j.Document doc = reader.read(is);
is.close();
Element content = doc.getRootElement(); //this will return the root element in your xml file
List<Element> methodEls = content.elements("element"); // this will retun List of all Elements with name "element"
Attribute attrib = methodEls.get(0).attribute("attributeName"); // this is the "attributeName" attribute of first element with name "element"

If you're looking purely to obtain attributes (E.g. a config / ini file) I would recommend using a java properties file.
http://docs.oracle.com/javase/tutorial/essential/environment/properties.html
If you just want to read a file create a new fileReader and put it into a bufferedReader.
BufferedReader in = new BufferedReader(new FileReader("example.xml"));

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

reading xml - java, dom - java

Have you checked the child node at index 1? My guess is that your parser sees all characters between tags (newlines, tabs, spaces) as CDATA and parses them as CDATA nodes which do not have attributes.

Related

Java XPath - find tags prefixed with

How can I grab api data from an xml where the tag name changes

Task from java web services and XML

Parsing XML from webpage

Failing to get element values using Element.getAttribute()

Categories

Resources