How to get XML node names without namespace in java? - java

I have an xml file having data which looks like given below:
....
<ems:MessageInformation>
<ecs:MessageID>2147321820</ecs:MessageID>
<ecs:MessageTimeStamp>2016-01-01T04:38:33</ecs:MessageTimeStamp>
<ecs:SendingSystem>LD</ecs:SendingSystem>
<ecs:ReceivingSystem>CH</ecs:ReceivingSystem>
<ecs:ServicingFipsCountyCode>037</ecs:ServicingFipsCountyCode>
<ecs:Environment>UGS-D8UACS02</ecs:Environment>
</ems:MessageInformation>
....
There are many other nodes also. All nodes have namespace like ecs,tns,ems etc. I am suing following code part to extract all node names without namespace.
public static void main(String[] args) throws SAXException, IOException, ParserConfigurationException, TransformerException {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document document = docBuilder.parse(new File("C:\\Users\\DadMadhR\\Desktop\\temp\\EDR_D3A0327.XML"));
NodeList nodeList = document.getElementsByTagName("*");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
//System.out.println(node.getNodeName());
System.out.println(node.getLocalName());
}
}
But when I execute this code, it's printing null for individual node. Can someone tell me what I am doing wrong here?
I read on internet and I came to know that node.getLocalName() will give node name without namespace. What is wrong then in my case?

You need to set the document builder factory to be namespace aware first. Then getLocalName() will start returning non-null values.
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
docBuilder.setNamespaceAware(true); // <=== here
Document document = docBuilder.parse(new File("C:\\Users\\DadMadhR\\Desktop\\temp\\EDR_D3A0327.XML"));

Related

Getting an attribute value in xml generated from UML models

I have an xml string like this and I want to get attribute value of "Element/xmi:type/name". It means I want to retreive name of element of type- activity only in a loop for each element. How do I do that? I am using javax.xml.parsers library.
<element xmi:idref="EAID_53685791_7F62_48a2_8BE8_DB7513AC776A" xmi:type="uml:Activity" name="Return error value" scope="public">
<model package="EAPK_263A2FE8_8346_4d1e_A851_39B9D573143D" tpos="0" ea_localid="98" ea_eleType="element"/>
<properties isSpecification="false" sType="Activity" nType="0" scope="public" isAbstract="false"/>
<project author="shiva999" version="1.0" phase="1.0" created="2016-08-16 09:44:25" modified="2016-08-16 10:13:51" complexity="1" status="Proposed"/>
<code gentype="<none>"/>
<style appearance="BackColor=-1;BorderColor=-1;BorderWidth=-1;FontColor=-1;VSwimLanes=1;HSwimLanes=1;BorderStyle=0;"/>
<modelDocument/>
<tags/>
<xrefs/>
<extendedProperties tagged="0" package_name="Activity Model"/>
<links>
<ControlFlow xmi:id="EAID_873CF8C4_0192_4099_8F66_6B36FA760AB6" start="EAID_53685791_7F62_48a2_8BE8_DB7513AC776A" end="EAID_D2EB427B_3AFD_4700_BD72_13B36684E595"/>
<ControlFlow xmi:id="EAID_2FECE2AE_6CA0_48a4_82AE_D743D257F37C" start="EAID_0D85B784_4393_429e_9BA1_7983BD7891CA" end="EAID_53685791_7F62_48a2_8BE8_DB7513AC776A"/>
</links>
</element>
Below is the code which I have written. Getting error as "The method type(int) is undefined for the type NodeLis". I am new to xml parsing and is refering online tutorials
public class DomXMLParser {
public static void main(String[] args) throws Exception {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document document = db.parse(new File("activity.xml"));
NodeList nodeList = document.getElementsByTagName("Type");
for(int x=0,size= nodeList.getLength(); x<size; x++) {
System.out.println(nodeList.type(x).getAttributes().getNamedItem("name").getNodeValue());
}
}
}
Below is the code used using XPath. Not getting the expected result. I have many such activity nodes and transition edges for those. So I need to store them as a list and create a hierarchial graph.
public class DomXMLParser {
public static void main(String[] args) throws ParserConfigurationException, SAXException,
IOException, XPathExpressionException {
//DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try {
File fXmlFile = new File("C:/Projekte/activity.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
dbFactory.setNamespaceAware(true);
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
javax.xml.xpath.XPathExpression expr
= xpath.compile("//xmi:XMI[xmi:type ='uml:Activity']/name/text()");
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}

use xpath to extract value from xml file with multiple namespace in java

I am trying to extract node value with multiple namespaces in java but not succeed. The xml file is like:
<ns26:start xmlns:ns26="http://www.tektronix.com/iris/isa/capture/start"
xmlns:ns31="http://www.tektronix.com/iris/isa/filters"
xmlns:ns13="http://www.tektronix.com/iris/isa/monitoredObjects"
xmlns:ns6="http://www.tektronix.com/iris/isa"
xmlns:ns10="http://www.tektronix.com/iris/isa/monNodeObjects"
xmlns:ns7="http://www.tektronix.com/iris/isa/capture/monitoredElements"
xmlns:ns11="http://www.tektronix.com/iris/isa/pointcodes"
xmlns:ns8="http://www.tektronix.com/iris/isa/capture/captureSession"
xmlns:ns2="http://www.tektronix.com/iris/isa/sessionSaveInfo"
xmlns:ns4="http://www.tektronix.com/iris/isa/customData"
xmlns:ns3="http://www.tektronix.com/iris/isa/manifest">
<ns6:Id>LAB:11300/isaclient;440</ns6:Id>
</ns26:start>
I want to extract Id with xpath local-name(). Expression like //*[local-name()='start']/*[local-name()='Id'] but didn't get any matched node. Please help to find issue here. Thanks
Add the java code here:
public static List<String> getXPathValueNamespace(String xml, String expression throws ParserConfigurationException, SAXException, IOException, XPathExpressionException
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder;
Document doc = null;
List<String> list = new ArrayList<String>();
builder = factory.newDocumentBuilder();
InputSource source = new InputSource(new StringReader(xml));
doc = builder.parse(source);
// Create XPathFactory object
XPathFactory xpathFactory = XPathFactory.newInstance();
// Create XPath object
XPath xpath = xpathFactory.newXPath();
XPathExpression expr = xpath.compile(expression);
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); i++)
list.add(nodes.item(i).getNodeValue());
return list;
}
The expression //*[local-name()='start']/*[local-name()='Id'] works and for the example document one node should be contained in the result node list.
But you should use nodes.item(i).getTextContent() to retrieve the node content, since getNodeValue() returns null for element nodes.

Locating XML element anywhere in the document with Java

Given the following XML (example):
<?xml version="1.0" encoding="UTF-8"?>
<rsb:VersionInfo xmlns:atom="http://www.w3.org/2005/Atom" xmlns:rsb="http://ws.rsb.de/v2">
<rsb:Variant>Windows</rsb:Variant>
<rsb:Version>10</rsb:Version>
</rsb:VersionInfo>
I need to get the values of Variant and Version. My current approach is using XPath as I cannnot rely on the given structure. All I know is that there is an element rsb:Version somewhere in the document.
XPath xpath = XPathFactory.newInstance().newXPath();
String expression = "//Variant";
InputSource inputSource = new InputSource("test.xml");
String result = (String) xpath.evaluate(expression, inputSource, XPathConstants.STRING);
System.out.println(result);
This however does not output anything. I have tried the following XPath expressions:
//Variant
//Variant/text()
//rsb:Variant
//rsb:Variant/text()
What is the correct XPath expression? Or is there an even simpler way getting to this element?
I would suggest just looping through the document to find the given tag
public static void main(String[] args) throws SAXException, IOException,ParserConfigurationException, TransformerException {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
Document document = docBuilder.parse(new File("test.xml"));
NodeList nodeList = document.getElementsByTagName("rsb:VersionInfo");
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
if (node.getNodeType() == Node.ELEMENT_NODE) {
// do something with the current element
System.out.println(node.getNodeName());
}
}
}
Edit: Yassin pointed out that it won't get child nodes. This should point you in the right direction for getting the children.
private static List<Node> getChildren(Node n)
{
List<Node> children = asList(n.getChildNodes());
Iterator<Node> it = children.iterator();
while (it.hasNext())
if (it.next().getNodeType() != Node.ELEMENT_NODE)
it.remove();
return children;
}

Retrieve values from XML tag in Java

I have a set of XML string outputs from a natural language tool and need to retrieve values out of them, also provide null value to those tags that are not presented in the output string. Tried to use the Java codes provided in Extracting data from XML using Java but it doesn't seem to work.
Current sample tag inventory is listed below:
<TimeStamp>, <Role>, <SpeakerId>, <Person>, <Location>, <Organization>
Sample XML output string:
<TimeStamp>00.00.00</TimeStamp> <Role>Speaker1</Role><SpeakerId>1234</SpeakerId>Blah, blah, blah.
Desire outputs:
TimeStamp: 00.00.00
Role: Speaker1
SpeakerId: 1234
Person: null
Place: null
Organization: null
In order to use the Java codes provided in above link (in updated code), I inserted <Dummy> and </Dummy> as follows:
<Dummy><TimeStamp>00.00.00</TimeStamp><Role>Speaker1</Role><SpeakerId>1234</SpeakerId>Blah, blah, blah.</Dummy>
However, it returns dummy and null only. Since I'm still a newbie to Java, detailed explanations will be much appreciated.
Try this way :D hope can help you
File fXmlFile = new File("yourfile.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
You can get child node list like this:
NodeList nList = doc.getElementsByTagName("staff");
Get the item like this:
Node nNode = nList.item(temp);
Example Site
This is what I ended up doing for my Java wrapper (Show TimeStamp only)
public class NERPost {
public String convertXML (String input) {
String nerOutput = input;
try {
DocumentBuilderFactory docBuilderFactory =
DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(nerOutput));
Document doc = docBuilder.parse(is);
// normalize text representation
doc.getDocumentElement ().normalize ();
NodeList listOfDummies = doc.getElementsByTagName("dummy");
for(int s=0; s<listOfDummies.getLength() ; s++){
Node firstDummyNode = listOfDummies.item(s);
if(firstDummyNode.getNodeType() == Node.ELEMENT_NODE){
Element firstDummyElement = (Element)firstDummyNode;
//Convert each entity label --------------------------------
//TimeStamp
String ts = "<TimeStamp>";
Boolean foundTs;
if (foundTs = nerOutput.contains(ts)) {
NodeList timeStampList = firstDummyElement.getElementsByTagName("TimeStamp");
//do it recursively
for (int i=0; i<timeStampList.getLength(); i++) {
Node firstTimeStampNode = timeStampList.item(i);
Element timeStampElement = (Element)firstTimeStampNode;
NodeList textTSList = timeStampElement.getChildNodes();
String timeStampOutput = ((Node)textTSList.item(0)).getNodeValue().trim();
System.out.println ("<TimeStamp>" + timeStampOutput + "</TimeStamp>\n")
} //end for
}//end if
//other XML tags
//.....
}//end if
}//end for
}
catch...
}//end try
}}

Load XML into TreeMap

I have an XML config file that has just one parent and one child. This will always be like this and never change. It looks something like this:
<parent>
<child1>test</child1>
<child2>123</child2>
</parent>
I want to use java DOM (org.w3c.dom.Document) to parse the XML into a TreeMap so that I can access the attributes as keys/values. I'm guessing I'd need to create a for loop that scans through the XML and adds the key (parent) and value (child) line by line?
You can traverse the XML document using JAXP APIs, you don't need to know the structure or node names in advance
InputStream is = new ByteArrayInputStream(xml.getBytes("UTF-8"));
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbf.newDocumentBuilder();
Document doc = docBuilder.parse(is);
NodeList nodeList = doc.getChildNodes();
and you can iterate on document and get the nodes and attributes
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
NamedNodeMap attributes = node.getAttributes();
//...
}

Categories