Read Complex Xml file in java

Read Complex Xml file in java - java

I am able to read many type of xml file in java. but today i got a xml file and not able to read its details.
<ENVELOPE>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>1</BILLREF>
<BILLPARTY>Party1</BILLPARTY>
</BILLFIXED>
<BILLCL>-10800.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-10800.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>2</BILLREF>
<BILLPARTY>Party2</BILLPARTY>
</BILLFIXED>
<BILLCL>-2000.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-2000.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>3</BILLREF>
<BILLPARTY>Party3</BILLPARTY>
</BILLFIXED>
<BILLCL>-1416.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-1416.00</BILLFINAL>
<BILLDUE>31-Jul-2017</BILLDUE>
<BILLOVERDUE>0</BILLOVERDUE>
</ENVELOPE>
I am using this code for read xml file. I am able to read data inside <BILLFIXED> tag but not able to read data outside of this like <BILLFINAL> and <BILLDUE> etc.
try {
File fXmlFile = new File("filepath");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList billNodeList = doc.getElementsByTagName("ENVELOPE");
for(int i=0;i<billNodeList.getLength();i++){
Node voucherNode = billNodeList.item(i);
Element voucherElement = (Element) voucherNode;
NodeList nList = voucherElement.getElementsByTagName("BILLFIXED");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node insideNode = nList.item(temp);
Element voucherElements = (Element) insideNode;
System.out.println(voucherElements.getElementsByTagName("BILLDATE").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLREF").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLPARTY").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLFINAL").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLOVERDUE").item(0).getTextContent());
}
}
} catch (Exception e) {
e.printStackTrace();
}
I am try all possible way which i know that but currently i am not able to find any solution.
If anyone have any solution please share with me.

One way to do it, is to "fix" the XML to be more well-structured, e.g. like this:
// Fix the XML
Element envelopeElem = doc.getDocumentElement();
List<Node> children = new ArrayList<>();
for (Node child = envelopeElem.getFirstChild(); child != null; child = child.getNextSibling())
children.add(child);
Element billElem = null;
for (Node child : children) {
if (child.getNodeType() == Node.ELEMENT_NODE && "BILLFIXED".equals(child.getNodeName()))
envelopeElem.insertBefore(billElem = doc.createElement("BILL"), child);
if (billElem != null)
billElem.appendChild(child);
}
The code basically creates a new <BILL> element as a child of <ENVELOPE> whenever it encounters a <BILLFIXED> element, then moves all subsequent nodes into the <BILL> element.
The result is that the XML in the DOM tree looks like this1, which should be easier for you to process:
<ENVELOPE>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>1</BILLREF>
<BILLPARTY>Party1</BILLPARTY>
</BILLFIXED>
<BILLCL>-10800.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-10800.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
</BILL>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>2</BILLREF>
<BILLPARTY>Party2</BILLPARTY>
</BILLFIXED>
<BILLCL>-2000.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-2000.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
</BILL>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>3</BILLREF>
<BILLPARTY>Party3</BILLPARTY>
</BILLFIXED>
<BILLCL>-1416.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-1416.00</BILLFINAL>
<BILLDUE>31-Jul-2017</BILLDUE>
<BILLOVERDUE>0</BILLOVERDUE>
</BILL>
</ENVELOPE>
1) The XML has been reformatted for human readability, i.e. it has been re-indented.

It isn't well-structured XML. Inside your <envelope> tags there is nothing to indicate the start of each set of six attributes that constitute a 'bill'. You'd normally expect that each one would have a <bill> and </bill> tag to contain them. And this is going to confuse the parser...

As per sample XML, it has data for 3 records. But each record does not have any separation. Looks like each field data populated into XML tag and written into file.
There 2 possible option I would suggest
JAVA based : As Andreas suggested, Read the file content and add a root tag for each record which would give finite XML structure then would be easier to handle. Performance impact may raise when the input file is in large size.
Transformation based : Try STX transformation which would convert the structure to required format either XML or even flat file. Then processing would be simpler

Related

Get xml value by element name

How can i get XML value by attribute for the below XML:
I have tried:
String xml = "<Info><document><document>234doc</document></document></Info>";
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder()
.parse(new InputSource(new StringReader(xml)));
NodeList errNodes = doc.getElementsByTagName("error");
if (errNodes.getLength() > 0) {
Element err = (Element)errNodes.item(0);
} else {
Node value = doc.getElementsByTagName("document").item(0);
out.println(value);
}
I am looking for the output: "234doc". But I am not sure how do get the value. Can any one please suggest?

This is not a rocket science. You should debug your code, explore classes you might already know (Document, NodeList, Node, Element) and understand your xml structure (for more info look here). One way to achieve your result is:
System.out.println(doc.getChildNodes().item(0).getTextContent());

Adding sub-element to element java (DOM)

Essentially, i'm creating an XML document from a file (a database), and then i'm comparing another parsed XML file (with updated information) to the original database, then writing the new information into the database.
I'm using java's org.w3c.dom.
After lots of struggling, i decided to just create a new Document object and will write from there from the oldDocument and newDocument ones i'm comparing the elements in.
The XML doc is in the following format:
<Log>
<File name="something.c">
<Warning file="something.c" line="101" column="23"/>
<Warning file="something.c" line="505" column="71" />
</File>
</Log>
as an example.
How would i go about adding in a new "warning" Element to the "File" without getting the pesky "org.w3c.dom.DOMException: WRONG_DOCUMENT_ERR: A node is used in a different document than the one that created it." exception?
Cutting it down, I have something similar to:
public static Document update(Element databaseRoot, Element newRoot){
Document doc = db.newDocument(); // DocumentBuilder defined previously
Element baseRoot = doc.createElement("Log");
//for each file i have:
Element newFileRoot = doc.createElement("File");
//some for loop that parses through each 'file' and looks at the warnings
//when i come to a new warning to add to the Document:
NodeList newWarnings = newFileToCompare.getChildNodes(); //newFileToCompare comes from the newRoot element
for (int m = 0; m < newWarnings.getLength(); m++){
if(newWarnings.item(m).getNodeType() == Node.ELEMENT_NODE){
Element newWarning = (Element)newWarnings.item(m);
Element newWarningRoot = (Element)newWarning.cloneNode(false);
newFileRoot.appendChild(doc.importNode(newWarningRoot,true)); // this is what crashes
}
}
// for new files i have this which works:
newFileRoot = (Element)newFiles.item(i).cloneNode(true);
baseRoot.appendChild(doc.importNode(newFileRoot,true));
doc.appendChild(baseRoot);
return doc;
}
Any ideas? I'm beating my head against the wall. First time doing this.

Going through with a debugger I verified that the document owners were correct. Using node.getOwnerDocument(), I realized that the newFileRoot was connected to the wrong document earlier when I created it, so I changed
Element newFileRoot = (Element)pastFileToFind.cloneNode(false);
to
Element newFileRoot = (Element)doc.importNode(pastFileToFind.cloneNode(false),true);
since later on when i was trying to add the newWarningRoot to newFileRoot, they had different Documents (newWarningRoot was correct but newFileRoot was connected to the wrong document)

How to get XML content as a String

<root>
<h id="1">
<d value="1,2,3,4,5"><open>10:00</open><close>23:00</close></d>
<d value="6"><open>10:00</open><close>2:00</close></d>
<d value="7"><open>10:00</open><close>21:00</close></d>
</h>
<h id="2">
</h>
</root>
Here I have the XML which root has list of <h> tagged nodes. Now I need to break these into parts and set it into different variables (add into a map).
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new InputSource(new ByteArrayInputStream(data.getBytes("utf-8"))));
NodeList nList = doc.getElementsByTagName("h");
for (int i = 0; i < nList.getLength(); i++)
{
Node nNode = nList.item(i);
System.out.println(nNode.getAttributes().getNamedItem("id") + " " + ?????);
}
what should I call in order to get the value (String value) of a nNode ?
Here is what Im looking for as the asnwer for the above code once some one fills the ????
1 <h id="1"><d value="1,2,3,4,5"><open>10:00</open><close>23:00</close></d><d value="6">open>10:00</open><close>2:00</close></d><d value="7"><open>10:00</open><close>21:00</close></d></h>
2 <h id="2"></h>
And i don't mind having as root element

You can use Node.getTextContent() to conveniently get all the text of a node (gets text of children as well).
See Parsing xml file contents without knowing xml file structure for a short example.
If you're trying to get the value attributes of the d nodes (I can't actually tell, your question is slightly unclear to me), then it would be different -- for that you would iterate through the children of each h node (use getChildNodes() or getFirstChild() + getNextSibling()) then grab their value attributes just as you are getting the id attribute of the h nodes (the above link also shows an example of iterating through child nodes).

Have you tried jDom library? http://www.jdom.org/docs/apidocs/org/jdom2/output/XMLOutputter.html
XMLOutputter outp = new XMLOutputter();
String s = outp.outputString(your_jdom_element);

Have you tried nNode.toString() if you are using Node from javax.xml.soap.Node.

You can use that:
http://docs.oracle.com/javase/7/docs/api/org/w3c/dom/Node.html#getTextContent()
but your sample nNode has other nodes, not just text. It seems you need helper method to construct String from child nodes.
Pass your nNode to nodeToString
XML Node to String in Java

Reading XML in java using DOM

I am new to read XML in Java using DOM. Could someone help me with simple code steps to read this XML in DOM?
Here is my XML:
<DataSet xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' xsi:noNamespaceSchemaLocation='datamartschema.1.3.xsd'>
<DataStream title='QUESTIONNAIRE'>
<Record>
<TransactionDate>2014-05-28T14:17:31.2186777-06:00</TransactionDate><SubType>xhaksdj</SubType>
<IntegerValue title='ComponentID'>11111</IntegerValue>
</Record><Record>
<TransactionDate>2014-05-28T14:17:31.2186777-06:00</TransactionDate><SubType>jhgjhg</SubType>
<IntegerValue title='ComponentID'>11111</IntegerValue>
</Record>
</DataStream>
</DataSet>
In this XML I need to read the DataStream value and Record values. My expected output is
DataStream=QUESTIONNAIRE and my records are
<TransactionDate>2014-05-28T14:17:31.2186777-06:00</TransactionDate><SubType>xhaksdj</SubType><IntegerValue title='ComponentID'>11111</IntegerValue><TransactionDate>2014-05-28T14:17:31.2186777-06:00</TransactionDate><SubType>jhgjhg</SubType><IntegerValue title='ComponentID'>11111</IntegerValue>
How can I get this output? I tried myself but I can't get the records output like above. I get the output without Tags which are present in the above output.I am using this line to get the output. But it does not give me correct output. Also, how to read the datastream value from this XML? Kindly help me.
This is my code snippets
NodeList datasetallRecords = indElement.getElementsByTagName("Record");
for (int y = 0; y < datasetallRecords.getLength(); y++) {
Element recordsElement = (Element) datasetallRecords.item(y);
recordXMl = recordXMl + recordsElement.getTextContent();
String d = datasetallRecords.item(y).getTextContent();
if (recordsElement.getTagName().equalsIgnoreCase("SubType")) {
lsDataStreamSubTypes.add(recordsElement.getTextContent());
}
recordCount = y;
}

When you create new instance of builder you can get DataStream
it would be look like this:
Element root = document.getDocumentElement();
NodeList dataStreams = root.getElementsByTagName("DataStream");
then get throw this list and get all info like this:
for (int i = 0; i < dataStreams.lenght(); i++) {
Element dataStream = (Element) dataStreams.item(i);
if (dataStream.getNodeType() == Element.ELEMENT_NODE) {
String title = dataStream.getAttributes()
.getNamedItem("title").getTextContent();
}
}

First you need to create a Node like this
Node nNode = datasetallRecords.item(y);
then an element like this
Element eElement = (Element) nNode;
now you can start taking the values from the element by using the getelementbyid and getnodevalue method.

You're not getting the tags because the call to getTextContent() on the "Record" node will return only the textual content of that node and its descendants.
If you need to nodes as well you'll have to process the XML by hand. Have a look at the DOM tutorial it covers processing a document in DOM mode very well including how to read out element names.

Inserting a node in an xml file

I would like to insert a node in an xml file using Java DOM. I am actually editing a lot of contents of a dummy file in order to mofidy it like the original.
I would want to add an open node and close node in between the following file;
<?xml version="1.0" encoding="utf-8"?>
<Memory xmlns:xyz="http://www.w3.org/2001/XMLSchema-instance"
xmlns:abc="http://www.w3.org/2001/XMLSchema" Derivative="ABC"
xmlns="http://..">
///////////<Address> ///////////(which I would like to insert)
<Block ---------
--------
-------
/>
////////// </Address> /////////(which I would like to insert)
<Parameters Thread ="yyyy" />
</Memory>
I hereby request you to let me know how to I insert -- in between the xml file?
Thanks in advance.!
What I have tried doing is;
Element child = doc.createElement("Address");
child.appendChild(doc.createTextNode("Block"));
root.appendChild(child);
But this gives me an output like;
<Address> Block </Address> and not the way i expect :(
And now, what I have tried is to add these lines;
Element cd = doc.createElement("Address");
Node Block = root.getFirstChild().getNextSibling();
cd.appendChild(Block);
root.insertBefore(cd, root.getFirstChild());
But still, this is not the output which i am looking for. I got this output as
---------

What you want is probably:
Node parent = block.getParentNode()
Node blockRemoved = parent.removeChild(block);
// Create address
parent.appendChild(address);
address.appendChild(blockRemoved);
This is the standard way to re-attach a node in another place under W3C DOM.

Here:
DocumentBuilder b = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document document = b.parse(...);
// Parent of existing Block elements and new Address elemet
// Might be retrieved differently depending on
// actual structure
Element parent = document.getDocumentElement();
Element address = document.createElement("address");
NodeList nl = parent.getElementsByTagName("Block");
for (int i = 0; i < nl.getLength(); ++i) {
Element block = (Element) nl.item(i);
if (i == 0)
parent.insertBefore(address, block);
parent.removeChild(block);
address.appendChild(block);
}
// UPDATE: how to pretty print
LSSerializer serializer =
((DOMImplementationLS)document.getImplementation()).createLSSerializer();
serializer.getDomConfig().setParameter("format-pretty-print", Boolean.TRUE);
LSOutput output =
((DOMImplementationLS)document.getImplementation()).createLSOutput();
output.setByteStream(System.out);
serializer.write(document, output);

I assume you are using the W3C DOM (e.g. http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html ). If so try
insertBefore(address, block);

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Read Complex Xml file in java - java

It isn't well-structured XML. Inside your <envelope> tags there is nothing to indicate the start of each set of six attributes that constitute a 'bill'. You'd normally expect that each one would have a <bill> and </bill> tag to contain them. And this is going to confuse the parser...

Related

Get xml value by element name

Adding sub-element to element java (DOM)

How to get XML content as a String

Reading XML in java using DOM

Inserting a node in an xml file

Categories

Resources