Java - Handle XML With Less Than/Greater Than Symbols in Text - java

I am trying to parse an XML file with the "less than" and "greater than" symbols in the text.
Here is a sample XML file:
<document>
<summary>
The equation for t is: 567<T<600.
</summary>
</document>
Is there any way to handle this in a Java XML parser? I know about escaping and changing to
<
and
>
but I only want to escape the characters in the text.
Currently, I am trying to use the DocumentBuilder, but it is erroring out.
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true);
domFactory.setExpandEntityReferences(false);
try {
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document document = builder.parse(new InputSource(new StringReader(sectionXML.toString())));
} catch (ParserConfigurationException e) {
e.printStackTrace();
}
The error I am getting is:
[Fatal Error] :1:70: Element type "T" must be followed by either attribute specifications, ">" or "/>".
org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 70; Element type "T" must be followed by either attribute specifications, ">" or "/>".
Any thoughts? Thanks in advance for any help.

Related

XML Parse - Issue with parsing text from specific Node [duplicate]

This question already has answers here:
Getting an attribute value in xml element
(3 answers)
Closed 5 years ago.
Face an issue in parsing XML to extract data from a specific node. I referred to Link1 Link2 Link3. Please note, am able to parse & get the data for other nodes in the below xml file like id, order_id etc. But for the below line / node, unable to extract the info of segment_id & instrument_id:
<trade segment_id="NSE-F&O " instrument_id="NSE:INFRATEL17NOVFUT">
Not sure if the way the XML file is setup or the way I am trying to extract the data for that specific node is wrong. Hope the specific issue I face is clear.
XML File:
<contract_note version="0.1">
<contracts>
<contract>
<id>CNT-17/18-5310750</id>
<name>CONTRACT NOTE CUM BILL</name>
<description>None</description>
<timestamp>2017-11-01</timestamp>
<trades>
<trade segment_id="NSE-F&O " instrument_id="NSE:INFRATEL17NOVFUT">
<id>37513030</id>
<order_id>1300000000352370</order_id>
<timestamp>09:20:48</timestamp>
<description>None</description>
<type>buy</type>
<quantity>1700</quantity>
<average_price>444.2</average_price>
<value>755140.0</value>
</trade>
</trades>
</contract>
</contracts>
</contract_note>
Code:
try {
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
NodeList cNoteList = doc.getElementsByTagName("contract");
Node nNode = cNoteList.item(0);
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
for (int j = 1; j <= eElement.getElementsByTagName("trade").getLength(); j++) {
// Check if data can be read for Node - 'id'
System.out.println(eElement.getElementsByTagName("id").item(j).getTextContent();
// Check if data can be read for segment_id & instrument_id
System.out.println("Scrip: " + eElement.getElementsByTagName("trade").item(0).getTextContent());
}
}catch (Exception e) {
e.printStackTrace();
}
Edit:
Corrected the xml file info provided above.
As #Juan commented, your XML is bad. Fix it by following the required XML escaping rules and replacing segment_id="NSE-F&O " with segment_id="NSE-F&O ".
If you cannot change the XML, then see How to parse invalid (bad / not well-formed) XML? for options, but the best option is to fix the XML at the source.

unable to use special charachter '-' while generating XML

I am Building XML Using JAVA,my element have few attribute and that attribute contains '-'
but when setting attibute as :
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.newDocument();
Element dffgr=doc.createElement("diffgr:diffgram");
dffgr.setAttribute("xmlns:msdata", "urn:schemas-­microsoft­-com:xml­-msdata".toString());
dffgr.setAttribute("xmlns:diffgr", "urn:schemas-­microsoft­-com:xml­diffgram-­v1".toString());
'-' is replaced by 'xAD'
as Output is :
<diffgr:diffgram xmlns:diffgr="urn:schemasέicrosoftΣom:xmlΤiffgramζ1" xmlns:msdata="urn:schemasέicrosoftΣom:xmlέsdata">
and desired output is :
<diffgr:diffgram xmlns:msdata="urn:schemas­microsoft­com:xml­msdata" xmlns:diffgr="urn:schemas­microsoft­com:xml­diffgram­v1">
Plese Help.
Copy and paste this:
dffgr.setAttribute("xmlns:msdata", "urn:schemas-microsoft­-com:xml­-msdata");
dffgr.setAttribute("xmlns:diffgr", "urn:schemas-microsoft­-com:xml­diffgram-v1");
You are using the wrong character for -.

How to read xml using XPATH in Java

I want to read XML data using XPath in java, so for the information I have gathered I am not able to parse xml according to my requirement,
I just want to take the value of bankId
this the example of the xml
I want to read XML data using XPath in java, so for the information I have gathered I am not able to parse xml according to my requirement,
I just want to take the value of 'bankId'
this part of the example of the xml
<?xml version="1.0" encoding="UTF-8"?>
<p:RawData xsi:type="p:RawData" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ba="http://service.bni.co.id/bancslink" xmlns:bo="http://service.bni.co.id/core/bo" xmlns:core="http://service.bni.co.id/core" xmlns:mhp="http://service.bni.co.id/mhp" xmlns:mpnG2="http://service.bni.co.id/mhp/mpn_g2" xmlns:mpnG2_1="http://service.bni.co.id/bancslink/mpn_g2" xmlns:p="http://BP_MultiHostPayment">
<boList>
<name>McpRequest</name>
<bo xsi:type="bo:CommonBillPaymentReq">
<billerCode>0128</billerCode>
<regionCode>0001</regionCode>
<billerName>MPN G2 IDR</billerName>
<billingNum>820150629548121</billingNum>
<customerName>EVA RAJAGUKGUK SH. M.KN </customerName>
<paymentMethod>2</paymentMethod>
<accountNum>0373163437</accountNum>
<trxAmount>50000</trxAmount>
<naration>820150629548121</naration>
<invoiceNum>820150629548121</invoiceNum>
<billInvoiceNum1>INVNUM1</billInvoiceNum1>
<billAmount1>0</billAmount1>
<billInvoiceNum2>INVNUM2</billInvoiceNum2>
<billAmount2>0</billAmount2>
<billInvoiceNum3>INVNUM3</billInvoiceNum3>
<billAmount3>0</billAmount3>
<isDecAmount>false</isDecAmount>
<trxDecAmount>50000</trxDecAmount>
</bo>
</boList>
<boList>
<name>McpTellerHeader</name>
<bo xsi:type="core:TellerHeader">
<tellerId>00004</tellerId>
<branchCode>0997</branchCode>
<overrideFlag>I</overrideFlag>
</bo>
</boList>
<boList>
<name>HostRequest</name>
<bo xsi:type="mhp:Request">
<header>
<hostId>MPN_G2</hostId>
<channelId>ATM</channelId>
<branchId></branchId>
<terminalId></terminalId>
<locationId></locationId>
<transDateTime>2015-06-30T22:26:33</transDateTime>
<transId>20150630T222633N042462J0000001969R840SLEXI0</transId>
</header>
<content xsi:type="mpnG2:PaymentReq">
<bankId>520009000990</bankId>
<billingInfo1>013</billingInfo1>
<billingInfo2>03</billingInfo2>
<billingInfo3>409257</billingInfo3>
<branchCode>0997</branchCode>
<channelType>7010</channelType>
<currency>IDR</currency>
<gmt>2015-06-30T15:26:33.942</gmt>
<localDatetime>2015-06-30T22:26:33.943</localDatetime>
<settlementDate>0701</settlementDate>
<switcherCode>001</switcherCode>
<terminalId>S1HKWGA032</terminalId>
<terminalLocation>0997</terminalLocation>
<transactionId>013978</transactionId>
<amount>50000</amount>
<billerAccountNumber>3010194605</billerAccountNumber>
<customerName>EVA RAJAGUKGUK SH. M.KN </customerName>
<ntb>000000058111</ntb>
<paymentCode>820150629548121</paymentCode>
</content>
</bo>
</boList>
</p:RawData>
this is my java code
try {
FileInputStream file = new FileInputStream(new File("C:/Log/contoh.xml"));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
System.out.println("hup hup");
String expression = "/p:RawData/boList/boList/boList/bo/content[#xsi:type='mpnG2:PaymentReq']/bankId";
System.out.println(expression);
String bankId = xPath.compile(expression).evaluate(xmlDocument);
System.out.println(bankId);
System.out.println("hup hup 2");
expression = "/p:RawData/boList/boList/boList/bo/content[#xsi:type='mpnG2:PaymentReq']/bankId";
NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(xmlDocument, XPathConstants.NODESET);
for (int i=0; i < nodeList.getLength(); i++){
System.out.println(nodeList.item(i).getFirstChild().getNodeValue());
}
only this that appear when I run the code
hup hup
/p:RawData/boList/boList/boList/bo/content[#xsi:type='mpnG2:PaymentReq']/bankId
hup hup 2
any help will be pleasure :)
Try the following xpaths
//content[#xsi:type='mpnG2:PaymentReq']/bankId or
//content[#xsi:type='mpnG2:PaymentReq'][1]/bankId or
//bankId or
//bankId[1]
I've tested the xpaths for your xml in this online Xml Xpath tester
The XPath expression is wrong.
/p:RawData/boList/boList/boList/bo/content[#xsi:type='mpnG2:PaymentReq']/bankId
would select the element at:
<p:RawData>
<boList>
<boList>
<boList>
<bo>
<content xsi:type="mpnG2:PaymentReq" />
</bo>
</boList>
</boList>
</boList>
</RawData>
There is no such element in the XML.
You want
/p:RawData/boList[3]/bo/content[#xsi:type='mpnG2:PaymentReq']/bankId
To select the 3rd boList.

add two namespaces into same dom element

I need create a dom document as this:
<namespace:Facturae xmlns:namespace="URI1" xmlns:namespace2="URI2">
//<.......
</namespace:Facturae>
But the following code produce the error:
NAMESPACE_ERR: An attempt is made to create or change an object in a way which is incorrect with regard to namespaces.
The code is:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
Element FacturaeElement = document.createElementNS("URI1", "Facturae");
document.appendChild(FacturaeElement);
FacturaeElement.setPrefix("namespace"); //First namespace OK
FacturaeElement.setAttributeNS("URI2", "xmlns:namespace2", "aaa"); //Generate error
//Rest of code
How I can put a second namespace into element??
Searching more information I have reached the solution:
I use the normal setAtribute method (without namespace) indicating the name of the atribute with the xmlns prefix so: "xmlns:namespace2".
Then, I create the sub element with this the namespace and later put the prefix.
Element FacturaeElement = document.createElementNS("URI1", "Facturae");
document.appendChild(FacturaeElement);
FacturaeElement.setPrefix("namespace"); //First namecpace
FacturaeElement.setAttribute("xmlns:namespace2", "URI2"); //second namespace
//I create the subelement with a namespace
Element FileHeaderElement = document.createElementNS("URI2", "FileHeader");
FacturaeElement.appendChild(FileHeaderElement);
FileHeaderElement.setPrefix("namespace2");

org.xml.sax.SAXParseException while parsing XMl using XPATH

I am trying to get values from an XML using XPATH. I received the following exception:
[Fatal Error] books.xml:4:16: The prefix "abc" for element "abc:priority" is not bound.
Exception in thread "main" org.xml.sax.SAXParseException; systemId: file:///D:/XSL%20TEST%20APP%20BACK%20UP/XMLTestApp/books.xml; lineNumber: 4; columnNumber: 16; The prefix "abc" for element "abc:priority" is not bound.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at xpath.XPathExample.main(XPathExample.java:18)
I am getting this error because my XML is a little bit of different from normal one (please see below):
<?xml version="1.0" encoding="UTF-8"?>
<inventory>
<Sample>
<abc:priority>1</abc:priority>
<abc:value>2</abc:value>
</Sample>
</inventory>
Here is my code (Java) to get values from the above XML:
import java.io.IOException;
import org.w3c.dom.*;
import org.xml.sax.SAXException;
import javax.xml.parsers.*;
import javax.xml.xpath.*;
public class XPathExample {
public static void main(String[] args)
throws ParserConfigurationException, SAXException,
IOException, XPathExpressionException {
DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance();
domFactory.setNamespaceAware(true); // never forget this!
DocumentBuilder builder = domFactory.newDocumentBuilder();
Document doc = builder.parse("books.xml");
XPathFactory factory = XPathFactory.newInstance();
XPath xpath = factory.newXPath();
XPathExpression expr
= xpath.compile("//Sample/*/text()");////book/Sample[author='Neal Stephenson']/title/text()
Object result = expr.evaluate(doc, XPathConstants.NODESET);
NodeList nodes = (NodeList) result;
for (int i = 0; i < nodes.getLength(); i++) {
System.out.println(nodes.item(i).getNodeValue());
}
}
}
If I remove the semicolon, I never get this error.
Is it possible to get content from an XML like mentioned above using XPATH?
"Is it possible to get content from an XML like mentioned above using Xpath ?" - I don't think so. This XML isn't well-formed.
From the spec (http://www.w3.org/TR/REC-xml-names/#ns-qualnames):
The Prefix provides the namespace prefix part of the qualified name,
and MUST be associated with a namespace URI reference in a namespace
declaration. [Definition: The LocalPart provides the local part of the
qualified name.]
In order to do anything with it, I think you'll have to add a namespace declaration.
Example
<inventory xmlns:abc="x">
<Sample>
<abc:priority>1</abc:priority>
<abc:value>2</abc:value>
</Sample>
</inventory>
Try without this line:
domFactory.setNamespaceAware(true); // never forget this!
Although it normally is a bad idea to run without namespace awareness, in this specific case it makes sense, since the input file is the way it is.

Categories