I'm using XMLStreamReader to parse a piece of xml:
XMLStreamReader rd = XMLInputFactory.newInstance().createXMLStreamReader(io_xml, "UTF-8");
...
if (eventType == XMLStreamConstants.START_ELEMENT) {
String name = rd.getLocalName();
if (name.equals("key")) {
String val = rd.getElementText();
}
}
Problem is, I'm getting a bad read for the following string: "<key>cami%C3%B5es%2Babc</key>"
org.junit.ComparisonFailure:
expected:<cami[%C3%B5es%]2Babc> but was:<cami[ C3 B5es ]2Babc>
Do I neeed to do anything special within the XML? I already tried to put everything within a CDATA section but I get the same error.
Using a "regular" parser everything works:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
Document parse = builder.parse(is);
String value = parse.getFirstChild().getTextContent();
...
I figured it out. The problem was in a different section of the code. A setter that didn't just set.
Related
I am successfully making an API call that is a SOAP request with an account number in the body. I connected using Httpurlconnection and I am reading those results using BufferedReader:
if (responseCode == HttpURLConnection.HTTP_OK) { // success
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
{
sb.append(inputLine).append("\n");
String xml2String = sb.toString();
Then using documentbuilderfactory to build the doc to read into the parser:
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbFactory.newDocumentBuilder();
Document xmlDom = docBuilder.parse(new InputSource(inputLine));
And then try to parse:
DOMParser parser = new DOMParser();
parser.parse(new InputSource(new StringReader(returnList.item(0).getTextContent())));
Document doc = parser.getDocument();
NodeList responsedata = doc.getDocumentElement().getChildNodes();
NodeList returnList = xmlDom.getElementsByTagName("DATA");
// Get the DATA
DOMParser parser = new DOMParser();
parser.parse(new InputSource(new StringReader(returnList.item(0).getTextContent())));
Document doc = parser.getDocument();
NodeList responsedata = doc.getDocumentElement().getChildNodes();
This is the error I get (which includes the output from the API request):
Exception,no protocol:
{"d":"<DATA><BussFlds><FieldName>FirstName</FieldName><Value><![CDATA[TESTY]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>LastName</FieldName><Value><![CDATA[TESTER]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>TYPE</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>DATE</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>CUSTCODE</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>PREMCODE</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>ADDRESS</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>CITY</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>STATE</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>ZIP</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>ZIP4</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>ACCTBALANCE</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>PASTDUE</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds><BussFlds><FieldName>PHONE</FieldName><Value><![CDATA[]]></Value><DataType>String</DataType><Format></Format><Editable>True</Editable></BussFlds></DATA>"}
I suspect that it is that curly bracket data on the first row or missing header information but I am not sure if that is the issue or how to fix it. Thanks!
In
docBuilder.parse(new InputSource(inputLine))
You are using the stringbuffer. Replace it with your variable xml2String
This response:
{"d":"<DATA><BussFlds>…
is not XML. You cannot read it with a DocumentBuilder.
That response is in a format known as JSON. You cannot use an XML parser to read it.
So, you will want to pass the response to a JSON parser, not an XML parser.
A JSON “object” is basically a dictionary (that is, a lookup table) with string keys. Your response has exactly one entry, whose key is "d". So you first need to parse the response as JSON:
String xml;
try (JsonParser jsonParser = Json.createParser(con.getInputStream())) {
xml = jsonParser.getObject().getString("d");
}
(There are other JSON parsing libraries available. I chose the one that is part of Java EE for the above example.)
Notice that the code does not attempt to read con.getInputStream() as a string first. There is no benefit to doing that. The parser accepts an InputStream directly. Which means there is no need to use InputStreamReader, or BufferedReader, or StringBuffer.
Now that you have XML content in the xml variable, you can parse it with DocumentBuilder:
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = dbFactory.newDocumentBuilder();
Document xmlDom = docBuilder.parse(new InputSource(new StringReader(xml)));
Side note: You should never use StringBuffer. Use StringBuilder instead. StringBuffer is a 26-year-old class that was part of Java 1.0, and it is designed for multithreaded use, which is almost never needed, and which adds a lot of overhead.
I have the following code:
DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile);
How can I get it to parse XML contained within a String instead of a file?
I have this function in my code base, this should work for you.
public static Document loadXMLFromString(String xml) throws Exception
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
return builder.parse(is);
}
also see this similar question
One way is to use the version of parse that takes an InputSource rather than a file
A SAX InputSource can be constructed from a Reader object. One Reader object is the StringReader
So something like
parse(new InputSource(new StringReader(myString))) may work.
Convert the string to an InputStream and pass it to DocumentBuilder
final InputStream stream = new ByteArrayInputStream(string.getBytes(StandardCharsets.UTF_8));
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
builder.parse(stream);
EDITIn response to bendin's comment regarding encoding, see shsteimer's answer to this question.
I'm using this method
public Document parseXmlFromString(String xmlString){
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream inputStream = new ByteArrayInputStream(xmlString.getBytes());
org.w3c.dom.Document document = builder.parse(inputStream);
return document;
}
javadocs show that the parse method is overloaded.
Create a StringStream or InputSource using your string XML and you should be set.
You can use the Scilca XML Progession package available at GitHub.
XMLIterator xi = new VirtualXML.XMLIterator("<xml />");
XMLReader xr = new XMLReader(xi);
Document d = xr.parseDocument();
This question already has answers here:
java.net.MalformedURLException: no protocol
(2 answers)
Closed 5 years ago.
I want to parse XML in Java.
The XML looks like:
<Attributes><ProductAttribute ID="359"><ProductAttributeValue><Value>1150</Value></ProductAttributeValue></ProductAttribute><ProductAttribute ID="361"><ProductAttributeValue><Value>1155</Value></ProductAttributeValue></ProductAttribute></Attributes>
My try was:
public static void parseXml(String sb) throws Exception{
sb = "<Attributes><ProductAttribute ID="359"><ProductAttributeValue><Value>1150</Value></ProductAttributeValue></ProductAttribute><ProductAttribute ID="361"><ProductAttributeValue><Value>1155</Value></ProductAttributeValue></ProductAttribute></Attributes>";
Document dom;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
dom = db.parse(new InputSource(new ByteArrayInputStream(sb.getBytes("utf-8"))));
dom.toString();
}
I wanted first see, if the parsing is going. But it doesn't.
I get the error:
Premature end of file
Have anybody an idea, how can I parse these?
The question is not duplicate. I have read the answers of another question like my question, but the difference is the XML.
Thanks
First parse the JSON string to get the XML, then parse the xml. For instance, using JSON-java:
JSONObject obj = new JSONObject(json);
JSONArray arr = obj.getJSONArray("value");
for (Object elm: arr) {
String xml = ((JSONObject)elm).getString("AttributesXml");
DocumentBuilderFactory documentBuilderFactory
= DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder
= documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(
new InputSource(new StringReader(xml)));
doSomethingWith(document);
}
But the method parse with a String argument expects an URI as the argument, not the XML source, so you must use another one.
UPDATE:
I see that you have updated your question, and the xml is in sb; this will be:
DocumentBuilderFactory documentBuilderFactory
= DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder
= documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.parse(
new InputSource(new StringReader(sb)));
doSomethingWith(document);
I have the following code which turns a string, that I pass into the function, into a document:
DocumentBuilderFactory dbFactory_ = DocumentBuilderFactory.newInstance();
Document doc_;
void toXml(String s)
{
documentBuild();
DocumentBuilder dBuilder = dbFactory_.newDocumentBuilder();
StringReader reader = new StringReader(s);
InputSource inputSource = new InputSource(reader);
doc_ = dBuilder.parse(inputSource);
}
The problem is that some of the legacy code that I'm using passes into this toXml function a single word like RANDOM or FICTION. I would like to turn these calls into valid xml before trying to parse it. Right now if I call the function with s = FICTION it returns a SAXParseExeption error. Could anyone advise me on the right way to do this? If you have any questions let me know.
Thank you for your time
-Josh
This creates an XmlDocument with an element test
function buildXml(string s) {
XmlDocument d = new XmlDocument();
d.AppendChild(d.CreateElement(s));
StringWriter sw = new StringWriter();
XmlTextWriter xw = new XmlTextWriter(sw);
d.WriteTo(xw);
return sw.ToString();
}
buildXml("Test"); //This will return <Test />
Its a bit ugly but it will create the XML without having to do any string work on your own ;)
You could add this in a try catch in your method so if it fails to load it as an XML directly it passes the string to this and then tries to load it.
Have you tried the seemingly obvious <FICTION/> or <FICTION></FICTION>?
I have the following code:
DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(xmlFile);
How can I get it to parse XML contained within a String instead of a file?
I have this function in my code base, this should work for you.
public static Document loadXMLFromString(String xml) throws Exception
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(xml));
return builder.parse(is);
}
also see this similar question
One way is to use the version of parse that takes an InputSource rather than a file
A SAX InputSource can be constructed from a Reader object. One Reader object is the StringReader
So something like
parse(new InputSource(new StringReader(myString))) may work.
Convert the string to an InputStream and pass it to DocumentBuilder
final InputStream stream = new ByteArrayInputStream(string.getBytes(StandardCharsets.UTF_8));
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
builder.parse(stream);
EDITIn response to bendin's comment regarding encoding, see shsteimer's answer to this question.
I'm using this method
public Document parseXmlFromString(String xmlString){
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputStream inputStream = new ByteArrayInputStream(xmlString.getBytes());
org.w3c.dom.Document document = builder.parse(inputStream);
return document;
}
javadocs show that the parse method is overloaded.
Create a StringStream or InputSource using your string XML and you should be set.
You can use the Scilca XML Progession package available at GitHub.
XMLIterator xi = new VirtualXML.XMLIterator("<xml />");
XMLReader xr = new XMLReader(xi);
Document d = xr.parseDocument();