DomParser from string XML getting null document - java

I'm trying to parse a xml string using domParser but when I trying to get the document it shows [#document: null] and it doesn't contain the data of xml passing.
The code is something like that:
Document doc = null;
DOMParser parser = new DOMParser();
logger.debug("Parsing");
InputSource IS = new InputSource(new StringReader(nameFile));
parser.parse(IS);
doc = parser.getDocument();
NodeList NL = doc.getElementsByTagName("element");
The problem starts when doc = parser.getDocument().
It returns [#document=null]. So the NodeList can't find the element that I'm looking for.
My XML is quite big. It contains around 50K character.
My question is, what are the possible issue that introducing this problem?
For your information, this application with the same code works in OAS with JDK1.4 now I'm transfering the application to Weblogic 12c with JDK 1.6.
Thanks in advance.
UPDATED:
Sorry for not mentioning nameFile data type. nameFile is a xml data in string format.
UPDATED2:
I've tried with a simple xml but no luck.
Example:
1st Example: this string is without any space ->
nameFile = "<?xml version='1.0'?><company><staff id='1001'><firstname>yong</firstname><lastname>mook kim</lastname><nickname>mkyong</nickname><salary>100000</salary></staff><staff id='2001'><firstname>low</firstname><lastname>yin fong</lastname><nickname>fong fong</nickname><salary>200000</salary></staff></company>";
2nd Example:
nameFile = "<message>Hello</message>
None of this is working. Always returns [#document:null]

I assume 'nameFile' in your code snippet is a string! The following works perfectly for me.
String nameFile= "<message>HELLO World</message>";
DOMParser parser = new DOMParser();
try {
parser.parse(new InputSource(new java.io.StringReader(nameFile)));
Document doc = parser.getDocument();
String message = doc.getDocumentElement().getTextContent();
System.out.println(message);
} catch (SAXException e) {
// handle SAXException
} catch (IOException e) {
// handle IOException
}

Related

Unexpected token (<) using Document builder in Android Studio

I'm using document builder and NodeList in Android Studio to parse an xml document. I previously found that the xml was incorrect and had un-escaped ampersands within the text. After taking care of this though and double check with w3 XML validator, I still get an unexpected token error:
e: "org.xml.sax.SAXParseException: Unexpected token (position:TEXT \n \n 601\n ...#5262:1 in java.io.StringReader#cd0db4a)"
However, when I open the xml and look at the line referred to, I don't see anything that would be considered troublesome:
... ...
5257 <WebSvcLocation>
5258 <Id>1521981</Id>
5259 <Name>Warehouse: Row 3</Name>
5260 <SiteName>Warehouse</SiteName>
5261 </WebSvcLocation>
5262 </ArrayOfWebSvcLocation>
I have checked the xml as well for non printing characters and I have not found any. Below is the code I have been using:
public List<Location> SpinnerXML(String xml){
List<Location> list = new ArrayList<Location>();
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
InputSource is;
String s = xml.replaceAll("[&]"," and ");
try {
builder = factory.newDocumentBuilder();
is = new InputSource(new StringReader(s));
Document doc = builder.parse(is);
NodeList lt = doc.getElementsByTagName("WebSvcLocation");
int id;
String name,siteName;
for (int i = 0; i < lt.getLength(); i++) {
Element el = (Element) lt.item(i);
id = Integer.parseInt(getValue(el, "Id"));
name = getValue(el, "Name");
siteName = getValue(el, "SiteName");
list.add(new Location(id, name, siteName));
}
} catch (ParserConfigurationException e){
} catch (SAXException e){
e.printStackTrace();
} catch (IOException e){
}
return list;
}
The XML I have been trying to read is hosted here.
Thanks in advance for the help!
InputSource seems to do some guessing as to the encoding, so here's some things to try.
From here it says:
Android note: The Android platform default (encoding) is always UTF-8.
Referenced from here
Java stores strings as UTF-16 internally.
"Java stores strings as UTF-16 internally, but the encoding used
externally, the "system default encoding", varies.
(1) I would initially recommend:
is.setEncoding("UTF-8");
(2) But it should do no harm to replace this:
Document doc = builder.parse(is);
With this:
Document doc = builder.parse(new ByteArrayInputStream(s.getBytes()));
(3) OR try this:
String s1 = URLDecoder.decode(s, "UTF-8");
Document doc = builder.parse(new ByteArrayInputStream(s1.getBytes()));
NOTE:
if you try (2) or (3) comment OUT:
is = new InputSource(new StringReader(s));
As it may mess up String s.

How to convert String having contents in XML format into JDom document

How convert String having contents in XML format into JDom document.
i am trying with below code:
String docString = txtEditor.getDocumentProvider().getDocument(
txtEditor.getEditorInput()).get();
SAXBuilder sb= new SAXBuilder();
doc = sb.build(new StringReader(docString));
Can any one help me to resolve above problem.
Thanks in advance!!
This is how you generally parse an xml to Document
try {
SAXBuilder builder = new SAXBuilder();
Document anotherDocument = builder.build(new File("/some/directory/sample.xml"));
} catch(JDOMException e) {
e.printStackTrace();
} catch(NullPointerException e) {
e.printStackTrace();
}
This is taken from JDOM IBM Reference
In case you have string you can convert it to InputStream and then pass it
String exampleXML = "<your-xml-string>";
InputStream stream = new ByteArrayInputStream(exampleXML.getBytes("UTF-8"));
Document anotherDocument = builder.build(stream);
For the various arguments builder.build() supports you can go through the api docs
This is a FAQ that shold have an answer more accessible than the actual FAQ: How do I build a document from a String?
So, I have created issue #111
For what it's worth, I have previously improved the error messages for this situation (see the previous issue #63 and now you should have an error that says:
MalformedURLException mx = new MalformedURLException(
"SAXBuilder.build(String) expects the String to be " +
"a systemID, but in this instance it appears to be " +
"actual XML data.");
Bottom line is that you should be using:
Document parseddoc = new SaxBuilder().build(new StringReader(myxmlstring));
rolfl

How to read XML response in specific character code in Java

I use Wikimedia API Sandbox for Japanese.
Japanese Version
English Version
I send a HTTP request to Wikimedia and I get a result formed in XML.
When I try to send a request and get a result on API Sandbox Webpage, there is no character corruption in a result.
But when I get a result in Java, a result includes character corruptions.
I cannot assign a specific character code in XML file.
How can I assign a result a specific character code?
How can I resolve my problem?
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db
.parse(new URL(
"http://ja.wikipedia.org/w/api.php?action=query&prop=categories&format=xml&cllimit=10&titles="
+ key).openStream());
Element root = doc.getDocumentElement();
NodeList queryList = root.getChildNodes();
Node query = queryList.item(0);
if (query instanceof Element) {
Element queryEle = (Element) query;
NodeList pagesList = queryEle.getChildNodes();
Node pgs = pagesList.item(0);
if (pgs instanceof Element) {
Element pagesElement = (Element) pgs;
NodeList pageList = pagesElement.getChildNodes();
Node page = pageList.item(0);
if (page instanceof Element) {
Element pageElement = (Element) page;
String title = pageElement.getAttribute("title");
title = new String(title.getBytes("UTF-8"), "UTF-8");
}
}
}
} catch (ParserConfigurationException e) {
} catch (SAXException e) {
} catch (IOException e) {
}
Now I send a request, I got a result whose page title is "大学". But in Java, it shows "??".
I use above code for Android Application.
title = new String(title.getBytes("UTF-8"), "UTF-8"); can be left out.
It worked for me, for key=1 (receiving UTF-8). I have a UTF-8 Linux PC though. Maybe you did not output in a UTF-8 context or so. Try write the Document to a file.
You could do more inspection:
URLConnection connection = new URL("...").openConnection();
... connection.getContentEncoding();
... connection.getContentType();
InputStream in = connection.openStream();

How to convert a string to a xml in java?

I have a string object "hello world"
I need to create an xml file from this string with hello world as text content.
I tried the following code snippet
String xmlString = "<?xml version=\"1.0\" encoding=\"utf-8\"?><soap:Envelope xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"></soap:Envelope>";
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try
{
builder = factory.newDocumentBuilder();
// Use String reader
Document document = builder.parse( new InputSource(
new StringReader( xmlString) ) );
TransformerFactory tranFactory = TransformerFactory.newInstance();
Transformer aTransformer = tranFactory.newTransformer();
Source src = new DOMSource( document );
Result dest = new StreamResult( new File("D:\\myXML.xml" ) );
aTransformer.transform( src, dest );
} catch (Exception e)
{
// TODO Auto-generated catch block
e.printStackTrace();
}
this code works fine. but when i replace the string with "Hello world" its not working.
Can any one help me out in this ?
Thanks
You cannot turn the string "hello world" into XML, as it is not a valid xml document. It has no declaration, and no tags.
The code above will not turn text into xml objects, it will only take a string which is already valid xml and write it out to file.
To be honest, if you just want to write it to a file, the xml stuff is all unnecessary.
If you want some kind of "hello world" xml file, you'll need to add the declaration and some tags yourself.
This error is because you are trying to parse xmlString as a valid XML string, which it is not. For example, your code will run fine with the following xmlString:
String xmlString = "<hi>Hello World</hi>";
If you have String newNode = "<node>Hello World</node>";
You can use
Element node = DocumentBuilderFactory
.newInstance()
.newDocumentBuilder()
.parse(new ByteArrayInputStream(newNode.getBytes()))
.getDocumentElement();
The simplest solution can be here is:
If it's a valid string(correct as per XML norms) just write it into a new file using FileWriter and give it .xml extension.
Anyway it will not convert if it's not a valid XML string

Building a DOM Document with tagsoup

I cannot make TagSoup work. I'm using the code that follows, but when I print the Node returned by the parser (the line with System.err.println(doc);) , I always get "[#document: null]".
I don't know how to find the bug in this code or, whichever it is, the origin of the problem. Please help!
public final Document parseDOM(final File fileToParse) {
Parser p = new Parser();
SAX2DOM sax2dom = null;
org.w3c.dom.Node doc = null;
try {
URL url = new URL("http://stackoverflow.com/");
p.setFeature(Parser.namespacesFeature, false);
p.setFeature(Parser.namespacePrefixesFeature, false);
sax2dom = new SAX2DOM();
p.setContentHandler(sax2dom);
p.parse(new InputSource(new InputStreamReader(url.openStream())));
doc = sax2dom.getDOM();
System.err.println(doc);
} catch (Exception e) {
// TODO handle exception
e.printStackTrace();
}
return doc.getOwnerDocument();
}
From the documentation on getOwnerDocument:
When this node is a Document or a DocumentType which is not used with any Document yet, this is null.
Since getDOM in your case should return a Document, you could simply cast the return value or change the type of doc to Document.
Your parser is working, but you just can't print out a node like that. The easiest way to print out a node and all its children is to use an XML Serializer like this:
Writer out = new StringWriter();
XMLSerializer serializer = new XMLSerializer(out, new OutputFormat());
serializer.serialize(doc);
System.out.println(out.toString());

Categories