I've built a document using JAXP like this:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
Element rootElement = document.createElement("Root");
for (MyObject o : myCollection) {
Element entry = document.createElement("Entry");
Element entryItem = document.createElement("EntryItem");
entryItem.appendChild(document.createTextNode(o.getProperty()));
entry.appendChild(entryItem);
rootElement.appendChild(entry);
}
document.appendChild(rootElement);
Now, when I try to output the XML for the document like this:
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(new StringWriter());
transformer.transform(source, result);
System.out.println(result.getWriter().toString());
It falls apart on the transformer.transform line with the following error:
FATAL ERROR: 'java.lang.NullPointerException'
:null
How do I go about debugging this? I've made sure that transformer, source and result aren't null.
I'm guessing that this:
entryItem.appendChild(document.createTextNode(o.getProperty()));
created a text node with a null value. Looking at Xerces' code (which is the default JAXP implementation shipped with Oracle's JDK 1.6), I see no null validation done at the time of constructing the text node. I suspect that that, later, makes the Transformer die.
Either that, or there's some JAXp configuration problem.
You may wish to set the jaxp.debug system property (available JDK 1.6+) to get some JAXP tracing information.
--How about the document?
Ooops sorry, obviously the second part follows the first :) Which parser are you using?
Related
I have hit somewhat of a roadblock.
My goal is to filter out everything except the number.
Here is the xml file
<?xml version="1.0" encoding="utf-8" ?>
<orders>
<order>
<stuff>"Some random information and # 123456"</stuff>
</order>
</orders>
Here is my incomplete code. I don't know how to find it nor how to go about making the change I want.
public static void main(String argv[]) {
try {
// Lesen der Datei
File inputFile = new File("C:\\filepath...\\asdf.xml");
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(inputFile);
// I don't know where to go from there
NodeList filter = doc.getChildNodes();
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult consoleResult = new StreamResult(System.out);
transformer.transform(source, consoleResult);
} catch (Exception e) {
e.printStackTrace();
}
}
When you use
Transformer transformer = transformerFactory.newTransformer();
the transformer is an "identity transformer" - it copies the input to the output with no change. In effect you're using the identity transformer here for serialization only, to convert the DOM to lexical XML.
If you want to make actual changes to the XML content, you have two choices: either write Java code to modify the in-memory DOM tree before serialising it, or write XSLT code so your Transformer is doing a real transformation not just an identity transformation. XSLT is almost certainly the better approach except that it involves more of a learning curve.
I'm not sure exactly what output you want, which makes it difficult to give you working code. The phrase "filter out" is unfortunately ambiguous, when people say "I want to filter out X" they sometimes mean they want to remove X, and sometimes they mean they want to remove everything except X. Also, "removing the number" isn't a complete specification unless we know all possibilities of what might appear in your document, for example is the number always preceded by "#", or is that only the case in this one example input? But one approach would be to remove all digits, which you could do with a call on translate(., '0123456789', '').
Note that if you're using XSLT you don't need to construct a DOM first, in fact, it's a waste of time and space. Just supply the lexical XML as input to the transformer, in the form of a StreamSource.
For example I have an input.xml file with next content:
<root>"Hello World!"</root>
Then I read it and parse:
// parse xml file
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse(new File("input.xml"));
doc.getDocumentElement().normalize();
Suppose i just want to save it in other place:
Transformer transformer = TransformerFactory.newInstance().newTransformer();
Result output = new StreamResult(new FileOutputStream("output.xml"));
doc.setXmlStandalone(true);
Source input = new DOMSource(doc);
transformer.setOutputProperty(OutputKeys.METHOD, "xml");
transformer.transform(input, output);
And I get output.xml file :
<root>"Hello World!"</root>
What i need to do to avoid this replacement?
I have not found any solution anywhere.
This question is similar to Is there a Java XML API that can parse a document without resolving character entities? and Java XML Parsing: Avoid entity reference resolution
If you want to keep entity reference, you need a special XML parser.
I have a standalone java project in which i am evaluation XPath from my xml file and it giving the right result.
When i integrated my source code into an application deployed on websphere 7, the result is not ok anymore.
After verification, i found that in the first case (standalone project) the Document is well-built (all the nodes from the root are recognized) and in the second case (source code added in the deployed app on WS 7) the root node is missing in the Document.
I worked with the same implementation of "DocumentBuilderFactory" in both cases and the problem persists.
This is the code i used :
domFactory = DocumentBuilderFactory.newInstance();
xmlContent = new byte[inputStream.available()];
inputStream.read(xmlContent);
ByteArrayInputStream bais = new ByteArrayInputStream(xmlContent);
DocumentBuilder documentBuilder = domFactory.newDocumentBuilder();
document = documentBuilder .parse(bais);
xPATH = XPathFactory.newInstance().newXPath();
transformerFactory = TransformerFactory.newInstance();
transformer = transformerFactory.newTransformer();
The implementation of DomFactory is : "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl" from the jar xerces-impl.1.5
Any ideas would be helpfull.
The InputStream.read(byte[]) method might or might not fill the array passed to it. At first, you were lucky; just because the code worked once doesn't mean it was guaranteed to work every time or on every platform.
The other problem is that the contract of the InputStream.available() method specifically states:
Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.
In your case, there is a pretty simple solution. You already have an InputStream. Do not read it into a byte array; delete all uses of xmlContent, bais, and available(). Parse the original InputStream instead:
DocumentBuilder documentBuilder = domFactory.newDocumentBuilder();
document = documentBuilder.parse(inputStream);
xPATH = XPathFactory.newInstance().newXPath();
transformerFactory = TransformerFactory.newInstance();
transformer = transformerFactory.newTransformer();
I use the javax.xml.parsers.DocumentBuilder, and want to write a org.w3c.dom.Document to a file.
If there is an empty element, the default output is a collapsed:
<element/>
Can I change this behavior so that is doesn't collapse the element? I.e.:
<element></element>
Thanks for your help.
This actualy depends on the way how you're writing a document to a file and has nothing to do with DOM itself. The following example uses popular Transformer-based approach:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
Document document = factory.newDocumentBuilder().newDocument();
Element element = document.createElement("tag");
document.appendChild(element);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.METHOD, "html");
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(System.out);
transformer.transform(source, result);
It outputs <tag></tag> as you're expecting. Please note, that changing the output method has other side effects, like missing XML declaration.
I am trying to write org.w3c.dom.Document to a file. I get the Document from
String URL = "http://...."
DOMParser parser = new DOMParser();
Document doc = null;
try {
parser.parse(new InputSource(URL));
doc = parser.getDocument();
} catch () {}
Then I write this Document to a file using
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File(file));
transformer.transform(source, result);
While doing this I keep getting the following error
ERROR: 'Namespace for prefix 'xlink' has not been declared.'
What might be wrong? Thanks
I recommend using a different library such as Dom4J rather than trying to fight your way through the built-in XML API in Java. Dom4J is better designed and makes your code much more readable:
Document doc = new SAXReader().read(inputStream);
new XMLWriter(outputStream).write(doc);
None of this mucking around with FactoryFactoryFactoryFactories.
I know this doesn't directly answer your question but hopefully it will help anyway. Dom4j knows how to talk to the Java XML API so you can mix and match them to suit your needs. You can even plug it into Xalan or something similar if you want to use XSLT.