Evaluation of XPath changes with different environments - java

I have a standalone java project in which i am evaluation XPath from my xml file and it giving the right result.
When i integrated my source code into an application deployed on websphere 7, the result is not ok anymore.
After verification, i found that in the first case (standalone project) the Document is well-built (all the nodes from the root are recognized) and in the second case (source code added in the deployed app on WS 7) the root node is missing in the Document.
I worked with the same implementation of "DocumentBuilderFactory" in both cases and the problem persists.
This is the code i used :
domFactory = DocumentBuilderFactory.newInstance();
xmlContent = new byte[inputStream.available()];
inputStream.read(xmlContent);
ByteArrayInputStream bais = new ByteArrayInputStream(xmlContent);
DocumentBuilder documentBuilder = domFactory.newDocumentBuilder();
document = documentBuilder .parse(bais);
xPATH = XPathFactory.newInstance().newXPath();
transformerFactory = TransformerFactory.newInstance();
transformer = transformerFactory.newTransformer();
The implementation of DomFactory is : "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl" from the jar xerces-impl.1.5
Any ideas would be helpfull.

The InputStream.read(byte[]) method might or might not fill the array passed to it. At first, you were lucky; just because the code worked once doesn't mean it was guaranteed to work every time or on every platform.
The other problem is that the contract of the InputStream.available() method specifically states:
Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.
In your case, there is a pretty simple solution. You already have an InputStream. Do not read it into a byte array; delete all uses of xmlContent, bais, and available(). Parse the original InputStream instead:
DocumentBuilder documentBuilder = domFactory.newDocumentBuilder();
document = documentBuilder.parse(inputStream);
xPATH = XPathFactory.newInstance().newXPath();
transformerFactory = TransformerFactory.newInstance();
transformer = transformerFactory.newTransformer();

Related

Java edit XML file with DOM

I have hit somewhat of a roadblock.
My goal is to filter out everything except the number.
Here is the xml file
<?xml version="1.0" encoding="utf-8" ?>
<orders>
<order>
<stuff>"Some random information and # 123456"</stuff>
</order>
</orders>
Here is my incomplete code. I don't know how to find it nor how to go about making the change I want.
public static void main(String argv[]) {
try {
// Lesen der Datei
File inputFile = new File("C:\\filepath...\\asdf.xml");
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse(inputFile);
// I don't know where to go from there
NodeList filter = doc.getChildNodes();
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult consoleResult = new StreamResult(System.out);
transformer.transform(source, consoleResult);
} catch (Exception e) {
e.printStackTrace();
}
}
When you use
Transformer transformer = transformerFactory.newTransformer();
the transformer is an "identity transformer" - it copies the input to the output with no change. In effect you're using the identity transformer here for serialization only, to convert the DOM to lexical XML.
If you want to make actual changes to the XML content, you have two choices: either write Java code to modify the in-memory DOM tree before serialising it, or write XSLT code so your Transformer is doing a real transformation not just an identity transformation. XSLT is almost certainly the better approach except that it involves more of a learning curve.
I'm not sure exactly what output you want, which makes it difficult to give you working code. The phrase "filter out" is unfortunately ambiguous, when people say "I want to filter out X" they sometimes mean they want to remove X, and sometimes they mean they want to remove everything except X. Also, "removing the number" isn't a complete specification unless we know all possibilities of what might appear in your document, for example is the number always preceded by "#", or is that only the case in this one example input? But one approach would be to remove all digits, which you could do with a call on translate(., '0123456789', '').
Note that if you're using XSLT you don't need to construct a DOM first, in fact, it's a waste of time and space. Just supply the lexical XML as input to the transformer, in the form of a StreamSource.

TransformerFactory - avoiding network lookups to verify DTDs

I am needing to program for offline transformation of XML documents.
I have been able to stop DTD network lookups when loading the original XML file with the following :
DocumentBuilderFactory factory;
factory = DocumentBuilderFactory.newInstance();
factory.setValidating(false);
factory.setNamespaceAware(true);
factory.setFeature("http://xml.org/sax/features/namespaces", false);
factory.setFeature("http://xml.org/sax/features/validation", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
// open up the xml document
docbuilder = factory.newDocumentBuilder();
doc = docbuilder.parse(new FileInputStream(m_strFilePath));
However, I am unable to apply this to the TransformerFactory object.
The DTDs are available locally, but I do not know how to direct the transformer to look at the local files as opposed to trying to do a network lookup.
From what I can see, the transformer needs these documents to correctly do the transformation.
For information, I am transforming MusicXML documents from Partwise to Timewise.
As you have probably guessed, XSLT is not my strong point (far from it).
Do I need to modify the XSLT files to reference local files, or can this be done differently ?
Further to the comments below, here is an excerpt of the xsl file. It is the only place that I see which refers to an external file :
<!--
XML output, with a DOCTYPE refering the timewise DTD.
Here we use the full Internet URL.
-->
<xsl:output method="xml" indent="yes" encoding="UTF-8"
omit-xml-declaration="no" standalone="no"
doctype-system="http://www.musicxml.org/dtds/timewise.dtd"
doctype-public="-//Recordare//DTD MusicXML 2.0 Timewise//EN" />
Is the mentioned technique valid for this also ?
The DTD file contains references to a number of MOD files like this :
<!ENTITY % layout PUBLIC
"-//Recordare//ELEMENTS MusicXML 2.0 Layout//EN"
"layout.mod">
I presume that these files will also be imported in turn also.
Ok, here is the answer which works for me.
1st step : load the original document, turning off validation and dtd loading within the factory.
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// stop the network loading of DTD files
factory.setValidating(false);
factory.setNamespaceAware(true);
factory.setFeature("http://xml.org/sax/features/namespaces", false);
factory.setFeature("http://xml.org/sax/features/validation", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
// open up the xml document
DocumentBuilder docbuilder = factory.newDocumentBuilder();
Document doc = docbuilder.parse(new FileInputStream(m_strFilePath));
2nd step : Now that I have got the document in memory ... and after having detected that I need to transform it -
TransformerFactory transformfactory = TransformerFactory.newInstance();
Templates xsl = transformfactory.newTemplates(new StreamSource(new FileInputStream((String)m_XslFile)));
Transformer transformer = xsl.newTransformer();
Document newdoc = docbuilder.newDocument();
Result XmlResult = new DOMResult(newdoc);
// now transform
transformer.transform(
new DOMSource(doc.getDocumentElement()),
XmlResult);
I needed to do this as I have further processing going on afterwards and did not want the overhead of outputting to file and reloading.
Little explanation :
The trick is to use the original DOM object which has had all the validation features turned off. You can see this here :
transformer.transform(
new DOMSource(doc.getDocumentElement()), // <<-----
XmlResult);
This has been tested with network access TURNED OFF.
So I know that there are no more network lookups.
However, if the DTDs, MODs, etc are available locally, then, as per the suggestions, the use of an EntityResolver is the answer. This to be applied, again, to the original docbuilder object.
I now have a transformed document stored in newdoc, ready to play with.
I hope this will help others.
You can use a library like Apache xml-commons-resolver and write a catalog file to map web URLs to your local copy of the relevant files. To wire this catalog up to the transformer mechanism you would need to use a SAXSource instead of a StreamSource as the source of your stylesheet:
SAXSource styleSource = new SAXSource(new InputSource("file:/path/to/stylesheet.xsl"));
CatalogResolver resolver = new CatalogResolver();
styleSource.getXMLReader().setEntityResolver(resolver);
TransformerFactory tf = TransformerFactory.newInstance();
tf.setURIResolver(resolver);
Transformer transformer = tf.newTransformer(styleSource);
The usual way to do this in Java is to use an LSResourceResolver to resolve the system ID (and/or public ID) to your local file. This is documented at http://docs.oracle.com/javase/7/docs/api/org/w3c/dom/ls/LSResourceResolver.html. You shouldn't need anything outside of standard Java XML parser features to get this working.

Error writing XML Document to file in Java

I am trying to write org.w3c.dom.Document to a file. I get the Document from
String URL = "http://...."
DOMParser parser = new DOMParser();
Document doc = null;
try {
parser.parse(new InputSource(URL));
doc = parser.getDocument();
} catch () {}
Then I write this Document to a file using
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File(file));
transformer.transform(source, result);
While doing this I keep getting the following error
ERROR: 'Namespace for prefix 'xlink' has not been declared.'
What might be wrong? Thanks
I recommend using a different library such as Dom4J rather than trying to fight your way through the built-in XML API in Java. Dom4J is better designed and makes your code much more readable:
Document doc = new SAXReader().read(inputStream);
new XMLWriter(outputStream).write(doc);
None of this mucking around with FactoryFactoryFactoryFactories.
I know this doesn't directly answer your question but hopefully it will help anyway. Dom4j knows how to talk to the Java XML API so you can mix and match them to suit your needs. You can even plug it into Xalan or something similar if you want to use XSLT.

org.w3c.dom.Document to String without javax.xml.transform

I've spent a while looking around on Google for a way to convert a org.w3c.dom.Document to a string representation of the whole DOM tree, so I can save the object to file system.
However all the solutions I've found use javax.xml.transform.Transformer which isn't supported as part of the Android 2.1 API. How can I do this without using this class/containing package?
Please try this code:
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.parse("/path/to/file.xml");
DOMImplementation domImpl = ownerDocument.getImplementation();
DOMImplementationLS domImplLS = (DOMImplementationLS)domImpl.getFeature("LS", "3.0");
LSSerializer serializer = domImplLS.createLSSerializer();
serializer.getDomConfig().setParameter("xml-declaration", Boolean.valueOf(false));
LSOutput lsOutput = domImplLS.createLSOutput();
lsOutput.setCharacterStream(output);
serializer.write(doc, lsOutput);
To avoid using Transformer you should manually iterate over your xml tree, otherwise you can rely on some external libraries. You should take a look here.

Transformer's transform causes a fatal error, why?

I've built a document using JAXP like this:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.newDocument();
Element rootElement = document.createElement("Root");
for (MyObject o : myCollection) {
Element entry = document.createElement("Entry");
Element entryItem = document.createElement("EntryItem");
entryItem.appendChild(document.createTextNode(o.getProperty()));
entry.appendChild(entryItem);
rootElement.appendChild(entry);
}
document.appendChild(rootElement);
Now, when I try to output the XML for the document like this:
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(new StringWriter());
transformer.transform(source, result);
System.out.println(result.getWriter().toString());
It falls apart on the transformer.transform line with the following error:
FATAL ERROR: 'java.lang.NullPointerException'
:null
How do I go about debugging this? I've made sure that transformer, source and result aren't null.
I'm guessing that this:
entryItem.appendChild(document.createTextNode(o.getProperty()));
created a text node with a null value. Looking at Xerces' code (which is the default JAXP implementation shipped with Oracle's JDK 1.6), I see no null validation done at the time of constructing the text node. I suspect that that, later, makes the Transformer die.
Either that, or there's some JAXp configuration problem.
You may wish to set the jaxp.debug system property (available JDK 1.6+) to get some JAXP tracing information.
--How about the document?
Ooops sorry, obviously the second part follows the first :) Which parser are you using?

Categories