javax.xml.transform.Transformer very slow

javax.xml.transform.Transformer very slow - java

I am using this code to write a (simple) DOM tree to a string, but on my LG Optimus L3, this takes up to 30 seconds or more. How could I make it faster?
Transformer t = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
t.transform(new DOMSource(doc), new StreamResult(writer));
result = writer.getBuffer().toString();

I ended up just writing my own serializer. It certainly doesn't support everything, just tag names, attributes and text content, but it is simple and fast:
void write(Node e, StringWriter w) {
if (e.getNodeType() == Node.ELEMENT_NODE) {
w.write("<"+e.getNodeName());
if (e.hasAttributes()) {
NamedNodeMap attrs = e.getAttributes();
for (int i = 0; i < attrs.getLength(); i++) {
w.write(" "+attrs.item(i).getNodeName()+"=\""+
attrs.item(i).getNodeValue()+"\"");
}
}
w.write(">");
if (e.hasChildNodes()) {
NodeList children = e.getChildNodes();
for (int i = 0; i < children.getLength(); i++) {
write(children.item(i), w);
}
}
w.write("</"+e.getNodeName()+">");
}
if (e.getNodeType() == Node.TEXT_NODE) {
w.write(e.getTextContent());
}
}
You use it with a Document like this:
StringWriter writer = new StringWriter();
String result;
write(doc.getDocumentElement(), writer);
result = writer.getBuffer().toString();

Related

Resulting certificate on Signature has "&#13" how to remove them?

I'm using xades4j to sing an xml, everything works fine.
But on the resulting XML the X509Certificate looks something like this:
<ds:X509Certificate> MIIDUjCCAjqgAwIBAgIIYFxFM0GPYwowDQYJKoZIhvcNAQELBQAwKTEMMAoGA1UEAwwDRkVMMQww
CgYDVQQKDANTQVQxCzAJBgNVBAYTAkdUMB4XDTE4MTIxMDE1MTQyOFoXDTIwMTIwOTE1MTQyOFow
KDERMA8GA1UEAwwIODI1NzYyNTQxEzARBgNVBAoMCnNhdC5nb2IuZ3QwggEiMA0GCSqGSIb3DQEB
AQUAA4IBDwAwggEKAoIBAQC6QTYY7yGtmikBaV6pNVee6WzNBToIr3jlFikbvZI4JD+4p0LJqten
</ds:X509Certificate>
How can I remove the "& #13;" from it?
The method that executes the signature is this one:
#Override
public DOMSource generarFirmaDigitalParaXML(Document xml, KeyingDataProvider keyingDataProvider, String nombreArchivoXmlOriginal) {
final Element rootElement = xml.getDocumentElement();
Element elementoAFirmar = null;
NodeList nodeList = xml.getElementsByTagName("dte:DatosEmision");
DOMSource source = null;
int lenght = nodeList.getLength();
for (int i = 0; i < lenght; i++) {
Node nNode = nodeList.item(i);
elementoAFirmar = (Element) nNode;
}
XadesBesSigningProfile profile = new XadesBesSigningProfile(keyingDataProvider);
try {
XadesSigner signer = profile.newSigner();
String atributoUtilizado = seleccionarAttributoComoId(elementoAFirmar, "ID");
if (atributoUtilizado != null) {
DataObjectDesc obj = new DataObjectReference("#" + elementoAFirmar.getAttribute(atributoUtilizado))
.withTransform(new EnvelopedSignatureTransform());
SignedDataObjects dataObjs = new SignedDataObjects().withSignedDataObject(obj);
signer.sign(dataObjs, rootElement);
xml.setXmlStandalone(true);
source = new DOMSource(xml);
} else {
throw new Exception("Atributo no encontrado en el XML");
}
} catch (Exception e) {
bitacora.log(Level.SEVERE, LOGGER, bitacora.obtenerStackTrace(e), true);
}
return source;
}

It took me a few hours to resolve it, but I've finally found the solution here: https://bugs.openjdk.java.net/browse/JDK-8264194
static {
System.setProperty("com.sun.org.apache.xml.internal.security.ignoreLineBreaks", "true");
}

Reorganize XML using Java DOM - Hierarchy_Request_Error

I have a xml with following scheme structure
<test>
<testcase classname="TestsQuarantine.CreateUsers" name="Administrator"/>
<testcase classname="TestsQuarantine.Login" name="documentMailQuarantine"/>
<testcase classname="TestsClerk.CreateUsers" name="John"/>
</test>
I need to reorganize it to
<test>
<testsuite name="Quarantine">
<testcase classname="TestsQuarantine.CreateUsers" name="Administrator"/>
<testcase classname="TestsQuarantine.Login" name="documentMailQuarantine"/>
</testsuite>
<testsuite name="Clerk">
<testcase classname="TestsClerk.CreateUsers" name="John"/>
</testsuite>
</test>
At this point I'm reading the file to NodeList, iterate through it, create new root and try to switch it with original to achieve the structure that I need but I get following error
HIERARCHY_REQUEST_ERR: An attempt was made to insert a node where it
is not permitted.
happening in line that performs switch of roots and I'm out of ideas why it is so.. Here is my code:
File file = new File(fullPath);
List<Element> clerk = null,
quara = null,
misc = null;
try {
DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
Document doc = builder.parse(file);
NodeList nodes = doc.getElementsByTagName("test");
Element root = doc.getDocumentElement(),
newRoot = doc.createElement("test");
clerk = new ArrayList<Element>();
quara = new ArrayList<Element>();
misc = new ArrayList<Element>();
for(int i=0; i < nodes.getLength(); i++) {
Element node = (Element) nodes.item(i);
if(node.getAttribute("classname").contains("Clerk")) {
clerk.add(node);
} else if(node.getAttribute("classname").contains("Quarantine")) {
quara.add(node);
} else {
misc.add(node);
}
}
if(clerk.isEmpty() == false) {
Element clerkSuite = doc.createElement("testsuite");
clerkSuite.setAttribute("name", "Clerk");
for(Element el : clerk) {
clerkSuite.appendChild(el);
}
newRoot.appendChild(clerkSuite);
}
if(quara.isEmpty() == false) {
Element quaraSuite = doc.createElement("testsuite");
quaraSuite.setAttribute("name", "Quarantine");
for(Element el : quara) {
quaraSuite.appendChild(el);
}
newRoot.appendChild(quaraSuite);
}
if(misc.isEmpty() == false) {
Element miscSuite = doc.createElement("testsuite");
miscSuite.setAttribute("name", "Miscellaneous");
for(Element el : misc) {
miscSuite.appendChild(el);
}
newRoot.appendChild(miscSuite);
}
root.getParentNode().replaceChild(newRoot, root);
DOMSource original = new DOMSource(doc);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
StreamResult overritten = new StreamResult(fullPath);
transformer.transform(original, overritten);
} catch (Exception e) {
e.printStackTrace();
}
What do I have to change to make it work?

Your iteration over testcase nodes is incorrect. I changed that fragment to below one and Your code is working:
Node testNode = doc.getDocumentElement();
NodeList testCases= testNode.getChildNodes();
for(int i=0; i < testCases.getLength(); i++) {
Node n = testCases.item(i);
if (!(n instanceof Text)) {
Element testCase = (Element) n;
if (testCase.getAttribute("classname").contains("Clerk")) {
clerk.add(testCase);
} else if (testCase.getAttribute("classname").contains("Quarantine")) {
quara.add(testCase);
} else {
misc.add(testCase);
}
}
}

setTextContent not work in my side

I have a document data.xml you can find below. The writeXML function is use for set node value.
I try to use targetNode.setTextContent(strValue), strValue = "400.00" to update node P2, but I only got null in my xml, the node P2 always null, never update by .setTextContent().
My selenium version is 2.40.0
public void writeXML(String strTestName, String strTargetNode, String strValue) throws Exception{
report= new ReportGen();
//get data.xml path
String path = System.getProperty("user.dir") + "\\data.xml";
Document document = load(path);
//get root node
Element root = document.getDocumentElement();
// System.out.println("The root node is:"+root.getTagName());
NodeList nl = root.getChildNodes();
NodeList cnl = null;
org.w3c.dom.Node targetNode = null;
String logStr = null;
String strNodeName = null;
int length = nl.getLength();
try{
for(int i=0; i<length;i++){
targetNode = nl.item(i);
if(targetNode!=null && targetNode instanceof Element && targetNode.getNodeName().equals(strTestName)){
if(targetNode.hasChildNodes()){
cnl = targetNode.getChildNodes();
break;
}else{
assert false;
}
}
}
length = cnl.getLength();
for(int i=0; i<length;i++){
targetNode = cnl.item(i);
strNodeName =targetNode.getNodeName();
if(targetNode!=null&&strNodeName.equals(strTargetNode)){
targetNode.setTextContent(strValue);
break;
}
}
}catch(Exception exception){
logStr=exception.getMessage();
assert false;
}
}
Below is my data.xml
<SF>
<TC03>
<KAM></KAM>
<PartnerName></PartnerName>
<Product></Product>
<P2></P2>
<P4></P4>
<P5></P5>
</TC03>
</SF>
coluld anyone give any suggestion?

You have to save file back, once you have changed node values with setTextContent
add something like this at the end of your code
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(document);
OutputStream stream = new FileOutputStream(fXmlFile);
StreamResult sresult = new StreamResult(stream);
transformer.transform(source, sresult);

Deleting XML element in Java

I have an XML:
<?xml version="1.0" encoding="UTF-8"?>
<songs>
<song>
<title>Gracious</title>
<artist>Ben Howard</artist>
<genre>Singer/Songwriter</genre>
</song>
<song>
<title>Only Love</title>
<artist>Ben Howard</artist>
<genre>Singer/Songwriter</genre>
</song>
<song>
<title>Bad Blood</title>
<artist>Bastille</artist>
<genre>N/A</genre>
</song>
<song>
<title>Keep Your Head Up</title>
<artist>Ben Howard</artist>
<genre>Singer/Songwriter</genre>
</song>
<song>
<title>Intro</title>
<artist>Alt-J</artist>
<genre>Alternative</genre>
</song>
</songs>
and my Java code is:
public static void deleteSong(Song song) {
String songTitle = song.getTitle();
String songArtist = song.getArtist();
String songGenre = song.getGenre();
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
File file = new File("songs.xml");
Document doc = db.parse(file);
NodeList songList = doc.getElementsByTagName("song");
if (songList != null && songList.getLength() > 0) {
for (int i = 0; i < songList.getLength(); i++) {
Node node = songList.item(i);
Element e = (Element) node;
NodeList nodeList = e.getElementsByTagName("title");
String title = nodeList.item(0).getChildNodes().item(0)
.getNodeValue();
nodeList = e.getElementsByTagName("artist");
String artist = nodeList.item(0).getChildNodes().item(0)
.getNodeValue();
nodeList = e.getElementsByTagName("genre");
String genre = nodeList.item(0).getChildNodes().item(0)
.getNodeValue();
System.out.println(title + " Title");
System.out.println(songTitle + " SongTitle");
if (title.equals(songTitle)) {
if (artist.equals(songArtist)) {
if (genre.equals(songGenre)) {
doc.getFirstChild().removeChild(node);
}
}
}
}
}
MainDisplay.main(null);
} catch (Exception e) {
System.out.println(e);
}
}
The song to be deleted is passed into the method and then compared to the songs in the xml file. However, if the song matches a song in the xml, it isn't deleted? No exceptions come up.

You need to remove relevant node, in you code you are removing node of firstchild which seems to be incorrect.
And write back your changes to the file.
if (title.equals(songTitle) && artist.equals(songArtist) && genre.equals(songGenre) ) {
node.getParentNode().removeChild(node);
}
// write back to xml file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File(filepath));
transformer.transform(source, result);

From what I see, you are only reading the documents. At some point, you will have to flush the changes back to the XML file.

How to strip whitespace-only text nodes from a DOM before serialization?

I have some Java (5.0) code that constructs a DOM from various (cached) data sources, then removes certain element nodes that are not required, then serializes the result into an XML string using:
// Serialize DOM back into a string
Writer out = new StringWriter();
Transformer tf = TransformerFactory.newInstance().newTransformer();
tf.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
tf.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
tf.setOutputProperty(OutputKeys.INDENT, "no");
tf.transform(new DOMSource(doc), new StreamResult(out));
return out.toString();
However, since I'm removing several element nodes, I end up with a lot of extra whitespace in the final serialized document.
Is there a simple way to remove/collapse the extraneous whitespace from the DOM before (or while) it's serialized into a String?

You can find empty text nodes using XPath, then remove them programmatically like so:
XPathFactory xpathFactory = XPathFactory.newInstance();
// XPath to find empty text nodes.
XPathExpression xpathExp = xpathFactory.newXPath().compile(
"//text()[normalize-space(.) = '']");
NodeList emptyTextNodes = (NodeList)
xpathExp.evaluate(doc, XPathConstants.NODESET);
// Remove each empty text node from document.
for (int i = 0; i < emptyTextNodes.getLength(); i++) {
Node emptyTextNode = emptyTextNodes.item(i);
emptyTextNode.getParentNode().removeChild(emptyTextNode);
}
This approach might be useful if you want more control over node removal than is easily achieved with an XSL template.

Try using the following XSL and the strip-space element to serialize your DOM:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
http://helpdesk.objects.com.au/java/how-do-i-remove-whitespace-from-an-xml-document

Below code deletes the comment nodes and text nodes with all empty spaces. If the text node has some value, value will be trimmed
public static void clean(Node node)
{
NodeList childNodes = node.getChildNodes();
for (int n = childNodes.getLength() - 1; n >= 0; n--)
{
Node child = childNodes.item(n);
short nodeType = child.getNodeType();
if (nodeType == Node.ELEMENT_NODE)
clean(child);
else if (nodeType == Node.TEXT_NODE)
{
String trimmedNodeVal = child.getNodeValue().trim();
if (trimmedNodeVal.length() == 0)
node.removeChild(child);
else
child.setNodeValue(trimmedNodeVal);
}
else if (nodeType == Node.COMMENT_NODE)
node.removeChild(child);
}
}
Ref: http://www.sitepoint.com/removing-useless-nodes-from-the-dom/

Another possible approach is to remove neighboring whitespace at the same time as you're removing the target nodes:
private void removeNodeAndTrailingWhitespace(Node node) {
List<Node> exiles = new ArrayList<Node>();
exiles.add(node);
for (Node whitespace = node.getNextSibling();
whitespace != null && whitespace.getNodeType() == Node.TEXT_NODE && whitespace.getTextContent().matches("\\s*");
whitespace = whitespace.getNextSibling()) {
exiles.add(whitespace);
}
for (Node exile: exiles) {
exile.getParentNode().removeChild(exile);
}
}
This has the benefit of keeping the rest of the existing formatting intact.

The following code works:
public String getSoapXmlFormatted(String pXml) {
try {
if (pXml != null) {
DocumentBuilderFactory tDbFactory = DocumentBuilderFactory
.newInstance();
DocumentBuilder tDBuilder;
tDBuilder = tDbFactory.newDocumentBuilder();
Document tDoc = tDBuilder.parse(new InputSource(
new StringReader(pXml)));
removeWhitespaces(tDoc);
final DOMImplementationRegistry tRegistry = DOMImplementationRegistry
.newInstance();
final DOMImplementationLS tImpl = (DOMImplementationLS) tRegistry
.getDOMImplementation("LS");
final LSSerializer tWriter = tImpl.createLSSerializer();
tWriter.getDomConfig().setParameter("format-pretty-print",
Boolean.FALSE);
tWriter.getDomConfig().setParameter(
"element-content-whitespace", Boolean.TRUE);
pXml = tWriter.writeToString(tDoc);
}
} catch (RuntimeException | ParserConfigurationException | SAXException
| IOException | ClassNotFoundException | InstantiationException
| IllegalAccessException tE) {
tE.printStackTrace();
}
return pXml;
}
public void removeWhitespaces(Node pRootNode) {
if (pRootNode != null) {
NodeList tList = pRootNode.getChildNodes();
if (tList != null && tList.getLength() > 0) {
ArrayList<Node> tRemoveNodeList = new ArrayList<Node>();
for (int i = 0; i < tList.getLength(); i++) {
Node tChildNode = tList.item(i);
if (tChildNode.getNodeType() == Node.TEXT_NODE) {
if (tChildNode.getTextContent() == null
|| "".equals(tChildNode.getTextContent().trim()))
tRemoveNodeList.add(tChildNode);
} else
removeWhitespaces(tChildNode);
}
for (Node tRemoveNode : tRemoveNodeList) {
pRootNode.removeChild(tRemoveNode);
}
}
}
}

I did it like this
private static final Pattern WHITESPACE_PATTERN = Pattern.compile("\\s*", Pattern.DOTALL);
private void removeWhitespace(Document doc) {
LinkedList<NodeList> stack = new LinkedList<>();
stack.add(doc.getDocumentElement().getChildNodes());
while (!stack.isEmpty()) {
NodeList nodeList = stack.removeFirst();
for (int i = nodeList.getLength() - 1; i >= 0; --i) {
Node node = nodeList.item(i);
if (node.getNodeType() == Node.TEXT_NODE) {
if (WHITESPACE_PATTERN.matcher(node.getTextContent()).matches()) {
node.getParentNode().removeChild(node);
}
} else if (node.getNodeType() == Node.ELEMENT_NODE) {
stack.add(node.getChildNodes());
}
}
}
}

transformer.setOutputProperty(OutputKeys.INDENT, "yes");
This will retain xml indentation.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

javax.xml.transform.Transformer very slow - java

Related

Resulting certificate on Signature has "&#13" how to remove them?

Reorganize XML using Java DOM - Hierarchy_Request_Error

setTextContent not work in my side

Deleting XML element in Java

How to strip whitespace-only text nodes from a DOM before serialization?

Categories

Resources