Java Style XML one node per line and no whitespace - java

Just fooling with removing whitespace but keeping each node on its own line from an xml document when adding and removing elements from xml in java and I'm having trouble understanding XML Style Sheets.
Here is what's happening so far.
Firstly I have the following XML,
<jobs>
<job>Job 1</job>
<job>Job 2</job>
<job>Job 3</job>
<job>Job 4</job>
</jobs>
Then I remove one of the elements and it ends up looking like this with the whitespacewhere the element was,
<jobs>
<job>Job 1</job>
<job>Job 3</job>
<job>Job 4</job>
</jobs>
So I tried applying the following style sheet I found,
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Which makes the xml appear on one line because it removes all whitespace. But I'm trying to keep the file readable too.
<jobs><job>Job 1</job><job>Job 2</job><job>Job 3</job><job>Job 4</job></jobs>
I was wondering if anyone has a style sheet to achieve this?

You need to add indent="yes" to <xsl:output:
<xsl:output method="xml" indent="yes" omit-xml-declaration="yes"/>
(Also, you might want to switch to XSL Version 2.0)
Hope this helps

Related

Fastest way to check for comments and processing instructions in XML file and strip whitespace?

What's the most performant way to check for comments and processing instructions in an XML file and strip unnecessary whitespace in Java?
Processing should fail if either comment or processing instruction is contained in the XML file otherwise all unnecessary whitespace should be removed.
How would the solution look like if processing instructions and comments should be removed as well instead of failing the validation / transformation?
You could use this xslt (version 1.0):
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="#*|text()|*">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="processing-instruction()|comment()"/>
</xsl:stylesheet>
Unnecessary whitespace is somewhat tricky sinds in a mixed-content-model whitespace can be significant.

XSLT to pick specific nodes, and parameterizing it from Java

I have an XML structure like this,
<?xml version="1.0" encoding="UTF-8"?>
<Package>
<PackageHeader>yadda yadda </PackageHeader>
<PackageBody>
<Element1>foo</Element1>
<Element2>bar</Element2>
<ElementN>xyz</ElementN>
</PackageBody>
I have a requirement where I need to eliminate either Element1, Element2, or ElementN, so I wrote this XSLT,
<xsl:stylesheet version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" />
<xsl:template match="#* | node()">
<xsl:copy>
<xsl:apply-templates select="#* | node()[not(self::Element1)][not(self::Element2)]" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
I am running this via a simple Java XSL Transformation program. The transformed XML has only elevrything from the orignal XML minus Element1 & Element2. I tried many ways to pass parameters from the Java program to parameterize which nodes should be eliminated, but no luck so far. Any help would be much appreciated.
Sounds like a task for XSLT 3 with static parameters and shadow attributes, best used with the Saxon s9api (http://saxonica.com/html/documentation9.9/javadoc/index.html) if that is the processor used:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:param name="element-to-be-removed" static="yes" as="xs:string" select="'Element2'"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:template _match="{$element-to-be-removed}"/>
</xsl:stylesheet>
https://xsltfiddle.liberty-development.net/gVhEaiE

How to copy parent node namespace to child element using xslt?

My xml looks like which I created using Java JAXBContext and Marshaller.
I want to format some part of xml only not the whole xml.
<?xml version="1.0" encoding="UTF-8"?>
<ns4:Requests xmlns:ns2="http://www.dummy.com/xsd/tublu/murmur_001" xmlns:ns3="http://www.dummy.com/xsd/CommonObjects_001" xmlns:ns4="http://www.dummy.com/xsd/naku_001">
<ns4:RequestSetId>fhskgvseruigiu</ns4:RequestSetId>
<ns4:RequestStream>CHAPP</ns4:RequestStream>
<ns4:Request>
<ns4:TrackAndTrace>
<ns4:CPAId>003</ns4:CPAId>
<ns4:CorrelationId>ytuty</ns4:CorrelationId>
</ns4:TrackAndTrace>
</ns4:Request>
<ns4:Request>
<ns4:TrackAndTrace>
<ns4:CPAId>003</ns4:CPAId>
<ns4:CorrelationId>cyuri7</ns4:CorrelationId>
</ns4:TrackAndTrace>
</ns4:Request>
</ns4:Requests>
I want to format like
<?xml version="1.0" encoding="UTF-8"?>
<ns4:Requests xmlns:ns2="http://www.dummy.com/xsd/tublu/murmur_001" xmlns:ns4="http://www.dummy.com/xsd/naku_001" xmlns:ns3="http://www.dummy.com/xsd/CommonObjects_001">
<ns4:RequestSetId>fhskgvseruigiu</ns4:RequestSetId>
<ns4:RequestStream>CHAPP</ns4:RequestStream>
<ns4:Request xmlns:ns4="http://www.dummy.com/xsd/naku_001"><ns4:TrackAndTrace><ns4:CPAId>003</ns4:CPAId><ns4:CorrelationId>ytuty</ns4:CorrelationId></ns4:TrackAndTrace></ns4:Request>
<ns4:Request xmlns:ns4="http://www.dummy.com/xsd/naku_001"><ns4:TrackAndTrace><ns4:CPAId>003</ns4:CPAId><ns4:CorrelationId>cyuri7</ns4:CorrelationId></ns4:TrackAndTrace></ns4:Request>
</ns4:Requests>
Here is the solution (by Transforming the XML Data using Java's XSLT APIs),
As you may also have noticed.. JAXB alone cannot meet this requirement, but after marshalling the object to a formatted XML String (as u have shown) you can then post process/transform it accordingly using a suitable XSLT file
So to get a linearized 'Request' element, just make use of the xsl shown below:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="TrackAndTrace"/>
<xsl:strip-space elements="Request"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Note: Also tested that above method/approach is working properly - used the Stylizer sample code (from https://docs.oracle.com/javase/tutorial/jaxp/xslt/transformingXML.html)
Cheers!
Update: If you want a solution that also preserves the original namespace prefix as shown in your question, follow this variation
Add factory.setNamespaceAware(true); in the Stylizer code
& Use this tweaked XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="w3.org/1999/XSL/Transform" xmlns:ns4="dummy.com/xsd/naku_001">
<xsl:strip-space elements="ns4:TrackAndTrace"/>
<xsl:strip-space elements="ns4:Request"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

XSLT dont add XMLNS to elements

I have a simple HTML fragment similar to this:
link
I need to transform it to
<abc:href var="123">link</abc:href>
I do it with XSLT, so I had to add the namespace in xsl:stylesheet
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:abc="http://abc.ru">
It works almost fine, unfortunately the XSLT transform keeps on adding a XMLNS to the output, like here:
<abc:href var="123" xmlns:abc="http://abc.ru">link</abc:href>
I don't need the xmlns definition, can I remove it?
Although it really goes against the grain, and I advise strongly against it, if you need to produce this malformed XML, then you can use an instruction like...
<xsl:value-of disable-output-escaping="yes" select="
concat('<abc:href var="',$href,'">',$link,'</abc:href>')
"/>
... where $href and $link are place-markers for the appropriate expression.
Update
In response to the OP's comment, one could use a template like this...
<xsl:template match="a">
<xsl:value-of disable-output-escaping="yes" select="
concat('<abc:href var="',#href,'">',.,'</abc:href>')
"/>
</xsl:template>
This ugly solution should be used only as a last resort. A much better solution would be to use XSLT to produce your WHOLE document, not just an invalid fragment of it. This way you document would be well formed and you could bring to bear the full power and simplicity of XSLT.
It works almost fine, unfortunately the XSLT transform keeps on adding
a XMLNS to the output, like here:
<abc:href var="123" xmlns:abc="http://abc.ru">link</abc:href>
I don't need the xmlns definition, can I remove it?
The wanted removal of the namespace declaration would produce a (namespace-)non-well-formed XML document and for this reason the XSLT processor adds the namespace declaration -- as required by the W3C XSLT specifications.
You can cause these namespace declarations to "disappear" by placing the namespace declaration on a common ancestor (such as the top element of the generated XML document).
Here is a complete example:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/*">
<top xmlns:abc="http://abc.ru">
<xsl:apply-templates/>
</top>
</xsl:template>
<xsl:template match="a[#href]">
<xsl:element name="abc:href" namespace="http://abc.ru">
<xsl:attribute name="var">
<xsl:apply-templates/>
</xsl:attribute>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the following document:
<html>
link1
link2
link3
link4
</html>
the wanted, correct result is produced:
<top xmlns:abc="http://abc.ru">
<abc:href var="link1"/>
<abc:href var="link2"/>
<abc:href var="link3"/>
<abc:href var="link4"/>
</top>
This is sad, but I really need an invalid xml
XSLT is designed to prevent you producing bad XML. If you want to produce bad XML, don't use XSLT.
Try it with exclude-result-prefixes, like this:
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:abc="http://abc.ru"
exclude-result-prefixes="abc">
<xsl:template match="/">
<xsl:apply-templates select="#* | node()"/>
</xsl:template>
<xsl:template match="a">
<href var="{#href}"><xsl:value-of select="."/></href>
</xsl:template>
</xsl:stylesheet>

Create xmlns attribute in the XML using XSLT Transformation

I am trying to add the xmlns attribute to the resulting XML with a value passed by parameter during XSLT transformation using JDK Transformer (Oracle XML v2 Parser or JAXP) but it always defaults to http://www.w3.org/2000/xmlns/
My source XML
<test/>
My XSLT
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://example.com">
<xsl:param name="myNameSpace" select="'http://neilghosh.com'"/>
<xsl:template match="/">
<process>
<xsl:attribute name="xmlns:neil">
<xsl:value-of select="$myNameSpace"/>
</xsl:attribute>
</process>
</xsl:template>
</xsl:stylesheet>
My Result
<?xml version="1.0"?>
<process xmlns="http://www.w3.org/2000/xmlns/" xmlns:neil="neilghosh.com">
</process>
My Desired Result
<?xml version="1.0"?>
<process xmlns="http://example.com" xmlns:neil="neilghosh.com">
</process>
Firstly, in the XSLT data model, you don't want to create an attribute node, you want to create a namespace node.
Namespace nodes are usually created automatically: if you create an element or attribute in a particular namespace, the requisite namespace node (and hence, when serialized, the namespace declaration) are added automatically by the processor.
If you want to create a namespace node that isn't necessary (because it's not used in the name of any element or attribute) then in XSLT 2.0 you can use xsl:namespace. If you're stuck with XSLT 1.0 then there's a workaround, that involves creating an element in the relevant namespace and then copying its namespace node:
<xsl:variable name="ns">
<xsl:element name="neil:dummy" namespace="{$param}"/>
</xsl:variable>
<process>
<xsl:copy-of select="$ns/*/namespace::neil"/>
</process>
Michael Kay provided you with the correct answer, but based on your comments, you aren't sure how to use it in your transformation.
Here is a complete transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common" exclude-result-prefixes="ext">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:param name="pNamespace" select="'neilghosh.com'"/>
<xsl:variable name="vDummy">
<xsl:element name="neil:x" namespace="{$pNamespace}"/>
</xsl:variable>
<xsl:template match="/*">
<xsl:element name="process" namespace="http://example.com">
<xsl:copy-of select="namespace::*"/>
<xsl:copy-of select="ext:node-set($vDummy)/*/namespace::*[.=$pNamespace]"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
When this transformation is applied on the provided XML document:
<test/>
the wanted, correct result is produced:
<process xmlns="http://example.com" xmlns:neil="neilghosh.com" />
Namespace declarations in XML are not attributes even though they look like attributes. In XSLT 2.0 you can use <xsl:namespace name="neil" select="$myNameSpace" /> to add a namespace declaration to the result tree dynamically but that feature is not available in XSLT 1.0.
Don't try to create "xmlns" attributes yourself. Create the namespaces in the XSLT and they will be done automatically.
This XSLT works (tested with Saxon 9.4):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:neil="neilghosh.com"
xpath-default-namespace="http://example.com"
xmlns="http://example.com" version="2.0">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:param name="myDynamicNamespace" select="'http://neilghosh.com'"/>
<xsl:template match="/">
<xsl:element name="process">
<xsl:namespace name="neil" select="$myDynamicNamespace"/>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
And gives the following output:
<?xml version="1.0" encoding="UTF-8"?>
<process xmlns="http://example.com" xmlns:neil="http://neilghosh.com"/>
Finally got an workaround which worked with my XSLT Processor (Oracle XML V2 Parser)
I had to transform it to a DOM Document and then persist that DOM to filesystem instead of outputting directly to StreamResult
I used DOMResult in the transform method
Following XSLT fragment worked but there was an extra xmlns:xmlns="http://www.w3.org/2000/xmlns/" which was probably absorbed by Document and did not appear in the final output when I persisted to file system.
<process>
<xsl:attribute name="xmlns">
<xsl:value-of select="'http://example.com'"/>
</xsl:attribute>
<process>
I know this is not the best way to do but given the parse constraint this is the only choice I have now.

Categories