I'm having an issue where when I publish my modspecs to pdf (XSL-FO). My tables are having issues, where the content of a cell will overflow its column into the next one. How do I force a break on the text so that a new line is created instead?
I can't manually insert zero-space characters since the table entries are programmatically entered. I'm looking for a simple solution that I can just simply add to docbook_pdf.xsl (either as a xsl:param or xsl:attribute)
EDIT:
Here is where I'm at currently:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0" xmlns:fo="http://www.w3.org/1999/XSL/Format">
<xsl:import href="urn:docbkx:stylesheet"/>
...(the beginning of my stylesheet for pdf generation, e.g. header and footer content stuff)
<xsl:template match="text()">
<xsl:call-template name="intersperse-with-zero-spaces">
<xsl:with-param name="str" select="."/>
</xsl:call-template>
</xsl:template>
<xsl:template name="intersperse-with-zero-spaces">
<xsl:param name="str"/>
<xsl:variable name="spacechars">
</xsl:variable>
<xsl:if test="string-length($str) > 0">
<xsl:variable name="c1" select="substring($str, 1, 1)"/>
<xsl:variable name="c2" select="substring($str, 2, 1)"/>
<xsl:value-of select="$c1"/>
<xsl:if test="$c2 != '' and
not(contains($spacechars, $c1) or
contains($spacechars, $c2))">
<xsl:text></xsl:text>
</xsl:if>
<xsl:call-template name="intersperse-with-zero-spaces">
<xsl:with-param name="str" select="substring($str, 2)"/>
</xsl:call-template>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
With this, the long words are successfully broken up in the table cells! Unfortunately, the side effect is that normal text elsewhere (like in a under sextion X) now breaks up words so that they appear on seperate lines. Is there a way to isolate the above process to just tables?
In the long words, try inserting a zero-width space character between the characters where a break is allowed.
You can use XSLT to insert a zero-width space between every character. Here is one way to do it: http://groups.yahoo.com/neo/groups/XSL-FO/conversations/topics/1177.
Here is a mailing list thread where various approaches to the problem are discussed: http://www.stylusstudio.com/xsllist/200201/post80920.html.
The SourceForge DocBook stylesheets includes a template for breaking up long URLs in FO output; see http://www.sagehill.net/docbookxsl/Ulinks.html#BreakLongUrls. The template (hyphenate-url) is in xref.xsl.
Since you're using XSLT 2.0:
<xsl:template match="text()">
<xsl:value-of
select="replace(replace(., '(\P{Zs})(\P{Zs})', '$1$2'),
'([^\p{Zs}])([^\p{Zs}])',
'$1$2')" />
</xsl:template>
This is using category escapes (http://www.w3.org/TR/xmlschema-2/#nt-catEsc) rather than an explicit list of characters to match, but you could do it that way instead. It needs two replace() because the inner replace() can only insert the character between every second character. The outer replace() matches on characters that are not either space characters or the character added by the inner replace().
Inserting after every thirteenth non-space character:
<xsl:template match="text()">
<xsl:value-of
select="replace(replace(., '(\P{Zs}{13})', '$1'),
'(\p{Zs})',
'$1')" />
</xsl:template>
The inner replace() inserts the character after every 13 non-space characters, and the outer replace() fixes it if the 14th character was a space character.
If you are using AH Formatter, then you can use axf:word-break="break-all" to allow AH Formatter to break anywhere within a word. See https://www.antenna.co.jp/AHF/help/en/ahf-ext.html#axf.word-break.
Related
I am looking for a way to not have some letters in the numbering of chapters generated by xalan in an .xml to .fo transformation.
I am using org.apache.xalan.xsltc.trax.TransformerFactoryImpl to transform an .xml file into an .fo file to later make a PDF out of it. In the xml file I have some numbered chapters like so :
<prcitem2 numbering="9">
They are transformed in the .fo like so :
(This block is inside an fo:list-item-label, inside an fo:list-item, but I am on mobile and can't edit it properly. Sorry)
<fo:block>Й.</fo:block>
The xsl in charge of the transformation is :
<xsl:when test="ancestor-or-self::prcitem2">
<xsl:choose>
<xsl:when test="($language = 'ru')">
<xsl:number count="prcitem2" format="А."/>
</xsl:when>
</xsl:choose>
But my Russian comrades have informed me that some of their letters can't be used in numbering as it is not allowed by ATA and Russian standards (e.g. Й, З (that's not a 3) and some others).
Is there a way to forbid the use of these letters ?
As I mentioned in comments, I don't see a way to "fix" the built-in xsl:number algorithm and I suggest you replace it with your own.
In the following template, replace the alpha parameter's value with the Cyrillic characters you want to use. Everything else is self-adjusting.
Note that the input numbering is expected to start at zero, so call the template with the decimal parameter's value being = "$your-number - 1".
<xsl:template name="dec-to-alpha">
<xsl:param name="decimal"/>
<xsl:param name="alpha" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ'"/>
<xsl:variable name="base" select="string-length($alpha)"/>
<xsl:variable name="bit" select="$decimal mod $base"/>
<xsl:variable name="char" select="substring($alpha, $bit + 1, 1)"/>
<xsl:variable name="next" select="floor($decimal div $base)"/>
<xsl:if test="$next">
<xsl:call-template name="dec-to-alpha">
<xsl:with-param name="decimal" select="$next - 1"/>
</xsl:call-template>
</xsl:if>
<xsl:value-of select="$char"/>
</xsl:template>
Demo: https://xsltfiddle.liberty-development.net/94rmq74
I am trying to check if a String is contained within a set. I have an Excel sheet that I convert to an xml file; example:
Excel sheet on left and converted sheet on right (RowData.xml):
So I have an xml file where those set of numbers may or may not be there. For example, the source xml may look like this:
Source.xml:
<Data>
<Number>5556781234</Number>
<Number>5556781235</Number>
<Number>5556781236</Number>
</Data>
As you see it can stop anywhere. The source xml file may have all the numbers listed in RowData.xml or it may have only 1 or more. So my question is, how would I check for that in my xslt file?
I want to do this:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- This is the Excel sheet converted to an XML file -->
<xsl:param name="sheet-uri" select="'RowData.xml'"/>
<xsl:param name="sheet-doc" select="document($sheet-uri)"/>
<xsl:template match="Data">
<xsl:for-each select="Data/Number">
<xsl:variable name="continue" select="$sheet-doc//Sheet/Row[Number = current()]/Continue"/>
<xsl:if test="">
<!-- Check the Source.xml against the RowData.xml and
see if the set contains any "No"'s in it. -->
<!-- If it does then don't do the following -->
<Data2>
<Number><xsl:value-of select="Number"/></Number>
<Timestamp>125222</Timestamp>
</Data2>
</xsl:if>
</xsl:for-each>
</xsl:template>
So basically, before making the <Data2> element, check the numbers in Source.xml and see if any of those numbers have a value of No for the column Continue in RowData.xml. I don't know how to make the if statement above. I know there's a contains() function in xslt; however, I don't know how I can use it here.
Is this possible? Please let me know if anything was confusing. Thanks in advance!
check the numbers in Source.xml and see if any of those numbers have a value of No for the column Continue in RowData.xml.
You can take advantage of XSLT's "existential equal" operator here:
test="doc('source.xml')/Data/Number =
$sheet-doc//Sheet/Row[Continue='No']/Number"
Essentially, if A and B are sets of values, then A = B returns true if some value in A is equal to some value in B.
I would suggest you use the key mechanism - esp. if you're using XSLT 2.0.
Define a key as:
<xsl:key name="row" match="Row" use="Number" />
then do:
<xsl:template match="/Data">
<xsl:for-each select="Number[not(key('row', ., $sheet-doc))]">
<Data2>
<xsl:copy-of select="."/>
<Timestamp>125222</Timestamp>
</Data2>
</xsl:for-each>
</xsl:template>
This selects only Numberelements that do not have a corresponding Row in the RowData.xml document.
I like to markup some strings in an xml document.
For example, I have:
<p> I like to go to Florida </p>
I need to tag the string "go" and have the output as:
<p> I like to <something>go</something> to Florida</p>
What is the best way to do this? I am using Java. I need to treat the XML file as XML not as text. I found some solutions that treat an xml file as a text file and use string.replace but I do not think those are good solutions.
Any suggestion is much appreciated.
Thank you,
Try an XSLT 2.0 transformation like this:
<xsl:template match="#*|*">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:analyze-string regex="go">
<xsl:matching-substring>
<something><xsl:value-of select="."/></something>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
You can of course extend the regular expression, e.g. regex="go|come|walk|run"; if you only want to match whole words, you might want to use tokenize() to split it into words and process each word separately.
I am having a certian issue with special characters in my XML.
Bascially I am splitting up an xml into multiple xmls using Xalan Processor.
When splitting the documents up I am using their value of the name tag as the name of the file generated. The problem is that the name contains characters that arent recognized by the XML processor like ™ (TM) and ® (R). I want to remove those characters ONLY when naming the files.
<xsl:template match="products">
<redirect:write select="concat('..\\xml\\product\\en\\',translate(string(name),'</> ',''),'.xml')">
The above is the XSL code I have writter to split the XML into multlpe XMLs. As you can see I am using hte translate method to subtitute '/','<','>' with '' from the name. I was hoping I could do the same with ™ (TM) and ® (R) but it doesnt seem to work.
Please advice me how I would be able to do that.
Thanks for you help in advance.
I don't have Xalan, but with 8 other XSLT processors this thransformation:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()">
<xsl:value-of select="translate(., '</>™®', '')"/>
===================
<xsl:value-of select="translate(., '</>™®', '')"/>
</xsl:template>
</xsl:stylesheet>
when applied on this XML document:
<t>XXX™ My Trademark®</t>
produces the wanted result:
XXX My Trademark
===================
XXX My Trademark
I suggest that you try to use one of the two expressions above -- at least the second may work successfully.
Following Dimitre answer, I think that if you are not sure about wich special character could be in name, maybe you should keep what you consider legal document's name characters.
As example:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="text()">
<xsl:value-of select="translate(.,
translate(.,
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ ',
''),
'')"/>
</xsl:template>
</xsl:stylesheet>
With input:
<t>XXX™ My > Trademark®</t>
Result:
XXX My Trademark
Say I have a very simple XML with an empty tag 'B':
<Root>
<A>foo</A>
<B></B>
<C>bar</C>
</Root>
I'm currently using XSLT to remove a few tags, like 'C' for example:
<?xml version="1.0" ?>
<xsl:stylesheet version="2.0" xmlns="http://www.w3.org/1999/XSL/Transform" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no" encoding="utf-8" omit-xml-declaration="yes" />
<xsl:template match="*">
<xsl:copy>
<xsl:copy-of select="#*" />
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
<xsl:template match="C" />
</xsl:stylesheet>
So far OK, but the problem is I end up having an output like this:
<Root>
<A>foo</A>
<B/>
</Root>
when I actually really want:
<Root>
<A>foo</A>
<B></B>
</Root>
Is there a way to prevent 'B' from collapsing?
Thanks.
Ok, so here what worked for me:
<xsl:output method="html">
Try this:
<script type="..." src="..."> </script>
Your HTML output will be:
<script type="..." src="..."> </script>
The prevents the collapsing but translates to a blank space. It's worked for me in the past.
There is no standard way, as they are equivalent; You might be able to find an XSLT engine that has an option for this behaviour, but I'm not aware of any.
If you're passing this to a third party that cannot accept empty tags using this syntax, then you may have to post-process the output yourself (or convince the third party to fix their XML parsing)
It is up to the XSLT engine to decide how the XML tag is rendered, because a parser should see no difference between the two variations. However, when outputting HTML this is a common problem (for <textarea> and <script> tags for example.) The simplest (but ugly) solution is to add a single whitespace inside the tag (this does change the meaning of the tag slightly though.)
This has been a long time issue and I finally made it work with a simple solution.
Add <xsl:text/> if you have a space character. I added a space in my helper class.
<xsl:choose>
<xsl:when test="$textAreaValue=' '">
<xsl:text/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$textAreaValue"/>
</xsl:otherwise>
</xsl:choose>
They are NOT always equivalent. Many browsers can't deal with <script type="..." src="..." /> and want a separate closing tag. I ran into this problem while using xml/xsl with PHP. Output "html" didn't work, I'm still looking for a solution.
No. The 2 are syntactically identical, so you shouldn't have to worry
It should not be a problem if it is or . However if you are using another tool which expects empty XML tags as way only, then you have a problem. A not very elegant way to do this will be adding a space between staring and ending 'B' tags through XSLT code.
<xsl:text disable-output-escaping="yes">
<![CDATA[<div></div>]]>
</xsl:text>
This works fine with C#'s XslCompiledTransform class with .Net 2.0, but may very well fail almost anywhere else. Do not use unless you are programmatically doing the transofrm yourself; it is not portable at all.
It's 7 years late, but for future readers I will buck the trend here and propose an actual solution to the original question. A solution that does not modify the original with spaces or the output directive.
The idea was to use an empty variable to trick the parser.
If you only want to do it just for one tag B, my first thought was to use something like this to attach a dummy variable.
<xsl:variable name="dummyempty" select="''"/>
<xsl:template match="B">
<xsl:copy>
<xsl:apply-templates select="#*" />
<xsl:value-of select="concat(., $dummyempty)"/>
</xsl:copy>
</xsl:template>
But I found that in fact, even the dummy variable is not necessary. This preserved empty tags, at least when tested with xsltproc in linux :
<xsl:template match="B">
<xsl:copy>
<xsl:apply-templates select="#*" />
<xsl:value-of select="."/>
</xsl:copy>
</xsl:template>
For a more generic solution to handle ALL empty tags, try this:
<xsl:variable name="dummyempty" select="''"/>
<xsl:template match="*[. = '']">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
<xsl:value-of select="$dummyempty"/>
</xsl:copy>
</xsl:template>
Again, depending on how smart your parser is, you may not even need the dummy variable.