BUY THIS BOOK
Add to Cart

Print Book $49.95


Add to Cart

Print+PDF $64.94

Add to Cart

PDF $39.99

Safari Books Online

What is this?

Add to UK Cart

Print Book £35.50

What is this?

Looking to Reprint or License this content?


XSLT Cookbook
XSLT Cookbook, Second Edition By Sal Mangano
December 2005
Pages: 774

Cover | Table of Contents


Table of Contents

Chapter 1: XPath
Neo, sooner or later you're going to realize just as I did that there's a difference between knowing the path and walking the path.
Morpheus (The Matrix)
XPath is an expression language that is fundamental to XML processing. You can no more master XSLT without mastering XPath than you can master English without learning the alphabet. Several readers of the first edition of XSLT Cookbook took me to task for not covering XPath. This chapter has been added partly to appease them but more so due to the greatly increased power of the latest XPath 2.0 specifications. However, many of these recipes are applicable to XPath 1.0 as well.
In XSLT 1.0, XPath plays three crucial roles. First, it is used within templates for addressing into the document to extract data as it is being transformed. Second, XPath syntax is used as a pattern language in the matching rules for templates. Third, it is used to perform simple math and string manipulations via built-in XPath operators and functions.
XSLT 2.0 retains and strengthens this intimate connection with XPath 2.0 by drawing heavily on the new computational abilities of XPath 2.0. In fact, one can make a reasonable argument that the enhanced capabilities of XSLT 2.0 stem largely from the advances in XPath 2.0. The new XPath 2.0 facilities include sequences, regular expressions, conditional and iterative expressions, and enhanced XML Schema compliant-type system as well as a large number of new built-in functions.
Each recipe in this chapter is a collection of mini-recipes for solving certain classes of XPath problems that often arise while using XSLT. We annotate each XPath expression with the XPath 2.0 commenting convention (: comment :) but users of XPath/XSLT 1.0 should be aware that these comments are not legal syntax. When we are showing the result of an XPath evaluation that is empty, we will write (), which happens to be the way one writes a literal empty sequence in XPath 2.0.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Introduction
XPath is an expression language that is fundamental to XML processing. You can no more master XSLT without mastering XPath than you can master English without learning the alphabet. Several readers of the first edition of XSLT Cookbook took me to task for not covering XPath. This chapter has been added partly to appease them but more so due to the greatly increased power of the latest XPath 2.0 specifications. However, many of these recipes are applicable to XPath 1.0 as well.
In XSLT 1.0, XPath plays three crucial roles. First, it is used within templates for addressing into the document to extract data as it is being transformed. Second, XPath syntax is used as a pattern language in the matching rules for templates. Third, it is used to perform simple math and string manipulations via built-in XPath operators and functions.
XSLT 2.0 retains and strengthens this intimate connection with XPath 2.0 by drawing heavily on the new computational abilities of XPath 2.0. In fact, one can make a reasonable argument that the enhanced capabilities of XSLT 2.0 stem largely from the advances in XPath 2.0. The new XPath 2.0 facilities include sequences, regular expressions, conditional and iterative expressions, and enhanced XML Schema compliant-type system as well as a large number of new built-in functions.
Each recipe in this chapter is a collection of mini-recipes for solving certain classes of XPath problems that often arise while using XSLT. We annotate each XPath expression with the XPath 2.0 commenting convention (: comment :) but users of XPath/XSLT 1.0 should be aware that these comments are not legal syntax. When we are showing the result of an XPath evaluation that is empty, we will write (), which happens to be the way one writes a literal empty sequence in XPath 2.0.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Effectively Using Axes
You need to select nodes in an XML tree in ways that consider complex relationships within the hierarchical structure.
Each of the following solutions is organized around related sets of axes. For each group, a sample XML document is presented with the context node in bold. An explanation of the effect of evaluating the path is provided, along with an indication of the nodes that will be selected with respect to the highlighted context. In some cases, the solution will consider other nodes as the context to illustrate subtleties of the particular path expression.

Child and descendant axes

The child axis is the default axis in XPath. This means one does not need to use the child:: axis specification, but you can if you are feeling pedantic. One can reach deeper into the XML tree using the descendant:: and the descendant-or-self:: axes. The former excludes the context node and the latter includes it.
<Test id="descendants">
   <parent>
      <X id="1"/>
      <X id="2"/>
      <Y id="3">
        <X id="3-1"/>
        <Y id="3-2"/>
        <X id="3-3"/>
      </Y>
      <X id="4"/>
      <Y id="5"/>
      <Z id="6"/>
      <X id="7"/>
      <X id="8"/>
      <Y id="9"/>
    </parent>
</Test>

(: Select all child elements named X :)
X   (: same as child::X :)

Result: <X id="1"/> <X id="2"/> <X id="4"/> <X id="7"/><X id="8"/>

(:Select the first X child element:)

X[1]    

Result: <X id="1"/>

(:Select the last X child element:)

X[last()]    

Result: <X id="8"/>


(:Select the first element, provided it is an X. Otherwise empty:)

*[1][self::X]    

Result: <X id="1"/>


(:Select the last child, provided it is an X. Otherwise empty:)

*[last()][self::X]    

Result: ()

*[last()][self::Y]    

Result: <Y id="9"/>

(: Select all descendants named X :)
descendant::X

Result: <X id="1"/> <X id="2"/> <X id="3-1"/> <X id="3-3"/> <X id="4"/> <X id="7"/> <X id="8"/>

(: Select the context node, if it is an X, and all descendants named X :)

Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Filtering Nodes
You need to select nodes based on the data they contain instead or in addition to their names or position.
Many of the mini-recipes in Recipe 1.1 used predicates to filter nodes, but those predicates were based strictly on position of the node or node name. Here we consider a variety of predicates that filter based on data content. In these examples, we use a simple child element path X before each predicate, but one could equally substitute any path expression for X, including those in Recipe 1.1.
In the following examples, we use the XPath 2.0 comparison operators (eq, ne, lt, le, gt, and ge) instead of the operators (=, !=, <, <=, >, and >=). This is because when one is comparing atomic values, the new operators are preferred. In XPath 1.0, you only have the latter operators so make the appropriate substitution. The new operators were introduced in XPath 2.0 because they have simpler semantics and will probably be more efficient as a result. The complexity of the old operators comes when one considers cases where a sequence is on either side of the comparison. Recipe 1.8 covers this topic further.
Another point must be made for those working in XPath 2.0 because that version incorporates type information when a schema is available. That could lead to some of the expressions below to have type errors. For example, X[@a = 10] is not the same as X[@a = '10'] when the attribute a has an integer type. Here we assume there is no schema and therefore all atomic values have the type untypedAtomic. You can find more on this topic in Recipes Recipe 1.9 and Recipe 1.10.
(: Select X child elements that have an attribute named a. :)
X[@a]

(: Select X children that have at least one attribute. :)
X[@*]

(: Select X children that have at least three attributes. :)
X[count(@*) > 2]
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Working with Sequences
You want to manipulate collections of arbitrary nodes and atomic values derived from an XML document or documents.

XPath 1.0

There is no notion of sequence in XPath 1.0 and hence these recipes are largely inapplicable. XPath 1.0 has node sets. There is an idiomatic way to construct the empty node sets using XSLT 1.0.
(: The empty node set :)
/..

XPath 2.0

(: The empty sequence constructor. :)
()

(: Sequence consisting of the single atomic item 1. :)
1

(: Use the comma operator to construct a sequence. Here we build a sequence 
of all X children of the context, followed by Y children, followed by Z children. :)
X, Y, Z

(: Use the to operator to construct ranges. :)
1 to 10

(: Here we combine comma with several ranges. :)
1 to 10, 100 to 110, 17, 19, 23

(: Variables and functions can be used as well. :)
1 to $x

                  1 to count(para)

(: Sequences do not nest so the following two sequences are the same. :)

((1,2,3), (4,5, (6,7), 8, 9, 10))

                  1,2,3,4,5,6,7,8,9,10

(: The to operator cannot create a decreasing sequence directly. :)
10 to 1 (: This sequence is empty! :)

(: You can accomplish the intended effect with the following. :)
for $n in 1 to 10 return 11 - $n

(: Remove duplicates from a sequence. :)
distinct-values($seq)

(: Return the size of a sequence. :)
count($seq)

(: Test if a sequence is empty. :)
empty($seq) (: prefer over count($seq) eq 0 :)

(: Locate the positions of an item in a sequence. Index-of produces a sequence 
of integers for every item in the first arg that is eq to the second. :)
index-of($seq, $item) 

(: Extract subsequences. :)

(: Up to 3 items from $seq, starting with the second. :)
subsequence($seq, 2, 3)

(: All items from $seq at position 3 to the end of the sequence. :)
subsequence($seq, 3)

(: Insert a sequence, $seq2, before the 3rd item in an input sequence, $seq1. :)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Shrinking Conditional Code with If Expressions
Your complex XSLT code is too verbose due to the high overhead of XML when expressing simple if-then-else conditions.

XPath 1.0

There are a few tricks you can play in XPath 1.0 to avoid using XSLT's verbose xsl:choose in simple situations. These tricks rely on the fact that false converts to 0 and true to 1 when used in a mathematical context.
So, for example, min, max, and absolute value can be calculated directly in XPath 1.0. In these examples, assume $x and $y contain integers.
(: min :)
($x <= $y) * $x + ($y < $x) * $y

(: max :)
($x >= $y) * $x + ($y > $x) * $y

(: abs :)
(1 - 2 * ($x < 0)) * $x
               

XPath 2.0

For the simple cases in the XPath 1.0 section (min, max, abs), there are now built-in XPath functions. For other simple conditionals, use the new conditional if expression.
(: Default the value of a missing attribute to 10. :)
if (@x) then @x else 10

(: Default the value of a missing element to 'unauthorized'. :)
if (password) then password else 'unauthorized''unauthorized'

(: Guard against division by zero. :)
if ($d ne 0) then $x div $d else 0

(: A para elements text if it contains at least one non-whitespace character; otherwise, a single space. :)
if (normalize-space(para)) then string(para) else ' '
If you are a veteran XSLT 1.0 programmer, you probably cringe every time you need to add some conditional code to a template. I know I do, and often go through pains to exploit XSLT's pattern-matching constructs to minimize conditional code. This is not because such code is more complicated or inefficient in XSLT but rather because it is so darn verbose. A simple xsl:if is not that bad, but if you need to express if-then-else logic, you are now forced to use the bulkier
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Eliminating Recursion with for Expressions
You want to derive an output sequence from an input sequence where each item in the output is an arbitrarily complex function of the input and the sizes of each sequence are not necessarily the same.

XPath 1.0

Not applicable in 1.0. Use a recursive XSLT template.

XPath 2.0

Use XPath 2.0's for expression. Here we show four cases demonstrating how the for expression can map sequences of differing input and output sizes.

Aggregation

(: Sum of squares. :)



sum(for $x in $numbers return $x * $x)
 
(: Average of squares. :)
avg(for $x in $numbers return $x * $x)
               

Mapping

(: Map a sequence of words in all paragraphs to a sequence of word lengths. :)
for $x in //para/tokenize(., ' ')  return string-length($x) 

(: Map a sequence of words in a paragraph to a sequence of word lengths for words greater than three letters. :)
for $x in //para/tokenize(., ' ')  return if (string-length($x) gt 3) the string-length($x) else () 

(: Same as above but with a condition on the input sequence. :)
for $x in //para/tokenize(., ' ')[string-length(.) gt 3] return string-length($x)
               

Generating

(: Generate a sequence 


of squares of the first 100 integers. :)
for $i in 1 to 100 return $i * $i 

(: Generate a sequence of squares in reverse order. :)
for $i in 0 to 10 return  (10 - $i) * (10 - $i)
               

Expanding

(: Map a sequence of paragraphs to a duped sequence of paragraphs. :)
for $x in //para return ($x, $x)

(: Duplicate words. :)
for $x in //para/tokenize(., ' ') return ($x, $x)
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Taming Complex Logic Using Quantifiers
You need to test a sequence for the existence of a condition in some or all of its items.

XPath 1.0

If the condition is based on equality, then the semantics of the = and != operators in XPath 1.0 and 2.0 will suffice.
(: True if at least one section is referenced. :)
//section/@id = //ref/@idref

(: True if all section elements are referenced by some ref element. :)
count(//section) = count(//section[@id = //ref/@idref])

XPath 2.0

In XPath 2.0, use some and every expressions to do the same.
(: True if at least one section is referenced. :)
some $id in //para/@id satisfies $id = //ref/@idref

(: True if all section elements are referenced by some ref element. :)
every $id in //section/@id satisfies $id = //ref/@idref
However, you can go quite a bit further with less effort in XPath 2.0.
(: There exists a section that references every section except itself. :)
some $s in //section satisfies
    every $id in //section[@id ne $s/@id]/@id satisfies $id = $s/ref/@idref 

(: $sequence2 is a sub-sequence of $sequence1 :)
count($sequence2) <= count($sequence1) and
every $pos in 1 to count($sequence1),
          $item1 in $sequence1[$pos],
          $item2 in $sequence2[$pos] satisfies $item1 = $item2
If you remove the count check in the preceding expression, it would assert that at least the first count($sequence1) items in $sequence2 are the same as corresponding items in $sequence1.
The semantics of =, !=, <, >, <=, >= in XPath 1.0 and 2.0 sometimes surprise the uninitiated when one of the operands is a sequence or XPath 1.0 node set. This is because the operators evaluate to true if there is at least one pair of values from each side of the expression which compare according to the relation. In XPath 1.0, this can sometimes work to your advantage, as we have shown previously, but other times it can leave your head spinning and you longing to be back in the 5th grade where math made sense. For example, one would guess that
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Using Set Operations
You want to process sequence as if they were mathematical sets.

XPath 1.0

The union operation (|) over nodes is supported in XPath 1.0, but one needs a bit of trickery to achieve intersection and set difference.
(: union :)
$set1 | $set2

(: intersection :)
$set1[count(. | $set2) = count($set2)]

(: difference :)
$set1[count(. | $set2) != count($set2)]

XPath 2.0

The | operator in XPath 2.0 remains but union is added as an alias. In addition, intersect and except are added for intersection and set difference respectively.
$set1 union $set2

(: intersection :)
$set1 intersect $set2

(: difference :)
$set1 except $set2
In XPath 2.0, node sets are replaced by sequences. Unlike node sets, sequences are ordered and can contain duplicate items. However, when using the XPath 2.0 set operations, duplicates and ordering are ignored so sequences behave just like sets. The result of a set operation will never contain duplicates even if the inputs did.
The except operator is used in an XPath 2.0 idiom for selecting all attributes but a given set.
(: All attributes except @a. :)
@* except @a  

(: All attributes except @a and @b. :)
@* except @a, @b
In, 1.0, one needs the following more awkward expressions:
@*[local-name(.) != 'a' and local-name(.) != 'b']
Interestingly enough, XPath only allows set operations over sequences of nodes. Atomic values are not allowed. This is because the set operations are over node identity and not value. One can get the effect of sets of values using the following XPath 2.0 expressions. For XPath 1.0, you will need to use XSLT recursion. See Chapter 8.
(: union :)
distinct-values( ($items1, $items2) )

(: intersection :)
distinct-values( $items1[. = $items2] )

(: difference :)
distinct-values( $items1[not(. = $items2)] )
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Using Node Comparisons
You want identify nodes or relate them by their position in a document.

XPath 1.0

In these examples, assume $x and $y each contain a single node from the same document. Also, recall that document order means the sequence in which nodes appear within a document.
(: Test if $x and $y are the same exact node. :)
generate-id($x) = generate-id($y)

(: You can also take advantage of the the | operator's removal of duplicates. :)
count($x|$y) = 1

(: Test if $x precedes $y in document order - note that this does not work if $x 
or $y are attributes. :)
count($x/preceding::node()) < count($y/preceding::node()) or 
$x = $y/ancestor::node()

(: Test if $x follows $y in document order - note that this does not work if $x 
or $y are attributes. :)
count($x/following::node()) < count($y/following::node()) or
$y = $x/ancestors::node()

XPath 2.0

(: Test if $x and $y are the same exact node. :)
$x is $y

(: Test if $x precedes $y in document order. :)
$x << $y

(: Test if $x follows $y in document order. :)
$x >> $y
               
The new XPath 2.0 node comparison operators are likely to be more efficient and certainly easier to understand than the XPath 1.0 counterparts. However, if you are using XSLT 2.0, you will not find too many situations where these operators are required. There are many situations where you think you need << or >> when the xsl::for-each-group element is preferred. See Recipe 6.2 for examples.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Coping with XPath 2.0's Extended Type System
XPath 2.0's stricter type rules have you cursing the W3C and longing for Perl.
Most incompatibilities between XPath/XSLT 1.0 and 2.0 come from type errors. This is true regardless of whether a schema is present or not. You can eliminate many problems encountered in porting legacy XSLT 1.0 to XSLT 2.0 with respect to XPath differences by running in 1.0 compatibility mode.
<xsl:stylesheet version="1.0">

    <!-- ... -->

</xsl:stylesheet>
In my opinion, eventually you will want to stop using compatibility mode. XPath 2.0 provides several facilities for dealing with type conversions. First, you can use conversion functions explicitly.
(: Convert the first X child of the context to a number. :)
number(X[1]) + 17

(: Convert a number in $n to a string. :)
concat("id-", string($n))
XPath 2.0 also provides type constructors so you can explicitly control the interpretation of a string.
(: Construct a date from a string. :)
xs:date("2005-06-01") 

(: Construct doubles from strings. :)
xs:double("1.1e8") + xs:double("23000")
Finally, XPath has the operators castable as, cast as, and treat as. Most of the time, you want to use the first two.
if ($x castable as xs:date) then $x cast as xs:date else xs:date("1970-01-01")
The operator, treat as, is not a conversion per se but rather an assertion that tells the XPath processor that you promise at runtime a value will conform to a specified type. If this turns out not to be the case, then a type error will occur. XPath 2.0 added treat as so XPath implementers could perform static (compile time) type checking in addition to dynamic type checking while allowing programmers to selectively disable static type checks. Static type checking XSLT 2.0 implementations will likely be rare so you can ignore treat as for the time being. It is far more likely to arise in higher-end XQuery processors that do static type checking to facilitate various optimizations.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Exploiting XPath 2.0's Extended Type System
You use XML Schema religiously when processing XML and would like to reap its rewards.
If you validate your documents against a schema, the resulting nodes become annotated with type information. You can then test for these types in XPath 2.0 (and while matching templates in XSLT 2.0).
(: Test if all invoiceDate elements have been validated as dates. :)
if (order/invoiceDate instance of element(*, xs:date)) then "invoicing 
complete" else " invoicing incomplete"
instance of is only useful in the presence of schema validation. In addition, it is not the same as castable as. For instance, 10 castable as xs:positiveInteger is always true but 10 instance of xs:positiveInteger is never true because literal integer types are labeled as xs:decimal.
However, the benefit of validation is not simply the ability to test types using instance of but rather from the safety and convenience of knowing that there will be no type error surprises once validation is passed. This can lead to more concise stylesheets.
(: Without validation, you should code like this. :)
for $order in Order return xs:date($order/invoiceDate) 
- xs:date($order/createDate)

(: If you know all date elements have been validated, you can dispense with 
the xs:date constructor.

for $order in Order return $order/invoiceDate - $order/createDate
            
My personal preference is to use XML Schemas as specification documents and not validation tools. Therefore, I tend to write XSLT transformations in ways that are resilient to type errors and use explicit conversions where needed. Stylesheets written in this manner will work in the presence of validation or not.
Once you begin to write stylesheets that depend on validation, you are locked into implementations that perform validation. On the other hand, if your company standards say all XML documents will be schema-validated before processing, then you can simplify your XSLT based on assurances that certain data types will appear in certain situations.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Chapter 2: Strings
I believe everybody in the world should have guns. Citizens should have bazookas and rocket launchers too. I believe that all citizens should have their weapons of choice. However, I also believe that only I should have the ammunition. Because frankly, I wouldn't trust the rest of the goobers with anything more dangerous than [a] string.
—Scott Adams
When it comes to manipulating strings, XSLT 1.0 certainly lacks the heavy artillery of Perl. XSLT is a language optimized for processing XML markup, not strings. However, since XML is simply a structured form of text, string processing is inevitable in all but the most trivial transformation problems. Unfortunately, XSLT 1.0 has only nine standard functions for string processing. Java, on the other hand, has about two dozen, and Perl, the undisputed king of modern text-processing languages, has a couple dozen plus a highly advanced regular-expression engine.
With the emergence of XSLT 2.0 implementations, XSLT developers can dispense with their Perl string envy. XPath 2.0 now provides 20 functions related to string processing. The functions include support for regular expressions. In addition, XSLT 2.0 adds facilities for parsing unstructured text via regular expressions so it can be converted to proper XML.
XSLT 1.0 programmers have two choices when they need to perform advanced string processing. First, they can call out to external functions written in Java or some other language supported by their XSLT processor. This can be extremely convenient if portability is not an issue and fairly heavy-duty string manipulation is needed. Second, they can implement the advanced string-handling functionality directly in XSLT. This chapter shows that quite a bit of common string manipulation can be done within the confines of XSLT 1.0 and also how the same problems are more easily handled in XSLT 2.0.
You can implement advanced string functions in XSLT 1.0 by combining the capabilities of the native string functions and by exploiting the power of recursion, which is an integral part of all advanced uses of XSLT. In fact, recursion is such an important technique in XSLT that it is worthwhile to look through some of these recipes even if you have no intention of implementing your string-processing needs directly in XSLT.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Introduction
When it comes to manipulating strings, XSLT 1.0 certainly lacks the heavy artillery of Perl. XSLT is a language optimized for processing XML markup, not strings. However, since XML is simply a structured form of text, string processing is inevitable in all but the most trivial transformation problems. Unfortunately, XSLT 1.0 has only nine standard functions for string processing. Java, on the other hand, has about two dozen, and Perl, the undisputed king of modern text-processing languages, has a couple dozen plus a highly advanced regular-expression engine.
With the emergence of XSLT 2.0 implementations, XSLT developers can dispense with their Perl string envy. XPath 2.0 now provides 20 functions related to string processing. The functions include support for regular expressions. In addition, XSLT 2.0 adds facilities for parsing unstructured text via regular expressions so it can be converted to proper XML.
XSLT 1.0 programmers have two choices when they need to perform advanced string processing. First, they can call out to external functions written in Java or some other language supported by their XSLT processor. This can be extremely convenient if portability is not an issue and fairly heavy-duty string manipulation is needed. Second, they can implement the advanced string-handling functionality directly in XSLT. This chapter shows that quite a bit of common string manipulation can be done within the confines of XSLT 1.0 and also how the same problems are more easily handled in XSLT 2.0.
You can implement advanced string functions in XSLT 1.0 by combining the capabilities of the native string functions and by exploiting the power of recursion, which is an integral part of all advanced uses of XSLT. In fact, recursion is such an important technique in XSLT that it is worthwhile to look through some of these recipes even if you have no intention of implementing your string-processing needs directly in XSLT.
This book also refers to the excellent work of EXSLT.org, a community initiative that helps standardize extensions to the XSLT language. You may want to check out their site at
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Testing If a String Ends with Another String
You need to test if a string ends with a particular substring.

XSLT 1.0

substring($value, (string-length($value) - string-length($substr)) + 1) = $substr

XSLT 2.0

ends-with($value, $substr)
XSLT 1.0 contains a native starts-with() function but no ends-with() . This is rectified in 2.0. However, as the previous 1.0 code shows, ends-with can be implemented easily in terms of substring() and string-length() . The code simply extracts the last string-length($substr) characters from the target string and compares them to the substring.
Programmers accustomed to having the first position in a string start at index 0 should note that XSLT strings start at index 1.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Finding the Position of a Substring
You want to find the index of a substring within a string rather than the text before or after the substring.

XSLT 1.0

<xsl:template name="string-index-of">
     <xsl:param name="input"/>
     <xsl:param name="substr"/>
<xsl:choose>
     <xsl:when test="contains($input, $substr)">
          <xsl:value-of select="string-length(substring-before($input, $substr))+1"/>
     </xsl:when>
     <xsl:otherwise>0</xsl:otherwise>
</xsl:choose>
</xsl:template>

XSLT 2.0

<xsl:function name="ckbk:string-index-of">
  <xsl:param name="input"/>
  <xsl:param name="substr"/>
  <xsl:sequence select="if (contains($input, $substr)) 
                        then string-length(substring-before($input, $substr))+1 
                        else 0"/>
</xsl:function>
The position of a substring within another string is simply the length of the string preceding it plus 1. If you are certain that the target string contains the substring, then you can simply use string-length(substring-before($value, $substr))+1. However, in general, you need a way to handle the case in which the substring is not present. Here, zero is chosen as an indication of this case, but you can use another value such as -1 or NaN.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Removing Specific Characters from a String
You want to strip certain characters (e.g., whitespace) from a string.

XSLT 1.0

Use translate with an empty replace string. For example, the following code can strip whitespace from a string:
translate($input," &#x9;&#xa;&xd;", "")

XSLT 2.0

Using translate() is still a good idea in XSLT 2.0 because it will usually perform best. However, some string removal tasks are much more naturally implemented using regular expressions and the new replace() function:
(: \s matches all whitespace characters :)
replace($input,"\s","")
translate() is a versatile string function that is often used to compensate for missing string-processing capabilities in XSLT 1.0. Here you use the fact that translate() will not copy characters in the input string that are in the from string but do not have a corresponding character in the to string.
You can also use translate to remove all but a specific set of characters from a string. For example, the following code removes all non-numeric characters from a string:
translate($string, 
          translate($string,'0123456789',''),'')
The inner translate() removes all characters of interest (e.g., numbers) to obtain a from string for the outer translate(), which removes these non-numeric characters from the original string.
Sometimes you do not want to remove all occurrences of whitespace, but instead want to remove leading, trailing, and redundant internal whitespace. XPath has a built-in function, normalize-space( ), which does just that. If you ever needed to normalize based on characters other than spaces, then you might use the following code (where C is the character you want to normalize):
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Finding Substrings from the End of a String
XSLT does not have any functions for searching strings in reverse.

XSLT 1.0

Using recursion, you can emulate a reverse search with a search for the last occurrence of substr. Using this technique, you can create a substring-before-last and a substring-after-last:
<xsl:template name="substring-before-last">
  <xsl:param name="input" />
  <xsl:param name="substr" />
  <xsl:if test="$substr and contains($input, $substr)">
    <xsl:variable name="temp" select="substring-after($input, $substr)" />
    <xsl:value-of select="substring-before($input, $substr)" />
    <xsl:if test="contains($temp, $substr)">
      <xsl:value-of select="$substr" />
      <xsl:call-template name="substring-before-last">
        <xsl:with-param name="input" select="$temp" />
        <xsl:with-param name="substr" select="$substr" />
      </xsl:call-template>
    </xsl:if>
  </xsl:if>
</xsl:template>
   
<xsl:template name="substring-after-last">
<xsl:param name="input"/>
<xsl:param name="substr"/>
   
<!-- Extract the string which comes after the first occurrence -->
<xsl:variable name="temp" select="substring-after($input,$substr)"/>
   
<xsl:choose>
     <!-- If it still contains the search string the recursively process -->
     <xsl:when test="$substr and contains($temp,$substr)">
          <xsl:call-template name="substring-after-last">
               <xsl:with-param name="input" select="$temp"/>
               <xsl:with-param name="substr" select="$substr"/>
          </xsl:call-template>
     </xsl:when>
     <xsl:otherwise>
          <xsl:value-of select="$temp"/>
     </xsl:otherwise>
</xsl:choose>
</xsl:template>

XSLT 2.0

XSLT 2.0 does not add reverse versions of substring-before/after, but one can get the desired effect using the versatile tokenize( ) function that uses regular expressions:
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Duplicating a String N Times
You need to duplicate a string N times, where N is a parameter. For example, you might need to pad out a string with spaces to achieve alignment.

XSLT 1.0

A nice solution is a recursive approach that doubles the input string until it is the required length while being careful to handle cases in which $count is odd:
<xsl:template name="dup">
     <xsl:param name="input"/>
     <xsl:param name="count" select="2"/>
     <xsl:choose>
          <xsl:when test="not($count) or not($input)"/>
          <xsl:when test="$count = 1">
               <xsl:value-of select="$input"/>
          </xsl:when>
          <xsl:otherwise>
               <!-- If $count is odd append an extra copy of input -->
               <xsl:if test="$count mod 2">
                    <xsl:value-of select="$input"/>
               </xsl:if>
               <!-- Recursively apply template after doubling input and 
               halving count -->
               <xsl:call-template name="dup">
                    <xsl:with-param name="input" 
                         select="concat($input,$input)"/>
                    <xsl:with-param name="count" 
                         select="floor($count div 2)"/>
               </xsl:call-template>     
          </xsl:otherwise>
     </xsl:choose>
</xsl:template>

XSLT 2.0

In 2.0, we can duplicate quite easily with a for expression. We overload dup to replicate the behavior of the defaulted argument in the XSLT 1.0 implementation:
<xsl:function name="ckbk:dup">
    <xsl:param name="input" as="xs:string"/>
    <xsl:sequence select="ckbk:dup($input,2)"/>
  </xsl:function>

  <xsl:function name="ckbk:dup">
    <xsl:param name="input" as="xs:string"/>
    <xsl:param name="count" as="xs:integer"/>
    <xsl:sequence select="string-join(for $i in 1 to $count return $input,'')"/>
  </xsl:function>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Reversing a String
You need to reverse the characters of a string.

XSLT 1.0

This template reverses $input in a subtle yet effective way:
<xsl:template name="reverse">
     <xsl:param name="input"/>
     <xsl:variable name="len" select="string-length($input)"/>
     <xsl:choose>
          <!-- Strings of length less than 2 are trivial to reverse -->
          <xsl:when test="$len &lt; 2">
               <xsl:value-of select="$input"/>
          </xsl:when>
          <!-- Strings of length 2 are also trivial to reverse -->
          <xsl:when test="$len = 2">
               <xsl:value-of select="substring($input,2,1)"/>
               <xsl:value-of select="substring($input,1,1)"/>
          </xsl:when>
          <xsl:otherwise>
               <!-- Swap the recursive application of this template to 
               the first half and second half of input -->
               <xsl:variable name="mid" select="floor($len div 2)"/>
               <xsl:call-template name="reverse">
                    <xsl:with-param name="input"
                         select="substring($input,$mid+1,$mid+1)"/>
               </xsl:call-template>
               <xsl:call-template name="reverse">
                    <xsl:with-param name="input"
                         select="substring($input,1,$mid)"/>
               </xsl:call-template>
          </xsl:otherwise>
     </xsl:choose>
</xsl:template>

XSLT 2.0

Reversing is trivial in 2.0.
<xsl:function name="ckbk:reverse">
    <xsl:param name="input" as="xs:string"/>
    <xsl:sequence select="codepoints-to-string(
                           reverse(string-to-codepoints($input)))"/>
  </xsl:function>

XSLT 1.0

The algorithm shown in the solution is not the most obvious, but it is efficient. In fact, this algorithm successfully reverses even very large strings, whereas other more obvious algorithms either take too long or fail with a stack overflow. The basic idea behind this algorithm is to swap the first half of the string with the second half and to keep applying the algorithm to these halves recursively until you are left with strings of length two or less, at which point the reverse operation is trivial. The following example illustrates how this algorithm works. At each step, I placed a + where the string was split and concatenated.
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Replacing Text
You want to replace all occurrences of a substring within a target string with another string.

XSLT 1.0

The following recursive template replaces all occurrences of a search string with a replacement string:
<xsl:template name="search-and-replace">
     <xsl:param name="input"/>
     <xsl:param name="search-string"/>
     <xsl:param name="replace-string"/>
     <xsl:choose>
          <!-- See if the input contains the search string -->
          <xsl:when test="$search-string and 
                           contains($input,$search-string)">
          <!-- If so, then concatenate the substring before the search
          string to the replacement string and to the result of
          recursively applying this template to the remaining substring.
          -->
               <xsl:value-of 
                    select="substring-before($input,$search-string)"/>
               <xsl:value-of select="$replace-string"/>
               <xsl:call-template name="search-and-replace">
                    <xsl:with-param name="input"
                    select="substring-after($input,$search-string)"/>
                    <xsl:with-param name="search-string" 
                    select="$search-string"/>
                    <xsl:with-param name="replace-string" 
                        select="$replace-string"/>
               </xsl:call-template>
          </xsl:when>
          <xsl:otherwise>
               <!-- There are no more occurrences of the search string so 
               just return the current input string -->
               <xsl:value-of select="$input"/>
          </xsl:otherwise>
     </xsl:choose>
</xsl:template>
If you want to replace only whole words, then you must ensure that the characters immediately before and after the search string are in the class of characters considered word delimiters. We chose the characters in the variable
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Converting Case
You want to convert an uppercase string to lowercase or vice versa.

XSLT 1.0

Use the XSLT translate() function. This code, for example, converts from upper- to lowercase:
translate($input,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')
This example converts from lower- to uppercase:
translate($input, 'abcdefghijklmnopqrstuvwxyz','ABCDEFGHIJKLMNOPQRSTUVWXYZ')

XSLT 2.0

Use the XPath 2.0 functions upper-case() and lower-case() :
upper-case($input)
lower-case($input)
This recipe is, of course, trivial. However, I include it as an opportunity to discuss the XSLT 1.0 solution's shortcomings. Case conversion is trivial as long as your text is restricted to a single locale. In English, you rarely, if ever, need to deal with special characters containing accents or other complicated case conversions in which a single character must convert to two characters. The most common example is German, in which the lowercase ß (eszett) is converted to an uppercase SS. Many modern programming languages provide case-conversion functions that are sensitive to locale, but XSLT does not support this concept directly. This is unfortunate, considering that XSLT has other features supporting internationalization.
A slight improvement can be made by defining general XML entities for each type conversion, as shown in the following example:
<?xml version="1.0" encoding="UTF-8"?>   
<!DOCTYPE stylesheet [
     <!ENTITY UPPERCASE "ABCDEFGHIJKLMNOPQRSTUVWXYZ">
     <!ENTITY LOWERCASE "abcdefghijklmnopqrstuvwxyz">
     <!ENTITY UPPER_TO_LOWER " '&UPPERCASE;' , '&LOWERCASE;' ">
     <!ENTITY LOWER_TO_UPPER " '&LOWERCASE;' , '&UPPERCASE;' ">
]>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
     <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
   
     <xsl:template match="/">
     <xsl:variable name="test"
          select=" 'The rain in Spain falls mainly on the plain' "/>
     <output>
          <lowercase>
               <xsl:value-of
                    select="translate($test,&UPPER_TO_LOWER;)"/>
          </lowercase>
          <uppercase>
               <xsl:value-of
                    select="translate($test,&LOWER_TO_UPPER;)"/>
          </uppercase>
     </output>
     </xsl:template>
   
</xsl:stylesheet>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Tokenizing a String
You want to break a string into a list of tokens based on the occurrence of one or more delimiter characters.

XSLT 1.0

Jeni Tennison implemented this solution (but the comments are my doing). The tokenizer returns each token as a node consisting of a token element text. It also defaults to character-level tokenization if the delimiter string is empty:
<xsl:template name="tokenize">
  <xsl:param name="string" select="''" />
  <xsl:param name="delimiters" select="' &#x9;&#xA;'" />
  <xsl:choose>
     <!-- Nothing to do if empty string -->
    <xsl:when test="not($string)" />
   
     <!-- No delimiters signals character level tokenization. -->
    <xsl:when test="not($delimiters)">
      <xsl:call-template name="_tokenize-characters">
        <xsl:with-param name="string" select="$string" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:call-template name="_tokenize-delimiters">
        <xsl:with-param name="string" select="$string" />
        <xsl:with-param name="delimiters" select="$delimiters" />
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>
   
<xsl:template name="_tokenize-characters">
  <xsl:param name="string" />
  <xsl:if test="$string">
    <token><xsl:value-of select="substring($string, 1, 1)" /></token>
    <xsl:call-template name="_tokenize-characters">
      <xsl:with-param name="string" select="substring($string, 2)" />
    </xsl:call-template>
  </xsl:if>
</xsl:template>
   
<xsl:template name="_tokenize-delimiters">
  <xsl:param name="string" />
  <xsl:param name="delimiters" />
  <xsl:param name="last-delimit"/> 
  <!-- Extract a delimiter -->
  <xsl:variable name="delimiter" select="substring($delimiters, 1, 1)" />
  <xsl:choose>
     <!-- If the delimiter is empty we have a token -->
    <xsl:when test="not($delimiter)">
      <token><xsl:value-of select="$string"/></token>
    </xsl:when>
     <!-- If the string contains at least one delimiter we must split it -->
    <xsl:when test="contains($string, $delimiter)">
      <!-- If it starts with the delimiter we don't need to handle the -->
       <!-- before part -->
      <xsl:if test="not(starts-with($string, $delimiter))">
         <!-- Handle the part that comes before the current delimiter -->
         <!-- with the next delimiter. If there is no next the first test -->
         <!-- in this template will detect the token -->
        <xsl:call-template name="_tokenize-delimiters">
          <xsl:with-param name="string" 
                          select="substring-before($string, $delimiter)" />
          <xsl:with-param name="delimiters" 
                          select="substring($delimiters, 2)" />
        </xsl:call-template>
      </xsl:if>
       <!-- Handle the part that comes after the delimiter using the -->
       <!-- current delimiter -->
      <xsl:call-template name="_tokenize-delimiters">
        <xsl:with-param name="string" 
                        select="substring-after($string, $delimiter)" />
        <xsl:with-param name="delimiters" select="$delimiters" />
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
       <!-- No occurrences of current delimiter so move on to next -->
      <xsl:call-template name="_tokenize-delimiters">
        <xsl:with-param name="string" 
                        select="$string" />
        <xsl:with-param name="delimiters" 
                        select="substring($delimiters, 2)" />
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>
   
</xsl:stylesheet>
Additional content appearing in this section has been removed.
Purchase this book now or read it online at Safari to get the whole thing!
Making Do Without Regular Expressions
You would like to perform regular-expression-like operations in XSLT 1.0, but you don't want to resort to nonstandard extensions.
Several common regular-expression-like matches can be emulated in native XPath 1.0. Table 2-1 lists the regular-expression matches by using Perl syntax along with their XSLT/XPath equivalent.