Optimizing Stylesheets

Overview

This document outlines optimizations that we can perform on stylesheets after a stylesheet is compiled, but before it is executed.

Create AVTs out of xsl:attributes

People often write stylesheets like

<LRE>
  <xsl:attribute name="foo">
    text here
    <xsl:value-of select="@bar"/>
  </xsl:attribute>

This could be optimized into an AVT like

<LRE foo="text here{@foo}">

This can be done both in templates and in attribute-sets

Investigate how RTFs are being used

When only the string-value or double-value of an RTF is being used it is unneccesary to create an entire RTF. Instead we could set up a string-handler and use the resulting string.

Once we have a node-set() function we could also have a special handler that creates that resulting nodeset rather then whatever-we-create-once-we-have-real-RTFs. That handler would be set up when we can determain that a RTF is used only as argument to the node-set() function.

In cases where an RTF is used as parameter in a call to another template we could do further analysis to see how that parameter is used in that template. Either by first analyzing how all template parameters are used in all templates, or by recursivly searching the called template when we find that a RTF is used as an argument.

We will still need a catch-all RTF implementation that is used when we can't determin how a variable is used, or when it is used in multiple different ways.

Inline/calculate constant variables

Variables that does not have a value that depends on the source document can be inlined into the stylesheet. The value of these variables could also be calculated before the transformation is started.

This can be combined with the previous optimization of RTFs so that stylesheets like

<xsl:variable name="v">_</xsl:variable>
...
<xsl:value-of select="concat(@id, $v, foo)"/>

is executed like

<xsl:value-of select="concat(@id, '_', foo)"/>

This can be done for both global and local variables.

Resolve cross-references

Cross-references to named templates, attribute-sets and template-modes can be resolved before the stylesheet is executed. That way we don't have to search for the template/attribute-set with a certain name, or the group of templates for a certain mode.

Note that for apply-imports we won't always know at compile-time which mode to seach for templates in.

Resolve variables by index rather then expanded name

Rather then seaching for variables with a certain name at runtime we could remove names for variables/parameters entierly and put all variables in an array (possibly a separate array for global variables) and then let variable-references in expressions just point to an index in that array.

Calculate and use constant AVTs

A lot of instruction-elements uses AVTs that often are constants in stylesheets, such as the data-type parameter to xsl:sort. For xsl:sort we could pre-create a finished nodesorter if all AVTs in all xsl:sort-elements are constant.

Another example is xsl:element and xsl:attribute where we could use LRE-instructions if the AVTs are constant. For xsl:number we could create prepare the list of txFormattedCounters.

Reuse parameter-map if exact same parameters are used

If a template calls another template and uses the exact same parameters we can reuse the same parameter-map.

<xsl:template name="hsbc_html_body">
  <xsl:param name="appName"/>
  <xsl:param name="title"/>
  <body>
    <xsl:call-template name="hsbc_form">
      <xsl:with-param name="appName" select="$appName"/>
      <xsl:with-param name="title" select="$title"/>
    </xsl:call-template>
  </body>
</xsl:template>

We have to watch out for default parameter-values as well as parameters being specified by the caller of this template but not used by this template. I.e. if the parameter "hello" is specified when calling the above template.

All this might make this optimization useless.

Combind consecutive LRE items into a txResultBuffer

When a template contains several consecutive "LRE instructions" such as LRE-elements, LRE-attributes and textnodes we can replace their instructions with a special instruction that contains a txResultBuffer. This buffer would then be flushed when the instruction is executed. This can of course not be done for LRE-attributes that contains AVTs. Even PI and comment-instructions can be included in this buffer if their contents is strictly literal.

The result is that a template like

<xsl:template match="/">
  <html>
    <body bgcolor="green">
      Here goes pagecontent
      <xsl:apply-templates />
    </body>
  </html>
</xsl:template>

Would result in the first txResultBuffer to contain the following transactions: eStartElementTransaction, eStartElementTransaction, eAttributeTransaction, eCharacterTransaction.

There is probably a lower limit to when it's worth the effort to replace the normal instructions with the buffer-instruction. A single "LRE instruction" is probobably better kept as a normal instruction.