mirror of
https://github.com/rn10950/RetroZilla.git
synced 2024-11-10 18:00:15 +01:00
524 lines
24 KiB
HTML
524 lines
24 KiB
HTML
<html>
|
|
<head>
|
|
<title>Optimizing XPath</title>
|
|
<style>
|
|
.comment {
|
|
font-style: italic;
|
|
}
|
|
h4 {
|
|
margin: 0;
|
|
padding: 0;
|
|
}
|
|
</style>
|
|
</head>
|
|
<body>
|
|
|
|
<h1>Optimizing XPath</h1>
|
|
|
|
<h2>Overview</h2>
|
|
<p>
|
|
This document outlines optimizations that we can perform to execute
|
|
xpath-expressions faster.
|
|
</p>
|
|
|
|
<h2>Stage 1, <a href="http://bugzilla.mozilla.org/show_bug.cgi?id=94471">DONE</a></h2>
|
|
|
|
<h3>Summary</h3>
|
|
<p>
|
|
Speed up retrieval of orderInfo objects by storing them in resp.
|
|
node instead of in a hash.
|
|
</p>
|
|
|
|
<h3>Details</h3>
|
|
<p>
|
|
We currently spend a GREAT deal of time looking through a
|
|
DOMHelper::orders hash looking for the orderInfo object for a
|
|
specific node. If we moved the ownership and retrieval of these
|
|
orderInfo objects to the Node class instead we will probably save
|
|
a lot of time. I.E. instead of calling
|
|
<code>myDOMHelper->getDocumentOrder(node)</code> you call
|
|
<code>node->getDocumentOrder()</code> which then returns the
|
|
orderInfo object.
|
|
</p>
|
|
|
|
<p>
|
|
It would also be nice if we at the same time fixed some bugs wrt the
|
|
orderInfo objects and the function that sorts nodes using them.
|
|
</p>
|
|
|
|
<p>
|
|
Bugs filed at this are 88964 and 94471
|
|
</p>
|
|
|
|
|
|
|
|
<h2>Stage 2, <a href="http://bugzilla.mozilla.org/show_bug.cgi?id=85893">DONE</a></h2>
|
|
|
|
|
|
<h3>Summary</h3>
|
|
<p>
|
|
Speed up document-order sorting by having the XPath engine always
|
|
return document-ordered nodesets.
|
|
</p>
|
|
|
|
<h3>Details</h3>
|
|
<p>
|
|
Currently the nodesets returned from the XPath engine are totally
|
|
unordered (or rather, have undefined order) which forces the XSLT
|
|
code to sort the nodesets. This is quite expensive since it requires
|
|
us to generate orderInfo objects for every node. Considering that
|
|
many XPath classes actually returns nodesets that are already
|
|
ordered in document order (or reversed document order) this seems a
|
|
bit unnecessary.
|
|
</p>
|
|
|
|
<p>
|
|
However we still need to handle the classes that don't by default
|
|
return document-ordered nodesets. A good example of this is the id()
|
|
function. For example "id('foo bar')" produces two nodes which the
|
|
id-function has no idea how they relate in terms of document order.
|
|
Another example is "foo | bar", where the UnionExpr object gets two
|
|
nodesets (ordered in document order since all XPath classes should
|
|
now return ordered nodesets) and need to merge them into a single
|
|
ordered nodeset.
|
|
</p>
|
|
|
|
<h2>Stage 3, <a href="http://bugzilla.mozilla.org/show_bug.cgi?id=205703">DONE</a></h2>
|
|
|
|
<h3>Summary</h3>
|
|
<p>
|
|
Refcount <code>ExprResult</code>s to reduce the number of objects
|
|
created during evaluation.
|
|
</p>
|
|
|
|
<h3>Details</h3>
|
|
<p>
|
|
Right now every subexpression creates a new object during evaluation.
|
|
If we refcounted objects we would be often be able to reuse the same
|
|
objects across multiple evaluations. We should also keep global
|
|
result-objects for true and false, that way expressions that return
|
|
bool-values would never have to create any objects.
|
|
</p>
|
|
|
|
<p>
|
|
This does however require that the returned objects arn't modified
|
|
since they might be used elsewhere. This is not a big problem in the
|
|
current code where we pretty much only modify nodesets in a couple
|
|
of places.
|
|
</p>
|
|
|
|
<p>
|
|
To be able to reuse objects across subexpressions we chould have an
|
|
<code>ExprResult::ensureModifyable</code>-function. This would
|
|
return the same object if the refcount is 1, and create a new object
|
|
to return otherwise. This is especially usefull for nodesets which
|
|
would be mostly used by a single object at a time. But it could be
|
|
just as usefull for other types, though then we might need a
|
|
<code>ExprResult::ensureModifyableOfType(ExprResult::ResultType)</code>-function
|
|
that only returned itself if it has a refcount of 1 and is of the
|
|
requsted type.
|
|
</p>
|
|
|
|
<h2>Stage 4</h2>
|
|
|
|
<h3>Summary</h3>
|
|
<p>
|
|
Speed up evaluation of XPath expressions by using specialized
|
|
classes for common optimizable expressions.
|
|
</p>
|
|
|
|
<h3>Details</h3>
|
|
<p>
|
|
Some common expressions are possible to execute faster if we have
|
|
classes that are specialized for them. For example the expression
|
|
"@foo" can be evaluated by simply calling |context->getAttributeNode
|
|
("foo")|, instead we now walk all attributes of the context node and
|
|
filter each node using a AttributeExpr. Below is a list of
|
|
expressions that I can think of that are optimizable, but there are
|
|
probably more.
|
|
</p>
|
|
|
|
<p>
|
|
One thing that we IMHO should keep in mind is to only put effort on
|
|
optimising expressions that are actually used in realworld
|
|
stylesheets. For example "foo | foo", "foo | bar[0]" and
|
|
"foo[position()]" can all be optimised to "foo", but since noone
|
|
should be so stupid as to write such an expression we shouldn't
|
|
spend time or codesize on that. Of course we should return the
|
|
correct result according to spec for those expressions, we just
|
|
shouldn't bother with evaluating them fast.
|
|
</p>
|
|
|
|
|
|
<p>
|
|
Apart from finding expression that we can evaluate more cleverly
|
|
there is also the problem of how and where do we create these
|
|
optimised objects instead of the unoptimised, general ones we create
|
|
now. And what are these optimised classes, should they be normal
|
|
Expr classes or should they be something else? We could also add
|
|
"optional" methods to Expr which have default implementations in
|
|
Expr, for example a ::isContextSensitive() which returns MB_TRUE
|
|
unless overridden. However we probably can't answer all this until
|
|
we know which expressions we want to optimised and how we want to
|
|
optimise them.
|
|
</p>
|
|
|
|
<p>
|
|
These expressions can be optimised:
|
|
</p>
|
|
|
|
<p><h4><a name="usecase01" href="#usecase01">Use case 1</a></h4>
|
|
<h4>Class:</h4>
|
|
Steps along the attribute axis which doesn't contain wildcards
|
|
<h4>Example:</h4>
|
|
@foo
|
|
<h4>What we do today:</h4>
|
|
Walk through the attributes NamedNodeMap and filter each node using a
|
|
NameTest.
|
|
<h4>What we could do:</h4>
|
|
Call getAttributeNode (or actually getAttributeNodeNS) on the
|
|
contextnode and return a nodeset containing just the returned node, or
|
|
an empty nodeset if NULL is returned.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase02" href="#usecase02">Use case 2</a></h4>
|
|
<h4>Class:</h4>
|
|
Union expressions where each expression consists of a LocationStep and
|
|
all LocationSteps have the same axis. None of the LocationSteps have any
|
|
predicates (well, this could be relaxed a bit)
|
|
<h4>Example:</h4>
|
|
foo | bar | baz
|
|
<h4>What we do today:</h4>
|
|
Evaluate each LocationStep separately and thus walk the same path through
|
|
the document each time. During the walking the NodeTest is applied to
|
|
filter out the correct nodes. The resulting nodesets are then merged and
|
|
thus we generate orderInfo objects for most nodes.
|
|
<h4>What we could do:</h4>
|
|
Have just one LocationStep object which contains a NodeTest that is a
|
|
"UnionNodeTest" which contains a list of NodeTests. The UnionNodeTest
|
|
then tests each NodeTest until it finds one that returns true. If none
|
|
do then false is returned.
|
|
This results in just one walk along the axis and no need to generate any
|
|
orderInfo objects.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase03" href="#usecase03">Use case 3</a></h4>
|
|
<h4>Class:</h4>
|
|
Steps where the predicates isn't context-node-list sensitive.
|
|
<h4>Example:</h4>
|
|
foo[@bar]
|
|
<h4>What we do today:</h4>
|
|
Build a nodeset of all nodes that match 'foo' and then filter the
|
|
nodeset through the predicate and thus do some node shuffling.
|
|
<h4>What we could do:</h4>
|
|
Create a "PredicatedNodeTest" that contains a NodeTest and a list of
|
|
predicates. The PredicatedNodeTest returns true if both the NodeTest
|
|
returns true and all predicats evaluate to true. Then let the
|
|
LocationStep have that PredicateNodeTest as NodeTest and no predicates.
|
|
This will save us the predicate filtering and thus some node shuffling.
|
|
(Note how this combines nicely with the previous optimisation...)
|
|
(Actually this can be done even if some predicates are context-list
|
|
sensitive, but only up until the first that isn't.)
|
|
</p>
|
|
|
|
<p><h4><a name="usecase04" href="#usecase04">Use case 4</a></h4>
|
|
<h4>Class:</h4>
|
|
PathExprs that only contains steps that from the child:: and attribute::
|
|
axes.
|
|
<h4>Example:</h4>
|
|
foo/bar/baz
|
|
<h4>What we do today:</h4>
|
|
For each step we evaluate the step once for every node in a nodeset
|
|
(for example for the second step the nodeset is the list of all "foo"
|
|
children) and then merge the resulting nodesets while making sure that
|
|
we keep the nodes in document order (and thus generate orderInfo
|
|
objects).
|
|
<h4>What we could do:</h4>
|
|
The same thing except that we don't merge the resulting nodeset, but
|
|
rather just concatenate them. We always know that the resulting nodesets
|
|
are after each other in node order.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase05" href="#usecase05">Use case 5</a></h4>
|
|
<h4>Class:</h4>
|
|
List of predicates where some predicate are not context-list sensitive
|
|
<h4>Example:</h4>
|
|
foo[position() > 3][@bar][.//baz][position() > size() div 2][.//@fud]
|
|
<h4>What we do today:</h4>
|
|
Apply each predicate separately requiring us to shuffle nodes five times
|
|
in the above example.
|
|
<h4>What we could do:</h4>
|
|
Merge all predicates that are not node context-list sensitive into the
|
|
previous predicate. The above predicate list could be merged into the
|
|
following predicate list
|
|
foo[(position() > 3) and (@bar) and (.//baz)][(position() > size() div 2) and (.//@fud)]
|
|
Which only requires two node-shuffles
|
|
</p>
|
|
|
|
<p><h4><a name="usecase06" href="#usecase06">Use case 6</a></h4>
|
|
<h4>Class:</h4>
|
|
Predicates that are only context-list-position sensitive and not
|
|
context-list-size sensitive
|
|
<h4>Example:</h4>
|
|
foo[position() > 5][position() mod 2]
|
|
<h4>What we do today:</h4>
|
|
Build the entire list of nodes that matches "foo" and then apply the
|
|
predicates
|
|
<h4>What we could do:</h4>
|
|
Apply the predicates during the initial build of the first nodeset. We
|
|
would have to keep track of how many nodes has passed each and somehow
|
|
override the code that calculates the context-list-position.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase07" href="#usecase07">Use case 7</a></h4>
|
|
<h4>Class:</h4>
|
|
Predicates that are constants
|
|
<h4>Example:</h4>
|
|
foo[5]
|
|
<h4>What we do today:</h4>
|
|
Perform the appropriate walk and build the entire nodeset. Then apply
|
|
the predicate.
|
|
<h4>What we could do:</h4>
|
|
There are three types of constant results; 1) Numerical values 2)
|
|
Results with a true boolean-value 3) Results with a false boolean value.
|
|
In the case of 1) we should only step up until the n:th node (5 in above
|
|
example) and then stop. For 2) we should completely ignore the predicate
|
|
and for 3) we should return an empty nodeset without doing any walking.
|
|
In some cases we can't at parsetime decide if a constant expression will
|
|
return a numerical or not, for example for "foo[$pos]", so the decision
|
|
of 1) 2) or 3) would have to be made at evaltime. However we should be
|
|
able to decide if it's a constant or not at parsetime.
|
|
Note that while evaluating a LocationStep [//foo] can be considered
|
|
constant.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase08" href="#usecase08">Use case 8</a></h4>
|
|
<h4>Class:</h4>
|
|
PathExprs that contains '//' followed by an unpredicated child-step.
|
|
<h4>Example:</h4>
|
|
.//bar
|
|
<h4>What we do today:</h4>
|
|
We walk the entire subtree below the contextnode and at every node we
|
|
evaluate the 'bar'-expression which walks all the children of the
|
|
contextnode. This means that we'll walk the entire subtree twice.
|
|
<h4>What we could do:</h4>
|
|
Change the expression into "./descendant::bar". This means that we'll
|
|
only walk the tree once. This can only be done if there are no
|
|
predicates since the context-node-list will be different for
|
|
predicates in the new expression.
|
|
Note that this combines nicely with the "Steps where the predicates
|
|
isn't context-node-list sensitive" optimization.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase09" href="#usecase09">Use case 9</a></h4>
|
|
<h4>Class:</h4>
|
|
PathExprs where the first step is '.'
|
|
<h4>Example:</h4>
|
|
./*
|
|
<h4>What we do today:</h4>
|
|
Evaluate the step "." which always returns the same node and then
|
|
evaluate the rest of the PathExpr.
|
|
<h4>What we could do:</h4>
|
|
Remove the '.'-step and simply evaluate the other steps. In the example
|
|
we could even remove the entire PathExpr-object and replace it with a
|
|
single Step-object.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase10" href="#usecase10">Use case 10</a></h4>
|
|
<h4>Class:</h4>
|
|
Steps along the attribute axis which doesn't contain wildcards and
|
|
we only care about the boolean value.
|
|
<h4>Example:</h4>
|
|
foo[@bar], @foo or @bar
|
|
<h4>What we do today:</h4>
|
|
Evaluate the step and create a nodeset. Then get the bool-value of
|
|
the nodeset by checking if the nodeset contain any nodes.
|
|
<h4>What we could do:</h4>
|
|
Simply check if the current element has an attribute of the
|
|
requested name and return a bool-result.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase11" href="#usecase11">Use case 11</a></h4>
|
|
<h4>Class:</h4>
|
|
Unpredicated steps where we only care about the boolean value.
|
|
<h4>Example:</h4>
|
|
foo[<u>processing-instruction()</u>]
|
|
<h4>What we do today:</h4>
|
|
Evaluate the step and create a nodeset. Then get the bool-value of
|
|
the nodeset by checking if the nodeset contain any nodes.
|
|
<h4>What we could do:</h4>
|
|
Walk along the axis until we find a node that matches the nodetest.
|
|
If one is found we can stop the walking and return a true
|
|
bool-result immediatly, otherwise a false bool-result is returned.
|
|
It might not be worth implementing all axes unless we can reuse
|
|
code from the normal Step-code. This could also be applied to
|
|
<code>PathExpr</code>s by getting the boolvalue of the last step.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase12" href="#usecase12">Use case 12</a></h4>
|
|
<h4>Class:</h4>
|
|
Unpredicated steps where we only care about the string-value.
|
|
<h4>Example:</h4>
|
|
starts-with(<u>processing-instruction()</u>, 'hello')
|
|
<h4>What we do today:</h4>
|
|
Evaluate the step and create a nodeset. Then get the string-value of
|
|
the nodeset by getting the stringvalue of the first node.
|
|
<h4>What we could do:</h4>
|
|
Walk along the axis until we find a node that matches the nodetest.
|
|
If one is found we can stop the walking and return a string-result
|
|
containing the value of that node. Otherwise an empty string-result
|
|
can be returned.<br>
|
|
This can also be done when we only care about the number-value.<br>
|
|
This could be combined with the "Unpredicated steps where we only
|
|
care about the boolean value" optimization by instead of returning
|
|
a bool-value or string-value return a nodeset containing just the
|
|
found node. If that is done this optimization could be applied to
|
|
<code>PathExpr</code>s.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase13" href="#usecase13">Use case 13</a></h4>
|
|
<h4>Class:</h4>
|
|
Expressions where the value of an attribute is compared to
|
|
a literal.
|
|
<h4>Example:</h4>
|
|
@bar = 'value'
|
|
<h4>What we do today:</h4>
|
|
Evaluate the attribute-step and then compare the resulting nodeset
|
|
to the value.
|
|
<h4>What we could do:</h4>
|
|
Get the attribute-value for the element and compare that directly
|
|
to the value. In the above example we would just call
|
|
<code>getAttr('bar', kNameSpaceID_None)</code> and compare the
|
|
resulting string with 'value'.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase14" href="#usecase14">Use case 14</a></h4>
|
|
<h4>Class:</h4>
|
|
PathExprs where the last step has a predicate that is not
|
|
context-nodeset dependent and that contains a part that is not
|
|
context-node dependent.
|
|
<h4>Example:</h4>
|
|
foo/*[@bar = current()/@bar]
|
|
<h4>What we do today:</h4>
|
|
<h4>What we could do:</h4>
|
|
First evaluate "foo/*" and "current()/@bar". Then replace
|
|
"current()/@bar" with a literal (and possibly optimize) and filter
|
|
all nodes in the nodeset from "foo/*".
|
|
</p>
|
|
|
|
<p><h4><a name="usecase15" href="#usecase15">Use case 15</a></h4>
|
|
<h4>Class:</h4>
|
|
local-name() or namespace-uri() compared to a literal
|
|
<h4>Example:</h4>
|
|
local-name() = 'foo'
|
|
<h4>What we do today:</h4>
|
|
evaluate the local-name function and compare the string-result to
|
|
the string-result of the literal.
|
|
<h4>What we could do:</h4>
|
|
Atomize the literal (or get the namespaceID in case of
|
|
namespace-uri()) and then compare that to the atom-name of the
|
|
contextnode. This is primarily usefull when combined with the
|
|
previous class.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase16" href="#usecase16">Use case 16</a></h4>
|
|
<h4>Class:</h4>
|
|
Comparisons where one side is a nodeset and the other is not a
|
|
bool-value.
|
|
<h4>Example:</h4>
|
|
//myElem = @baz
|
|
<h4>What we do today:</h4>
|
|
Evaluate both sides and then compare them according to the spec.
|
|
<h4>What we could do:</h4>
|
|
First of all we should start by evaluating the nodeset-side, if the
|
|
result is an empty nodeset false can be returned immediatly.
|
|
Otherwise we evaluate as normal. When both sides are nodesets we
|
|
should examine them and try to figure out which is faster to
|
|
evaluate. That expression should be evaluated first (probably
|
|
by making it the left-hand-side expression).
|
|
</p>
|
|
|
|
<p><h4><a name="usecase17" href="#usecase17">Use case 17</a></h4>
|
|
<h4>Class:</h4>
|
|
Comparisons where one side is a PathExpr and the other is a
|
|
bool-value.
|
|
<h4>Example:</h4>
|
|
baz = ($foo > $bar)
|
|
<h4>What we do today:</h4>
|
|
Evaluate both sides and then compare them.
|
|
<h4>What we could do:</h4>
|
|
Apply the "Steps where we only care about the boolean
|
|
value"-optimization on the PathExpr-side and then evaluate as usual.
|
|
</p>
|
|
|
|
<p><h4><a name="usecase18" href="#usecase18">Use case 18</a></h4>
|
|
<h4>Class:</h4>
|
|
Subexpressions that will be evaluated more then once where the only
|
|
change is in context it doesn't depend on
|
|
<h4>Example:</h4>
|
|
foo[@bar = <u>sum($var/@bar)</u>]
|
|
<h4>What we do today:</h4>
|
|
Reevaluate the subexpression every time we need it and every time
|
|
get the same result.
|
|
<h4>What we could do:</h4>
|
|
We should save the result from the first evaluation and just bring
|
|
it back the following time we need it. This can be done by
|
|
inserting an extra expression between the subexpression and its
|
|
parent, this expression would then first go look in a cache
|
|
available through the <code>nsIEvalContext</code>, if the value
|
|
isn't available there the original expression is evaluated and its
|
|
result is saved in the cache. The cache can be keyed on an integer
|
|
which is stored in the inserted 'cache-expression'.<br>
|
|
The cache itself could be created by another expression that is
|
|
inserted at the top of the expression. This way that expression
|
|
works as a boundry-point for the cache and can in theory be
|
|
inserted anywhere in an expression if needed.
|
|
</p>
|
|
|
|
<h2>Stage 5</h2>
|
|
|
|
<h3>Summary</h3>
|
|
<p>
|
|
Detect when we can concatenate nodesets instead of merge them in
|
|
PathExpr.
|
|
</p>
|
|
|
|
<h3>Details</h3>
|
|
<p>
|
|
Why can we for expressions like "foo/bar/baz" concatenate the resulting
|
|
nodesets without having to check nodeorder? Because at every step two
|
|
statements are true:
|
|
<ol>
|
|
<li>We iterate a nodeset where no node is an ancestor of another</li>
|
|
<li>The LocationStep only returns nodes that are members of the subtree
|
|
below the context-node</li>
|
|
</ol>
|
|
</p>
|
|
|
|
<p>
|
|
For example; While evaluating the second step in "foo/bar/baz" we
|
|
iterate a nodelist containing all "foo" children of the original
|
|
contextnode, i.e. none can be an ancestor of another. And the
|
|
LocationStep "bar" only returns children of the contextnode.
|
|
</p>
|
|
|
|
<p>
|
|
So, it would be nice if we can detect when this occurs as often as
|
|
possible. For example the expression "id(foo)/bar/baz" fulfils those
|
|
requirements if the nodeset returned from contains doesn't contain any
|
|
ancestors of other nodes in the nodeset, which probably often is the
|
|
case in real-world stylesheets.
|
|
</p>
|
|
|
|
<p>
|
|
We should perform this check on every step to be able to take advantage
|
|
of it as often as possible. For example the in expression
|
|
"id(@boss)/ancestor::team/members" we can't use this optimisation at the
|
|
second step since the ancestor axis returns nodes that are not members
|
|
of the contextnodes subtree. However we will probably be able to use the
|
|
optimisation at the third step since if iterated nodeset contains only
|
|
one node (and thus can't contain ancestors of it's members).
|
|
</p>
|
|
</body>
|
|
</html>
|