The systems and methods for type inference for optimized XSLT implementation in accordance with the present invention are further described with reference to the accompanying drawings in which:
Certain specific details are set forth in the following description and figures to provide a thorough understanding of various embodiments of the invention. Certain well-known details often associated with computing and software technology are not set forth in the following disclosure, however, to avoid unnecessarily obscuring the various embodiments of the invention. Further, those of ordinary skill in the relevant art will understand that they can practice other embodiments of the invention without one or more of the details described below. Finally, while various methods are described with reference to steps and sequences in the following disclosure, the description as such is for providing a clear implementation of embodiments of the invention, and the steps and sequences of steps should not be taken as required to practice this invention.
In one embodiment, contemplated systems and methods cam perform type inference in a system with a call-graph, a data-flow graph, various flow analyses and an optimizing code generator. The performance of a technology such as XslCompiledTransform is achieved through a combination of techniques including optimizations as referenced in the background section, as well as: computation of a call graph, computation of a data-flow graph, side-effect inference, type inference, as described herein, unused parameter elimination, dead-code elimination, and focus inference, as described in U.S. Provisional Application 60/789,555.
Example of Type Inference
Consider the following template:
Independently of other templates containing in the stylesheet, we may infer that:
On the contrary, the types of $par and $var4 cannot be determined without analyzing all callers of the given template. For instance, if all callers pass string values for the “par” parameter, we may infer type string for both $par and $var4. If some callers pass string values, and others pass node set values, then we cannot make $par and $var4 strongly-typed, and may end up allocating “variant” storages for them.
In one embodiment, type inference may be conducted using a program analysis for XSLT programs that comprises the phases illustrated in
While
Phase 1—Naïve Type Inference for XPath Expressions
According to step 101 in
Other flags denote the corresponding XPath data types, except for XslFlags.Node, which indicates a node set containing exactly one node.
Global and Local Variables
For every global or local variable, one embodiment of the invention may initialize its type flags with the type of the XPath expressions contained in the “select” attribute of that variable, or XslFlags.Rtf if the variable is bound to a result tree fragment 202. For the majority of XPath expressions their types may be inferred in a straight-forward manner using semantics of XPath operators and signatures of XPath and XSLT functions:
However, if an expression contains just a variable reference ($<name>), its type cannot be inferred by a naïve type inference algorithm:
In this case type flags cannot be inferred without inter-template analysis, and are not set initially.
Local Parameters
Since values of local parameters are specified by callers, naïve type inference method cannot infer their type flags. In one embodiment, type flags may be calculated for default values of local parameters 203, without setting any type flags on local parameters themselves.
Global Parameters and Extension Functions
An XSLT stylesheet may have global parameters, whose values are to be specified at execution time, and therefore, not known during compilation of that stylesheet. Effectively, global parameters may be assigned values of arbitrary type, so they are marked with all existing type flags 204.
In addition, the XslCompiledTransform engine allows a stylesheet to involve calls to extension functions, whose return types are not known at compile time either. In one embodiment, such calls are treated the same way as global parameters.
Phase 2—Data-Flow Graph Construction
According to step 102 in
In one embodiment, the method may begin by building a data-flow graph to detect values of which types really may be assigned to variables and parameters of an XSLT stylesheet 301. In one exemplary algorithm, a data-flow graph represents the relation “can-be-assigned to” for three cases:
The first two of these instructions may specify values of callee's local parameters via its xsl:with-param children nodes. If the value of some parameter P is specified via xsl:with-param, and the type flags of the expression specified by that xsl:with-param was inferred during phase 1, those type flags are included into the type flags of P 302. Suppose that the type flags of the xsl:with-param could not be inferred during phase 1, which means it contains just a variable or a parameter reference:
In this case we add an edge <P, Q> to the data-flow graph 303.
If the value of some parameter is not specified (and in case of xsl:apply-imports it is never specified), the default parameter value will be used instead 304. It is important to distinguish the case when the value of a local parameter is always specified by callers. In such a case, we do not have to update the type flags of that parameter with the type flags of its default value. That makes a big difference, because even if there is no default value specified in the stylesheet, the empty string default value is assumed by XSLT 1.0 rules. Consider the following two equivalent templates:
Suppose that all callers pass a single node as a value of the “node” parameter. In that case we may ignore the default value and infer node type.
A local parameter, whose default value may be used during execution of the stylesheet, is marked with a special XslFlags.MayBeDefault flag 305.
Phase 3—Inter-Template Type Flags Propagation
According to step 103 in
Exemplary systems are now in the position to propagate type flags through the data-flow graph. In the general case of flow analysis the result is potentially obtained by means of a fix-point algorithm on a control-flow or data-flow graph (“stop if no more changes have been done in the previous pass”). It turns out type inference admits a more efficient approach: a one-pass (hence non-iterative) post-order (“depth-first”) traversal of the data-flow graph 401. This is an insight that contributes to the scalability of type inference.
Type Flags Propagation for xsl:apply-templates
In one embodiment, our technique handles the xsl:apply-templates instruction. Consider the following XSLT program:
In this case, the xsl:apply-templates instruction may call either the “T1” template or the “T2” template according to step 402, and therefore the type flags of the both “par” parameters depend on the type flags of the XPath expression denoted by the ellipsis. This means that logically we should add as many edges to the data-flow graph as there are templates that have the “par” parameter and carry the relevant mode “M”. In one exemplary embodiment, we instead add edges to a special node that collectively represents all “par” parameters for the given mode “M” according to step 403. This also improves scalability of the inference. As an aside, this discussion also demonstrates that type inference naturally interacts with the XSLT concept of modes.
Phase 4—Allocation of Data Storages and Eliminating Type Checks
According to step 104 in
A number of other optimizations may be made based on the computed set of type flags. For example, predicates are valid on node set only, and, in general, one needs to check the runtime type of a parameter before applying the predicate to it. However, if the type node set has been inferred for the parameter, the runtime check is not needed, and may be optimized out 504:
As another example, the XslCompiledTransform engine uses the same data structure for representing both node and result tree fragment types. Thus, if the computed set of type flags contains only XslFlags.Node and XslFlags.Rtf flags, the data storage for the node type will be used, however, the code generator will insert runtime type checks for all operations that are valid on node sets, but invalid on result tree fragments:
Though we still need a runtime type check for the parameter “par”, we benefit from using the most efficient data storage for it.
Overview of Exemplary Implementation
This section presents an overview of an exemplary implementation. The logic, as described above, can be implemented in a system such as the .NET Framework 2.0. Such an implementation uses, for example, C# 2.0.
The exemplary implementation is located in the XslAstAnalyzer class, which is one of the internal classes of the XslCompiledTransform implementation. This class implements a visitor on the XSLT AST—the in-memory tree that represents stylesheets. We use the standard visitor pattern here. We refer to Listing 1 for the visitor methods which resemble the AST node types for an XSLT program.
The visitor traverses the AST in bottom-up manner while naïvely inferring type flags for variables and parameters according to phase 1 and adding edges to the data-flow graph according to phase 2. Hence, phase 1 and phase 2 are carried out in an interleaved manner. The visit methods also build data structures for other program analyses as mentioned earlier. Listing 2 demonstrates a sketch of the helper XPathAnalyzer class used for computing type flags for XPath expressions. If an XPath expression contains just a variable or parameter reference with optional parentheses, its type flags cannot be inferred naïvely, and in that case XPathAnalyzer returns the variable or the parameter that governs the type of the expression (“type donor”), so XslAstAnalyzer could use that information constructing the data-flow graph.
We refer to Listing 3 for a sketch of the graph class that is instantiated for data-flow graphs. Upon completion of the visitor's work, the XslAstAnalyzer class calls the PropagateFlag( ) method separately for each of the data type flags. The propagation method is also shown in Listing 3.
As a result of this analysis, all variables and parameters in the AST are annotated with XslFlags and the “code generator” component of the XSLT compiler can use this information directly to allocate the most appropriate data storages for them.
In addition to the specific implementations explicitly set forth herein, other aspects and implementations will be apparent to those skilled in the art from consideration of the specification disclosed herein. It is intended that the specification and illustrated implementations be considered as examples only, with a true scope and spirit of the following claims.
The shown methods correspond to the AST node types for XSLT programs.
This class is used to compute type flags of XPath expressions.
This class is used to represent (reverse) call graphs.
There is a general facility for flag annotation.
There is also readily support for propagation of flags (using DepthFirstSearch; elided).
This application claims priority to U.S. Provisional Application 60/789,554, filed Apr. 4, 2006. This application is related by subject matter to U.S. Provisional Application 60/789,555, filed Apr. 4, 2006, and any subsequent nonprovisional applications claiming priority thereto.
Number | Date | Country | |
---|---|---|---|
60789554 | Apr 2006 | US | |
60789555 | Apr 2006 | US |