The present invention relates to database processing, and, in particular embodiments, to an apparatus and method for using parameterized intermediate representation for just-in-time compilation in a database query execution engine.
Central Processing Unit (CPU) cost of query execution is getting more critical in modern database systems, such as when slow disk accesses are largely avoided with the adoption of solid-state drive (SSD) devices. Just-in-time (JIT) compilation is an approach used to improve the CPU performance in a database system. JIT compilation refers to a compilation scheme in query execution performed during execution of a program, at run-time, rather than prior to execution. The low level virtual machine (LLVM) compiler framework is a good candidate for JIT compilation due to its efficiency on code optimization and native code generation, and the quality of its compiled code. The LLVM includes a building function (referred to as IRBuilder) for generating, at run-time, an intermediate representation (IR) of a query-specific function. The LLVM can more efficiently generate optimized machine code from the IR than the compiled query-specific function, for instance. However, the code generation of LLVM IR by using the LLVM IRBuilder at run-time is costly, e.g., in terms of time and/or computing resources, such as memory, and is error-prone. Alternatively, a tool such as clang/clang++provided by LLVM can be used for compiling C/C++code into LLVM IR. This approach by itself may not have benefits from the JIT compilation if the C/C++code is not specialized for the incoming query. Thus, there is a need for a scheme of generating IR for JIT compilation of query with improved efficiency.
In accordance with an embodiment, a method supporting query just-in-time (JIT) compilation and execution in a database management system includes identifying a central processing unit (CPU) intensive function in a query, and identifying, in the CPU intensive function, one or more parameters. The one or more parameters represent variables with values changeable at different query instances. The CPU intensive function tis compiled to a parameterized intermediate representation (IR) including the one or more parameters. The parameterized IR of the CPU intensive function is saved in a catalog of parameterized IRs.
In accordance with another embodiment, a method supporting query JIT compilation and execution in a database management system includes compiling a CPU intensive function to a parameterized IR including one or more parameters. The one or more parameters represent variables with values at different changeable query instances. The method further includes saving the parameterized IR of the CPU intensive function in a catalog of parameterized IRs, and loading, during preparation for execution of an incoming query, a parameterized IR from a catalog. In the parameterized IR, the one or more parameters are replaced with constant values for the variables of the incoming query. The parameterized IR are compiled, using the JIT compilation, with the constant values replacing the one or more parameters, to generate a machine code for the execution of the incoming query.
In accordance with yet another embodiment, an apparatus for a database query execution engine comprises at least one processor and a non-transitory computer readable storage medium storing programming for execution by the at least one processor. The programming includes instructions to identify a CPU intensive function in a query, and identify, in the CPU intensive function, one or more parameters that represent variables with values changeable at different query instances. The programming includes further instructions to compile the CPU intensive function to a parameterized IR including the one or more parameters, and save the parameterized IR of the CPU intensive function in a catalog of parameterized IRs.
The foregoing has outlined rather broadly the features of an embodiment of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of embodiments of the invention will be described hereinafter, which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiments disclosed may be readily utilized as a basis for modifying or designing other structures or processes for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.
For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawing, in which:
Corresponding numerals and symbols in the different figures generally refer to corresponding parts unless otherwise indicated. The figures are drawn to clearly illustrate the relevant aspects of the embodiments and are not necessarily drawn to scale.
The making and using of the presently preferred embodiments are discussed in detail below. It should be appreciated, however, that the present invention provides many applicable inventive concepts that can be embodied in a wide variety of specific contexts. The specific embodiments discussed are merely illustrative of specific ways to make and use the invention, and do not limit the scope of the invention.
System and method embodiments are provided herein for using parameterized Intermediate Representation (IR) for just-in-time (JIT) compilation in database query execution engines. Specifically, for a given query function such as a processor (or CPU) intensive function handled by the database execution engine, the variables of the function which could be invariants (fixed) for a specific query are identified. The variables have changeable values at different query instances, and thus can be used as parameters for the query. The CPU intensive functions can be functions that demand more processing resources, such as in terms of time, memory or other processing resources. The CPU intensive function can be identified by CPU profiling. The identified variables can include, for example, schema information and expression or data type related variables. The variables identified as invariants can provide more compiler optimizations such as dead code elimination, loop unrolling, constant folding and propagation, inline of the virtual function call, or the call via a function pointer. The identified variables are set in a template IR of the CPU intensive function as parameters of the IR. During the JIT compilation, e.g., by an IR compiler such as LLVM JIT compiler, the parameters in the IR are replaced by constant values of the query specific information to generate the optimized machine code for execution.
The template IR with the parameters, referred to herein as the parameterized IR, is statically compiled from the original function (interpreted code), and the IR can be stored in a catalog table with a unique ID and the parameter names. The parameterized IR is loaded, at run-time, during the preparation for the execution of a specific query, and the parameters in the IR are hence replaced with the constant values to obtain a modified IR. The modified IR is then JIT compiled to generate native machine code for the function. Generating and compiling a parameterized IR as such, with a block of instructions in the query function simultaneously or at one time, can avoid the costly and error-prone instruction-by-instruction generation of the query specific IR function at run-time. By saving a generic version of the parameterized IR function, loading the IR at run-time, and JIT compiling the IR function after injecting the query specific information to the parameterized IR, there is no need to generate C/C++code at run-time for each JIT compilation. Generating and compiling the parameterized IR scheme is also faster than generating C/C++code for the query and using a C/C++compiler to JIT compile the query.
In an embodiment, the executor 140 (the database query execution engine) implements JIT compilation using parameterized IR.
The unique ID can have an enumerated data type for matching its IR in the table, and is used for retrieving the IR later in the query execution engine (during execution run-time). The enumerated data type can be defined as follows:
The code (in C language) below illustrates an example of a function sdt_loop extracted from PostgreSQL:
In the example, the function sdt_loop is simplified and extracted from the query execution engine of PostgreSQL. The function sdt_loop is used to extract column values from an in-memory tuple. The value of the variable natts is invariant for a specific relation in a query and it is the number of column values that need to be extracted from a tuple. Based on the CPU profiling and program analysis, the function sdt_loop is identified as a candidate for JIT compilation in the database query execution engine. The variable natts in the function is also identified as parameter in the compiled IR since for a specific relation in a query, the value of natts is not changed during the execution of the query. The function is statically compiled into IR with the augmented parameter of natts as follows:
The parameterized IR above is saved in the catalog table to be loaded for JIT compilation in the query execution engine of PostgreSQL. In a scenario, a query is received with natts equal to 3 as follows:
In the above example, C3 is the third column in the table T.
The parameterized IR is hence loaded from the catalog table and the reference of the parameter natts is replaced with the constant 3 in the IR as follows:
The above LLVM IR is equivalent to the following C code:
The function is then JIT compiled and the native machine code is generated for the execution of this query. The resulting optimized code with JIT compilation on the function sdt_loop becomes:
Without loss of generality, the method above can be applied to other CPU intensive functions in the database query execution engine, such as hash join, sort or aggregation with group-by, index creation, or others. In the method, a program analysis can be applied to identify parameters in a candidate function for JIT compilation. If the value of a variable is invariant for a specific query (query specific information) and the replacement of the variable with a constant is expected to introduce more compiler optimizations on the function, such as dead code elimination, loop unrolling, constant folding and propagation, inline the virtual function call, or the call via a function pointer, then this variable is a parameter in the IR of the function for JIT compilation. Query specific information can be identified from the schema information of relations (e.g., tables, views, indexes) in the query, or from the expressions and data types in the query. For example, in the above sdt_loop function, the value of the variable natts can be determined for a specific query before its execution based on the schema information and the accessed columns in the query. For example, some schema information, for example, NOT NULL, can help on the dead code elimination (the NULL value check of a NOT NULL column is redundant for a query). The data type information (from the schema or from the query itself) can help resolve the function for a function pointer or a virtual function.
In embodiments, any one of two methods can be implemented to inject query specific information (constant values) to the parameterized IR. As the above example for the sdt_loop function shows, the first method comprises replacing the references of each parameter with its related constant value for the specific query. When the IR is loaded from the catalog table, the number of parameters and their names in the IR can also be obtained. The IR is parsed to replace the reference of a parameter in any instruction by the related constant value. The second method comprises inserting an assignment statement for each parameter in the beginning of the function as follows for the sdt_loop function example:
The JIT compiler then propagates the constant value to the references of the parameter. The LLVM IR is in static single assignment form (SSA form), and there is no assignment instruction in LLVM IR. As such, the add instruction can be used to add the constant by zero as the assignment of the constant to the parameter variable. The resulting code becomes:
The JIT compilation can be applied to a function to generate the optimized machine code and return the function pointer of the JIT compiled function. In a scenario, the CPU intensive function may contain only a relatively small portion of code (e.g., a relatively small loop with multiple iterations) that is CPU intensive and has query-specific information. However, other code in the same function may be either not CPU intensive or may have no query-specific information. In such case, to reduce the cost of JIT compilation, the CPU intensive and query-specific portion of code can be split from the original function and a new function for that portion is thus constructed. The newly constructed and relatively small function is then JIT compiled instead of the original larger function.
The CPU 410 may comprise any type of electronic data processor. The memory 420 may comprise any type of system memory such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous DRAM (SDRAM), read-only memory (ROM), a combination thereof, or the like. In an embodiment, the memory 420 may include ROM for use at boot-up, and DRAM for program and data storage for use while executing programs. In embodiments, the memory 420 is non-transitory. The mass storage device 430 may comprise any type of storage device configured to store data, programs, and other information and to make the data, programs, and other information accessible via the bus. The mass storage device 430 may comprise, for example, one or more of a solid state drive, hard disk drive, a magnetic disk drive, an optical disk drive, or the like.
The video adapter 440 and the I/O interface 460 provide interfaces to couple external input and output devices to the processing unit. As illustrated, examples of input and output devices include a display 490 coupled to the video adapter 440 and any combination of mouse/keyboard/printer 470 coupled to the I/O interface 460. Other devices may be coupled to the processing unit 401, and additional or fewer interface cards may be utilized. For example, a serial interface card (not shown) may be used to provide a serial interface for a printer.
The processing unit 401 also includes one or more network interfaces 450, which may comprise wired links, such as an Ethernet cable or the like, and/or wireless links to access nodes or one or more networks 480. The network interface 450 allows the processing unit 401 to communicate with remote units via the networks 480. For example, the network interface 450 may provide wireless communication via one or more transmitters/transmit antennas and one or more receivers/receive antennas. In an embodiment, the processing unit 401 is coupled to a local-area network or a wide-area network for data processing and communications with remote devices, such as other processing units, the Internet, remote storage facilities, or the like.
While several embodiments have been provided in the present disclosure, it should be understood that the disclosed systems and methods might be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.
In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the spirit and scope disclosed herein.