This application is related to the following application filed on the same date herewith: Deferred Maintenance of Sparse Join Indexes (Applicant Reference Number: 2704.007US1).
A relational database stores data that is logically related by rows and columns. The database may be queried for data such as by using a query language to identify the data desired, and where in the database to look for the data. A query optimizer may take the query and determine if there is a more efficient way to process the query. Some databases facilitate the creation of indexes, which may be thought of as a subset of the database that contains data that is organized in a certain manner, and may contain further data that is aggregated, or pre-calculated. Currently, when a join index is used to rewrite a query, the join index contains the same set or a super set of rows that is used to process the query. If the join index does not contain all the rows needed to process the query, it is not used.
In the following description, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific embodiments which may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that structural, logical and electrical changes may be made without departing from the scope of the present invention. The following description of example embodiments is, therefore, not to be taken in a limited sense, and the scope of the present invention is defined by the appended claims.
The functions or methods described herein may be implemented in software or a combination of software and human implemented procedures in one embodiment. The software may consist of computer executable instructions stored on computer readable media such as memory or other type of storage devices. The term “computer readable media” is also used to represent any means by which the computer readable instructions may be received by the computer, such as by different forms of wired or wireless transmissions. Further, such functions correspond to modules, which are software, hardware, firmware or any combination thereof. Multiple functions may be performed in one or more modules as desired, and the embodiments described are merely examples. The software may be executed on a digital signal processor, ASIC, microprocessor, or other type of processor operating on a computer system, such as a personal computer, server or other computer system.
A join index with single table predicates is known as “sparse join index”. The predicate is called sparse condition, which filters out rows that are not needed in the join index. A “snapshot join index” may be defined with a sparse condition that confines the data set included in the join index. A sparse join index may be used to answer queries on bigger data sets by fetching the missing rows of the sparse join index from a base table. In various embodiments, faster response time to queries may be obtained by leveraging pre-aggregated, pre-joined data contained within such horizontal partial covering join indexes. A join index may be used in processing a query when the join index is sparse. A sparse join index contains a subset of rows that will be used in processing the query. A “snapshot join index” may be defined with a sparse condition that confines the data set included in the join index. A sparse join index may be used to answer queries on bigger data sets by fetching the missing rows of the sparse join index from a base table. In various embodiments, faster response time to queries may be obtained by leveraging pre-aggregated, pre-joined data contained within sparse join indexes.
Operators of the computer system 100 typically use a workstation 110, terminal, computer, handheld wireless device or other input device to interact with the computer system 100. This interaction generally comprises queries that conform to a Structured Query Language (SQL) standard, and invoke functions performed by a Relational Database Management System (RDBMS) executed by the system 100. In further embodiments, the computer system 100 may implement on-line analysis processing (OLAP) or multidimensional OLAP (MOLAP) or relational OLAP (ROLAP). Various other processing systems may also be implemented by computer system 100 or other computer systems capable of providing access to relational databases.
In one embodiment, the RDBMS comprises the Teradata.® product offered by Teradata US, Inc., and may include one or more Parallel Database Extensions (PDEs) 112, Parsing Engines (PEs) 114, and Access Module Processors (AMPs) 116. These components of the RDBMS perform the function which enable of RDBMS and SQL standards, i.e., definition, compilation, interpretation, optimization, database access control, database retrieval, and database update.
Work may be divided among the PUs 102 in the system 100 by spreading the storage of a partitioned relational database 118 managed by the RDBMS across multiple AMPs 116 and the DSUs 106 (which are managed by the AMPs 116). Thus, a DSU 106 may store only a subset of rows that comprise a table in the partitioned database 118 and work is managed by the system 100 so that the task of operating on each subset of rows is performed by the AMP 116 managing the DSUs 106 that store the subset of rows.
The PEs 114 handle communications, session control, optimization and query plan generation and control. The PEs 114 fully parallelize all functions among the AMPs 116. As a result, the system of
Both the PEs 114 and AMPs 116 are known as “virtual processors” or “vprocs”. The vproc concept is accomplished by executing multiple threads or processes in a PU 102, wherein each thread or process is encapsulated within a vproc. The vproc concept adds a level of abstraction between the multi-threading of a work unit and the physical layout of the parallel processing computer system 100. Moreover, when a PU 102 itself is comprised of a plurality of processors or nodes, the vproc concept provides for intra-node as well as the inter-node parallelism.
The vproc concept results in better system 100 availability without undue programming overhead. The vprocs also provide a degree of location transparency, in that vprocs communicate with each other using addresses that are vproc-specific, rather than node-specific. Further, vprocs facilitate redundancy by providing a level of isolation/abstraction between the physical node 102 and the thread or process. The result is increased system 100 utilization and fault tolerance.
In various embodiments, data partitioning and repartitioning may be performed, in order to enhance parallel processing across multiple AMPs 116. For example, the data may be hash partitioned, range partitioned, or not partitioned at all (i.e., locally processed). Hash partitioning is a partitioning scheme in which a predefined hash function and map is used to assign records to AMPs 116, wherein the hashing function generates a hash “bucket” number and the hash bucket numbers are mapped to AMPs 116. Range partitioning is a partitioning scheme in which each AMP 116 manages the records falling within a range of values, wherein the entire data set is divided into as many ranges as there are AMPs 116. No partitioning means that a single AMP 116 manages all of the records.
Generally, the PDEs 112, PEs 114, and AMPs 116 are tangibly embodied in and/or accessible from a device, media, carrier, or signal, such as RAM, ROM, one or more of the DSUs 106, and/or a remote system or device communicating with the computer system 100 via one or more of the DCUs 108. The PDEs 112, PEs 114, and AMPs 116 each comprise logic and/or data which, when executed, invoked, and/or interpreted by the PUs 102 of the computer system 100, cause the methods or elements of the present invention to be performed.
As noted above, many different hardware and software environments may be used to implement the methods described herein. A spectrum of embodiments ranging from stand alone processors with a single storage device, to multiple distributed processors with distributed storage devices storing one or more databases may be used in various embodiments.
In
In a further embodiment as illustrated in
In one example embodiment, base tables are defined with a partitioned primary index (PPI) by which new incoming data go to the most recent partition(s) as described in the following database definition language (DDL) statements:
The last two lines correspond to a very specific example. In further embodiments, the statements may be represented generically as:
wherein xxx and yyy are dates, and zzz is a number of time periods QQQ. As indicated above, the dates are in one specific format, but may be in other formats as desired. While a MONTH is indicated as the time period above, the time period may be varied to correspond to an actual application, such as a day, week, year, quarter, hour, minute, or whatever other type of time period desired.
A join index may be defined with the sparse condition that specifies a “snapshot view” of the data. When the base table is updated, the join index (JI) maintenance may be bypassed as a result of the values of the updated rows being outside the range set by the sparse condition.
1) First, perform an INSERT into HONG.orders.
->No rows are returned to the user as the result of statement 1.
EXPLAIN DEL orders WHERE o_orderdate<‘2003-01-01’;
No rows are returned to the user as the result of statement 1.
This “snapshot ji” can be used to answer queries that ask for more rows than those included in the JI. For example, the following query
Assume that the sparse condition in the JI definition and the query condition are sparse_ji_Cond and query_cond, respectively. The ji_ret_cond, which represents the condition used for the join index retrieval, is calculated as:
ji_ret_cond=query_cond AND sparse_ji_cond
Since all the rows in the join index already satisfy the sparse_ji_cond, the expression can be simplified as:
ji_ret_cond=query_cond
Furthermore, when
!query_cond AND sparse_ji_cond=false;
i.e. the row set of the query result is a superset of the row set included in the join index, ji_ret_cond may be set to be true because all the rows in the join index are needed to answer the query. In the above example,
Therefore, ji_ret_cond=true in this example.
The base_ret_cond, which represents the condition for the retrieval from the base table to get the extra rows needed in the query, is calculated as:
Therefore base_ret_cond specifies two range conditions that correspond to the rows that are required in the query but are not included in the join index.
The above example illustrates the idea of the “horizontal partial covering”—when the row set in JI is a subset of that required by the query, a retrieval to the base table for the rest of rows is needed and the union of the two can give the row set required in the query.
In the following, an example is described where using the horizontal partial covering method can help to improve performance by leveraging an aggregate join index (AJI) with aggregates at the same or lower level than that is required in the query. Assume a fact table and a dimension table as:
An AJI at week level is defined as:
A query that rolls up to the month level,
The time ranges covered in the join index (from startweek to endweek) and in the query (from startmonth to endmonth) may be different. So going back to the base table to get the missing rows in the join index may be needed in order to use the AJI to answer the query. A time line 400 in
The Abstract is provided to comply with 37 C.F.R. §1.72(b) to allow the reader to quickly ascertain the nature and gist of the technical disclosure. The Abstract is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.