This application is related to the United States Patent Application entitled Dynamic Partition Enhanced Joining by Mark Morris and Bhashyam Ramesh, Ser. No. 10/862,686, filed on even date.
This application is related to the United States Patent Application entitled Dynamic Partition Enhanced Inequality Joining Using A Value-Count Index by Mark Morris and Bhashyam Ramesh, Ser. No. 10/862,649, filed on even date.
One important feature in relational database system (RDBMS) is the ability to perform queries that join columns from two or more tables. An example of a query including a join is:
SELECT T1.*, T2.*FROM T1, T2 WHERE T1.A condition T2.B;
where T1 and T2 are tables, T1.A is a column in T1, T2.B is a column in T2, and condition is a comparison between the operands to the condition. The example query above returns all of the columns in T1 horizontally concatenated with all of the columns in T2, for rows where T1.A and T2.B satisfy the equality condition. In mathematical terms, this query may be described as a Cartesian product with a condition or cross product with a condition.
In general, in one aspect, the invention features a method of performing a database query to join two or more tables on a condition where each table includes one or more join columns. Each of the join columns include zero or more join-column values. The method includes excluding join-column values that do not satisfy the condition, using one or more value-count indexes and performing the database query on non-excluded join-column values.
Implementations of the invention may include one or more of the following. The database may include value-count indexes for each of the join columns. Excluding join-column values that do not satisfy the condition may include joining the value-count indexes for the join columns to produce a qualifying set. Performing the database query on non-excluded join-column values may include joining each of the tables with the qualifying set to produce a worktable and performing the database query on the worktables. Performing the database query on non-excluded join-column values may include creating an IN list populated with values from the qualifying set. Performing the database query on non-excluded join-column values may include scanning each of the tables. Performing the database query on non-excluded join-column values may include creating a worktable for each table. Performing the database query on non-excluded join-column values may include scanning each row in each table and if the one or more join columns have values in the IN list, adding the row to the worktable. Performing the database query on non-excluded join-column values may include performing the original query on the worktables.
A first join column may be in a first table and a second join column may be in a second table. The database may have a first-join-column-value-count index. Excluding values that do not satisfy the condition, using one or more value-count indexes may include joining the first-join-column-value-count index and the second table to produce a worktable. Performing the database query on non-excluded values may include performing the database query on the first table and the worktable. The join columns in the first set may not have value-count indexes and where join columns in the second set may have value-count indexes.
In general, in another aspect the invention features a method of performing a database query to join two or more tables on an inequality condition. Each table includes one or more join columns. The database includes a value-count index for each join column. Each value-count index includes one or more indexed values and one or more counts. The method includes defining two or more partitions for each table, using the one or more value-count indexes and performing the database query on one partition from each table.
Implementations of the invention may include one or more of the following. Defining two or more partitions may include joining the value-count indexes for the join columns to create a qualifying set. Defining two or more partitions may include joining each table with the qualifying set to create two or more work tables. One or more of the work tables may include values that do not satisfy the inequality condition. One of the work tables may include values that satisfy the inequality condition. Performing the database query on one partition from each table may include performing the database query on the worktables that include values that do not satisfy the inequality condition and may also include storing the query result. The method may also include generating one or more Cartesian products for one or more pairs of work tables, where one or more of the work tables in the pair satisfy the inequality condition. The method may also include storing the Cartesian products. The method may also include merging the query result and Cartesian products.
In general, in another aspect, the invention features a computer program, stored on a tangible storage medium, for performing a database query to join two or more tables on a condition. Each table includes one or more join columns. Each of the join columns includes zero or more join-column values. The computer program including executable instructions that cause a computer to exclude join-column values that do not satisfy the condition, using one or more value-count indexes and cause the computer to perform the database query on non-excluded join-column values.
In general, in another aspect, the invention features a computer program, stored on a tangible storage medium, for performing a database query to join two or more tables on an inequality condition. Each of the tables includes one or more join columns. The database includes a value-count index for each join column. Each value-count index includes one or more indexed values and one or more counts. The computer program includes executable instructions that cause a computer to define two or more partitions for each table using the one or more value-count indexes. The computer program includes executable instructions that cause a computer to perform the database query on one partition from each table.
In general, in another aspect, the invention features a database system including a massively parallel processing system. The massively parallel processing system includes one or more nodes, a plurality of CPUs, a plurality of data storage facilities, and a process for execution on the massively parallel processing system for performing a database query to join two or more tables on an inequality condition. Each of the nodes provides access to one or more CPUs. Each of the one or more CPUs provides access to one or more data storage facilities. Each table includes one or more columns. The database includes a value-count index for each join column. Each value-count index includes one or more indexed values and one or more counts. The process includes defining two or more partitions for each table with a join column using the one or more value-count indexes. The process includes performing the database query on one partition from each table.
In general, in another aspect, the invention features a database system including a massively parallel processing system. The massively parallel processing system includes one or more nodes, a plurality of CPUs, a plurality of data storage facilities, and a process for execution on the massively parallel processing system for performing a database query to join two or more tables on an inequality condition. Each of the nodes provides access to one or more CPUs. Each of the one or more CPUs provides access to one or more data storage facilities. Each table includes one or more columns. The database includes a value-count index for each join column. Each value-count index includes one or more indexed values and one or more counts. The process includes defining two or more partitions for each table with a join column using the one or more value-count indexes. The process includes performing the database query on one partition from each table.
The techniques for performing joins disclosed herein have particular application, but are not limited, to large databases that might contain many millions or billions of records managed by a database system (“DBS”) 100, such as a Teradata Active Data Warehousing System available from NCR Corporation.
For the case in which one or more virtual processors are running on a single physical processor, the single physical processor swaps between the set of N virtual processors.
For the case in which N virtual processors are running on an M-processor node, the node's operating system schedules the N virtual processors to run on its set of M physical processors. If there are 4 virtual processors and 4 physical processors, then typically each virtual processor would run on its own physical processor. If there are 8 virtual processors and 4 physical processors, the operating system would schedule the 8 virtual processors against the 4 physical processors, in which case swapping of the virtual processors would occur.
Each of the processing modules 1101 . . . N manages a portion of a database that is stored in a corresponding one of the data-storage facilities 1201 . . . N. Each of the data-storage facilities 1201 . . . N includes one or more disk drives. The DBS may include multiple nodes 1052 . . . O in addition to the illustrated node 1051, connected by extending the network 115.
The system stores data in one or more tables in the data-storage facilities 1201 . . . N. The rows 1251 . . . Z of the tables are stored across multiple data-storage facilities 1201 . . . N to ensure that the system workload is distributed evenly across the processing modules 1101 . . . N. A parsing engine 130 organizes the storage of data and the distribution of table rows 1251 . . . Z among the processing modules 1101 . . . N. The parsing engine 130 also coordinates the retrieval of data from the data-storage facilities 1201 . . . N in response to queries received from a user at a mainframe 135 or a client computer 140. The DBS 100 usually receives queries and commands to build tables in a standard format, such as SQL.
In one implementation, the rows 1251 . . . Z are distributed across the data-storage facilities 1201 . . . N by the parsing engine 130 in accordance with their primary index. The primary index defines the columns of the rows that are used for calculating a hash value. The function that produces the hash value from the values in the columns specified by the primary index is called the hash function. Some portion, possibly the entirety, of the hash value is designated a “hash bucket.” The hash buckets are assigned to data-storage facilities 1201 . . . N and associated processing modules 1101 . . . N by a hash bucket map. The characteristics of the columns chosen for the primary index determine how evenly the rows are distributed.
In one example system, the parsing engine 130 is made up of three components: a session control 200, a parser 205, and a dispatcher 210, as shown in
Once the session control 200 allows a session to begin, a user may submit a SQL query, which is routed to the parser 205. As illustrated in
An example system for processing a SQL query including one or more joins is shown in
SELECT T1.*, T2.*, T3.* FROM T1, T2, T3 WHERE T1.A condition T2.B AND T1.A condition T3.C;
where T1, T2, and T3 are tables, T1.A is a column in T1, T2.B is a column in T2, and T3.C is a column in T3, and “condition” is a comparison between two operators. In one example implementation, the system joins T1 and T2 on the condition “T1.A condition T2.B”. The system also joins T1 and T3 on the condition “T1.A condition T3.C”. Finally, the system joins the two previous results. In this example implementation, the query is decomposed into three join operations and the system loops (blocks 405 and 410) three times.
Another example system for performing a SQL query including one or more joins performs the example SQL query above using two joins. The system performs a join between T1 and T2 on the condition “T1.A condition T2.B” and stores the result. In one example implementation, the result is stored in a work table or a spool table S1. The system then performs a join between T3 and S1 on the condition “S1.A condition T3.C”.
Within the loop defined by blocks 405 and 410 the system processes the join (block 415). One example system for processing the join is shown in
SELECT T1.*, T2.* FROM T1, T2 WHERE T1.A equality_condition T2.B;
where equality_condition is “=” or any other equality condition. If the join is not an equijoin, the system evaluates other methods to process the join (block 510). Otherwise, the system determines if the DBS 100 includes a Value Count Index (VCI) for each of the join columns (e.g., T1.A and T2.B). A VCI is a mechanism or data structure for tracking which values appear in a column and how many times the values appear in the column. One example VCI is an aggregate join index that is automatically updated each time a value the in indexed column is added, deleted, or altered. The VCI typically contains two columns: a value column and a count column. The value column represents the value in the indexed column and the count column represents the number of occurrences of the value in the column.
Returning to
An example system for producing a qualifying set Q3 (block 520) is shown in
SELECT VCI1a.Value FROM VCI1a, VCI2b WHERE VC1a.Value equality_condition VCI2b.Value;
where VCI1a is the VCI for column T1.A, VCI2b is the VCI for column T2.B, and where each VCI includes a Value column representing the value of the indexed column and a Count column representing the count of the indexed value in the column. The selected value, VCI1a. Value is placed in Q3 to produce the qualifying set. After joining the VCIs for the join columns on an equality condition, the system returns Q3 (block 610).
Returning to
An example system for producing the worktable SL (block 715) is shown in
Returning to
SELECT S4.*, S5.* from S4, S5 WHERE S4.A equality_condition S5.B;
where S4 is the table T1 qualified by Q3, S5 is the table T2 qualified by Q3, and S4.A and S5.B are columns in S4 and S5, respectively.
Another example system for processing the join (block 415,
SELECT S6.*, T2.* from S6, T2 WHERE S6.A equality_condition T2.B;
In certain implementations of system, the join performed in block 1015 may be omitted if the result of the query only needs to contain the column in T2 that is indexed by VCI2. In such a situation, the Count values of VCI2 may be used to satisfy the SQL query.
SELECT T1.*, T2.* FROM T1, T2 WHERE T1.A inequality_condition T2.B;
where inequality_condition is a comparison that evaluates if the operands to the condition are unequal. An example inequality_condition is “< >.”
The system determines if there is a VCI on all join columns (block 515, as described with respect to
An example system for producing the worktables W, X, Y, and Z (block 1105) is shown in
Returning to
Returning to
SELECT X.*, Z.* FROM X, Z WHERE X.A inequality_condition Z.B;
where inequality_condition is the join condition, X and Z are worktables, and X.A and Z.B are columns in the worktables. The system merges the worktables to produce the result (block 1120). An example system of merging the worktables is shown in
Although the system for processing joins has been described with respect to certain example SQL queries, the system may be used with many other SQL queries. For example, the system may be used with SQL queries including OUTER JOIN or DISTINCT terms.
The foregoing description of the preferred embodiment of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.
Number | Name | Date | Kind |
---|---|---|---|
4956774 | Shibamiya et al. | Sep 1990 | A |
5241648 | Cheng et al. | Aug 1993 | A |
5557791 | Cheng et al. | Sep 1996 | A |
5822750 | Jou et al. | Oct 1998 | A |
5903893 | Kleewein et al. | May 1999 | A |
5918225 | White et al. | Jun 1999 | A |
5930785 | Lohman et al. | Jul 1999 | A |
5983215 | Ross et al. | Nov 1999 | A |
6052693 | Smith et al. | Apr 2000 | A |
6167399 | Hoang | Dec 2000 | A |
6226639 | Lindsay et al. | May 2001 | B1 |
6505189 | On Au et al. | Jan 2003 | B1 |
6944633 | Higa et al. | Sep 2005 | B1 |
20010014888 | Tsuchida et al. | Aug 2001 | A1 |
20030074348 | Sinclair et al. | Apr 2003 | A1 |
20030217033 | Sandler et al. | Nov 2003 | A1 |
20040172400 | Zarom et al. | Sep 2004 | A1 |
20040249845 | Das | Dec 2004 | A1 |
20040260684 | Agrawal et al. | Dec 2004 | A1 |
20050149491 | Bakalash et al. | Jul 2005 | A1 |
20050160102 | Abdo et al. | Jul 2005 | A1 |
20050240615 | Barsness et al. | Oct 2005 | A1 |
20060047638 | Dettinger et al. | Mar 2006 | A1 |