The present application is related to U.S. patent application Ser. No. 10/874,400, entitled “Multi-Tier Query Procesing” filed by Rafi Ahmed on Jun. 22, 2004, the content of which is incorporated herein by reference.
The present application is also related to U.S. patent application Ser. No. 11/237,039, entitled “PARALLEL QUERY PROCESSING TECHNIQUES FOR MINUS AND INTERSECT OPERATORS”, filed by Bhaskar Ghosh, Rafi Ahmed, Hermann Baer on Sep. 27, 2005, the content of which is incorporated herein by reference.
The present invention relates to databases and, more specifically, to query processing techniques for queries that contain MINUS and INTERSECT operators.
A join is a query that combines rows from two or more sources, such as tables, views, or snapshots. In the context of database systems, a join is performed whenever multiple tables appear in a query's FROM clause. The query's select list can select any columns from any of the base tables listed in the FROM clause.
An equijoin is a join with a join condition containing an equality operator. An equijoin combines rows that have equivalent values for the specified columns. Query1 is an equijoin that combines the rows of tables R and S where the value in column r.b is the same as the value in column s.b:
For the purpose of illustration, assume that tables R and S contain the following rows:
Under these circumstances, Query 1 would produce the result set:
In this example, two rows ((W, A) and (X, A)) in table R combine with row (1, A) in row S. Therefore, row (1, A) appears twice in the result set. Similarly, row (Z, C) in table R combines with two rows ((2, C) and (3, C)) in table S. Therefore, row (Z, C) appears twice in the result set. Row (Y, B) of table R does not combine with any row in table S, so row (Y, B) is not reflected in the result set of the equijoin. Similarly, row (4, D) of table S did not combine with any row in table R, so row (4, D) is not reflected in the result set of the equijoin.
An equijoin is an example of a binary operation that produces a result multi-set (a multi-set is a collection of items that allows non-distinct items) based on the contents of two multi-set sources. Other binary SQL operations that produce result set based on two multi-set sources are minus and intersect. Each of these operations shall be described in greater detail hereafter.
A minus operation returns all of the distinct elements of one multi-set (the “left-hand source) that do not match any values in another multi-set (the “right hand source”). Thus, column R.b minus column S.b would produce the result set (B), because B is the only value in R.b that does not match with any value in S.b. Significantly, even if R.b has the value “B” in several rows, the result set of the minus operation would only include a single “B”, because minus operations only return distinct values (i.e. no duplicates).
An intersect operation returns all of the distinct elements in one multi-set (the “left-hand source”) that are also contained in another multi-set (the “right-hand source”). Thus, column R.b intersect column S.b would produce the result set (A, C), because A and C are the values in R.b that match values in S.b. Significantly, even though C matches two rows in S.b, only one C is included in the result set because intersect operations only return distinct values.
MINUS and INTERSECT are commonly used set operators in the Structured Query Language (“SQL”) that is supported by most database servers. MINUS and INTERSECT operators have been adopted in ANSI SQL for the last ten years, and every major database vendor offers support of the MINUS and INTERSECT operators in some form or another. A common strategy for performing the MINUS/INTERSECT operations involves, for example, performing a sort-merge join and a sort-merge anti-join, respectively.
In data-warehouses with reporting applications, set operators are usually evaluated on very large sets of data, so it is critical to make the set operations, such as MINUS and INTERSECT, scale in any SQL execution engine. Based on the foregoing, it is desirable to provide techniques that handle MINUS/INTERSECT operations more efficiently.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to avoid unnecessarily obscuring the present invention.
Various techniques are described hereafter for processing database commands that include MINUS and/or INTERSECT operators. According to one approach, multi-tiered transformation is performed on queries that contain MINUS and/or INTERSECT operators to create a plurality of transformed queries. Each of the transformed queries produces the same result as the original query, but does not include the MINUS and/or INTERSECT operator. Queries that produce the same result as a particular query are referred to herein as “equivalent queries” or “semantically equivalent queries” of that particular query.
To achieve the same result set as the original query, the transformed queries employ equijoins, antijoins, and/or semijoins, and duplicate elimination operations. Costs are estimated for each of the various transformed queries. Based on the cost estimates, one of the transformed queries is selected as the query that is to be executed to perform the operations specified in the original query.
According to one embodiment, the database server constructs equivalent queries by transforming the set operations INTERSECT and MINUS contained in the original query into join/semi-join and anti-join, respectively. When the database server converts INTERSECT/MINUS into [semi]/anti-join, join methods such as hash, sort-merge, and nested loops may be employed to perform the join. In addition, new access paths and alternate join orders become possible. Converting the set operators into joins also provides scalability and parallelization.
Equijoin operations have been described above, with reference to tables R and S. The following sections illustrate anti-join and semi-join operations with reference to the same tables R and S.
An antijoin operation returns all of the rows in a left-hand source that do not combine with any rows in a right-hand source. Thus, if tables R and S are combined in an anti-join where the join condition is R.b=S.b, then the result set would be:
Anti-join does not remove duplicates from the left-hand source. Thus, if two rows of R contain (Y, B), then the result set would include both of the rows. Antijoins are described in detail in U.S. Pat. No. 6,449,606.
A semi-join operation returns all of the rows in a left-hand source that satisfy the join condition relative to any rows in a right-hand source. Thus, if tables R and S are combined in a semi-join where the join condition is R.b=S.b, then the result set would be:
In the result set, the row (Z, C) is only included once, even though (Z, C) satisfies the join condition relative to multiple rows ((2, C) and (3, C)) in the right-hand table. Thus, duplicates in the right-hand source do not affect the result of a semi-join. However, semi-join does not remove duplicates from the left-hand source. Thus, if two rows of R contained (W, A), then the result set would include both of the rows. Semi-joins are discussed in detail in U.S. Pat. No. 6,449,609.
For the purpose of explanation, textual examples of various transformed queries are provided. However, query transformations are not implemented at the textual level. For example, the text of a query may be used to generate an internal query structure, which serves as the internal representation of the query. Query transformations may be performed by transforming the internal query structure. Thus, while the textual representation of a transformed query may indicate the creation of a view, the actual execution of the transformed query need not involve the creation of an in-line view. Performing query transformations using internal representations of queries is described in detail in U.S. Pat. No. 5,963,932.
As mentioned above, database queries that contain the INTERSECT operator are transformed to produce several transformed queries, each of which produces the same result as the original query. For the purpose of explanation, it shall be assumed that a database server has received the following query (Q1):
In query Q1, T1.X is the left-hand source for the INTERSECT operation and T2.Y is the right-hand source for the intersect operation. In the following sections, examples are given of how query Q1 may be transformed to produce equivalent queries. The transformations affect how the database server will produce the results of the command, but do not actually change the results thus produced.
Because MINUS and INTERSECT produce sets of unique values, the transformed queries typically involve duplicate elimination (represented by the DISTINCT operator). However, the transformed queries differ relative to where, within the transformed query or the query plan, the duplicate elimination is performed.
The optimal point at which to perform duplicate elimination may vary based on a variety of factors. For example, if duplicate elimination is performed on the result, then there would be no need to sort the input operands to the MINUS/INTERSECT operator in the query. The output of INTERSECT/MINUS will always be smaller than the sum of the inputs and will generally be smaller then either input. Therefore, doing delayed evaluation of the distinct operator (i.e., duplicate elimination) should generally be cheaper than doing duplicate elimination on the inputs. On the other hand, by performing early duplicate elimination, a significant reduction may result in the number of rows on which multiple distinct operators apply, as well as in the number of rows later used in the join. Therefore, placement of distinct operator varies among the various transformed queries. After the various transformed queries have been generated, the database server selects the transformation that has the lowest estimated cost.
According to one embodiment, the database server transforms query Q1 to produce the following transformed query (Q1A):
Query Q1A achieves the results of Q1 using an equijoin. In Q1A, duplicate elimination is deferred by applying the DISTINCT operator to the output of the equijoin. Specifically, because the DISTINCT operator is applied to the values produced by the equijoin, the equijoin is performed prior to execution of the distinct operation.
The operations that a database server would perform when executing Q1A may be further illustrated using the rowsource or iterator model of SQL execution and the Parallel Execution model, which is described in detail in U.S. Pat. No. 5,956,704. Specifically, if the database server chooses to perform the equijoin using a hash join, the physical plan corresponding to Q1A could look like the following:
If this query plan is executed in parallel using a HASH-HASH distribution for the has-join, then the following parallelism is achieved during execution. One set of slave processes performs a full scan of table T1, while another set of slave processes performs a full table scan of T2. Both sets of table scan slaves redistribute their rows to a set of hash join slaves using hash-based distribution. Each hash-join slave performs the join based on the join condition. Moreover, since the join column T1.x is the same as the distinct column and the input redistribution into the join-slaves is based on HASH(T1.x) from the left side, all rows with the same value of T1.x will be sent to the same join slave, implying that the distinct computation can be performed on the same join slave without need for another redistribution. Duplicate elimination is then performed on the result of the join.
According to one embodiment, the database server transforms query Q1 to produce the following transformed query (Q1B):
Query Q1B achieves the results of Q1 using a semi-join between a view (V1) and the right-hand source (T2) of the INTERSECT operation of the original query Q1. The view represents the results of performing duplicate elimination (using the DISTINCT operator) to T1.X, the left-hand source of the INTERSECT operation specified in the original query Q1. The clause “V1.X S=T2.Y” is used to indicate a semi-join operation, in which V1.X is the left hand source and T2.Y is the right hand source.
If the database server chooses a hash join to perform the semi-join, then the physical plan corresponding to Q2B could look like the following:
Unlike Q1A, Q1B specifies that the DISTINCT operation be performed before T1 is joined with T2. In situations in which the left-hand side of the INTERSECT operation includes many duplicate join values, performing the DISTINCT operation on the left-hand values (T1.X) before performing the join may significantly reduce the number left-hand rows that have to be processed during the join operation. The combination of performing a distinct operation on the left-hand source and using a semi-join ensures that all values in the result set of the join are distinct. Consequently, no separate distinct operation need be performed on the output of the semi-join operation.
According to one embodiment, the database server transforms query Q1 to produce the following transformed query (Q1C):
Query Q1C achieves the result of Q1 using an equijoin between the left-hand source (T1.X) and a view (V2). The view represents the result of performing duplicate elimination (using the DISTINCT operator) on T2.Y, the right-hand source of the INTERSECT operation specified in the original query Q1.
If the database server chooses to perform the equijoin using a hash join, the physical plan corresponding to Q1C could look like the following:
In this example, duplicate elimination is performed on the right-hand source prior to the join, and on the results of the join. Performing duplicate elimination on the results of the join is necessary because the combination of duplicate elimination on the right-hand source, and an equijoin, does not guarantee that the result set produced by the join will be unique.
According to one embodiment, the database server transforms query Q1 to produce the following transformed query (Q1C):
If the database server chooses to perform the equijoin using a hash join, then the physical plan corresponding to Q1D could look like the following:
In this transformed query, duplicate elimination is performed on both inputs to the equijoin operation. Performing duplicate elimination on both inputs guarantees that the result set produced by the equijoin will not have any duplicates. Consequently, no duplicate elimination is performed on the result set produced by the equijoin.
As mentioned above, once the various transformed queries are generated, cost estimates are generated for each of the queries. The cost of a given query is based on a variety of factors, including how many duplicates are in each of the input sources. Once the cost estimates have been generated, the optimizer within the database server selects the transformed query with the lowest estimated cost. According to one embodiment, the original query Q1 is not considered as a possible candidate for selection, since one of the transformed queries is guaranteed to be equivalent, if not better, than the original. The execution plan of the transformed query thus selected is executed in response to requests to execute the original query.
As mentioned above, database queries that contain the MINUS operator are transformed to produce several transformed queries, each of which produces the same result as the original query. For the purpose of explanation, it shall be assumed that a database server has received the following query (Q1):
In the following sections, examples are given of how query Q2 may be transformed to produce equivalent queries. Similar to the INTERSECT transformations described above, the MINUS transformations differ relative to where duplicate elimination is performed. However, the MINUS transformations differ from the INTERSECT transformations in that the MINUS transformations involve anti-join operations. The transformations affect how the database server will produce the results of the command, but do not actually change the results thus produced.
According to one embodiment, the database server transforms query Q2 to produce the following equivalent query (Q2A):
Query Q2A achieves the results of Q2 using an anti-join between a view (V2) and the left-hand source (T1) of the MINUS of the original query Q2. The clause “T1.x A=V2.y” is used to indicate the anti-join operation, in which T1.x is the left hand source and V2.y is the right hand source. In query Q2A, duplicate elimination is deferred, and applied only to the results of the anti-join operation.
According to one embodiment, the database server transforms query Q2 to produce the following equivalent query (Q2B):
Query Q2B achieves the results of Q2 using an anti-join between two views (V1 and V2). The clause “V1.x A=V2.y” is used to indicate the anti-join operation, in which V1.x is the left hand source and V2.y is the right hand source. Duplicate elimination is performed on T1.x in V1. Duplicate elimination is not performed on T2.y. However, because duplicate elimination is performed on the left-hand source of the anti-join, the results of the anti-join are guaranteed to contain no duplicates.
According to one embodiment, the database server transforms query Q2 to produce the following equivalent query (Q2C):
Query Q2C achieves the results of Q2 using an anti-join between the left-hand source of the MINUS operation (T1.x) and a view (V2). The clause “T1.x A=V2.y” is used to indicate the anti-join operation, in which T1.x is the left hand source and V2.y is the right hand source. Duplicate elimination is performed on T2.y in V2. Duplicate elimination is also performed on the result set of the anti-join operation.
According to one embodiment, the database server transforms query Q2 to produce the following equivalent query (Q2D):
Query Q2D achieves the results of Q2 using an anti-join between the two views (V1) and (V2). The clause “V1.x A=V2.y” is used to indicate the anti-join operation, in which V1.x is the left hand source and V2.y is the right hand source. Duplicate elimination is performed on T1.x in V1. Similarly, duplicate elimination is performed on T2.y in V2. Duplicate elimination need not be performed on the result set of the anti-join, since under these circumstances the result-set of the anti-join is guaranteed to be unique.
Queries that have more than one type of set operator within the same query block are said to have non-uniform set operators. An example of a query with a non-uniform set operation is illustrated below (Q3):
Using the transformation techniques described above, query Q3 may be transformed in a variety of ways. Specifically, the portion of the query block that contains the INTERSECT operator may be transformed according to any of the INTERSECT transformations described above. Similarly, the portion of the query block that contains the MINUS operator may be transformed according to any of the MINUS transformations described above. One example of how query Q3 may be transformed is illustrated below (Q3A):
Hardware Overview
Computer system 200 may be coupled via bus 202 to a display 212, such as a cathode ray tube (CRT), for displaying information to a computer user. An input device 214, including alphanumeric and other keys, is coupled to bus 202 for communicating information and command selections to processor 204. Another type of user input device is cursor control 216, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 204 and for controlling cursor movement on display 212. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
The invention is related to the use of computer system 200 for implementing the techniques described herein. According to one embodiment of the invention, those techniques are performed by computer system 200 in response to processor 204 executing one or more sequences of one or more instructions contained in main memory 206. Such instructions may be read into main memory 206 from another machine-readable medium, such as storage device 210. Execution of the sequences of instructions contained in main memory 206 causes processor 204 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.
The term “machine-readable medium” as used herein refers to any medium that participates in providing data that causes a machine to operation in a specific fashion. In an embodiment implemented using computer system 200, various machine-readable media are involved, for example, in providing instructions to processor 204 for execution. Such a medium may take many forms, including but not limited to storage media, which includes non-volatile and volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 210. Volatile media includes dynamic memory, such as main memory 206. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 202. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
Common forms of machine-readable media include, for example, storage media such as a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punchcards, papertape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, or transmission media such as a carrier wave as described hereinafter, or any other medium from which a computer can read.
Various forms of machine-readable media may be involved in carrying one or more sequences of one or more instructions to processor 204 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 200 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 202. Bus 202 carries the data to main memory 206, from which processor 204 retrieves and executes the instructions. The instructions received by main memory 206 may optionally be stored on storage device 210 either before or after execution by processor 204.
Computer system 200 also includes a communication interface 218 coupled to bus 202. Communication interface 218 provides a two-way data communication coupling to a network link 220 that is connected to a local network 222. For example, communication interface 218 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 218 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 218 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
Network link 220 typically provides data communication through one or more networks to other data devices. For example, network link 220 may provide a connection through local network 222 to a host computer 224 or to data equipment operated by an Internet Service Provider (ISP) 226. ISP 226 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 228. Local network 222 and Internet 228 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 220 and through communication interface 218, which carry the digital data to and from computer system 200, are exemplary forms of carrier waves transporting the information.
Computer system 200 can send messages and receive data, including program code, through the network(s), network link 220 and communication interface 218. In the Internet example, a server 230 might transmit a requested code for an application program through Internet 228, ISP 226, local network 222 and communication interface 218.
The received code may be executed by processor 204 as it is received, and/or stored in storage device 210, or other non-volatile storage for later execution. In this manner, computer system 200 may obtain application code in the form of a carrier wave.
In the foregoing specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Thus, the sole and exclusive indicator of what is the invention, and is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Number | Name | Date | Kind |
---|---|---|---|
5412804 | Krishna | May 1995 | A |
5437032 | Wolf et al. | Jul 1995 | A |
5495605 | Cadot | Feb 1996 | A |
5548755 | Leung et al. | Aug 1996 | A |
5588150 | Lin et al. | Dec 1996 | A |
5590319 | Cohen et al. | Dec 1996 | A |
5590324 | Leung et al. | Dec 1996 | A |
5724570 | Zeller et al. | Mar 1998 | A |
5797136 | Boyer et al. | Aug 1998 | A |
5822748 | Cohen et al. | Oct 1998 | A |
5832477 | Bhargava et al. | Nov 1998 | A |
5905981 | Lawler | May 1999 | A |
5924088 | Jakobsson et al. | Jul 1999 | A |
5963932 | Jakobsson et al. | Oct 1999 | A |
5963959 | Sun et al. | Oct 1999 | A |
5974408 | Cohen et al. | Oct 1999 | A |
6026394 | Tsuchida et al. | Feb 2000 | A |
6032143 | Leung et al. | Feb 2000 | A |
6061676 | Srivastava et al. | May 2000 | A |
6289334 | Reiner et al. | Sep 2001 | B1 |
6298342 | Graefe et al. | Oct 2001 | B1 |
6339768 | Leung et al. | Jan 2002 | B1 |
6370524 | Witkowski | Apr 2002 | B1 |
6510422 | Galindo-Legaria et al. | Jan 2003 | B1 |
6529896 | Leung et al. | Mar 2003 | B1 |
6529901 | Chaudhuri et al. | Mar 2003 | B1 |
6535874 | Purcell | Mar 2003 | B2 |
6622138 | Bellamkonda et al. | Sep 2003 | B1 |
6694306 | Nishizawa et al. | Feb 2004 | B1 |
6792420 | Stephen Chen et al. | Sep 2004 | B2 |
6801905 | Andrei | Oct 2004 | B2 |
6934699 | Haas et al. | Aug 2005 | B1 |
7031956 | Lee et al. | Apr 2006 | B1 |
7089225 | Li et al. | Aug 2006 | B2 |
7111020 | Gupta et al. | Sep 2006 | B1 |
7146360 | Allen et al. | Dec 2006 | B2 |
7158994 | Smith et al. | Jan 2007 | B1 |
7167852 | Ahmed et al. | Jan 2007 | B1 |
7246108 | Ahmed | Jul 2007 | B2 |
7440935 | Day et al. | Oct 2008 | B2 |
20010047372 | Gorelik et al. | Nov 2001 | A1 |
20030055814 | Chen et al. | Mar 2003 | A1 |
20030120825 | Avvari et al. | Jun 2003 | A1 |
20040148278 | Milo et al. | Jul 2004 | A1 |
20040167904 | Wen et al. | Aug 2004 | A1 |
20040220911 | Zuzarte et al. | Nov 2004 | A1 |
20040220923 | Nica | Nov 2004 | A1 |
20040267760 | Brundage et al. | Dec 2004 | A1 |
20050033730 | Chaudhuri et al. | Feb 2005 | A1 |
20050055382 | Ferrat et al. | Mar 2005 | A1 |
20050076018 | Neidecker-Lutz | Apr 2005 | A1 |
20050149584 | Bourbonnais et al. | Jul 2005 | A1 |
20050187917 | Lawande et al. | Aug 2005 | A1 |
20050198013 | Cunningham et al. | Sep 2005 | A1 |
20050210010 | Larson et al. | Sep 2005 | A1 |
20050234965 | Rozenshtein et al. | Oct 2005 | A1 |
20050283471 | Ahmed | Dec 2005 | A1 |
20050289125 | Liu et al. | Dec 2005 | A1 |
20060167865 | Andrei | Jul 2006 | A1 |
20070027880 | Dettinger et al. | Feb 2007 | A1 |
20070073643 | Ghosh et al. | Mar 2007 | A1 |
Number | Date | Country | |
---|---|---|---|
20070073643 A1 | Mar 2007 | US |