1. Field of the Invention
The present invention relates to a method, system, and program for accessing data in a database table.
2. Description of the Related Art
Database programs include a feature referred to as cursors. A cursor is a named control structure used by an application program to point to a row of interest within some set of rows and to retrieve rows from the set, possibly making updates and deletions. A cursor points to rows from a database table that satisfy a structured query language (SQL) query against the table. The rows in the table that satisfy the SQL query comprise a result table of data. The SQL query includes an SQL SELECT statement and a WHERE clause to qualify rows according to a search condition. An application can then access data on a row-by-row basis from the result table.
If the result table is static and not updateable, then the result table may be materialized in a workfile. Alternatively, the cursor may be dynamic and point directly to the rows in the base table. In such case, the result table is not materialized in a workfile and the cursor is updateable when the base table is updated.
After a cursor is opened or initialized, an application program may issue fetch statements to move a cursor positioned on a row one or more rows, forward or backward, from a current cursor position. In current implementations, if a cursor is implemented as sensitive, then changes made to the database after the result table is materialized are visible to the cursor. The cursor has some level of sensitivity to any updates or deletes made to the rows underlying its result table after the table is materialized. The cursor may further be sensitive to positioned updates or deletes using the same cursor. Additionally, the cursor can have sensitivity to committed changes made outside this cursor. A static cursor specifies that the size of the result table and the order of the rows does not change after the cursor is opened. Rows inserted into the underlying table are not added to the result table regardless of how the rows are inserted. Rows in the result table do not move if columns in the ORDER BY clause are updated in rows that have already been materialized. Positioned updates and deletes are allowed if the result table is updateable. The SELECT statement of a cursor that is defined as SENSITIVE STATIC cannot contain an INSERT statement. Further, a static cursor may have visibility to changes made by this cursor using positioned updates or deletes. Committed changes made outside this cursor are visible with the SENSITIVE option of the FETCH statement. A FETCH SENSITIVE can result in a hole in the result table (that is, a difference between the result table and its underlying base table). If an updated row in the base table of a cursor no longer satisfies the predicate of its SELECT statement, an update hole occurs in the result table. If a row of a cursor was deleted in the base table, a delete hole occurs in the result table. When a FETCH SENSITIVE detects an update hole, no data is returned (a warning is issued), and the cursor is left positioned on the update hole. When a FETCH SENSITIVE detects a delete hole, no data is returned (a warning is issued), and the cursor is left positioned on the delete hole.
Open DataBase Connectivity (ODBC) is a standard database access method to allow applications to access data in a database management system (DBMS). An ODBC driver translates the application's queries into statements that the DBMS understands. The ODBC standards describe scrollable, keyset driven, static and dynamic cursors. The ODBC standards mention that cursors may be updateable or nonupdateable. Cursors are defined as updateable if the application is capable of modifying the data in the cursor result table. As discussed, the result table may be implemented in a work file or comprise the rows pointed to by the cursor in the base table. The ODBC also mentions that when positioned on a row in an updateable cursor, the application can perform position updates or delete operations that target the base table rows used to build the current row in the cursor.
The ODBC defines the following types of cursors:
Cursors may be categorized as forward-only or scrollable. If the cursor is scrollable then they can be either static, keyset or dynamic. Further details of scrollable cursors are described in the copending and commonly assigned patent application entitled “Method, System, and Program for Implementing Scrollable Cursors in a Database”, having U.S. application Ser. No. 09/656,558, filed on Sep. 7, 2000, which patent application is incorporated herein by reference in its entirety.
Current database systems provide static scrollable cursors only. However, static scrollable cursors restrict the applications form being able to fetch newly inserted rows. Furthermore, they require extra disk storage to save the temporary result table per transaction. While some applications prefer to work with a fixed set of rows, at the expense of extra disk storage, there is also an extensive requirement in the industry to be able to scroll on the database table there by having immediate access to the most current data via the access path selected by the database optimizer.
Provided are a method, system, and program for accessing data in a database table. A fetch request is received to fetch data from a base table that satisfies a query predicate, wherein rows of the base table are stored in table partitions and wherein there is one index partition for each determined table partition, wherein each index partition includes nodes, wherein each node in each index partition includes at least one key column value from a corresponding table row in the table partition associated with the index partition and a location identifier identifying the corresponding table row in the corresponding table partition. A determination is made of a set of nodes, one from each index partition, whose key column value satisfies the query predicate. One node from the set is selected and data is returned from the table row identified by the location identifier in the selected node in response to the fetch request.
In further implementations, a determination is made as to whether to modify a direction of the fetch request. The direction of the fetch request is modified if the determination is made to modify the fetch request. And a determination is made of the set of nodes based on the direction of the fetch request.
In still further implementations, a subsequent fetch request is received to fetch data from the base table. A previously selected node selected in a previous fetch request in the set is replaced with one node in the index partition including the previously selected node whose key column value satisfies the query predicate to form a modified set. One node is selected from the modified set and the table row identified by the location identifier in the node selected from the modified set is returned.
Still further, the cached keys are discarded if the fetch request is in an opposite direction of a previous fetch request. A determination is made of a new set of nodes from each index partition. The determined new set of nodes is cached when performing the fetch operation.
Referring now to the drawings in which like reference numbers represents corresponding parts throughout:
a and 1b illustrate a computing environment in which aspects of the invention are implemented in accordance with implementations of the invention;
In the following description, reference is made to the accompanying drawings which form a part hereof, and which illustrate several embodiments of the present invention. It is understood that other embodiments may be utilized and structural and operational changes may be made without departing from the scope of the present invention.
a illustrates a computing environment in which a database may be implemented. A server 2, which may comprise any server class system known in the art, receives and handles database requests, such as database queries, directed to tables or indexes in a database 6. Database 6 is shown as having a table 8, a partition map 10, and an index 11. The partition map 10 identifies one or more columns of the table 8 as partition map columns, and uses the partition map column values to associate table rows with one partition 14a, 14b . . . 14n in a table space 16 in storage 18 where the table 8 data is stored. The partition map 10 criteria may indicate ranges of values associated with different partitions 14a, 14b . . . 14n defined for the table 8, such that when adding a row to the table 8, the column value for the added row corresponding to the partition map column determines in which partition 14a, 14b . . . 14n the row will be stored. For instance, the partition map 10 may indicate a range of partition map column values for each partition 14a, 14b . . . 14n, such that the row is stored in the partition associated with the range including the column value of the row.
The index 11 on the table 8 may be implemented as a plurality of data partitioned secondary indexe (DPSI) partitions 12a, 12b . . . 12n in an index space 13, where each DPSI partition 12a, 12b . . . 12n provides a scannable index on a subset of rows of the table 8. This makes the index 11 partitioned. In certain implementations, each DPSI partition 12a, 12b . . . 12n is associated with one table space partition 14a, 14b . . . 14n, such that each DPSI partition 12a, 12b . . . 12n has nodes arranged in a tree structure, such as a B-tree, where each node has a key value for one table record and a location identifier, such as record identifier (RIDs), pointing to the corresponding table row in the table space partition 14a, 14b . . . 14m. In this way, the rows in the table 8 in one table space partition 14a, 14b . . . 14n may be separately searched by searching the DPSI partitions 12a, 12b . . . 12n that includes the key value for each row in the table space partition 14a, 14b . . . 14n associated with that DPSI partition 12a, 12b . . . 12n.
b provides a further example of a data partitioned secondary index (DPSI) and, in particular, a mapping of a data partitioned secondary index 30 to physical partitions on storage device 18 against which parallel tasks 32, 34 execute in accordance with certain implementations of the invention. In particular, the data partitioned secondary index (DPSI) 30 is partitioned into three partitions: data partitioned secondary index (DPSI) partition A 36, data partitioned secondary index (DPSI) partition B 38, and data partitioned secondary index (DPSI) partition C 40. Storage device 18 stores physical partition A 40 with data pages 1-10, physical partition B 142 with data pages 11-20, and physical partition C 144 with data pages 21-30. Each data partitioned secondary index (DPSI) partition is associated with a physical partition. For example, data partitioned secondary index (DPSI) partition A 36 is associated with physical partition A 40. Data partitioned secondary index (DPSI) partition B 38 is associated with physical partition B 42. Data partitioned secondary index (DPSI) partition C 40 is associated with physical partition C 44. Additionally, parallel task 32 is assigned data pages 1-10 of physical partition A 40 and data pages 11-20 of physical partition B 42. Parallel task 34 is assigned data pages 21-30 of physical partition C 46.
By assigning data pages to parallel tasks, certain implementations of the invention, further discussed in detail in U.S. patent application Ser. No. 10/353,138, which was incorporated by reference in its entirety above, are able to map the data pages to physical partitions, which are then mapped to associations with data partitioned secondary index partitions. Since different parallel tasks access different data pages and different data partitioned secondary index partitions, I/O contention between parallel tasks is minimized and overall elapsed time is reduced. Thus, implementations of the invention achieve an elapsed time improvement over sequential database query execution by creating multiple parallel tasks to concurrently access data through a data partitioned secondary index and by using data page range partitioning (i.e., assigning different data page ranges to different parallel tasks having the same key range).
The storage 18 may comprise any type of non-volatile storage device known in the art. Further, pages of any of the table space partitions 12a, 12b . . . 12n and 14a, 14b . . . 14m, or part thereof, may be loaded into the memory of the server 2 or the database members 20a, 20b . . . 20n.
Although only one table 8, partition map 10, index 11 and one set of DPSI partitions 12a, 12b . . . 12n and table space partitions 14a, 14b . . . 14n are shown, the database 6 may include many tables and DPSIs, as well as additional sets of index and table space partitions for the tables and indexes in the database 6. Moreover, multiple DPSIs defined on one table may each be comprised of multiple partitions, where each DPSI has a different set of key columns on the table.
Database members 20a, 20b . . . 20k include a local buffer pool 22a, 22b . . . 22k to store records retrieved from the database 6 and a database server 24a, 24b . . . 24k, such as a database management server (DBMS), to receive requests from database clients 25 and execute such requests against the database 6. The database clients initiating the requests may execute within the server 2 or on a system external to the server 2. The database server 24a, 24b . . . 24k may then perform operations on database 6 data retrieved through the database server 24a, 24b . . . 24k written to the local buffer pool 22a, 22b . . . 22k. When modifying data in a local buffer pool 22a, 22b . . . 22k, the database servers 24a, 24b . . . 24k would write any modifications to a group buffer pool 26. Any database server 24a, 24b . . . 24k accessing database data in their local buffer pool 22a, 22b . . . 22k would first check the group buffer pool 26 to determine if there is a more recent version of the data they are accessing, and if so, access such more recent version of the data from the group buffer pool 26.
The database member 20a, 20b . . . 20n may be implemented in computer systems separate from the server 2. Alternatively, the database members 20a, 20b . . . 20n may comprise programs or processes that execute within the server 2.
If (at block 108) the query includes search predicates on partition map columns, then a determination is made (at block 112) of the one or more qualifying table space partitions associated with column values in the partition map 10 that are capable of satisfying the query search predicates. A determination is then made (at block 114) of the DPSI partitions 12a, 12b . . . 12n corresponding to the determined table space partitions 14a, 14b . . . 14n having index nodes whose key values are capable of satisfying one or more of the query search predicates. Control then proceeds to block to block 130 in
To perform the first scan (from blocks 110 or 114), then the database server 24a, 24b . . . 24k accesses (at block 130) the root, or the top, of each of the m DPSI partitions 12a, 12b . . . 12n to scan, where m may be n or less than n. The database server 24a, 24b . . . 24k then scans (at block 132) in the determined direction from the previously determined starting node of each of the m DPSI partitions 12a, 12b . . . 12n to determine the first node in each DPSI partition whose key value satisfies the query predicates. The scan direction and starting node may be determined by the logic of
If (at block 140) all queried table rows satisfy the query predicates on the non-key columns, then the qualifying key values located in each scanned DPSI partition are sorted (at block 142) if there are multiple nodes according to the sort order of the index. If the index is sorted in an ascending order, then the key values would be sorted from lowest to highest, if a descending order, then the key values would be sorted from highest to lowest. In certain implementations, each DPSI may comprise a B-tree or other searchable tree data structure having a root or top node and descendant nodes organizing one or more key columns according to a sort criteria. The first qualifying key value is selected (at block 144) from the sorted one or more located key values. The index node of the selected qualifying key values is now the winning node. The database server 24a, 24b . . . 24k then reevaluates (at block 146) whether the key value of the winning node still satisfies the query predicate search conditions (on both DPSI key columns and non-DPSI key columns), in the event the row has been updated while doing the search. This reevaluation ensures only qualifying rows are returned. If (at block 146) the row corresponding to the winning index node still satisfies the query predicates, then a determination is made (at block 147) whether the fetch quantity is satisfied. If the fetch request is to fetch forward or backward by a certain quantity or number, then the row is only returned when that fetch quantity is reached. If (at block 147) the fetch quantity is not satisfied, then the current winning key is discarded (at block 149) and the fetch quantity is incremented by one. Control then proceeds to block 162 in
If (at block 140) the queried table rows do not satisfy the query predicates on the non-DPSI key columns, the database server 24a, 24b . . . 24k scans (at block 150) from node(s) in DPSI partition(s) of the node(s) that do not satisfy the non-DPSI key column predicates to locate next qualifying node(s) that satisfy the DPSI key column predicates. If (at block 152) there is one qualifying node located in the DPSI partitions scanned at block 150 that satisfies the key column predicates, then control proceeds to block 136. Otherwise, if (at block 152) there are no nodes that satisfy the DPSI key column predicates determined in the scan at block 150, then control proceeds to block 153 to determine whether there are qualifying key values form other DPSI partitions not scanned at block 150. If so, control proceeds to block 142, otherwise, control ends.
With respect to
The described logic of
The described implementations provide a technique to limit a query to a subset of the multiple index (DPSI) partitions when possible and return rows in the order of the key columns of the index while the same range of key values can exist across multiple index partitions. This avoids the need to query an entire index. Instead, the query is optimized by, in certain instances, limiting the query to an index partition that has fewer searchable entries than an index on all rows of the table, where each index partition provides an index on a subset of table 8 rows stored in a particular partition 14a, 14b . . . 14n.
In certain implementations, the database server programs 24a, 24b . . . 24k would have the capability to implement a scrollable cursor.
In a dynamic scrollable cursor, the cursor operations, such as fetch forward, backward, relative, etc., are performed directly on a table 8 or index 11 on the table without using a result set. This avoids the need to first buffer those table rows that qualify according to the search predicate in a separate result table 200 (
Dynamic scrollable cursors provide access to the current the data because the scrollable cursor operates against the table 8 or index 11 on the table.
In certain implementations, the scrollable cursor may be defined to scroll on search predicates including a key column of an index on a table. For such cursors, the scrollable cursor would scroll on the index, and upon locating an index node satisfying the search predicates of the scrollable cursor, return the row in the table addressed by the index node located in the scrollable cursor fetch operation.
From block 306 or 308, the database server 24a, 24b . . . 24k receives (at block 312) the modified fetch request with the fetch direction 282 set and a scroll quantity, or number of rows to scroll in the set current direction 283. If (at block 314) this is the first fetch, i.e., the counter 286 is set to zero, then the starting cursor keys 284 are set (at block 316) for each DPSI partition 252a, 252b . . . 252n to the last key if the current direction 283 is forward or to the highest key if the current direction 283 is backward. Control then proceeds (at block 318) to block 132 in
The above described cursor implementations allows cursor fetch forward or backward operations on an index of a table that is implemented in multiple index partitions.
The logic of
The following example illustrates dynamic scrolling with DPSI. If, for example, there are four partitions having key values in the DPSI in ascending order and the application SELECT statement requests these values in ascending order, and it is able to scan backwards as well.
After open cursor, FETCH NEXT will build the following key values in the cache: 4, 6, 5, 2 which are the lowest key from each part. After sorting, the ordered key values are 2, 4, 5, 6, the lowest, 2, is returned. On the subsequent FETCH NEXT, 2, will be discarded because it has already been returned and it will be replaced by the next key from part 4, which is 10. After sorting the cache will hold 4, 5, 6, 10. And 4 will be returned. On yet another FETCH NEXT, the cache will contain 5, 6, 10, 12, and 5 will be returned.
If the application switches direction and requests fetch previous, the cache is discarded and replaced (at block 324 in
The described database management techniques disclosed herein may be implemented as a method, apparatus or article of manufacture using standard programming and/or engineering techniques to produce software, firmware, hardware, or any combination thereof. The term “article of manufacture” as used herein refers to code or logic implemented in hardware logic (e.g., an integrated circuit chip, Programmable Gate Array (PGA), Application Specific Integrated Circuit (ASIC), etc.) or a computer readable medium, such as magnetic storage medium (e.g., hard disk drives, floppy disks,, tape, etc.), optical storage (CD-ROMs, optical disks, etc.), volatile and non-volatile memory devices (e.g., EEPROMs, ROMs, PROMs, RAMs, DRAMs, SRAMs, firmware, programmable logic, etc.). Code in the computer readable medium is accessed and executed by a processor complex. The code in which preferred embodiments are implemented may further be accessible through a transmission media or from a file server over a network. In such cases, the article of manufacture in which the code is implemented may comprise a transmission media, such as a network transmission line, wireless transmission media, signals propagating through space, radio waves, infrared signals, etc. Thus, the “article of manufacture” may comprise the medium in which the code is embodied. Additionally, the “article of manufacture” may comprise a combination of hardware and software components in which the code is embodied, processed, and executed. Of course, those skilled in the art will recognize that many modifications may be made to this configuration without departing from the scope of the present invention, and that the article of manufacture may comprise any information bearing medium known in the art.
The discussion and flowcharts in
The partitions for which an index partition is provided may comprise a portion of a page set or any other portion of a physical storage unit which is provided to store database tables.
The described implementations implemented the database server in a server accessible to multiple database members. In alternative implementations, the operations described with respect to the database server may be performed by any database program handling database operations and database tables and indexes, including database programs that execute on computers other than servers.
The foregoing description of the implementations has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto. The above specification, examples and data provide a complete description of the manufacture and use of the composition of the invention. Since many implementations of the invention can be made without departing from the spirit and scope of the invention, the invention resides in the claims hereinafter appended.