The present invention relates to Enterprise Data Warehouses and Relational Database Management Systems used to support real-time decision management.
Enterprise data warehouse (EDW) systems have traditionally been used for decision support by enabling business entities to view some aspect of their businesses in real-time and plan tactical or strategic moves aimed at improving operations. Businesses increasingly are trying to push the decision support capabilities of EDW systems to real-time, allowing for decisions on business operations to be made quickly in response to certain events. In many cases, real-time events occur in some time-ordered manner or priority, the order of which the businesses would like to preserve and react to accordingly. Traditional EDW systems, however, do not easily lend themselves to these sort of time-ordered operations, and attempts at making them do so have proven difficult.
In general, according to one embodiment, this invention involves a method for use in delivering data from a database table in response to a query. The method includes accessing an ordered list that indicates an order of retrieval for each of one or more rows in the database table and returning the row that is first in the order.
In general, according to one embodiment, this invention involves a method for use in delivering data from a database table in response to a query. The method includes assessing whether the database table has any rows of data. When there are no rows of data in the table, halting execution of the query until data is placed in the database table and then returning the data that was placed in the database table.
In general, according to one embodiment, this invention involves a system for use in delivering data from a database table in response to a query. The system provided by embodiments of the present invention comprises: a database that includes one table; one or more data-storage facilities that together store the database; and a database-management component coupled to the database and configured to access an ordered list that indicates an order of retrieval for each of one or more rows in the database table and returns the row that is first in the order.
In general, according to one embodiment, this invention involves a system for use in storing data in a database table. The system provided by embodiments of the present invention comprises: a database that includes one table; one or more data-storage facilities that together store the database; and a database-management component coupled to the database and configured to store data in one or more rows in the database table and create an ordered list that indicates an order of retrieval for the one or more rows.
In general, according to one embodiment, this invention involves a system for use in delivering data from a database table in response to a query. The system provided by embodiments of the present invention comprises: a database that includes one table; one or more data-storage facilities that together store the database; and a database-management component coupled to the database and configured to assess whether the database table has any rows of data. When there are no rows of data in the table, halting executing of the query until data is placed in the database table and then returning the data that was placed in the database table.
Each of the nodes 102 executes one or more computer programs, such as an operating system, Data Mining Applications performing data mining operations, Real-time Data Management (RTDM) Applications for performing operations on the data, Client Database Applications interacting with the database, and/or a Relational Database Management System (RDBMS) for managing a relational database stored across one or more of the DSUs. In some embodiments, a computer system having only a single node manages all or some subset of the computer programs.
Those skilled in the art will recognize that the exemplary environment illustrated in
In addition to traditional relational tables, the database 200 also includes one or more tables, known as queue tables 210, that provide enhanced support for real-time decision making in an EDW system. Each of the queue tables 210 is a relational table that has non-traditional “queue like” properties. In particular, each of the queue tables 210 is created and managed in a manner that provides for the preservation of order amongst data placed in the table. This preservation of order typically appears as a preservation of some time related order (e.g., the time order in which the data is placed in the queue table or comes into existence at some external source) or as the preservation of some priority-based order (e.g., higher priority data is distinguished from lower priority data). In general, the queue tables typically use some characteristic of the data they store—be it an intrinsic characteristic of the data created with the data itself or an extrinsic characteristic added to the data at some point—to indicate an order of retrieval for returning the rows of data that make up the tables.
One way that the database, through the queue tables 210, preserves the order of data is through the use of one or more index type structures known as ordered lists 220. Each of the ordered lists 220 in the database maintains, for a corresponding queue table 210, an order of retrieval for the rows stored in that queue table. Unlike a standard database index 240, which can not anticipate which data will be requested or the order in which it will be returned, the ordered lists 220 are created precisely to specify which rows of data are returned in response to individual queries.
The criteria used in any one of the ordered lists 220 to govern the retrieval order for rows of data stored in the corresponding queue table 210 are independent of the nature of the database system in which the queue tables are implemented and are often chosen by the DBA. For example, in some systems, the retrieval order in an ordered list relies on time based information, such as a timestamp indicating when each row was created in the corresponding queue table. The database may or may not store the time based information with the corresponding rows of data. This time based information, when it is stored with the corresponding data, is typically stored in a time indicator data column 215 in the corresponding queue table 210. In some embodiments however, the time based information, if stored at all, is stored outside of the queue table. For example, in some of these embodiments, the time based information is stored as part of the ordered list that corresponds to the queue table. Also, in some embodiments, the time based information is hidden and is not assessable through database queries, while in other embodiments the time based information is accessible through database queries.
In other embodiments, the retrieval order preserved by the ordered list is based on an occurrence of some event other than the time at which the data was stored in the corresponding queue table. In these embodiments, the time at which the event occurred is typically supplied with the data.
In still other embodiments, retrieval order is based on priority ratings assigned to the rows of data stored in the queue table. In some cases, the priority ratings are provided to the database along with the data and, in others, priority ratings are calculated within the database by applying an algorithm to the data stored in the queue table.
Queue tables like those described above carry several advantages beyond their ability to preserve order among data. These advantages include: 1) the ability to suspend execution of a database query when no data is available and then resume execution of the query and return the requested data when the data becomes available; 2) the ability to return only a single row of data in response to a query, even when multiple rows satisfy the query; and 3) the ability to execute “select and consume” queries that lead to the return of a row of data followed by deletion of the data from the queue table.
In regard to the first of these advantages, when the database receives a traditional database query, the database either returns the requested data or upon failing to return data, indicates that no data is available and ends the query. In a real-time environment, failing to return data wastes valuable computer and database resources, because it forces the reissuance of the query, perhaps more than once, until data is available. Through the use of queue tables, on the other hand, the database is able to suspend execution of a query when no data is available and resume execution of the query once the data becomes available. Suspending and resuming queries in this manner reduces the consumption of computer and database resources used in processing queries and reduces the delays traditionally associated with retrieving real-time data from the database.
The second of the advantages listed above is the ability of the database to return only a single row of data from a database table in response to a query, even when multiple rows of data might satisfy the query. In a traditional database system, a single row of data can be returned only if the query is sufficiently complex and has decided to target only that row. Quiet often, such a query is not even possible. In a database system implementing queue tables however, the DBA can set the system to return only one row from a queue table in response to any query. Returning only a single row of data greatly reduces the complexity of database queries and minimizes the impact on the resources required to execute the queries when only small amounts of data are needed at a time.
The use of queue tables also allows the database to execute “select and consume” commands. These commands provide a simple mechanism for retrieving a row of data from a queue table and then removing or deleting the returned row from the queue table. In general, the “select and consume command” is performed as an indivisible command that is not and in fact can not be interrupted during execution. In a traditional database system, data is locked at the SQL command level to prevent the execution of other queries attempting to retrieve the same data before that data can be deleted from the table. The structure of the queue table allows for easy execution of a “select and consume” command in the database system.
In some systems, when the database system suspends a query, the application that issued the query also suspends execution. In some of these systems, the application starts a timer before issuing the query. If the timer expires before the application has received a response to the query, the application (or some part of the application) resumes execution and the application decides whether to: 1) abort the query altogether; 2) process other work while continuing to await for a response; or 3) set another timer and continue to wait for a response to the query.
Returning to step 410, if no data is available, the database assesses the type of table targeted by the query (step 420). If a queue table is not involved in the query, the database will return a token indicating that no data was available and then terminate the processing of the query (step 470). If a queue table is involved in the query, the database will suspend execution of the query until the requested data is received (step 435) and then terminate the processing of the query (step 470).
Upon receiving data for storage in a queue table (step 500), the database creates a row in the queue table (step 505) and stores the data in that row (step 510). A record is then created in an ordered list associated with the queue table (step 515) and a pointer to the row in the queue table is stored in the record (step 520). The record is added to the ordered list so that it is last in the ordered list (step 525). (It is possible that the record can be both last and first in the ordered list if the queue table has only one row of data.) The record is last in the ordered list because the data that it points to is the most recently stored data in the queue table. The record remains last in the ordered list until another record is added to the ordered list. As additional data is inserted into the queue table, the data associated with the record ages and becomes decreasingly recent. At some point, this data becomes the oldest data in the queue table and therefore will be first in the ordered list. Once the data is stored and the ordered list updated, an indication is sent to any query waiting for data to be inserted into this queue table (step 530). The indication causes the waiting or suspended query to resume execution.
In some systems, a query command that targets a queue table will include an SQL “where” clause that places restrictions on the data to be returned. When a query of this type is received, the database examines the data in the queue table in the order specified by the ordered list associated with the queue table and returns at least a portion of the data from the first row found that satisfies the restrictions contained in the query. The row may not be the first row listed in the ordered list, but it will be the row closest to the first row in the ordered list that satisfies the query. If no data is found, the execution of the query is suspended until data satisfying the query is inserted into the queue table. In still other systems, when a query containing a “where” clause is received, the database uses one or more database indexes along with the ordered list to reduce the time needed to search for data in the queue table.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate that numerous modifications and variations of the invention exist. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of the invention.
This application is a divisional application of U.S. patent application Ser. No. 11/238,937; entitled “ENHANCING TABLES AND SQL INTERACTION WITH QUEUE SEMANTICS” by Jason Chen and Bhashyam Ramesh; filed on Sep. 29, 2005.