According to conventional database systems, transactions are used to retrieve data from a database or to insert, update or delete records of the database. In a distributed database system, each of two or more database nodes may execute respective transactions in parallel, and/or a single transaction may affect data located on more than one database node. Distributed database systems therefore employ transaction management techniques.
The following description is provided to enable any person in the art to make and use the described embodiments. Various modifications, however, will remain readily apparent to those in the art.
Generally, each logical element described herein may be implemented by any number of devices coupled via any number of public and/or private networks. Two or more of such devices may be located remote from one another and may communicate with one another via any known manner of network(s) and/or via a dedicated connection.
System 100 includes database instance 110, which is a distributed database including database nodes 112, 114 and 116. Each of database nodes 112, 114 and 116 includes at least one processor and a memory device. The memory devices of database nodes 112, 114 and 116 need not be physically segregated as illustrated in
In some embodiments, the memory of database nodes 112, 114 and 116 is implemented in Random Access Memory (e.g., cache memory for storing recently-used data) and one or more fixed disks (e.g., persistent memory for storing their respective portions of the full database). Alternatively, one or more of nodes 112, 114 and 116 may implement an “in-memory” database, in which volatile (e.g., non-disk-based) memory (e.g., Random Access Memory) is used both for cache memory and for storing its entire respective portion of the full database. In some embodiments, the data of the full database may comprise one or more of conventional tabular data, row-based data, column-based data, and object-based data. Database instance 100 may also or alternatively support multi-tenancy by providing multiple logical database systems which are programmatically isolated from one another.
According to some embodiments, database nodes 112, 114 and 116 each execute a database server process to provide the full data of database instance to database applications. More specifically, database instance 110 may communicate with one or more database applications executed by client 120 over one or more interfaces (e.g., a Structured Query Language (SQL)-based interface) in order to provide data thereto. Client 120 may comprise one or more processors and memory storing program code which is executable by the one or more processors to cause client 120 to perform the actions attributed thereto herein.
Client 120 may thereby comprise an application server executing database applications to provide, for example, business reporting, inventory control, online shopping, and/or any other suitable functions. The database applications may, in turn, support presentation applications executed by end-user devices (e.g., desktop computers, laptop computers, tablet computers, smartphones, etc.). Such a presentation application may simply comprise a Web browser to access and display reports generated by a database application.
The data of database instance 110 may be received from disparate hardware and software systems, some of which are not interoperational with one another. The systems may comprise a back-end data environment employed in a business or industrial context. The data may be pushed to database instance 110 and/or provided in response to queries received therefrom.
Database instance 110 and each element thereof may also include other unshown elements that may be used during operation thereof, such as any suitable program code, scripts, or other functional data that is executable to interface with other elements, other applications, other data files, operating system files, and device drivers. These elements are known to those in the art, and are therefore not described in detail herein.
Each transaction T# of each of connections 210, 220 and 230 is terminated in response to an instruction to commit the transaction. Accordingly, a transaction may include one or more write or query statements before an instruction to commit the transaction is issued. Each query statement “sees” a particular snapshot of the database instance at a point in time, which may be determined based on the read mode of the statement's associated connection.
For purposes of the
As a result of the foregoing assumptions, statements Q1, Q2 and Q3 of transaction T1 each see the result of statement W1, and statements Q4 and Q5 of transaction T3 also see the result of statement W1. Statement Q6 of transaction T3, on the other hand, sees the result of statements W1, W2 and W3.
As described in commonly-assigned U.S. application Ser. No. (Atty Docket no. 2010P00461US), the particular snapshot seen by a statement/transaction may be governed by a “transaction token” in some embodiments. A transaction token, or snapshot timestamp, is assigned on each statement or transaction by a transaction coordinator (e.g., a master database node). A write transaction creates update versions and updates a transaction token when committed. A garbage collector also operates to merge or delete update versions according to a collection protocol. Under such an implementation, each statement in a ReadCommitted-mode transaction may be associated with its own transaction token, while each statement in a RepeatableRead-mode or Serializable-mode transaction may be associated with a same transaction token.
Initially, a query Q1 is received by database node 314 from client device 320. As is known in the art, the query may be pre-compiled for execution by database node 314, or may conform to any suitable compilable query language that is or becomes known, such as, for example, SQL. Database node 314 may comprise a database node of a distributed database as described with respect to
Query Q1 is associated with a particular transaction (i.e., transaction T1). The transaction may be initiated by database node 314 in response to reception of query Q1 or may have been previously-initiated. Similarly, client 320 may open a connection with database node 314 prior to transmission of query Q1.
Returning to
Having received the transaction token, database node 314 may execute query Q1 based on the snapshot timestamp indicated by the transaction token. Execution of query Q1 generates query results which are transmitted to client 320. As noted in
Client device 320 then transmits query Q2 and the stored transaction token to database node 314. In this regard, query Q2 is also associated with transaction T1 and is intended to view a same snapshot as viewed by query Ql. In some embodiments, queries Q1 and Q2 are executed in RepeatableRead mode or Serializable mode as described above.
Since database node 314 now possesses a suitable transaction token for query Q2, node 314 may, in some embodiments, execute query Q2 without having to request a token from coordinator database node 312. Accordingly, query Q2 is executed in view of the received token and the results are returned to client 320 as illustrated in
Client device 320 then transmits an instruction to commit transaction T1 as illustrated in
According to some embodiments, client device 320 may store tokens associated with more than one ongoing transaction. For example, client device 320 may store a token associated with a transaction instantiated on database node 314 and a token associated with a transaction instantiated on database node 316 of system 400. If a database node supports more than one contemporaneous transaction, then client device 320 may store a token associated with each contemporaneous transaction instantiated on the database node.
Database master 1110 and each of database slaves 1112, 1114 and 1116 may comprise a multi-processor “blade” server. Each of database master 1110 and database slaves 1112, 1114 and 1116 may operate as described herein with respect to database nodes, and database master 1110 may perform additional transaction coordination functions and other master server functions which are not performed by database slaves 1112, 1114 and 1116 as is known in the art.
Database master 1110 and database slaves 1112, 1114 and 1116 are connected via network switch 1120, and are thereby also connected to shared storage 1130. Shared storage 1130 and all other memory mentioned herein may comprise any appropriate non-transitory storage device, including combinations of magnetic storage devices (e.g., magnetic tape, hard disk drives and flash memory), optical storage devices, and Read Only Memory (ROM) devices, etc.
Shared storage 1130 may comprise the persistent storage of a database instance distributed among database master 1110 and database slaves 1112, 1114 and 1116. As such, various portions of the data within shared storage 1130 may be allotted (i.e., managed by) one of database master 1110 and database slaves 1112, 1114 and 1116.
Application server 1140 may also comprise a multi-processor blade server. Application server 1140, as described above, may execute database applications to provide functionality to end users operating user devices.
Embodiments described herein are solely for the purpose of illustration. Those in the art will recognize other embodiments may be practiced with modifications and alterations to that described above.