Concurrency control systems aim to provide database transaction isolation for data in a system accessed by multiple users. Where multiple users attempt to read and/or write to a data slot in a database in parallel, controls are implemented such that a first transaction does not adversely affect other transactions and that the serializability of the system is not violated. For example, pessimistic concurrency controls may be implemented to lock an entity in the database such that a holder of a lock may disallow anyone from reading or writing to the entity.
The present application may be more fully appreciated in connection with the following detailed description taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:
Systems, methods, and equivalents for adaptively changing the concurrency control mode of a data slot are provided. A concurrency control mode is defined by the operational procedures used to maintain the consistency of the database entity when facing concurrent transactions. Concurrency control modes aim to prevent the adverse effects of transactions attempting to modify data in a database concurrently. Specifically, concurrency control modes aim to maintain consistency of an entity in a database interacted with by multiple transactions in parallel by maintaining the principles of serializability within a data system. Serializability is maintained where a database state resulting from multiple concurrent transactions mimics the result of the transactions executing serially.
Multiple concurrency control modes may be implemented in a database. A pessimistic concurrency control (PCC) mode, for instance, may limit concurrency by allowing readers and writers to “lock” data, disallowing other readers and writers from submitting transactions, such as read or write transactions, with respect to the data. For example, data may be locked in the database and accessible only by the lock holder such that the lock holder has exclusive access to update the data.
Optimistic concurrency control (OCC) is a concurrency control mode that may enable multiple readers and writers to perform transactions on the same data or entity and abort a transaction before committing in the event the transaction would violate the principle of serializability within the system i.e. where there is a read-write conflict. In an example of OCC, a single version of data is maintained, the data is read in shared memory, and the data is written to in private memory. When in an OCC mode, transactions may be validated before they are committed, and a transaction may be aborted if the transaction is found to be invalid. For example, if a transaction attempts to write to data that was modified by another transaction subsequent to the time the data was read by the transaction, the transaction may be found to be invalid and aborted. In an example, the transaction may be retried responsive to the abort.
Multi-version concurrency control (MVCC) is a concurrency control mode that, unlike OCC, may not abort a transaction before committing in the event a read-write conflict would occur. In MVCC, an additional version of the data or entity may be created responsive to detecting a read-write conflict. MVCC may mimic isolation within the database by creating a snapshot of data at a point in time at which the transaction is initiated. The transaction may perform on the additional version, i.e. the snapshot, until such time as a commit occurs.
A timestamp may be applied to the additional version to mark the time of creation of the additional version. Where multiple additional versions are created, the additional versions may be ordered by timestamp. A timestamp may similarly be applied to a transaction, which may mark a time the transaction was initiated, mark a time the transaction last read from data, etc. In an example, a timestamp assigned to the transaction may be compared with the timestamps of any number of the ordered additional versions and may read or write to the additional version that has the latest timestamp prior to the timestamp of the transaction. In doing so, the transaction may maintain a consistent view of the database on which it is operating.
In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present systems and methods. For some examples, the present systems and methods may be practiced without these specific details. Reference in the specification to “an example” or similar language means that a particular feature, structure, or characteristic described in connection with that example is included as described, but may not be included in other examples. In other instances, methods and structures may not be described in detail to avoid unnecessarily obscuring the description of the examples. Also, the examples may be used in combination with each other.
The following terminology is understood to mean the following when recited by the specification or the claims. The singular forms “a,” “an,” and “the” mean “one or more.” The terms “including” and “having” are intended to have the same inclusive meaning as the term “comprising.”
Any of the processors discussed herein may include a microprocessor, a microcontroller, a programmable gate array, an application-specific integrated circuit (ASIC), a computer processor, or the like. Any of the processors may, for example, include multiple cores on a chip, multiple cores across multiple chips, multiple cores across multiple devices, or combinations thereof. In some examples, any of the processors may include at least one integrated circuit (IC), other control logic, other electronic circuits, or combinations thereof.
A concurrency control mode for a data slot may be changed. A data slot may include any number of data fields for housing data within a database.
Second memory resource 130 may store data in a database. The database may include a data slot 140 having in-place data field 142 to house an in-place data value which may be a committed data value. In an example, the housed committed data value may be a latest committed data value of the data slot.
Data slot 140 may also include a mode indicator field 144. In an example, the mode indicator field 144 may house data, such as a mode indicator, to indicate what concurrency control mode the data slot is in. For instance, mode indicator field 144 may house a mode indicator to indicate whether the concurrency control mode of the data slot is in optimistic concurrency control (OCC) mode or multi-version concurrency control (MVCC) mode. Specifically, the mode indicator may indicate that the concurrency control mode of the data slot is in OCC mode, such that a transaction with respect to the data slot is aborted upon detection of a read-write conflict. The mode indicator may also indicate that the concurrency control mode of the data slot is in MVCC mode, such that an additional version of the committed data value may be created responsive to detecting the read-write conflict.
Instructions may be provided in first memory resource 120 of system 100. Specifically, instructions 122 may be provided to change a concurrency control (CC) mode from OCC to MVCC. In an example, instructions 122 may change the concurrency mode from OCC to MVCC responsive to detecting a read-write conflict for the data slot. Instructions 124 may also be provided in first memory resource 120 to change the CC mode from MVCC to OCC. In an example, instructions 124 may change the CC mode from MVCC to OCC responsive to detecting that the data slot satisfied a low contention criterion which is described in greater detail herein. Accordingly, instructions may be provided in first memory resource 120 to switch from OCC to MVCC and from MVCC to OCC.
A transaction, or a succession of transactions, may be executed by a thread. A thread may be a unit of execution, i.e., the thread executes instructions that operate on data stored in memory. In an example, transactions executed by different threads will be stored in the ring buffer of the respective thread. An example memory in the form of a collection 230 of two per-thread ring buffers is illustrated in
Additional fields may be included within data slot 140 in addition to in-place data field 142 and mode indicator 144, including priority indicator 246 and version chain locator 248. Version chain locator 248 may locate any transaction seeking to update in-place data field 142, and specifically, may locate any created additional versions in the form of updates to in-place data field 142. In an example, a created additional version to update in-place data field 142 may be stored within a transaction record located in memory. A transaction record may be stored within a cyclical ring-buffer. The cyclical ring-buffer may describe the logical organization of the data in memory. Any number of ring-buffers may be utilized, and may depend on any number of physical processors in the system, the load on the database server, etc.
In an example, multiple updates in the form of additional versions may be stored in any number of per-thread ring buffers within collection 230. Additional versions stored within different per-thread ring buffers, e.g., where created by different threads, may be linked in the form of a version chain. The version chain located by version chain locater 248 may link together multiple created additional versions for a single data slot. In an example, the created additional versions may be linked sequentially such that a first additional version of the created additional versions points to a second additional version of the created additional versions that was created earlier in time than was the first additional version. For instance, a first additional version stored within first thread ring buffer 232 may point to a second additional version stored within second thread ring buffer 234. While two per-thread ring buffers are illustrated in example system 200, additional versions may be stored and linked between any number of per-thread ring buffers. Thus, version chain locator 248 may locate the version chain stored within the collection of per-thread ring buffers 230.
Priority indicator 246 may also be provided within data slot 140 and may indicate from which data slot field a read should first occur. For example, priority indicator 246 may indicate whether a read should first occur from in-place data field 142 of data slot 140 or the additional versions pointed to by version chain locator 248.
A timestamp may be assigned to an additional version of a data value upon committing. Memory 120 may include instructions 226 to assign a commit timestamp to an additional version. Thus, each additional version stored within a per-thread ring buffer may be assigned a commit timestamp indicating the time at which the additional version committed. For example, first additional version 236 may be assigned first commit timestamp 242 and second additional version 238 may be assigned second commit timestamp 244. The assigned timestamps may keep record of when each additional version was committed to the version chain.
Any of the non-transitory computer-readable storage media described herein may include a single medium or multiple media. The non-transitory computer readable storage medium may comprise any electronic, magnetic, optical, or other physical storage device. For example, the non-transitory computer readable storage medium may include, for example, random access memory (RAM), static memory, read-only memory, an electrically erasable programmable read-only memory (EEPROM), a hard drive, an optical drive, a storage drive, a CD, a DVD, or the like.
During moments of high contention at a data slot, a concurrency mode of the data slot may be switched from an optimistic concurrency control (OCC) mode to a multi-version concurrency control (MVCC) mode. Instructions 324 may be provided to change a concurrency control mode of the data slot from an OCC mode to an MVCC mode. In an example, the OCC mode aborts a transaction upon detecting a read-write conflict. When in an OCC mode, transactions may be validated before they are committed, and a transaction may be aborted where the transaction is found to be invalid. In an example, the MVCC mode creates a snapshot of the data targeted by the transaction in the form of an additional version of the stored data value upon detecting a read-write conflict such that concurrent transactions may read and/or write to identical values from a given point in time and an illusion of isolation within the system may be maintained.
During moments of low contention at the data slot, a low contention criterion may be satisfied such that the concurrency control mode of the data slot may be switched from MVCC mode to OCC mode. In an example, the satisfaction of the low contention criterion may be determined from a version chain field of a data slot, such as version chain locator 248 of
Instructions 326 may detect that the version chain field is empty, indicating that there are no additional versions of the stored data value stored within the collection of per-thread ring buffers. Where the version chain is empty the low contention criterion may be satisfied. In an example, the low contention criterion may be satisfied where the version chain is made up of less than a threshold amount of additional versions.
The data slot may be changed from MVCC mode to OCC mode where the version chain of the data slot does not point to any additional version of the stored data value. Instructions 328 are provided to change the concurrency control mode from MVCC mode to OCC mode responsive to detecting that the version chain does not point to any additional version of the stored data value. Accordingly, a non-transitory computer readable medium may be provided to dynamically switch a data slot from OCC mode to MVCC mode during periods of high contention, and from MVCC mode to OCC mode during periods of low contention.
The method may be implemented in the form of executable instructions stored on a computer-readable medium or in the form of electronic circuitry. For example, method 400 may be implemented by executable instructions in a non-transitory computer readable medium as in example
An update of a transaction may be written to an optimistic concurrency control (OCC) buffer in a transaction record when in an OCC mode, and an update of a transaction may be written to a multi-version concurrency control (MVCC) buffer in a transaction record when in an MVCC mode.
OCC buffer 530 may be included in transaction record 510 as well as MVCC buffer 540. A write request of a transaction may write to OCC buffer 530 or to MVCC buffer 540. Illustrated in
In an example, the concurrency control mode of a data slot may determine whether an update is written to OCC buffer 530 or to MVCC buffer 540. For example, when a data slot is in OCC mode, an update to that slot may be written to OCC buffer 530. Conversely, when a data slot is in MVCC mode, an update to that slot may be written to MVCC buffer 540. Accordingly, MVCC buffer 540 may contain updates for data slots under MVCC mode, and OCC buffer 530 may contain updates for data slots under OCC mode.
In an example, the transaction record may be stored within non-volatile memory. In an example, the transaction record is stored in a transaction record buffer, which may be stored in a per-thread ring buffer, such as collection of per-thread ring buffers 230 of
To maintain the atomicity of a transaction and ensure serializability is maintained, a transaction may be both executed and validated prior to committing.
Where the transaction record buffer is not full, or after a full or partial garbage collection at block 606, space within the transaction record buffer may be allocated to the transaction at block 608. In an example, operations in the transaction entering the execution phase may be ordered in a queue, and the transactions may be executed by order of transaction within the queue. At block 610 it may be determined whether the operational queue is empty. A negative determination may lead to any reads or writes of the transaction being executed in program order. Specifically, it may be determined whether an operation is a read request at block 612. If the operation is determined to be a read request, the read is executed at block 614, e.g., using the method of
Before a transaction is committed, it is validated to ensure that any executed operations resulting from the executed transaction are valid and serializability will not be violated if the transaction commits.
Each executed read may be validated at step 704, as will be further described in
At block 710, the committed write requests may be installed sequentially to their respective data slots. In an example, installing the write requests to their respective data slots may occur during a post-commit phase, which is described below in
A transaction may take the form of a read request or a write request and a read request and/or a write request may be executed and validated.
A determination that priority indicator (p) points to the in-place data value leads to block 806, where the in-place data value is read. In an example, an assigned commit timestamp of the in-place data value may also be read and it may be determined at block 808 whether a transaction timestamp is greater than the commit timestamp of the in-place data value. In an example, the transaction timestamp may be a start timestamp of the transaction, such as the start timestamp assigned at block 602 in
Where the transaction timestamp is determined to be greater than the commit timestamp of the in-place data value, the in-place data value may be read as illustrated at block 830. Where the transaction timestamp is not determined to be greater than the commit timestamp of the in-place data value, a read may occur from the version chain at block 810. Specifically, it may be determined at block 812 whether there are any additional committed versions within the version chain that have a commit timestamp that is less than the transaction timestamp. Where there are additional committed versions that have a commit timestamp less than the transaction timestamp, the additional version having the greatest timestamp that is less than the transaction timestamp may be determined at block 814. The determined version may be read at block 830.
It may be determined that there are no committed versions that have a commit timestamp less than the transaction timestamp. This may be because any committed version that had a commit timestamp less than the transaction timestamp may have been overwritten or garbage collected. Where it is determined that there are no additional committed versions within the version chain that have a commit timestamp less than the transaction timestamp, the mode indicator, e.g. in mode indicator field 144 of
Looking back to block 804, it may be determined that the priority indicator (p) does not point to the in-place data value. In an example, p may point to the version chain and not the in-place data value. A read occurs from the version chain at block 816 where it is determined that p points to the version chain. Similar to block 812, it may then be determined at block 818 whether there are any additional versions having a commit timestamp that is less than the transaction timestamp. Similar to block 814, where there are additional committed versions that have a commit timestamp less than the transaction timestamp, the additional committed version having the greatest timestamp that is less than the transaction timestamp may be determined at block 820. The determined version may be read at block 830.
Where it is determined that there are no additional versions having a commit timestamp that is less than the timestamp of the executing transaction, the in-place data value may be read at block 822. Similar to block 808, it may be determined at block 824 whether the transaction timestamp is greater than the commit timestamp of the in-place data value. Where it is determined that the transaction timestamp is greater than the in-place data value of the commit timestamp, the in-place data value may be read at block 830.
Where it is determined that the transaction timestamp is not greater than the commit timestamp of the in-place data value, the mode indicator, e.g. in mode indicator field 144 of
As described at
An operation may be executed and then added to a read set, i.e., a set of read operations that are performed by a transaction, and/or write set for validation. In an example, a read operation may be validated in the order in which it was added to the read set. Similarly, a write operation may be validated in the order in which it was added to the write set.
Where it is determined at block 1004 that the version read during the execution phase is not the same as the version read during the validation phase, or where it is determined at block 1005 that the commit timestamp is not greater than the timestamp of the read version, the concurrency mode of the data slot may be changed from optimistic concurrency control (OCC) mode to multi-version concurrency control (MVCC) mode at block 1008. In an example, the version read during the execution phase may not be the same as the version read during the validation phase because of a read-write conflict, that is, because a concurrent transaction may have modified the version read during the execution phase prior to validation. The transaction may be aborted at block 1010 following the concurrency mode change at block 1008. Accordingly, the transaction may be validated, or the concurrency mode of the data slot may be updated provided the transaction is not validated.
A positive determination at block 1104 however, leads to the determination at block 1108 as to whether the data slot associated with the write transaction is in optimistic concurrency control (OCC) mode or multi-version concurrency control (MVCC) mode. The mode of the data slot may be indicated by the mode indicator of the data slot, e.g. in mode indicator field 144 in
A determination that the priority indicator points to the version chain may indicate an erroneous state in the protocol as described in greater detail below, and the transaction may abort at block 1106. Where it is determined that the priority indicator does not point to the version chain, but rather the in-place data value of the data slot, the priority indicator is changed from pointing to the in-place data value to the version chain and the write may be applied to the version chain at block 1114.
Returning to block 1108, it may be determined that the concurrency mode is not in optimistic concurrency control (OCC) mode. For example, it may be determined by the mode indicator of the data slot that the data slot is in multi-version concurrency control (MVCC) mode. Where it is determined at block 1108 that the data slot is not in OCC mode, it is determined at block 1116 whether a low contention criterion is satisfied. In an example, the low contention criterion may be satisfied where the version chain of the data slot is empty. In an example, the low contention criterion may be satisfied where no additional versions of the data slot have been created, or where no additional version of the data slot is currently stored within a transaction record buffer. In an example, the low contention criterion is satisfied where a version chain locator of the data slot, e.g. version chain locator 248 of
A determination that the low contention criteria is satisfied may indicate a period of low contention regarding transactions associated with the data slot. Where the low contention criterion is determined to be satisfied at block 1116, the concurrency mode of the data slot may be switched from MVCC mode to OCC mode at block 1118, followed by a priority indicator read at block 1110, a determination as to whether the priority indicator points to the in-place data value at block 1112, and, depending on the determination, an update to the version chain at block 1114 or an abort of the transaction at block 1106.
Specifically, a determination at block 1112 that the priority indicator does not point to the in-place data value may lead to an abort at block 1106. If prior transactions have successfully committed and the updates and/or writes of the transaction have been installed, then in a period of low contention, a successful transition to OCC mode may be indicated where the priority indictor points to the in-place data value. The priority indicator not pointing to the in-place data value may indicate that the last committed transaction to modify the data slot did not successfully complete installation. In MVCC mode, the latest committed update may be located within the value chain. In OCC mode however, the latest committed value is expected to be in the in-place value. Thus, in an example, the transaction may abort at block 1106. In another example, a determination at block 1112 that the priority indicator does not point to the in-place data value may be followed by switching the concurrency mode from OCC mode to MVCC mode.
In an example, a determination that the low contention criterion is not satisfied may indicate that the version chain is not empty, that the version chain makes up greater than a threshold amount of additional versions, etc. Where the low contention criterion is not satisfied, a new version may be created in the version chain at block 1120. In an example, versions within the version chain are ordered by commit time, and the newly created version may be placed within the version chain according to the commit time of the newly created version. The priority indicator may then be changed to point to the version chain at block 1122.
Once a transaction is committed, post-commit operations may be executed.
Once the write set of the transaction has been installed, the OCC transaction buffer, i.e., the memory holding data stored when in an optimistic concurrency control (OCC) mode, may be reclaimed. Where the write set is determined to be empty, the OCC buffer is reclaimed at block 1208. In an example, an identification value may be stored within the OCC buffer upon reclamation to indicate the completion of the post-commit phase.
In an example, a transaction may commit and the post-commit phase may fail to partially or fully execute. This may occur, for example, where a thread crashes during the post-commit phase. Where the post-commit phase fails to perform, a modified post-commit phase may be performed. During the modified post-commit phase, it may be determined whether a version within the version chain has a latest commit time, and if so, the in-place data value of the determined version is updated. Subsequent to the update, the priority indicator may also be updated to point to the in-place data value.
To promote space and computational efficiency, garbage collection may occur within a transaction record such that memory within a transaction record is reclaimed. Garbage collection may occur in the foreground, such that garbage collection is performed by transaction processing threads that have run out of space in their ring buffer, or in the background, such that garbage collection is performed by dedicated garbage collection threads. In an example, the OCC buffer and/or MVCC buffer may be reclaimed periodically, such that an OCC buffer and/or an MVCC buffer may be reclaimed after a specified period of time. In an example, periodic garbage collection occurs in the background, such that garbage collection is performed by garbage collection threads distinct from the transaction processing threads that are dedicated to reclaiming memory. This temporal garbage collection method may be performed at per-thread ring buffers, e.g. first thread ring buffer 232 or second thread ring buffer 234 of
In an example, a transaction processing thread may reclaim space within its per-thread ring buffer where the ring buffer lacks sufficient available space for the transaction processing thread to allocate a new transaction record. In an example, transaction records from the per-thread ring buffer for a transaction processing thread that lacks sufficient space are garbage collected in the foreground. A transaction record may also be reclaimed where the status field of a transaction record, e.g. status field 520 of
The features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or the elements of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or elements are mutually exclusive.
In the foregoing description, numerous details are set forth to provide an understanding of the subject disclosed herein. However, various examples may be practiced without some of these details. Some examples may include modifications and variations from the details discussed above. It is intended that the appended claims cover such modifications and variations.