Concurrency control is a mechanism that manages and coordinates concurrent accesses to a database in a multi-user database management system (DBMS). In multi-user DBMS environments, database updates performed by one user may conflict with database retrievals and updates performed by another. Concurrency control allows multiple users to simultaneously access the same database while preserving the illusion that each user is executing alone on a dedicated system.
The following detailed description references the drawings wherein:
The following detailed description refers to the accompanying drawings. Wherever possible, the same reference numbers are used in the drawings and the following description to refer to the same or similar parts. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only. While several examples are described in this document, modifications, adaptations, and other implementations are possible. Accordingly, the following detailed description does not limit the disclosed examples. Instead, the proper scope of the disclosed examples may be defined by the appended claims.
Concurrency control is a mechanism that manages and coordinates concurrent accesses to a database in a multi-user database management system (DBMS). In multi-user DBMS environments, database updates performed by one user may conflict with database retrievals and updates performed by another. Concurrency control allows multiple users to simultaneously access the same database while preserving the illusion that each user is executing alone on a dedicated system.
Example types of concurrency control may include a pessimistic concurrency control type and an optimistic concurrency control type. In pessimistic concurrency control, data may be marked as locked while it is being updated by a user. Other users cannot perform actions that would conflict with the lock until that user commits the update and/or releases the lock. In optimistic concurrency control, before a user commits an update, the system verifies that no other transaction has modified the data being read by the update transaction. If the verification reveals conflicting updates to the same data, the update transaction would be rejected (e.g., the user receives an error). The user may roll back the rejected transaction and start over.
The pessimistic control type assures data integrity in the price of reduced concurrency (e.g., having transactions wait for other transactions' locks to clear) and/or reduced performance (e.g., managing locks). Thus, the optimistic concurrency control type is generally used in environments with low data contention (e.g., less likelihood of update conflicts). When conflicts are rare, transactions can complete without the expense of reduced concurrency and performance. However, when conflicts are frequent (e.g., greater likelihood of update conflicts), the pessimistic control type may be a better fit because the cost of repeatedly restarting transactions would significantly hurt performance. Although a particular type of concurrency control could be selected to handle all of the transactions occurring in a database and/or being initiated by an application, it is technically challenging to dynamically determine a concurrency control type to be used for individual data objects.
Examples disclosed herein provide technical solutions to these technical challenges by generating prediction models for concurrency control types. Some of the examples disclosed herein may enable generating a prediction model based on training data. The training data may comprise a set of access data associated with a data object. The set of access data may, comprise: values for a set of attributes of the data object, and an indication whether a conflict occurred during processing of a request to access the data object. In some instances, the set of access data may further include contextual data including, but not being limited to, of access, and geographical location of the user who requests the access. The prediction model may then determine a probability of a conflict occurring during processing of a request to access another data object. Based on the determined probability of the conflict, a concurrency control type to be used to process this request may be determined.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. As used herein, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. The term “plurality,” as used herein, is defined as two or more than two. The term “another,” as used herein, is defined as at least a second or more. The term “coupled,” as used herein, is defined as connected, whether directly without any intervening elements or indirectly with at least one intervening elements, unless otherwise indicated. Two elements can be coupled mechanically, electrically, or communicatively linked through a communication channel, pathway, network, or system. The term “and/or” as used herein refers and encompasses any and all possible combinations of one or more of the associated listed items. It will also be understood that, although the terms first, second, third, etc. may be used herein to describe various elements, these elements should not be limited by these terms, as these terms are only used to distinguish one element from another unless stated otherwise or the context indicates otherwise. As used herein, the term “includes” means includes but not limited to, the term “including” means including but not limited to. The term “based on” means based at least in part on.
The various components (e.g., components 129, 130, and/or 140) depicted in
Prediction model system 110 may comprise an access data engine 121, a prediction model engine 122, a request process engine 123, and/or other engines. The term “engine”, as used herein, refers to a combination of hardware and programming that performs a designated function. As is illustrated respect to
Access data engine 121 may identify access data associated with various data objects. For example, access data engine 121 may identify a first set of access data associated with a first data object. The first set of access data may comprise: values for a first set of attributes of the first data object, a concurrency control type that was used to process a request to access the first data object, and/or an indication of whether a conflict occurred during processing of the request to access the first data object.
A “request to access” a particular data object, as used herein, may include a request to read, delete, and/or update the data object and/or other database transaction related to the data object.
A “concurrency control type that was used to process a request to access a particular data object,” as used herein, may comprise a pessimistic concurrency control type, an optimistic concurrency control type, and/or other types of concurrency control. For example, in pessimistic concurrency control, data may be marked as locked while it is being updated by a user. Other users cannot perform actions that would conflict with the lock until that user commits the update and/or releases the lock. In optimistic concurrency control, before a user commits an update, the system verifies that no other transaction has modified the data being read by the update transaction. If the verification reveals conflicting updates to the same data, the update transaction would be rejected (e.g., the user receives an error). The user may roll back the rejected transaction and start over. Note that although two concurrency control types (e.g., pessimistic and optimistic types) are discussed in the examples described herein (e.g., examples illustrated in
An “indication of a conflict that occurred,” as used herein, may comprise: a first indication that no occurrence of a conflict was detected during processing of a request to access a particular data object (e.g., denoted by “None”), a second indication that an occurrence of a conflict was detected during the processing (e.g., denoted by “Conflict”), and/or other indications of a conflict. In some implementations, the second indication may indicate a type of a conflict that occurred: that the conflict occurred due to a locking of that particular object (e.g., denoted by “Lock”), that the conflict occurred due to a rejection of an update to the particular data object (e.g., denoted by “Reject”), and/or other types of a conflict.
Suppose that an example dataset as illustrated in
The access data of various data objects like the example dataset shown in
Prediction model engine 122 may generate, using a machine-learning algorithm, a prediction model based on the training data that includes the access data (e.g., identified by access data engine 121). Any machine-learning algorithm known in the art may be used to generate a prediction model. The prediction model may be trained to recognize patterns in the training data. In some implementations, a trained prediction model may be tested using test data to ensure that its output is validated within an acceptable margin of error. A properly trained and/or tested prediction model may use the knowledge discovered from the training data to predict an output (e.g., an indication of whether a conflict Is predicted to occur) given a new data object and/or given at least a portion of its access data (e.g., values for a set of attributes of the new data object). Prior to processing a request to access a new data object, the new data object may be identified and/or submitted to the prediction model to predict an indication of whether a conflict is predicted to occur. Based on this indication of a predicted conflict, prediction model engine 122 may determine a concurrency control type to be used to process the request to access that new data object.
An “indication of a conflict that is predicted to occur,” as used herein, may comprise: a first indication that no conflict is predicted to occur during processing of a request to access a particular data object (e.g., denoted by “None”, a second indication that a conflict is predicted to occur during the processing (e.g., denoted by “Conflict”), and/or other indications of a conflict. In some implementations, the second indication may indicate a type of a conflict that is predicted to occur: that the conflict is predicted to occur due to a locking of that particular object (e.g., denoted by “Lock”), that the conflict is predicted to due to a rejection of an update to the particular data object (e.g., denoted by “Reject”), and/or other types of a conflict.
For example, suppose that an example dataset as illustrated in
In the example illustrated in
With respect to Data Object ID #13 in
Request process engine 123 may process a request access a data object. For example, request process engine 123 may read, delete, and/or update the data object according to the request. The processing may result in the requested transaction being successfully committed or in the requested transaction failing to commit (e.g., because a conflict occurred). During processing of a request to access a data object, request process engine 123 may use a particular concurrency control type that has been selected, determined, and/or suggested by prediction model engine 122 for that particular data object, Returning to the example illustrated in
In some implementations, after applying the concurrency control type that was selected, determined, and/or suggested by prediction model engine 122 to process a request to access a new data object, prediction model engine 122 may monitor for an indication of whether a conflict actually occurred or not during the processing of the request to access the new data object. Returning to the example illustrated in
In performing their respective functions, engines 121-123 may access data storage 129 and/or other suitable database(s). Data storage 129 may represent any memory accessible to prediction model system 110 that can be used to store and retrieve data. Data storage 129 and/or other database may comprise random access memory (RAM), read-only memory (ROM), electrically-erasable programmable read-only memory (EEPROM), cache memory, floppy disks, hard disks, optical disks, tapes, solid state drives, flash drives, portable compact disks, and/or other storage media for storing computer-executable instructions and/or data Prediction model system 1 may access data storage 129 locally or remotely via network 50 or other networks
Data storage 129 may include a database to organize and store data. The database may reside in a single or multiple physical device(s) and in a single or multiple physical location(s), The database may store a plurality of types of data and/or files and associated data or file description, administrative information, or any other data.
In the foregoing discussion, engines 121-123 were described as combinations of hardware and programming. Engines 121-123 may be implemented in a number of fashions. Referring to
In
Referring to
In
Machine-readable storage medium 310 (or machine-readable storage medium 410) may be any electronic, magnetic, optical, or other physical storage device that contains or stores executable instructions. In some implementations, machine-readable storage medium 310 (or machine-readable storage medium 410) may be a non-transitory storage medium, where the term “non-transitory” does not encompass transitory propagating signals. Machine-readable storage medium 310 (or machine-readable storage medium 410) may be implemented in a single device or distributed across devices. Likewise, processor 311 (or processor 411) may represent any number of processors capable of executing instructions stored by machine-readable storage medium 310 (or machine-readable storage medium 410). Processor 311 (or processor 411) may be integrated in a single device or distributed across devices, Further, machine-readable storage medium 310 (or machine-readable storage medium 410) may be fully or partially integrated in the same device as processor 311 (or processor 411) or it may be separate but accessible to that device and processor 311 (or processor 411).
In one example, the program it instructions may be part of an installation package that when installed can be executed by processor 311 (or processor 411) to implement prediction model system 110. In this case, machine-readable storage medium 310 (or machine-readable storage medium 410) may be a portable medium such as a floppy disk, CD, DVD, or flash drive or a memory maintained by a server from which the installation package can be downloaded and installed. In another example, the program instructions may be part of an application or applications already installed. Here, machine-readable storage medium 310 (or machine-readable storage medium 410) may include a hard disk, optical disk, tapes, solid state drives, RAM, ROM, EEPROM, or the like.
Processor 311 may be at least one central processing unit (CPU), microprocessor, and/or other hardware device suitable for retrieval and execution of instructions stored in machine-readable storage medium 310. Processor 311 may fetch, decode, and execute program instructions 321, and/or other instructions. As an alternative or in addition to retrieving and executing instructions, processor 311 may include at least one electronic circuit comprising a number of electronic components for performing the functionality of instructions 321, and/or other instructions.
Processor 411 may be at least one central processing unit (CPU), microprocessor, and/or other hardware device suitable for retrieval and execution of instructions stored in machine-readable storage medium 410. Processor 411 may fetch, decode, and execute program instructions 421-423, and/or other instructions. As an alternative or in addition to retrieving and executing instructions, processor 411 may include at least one electronic circuit comprising a number of electronic components for performing the functionality of at least one of instructions 421-423, and/or other instructions.
In block 521, method 500 may include identifying a first set of access data associated with a first data object. The first set of access data may comprise: values for a first set of attributes of the first data object, and an indication of whether a conflict occurred during processing of a request to the first data object. Referring back to
In block 522, method 500 may include identifying second access data associated with a second data object. The second set of access data may comprise: values for a second set of attributes of the second data object, and an indication of whether a conflict occurred during processing of a request to access the second data object. Referring back to
In block 523, method 500 may include generating, using a machine-learning algorithm, a prediction model based on training data that includes the first and second sets of access data. Referring back to
In block 621, method 600 may include identifying a first set of access data associated with a first data object. The first set of access data may comprise: values for a first set of attributes of the first data object, and an indication of whether a conflict occurred during processing of a request to access the first data object. Referring back to
In block 622, method 600 may include identifying a second set of access data associated with a second data object. The second set of access data may comprise: values for a second set of attributes of the second data object, and an indication of whether a conflict occurred during processing of a request to access the second data object. Referring back to
In block 623, method 600 may include generating, using a machine-learning algorithm, a prediction model based on training data that includes the first and second sets of access data. Referring back to
In block 624, method 600 may include determining, using the prediction model, an indication of whether a conflict is predicted to occur during processing of a request to access to a third data object. Referring back to
In block 625, method 600 may include processing the request to access the third data object using the concurrency control type. Referring back to
In block 626, method 600 may include identifying an indication of whether a conflict occurred during processing of the request to access the third data object. Referring back to
In block 627, method 600 may include updating the prediction model based on the training data that includes a third set of access data associated with the third data object. The third set of access data may comprise: value for the third set attributes of the third data object, and the indication of whether the conflict occurred during processing of the request to access the third data object. Referring back to
The foregoing disclosure describes a number of example implementations for prediction models for concurrency control types. The disclosed examples may include systems, devices, computer-readable storage media, and methods for prediction models for concurrency control types. For purposes of explanation, certain examples are described with reference to the components illustrated in
Further, all or part of the functionality of illustrated elements may co-exist or be distributed among several geographically dispersed locations. Moreover, the disclosed examples may be implemented in various environments and are not limited to the illustrated examples. Further, the sequence of operations described in connection with
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2015/061651 | 11/19/2015 | WO | 00 |