Not applicable.
Not applicable.
The present invention relates to the field of computer technologies, and in particular, to a data persistence processing method and apparatus, and a database system.
Compared with a disk, a memory can provide a higher throughput and a quicker response. Generally, a database system preferentially stores data in a memory; for example, data that is complex for reading and writing, so as to improve a speed of data reading and writing, and implement caching. A database system generally uses a page as a unit of caching. When a process modifies data in a cache, the page is marked as a dirty page by a kernel, and the database system writes data of the dirty page into a disk at a proper time, so as to maintain that the data in the cache and the data in the disk are consistent.
A checkpoint mechanism is a mechanism that allows a database to recover after a fault occurs. A traditional checkpoint mechanism is also called a full checkpoint mechanism, and all dirty pages in a checkpoint queue are transferred and stored in a disk at a time. When the checkpoint mechanism is used for performing data persistence processing, to ensure consistency between data in a memory and data in a disk, the entire checkpoint queue requires to be locked during the whole period of data persistence processing. In other words, a normal transaction operation of a user is prevented for a relatively long period.
To overcome a disadvantage that the traditional full checkpoint mechanism affects execution of a normal transaction, a mechanism called “fuzzy checkpoint” is put forward. A fuzzy checkpoint mechanism aims to copy a generated dirty page to a disk step by step; thereby impact on a normal transaction operation of a user caused by data persistence processing is reduced. However, there is a lack of effective solutions in the prior art to specifically implement the fuzzy checkpoint mechanism.
Embodiments of the present invention provide a data persistence processing method and apparatus, and a database system, so as to improve dumping efficiency of dirty pages to a certain extent.
According to one aspect, an embodiment of the present invention provides a data persistence processing method, including: adding, to a checkpoint queue each time when a dirty page is generated in a database system memory, a page identifier respectively corresponding to each generated dirty page; determining an active group and a current group in the checkpoint queue, where the page identifiers that are in the checkpoint queue and are respectively corresponding to multiple dirty pages to be currently dumped to a disk form the active group, and a group inserted with a dirty page that is newly added in the checkpoint queue is the current group; successively dumping, to a data file of the disk, each dirty page that is corresponding to each page identifier included in the active group on a preset checkpoint occurrence occasion; and determining a next active group in the checkpoint queue if dumping of the dirty pages related to the active group is completed, and successively dumping, to a data file of the disk, each dirty page that is corresponding to each page identifier included in the next active group on the checkpoint occurrence occasion.
According to another aspect, an embodiment of the present invention further provides a data persistence processing apparatus, including: a checkpoint queue maintaining unit configured to add, to a checkpoint queue each time when a dirty page is generated in a database system memory, a page identifier respectively corresponding to each generated dirty page; a group processing unit configured to determine an active group and a current group in the checkpoint queue, where the page identifiers that are in the checkpoint queue and are respectively corresponding to multiple dirty pages to be currently dumped to a disk form the active group; and a group inserted with a dirty page that is newly added in the checkpoint queue is the current group; and a dirty page bulk dumping unit configured to successively dump, to a data file of the disk, each dirty page that is corresponding to each page identifier included in the active group on a preset checkpoint occurrence occasion; the group processing unit is further configured to determine a next active group in the checkpoint queue if dumping of the dirty pages related to the active group is completed; and the dirty page bulk dumping unit is further configured to successively dump, to a data file of the disk, each dirty page that is corresponding to each page identifier included in the next active group on the checkpoint occurrence occasion.
According to still another aspect, an embodiment of the present invention further provides a database system, including: a disk file, a memory database, and a database management system, where the database management system is configured to manage data stored in the memory database; the database management system includes the foregoing data persistence processing apparatus; and the data persistence processing apparatus is configured to dump the data stored in the memory database to the disk file.
In the data persistence processing method and apparatus, and the database system provided by the embodiments of the present invention, a checkpoint queue is maintained dynamically; page identifiers that are in the checkpoint queue and are corresponding to multiple dirty pages to be currently dumped to a disk are used as an active group, and a group inserted with a dirty page that is newly added in the checkpoint queue is used as a current group; on each checkpoint occurrence occasion, dirty pages corresponding to each page identifier included in an active group are successively dumped to a database of the disk; and after dumping of the dirty pages corresponding to each page identifier included in an active group is completed, a next active group is determined in the checkpoint queue, so as to successively dump each dirty page that is corresponding to each page identifier included in the next active group to a data file of the disk on a next checkpoint occurrence occasion. By circularly performing processing, dirty pages are dumped to the disk in groups and in bulk according to the checkpoint occurrence occasion, thereby improving dumping efficiency of the dirty pages on the basis that the dumping of the dirty pages has small impact on a normal transaction operation.
To describe the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings required for describing the embodiments. The accompanying drawings in the following description show merely some embodiments of the present invention, and persons of ordinary skill in the art may still derive other drawings from these accompanying drawings without creative efforts.
To make the objectives, technical solutions, and advantages of the embodiments of the present invention clearer, the following clearly describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. The described embodiments are a part rather than all of the embodiments of the present invention. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present invention without creative efforts shall fall within the protection scope of the present invention.
11: Add, to a checkpoint queue each time when a dirty page is generated in a database system memory, a page identifier respectively corresponding to each generated dirty page.
A checkpoint queue is dynamically maintained in the database system, where the checkpoint queue is used for caching the page identifier corresponding to each dirty page generated in the database system memory. Each time when the dirty page is generated in the database system memory, the page identifier corresponding to the dirty page may be successively added to the checkpoint queue in a time sequence of generating the dirty page. After data of the dirty page corresponding to any page identifier included in the checkpoint queue is dumped to a data file of a disk from a memory, the page identifier of the dirty page is automatically deleted in the checkpoint queue.
12: Determine an active group and a current group in the checkpoint queue, and successively dump each dirty page that is corresponding to each page identifier included in the active group to a data file of the disk on a preset checkpoint occurrence occasion, where the page identifiers that are in the checkpoint queue and are respectively corresponding to multiple dirty pages to be currently dumped to the disk form the active group; and a group inserted with a dirty page that is newly added in the checkpoint queue is the current group.
Each page identifier included in the checkpoint queue may be grouped, so as to implement dumping of the dirty pages in groups and in bulk. For example, the page identifiers that are in the checkpoint queue and are respectively corresponding to the dirty pages that require to be currently dumped to the disk form the active group, and the group inserted with the dirty page that is newly added in the checkpoint queue is the current group. In an optional implementation manner, each page identifier included in the active group may be marked with an active group identifier; after being processed in such a way, each page identifier included in the checkpoint queue are classified into two types: one is the page identifier marked with the active group identifier, namely, each page identifier included in the active group, where dirty pages corresponding to these page identifiers are the dirty pages that require to be currently dumped to the disk from the memory; and the other is page identifier without the active group identifier, namely, other page identifiers in the checkpoint queue except each page identifier included in the active group, where the other page identifiers are not marked with the active group identifier. After the active group is determined, an optional example of the current group in the checkpoint queue is shown in
After the current active group is determined, the dirty pages corresponding to each page identifier included in the active group may be successively dumped in the data file of the disk on the checkpoint occurrence occasion, where the checkpoint occurrence occasion may be pre-determined. For example, the checkpoint occurrence occasion may be determined from a perspective of an atomic operation, so as to reduce impact of a checkpoint mechanism on a normal transaction operation.
After the dirty page corresponding to any page identifier is dumped to the data file of the disk, the page identifier may be automatically deleted in the checkpoint queue, that is, the page identifier is automatically deleted in the active group.
13: Determine a next active group in the checkpoint queue if dumping of the dirty pages related to the active group is completed, and successively dump each dirty page that is corresponding to each page identifier included in the next active group to a data file of the disk on the checkpoint occurrence occasion.
After the dirty pages corresponding to each page identifier included in the active group are dumped to the data file of the disk, the next active group may be determined in the checkpoint queue, that is, remained page identifiers in the checkpoint queue are regrouped, of which an example is shown in
If the number of the remained page identifiers in the checkpoint queue is smaller than the preset number of preset page identifiers that is preset for the active group and needs to be included in the active group, all the remained page identifiers in the checkpoint queue may be grouped into the active group. For example, as shown in
After the next active group is determined, the dirty pages corresponding to each page identifier included in the active group may be dumped to the data file of the disk on a new checkpoint occurrence occasion. In addition, a page identifier of a new dirty page generated in the memory after grouping is added to the current group. The specific implementation manner is similar to step 12, and details are not described herein again.
If the number of the remained page identifiers in the checkpoint queue is 0, that is, the checkpoint queue is empty, the foregoing steps 12 and 13 is not executed, and when a new page identifier is added to the checkpoint queue and the new checkpoint occurrence occasion arrives, the foregoing steps 12 and 13 are repeatedly executed.
In the data persistence processing method provided by the embodiment of the present invention, a checkpoint queue is maintained dynamically; page identifiers that are in the checkpoint queue and are corresponding to multiple dirty pages to be currently dumped to a disk are used as an active group, and a group inserted with a dirty page that is newly added in the checkpoint queue is a current group; on each checkpoint occurrence occasion, dirty pages corresponding to each page identifier included in an active group are successively dumped to a database of the disk; and after dumping of the dirty pages corresponding to each page identifier included in an active group is completed, a next active group is determined in the checkpoint queue, so as to successively dump each dirty page that is corresponding to each page identifier included in the next active group to a data file of the disk on a next checkpoint occurrence occasion. By circularly performing processing, dirty pages are dumped to the disk in groups and in bulk according to the checkpoint occurrence occasion, thereby improving dumping efficiency of the dirty pages on the basis that the dumping of the dirty pages has small impact on a normal transaction operation.
On the basis of the foregoing technical solutions, optionally, if it is determined that a dirty page corresponding to any page identifier included in the checkpoint queue requires to be modified, whether the any page identifier belongs to the active group is determined; if the page identifier belongs to the active group, before the dirty page corresponding to the page identifier is dumped to the data file of the disk, a mirrored page of the dirty page corresponding to the page identifier is created; and if the page identifier does not belong to the active group, the mirrored page of the dirty page corresponding to the page identifier is not created. After the mirrored page of the dirty page corresponding to the page identifier is created, if it is time dump the dirty page corresponding to the page identifier, the mirrored page corresponding to the page identifier is dumped to the data file of the disk. Processing in such a way of, a mirrored page does not require to be created for the dirty pages corresponding to each page identifier in the checkpoint queue, and a corresponding mirrored page is only created for a page identifier that is in the active group and determined for modification, thereby reducing memory space required for creating the mirrored page, and ensuring consistency between data in the memory and data in the disk.
On the basis of the foregoing technical solutions, optionally, an atomic operation may relate to a plurality of dirty pages, and an active group may include dirty pages related to a plurality of atomic operations. Before the dirty pages corresponding to each page identifier included in the active group are dumped to the data file of the disk, a log that is of each atomic operation associated with the active group and buffered in a log buffer area of the memory may be dumped to a log file of the disk. For example, an atomic operation associated with each page identifier included in the current active group is determined; an address of each log buffer area associated with the determined atomic operation is acquired from the log buffer area of the database system memory; and a log cached at the acquired address of each log buffer area is dumped to the log file of the disk. After dumping of a corresponding log is completed, the dirty pages corresponding to each page identifier included in the active group are then dumped to the data file of the disk.
Further, optionally, after dumping of a dirty page of the active group is completed, and a next active group is determined, a log-file starting point of each atomic operation that is associated with each page identifier included in the next active group is required, where the log-file starting point of any atomic operation is used to indicate a log that is generated when the any atomic operation starts running and a storage location in the log file; and storing each log included in the log file in a time sequence. A minimum value of the acquired log-file starting point of each atomic operation is set to a database recovery point, where the database recovery point is used to indicate: if the database system encounters a fault before completing dumping of the dirty pages corresponding to the page identifiers included in the next active group to the disk, a starting point for recovering the required log is read in the log file when the database system encountering the fault is being recovered. Processing in such a way, a log required for database recovery may be determined quickly according to the recovery point, thereby improving a speed of database system recovery. For example, in
It should be noted that, to make the description brief, the foregoing method embodiments are described as a series of action combinations. However, persons skilled in the art should understand that the present invention is not limited to the described sequence of the actions, because some steps may be performed in other order or simultaneously according to the present invention. In addition, persons skilled in the art should also understand that the embodiments described in the specification all belong to exemplary embodiments and the involved actions and modules are not necessarily mandatory to the present invention.
In the foregoing embodiments, description of each embodiment has its emphasis. For a part that is not described in detail in a certain embodiment, reference may be made to related description in another embodiment.
Persons of ordinary skill in the art may understand: all or a part of the steps of the foregoing method embodiments may be implemented by a program instructing relevant hardware. The foregoing program may be stored in a computer readable storage medium. When the program runs, the steps included in the foregoing method embodiments are performed. The foregoing storage medium includes any medium that can store program code, such as a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disc.
The group processing unit 42 may be further configured to determine a next active group in the checkpoint queue if dumping of the dirty pages related to the active group is completed.
The dirty page bulk dumping unit 43 may be further configured to successively dump, to a data file of the disk, each dirty page that is corresponding to each page identifier included in the next active group on the checkpoint occurrence occasion.
To ensure coherence of atomic operations running in the database system memory, the checkpoint occurrence occasion includes: an atomic operation that is not currently running in the database system memory.
By using the foregoing data persistence processing apparatus, dirty pages may be dumped in groups and in bulk to a data file of the disk according to the checkpoint occurrence occasion, thereby minimizing impact on a normal transaction process during a process of executing a checkpoint, and improving dumping efficiency of the dirty pages.
As shown in
On the basis of the foregoing technical solutions, optionally, the data persistence processing apparatus 40 further includes a log file dumping processing unit 45. The log file dumping processing unit 45 is configured to: determine an atomic operation associated with each page identifier included in the active group; acquire, an address of each log buffer area associated with the atomic operation in a log buffer area of the database system memory; and dump, to a log file of the disk, a log buffered at the acquired address of each log buffer area. Processing in this way is good for ensuring correctness of recovered data when the database system recovers from a fault based on the disk.
Further, optionally, the data persistence processing apparatus 40 further includes: a database recovery point setting module 46. The database recovery point setting module 46 may be configured to: acquire a log-file starting point of each atomic operation that is associated with each page identifier included in the next active group, where the log-file starting point of any atomic operation is used to indicate a log that is generated when the any atomic operation starts running and a storage location in the log file; and store each log included in the log file in a time sequence; and set a minimum value of the acquired log-file starting point of each atomic operation to a database recovery point, where the database recovery point is used to indicate: if the database system encounters a fault before completing dumping the dirty pages corresponding to the page identifiers included in the next active group to the disk, a starting point for recovering the required log is read in the log file when the database system encountering the fault is being recovered. Processing in such a way, a log required for database recovery may be determined quickly according to the recovery point, thereby improving a speed of database system recovery.
The data persistence processing apparatus provided by the embodiment of the present invention is configured to implement the data persistence processing method provided by this embodiment of the present invention. For a working mechanism of the apparatus, reference may be made to a corresponding record in the foregoing method embodiments of the present invention, and details are not described herein again.
As shown in
The solutions of the present invention may be described in a general context of a computer-executable instruction executed by a computer, for example, a program unit. Generally, the program unit includes a routine, a program, an object, a component, a data structure, and the like, which execute a specific task or implements a specific abstract data type. The solutions of the present invention may also be implemented in a distributed computing environment. In the distributed computing environment, a task is executed by a remote processing device connected by using a communications network. In the distributed computing environment, the program unit may be located in a storage medium of a local or remote computer including a storage device.
In addition, each functional unit in the embodiments of the present invention may be integrated into one unit, or may exist alone physically, or two or more functional units are integrated into one unit. The foregoing integrated unit may be implemented in a form of hardware or in a form of a software functional unit, or may be implemented in a form of hardware plus a software functional unit.
The embodiments of the present specification are described in a progressive manner. The same or similar parts of the embodiments can be referenced mutually. The focus of each embodiment is placed on a difference from other embodiments. Especially, for the apparatus embodiments, as they are fundamentally similar to the method embodiments, their description is simplified, and for relevant parts, reference may be made to the description of the method embodiments. The described apparatus embodiments are merely exemplary. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of network units. A part or all of the modules may be selected according to an actual need to achieve the objectives of the solutions of the embodiments. Persons of ordinary skill in the art may understand and implement the embodiments of the present invention without creative efforts.
Persons of skilled in the art may understand that the modules in the apparatuses provided in the embodiments may be distributed in the apparatuses according to the descriptions of the embodiments, or may be arranged in one or more apparatuses which are different from those described in the embodiments. Units in the foregoing embodiments may be integrated into one unit, or may further split into multiple sub-modules. When the functions are implemented in a form of a software functional unit and sold or used as an independent product, the functions may be stored in a computer-readable storage medium. Based on such an understanding, the technical solutions of the present invention essentially, or the part contributing to the prior art, or a part of the technical solutions may be implemented in a form of a software product. The software product is stored in a storage medium, and includes several instructions for instructing a computer device (which may be a personal computer, a server, or a network device) to perform all or a part of the steps of the methods described in the embodiments of the present invention. The foregoing storage medium includes any medium that can store program code, such as a universal serial bus (USB) flash drive, a removable hard disk, a ROM, a RAM, a disk, or an optical disc.
Persons skilled in the art may understand that, the accompanying drawings are merely schematic drawings of embodiments, and modules or procedures in the accompanying drawings are not necessarily required for implementing the present invention.
Finally, it should be noted that the foregoing embodiments are merely intended for describing the technical solutions of the present invention other than limiting the present invention. Although the present invention is described in detail with reference to the foregoing embodiments, persons of ordinary skill in the art should understand that they may still make modifications to the technical solutions described in the foregoing embodiments or make equivalent replacements to some technical features thereof, without departing from the spirit and scope of the technical solutions of the embodiments of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
201210133474.4 | May 2012 | CN | national |
This application is a continuation of International Application No. PCT/CN2012/083305, filed on Oct. 22, 2012, which claims priority to Chinese Patent Application No. 201210133474.4, filed on May 2, 2012, both of which are hereby incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2012/083305 | Oct 2012 | US |
Child | 14529501 | US |