System And Method For Implementing A Partial-Blocking Consistency Point In A Database

Information

  • Patent Application
  • 20080005191
  • Publication Number
    20080005191
  • Date Filed
    June 28, 2006
    18 years ago
  • Date Published
    January 03, 2008
    16 years ago
Abstract
A partial-blocking consistency point system identifies transaction updates with a consistency point ID associated with a consistency point sequence number, records consistency point data that identify a location of the partial-blocking consistency point, flushes to a non-volatile storage the transaction updates identified with the consistency point sequence number without blocking transaction activity, and hardens to the non-volatile storage the recorded partial-blocking consistency point so that data associated with the recorded partial-blocking consistency point can be recovered. The consistency point sequence number is incremented each time the partial-blocking consistency point is recorded to uniquely identify the partial-blocking consistency point and transaction updates associated with the partial-blocking consistency point. The transaction updates identified with the consistency point sequence number are processed to improve efficiency of the flushing of the transaction updates.
Description

BRIEF DESCRIPTION OF THE DRAWINGS

The various features of the present invention and the manner of attaining them will be described in greater detail with reference to the following description, claims, and drawings, wherein reference numerals are reused, where appropriate, to indicate a correspondence between the referenced items, and wherein:



FIG. 1 is a schematic illustration of an exemplary operating environment in which a partial-blocking consistency point system of the present invention can be used;



FIG. 2 is a block diagram of the high-level architecture of the partial-blocking consistency point system of FIG. 1;



FIG. 3 is a diagram of an exemplary log stream illustrating performance of the partial-blocking consistency point system of FIGS. 1 and 2;



FIG. 4 is a process flow chart illustrating a method of operation of the partial-blocking consistency point system of FIGS. 1 and 2 in applying a consistency point ID to pages in a buffer; and



FIG. 5 is a process flow chart illustrating a method of operation of the partial-blocking consistency point system of FIGS. 1 and 2 in generating a partial-blocking consistency point.





DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS


FIG. 1 portrays an exemplary overall environment in which a system, a service, a computer program product, and an associated method (the “system 10”) for implementing a partial-blocking consistency point in a database according to the present invention may be used. System 10 comprises a software programming code or a computer program product that is typically embedded within, or installed on a computer, a switching device, or any layer or point residing between hosts and storage devices. For example, system 10 can be installed in a virtualization file system, a virtualization layer, or a virtualization storage-switching device. Alternatively, system 10 can be saved on a suitable storage medium such as a diskette, a CD, a hard drive, or like devices.


System 10 can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment containing both hardware and software elements. In one embodiment, system 10 is implemented in software, which comprises but is not limited to firmware, resident software, microcode, etc.


Furthermore, system 10 can take the form of a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.


The medium can be an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. Examples of a computer-readable medium comprise a semiconductor or solid-state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk. Current examples of optical disks comprise compact disk—read only memory (CD-ROM), compact disk—read/write (CD-R/W) and DVD.


A data processing system suitable for storing and/or executing program code comprise at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can comprise local memory employed during actual execution of the program code, bulk storage, and cache memories that provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution.


Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) can be coupled to the system either directly or through intervening I/O controllers.


Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.


Hosts, such as an application host 1, 15, through an application host N, 20, (collectively referenced as application hosts 25) access a storage system 30 through a network 35. The storage system 30 comprises a non-volatile storage device such as storage device 40. While system 10 is described in terms of network 35, application hosts 25 may also access the storage system 30 and system 10 locally rather than remotely.


The storage system 30 further comprises a log 45 and a buffer 50. Log 45 and buffer 50 comprise (or alternatively form part of) a volatile memory. The application hosts 25 perform a transaction such as modifying, deleting, or adding data stored in storage device 40 System 10 records the transaction in the buffer 50 (interchangeably referenced herein as the buffer pool 50) and logs the transaction in log 45. In summary, the present invention uses two logs, one log to record transactions, typically referred to as the redo log, and another log to record physical images of pages prior to updates being applied, typically referred to as the physical log (or undo log). Certain databases may combine the two logs into a single log and other databases may separate them into two distinct logs.



FIG. 2 illustrates a high-level hierarchy of system 10. System 10 comprises a page identification module 205, a consistency point recording module 210, a flushing module 215, and a hardening module 220. Given a transaction input 225, the page identification module 205 associates pages in the buffer 50 with a partial-blocking consistency point via a consistency point sequence number. The consistency point recording module 210 briefly blocks processing by the application hosts 25 to record a partial-blocking consistency point, then allows all processing by the application hosts 25 to continue. The flushing module 215 flushes from the buffer 50 to the storage device 40 all transaction updates that have occurred prior to the recording of the partial-blocking consistency point. The hardening module 220 hardens to the storage device 40 the recorded partial-blocking consistency point.


By not blocking update transactions, system 10 exhibits the unobtrusive characteristics of fuzzy checkpoints while providing the simplicity, recovery performance, and predictability or a full-blocking consistency point. System 10 establishes a partial-blocking consistency point that is a physical consistency point similar to that of a hard consistency point while allowing updates to occur while the processing of the partial-blocking consistency point is in progress. System 10 controls processing of pages being modified, either newly modified or re-modified, while the original set of modified dirty pages in the buffer 50 are written to the storage device 40.


In the following exemplary scenario, a consistency point sequence number (CPSN) N is being processed, and the system is incremented to CPSN=N+1. During checkpoint processing. If a non-modified page is modified, it gets stamped with a CPSN=N+1. If a modified page is updated and the CPSN<N+1, we know that this page must be flushed as part of the checkpoint processing but it has not been flushed yet. So, the page is flagged with a special flag to instruct the checkpoint processing to be flushed and the page is stamped with N+1. As a result, system 10 tracks page modifications to make certain that all the required pages for the consistency point are flushed to disk.


Compared to conventional consistency point systems, system 10 reduces the quiescent time that transactions experience while waiting for a consistency point to complete.


System 10 assumes that all transactions from the start of the partial-blocking consistency point, known as the consistency point restart position (CPR), are applied and no log records in log 45 are skipped.


In one embodiment, each transaction comprises a series of critical sections. Each critical section comprises a set of operations: a physical log of a page, a logical log of operations, and a modification of buffer 50. Log 45 comprises the physical log and the logical log. The physical log of a page comprises an undo log. The logical log of operations comprises a redo log. The modification of buffer 50 comprises actual page updates. A sequence of operations within a critical section remains the same. There can be more than one invocation of a set of operations inside the critical section, but the sequence for any given page remains the same. Write ahead logging rules apply, indicating that the modified page cannot be flushed to the storage device 40 until the physical log and logical log are flushed.



FIG. 3 shows an exemplary log stream 300 illustrating performance of system 10. System 10 uses an interval processing 305 to identify which data to flush to the storage device 40 to establish a single restore point for applications 25. Pages to be flushed from buffer 50 to the storage device 40 are referenced to a previous partial-blocking consistency point 310. The partial-blocking consistency point 310 comprises a consistency point sequence number of “n”. The page identification module 205 identifies transactions written to the buffer after the partial-blocking consistency point 310 with a consistency point ID set at “n+1”.


At a point A, 315, the consistency point recording module 210 specifies in the log 45 a partial-blocking consistency point. There are two recording actions that take place when a consistency point is recorded. The list of open transactions is recorded to non-volatile storage (the log) at the time of the consistency point. The log position of the list of open transactions is recorded to volatile memory. The restart point of the physical log is recorded to non-volatile memory. Together, these log positions form the recovery restart point.


The consistency point recording module 210 increments the consistency point sequence number by 1, to “n+2”. The page identification module 205 identifies transactions written to buffer 50 after the specified partial-blocking consistency point with the consistency point sequence number of “n+2”. Consequently, any transactions occurring after the partial-blocking consistency point is specified are associated with a next partial-blocking consistency point with consistency point sequence number “n+2”.


While specifying the partial-blocking consistency point, the consistency point recording module 210 briefly blocks transactions, as indicated by a transaction-blocked interval 320. At a point B, 325, the consistency point recording module 210 allows transactions by applications 25. The page identification module 205 associates allowed transactions with the incremented consistency point sequence number (n+2). At point B, 325, the consistency point recording module 210 opens the interval processing 305 and the flushing module 215 initiates disk flushing.


Disk flushing comprises flushing to the storage device 40 all the pages in the buffer 50 required for the partial-blocking consistency point recorded at point A, 315, i.e., those pages that have a consistency point ID of “n +1”. If a page in buffer 50 is updated after the partial-blocking consistency point is recorded at point A, 315, the flushing module does not flush that page to the storage device 40 because that update is not required for the consistency point. In the event of a recovery, that updated page is restored from the log 45 during log recovery.


Because the consistency point sequence number identifies the pages in the buffer 50, the flushing module 215 performs an efficient non-random I/O. In one embodiment, the flushing module 215 orders the pages in buffer 50 prior to flushing the pages in buffer 50 to the storage device 40. By identifying the pages in the buffer 50 that are flushed by the flushing module 215, an administrator is able to tune system 10 to perform efficiently and meet the policies of an application such as a database using system 10.


Another embodiment refers to the non-ordering of the pages. Some storage mechanisms have a physical configuration different than what is exposed to the server as storage. For instance, a SAN or NAS device may appear as a single disk. It may contain several disks with a large memory cache within the storage device (EMC, IBM Shark). In this embodiment, the pages are readily identifiable and therefore allow either ordering of the I/O as well as the ability to submit, for example, thousands of I/Os in a single call to the I/O sub-system which takes advantage of new I/O APIs. In essence, according to this embodiment, the easily identifiable pages allow for more efficient and more flexible I/O. Another alternative embodiment could similarly be effected for the non-ordered I/O.


When all pages in the buffer 50 associated with the partial-blocking consistency point (i.e., all pages in buffer 50 with the consistency point ID of “n+1”) have been flushed to the storage device 40, the flushing module 215 closes the interval processing at point C, 330. At point C, 330, the hardening module 220 hardens the partial-blocking consistency data transferred to the storage device 40 and processing of the partial-blocking consistency point is complete.



FIG. 4 is a process flow chart illustrating a method of operation of system 10 in applying a consistency point ID to pages in buffer 50. Applications 25 access the storage system 30 (step 405). If transactions are blocked (i.e., a partial-blocking consistency point is being recorded) (decision step 410), transaction input 225 is paused (step 415) until transactions are no longer blocked (decision step 410).


The page identification module 205 receives transaction input 225 (step 420). The page identification module 205 sets the consistency point ID of the page equal to the consistency point sequence number (step 425). The transaction is written to buffer (step 430) and logged (step 435). When a transaction is applied to a page, the page in buffer 50 is also marked as modified.



FIG. 5 is a process flow chart illustrating a method of operation of system 10 in generating a partial-blocking consistency point. The consistency point recording module 210 initiates a partial-blocking consistency point (step 505). The consistency point recording module 210 blocks transaction input 225 (step 510). The consistency point recording module 210 records partial-blocking consistency point data (step 515). Recording the partial-blocking consistency point data comprises writing the partial-blocking consistency point data to log 45 and recording into memory the position of a physical location of the partial-blocking consistency point in the log.


The consistency point recording module 210 increments the consistency point sequence number (step 215) and unblocks transaction input (step 220). System 10 uses the consistency point sequence number to determine the requirements for modifying subsequent pages once processing of the partial-blocking consistency point begins.


The flushing module 215 identifies pages in the buffer 50 that are marked modified and with a consistency point ID. It is possible that a page that was modified during the interval might have been flushed prior to a checkpoint. System 10 will preemptively flush modified pages to storage outside of the checkpoint flush module to avoid running out of buffers for page faults. This flushing could leave pages that had transactions applied during the checkpoint interval; however, because these pages have already been flushed to disk, they do not need to be flushed again. The consistency point ID equals the incremented consistency point sequence number—1 (step 530). Every page that has been modified by a transaction is marked as modified (or dirty). In other terms, the state of the page changes to “modified”. When a page is flushed to disk, it is unmarked, or its state is changed to “not modified”. The pages flushed to storage by module 215 change the state of pages to “not modified”. The flushing module 215 flushes to the storage device 40 the identified and optionally processed pages (step 540). The hardening module hardens to the storage device 40 the recorded partial-blocking consistency data (step 545).


If a failure occurs prior to completion of the partial-blocking consistency point, restoration and crash recovery begins with the previous partial-blocking consistency point record in the reserved pages since the partial-blocking consistency point record has not been completed.


It is to be understood that the specific embodiments of the invention that have been described are merely illustrative of certain applications of the principle of the present invention. Numerous modifications may be made to the system and method for implementing a partial-blocking consistency point in a database described herein without departing from the spirit and scope of the present invention. In one exemplary embodiment, system 10 processes one consistency point at a time. The next consistency point does not start until the previous consistency point is completed.

Claims
  • 1. A processor-executable method of implementing a partial-blocking consistency point in a database system, comprising: identifying a plurality of transaction updates with a consistency point ID associated with a consistency point sequence number;recording a plurality of consistency point data that identify a location of the partial-blocking consistency point associated with the consistency point sequence number;flushing, to a non-volatile storage, the transaction updates identified with the consistency point sequence number, without blocking transaction activity;hardening, to the non-volatile storage, the recorded partial-blocking consistency point so that a plurality of data associated with the recorded partial-blocking consistency point can be recovered.
  • 2. The method of claim 1, wherein hardening the recorded partial-blocking consistency point includes allowing the transaction activity to proceed without interruption.
  • 3. The method of claim 1, further comprising incrementing the consistency point sequence number each time the partial-blocking consistency point is recorded.
  • 4. The method of claim 3, wherein incrementing the consistency point sequence number includes uniquely identifying the partial-blocking consistency point.
  • 5. The method of claim 3, wherein incrementing the consistency point sequence number includes uniquely identifying the transaction updates associated with the partial-blocking consistency point.
  • 6. The method of claim 1, further comprising processing the transaction updates identified with the consistency point sequence number, in order to improve a flushing efficiency of the transaction updates.
  • 7. A processor-executable system for implementing a partial-blocking consistency point in a database system, comprising: a page identification module for identifying a plurality of transaction updates with a consistency point ID associated with a consistency point sequence number;a consistency point recording module for recording a plurality of consistency point data that identify a location of the partial-blocking consistency point associated with the consistency point sequence number;a flushing module for flushing, to a non-volatile storage, the transaction updates identified with the consistency point sequence number, without blocking transaction activity; anda hardening module for hardening, to the non-volatile storage, the recorded partial-blocking consistency point so that a plurality of data associated with the recorded partial-blocking consistency point can be recovered.
  • 8. The system of claim 7, wherein the hardening module hardens the recorded partial-blocking consistency point and allows the transaction activity to proceed without interruption.
  • 9. The system of claim 7, wherein the consistency point sequence number is incremented each time the partial-blocking consistency point is recorded.
  • 10. The system of claim 9, wherein the consistency point sequence number is incremented to uniquely identify the partial-blocking consistency point.
  • 11. The system of claim 9, wherein the consistency point sequence number is incremented to uniquely identify the transaction updates associated with the partial-blocking consistency point.
  • 12. The system of claim 7, wherein the transaction updates identified with the consistency point sequence number are processed in order to improve a flushing efficiency of the transaction updates.
  • 13. A computer program product having program codes stored on a computer-usable medium for implementing a partial-blocking consistency point in a database computer program product, comprising: a program code for identifying a plurality of transaction updates with a consistency point ID associated with a consistency point sequence number;a program code for recording a plurality of consistency point data that identify a location of the partial-blocking consistency point associated with the consistency point sequence number;a program code for flushing, to a non-volatile storage, the transaction updates identified with the consistency point sequence number, without blocking transaction activity; and
  • 14. The computer program product of claim 13, wherein the program code for hardening the recorded partial-blocking consistency point includes a program code for allowing the transaction activity to proceed without interruption.
  • 15. The computer program product of claim 13, further comprising a program code for incrementing the consistency point sequence number each time the partial-blocking consistency point is recorded.
  • 16. The computer program product of claim 15, wherein the program code for incrementing the consistency point sequence number includes a program code for uniquely identifying the partial-blocking consistency point.
  • 17. The computer program product of claim 15, wherein the program code for incrementing the consistency point sequence number includes a program code for uniquely identifying the transaction updates associated with the partial-blocking consistency point.
  • 18. The computer program product of claim 13, further comprising a program code for processing the transaction updates identified with the consistency point sequence number, in order to improve a flushing efficiency of the transaction updates.