The present application claims priority from Japanese application JP2008-307284 filed on Dec. 2, 2008, the content of which is hereby incorporated by reference into this application.
The present invention relates to a database management method, a database management program and a database management apparatus.
With advancement of information society, many enterprises use systems provided with databases and supply various kinds of online services. The examples of those online services are banks' ATMs and air-ticket selling systems. As competitions among the enterprises are made harder and harder, today, to differentiate the enterprises from one another, the enterprises are more likely to prolong online service times and expand the contents of the online services.
In building an online service system, various kinds of hardware and middleware are often combined. This type of system is required to maintain not only the combined software and hardware parts each by each but also the overall system. It means that the management of this type of system is very costly.
Since the system management cost occurs constantly while the system is in operation, in order for one enterprise to get the better of the other ones in the competition, it is quite important to reduce the system management cost and thereby secure the profit. Hence, it is expected that the effective use of the time zone out of the online service time leads to the effective use of the system and giving rise to so high an outcome as meeting with the system management cost.
The database management system often operates to process the data accumulated in online services as a batch for the purpose of generating the statistical data and use the generated statistical data for supporting the management decision of an enterprise. The batch process to be executed in generating the statistical data requires the management system to access the massive amount of data, so that the load applied onto the system is likely to go higher during execution of the batch process.
In particular, when data is stored on an external storage medium such as a disk drive unit, the transfer of the data between the main body of the system and the external storage medium is made heavier. Hence, the batch process is often executed during the night time out of the online service time when the load applied onto the system is relatively light.
As the online services are expanded more and the online service users are made greater in number, the amount of data to be processed is increased and thereby the batch process time is made longer. On the other hand, each enterprise extends the online service time, so that the time to be allocated for the batch process is made shorter accordingly. Therefore, it is expected that the batch process time is made shorter.
In turn, the promotion of the batch process in the database management system will be described with reference to
When the system executes a plurality of batch processes, in order to make ready for any system failure, the data of the database at the batch process end times (t1, t3) are backed up (c1, c2).
In the conventional system promotion, in order to prevent a non-backed-up page on a database from being updated during the backup process (a data overwrite error), after the batch process is finished, the result of the batch process is backed up.
The difference of the process shown in
This method is executed to record the update logs L1 and L2 that are the update histories of the database when the batch process is executed.
Hence, if the data is overwritten on the database by the batch process b2, by writing back the update log L2 and the backup data bk1 previous to the overwrite onto the database, it is possible to execute the recovering process when a system failure occurs. (Refer to C. J. Date, “Introduction to Database Systems (Systems Programming Series”, Addison-Wesley Publishing, July 1982, page 20.) This recovering process makes it possible to keep the data on the database returned to the data at the time t1.
In the promotion of the method shown in
To reduce the load applied onto the system (simply referred to as the system load), it may be designed to take the speed-up steps of putting the data stored on the external storage medium onto a memory and reading and writing the data stored in the memory. (This speed-up process is referred to as the in-memory data process.) This in-memory data process does not require the system to access the external storage unit. This is quite effective in reducing the system load.
In
When the DB synchronization is finished at a time t2, in a backup process c1, the data is read out of the external storage medium and is written as the backup data on the target storage medium. As mentioned above, to back up the data in memory, the DB synchronization is executed. After the DB synchronization is finished, the batch process b2 is executed.
To make ready for the data overwrite error, until the DB synchronization for the batch process b1 is finished, the system is required to keep the start of the next batch process b2 awaited. This waiting time makes the overall time of the batch processes (times t0 to t5) longer.
It is therefore an object of the present invention to solve the foregoing problems and to reduce the processing time of the batch processes accompanied with the data backup in the database management system.
In carrying out the invention in a preferred mode, the present invention concerns with a database management method provided in a database management apparatus arranged to manage data saved in a database, which includes a data access processing portion and a backup processing portion, the backup processing portion operating to write data at a given time saved in the database stored in a storage unit as backup data out of the database the given time later, the data access processing portion operating to execute a data access to the database in response to a data access request for the database, in the data access request after the given time, when an update request for the data not to be written out as the backup data occurs, the data at the given time being written as the backup data out of the database, and then the data of the database being updated in response to the update request, and wherein the database management apparatus executes the process in the backup unit and the process in the data accessing unit in parallel.
The other component means will be described later.
The present invention is therefore effective in shortening the processing time of the batch process covering the data backup in the database management system.
Other objects, features and advantages of the invention will become apparent from the following description of the embodiments of the invention taken in conjunction with the accompanying drawings.
Hereafter, an embodiment of a database management system to which the present invention is applied will be described with reference to the appended drawings.
The database management apparatus 1 is configured as a computer including a processor 10, a memory 20 and a communication interface through which the computer is communicated with another apparatus. The processor 10 composes the processing units for executing various kinds of processes by running the programs stored in the memory 20.
The memory 20 stores the programs, which compose a DB access request accepting portion 21, a backup request accepting portion 22, a recovery request accepting portion 23, an in-memory request accepting portion 24, a DB access processing portion 25, a backup portion 26, a recovering portion 27, and an in-memory processing portion 28, respectively.
The memory 20 stores as data a backup flag 29 that indicates the backup process is being executed, a backup management table 30, and an in-memory database 31.
The terminal 70 accepts a batch process for executing a batch service through an input unit through which the data is inputted from a user. The terminal 70 puts a transaction for executing the batch process between the database management apparatus 1 and the terminal 70 itself. If a data access is required during execution of the batch process, the terminal 70 operates to send a data access request or the like to the database management apparatus 1 through the transaction. The access request is described according to the SQL (Structured Query Language) rules, for example.
The database management apparatus 1 manages the data saved in each database (a database 52 and an in-memory database 31). The in-memory database 31 is created by storing the data of the database 52 in the memory (making the data of the database 52 the same as the data of the memory). The database management apparatus 1 controls the read and the write of the target data in response to a data access request sent from the terminal 70.
The storage apparatus 50 operates to save the data to be managed by the database management apparatus 1 in the database 52 of a storage unit 51 included in the storage apparatus 50 itself.
The storage apparatus 50 further stores as a DB access history 53 in the storage unit 51 a history of accesses having been made in response to a request for accessing the in-memory database 31 and the database 52.
Further, as the storage medium on which the database 52 and the DB access history 53 are stored, the storage apparatus 50 may have a storage unit 51 having a storage medium such as a HDD (Harddisk Drive) or an SSD (Solid-state Drive) built therein or another type of storage unit 51 like a DVD drive arranged so that a storage medium such as a readable and writable DVD (Digital Versatile Disc) may be loaded therein.
The backup apparatus 80 operates to store in a storage unit 81 the data of the in-memory database 31 managed by the database management apparatus 1 as backup data 82. The backup apparatus may be a storage device, an external storage medium directly connected with the database management apparatus 1, a tape device, or the like.
In the database management apparatus 1, the DB access request accepting portion 21 accepts the data access request sent from the terminal 70 or the like. The DB access processing portion 25 reads or writes the data saved in the in-memory database 31 in response to the data access request accepted by the DB access request accepting portion 21.
The DB access processing portion 25 accesses the data saved in the in-memory database 31 in response to an access request involved in the batch process. If the data access to the in-memory database 31 results in updating the data of the in-memory database 31, the DB access processing portion 25 executes the process of making the data of the in-memory database 31 the same as those of the database 52 on the synchronous or non-synchronous timing.
The backup request accepting portion 22 accepts the backup request sent from the terminal 70 or the like. The backup processing portion 26 backs up the data of the database 52 stored in the storage apparatus 50 and the in-memory database 31 in response to the backup request accepted by the backup request accepting portion 22.
The recovery request accepting portion 23 accepts a data recovery request sent from the terminal 70 or the like. The recovery processing portion 27 prepares the backup data required for the requested recovery from the backup data 82 in response to the data recovery request accepted by the recovery request accepting portion 23. The prepared backup data is reflected on the in-memory database so that the concerned data of the in-memory database 31 may be recovered.
The in-memory request accepting portion 24 accepts commands such as execution and release ones for processing the in-memory data. The in-memory processing portion 28 controls the in-memory data processing based on the command accepted by the in-memory request accepting portion 24.
The in-memory data processing means the process of putting the data to be managed, stored in the storage apparatus 50, in the memory 20 and accessing the in-memory database 31 stored in the memory 20 if the data access request is accepted. This process allows the in-memory processing portion 28 to eliminate the necessity of accessing the storage apparatus (database 52) provided as another apparatus, thereby being able to make the response to data access higher.
The database management apparatus 1 executes the backup process by writing the data of the in-memory database 31 being managed by the apparatus 1 itself onto the backup data 82 stored in the backup apparatus 80. This backup processing is implemented by any one of the following two writing processes.
The first sort of writing process is the “normal write”, which is the writing process to be executed by the backup processing portion 26. In particular, after the DB access processing portion 25 executes the committing process (the process of defining the data) about a predetermined transaction and then finishes the transaction, the backup processing portion 26 writes out the data of the in-memory database 31 including the data defined by the committing process onto the backup data 82.
The backup processing portion 26 creates a backup schedule data (see
On the other hand, the backup processing portion 26 constantly measures a degree of load of the database management apparatus 1 without creating the backup schedule data in advance and defines the writing time of a page when the load of the database management apparatus 1 is made smaller than the predetermined degree. That is, the so-called dynamic backup control is carried out.
The second sort of writing process is the “extra writing”, which is the writing process to be executed by the DB access processing portion 25. The data to be written out as the backup data 82 is not the latest data but the data at a predetermined time (for example, the end time of a predetermined transaction).
On the other hand, within the in-memory database 31, a data value in the page about which the data writing is not finished may be rewritten as the latest data value by the data access process of the latest transaction.
To cope with this unfavorable rewrite, if the update of the data value to a page where no backup is executed by the “normal writing” takes place in the transaction being currently processed, the DB access processing portion 25 writes out the data value in the non-updated page on behalf of the backup processing portion 26 and writes the data value in the updated page of the in-memory database 31.
The aforementioned backup processes based on the “normal writing” and the “extra writing” are both the process of writing the data from the in-memory database 31 to the backup data 82. However, these writing processes are executed by their components. (The writing process based on the “normal writing” is executed by the backup processing portion, while the writing process based on the “extra writing” is executed by the DB access processing portion 25.)
Hence, to prevent occurrence of collision of data accesses between the data writing processes executed by the DB access procession portion 25 and the backup processing portion 26, the database management apparatus 1 executes the following exclusive process.
One of the DB access processing portion 25 and the backup processing portion 26 operates to check if the other processing portion is executing the data writing process before the backup process is started. If the other processing portion is executing the data writing process, the processing portion for checking for it has to stay in the waiting state until the other processing portion finishes the writing process.
The backup flag 29 indicates if the backup process is being executed. It thus has any one value of “being processed” and “unprocessed”. When the flag 29 has a value of “being processed”, it indicates the backup is under process, while when the flag 29 has a value of “unprocessed”, it indicates the backup is not processed.
When the flag 29 indicates the “being processed”, it is determined if the DB access processing portion 25 needs to execute the data writing process, while when the flag 29 indicates the “unprocessed”, it is not determined if the DB access processing portion 25 needs to do the writing process.
Further, as to the same page, just one of the “normal writing” backup process and the “extra writing” backup process is required. Hence, the database management apparatus 1 manages the progress of the backup process of each page in the backup management table 30.
The in-memory database 31 and the backup data 82 are the storage areas on which the data to be saved in the database 52 are stored. The data is managed with a “page ID” and a “data value in the page” matched to each other.
Further, the input and output of data among the in-memory database 31, the backup date 82 and the database 52 is carried out page by page.
In
The backup management table 30 is a table for saving the information that indicates if a page has been already backed up in each page of the in-memory database 31. The table 30 manages a “page ID” to be used for specifying a page to be backed up and a “flag” that indicates if a page has been backed up with both of the “page ID” and the “flag” being matched to each other.
The “flag” takes any one of the values “complete”, which indicates that a page has been backed up, and “incomplete”, which indicates that a page is not backed up yet.
One of the DB access processing portion 25 and the backup processing portion 26 executes the backup of the page with the flag of “incomplete” and then changes the flag from “incomplete” into “complete”. The other processing portion skips the backup of the page with a flag of “backed”. This function makes it possible to prevent the so-called “double backup”, that is, duplicate write of the page by one processing portion though the page has been already written out by the other processing portion.
When executing the data update of the in-memory database 31 in a batch process, the DB access processing portion 25 writes in the DB access history 53 the ID of the batch process, the page ID whose data is to be updated and an update time in a manner to make them matched to one another.
In the backup schedule data, a page ID and a scheduled writing time of a page of the ID are described for each page to be backed up. The page ID is matched to the scheduled writing time thereof.
The backup processing portion 26 may execute the normal writing process of each page on the timing according to the backup schedule data or write out pages in earlier sequence of the page writing times. The backup schedule data may be created by the backup processing portion 26 or arranged as the data having been inputted from the terminal 70 or the like.
It is expected that the backup schedule data is arranged so that the database management apparatus 1 keeps the processing load substantially even at any processing time. This is because if a massive amount of pages are backed up on a specific period, the heavy processing load applied by the backup process brings pressure onto the data access executed by the DB access processing portion 25, so that the adverse effect is given to the transaction processing.
Further, if a frequency of the “extra write” is made higher in the DB access processing portion 25, the adverse effect is also given to the processing time of the transaction or the batch process.
To overcome these shortcomings, the backup processing portion 26 executes any one of the following two creating methods, which results in predicting the current and the future data accesses based on the past DB access history 53 with high accuracy and creating the backup schedule data being highly effective in distributing the processing load based on the prediction.
The first creating method is a method of creating the backup schedule data based on the update sequence of the pages extracted from the DB access history 53.
In the DB access history 53 shown in
In this schedule data, the page “p1” of the earlier writing times in the past is preferentially processed in the future backup process. The page “p3” of the later writing times in the past is postponed in the future backup process.
The second creating method is a method of creating the backup schedule data based on the average of the update sequence of the pages extracted from the DB access history 53. For example, it is assumed that about the pages “p1” to “p4”, the average of the update sequence of the pages extracted from the DB access history 53 is as shown. Herein, the “First rank” means that the “First-rank” page of the four pages is backed up at the earliest (the most previous) time.
Page “p1”: 1.2nd Rank
Page “p2”: 2.5th Rank
Page “p3”: 2.9th Rank
Page “p4”: 3.7th Rank
The backup processing portion 26 creates the backup schedule data in which the page IDs are arranged in the better average (the less valued) sequence, that is, in the sequence of “p1” to “p2” to “p3” to “p4”.
The aforementioned two creating methods make it possible to enhance a possibility of bringing about the “normal write” according to the backup schedule data more than the “extra write” with respect to each page.
The lower frequency of the “extra write” allows the DB access processing portion 25 to allocate a more processing capability to the transaction being currently processed.
The backup process bk1 is a process of writing out the data of the in-memory database 31 at an end time (time t1) of the batch process b1 to the backup data 82. The backup process bk2 is a process of writing out the data of the in-memory database 31 at the end time (time t6) of the batch process b2 to the backup data 82. In addition, the backup processing portion 26 puts the backup processing flag 29 at “processing” mode before executing the backup processes bk1 and bk2 and then puts the flag 29 at “non-processed” mode after executing these processes.
At a time t0, the terminal 70 starts the batch process b1 and the DB access processing portion 25 reads and writes the pages saved in the in-memory database 31. Herein, the in-memory database 31 has been already stored in the memory.
At a time t1, the terminal 70 finishes the batch process b1, when the terminal 70 and the DB access processing portion 25 start the next batch process b2. At a time, the backup processing portion 26 creates the backup schedule data (see
The backup schedule data is created by rearranging the pages (p1, p2, p3, p4) to be backed up in the sequence of the pages to be updated earlier by the batch process b1, that is, the sequence of p1 to p2 to p3 to p4.
As shown in
As shown in
As shown in
The DB access processing portion 25 writes out the data “35” of the page p4 that is not updated on behalf of the backup processing portion 26. (It corresponds to the extra write.) At a time, the DB access processing portion 25 writes the updated data “50” in the in-memory database 31. In this extra write, the DB access processing portion 25 changes the flag of the page p4 of the backup management table 30 into “complete”.
As shown in
Afterwards, at a time t5, when though the write time of the page p4 comes as indicated by the backup schedule data, the extra write has been already executed at the time t3 in the backup process bk1, the backup processing portion 26 skips the backup process of the page p4 and finishes the backup process bk1. The backup processing portion 26 determines the skip of the backup process of the page p4 if the flag of the table 30 for the page 4 is set to “complete”.
The foregoing backup process bk1 makes it possible to make the data (see
The foregoing description has concerned with the backup process of the data from the in-memory database 31 to the backup data 82. In turn, the description will be oriented to the two cases of recovering (writing back) the data from the backup data 82 to the in-memory database 31 if a system failure occurs in the batch process. This recovery is executed by the recovering portion 27.
The first case concerns with the case that a system failure occurs before the backup process bk1 is completed as shown in
The process at times t0 to t5 holds true to the process shown in
When the batch process b2 and the backup process bk1 are executed in parallel at a time t11, a system failure occurs in the batch process. In response to a report on failure occurrence sent from the DB access processing portion 25, the backup processing portion 26 stops the backup process bk1 and executes the recovering process r1 of the in-memory database 31.
As shown in
As shown in
This write-back process results in making the data of the in-memory database 31 recovered as the data at the time t1. Going back to
The process at the times t0 to t5 holds true to the process shown in
As shown in
As shown in
This flow of process is executed for each page to be backed up (referred to as the target page) each time the write time of a target page or the write sequence thereof comes.
In step S101, the backup processing portion 26 refers to the backup management table 30. If the target page backup is “complete” (Yes in S101), the backup processing portion 26 skips the backup process and then finishes the process.
In step S102, if the target page backup is “incomplete” (No in S101), the backup processing portion 26 reads the data value of the target page from the in-memory database 31 and writes the read data value out to the backup data 82.
In step S201, the DB access processing portion 25 refers to the backup flag 29. If the backup processing portion 26 keeps the normal write “being processed” (Yes in S201), the portion 25 goes to a step S202, while if the portion 26 keeps the normal write “unprocessed” (No in S201), the portion 25 goes to a step S204.
In the step S202, the DB access processing portion 25 refers to the backup management table 30. If the target page backup is “complete” (Yes in S202), the portion 25 skips the backup process and goes to the step S204.
In a step S203, If the target page backup is “incomplete” (No in S202), the DB access processing portion reads the data value of the target page that are not updated from the in-memory database 31 and then writes out the data value to the backup data 82.
In the step S204, the DB access processing portion 25 writes the updated data of the target page out to the in-memory database 31.
In a step S205, the processor 10 of the database management apparatus 1 determines if the requesting source for data update to be executed in the step S204 is the terminal 70. If the determining condition is met (Yes in S205), the DB access processing portion 25 goes to a step S206, while if it is not met (No in S205), the portion 25 finishes the process.
In the step S206, about the data updated and written in the in-memory database 31 in the step S204, the DB access processing portion 25 writes in the DB access history 53 the batch process ID of the batch process being executed, the ID of the page whose data value is to be updated and the data update time in a manner so that those factors may be matched to one another.
As set forth above, the database management method and apparatus according to the embodiments of the invention provide a capability of executing the backup process and the batch process in parallel and thereby shortening the total required time of the batch process and the promotion service.
Even if the backup process and the batch process are simply combined so that those processes are executed in parallel, the data overwrite problem takes place.
In the foregoing embodiments, when the data overwrite takes place, the erase of the target page by the overwrite can be avoided by executing the extra write.
In the foregoing embodiments, by creating the highly accurate backup schedule data based on the data access history extracted from the DB access history 53, it is possible to lessen an execution frequency of the extra write that has an adverse effect on the batch process.
It should be further understood by those skilled in the art that although the foregoing description has been made on embodiments of the invention, the invention is not limited thereto and various changes and modifications may be made without departing from the spirit of the invention and the scope of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
2008-307284 | Dec 2008 | JP | national |