This application is based upon and claims the benefit of priority of the prior Japanese Patent Application No. 2015-003703, filed on Jan. 9, 2015, the entire contents of which are incorporated herein by reference.
The embodiment discussed herein is related to an apparatus, a snapshot management method, and a recording medium.
Conventionally, there is a technique in which a snapshot obtained by duplicating a database at a certain time is created. By accumulating snapshots at short time intervals, time points at each of which a restorable database exists can be increased. On the other hand, because the storage area to accumulate the snapshots is finite, there is a limit to the number of snapshots that can be accumulated. As a related art, there is e.g. a technique in which, when a failure occurs, a virtual machine is restored to the acquisition time point of a snapshot by using the snapshot of the virtual machine and input logs from the acquisition time point to the time point of the failure occurrence included in communication logs are populated into the restored virtual machine in chronological order. Furthermore, there is a technique in which header information of two packets transmitted to a standby server is properly rewritten so that the time interval of two packets transmitted to a main server may become identical to the time interval of the two packets transmitted to the standby server.
As related-art documents, there are Japanese Laid-open Patent Publication No. 2009-080705 and Japanese Laid-open Patent Publication No. 2011-199680.
However, according to the related art, it is difficult to determine which snapshot is to be deleted from accumulated plural snapshots. For example, if merely any snapshot is deleted from plural snapshots, possibly increase in the restoring time of the database is caused although the amount of accumulation of the snapshots can be reduced.
According to an aspect of the invention, an apparatus includes: a first memory configured to store a plurality of snapshots of a database, each of the plurality of snapshots being obtained by duplicating contents of the database at a different time; a second memory configured to store correspondence information for each of processing requests to the database, the correspondence information including a time at which each of the processing requests is accepted by the database in association with a time period taken for processing each of the processing requests; and a processor coupled to the first memory and the second memory and configured to: identify a total time period to process processing requests received by the database from a first time at which a first snapshot is obtained to a second time at which a second snapshot is obtained, the first snapshot and the second snapshot being included in the plurality of snapshots; decide a snapshot to be deleted from the plurality of snapshots based on the identified total time period; and delete the decided snapshot from the first memory.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
One aspect of the embodiment of a snapshot management method, a system, and a recording medium to be disclosed intends to suppress increase in the restoring time of a database and reduce the amount of accumulation of snapshots. The embodiment of the disclosure will be described in detail below with reference to the accompanying drawings.
The DB 102 accepts a processing request. When accepting a processing request, the DB 102 executes processing corresponding to the processing request. The processing request is, for example, described by a database query language, e.g. an SQL query. Hereinafter, suppose that the processing request is the SQL query. Furthermore, for example if the SQL query is a query to make reference to the DB 102 and return data according with a condition, the processing corresponding to the SQL query is processing of retrieving the data according with the condition and returning the data according with the condition to the request source as a response. Moreover, for example if the SQL query is a query to update the data of the DB 102, the processing corresponding to the SQL query is processing of updating the data of the DB 102 and returning the update result.
By saving the issued SQL query, the contents of not only the DB at the time of a snapshot but also the DB at an arbitrary time can be restored if the SQL query is issued to the DB which is restored from the snapshot.
Here, the restoring time of the DB can be shortened as the number of saved snapshots is increased. However, there is a limit to the number of snapshots that can be saved and therefore the snapshot is deleted. However, it is difficult to reduce the data amount of the snapshot with suppression of increase in the restoring time of the database. For example, if merely any snapshot is deleted from plural snapshots, possibly increase in the restoring time of the database is caused although the amount of accumulation of the snapshots can be reduced.
Therefore, the management device 101 in the present embodiment decides the snapshot to be deleted from plural snapshots on the basis of the processing time of the SQL query received between the creation time points of the respective snapshots. This allows the management device 101 to delete a snapshot that can be quickly created to suppress increase in the restoring time of the DB while reducing the amount of accumulation of the snapshots. Although the management device 101 decides the snapshot to be deleted in the present embodiment, the management device 101 may decide the snapshot to be left.
The specific operation of the management device 101 will be described by using
The management device 101 stores correspondence information in which the times of the SQL queries 1 to 4 are associated with the processing times of the SQL queries 1 to 4. Here, the times of the SQL queries 1 to 4 may be either the request times of the SQL queries 1 to 4 or the response times.
Furthermore, suppose that the management device 101 manages snapshots s0 to s2 and, in
The management device 101 refers to the correspondence information and identifies the processing time of the SQL query accepted by the DB 102 from the time of another snapshot different from one snapshot of the snapshots s1 and s2 by the time of this one snapshot. The processing time of the SQL query accepted by the DB 102 from the time of another snapshot different from one snapshot of the snapshots s1 and s2 by the time of this one snapshot is the creation time of this one snapshot.
The management device 101 may identify the processing time of the SQL query about one snapshot or may identify the processing time of the SQL query about all of the snapshots of the deletion candidates. Furthermore, another snapshot different from the one snapshot may be any snapshot as long as this snapshot is a snapshot previous to the one snapshot. However, it is preferable that this snapshot is a snapshot immediately previous to the one snapshot.
For example, in the example of
Then, the management device 101 decides the snapshot to be deleted from plural snapshots of the deletion candidates on the basis of the identified creation time of the snapshot. For example, when the creation time is identified about a certain snapshot, the management device 101 decides the certain snapshot as the snapshot to be deleted if the identified creation time is shorter than a predetermined threshold. Furthermore, when the creation times are identified about all of the snapshots of the deletion candidates, the management device 101 decides the snapshot having the shortest creation time among the identified creation times as the snapshot to be deleted.
In the example of
The DB restoring device 211 includes an SQL query storage unit 221 and a snapshot storage unit 222. The production system 201 includes an HTTP server 231 and a DB server 232. Furthermore, the production system 201 includes networks 233 and 234. The HTTP request storage device 213, the HTTP server 231, and the user terminal 214 are coupled by the network 233. The HTTP server 231, the DB server 232, and the DB restoring device 211 are coupled by the network 234. The test system 202 includes an access reproduction device 241, an HTTP server 242, and a test DB server 243.
Here, the DB restoring device 211 is equivalent to the management device 101 illustrated in
The production system 201 is a system that offers some kind of service. For example, in the example of
The HTTP server 231 generates an SQL query from the received HTTP request and transmits the SQL query to the network 234. The network 234 transmits the received SQL query to each of the DB server 232 and the DB restoring device 211. The DB restoring device 211 stores the received SQL query in the SQL query storage unit 221.
The DB server 232 executes processing corresponding to the received SQL query. Furthermore, the DB server 232 creates a snapshot in a time zone in which less access is made, such as at the time of maintenance, in the night or the like. The snapshot created in the time zone in which less access is made, such as at the time of maintenance, in the night or the like, will be referred to as the “initial snapshot.” The DB server 232 transmits the created snapshot to the DB restoring device 211. The DB restoring device 211 stores the received snapshot in the snapshot storage unit 222. Here, the HTTP requests saved in the HTTP request storage device 213 and the SQL queries saved in the SQL query storage unit 221 may be based on packet capture or may be logs as long as the issuance order can be identified collectively.
The test system 202 is a system that tests the operation of the production system 201. For example, when a trouble occurs in the production system 201, the access reproduction control device 212 receives an instruction to reproduce the state at the time of the occurrence of the trouble from the verifier terminal 215 by operation by a verifier. Then, the access reproduction control device 212 instructs the DB restoring device 211 to reproduce, in the test DB server 243, the contents of the DB server 232 at the time of the occurrence of the trouble in the production system 201.
Furthermore, the access reproduction control device 212 instructs the access reproduction device 241 to transmit an HTTP request desired to be reproduced to the HTTP server 242 by using the HTTP request saved in the HTTP request storage device 213. The access reproduction device 241 transmits the HTTP request saved in the HTTP request storage device 213 to the HTTP server 242 while converting the Internet protocol (IP) address and the port number of the HTTP request in conformity with the test system 202. A more detailed operation example will be described with
In
The CPU 301 is an arithmetic processing device responsible for overall control of the computer. Furthermore, the computer may include plural CPUs. The ROM 302 is a non-volatile memory that stores programs such as a boot program. The RAM 303 is a volatile memory used as a work area of the CPU 301.
The disk drive 304 is a control device that controls reading and writing of data on the disk 305 in accordance with control of the CPU 301. As the disk drive 304, e.g. a magnetic disk drive, an optical disk drive, a solid state drive, or the like can be employed. The disk 305 is a non-volatile memory that stores data written under control by the disk drive 304. For example, if the disk drive 304 is a magnetic disk drive, a magnetic disk can be employed as the disk 305. Furthermore, if the disk drive 304 is an optical disk drive, an optical disk can be employed as the disk 305. In addition, if the disk drive 304 is a solid state drive, a semiconductor memory formed of a semiconductor element, i.e. a so-called semiconductor disk, can be employed as the disk 305.
The communication interface 306 is a control device that is responsible for an interface between a network and the inside and controls input and output of data from and to another device. For example, the communication interface 306 is coupled to another device through a communication line and via a network. As the communication interface 306, e.g. a modem, a local area network (LAN) adaptor, or the like can be employed.
Furthermore, the user terminal 214 and the verifier terminal 215 include hardware such as a display, a keyboard, and a mouse in addition to the hardware illustrated in
Next, regarding (2) in
Then, regarding (3) in
Next, regarding (4) in
Then, regarding (5) in
Next, regarding (6) in
For example, suppose that, in the example of
(Functional Configuration Example of DB Restoring Device 211)
Furthermore, the DB restoring device 211 can access the SQL query storage unit 221, the snapshot storage unit 222, and snapshot use information 610. The SQL query storage unit 221, the snapshot storage unit 222, and the snapshot use information 610 are stored in a storage device such as the RAM 303 or the disk 305. The snapshot storage unit 222 stores the times of SQL queries and the processing times of the SQL queries in association with each other. The snapshot storage unit 222 stores snapshots. The snapshot use information 610 stores information to identify the time when the snapshot stored in the snapshot storage unit 222 is used to restore the contents of the DB server 232. One example of the stored contents of the snapshot use information 610 will be described with
The identifying unit 601 refers to the SQL query storage unit 221 and identifies the processing time of the SQL query accepted by the DB server 232 from the time of another snapshot different from one snapshot of plural snapshots by the time of this one snapshot. The processing time of the SQL query may be the processing time of a query to update the DB server 232 among all SQL queries. A specific identifying method will be described with
The deciding unit 602 decides the snapshot to be deleted from the plural snapshots on the basis of the creation time identified by the identifying unit 601. The method for the decision may be absolute comparison with use of a threshold. Alternatively, if plural creation times exist as the creation times identified by the identifying unit 601, the deciding unit 602 may decide the snapshot having the shortest creation time among the plural snapshots as the snapshot to be deleted.
Furthermore, the deciding unit 602 may decide the snapshot to be deleted from the plural snapshots on the basis of the time identified by the identifying unit 601 and the time when the snapshot is used to restore the contents of the DB server 232, stored in the snapshot use information 610. For example, the deciding unit 602 decides the snapshot to be deleted in accordance with the least recently used (LRU) algorithm. For example, if the times when the snapshot is used to restore the contents of the DB server 232 are equivalent, the deciding unit 602 may decide the snapshot having a smaller number of times of being used to restore the contents of the DB server 232 as the snapshot to be deleted. A specific example of the decision of the snapshots to be deleted will be described with
The deleting unit 603 deletes the snapshot decided by the deciding unit 602 from the snapshot storage unit 222.
The classifying unit 604 refers to the SQL query storage unit 221 and classifies the SQL queries stored by the SQL query storage unit 221 into plural groups in accordance with the times of the SQL queries. Corresponding to each group of the plural groups classified by the classifying unit 604, the creating unit 605 creates a snapshot obtained by duplicating the contents of the DB obtained by issuing the SQL queries belonging to each group. Here, the issuance destination of the SQL queries is the DB including the contents at the time of the first SQL query among the processing requests belonging to each group, and is e.g. the test DB server 243. The method for the classification and the method for creating the snapshots will be described with
The snapshot use information 610 includes fields of immediately-previous sql of snapshot, test use time, and snapshot ID. In the field of immediately-previous sql of snapshot, identification information of the SQL query issued immediately previous to the DB server 232 at the duplication time point of the corresponding snapshot is stored. In the field of test use time, the time when the corresponding snapshot is used to restore the contents of the DB server 232 is stored. In the field of snapshot ID, identification information of the corresponding snapshot is stored.
Here, for simplification of description, suppose that sql0 is issued at the oldest time among sql0 to sql8 in the snapshot use information 610 illustrated in
For example, the snapshot use information 610 indicates that the snapshot id8 is used at t8, which is the latest time. However, the SQL query issued immediately previous to the DB server 232 at the duplication time point of the snapshot id8 is sql7 and is issued at an older time than the SQL query sql8 corresponding to the snapshot id7.
Furthermore, the DB restoring device 211 also manages the initial snapshot by using the snapshot use information 610. At this time, in order to indicate that this snapshot is the initial snapshot, the snapshot use information 610 stores a flag indicating that this snapshot is the initial snapshot in association with the snapshot ID although not represented in
In
First, the DB restoring device 211 selects n [%] from the snapshot whose last time of the use for restoring is the oldest. n [%] is a value decided by an administrator of the DB restoring device 211. In the example of
Furthermore, in the middle of access reproduction, in order to narrow down the reproduction target more finely, the reproduction is often carried out from a time slightly subsequent to the specified time. In this case, because the snapshot slightly subsequent to the specified time can be quickly made from the snapshot of the specified time, the DB restoring device 211 can shorten the creation time of the snapshot by leaving the snapshot of the specified time.
Then, the DB restoring device 211 decides the snapshots id2 and id4, whose creation time is short, among the selected snapshots id1 to id4 as the snapshots to be deleted. In this manner, snapshots with a short creation time are treated as the deletion target in consideration of the trouble of creation in the future. This shortens the average creation time in the DB restoring device 211.
The DB restoring device 211 identifies the time obtained by summing the processing times of the SQL queries as the creation time of the snapshot. Furthermore, the DB restoring device 211 may identify the processing times of the update SQL queries among the SQL queries as the creation time of the snapshot. For example, if the DB server 232 is a DB that simultaneously accepts and processes SQL queries, the DB restoring device 211 executes the processing on the basis of the assumption that the processing time of the update SQL query does not change even when another reference SQL query simultaneously processed is excluded. Actually there is a possibility that excluding the reference SQL query shortens the processing time of the update SQL query. However, the amount of time shortening is small relative to the processing time of the update SQL query and does not have to be exactly obtained. Furthermore, even with an asynchronous update SQL query, the processing time of the asynchronous update SQL query does not change.
Regarding the processing time of the SQL query, the DB restoring device 211 defines e.g. the time obtained by subtracting the request time of the SQL query from the response time as the processing time of the SQL query. Alternatively, suppose that it is apparent from the specifications of the production system 201 that the processing time of the SQL query issued in the production system 201 is substantially steady. In this case, the DB restoring device 211 may employ a predetermined value as the processing time of the SQL query.
In the example of
The DB restoring device 211 identifies, as the number of snapshots that can be saved, a value obtained by dividing a value resulting from subtraction of the present data amount of the snapshot storage unit 222 from the amount of data that can be saved in the snapshot storage unit 222 by the data amount of the snapshot per one snapshot. Suppose that the DB restoring device 211 identifies the number of snapshots that can be saved as three in the example of
Next, the DB restoring device 211 decides the update SQL queries corresponding to the timing of saving of a snapshot from the processing time of sql so that the three snapshots to be created may be evenly distributed. In the example of
Next, the DB restoring device 211 issues sql1 to sql10 to a test DB server 243_s0. Thereby, the DB restoring device 211 creates a snapshot s10 obtained by duplicating the contents of a test DB server 243_s10 that has become DB state 1. Similarly, the DB restoring device 211 issues sql11 to sql20 to the test DB server 243_s10. Thereby, the DB restoring device 211 creates a snapshot s20 obtained by duplicating the contents of a test DB server 243_s20 that has become DB state 2. Similarly, the DB restoring device 211 issues sql21 to sql30 to the test DB server 243_s20. Thereby, the DB restoring device 211 creates a snapshot s30 obtained by duplicating the contents of a test DB server 243_s30 that has become the target state.
Next, flowcharts representing processing executed by the access reproduction system 200 will be described by using
The DB restoring device 211 saves an initial snapshot (S1101). Next, the access reproduction system 200 starts saving of HTTP requests and SQL queries (S1102). Regarding the processing of S1102, for example, the HTTP request storage device 213 starts saving of HTTP requests and the DB restoring device 211 starts saving of SQL queries.
After the elapse of a certain time from the processing of S1102, the DB restoring device 211 executes speculative snapshot creation processing (S1103). Details of the speculative snapshot creation processing will be described with
Then, the DB restoring device 211 restores the DB by the selected snapshot and issuance of the SQL query (S1105). Next, the DB restoring device 211 determines whether or not the snapshot can be saved or the SQL query is not issued in the DB restoring (S1106). If it is difficult to save the snapshot and the SQL query is issued in the DB restoring (S1106: No), the DB restoring device 211 executes deleted-snapshot decision processing (S1107). Details of the deleted-snapshot decision processing will be described with
After the end of the processing of S1107 or if the snapshot can be saved or the SQL query is not issued in the DB restoring (S1106: Yes), the DB restoring device 211 determines whether or not the SQL query is issued in the DB restoring (S1108). If the SQL query is issued in the DB restoring (S1108: Yes), the DB restoring device 211 saves the snapshot corresponding to the restored DB (S1109).
After the end of the processing of S1109 or if the SQL query is not issued in the DB restoring (S1108: No), the DB restoring device 211 updates the snapshot use information (S1110). Then, the access reproduction control device 212 performs a test based on access reproduction (S1111). After the end of the processing of S1111, the access reproduction system 200 makes transition to the processing of S1103. By executing the access reproduction processing, the DB restoring device 211 can reproduce the state in the production system 201 at the time point specified by the verifier by the test system 202 and the verifier can carry out verification.
The DB restoring device 211 selects n [%] from the snapshot whose last time of the use for restoring is the oldest (S1201). Next, the DB restoring device 211 identifies the creation time from the immediately-previous snapshot about the selected snapshots (S1202). Then, the DB restoring device 211 decides the snapshot having the shortest creation time among the selected snapshots as the snapshot to be deleted (S1203). Then, the DB restoring device 211 deletes the decided snapshot (S1204). After the end of the processing of S1204, the DB restoring device 211 ends the deleted-snapshot decision processing. By executing the deleted-snapshot decision processing, the DB restoring device 211 reduces the data amount of the snapshot storage unit 222 and can suppress increase in the restoring time of the DB.
The DB restoring device 211 identifies the number of snapshots that can be saved (S1301). Next, the DB restoring device 211 identifies the processing time of the update SQL query through analysis of the SQL queries saved in the SQL query storage unit 221 (S1302). Then, the DB restoring device 211 decides the update SQL queries corresponding to the timing of saving of a snapshot from the identified number of snapshots that can be saved and the processing time of the update SQL query (S1303). Next, the DB restoring device 211 sets “lastpos” to the last update SQL query (S1304). Then, the DB restoring device 211 sets “start” to the first update SQL query (S1305).
Next, the DB restoring device 211 determines whether or not the update SQL query corresponding to the timing of saving of a snapshot exists in the range from “start” to “lastpos” (S1401). If the update SQL query corresponding to the timing of saving of a snapshot exists (S1401: Yes), the DB restoring device 211 sets “pos” to the next update SQL query corresponding to the timing of saving of a snapshot (S1402). Next, the DB restoring device 211 issues the update SQL queries from “start” to “pos” to the test DB server 243 (S1403). Then, the DB restoring device 211 duplicates the contents of the test DB server 243 to create a snapshot (S1404). The DB restoring device 211 saves the created snapshot in the snapshot storage unit 222. Next, the DB restoring device 211 updates the snapshot use information 610 (S1405).
Then, the DB restoring device 211 sets “start” to the update SQL query next to “pos” (S1406). Regarding the processing of S1405, the identification information of the SQL query indicated by “pos” is stored in the field of immediately-previous sql of snapshot. Nothing is stored in the field of test use time and an empty field is left. Next, the DB restoring device 211 makes transition to the processing of S1401.
On the other hand, if the update SQL query corresponding to the timing of saving of a snapshot does not exist (S1401: No), the DB restoring device 211 issues the update SQL queries from “start” to “lastpos” to the test DB server 243 (S1407). Next, the DB restoring device 211 duplicates the contents of the test DB server 243 to create a snapshot (51408). The DB restoring device 211 saves the created snapshot in the snapshot storage unit 222. Then, the DB restoring device 211 updates the snapshot use information 610 (S1409). Regarding the processing of S1409, the identification information of the SQL query indicated by “lastpos” is stored in the field of immediately-previous sql of snapshot. Nothing is stored in the field of test use time and an empty field is left.
After the end of the processing of S1409, the DB restoring device 211 ends the speculative snapshot creation processing. By executing the speculative snapshot creation processing, the DB restoring device 211 can shorten the average value of the restoring time of the DB.
As described above, according to the DB restoring device 211, the snapshot to be deleted is decided from plural snapshots on the basis of the processing time of the SQL query received between the creation time points of the respective snapshots, i.e. the creation time of the snapshot. This allows the management device 101 to suppress increase in the restoring time of the contents of the DB server 232 while deleting the snapshot that can be quickly made to reduce the data amount. Furthermore, the DB restoring device 211 can create the snapshot without imposing a burden on the production system 201. Even if a stop for several seconds for creating the snapshot is permitted in the production system 201, it is difficult to save the snapshot at arbitrary timing. In contrast, the DB restoring device 211 can save the snapshot at arbitrary timing.
Furthermore, according to the DB restoring device 211, the snapshot to be deleted may be decided on the basis of the creation time of the snapshot and the test use time of the snapshot. Due to this, from a rule of thumb that, when a trouble in the production system 201 is corrected, a problem readily occurs again at the corrected place, the DB restoring device 211 can shorten the creation time of the snapshot by leaving the snapshots recently used for restoring.
Moreover, according to the DB restoring device 211, the snapshot may be speculatively created with reference to the SQL query storage unit 221. This allows the DB restoring device 211 to shorten the average value of the restoring time of the DB. Furthermore, the snapshot speculatively created is readily deleted because the field of test use time is empty. Therefore, a situation does not occur in which a snapshot frequently used is deleted due to the snapshot speculatively created.
The snapshot management method described in the present embodiment can be implemented by executing a program prepared in advance by a computer such as a personal computer, a workstation or the like. The present snapshot management program is recorded in a computer-readable recording medium such as a hard disk, a flexible disk, a compact disc-read only memory (CD-ROM), or a digital versatile disk (DVD) and is read out from the recording medium by the computer to be executed. Furthermore, the present snapshot management program may be distributed via a network such as the Internet.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2015-003703 | Jan 2015 | JP | national |