This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2016-057003, filed on Mar. 22, 2016; the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to a data management device, a data management method, and a computer program product.
As a data management method used in assembling a database, the key value store (KVS) method is known. In the KVS method, data made of pairs of keys, which represent identification information, and values, which are the targets for writing and reading, is stored in a storage device. The user can specify a key, and can write or read the desired value at high speeds.
If the KVS-type data is replicated among a plurality of storage devices, it becomes possible to build a storage system having redundancy and a high degree of reliability. Usually, the operations for such replication are performed not in the client server representing the user interface but in the storage system. For example, there is a method in which a storage server copies a snapshot of the data, which is stored in a host storage device, into slave storage devices.
In a KVS-type database assembled across a plurality of storage devices, the order of writing needs to be managed in order to guarantee consistency in the operations among the storage devices. Simply, if the number of writing operations that can be performed at a single timing is limited to one, it becomes possible to guarantee consistency. However, according to this method, operations such as writing cannot be processed using a pipeline thereby leading to decline in the throughput.
The database 11 is a KVS-type distributed database including a plurality of KVS-type databases 31 to 33. Each of the KVS-type databases 31 to 33 is used to store data made of pairs of keys, which represent identification information, and values, which are the targets for writing and reading.
The obtaining unit 21 obtains a user request 10 (a first-type request) as an instruction for writing a value in the database 11 or reading a value from the database 11. The user request 10 includes address information corresponding to a writing operation or a reading operation. The address information can be, for example, information enabling identification of sectors assigned according to the logical block addressing (LBA). However, the address information is not limited to the LBA-based information, and alternatively can be a KVS key, for example.
The converting unit 22 converts the user request 10 into a KVS request 20 (a second-type request). The KVS request 20 can be a KVS command including a key. In the first embodiment, the key included in the KVS request 20 includes a combination of address information and version information. Herein, for each set of address information, the version information is assigned and is updated every time a writing instruction is issued in which the corresponding address information represents the writing position. For example, in the case in which version information “0” is assigned for the initial writing instruction in which address information “0” represents the writing position, when the second writing instruction is issued in which the address information “0” represents the writing position, the version information is updated to “1” (this manner of updating is only exemplary). That is, the converting unit 22 generates information containing a combination of the address information, which is included in the user request 10, and the version information, which is updated every time a writing instruction is issued; and generates a key to be used in searching the KVS-type database 11.
Based on the KITS request 20 generated by the converting unit 22, the control unit 23 either performs a writing operation for writing a value in the database 11 or performs a reading operation for reading a value from the database 11. As a result of a writing operation performed by the control unit 23, such data is written in the database 11 which contains key value pairs that are segmented by keys including version information. Since the version information is updated every time a writing instruction is issued, it results in maintaining consistency in the order of writing.
The CPU 111 follows a control program stored in the ROM 113, and performs predetermined arithmetic processing using the RAM 112 as the working area. The input device 114 is a device used to input information from outside. Examples of the input device 114 include a keyboard, a mouse, and a touch-sensitive panel. The output device 115 is a device used to output internally-generated information to outside. Examples of the output device 115 include a display and a speaker. The communication I/F 116 is a device that enables transmission of information to and reception of information from external devices via a network 121. Using the data management device 101, at least some of the functions of the obtaining unit 21, the converting unit 22, and the control unit 23 are implemented. The obtaining unit 21, the converting unit 22, and the control unit 23 can be configured using the CPU 111 that performs operations according to computer programs; or can be configured using various logic circuits; or can be configured using cooperation between the CPU 111 and various logic circuits.
The database system 201 includes a storage server (a file server) 205 and distributed storage 206. In an identical manner to the data management device 101, the storage server 211 is an information processing device that includes a CPU controlled by computer programs, a ROM used to store computer programs, and a RAM serving as the working area. The distributed storage 212 includes a plurality of storage devices in each of which a KVS-type database is assembled. A storage device can be a flash memory device (such as a solid state device (SSD)) or can be a magnetic memory device (such as a hard disk drive (HDD)). Based on the KVS protocol, the storage server 211 performs an operation for writing a value in the distributed storage 212, an operation for reading a value from the distributed storage 212, an operation for setting a master storage device, and an operation for setting slave storage devices. Using the database system 201, the functions of the database 11 are implemented.
Meanwhile, the hardware configuration illustrated in
The data management device 101 includes an application 301 and a block storage unit 302.
The application 301 corresponds to at least some portion of the obtaining unit 21 illustrated in
The application 301 obtains (generates) the user request 10 based on the user input. In this example, the user request 10 can also include instruction-details information, address information, and a value. The instruction-details information enables identification of the type of the instruction desired by the user; and can be a character string, a symbol, or a value assigned to each type of instruction such as “writing” or “reading”. The address information enables identification of the writing position or the reading position as described earlier, and can be a value assigned according to the LBA method or can be a key in the KVS method. The value represents the target data for writing, and can be a text file or an image file. Depending on the type of instruction (for example, in a reading instruction), the value may not be included in the user request 10.
The block storage unit 302 performs, based on the user request 10, an operation for writing a value in the database system 201 or an operation for reading a value from the database system 201. For example, the block storage unit 302 implements the LBA method and performs various operations in such a way that as if writing/reading of a value is being performed with respect to a single memory area. The block storage unit 302 includes a converting unit 311, a storage I/F 312, and a memory unit 313.
The converting unit 311 corresponds at least to some portion of the converting unit 22 illustrated in
As described earlier, the converting unit 311 combines address information and version information included in the user request 10, and generates a key to be included in a KVS command. Herein, the converting unit 311 manages a counter table 321 stored in the memory unit 313, and sets the version information corresponding to the address information based on the counter table 321.
The counter table 321 indicates the correspondence relationship between address information and a counter value 323. Herein, the counter value 323 is assigned to each set of address information 322. Every time a writing instruction (the user request 10) is issued in which the position indicated by the address information 322 is the writing position, the converting unit 311 increments the counter value 323 corresponding to that address information 322.
The storage I/F 312 corresponds to at least some portion of the control unit 23 illustrated in
The storage I/F 312 manages a master monitoring table 331 and a request monitoring table 341 that are stored in the memory unit 313.
The master monitoring table 331 indicates whether each of a plurality of (for example, three) storage devices 211 to 213 included in the database system 201 is a master storage device or a slave storage device. The master monitoring table 331 illustrated in
The storage I/F 312 transmits the KVS request 20 for a writing operation (for example, the KVS request 20 including the SET command) to the master storage device and the slave storage devices; but transmits the KVS request 20 for a reading operation (for example, the KVS request 20 including the GET command) only to the master storage device. A normal reading operation is performed using only the master storage device, and a writing operation is performed with respect to the master storage device as well as the slave storage devices so as to achieve redundancy. As a result, redundancy can be achieved without having to perform the operation of copying the snapshot of the host in the slave storage devices in the database system 201.
The request monitoring table 341 is a table in which such KVS requests 20 are listed which have been issued for instructing a writing operation but which are determined to be unimplemented. In this example, firstly, the storage I/F 312 adds, in the request monitoring table 341, the KVS request 20 that has been issued by the storage I/F 312 to the master storage device of the database system 201. Thereafter, when a response indicating that the writing is performed normally is received from the master storage device, the storage I/F 312 deletes the concerned KVS request 20 from the request monitoring table 341. Thus, the KVS requests 20 that remain listed in e request monitoring table 341 are such KVS requests 20 in response to which the implementation of the requested operation is not confirmed.
Regarding the KVS requests 20 listed in the request monitoring table 341, the storage I/F 312 retransmits those KVS requests based on a predetermined condition. Examples of the predetermined condition include the case in which there is a change in the master storage device and the case in which identical KVS requests 20 are listed in the request monitoring table 341 for a predetermined period of time or beyond. The version information included in the retransmitted KVS requests 20 matches with the version information that was included in the key of the identical KVS request transmitted earlier. As a result, the KVS requests 20 that are retransmitted are prevented from being treated as new KVS requests 20, thereby making it possible to maintain consistency in the writing operations.
Explained below with reference to
The converting unit 311 converts the user request 10A into a KVS request 20A. Herein, the KVS request 20A includes the command “SET” indicating a writing instruction; a key “(0, 1)”; and the value “DataA”.
Firstly, the converting unit 311 updates the counter table 321 in a corresponding manner to the fact that “Write” is included in the user request 10A. Moreover, the converting unit 311 increments the counter value 323, which corresponds to the address information 322: “0” of the counter table 321, from “0” to “1”. Then, the converting unit 311 generates the key “(0, 1)” by combining the address information “0” and the counter value “1” based on the updated counter table 321, and generates the KVS request 20A that includes the key “(0, 1)”.
Since the KVS request 20A issued for instructing a writing operation, the storage I/F 312 transmits the KVS request 20A to the master storage device (the first storage device 211) as well as to the slave storage devices (the second storage device 212 and the third storage device 213). Moreover, the storage I/F 312 writes the KVS request 20A, which has been transmitted to the master storage device, in the request monitoring table 341.
The response signal 30 that is received by the storage I/F 312 is transmitted to the application 301 via the converting unit 311. Thus, via the application 301, the user can recognize that the writing instruction was implemented normally.
The converting unit 311 converts the user request 10B into a KVS request 20B. Herein, the KVS request 20B includes the command “GET” indicating a writing instruction and includes the key “(0, 1)”.
Since “Write” is not included in the user request 10B, the converting unit 311 does not update the counter table 321. The converting unit 311 obtains the counter value 323: “1” corresponding to the address information 2: “0” in the counter table 321, and generates the key “(0, 1)” by combining the address information “0” and the counter value “1”. Then, the converting unit 311 generates the KVS request 20B that includes the key “(0, 1)”.
Since the KVS request 20B is issued for performing a reading operation, the storage I/F 312 transmits the KVS request 20D only to the master storage device (the first storage device 211). Moreover, the storage I/F 312 writes the KVS request 20B, which has been transmitted to the master storage device, in the request monitoring table 341.
The value 40 “DataA” that is transmitted to the storage I/F 312 is also transmitted to the application 301 via the converting unit 311. Thus, via the application 301, the user can obtain the value 40 “DataA”.
The converting unit 311 converts the user requests 10C into KVS requests 20C “SET/(0, 2)/DataB”, “GET/(0, 2)”, and “SET/(0, 3)/DataC”.
Firstly, according to the fact that “Write” is included in the initial user request 10C “Write/0/DataB”, the converting unit 311 updates the counter table 321, and increments the counter value 323: “1” corresponding to the address information 322: “0” (i.e., the state illustrated in
Subsequently, the converting unit 311 generates the KVS request 20C “GET/(0, 2)” corresponding to the second user request 10C “Read/0”. At that time, since “Write” is not included in the user request 10C “Read/0”, the converting unit 311 does not update the counter table 321. Then, the converting unit 311 generates the key “(0, 2)” by combining the address information “0” and the counter value “2” based on the current counter table 321, and generates the KVS request 20C “GET/(0, 2)”.
Subsequently, the converting unit 311 generates the KVS request 20C “SET/(0, 3)/DataC” corresponding to the third user request 10C “Write/0/DataC”. At that time, according to the fact that “Write” is included in the third user request 10C “Write/0/DataC”, the converting unit 311 updates the counter table 321 and increments the counter value 323 corresponding to the address information 322: “0” from “2” to “3”. Then, the converting unit 311 generates a key “(0, 3)” by combining the address information “0” and the counter value “3” based on the updated counter table 321, and generates the KVS request 20C “SET/(0, 3)/DataC” corresponding to the user request 10C “Write/0/DataC”.
The storage I/F 312 transmits all KVS requests 20C “SET/(0, 2)/DataB”, “GET/(0, 2)”, and “SET/(0, 3)/DataC” to the first storage device 211 representing the master storage device. Moreover, the storage I/F 312 transmits the KVS requests 20D “SET/(0, 2)/DataB” and “SET/(0, 3)/DataC”, which are requests for a writing operation, to the second storage device 212 and the third storage device 213 representing the slave storage devices. Furthermore, the storage I/F 312 writes, in the request monitoring table 341, the KVS requests 20C “SET/(0, 2)/DataB”, “GET/(0, 2)”, and “SET/(0, 3)/DataC” that have been transmitted to the master storage device.
The storage I/F 312 updates the master monitoring table 331 in response to the change in the master storage device. Then, the storage I/F 312 retransmits the KVS requests 20C, which are still listed in the request monitoring table 341, to the second storage device 212 representing the new master storage device.
As described above, according to the first embodiment, regarding the retransmitted KVS request 20C “GET/(0, 2)” for a reading operation, the value 40 “DataB” represents the response. The result of that response matches with the result of the response that would have been obtained in the pre-failure master storage device (the first storage device 211 representing the master storage device). That is because the KVS request 20C “GET/(0, 2)” issued before the failure (see
In the examples illustrated in
In contrast, in the first embodiment, the key of the KVS request 20 includes the version information that is updated every time a writing instruction is issued. Hence, even if the KVS request 20 is retransmitted, the writing operation can be appropriately performed without getting affected by the writing operation performed at a later instance. Moreover, since a plurality of writing operations or reading operations can be performed in a concurrent manner, there is no decline in the throughput. Thus, according to the first embodiment, in a KVS-type database, the reliability can be enhanced without causing a decline in the throughput.
A second embodiment is described below reference to the accompanying drawings. The identical constituent elements to those explained in the first embodiment are referred to by the same reference numerals, and the explanation is sometimes not repeated.
The deleting unit 615 performs operations for deleting the data corresponding to the past sets of version information from the data stored in the storage devices 211 to 213. The deleting unit 615 transmits a request signal 50 to the converting unit 311 at predetermined timings (for example, at regular intervals). The request signal 50 is a signal for requesting the converting unit 311 to transmit the current counter value 323 corresponding to each set of address information 322. Upon receiving the request signal 50, the converting unit 311 generates counter information 60 that contains the correspondence relationship between the sets of address information 322 and the current counter values 323 based on the counter table 321, and transmits counter information 60 to the deleting unit 615. Upon receiving the counter information 60, the deleting unit 615 issues, to the database system 201, a deletion request 70 for instructing an operation of deleting the data containing the keys having the past version information (counter values). The deleting unit 615 updates the data deletion monitoring table 621 according to the data deletion. Herein, the data deletion monitoring table 621 is a table indicating the correspondence relationship between the sets of address information 322 and deletion counter values 623 indicating the version information of the deleted sets of data.
Based on the counter information 60, the deleting unit 615 generates deletion requests 70 “DELETE/(0, 1)” and “DELETE/(0, 2)”, and transmits them to the database system 201. The deletion request 70 “DELETE/(0, 1)” is a KVS command for deleting the data containing the key (0, 1); while the deletion request 70 “DELETE/(0, 2)” is a KVS command for deleting the data containing the key (0, 2). In this example, if A represents the already-deleted deletion counter value 623 and if B represents the current counter value 323, then the deleting unit 615 deletes the data corresponding to the counter values from A to (B−1). In this example, the already-deleted deletion counter value 623 (A) corresponding to the address information 322: “0” is “0” (see
As described above, according to the second embodiment, in addition to achieving the effect according to the first embodiment, the data corresponding to the pass versions can be deleted from the storage devices 211 to 213. That makes it possible to hold down the loss of the memory capacity of the storage devices 211 to 213.
Meanwhile, the computer programs for implementing the various functions of the data management systems 1 and 501 (the data management device 101) can be recorded as installable or executable files in a computer-readable recording medium such as a compact disk read only memory (CD-ROM), a flexible disk (FD), a compact disk recordable (CD-R), and a digital versatile disk (DVD). Alternatively, the computer programs can be downloaded from a predetermined memory device connected to a network into a predetermined information processing device. Still alternatively, the computer programs can be stored in advance in a ROM of a predetermined information processing device. Herein, the computer programs can be made of a plurality of modules for implementing the various functions.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2016-057003 | Mar 2016 | JP | national |