The present invention relates to a database management method, and in particular relates to a method of managing a database in which a plurality of master nodes are hierarchically connected by a network.
The present applicant took particular note of the technology of managing a database using a transaction log, and proposed a data synchronization method based on a one-to-one master/slave system in Japanese Patent Application Publication No. 2006-293910, and subsequently proposed a data synchronization method based on a one-to-N (wherein N is a positive integer) master/slave system in PCT/JP2010/054311 (hereinafter referred to as the “Earlier Unpublished Application”).
Here, a transaction log refers to the technology of realizing high-speed operations while maintaining the permanence of changes by recording the history of changes made to the database in an area that is different from the recording area of the database.
In particular, Japanese Patent Application Publication No. 2006-293910 focused on realizing a replication system by using a transaction log, and, with the Earlier Unpublished Application, a slave node that received a search command from a client sends a request message to a master node, the slave node requests the master node to send a transaction log up to the latest version of the update of the master database when the slave node does not receive a return message from the master node within a predetermined time, the master node that received the foregoing request sends the requested transaction log to the slave node, and the slave node refers to the received log and updates its own replica database.
Meanwhile, the Earlier Unpublished Application was based on the premise of being a network configuration in which its node configuration comprises a plurality of mirror nodes to one master node.
Thus, since the mirror node would never independently execute data update instructions (INSERT, UPDATE, DELETE), the slave node only had to update its own database by referring to the transaction log from the master node.
Meanwhile, pursuant to the diversity and complexity of databases, a so-called multi master system comprising a plurality of master nodes is attracting attention. With respect to this point, in a network configuration where the master nodes have symmetric properties (when the master nodes are of a parallel structure), proposed is a theory of prioritizing the update information among the master nodes, and synchronizing the master nodes. However, the procedures for synchronizing all nodes are complex, and the solution during contention was not realistic.
The present inventors devised the present invention in light of the foregoing circumstances. Thus, the technical object of this invention is to realize a database management method capable of hierarchically building the master nodes, and using such hierarchical structure to reliably and efficiently update the nodes.
The present invention adopted the following measures to achieve the foregoing object.
According to an aspect of the present invention is a method of managing a database which is a recordable database hierarchically including higher-level and lower-level master nodes in which records can be updated, comprising: a step of a database processing unit of any one of lower-level master nodes generating and sending, to a higher-level master node, a write set expanded in its own memory including a shadow copy of a database and a heaptuple map, when an instruction for updating the database is issued in the lower-level master node; a step of the higher-level master node comparing the heaptuple map in the write set received from the lower-level master node and its own database, and verifying whether a corresponding line of the database registered as a target has been updated by a different write set; a step of aborting the write set when the update has been performed, updating its own database using the shadow copy when the update has not been performed, and generating a record of this update as a transaction log; a step of distributing the transaction log to the lower-level master nodes including the lower-level master node which has sent the write set; and a step of a transaction log processing unit of the lower-level master node updating its own database based on the received transaction log.
According to another aspect of the present invention is the method of managing a database, wherein the shadow copy of the write set in the lower-level master node is configured only from newly added lines.
According to another aspect of the present invention is the method of managing a database, wherein, in a stage where the write set is being generated in the lower-level master node, if search processing is performed on a master database of that lower-level master node, the method executes: a step of the database processing unit of the lower-level master node referring to the master database; and a step of referring to the heaptuple map one line at a time to determine whether the corresponding line number has been entered in the search processing, setting the master database as a direct search target when the corresponding line number has not been entered, and, when the corresponding line number has been entered, determining whether the entry is a deletion command or an update command, and removing the corresponding line number from the search target when the entry is the deletion command, and setting the entry of the shadow copy in the heaptuple map as the search target when the entry is the update command.
According to another aspect of the present invention is the method of managing a database, wherein, in a stage where the write set is being generated in the lower-level master node, if search processing is performed on a master database of that lower-level master node, the method executes: a step of the database processing unit of the lower-level master node referring to the master database; a step of referring to the whole heaptuple map, extracting entries of line numbers to be the search target, and removing the all entered line numbers from the search target on the assumption that the line numbers have been deleted; and a step of referring to the shadow copy, referring to line numbers of entries that have been added in the shadow copy, and setting only those line numbers as the search target.
According to another aspect of the present invention is the method of managing a database, wherein, when the write set is generated in the lower-level master node, an area of a backend memory (BEM) in the lower-level master node for registering the write set can be referred to at least by the transaction log processing unit which executes an update instruction based on the transaction log distributed from the higher-level master node, and
when the transaction log processing unit refers to the backend memory (BEM) and if a corresponding line to be updated based on the update instruction is contained in the heaptuple map (HTM) of the write set, the transaction log processing unit aborts the transaction that is generating that heaptuple map (HTM).
According to another aspect of the present invention is a database management method of a higher-level master node in a recordable database hierarchically including master nodes, comprising: a step of a higher-level master node receiving a write set generated by a database processing unit of any one of lower-level master nodes from a shadow copy of a database and a heaptuple map expanded in its own memory, when an instruction for updating the database is issued in the lower-level master node; a step of the higher-level master node comparing the heaptuple map in the received write set and its own database, and verifying whether a corresponding line of the database registered as a target has been updated by a different write set; and a step of aborting the write set when the update has been performed, and updating its own database using the shadow copy when the update has not been performed.
According to another aspect of the present invention is the database management method of a higher-level master node, wherein the higher-level master node receives, as the write set, the shadow copy which is configured only from newly added lines in the lower-level master node.
According to another aspect of the present invention is a database management method of a lower-level master node in a recordable database hierarchically including higher-level and lower-level master nodes in which records can be updated, comprising: a step of a database processing unit of any one of lower-level master nodes generating, for a higher-level master node, a write set expanded in its own memory including a shadow copy of a database and a heaptuple map, when an instruction for updating the database is issued in the lower-level master node, a step of sending the write set to the higher-level master node for causing the higher-level master node to compare the heaptuple map in the write set received from the lower-level master node and its own database, to verify whether a corresponding line of the database registered as a target has been updated by a different write set, to abort the write set when the update has been performed, to update the database of the higher-level master node using the shadow copy when the update has not been performed, to generate a record of this update as a transaction log, and to distribute the transaction log to the lower-level master nodes including the lower-level master node which has sent the write set; and a step of a transaction log processing unit of the lower-level master node updating, upon receiving the transaction log, its own database based on the received transaction log.
According to another aspect of the present invention is the database management method of a lower-level master node, wherein the shadow copy of the write set in the lower-level master node is configured only from newly added lines.
According to another aspect of the present invention is the method of managing a database, wherein, in a stage where the write set is being generated in the lower-level master node, if search processing is performed on a master database of that lower-level master node, the method executes: a step of the database processing unit of the lower-level master node referring to the master database; and a step of referring to the heaptuple map one line at a time to determine whether the corresponding line number has been entered in the search processing, setting the master database as a direct search target when the corresponding line number has not been entered, and, when the corresponding line number has been entered, determining whether the entry is a deletion command or an update command, and removing the corresponding line number from the search target when the entry is the deletion command, and setting the entry of the shadow copy in the heaptuple map as the search target when the entry is the update command.
According to another aspect of the present invention is the method of managing a database, wherein, in a stage where the write set is being generated in the lower-level master node, if search processing is performed on a master database of that lower-level master node, the method executes: a step of the database processing unit of the lower-level master node referring to the master database; a step of referring to the whole heaptuple map, extracting entries of line numbers to be the search target, and removing the all entered line numbers from the search target on the assumption that the line numbers have been deleted; and a step of referring to the shadow copy, referring to line numbers of entries that have been added in the shadow copy, and setting only those line numbers as the search target.
According to another aspect of the present invention is the method of managing a database, wherein, when the write set is generated in the lower-level master node, an area of a backend memory (BEM) in the lower-level master node for registering the write set can be referred to at least by the transaction log processing unit which executes an update instruction based on the transaction log distributed from the higher-level master node, and
when the transaction log processing unit refers to the backend memory (BEM) and if a corresponding line to be updated based on the update instruction is contained in the heaptuple map (HTM) of the write set, the transaction log processing unit aborts the transaction that is generating that heaptuple map (HTM).
According to the present invention, multi master nodes are hierarchically built, a shadow copy and a heap tuple map expanded in its own memory are sent as a write set from a lower-level master node to a higher-level master node, the higher-level master node that received the foregoing write set verifies whether the relevant row has been previously updated by a different write set, and performs database update processing by using the shadow copy and heap tuple map if the relevant row has not been updated. In addition, as a result of the higher-level master node sending the record of this update as a transaction log to the lower-level master node, the database can be updated efficiently and consistently from the lower-level master node to the higher-level master node, and from the higher-level master node to the lower-level master node under its control.
The present invention is now explained with reference to the appended drawings.
While the foregoing Earlier Unpublished Application only needs to copy (replicate) the transaction log data of the master node to the lower-level nodes, this embodiment is unique in that attention was focused on the fact that, with a configuration of hierarchical multi master nodes, when an update instruction is also being executed in the lower-level master nodes, the consistency of all lower-level nodes cannot be maintained only by referring to the transaction log from the higher-level node. This is explained below.
As shown in
Here, the database processing unit lib sends the write set, which was generated in the backend memory (BEM), to the higher-level master via a communication module (11d) without directly writing data into a master database (11a) while referring to the master database (11a).
This kind of processing is performed in the higher-level master node (MS101), as well as in the lower-level master nodes (MS201, MS202, . . . , MS20n and MS301, MS302, . . . , MS30n).
The large-scale hard drive (HD) stores an operating system (OS) as well as application programs (APL), and the functions as the foregoing master node are realized by the central processing unit (CPU) reading and sequentially executing the programs via the bus (BUS) and the main memory (MM). Note that, although the explanation is omitted, the lower-level master nodes (MS201, MS202, . . . , MS20n and MS301, MS302, . . . , MS30n) are also configured the same.
The processing performed by the database processing unit (11b) explained with reference to
In the lower-level master node (MS201), when this kind of update instruction is given to the master database based on instructions from the client terminal (CL), as described above, a write set configured from a heap tuple map (HTM, heap file) and a shadow copy (SC) is generated in the backend memory (BEM).
The original row number (ctid) and the row number (sctid) of the new row are associated and registered in the heap tuple map (HTM). The heap tuple map (HTM) is additionally generated each time the database is updated. Note that, since the row number to which the instruction (sc1) of row number 5 is to be written is still unknown at this stage, a new instruction (sc1) is written in sctid.
Meanwhile, the shadow copy of the row number to be rewritten by referring to the master database (11a) is generated in the shadow copy (SC). Here, since the row number to be newly added is still unknown at this stage, a new instruction (sc1) is written in the row number.
Note that, at this stage, since the database processing unit (11b) of the lower-level master node (MS201) knows that row number 4 to which the DELETE instruction is applied and old row number 5 to which the UPDATE instruction is applied will be deleted based on the generation of the heap tuple map (HTM), it is also possible to only write the new instruction (sc1) as the shadow copy (SC).
The write set generated as described above is sent from the lower-level master node (MS201) to the higher-level master node (MS101).
In the higher-level master node (MS101), when the database processing unit lib (central processing unit (CPU)) receives the write set from the lower-level master node (MS201), it activates the transaction log processing unit (11c) pursuant to the update instruction and starts the generation of transaction log data. In addition, the database processing unit lib reads the heap tuple map (HTM) from the received write set, and compares it with its own master database (11a). Here, whether the contents of the target tuples (row numbers 4, 5 and 7 in this case) have been updated in the master database (11a) is verified. In
Meanwhile, upon comparing the heap tuple map (HTM) in the write set from the lower-level master node (M201) and its own database, if the relevant row has already been updated in higher-level master node (M201) based on a different write set, the processing based on that write set is aborted (stopped).
According to
This transaction log data is distributed from the communication module (11d) to the lower-level master node (MS201) which has sent the write set, as well as to all lower-level master nodes (MS202, . . . , MS20n and MS301, MS302, . . . , MS30n).
The lower-level master nodes that received the transaction log data copy (replicate) such transaction log data in their own database.
Specifically, when a lower-level master node (M202, for instance) receives the transaction log data shown in
Accordingly, in the lower-level master nodes, the databases are uniformly managed by replicating the transaction log data sent from the higher-level master node.
The present invention was explained above based on an embodiment, but the present invention is not limited thereto. Modified examples of the present invention are explained below.
(Case Where the Database is Updated in the Higher-Level Master Node MS101)
With respect to the processing in a case where an update instruction of a master database is issued in a lower-level master node (MS201, for instance), as explained in
(When Search Processing is Executed at the Stage Where the Lower-Level Master Node is Generating a Write Set)
In the lower-level master node (MS201, for example), when a search is executed to the master database of that lower-level master node at the stage when a write set is being generated as shown in
The following two measures may be considered in the foregoing case.
The first type is where the database processing unit (11b) refers to the master database (11a), and thereafter refers to the heap tuple map (HTM). In addition, whether the search target row number has been entered in the heap tuple map (HTM) is checked. If there is such an entry, whether that entry is a deletion or update is determined, and, if the entry is an update, the shadow copy (SC) is referred to and the entry (sc1) of that shadow copy (SC) is set as the search target. For example, in the case shown in
Meanwhile, if the search target is row number 4 in the example shown in
Meanwhile, in the example shown in
Here, the database processing unit (11b) may refer to the shadow copy (SC) and set the entry (sc1) which rewrote the row number 5 as the search target.
The second type is here the database processing unit (11b) foremost refers to the master database (11a), and thereafter refers to the overall heap tuple map (HTM). Here, whether the row to be searched has been entered is checked, and it is assumed that all entered row numbers (row numbers 4 and 5 in this case) have been deleted (to be excluded from the search target). Subsequently, the database processing unit 11b refers to the shadow copy (SC), and refers to the entry (sc1) that was added in the shadow copy and sets this as the search target.
(When Contention Occurs While Update of the Lower-Level Master is Being Performed with the Transaction Log Data from the Higher-Level Master)
When an instruction for updating the database of the lower-level master is being executed to the corresponding row while replication is being performed on the database of the lower-level master node based on the transaction log data distributed from the higher-level master, contention will occur.
Specifically, a corresponding case would be when transaction log data including an entry of row number 5 is distributed from the higher-level master while row numbers 4 and 5 are being updated based on the update instruction to the lower-level master node.
In the foregoing case, even if the write set created in the lower-level master node is sent to the higher-level master node, since the transaction log data relating to the corresponding row has already been distributed to the higher-level master node, the write set is aborted since contention with the higher-level master node is detected. Accordingly, contention in that lower-level master node may be ignored.
Meanwhile, as another method of resolving the contention in the lower-level master nodes, when a write set (heap tuple map (HTM) and shadow copy (SC)) is created in a lower-level master node, these may be registered in a backend memory (BEM) of that lower-level master node so that these areas can be referred to from two or more processes (specifically, the replication process and the write set generation process). In other words, in the lower-level master node, desirably the write set, and more specifically only the heap tuple map (HTM) is stored in the shared memory.
In the foregoing case, as a result of referring to the write set in the backend memory (BEM) upon the lower-level master node executing the replication of the master database 11a, the write set that contradicts the update instruction can be aborted at the stage of the lower-level master node. Specifically, the transaction log processing unit 11c refers to the backend memory (BEM), and aborts the transaction that is generating the heap tuple map (HTM) when the corresponding row to be updated based on the update instruction is contained in that heap tuple map (HTM).
Accordingly, by storing the heap tuple map (HTM) in a shared memory and enabling the referral thereof from a plurality of processes, contention can be prevented at the stage of the lower-level master node even in a multi master system database. In addition, since only the heap tuple map (HTM) needs to be stored in the shared memory, it will not occupy the valuable shared memory.
The present invention can be applied to a data management system of a multi master node structure comprising a hierarchical structure.
Number | Date | Country | Kind |
---|---|---|---|
2010-239713 | Oct 2010 | JP | national |
This application is a Continuation Application of U.S. Ser. No. 13/513,773, filed Jun. 4, 2012, which is a U.S. National Stage filing under 35 U.S.C. 371 of PCT International Application No. PCT/JP2011/068057, filed on Aug. 8, 2011. The entire contents of each of the above documents in hereby incorporated by reference into the present application.
Number | Name | Date | Kind |
---|---|---|---|
20120011098 | Yamada | Jan 2012 | A1 |
20120297487 | Xia | Nov 2012 | A1 |
20130013556 | Yamada | Jan 2013 | A1 |
20130179416 | Yamada | Jul 2013 | A1 |
Number | Date | Country |
---|---|---|
2001-319012 | Nov 2001 | JP |
2002-149459 | May 2002 | JP |
2006-293910 | Oct 2006 | JP |
2007-264685 | Oct 2007 | JP |
2010106991 | Sep 2010 | WO |
Entry |
---|
International Search Report of PCT/JP2011/068057, mailing date of Aug. 30, 2011. |
U.S. Non-Final Office Action dated Sep. 18, 2013, issued in related U.S. Appl. No. 13/784,244 (11 pages). |
U.S. Non-Final Office Action dated Sep. 18, 2013, issued in related U.S. Appl. No. 13/784,254 (11 pages). |
U.S. Non-Final Office Action dated Jan. 31, 2014, issued in related U.S. Appl. No. 13/784,203 (14 pages). |
U.S. Notice of Allowance dated Feb. 19, 2014, issued in related U.S. Appl. No. 13/784,254 (13 pages). |
U.S. Notice of Allowance dated Feb. 5, 2014, issued in related U.S. Appl. No. 13/784,244 (12 pages). |
U.S. Non-Final Office Action dated Apr. 11, 2014, issued in related U.S. Appl. No. 13/784,288 (23 pages). |
Number | Date | Country | |
---|---|---|---|
20130179417 A1 | Jul 2013 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13513773 | US | |
Child | 13784272 | US |