The embodiments discussed herein are related to an information processing apparatus, a data management method, and a program.
The method of associating keys with data and saving them in one or more storage devices is known as one of data management methods in information processing systems. For example, software, such as application software, designates a key and performs data manipulation, such as data reading or data writing. It is assumed that data is distributed and saved in a plurality of storage devices. For example, a key and a hash function are used for determining a save destination storage device of data. In addition, a key and a hash function are used for searching for a storage device in which desired data is saved.
Furthermore, the method of dividing a set of keys into plural segments and managing data on a segment basis is known as a method for efficiently managing data. For example, the method of determining a save destination storage device of data on a segment basis is proposed. A method using a binary tree referred to as, for example, a PHT (Prefix Hash Tree) is proposed as the method of dividing a set of keys into plural segments and assigning the keys to the segments.
A binary number whose length corresponds to depth from a root node is given to each node of the PHT as a label. The root node has node 0 and node 1 as child nodes. Node 0 has node 00 and node 01 as child nodes. A key is represented by a binary number and its prefix (portion of the key corresponding to several bits from the head) and a label are compared. By doing so, each key is assigned to a leaf node (node having no child node). Keys and data are managed on a segment basis, where each segment corresponds to a leaf node. The depth of the PHT (degree of segment division) is determined according to, for example, the amount of data to be managed.
Please see, for example, Yatin Chawathe, Sriram Ramabhadran, Silvia Ratnasamy, Anthony LaMarca, Scott Shenker and Joseph Hellerstein, “A Case Study in Building Layered DHT Applications”, Proceeding of the ACM SIGCOMM 2005 conference on Applications, technologies, architectures and protocols for computer communications, Aug. 22-26, 2005.
By the way, as an information processing system is operated, data is written to storage devices. As a result, the amount of data to be managed may increase. Accordingly, when data is managed on a segment basis, a segment in which the amount of data is large may be dynamically divided further into plural segments. In order to distribute the load on the information processing system, data which belongs to at least one of the plural segments obtained by the division may be moved from a storage device to which the segment before the division is assigned to another storage device.
However, if data is moved from one storage device to another each time a segment is divided, then traffic for data management increases. This may impede essential data access. For example, if many of data accesses from the outside are for adding data, there is a possibility that half of traffic on a network in the information processing system is for moving data from one storage device to another.
According to an aspect, there is provided an information processing apparatus used in an information processing system which divides a set of keys associated with data to be stored into a plurality of segments and which manages arrangement of the data in a plurality of storage devices on a segment basis. The information processing apparatus include: a memory which stores segment information indicative of correspondences between the segments and the keys; and a processor which divides, according to an increase in data which belongs to a segment, the segment into a plurality of segments and updates the segment information, and allows data that belongs to at least one of a plurality of second segments obtained by dividing a first segment once or by hierarchically dividing the first segment N times, at the time of the N meeting a determined condition, to be moved from a first storage device which stores data that belongs to the first segment to a second storage device and restricts, at the time of the N not meeting the determined condition, movement of data that belongs to the plurality of second segments, the N being greater than 1.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention.
Embodiments will now be described with reference to the drawings.
The storage section 11 stores segment information. The segment information indicates the relationships between keys and segments. As illustrated in
The control section 12 divides, according to an increase in data which belongs to a segment (such as an increase in the amount of data which belongs to a segment or an increase in the number of keys associated with data which belongs to a segment), the segment into a plurality of segments and updates segment information. For example, the control section 12 refers to segment information stored in the storage section 11, and divides, on the basis of the amount of data which belongs to a segment or the number of keys associated with data which belongs to a segment and a threshold, the segment into two or more segments. The control section 12 confirms the amount of the data or the number of the keys, for example, by gaining access to a storage device in which the segment is arranged.
In the example of
When a plurality of segments are generated by performing hierarchical division operations with a segment as reference, the control section 12 determines whether or not each division operation meets a determined division depth condition. If a division operation meets the determined division depth condition, then the control section 12 allows data which belongs to at least one of generated plural segments to be moved (to be transferred) from a storage device which stores data that belongs to an original segment to another storage device. Furthermore, if a division operation does not meet the determined division depth condition, then the control section 12 restricts the movement of data which belongs to generated plural segments.
For example, a division depth condition is whether or not an interval between a hierarchical level to which an original segment belongs and a hierarchical level to which generated plural segments belong is a multiple of a determined hierarchical level interval greater than or equal to two. If a hierarchical level interval is a multiple of the determined hierarchical level interval, then the division depth condition is met. If a hierarchical level interval is not a multiple of the determined hierarchical level interval, then the division depth condition is not met. Information indicative of the determined hierarchical level interval is stored in advance in, for example, the storage section 11.
For example, it is assumed that the determined hierarchical level interval is 2. As illustrated in
On the other hand, the interval between a hierarchical level corresponding to n=3 and the hierarchical level corresponding to n=1 is 2. Accordingly, the control section 12 allows data which belongs to at least one of segments sg3a, sg3b, sg3c, and sg3d at the hierarchical level corresponding to n=3 to be transferred from a storage device which stores data that belongs to the segment sg1 at the hierarchical level corresponding to n=1 to another storage device. As illustrated in
Furthermore, in
By the way, in the example of
The control section 12 may be realized as a program executed by the use of a CPU (Central Processing Unit) and a RAM.
As has been described, the control section 12 hierarchically divides a segment and generates plural segments, which are descendants of the segment. If division does not meet a determined division depth condition at this time, then the movement of data which belongs to generated segments to another storage device is restricted.
As a result, when a segment is divided, data transfer does not always occur. This checks an increase in the amount of data transferred at the time of distributing the load on an information processing system. Accordingly, the movement of data from one storage device to another at load distribution time does not impede, for example, access by a user to data in the information processing system.
Each of the server apparatus 100, 100a, 100b, and 100c is a server computer which exercises distributed management of data. Each of the server apparatus 100, 100a, 100b, and 100c associates keys for identifying data with the data, stores them, and holds segment information for searching on the basis of a key for a server apparatus in which data is stored. Any of the server apparatus 100, 100a, 100b, and 100c can accept from the client apparatus 300 a request to handle data. A server apparatus which accepts a request including a key searches on the basis of segment information for a server apparatus in which data to be handled is stored, and requests the server apparatus to handle the data.
The management apparatus 200 is a computer used by a user (manager of the information processing system, for example). The management apparatus 200 manages the server apparatus 100, 100a, 100b, and 100c on the basis of operation by the user. For example, when a server apparatus which stores data is added to the information processing system, the management apparatus 200 transmits information regarding the added server apparatus to the server apparatus 100, 100a, 100b, and 100c. In addition, the management apparatus 200 transmits to the server apparatus 100, 100a, 100b, and 100c setting information for adjusting the distributed arrangement of data.
The client apparatus 300 is a computer used by a user (user of cloud services, for example). For example, application software which handles data stored in the server apparatus 100, 100a, 100b, or 100c is executed on the client apparatus 300. The client apparatus 300 transmits to any of the server apparatus 100, 100a, 100b, and 100c via the networks 41 and 42 a request to handle data. One key or a range of a key is designated in a request. Data handling includes data reading and data writing.
Even when any of the server apparatus 100, 100a, 100b, and 100c receives a request from the client apparatus 300, messages are transmitted among the server apparatus 100, 100a, 100b, and 100c. By doing so, data handling requested is performed. As has been described, with the information processing system according to the second embodiment it is possible to make a design so as to prevent as much as possible an apparatus which becomes a data handling bottleneck from appearing. Availability, response performance, and the like are improved.
(Example of Hardware of Server Apparatus)
The CPU 101 is an operation unit which controls information processing on the server apparatus 100. The CPU 101 reads out at least a part of a program or data stored in the HDD 103, expands it in the RAM 102, and executes the program. The server apparatus 100 may include a plurality of operation units to perform distributed information processing.
The RAM 102 is a volatile memory which temporarily stores a program or data that the CPU 101 handles. The server apparatus 100 may include a memory which differs from a RAM in type, or include a plurality of memories.
The HDD 103 is a nonvolatile storage unit which stores programs, such as an OS (Operating System) program and application programs, and data used for information processing. The HDD 103 reads from or writes to a built-in magnetic disk in accordance with an instruction from the CPU 101. The server apparatus 100 may include a nonvolatile storage unit (such as a SSD (Solid State Drive)) other than a HDD or include a plurality of storage units.
In accordance with an instruction from the CPU 101, the image signal processing section 104 outputs an image to a display 31 connected to the server apparatus 100. A CRT (Cathode Ray Tube) display, a liquid crystal display, or the like is used as the display 31.
The input signal processing section 105 acquires an input signal from an input device 32 connected to the server apparatus 100, and outputs it to the CPU 101. A pointing device, such as a mouse or a touch panel, a keyboard, or the like is used as the input device 32.
The disk drive 106 is a drive unit which reads a program or data recorded on a record medium 33. A magnetic disk, such as a FD (Flexible Disk) or a HDD, an optical disk, such as a CD (Compact Disc) or a DVD (Digital Versatile Disc), a MO (Magneto-Optical disk), or the like is used as the record medium 33. For example, the disk drive 106 stores a program or data which it reads from the record medium 33 in the RAM 102 or the HDD 103 in accordance with an instruction from the CPU 101.
The communication unit 107 is a communication interface which is connected to the network 41 and which performs communication. The communication unit 107 may be connected to the network 41 by wire or radio. That is to say, the communication unit 107 may be a wired communication interface or a radio communication interface.
(Example of Request to Information Processing System)
In
In
In
The process by the server apparatus 100 of determining from a key included in a request received from the client apparatus 300, for example, which server apparatus data corresponding to the key is stored in will be described later.
As has been described, each of the server apparatus 100, 100a, 100b, and 100c associates keys with data, holds them, and reads out data from or writes data to a storage device by the key. A key to be handled is designated by, for example, application software executed on the client apparatus 300. Accordingly, the complexity of data processing by the server apparatus 100, 100a, 100b, or 100c is decreased and the load on the server apparatus 100, 100a, 100b, or 100c is reduced.
If data corresponding to a key designated in a write request does not reside, then the write request means to add the data. On the other hand, if data corresponding to a key designated in a write request already resides, then the write request means to overwrite the data. For example, application software executed on the client apparatus 300 is responsible for checking whether or not data corresponding to a key resides or determining whether or not overwrite is allowed. For example, the information processing system according to the second embodiment is used for managing data (log data, for example) for which time is considered as a key.
(Example of Segment Management Tree)
With a segment management tree key space is hierarchically divided into intervals referred to as segments. Each node of a segment management tree corresponds to a segment. A label whose length corresponds to the depth of a node is given to the node. For example, the label “0” is given to a child node on the left side of a root node and the label “1” is given to a child node on the right side of the root node. In addition, when the label “L” is given to a node, the label “L+0” is given to a child node on the left side of the node and the label “L+1” is given to a child node on the right side of the node. Each key in the key space is associated with a leaf node (which has no child node) to which a label that matches its prefix is given.
For example, it is assumed that a key is represented by a 5-bit binary number. If a key designated by the client apparatus 300 is not represented by a binary number, then the key is converted to a binary number. Furthermore, as illustrated in
Data which belongs to segments corresponding to the leaf nodes of the segment management tree is stored in storage units (HDD 103 illustrated in
By determining in this way on a segment basis a server apparatus in which data is arranged, key locality is maintained and data associated with keys whose values are close to one another is arranged as much as possible in the same server apparatus. Accordingly, range-designated data processing, such as a range-designated read request, is efficiently performed. The depth (hierarchical level) of the segment management tree is dynamically adjusted according to the amount of data stored in the server apparatus 100, 100a, 100b, or 100c. As described later, a segment to which a large amount of data belongs is dynamically divided further into plural segments.
In the above description a binary tree is used as an example of a segment management tree. However, a tree in which each node has any number of branches may be used. If each node other than a leaf node has b (b is an integer greater than or equal to 2) child nodes, then a label represented as a b-ary number is given to each node. For example, if b=3, then a root node has child nodes whose labels are “0”, “1”, and “2”. In addition, the node whose label is “0” has child nodes whose labels are “00”, “01”, and “02”. To associate a key with a segment, the key is converted to a b-ary number and a prefix of the key and a label are compared.
(Example of Method for Arranging Data in Plurality of Server Apparatus)
It is assumed that a value area (area within which a hash value falls) of a hash function is 0 to 2n−1 (n is a natural number), and that hash value space is formed by looping the value area. Each of the server apparatus 100, 100a, 100b, and 100c calculates a hash value corresponding thereto by applying the hash function to its identification information (address, for example). In addition, each of the server apparatus 100, 100a, 100b, and 100c calculates a hash value corresponding to a segment by applying the hash function to a label of the segment or another label obtained by converting the label by a conversion method described later. Each of the server apparatus 100, 100a, 100b, and 100c then determines a server apparatus in which data that belongs to the segment is arranged on the basis of the relative positions of a hash value of each server apparatus and the hash value of the segment on a loop. For example, a server apparatus whose hash value resides ahead of the hash value of the segment on the loop and whose hash value is the smallest is selected.
For example, it is assumed that a hash value of the server apparatus 100 is h(s1), that a hash value of the server apparatus 100a is h(s2), and that a hash value of the label “0” is h(0). If h(s2)>h(0)>h(s1), then a segment whose label is “0” is associated with the server apparatus 100a. Furthermore, it is assumed that a hash value of the server apparatus 100b is h(s3), that a hash value of the server apparatus 100c is h(s4), and that a hash value of the label “001” is h(001). If h(s4)>h(001)>h(s3), then a segment whose label is “001” is associated with the server apparatus 100c.
However, a method for arranging a segment in the server apparatus 100, 100a, 100b, or 100c is not limited to the above method. For example, the server apparatus selection method of finding a remainder by dividing a hash value calculated from a label of a segment by the number of server apparatus and arranging the segment according to the remainder may be adopted. For example, a hash value is divided by 4. If a remainder is 0, then the server apparatus 100 is selected. If a remainder is 1, then the server apparatus 100a is selected. If a remainder is 2, then the server apparatus 100b is selected. If a remainder is 3, then the server apparatus 100c is selected.
(Example of Software of Server Apparatus)
The server apparatus 100 includes a communication processing section 110, a storage destination determination section 120, an event processing section 130, and a data storage section 140. The communication processing section 110, the storage destination determination section 120, and the event processing section 130 are realized as programs executed by the use of, for example, the CPU 101 and the RAM 102 illustrated in
The communication processing section 110 includes a request receiving block 111, a transfer condition storage block 112, a data transmission block 113, and a message processing block 114.
When the request receiving block 111 receives a request from the client apparatus 300 illustrated in
The transfer condition storage block 112 stores information indicative of a transfer condition on the basis of which whether to transfer data which belongs to segments after division from a server apparatus which stores the data to another server apparatus is determined at the time of a segment division process described later. Hierarchical level intervals at which the transfer of data which belongs to segments generated by hierarchically dividing a segment in the segment management tree illustrated in
The data transmission block 113 refers to a transfer condition stored in the transfer condition storage block 112 at the time of a segment division process. If data which belongs to segments after division is stored in the data storage section 140, then the data transmission block 113 determines on the basis of the transfer condition whether to transfer data which belongs to at least one of the segments after the division to another server apparatus. Furthermore, when the data transmission block 113 determines to transfer data which belongs to at least one of the segments after the division, the data transmission block 113 inquires of the storage destination determination section 120 a server apparatus which is a storage destination of data which belongs to each segment after the division. If a server apparatus designated by the storage destination determination section 120 is not the server apparatus 100, then the data transmission block 113 requests data to be transmitted of a data processing block 131 and receives the data. The data transmission block 113 then transmits the data to the server apparatus determined by the storage destination determination section 120.
The message processing block 114 transmits a message to or receives a message from another server apparatus (server apparatus 100a, 100b, or 100c). When the message processing block 114 receives a read request message or a write request message from another server apparatus, the message processing block 114 requests the event processing section 130 to perform data processing. In addition, when a segment division process or the like occurs, the message processing block 114 transmits a segment update message to another server apparatus in response to a request from a segment management block 122. Furthermore, when the message processing block 114 receives a segment update message from another server apparatus, the message processing block 114 requests the segment management block 122 to update segment information.
Moreover, the message processing block 114 accepts a request from the client apparatus 300 received by the request receiving block 111. If the request is a write request, then the message processing block 114 passes a key included in the request to the storage destination determination section 120 and inquires of the storage destination determination section 120 a server apparatus to which data is to be written. If a storage destination server apparatus is the server apparatus 100, then the message processing block 114 stores the data in the data storage section 140 via the event processing section 130. If a storage destination server apparatus is a server apparatus other than the server apparatus 100, then the message processing block 114 transfers the request to the server apparatus. If the request is a read request, then the message processing block 114 passes a key included in the request to the storage destination determination section 120 and inquires of the storage destination determination section 120 a server apparatus in which data resides. If a storage destination server apparatus is the server apparatus 100, then the message processing block 114 retrieves the data from the data storage section 140 via the event processing section 130. If a storage destination server apparatus is a server apparatus other than the server apparatus 100, then the message processing block 114 transfers the request to the server apparatus.
The storage destination determination section 120 includes a segment information storage block 121 and a segment management block 122.
The segment information storage block 121 stores segment information in which a segment management tree like that illustrated in
On the basis of segment information stored in the segment information storage block 121, the segment management block 122 searches for one or more segments corresponding to a key or a key range designated in a request. The segment management block 122 then determines one or more server apparatus in which data that belongs to the segments for which the segment management block 122 searches is arranged from labels of the segments or labels obtained by making a conversion by the use of a method described later and a hash function. Furthermore, if the data that belongs to the segments for which the segment management block 122 searches resides in the server apparatus 100, then the segment management block 122 requests the event processing section 130 to process the data. If the data that belongs to the segments for which the segment management block 122 searches resides in a server apparatus other than the server apparatus 100, then the segment management block 122 requests the message processing block 114 to transmit a message to the server apparatus.
In addition, when the event processing section 130 reports segment division to the segment management block 122, the segment management block 122 updates segment information stored in the segment information storage block 121. The segment management block 122 then requests the message processing block 114 to transmit a message indicative of segment information update to the other server apparatus. Furthermore, the segment management block 122 updates the segment information stored in the segment information storage block 121 in response to a request from the message processing block 114. The segment management block 122 receives identification information for a server apparatus in which a segment may be arranged from the management apparatus 200 as setting information, and holds the identification information.
The event processing section 130 includes the data processing block 131, a calculation expression storage block 132, a threshold calculation block 133, and a division determination block 134.
The data processing block 131 processes data in response to a request from the segment management block 122 or the message processing block 114. If a read request in which a key or a key range is designated is made, then the data processing block 131 reads out from the data storage section 140 data corresponding to the key or a data group corresponding to the key range. If a write request in which a key is designated is made, then the data processing block 131 associates data with the key and writes the data to the data storage section 140.
Furthermore, if the data processing block 131 writes data to the data storage section 140, the data processing block 131 inquires of the division determination unit 134 whether or not there is a need to divide a segment. If the division determination block 134 determines that there is a need to divide the segment, then the data processing block 131 divides the segment and reports the division of the segment to the segment management block 122. In addition, the data processing block 131 transmits to the data transmission block 113 data which belongs to one or more segments of segments after the division in response to a request from the data transmission block 113.
The calculation expression storage block 132 stores calculation expressions by which a threshold for determining whether or not there is a need to divide a segment is calculated. As described later, the threshold is calculated on the basis of a label of the segment for which whether or not there is a need of division is determined. The threshold calculation expressions are described by, for example, a user of the management apparatus 200 and are set in the calculation expression storage block 132 by the management apparatus 200. The calculation expression storage block 132 is realized as a storage area on, for example, the RAM 102 or the HDD 103 illustrated in
The threshold calculation block 133 calculates, from a label of a segment designated by the division determination block 134 and threshold calculation expressions stored in the calculation expression storage block 132, a threshold for the designated segment. The threshold calculation block 133 then returns the calculated threshold to the division determination block 134.
The division determination block 134 determines in response to an inquiry from the data processing block 131 whether or not there is a need to divide a segment to which data has been written. The division determination block 134 designates a label of the segment to which the data has been written, and requests the threshold calculation block 133 to calculate a threshold. For example, the division determination block 134 refers to segment information and finds the label of the segment from a key designated in a write request. Furthermore, the division determination block 134 refers to the data storage section 140 and finds the amount of or the number of pieces of data which belongs to the segment which has been written (total amount or total number of pieces of data associated with keys which belong to the segment, for example). If the amount of or the number of pieces of the data is larger than the threshold, then the division determination block 134 determines that there is a need to divide the segment. If the amount of or the number of pieces of the data is smaller than or equal to the threshold, then the division determination block 134 determines that there is no need to divide the segment.
The division determination block 134 may periodically check the amount of or the number of pieces of data which belongs to each segment, and determine that there is a need of division of a segment for which the amount of or the number of pieces of data is larger than a threshold.
The data storage section 140 associates keys with data and stores them. The data storage section 140 may divide a storage area according to segments and store keys and data.
(Segment Division Process)
(Step S11) The division determination block 134 specifies a key of data which the data processing block 131 writes to the data storage section 140. The division determination block 134 then searches for a segment corresponding to the key on the basis of segment information stored in the segment information storage block 121.
(Step S12) The threshold calculation block 133 substitutes a label of the segment for which the division determination block 134 searches in step S11 in calculation expressions stored in the calculation expression storage block 132 to calculate a threshold for the segment. For example, the threshold calculation expressions are expressed as follows:
threshold th(L)=R×2(decimal part of (f×N)) (1)
where th(L) indicates a threshold applied to a segment whose label is L, R indicates a determined root threshold, and N indicates the above hierarchical level interval at which transfer is allowed.
f=value(L)/blength(L) (2)
where b indicates the number of branches in a segment management tree, length(L) indicates the number of bits of a label L, and value(L) indicates a decimal number as which a binary number represented by the label L is represented.
(Step S13) The division determination block 134 refers to the data storage section 140 and calculates the amount of data (total amount of data associated with keys which belong to the segment, for example) or the number of pieces of data which belongs to the segment for which the division determination block 134 searches in step S11. The division determination block 134 then determines whether or not the amount of or the number of pieces of the data is larger than the threshold calculated in step S12. If the amount of or the number of pieces of the data is larger than the threshold, then step S14 is performed. If the amount of or the number of pieces of the data is smaller than or equal to the threshold, then the process ends.
(Step S14) The segment management block 122 defines b child segments obtained by dividing the segment for which the division determination block 134 searches in step S11. For example, if a label of the segment is L and b=2, then the segment management block 122 defines child segments whose labels are (L+0) and (L+1).
(Step S15) The data processing block 131 selects from the data storage section 140 one key which belongs to the segment for which the division determination block 134 searches in step S11.
(Step S16) The data processing block 131 specifies a child segment, of the child segments defined in step S14, whose label matches a prefix of the key selected in step S15. The data processing block 131 then assigns the key selected in step S15 to the specified child segment.
(Step S17) The data processing block 131 determines whether or not it has selected all keys in step S15. If all the keys are selected, then step S18 is performed. If there is a key which is not yet selected, then the process is repeated from step S15.
(Step S18) The data transmission block 113 determines whether to allow the transfer of data which belongs to one of the b child segments to another server apparatus (data transfer determination process). An example of a data transfer determination process will be described later. If the data transmission block 113 determines to allow data transfer, then step S19 is performed. If the data transmission block 113 determines not to allow data transfer, then step S20 is performed.
(Step S19) The data transmission block 113 reads out from the data storage section 140 via the data processing block 131 data which belongs to the child segment (which is associated with a key that belongs to the child segment) and which is to be arranged in another server apparatus, and transmits the data to another server apparatus. When the transmission of the data is completed, the data transmission block 113 informs the data processing block 131 that the transmission of the data is completed. When the data processing block 131 is informed by the data transmission block 113 that the transmission of the data is completed, the data processing block 131 deletes from the data storage section 140 the data which belongs to the child segment and which is arranged in another server apparatus.
(Step S20) The segment management block 122 updates the segment information stored in the segment information storage block 121 so as to make the segment division performed in step S14 reflect. The message processing block 114 informs all of the other server apparatus of segment update.
In the above segment division process, the division of a written segment is taken as an example. However, a segment division process may be performed according to a result obtained by periodically comparing the amount of or the number of pieces of data which belongs to each segment at the current lowest hierarchical level of a segment management tree with a threshold.
Furthermore, in the above segment division process, a threshold is calculated from a label of a written segment for which a search is made, and the calculated threshold is used in step S13. However, another method may be used. For example, the threshold calculation block 133 may calculate a threshold at a stage at which each segment has been generated, and store it in a storage block (segment information storage block 121, for example). In that case, the division determination block 134 retrieves from the storage block a threshold corresponding to a segment for which the division determination block 134 searches in step S11, and uses the threshold in step S13. In addition, the same threshold may be used for all segments.
In the example of
Root threshold R=128 is set for a segment corresponding to a leftmost node (root node (whose label is “Φ”) or a node whose label is “0”, “00”, “000”, or “0000”) at each hierarchical level (n=1 to 5). Thresholds obtained by multiplying 128 by 2(decimal part of (f×N)) are set for segments corresponding to nodes other than the leftmost nodes.
With a segment whose label is “1”, for example, f=0.5 is obtained from the above expression (2). In expression (1), f×N=0.5×4=2.0. Its decimal part is 0, so threshold th(“1”)=128×20=128 (MB).
Furthermore, with a segment whose label is “111”, f=7/23=0.875 is obtained from the above expression (2). In expression (1), f×N=0.875×4=3.5. Its decimal part is 0.5, so threshold th(“111”)=128×20.5≈181 (MB).
By calculating a threshold on the basis of expression (1), timing at which segments that belong to a hierarchical level at which transfer may occur at the time of division are divided can be staggered. In the segment management tree illustrated in
Furthermore, if data is uniformly added to the key space and a threshold for each segment is calculated by the use of expression (1), then a segment which belongs to a lower hierarchical level is divided after the division of a segment which belongs to an upper hierarchical level is completed. For example, after the segment whose label is “1” is divided, the segment whose label is “00” is divided. After a segment whose label is “11” is divided, the segment whose label is “000” is divided. After the segment whose label is “111” is divided, the segment whose label is “0000” is divided.
(Data Transfer Determination Process)
An example of step S18 (data transfer determination process) of the flow chart indicated in
(Step S21) The data transmission block 113 converts in the following way a label of a child segment defined in step S14 of the flow chart indicated in
(Label Conversion Method 1)
The data transmission block 113 refers to a hierarchical level interval N at which transfer is allowed. The hierarchical interval level N at which transfer is allowed is stored in the transfer condition storage block 112. For example, the data transmission block 113 then leaves only bits, from the head of a label of a segment represented as a binary number, the number of which is a multiple of N, and deletes bits the number of which is not a multiple of N.
For example, it is assumed that N=4. If this label conversion method is applied to the segment management tree illustrated in
Furthermore, if consecutive bits of a label from the LSB (Least Significant Bit) to the MSB (Most Significant Bit) the number of which corresponds to a multiple of N are 0, then the data transmission block 113 deletes the consecutive 0's the number of which corresponds to the multiple of N.
For example, it is assumed that N=4. The label “11110000” is converted to the label “1111” and the label “10100000000” is converted to the label “101”.
(Label Conversion Method 2)
The data transmission block 113 applies expression (2) to a label of a segment represented as, for example, a binary number to find the value of f. The data transmission block 113 may acquire and use the value of f found in the threshold calculation process (performed in step S12 of
If a remainder (A) obtained by dividing the number of bits of the label by N matches an integral part of (f×N), then the data transmission block 113 maintains a label value. If a remainder (A) obtained by dividing the number of bits of the label by N does not match an integral part of (f×N), then the data transmission block 113 deletes A bits from the label from the LSB to the MSB.
Furthermore, if consecutive bits of a label from the LSB to the MSB the number of which corresponds to a multiple of N are 0, then the data transmission block 113 deletes the consecutive 0's the number of which corresponds to the multiple of N. This is the same with label conversion method 1.
(Step S22) The data transmission block 113 applies one of the above two label conversion methods. After that, the data transmission block 113 informs the segment management block 122 of the label or a label after the conversion of the child segment and inquires of the segment management block 122 a server apparatus in which data that belongs to the child segment is to be arranged.
(Step S23) The segment management block 122 applies a hash function to the label or the label after the conversion of the child segment to calculate a hash value corresponding to the child segment. As illustrated in
(Step S24) The data transmission block 113 determines whether or not the determined server apparatus is different from a server apparatus which is determined in advance by the same steps as steps S21 and S22 and in which data that belongs to the parent segment is arranged. If any of server apparatus determined for the plural child segments is different from the server apparatus in which the data that belongs to the parent segment is arranged, then step S25 is performed. If all server apparatus determined for the plural child segments are the same as the server apparatus in which the data that belongs to the parent segment is arranged, then step S26 is performed.
(Step S25) The data transmission block 113 allows data that belongs to a child segment for which a server apparatus different from the server apparatus associated with the parent segment is designated to be transferred from the server apparatus associated with the parent segment to a designated server apparatus.
(Step S26) The data transmission block 113 does not allow data that belongs to each child segment to be transferred from the server apparatus associated with the parent segment to another server apparatus.
(Example of Application of Label Conversion Method 1)
In the following description it is assumed that data that belongs to a segment at a hierarchical level corresponding to n=1 is stored in the data storage section 140 of the server apparatus 100. An arrow added to a segment indicates that data which belongs to the segment is allowed to be transferred from the server apparatus 100 to another server apparatus.
It is assumed that a hierarchical level interval N at which transfer is allowed is 4. As stated above, if the above label conversion method 1 is applied, all bits of a label of each segment which belongs to a hierarchical level corresponding to n=1, 2, 3, or 4 are deleted and a hash value is calculated. As a result, the same hash value is obtained. Accordingly, the segment management block 122 informs the data transmission block 113 of the same server apparatus 100 as a transfer destination. This means that data that belongs to each segment which belongs to a hierarchical level corresponding to n=1, 2, 3, or 4 is not moved.
With a segment which belongs to a hierarchical level corresponding to n=5 and whose label is “0000”, all 0's are deleted by applying the above label conversion method 1. Accordingly, the same server apparatus 100 is designated. With the other segments which belong to the hierarchical level corresponding to n=5, the segment management block 122 designates a transfer destination server apparatus according to a label value of each segment. However, the segment management block 122 may designate the same server apparatus 100 for a segment, depending on the number of server apparatus or the like. In that case, the data transmission block 113 does not transfer data which belongs to the segment from the server apparatus 100 to another server apparatus.
It is assumed that N=4 and that label conversion method 1 is applied. When segments are divided further in the segment management tree illustrated in
As has been described, if label conversion method 1 is applied and determined division depth (hierarchical level interval) is not satisfied, the movement of data which belongs to generated segments to another storage device is restricted. This checks an increase in the average amount of data transferred at the time of distributing a load in the information processing system. Therefore, the movement of data from one storage device to another at load distribution time does not impede, for example, access by a user to data in the information processing system.
(Example of Application of Label Conversion Method 2)
In the following description it is assumed that data that belongs to a segment at a hierarchical level corresponding to n=1 is stored in the data storage section 140 of the server apparatus 100. This is the same with the example of application of label conversion method 1. Furthermore, an arrow added to a segment indicates that data which belongs to the segment is allowed to be transferred from the server apparatus 100 to another server apparatus.
It is assumed that a hierarchical level interval N at which transfer is allowed is 4 and that the above label conversion method 2 is applied. With a segment which belongs to a hierarchical level corresponding to n=2 and whose label is “0”, the number of bits of the label is 1, a remainder (A) obtained by dividing the number of the bit of the label by N(=4) is 1, and (f×N)=0.0. In this case, the remainder (A)(=1) does not match an integral part of (f×N)(=0.0), so one bit is deleted from the label from the LSB to the MSB. As a result, all the bits of the label are deleted. Accordingly, a hash value is the same with a segment which belongs to a hierarchical level corresponding to n=1. The segment management block 122 informs the data transmission block 113 of the server apparatus 100 as a transfer destination. This means that data which belongs to the segment whose label is “0” is not moved. This is the same with a segment whose label is “1”. That is to say, a hash value is the same with the segment which belongs to the hierarchical level corresponding to n=1. The segment management block 122 informs the data transmission block 113 of the server apparatus 100 as a transfer destination. This means that data which belongs to the segment whose label is “1” is not moved.
With each segment which belongs to a hierarchical level corresponding to n=3, the number of bits of a label is 2 and a remainder (A) obtained by dividing the number of the bits of the label by N(=4) is 2. With a segment whose label is “10”, an integral part of (f×4) matches 2. Accordingly, the data transmission block 113 informs the segment management block 122 of the label “10”. The segment management block 122 finds a hash value corresponding to the label “10”, determines on the basis of the hash value a server apparatus to which data that belongs to the segment whose label is “10” is to be transferred, and informs the data transmission block 113 of the server apparatus. If the server apparatus of which the segment management block 122 informs the data transmission block 113 is different from the server apparatus 100, then the data transmission block 113 transfers the data that belongs to the segment whose label is “10” to the server apparatus of which the segment management block 122 informs the data transmission block 113.
With the other segments which belong to the hierarchical level corresponding to n=3, data corresponding to bits the number of which is equal to a remainder (A) is deleted by applying label conversion method 2. As a result, hash values are the same with the segment which belongs to the hierarchical level corresponding to n=1. Accordingly, the segment management block 122 informs the data transmission block 113 of the server apparatus 100 as a transfer destination. This means that data which belongs to these segments is not moved.
With each segment which belongs to a hierarchical level corresponding to n=4, the number of bits of a label is 3 and a remainder (A) obtained by dividing the number of the bits of the label by N(=4) is 3. With a segment whose label is “110” or “111”, an integral part of (f×4) matches 3. Accordingly, the data transmission block 113 informs the segment management block 122 of the label “110” or “111”. The segment management block 122 finds a hash value corresponding to the label “110” or “111”, determines on the basis of the hash value a server apparatus to which data that belongs to the segment whose label is “110” or “111” is to be transferred, and informs the data transmission block 113 of the server apparatus. If the server apparatus of which the segment management block 122 informs the data transmission block 113 is different from the server apparatus 100, then the data transmission block 113 transfers the data that belongs to the segment whose label is “110” or “111” to the server apparatus of which the segment management block 122 informs the data transmission block 113.
With the other segments which belong to the hierarchical level corresponding to n=4, data corresponding to bits the number of which is equal to a remainder (A) is deleted by applying label conversion method 2. As a result, hash values are the same with the segment which belongs to the hierarchical level corresponding to n=1. Accordingly, the segment management block 122 informs the data transmission block 113 of the server apparatus 100 as a transfer destination. This means that data which belongs to these segments is not moved.
With each segment which belongs to a hierarchical level corresponding to n=5, the number of bits of a label is 4 and a remainder (A) obtained by dividing the number of the bits of the label by N(=4) is 0. With segments whose labels are “0000” through “0011”, an integral part of (f×4) matches 0. However, the label “0000” includes consecutive 0's from the LSB to the MSB the number of which is a multiple of N (once N), so the consecutive 0's are deleted. With the labels “0100” through “0111”, a remainder (A) is 0, so no bit is deleted. Accordingly, the data transmission block 113 informs the segment management block 122 of the label “(empty)” through “0111”. The segment management block 122 finds hash values corresponding to the label “empty” through “0111”, determines on the basis of the hash values server apparatus to which data that belongs to the segments whose labels are “empty” through “0111” is to be transferred, and informs the data transmission block 113 of the server apparatus. The hash value corresponding to the label “empty” is the same with the segment which belongs to the hierarchical level corresponding to n=1, so the server apparatus 100 is designated. If the server apparatus of which the segment management block 122 informs the data transmission block 113 as transfer destinations of the data that belongs to the segments whose labels are “0001” through “0111” are different from the server apparatus 100, then the data transmission block 113 transfers the data that belongs to these segments to the server apparatus of which the segment management block 122 informs the data transmission block 113.
With segments whose labels are “1000” through “1111”, a remainder (A) is also 0, so no bit is deleted from a label value. However, if transfer occurs for the segment whose label is “10”, “110” or “111”, the segments whose labels are “1000” through “1111” do not meet the division depth condition. That is to say, an interval between the hierarchical level to which the segment whose label is “10”, “110” or “111” belongs and the hierarchical level to which the segments whose labels are “1000” through “1111” belong is smaller than N(=4). Accordingly, the data transmission block 113 does not allow data which belongs to these segments to be moved. When segment division is performed further, descendant segments of the segment whose label is “10”, “110” or “111” are generated. As a result, segments for which the movement of data is allowed appear at hierarchical level intervals of 4. The same applies to descendant segments of the segments whose labels are “0001” through “0111”.
As has been described, in the example of
If label conversion method 2 is applied in the above way, the same effect that is obtained by applying label conversion method 1 is achieved. In addition, timing at which a data transfer occurs for plural segments can be staggered. That is to say, it is possible to prevent data transfers from occurring in the same period of time. Therefore, the load on the information processing system is reduced.
The server apparatus 100a, 100b, or 100c illustrated in
(Request Handling)
(Step S31) The request receiving block 111 receives a request from the client apparatus 300. The request may be a read request in which one key is designated, a read request in which a key range is designated, or a write request in which one key is designated.
(Step S32) The segment management block 122 determines whether or not a key range is designated in the request received in step S31. If a key range is designated in the request received in step S31, then the segment management block 122 proceeds to step S34. If a key range is not designated in the request received in step S31, then the segment management block 122 proceeds to step S33.
(Step S33) The segment management block 122 refers to segment information stored in the segment information storage block 121, and searches for a segment to which a key designated in the request belongs. For example, at this time the segment management block 122 compares a prefix of the key represented as a b-ary number (b is an integer greater than or equal to 2) with a label from a root node to a leaf node of a segment management tree, and specifies a leaf node whose label matches the prefix. The segment management block 122 searches for a segment corresponding to the specified leaf node.
(Step S34) The segment management block 122 refers to segment information and searches for all segments including at least one key within the key range designated in the request. For example, in order to search for plural segments, the segment management block 122 finds the longest prefix common to the maximum key value and the minimum key value within the key range, and specifies a node whose label matches the longest prefix. The segment management block 122 then specifies plural leaf nodes below the specified node. The segment management block 122 searches for plural segments corresponding to the specified plural leaf nodes.
For example, it is assumed that the key range is 00010 to 00101. The longest prefix common to the minimum key value “00010” and the maximum key value “00101” is “00”. Accordingly, the segment management block 122 specifies leaf nodes whose labels are “000” and “001” and which are below a node whose label is “00”. The segment management block 122 then searches for segments corresponding to the leaf nodes whose labels are “000” and “001”.
(Step S35) The segment management block 122 converts a label of the segment for which the segment management block 122 searches in step S33 or S34 by the use of label conversion method 1 or 2 applied in the above data transfer determination process. The segment management block 122 then applies a hash function to the label or a label after the conversion to calculate a hash value. The segment management block 122 then specifies from the hash value a server apparatus in which the segment is arranged. At this time, for example, the method indicated in
(Step S36) The segment management block 122 determines whether or not the segment management block 122 specifies another server apparatus in step S35 as a server apparatus in which a segment to which data to be processed belongs is arranged. If the segment management block 122 specifies another server apparatus in step S35, then step S37 is performed. If the segment management block 122 specifies in step S35 the server apparatus (server apparatus 100) including itself, then step S38 is performed.
(Step S37) The message processing block 114 transmits a message indicative of a read request or a write request to another server apparatus specified in step S35. The message includes a key or a key range, of a key or a key range designated in the request from the client apparatus 300, stored in the destination server apparatus. After that, the message processing block 114 receives a response message from the destination server apparatus. The response message includes data corresponding to the key or a report on the completion of writing.
(Step S38) The data processing block 131 performs data processing in response to the request from the client apparatus 300. If the request is a read request, then the data processing block 131 reads out from the data storage section 140 data corresponding to a key or a key range, of a key or a key range designated in the request from the client apparatus 300, stored in the server apparatus 100. If the request is a write request, then the data processing block 131 associates a key included in the request with data, and writes them to the data storage section 140.
(Step S39) The request receiving block 111 transmits to the client apparatus 300 a response to the request which it receives in step S31. If the request is a read request, then the request receiving block 111 transmits to the client apparatus 300 the data read out from another server apparatus or the server apparatus 100 in step S37 or S38 as the response. If the request is a write request, then the request receiving block 111 transmits to the client apparatus 300 a report on the completion of writing confirmed in step S37 or S38 as the response.
(Example of Sequence of Operation in Information Processing System)
(Step S41) The client apparatus 300 transmits to the server apparatus 100 a request indicative of data writing. The request includes a key and data.
(Step S42) The server apparatus 100 searches on the basis of segment information it has for a segment to which the key belongs, and converts a label by the use of applied label conversion method 1 or 2. The server apparatus 100 then uses a label after the conversion and a hash function for specifying a server apparatus (server apparatus 100a, in this case) in which data that belongs to the segment is arranged. The server apparatus 100 then transmits to the server apparatus 100a a message indicative of a write request. The message includes the key and the data.
(Step S43) The server apparatus 100a associates the key included in the message received from the server apparatus 100 with the data included in the message received from the server apparatus 100, and writes them to a storage device included therein. The server apparatus 100a then transmits to the server apparatus 100 a response indicative of the completion of the writing.
(Step S44) When the server apparatus 100 confirms that the server apparatus 100a has completed the writing, the server apparatus 100 transmits a response to the client apparatus 300.
(Step S45) The server apparatus 100a detects that the amount of or the number of pieces of data which belongs to the segment has exceeded a threshold as a result of writing the data. The server apparatus 100a divides the segment into b (two, for example) segments.
(Step S46) The server apparatus 100a performs a data transfer determination process like that indicated in
(Step S47) The server apparatus 100a transmits to the server apparatus 100b a key and data stored in the server apparatus 100b.
(Step S48) The server apparatus 100b associates the key received from the server apparatus 100a with the data received from the server apparatus 100a, and writes them to a storage device included therein. The server apparatus 100b then transmits to the server apparatus 100a a response indicative of the completion of the writing.
(Step S49) The server apparatus 100a deletes from the storage device included therein the data in the segment duplication of which into the server apparatus 100b is completed. In addition, the server apparatus 100a updates segment information which it has so as to make the segment division reflect. The server apparatus 100a then transmits to all the other server apparatus (server apparatus 100, 100b, and 100c) notice indicative of the segment information update.
(Step S50) Each of the server apparatus 100, 100b, and 100c updates segment information which it has on the basis of the notice from the server apparatus 100a. Each of the server apparatus 100, 100b, and 100c then transmits to the server apparatus 100a a response indicative of the completion of the segment information update. As a result, the segment information update performed by the server apparatus 100a is reflected in the segment information which each of the server apparatus 100, 100b, and 100c has.
According to the second embodiment, each of the server apparatus 100, 100a, 100b, and 100c associates a key with data, stores them, and performs data processing for a key designated by the client apparatus 300. This decreases the complexity of data processing and the load. Furthermore, data is distributed and is stored in plural server apparatus. In addition, any server apparatus can accept a request from the client apparatus 300. As a result, it is possible to make a design so as to prevent an apparatus which becomes a bottleneck from appearing. Accordingly, the availability of the information processing system is improved.
Furthermore, key space is divided into plural segments and data is arranged in server apparatus on a segment basis. Accordingly, there is a high probability that data corresponding to keys the values of which are close to one another is stored in the same server apparatus. As a result, data processing in which a range of keys is designated is performed efficiently. In addition, whether or not there is a need for segment division or data transfer is determined independently for each segment on the basis of its label. Therefore, each server apparatus makes a determination without conducting negotiations with another server apparatus. As a result, scalability is improved.
Furthermore, if determined division depth (hierarchical level interval) is not satisfied, then the movement of data which belongs to a generated segment to another storage device is restricted. This checks an increase in the average amount of data transferred at the time of distributing a load in the information processing system. Therefore, the movement of data from one storage device to another at load distribution time does not impede, for example, access by a user to data in the information processing system.
Furthermore, if label conversion method 2 is applied in a data transfer determination process, timing at which a data transfer occurs for plural segments can be staggered. That is to say, it is possible to prevent data transfers from occurring in the same period of time. Therefore, the load on the information processing system is reduced.
As stated above, the data management in the second embodiment can be realized by making each of the server apparatus 100, 100a, 100b, and 100c as computers execute a data management program. The program is recorded in a computer-readable record medium (record medium 33, for example). A magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is used as a record medium. A magnetic disk may be a FD or a HDD. An optical disk may be a CD, a CD-R(Recordable)/RW(ReWritable), a DVD, or a DVD-R/RW.
In order to circulate the program, portable record media or the like in which the program is recorded are provided. Alternatively, the program may be stored in a storage device of another computer and be distributed via the network 41. A computer stores in a storage device (HDD 103, for example) the program recorded in a portable record medium, the program received from another computer, or the like, reads the program from the storage device, and executes the program. However, the computer may directly execute the program read from the portable record medium or directly execute the program received from another computer via the network 41.
Traffic for data management in an information processing system including a plurality of storage devices is reduced.
All examples and conditional language provided herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed as limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present invention have been described in detail, it should be understood that various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
This application is a continuation application of International Application PCT/JP2012/050158 filed on Jan. 6, 2012 which designated the U.S., the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2012/050158 | Jan 2012 | US |
Child | 14315377 | US |