This application is based upon and claims the benefit of priority of the prior Japanese Patent application No. 2021-003717, filed on Jan. 13, 2021, the entire contents of which are incorporated herein by reference.
The embodiment discussed herein relates to an information processing system, an information processing apparatus, and a method for processing information.
As an example of an information processing system including multiple information processing apparatuses, a block storage system is known in which a computing server and a storage server are communicably connected to each other via a network.
[Patent Document 1] Japanese Laid-open Patent Publication No. 2018-142314
[Patent Document 2] Japanese Laid-open Patent Publication No. 2018-185760
[Patent Document 3] Japanese Laid-open Patent Publication No. 2005-202942
In a block storage system, when data is written from a computing server into a storage server, passage of data through a network causes communication.
For example, by employing a contents cache in a computing server, passage of data through the network can be suppressed in terms of writing cache-hit data, which means that deduplication is enabled. On the other hand, cache-miss data is not deduplicated.
As described above, depending on the operation mode of the information processing system, the tendency of writing accesses to the information processing apparatus, and the like, the effect of deduplication in reducing data traffic may lower with, for example, an increase in frequency of cache misses.
According to an aspect of the embodiments, an information processing system includes: a first information processing apparatus; and a second information processing apparatus connected to the first information processing apparatus via a network. The first information processing apparatus includes a first memory, a first storing region that stores a fingerprint of data, and a first processor coupled to the first memory and the first storing region. The first processor is configured to transmit, in a case where a fingerprint of writing target data to be written into the second information processing apparatus exits in the first storing region, a writing request including the fingerprint to the second information processing apparatus, and transmit, in a case where the fingerprint does not exist in the first storing region, a writing request containing the writing target data and the fingerprint to the second information processing apparatus. The second information processing apparatus includes a second memory, a second storing region that stores respective fingerprints of a plurality of data pieces written into a storing device in a sequence of writing the plurality of data pieces, and a second processor coupled to the second memory and the second storing region. The second processor is configured to receive a plurality of the writing requests from the first information processing apparatus via the network, determine, based on writing positions of the plurality of the fingerprints included in the plurality of writing requests on a data layout of the second storing region, whether or not the plurality of writing requests have sequentiality, read, when determining that the plurality of writing requests have sequentiality, a subsequent fingerprint to the plurality of fingerprints on the data layout of the second storing region, and transmit the subsequent fingerprint to the first information processing apparatus. The first information apparatus stores the subsequent fingerprint into the first storing region.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
Hereinafter, an embodiment of the present invention will now be described with reference to the accompanying drawings. However, the embodiment described below is merely illustrative and there is no intention to exclude the application of various modifications and techniques that are not explicitly described below. For example, the present embodiment can be variously modified and implemented without departing from the scope thereof. In the drawings to be used in the following description, like reference numbers denote the same or similar parts, unless otherwise specified.
As illustrated in
As illustrated in
As illustrated in
As illustrated in
In the first, third, and fourth configuration examples illustrated in
For example, by employing a contents cache in the computing server 110, passage of data through the network 120 can be suppressed in terms of writing cache-hit data, which means that deduplication, is enabled.
Each local cache 150 includes a cache 151. The storage server 230 includes a cache 131, a deduplicating and compacting unit 132 that deduplicates and compresses data, and a Redundant Arrays of Inexpensive Disks (RAID) 133 that stores data. In the first and third configuration examples, as illustrated in
However, in either of the examples of
The contents cache 141 is, for example, a deduplicated cache and may include, by way of example, a “Logical Unit Number (LUN),” a “Logical Block Address (LBA),” a “fingerprint,” and “data.” A fingerprint (FP) is a fixed-length or variable-length data string calculated on the basis of data, and may be, as an example, a hash value calculated by a hash function. Various hash functions such as SHA-1 can be used as the hash function.
As illustrated in
In the example of
Accordingly, the efficiency of the cache capacity can be enhanced, and from the viewpoint of communication, the data transfer amount at the time of writing can be reduced.
An effective example brought by the contents cache 141 is, as illustrated in
When the definition files are updated upon the starts of the virtual desktops, multiple writings of the same data occur from multiple virtual desktops to the storage servers 130 around the working start time. These writings allow the data to be fetched (stored) in the contents cache 141 because the size of the data related to the writings is small and the writings occur at substantially the same time.
In the example of
As described above, unless deduplication is performed in the contents cache 141, the data traffic is not reduced. In other words, unless the data exists in the contents cache 141 (the cache hit occurs), the data traffic is not reduced. Another conceivable approach is to compress data, which reduces data traffic by as low as about 30 to 40 percent, but does not result in a drastic change in suppressing transmission of the entire data as achieved by deduplication.
One of the causes that the contents cache 141 is not deduplicated is unsuccessful deduplication of the contents cache 141 in a situation where the content was previously written. In this case, although data traffic increases, the deduplication might be possible if inquiry is made to the storage server 130. The underlying cause is that the contents cache 141 of the computing server 110 stores only part of the FPs throughout the system.
An example of a use case of a block storage system is a case where multiple users store a data set into the storage servers 130 for machine learning of Artificial Intelligence (AI).
The data set used in the machine learning of AI can be tens of PBs (petabytes). For example, the users download the data set from a community site and deploy it onto the storage servers 130. It is assumed that the data sets used in machine learning have the same data and a similar writing sequence.
In terms of the storage capacity of the contents cache 141, it is difficult to place all writings of a data set of several tens of PBs in the contents cache 141. However, the data sets, which contain the same data and similar writing sequence, have regularity.
With the foregoing in view, description of the one embodiment will be made in relation to, as an example of a scheme to reduce data traffic when data is written into an information processing apparatus, a scheme that achieves deduplication in writing data sets from the second and subsequent users by using regularity.
The following description is based on the block storage system 100D according to the fourth configuration example. However, the scheme according to the one embodiment is also applicable to writing for duplication in the block storage system 100B according to the second configuration example. In other words, in terms of an I/O (Input/Output) path, the computing server 110 serving as a writing destination of the block storage system 100B can be treated the same as the storage server 130 in the block storage system 100D.
The computing server 110 is an example of a first information processing apparatus, and the storage server 130 is an example of a second information processing apparatus. Further, in cases where the multiple computing servers 110 have a redundant configuration and data is written between the computing servers 110 in the example illustrated in
Each computing server 2 may include a storage component 20 having a contents cache 20a. Each storage server 4 may include a prefetching unit 40a, a deduplicating and compacting unit 40b, and a storage 40c.
Each storage server 4 according to the one embodiment reduces data traffic by predicting regularity and transmitting an FP that is likely to be written by the computing server 2 to the contents cache 20a of the computing server 2 in advance.
For example, the storage server 4 prefetches an FP, focusing on sequentiality of data that can be detected inside the storage server 4. As illustrated in
As a scheme for detecting the regularity described above, time series analysis has been known, for example. Time series analysis is, for example, a scheme of analysis that provides an FP written for each LUN with a time stamp. In time series analysis, additional resources of the storage server 4 or a server on a cloud are used for managing the time stamp provided to each FP. In addition, when time series analysis is performed inside the storage of the storage server 4, the time series analysis, which is high in processing load, may be a cause of degrading the performance of the storage server 4.
For the above, the one embodiment focuses on sequentiality of data as the regularity. By using the sequentiality of data that can be detected inside the storage of the storage server 4 as the regularity, it is possible to complete the process within the storage. In order to enhance the detection accuracy, time series analysis may be employed as regularity in addition to the sequentiality of the data to the extent that the use of additional resources is permitted.
As illustrated in
As illustrated in
In cases where the storage server 4 determines that the FPs are sequential (succeeds in determination), the storage server 4 reads the FPs at and subsequent to 532th byte on the data layout of the storing region 40d, which follow the received FPs, and transfers the read FPs to the computing server 2 (see reference numbers (3)).
Thereby, in cases where the FPs of the fourth and subsequent data in the writing sequence match the FPs received from the storage server 4, the computing server 2 can omit the transmission of the data as in the case of the first to third data. In other words, in the block storage system 1, it is possible to reduce the data traffic by deduplication.
The sequential determination described above is assumed to use the writing positions in the storage 40c, for instance, a disk group such as a RAID.
For example, in cases where the sequential determination uses LUNs and LBAs, since the data layout on the LUNs is based on the logical writing positions of the actual data, subsequent data is guaranteed to follow if being read sequentially on the basis of the LUNs and the LBAs. In other words, on the data layout on the LUN, the subsequent data is guaranteed to be the next data on the same LUN.
On the other hand, in the scheme of the one embodiment, the sequential determination depends on the writing sequence of the fingerprints. That is, in the example of
One of the cases where it is difficult, to write “in the writing sequence in units of an LUN as much as possible” is when writing of the metadata or a journal log of a file system occurs. For example, a block storage sometimes uses a file system. The file system sometimes writes, for example, metadata and a journal log into the storage 40c in addition to the data body in accordance with workload data of a user.
As illustrated in
As illustrated in
As a solution to the above, the block storage system 1 according to the one embodiment may perform compaction of FPs as illustrated in
For example, as illustrated in
Thus, at the time of the next writing into the storage server 4, since compaction is already performed in the storing region 40d-2, the FPs therein are easily determined to be sequential and the storing region 40d-2 has a small number of pieces of unrequired data, which can enhance the prefetching hit rate.
As described above, according to the scheme of the one embodiment, by transferring FPs that are likely to cause cache hits in prefetching from the storage server 4 to the computing server 2 in advance, the deduplication rate can be enhanced by prefetching hits. This can reduce the data traffic.
For example, in the event of executing a workload of writing which has sequentiality and in which deduplication is effective, deduplication can be accomplished regardless of the size of the contents cache 20a even in large scale writing.
In addition, since compaction can remove unrequired data that causes errors in sequential determination and a decrease in the prefetching hit rate, the deduplication rate can be further enhanced at, for example, the third and subsequent writings.
As illustrated in
The contents cache 20a is, for example, a cache in which deduplication has been performed, and may include an “LUN”, an “LBA”, a “fingerprint”, and “data”, as the data structure illustrated in
The dirty data managing unit 21 manages dirty data in the contents cache 20a, which has not yet been written into the storage server 4. For example, the dirty data managing unit 21 may manage metadata such as LUN+LBA along with dirty data. The dirty data managing unit 21 outputs data to the deduplication determining unit 22 when the deduplication determining unit 22 determines to perform deduplication.
The deduplication determining unit 22 calculates the FP of the data, and determines whether or not the deduplication of the data is to be performed. The FP calculated by the deduplication determining unit 22 is managed by the FP managing unit 23.
The FP managing unit 23 manages the FP held in the contents cache 20a. The FP managing unit 23 may manage FPs received from the prefetching unit 40a of the storage server 4 in addition to the FPs calculated from the data in the contents cache 20a.
The network IF unit 20b has a function as a communication IF to an external information processing apparatus such as the storage server 4.
As illustrated in
The network IF unit 40e has a function as a communication IF to an external information processing apparatus such as the computing server 2.
The first managing unit 41 manages FPs that the storage server 4 holds. For example, the first managing unit 41 may read and write an FP from and to the back end through the first layout managing unit 44. The first managing unit 41 may, for example, receive a writing request including an FP of writing target data to be written into the storage 40c from the computing server 2 through the network 3 by the network IF unit 40e.
The second managing unit 42 manages data except for the FPs. For example, the second managing unit 42 may manage various data held by the storage server 4, including metadata such as a reference count and mapping from the LUN+LBA to the address of the data, a data body, and the like. The second managing unit 42 outputs the data body to the deduplication hit determining unit 43 in deduplication determination. The second managing unit 42 may read and write various data except for the FPs from the back end through the second layout managing unit 45.
The deduplication hit determining unit 43 calculates the FP of the data, and determines whether or not the deduplication of the data is to be performed. The PP calculated by the deduplication hit determining unit 43 is managed by the first managing unit 41.
The first layout managing unit 44 manages, through the drive IF unit 40f, the layout on the volume of the storage 40c when an PP is read or written. For example, the first layout managing unit 44 may determine the position of an FP to be read or written.
The second layout managing unit 45 manages, through the drive IP unit 40f, the layout on the volume of the storage 40c when reading or writing metadata such as a reference count and mapping from the LUN+LBA to the address of the data, the data body, and the like. For example, the second layout managing unit 45 may determine the positions of the metadata, the data body, and the like to be read and written.
The drive IF unit 40f has a function as an IF for reading from and writing to the drive of the storage 40c serving as the back end of the deduplication.
The storage 40c is an example of a storing device configured by combining multiple drives. The storage 40c may be a virtual volume such as RAID, for example. Examples of the drive include at least one of drives such as a Solid State Drive (SSD), a Hard Disk Drive (HDD), and a remote drive. The storage 40c may include a storing region (not illustrated) that stores data to be written and one or more storing regions 40d that store metadata such as an FP.
The storing region 40d is an example of a second storing region, and may store, for example, respective FPs of multiple data pieces written into the storage 40c in the sequence of writing the multiple data pieces.
The hit rate and history managing unit 46 determines the prefetching hit rate and manages the hit history.
For example, in order to determine the prefetching hit rate, when adding a prefetched FP to the contents cache 20a, the hit rate and history managing unit 46 may add, through the first managing unit 41, information indicating the prefetched FP, for example, a flag, to the FP. In cases where the FP with a flag is written from the computing server 2, which means prefetching hit, the hit ratio and history managing unit 46 may transfer the FP with the flag to the storage 40c through the first managing unit 41, to update the hit ratio. Incidentally, the presence or absence of a flag may be regarded as the presence or absence of an entry in a hit history table 46a to be described below. That is, addition of a flag to an FP may represent addition of an entry to the hit history table 46a.
Further, for example, the hit rate and history managing unit 46 may use the hit history table 46a that manages the hit number in the storage server 4 in order to manage the hit history of prefetching. The hit history table 46a is an example of information that records the number of time of receiving a writing request including an FP that matches an FP transmitted in prefetching for each of multiple FPs transmitted in prefetching.
The hit rate and history managing unit 46 may create an entry in the hit history table 46a when prefetching is carried out in the storage server 4. The hit rate and history managing unit 46 may update the hit number of the target FP upon a prefetching hit. The hit rate and history managing unit 46 may delete an entry when a predetermined time has elapsed after prefetching.
The sequential determining unit 47 performs sequential determination based on FPs. For example, the sequential determining unit 47 may detect the sequentiality of multiple received writing requests on the basis of writing positions of multiple FPs included in the multiple writing requests on the data layout of the storing region 40d.
The sequential determining unit 47 may use the parameters of P, N, and H in the sequential determination. The parameter P represents the number of entries having sequentiality that the sequential determining unit 47 detects (i.e., the number of times that the sequential determining unit 47 detects sequentiality), and may be an integer of two or more. The parameter N is a coefficient for determining the distance between FPs, which serves as a criterion for determining that the positions of the hit FPs are successive on the data layout of the storing region 40d, in other words, for determining that the FPs are sequential, and may be, for example, an integer of one or more. The parameter H is a threshold for performing prefetching, and may be, for example, an integer of two or more. In the following description, it is assumed that P=8, N=16, and H=5.
For example, when the hit FP locates at the position of ±(α×N) (within a first given range) from the position of the last hit FP (e.g., at the immediately preceding writing request) on the data layout of the storing region 40d, the sequential determining unit 47 may determine that the FPs are sequential. The symbol α represents the data size of an FP and is, for example, eight bytes. The case of N=+1 can be said to be truly sequential, but N may be a value of 2 or more with a margin in consideration of switching the sequence of the I/O. Thus, even if the FPs are not successive on the data layout of the storing region 40d, the sequential determining unit 47 can determine that the FPs are sequential if the hit FPs are within the distance of ±(α×N).
As another example, the sequential determining unit 47 may determine that the FPs are sequential if the FPs on the data layout of the storing region 40d are hit H times or more. As the above, the sequential determining unit 47 can enhance the accuracy of the sequential determination by determining that the FPs have sequentiality after the FPs are hit a certain number of times.
In the example of
When replacing the entries in the FP history table 47a, the sequential determining unit 47 may replace the entries that are not used for a fixed interval or more or that have values at the nearest location to the accessed FP.
As described above, the sequential determining unit 47 may detect the sequentiality of multiple writing requests in cases where, regarding the multiple FPs that are stored in the storing region 40d and matching the FPs included in the multiple writing requests, a given number of pairs of neighboring FPs in a sequence of receiving the multiple writing requests on the data layout each fall within the first given range.
The parameter adjusting unit 48 adjusts the above-described parameters used for the sequential determination. For example, the parameter adjusting unit 48 may perform parameter adjustment when the sequential determination is performed under an eased condition, and cause the sequential determining unit 47 to perform the sequential determination based on the adjusted parameters.
For example, in cases where the FPs are not determined to be sequential in the sequential determination by the sequential determining unit 47, the parameter adjusting unit 43 adjusts the parameters such that the condition for determining that the FPs are sequential is eased.
As illustrated in an example of
When the hit occurs H times, the sequential determining unit 47 calculates the distance between each pair of neighboring FPs from the corresponding entries in the FP history table 47a and determines whether or not there is a distance larger than the distance based on N′ after the parameter adjustment. When there are one or more distances larger than the distance based on N′, since the sequential determination is made under an eased condition, the sequential determining unit 47 inhibits the prefetching unit 40a from executing prefetching and the process shifts to the compaction determination to be made by the compaction determining unit 49. On the other hand, when there is no distance larger than the distance based on N′, the sequential determining unit 47 may determine that the FPs have the sequentiality.
As described above, in cases where the sequentiality of multiple writing requests is not detected in the determination based on the first, given range, the sequential determining unit 47 may detect the sequentiality of the multiple writing requests based on the second given range (e.g., ±(α×N′)) including the first given range. In the event of detecting the sequentiality in the determination based on the second given range, the sequential determining unit 47 may suppress the prefetching by the prefetching unit 40a.
The prefetching unit 40a prefetches an FP and transfers the prefetched FP to the computing server 2. For example, in cases where the sequential determining unit 47 determines (detects) the presence of the sequentiality, in other words, the sequential determination is successful, the prefetching unit 40a may determine to execute prefetching and schedule the prefetching.
For example, in prefetching, the prefetching unit 40a may read an FP subsequent to the multiple FPs received immediately before, e.g., a subsequent FP on the data layout of the storing region 40d, and transmit the read subsequent FP to the computing server 2.
As an example, the prefetching unit 40a may obtain the information on the FP subsequent to the FPs which have been hit H times in the sequential determining unit 47 through the first layout managing unit 44 and notify the obtained information to the computing server 2 through the network IF unit 40e.
If it is determined that there are one or more distances equal to or longer than the distance based on N′ adjusted by the parameter adjusting unit 48, the prefetching unit 40a may suppress the execution of prefetching because the sequential determination is performed in a state in which the condition is eased. On the other hand, if there is no distance equal to or longer than the distance based on N′, the prefetching unit 40a may determine to execute prefetching.
Upon receiving the FP transmitted by the prefetching unit 40a, the storage component 20 of the computing server 2 may store the received FP into the contents cache 20a. This makes it possible for the computing server 2 to use the prefetched FP in processing by the deduplication determining unit 22 at the time of transmitting the next writing request.
The compaction determining unit 49 determines whether or not to perform compaction. For example, the compaction determining unit 49 may make a determination triggered by one or both of a prefetching hit and sequential determination.
In the event of a prefetching hit, the compaction determining unit 49 refers to entries around the hit FP in the hit history table 46a, and marks, as unrequired date, an entry having a difference in the hit number. An example of the entry having a difference in the hit number may be one having the hit number equal to or less than a hit number obtained by subtracting a given threshold (first threshold) from the maximum hit number among the entries around the hit FP or from the average hit number of the entries around the hit FPs.
In the first example, the compaction determining unit 49 may recognize, as unrequired data, each entry having a hit number equal to or less than a value obtained by subtracting a threshold from the maximum hit number among 11 (n is an integer of one or more) histories. If n=3 and threshold value is 2, since the maximum hit number is 3 and the threshold value is 2 in the example of
In the second example, the compaction determining unit 49 may recognize, as unrequired data, each entry having a hit number equal to or less than a value obtained by subtracting a threshold from the average hit number among n histories. If n=3 and threshold value is 1, since the average hit number is 2 and the threshold value is 1 in the example of
Then, the compaction determining unit 49 may schedule the compaction when the number of unrequired data is equal to or larger than a threshold (second threshold) among the n history in the periphery.
In the example of
For example, the first layout managing unit 44 may arrange, in another storing region 40d-2, the FPs [4F89A3], [B107E5], and [C26D4A], which are obtained by excluding the FP [58E13B] of “528” in the storing region 40d-1, by the scheduled compaction. The compaction determining unit 49 may update the locations of the FPs after the arrangement onto the storing region 40d-2 in the hit history table 46a.
As described above, when receiving a writing request containing an FP that matches the FP transmitted in the prefetching (in the case of a prefetching hit), the compaction determining unit 49 may select an FP to be excluded on the basis of the hit history table 46a. Then, the compaction determining unit 49 may move one or more FPs except for the selected removing target FP among multiple fingerprints stored in the first region 40d-1 of the storing region 40d to the second region 40d-2 of the storing region 40d.
When an entry is hit H times in the sequential determination, the compaction determining unit 49 calculates the distances of each pair of FPs in the corresponding entry in the FP history table 47a, and determines whether or not a distance equal to or longer than the distance based on N exists. If a distance equal to or longer than the distance based on N exists, the compaction determining unit 49 schedules compaction to exclude unrequired data.
In the first example, the compaction determining unit 49 may determine to execute compaction if there are m (m is an integer of one or more) or more FPs having distances equal to or longer than a value (N-threshold) obtained by subtracting a threshold from N. If N=16, the threshold (third threshold)=2, and m=2, since the entry “No. 0” has two distances of “14” or more in the example of
In the second example, the compaction determining unit 49 may determine to execute compaction when the average value of the distances is equal to or greater than a value (N-threshold) obtained by subtracting a threshold from N. If N=16 and the threshold (fourth threshold)=7, in the example of
In the compaction triggered by the sequential determination, the compaction determining unit 49 may determine an FP existing between FPs separated by a distance (N-threshold) obtained by subtracting a threshold from N or more on the data layout of the storing region 40d as unrequired data of removing target. As illustrated in
As described above, in cases where the sequential determining unit 47 detects the sequentiality based on the second given range, the compaction determining unit 49 may select a removing target FP on the basis of writing positions of the FPs neighboring on the data layout and the first given range. Then, the compaction determining unit 49 may move one or more FPs remaining after excluding the selected removing target. FP among multiple FPs stored in the first region 40d-1 of the storing region 40d to the second region 40d-2 of the storing region 40d.
Next, description will now be made in relation to an example of operation of the block storage system 1 according to the one embodiment.
The dirty data managing unit 21 of the storage component 20 determines whether or not the FP of the writing target data is hit in the contents cache 20a, using the deduplication determining unit 22 (Step S2).
When a cache hit occurs in the contents cache 20a (YES in Step S2), the dirty data managing unit 21 transfers the FP and the LUN+LBA to the storage server 4 (Step S3), and the process proceeds to Step S5.
When a cache hit does not occur in the contents cache 20a (NO in Step S2), the dirty data managing unit 21 transfers the writing target data, the FP, and the LUN+LBA to the storage server 4 (Step S4), and the process proceeds to Step S5.
The dirty data managing unit 21 waits, from the storage server 4, for a response to requests transmitted to the storage server 4 in Steps S3 and S4 (Step S5).
The dirty data managing unit 21 analyzes the received response, and determines whether or net the prefetched FP is included in the response (Step S6). If the prefetched FP is not included in the response (NO in Step S6), the process ends.
In cases where the prefetched FP is included in the response (YES in Step S6), the dirty data managing unit 21 adds the received FP to the contents cache 20a through the FP managing unit 23 (Step S7), and then the writing process by the computing server 2 ends.
The computing server 2 executes the process illustrated in
The storage server 4 causes the first managing unit 41 and the second managing unit 42 to execute a storage process after the deduplication (Step S12). The storage process may be, for example, similar to that of a storage server in a known block storage system.
The storage server 4 performs a prefetching process (Step S13). The prefetching unit 40a determines whether or not an FP to be prefetched exists (Step S14).
If an FP to be prefetched exists (YES in Step S14), the prefetching unit 40a responds to the computing server 4 with the completion of writing while attaching the FP to be prefetched (Step S15), and the receiving process by the storage server 4 ends.
If the FP to be prefetched does not exist (NO in step S14), the storage server 4 responds to the computing server 2 with the completion of writing (Step S16), and the receiving process by the storage server 4 ends.
On the basis of the hit history table 46a, the compaction determining unit 49 determines whether or not a prefetching hit exists and many pieces of unrequired data exist in the hit history (Step S22). For example, as illustrated in
If a prefetching hit does not exist, or not many pieces of unrequired data exist in the hit history (NO in Step S22), the process proceeds to Step S24.
If a prefetching hit exists or many pieces of unrequired data exist in the hit history (YES in Step S22), the compaction determining unit 49 schedules compaction triggered by prefetching hit (Step S23) and the process proceeds to Step S24.
The sequential determining unit 47 performs sequential determination based on the FP history table 47a and the FP received from the computing server 2, and determines whether or not the FP is hit in the FP history table 47a (Step S24).
If the FP is not hit (NO in Step S24), the sequential determining unit 47 and the parameter adjusting unit 48 perform the sequential determination under an eased condition (parameters), and determine whether or not the FP is hit in the FP history table 47a (Step S25).
If the FP is not hit in Step S25 (NO in Step S25), the process proceeds to Step S28. On the other hand, if the FP is hit. in Step S25 (YES in Step S24 or YES in Step S25), the process proceeds to Step S26.
In Step S26, the prefetching unit 40a determines whether or not to perform prefetching. If the prefetching is not to be performed, for example, in Step S26 executed via YES in step S25 (NO in Step S26), the process proceeds to Step S28.
If the prefetching is to be performed, for example, in Step S26 executed via YES in Step S24 (YES in Step S26), the prefetching unit 40a schedules prefetching (Step S27), and the process proceeds to Step S28.
In Step S28, the compaction determining unit 49 determines whether or not many pieces of unrequired data exist on the basis of the FP history table 47a at the time of the sequential determination. For example, as illustrated in
If many pieces of unrequired data do not exist at the time of the sequential determination (NO in Step S28), the prefetching process ends.
If many pieces of unrequired data exist at the time of the sequential determination (YES in Step S28), the compaction determining unit 49 schedules compaction triggered by the sequential determination (Step S29), and the prefetching process ends.
The compaction scheduled in Steps S23 and S29 is performed by the first layout managing unit 44 at a given timing. The prefetching scheduled in Step S27 is performed by the prefetching unit 40a at a given timing (for example, at Step S15 in
Hereinafter, description will now be made in relation to an application example of the scheme according to the one embodiment with reference to
As illustrated in
Next, as illustrated in
Next, as illustrated in
For example, when it is assumed that the data traffic of LUN+LBA is 8+8=16 B and that of FP is 20 B, a conventional method uses a communication size of 4096+16+20=4132 B each time. On the ether hand, assuming that the deduplication succeeds for all data, the scheme of the one embodiment uses a communication size of 16+20=36 B each time. In the writing of the 1-PB data set 40g, since the number of times of communication is 2(50−12)=238, the data traffic can be reduced from 4132×238 B to 36×238 B. Being expressed in a percentage, the data traffic can be reduced to 36/4132=0.87%.
The data transfer amount of FPs from the storage server 4 to the computing server 2 in an ideal case is 20×238 B. In the case of the writing by the user B illustrated in
The example described above is a case where the one embodiment is applied to a use case in which a large effect on reducing the data traffic is expected. The effect on reducing the data traffic by the scheme of the one embodiment varies with, for example, a use case, workload, and a data set. Thus, various conditions such as parameters for processes including sequential determination, compaction, prefetching, and the like according to the above-described one embodiment may be appropriately adjusted according to, for example, a use case, workload, and a data set.
The devices for achieving the above-described computing server 2 and storage server 4 may be virtual servers (VMs; Virtual Machines) or physical servers. The functions of each of the computing server 2 and the storage server 4 may be achieved by one computer or by two or more computers. Further, at least some of the respective functions of the computing server 2 and the storage server 4 may be implemented using Hardware (HW) and Network (NW) resources provided by cloud environment.
The computing server 2 and storage server 4 may be implemented by computers similar to each other. Hereinafter, the computer 10 is assumed to be an example of a computer for achieving the functions of each of the computing server 2 and the storage server 4.
As illustrated in
The processor 10a is an example of an arithmetic processing apparatus that performs various controls and arithmetic operations. The processor 10a may be connected to each block in the computer 10 so as to be mutually communicable via a bus 10i. The processor 10a may be a multiprocessor including multiple processors, or a multi-core processor including multiple processor cores, or may have a configuration including multiple multi-core processors.
An example of the processor 10a is an Integrated Circuit (IC) such as a Central Processing Unit (CPU), a Micro Processing Unit (MPU), a Graphics Processing Unit (GPU), an Accelerated Processing Unit (APU), a Digital Signal Processor (DSP), an Application Specific IC (ASIC), and a Field-Programmable Gate Array (FPGA). The processor 10a may be a combination of two or more ICs exemplified as the above.
The memory 10b is an example of a HW device that stores information such as various data and programs. An example of the memory 10b includes one or both of a volatile memory such as a Dynamic Random Access Memory (DRAM) and a non-volatile memory such as a Persistent Memory (PM).
The storing device 10c is an example of a HW device that stores information such as various data and programs. Examples of the storing device 10c include various storing devices exemplified by a magnetic disk device such as a Hard Disk Drive (HDD), a semiconductor drive device such as a Solid State Drive (SSD), and a non-volatile memory. Examples of a non-volatile memory are a flash memory, a Storage Class Memory (SCM), and a Read Only Memory (ROM).
The information on the contents cache 20a that the computing server 2 stores may be stored in one or more storing regions that one or both of the memory 10b and the storing device 10c include. Each of the storage 40c and the storing region 40a of the storage server 4 may be implemented by one or more storing regions that one or both of the memory 10b and the storing device 10c include. Furthermore, the information on the hit history table 46a and the FP history table 47a that the storage 40c stores may be stored in one or more storing regions that one or both of the memory 10b and the storing device 10c include.
The storing device 10c may store a program 10g (information processing program) that implements all or part of the functions of the computer 10. For example, the processor 10a of the computing server 2 can implement the function of the storage component 20 illustrated in
The IF device 10d is an example of a communication IF that controls connection to and communication of a network between the computing servers 2, a network between the storage servers 4, and a network between the computing server 2 and the storage server 4, such as the network 3. For example, the IF device 10d may include an adaptor compatible with a Local Area Network (LAN) such as Ethernet (registered trademark), an optical communication such as Fibre Channel (FC), or the like. The adaptor may be compatible with one or both of wired and wireless communication schemes. For example, each of the network IF units 20b and 40e illustrated in
The I/O device 10e may include one or both of an input device and an output device. Examples of the input device are a keyboard, a mouse, and a touch screen. Examples of the output device are a monitor, a projector, and a printer.
The reader 10f is an example of a reader that reads information on data and programs recorded on a recording medium 10h. The reader 10f may include a connecting terminal or a device to which the recording medium 10h can be connected or inserted. Examples of the reader 10f include an adapter conforming to, for example, Universal Serial Bus (USB), a drive apparatus that accesses a recording disk, and a card reader that accesses a flash memory such as an SD card. The program 10g may be stored in the recording medium 10h. The reader 10f may read the program 10g from the recording medium 10h and store the read program 10g into the storing device 10c.
An example of the recording medium 10h is a non-transitory computer-readable recording medium such as a magnetic/optical disk and a flash memory. Examples of the magnetic/optical disk include a flexible disk, a Compact Disc (CD), a Digital Versatile Disc (DVD), a Blu-ray disk, and a Holographic Versatile Disc (HVD). An examples of the flash memory includes a semiconductor memory such as a USB memory and an SD card.
The HW configuration of the computer 10 described above is merely illustrative. Accordingly, the computer 10 may appropriately undergo increase or decrease of HW (e.g., addition or deletion of arbitrary blocks), division, integration in an arbitrary combination, and addition or deletion of the bus. For example, at least one of the I/O device 10e and the reader 10f may be omitted in one or both of the computing server 2 and the storage server 4.
The technique according to the one embodiment described above can be implemented by changing or modifying as follows.
For example, the blocks 21 to 23 included in the computing server 2 illustrated in
Further, each of the block storage system 1, the computing server 2, and the storage servers 4 may be configured to achieve each processing function by mutual cooperation of multiple devices via a network. For example, each of the multiple functional blocks illustrated in
In one aspect, the one embodiment can reduce the data traffic when data is written into an information processing apparatus.
All examples and conditional language recited herein are intended for the pedagogical purposes of aiding the reader in understanding the invention and the concepts contributed by the inventor to further the art, and are not to be construed limitations to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although one or more embodiments of the present inventions have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2021-003717 | Jan 2021 | JP | national |