The present invention relates to a file management method.
Recently, technology called big data analysis that produces a new value by analyzing enormous data related to social infrastructure such as social networking service, banking, medical treatment and traffic is being put into practice.
In big data analysis each volume of input data collected from social infrastructure and output data which is results of analysis is very big and continues to increase over time.
For example, when big data is analyzed utilizing cloud service, the problem is remarkable. As for computing resources of cloud service, cloud service is often counted based upon the performance of a computer and usage time and as to storage resources, cloud service is often counted based upon data capacity and a recording period. Therefore, when data capacity increases, a usage charge related to storage resources is more dominant than a usage charge related to computing resources as a total cost. Therefore, when big data analysis is made utilizing cloud service, a cost of utilizing cloud service is enormous.
For a method of enhancing the usage efficiency of storage resources, technique for providing (selling) and utilizing (buying) resources allocated to a user among users smoothly as disclosed in Patent Literature 1 can be given. Unused resources of users are sold and bought by using this technique and the usage efficiency of the unused resources can be enhanced.
Besides, as a cost for collecting input data is also expensive, the reliability of storage resources is also important.
Patent Literature 1: Japanese Patent Application Laid-Open No. 2011-154532
Technique disclosed in Patent Literature 1 enables providing and utilizing resources possessed by users among the users. At this time, a user that receives some resource acquires the ownership of the provided resource. Therefore, since another user has the ownership of the resource even if the user that provided the resource desires the recovery of the resource, there is a problem that the resource cannot be recovered.
Then, an object of the present invention is to enable recovering an accommodated resource, enabling accommodating an unused area.
A file management method according to the present invention is based upon a file management method of storing a file from a client in a storage in a state where the file is made redundant by a certain redundant number and has a characteristic that the file management method includes a first step of accepting an additional file from the client to the storage, a second step of comparing the capacity of the additional file and the unused physical capacity of the storage and a third step of changing the redundant number of the already stored file, increasing the unused physical capacity and storing the additional file in the storage device when the capacity of the additional file is larger than the unused physical capacity.
Besides, the file management method according to the present invention is based upon a file management method of allocating physical allocation capacity of a storage to plural users, enabling lending and borrowing/allocated physical allocation capacity and storing a file from a client of the user in the storage device where the file is made redundant by a certain redundant number, and has a characteristic that the file management method includes a step of accepting an additional file from the client to the storage device, a step of adding capacity of the additional file and the capacity of all the already stored files of the first user who possesses the additional file and comparing the added capacity with capacity acquired by subtracting lent capacity from the physical allocation capacity allocated to the first user, and a step of changing a redundant number of an already stored file of a user at the destination of lending and storing the additional file in capacity made free by changing the redundant number when the added capacity is larger.
According to the present invention, the recovery of a provided area is enabled by changing a redundant number.
Embodiments in which the capacity of a storage is accommodated among users will be described using a system that stores a file in a storage via a computer (a gateway) with the file made redundant so as to protect the file for an example below. For an outline of the accommodation, (1) unused capacity of capacity allocated to a user can be lent to another user who has the relation of accommodation. (2) To enable recovery even if the lent capacity is utilized by another user, the data redundancy of the borrowing user is reduced and recovered capacity can be secured. (3) The redundancy of the whole data is made minimum so that the whole data is stored in the capacity allocated to the user himself/herself so as to enable the securement of recovered capacity without deleting own data.
In a first embodiment, a system that stores a file in three physical volumes in a storage device in a state where the file is made redundant will be described using an example. The example that unused capacity is automatically distributed in a group in the relation of accommodation in a state where files having the same contents are stored by a redundant number will be described below. In a second embodiment, an example that unused capacity is lent and recovered to/from an individual user in place of the distribution in the group in the first embodiment will be described. In a third embodiment, a system that stores a file in three cloud storages will be described using an example. In the first embodiment, the redundant number of the file is used for an index, while in the third embodiment, availability information provided by Service Level Agreement (hereinafter called SLA) as redundancy shall be used for an index.
Referring to
This embodiment also includes (1) processing for storing a file by reducing redundancy as a group and securing an unused physical allocation area in the case of the excess of an initial logical allocation area and (2) processing for complementing redundancy by borrowing an unused physical allocation area from another user when the redundancy is reduced.
In No. 4, a case where the user B stores data for two blocks of the physical volume is shown. The user B has only 4.5 blocks of the physical volumes because the user B lends the user A 4.5 blocks, and therefore the user B recovers 1.5 blocks of the physical volumes from the user A. At this time, since the user A returns 1.5 blocks of the physical volumes, duplication is applied to a part of the blocks in place of triplication. In No. 5, the user B similarly recovers 3 blocks from the user A to store data for three blocks of physical volumes. Since the user B stores data in capacity initially allocated, the data is all triplicated. In the meantime, as the user A stores in excess of capacity initially allocated and has no area to borrow, the data is stored with it duplicated. As described above, in this embodiment, a function for lending and recovering capacity between users is provided.
A gateway 100 is a computer that provides file storage service to a client 300. Therefore, the gateway 100 transfers a file between the client 300 and the storage device 400. A CPU 110 executes a processor (a program) stored in a memory 140. The memory 140 stores processors (programs) and tables for the file storage service. In the memory 140, a data manager 141, a redundant number calculator 142 and a capacity recoverer 145 are stored. In the memory, a user management table 500 and a file management table 600 are also stored. Further, the memory 140 has an area such as a work area required for executing each processor. A volume 120 stores a stub file 121. The stub file 121 holds a file ID for every user. An interface (I/F) 130 transmits/receives a file to/from the client 300 and the storage device 400. Besides, the interface transmits/receives management information to/from a management terminal 200 and the client 300. However, each processor may not be a program executed in the CPU 110 but may be independent hardware that performs the same operation as operation when the CPU 110 executes a program.
The management terminal 200 can acquire management information in the gateway 100 and the storage device 400 if necessary, is a terminal for managing the gateway 100, and is a computer provided with an interface (I/F) 230 for connecting to a network and an operational screen, a memory 240 and an internal communication line for connecting them. The memory 240 stores a processor (a program) and data. The processor is a gateway manager 241 for example. The operational screen 250 inputs and outputs the management information of the gateway 100 via the management terminal 200.
The client 300 is a computer used by a user who utilizes file storage service provided by the gateway 100 and is provided with an interface (I/F) 330 for connecting the network and the operational screen, a memory 340 and an internal communication line for connecting them. A operational screen 350 inputs and outputs the management information of the gateway 100 via the client 300.
The storage device 400 provides file storage service (writing, reading, updating, deletion and the like) corresponding to an instruction from the gateway 100. Therefore, the storage device 400 has single/plural volumes 401 for storing a file. Besides, a file ID for identifying a file is used for writing/reading the file. The gateway 100 allocates a proper file ID to each file.
A case where a file is written from the client 300 will be described below. The client 300 transmits a file to the gateway 100. The gateway 100 allocates a proper file ID to the received file and transmits it to the storage device 400. The gateway 100 holds the correlation of the file ID and path information showing a location in which the file is stored every user as stub information. When the client 300 reads the file, the client 300 has only to refer to the stub information by storing the file in the storage device as described above.
Total file size 506 shows the total of files except redundant files. Physical used capacity in upper limit redundancy 507 shows used quantity of capacity when all files that a user possesses reach a set upper limit redundant number. When this capacity exceeds physical allocation capacity, capacity is accommodated from another user or the redundant number of some files is required to be reduced so as to keep in physical allocation capacity. A lower limit redundant number physical used capacity 508 shows used quantity of capacity when all files that a user possesses reach a set lower limit redundant number. These values function as a limit value for limiting physical used capacity of a user because all files that a user possesses are required to be held in physical allocation capacity when no capacity can be borrowed from another user. An initial upper limit redundant number 509 and an initial lower limit redundant number 510 show an initial upper limit redundant number and an initial lower limit redundant number respectively set to an added file.
Whether a file is initially compressed or not 511 shows whether a file is automatically compressed or not when the file is written. Redundant number reduction priority 512 shows priority information for determining the order of sorting when a redundant number is reduced. For example, as for the user A, the weight of sorting is determined in the order of priority information, size, an access date and a creation date. Size [large] shows that a redundant number is precedently reduced from larger size. An accommodation group 513 shows that accommodable (unused) capacity that a user possesses can be provided to a user who belongs to the same group. Total accommodated capacity in a belonging group 514 shows a value acquired by totaling accommodable (unused) capacity of all users who belong to the same group. Used accommodated capacity 515 in the belonging group 515 shows capacity which a user utilizing in excess of physical allocation capacity borrows from unused capacity of a user who belongs to the same group. Unused accommodated capacity 516 in the belonging group 516 shows capacity which is not used by another user in the total accommodated capacity in the belonging group 514. Lent/borrowed capacity 517 will be described later because it is described in a second embodiment and is not described in the first embodiment.
First, the data manager 141 (the execution of a program called the data manager 141 by the CPU 110) acquires a request for writing a file from the client 300 and the information of the file. In this case, the user ID shall be the user A and a file shall be an additional file (S10). The data manager 141 verifies whether the user A can store the additional file or not. When it is supposed that the existing file and the additional file are set to the lower limit redundant number, the data manager judges whether the current used capacity exceeds a value of the physical allocation capacity 503 of the user A. When the current used capacity does not exceed the value, processing proceeds to a step S12 and when the current used capacity exceeds the value, the processing proceeds to a step S17 (S11). Judgment of whether the current used capacity exceeds the value or not also includes judgment of whether the existing file and the additional file can be stored in terms of capacity including an error and the like in numeric representation in a computer in addition to judgment by mathematically strict comparison. Judgment in the following description is also similar.
The data manager 141 initializes a file referring to the user management table 500 (S12). Next, the data manager 141 sets a file redundant number of the user A and a user who belongs to the same group using the redundant number calculator 142 (S13). The data manager writes the file in the storage device based upon a changed redundant number of the file (S14) and updates the file management table and the user management table (S15). When the redundant number of the file is changed, the user is notified of it (S16). When the current used capacity exceeds the value of the physical allocation capacity 503 even if the existing file and the additional file are set to the lower limit redundant number, the capacity is short and writing fails (S17).
When the additional file cannot be stored at a value of the upper limit redundant number, states are compared to recover capacity or to distribute accommodated capacity. In the case of recovering capacity, the processing proceeds to a step S23 and when accommodated capacity is distributed, the processing proceeds to a step S27 (S22). When it is determined that the processing proceeds to S23, a value of the physical used capacity in upper limit redundancy 507 and the value of the physical allocation capacity 503 are compared as to each user in the group and a user who exceeds the value of the physical allocation capacity 503 is acquired (S23). Capacity acquired by subtracting the capacity of the additional file set to upper limit redundancy from the total accommodated capacity is capacity before distribution (S24). Distributed capacity is determined based upon excess quantity of each user in excess of capacity and the capacity before distribution (S25). Capacity in which the file can be stored (target capacity) is specified for each user in excess of capacity and a redundant number adjustment process is executed. At this time, as the user A is not included in the redundant number adjustment process, the processing proceeds to S21 (S26).
When it is determined that the processing proceeds to S27 in S22, the similar processing to the processing in S23 is executed (S27) and total accommodated capacity is made the capacity before distribution (S28). Distributed capacity is determined based upon excess quantity of each user in excess of capacity and the capacity before distribution (S29). Capacity in which the file can be stored (target capacity) is specified for each user in excess of capacity and the redundant number adjustment process is executed. At this time, the user A is also included in the redundant number adjustment process (S30).
As the redundant number of the file is not required to be changed when all the files of all the users in the group are stored at a value of the upper limit redundant number, file deletion setting is made and the process is finished (S52, S57, S15, S16). When all the files of all the users in the group are not stored at the value of the upper limit redundant number, an accommodated unused area newly caused is redistributed (S53 to S56). As processing in the steps S53, S55 and S56 is similar to the processing in S27, S29, S30 in
As described above, the redundancy of a file can be enhanced by accommodating unused capacity of each user in the belonging group and a user who uses excess capacity by borrowing can recover the capacity by reducing the redundancy of a file.
Referring to
Since files that a user possesses cannot be maintained by only capacity allocated to the user when the capacity exceeds the value of the physical allocation capacity 503, the additional file cannot be written because of the shortage of capacity (S62). When the abovementioned total capacity does not exceed the value of the physical allocation capacity 503, it is judged whether the capacity acquired by totaling the capacity in lower limit redundancy of the additional file and the value of the physical used capacity in lower limit redundancy 508 exceeds capacity acquired by subtracting capacity lent to another user from the value of the own physical allocation capacity 503 or the total (capacity after accommodation) of the value of the own physical allocation capacity 503 and capacity borrowed from another user. When the total capacity exceeds the capacity after accommodation, the processing proceeds to a step S64 and when the total capacity does not exceed the capacity after accommodation, the processing proceeds to a step S67 (S63).
When the total value of the capacity in lower limit redundancy of the additional file and the value of the physical used capacity in lower limit redundancy 508 does not exceed the value of the physical allocation capacity 503 and exceeds the capacity after accommodation, storage capacity is short because the user lends another user capacity. Therefore, to recover the capacity, recovered capacity is set for a lent user (a user at a destination of recovery). At this time, the recovered capacity is required to be set to be the total capacity of the capacity in lower limit redundancy of the additional file and the value of the physical used capacity in lower limit redundancy 503 or more (S64). Processing for recovering the set capacity from the user at the destination of recovery is executed. Since a redundant number adjustment process is described in relation to
A case where the total capacity of the capacity in lower limit redundancy of the additional file and the value of the physical used capacity in lower limit redundancy 503 does not exceed the value of the physical allocation capacity 503 and does not also exceed the capacity after accommodation will be described below. It is judged whether total capacity of capacity in upper limit redundancy of the additional file and a value of physical used capacity in upper limit redundancy 507 exceeds the capacity after accommodation. When the total capacity exceeds the capacity after accommodation, the processing proceeds to a step S68 and when the total capacity does not exceed the capacity after accommodation, the processing proceeds to a step S69 (S67). In the case of procession to S68, recovered capacity can be freely set, unlike S64. It is determined depending upon difference in the size of recovered capacity whether a file redundant number of the user requesting wiring is all maximum (S68). In the case of procession to S69, since writable capacity exists even if a redundant number of the additional file is maximum, the redundant number of the additional file is set to a maximum redundant number (S69). In S69, the already existing all files may or may not be set to a maximum redundant number.
As described above, when capacity is short in writing the additional file, a user can write the additional file not by reducing a redundant number of each user in a belonging group and recovering capacity but by reducing a redundant number of a specific user at the destination of recovery and recovering capacity. Therefore, a range where recovery has an effect is small and recovery may be able to be processed at high speed.
Referring to
In a process using the user management table 1500 and the file management table 1600, the processing of the items in the process described in the first embodiment has only to be changed to the corresponding items for the cloud. Availability information of SLA may also be defined by calculating (MTBF/(MTBF+MTTR))*100 and others. In this case, MTBF means mean time between failure and MTTR means mean time to repair.
As described above, in the cloud system, unused capacity of each user in a belonging group is accommodated and the availability of a file can also be enhanced, a user who uses excess capacity by borrowing lowers the availability of the file, and the user can make the capacity recovered. The usage efficiency of storage resources that store big data of the cloud system is enhanced and the reliability can be secured.
100: Gateway, 141: Data manager, 142: Redundant number calculator, 143: Cloud information table, 144: Store combination table, 145: Capacity recoverer, 200: Management terminal, 300: Client, 400: Storage device, 500: User management table, 600: File management table, 900: Cloud storage
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2013/078374 | 10/18/2013 | WO | 00 |