The present invention relates to a computer system. Particularly, the invention relates to a computer system and file system management method suited for use in acquisition of statistic information of file systems.
Recently, the difficulty of storage capacity management and stored file management has been increasing due to diversification of files/content stored in storage systems. If a storage administrator becomes aware of a tendency of stored files according to, for example, types and size of files to be stored, and update dates and times, and a file increase/decrease tendency the storage administrator can efficiently manage storage capacity plans and file migration. So, an efficient method for obtaining the tendency of files stored in file systems, that is, statistic information, is required.
Main types of the statistic information to find the tendency of files stored in the file storage system are as follows. A first type is size distribution of each of pluralities of files and their chronological variation. A second type is distribution of extensions of each file and their chronological variation. A third type is distribution of update dates and times and access dates and times of each file and their chronological variation. A fourth type is a combination of these types, for example, how access dates and times of files with a certain extension are distributed. The statistic information of files is stored in, for example, a database or a table and users can refer to the statistic information via a GUI.
An administrative user of a storage system needs to refer to metadata of files and collect them in order to obtain the statistic information. Regarding the conventional technology, there is a method, which is implemented by a statistic information collection program, for collecting metadata of the entire file system as a target, storing the collected information in, for example, a database or a table, and calculating the statistic information. If this method is employed, the statistic information collection program needs to periodically refer to the metadata of all files existing in the file system and it takes a lot of time to collect the metadata, which is not efficient.
Therefore, there is a method executed by a metadata collection system for reporting a change of metadata to a system of a storage user when the metadata is changed, as a method for efficiently collecting the metadata (U.S. Pat. No. 7,890,551). According to this method, the latest metadata is efficiently collected by reporting an event of the metadata change to the user.
Another conventional technique is a system for efficiently collecting any change of metadata from snapshots of the metadata (U.S. Pat. No. 7,693,864). According to this system, snapshots of a file system are periodically obtained and any difference between metadata before the change and metadata after the change can be obtained by comparing the metadata in the past in the snapshots with the present metadata.
PTL 1: U.S. Pat. No. 7,890,551
PTL 2: U.S. Pat. No. 7,693,864
With the first conventional technique, changes of the metadata can be recognized, but how the metadata has changed cannot be recognized. So, it cannot determine how the changes of the metadata would influence the statistic information table. Therefore, every time the metadata changes, the system needs to refer to the metadata of the entire file system in order to update the statistic information table.
Then, with the second conventional technique, the metadata can be obtained only when obtaining snapshots. So, it is difficult to collect the statistic information of the file system in real time.
Therefore, it is an object of the present invention to provide a computer system capable of providing the latest statistics of a file system in real time. Another object of the present invention is to provide a computer system and file system management method capable of updating the statistics of the file system in real time without periodically collecting the statistic information of all files in the file system.
So, the present invention is characterized in that when the file system is accessed, first statistic information before access processing on the file system is compared with second statistic information after the access processing; and if any difference exists between the first statistic information and the second statistic information, statistic results of the file system are updated based on the difference.
According to the present invention, a computer system capable of providing the latest statistics of the file system in real time can be provided. Particularly, it is possible to provide a computer system capable of updating the statistics of the file system in real time without periodically collecting the statistic information of all files in the file system.
Next, an embodiment of the present invention will be explained.
The file storage system 20 includes a management interface 201 and a data interface 202. An interface 113 of the management computer 11 is connected to the management interface 201. The client computer 12 is connected to the data interface 202. The management computer 11 includes an arithmetic unit (CPU) 111, a memory 112 having a management program 1121 of the file storage system 20, and an I/O device 140 such as a screen for an administrative user.
The application program is constructed as a changed metadata collection program 2044 for collecting changed metadata. The metadata is management information of files constituting a file system. The computer system uses the management information of the files as statistic information of the file system. Incidentally, the changed metadata collection program 2044 may be constructed as part of functions of the OS 204 or the file system 2045.
The changed metadata collection program 2044 has a function 20441 as an application program interface (API), a function 20443 collecting metadata, a function 20444 temporarily storing the metadata, and a function 20445 reporting updates to the file(s). Incidentally, for the sake of convenience, the function as a changed metadata collection application program interface (API) will be described as a changed metadata API (20441), the function collecting the metadata will be described as a metadata collection unit 20443, the function temporarily storing the metadata will be described as a temporary metadata storage unit 20444, and the function reporting updates to the files will be described as a file change notification unit 20445.
The memory 204 has a metadata collection target table 80 for specifying the types of metadata, which are targets to be collected, and file systems which are metadata collection targets, as an information management table that is necessary for the changed metadata collection program 2044 to fulfill the above-described functions.
The memory 204 has a statistic information processing program as an application program of a different tier from that of the changed metadata collection program 2044 or as a function of the OS. The statistic information processing program executes various processing for, for example, generating and updating statistics of the file systems. For example, the statistic information processing program has a function setting a policy for collecting the statistic information of the file systems (a statistic information collection policy setting unit 30), a function collecting the statistic information (a statistic information collection unit 2042), a function collecting the statistic information and executing processing on the collected statistic information (the statistic information collection processing unit 2042), and a statistic information table 40.
Furthermore,
The aforementioned policy table 30A (
The statistic results are displayed as shown in
Next, the operation of the computer system according to the present invention will be explained.
Firstly, as shown in
The statistic information collection unit 2042 notifies the changed metadata collection API (20441) of a file system(s) as a target(s) to collect the statistic information, which is set to the statistic information collection policy table 30A, as well as the type of the related metadata corresponding to a pattern of the statistic information to be collected from each file system (S1103). In step S1104, changed metadata collection processing by the changed metadata collection program is executed. The details of this processing will be explained below with reference to the other flowchart.
If any change of the related metadata happens with respect to a file belonging to the file system which is a statistic information collection target, the changed metadata collection API (20441) sends the related metadata before and after the change for the file, whose related metadata has been changed, to the statistic information collection unit 2042 as shown in
The statistic information collection unit 2042 compares the related metadata before the change (old metadata) with the related metadata after the change (new metadata) and extracts necessary information to update the statistic information table (S1106). Then, the statistic information collection unit updates the statistic information table based on the extracted information (S1107). For example, if data is written to a certain file and its file size is changed (size before the change: 5 KB; and size after the change: 15 KB), the statistic information collection unit subtracts one from the number of file counts, which is 2340, for the file size (1 KB<file size<10 KB) belonging to the record [present] in the statistic information table (
Furthermore, if the change of the related metadata is based on the creation of a new file, the statistic information collection unit 2042 adds one to the number of counts of files belonging to a record of that extension and adds the relevant file size to the total capacity based on the extension of the newly created file (see
Furthermore, if the related metadata is the last file access time, the statistic information collection unit 2042 refers to the last file access time before the change and updates the statistic information collection table 40. For example, referring to
In step S1108, the statistic information collection unit checks whether or not an instruction to halt the statistic information collection has been issued from the administrative user; and if an affirmative judgment is returned, the instruction to halt the metadata collection is issued to the changed metadata collection API (S1109). If a negative judgment is returned, the processing returns to step S1104 in
Next, processing for collecting the changed metadata will be explained.
The metadata collection unit 20443 refers to the metadata collection target table 80 and sets the file change notification unit 20445 so that the statistic information collection target file system will be a monitor target (S1202). Next, the file change notification unit 20445 starts monitoring the statistic information collection target file system (S1203). This monitoring is to check whether or not access from the client computer 12 is made to a file belonging to the monitor target file system.
Then, the metadata collection unit 20443 checks, via the changed metadata collection API (20441), whether an instruction to halt the collection has been issued or not (S1204, S1109). If the halt instruction has been issued, the metadata collection unit 20443 terminates the processing of the flowchart. If the halt instruction has not been issued, the file change notification unit 20445 checks the file system storage information area of the memory and checks whether or not the monitor target file system exists in the file storage system, in step S1205. If the file system does not exist, the file change notification unit 20445 terminates the processing of the flowchart.
If a negative judgment is returned in step S1204 and an affirmative judgment is returned in step S1205, the processing of the flowchart proceeds to step S1206 (
If the file change notification unit 20445 returns an affirmative judgment in this step, the file change notification unit 20445 issues a command to the OS to have the file system temporarily stop writing/reading data to/from the file and notifies the metadata collection unit 20443 that the write/read request to write/read data to/from the file in the monitor target file system has occurred (S1207). If the file change notification unit 20445 returns a negative judgment in step S1206, the processing returns to S1204
After receiving the notification from the file change notification unit 20445, the metadata collection unit 20443 refers to the metadata collection target table 80 and identifies the type of the collection target metadata that is set for the file system to which the write/read request target file belongs. Then, the metadata collection unit 20443 reads the identified type of the related metadata from the file and saves and records it to a temporary storage area of the temporary metadata storage unit 20444 (S1208). Furthermore, the metadata collection unit 20443 issues a command to the OS to cancel the file writing/reading halt. The related metadata recorded in the temporary metadata storage area is metadata before data was written to or read from the file according to the write/read command.
The file change notification unit 20445 judges whether writing/reading of data to/from the file for which the write/read request was made has been completed or not (S1209). If writing/reading of data to/from the file has been completed, the file change notification unit 20445 notifies the metadata collection unit 20443 that writing/reading of data to/from the file has been completed (S1210). The file change notification unit 20445 monitors a write/read completion response from the OS, which controls commands from the client computer, to the client computer 12.
After receiving the notification, the metadata collection unit 20443 reads updated metadata from the file, to or from which data was written or read, reads old metadata, which has been saved to the temporary metadata storage area, and then compares the updated metadata with the old metadata (S1211:
Subsequently, the metadata collection unit 20443 refers to the metadata collection target table 80 and judges whether any difference exists between the updated metadata and the pre-update metadata with respect to the metadata of the type defined as the metadata collection target (S1212). If any difference exists, the metadata collection unit 20443 outputs the updated metadata and the old metadata via the metadata collection API (20441) to the statistic information collection unit 2042 (S1213). If a negative judgment is returned in this step, the processing returns to S1204.
If the file system is accessed and the statistic information about the file system changes before and after the access, the computer system always monitors the changes of the file system and updates the statistic results of the file system according to the changes as the result of the above-described processing. Therefore, it is possible to always maintain the statistics of the file system in the latest state. When this is performed, the load imposed by the updates on the computer system is not so large.
In the aforementioned embodiment, the changed metadata collection program and the statistic information processing program are located in the file storage system, but they may be located in the management computer. Furthermore, the statistic information table 40 is used to record the statistic results relating to one type of metadata, but it may be used to record the statistic results relating to a plurality of types of metadata, for example, in a case where distribution of last file access time is recorded with respect to one extension.
Furthermore, in the aforementioned embodiment, the difference in the metadata before and after the file access is checked by the changed metadata collection program; however, this check may be performed by the statistic information collection unit. Furthermore, the statistic information table is updated when a file is accessed; however, this update may be performed at certain time intervals, for example, every several hours.
11 Management computer
12 Client computer
20 File storage system
70 File
203 Arithmetic unit (controller)
205 Disk array system
2044 Changed metadata collection program
2045 File system
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2011/006744 | 12/1/2011 | WO | 00 | 12/12/2011 |