This application claims priority under 35 U.S.C. §119 to Korean Patent Application No. 10-2008-0129631, filed on Dec. 18, 2008, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein by reference in its entirety.
The following disclosure relates to a file storage system, and in particular, to a metadata server cluster and a metadata management method thereof, which distribute metadata for a file to a cluster including a plurality of metadata servers to replicate the metadata.
A storage system, which provides a large amount of shared space by using a plurality of computers that are connected to one another over a network, is configured with a data server storing data and a metadata server storing metadata. Typically, since the amount of relevant metadata is far less than that of stored data, the storage system may be configured with a plurality of data servers and one metadata server. Such a storage system is called an asymmetric storage system.
In an asymmetric storage system, many data servers are required when the amount of data to store increases exponentially, and consequently, the amount of metadata also increases greatly. If one metadata server is used even in such an environment, a bottleneck may occur because metadata requests from all clients are directed to the one metadata server. Consequently, the performance of metadata operations is degraded. Moreover, when a failure occurs in the metadata server, the limitation of availability arises, in which all service is interrupted. Furthermore, if the number of metadata to store increases exponentially, long-term expandability issues arise.
U.S. Patent Publication No. 20060026219, filed on Jul. 27, 2005 by Orenstein et al. and published on Feb. 2, 2006, discloses a metadata management method necessary for distributing and storing data such as video that is not altered. According to the metadata management method, the attributes of metadata are applied to a hash function to classify the entire metadata into a plurality of subsets based on a hash value, and the classified subsets of the metadata are referred to as a region. The entire metadata are classified into a plurality of regions, and each classified region is enabled to separately maintain its copy. Also, the regions are distributed to a plurality of nodes for storage. However, since a hash function for determining a node to store metadata is used, it is accompanied by limitations of having to move a considerable amount of metadata whenever a change occurs in a cluster for metadata servers. Consequently, such a metadata management method is intended to be applied to systems for archiving data that will remain unchanged.
U.S. Patent Publication No. 20060123024, filed on Dec. 1, 2005 by Sathyanarayan et al. and published on Jun. 8, 2006, discloses a method for providing consistent metadata to a cluster for nodes using a cluster database management system. According to the method for providing consistent metadata, when a cluster for LDAP servers exists, the cluster database management system manages the metadata, and each of the LDAP servers caches a portion of metadata in its own heap memory. Accordingly, the LDAP servers are configured to respond to requests from clients by using the metadata cached in their heap memories. If a specific client accesses a specific LDAP server and changes the metadata cached in its heap memory, the changed contents of the metadata are reflected in the entire cluster through the use of a shared memory and a specific protocol. Because specific environmental conditions such as the cluster database management system and the shared memory are assumed to be present, however, it is difficult to apply to general environments lacking LDAP service. Also, slight data inconsistency may occur until changed content is reflected in all LDAP servers.
In one general aspect, a metadata server in a metadata server cluster including a plurality of metadata servers sharing a directory hierarchy includes: a directory hierarchy storage unit storing all directory hierarchies which are stored in the metadata server cluster; a metadata storage unit storing metadata for a data file; and a search unit searching the directory hierarchies and the metadata.
In another general aspect, a metadata management method in a metadata server cluster including a plurality of metadata servers includes: checking a metadata server storing metadata for a data file on which a client requests an inquiry; providing information of the checked metadata server to the client; searching, by the checked metadata server, the metadata for the requested data file in response to the inquiry request of the client; and providing the searched metadata to the client.
In another general aspect, a metadata management method in a metadata server cluster including a plurality of metadata servers includes: receiving a making request of a directory; selecting a metadata server to make the directory; making a directory in a metadata storage unit of the selected metadata server; and reflecting the directory making to update a directory hierarchy.
In another general aspect, a metadata management method in a metadata server cluster including a plurality of metadata servers includes: receiving a metadata file generation request from a client; checking a metadata server with a directory to which the data file will belong; notifying the client of information on the checked metadata server; and receiving, by the checked metadata server, the metadata file generation request from the client, and generating a metadata file in the metadata server in response to the generation request.
Hereinafter, exemplary embodiments will be described in detail with reference to the accompanying drawings. Throughout the drawings and the detailed description, unless otherwise described, the same drawing reference numerals will be understood to refer to the same elements, features, and structures. The relative size and depiction of these elements may be exaggerated for clarity, illustration, and convenience. The following detailed description is provided to assist the reader in gaining a comprehensive understanding of the methods, apparatuses, and/or systems described herein. Accordingly, various changes, modifications, and equivalents of the methods, apparatuses, and/of systems described herein will be suggested to those of ordinary skill in the art. Also, descriptions of well-known functions and constructions may be omitted for increased clarity and conciseness.
Metadata server directory hierarchy according to an exemplary embodiment is shared, and the contents of each directory are distributed to and stored in metadata servers by directory unit. A directory hierarchy master metadata server exists for a directory change operation, and a directory inquiry and an input/output operation on a file may be distributed to all the metadata servers. Accordingly, a metadata server according to an exemplary embodiment enhances the performance of a metadata operation, and also secure availability by maintaining the copy of metadata.
In this specification, the term “metadata server cluster” denotes the set of a plurality of metadata servers sharing a directory hierarchy. Moreover, the directory hierarchy means information that represents which metadata server a directory including each file is stored in.
Referring to
The following description will be made in detail with reference to
A data 114 is stored in the data server (DS-1) 300-1 among the data servers 300-1 to 300-N, a metadata 107 linked to the data 114 is stored in the metadata sever (M1) 210-1. That is, metadata for a file are disposed in a directory including the file in a corresponding metadata server. A directory hierarchy 106, which is information that represents which metadata server a directory including each file is stored in, is shared by all the metadata servers of the metadata server cluster 200.
With reference to
When the client 100 intends to inquire the contents of the file of a directory (/A), it queries to one metadata server that is selected at random from the metadata servers 210-1 to 210-3 of the metadata server cluster 200 in order to find a metadata server (M1) 210-1 where the metadata 107 of the file (/A/file1) is stored in operation S101.
Since all the metadata servers 210-1 to 210-3 store the same directory hierarchy 106, all the metadata servers of the metadata server cluster 200 check that the contents of the directory (/A) are in the metadata server (M1) 210-1, and notifies the check result to the client 100 in operation S102.
The client 100 requests the contents of the metadata to the metadata server (M1) 210-1 in operation S103, and the metadata server (M1) 210-1 notifies the client 100 of the fact that an actual data corresponding to the file (/A/file1) is stored as a chunk-1114 in the data server 300-1 in operation S104. Accordingly, the client 100 accesses the data server 300-1 in operation S105, and acquires data in operation S106.
Referring to
The directory hierarchy storage unit 211 stores at least one directory hierarchy. For example, a reference number “106” in
In an embodiment illustrated in
Herein, the metadata of the file (file1) is to represent position information about which of the data servers 300-1 to 300-N data are stored in and information about all the attributes of the file (file1), for example, the size of the file and the access authority of the file. Such contents are recorded as one file.
The metadata storage unit 212 stores the metadata. In
The directory hierarchy duplication unit 213 replicates a directory hierarchy from one metadata server to another metadata server, and relevant description will be made below.
A metadata server according to an exemplary embodiment, as illustrated in
That is, all the metadata servers 210-1 to 210-3 of the metadata server cluster 200 share the same directory hierarchy. For example, as illustrated in
In an exemplary embodiment, the duplication of a directory hierarchy is made from the specific metadata server (i.e., a directory hierarchy master metadata server storing the master copy of the directory hierarchy) to another metadata server in the metadata server cluster 200. In this case, the directory hierarchy master metadata server includes the directory hierarchy duplication unit 213 to replicate the directory hierarchy to other metadata servers. Accordingly, metadata operations such as directory making (mkdir) and directory removal (rmdir) are transferred only to the directory hierarchy master metadata server in the metadata server cluster 200.
In another exemplary embodiment, the arbitrary metadata server of the cluster 200 may be configured to replicate a directory hierarchy to other metadata server. In this case, the directory hierarchy duplication unit 213 is included in the each metadata server.
The selection of configuration is easily made according to the size and operation characteristics of the cluster 200.
In the following description on directory making and file generation, it is assumed that a specific metadata server serves as the directory hierarchy master metadata server. However, even when exemplary embodiments are configured for an arbitrary metadata server to change and replicate a directory hierarchy without a specific master metadata server in the cluster 200, it is apparent that one skilled in the art easily embody exemplary embodiments through changes and modifications within their spirit and scope from the following description.
Even though all the metadata servers of the cluster 200 share the directory hierarchy according to an exemplary embodiment, since the number of directories among information to be stored in the metadata server cluster 200 is relatively far less than that of data, there is little probability that the share of the directory hierarchy causes the increase of the load in a system level.
The metadata duplication unit 214 duplicates a directory, which is stored in a specific metadata server, in another metadata server for enhancing the availability of the metadata.
For example, the directory hierarchies 301, 311 and 321 in
In
In the duplication of the metadata, the cluster 200 selects metadata for duplicating and one or more metadata servers for storing a copy to duplicate the selected metadata in the selected one or more metadata servers, and thereafter reflects information (about which metadata server the master copy and copy of the metadata are stored in) in a directory hierarchy like the directory hierarchies 301, 311 and 321 in
Such a process is performed in the metadata duplication unit 214, and it is apparent that an exemplary embodiment may have a configuration where an arbitrary metadata server in the cluster 200 leads the process or the directory hierarchy master metadata server leads the process.
In this way, the same metadata may be duplicated in the plurality of metadata servers. In this case, the number of metadata to duplicate may be adjusted in the setting of environment.
The number of the metadata to duplicate may be designated as “0”, but the number of the metadata to duplicate is designated as “1” for securing availability.
The search unit 215 inquires the directory hierarchy storage unit 211 to search a metadata server where a queried directory is disposed, and inquires the metadata storage unit 212 to search queried metadata that are stored in a metadata server.
The recovery unit 216 recognizes a metadata server where a failure occurs among the metadata servers 210-1 to 210-3 constituting the metadata server cluster 200, and increases the duplication number of the metadata which is stored in the recognized metadata server, thereby enabling a metadata service to be provided only to a plurality of the remaining metadata servers without interruption.
For example, if total four metadata servers exist and the same metadata are stored in two metadata servers among the four metadata servers, the stored duplication number of the metadata is “2”. When a failure occurs in one metadata server among the two metadata servers, the client 100 receives corresponding metadata from the remaining one metadata server, and thus, by additionally copying the available metadata of the remaining one metadata server in other metadata servers, the client 100 increases the duplication number of the metadata to prepare against the additional occurrence of a failure. The recovery unit 216 may be configured to autonomously have the duplication function of the metadata or to allow the metadata duplication unit 214 to perform duplication through notification. The control unit 217 overall controls the functions of the elements of the metadata servers 210-1 to 210-3.
The metadata duplication unit 214 and the recovery unit 216 may be disposed only in the specific metadata server (for example, the directory hierarchy master metadata server) of the cluster 200, and may also be disposed in all the metadata servers or some metadata servers in the cluster 200.
The following description will be made in detail on processes about the making of a directory, the generation of a file and the inquiry of the metadata of the file.
Hereinafter, a process of making a directory (/C) will be described with reference to
The client 100 transfers the making request of the directory (/C) to the directory hierarchy master metadata server (M1) 210-1 among the metadata servers (M1, M2 and M3) 210-1 to 210-3 of the metadata server cluster 200 in operation S710.
The directory hierarchy master metadata server (M1) 210-1 selects a metadata server to store the new directory (/C) from the metadata servers 210-1 to 210-3 in operation S720.
When the metadata server (M3) 210-3 is selected, as illustrated in
Subsequently, the directory hierarchy master metadata server (M1) 210-1 adds information C(M3) in its own directory hierarchy to update the directory hierarchy in operation S740. Afterwards, the directory hierarchy master metadata server (M1) 210-1 replicates information on the updated directory hierarchy 301 of
A file generation process is classified into an operation of generating a file to record metadata for a corresponding file in a metadata server and an operation of generating a file to store actual data in a data server. The following description will be made with reference to
Referring to
When the metadata server (M2) 210-2 receives the generation request of the file (/C/file6), the metadata server (M2) 210-2 inquires its own directory hierarchy 311, checks the information C(M3), i.e., the fact that the metadata server (M3) 210-3 stores the directory (/C), and notifies the client 100 to request the generation of the file (/C/file6) to the metadata server 210-3 in operation S820.
At this point, the client 100 transfers the generation request of the file (/C/file6) to the metadata server (M3) 210-3 in operation S830.
Accordingly, as illustrated in
As a result, metadata for the file (/C/file6) are stored in the metadata server (M3) 210-3, the actual data of the file (/C/file6) are stored as a file type named a chunck-6 in a data server (DS-4) 300-4. For convenience and concentration, a process of selecting a data server to store the actual data of the file (/C/file6) and a process of dividing the actual data into a plurality of chunks to duplicate and store the divided chunks will be omitted.
Referring to
When the metadata server (M3) 210-3 receives the inquiry request of the metadata for the file (/B/file5), the metadata server (M3) 210-3 inquires its own directory hierarchy 321, and thereafter checks information B(M2) in operation S920.
That is, the metadata server (M3) 210-3 checks the fact that a directory /B is stored in the metadata server 210-2, and commands the client 100 to request the inquiry of the metadata for the file (/B/file5) to the metadata server 210-2.
At this point, the client 100 transfers the inquiry request of the metadata for the file (/B/file5) to the metadata server 210-2 in operation S930.
Accordingly, by checking the metadata of the file (/B/file5) in the metadata storage unit 312, the metadata server 210-2 notifies the client 100 of the fact that the actual data of the file (/B/file5) are stored in a data server (DS-3) 300-3 in operation S940.
A new metadata server may be added in the metadata server cluster 200. In this case, information on the new metadata server is registered in the metadata server cluster, and a directory hierarchy stored in the existing metadata server of the metadata server cluster 200 is duplicated in the new metadata server, thereby adding the new metadata server in the cluster 200.
Registering information on the new metadata server in the cluster 200 may be performed through a process where the directory hierarchy master metadata server or the arbitrary metadata server of the cluster 200 adds the information (for example, identifier and nickname) of a metadata server that is added in the directory hierarchy.
Subsequently, the directory hierarchy where the added metadata server information is reflected is replicated to all other metadata servers in the cluster 200, and thus, all the existing metadata servers can recognize the existence of the added metadata server.
A number of exemplary embodiments have been described above. Nevertheless, it will be understood that various modifications may be made. For example, suitable results may be achieved if the described techniques are performed in a different order and/or if components in a described system, architecture, device, or circuit are combined in a different manner and/or replaced or supplemented by other components or their equivalents. Accordingly, other implementations are within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2008-0129631 | Dec 2008 | KR | national |