File systems may be used to organise data into computer file entities, namely directories and files, that may be stored, manipulated and retrieved using a computers operating system. For example, various versions of FAT (File Allocation Table) and NTFS (New Technology File System) ext (extended file system) are used with example operating systems. File systems relate the data of named files to locations in storage. The storage can comprise remote, physical storage devices such as, for example, hard disk drives, solid-state storage, tape storage, and CD-ROMs, and/or virtualised storage layered above such physical storage devices.
Virtual Tape Libraries (VTLs), for example, are connected to client computer systems via either internet Small Computer Systems Interface (iSCSI) or fibre channel (FC). With the arrival of compaction technology a large increase in the amount of stored data housed upon the VTL may occur.
For a more complete understanding, reference is now made to the following description taken in conjunction with the accompanying drawings in which:
Referring to
A network interface 207 is also included in the client computer system 110 for communicating over the network 130. The network interface 207 may, for example, comprise an adapter, for example an NIC (network interface controller), suited to the network.
The client computer system 110 further comprises a backup application 209 which is executed to provide backup copies of consumer data, a deduplication engine 211 for dividing the consumer data to be backed up into chunks and determining a hash function for each chunks for processing the consumer data for deduplication before backup copies of the consumer data are transferred to back up storage facilities on the mass storage 140.
The client computer system 110 further comprises a file system 215 for organising consumer data into file entities (or objects) in a directory tree structure, as shown for example in
The file system 215 includes a metadata generator 213 for generating metadata which includes information of the objects of the tree structure including the type of object and its relative relationship with the other objects within the tree structure. For example, the metadata may comprise a unique universal identifier (UUID) for each object and if that object has a parent object, the metadata for that object also includes the parent UUID. In the example shown in
The controller 120, as shown in
Operation of the system will now be described in more detail with reference to
The metadata generator 213 then creates, 603, the metadata based on the consumer directory tree structure. This is achieved by the notion of a parent UUID (unique universal identifier) and an object UUID for each object. These UUIDs may be stored in the ‘tags’ region 313 of the current Object store schema for each object. Although this example utilises an Object store schema, it can be appreciated that different unique storage schema may be utilised.
The UUID of the object may also be set as the key of the object, rather than an incremental datum. Along with the incremental notion of an object stored in an Object store having a ‘parent’, the notion of a ‘root’ object is provided having a NULL parent UUID. This provides a point to start navigating relationships between objects, and hence facilitating a file system type mapping.
Along with the parent UUID and own UUID of each object, additional states may be stored per object that allows specification of the type of objects in an object store. It is intended that the storage of such “type” information allows the client links, etc. Thus there is the use of an Object store object solely as a means of storing metadata about a presentation (e.g. file system in the most likely instance); the use of such objects being readily used to provide the presentation of directories (container objects), special files (symbolic links) etc.
The deduplicated data (or data to be further processed for deduplication) and metadata is then transferred, 605, over the network 130 to the controller 120. The metadata is stored in the tag regions 313 of one of the object stores 309_1 to 309_k. The deduplicated data is located and stored on the mass storage 140.
As a result, some processing of the data for deduplication is carried out on the client computer system to reduce the demand on the processor resource of the controller. Further, the bandwidth for transferring the data from the client computer system is not wasted by transferral of redundant data which, when it arrives at the controller 120, it is already found to have been stored since the consumer data may be deduplicated before transferral since the controller 120 may only transfer the non duplicated chunks. An update count of duplicated chunks is incremented such that no chunks are unreferenced. This update is transferred to the controller.
The tree structure can then be retrieved, 701, from the controller 130 by a client computer system 110 using the metadata stored in the object store and presented, 703 to the user via the user interface 205.
Referring to
Setting an object O, such as a file entity, to have a parent UUID, 1201, is shown in
It will be appreciated that the use of the metadata as described above allows the storage of multiple presentations within one Object store (and hence deduplication domain), hence allowing consumers the ability to deduplicate differing file systems against one another, and hence reduce overall stored data on the controller 120 and to reduce the bandwidth in transferring data across the network 130.
In order to navigate a set of objects, one starts at a known points in the relationship hierarchy (root for the sake of argument); and then the contents can be enumerated, 1001, by the technique shown in
In order to present a view of objects that a file system navigator might expect (typically what is provided in a Unix stat structure per file for example) in which case additional data over and above the UUIDs may be stored, to enable such a view per object to be derived (typically permissions bits, but by no means limited to that solely—may also include data fields for ACLs/extended attributes/leaf-name of object, etc).
Moving files, 1101, on the client computer system 110 around the presentation of the directory tree structure likewise becomes a simple matter as illustrated in
In another example, the techniques can handle a situation where a ‘valid’ container is suggested initially to be an object store object that has no backing data in the mass storage. The metadata can readily provide an indication of ‘containerness’ along with the other incremental data being stored per object.
A container can be created, 901, as shown in
As a result, the directory structure can be represented by metadata solely housed within the Object store, rather than requiring any client side storage. Therefore, metadata will not be lost following failure of the client computer system and therefore, the backup data and the directory tree structure are completely recoverable from the mass storage 140 and the object store.
As a result, a client computer system (or host) without any unique software other than the usual ISV (independent software vendor) application can perform a restore from the mass storage 140. Further, since the metadata is not stored on the client computer system more consumer usable disaster recovery solutions can be utilised in combination with the system described above.
Any of the features disclosed in this specification, including the accompanying claims, abstract and drawings, and/or any of the steps of any method or process so disclosed, may be combined in any combination, except combinations were the sum of such features and/or steps are mutually exclusive. Each feature disclosed in this specification, including the accompanying claims, abstract and drawings may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise. Thus, unless expressly stated otherwise, each feature disclosed is one example only of a generic series of equivalent or similar features. The techniques of the present application are not restricted to the details of any foregoing examples. The claims should not be construed to cover merely the foregoing examples, but also any examples which fall within the scope of the claims. The techniques of the present application extend to any novel one, or any novel combination, of the features disclosed in this specification, including the accompanying claims, abstract and drawings, or to any novel one, or any novel combination, of the steps of any method or process so disclosed.
It will be appreciated that examples can be realized in the form of hardware, software module or a combination of hardware and the software module. Any such software module, which includes machine-readable instructions, may be stored in the form of volatile or non-volatile storage such as, for example, a storage device like a ROM, whether erasable or rewritable or not, or in the form of memory such as, for example, RAM, memory chips, device or integrated circuits or on an optically or magnetically readable medium such as, for example, a CD, DVD, magnetic disk or magnetic tape. It will be appreciated that the storage devices and storage media are examples of a non-transitory computer-readable storage medium that are suitable for storing a program or programs that, when executed, for example by a processor, implement embodiments. Accordingly, embodiments provide a program comprising code for implementing a system or method as claimed in any preceding claim and a non-transitory computer readable storage medium storing such a program.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2013/050990 | 7/18/2013 | WO | 00 |