At least one embodiment of the present disclosure pertains to data storage systems, and more particularly, to performing deduplication across a data storage system.
Scalability is an important requirement in many data storage systems, particularly in network-oriented storage systems, e.g., network attached storage (NAS) systems and storage area network (SAN) systems. Different types of storage systems provide diverse methods of seamless scalability through storage capacity expansion including virtualized volumes of storage across multiple storage servers (e.g., a server cluster containing multiple server nodes.
A process used in many storage systems that can affect scalability is data deduplication. Data deduplication is an important feature for data storage systems, particularly for distributed data storage systems. Data deduplication is a technique to improve data storage utilization by reducing data redundancy. A data deduplication process identifies duplicate data and replaces the duplicate data with references that point to data stored elsewhere in the data storage system. However, existing deduplication technology for storage systems suffer deficiencies in scalability and flexibility of the storage system, including bottlenecking at specific server nodes in the I/O flow of the storage system.
The figures depict various embodiments of the present disclosure for purposes of illustration only. One skilled in the art will readily recognize from the following discussion that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
The technology introduced here includes a method of performing asynchronous global deduplication in a variety of storage architectures. Asynchronous deduplication here refers to performing deduplication of data outside of an I/O flow of a storage architecture. For example, the technology includes global deduplication in host-based flash cache, cache appliances, cloud-backup, infinite volumes, centralized backup systems, object-based storage platforms, e.g., StorageGRID™, and enterprise file hosting and synchronization service, e.g., Dropbox™. The disclosed technology performs an asynchronous deduplication across a storage system for backing up data utilizing a global data fingerprint tracking structure (“global fingerprint store”). “Data fingerprint” refers to a value corresponding to a data chunk (e.g., a data block, a fixed sized portion of data comprising multiple data blocks, or a variable sized portion of data comprising multiple data blocks) that uniquely or with substantially high probability to uniquely identify the data chunk. For example, data fingerprint can be a result of running a hashing algorithm on the data chunk. The global fingerprint store is “global” in the sense that it tracks fingerprint updates from every staging area in the storage system. For example, if each server node in a storage system has a staging area, then the global fingerprint store tracks fingerprint updates from every one of the server nodes.
For example, the global data fingerprint tracking structure can be the global data structure disclosed in U.S. patent application Ser. No. 13/479,138 titled “DISTRIBUTED DEDUPLICATION USING GLOBAL CHUNK DATA STRUCTURE AND EPOCHS” filed on May 23, 2012, which is incorporated herein in its entirety. The subject matter incorporated herein is intended to be examples of methods and data structures for implementing global deduplication consistent with various embodiments, and is not intended to redefine or limit elements or processes of the present disclosure.
The asynchronous deduplication can be realized through asynchronous updates of data fingerprints of incoming data from one or more staging areas of the storage system to a metadata server system. A staging area is a storage space for collecting and protecting data chunks to be written to a backing storage of the storage system. The staging area can begin to clear its contents when full by contacting the metadata server system with the data fingerprints of data chunks in the staging area. The metadata server system can then reply with a list of data fingerprints that are unique (i.e., not currently in the storage system). The staging area can then commit the unique data chunks to the backing storage system of the storage system and discard duplicate data chunks (i.e., data chunks corresponding to non-unique fingerprints). The metadata server system can also contain a list of unique data chunks that comprise each stored data object in the storage system.
The backing storage is a persistent portion of the storage system to store data. The backing storage can be a separate set of storage devices from device implementing the staging area, which can also be persistent storage. The backing storage can be distributed within a storage cluster implementing the storage system. The staging areas, the metadata server system, and the backing storage system can be part of the storage system. The staging areas, the metadata server system, and the backing storage system can be implemented on separate hardware devices. Any two or all three of the staging areas, the metadata server system, and the backing storage system can be implemented on or partially or completely share the same hardware device(s). In some embodiments, each node (e.g., virtual or physical server node in a storage system implemented as a storage cluster) includes a staging area.
The metadata server system maintains the global fingerprint store, e.g., a hash table, that tracks data fingerprints (e.g., hash values generated based on data chunks of data objects) corresponding to unique data chunks. The metadata server system can be scalable. The metadata server system can comprise of one or more storage nodes, e.g., storage servers or other storage devices. The multiple metadata servers may be virtual or physical servers. Each of the multiple metadata servers can maintain a version of the global data structure tracking the unique fingerprints. The version may include a partitioned portion of unique data fingerprints in the storage system or all of the known unique data fingerprints in the storage system at a specific time. The multiple metadata servers can update each other in an “epoch-based” methodology. An epoch-based update refers to freezing a consistent view of a storage system at points in time through versions of the global fingerprint store. The global fingerprint store allows a storage system to deduplicate data in an efficient manner. The asynchronous deduplication scales well to an arbitrary number of nodes in a cluster, enables a reduction in amount of data required to be transferred from a staging area to a backing storage system in the storage system for persistent storage, and enables deduplication without delaying the I/O flow of the storage system.
The asynchronous global deduplication technology enables a more efficient accumulation of data. For example, the staging area can accumulate data at high speed, without having to compute and lookup each individual fingerprint in real-time. The fingerprint lookup can be delayed and accomplished in a bulk/batch fashion, which is more efficient and reduces number of messages between the staging area and the metadata server system that keeps track of the fingerprint list.
This disclosed technology leverages advantages of a scalable metadata server system to provide the ability to have only a single instance of each data chunk (e.g., data block) that is shared across many storage server nodes (i.e., global dedup) in many different deployment scenarios, exemplified by various system architectures of
The storage system 100 can service one or more clients, e.g., client 106A and client 106B (collectively as the “clients 106”), by storing, retrieving, maintaining, protecting, and managing data for the clients 106. Each of the staging areas 102 can service one or more of the clients 106. The storage system 100 can communicate with the clients 106 through a network channel 108. The network channel 108 can comprise one or more interconnects carrying data in and out of the storage system 100. The network channel 108 can comprise subnetworks. For example, a subnetwork can facilitate communication between the client 106A and the first staging area 102A while a different subnetwork can facilitate communication between the client 106B and the second staging area 102B. The clients 106, for example, can include application servers, application processes running on computing devices, or mobile devices. In some embodiments, the clients 106 can run on the same hardware appliance as the staging areas 102, where, for example, the client 106A can communicate directly with the first staging area 102A via internal network on a computing device, without going through an external network.
A global deduplication process can operate on each of the staging areas 102. The global deduplication process can collect incoming data objects from the clients 106 to be written to the storage system 100 at each of the staging areas 102. The global deduplication process can divide the data objects into data chunks, which are fixed sized or variable sized contiguous portions of the data objects. The global deduplication process can also generate a data fingerprint for each of the data chunks. For example, the data fingerprint may be generated by running a hash algorithm on each of the data chunks. In response to a trigger event, the global deduplication process can send the data fingerprints corresponding to the data chunks to a metadata server system 110. For example, the data fingerprints may be sent over to the metadata server system 110 as a fingerprints message.
The trigger event can be based on a set schedule (i.e., a schedule indicated in the configuration of the global deduplication process). The set schedule may be based on a periodic schedule. The set schedule of each instance of the global deduplication process may be synchronized to each other by synchronizing with a system clock available to each instance operating on each of the staging areas 102. Alternatively, the trigger event may be based on a state of a staging area. For example, the trigger event can occur whenever a staging area is full (i.e., at its maximum capacity) or if the staging area reaches a threshold percentage of its maximum capacity. The trigger event may further be based on an external message, e.g., a message from one of the clients 106.
The metadata server system 110 includes one or more metadata nodes, e.g., a first metadata node 112A and a second metadata node 112B (collectively as metadata nodes 112). In some embodiments, each of the metadata nodes 112 can act on behalf of the metadata server system 110 to reply to a staging area of whether a data fingerprint is unique in the storage system 100. An instance of the global deduplication process may specifically select one of the metadata nodes 112 to send a specific data fingerprint based on a characteristic of the specific data fingerprint, e.g., a characteristic of a hash value representing the specific data fingerprint. An instance of the global deduplication process may also specifically select one of the metadata nodes 112 to send a specific data fingerprint based on a characteristic of the staging area (e.g., each staging area being assigned to a particular metadata node). In some embodiments, one of the metadata nodes 112 may be preselected to route the fingerprints message from the staging areas 102 to the other metadata nodes.
Once a metadata node receives a fingerprints message, the metadata node can compare fingerprints in the fingerprints message against a version of a global fingerprint store available in the metadata node (e.g., a first version 114A of the global fingerprint store and a second version 114B of the global fingerprint store). The comparison can determine whether a particular fingerprint is unique or not in the storage system 100 according to the version of the global fingerprint store available in the metadata node. In some embodiments, the version of the global fingerprint store contains a portion of all unique fingerprints in the storage system 100, e.g., where the portion corresponds to a specific subset of the staging areas 102 or a particular group of the fingerprints according to a characteristic of the fingerprints. In other embodiments, the version of the global fingerprint store contains all unique fingerprints in the storage system 100 at a specific point in time. In some embodiments, unique fingerprints across the entire storage system 100, including the staging areas 102 and the backing storage 104, is tracked by the global fingerprint store. In other embodiments, unique fingerprints across only the backing storage 104 is tracked by the global fingerprint store. Again here, “unique fingerprints” as defined by the metadata node is defined according to the version of the global fingerprint store.
When a particular fingerprint is determined to be unique by a metadata node (i.e., not to exist in the version of the global fingerprint store in the metadata node), then the metadata node can modify and add the particular fingerprint to its version of the global fingerprint store. Aside from updating the version of the global fingerprint store according to the fingerprints messages from the staging areas 102, the version of the global fingerprint store may also be updated periodically from other metadata nodes in the metadata server system 110. For example, the metadata nodes 112 can be scheduled for a rolling update from one metadata node to another. The sequence of which metadata node to update first may be determined based on load-balancing considerations, amount of updates to the current version of the global fingerprint store, or other considerations related to a state of a metadata node or a state of the global fingerprint store. The sequence of which metadata node to update may also be determined arbitrarily. A version indicator (e.g., an epoch indicator) can be stored on the metadata node to facilitate the updating of the global fingerprint store.
The metadata node can generate a response message in response to receiving a fingerprints message from the staging area. When a particular fingerprint is determined to be not unique by the metadata node (i.e., the particular fingerprint exists in the version of the global fingerprint store in the metadata node), the response message may contain an indication that a data chunk corresponding to the particular fingerprint exists in the storage system 100 or in the backing storage 104. In some embodiments, the indication includes a specific storage location in the backing storage 104 where an existing data chunk corresponds to the same particular fingerprint. In other embodiments, the indication includes a hint or suggestion to where an existing data chunk corresponding to the same particular fingerprint can be found in the backing storage 104 or simply that the existing data chunk is in the backing storage 104. The specific storage location or the hint of where the existing data chunk may exist can be used to deduplicate a data chunk on the staging area corresponding to the particular fingerprint. A reference to the storage location can be mapped/linked to any data objects referencing the data chunk. For example, when committing the data chunk on the staging area corresponding to the particular fingerprint to the backing storage 104, instead of transferring the entire data chunk, a link referencing the storage location is transferred to the backing storage 104 instead.
When a particular fingerprint is determined to be unique by the metadata node (i.e., not to exist in the version of the global fingerprint store in the metadata node), the response message may contain an indication that any data chunk on the staging area corresponding to the particular fingerprint is unique, and thus need not to be deduplicated or need only to be deduplicated with each other (i.e., amongst data chunks in the staging area with the same data fingerprint). When committing a data chunk corresponding to the particular fingerprint to the backing storage 104, the staging area may indicate to the backing storage 104 that the data chunk is unique and thus need not to be deduplicated on the backing storage 104.
The storage system 100 can be consistent with various storage architectures. For example, the storage system 100 can represent a host-based cache storage system, as further exemplified in
Each node 210A, 210B, 210C or 210D receives and responds to various read and write requests from clients such 230A or 230B, directed to data stored in or to be stored in persistent storage 260. Each of the nodes 210A, 210B, 210C and 210D contains a persistent storage 260 which includes a number of nonvolatile mass storage devices 265. The nonvolatile mass storage devices 265 can be, for example, conventional magnetic or optical disks or tape drives; alternatively, they can be non-volatile solid-state memory, e.g., flash memory, or any combination of such devices. In some embodiments, the mass storage devices 265 in each node can be organized as a Redundant Array of Inexpensive Disks (RAID), in which the node 210A, 210B, 210C or 210D accesses the persistent storage 260 using a conventional RAID algorithm for redundancy.
Each of the nodes 210A, 210B, 210C or 210D may contain a storage operating system 270 that manages operations of the persistent storage 260. In certain embodiments, the storage operating systems 270 are implemented in the form of software. In other embodiments, however, any one or more of these storage operating systems may be implemented in pure hardware, e.g., specially-designed dedicated circuitry or partially in software and partially as dedicated circuitry.
Each of the data nodes 210A and 210B may be, for example, a storage server which provides file-level data access services to hosts, e.g., commonly done in a NAS environment, or block-level data access services e.g., commonly done in a SAN environment, or it may be capable of providing both file-level and block-level data access services to hosts. Further, although the nodes 210A, 210B, 210C and 210D are illustrated as single units in
The processor(s) 310 is/are the central processing unit (CPU) of the storage controller 300 and, thus, control the overall operation of the node 300. In certain embodiments, the processor(s) 310 accomplish this by executing software or firmware stored in memory 320. The processor(s) 310 may be, or may include, one or more programmable general-purpose or special-purpose microprocessors, digital signal processors (DSPs), programmable controllers, application specific integrated circuits (ASICs), programmable logic devices (PLDs), trusted platform modules (TPMs), or the like, or a combination of such devices.
The memory 320 is or includes the main memory of the node 300. The memory 320 represents any form of random access memory (RAM), read-only memory (ROM), flash memory, or the like, or a combination of such devices. In use, the memory 320 may contain, among other things, code 370 embodying at least a portion of a storage operating system of the node 300. Code 370 may also include a deduplication application.
Also connected to the processor(s) 310 through the interconnect 330 are a network adapter 340 and a storage adapter 350. The network adapter 340 provides the node 300 with the ability to communicate with remote devices, e.g., clients 130A or 130B, over a network and may be, for example, an Ethernet adapter or Fibre Channel adapter. The network adapter 340 may also provide the node 300 with the ability to communicate with other nodes within the data storage cluster. In some embodiments, a node may use more than one network adapter to deal with the communications within and outside of the data storage cluster separately. The storage adapter 350 allows the node 300 to access a persistent storage, e.g., persistent storage 160, and may be, for example, a Fibre Channel adapter or SCSI adapter.
The code 370 stored in memory 320 may be implemented as software and/or firmware to program the processor(s) 310 to carry out actions described below. In certain embodiments, such software or firmware may be initially provided to the node 300 by downloading it from a remote system through the node 300 (e.g., via network adapter 340).
The distributed storage system, also referred to as a data storage cluster, can include a large number of distributed data nodes. For example, the distributed storage system may contain more than 1000 data nodes, although the technique introduced here is also applicable to a cluster with a very small number of nodes. Data is stored across the nodes of the system. The deduplication technology disclosed herein applies to the distributed storage system by gathering deduplication fingerprints from distributed storage nodes periodically, processing the fingerprints to identify duplicate data, and updating a global fingerprint store consistently from a current version to the next version.
Once a certain amount of data chunks are collected, data fingerprints of the data chunks are generated in step 404. The data fingerprints may be generated at the staging area. For example, the data fingerprints may be generated by a host of the staging area. A data fingerprint requires less storage space than its corresponding data chunk and is for identifying the data chunk. The data fingerprints may be generated by executing a hash function on the data chunks. Each data chunk is represented by a hash value (as its data fingerprint).
Then in step 406, a controller of the staging area (e.g., a storage controller or a processor of a host of the staging area) sends the data fingerprints in batch (e.g., including data fingerprints corresponding to data chunks collected at different times) to a metadata server system in the storage system. The metadata server system can receive batch data fingerprints updates from multiple staging areas. The metadata server system can be the metadata server system 110 of
Sending of the data fingerprints may be processed independently (i.e., asynchronously) from an I/O path of the staging area. Sending of the data fingerprints may be in response to a trigger event. For example, the data fingerprints may be sent when the staging area reaches its maximum capacity or if the staging area reaches a threshold percentage of its maximum capacity. As another example, the data fingerprints may be sent periodically based on a set schedule. When sending the data fingerprints, the controller of the staging area may determine a metadata node in the metadata server system to send the data fingerprints based on a characteristic of the data fingerprints. Alternatively, where to send the data fingerprints may be determined based on a characteristic of the staging area.
In response to the batch fingerprints update from the staging area, the metadata server system sends and the controller of the staging area receives an indication of whether each of the data fingerprints is unique in the storage system in step 408. When the indication indicates that the data fingerprint of a particular data chunk is not unique and exists in a global fingerprint store of the metadata server system, the metadata server system may also send a storage location identifier of an existing data chunk in the backing storage to the controller of the staging area, where the existing data chunk also corresponds to the data fingerprint of the particular data chunk in the staging area.
When committing a data object containing one of the data chunks in the staging area to the backing storage, the one data chunk is discarded when the indication indicates that the data fingerprint corresponding to the one data chunk is not unique in step 410. The staging area may begin to process the data object to commit the data object to the backing storage when the staging area is full, i.e., at its maximum capacity. The staging area may commit the data object in response to sending the data fingerprints in batch and receiving the indication of whether data fingerprints are unique. When committing the data object to the backing storage, the controller of the staging area may indicate to the backing storage which of the data chunks in the data object have been deduplicated. When committing the data object to the backing storage, the host of the staging area may logically map the storage location of the existing data chunk in place of the one data chunk determined not to be unique prior to or when discarding the one data chunk.
Steps 402 to 410 may correspond to the process 400 of processing of data objects and/or data chunks in write requests to the storage system.
In response to receiving the batch fingerprints message and determining the indication, the metadata server sends the indication of whether the data fingerprint is in the version of the global fingerprint store (i.e., whether the metadata server accords that the data fingerprint is unique) to the first staging area in step 506. As part of step 506, the metadata server may also send a storage location identifier of an existing data chunk in a backing storage, where the existing data chunk corresponds to the same data fingerprint corresponding to the indication.
Also in response to determining the indication (e.g., in parallel to step 506 or immediately before or after step 506), the metadata server updates the version of the global fingerprint store with the data fingerprint when the data fingerprint does not exist in the version in step 508. The metadata server may store a list of unique data chunks in each data object in the storage system that can be requested by any of the staging areas of the storage system. Thus, the updating of the data fingerprint may also include updating data chunks metadata associated with the data fingerprints and corresponding data objects. A benefit of storing and updating the version of the global fingerprint store in the metadata server system is that the global fingerprint store remains relevant even when data corresponding to data fingerprints of the global fingerprint store is move in an arbitrary manner. After the update in step 508, the metadata server communicates with a peer metadata server in the metadata server system to update a peer version of the global fingerprint store in the peer metadata server in step 510.
The disclosed global deduplication technology may be exemplified in the number of backup systems.
A storage system often includes a network storage controller that is used to store and retrieve data on behalf of one or more hosts on a network. The storage system may also include a cache to facilitate mass amount of data I/O processing. Solid state cache systems and flash-based cache systems enable the size of cache memory that is utilized by a storage controller to grow relatively large, in many cases, into Terabytes. Furthermore, conventional storage systems are often configurable providing for a variety of cache memory sizes. Typically, the larger the cache size, the better the performance of the storage system. However, cache memory is expensive and performance benefits of additional cache memory can decrease considerably as the size of the cache memory increases, e.g., depending on the workload.
Without expensive and time-consuming simulations running on the storage systems, predicted statistic of how cache memories are used and effectiveness of such cache memories are difficult to come by. A host-based cache system is a system architecture for a storage system that enables the hosts themselves to control the mechanisms that place data either in the cache or a backing storage.
For example, a host-based flash cache system may provide a write-back cache (i.e., a cache implementing a write-back policy, where initially, writing is done only to the cache and the write to the backing storage is postponed until the cache blocks containing the data are about to be modified/replaced by new content) capability using peer-to-peer protocols. This makes the host cache a viable staging area (e.g., one of the staging areas 102 of
Optionally, a special protocol between the host cache and systems that support deduplication, e.g., Fabric-Attached Storage (FAS) made by NetApp, Inc. of Sunnyvale, Calif., can provide benefits to the systems by writing the unique data chunks in the backing storage and only logically mapping previously existing data chunks in new data objects containing data chunks that are not unique. The cache also optimizes transfer of read data, by returning requested data directly from the cache, and only actually requesting the data from a backing storage when the cache determines that the requested data is not present in the cache.
The host storage controller 614 and/or the processor 602 may be coupled to a storage space 616. Storage devices within the storage space 616 may be on the same physical device as the processors 602 or the host storage controller 614 or on a separate device couple to the host storage controller 614 and/or the processor 602 via network. The storage space 616 may include flash-based memory, other solid-state memory, disk-based memory, tape-based memory, other types of memory, or any combination thereof. The storage space 616 may be accessible directly or indirectly to the host (e.g., the processor 602 and/or the host storage controller 614) to direct system input/output to various physical media regions on the storage space 616.
Most frequently used data may be directed and stored on the fastest media portion of the storage space 616, which acts as a cache for the slower storage. For example, the storage space 616 includes a secondary cache system 618 and a persistent storage system 620. The secondary cache system 618 includes one or more solid-state memories 622, e.g., flash memories. The persistent storage system 620 includes one or more mass storages 624, including tape drives and disk drives. In some embodiments, the mass storages 624 may include solid-state drives as well. If one of the storages becomes filled, the caching process executed by the processor 602 can instruct the storage space 616 to move data from one region to another, via internal instructions. The host-based caching process can provide a caching solution superior to other caching solutions that does not have real time host knowledge and the richness of information needed to effectively control the different types (e.g., faster solid state or flash memory and slower disk memory) within the storage space 616.
The host-based caching technique has the ability to directly control the content of cache (e.g., primary or secondary). By providing information about file types and process priority, the host (e.g., the processor 602 or the storage controller 614) can make decisions based on which logical addresses are touched. This more informed decision can lead to increased performance in some embodiments. Allowing the host to control the mechanisms that place data either in the faster solid state media area or the magnetic slower media area of the storage space may lead to better performance and lower power consumption in some cases. This is because the host may be aware of the additional information associated with inputs/outputs destined for the device and can make more intelligent caching decisions as a result. Thus, the host can control the placement of incoming input and output data within the storage.
In this system architecture, either the system memory 608 or the secondary cache system 618 can be the staging area, e.g., one of the staging areas 102, in accordance with the disclosed global deduplication technology. The persistent storage system 620 can be the backing storage, e.g., the backing storage 104. When the processor 602 issues a write request to write a data object into the persistent storage system 620, the processor 602 can first store the data object in the system memory 608 or the secondary cache system 618, acting as a staging area. For example, when the system memory 608 serves as the staging area, contents of the system memory 608 can be mirrored into the secondary cache system 618 for protection as well. The system memory 608 can also be protected by error correcting code or erasure correcting code.
When the staging area is full, the processor 602 can contact a metadata server 630 (e.g., the metadata node 112A or the metadata node 112B) that maintains a global fingerprint store 632 by sending data fingerprints of data chunks in the data object. Generation of the data fingerprints may occur as a continuous process, in response to the write request, or in response to the staging area being full. The metadata server 630 may be implemented as an external system to the host (as shown) that communicates via a network. Alternatively, the metadata server 630 may be implemented as a service in the host-based cache system 600 (not shown) with the global fingerprint store 632 store in the system memory 608 or the secondary cache system 618. The global deduplication process methods be carry out in accordance with
The backup system may be a cloud-based backup, which is a feature that allows a storage device to send backup data directly to a cloud provider 704A. When a set of data chunks in the data objects to be backed-up is determined, the host computer computes data fingerprints of the data chunks and queries a metadata server 706 (e.g., the metadata node 112A or the metadata node 112B of
The backup system may be an enterprise file share system 704B (e.g., Dropbox(™)) for synchronizing and hosting files work similar to the cloud provider 704A. The enterprise file share system 704B may, for example, include a cloud storage. For example, the first host device 702A may be coupled to the enterprise file share system 704B through a file management application installed on the first host device 702A that enables a user to share and store a data object in the enterprise file share system 704B while the same data object is simultaneously accessed from multiple other host devices 702 (i.e., devices with the file management application installed).
The file management application usually includes a file sharing folder where a new data object can be added. The file sharing folder can serve as a local cache and therefore can be a staging area, e.g., one of the staging areas 102. Before syncing the new data object to the enterprise file share system 704B, the first host device 702A can communicate with the metadata server 706 to identify unique data chunks that should be sent to the enterprise file share system 704B.
The backup system may be a centralized backup service system 704C. For example, in large enterprises, it is common for laptops and desktops to run a backup application that periodically backs up users' home directories on the laptops or desktops to a central backup server, e.g., the centralized backup service system 704C. Frequently, up to 60% of home directory data can usually be deduplicated. The backup application can communicate with the metadata server 706 to identify unique data chunks and only send those to the centralized backup service system 704C.
Accordingly, when implementing the disclosed global deduplication technique to the cache appliance system 800, the cache appliances 802 may serve as staging areas, e.g., the staging areas 102 of
An expandable storage volume is a scalable storage volume including multiple flexible volumes. A “namespace” as discussed herein is a logical grouping of unique identifiers for a set of logical containers of data, e.g., volumes. A flexible volume is a volume whose boundaries are flexibly associated with the underlying physical storage (e.g., aggregate). The namespace constituent volume stores the metadata (e.g., inode files) for the data objects in the expandable storage volume. Various metadata are collected into this single namespace constituent volume.
Multiple client computing devices or systems 904A-904N may be connected to the storage server system 902 by a network 906 connecting the client systems 904A-904N and the storage server system 902. As illustrated in
The storage server 908 may be connected to the storage devices 912A-912M via the switching fabric 910, which can be a fiber distributed data interface (FDDI) network, for example. It is noted that, within the network data storage environment, any other suitable numbers of storage servers and/or mass storage devices, and/or any other suitable network technologies, may be employed. While the embodiment illustrated in
The storage server 908 can make some or all of the storage space on the storage devices 912A-912M available to the client systems 904A-904N in a conventional manner. For example, a storage device (one of 912A-912M) can be implemented as an individual disk, multiple disks (e.g., a RAID group) or any other suitable mass storage device(s). The storage server 908 can communicate with the client systems 904A-904N according to well-known protocols, e.g., the Network File System (NFS) protocol or the Common Internet File System (CIFS) protocol, to make data stored at storage devices 912A-912M available to users and/or application programs.
The storage server 908 can present or export data stored at storage device 912A-912M as volumes (also referred to herein as storage volumes) to one or more of the client systems 904A-904N. One or more volumes can be managed as a single file system. In various embodiments, a “file system” does not have to include or be based on “files” per se as its units of data storage. Various functions and configuration settings of the storage server 908 and the mass storage subsystem 914 can be controlled from a management console 916 coupled to the network 906. The clustered storage server system 902 can be organized into any suitable number of virtual servers (also referred to as “vservers”), in which one or more these vservers represent a single storage system namespace with separate network access. In various embodiments, each of these vserver has a user domain and a security domain that are separate from the user and security domains of other vservers.
According to the system architecture of the cluster storage server system 902, the storage server 908 can be implemented as a staging area for global deduplication, e.g., one of the staging areas 102 of
The clients 1002 can communicate via a number of file access protocols 1008 with the object level management server 1006. For example, the file access protocols 1008 may include Common Internet File System (CIFS), Network File System (NFS), and Hyper Text Transfer Protocol (HTTP). The object level management server 1006 can stage I/O workload from the clients 1002 for one or more storage facilities (e.g., storage facility 1010A or storage facility 1010B, collectively as “storage facilities 1010”). The storage facility 1010A, for example, may be a main facility for the distributed object storage system 1000. The storage facility 1010B, for example, may be a disaster recovery facility for the distributed object storage system 1000. Each of the storage facilities 1010 may include one or more storage devices (e.g., storage devices 1012A, 1012B, 1012C, and 1012D, collectively as “storage devices 1012”). The storage devices 1012 may be accessible in the storage facilities 1010 via Serial Advanced Technology Attachment (SATA), Storage Area Network (SAN), Small Computer System Interface (SCSI), or other protocols and connections.
In the system architecture of the distributed object storage system 1000, the object level management server 1006 can be implemented as a staging area for global deduplication, e.g., one of the staging areas 102 of