This invention relates generally to storage networks, and more specifically, relates to file deduplication using copy-on-write storage tiers.
In enterprises today, employees tend to keep copies of all of the necessary documents and data that they access often. This is so that they can find the documents and data easily (central locations tend to change at least every so often). Furthermore, employees also tend to forget where certain things were found (in the central location), or never even knew where the document originated (they are sent a copy of the document via email). Finally, multiple employees may each keep a copy of the latest mp3 file, or video file, even if it is against company policy.
This leads to duplicate copies of the same document or data residing in individually owned locations, so that the individual's themselves can easily find the document. However, this also means a lot of wasted space to store all of these copies of the document or data. And these copies are often stored on more expensive (and higher performance) tiers of storage, since the employees tend not to focus on costs, but rather on performance (they will store data on the location that they can most easily remember that gives them the best performance in retrieving the data).
Deduplication is a technique where files with identical contents are first identified and then only one copy of the identical contents, the single-instance copy, is kept in the physical storage while the storage space for the remaining identical contents is reclaimed and reused. Files whose contents have been deduped because of identical contents are hereafter referred to as deduplicated files. Thus, deduplication achieves what is called “Single-Instance Storage” where only the single-instance copy is stored in the physical storage, resulting in more efficient use of the physical storage space. File deduplication thus creates a domino effect of efficiency, reducing capital, administrative, and facility costs and is considered one of the most important and valuable technologies in storage.
U.S. Pat. Nos. 6,389,433 and 6,477,544 are examples of how a file system provides the single-instance-storage.
While single-instance-storage is conceptually simple, implementing it without sacrificing read/write performance is difficult. Files are deduped without the owners being aware of it. The owners of deduplicated files therefore have the same performance expectation as other files that have no duplicated copies. Since many deduplicated files are sharing one single-instance copy of the contents, it is important to prevent the single-instance copy from being modified. Typically, a file system uses the copy-on-write (COW) technique to protect the single-instance copy. When an update is pending on a deduplicated file, the file system creates a partial or full copy of the single-instance copy, and the update is allowed to proceed only after the (partial) copied data has been created and only on the copied data. The delay to wait for the creation of a (partial) copy of the single-instance data before an update can proceed introduces significant performance degradation. In addition, the process to identify and dedupe replicated files also puts a strain on file system resources. Because of the performance degradation, deduplication or single-instance copy is deemed not acceptable for normal use. In reality, deduplication is of no (obvious) benefit to the end-user. Thus, while the feature of deduplication or single-instance storage has been available in a few file systems, it is not commonly used and many file systems do not even offer this feature due to its adverse performance impact.
File system level deduplication offers many advantages for IT administrators. However, it generally offers no direct benefits to the users of the file system other than performance degradation for those files that have been deduped. Therefore, it would be desirable to reduce performance degradation to an acceptable level.
Another aspect of the file system level deduplication is that deduplication is usually done on a per file system basis. It is more desirable if deduplication is done together on one or more file systems. For example, the more file systems that are deduped together, the more chances that files with identical contents will be found and more storage space will be reclaimed. For example, if there is only one copy of file A in a file system, file A will not be deduped. On the other hand, if there is a copy of file A in another file system, then together, file A in the two file systems can be deduped. Furthermore, since there is only one single-instance copy for all of the deduplicated files from one or more file systems, the more file systems that are deduped together, the more efficient the deduplication process becomes.
The related application entitled File Deduplication Using Storage Tiers discloses a method of deduplication where duplicated files in one or more file servers in tier-1 storage are migrated to one or more file servers in tier-2 storage. As a result, the storage space occupied by duplicated files in tier-1 storage is reclaimed, while storage space in less expensive tier-2 storage is consumed for storing the duplicated files migrated from tier-1. Furthermore, a mirror copy from each set of duplicated files is left in the tier-1 storage for maintaining read performance. The performance degradation that exists on update operation on deduplicated file is eliminated since COW is not needed. While the deduplication method specified in the co-pending application does not actually save total storage space consumed by the duplicate files, it makes it easier for end-users to accept deduplication since they will experience, at most, a very minor inconvenience. Furthermore, the number of files in tier-1 storage is reduced by deduplication, resulting in faster backup of tier-1 file servers.
However, in some cases, the actual removal of all duplicated files is unlikely to cause any inconvenience to end-users. For example, the contents of music or image files are never changed once created and are therefore good candidates for deduplication. In another case, files that have not been accessed for a long time are also good candidates, since they are unlikely to be changed again any time soon.
Therefore, it would be desirable to provide deduplication of specified classes of files.
It would be desirable to achieve deduplication with acceptable performance. It is even more desirable to be able to dedupe across more file systems to achieve higher deduplication efficiency. Furthermore, to reduce inconvenience experienced by end-users due to the performance overhead of deduplication, deduplication itself should be able to be performed on a selected set of files, instead of on every file in one or more selected file servers. Finally, in the case where end-users are unlikely to experience inconvenience due to deduplication, deduplication should result in less utilization of storage space by eliminating the storage of identical file copies.
In accordance with one aspect of the invention there is provided a method and file virtualization appliance for deduplicating files using copy-on-write storage tiers. Deduplicating files involves associating a number of files from the primary storage tier with a copy-on-write storage tier having a designated mirror server and deduplicating the files associated with the copy-on-write storage tier, such deduplicating including storing in the designated mirror server of the copy-on-write storage tier a single copy of the file contents for each duplicate and non-duplicate file associated with the copy-on-write storage tier; deleting the file contents from each deduplicated file in the copy-on-write storage tier to leave a sparse file; and storing metadata for each of the files, the metadata associating each sparse file with the corresponding single copy of the file contents stored in the designated mirror server.
In various alternative embodiments, associating a number of files from the primary storage tier with a copy-on-write storage tier may involve maintaining the copy-on-write storage tier separately from the primary storage tier and migrating the number of files from the primary storage tier to the copy-on-write storage tier. Maintaining the copy-on-write storage tier separately from the primary storage tier may involve creating a synthetic namespace for the copy-on-write storage tier using file virtualization, the synthetic namespace associated with a number of file servers, and wherein migrating the number of files from the primary storage tier to the copy-on-write storage tier comprises migrating a selected set of files from the synthetic namespace to the copy-on-write storage tier. Associating a number of files from the primary storage tier with a copy-on-write storage tier alternatively may involve marking the number of files as being associated with the copy-on-write storage tier, wherein the copy-on-write storage tier is a virtual copy-on-write storage tier. Associating a number of files from the primary storage tier with a copy-on-write storage tier may involve maintaining a set of storage policies identifying files to be associated with the copy-on-write storage tier and associating the number of files with the copy-on-write storage tier based on the set of storage policies. Storing a single copy of the file contents for each duplicate and non-duplicate file may involve determining whether the file contents of a selected file in the copy-on-write storage tier match the file contents of a previously deduplicated file having a single copy of file contents stored in the designated mirror server and when the file contents of the first selected file do not match the file contents of any previously deduplicated file, storing the file contents of the selected file in the designated mirror server. Determining whether the file contents of a selected file in the copy-on-write storage tier match the file contents of a previously deduplicated file having a single copy of file contents stored in the designated mirror server may involve comparing a hash value associated with the selected file to a hash values associated with the single copies of file contents for the previously deduplicated files stored in the designated mirror server.
Deduplicating files may further involve purging unused mirror copies from the designated mirror server. Purging unused mirror copies from the designated mirror server may involve suspending file deduplication operations; identifying mirror copies in the designated mirror server that are no longer in use; purging the unused mirror copies from the designated mirror server; and enabling file deduplication operations. Identifying mirror copies in the designated mirror server that are no longer in use may involve identifying mirror copies in the designated mirror server that are no longer associated with existing files associated with the copy-on-write storage tier. Identifying mirror copies in the designated mirror server that are no longer associated with existing files in the copy-on-write storage tier may involve constructing a list of hash values associated with existing files in the copy-on-write storage tier; and for each mirror copy in the designated mirror server, comparing a hash value associated with the mirror copy to the hash values in the list of hash values, wherein the mirror copy is deemed to be an unused mirror copy when the hash value associated with the mirror copy is not in the list of hash values.
The method may further involve processing open requests for files associated with the copy-on-write storage tier, such processing of open requests comprising:
receiving from a client an open request for a specified file associated with the copy-on-write storage tier;
when the specified file is a non-deduplicated file:
when the specified file is a deduplicated file having a mirror copy of the file contents stored in the designated mirror server:
The mirror file handle for the mirror copy may be obtained from the designated mirror server based on hash values associated with the specified file and the mirror copy.
The contents of the specified file may be filled from the copy of the file contents stored in the designated mirror server using a background task.
The method may further involve processing file requests for files associated with the copy-on-write storage tier. Such processing may involve:
receiving from the client a file request including the copy-on-write file handle;
when the copy-on-write file handle is marked as not ready:
when the copy-on-write file handle is marked as ready with error, returning an error indication to the client;
when the file request is a read operation and the copy-on-write file handle is associated with a mirror file handle:
when the file request is a read operation and the copy-on-write file handle is not associated with a mirror file handle:
when the file request is a write operation, using the copy-on-write file handle to write data to the file in the copy-on-write storage tier; and
otherwise sending the file request to the file virtualization appliance.
The foregoing features of the invention will be more readily understood by reference to the following detailed description, taken with reference to the accompanying drawings, in which:
Embodiments of the present invention relate generally to using a copy-on-write storage tier to reclaim storage space of all duplicated files and recreate the contents of a duplicated file from its mirror copy when an update is about to occur on the duplicated file.
A traditional file system manages the storage space by providing a hierarchical namespace. The hierarchical namespace starts from the root directory, which contains files and subdirectories. Each directory may also contain files and subdirectories identifying other files or subdirectories. Data is stored in files. Every file and directory is identified by a name. The full name of a file or directory is constructed by concatenating the name of the root directory and the names of each subdirectory that finally leads to the subdirectory containing the identified file or directory, together with the name of the file or the directory.
The full name of a file thus carries with it two pieces of information: (1) the identification of the file and (2) the physical storage location where the file is stored. If the physical storage location of a file is changed (for example, moved from one partition mounted on a system to another), the identification of the file changes as well.
For ease of management, as well as for a variety of other reasons, the administrator would like to control the physical storage location of a file. For example, important files might be stored on expensive, high-performance file servers, while less important files could be stored on less expensive and less capable file servers.
Unfortunately, moving files from one server to another usually changes the full name of the files and thus, their identification, as well. This is usually a very disruptive process, since after the move users may not be able to remember the new location of their files. Thus, it is desirable to separate the physical storage location of a file from its identification. With this separation, IT and system administrators will be able to control the physical storage location of a file while preserving what the user perceives as the location of the file (and thus its identity).
File virtualization is a technology that separates the full name of a file from its physical storage location. File virtualization is usually implemented as a hardware appliance that is physically or logically located in the data path between users and the file servers. For users, a file virtualization appliance appears as a file server that exports the namespace of a file system. From the file servers' perspective, the file virtualization appliance appears as just a normal user. Attune System's Maestro File Manager (MFM) is an example of a file virtualization appliance.
As a result of separating the full name of a file from the file's physical storage location, file virtualization provides the following capabilities:
Deduplication is of no obvious benefit to the end users of a file system. Exemplary embodiments of the present invention use deduplication as a storage placement policy to intelligently manage the storage assets of an enterprise, with relatively little inconvenience to end users.
Embodiments of the present invention utilize a Copy-On-Write (COW) storage tier in which every file in any of the file servers in the storage tier is eventually deduplicated, regardless whether there is any file in the storage tier that has identical contents. This is in contrast with the typical deduplication, where only files with identical contents are deduped.
Storage policies are typically used to limit the deduplication to only a set of files selected by the storage policies that apply to a synthetic namespace comprising one or more file servers. For example, one storage policy may migrate a specified class of files (e.g., all mp3 audio and jpeg image files) to a COW storage tier. Another example is that all files that have not been referenced for a specified period of time (e.g., over six months) are migrated to a COW storage tier. Once the files are in the COW storage tier, deduplication is done on every file, regardless whether any file with duplicated contents exists.
In an exemplary embodiment, extending file virtualization to support deduplication using the COW storage tier operates generally as follows. First, a synthetic namespace is created via file virtualization, and is comprised of one or more file servers. A set of storage policies is created that selects a set of files from the synthetic namespace to be migrated to the COW storage tier.
A set of file servers are selected to be in the COW storage tier. One of the file servers in a COW storage tier will also act as a mirror server. In exemplary embodiments, a mirror server is storage that may contain the current, past, or both current and past mirror copies of the authoritative copy of files stored at the COW storage tier. In exemplary embodiments, each mirror copy in the mirror sever is associated with a hash value, e.g., identified by a 160-bit number, which is the sha1 digest computed from the contents of the mirror copy. A sha1 digest value is a globally unique value for any given set of data (contents) of a file. Therefore, if two files are identical in contents (but not necessarily name or location), they should always have the same sha1 digest values. And conversely, if two files are different in contents, they should always have different sha1 digest values.
The mirror server is a special device. While it can be written, the writing of it is only performed by the file virtualization appliance itself, and each write to a file is only done once. Users and applications only read the contents of files stored on the mirror server. Basically, the mirror server is a sort of write once, read many (WORM) device. Therefore, if the mirror server were replicated, users and applications could read from any of the mirror servers available. By replicating the mirror server, one can increase the availability (if one mirror server is unavailable, another mirror server can service the request) and performance (multiple mirror servers can respond to reads from users and applications in parallel, as well as having mirror servers that are closest to the requester service the request).
Once a file is stored in a COW storage tier, the file will eventually be deduplicated. For example, if there is no update made to any files in a COW storage tier, then after a certain duration, all files in the COW storage tier will be deduped. After a file is deduped, the file becomes a sparse file where essentially all of the file's storage space is reclaimed while all of the file's attributes, including its size, remain.
A background deduplication process typically is run periodically within the file aggregation appliance to perform the deduplication. An exemplary deduplication process for a COW storage tier is as follows:
An exemplary process to dedupe a single file (called from the deduplication process for the namespace) is as follows:
When a file in COW storage tier is opened, the open request is actually sent to the MFM that manages the COW storage tier. An exemplary process to open a file is as follows:
When a file request is sent to the MFM, it includes a COW file handle. Exemplary steps for handling a file identified by the COW file handle are as follows:
As more mirror file copies are added into the mirror server, the past mirror file copies will need to be purged from the mirror server or the mirror server will eventually run out of storage space. An exemplary process to purge past mirror copies from the mirror server is as follows:
Some enterprises or locations may not have multiple storage tiers available to setup a copy-on-write storage tier, or not have enough available storage in an available tier to store the large amount of mp3 and image files that a storage policy would dictate be stored on the copy-on-write storage tier. A new storage tier is just that, a new storage tier to create and manage.
Therefore, an alternative embodiment removes the restriction that the copy-on-write storage tier is a separate and real physical storage tier. The copy-on-write storage tier may just be some part of another storage tier, such as tier-1 or tier-2 storage, thus becoming a virtual storage tier. Rather than copying files to an actual storage tier, files could be marked as a part of the virtual storage tier by virtue of a metadata flag, hereafter referred to as the COW flag. If the COW flag is false, the file is just a part of the storage tier the file resides within. If the COW flag is true, the file is not part of the storage tier the file resides within. Rather, the file is part of the virtual copy-on-write storage tier.
Some advantages of this approach are that the files need not be copied to a physical tier of storage first, before deduplication. Furthermore, the IT administrator continues to just manage a single tier (or the same number of tiers as they were managing previously).
In addition to these advantages, all of the advantages of a physically separate COW tier discussed above generally continue to hold, including achieving deduplication with acceptable performance, the ability to dedupe across more file systems to achieve higher deduplication efficiency, and reducing the inconvenience experienced by end-users due to the performance overhead of deduplication based on a storage policy of deduping a selected set of files, while still resulting in less utilization of storage space by eliminating the storage of identical file copies.
As before, every file within the virtual copy-on-write storage tier will eventually be deduped, regardless whether there is any file in the virtual storage tier that has identical contents. This is in contrast with the typical deduplication, where only files with identical contents are deduped.
As above, a set of storage policies is created that selects a set of files from the synthetic namespace to be migrated to the virtual COW storage tier. If the files already reside on the tier which co-resides with the virtual COW storage tier, then no actual migration is performed. Rather, the COW flag within the metadata indicating that the file has been migrated to the virtual COW storage tier is set. If the file resides on a different storage tier than the virtual COW storage tier, then a physical migration is performed to the COW storage tier. Again, the COW flag within the metadata indicating that the file has been migrated to the virtual COW storage tier is set.
Alternatively, there may be a single virtual COW storage tier for all physical storage tiers within the namespace. In this case, when a storage policy indicates that a file should be migrated to the virtual COW storage tier, no physical migration is ever performed. The COW flag within the metadata indicating that the file has been migrated to the virtual COW storage is set. In this way, there generally is no need to select a set of file servers to be in the COW storage tier.
There is still the need to select one of the file servers to act as a mirror server.
Once a file is stored in the virtual COW storage tier, the file will eventually be deduped. In other words, if there is no update made to any files in a virtual COW storage tier, then after a certain duration, all files in the virtual COW storage tier will be deduped. After a file is deduped, the file becomes a sparse file where all of the file's storage space is reclaimed while all of the file's attributes, including its size, remain. Since the file just resides within a regular storage tier, the storage space that is reclaimed is the valuable tier storage space the file used to occupy.
As above, a background deduplication process typically is run periodically within the MFM to perform the deduplication. An exemplary deduplication process for a virtual COW storage tier is as follows:
An exemplary process to dedupe a single file (as called by the deduplication process above) is essentially unchanged from the process described above. An exemplary process to dedupe a single file is as follows:
When a file is opened, the open request is actually sent to an MFM that manages the partition of the namespace. An exemplary process to open a file is as follows:
When a file request is sent to the MFM, it must include a file handle. Exemplary steps for handling a file are as follows:
As more mirror file copies are added into the mirror server, the past mirror file copies will need to be purged from the mirror server or the mirror server will eventually run out of storage space. An exemplary process to purge past mirror copies from the mirror server is as follows:
It should be noted that the in-user mirror list in an actual embodiment may be implemented as a hash table, a binary tree, or using other data structures commonly used by the people skilled in the art to achieve acceptable find performance.
As described here, it is still possible that the mirror server completely fills up (even though past mirror copies are purged). Therefore, the mirror server should be as large as possible, to accommodate at least one copy of all files that can exist in the COW storage tier. Otherwise, the mirror server may run out of space, and further deduplication will not be possible.
The related application entitled Remote File Virtualization Data Mirroring, a mechanism to purge mirror copies from the mirror server (any mirror copy can be purged at any given time, since an authoritative copy exists elsewhere) discusses a process for purging past mirror copies from the mirror server. Such purging of in-use mirror copies generally cannot be used in embodiments of the present invention. This is because a file that has been deduped in the COW storage tier only exists as a sparse file (no data in the file) and as a mirror copy. Thus, the mirror copy is actually the authoritative copy of the data contents of the deduped file. An in-use mirror copy is not purged because, among other things, it is difficult to locate and restore the contents of all the COW files that have the same identical mirror copy.
It should be noted that file deduplication as discussed herein may be implemented using a file switches of the types described above and in the provisional patent application referred to by Attorney Docket No. 3193/114. It should also be noted that embodiments of the present invention may incorporate, utilize, supplement, or be combined with various features described in one or more of the other referenced patent applications.
It should be noted that terms such as “client,” “server,” “switch,” and “node” may be used herein to describe devices that may be used in certain embodiments of the present invention and should not be construed to limit the present invention to any particular device type unless the context otherwise requires. Thus, a device may include, without limitation, a bridge, router, bridge-router (brouter), switch, node, server, computer, appliance, or other type of device. Such devices typically include one or more network interfaces for communicating over a communication network and a processor (e.g., a microprocessor with memory and other peripherals and/or application-specific hardware) configured accordingly to perform device functions. Communication networks generally may include public and/or private networks; may include local-area, wide-area, metropolitan-area, storage, and/or other types of networks; and may employ communication technologies including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies (e.g., Bluetooth), networking technologies, and internetworking technologies.
It should also be noted that devices may use communication protocols and messages (e.g., messages created, transmitted, received, stored, and/or processed by the device), and such messages may be conveyed by a communication network or medium. Unless the context otherwise requires, the present invention should not be construed as being limited to any particular communication message type, communication message format, or communication protocol. Thus, a communication message generally may include, without limitation, a frame, packet, datagram, user datagram, cell, or other type of communication message.
It should also be noted that logic flows may be described herein to demonstrate various aspects of the invention, and should not be construed to limit the present invention to any particular logic flow or logic implementation. The described logic may be partitioned into different logic blocks (e.g., programs, modules, functions, or subroutines) without changing the overall results or otherwise departing from the true scope of the invention. Often times, logic elements may be added, modified, omitted, performed in a different order, or implemented using different logic constructs (e.g., logic gates, looping primitives, conditional logic, and other logic constructs) without changing the overall results or otherwise departing from the true scope of the invention.
The present invention may be embodied in many different forms, including, but in no way limited to, computer program logic for use with a processor (e.g., a microprocessor, microcontroller, digital signal processor, or general purpose computer), programmable logic for use with a programmable logic device (e.g., a Field Programmable Gate Array (FPGA) or other PLD), discrete components, integrated circuitry (e.g., an Application Specific Integrated Circuit (ASIC)), or any other means including any combination thereof. In a typical embodiment of the present invention, predominantly all of the described logic is implemented as a set of computer program instructions that is converted into a computer executable form, stored as such in a computer readable medium, and executed by a microprocessor under the control of an operating system.
Computer program logic implementing all or part of the functionality previously described herein may be embodied in various forms, including, but in no way limited to, a source code form, a computer executable form, and various intermediate forms (e.g., forms generated by an assembler, compiler, linker, or locator). Source code may include a series of computer program instructions implemented in any of various programming languages (e.g., an object code, an assembly language, or a high-level language such as Fortran, C, C++, JAVA, or HTML) for use with various operating systems or operating environments. The source code may define and use various data structures and communication messages. The source code may be in a computer executable form (e.g., via an interpreter), or the source code may be converted (e.g., via a translator, assembler, or compiler) into a computer executable form.
The computer program may be fixed in any form (e.g., source code form, computer executable form, or an intermediate form) either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), a PC card (e.g., PCMCIA card), or other memory device. The computer program may be fixed in any form in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies (e.g., Bluetooth), networking technologies, and internetworking technologies. The computer program may be distributed in any form as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web).
Hardware logic (including programmable logic for use with a programmable logic device) implementing all or part of the functionality previously described herein may be designed using traditional manual methods, or may be designed, captured, simulated, or documented electronically using various tools, such as Computer Aided Design (CAD), a hardware description language (e.g., VHDL or AHDL), or a PLD programming language (e.g., PALASM, ABEL, or CUPL).
Programmable logic may be fixed either permanently or transitorily in a tangible storage medium, such as a semiconductor memory device (e.g., a RAM, ROM, PROM, EEPROM, or Flash-Programmable RAM), a magnetic memory device (e.g., a diskette or fixed disk), an optical memory device (e.g., a CD-ROM), or other memory device. The programmable logic may be fixed in a signal that is transmittable to a computer using any of various communication technologies, including, but in no way limited to, analog technologies, digital technologies, optical technologies, wireless technologies (e.g., Bluetooth), networking technologies, and internetworking technologies. The programmable logic may be distributed as a removable storage medium with accompanying printed or electronic documentation (e.g., shrink wrapped software), preloaded with a computer system (e.g., on system ROM or fixed disk), or distributed from a server or electronic bulletin board over the communication system (e.g., the Internet or World Wide Web).
The present invention may be embodied in other specific forms without departing from the true scope of the invention. Any references to the “invention” are intended to refer to exemplary embodiments of the invention and should not be construed to refer to all embodiments of the invention unless the context otherwise requires. The described embodiments are to be considered in all respects only as illustrative and not restrictive.
This patent application claims priority from U.S. Provisional Patent Application No. 60/988,269 entitled FILE DEDUPLICATION USING COPY-ON-WRITE STORAGE TIERS filed on Nov. 15, 2007 (Attorney Docket No. 3193/125) and also claims priority from U.S. Provisional Patent Application No. 60/988,306 entitled FILE DEDUPLICATION USING A VIRTUAL COPY-ON-WRITE STORAGE TIER filed on Nov. 15, 2007 (Attorney Docket No. 3193/126). This patent application also may be related to one or more of the following patent applications: U.S. Provisional Patent Application No. 60/923,765 entitled NETWORK FILE MANAGEMENT SYSTEMS, APPARATUS, AND METHODS filed on Apr. 16, 2007 (Attorney Docket No. 3193/114). U.S. Provisional Patent Application No. 60/940,104 entitled REMOTE FILE VIRTUALIZATION filed on May 25, 2007 (Attorney Docket No. 3193/116). U.S. Provisional Patent Application No. 60/987,161 entitled REMOTE FILE VIRTUALIZATION METADATA MIRRORING filed Nov. 12, 2007 (Attorney Docket No. 3193/117). U.S. Provisional Patent Application No. 60/987,165 entitled REMOTE FILE VIRTUALIZATION DATA MIRRORING filed Nov. 12, 2007 (Attorney Docket No. 3193/118). U.S. Provisional Patent Application No. 60/987,170 entitled REMOTE FILE VIRTUALIZATION WITH NO EDGE SERVERS filed Nov. 12, 2007 (Attorney Docket No. 3193/119). U.S. Provisional Patent Application No. 60/987,174 entitled LOAD SHARING CLUSTER FILE SYSTEM filed Nov. 12, 2007 (Attorney Docket No. 3193/120). U.S. Provisional Patent Application No. 60/987,206 entitled NON-DISRUPTIVE FILE MIGRATION filed Nov. 12, 2007 (Attorney Docket No. 3193/121). U.S. Provisional Patent Application No. 60/987,197 entitled HOTSPOT MITIGATION IN LOAD SHARING CLUSTER FILE SYSTEMS filed Nov. 12, 2007 (Attorney Docket No. 3193/122). U.S. Provisional Patent Application No. 60/987,194 entitled ON DEMAND FILE VIRTUALIZATION FOR SERVER CONFIGURATION MANAGEMENT WITH LIMITED INTERRUPTION filed Nov. 12, 2007 (Attorney Docket No. 3193/123). U.S. Provisional Patent Application No. 60/987,181 entitled FILE DEDUPLICATION USING STORAGE TIERS filed Nov. 12, 2007 (Attorney Docket No. 3193/124). U.S. patent application Ser. No. 12/104,197 entitled FILE AGGREGATION IN A SWITCHED FILE SYSTEM filed Apr. 16, 2008 (Attorney Docket No. 3193/129). U.S. patent application Ser. No. 12/103,989 entitled FILE AGGREGATION IN A SWITCHED FILE SYSTEM filed Apr. 16, 2008 (Attorney Docket No. 3193/130). U.S. patent application Ser. No. 12/126,129 entitled REMOTE FILE VIRTUALIZATION IN A SWITCHED FILE SYSTEM filed May 23, 2008 (Attorney Docket No. 3193/131). All of the above-referenced patent applications are hereby incorporated herein by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
60988269 | Nov 2007 | US | |
60988306 | Nov 2007 | US |