Data transfer techniques within data storage devices, such as network attached storage performing data migration

Information

  • Patent Grant
  • 10547678
  • Patent Number
    10,547,678
  • Date Filed
    Wednesday, December 9, 2015
    9 years ago
  • Date Issued
    Tuesday, January 28, 2020
    4 years ago
Abstract
A stand-alone, network accessible data storage device, such as a filer or NAS device, is capable of transferring data objects based on portions of the data objects. The device transfers portions of files, folders, and other data objects from a data store within the device to external secondary storage based on certain criteria, such as time-based criteria, age-based criteria, and so on. A portion may be one or more blocks of a data object, or one or more chunks of a data object, or other segments that combine to form or store a data object. For example, the device identifies one or more blocks of a data object that satisfy a certain criteria, and migrates the identified blocks to external storage, thereby freeing up storage space within the device. The device may determine that a certain number of blocks of a file have not been modified or called by a file system in a certain time period, and migrate these blocks to secondary storage.
Description
BACKGROUND

Networked attached storage (NAS), often refers to a computing system, attached to a network, which provides file-based data storage services to other devices on the network. A NAS system, or NAS device, may include a file system (e.g., under Microsoft Windows) that manages the data storage services, but is generally controlled by other resources via an IP address or other communication protocol. A NAS device may also include an operating system, although the operating system is often configured only to facilitate operations performed by the NAS system. Mainly, a NAS device includes one or more redundantly arranged hard disks, such as RAID arrays. A NAS device works with various file-based and/or communication protocols, such as NFS (Network File System) for UNIX or LINUX systems, SMB/CIFS (Server Message Block/Common Internet File System) for Windows systems, or iSCSI (Internet SCSI) for IP communications.


NAS devices provide a few similar functionalities to Storage Area Networks (SANs), although typical NAS devices only facilitate file level storage. Some hybrid systems exist, which provide both NAS and SAN functionalities. However, in these hybrid systems, such as Openfiler on LINUX, the NAS device serves the SAN device at the file level, and not at a file system level, such as at the individual file level. For example, the assignee's U.S. Pat. No. 7,546,324, entitled Systems and Methods for Performing Storage Operations Using Network Attached Storage, describes how individual files in a NAS device can be written to secondary storage, and are replaced in the NAS device with a stub having a pointer to the secondary storage location where the file now resides.


A NAS device may provide centralized storage to client computers on a network, but may also assist in load balancing and fault tolerance for resources such as email and/or web server systems. Additionally, NAS devices are generally smaller and easy to install to a network.


NAS device performance generally depends on traffic and the speed of the traffic on the attached network, as well as the capacity of a cache memory on the NAS device. Because a NAS device supports multiple protocols and contains reduced processing and operating systems, its performance may suffer when many users or many operations attempt to utilize the NAS device. The contained hardware intrinsically limits a typical NAS device, because it is self-contained and self-supported. For example, the capacity of its local memory may limit a typical NAS device's ability to provide data storage to a network, among other problems.


The need exists for a system that overcomes the above problems, as well as one that provides additional benefits. Overall, the examples herein of some prior or related systems and their associated limitations are intended to be illustrative and not exclusive. Other limitations of existing or prior systems will become apparent to those of skill in the art upon reading the following Detailed Description.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a block diagram illustrating components of a data stream utilized by a suitable data storage system.



FIG. 2 is a block diagram illustrating an example of a data storage system.



FIG. 3 is a block diagram illustrating an example of components of a server used in data storage operations.



FIG. 4 is a block diagram illustrating a NAS device within a networked computing system.



FIG. 5 is a block diagram illustrating the components of a NAS device configured to perform data migration.



FIGS. 6A and 6B are schematic diagrams illustrating a data store before and after a block-based data migration, respectively.



FIG. 7 is a flow diagram illustrating a routine for performing block-level data migration in a NAS device.



FIG. 8 is a flow diagram illustrating a routine for performing chunk-level data migration in a NAS device.



FIG. 9 is flow diagram illustrating a routine for block-based or chunk-based data restoration and modification via a NAS device.





DETAILED DESCRIPTION

Overview


Described in detail herein is a system and method that transfers or migrates data objects within a stand-alone network storage device, such as a filer or network-attached storage (NAS) device. In some examples, a NAS device transfers segments, portions, increments, or proper subsets of data objects stored in local memory of the NAS device. The NAS device may transfer portions of files, folders, and other data objects from a cache to secondary storage based on certain criteria, such as time-based criteria, age-based criteria, and so on. A portion may be one or more blocks of a data object, or one or more chunks of a data object, or other data portions that combine to form, store, and/or contain a data object, such as a file.


In some examples, the NAS device performs block-based migration of data. A data migration component within the NAS device identifies one or more blocks of a data object stored in a cache or data storage that satisfy a certain criteria, and migrates the identified blocks. For example, the data migration component may determine that a certain number of blocks of a file have not been modified or called by a file system within a certain time period, and migrate these blocks to secondary storage. The data migration component then maintains the other blocks of the file in primary storage. In some cases, the data migration component automatically migrates data without requiring user input. Additionally, the migration may be transparent to a user.


In some examples, the NAS device performs chunk-based migration of data. A chunk is, for example, a group or set of blocks. One or more chunks may comprise a portion of a file, folder, or other data object. The data migration component identifies one or more chunks of a data object that satisfy a certain criteria, and migrates the identified chunks. For example, the data migration component may determine that a certain number of chunks of a file have not been modified or called by a file system in a certain time period, and migrate these chunks to secondary storage. The system then maintains the other chunks of the file in the cache or data storage of the NAS device.


Network-attached storage, such as a filer or NAS device, and associated data migration components and processes, will now be described with respect to various examples. The following description provides specific details for a thorough understanding of, and enabling description for, these examples of the system. However, one skilled in the art will understand that the system may be practiced without these details. In other instances, well-known structures and functions have not been shown or described in detail to avoid unnecessarily obscuring the description of the examples of the system.


The terminology used in the description presented below is intended to be interpreted in its broadest reasonable manner, even though it is being used in conjunction with a detailed description of certain specific examples of the system. Certain terms may even be emphasized below; however, any terminology intended to be interpreted in any restricted manner will be overtly and specifically defined as such in this Detailed Description section. A suitable data storage system will first be described, followed by a description of suitable stand-alone devices. Following that, various data migration and data recovery processes will be discussed.


Suitable System


Referring to FIG. 1, a block diagram illustrating components of a data stream utilized by a suitable data storage system, such as a system that performs network attached storage, is shown. The stream 110 may include a client 111, a media agent 112, and a secondary storage device 113. For example, in storage operations, the system may store, receive and/or prepare data, such as blocks or chunks, to be stored, copied or backed up at a server or client 111. The system may then transfer the data to be stored to media agent 112, which may then refer to storage policies, schedule policies, and/retention policies (and other policies) to choose a secondary storage device 113, such as a NAS device that receives data and transfers data to attached secondary storage devices. The media agent 112 may include or be associated with a NAS device, to be discussed herein.


The secondary storage device 113 receives the data from the media agent 112 and stores the data as a secondary copy, such as a backup copy. Secondary storage devices may be magnetic tapes, optical disks, USB and other similar media, disk and tape drives, and so on. Of course, the system may employ other configurations of stream components not shown in the Figure.


Referring to FIG. 2, a block diagram illustrating an example of a data storage system 200 is shown. Data storage systems may contain some or all of the following components, depending on the needs of the system. FIG. 2 and the following discussion provide a brief, general description of a suitable computing environment in which the system can be implemented. Although not required, aspects of the system are described in the general context of computer-executable instructions, such as routines executed by a general-purpose computer, e.g., a server computer, wireless device or personal computer. Those skilled in the relevant art will appreciate that the system can be practiced with other communications, data processing, or computer system configurations, including: Internet appliances, network PCs, mini-computers, mainframe computers, and the like. Indeed, the terms “computer,” “host,” and “host computer” are generally used interchangeably herein, and refer to any of the above devices and systems, as well as any data processor.


Aspects of the system can be embodied in a special purpose computer or data processor that is specifically programmed, configured, or constructed to perform one or more of the computer-executable instructions explained in detail herein. Aspects of the system can also be practiced in distributed computing environments where tasks or modules are performed by remote processing devices, which are linked through a communications network, such as a Local Area Network (LAN), Wide Area Network (WAN), Storage Area Network (SAN), Fibre Channel, or the Internet. In a distributed computing environment, program modules may be located in both local and remote memory storage devices.


Aspects of the system may be stored or distributed on computer-readable media, including tangible storage media, such as magnetically or optically readable computer discs, hard-wired or preprogrammed chips (e.g., EEPROM semiconductor chips), nanotechnology memory, biological memory, or other data storage media. Alternatively, computer implemented instructions, data structures, screen displays, and other data under aspects of the system may be distributed over the Internet or over other networks (including wireless networks), on a propagated signal on a propagation medium (e.g., an electromagnetic wave(s), a sound wave, etc.) over a period of time, or they may be provided on any analog or digital network (packet switched, circuit switched, or other scheme). Those skilled in the relevant art will recognize that portions of the system reside on a server computer, while corresponding portions reside on a client computer, and thus, while certain hardware platforms are described herein, aspects of the system are equally applicable to nodes on a network.


For example, the data storage system 200 contains a storage manager 210, one or more clients 111, one or more media agents 112, and one or more storage devices 113. Storage manager 210 controls media agents 112, which may be responsible for transferring data to storage devices 113. Storage manager 210 includes a jobs agent 211, a management agent 212, a database 213, and/or an interface module 214. Storage manager 210 communicates with client(s) 111. One or more clients 111 may access data to be stored by the system from database 222 via a data agent 221. The system uses media agents 112, which contain databases 231, to transfer and store data into storage devices 113. The storage devices 113 may include network attached storage, such as the NAS devices described herein. Client databases 222 may contain data files and other information, while media agent databases may contain indices and other data structures that assist and implement the storage of data into secondary storage devices, for example.


The data storage and recovery system may include software and/or hardware components and modules used in data storage operations. The components may be storage resources that function to copy data during storage operations. The components may perform other storage operations (or storage management operations) other that operations used in data stores. For example, some resources may create, store, retrieve, and/or migrate primary or secondary data copies of data. Additionally, some resources may create indices and other tables relied upon by the data storage system and other data recovery systems. The secondary copies may include snapshot copies and associated indices, but may also include other backup copies such as HSM copies, archive copies, auxiliary copies, and so on. The resources may also perform storage management functions that may communicate information to higher level components, such as global management resources.


In some examples, the system performs storage operations based on storage policies, as mentioned above. For example, a storage policy includes a set of preferences or other criteria to be considered during storage operations. The storage policy may determine or define a storage location and/or set of preferences about how the system transfers data to the location and what processes the system performs on the data before, during, or after the data transfer. In some cases, a storage policy may define a logical bucket in which to transfer, store or copy data from a source to a data store, such as storage media. Storage policies may be stored in storage manager 210, or may be stored in other resources, such as a global manager, a media agent, and so on. Further details regarding storage management and resources for storage management will now be discussed.


Referring to FIG. 3, a block diagram illustrating an example of components of a server used in data storage operations is shown. A server, such as storage manager 210, may communicate with clients 111 to determine data to be copied to storage media. As described above, the storage manager 210 may contain a jobs agent 211, a management agent 212, a database 213, and/or an interface module. Jobs agent 211 may manage and control the scheduling of jobs (such as copying data files) from clients 111 to media agents 112. Management agent 212 may control the overall functionality and processes of the data storage system, or may communicate with global managers. Database 213 or another data structure may store storage policies, schedule policies, retention policies, or other information, such as historical storage statistics, storage trend statistics, and so on. Interface module 215 may interact with a user interface, enabling the system to present information to administrators and receive feedback or other input from the administrators or with other components of the system (such as via APIs).


Suitable Storage Devices


Referring to FIG. 4, a block diagram illustrating components of a networked data storage device, such as a filer or NAS device 440, configured to perform data migration within a networked computing system is shown. (While the examples below discuss a NAS device, any architecture or networked data storage device employing the following principles may be used, including a proxy computer coupled to the NAS device). The computing system 400 includes a data storage system 410, such as the tiered data storage system 200. Client computers 420, including computers 422 and 424, are associated with users that generate data to be stored in secondary storage. The client computers 422 and 424 communicate with the data storage system 410 over a network 430, such as a private network such as an Intranet, a public network such as the Internet, and so on. The networked computing system 400 includes network attached storage, such as NAS device 440. The NAS device 440 includes NAS-based storage or memory, such as a cache 444, for storing data received from the network, such as data from client computers 422 and 424. (The term “cache” is used generically herein for any type of storage, and thus the cache 444 can include any type of storage for storing of data files within the NAS device, such as magnetic disk, optical disk, semiconductor memory, or other known types of storage such as magnetic tape or types of storage hereafter developed.) The cache 444 may include an index or other data structure in order to track where data is eventually stored or the index may be stored elsewhere, such as on the proxy computer. The index may include information associating the data with information identifying a secondary storage device that stored the data, or other information. For example, as described in detail below, the index may include both an indication of which blocks have been written to secondary storage (and where they are stored in secondary storage), and a look up table that maps blocks to individual files stored within the NAS and NAS device 440.


The NAS device 440 also includes a data migration component 442 that performs data migration on data stored in the cache 444. While shown in FIG. 4 as being within the NAS device 440, the data migration component 442 may be on a proxy computer coupled to the NAS device. In some cases, the data migration component 442 is a device driver or agent that performs block-level data migration of data stored in the cache. In some cases, the data migration component 442 performs chunk-based data migration of data stored in the cache. Additionally, in some cases the data migration component 442 may perform file-based data migration, or a combination of two or more types of data migration, depending on the needs of the system. During data migration, the NAS device transfers data from the cache of the device to one or more secondary storage devices 450 located on the network 430, such as magnetic tapes 452, optical disks 454, or other secondary storage 456. The NAS device may include various data storage components when identifying and transferring data from the cache 444 to the secondary storage devices 450. These components will now be discussed.


Referring to FIG. 5, a block diagram illustrating the components of a NAS device 440 configured to perform data migration is shown. In addition to a data migration component 442 and cache 444, the NAS device 440 may include an input component 510, a data reception component 520, a file system 530, and an operating system 540. The input component 510 may receive various inputs, such as via an iSCSI protocol. That is, the NAS device may receive commands or control data from a data storage system 410 over IP channels. For example, the data storage system 410 may send commands to a NAS device's IP address in order to provide instructions to the NAS device. The data reception component 520 may receive data to be stored over multiple protocols, such as NFS, CIFS, and so on. For example, a UNIX based system may send data to be stored to the NAS device over a NFS communication channel, while a Windows based system may send data to be stored to the NAS device over a CIFS communication channel.


Additionally, the NAS device 440 may include a number of data storage resources, such as a data storage engine 560 to direct reads from writes to the data store 444, and one or more media agents 570. The media agents 570 may be similar to the media agents 112 described herein. In some cases, the NAS device 440 may include two or more media agents 570, such as multiple media agents 570 externally attached to the NAS device 440. The NAS device 440 may expand its data storage capabilities by adding media agents 570, as well as other components.


As discussed herein, the NAS device 440 includes a data migration component capable of transferring some or all of the data stored in the cache 442. In some examples, the data migration component 442 requests and/or receives information from a callback layer 550, or other intermediate component, within the NAS device 440. Briefly, the callback layer 550 intercepts calls for data between the file system 530 and the cache 444, and tracks these calls to provide information to the data migration component 442 regarding when data is changed, updated, and/or accessed by the file system 530. Further details regarding the callback layer 550 and other intermediate components will now discussed.


In some examples, the NAS device monitors the transfer of data from the file system 530 to the cache 444 via the callback layer 550. The callback layer 550 not only facilitates the migration of data portions from data storage on the NAS device to secondary storage, but also facilitates read back or callback of that data from the secondary storage back to the NAS device. While described at times herein as a device driver or agent, the callback layer 550 may be a layer, or additional file system, that resides on top of the file system 530. The callback layer 550 may intercept data requests from the file system 530, in order to identify, track and/or monitor data requested by the file system 530 and store information associated with these requests in a data structure, such as a bitmap similar to the one shown in Table 1. Thus, the callback layer stores information identifying when a data portion is accessed by tracking calls from the file system 530 to the cache 530. For example, Table 1 provides entry information that tracks calls to a data store:











TABLE 1






Chunk of File1
Access Time








File1.1
Sep. 5, 2008 @12:00



File1.2
Sep. 5, 2008 @12:30



File1.3
Sep. 5, 2008 @13:30



File1.4
Jun. 4, 2008 @ 12:30









In this example, the file system 530 creates a data object named “File1,” using a chunking component (described herein) to divide the file into four chunks: “File1.1,” “File1.2,” “File1.3,” and “File1.4.” The file system 530 stores the four chunks to the cache 444 on Jun. 4, 2008. According to the table, the file system can determine that it has not accessed chunk File1.4 since its creation, and most recently accessed the other chunks on Sep. 5, 2008. Of course, Table 1 may include additional, other or different information, such as information identifying a location of the chunks, information identifying the type of media storing the chunks, information identifying the blocks within the chunk, and/or other information or metadata.


Thus, providing data migration to the NAS device enables the device to facilitate inexpensive, transparent storage to a networked computing system, to free up storage space by migrating or archiving stale data to other locations, among other benefits. Of course, non-networked computing systems may also store data to the NAS devices described herein. Because the NAS devices described herein can be easily and quickly installed on networks, they provide users, such as network administrators, with a quick and efficient way to expand their storage capacity without incurring the typical costs associated with typical NAS devices that do not perform data migration.


For example, adding a NAS device described herein to an existing networked computing system can provide the computing system with expanded storage capabilities, but can also provide the computing system with other data storage functionality. In some examples, the NAS device described herein includes a data storage engine (e.g., a common technology engine, or CTE, provided by Commvault Systems, Inc. of Oceanport, N.J.), the NAS device may act as a backup server. For example, such a device may perform various data storage functions normally provided by a backup server, such as single instancing, data classification, mirroring, content indexing, data backup, encryption, compression, and so on. Thus, in some examples, the NAS device described herein acts as a fully functional and independent device an administrator can attach to a network to perform virtually any data storage function.


Also, in some cases, the NAS device described herein may act to perform fault tolerance in a data storage system. For example, the clustering of NAS devices on a system may provide a higher level of security, because processes on one device can be replicated on another. Thus, attaching two or more of the NAS devices described herein may provide an administrator with the redundancy or security required in some data storage systems.


Data Migration in Storage Devices


As described herein, in some examples, the NAS device leverages block-level or chunk-based data migration in order to provide expanded storage capabilities to a networked computing system.


Block-level migration, or block-based data migration, involves migrating disk blocks from the data store or cache 444 to secondary media, such as storage devices 550. Using block-level migration, the NAS device 440 transfers blocks from the cache that have not been recently accessed to secondary storage, freeing up space on the cache.


As described above, the system can transfer or migrate certain blocks of a data object from one data store to another, such as from a cache in a NAS device to secondary storage. Referring to FIGS. 6A-6B, a schematic diagram illustrating contents of two data stores before and after a block-based data migration is shown. In FIG. 6A, a first data store 610 contains primary copies (i.e., production copies) of two data objects, a first data object 620 and a second data object 630. The first data object comprises blocks A and A1, where blocks A are blocks that satisfy or meet certain storage criteria (such as blocks that have not been modified since creation or not been modified within a certain period of time) and blocks A′ are blocks that do not meet the criteria (such as blocks that have been modified within the certain time period). The second data object comprises blocks B and B′, where blocks B satisfy the criteria and blocks B′ do not meet the criteria.



FIG. 6B depicts the first data store 610 after a block-based data migration of the two data objects 620 and 630. In this example, the system only transfers the data from blocks that satisfy a criteria (blocks A and B) from the first data store 610 to a second data store 640, such as secondary storage 642, 644. The secondary storage may include one or more magnetic tapes, one or more optical disks, and so on. The system maintains data in the remaining blocks (blocks A′ and B′) within the first data store 610.


The system can perform file system data migration at a block level, unlike previous systems that only migrate data at the file level (that is, they have a file-level granularity). By tracking migrated blocks, the system can also restore data at the block level, which may avoid cost and time problems associated with restoring data at the file level.


Referring to FIG. 7, a flow diagram illustrating a routine 700 for performing block-level data migration in a NAS device is shown. In step 710, the NAS device, via the data migration component 442, identifies data blocks within a cache that satisfy a certain criteria. The data migration component 442 may compare some or all of the blocks (or, information associated with the blocks) in the cache with predetermined criteria. The predetermined criteria may be time-based criteria within a storage policy or data retention policy.


In some examples, the data migration component 442 identifies blocks set to be “aged off” from the cache. That is, the data migration component 442 identifies blocks created, changed, or last modified before a certain date and time. For example, the system may review a cache for all data blocks that satisfy a criterion or criteria. The data store may be an electronic mailbox or personal folders (.pst) file for a Microsoft Exchange user, and the criterion may define, for example, all blocks or emails last modified or changed thirty days ago or earlier. The component 442 compares information associated with the blocks, such as metadata associated with the blocks, to the criteria, and identifies all blocks that satisfy the criteria. For example, the component 442 identifies all blocks in the .pst file not modified within the past thirty days. The identified blocks may include all the blocks for some emails and/or a portion of the blocks for other emails. That is, for a given email (or data object), a first portion of the blocks that include the email may satisfy the criteria, while a second portion of the blocks that include the same email may not satisfy the criteria. In other words, a file or a data object can be divided into parts or portions, and only some of the parts or portions change.


To determine which blocks have changed, and when, the NAS device can monitor the activity of a NAS device's file system 530 via the callback layer 550. The NAS device may store a data structure, such as a bitmap, table, log, and so on within the cache 444 or other memory in the NAS device or elsewhere, and update the data structure whenever the file system calls the cache 444 to access and update or change data blocks within the cache 444. The callback layer 550 traps commands to the cache 444, where that command identifies certain blocks on a disk for access or modifications, and writes to the data structure the changed blocks and the time of the change. The data structure may include information such as an identification of changed blocks and a date and a time the blocks were changed. The data structure, which may be a table, bitmap, or group of pointers, such as a snapshot, may also include other information, such as information that maps file names to blocks, information that maps chunks to blocks and/or file names, and so on, and identify when accesses/changes were made. Table 2 provides entry information for tracking the activity of a file system with the “/users” directory:











TABLE 2






Blocks
Date and Time Modified








/users/blocks1-100
Sep. 8, 2008 @14:30



/users/blocks101-105
Sep. 4, 2008 @12:23



/users2/blocks106-110
Sep. 4, 2008 @11:34



/users3/blocks110-1000
Aug. 5, 2008 @10:34









Thus, if a storage policy identified the time Aug. 30, 2008 @ 12:00 as a threshold time criteria, where data modified after the time is to be retained, the system would identify, in step 710, blocks 110-1000 as having satisfied the criteria. Thus, the system, via the intermediate component 420, can monitor what blocks are requested by a file system, and act accordingly, as described herein.


In step 720, the NAS device transfers data within the identified blocks from the cache to a media agent 570, to be stored in a different data store. The system may perform some or all of the processes described with respect to FIGS. 1-3 when transferring the data to the media agent. For example, before transferring data, the system may review a storage policy as described herein to select a media agent, such as media agent 112, based on instructions within the storage policy. In step 725, the system optionally updates an allocation table, such as a file allocation table (FAT) for the file system 530 associated with the NAS device, to indicate the data blocks that no longer contain data and are now free to receive and store data from the file system.


In step 730, via the media agent 570, the NAS device 440 stores data from the blocks to a different data store. In some cases, the NAS device, via the media agent 570, stores the data from the blocks to a secondary storage device, such as a magnetic tape 452 or optical disk 454. For example, the NAS device may store the data from the blocks in secondary copies of the data store, such as a backup copy, an archive copy, and so on.


The NAS device may create, generate, update, and/or include an allocation table, (such as a table for the data store) that tracks the transferred data and the data that was not transferred. The table may include information identifying the original data blocks for the data, the name of the data object (e.g., file name), the location of any transferred data blocks (including, e.g., offset information), and so on. For example, Table 3 provides entry information for an example .pst file:











TABLE 3






Name of Data Object
Location of data








Email1
C:/users/blocks1-100



Email2.1 (body of email)
C:/users/blocks101-120



Email2.2 (attachment)
X:/remov1/blocks1-250



Email3
X:/remov2/blocks300-500









In the above example, the data for “Email2” is stored in two locations, the cache (C:/) and an off-site data store (X:/). The system maintains the body of the email, recently modified or accessed, at a location within a data store associated with a file system, “C:/users/blocks101-120.” The system stores the attachment, not recently modified or accessed, in a separate data store, “X:/remov1/blocks1-250.” Of course, the table may include other information, fields, or entries not shown. For example, when the system stored data to tape, the table may include tape identification information, tape offset information, and so on.


Chunked file migration, or chunk-based data migration, involves splitting a data object into two or more portions of the data object, creating an index that tracks the portions, and storing the data object to secondary storage via the two or more portions. Among other things, the chunk-based migration provides for fast and efficient storage of a data object. Additionally, chunk-based migration facilitates fast and efficient recall of a data object, such as the large files described herein. For example, if a user modifies a migrated file, chunk-based migration enables a data restore component to only retrieve and migrate back to secondary storage the chunk containing the modified portion of the file, and not the entire file.


As described above, in some examples the NAS device migrates chunks of data (sets of blocks) that comprise a data object from the cache 444 to another. A data object, such as a file, may comprise two or more chunks. A chunk may be a logical division of a data object. For example, a .pst file may include two or more chucks: a first chunk that stores associated with an index of a user's mailbox, and one or more chunks that stores email, attachments, and so on within the user's mailbox. A chunk is a proper subset of all the blocks that contain a file. That is, for a file contained or defined by n blocks, the largest chunk of the file contains at most n−1 blocks.


In some cases, the data migration component 442 may include a chunking component that divides data objects into chunks. The chunking component may receive files to be stored in the cache 444, divide the files into two or more chunks, and store the files as two or more chunks in the cache. The chunking component may update an index that associated information associated with files with the chunks of the file, the data blocks of the chunks, and so on.


The chunking component may perform different processes when determining how to divide a data object. For example, the chunking component may include indexing, header, and other identifying information or metadata in a first chunk, and include the payload in other chunks. The chunking component may identify and/or retrieve file format or schema information from an index, FAT, NFS, or other allocation table in the file system to determine where certain chunks of a data object reside (such as the first or last chunk of a large file). The chunking component may follow a rules-based process when dividing a data object. The rules may define a minimum or maximum data size for a chunk, a time of creation for data within a chunk, a type of data within a chunk, and so on.


For example, the chunking component may divide a user mailbox (such as a .pst file) into a number of chunks, based on various rules that assign emails within the mailbox to chunks based on the metadata associated with the emails. The chunking component may place an index of the mailbox in a first chunk and the emails in other chunks. The chunking component may then divide the other chunks based on dates of creation, deletion or reception of the emails, size of the emails, sender of the emails, type of emails, and so on. Thus, as an example, the chunking component may divide a mailbox as follows:

















User1/Chunk1
Index



User1/Chunk2
Sent emails



User1/Chunk3
Received emails



User1/Chunk4
Deleted emails



User1/Chunk5
All Attachments.










Of course, other divisions are possible. Chunks may not necessarily fall within logical divisions. For example, the chunking component may divide a data object based on information or instructions not associated with the data object, such as information about data storage resources, information about a target secondary storage device, historical information about previous divisions, and so on.


Referring to FIG. 8, a flow diagram illustrating a routine 800 for performing chunk-level data migration in a NAS device is shown. In step 810, the system identifies chunks of data blocks within a data store that satisfy one or more criteria. The data store may store large files (>50 MB), such as databases associated with a file system, SQL databases, Microsoft Exchange mailboxes, virtual machine files, and so on. The system may compare some or all of the chunks (or, information associated with the chunks) of the data store with predetermined and/or dynamic criteria. The predetermined criteria may be time-based criteria within a storage policy or data retention policy. The system may review an index with the chunking component 815 when comparing the chunks with applicable criteria.


In step 820, the NAS device transfers data within the identified chunks from the data store to a media agent, to be stored in a different data store. The NAS device may perform some or all of the processes described with respect to FIGS. 1-3 when transferring the data to the media agent. For example, the NAS device may review a storage policy assigned to the data store and select a media agent based on instructions within the storage policy. In step 825, the system optionally updates an allocation table, such as a file allocation table (FAT) for a file system associated with the NAS device, to indicate the data blocks that no longer contain data and are now free to receive and store data from the file system.


In step 930, via one or more media agents 570, the NAS device 440 stores the data from the chunks to a different data store. In some cases, the system, via the media agent, stores the data to a secondary storage device, such as a magnetic tape or optical disk. For example, the system may store the data in secondary copies of the data store, such as a backup copy, and archive copy, and so on.


Data Recovery in Storage Devices


A data storage system, using a NAS device leveraging the block-based or chunk-based data migration processes described herein, is able to restore portions of files instead of entire files, such as individual blocks or chunks that comprise portions of the files. Referring to FIG. 9, a flow diagram illustrating a routine 900 for block-based or chunk-based data restoration and modification is shown. In step 910, the system, via a restore or data recovery component, receives a request to modify a file located in a cache of a NAS device or in secondary storage in communication with a NAS device. For example, a user submits a request to a file system to provide an old copy of a large PowerPoint presentation so the user can modify a picture located on slide 5 of 300 of the presentation.


In step 920, the system identifies one or more blocks or one or more chunks associated with the request. For example, the callback layer 550 of the system looks to a table similar to Table 3, identifies blocks associated with page 5 of the presentation and blocks associated with a table of contents of the presentation, and contacts a NAS device that stored or migrated the blocks on secondary storage.


In step 930, the system, via the NAS device, retrieves the identified blocks or chunks from the secondary storage and presents them to the user. For example, the system only retrieves page 5 and table of contents of the presentation and presents the pages to the user.


In step 940, the system receives input from a user to modify the retrieved blocks or chunks. For example, the user updates the PowerPoint presentation to include a different picture. In step 950, the system transfers data associated with the modified blocks or chunks back to the NAS device, where it remains in a cache or is transferred to secondary storage. For example, the system transfers the modified page 5 to the data store. The system may also update a table that tracks access to the data store, such as Table 1 or Table 3.


Thus, the system, leveraging block-based or chunk-based data migration in a NAS device, restores only portions of data objects required by a file system. Such restoration can be, among other benefits, advantageous over systems that perform file-based restoration, because those systems restore entire files, which can be expensive, time consuming, and so on. Some files, such as .pst files, may contain large amounts of data. File-based restoration can therefore be inconvenient and cumbersome, among other things, especially when a user only requires a small portion of a large file.


For example, a user submits a request to the system to retrieve an old email stored in a secondary copy on removable media via a NAS device. The system identifies a portion of a .pst file associated with the user that contains a list of old emails in the cache of the NAS device, and retrieves the list. That is, the system has knowledge of the chunk that includes the list (e.g., a chunking component may always include the list in a first chunk of a data object), accesses the chunk, and retrieves the list. The other portions (e.g., all the emails with the .pst file), were transferred from the NAS device secondary storage. The user selects the desired email from the list. The NAS device, via an index in the cache that associates chunks with data or files (such as an index similar to Table 3), identifies the chunk that contains the email, and retrieves the chunk from associate secondary storage for presentation to the user. Thus, the NAS device is able to restore the email without restoring the entire mailbox (.pst file) associated with the user.


As noted above, the callback layer 550 maintains a data structure that not only tracks where a block or chunk resides on secondary storage, but also which file was affected based on the migration of that block or chunk. Portions of large files may be written to secondary storage to free up space in the data store 444 of the NAS device 440. Thus, to the network, the total data storage of the NAS device is much greater than that actually available within the data store 444. For example, while the data store 444 may have only a 100 gigabyte capacity, its capacity may actually appear as 300 gigabytes, with over 200 gigabytes migrated to secondary storage.


To help ensure sufficient space to write back data from secondary storage to the data store 444 of the NAS device 440, the data store may be partitioned to provide a callback or read-back cache. For example, a disk cache may be established in the data store 444 of the NAS device 440 for the NAS device to write back data read from secondary storage. The amount of the partition is configurable, and may be, for example, between 5 and 20 percent of the total capacity of the data store 440. In the above example, with a 100 gigabyte data store 444, 10 gigabytes may be reserved (10 percent) for data called back from secondary storage to the NAS device 440. This disk partition or callback cache can be managed in known ways, such that data called back to this disk partition can have the oldest data overwritten when room is needed to write new data.


CONCLUSION

From the foregoing, it will be appreciated that specific examples of the data recovery system have been described herein for purposes of illustration, but that various modifications may be made without deviating from the spirit and scope of the system. For example, although files have been described, other types of content such as user settings, application data, emails, and other data objects can be imaged by snapshots. Accordingly, the system is not limited except as by the appended claims.


Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense, as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to.” The word “coupled”, as generally used herein, refers to two or more elements that may be either directly connected, or connected by way of one or more intermediate elements. Additionally, the words “herein,” “above,” “below,” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of this application. Where the context permits, words in the above Detailed Description using the singular or plural number may also include the plural or singular number respectively. The word “or” in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list, and any combination of the items in the list.


The above detailed description of embodiments of the system is not intended to be exhaustive or to limit the system to the precise form disclosed above. While specific embodiments of, and examples for, the system are described above for illustrative purposes, various equivalent modifications are possible within the scope of the system, as those skilled in the relevant art will recognize. For example, while processes or blocks are presented in a given order, alternative embodiments may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed in parallel, or may be performed at different times.


The teachings of the system provided herein can be applied to other systems, not necessarily the system described above. The elements and acts of the various embodiments described above can be combined to provide further embodiments.


These and other changes can be made to the system in light of the above Detailed Description. While the above description details certain embodiments of the system and describes the best mode contemplated, no matter how detailed the above appears in text, the system can be practiced in many ways. Details of the system may vary considerably in implementation details, while still being encompassed by the system disclosed herein. As noted above, particular terminology used when describing certain features or aspects of the system should not be taken to imply that the terminology is being redefined herein to be restricted to any specific characteristics, features, or aspects of the system with which that terminology is associated. In general, the terms used in the following claims should not be construed to limit the system to the specific embodiments disclosed in the specification, unless the above Detailed Description section explicitly defines such terms. Accordingly, the actual scope of the system encompasses not only the disclosed embodiments, but also all equivalent ways of practicing or implementing the system under the claims.


While certain aspects of the system are presented below in certain claim forms, the applicant contemplates the various aspects of the system in any number of claim forms. For example, while only one aspect of the system is recited as a means-plus-function claim under 35 U.S.C sec. 112, sixth paragraph, other aspects may likewise be embodied as a means-plus-function claim, or in other forms, such as being embodied in a computer-readable medium. (Any claims intended to be treated under 35 U.S.C. § 112, ¶6 will begin with the words “means for”.) Accordingly, the applicant reserves the right to add additional claims after filing the application to pursue such additional claim forms for other aspects of the system.

Claims
  • 1. A network attached storage (NAS) device, wherein the network attached storage device is configured to be connected to a networked computing system, and wherein the networked computing system includes one or more non-volatile secondary data storage devices and one or more client computers connected via a network, the network attached storage device comprising: a housing containing one or more components, the components including:a data reception component, wherein the data reception component is configured to receive for storage multiple data files from the one or more client computers via the network;an operating system, wherein the operating system is configured to provide a computing environment for the network attached storage device;at least one processor, wherein the at least one processor is programmed to perform one or more data storage functions for the network attached storage device;a non-volatile data store, wherein the non-volatile data store is configured to store the data files received from the data reception component;a data interception component, wherein the data interception component is configured to intercept data transferred from the data reception component to the non-volatile data store and to update an index associating information identifying the transferred data with information identifying a time of transfer to the non-volatile data store;a file system, wherein the file system is configured to manage, for the network attached storage device, the writing of data files to, and the reading of data files from, the non-volatile data store;one or more media agents, wherein the one or more media agents are configured to receive instructions from the at least one processor and to transfer data stored in the non-volatile data store of the network attached storage device to the one or more non-volatile secondary data storage devices,wherein the one or more non-volatile secondary data storage devices are external to the network attached storage device but are accessible by the one or more media agents of the network attached storage device via the network; anda data migration component, wherein the data migration component is configured to identify portions of at least some of the data files within the non-volatile data store, and to migrate the identified data file portions from the network attached storage device to the one or more non-volatile secondary data storage devices, to thereby free up storage space in the non-volatile data store for the storage of one or more other data files,wherein each of the data files is an individual file,wherein the identified data file portions are less than all of the selected data file,wherein the identified data file portions are for storage by the one or more media agents to at least one of the one or more non-volatile secondary storage devices,wherein the data migration component is further configured to identify portions of a selected data file within the non-volatile data store based at least in part on a data storage criterion,wherein the data storage criterion is associated with writing data to, and reading data from, portions of the selected data file,wherein the data storage criterion includes data file portions that have not been modified within a predetermined period of time, and,wherein the data migration component maintains a data structure that:tracks a logical location of the identified data file portions stored in the one or more secondary storage devices, andmaps the identified data file portions to the selected data file.
  • 2. The network attached storage device of claim 1, further comprising: a data interception component, wherein the data interception component is configured to intercept access requests from the file system to the non-volatile data store and to update an index associating information identifying the access requests with information identifying a time of the access requests.
  • 3. The network attached storage device of claim 1, wherein the data files consist of multiple blocks and wherein the data migration component identifies at least one block for migration from the network attached storage device, but not all of the multiple blocks, wherein the one block has an oldest access time as compared to other of the multiple blocks.
  • 4. The network attached storage device of claim 1, wherein the data migration component is configured to identify data blocks in the non-volatile data store that have not been accessed by the file system within a predetermined time period.
  • 5. The network attached storage device of claim 1 wherein the non-volatile data store further comprising a reserved read-back cache, and wherein the network attached storage device further comprises a callback component configured to read back the identified data file portions stored in the one or more secondary storage devices and to write the identified data file portions to the read-back cache, wherein the callback component is configured to read back the identified data file portions in response to a receipt of a request from the one or more client computers to access the identified data file portions.
  • 6. A computer-implemented method for tracking at least a first portion of at least one data object within a network attached storage (NAS) device coupled to a network, wherein the NAS device includes a NAS file system and a non-volatile data store, the method comprising: accessing calls to or from the NAS file system for reading of data from or writing of data to the non-volatile data store of the NAS device,wherein the at least one data object consists of multiple data blocks,wherein each data object is a single data object,wherein the non-volatile data store of the NAS device stores the multiple data blocks of the data object;wherein the NAS file system controls the reading of data from or the writing of data to the multiple data blocks of the data object, and wherein the accessing includes identifying individual blocks or groups of blocks within the multiple data blocks of the data object that the NAS file system reads data from or writes data to, wherein the identified individual blocks or groups of blocks are less that the entire data object;based on the accessing, identifying a portion of the multiple data blocks of the data object that satisfies a data storage criterion, wherein the data storage criterion filters for the portion of the multiple data blocks that has not been modified since creation or that has not been modified within a predetermined period of time;causing data stored in the portion of the multiple data blocks to be transferred to a separate non-volatile storage device, so as to free up available data storage on the NAS device for storage of other data objects, wherein the separate storage device is not contained by or within the NAS device but communicates with the NAS device over the network, wherein the network is a private network;based on the identifying, and independently of the NAS file system of the NAS device,updating a data structure, wherein the data structure,tracks the portion of the multiple data blocks, andprovides an indication of the data object to which the portion of the multiple data blocks belongs;updating the data structure to include information associating the portion of the multiple data blocks with the separate storage device,wherein the data structure is an index is stored in the non-volatile data store of the NAS device; andremoving information from an allocation table associated with the NAS file system of the NAS device,wherein the data object is a file, and wherein the portion of the multiple data blocks is less than all of the multiple data blocks for the file.
  • 7. The method of claim 6, wherein the separate storage device is not contained by or within the NAS device but communicates with the NAS device over the network, the method further comprising: updating the data structure to track a location of the portion of the multiple data blocks as being located in the separate storage device.
  • 8. The method of claim 6, wherein the data storage criterion comprises a time period in which to retain data in a cache of the NAS device, or a time period in which a recent access of the portion must occur of the multiple data blocks.
  • 9. A stand-alone data storage device, coupled to one or more external computing devices over a network, wherein at least one non-volatile external storage device is also connected to the data storage device via the network, the data storage device comprising: at least one processor;a communication component coupled to the at least one processor and associated with a network address for the data storage device,wherein the communication component receives data transfer commands from the one or more external computing devices on the network,wherein the one or more external computing devices direct the data transfer commands to the data storage device via the network address for the data storage device, andwherein the data transfer commands direct operation of the data storage device;a non-volatile, internal data store, coupled to the at least one processor, wherein the internal data store stores data objects, wherein the data objects are comprised of multiple data blocks;a data storage component that comprises program code, which when executed by the processor, performs data storage tasks with respect to the internal data store;a file system that comprises program code, which when executed by the processor, stores and organizes data objects stored in the internal data store;a call intercept layer, in communication with the file system, that comprises program code, which when executed by the processor, recognizes calls to or from the file system for reading of data from or writing of data to individual data blocks or groups of data blocks stored within the internal data store;a data block identification component, in communication with the call intercept layer, that comprises program code, which when executed by the processor, identifies data blocks of the data object that satisfy a criterion,wherein the criterion is associated with the recognized calls to or from the file system for the reading of data from or the writing of data to the individual data blocks or groups of data blocks,wherein the criterion defines data blocks of the data object that have not been modified since creation or data blocks of the data object that have not been modified within a predetermined period of time,wherein the identified data blocks of the data object are less than the whole data object;an index component that comprises program code, which when executed by the processor, updates an index to include information associating the identified data blocks with information identifying the data object; anda media agent, in communication with the data block identification component, that comprises program code, which when executed by the processor, copies or transfers, via the network and to the external storage device, data for no more than n−1 identified data blocks to free up storage space on the data storage device for one or more other data objects, wherein at least one data object includes n number of data blocks.
  • 10. The data storage device of claim 9 wherein at least one data object is a file having n number of data blocks, wherein the index component includes a bitmap or a data location table mapping the no more than n−1 identified data blocks to a logical location on the network.
  • 11. The data storage device of claim 9 wherein at least one data object is a file having n number of data blocks, and wherein the index component updates the index to include information associating the transferred data with information identifying tape offsets for the secondary storage device that contains the transferred data.
  • 12. The data storage device of claim 9, wherein the criterion defines a time period related to when the file system last read data from or wrote data to individual data blocks or groups of data blocks.
  • 13. The data storage device of claim 9, wherein the criterion defines a time period in which changes were made to data contained by blocks of the data object.
  • 14. A network attached storage device, comprising: a non-transitory cache, wherein the cache stores one or more data objects;a media agent, wherein the media agent is configured to transfer portions of the one or more data objects from the cache to associated non-volatile secondary storage devices, wherein the transferring frees up storage space in the non-transitory cache for one or more other data objects to be stored at a location previously occupied by the portions of the one or more data objects,wherein the portions of the one or more data objects are less than any one of the one or more data objects, andwherein the secondary storage devices are located external to the network attached storage device and configured to provide long term storage of data; anda data identification component, wherein the data identification component is configured to identify to the media agent the portions of the data objects to be transferred to the secondary storage devices based on one or more storage criteria, andwherein the one or more storage criteria are met by the portions of the data objects that have not been modified since their creation, or the portions of the data objects that have not been modified within a predetermined period of time; andan intermediate component, wherein the intermediate component tracks in an index all accesses of the one or more data objects by a file system within the network attached storage device, andwherein the data identification component identifies the portions of the data objects based on information within the index.
  • 15. The network attached storage device of claim 14, wherein the identified portions are proper subsets of data blocks of the data objects.
  • 16. The network attached storage device of claim 14, wherein the identified portions include chunks of the data objects created from a rule-based process of dividing the data object into two or more portions.
CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation of U.S. patent application Ser. No. 12/558,640 filed Sep. 14, 2009 (entitled DATA TRANSFER TECHNIQUES WITHIN DATA STORAGE DEVICES, SUCH AS NETWORK ATTACHED STORAGE PERFORMING DATA MIGRATION), which claims priority to U.S. Provisional Patent Application No. 61/097,176 filed Sep. 15, 2008 (entitled DATA TRANSFER TECHNIQUES WITHIN DATA STORAGE DEVICES, SUCH AS NETWORK ATTACHED STORAGE PERFORMING DATA MIGRATION), the entirety of each of which is incorporated by reference herein.

US Referenced Citations (440)
Number Name Date Kind
4686620 Ng Aug 1987 A
4995035 Cole et al. Feb 1991 A
5005122 Griffin et al. Apr 1991 A
5093912 Dong et al. Mar 1992 A
5133065 Cheffetz et al. Jul 1992 A
5193154 Kitajima et al. Mar 1993 A
5212772 Masters May 1993 A
5226157 Nakano et al. Jul 1993 A
5239647 Anglin et al. Aug 1993 A
5241164 Pavlidis et al. Aug 1993 A
5241668 Eastridge et al. Aug 1993 A
5241670 Eastridge et al. Aug 1993 A
5265159 Kung Nov 1993 A
5276860 Fortier et al. Jan 1994 A
5276867 Kenley et al. Jan 1994 A
5287500 Stoppani, Jr. Feb 1994 A
5321816 Rogan et al. Jun 1994 A
5333315 Saether et al. Jul 1994 A
5347653 Flynn et al. Sep 1994 A
5367698 Webber et al. Nov 1994 A
5410700 Fecteau et al. Apr 1995 A
5412668 Dewey May 1995 A
5448724 Hayashi Sep 1995 A
5455926 Keele et al. Oct 1995 A
5491810 Allen Feb 1996 A
5495457 Takagi Feb 1996 A
5495607 Pisello et al. Feb 1996 A
5499364 Klein et al. Mar 1996 A
5504873 Martin et al. Apr 1996 A
5506986 Healy Apr 1996 A
5544345 Carpenter et al. Aug 1996 A
5544347 Yanai et al. Aug 1996 A
5548521 Krayer et al. Aug 1996 A
5559957 Balk Sep 1996 A
5608865 Midgely et al. Mar 1997 A
5619644 Crockett et al. Apr 1997 A
5638509 Dunphy et al. Jun 1997 A
5673381 Huai et al. Sep 1997 A
5677900 Nishida et al. Oct 1997 A
5699361 Ding et al. Dec 1997 A
5729743 Squibb Mar 1998 A
5751997 Kullick et al. May 1998 A
5758359 Saxon May 1998 A
5761677 Senator et al. Jun 1998 A
5764972 Crouse et al. Jun 1998 A
5778395 Whiting et al. Jul 1998 A
5812398 Nielsen Sep 1998 A
5813009 Johnson et al. Sep 1998 A
5813017 Morris Sep 1998 A
5815662 Ong Sep 1998 A
5832522 Blickenstaff et al. Nov 1998 A
5860068 Cook Jan 1999 A
5875478 Blumenau Feb 1999 A
5875481 Ashton et al. Feb 1999 A
5887134 Ebrahim Mar 1999 A
5893139 Kamiyama Apr 1999 A
5898593 Baca et al. Apr 1999 A
5901327 Ofek May 1999 A
5924102 Perks Jul 1999 A
5950205 Aviani, Jr. Sep 1999 A
5958005 Thorne et al. Sep 1999 A
5974563 Beeler, Jr. Oct 1999 A
5978577 Rierden et al. Nov 1999 A
5983239 Cannon Nov 1999 A
6014695 Hirose et al. Jan 2000 A
6021415 Cannon et al. Feb 2000 A
6023705 Bellinger et al. Feb 2000 A
6026398 Brown et al. Feb 2000 A
6026414 Anglin Feb 2000 A
6052735 Ulrich et al. Apr 2000 A
6076148 Kedem Jun 2000 A
6088694 Burns et al. Jul 2000 A
6094416 Ying Jul 2000 A
6131095 Low et al. Oct 2000 A
6131099 Johnson et al. Oct 2000 A
6131147 Takagi Oct 2000 A
6131190 Sidwell Oct 2000 A
6137864 Yaker Oct 2000 A
6148412 Cannon et al. Nov 2000 A
6149316 Harari et al. Nov 2000 A
6154738 Call Nov 2000 A
6154787 Urevig et al. Nov 2000 A
6161111 Mutalik et al. Dec 2000 A
6167402 Yeager Dec 2000 A
6195794 Buxton Feb 2001 B1
6212512 Barney et al. Apr 2001 B1
6223205 Harchol-Balter Apr 2001 B1
6246882 Lachance Jun 2001 B1
6260069 Anglin Jul 2001 B1
6266678 McDevitt et al. Jul 2001 B1
6266784 Hsiao et al. Jul 2001 B1
6269382 Cabrera et al. Jul 2001 B1
6269431 Dunham Jul 2001 B1
6275953 Vahalia et al. Aug 2001 B1
6301592 Aoyama et al. Oct 2001 B1
6304880 Kishi Oct 2001 B1
6308245 Johnson et al. Oct 2001 B1
6324581 Xu et al. Nov 2001 B1
6328766 Long Dec 2001 B1
6330570 Crighton Dec 2001 B1
6330572 Sitka Dec 2001 B1
6330642 Carteau Dec 2001 B1
6338006 Jesionowski et al. Jan 2002 B1
6343324 Hubis et al. Jan 2002 B1
RE37601 Eastridge et al. Mar 2002 E
6353878 Dunham Mar 2002 B1
6356801 Goodman et al. Mar 2002 B1
6356901 MacLeod et al. Mar 2002 B1
6366900 Hu Apr 2002 B1
6374336 Peters et al. Apr 2002 B1
6389432 Pothapragada et al. May 2002 B1
6418441 Call Jul 2002 B1
6418478 Ignatius et al. Jul 2002 B1
6421711 Blumenau et al. Jul 2002 B1
6434682 Ashton et al. Aug 2002 B1
6457017 Watkins et al. Sep 2002 B2
6484166 Maynard Nov 2002 B1
6487561 Ofek et al. Nov 2002 B1
6490666 Cabrera et al. Dec 2002 B1
6496744 Cook Dec 2002 B1
6519679 Devireddy et al. Feb 2003 B2
6538669 Lagueux, Jr. et al. Mar 2003 B1
6542972 Ignatius et al. Apr 2003 B2
6550057 Bowman-Amuah Apr 2003 B1
6564228 O'Connor May 2003 B1
6615349 Hair Sep 2003 B1
6616047 Catan Sep 2003 B2
6658436 Oshinsky et al. Dec 2003 B2
6658526 Nguyen et al. Dec 2003 B2
6662281 Ballard et al. Dec 2003 B2
6669832 Saito et al. Dec 2003 B1
6674924 Wright Jan 2004 B2
6704839 Butterworth et al. Mar 2004 B2
6721334 Ketcham Apr 2004 B1
6732293 Schneider May 2004 B1
6757794 Cabrera et al. Jun 2004 B2
6771595 Gilbert et al. Aug 2004 B1
6785078 Basham et al. Aug 2004 B2
6789161 Blendermann et al. Sep 2004 B1
6802025 Thomas et al. Oct 2004 B1
6820035 Zahavi Nov 2004 B1
6851031 Trimmer et al. Feb 2005 B2
6862622 Jorgensen Mar 2005 B2
6909356 Brown et al. Jun 2005 B2
6922687 Vernon Jul 2005 B2
6934879 Misra et al. Aug 2005 B2
6941370 Boies et al. Sep 2005 B2
6950723 Gallo et al. Sep 2005 B2
6968351 Butterworth Nov 2005 B2
6968479 Wyatt et al. Nov 2005 B2
6972918 Kokami et al. Dec 2005 B2
6973369 Trimmer et al. Dec 2005 B2
6973553 Archibald, Jr. et al. Dec 2005 B1
6983351 Gibble et al. Jan 2006 B2
7006435 Davies et al. Feb 2006 B1
7010387 Lantry et al. Mar 2006 B2
7012529 Sajkowsky Mar 2006 B2
7034683 Ghazarian Apr 2006 B2
7035880 Crescenti et al. Apr 2006 B1
7058649 Ough et al. Jun 2006 B2
7069466 Trimmer et al. Jun 2006 B2
7082441 Zahavi et al. Jul 2006 B1
7085786 Carlson et al. Aug 2006 B2
7085904 Mizuno et al. Aug 2006 B2
7093089 de Brebisson Aug 2006 B2
7096269 Yamagami Aug 2006 B2
7096315 Takeda et al. Aug 2006 B2
7103619 Rajpurkar et al. Sep 2006 B1
7103731 Gibble et al. Sep 2006 B2
7103740 Colgrove et al. Sep 2006 B1
7107298 Prahlad et al. Sep 2006 B2
7107395 Ofek Sep 2006 B1
7118034 Baldassari et al. Oct 2006 B2
7120823 Foster et al. Oct 2006 B2
7130970 Devassy et al. Oct 2006 B2
7136720 Deckers Nov 2006 B2
7146377 Nowicki et al. Dec 2006 B2
7155465 Lee et al. Dec 2006 B2
7155486 Aoshima et al. Dec 2006 B2
7162496 Amarendran et al. Jan 2007 B2
7162604 Nourmohamadian et al. Jan 2007 B1
7162693 Yamanaka et al. Jan 2007 B2
7165059 Shah et al. Jan 2007 B1
7191283 Amemiya et al. Mar 2007 B2
7197490 English Mar 2007 B1
7200621 Beck et al. Apr 2007 B2
7203944 van Rietschote et al. Apr 2007 B1
7209949 Mousseau et al. Apr 2007 B2
7213118 Goodman et al. May 2007 B2
7216244 Amano May 2007 B2
7246140 Therrien et al. Jul 2007 B2
7246207 Kottomtharayil et al. Jul 2007 B2
7246258 Chen et al. Jul 2007 B2
7251218 Jorgensen Jul 2007 B2
7275063 Horn Sep 2007 B2
7277246 Barbian et al. Oct 2007 B2
7277953 Wils et al. Oct 2007 B2
7281032 Kodama Oct 2007 B2
7287047 Kavuri Oct 2007 B2
7293133 Colgrove Nov 2007 B1
7302540 Holdman et al. Nov 2007 B1
7315923 Retnamma et al. Jan 2008 B2
7343356 Prahlad et al. Mar 2008 B2
7343453 Prahlad et al. Mar 2008 B2
7343459 Prahlad et al. Mar 2008 B2
7346623 Prahlad et al. Mar 2008 B2
7346751 Prahlad et al. Mar 2008 B2
7379850 Sprogis et al. May 2008 B2
7395282 Crescenti et al. Jul 2008 B1
7395387 Berkowitz et al. Jul 2008 B2
7395446 Luke et al. Jul 2008 B2
7398524 Shapiro Jul 2008 B2
7401728 Markham et al. Jul 2008 B2
7412433 Anglin et al. Aug 2008 B2
7418464 Cannon et al. Aug 2008 B2
7421312 Trossell Sep 2008 B2
7434090 Hartung et al. Oct 2008 B2
7447907 Hart, III et al. Nov 2008 B2
7451283 Chen Nov 2008 B2
7454569 Kavuri et al. Nov 2008 B2
7467167 Patterson Dec 2008 B2
7472238 Gokhale et al. Dec 2008 B1
7500053 Kavuri et al. Mar 2009 B1
7529782 Prahlad et al. May 2009 B2
7536424 Barzilai et al. May 2009 B2
7539702 Deshmukh et al. May 2009 B2
7539783 Kochunni et al. May 2009 B2
7565340 Herlocker et al. Jul 2009 B2
7581011 Teng Aug 2009 B2
7584227 Gokhale et al. Sep 2009 B2
7584298 Klinker et al. Sep 2009 B2
7587749 Leser et al. Sep 2009 B2
7596586 Gokhale et al. Sep 2009 B2
7603518 Kottomtharayil Oct 2009 B2
7613752 Prahlad Nov 2009 B2
7617262 Prahlad et al. Nov 2009 B2
7617392 Hair Nov 2009 B2
7627617 Kavuri et al. Dec 2009 B2
7644245 Prahlad et al. Jan 2010 B2
7653671 Ikezawa et al. Jan 2010 B2
7657666 Kottomtharayil et al. Feb 2010 B2
7659820 Schnee et al. Feb 2010 B2
7660812 Findlay et al. Feb 2010 B2
7680843 Panchbudhe et al. Mar 2010 B1
7689510 Lamkin et al. Mar 2010 B2
7693832 Vargas et al. Apr 2010 B2
7702659 Ban et al. Apr 2010 B2
7702831 Ma et al. Apr 2010 B2
7707060 Chainer et al. Apr 2010 B2
7712094 Shapiro May 2010 B2
7720817 Stager et al. May 2010 B2
7734669 Kottomtharayil et al. Jun 2010 B2
7739450 Kottomtharayil Jun 2010 B2
7747579 Prahlad et al. Jun 2010 B2
7748610 Bell et al. Jul 2010 B2
7751628 Reisman Jul 2010 B1
7765167 Prahlad et al. Jul 2010 B2
7765369 Prahlad et al. Jul 2010 B1
7805416 Compton et al. Sep 2010 B1
7809699 Passmore et al. Oct 2010 B2
7809914 Kottomtharayil et al. Oct 2010 B2
7818417 Ginis et al. Oct 2010 B2
7822715 Petruzzo Oct 2010 B2
7831566 Kavuri et al. Nov 2010 B2
7840537 Gokhale et al. Nov 2010 B2
7844573 Amarendran et al. Nov 2010 B2
7849266 Kavuri et al. Dec 2010 B2
7861011 Kottomtharayil et al. Dec 2010 B2
7873802 Gokhale et al. Jan 2011 B2
7877351 Crescenti et al. Jan 2011 B2
7877362 Gokhale et al. Jan 2011 B2
7889847 Gainsboro Feb 2011 B2
7890796 Pawar et al. Feb 2011 B2
7904350 Ayala et al. Mar 2011 B2
7917473 Kavuri et al. Mar 2011 B2
7917695 Ulrich et al. Mar 2011 B2
7934071 Abe et al. Apr 2011 B2
7937365 Prahlad et al. May 2011 B2
7937393 Prahlad et al. May 2011 B2
7945810 Soran et al. May 2011 B2
7953802 Mousseau et al. May 2011 B2
7966293 Owara et al. Jun 2011 B1
7969306 Ebert et al. Jun 2011 B2
7975061 Gokhale et al. Jul 2011 B1
7987319 Kottomtharayil Jul 2011 B2
8005913 Carlander Aug 2011 B1
8006111 Tzelnic et al. Aug 2011 B1
8032569 Oshita et al. Oct 2011 B2
8040727 Harari Oct 2011 B1
8051043 Young Nov 2011 B2
8112605 Kavuri Feb 2012 B2
8341182 Muller Feb 2012 B2
8140786 Bunte et al. Mar 2012 B2
8156086 Lu et al. Apr 2012 B2
8161318 D'Souza et al. Apr 2012 B2
8170995 Prahlad et al. May 2012 B2
8195800 Tameshige et al. Jun 2012 B2
8200638 Zheng et al. Jun 2012 B1
8204862 Paulzagade et al. Jun 2012 B1
8209293 Gokhale et al. Jun 2012 B2
8219524 Gokhale Jul 2012 B2
8229954 Kottomtharayil et al. Jul 2012 B2
8230066 Heil Jul 2012 B2
8230171 Kottomtharayil Jul 2012 B2
8230195 Amarendran et al. Jul 2012 B2
8234417 Kottomtharayil et al. Jul 2012 B2
8234468 Deshmukh et al. Jul 2012 B1
8244841 Shaji et al. Aug 2012 B2
8266406 Kavuri Sep 2012 B2
8266615 Shapiro Sep 2012 B2
8285681 Prahlad Oct 2012 B2
8285898 Amit et al. Oct 2012 B2
8306926 Prahlad Nov 2012 B2
8307177 Prahlad Nov 2012 B2
8327050 Amit et al. Dec 2012 B2
8335789 Hull Dec 2012 B2
8346733 Gokhale Jan 2013 B2
8346734 Muller Jan 2013 B2
8347088 Moore et al. Jan 2013 B2
8352433 Crescenti Jan 2013 B2
8364652 Vijayan et al. Jan 2013 B2
8402000 Gokhale et al. Mar 2013 B2
8407190 Prahlad et al. Mar 2013 B2
8412848 Therrien et al. Apr 2013 B2
8417678 Bone et al. Apr 2013 B2
8422733 Reisman et al. Apr 2013 B2
8433679 Crescenti Apr 2013 B2
8463753 Gokhale Jun 2013 B2
8463994 Kottomtharayil Jun 2013 B2
8478876 Paul et al. Jul 2013 B2
8484165 Gokhale Jul 2013 B2
8510573 Muller Aug 2013 B2
8527549 Cidon et al. Sep 2013 B2
8539118 Kottomtharayil et al. Sep 2013 B2
8572330 Kottomtharayil Oct 2013 B2
8578120 Attarde Nov 2013 B2
8600998 Jobanputra et al. Dec 2013 B1
8620286 Stannard et al. Dec 2013 B2
8626128 Limont et al. Jan 2014 B2
8635204 Xie et al. Jan 2014 B1
8656068 Kottomtharayil et al. Feb 2014 B2
8661216 Kavuri et al. Feb 2014 B2
8671209 Awano Mar 2014 B2
8688641 Cook et al. Apr 2014 B1
8695058 Batchu et al. Apr 2014 B2
8700578 Varadan et al. Apr 2014 B1
8706976 Vijayan et al. Apr 2014 B2
8712959 Lim et al. Apr 2014 B1
8756203 Muller et al. Jun 2014 B2
8825591 Lai et al. Sep 2014 B1
8832031 Kavuri et al. Sep 2014 B2
8832044 Gipp et al. Sep 2014 B1
8849761 Kottomtharayil et al. Sep 2014 B2
8850140 Freedman et al. Sep 2014 B2
8886853 Vijayan et al. Nov 2014 B2
8924428 Muller et al. Dec 2014 B2
8931107 Brandwine Jan 2015 B1
8938481 Kumarasamy Jan 2015 B2
8996823 Kottomtharayil et al. Mar 2015 B2
9069799 Vijayan Jun 2015 B2
9183560 Abelow Nov 2015 B2
10185670 Litichever Jan 2019 B2
10341285 Warfield Jul 2019 B2
20020010661 Waddington et al. Jan 2002 A1
20020032613 Buettgenbach et al. Mar 2002 A1
20020049778 Bell et al. Apr 2002 A1
20020069324 Gerasimov et al. Jun 2002 A1
20020087950 Brodeur et al. Jul 2002 A1
20030055671 Nassar Mar 2003 A1
20030065759 Britt et al. Apr 2003 A1
20030101155 Gokhale et al. May 2003 A1
20030134619 Phillips et al. Jul 2003 A1
20030220901 Carr et al. Nov 2003 A1
20040054607 Waddington et al. Mar 2004 A1
20040073677 Honma et al. Apr 2004 A1
20040083202 Mu et al. Apr 2004 A1
20040107199 Dalrymple, III et al. Jun 2004 A1
20040186847 Rappaport et al. Sep 2004 A1
20040192260 Sugimoto et al. Sep 2004 A1
20040193953 Callahan et al. Sep 2004 A1
20050008163 Leser et al. Jan 2005 A1
20050021524 Oliver Jan 2005 A1
20050033913 Kottomtharayil et al. Feb 2005 A1
20050039069 Prahlad et al. Feb 2005 A1
20050076264 Rowan et al. Apr 2005 A1
20050102203 Keong May 2005 A1
20050125807 Brady et al. Jun 2005 A1
20050174869 Kottomtharayil et al. Aug 2005 A1
20050177828 Graham et al. Aug 2005 A1
20050210304 Hartung et al. Sep 2005 A1
20050246342 Vernon Nov 2005 A1
20060004639 O'Keefe Jan 2006 A1
20060004675 Bennett et al. Jan 2006 A1
20060011720 Call Jan 2006 A1
20060069886 Tulyani et al. Mar 2006 A1
20060075007 Anderson et al. Apr 2006 A1
20060095385 Atkinson et al. May 2006 A1
20060100912 Kumar et al. May 2006 A1
20060161879 Lubrecht et al. Jul 2006 A1
20060169769 Boyarsky et al. Aug 2006 A1
20060224846 Amarendran et al. Oct 2006 A1
20060248165 Sridhar et al. Nov 2006 A1
20060282194 Schaefer et al. Dec 2006 A1
20060288044 Kashiwagi et al. Dec 2006 A1
20070130105 Papatla Jun 2007 A1
20070185912 Gupta et al. Aug 2007 A1
20080077622 Keith et al. Mar 2008 A1
20080147621 Newman et al. Jun 2008 A1
20080177806 Cannon et al. Jul 2008 A1
20080243795 Prahlad et al. Oct 2008 A1
20090319534 Gokhale Dec 2009 A1
20100030528 Smith et al. Feb 2010 A1
20100070466 Prahlad et al. Mar 2010 A1
20100070474 Lad Mar 2010 A1
20100070726 Ngo et al. Mar 2010 A1
20100077453 Mohanty et al. Mar 2010 A1
20100082672 Kottomtharayil et al. Apr 2010 A1
20100269164 Sosnosky et al. Oct 2010 A1
20100318500 Murphy et al. Dec 2010 A1
20100333116 Prahlad et al. Dec 2010 A1
20110040736 Kalaboukis et al. Feb 2011 A1
20110093672 Gokhale et al. Apr 2011 A1
20110231852 Gokhale et al. Sep 2011 A1
20110252071 Cidon Oct 2011 A1
20110270833 von Kaenel et al. Nov 2011 A1
20110306326 Reed et al. Dec 2011 A1
20120084523 Littlefield et al. Apr 2012 A1
20120150818 Vijayan Retnamma et al. Jun 2012 A1
20120150826 Vijayan Retnamma et al. Jun 2012 A1
20120185657 Gokhale et al. Jul 2012 A1
20120240183 Sinha et al. Sep 2012 A1
20130054533 Hao et al. Feb 2013 A1
20130238572 Prahlad et al. Sep 2013 A1
20130262392 Vibhor et al. Oct 2013 A1
20130275380 Gokhale et al. Oct 2013 A1
20130318207 Dotter et al. Nov 2013 A1
20140046904 Kumarasamy et al. Feb 2014 A1
20140122435 Chavda et al. May 2014 A1
20150012495 Prahlad et al. Jan 2015 A1
20150269035 Vijayan et al. Sep 2015 A1
Foreign Referenced Citations (16)
Number Date Country
0259912 Mar 1988 EP
0405926 Jan 1991 EP
0467546 Jan 1992 EP
0620553 Oct 1994 EP
0757317 Feb 1997 EP
0774715 May 1997 EP
0809184 Nov 1997 EP
0899662 Mar 1999 EP
0981090 Feb 2000 EP
7254204 Oct 1995 JP
9044381 Feb 1997 JP
9081424 Mar 1997 JP
9513580 May 1995 WO
9912098 Mar 1999 WO
2005024573 Mar 2005 WO
2008154448 Dec 2008 WO
Non-Patent Literature Citations (25)
Entry
U.S. Appl. No. 14/673,278, filed Mar. 30, 2015, Kumarasamy, Paramasivam et al.
U.S. Appl. No. 14/843,075, filed Sep. 2, 2015, Kochunni, Jaidev O., et al.
About Backupify [retrieved on Aug. 1, 2014], Retrieved from internet: http://web.archive.org/web/20120122064518/https://www.backupify.com/about; published on Jan. 22, 2012 as per Wayback Machine.
Allen, “Probability, Statistics and Queuing Theory,” (1978), p. 370, col. 19, Lines 3-33, 2 pages.
Armstead et al., “Implementation of a Campwide Distributed Mass Storage Service: The Dream vs. Reality,” IEEE, Sep. 11-14, 1995, pp. 190-199.
Arneson, “Mass Storage Archiving in Network Environments,” Digest of Papers, Ninth IEEE Symposium on Mass Storage Systems, Oct. 31, 1988-Nov. 3, 1988, pp. 45-50, Monterey, CA.
Ashton et al., “Two Decades of policy-based storage management for the IBM mainframe computer”, www.research.ibm.com, published Apr. 10, 2003, printed Jan. 3, 2009, 19 pages.
Backup your social media content with MyCube Vault [retrieved on Oct. 30, 2014] Retrieved from internet; http://web.archive.org/web/20110606160223/http://www.kullin.net/2010/11/backup-your-social-media-content-with-mycube-vault/; published on Jun. 6, 2011 as per Wayback Machine.
Cabrera et al., “ADSM: A Multi-Platform, Scalable, Backup and Archive Mass Storage System,” Digest of Papers, Compcon '95, Proceedings of the 40th IEEE Computer Society International Conference, Mar. 5, 1995-Mar. 9, 1995, pp. 420-427, San Francisco, CA.
Campbell, C.: “Linux and Windows NT 4.0: Basic Administration—Part III” Internet Publication, [Online] Oct. 5, 2000, Retrieved from the Internet: URL: <http://linux.omnipotent.net/article.php?article_id=10933> [retrieved on Aug. 22, 2006], 6 pages.
Carrington, D.: “Backups Using the “at” Command”, Internet Publication, [Online] May 1999, Retrieved from the Internet: URL: <http://groups.google.de/group/microsoft.public.windowsnt.misc/browse_thread/thread/d1406a9a8391afea/48bac300a0adcc7a?lnk=st&q=&rnum=12&h1=de#48bac300a0adcc7a> [retrieved on Aug. 22, 2006], 1 page.
Cook, P.: “ntbackup: eject tape at end of backup?” Internet Publication, [Online] Oct. 18, 2000, Retrieved from the Internet: URL: <http://groups.google.de/group/microsoft.public.windowsnt.misc/browse_thread/thread/8f67f0cc96df42b7/0ab1d93a6f91b511?lnk=st&q=%22ntbackup+eject%22+at&rnum=1&h1=de#0ab1d93a6f91b511> [retrieved on Aug. 22, 2006], 1 page.
Eitel, “Backup and Storage Management in Distributed Heterogeneous Environments,” IEEE, Jun. 12-16, 1994, pp. 124-126.
Extended European Search Report in European Application No. 13767340.6, dated Aug. 19, 2015, 7 pages.
Gait, J., “The Optical File Cabinet: A Random-Access File System For Write-Once Optical Disks,” IEEE Computer, vol. 21, No. 6, pp. 11-22 (Jun. 1988).
Gonzalez-Seco, Jose, “A Genetic Algorithm as the Learning Procedure for Neural Networks,” International Joint Conference on Neural Networks, Jun. 1992, 356 pages.
Indian First Examination Report, Application No. 3362/DELNP/2006, dated Jan. 21, 2013, 2 pages.
International Search Report and Written Opinion for International Application No. PCT/US2013/029393; dated Jun. 27, 2013; 10 pages.
Jander, M., “Launching Storage-Area Net,” Data Communications, US, McGraw Hill, NY, vol. 27, No. 4 (Mar. 21, 1998), pp. 64-72.
MDM: “Automatically eject tape”, Internet Publication, [Online] Jun. 7, 1999, Retrieved from Internet: URL: <http://groups.google.de/group/microsoft.public.windowsnt.misc/browse_thread/thread/66537271a88cebda/2f8b1b96dfc5f102?lnk=st&q=&rnum=11&h1=de#2f8b1b96dfc5f102> [retrieved on Jun. 22, 2006], 1 page.
PageFreezer Website Archiving & Social Media Archiving [retrieved on Aug. 1, 2014], Retrieved from internet: http://webarchive.org/web/20120303012345/http://pagefreezer.com/blog; published on Mar. 3, 2012 as per Wayback Machine.
Recycle Bin (Windows), Aug. 2007, Wikipedia, pp. 1-3.
Rosenblum et al., “The Design and Implementation of a Log-Structured File System,” Operating Systems Review SIGOPS, vol. 25, No. 5, New York, US, pp. 1-15 (May 1991).
Savill, J., “Windows NT FAQ Single File Version—Section Backup's” Internet Publication, [Online] 2000, Retrieved from Internet: URL: <http://burks.bton.ac.uk/burks/pcinfo/osdocs/ntfaq/ntfaq_09.htm> [retrieved on Aug. 22, 2006], 8 pages.
Supplementary European Search Report in European Application No. 13767340.6, dated Sep. 4, 2015, 1 page.
Related Publications (1)
Number Date Country
20160100007 A1 Apr 2016 US
Provisional Applications (1)
Number Date Country
61097176 Sep 2008 US
Continuations (1)
Number Date Country
Parent 12558640 Sep 2009 US
Child 14963954 US