A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosures, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
The invention disclosed herein relates generally to data storage systems in computer networks and, more particularly, to improvements in administering data on storage media by providing the capability of selectively deleting stored data.
There are many different computing architectures for storing electronic data. Individual computers typically store electronic data in volatile storage devices such as Random Access Memory (RAM) and one or more nonvolatile storage devices such as hard drives, tape drives, or optical disks, that form a part of or are directly connectable to the individual computer. In a network of computers such as a Local Area Network (LAN) or a Wide Area Network (WAN), storage of electronic data is typically accomplished via servers or stand-alone storage devices accessible via the network. These individual network storage devices may be networkable tape drives, optical libraries, Redundant Arrays of Inexpensive Disks (RAID), CD-ROM jukeboxes, and other devices.
Storage media used for storing electronic data in such computing architectures include, for example, tapes, hard drives, disks, optical disks and other storage media. In some storage operations performed by a system, data is written to storage media, which is stored in a storage device in the system. Often, there is no logic associated with selecting a particular piece of media that is appropriate for the storage operation. In addition, there is no post storage operation check to determine remaining media capacity. Thus, there are storage systems which maintain storage devices containing media that is not entirely filled with data, which results in the system having to utilize more storage media than is necessary to accommodate its data storage requirements.
Another problem with existing storage systems, is that data stored to storage media may include more than one data file. Currently, to remove, or otherwise delete data files from storage media, all files stored to the storage media must be deleted. There is therefore a need to remove one or more selected stored files from the storage media, while retaining some of the non-selected stored files, without removing all of the data stored on the storage media. This results in storage systems maintaining data files on storage media that are no longer needed, and precludes the ability to delete certain files, for example, when a certain data type is no longer needed to be stored.
The present invention includes a method and system for allocating storage space in a storage network, more particularly, to providing deletion of selected stored data.
According to one embodiment of the invention, a method is provided for selectively deleting data stored on storage media. A first data file is identified, according to selection criteria. The first data file is selected to be deleted in accordance with selection criteria, which can be, for example a manual input from a user, or in accordance with a storage policy or template, such as a policy relating to file type, file storage duration, storage preferences, storage characteristics, aging criteria, or other criteria. The first data file and its file location are identified by consulting index data, such as index data in a media management index or storage manager index. The first data file is generally located on an item of storage media which contains one or more other data files. The first data file is retrieved from the storage media together with one or more other data files that precede the first data file. The first data file and the one or more other data files that precede the first data file are retrieved to a media management component temporarily, such as in a buffer or cache. The one or more data files are copied from the media management component or other temporary storage to a storage media, such as the tape on which the files were originally stored, or other suitable media. The first data file is not copied to the storage media and is thus deleted. Data relating to the one or more other data files, e.g., location and other file information is stored to an index, such as to the media management or storage manager index. Data relating to the first data file, which has been selectively deleted from the storage media, is removed from the index, e.g. by deleting pointers to the data reference.
According to another embodiment of the invention, a method is provided for selectively deleting data stored on storage media, such as tape. The data selected for deletion is typically stored on storage media that contains more than one data file. Examples of storage media include, for example, a tape, or other storage media. The storage system receives an instruction to delete at least one selected data file. In other embodiments of the invention, a storage manager receives the deletion instruction, which is input by a client, user, or other system component. The deletion instruction can be input manually by the user, or automatically generated in accordance with storage policies, user preferences, or other storage logic.
According to another embodiment of the invention, a media management component or storage manager, as further described herein, identifies a location of the selected data file. The location of the selected data file is typically on a storage medium. In some embodiments of the invention, the storage manager consults its index to determine the location of the selected data file. In yet another embodiment of the invention, the storage manager consults its index to determine the media management component(s) that is associated with the selected data file and queries the associated media management component to identify the location of the selected data file. In this embodiment, the media management component consults its index of storage data to determine the location of the selected data file.
When the location of the selected data file is determined, the media management component retrieves the selected data file(s) and other data preceding the selected data file(s) from the storage media. In another embodiment, the media management component retrieves or copies the selected data file(s) and other data preceding the selected data file(s) to a cache, buffer or other temporary storage associated with the media management component. The retrieved data, except for the selected data file, is copied back to storage media, which can be the storage media from which the data was initially retrieved, or other storage media. In some embodiments, the copying of retrieved data results in the data being shifted to another position on the storage media on which it was initially stored. Each media management component of the system involved in the deletion, such as a storage manager and a media management component updates its respective index to remove data relating to the selected data file that was deleted, and to add data relating to the data files copied to the storage media.
In another embodiment of the invention, a method is provided for selectively deleting data stored on storage media by identifying a storage media that contains more than one data file, including a data file that has been selected for deletion, for example, a data file selected manually by a user, or automatically in accordance with user preferences, storage policies or other storage logic. The storage media can be any media capable of storing data, such as a tape. In general, the storage media includes a data file in addition to the data file selected for deletion. The storage manager instructs a media management component to delete the selected data file. The deletion of the selected data file can be accomplished as described herein, such as by shifting the data file on a tape, writing zeros over the selected data file, removing pointers to the selected data file, or other deletion techniques.
The present invention further includes methods and systems operating in conjunction with a modular storage system to enable computers on a network to share storage devices on a physical and logical level. An exemplary modular storage system is the GALAXY™ storage management system and QiNetix™ available from CommVault Systems of New Jersey. The modular architecture underlying this system is described in the above referenced patent applications, each of which is incorporated herein.
The invention is illustrated in the figures of the accompanying drawings which are meant to be exemplary and not limiting, in which like references are intended to refer to like or corresponding parts, and in which:
In the following description of the preferred embodiment, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration a specific embodiment in which the invention may be practiced. It is to be understood that other embodiments may be utilized and structural changes may be made without departing from the scope of the present invention.
A client 50 may be any networked client 50 and may include at least one attached data store 70. The data store 70 may be any memory device or local data storage device known in the art, such as a hard drive, CD-ROM drive, tape drive, RAM, or other types of magnetic, optical, digital and/or analog local storage. In some embodiments of the invention, client 50 includes at least one data agent 60, which is a software module that is generally responsible for performing storage operations for data of a client 50 stored in data store 70 or other memory location. Storage operations include, but are not limited to, creation, storage, retrieval, migration, deletion, and tracking of primary or production volume data, secondary volume data, primary copies, secondary copies, auxiliary copies, snapshot copies, backup copies, incremental copies, differential copies, synthetic copies, HSM copies, archive copies, Information Lifecycle Management (“ILM”) copies, and other types of copies and versions of electronic data. In some embodiments of the invention, the system provides at least one, and typically a plurality of data agents 60 for each client, each data agent 60 is intended to backup, migrate, and recover data associated with a different application. For example, a client 50 may have different individual data agents 60 designed to handle Microsoft Exchange data, Lotus Notes data, Microsoft Windows file system data, Microsoft Active Directory Objects data, and other types of data known in the art.
The storage manager 80 is generally a software module or application that coordinates and controls the system, for example, storage manager 80 manages and controls storage operations performed by the system. The storage manager 80 communicates with all components of the system including client 50, data agent 60, media management components 100, and storage devices 120 to initiate and manage storage operations. The storage manager 80 typically has an index 90, further described herein, for storing data related to storage operations. In general, storage manager 80 communicates with storage devices 120 via a media management component 100. In another embodiment, not shown in
The system includes one or more media management components 100. The media management component 100 is generally a software module that conducts data, as directed by the storage manager 80, between client 50 and one or more storage devices 120, for example, a tape library, a hard drive, a magnetic media storage device, an optical media storage device, or other storage device. The media management component 100 is communicatively coupled with and controls the storage device 120. For example, the media management component 100 might instruct a storage device 120 to archive, migrate, or restore application specific data. The media management component 100 generally communicates with the storage device 120 via a local bus such as a Small Computer System Interface (SCSI) adaptor.
Each media management component 100 maintains an index cache 110 which stores index data that the system generates during storage operations as further described herein. For example, storage operations for Microsoft Exchange data generate index data. Media management component index data includes, for example, information regarding the location of the stored data on a particular media, information regarding the content of the data stored such as file names, sizes, creation dates, formats, application types, and other file-related criteria, information regarding one or more clients associated with the data stored, information regarding one or more storage policies, storage criteria, or storage preferences associated with the data stored, compression information, retention-related information, encryption-related information, stream-related information, and other types of information. Index data thus provides the system with an efficient mechanism for performing storage operations including locating user files for recovery operations and for managing and tracking stored data.
The system generally maintains two copies of the media management component index data regarding particular stored data. A first copy is generally stored with the data copied to a storage device 120. Thus, a tape may contain the stored data as well as index information related to the stored data. In the event of a system restore, the index data stored with the stored data may be used to rebuild a media management component index 110 or other index useful in performing storage operations. In addition, the media management component 100 that controls the storage operation also generally writes an additional copy of the index data to its index cache 110. The data in the media management component index cache 110 is generally stored on faster media, such as magnetic media, and is thus readily available to the system for use in storage operations and other activities without having to be first retrieved from the storage device 120.
The storage manager 80 also maintains an index cache 90. Storage manager index data is used to indicate, track, and associate logical relationships and associations between components of the system, user preferences, management tasks, and other useful data. For example, storage manager 80 might use its index cache 90 to track logical associations between media management components 100 and storage devices 120. The storage manager 80 may also use its index cache 90 to track the status of storage operations to be performed, storage patterns associated with the system components such as media use, storage growth, network bandwidth, service level agreement (“SLA”) compliance levels, data protection levels, storage policy information, storage criteria associated with user preferences, retention criteria, storage operation preferences, and other storage-related information.
A storage policy is generally a data structure or other information which includes a set of preferences and other storage criteria for performing a storage operation. The preferences and storage criteria may include, but are not limited to: a storage location, relationships between system components, network pathway to utilize, retention policies, data characteristics, compression or encryption requirements, preferred system components to utilize in a storage operation, and other criteria relating to a storage operation. A storage policy may be stored to a storage manager index, to archive media as metadata for use in restore operations or other storage operations, or to other locations or components of the system.
Index caches 90 and 110 typically reside on their corresponding storage component's hard disk or other fixed storage device. For example, jobs agent 85 of storage manager 80 may retrieve storage manager index 90 data regarding a storage policy and storage operation to be performed or scheduled for a particular client 50. The jobs agent 85, either directly or via another system module, communicates with data agent 60 for client 50 regarding the storage operation. In some embodiments, jobs agent 85 also retrieves from index cache 90 a storage policy associated with client 50 and uses information from the storage policy to communicate to data agent 60 one or more media management components 100 associated with performing storage operations for that particular client 50 as well as other information regarding the storage operation to be performed such as retention criteria, encryption criteria, streaming criteria, etc. The data agent 60 then packages or otherwise manipulates the client data stored in client data store 70 in accordance with the storage policy information and/or according to a user preference, and communicates this client data to the appropriate media management component(s) 100 for processing. The media management component(s) 100 store the data according to storage preferences associated with the storage policy including storing the generated index data with the stored data, as well as storing a copy of the generated index data in the media management component index cache 110.
In some embodiments, components of the system may reside and execute on the same computer. In some embodiments, a client component such as data agent 60, a media management component 100, or storage manager 80 coordinates and directs local archiving, migration, and retrieval application functions as further described in application Ser. No. 09/610,738. These client components may function independently or together with other similar client components.
The storage device 120 can be any storage device capable of storing data that is suitable for the purposes described herein. Some storage devices may have a robotic arm which is used to insert and remove storage media contained in the storage device or other technology to shuffle media in the storage device. The type of storage media used in the storage device may be, for example, magnetic tape, such as that depicted in
As shown in
Referring to
The selective deletion instruction is processed by the storage manager, which identifies the storage location, step 310. In general, the storage location is determined by storage manager 80 communicating an instruction to a media management component 100, to identify the storage location of data file 210, 220 or 230, or more specifically, the storage device and tape on which the data is stored. In some embodiments, the storage manager 80 consults its index 90 to determine which media management component 100 is associated with or controls data file 210, 220 or 230 selected for deletion, and communicates with the appropriate media management component 100 to obtain information or data relating to the selected data file to perform a selective deletion operation. The media management component 100 then consults its index 110 to determine in which storage device and storage media data file 210, 220 or 230 is located. In addition, the media management component index 110 includes data indicating the location of the data file on the storage media, such as the offset and length of the data file 210, 220 or 230 on the storage media 200. For example, the index data may indicate that the selected file 220 to be deleted is located in the middle of magnetic tape 200 as shown in
Referring back to
Thus, if the selected file location is a point other than the starting point of the tape, the file location will require that all of the data files prior to and including the data file selected for deletion are retrieved, step 330, such as to the media management component, tape or other suitable media. In another embodiment, the retrieved data is copied to the media management component buffer, cache or other temporary storage 110. Each of the media management component 100, tape or other suitable media has the capacity to store data, for example, at least on a temporary basis. In some embodiments, a data pipe is used to retrieve the data. An example of a preferred data pipe for this storage operation is described in U.S. Pat. No. 6,418,478.
The retrieved data, except the data file selected for deletion, is copied back to tape 200 on which it was originally stored, or to another item of storage media 200, in step 340. Thus, the data copied to tape 200 omits the selected data file, which is pruned from the new version of the tape. The new version of the tape contains only files to be retained, thus, fewer data files are contained on the tape. In addition, the data is copied to the tape sequentially so that the new version of the tape contains a section of empty tape, preferably at one end of a length of tape, so that the tape may be used to store new or additional data. In some embodiments, copying data to tape 200 (in step 340) sequentially causes a shift in the location of the data. The shift reduces the likelihood of file fragments on storage media. That is, there are generally no empty sections of tape.
The media management component 100 communicates with storage manager 80 that the file deletion has occurred so that each of the media management component 100 and storage manager 80 may update its respective index 110 and 90 accordingly, step 350. The media management component and storage manager indexes 110 and 90 may be updated with the new data location or offset information for the remaining data files, and by deleting pointers to the deleted file and updating metadata and other storage data relating to the deleted data file.
Systems and modules described herein may comprise software, firmware, hardware, or any combination(s) of software, firmware, or hardware suitable for the purposes described herein. Software and other modules may reside on servers, workstations, personal computers, computerized tablets, PDAs, and other devices suitable for the purposes described herein. Software and other modules may be accessible via local memory, via a network, via a browser or other application in an ASP context, or via other means suitable for the purposes described herein. Data structures described herein may comprise computer files, variables, programming arrays, programming structures, or any electronic information storage schemes or methods, or any combinations thereof, suitable for the purposes described herein. User interface elements described herein may comprise elements from graphical user interfaces, command line interfaces, and other interfaces suitable for the purposes described herein. Screenshots presented and described herein may be displayed differently as known in the art to input, access, change, manipulate, modify, alter, and work with information.
While the invention has been described and illustrated in connection with preferred embodiments, many variations and modifications as will be evident to those skilled in this art may be made without departing from the spirit and scope of the invention, and the invention is thus not to be limited to the precise details of methodology or construction set forth above as such variations and modification are intended to be included within the scope of the invention.
This application claims the benefit of U.S. provisional application No. 60/626,076 titled SYSTEM AND METHOD FOR PERFORMING STORAGE OPERATIONS IN A COMPUTER NETWORK, filed Nov. 8, 2004, and U.S. provisional application No. 60/625,746 titled STORAGE MANAGEMENT SYSTEM filed Nov. 5, 2004, each of which is incorporated herein by reference in its entirety. This application is related to the following applications, each of which is hereby incorporated herein by reference in its entirety: application Ser. No. 09/354,058, titled Hierarchical Backup And Retrieval System, filed Jul. 15, 1999, now U.S. Pat. No. 7,395,282;U.S. Pat. No. 6,418,478, titled Pipelined High Speed Data Transfer Mechanism, filed Mar. 11, 1998;application Ser. No. 09/610,738, titled Modular Backup And Retrieval System Used In Conjunction With A Storage Area Network, filed Jul. 6, 2000, now U.S. Pat. No. 7,035,880;application Ser. No. 10/818,749, titled System And Method For Dynamically Performing Storage Operations In A Computer Network, filed Apr. 5, 2004, now U.S. Pat. No. 7,246,207;application Ser. No. 10/877,831, titled Hierarchical System And Method For Performing Storage Operations In A Computer Network, filed Jun. 25, 2004, now U.S. Pat. No. 7,454,569;Application Ser. No. 60/567,178, titled Hierarchical System And Method For Performing Storage Operations In A Computer Network, filed Apr. 30, 2004;application Ser. No. 11/269,520, titled System And Method For Performing Multistream Storage Operations, filed Nov. 7, 2005;application Ser. No. 11/269,512, titled System And Method To Support Single Instance Storage Operations, filed Nov. 7, 2005, now U.S. Patent Publication 2006-0224846;application Ser. No. 11/269,514, titled Method And System Of Pooling Storage Devices, filed Nov. 7, 2005;application Ser. No. 11/269,519, titled Method And System For Grouping Storage System Components, now U.S. Pat. No. 7,500,053, filed Nov. 7, 2005;application Ser. No. 11/269,515, titled Systems And Methods For Recovering Electronic Information From A Storage Medium, filed Nov. 7, 2005, now U.S. Pat. No. 7,472,238; andapplication Ser. No. 11/269,513, titled Method And System For Monitoring A Storage Network, filed Nov. 7, 2005;
Number | Name | Date | Kind |
---|---|---|---|
4686620 | Ng | Aug 1987 | A |
4995035 | Cole et al. | Feb 1991 | A |
5005122 | Griffin et al. | Apr 1991 | A |
5093912 | Dong et al. | Mar 1992 | A |
5133065 | Cheffetz et al. | Jul 1992 | A |
5193154 | Kitajima et al. | Mar 1993 | A |
5212772 | Masters | May 1993 | A |
5226157 | Nakano et al. | Jul 1993 | A |
5239647 | Anglin et al. | Aug 1993 | A |
5241668 | Eastridge et al. | Aug 1993 | A |
5241670 | Eastridge et al. | Aug 1993 | A |
5265159 | Kung | Nov 1993 | A |
5276860 | Fortier et al. | Jan 1994 | A |
5276867 | Kenley et al. | Jan 1994 | A |
5287500 | Stoppani, Jr. | Feb 1994 | A |
5321816 | Rogan et al. | Jun 1994 | A |
5333315 | Saether et al. | Jul 1994 | A |
5347653 | Flynn et al. | Sep 1994 | A |
5410700 | Fecteau et al. | Apr 1995 | A |
5448724 | Hayashi et al. | Sep 1995 | A |
5455926 | Keele et al. | Oct 1995 | A |
5491810 | Allen | Feb 1996 | A |
5495607 | Pisello et al. | Feb 1996 | A |
5504873 | Martin et al. | Apr 1996 | A |
5544345 | Carpenter et al. | Aug 1996 | A |
5544347 | Yanai et al. | Aug 1996 | A |
5559957 | Balk | Sep 1996 | A |
5619644 | Crockett et al. | Apr 1997 | A |
5638509 | Dunphy et al. | Jun 1997 | A |
5673381 | Huai et al. | Sep 1997 | A |
5677900 | Nishida et al. | Oct 1997 | A |
5699361 | Ding et al. | Dec 1997 | A |
5729743 | Squibb | Mar 1998 | A |
5751997 | Kullick et al. | May 1998 | A |
5758359 | Saxon | May 1998 | A |
5761677 | Senator et al. | Jun 1998 | A |
5764972 | Crouse et al. | Jun 1998 | A |
5778395 | Whiting et al. | Jul 1998 | A |
5812398 | Nielsen | Sep 1998 | A |
5813009 | Johnson et al. | Sep 1998 | A |
5813017 | Morris | Sep 1998 | A |
5875478 | Blumenau | Feb 1999 | A |
5875481 | Ashton et al. | Feb 1999 | A |
5887134 | Ebrahim | Mar 1999 | A |
5901327 | Ofek | May 1999 | A |
5924102 | Perks | Jul 1999 | A |
5950205 | Aviani, Jr. | Sep 1999 | A |
5958005 | Thorne et al. | Sep 1999 | A |
5974563 | Beeler, Jr. | Oct 1999 | A |
6021415 | Cannon et al. | Feb 2000 | A |
6026414 | Anglin | Feb 2000 | A |
6052735 | Ulrich et al. | Apr 2000 | A |
6076148 | Kedem et al. | Jun 2000 | A |
6094416 | Ying | Jul 2000 | A |
6131095 | Low et al. | Oct 2000 | A |
6131190 | Sidwell | Oct 2000 | A |
6137864 | Yaker | Oct 2000 | A |
6148412 | Cannon et al. | Nov 2000 | A |
6154787 | Urevig et al. | Nov 2000 | A |
6161111 | Mutalik et al. | Dec 2000 | A |
6167402 | Yeager | Dec 2000 | A |
6212512 | Barney et al. | Apr 2001 | B1 |
6260069 | Anglin | Jul 2001 | B1 |
6269431 | Dunham | Jul 2001 | B1 |
6275953 | Vahalia et al. | Aug 2001 | B1 |
6301592 | Aoyama et al. | Oct 2001 | B1 |
6304880 | Kishi | Oct 2001 | B1 |
6324581 | Xu et al. | Nov 2001 | B1 |
6328766 | Long | Dec 2001 | B1 |
6330570 | Crighton et al. | Dec 2001 | B1 |
6330642 | Carteau | Dec 2001 | B1 |
6343324 | Hubis et al. | Jan 2002 | B1 |
RE37601 | Eastridge et al. | Mar 2002 | E |
6353878 | Dunham | Mar 2002 | B1 |
6356801 | Goodman et al. | Mar 2002 | B1 |
6389432 | Pothapragada et al. | May 2002 | B1 |
6418478 | Ignatius et al. | Jul 2002 | B1 |
6421711 | Blumenau et al. | Jul 2002 | B1 |
6487561 | Ofek et al. | Nov 2002 | B1 |
6519679 | Devireddy et al. | Feb 2003 | B2 |
6538669 | Lagueux, Jr. et al. | Mar 2003 | B1 |
6542972 | Ignatius et al. | Apr 2003 | B2 |
6564228 | O'Connor | May 2003 | B1 |
6658526 | Nguyen et al. | Dec 2003 | B2 |
6789161 | Blendermann et al. | Sep 2004 | B1 |
6973553 | Archibald, Jr. et al. | Dec 2005 | B1 |
7035880 | Crescenti et al. | Apr 2006 | B1 |
7103731 | Gibble et al. | Sep 2006 | B2 |
7103740 | Colgrove et al. | Sep 2006 | B1 |
7107395 | Ofek et al. | Sep 2006 | B1 |
7155465 | Lee et al. | Dec 2006 | B2 |
7246140 | Therrien et al. | Jul 2007 | B2 |
7293133 | Colgrove et al. | Nov 2007 | B1 |
7467167 | Patterson | Dec 2008 | B2 |
20020040376 | Yamanaka et al. | Apr 2002 | A1 |
20020049778 | Bell et al. | Apr 2002 | A1 |
20020069324 | Gerasimov et al. | Jun 2002 | A1 |
20020087822 | Butterworth | Jul 2002 | A1 |
20030196036 | Gibble et al. | Oct 2003 | A1 |
20030225800 | Kavuri | Dec 2003 | A1 |
20040107199 | Dalrymple et al. | Jun 2004 | A1 |
20040193953 | Callahan et al. | Sep 2004 | A1 |
20050033755 | Gokhale et al. | Feb 2005 | A1 |
20090313448 | Gokhale et al. | Dec 2009 | A1 |
Number | Date | Country |
---|---|---|
0259912 | Mar 1988 | EP |
0405926 | Jan 1991 | EP |
0467546 | Jan 1992 | EP |
0774715 | May 1997 | EP |
0809184 | Nov 1997 | EP |
0899662 | Mar 1999 | EP |
0981090 | Feb 2000 | EP |
WO-9513580 | May 1995 | WO |
WO-9912098 | Mar 1999 | WO |
Number | Date | Country | |
---|---|---|---|
60626076 | Nov 2004 | US | |
60625746 | Nov 2004 | US |