A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The invention generally relates to data storage systems, and more specifically, relates to sanitization of disks.
When data is deleted from a magnetic disk storage device such as a hard drive, the data can often be recovered. A hard drive typically comprises many addressable data storage units known as “blocks.” A file (or other logical data storage unit) typically comprises data written to the blocks, and an entry in a file structure that includes pointers that point to the blocks storing the data. The “delete” function of many file systems only removes the pointers; the data itself remains intact. Even after a low-level formatting of a hard drive, data stored on the drive may be recoverable. In certain situations, such as when the data includes sensitive information, allowing the data to remain recoverable after it has been “deleted” may be undesirable.
Several techniques for “sanitizing” a magnetic disk exist. Generally, sanitization involves affecting a disk so that data previously stored on it is unrecoverable. One way to sanitize a hard drive is to physically destroy the drive. For example, the drive may be dismantled or otherwise physically altered. Another physical method is to degauss the disk by applying a powerful alternating magnetic field to the disk. The degaussing technique changes the orientation of the magnetic particles on the disk platter.
If the drive is to be reused, it can be sanitized by writing over the data already on the disk. This is known as “media overwrite” sanitization. Media overwrite sanitization may be as simple as writing zeros to every bit on a drive, or writing different predetermined or random patterns to the drive. Writing over the drive once is known as a “single pass” overwrite. Writing over the drive multiple times is known as “multiple pass” overwrite. Different users require different levels of sanitization. For example, a user storing sensitive information, such as confidential trade secrets, may want to perform a greater number of passes.
Several different “patterns” have been developed to perform overwrite sanitization. A pattern is the sequence of bits (ones and zeros) that will be written to every bit on the drive. Using a multiple pass overwrite, different patterns may be used for each pass. For example, the first pass may use the pattern, the second pass uses the pattern's complement, and the third pass used random data.
Sanitization is typically performed at the granularity of the entire storage medium. Usually, when a hard drive or other magnetic medium is being retired or removed from use, the entire drive is sanitized to protect the data. In other instances, though, it may be desirable to sanitize only a portion of the drive. For example, storage users that are subject to government regulations regarding the retention of data may want to delete and sanitize only the files that the users are permitted to delete. The regulations may require that the user retain the other files.
A file may be sanitized as soon as it is deleted. Sanitizing a file as soon as it is deleted typically requires performing multiple overwrite sanitization before the operating system receives confirmation that the file has been deleted. However, this is extremely resource intensive, since the hard drive or other storage medium is typically required to write over the same blocks several times before the file is considered sanitized.
The present invention includes a method and a corresponding apparatus for sanitizing storage in a data storage system. In one embodiment, the method includes maintaining data in an active file system, and automatically sanitizing the data in the active file system according to a specified scheduling criterion. Other aspects of the invention will be apparent from the accompanying figures and from the detailed description which follows.
One or more embodiments of the present invention are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
Described herein are methods and apparatuses for disk sanitization using queues. Note that in this description, references to “one embodiment” or “an embodiment” mean that the feature being referred to is included in at least one embodiment of the present invention. Further, separate references to “one embodiment” or “an embodiment” in this description do not necessarily refer to the same embodiment. However, such embodiments are also not mutually exclusive unless so stated, and except as will be readily apparent to those skilled in the art from the description. For example, a feature, structure, act, etc. described in one embodiment may also be included in other embodiments. Thus, the present invention can include a variety of combinations and/or integrations of the embodiments described herein.
According to an embodiment of the invention, when a file or other logical storage unit in an active file system is deleted, the physical data storage units (e.g., blocks) comprising the file or other logical storage unit are moved into a queue. The queue may include several pointers (references) to blocks from different deleted files. The queue is processed (i.e., the blocks referenced by the queue are sanitized) according to a specified scheduling criterion. For example, the blocks referenced by the queue may be sanitized when it exceeds a certain size, or at a predetermined interval. Sanitizing blocks using a queue mechanism allows sanitization to be done within an active file system, while the active file system remains accessible to users (e.g., for non-sanitization operations), and makes more efficient use of system resources; the system can sanitize the blocks when resources are available. Examples of non-sanitization operations of the active file system include executing read and write operations on storage devices in response to client requests, maintaining directories, etc. According to another embodiment, the queue is constantly processed, and the rate of sanitization of the queue may be increased or decreased depending on current system load. When a file is deleted, the operating system receives verification that the file has been deleted, and the blocks that comprised the file are made unavailable and inaccessible. The blocks may be sanitized at a later time, and then made available to be rewritten. This way, a large number of blocks can be sanitized at one time, improving the performance of the operating system. According to one embodiment of the invention, a queue is a logical construct, such as a file, that includes pointers (or references) to blocks of a file that has in some way been altered, including being deleted or moved to another location. The queue is used to determine which blocks need to be sanitized. When performing sanitization, a file system may parse the queue to determine an order of blocks to sanitize, so that the blocks in the queue may be sanitized at a later time, thereby consolidating sanitization activity, and improving system performance. The data blocks still exist in their original location, however their association with the file (or other logical construct) to which they originally belonged has been dissolved.
The client 104 accesses and uses a volume 106 for data service. The volume 106 may comprise one or more drives, including one or more magnetic disks such as hard drives. The client 104 may communicate with the storage server 102 over a network 108 using various communications protocols to store and access data stored on the volume 106. The storage server 102 may be any one of several different types of storage servers, including those that employ a NAS or Storage Area Network (SAN) approach, or both. For example, the storage server 102 may be a filer server, or filer, that stores data in the form of files.
A queue file 110 may be stored on the volume 106. The queue file 110 may be a file that includes the data storage units, such as blocks, that are waiting to be sanitized. For example, when a file is deleted, the storage server 102 may create pointers to the blocks belonging to the deleted file in the queue file 110. The process typically only requires adding pointers to the queue file that point to the deleted blocks and destroying the pointers of the deleted file. According to one embodiment, when a file is deleted, the blocks belonging to that file are added to the end of the queue file 110, so that the queue file 110 may sanitize the deleted blocks in the order they were deleted. Since the queue file 110 is persistent, if the storage server 102 loses power, the blocks, and therefore data belonging to the files will still be sanitized when power is restored, since the data is retained even when power is disrupted.
According to other embodiments, a queue 112 may comprise any other type of memory. The queue 112 may be used in place of, or in addition to, the queue 110. For example, the queue 112 may be a persistent memory such as a flash memory or a battery powered memory. According to one embodiment, the queue 112 may store pointers to the deleted blocks physically located on the volume 106. If the queue 112 is persistent, it can easily be restored if power to the storage server 102 is interrupted. Although the following description refers to a queue file, it is understood that other types queues may be used in place of a file.
The processor 202 is the central processing unit (CPU) of the filer 200 and, thus, controls the overall operation of the filer 200. In certain embodiments, the processor 202 accomplishes this by executing software stored in main memory 204. The processor 202 may be, or may include, one or more programmable general-purpose or special-purpose microprocessors, digital signal processors (DSPs), programmable controllers, application specific integrated circuits (ASICs), programmable logic devices (PLDs), or the like, or a combination of such devices.
The main memory 204, which is generally some form of random access memory (RAM), stores the operating system 208 of the filer 200. Techniques of the present invention may be implemented within the operating system 208, as described further below. The operating system 208 may be, for example, the ONTAP operating system by Network Appliance, Inc., of Sunnyvale, Calif. (NetApp®). Also connected to the processor 202 through the bus system 206 are a network adapter 210 and a storage adapter 212. The network adapter 210 provides the filer 200 with the ability to communicate with remote devices, such as clients and/or another filer, over a network and may be, for example, an Ethernet adapter. The storage adapter 212 allows the filer to access the external mass storage devices such as a volume 214, and may be, for example, a Fibre Channel (FC) adapter or SCSI adapter.
The volume 214 may, as described above regarding the volume 106 of
It is understood that although a filer 200 is described in
The operating system 208 also includes a user interface 306, through which a network administrator or other user can control and/or configure the filer (e.g., remotely from a management station). The user interface 306 may generate a command line interface and/or a graphical user interface for this purpose. On the client side the operating system 208 (see
On the storage device side, the operating system 208 includes a storage access layer 312 and, at the lowest level, a driver layer 314. The storage access layer 312 implements a disk storage protocol such as RAID, while the driver layer 314 implements a lower-level storage device access protocol, such as Fibre Channel or SCSI.
The operating system 208 also includes a sanitization module 316. The sanitization module 316 may be invoked by the file system 308 when the queue file 216 of
The deleted file 402 is labeled file ‘A’. The deleted file ‘A’ 402 includes five blocks 404a-e. The files 402 and 216, as shown in
According to other embodiments, the file system 408 may sanitize the blocks 406 and 402 using a different order. For example, if there are several blocks in the queue file 216 that are physically located near each other, the sanitization module 316 may sanitize those blocks at the same time, even if other blocks which were deleted earlier have not yet been sanitized. Additionally, there may be other considerations that may cause one to use a different order to maintain efficiency.
According to another embodiment, the sanitization process may sanitize several blocks at once. For example, it may be more efficient to sanitize a group of blocks (say ten or twenty blocks) at one time using multiple overwrites. For example, a hard drive includes a magnetic head that scans across a rotating magnetic platter to read and write to the physical blocks on the platter. If several blocks are sanitized at once, the first of the multiple overwrites may be performed during a single pass of the magnetic head. For example, while the head is moving from one end of the disk to the other, the head may write ten blocks during that pass. Ten blocks are then partially sanitized. Further passes of the magnetic head may be used to perform the subsequent necessary multiple overwrites.
After the blocks 402 and 406 are sanitized, they may be allocated to a free block list. The free block list includes a list of unused blocks that are available for use by the operating system 308 of
The user may select the radio button 502 to sanitize the queue file 216 based on the level of available system resources. For example, the sanitization module 316 (see
The user may also select the radio button 504 to clear the queue file 216 (see
The radio button 506 sanitizes the entire queue file 216 (see
According to another embodiment, a check box 520 may be selected to indicate that the user wishes to sanitize all ‘old’ blocks. ‘Old’ blocks may be defined as those that have been in the queue file 216 (see
It is understood that other configurations of the panel 500 may be chosen. For example, the pull-down menus 510-518 may be implemented as fill-in fields, the radio buttons 502-508 may be implemented as a pull-down menu, or the check box 520 may be implemented as a radio button. According to other embodiments, other criteria may be implemented in the panel 500. Further, according to another embodiment, the control panel 500 may be controlled by the file system 208 (see
In block 606, the reallocated blocks are made unavailable. The file system 308 (see
In block 702, the sanitization module 316 of
In block 704, it is determined whether the scheduling criterion is the system resources criterion indicated by the radio button 502 of
In block 706, the sanitization process begins in the system background. In other words, the system may sanitize blocks in the queue file 216 (see
In block 716, it is determined whether the specified scheduling criterion is the specified time criterion that may be chosen using the radio button 504 of
In block 722, it is determined whether the criterion is either the size or number of blocks criteria, chosen by either the radio buttons 506 or 508 (see
If, in block 722, it is determined that the criterion is not either the size or number of blocks criterion, in block 726, it is determined whether the age of the oldest blocks in the queue file 216 (see
The techniques introduced above have been described in the context of a network attached storage (NAS) environment. However, these techniques can also be applied in various other contexts. For example, the techniques introduced above can be applied in a storage area network (SAN) environment. A SAN is a highly efficient network of interconnected, shared storage devices. One difference between NAS and SAN is that in a SAN, the storage server (which may be an appliance) provides a remote host with block-level access to stored data, whereas in a NAS configuration, the storage server provides clients with file-level access to stored data. Thus, the techniques introduced above are not limited to use in a file server or in a NAS environment.
For example, using one embodiment of a SAN, the sanitization module 316 may be a part of file system of the client 104. Using this embodiment, a client sanitizes blocks stored by the SAN. According to another embodiment, a “virtualized” SAN may be used. A virtualized SAN may include a file having a number of blocks that are available for use by clients. Using this virtual SAN, a client may access the pool of blocks in the file as though the file were a standard SAN device.
This invention has been described with reference to specific exemplary embodiments thereof. It will, however, be evident to persons having the benefit of this disclosure that various modifications changes may be made to these embodiments without departing from the broader spirit and scope of the invention. The specification and drawings are accordingly to be regarded in an illustrative rather than in a restrictive sense.
This application claims the benefit of U.S. Provisional Patent application No. 60/636,423, filed on Dec. 14, 2004 and entitled, “Disk Sanitization Using Queues,” which is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
4635145 | Horie et al. | Jan 1987 | A |
4727512 | Birkner et al. | Feb 1988 | A |
4775969 | Osterlund | Oct 1988 | A |
5235695 | Pence | Aug 1993 | A |
5269022 | Shinjo et al. | Dec 1993 | A |
5297124 | Plotkin et al. | Mar 1994 | A |
5438674 | Keele et al. | Aug 1995 | A |
5455926 | Keele et al. | Oct 1995 | A |
5485321 | Leonhardt et al. | Jan 1996 | A |
5666538 | DeNicola | Sep 1997 | A |
5673382 | Cannon et al. | Sep 1997 | A |
5774292 | Georgiou et al. | Jun 1998 | A |
5774715 | Madany et al. | Jun 1998 | A |
5805864 | Carlson et al. | Sep 1998 | A |
5809511 | Peake | Sep 1998 | A |
5809543 | Byers et al. | Sep 1998 | A |
5854720 | Shrinkle et al. | Dec 1998 | A |
5857208 | Ofek | Jan 1999 | A |
5864346 | Yokoi et al. | Jan 1999 | A |
5872669 | Morehouse et al. | Feb 1999 | A |
5875479 | Blount et al. | Feb 1999 | A |
5911779 | Stallmo et al. | Jun 1999 | A |
5949970 | Sipple et al. | Sep 1999 | A |
5961613 | DeNicola | Oct 1999 | A |
5963971 | Fosler et al. | Oct 1999 | A |
6021408 | Ledain et al. | Feb 2000 | A |
6023709 | Anglin et al. | Feb 2000 | A |
6029179 | Kishi | Feb 2000 | A |
6041329 | Kishi | Mar 2000 | A |
6044442 | Jesionowski | Mar 2000 | A |
6049848 | Yates et al. | Apr 2000 | A |
6061309 | Gallo et al. | May 2000 | A |
6067587 | Miller et al. | May 2000 | A |
6070224 | LeCrone et al. | May 2000 | A |
6098148 | Carlson | Aug 2000 | A |
6128698 | Georgis | Oct 2000 | A |
6131142 | Kamo et al. | Oct 2000 | A |
6131148 | West et al. | Oct 2000 | A |
6134660 | Boneh et al. | Oct 2000 | A |
6163856 | Dion et al. | Dec 2000 | A |
6173359 | Carlson et al. | Jan 2001 | B1 |
6195730 | West | Feb 2001 | B1 |
6212600 | Friedman et al. | Apr 2001 | B1 |
6225709 | Nakajima et al. | May 2001 | B1 |
6247096 | Fisher et al. | Jun 2001 | B1 |
6260110 | LeCrone et al. | Jul 2001 | B1 |
6266784 | Hsiao et al. | Jul 2001 | B1 |
6269423 | Kishi | Jul 2001 | B1 |
6269431 | Dunham | Jul 2001 | B1 |
6282609 | Carlson | Aug 2001 | B1 |
6289425 | Blendermann et al. | Sep 2001 | B1 |
6292889 | Fitzgerald et al. | Sep 2001 | B1 |
6301677 | Squibb | Oct 2001 | B1 |
6304880 | Kishi | Oct 2001 | B1 |
6317814 | Blendermann et al. | Nov 2001 | B1 |
6324497 | Yates et al. | Nov 2001 | B1 |
6327418 | Barton | Dec 2001 | B1 |
6336163 | Brewer et al. | Jan 2002 | B1 |
6336173 | Day, III et al. | Jan 2002 | B1 |
6339778 | Kishi | Jan 2002 | B1 |
6341329 | LeCrone et al. | Jan 2002 | B1 |
6343342 | Carlson | Jan 2002 | B1 |
6353837 | Blumenau | Mar 2002 | B1 |
6360232 | Brewer et al. | Mar 2002 | B1 |
6389503 | Georgis et al. | May 2002 | B1 |
6397307 | Ohran | May 2002 | B2 |
6408359 | Ito et al. | Jun 2002 | B1 |
6487561 | Ofek et al. | Nov 2002 | B1 |
6496791 | Yates et al. | Dec 2002 | B1 |
6499026 | Rivette et al. | Dec 2002 | B1 |
6557073 | Fujiwara et al. | Apr 2003 | B1 |
6557089 | Reed et al. | Apr 2003 | B1 |
6578120 | Crockett et al. | Jun 2003 | B1 |
6615365 | Jenevein et al. | Sep 2003 | B1 |
6625704 | Winokur et al. | Sep 2003 | B2 |
6654912 | Viswanathan et al. | Nov 2003 | B1 |
6658435 | McCall et al. | Dec 2003 | B1 |
6694447 | Leach et al. | Feb 2004 | B1 |
6725331 | Kedem | Apr 2004 | B1 |
6766520 | Rieschl et al. | Jul 2004 | B1 |
6779057 | Masters et al. | Aug 2004 | B2 |
6779058 | Kishi et al. | Aug 2004 | B2 |
6779081 | Arakawa et al. | Aug 2004 | B2 |
6816941 | Carlson et al. | Nov 2004 | B1 |
6816942 | Okada et al. | Nov 2004 | B2 |
6834324 | Wood | Dec 2004 | B1 |
6850964 | Brough et al. | Feb 2005 | B1 |
6877016 | Hart et al. | Apr 2005 | B1 |
6898600 | Fruchtman et al. | May 2005 | B2 |
6915397 | Lubbers et al. | Jul 2005 | B2 |
6931557 | Togawa et al. | Aug 2005 | B2 |
6950263 | Suzuki et al. | Sep 2005 | B2 |
6973534 | Dawson et al. | Dec 2005 | B2 |
6978283 | Edwards et al. | Dec 2005 | B1 |
6978325 | Gibble et al. | Dec 2005 | B2 |
7003621 | Koren et al. | Feb 2006 | B2 |
7007043 | Farmer et al. | Feb 2006 | B2 |
7020779 | Sutherland | Mar 2006 | B1 |
7032126 | Zalewski et al. | Apr 2006 | B2 |
7055009 | Factor et al. | May 2006 | B2 |
7096331 | Haase et al. | Aug 2006 | B1 |
7100089 | Phelps | Aug 2006 | B1 |
7111136 | Yamagami | Sep 2006 | B2 |
7111194 | Schoenthal et al. | Sep 2006 | B1 |
7127388 | Yates et al. | Oct 2006 | B2 |
7152078 | Yamagami | Dec 2006 | B2 |
7155465 | Lee et al. | Dec 2006 | B2 |
7155586 | Wagner et al. | Dec 2006 | B1 |
7200726 | Gole et al. | Apr 2007 | B1 |
7203726 | Hasegawa | Apr 2007 | B2 |
20020004835 | Yarbrough | Jan 2002 | A1 |
20020016827 | McCabe et al. | Feb 2002 | A1 |
20020026595 | Saitou et al. | Feb 2002 | A1 |
20020095557 | Constable et al. | Jul 2002 | A1 |
20020144057 | Li et al. | Oct 2002 | A1 |
20020166079 | Ulrich et al. | Nov 2002 | A1 |
20020199129 | Bohrer et al. | Dec 2002 | A1 |
20030004980 | Kishi et al. | Jan 2003 | A1 |
20030005313 | Gammel et al. | Jan 2003 | A1 |
20030025800 | Hunter et al. | Feb 2003 | A1 |
20030037211 | Winokur | Feb 2003 | A1 |
20030120676 | Holavanahalli et al. | Jun 2003 | A1 |
20030126136 | Omoigui | Jul 2003 | A1 |
20030126388 | Yamagami | Jul 2003 | A1 |
20030135672 | Yip et al. | Jul 2003 | A1 |
20030149700 | Bolt | Aug 2003 | A1 |
20030158766 | Mital et al. | Aug 2003 | A1 |
20030182350 | Dewey | Sep 2003 | A1 |
20030188208 | Fung | Oct 2003 | A1 |
20030225800 | Kavuri | Dec 2003 | A1 |
20040015731 | Chu et al. | Jan 2004 | A1 |
20040098244 | Dailey et al. | May 2004 | A1 |
20040103147 | Flesher et al. | May 2004 | A1 |
20040167903 | Margolus et al. | Aug 2004 | A1 |
20040168034 | Homma et al. | Aug 2004 | A1 |
20040181388 | Yip et al. | Sep 2004 | A1 |
20040181707 | Fujibayashi | Sep 2004 | A1 |
20050010529 | Zalewski et al. | Jan 2005 | A1 |
20050044162 | Liang et al. | Feb 2005 | A1 |
20050063374 | Rowan et al. | Mar 2005 | A1 |
20050065962 | Rowan et al. | Mar 2005 | A1 |
20050066118 | Perry et al. | Mar 2005 | A1 |
20050066225 | Rowan et al. | Mar 2005 | A1 |
20050076070 | Mikami | Apr 2005 | A1 |
20050076261 | Rowan et al. | Apr 2005 | A1 |
20050076262 | Rowan et al. | Apr 2005 | A1 |
20050076264 | Rowan et al. | Apr 2005 | A1 |
20050097260 | McGovern et al. | May 2005 | A1 |
20050108302 | Rand et al. | May 2005 | A1 |
20050144407 | Colgrove et al. | Jun 2005 | A1 |
20050182910 | Stager et al. | Aug 2005 | A1 |
20050240813 | Okada et al. | Oct 2005 | A1 |
20060010177 | Kodama | Jan 2006 | A1 |
20060047895 | Rowan et al. | Mar 2006 | A1 |
20060047902 | Passerini | Mar 2006 | A1 |
20060047903 | Passerini | Mar 2006 | A1 |
20060047905 | Matze et al. | Mar 2006 | A1 |
20060047925 | Perry | Mar 2006 | A1 |
20060047989 | Delgado et al. | Mar 2006 | A1 |
20060047998 | Darcy | Mar 2006 | A1 |
20060047999 | Passerini et al. | Mar 2006 | A1 |
20060143376 | Matze et al. | Jun 2006 | A1 |
20060259160 | Hood et al. | Nov 2006 | A1 |
Number | Date | Country |
---|---|---|
2 256 934 | Jun 2000 | CA |
0 845 733 | Jun 1998 | EP |
0 869 460 | Oct 1998 | EP |
1 058 254 | Dec 2000 | EP |
1 122 910 | Aug 2001 | EP |
1 233 414 | Aug 2002 | EP |
1333379 | Aug 2003 | EP |
1671231 | Jun 2006 | EP |
WO-9906912 | Feb 1999 | WO |
WO-9903098 | Nov 1999 | WO |
WO 0118633 | Mar 2001 | WO |
WO-0118633 | Mar 2001 | WO |
WO 03067438 | Aug 2003 | WO |
WO-03067438 | Aug 2003 | WO |
WO 2004084010 | Sep 2004 | WO |
WO-2004084010 | Sep 2004 | WO |
WO-2005031576 | Apr 2005 | WO |
WO-2006023990 | Mar 2006 | WO |
WO-2006023991 | Mar 2006 | WO |
WO-2006023992 | Mar 2006 | WO |
WO-2006023993 | Mar 2006 | WO |
WO-2006023994 | Mar 2006 | WO |
WO-2006023995 | Mar 2006 | WO |
Number | Date | Country | |
---|---|---|---|
60636423 | Dec 2004 | US |