Conventional data storage systems include one or more storage devices connected to a controller or manager. As used herein, the term “data storage device” refers to any device or apparatus utilizable for the storage of data, e.g., disk drive. For explanatory purposes only and not as an intent to limit the scope of the invention, the term “disk drive” as used in this document is synonymous with the term “data storage device.”
To protect against the loss of data in the event of a disk drive failure, redundant copies of the data may be kept on multiple disks such that if a disk fails, its contents can be reconstructed from the redundant data on the other disks. Traditionally, a service person will physically replace the failed disk with a new one. However, this approach can cause undue delay in the restoration of data redundancy and may lead to the loss of the data entirely. Another common approach is to have a hot standby disk that is not in use, but can be used to automatically replace a disk that fails. The main drawback of having a hot standby disk is that additional idle hardware must be purchased, which is both expensive and inefficient.
A method and system for restoring data redundancy without the use of a hot standby disk is disclosed. Instead of having a hot standby disk, reserve storage space is maintained in the disk drives. In one embodiment, the reserve storage space comprises unallocated storage space on the disk drives. Once a disk drive failure is detected, data redundancy is restored on the reserve storage space.
Further details of aspects, objects, and advantages of the invention are described below in the detailed description, drawings, and claims. Both the foregoing general description and the following detailed description are exemplary and explanatory, and are not intended to be limiting as to the scope of the invention.
The accompanying drawings are included to provide a further understanding of the invention and, together with the Detailed Description, serve to explain the principles of the invention.
Restoring data redundancy in a storage system without a hot standby disk is disclosed. Instead of purchasing additional idle hardware, which is both expensive and inefficient, unallocated storage space on disk drives in the storage system is utilized to restore data redundancy, which takes full advantage of existing hardware. Different redundancy methods, such as mirroring or parity protection, may be employed to ensure continued access to data. Mirroring involves the replication of data at two or more separate and distinct disk drives. When parity protection is employed, lost data may be recalculated from the parity piece and the remaining data pieces in the corresponding parity set.
In order to allow for the restoration of data redundancy in the event of a disk drive failure, a reserve storage space is maintained. The reserve storage space in the embodiment shown in
Referring to
Storage systems may associate each disk drive with a failure group. Two disk drives are in the same failure group if they share a common failure condition that is projected to affect both drives at the same time, e.g., they are dependent upon a common piece of hardware which can fail without the entire system failing. In
Storage system may also mirror at granularities much smaller than the entire disk drive. Referring to
Examples of storage systems that utilize failure groups and mirror at a finer granularity are disclosed in co-pending U.S. application Ser. No. 09/178,387 (now U.S. Pat. No. 6, 530,035), entitled “Method and System for Managing Storage Systems Containing Redundancy Data,” filed on Oct. 23, 1998, and U.S. Pat. No. 6,405,284, entitled “Distributing Data Across Multiple Data Storage Devices in a Data Storage System,” issued on Jun. 11, 2002, and both of which are incorporated herein by reference in their entireties.
In
Upon failure of disk drive 304, data extents B1, B2, and B3, can be reconstructed from the remaining data and parity extents in the corresponding parity set, and a mirror copy of data extent C2 can be reconstructed from the primary copy. However, in this example, the data and parity extents in the surviving disk drives are redistributed prior to restoring data redundancy because merely reconstructing data extents B1, B2, B3, and C2′ on the unallocated storage space in the remaining disk drives may not restore data redundancy as some of the redundant data may be stored on the same disk drive as the primary data. In the embodiment shown, data extents A2 and A3 are reallocated to disk drives 306 and 308, respectively. Parity data extents P2 and P3 are reallocated to disk drives 308 and 302, respectively. Data extent C3 is reallocated to disk drive 302. Data extent B1 is then reconstructed on the unallocated storage space in disk drive 308. Data extent B2 is reconstructed on the unallocated storage space in disk drive 302. Data extents B3 and C2′ are reconstructed on the unallocated storage space in disk drive 306.
In one embodiment, mirror partners may be defined to limit the number of disk drives that protect data for redundancy purposes. Each disk drive is associated with one or more mirror partners. In another embodiment, the number of mirror partners for any particular disk drive is limited. Limiting the number of mirror partners for a disk drive reduces the number of disk drives that contain redundant copies of a particular data, thereby reducing the probability of losing the data if a multiple disk drive failure occurs. The number of mirror partners for any particular disk drive can be different from that of other disk drives. In one embodiment, each disk drive is in a different failure group from its mirror partners.
Referring to
Four separate parity sets are illustrated in
Disk drives that are listed as a mirror partner of another disk drive may not always be available to be used for mirroring or parity protection. This may occur, for example, if the disk drive selected to store redundancy data (e.g., mirrored data) does not contain sufficient unallocated storage space. If this occurs, then another mirror partner of the disk drive containing the primary data (e.g., the parity data) is selected. In one embodiment, if there is insufficient space on the mirror partners of a disk drive containing the primary data to allocate all of the required redundancy data, then the primary data is deallocated and a new disk drive is selected to allocate the primary data.
As shown in
In order to retain load balancing and to reconstruct Parity 2, Data C is deallocated from disk drive 404 and reallocated to disk drive 410. Parity 2 is then reconstructed on the unallocated storage space in disk drive 406, using Data C and Data D. Disk drives 410 and 420, which contain Data C and Data D, respectively, are mirror partners of disk drive 406. Data M is deallocated from disk drive 412 and reallocated to disk drive 414 to preserve load balancing. Redundancy Data M is then reconstructed on disk drive 420, a mirror partner of disk drive 414. Data I is reconstructed on disk drive 416, another mirror partner of disk drive 402, which contains Redundancy Data I. To maintain load balancing, Redundancy Data L is deallocated from disk drive 416 and reallocated to disk drive 418. Data L is then reconstructed on disk drive 424, a mirror partner of disk drive 418.
Computer system 800 may be coupled via bus 802 to a display 812, such as a cathode ray tube (CRT), for displaying information to a computer user. An input device 814, including alphanumeric and other keys, is coupled to bus 802 for communicating information and command selections to processor 804. Another type of user input device is cursor control 816, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 804 and for controlling cursor movement on display 812. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
In one embodiment computer system 800 is used to restoring data redundancy in a storage system without a hot standby disk. According to one embodiment, such use is provided by computer system 800 in response to processor 804 executing one or more sequences of one or more instructions contained in main memory 806. Such instructions may be read into main memory 806 from another computer-readable medium, such as storage device 810. Execution of the sequences of instructions contained in main memory 806 causes processor 804 to perform the process steps described herein. One or more processors in a multi-processing arrangement may also be employed to execute the sequences of instructions contained in main memory 806. In other embodiments, hard-wired circuitry may be used in place of or in combination with software instructions. Thus, embodiments of the invention are not limited to any specific combination of hardware circuitry and software.
The term “computer-readable medium” as used herein refers to any medium that participates in providing instructions to processor 804 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 810. Volatile media includes dynamic memory, such as main memory 806. Transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 802. Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
Common forms of computer-readable media include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave as described hereinafter, or any other medium from which a computer can read.
Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 804 for execution. For example, the instructions may initially be carried on a magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 800 can receive the data on the telephone line and use an infrared transmitter to convert the data to an infrared signal. An infrared detector coupled to bus 802 can receive the data carried in the infrared signal and place the data on bus 802. Bus 802 carries the data to main memory 806, from which processor 804 retrieves and executes the instructions. The instructions received by main memory 806 may optionally be stored on storage device 810 either before or after execution by processor 804.
Computer system 800 also includes a communication interface 818 coupled to bus 802. Communication interface 818 provides a two-way data communication coupling to a network link 820 that is connected to a local network 822. For example, communication interface 818 may be an integrated services digital network (ISDN) card or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 818 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 818 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
Network link 820 typically provides data communication through one or more networks to other data devices. For example, network link 820 may provide a connection through local network 822 to a host computer 824 or to data equipment operated by an Internet Service Provider (ISP) 826. ISP 826 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the “Internet” 828. Local network 822 and Internet 828 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 820 and through communication interface 818, which carry the digital data to and from computer system 800, are exemplary forms of carrier waves transporting the information.
Computer system 800 can send messages and receive data, including program code, through the network(s), network link 820 and communication interface 818. In the Internet example, a server 830 might transmit a requested code for an application program through Internet 828, ISP 826, local network 822 and communication interface 818. In accordance with the invention, one such downloaded application provides for managing, storing, and retrieving data from a storage system containing multiple data storage devices. The received code may be executed by processor 804 as it is received, and/or stored in storage device 810, or other non-volatile storage for later execution. In this manner, computer system 800 may obtain application code in the form of a carrier wave.
In the foregoing specification, the invention has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the invention. For example, the above-described process flows are described with reference to a particular ordering of process actions. However, the ordering of many of the described process actions may be changed without affecting the scope or operation of the invention. The specification and drawings are, accordingly, to be regarded in an illustrative rather than restrictive sense.
This application is a continuation-in-part of U.S. patent application Ser. No. 09/178,387 (now U.S. Pat. No. 6,530,035), filed on Oct. 23, 1998, entitled “Method and System for Managing Storage Systems Containing Redundancy Data,” which is incorporated herein by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
5155845 | Beal et al. | Oct 1992 | A |
5258984 | Menon et al. | Nov 1993 | A |
5287459 | Gniewek | Feb 1994 | A |
5388108 | DeMoss et al. | Feb 1995 | A |
5469443 | Saxena | Nov 1995 | A |
5485571 | Menon | Jan 1996 | A |
5524204 | Verdoorn, Jr. | Jun 1996 | A |
5542065 | Burkes et al. | Jul 1996 | A |
5559764 | Chen et al. | Sep 1996 | A |
5574851 | Rathunde | Nov 1996 | A |
5615352 | Jacobson et al. | Mar 1997 | A |
5651133 | Burkes et al. | Jul 1997 | A |
5666512 | Nelson et al. | Sep 1997 | A |
5790774 | Sarkozy | Aug 1998 | A |
5862158 | Baylor et al. | Jan 1999 | A |
5875456 | Stallmo et al. | Feb 1999 | A |
5893919 | Sarkozy et al. | Apr 1999 | A |
5987566 | Vishlitzky et al. | Nov 1999 | A |
6000010 | Legg | Dec 1999 | A |
6035373 | Iwata | Mar 2000 | A |
6047294 | Deshayes et al. | Apr 2000 | A |
6058454 | Gerlach et al. | May 2000 | A |
6067199 | Blumenau | May 2000 | A |
6092169 | Murthy et al. | Jul 2000 | A |
6138125 | DeMoss | Oct 2000 | A |
6195761 | Kedem | Feb 2001 | B1 |
6223252 | Bandera et al. | Apr 2001 | B1 |
6233696 | Kedem | May 2001 | B1 |
6405284 | Bridge | Jun 2002 | B1 |
6425052 | Hashemi | Jul 2002 | B1 |
6915448 | Murphy et al. | Jul 2005 | B2 |
20030088803 | Arnott et al. | May 2003 | A1 |
20060277432 | Patel et al. | Dec 2006 | A1 |
Number | Date | Country | |
---|---|---|---|
Parent | 09178387 | Oct 1998 | US |
Child | 10327208 | US |