The present invention relates to the field of data management via data storage systems and particularly to a system and method for utilizing mirroring in a data storage system to promote improved data accessibility and improved system efficiency.
Currently available data storage systems/methods for providing data management in data storage systems may not provide a desired level of performance.
Therefore, it may be desirable to provide a data storage system/method(s) for providing data management in a data storage system which addresses the above-referenced shortcomings of currently available solutions.
Accordingly, an embodiment of the present invention is directed to a method for utilizing mirroring in a data storage system to promote improved data accessibility and improved system efficiency, comprising: establishing a first set of drives of the system in active mode; establishing a second set of drives of the system in passive mode, passive mode being a lower power mode than active mode; writing a first portion of data to a first drive, the first drive being included in the first set of drives; writing a copy of the first portion of data to a second drive, the second drive being included in the first set of drives; updating metadata of the system to indicate that the copy of the first portion of data is located on the second drive; activating a third drive, the third drive being included in the second set of drives, the third drive being activated from passive mode to active mode; writing a second copy of the first portion of data to the third drive; re-establishing the third drive in passive mode; updating the metadata of the system to indicate that the second copy of the first portion of data is located on the third drive; deleting the copy of the first portion of data from the second drive; and when the first drive fails, re-activating the third drive from passive mode into active mode to allow for host access to the second copy of the first portion of data.
A further embodiment of the present invention is directed to a computer-readable medium having computer-executable instructions for performing a method for utilizing mirroring in a data storage system to promote improved data accessibility and improved system efficiency, comprising: establishing a first set of drives of the system in active mode; establishing a second set of drives of the system in passive mode, passive mode being a lower power mode than active mode; writing a first portion of data to a first drive, the first drive being included in the first set of drives; writing a copy of the first portion of data to a second drive, the second drive being included in the first set of drives; updating metadata of the system to indicate that the copy of the first portion of data is located on the second drive; activating a third drive, the third drive being included in the second set of drives, the third drive being activated from passive mode to active mode; writing a second copy of the first portion of data to the third drive; re-establishing the third drive in passive mode; updating the metadata of the system to indicate that the second copy of the first portion of data is located on the third drive; deleting the copy of the first portion of data from the second drive; and when the first drive fails, re-activating the third drive from passive mode into active mode to allow for host access to the second copy of the first portion of data.
A still further embodiment of the present invention is directed to a data storage system, including: a first set of drives, the first set of drives being established in active mode, a first drive included in the first set of drives being configured for storing a portion of data, a second drive included in the first set of drives being configured for storing a first copy of the portion of data; and a second set of drives, the second set of drives being established in passive mode, passive mode being a lower power mode than active mode, a first drive included in the second set of drives being configured for being activated from passive mode to active mode, when the first drive included in the second set of drives is activated from passive mode to active mode, the system is configured for writing a second copy of the portion of data to the first drive included in the second set of drives, re-establishing the first drive included in the second set of drives into passive mode, updating metadata of the system to indicate that the second copy of the portion of data is located on the first drive included in the second set of drives, deleting the first copy of the portion of data from the second drive included in the first set of drives, wherein Controlled Replication Under Scalable Hashing algorithms are implemented by the system for data mapping, wherein, when the first drive included in the first set of drives fails, the system is further configured for re-activating the third drive from passive mode to active mode to allow for host access to the second copy of the portion of data.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not necessarily restrictive of the invention as claimed. The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate embodiments of the invention and together with the general description, serve to explain the principles of the invention.
The numerous advantages of the present invention may be better understood by those skilled in the art by reference to the accompanying figures in which:
Reference will now be made in detail to the presently preferred embodiments of the invention, examples of which are illustrated in the accompanying drawings.
Power usage in data centers is becoming an increasingly important issue. Spinning disk drives consume proportionally large amounts of data center power. Spinning disk drives also produce heat, which results in increased cooling costs for the data centers. With a number of data centers/data storage systems, drives of said systems may consume power and produce heat, even if: 1) data on said drives is not being accessed; and/or 2) said drives hold no data (ex.—hot spare drives).
Massive Array of Idle Disks (MAID) systems (such as disclosed in: The Case for Massive Arrays of Idle Disks (MAID)., Colarelli et al., Dept. of Computer Science, Univ. of Colo., Boulder, pp. 1-6, Jan. 7, 2002, which is herein incorporated by reference in its entirety) may be implemented in an attempt to address the above-referenced issues. However, MAID systems do not take into account the amount of data that has been written to the system. In a MAID system, a fixed number of both active drives and passive drives are allocated when the MAID system is initially configured. However, the allocations do not change dynamically as the MAID system fills with data. MAID systems are further disadvantageous in that certain portions of data may not be accessible without incurring a drive spin-up delay. While such a delay may be acceptable for many workloads, such as backups and archives, said delay may not be tolerable for more active systems.
Referring to
In exemplary embodiments, the active drives 102 handle both reads and writes for the system 100. In current embodiments of the present invention, data/data segment(s) may be written to the active drive group 102, such that for each data segment (ex.—primary copy) written to/stored on a first active drive 106 included in the active drive group 102, a corresponding temporary secondary copy of the data segment may be written to and stored on a second active drive 108 included in the active drive group 102. Thus, by using unallocated space on an already active drive (as described above), host write operations do not need to activate/spin-up/switch to active mode any of the passive drives 104 in order to write both a primary copy and a temporary secondary copy of the data to the system 100. In
In additional embodiments, the system 100 is configured for flushing/copying the temporary secondary copy/copies of the data segment(s) from the active drive(s) 102 to the passive drive(s) 104, thereby creating a secondary copy/flushed secondary copy which is located/stored on the passive drive(s) 104.
In exemplary embodiments of the present invention, the system 100 may implement mirroring (as shown in
When the system 100 is in an optimal state (ex.—all of the drives in that active drive group 102 are functioning properly), as shown in
In further embodiments, as shown in
In exemplary embodiments of the present invention, the system 100 may map data locations/data by implementing any method which will distribute the data uniformly among the drive set(s) (102, 104). For example, the system 100 may divide data into mirrored chunks and spread said data uniformly among/across drives in the active drive group 102 and the passive drive group 104 via implementation of Controlled Replication Under Scalable Hashing (CRUSH) algorithms which were developed by the University of California at Santa Cruz (such as disclosed in: CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data., Weil et al., Proceedings of SC '06, November 2006, which is herein incorporated by reference in its entirety).
As mentioned above, metadata is implemented in the system 100 for tracking valid copies (ex.—primary copies, temporary secondary copies, and secondary copies) of data in both the active bucket/active drive group 102 and the passive bucket/passive drive group 104. When primary data/a primary copy is overwritten in the active bucket 102 (thereby generating an updated primary copy), any corresponding secondary data/secondary copy must be either overwritten or invalidated in the metadata. In instances when the system 100 has not flushed data to the passive bucket/passive drive group 104 and a temporary secondary copy exists in the active bucket/active drive group 102 which corresponds to the primary copy, the temporary secondary copy may be overwritten at the same time its corresponding primary copy is overwritten. In instances when the system 100 has flushed data (ex.—provided a secondary copy based on a temporary secondary copy) corresponding to the primary copy to the passive bucket/passive drive group 104, the metadata may be changed to invalidate the secondary copy (which is located on a drive included in the passive drive group 104), and a new temporary secondary copy may be written to another drive in the active drive group 102 (ex.—a drive in the active drive group 102 which is a different drive than the drive on which the updated primary copy is located).
In further embodiments, the system 100 of the present invention implements Thin Provisioning, thus there may be as few as two drives included in the first group of drives/the first pool of drives/the active group of drives/the active drives 102, while the rest of the drives of the system 100 may be drives included in the second group of drives/the second pool of drives/the passive group of drives/the passive drives 104. Thus, the first group of drives 102 includes at least two drives, while the second group of drives 104 also includes at least two drives. As more active storage capacity is needed by the system 100, one or more of the passive drives 104 may be relocated from the passive bucket 104 to the active bucket 102 to become an active drive, thereby expanding the storage capacity/number of drives in the active drive group 102. Further, as the new drive(s) is/are added to the active drive group 102, the system 100 may evenly redistribute data chunks stored by the system 100, thereby keeping all active drives 102 relatively equally populated.
In further embodiments, the system 100 of the present invention may include a third drive group/bucket, configured for implementation/connection with the active bucket 102 and/or the passive bucket 104, which may be in a completely powered off mode until needed in either the active bucket 102 or the passive bucket 104, thereby allowing the system 100 to implement drive groups in multiple, low power modes.
In
It is to be noted that the foregoing described embodiments according to the present invention may be conveniently implemented using conventional general purpose digital computers programmed according to the teachings of the present specification, as will be apparent to those skilled in the computer art. Appropriate software coding may readily be prepared by skilled programmers based on the teachings of the present disclosure, as will be apparent to those skilled in the software art.
It is to be understood that the present invention may be conveniently implemented in forms of a software package. Such a software package may be a computer program product which employs a computer-readable storage medium including stored computer code which is used to program a computer to perform the disclosed function and process of the present invention. The computer-readable medium/computer-readable storage medium may include, but is not limited to, any type of conventional floppy disk, optical disk, CD-ROM, magnetic disk, hard disk drive, magneto-optical disk, ROM, RAM, EPROM, EEPROM, magnetic or optical card, or any other suitable media for storing electronic instructions.
It is understood that the specific order or hierarchy of steps in the foregoing disclosed methods are examples of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the method can be rearranged while remaining within the scope of the present invention. The accompanying method claims present elements of the various steps in a sample order, and are not meant to be limited to the specific order or hierarchy presented.
It is believed that the present invention and many of its attendant advantages will be understood by the foregoing description. It is also believed that it will be apparent that various changes may be made in the form, construction and arrangement of the components thereof without departing from the scope and spirit of the invention or without sacrificing all of its material advantages. The form herein before described being merely an explanatory embodiment thereof, it is the intention of the following claims to encompass and include such changes.
The following patent application is incorporated by reference in its entirety: Attorney Docket No.Express Mail No.Filing DateSer. No.LSI 09-0099EM 316812549Aug. 04, 2009 Further, U.S. patent application Ser. No. 12/288,037 entitled: Power and Performance Management Using MAIDx and Adaptive Data Placement, filed Oct. 16, 2008 (pending), which is also hereby incorporated by reference in its entirety.