This invention relates to data storage subsystems and, in particular, to a dynamically mapped virtual data storage subsystem which includes a data storage manager that functions to combine the non-homogeneous physical devices contained in the data storage subsystem to create a logical device with new and unique quality of service characteristics that satisfy the criteria for the policies appropriate for the present data object.
It is a problem in the field of data storage subsystems to store the ever increasing volume of application data in an efficient manner, especially in view of the rapid changes in data storage characteristics of the data storage elements that are used to implement the data storage subsystem and the increasingly specific need of the applications that generate the data.
Data storage subsystems traditionally comprised homogeneous collections of data storage elements on which the application data was stored for a plurality of host processors. As the data storage technology changed and a multitude of different types of data storage elements became available, the data storage subsystem changed to comprise subsets of homogeneous collections of data storage elements, so that the application data could be stored on the most appropriate one of the plurality of subsets of data storage elements. Data storage management systems were developed to route the application data to a selected subset of data storage elements and a significant amount of processing was devoted to ascertaining the proper data storage destination for a particular data set in terms of the data storage characteristics of the selected subset of data storage elements. Some systems also migrate data through a hierarchy of data storage elements to account for the timewise variation in the data storage needs of the data sets.
In these data storage subsystems, the quality of service characteristics are determined by the unmodified physical attributes of the data storage elements that are used to populate the data storage subsystem. One exception to this rule is disclosed in U.S. Pat. No. 5,430,855 titled “Disk Drive Array Memory System Using Nonuniform Disk Drives,” which discloses a data storage subsystem that uses an array of data storage elements that vary in their data storage characteristics and/or data storage capacity. The data storage manager in this data storage subsystem automatically compensates for any nonuniformity among the disk drives by selecting a set of physical characteristics that define a common data storage element format. However, the data storage utilization of the redundancy groups formed by the data storage manager is less than optimal, since the least common denominator data storage characteristics of the set of disk drives is used as the common disk format. Thus, disk drive whose data storage capacity far exceeds the smallest capacity disk drive in the redundancy group suffers from loss of utilization of its excess data storage capacity. Therefore, most data storage subsystems do not utilize this concept and simply configure multiple redundancy groups, with each redundancy group comprising a homogeneous set of disk drives. A problem with such an approach is that the data storage capacity of the data storage subsystem must increase by the addition of an entire redundancy group. Furthermore, the replacement of a failed disk drive requires the use of a disk drive that matches the characteristics of the remaining disk drives in the redundancy group, unless loss of the excess data storage capacity of the newly added disk drive were incurred, as noted above.
Thus, it is a prevalent problem in data storage subsystems that the introduction of new technology is costly and typically must occur in fairly large increments, occasioned by the need for the data storage subsystem to be comprised of homogeneous subset of data storage devices, even in a virtual data storage subsystem. Therefore, data administrators find it difficult to cost effectively manage the increasing volume of data that is being generated in order to meet the needs of the end users' business. In addition, the rate of technological innovation is accelerating, especially in the area of increases in data storage capacity and the task of incrementally integrating these new solutions into existing data storage subsystems is difficult to achieve.
The above described problems are solved and a technical advance achieved by the present intelligent data storage manager that functions to combine the non-homogeneous physical devices contained in a data storage subsystem to create a logical device with new and unique quality of service characteristics that satisfy the criteria for the policies appropriate for the present data object. In particular, if there is presently no logical device that is appropriate for use in storing the present data object, the intelligent data storage manager defines a new logical device using existing physical and/or logical device definitions as component building blocks to provide the appropriate characteristics to satisfy the policy requirements. The intelligent data storage manager uses weighted values that are assigned to each of the presently defined logical devices to produce a best fit solution to the requested policies in an n-dimensional best fit matching algorithm. The resulting logical device definition is then implemented by dynamically interconnecting the logical devices that were used as the components of the newly defined logical device to store the data object.
If there is presently no logical device that satisfies the criteria for the policies appropriate for a user data object, the logical device manager 104 creates a new logical device definition with the appropriate data storage characteristics to satisfy the policy requirements using existing physical and/or logical device definitions as component building blocks. The logical device manager 104 uses weighted values that are assigned to each of the presently defined logical devices to produce a best fit solution to the requested policies in an n-dimensional best fit matching algorithm. Thus, the intelligent data storage manager 110 maps the virtual device to the user data object rather than mapping a data object to a predefined data storage device. The various data storage attributes that are used by the intelligent data storage manager 110 to evaluate the appropriateness of a particular virtual device include, but are not limited to: speed of access to first byte, level of reliability, cost of storage, probability of recall, and expected data transfer rate. The logical device manager 104 stores the mapping data which comprises a real time definition of the available storage space in the data storage subsystem 100. Once one of the current logical device definitions meet the criteria required by a data object, the logical device manager 104 either allocates space on an existing instance of a logical device of that type or creates a new instance of that type of logical device.
The policy attributes and the potential algorithms that are used to map user requirements to storage devices are managed by the intelligent storage manager 110. A typical general set of attributes for storage devices is shown in Table 1:
Each of these attributes has a range or dimension of “values”. Each dimension needs to be relatively uniform in its number scheme. For example, each dimension could have a numeric value for 0.0 to 10.0. Some dimensions need to be logarithmic (lg) because of the inherent nature of the dimension. For example, Cost per MB can be defined as a logarithmic dimension that runs from the $0.001 for tape storage to $10 for RAM. So one approach is to do a distance calculation of the difference between the customer's policy requirements and each storage device's policy attributes. In addition, levels of priority among attributes can be specified since certain dimensions may be more important than others (reliability, for example). When the intelligent storage manager 110 must resolve between conflicting priority levels, the logical storage manager 104 tries to find ways to combine single devices into an optimal, logical device using logical combining operators.
The present intelligent data storage manager 110 is responsive to one of the host processors 111 initiating a data write operation by transmitting a predefined set of commands over a selected one of the communication links to the data storage subsystem 100. These commands include a definition of the desired device on which the present data object is to be stored, typically in terms of a set of data storage characteristics.
There are many instances of data file storage where the needs of the application and/or user do not correspond to the reality of the data storage characteristics of the various data storage elements 151-153 and virtual data storage elements 154-155 that are available in the data storage subsystem 100. For example, the application “video on demand” may require a high reliability data storage element and fast access to the initial portion of the file, yet not require fast access for the entirety of the file since the data is typically read out at a fairly slow data access rate. However, the required data transfer bandwidth may be large, since the amount of data to be processed is significant and having a slow speed access device as well as a narrow bandwidth would result in unacceptable performance. Furthermore, the cost of data storage is a concern due to the volume of data. The intelligent data storage manager 110 must therefore factor all of these data storage characteristics to determine a best fit data storage device or devices to serve these needs. In this example, the defined data storage characteristics may be partially satisfied by a Redundant Array of Inexpensive Tapes since the reliability of this data storage device is high as is the data bandwidth, yet the cost of implementation is relatively low, especially if the configuration is a RAIT-5 and the data access speed is moderate. In making a determination of the appropriate data storage device, the intelligent data storage manager 110 must review the criticality of the various data storage characteristics and the amount of variability acceptable for that data storage characteristic.
All devices support some form of quality of service, which can be described as attributes with certain fixed values. For example, they cost $xxx per megabyte of data or have nnn access speed. The intelligent storage manager 110 provides an algorithmic way to use these attributes to determine the perfect device, as specified by user policy. In some cases, the perfect device is a logical device that is constructed when the intelligent storage manager 110 rank orders the distance between 1) how the user would like to have data stored and 2) the storage devices that are available. This logical device can span both disk and tape subsystems and, therefore, blurs the distinction between disk and tape.
The diagram of
Table 2 provides a more complex comparison of device attributes versus attributes defined through user policy. In this example, the set of attributes of the following storage subsystems: single disk, RAID, single tape drive, and RAIT are listed. The intelligent storage manager 110 determines an optimal storage solution by doing a distance calculation between 1) the set of attributes for each device and 2) the set of attributes for a file (defined through user policy).
For example, the calculation below denotes the vector for point P by [x1(P), x2(P), x3(P)]. Then the distance between points 1 and 2 is
√{square root over ([(x1−x2)2+(y1−y2)2+(z1−z2)2])}{square root over ([(x1−x2)2+(y1−y2)2+(z1−z2)2])}{square root over ([(x1−x2)2+(y1−y2)2+(z1−z2)2])}
Where
This example is for three dimensions. To extend it to more dimensions, take the difference between corresponding components of the two vectors, square this difference, add this square to all the other squares, and take the square root of the sum of the squares. Of course, you don't need to do the square root if you're simply looking for the point closest to a give point.
In the present example, the realized data storage device can be a composite device or a collection of composite devices. For example, the video on demand file data storage requirements can be met by the virtual device illustrated in FIG. 3. The virtual device 300 can comprise several elements 301, 302, each of which itself comprises a collection of physical and/or virtual devices. The virtual device 300 comprises a first device 301 which comprises a set of parallel connected disk drives 310-314 that provides a portion of the data storage capability of the virtual device 300. These parallel connected disk drives 310-314 provide a fast access time for the application to retrieve the first segment of the video on demand data to thereby provide the user with a fast response time to the file request. The bulk of the video on demand data file is stored on a second element 302 that comprises a Redundant Array of Inexpensive Tapes device that implements a RAIT-5 storage configuration. The relative data storage capacity of the two data storage elements 301, 302 is determined by the amount of data that must be provided to the user on a priority basis and the length of time before the remainder of the file can be staged for provision to the user.
The data storage manager 110 implements devices that support some form of quality of service. These attributes have some type of fixed value: they cost so much—they have XX access speed. The data storage manager 110 can also rank order the distances between how the user wishes to have a data file stored compared to the storage devices that are in the data storage subsystem 100. From this the data storage manager 110 can also come up with some alternative storage methods—for example, the data storage manager 110 can do a mixture of disk and tape to achieve the qualities that the user is looking for. The data storage manager 110 can put some of the data file on disk for quick access and some of it on tape for cheap storage as noted above. Another alternative factor is if there is a file that the user wants stored at a certain $$ per megabyte, it can be migrated from disk to tape over a certain period of weeks and the average cost of storage complies with the user policy definition. So, the data storage manager 110 must evaluate quickly what devices are available and the data storage manager 110 compares how the user wants to store the data file. If the data storage manager 110 doesn't have a perfect match, the mixtures of devices are rank ordered and investigated to try and achieve the policy that is defined by the user.
The intelligent data storage manager functions to combine the non-homogeneous physical devices contained in a data storage subsystem to create a logical device with new and unique quality of service characteristics that satisfy the criteria for the policies appropriate for the present data object. The intelligent data storage manager uses weighted values that are assigned to each of the presently defined logical devices to produce a best fit solution to the requested policies in an n-dimensional best fit matching algorithm. The resulting logical device definition is then implemented by dynamically interconnecting the logical devices that were used as the components of the newly defined logical device to store the data object.
This is a continuation divisional of application(s) Ser. No. 09/232,431 filed on Jan. 15, 1999 now U.S. Pat. No. 6,330,621.
Number | Name | Date | Kind |
---|---|---|---|
3130387 | Wright et al. | Apr 1964 | A |
3699533 | Hunter | Oct 1972 | A |
3806888 | Brickman et al. | Apr 1974 | A |
3909799 | Recks et al. | Sep 1975 | A |
3949377 | O'Neill, Jr. | Apr 1976 | A |
3976977 | Porter et al. | Aug 1976 | A |
4021782 | Hoerning | May 1977 | A |
4040026 | Gernelle | Aug 1977 | A |
4054951 | Jackson et al. | Oct 1977 | A |
4080651 | Cronshaw et al. | Mar 1978 | A |
4080652 | Cronshaw et al. | Mar 1978 | A |
4084234 | Calle et al. | Apr 1978 | A |
4086629 | Desyllas et al. | Apr 1978 | A |
4110823 | Cronshaw et al. | Aug 1978 | A |
4123795 | Dean, Jr. et al. | Oct 1978 | A |
4126893 | Cronshaw et al. | Nov 1978 | A |
4126894 | Cronshaw et al. | Nov 1978 | A |
4158235 | Call et al. | Jun 1979 | A |
4215400 | Denko | Jul 1980 | A |
4228501 | Frissell | Oct 1980 | A |
4241420 | Fish et al. | Dec 1980 | A |
4246637 | Brown et al. | Jan 1981 | A |
4276595 | Brereton et al. | Jun 1981 | A |
4298932 | Sams | Nov 1981 | A |
4310883 | Clifton et al. | Jan 1982 | A |
4318184 | Millett et al. | Mar 1982 | A |
4327408 | Frissell et al. | Apr 1982 | A |
4412285 | Neches et al. | Oct 1983 | A |
4414644 | Tayler | Nov 1983 | A |
4533995 | Christian et al. | Aug 1985 | A |
4945429 | Munro et al. | Jul 1990 | A |
4974156 | Harding et al. | Nov 1990 | A |
5131087 | Warr | Jul 1992 | A |
5164909 | Leonhardt et al. | Nov 1992 | A |
5214768 | Martin et al. | May 1993 | A |
5303214 | Kulakowski et al. | Apr 1994 | A |
5386516 | Monahan et al. | Jan 1995 | A |
5388260 | Monahan et al. | Feb 1995 | A |
5412791 | Martin et al. | May 1995 | A |
5430855 | Walsh et al. | Jul 1995 | A |
5455926 | Keele et al. | Oct 1995 | A |
5475817 | Waldo et al. | Dec 1995 | A |
5504873 | Martin et al. | Apr 1996 | A |
5506986 | Healy | Apr 1996 | A |
5535322 | Hecht | Jul 1996 | A |
5537585 | Blickenstaff et al. | Jul 1996 | A |
5546557 | Allen et al. | Aug 1996 | A |
5560040 | Mizumachi | Sep 1996 | A |
5566331 | Irwin, Jr. et al. | Oct 1996 | A |
5613154 | Burke et al. | Mar 1997 | A |
5619690 | Matsumani et al. | Apr 1997 | A |
5625405 | DuLac et al. | Apr 1997 | A |
5630067 | Kindell et al. | May 1997 | A |
5640510 | Hanaoka et al. | Jun 1997 | A |
5664186 | Bennett et al. | Sep 1997 | A |
5671439 | Klein et al. | Sep 1997 | A |
5689481 | Tamura et al. | Nov 1997 | A |
5694550 | Takeda et al. | Dec 1997 | A |
5710549 | Horst et al. | Jan 1998 | A |
5740362 | Buickel et al. | Apr 1998 | A |
5751715 | Chan et al. | May 1998 | A |
5758050 | Brady et al. | May 1998 | A |
5758085 | Koucheris et al. | May 1998 | A |
5758125 | Misinai et al. | May 1998 | A |
5802258 | Chen | Sep 1998 | A |
5805864 | Carlson et al. | Sep 1998 | A |
5809285 | Hilland | Sep 1998 | A |
5828836 | Westwick et al. | Oct 1998 | A |
5829046 | Tzelnic et al. | Oct 1998 | A |
5829053 | Smith et al. | Oct 1998 | A |
5832527 | Kawaguchi | Nov 1998 | A |
5838891 | Mizuno et al. | Nov 1998 | A |
5845147 | Vishlitzky et al. | Dec 1998 | A |
5867648 | Foth et al. | Feb 1999 | A |
5881311 | Woods | Mar 1999 | A |
5884046 | Antonov | Mar 1999 | A |
5890203 | Aoki | Mar 1999 | A |
5960451 | Voigt et al. | Sep 1999 | A |
5963971 | Fosler et al. | Oct 1999 | A |
5996024 | Blumenau | Nov 1999 | A |
6111944 | Molin | Aug 2000 | A |
6128717 | Harrison et al. | Oct 2000 | A |
6141729 | Ishida et al. | Oct 2000 | A |
6457139 | D'Errico et al. | Sep 2002 | B1 |
Number | Date | Country |
---|---|---|
892798 | Feb 1972 | CA |
907211 | Aug 1972 | CA |
0 535 922 | Apr 1993 | EP |
0 689 125 | Dec 1995 | EP |
1167762 | Oct 1969 | GB |
1353770 | May 1974 | GB |
1359662 | Oct 1974 | GB |
1496779 | Jan 1978 | GB |
1496780 | Jan 1978 | GB |
1547381 | Jun 1979 | GB |
2063532 | Jun 1981 | GB |
51-18409 | Feb 1976 | JP |
52-106641 | Sep 1977 | JP |
53-22331 | Mar 1978 | JP |
53-84632 | Jul 1978 | JP |
53-98741 | Aug 1978 | JP |
53-108747 | Sep 1978 | JP |
55-153058 | Nov 1980 | JP |
55-164958 | Dec 1980 | JP |
4-48250 | Jun 1992 | JP |
98 40810 | Jul 1992 | WO |
97 07461 | Feb 1997 | WO |
98 33113 | Jul 1998 | WO |
Number | Date | Country | |
---|---|---|---|
20020032816 A1 | Mar 2002 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 09232431 | Jan 1999 | US |
Child | 09966263 | US |