1. Field of the Invention
The invention relates generally to management of logical volumes in a storage system, and more specifically relates to techniques for quickly transferring ownership of a logical volume from one storage controller to another storage controller.
2. Related Patents
This patent application is related to the following commonly owned United States patent applications, all filed on the same date herewith and all of which are herein incorporated by reference:
3. Discussion of Related Art
In the field of data storage, customers demand highly resilient data storage systems that also exhibit fast recovery times for stored data. One type of storage system used to provide both of these characteristics is known as a clustered storage system.
A clustered storage system typically comprises a number of storage controllers, wherein each storage controller processes host Input/Output (I/O) requests directed to one or more logical volumes. The logical volumes reside on portions of one or more storage devices (e.g., hard disks) coupled with the storage controllers. Often, the logical volumes are configured as Redundant Array of Independent Disks (RAID) volumes in order to ensure an enhanced level of data integrity and/or performance.
A notable feature of clustered storage environments is that the storage controllers are capable of coordinating processing of host requests (e.g., by shipping I/O processing between each other) in order to enhance the performance of the storage environment. This includes intentionally transferring ownership of a logical volume from one storage controller to another. For example, a first storage controller may detect that it is currently undergoing a heavy processing load, and may assign ownership of a given logical volume to a second storage controller that has a smaller processing burden in order to increase overall speed of the clustered storage system. Other storage controllers may then update information identifying which storage controller presently owns each logical volume. Thus, when an I/O request is received at a storage controller that does not own the logical volume identified in the request, the storage controller may “ship” the request to the storage controller that presently owns the identified logical volume.
While clustered storage systems provide a number of performance benefits over more traditional storage systems described above, the speed of a storage system still typically remains a bottleneck to the overall speed of a processing system utilizing the storage system.
For example, in a storage system, data describing logical volumes provisioned on a plurality of storage devices may be stored in Disk Data Format (DDF) on the storage devices. DDF data (or other similar metadata) for a volume describes, for example, physical and virtual disk records for the volumes. Whenever a storage controller assumes ownership of a logical volume, DDF data is processed into a metadata format native to the storage controller. This is beneficial because relevant data is more easily accessible to the storage controller in the native format. Additionally, the native format data is typically a substantially smaller size than the DDF data because it describes fewer logical volumes. Unfortunately, processing the DDF data is an intensive process that delays the processing of incoming host I/O requests directed to the volume. Reading the DDF metadata from the storage devices can consume significant time. Further, the processing to extract data from the read DDF metadata and form desired native format metadata also consumes significant time. This in turn may undesirably reduce the speed of the clustered storage system.
Thus it is an ongoing challenge to enhance the speed at which ownership of a logical volume can be transferred.
The present invention solves the above and other problems, thereby advancing the state of the useful arts, by providing methods and structure for transferring ownership of a logical volume from one storage controller to another storage controller. Specifically, according to the methods and systems, when ownership of a logical volume is passed from a first storage controller to another storage controller, native-format metadata that describes the configuration of the logical volume is transferred from the first storage controller to the second storage controller. Thus, the second storage controller does not need to read DDF data (or similar metadata) from the storage devices implementing the volume, and the second storage controller has no need to process DDF data into a native format. This in turn increases the speed at which ownership of the logical volume is transferred to the second storage controller.
In one aspect hereof, a method is provided for transferring ownership of a logical volume in a storage system comprising multiple storage controllers, the storage controllers coupled for communication with a logical volume, wherein at least one storage device coupled with the storage controllers implements the logical volume. The method comprises identifying, at a first storage controller, a second storage controller to receive the logical volume. The method also comprises initiating a transfer of ownership of the logical volume from the first storage controller to the second storage controller by transferring metadata stored in a memory of the first storage controller to the second storage controller, the metadata existing in a native format that describes the configuration of the logical volume on the at least one storage device.
Another aspect hereof provides a clustered storage system. The clustered storage system comprises multiple storage controllers coupled for communication with a host system, and coupled for communication with a logical volume. The clustered storage system also comprises at least one storage device coupled with the storage controllers and implementing the logical volume. A first storage controller of the storage system is operable to initiate a transfer of ownership of the logical volume from the first storage controller to a second storage controller by transferring metadata stored in a memory of the first storage controller to the second storage controller, the metadata existing in a native format that describes the configuration of the logical volume on the at least one storage device.
Another aspect hereof provides a storage controller. The storage controller is coupled for communication with a host system, and the storage controller is operable to maintain ownership of a logical volume. The storage controller comprises a communication channel operable to couple for communication with at least one storage device implementing the logical volume, and also comprises a control unit. The control unit is operable to initiate a transfer of ownership of the logical volume to another storage controller by transferring metadata stored in a memory of the storage controller to the other storage controller, the metadata existing in a native format that describes the configuration of the logical volume on the at least one storage device.
Further, one of ordinary skill will understand that while logical volumes 330, 340, and 350 are depicted in
Enhanced storage controller 310 comprises control unit 314, communication channel 316, and memory 312 storing metadata 313. Metadata 313 comprises native-format metadata for storage controller 310 that describes the configuration of one or more logical volumes owned by storage controller 310. This native-format metadata may include information extracted from Disk Data Format (DDF) data for the logical volumes owned by storage controller 310, and may be dynamically generated based on DDF data for the logical volumes maintained at storage devices of clustered storage system 300.
Control unit 314 is operable to manage the operations of storage controller 310. Control unit 314 may be implemented, for example, as custom circuitry, as a special or general purpose processor executing programmed instructions stored in an associated program memory, or some combination thereof. Managing the operations of storage controller 310 includes processing host I/O requests directed to logical volumes 330, 340, and 350 implemented on storage devices 332, 342, and 352, respectively. Control unit 314 utilizes communication channel 316 in order to communicate with the storage devices implementing logical volumes 330, 340, and 350. Communication channel 316 may comprise, for example, a channel compliant with protocols for SAS, Fibre Channel, Ethernet, etc.
Control unit 314 is further operable to determine that it is appropriate to transfer ownership of a logical volume to storage controller 320. For example, the decision to transfer ownership may occur based on a load balancing determination made when control unit 314 determines that the workload (e.g., amount of queued host I/O commands) at storage controller 310 is greater than the workload at storage controller 320. In another example, the transfer of ownership may occur under any condition which may potentially impact the availability of data for the logical volume. For example, conditions such as high temperature or component failure (e.g., a battery below a minimum charge level) may indicate that storage controller 310 is likely to experience an unexpected failure and therefore trigger a transfer. In a still further example, the transfer of ownership may occur as a part of a planned shutdown of storage controller 310. A shutdown may be planned, for example, in order to update firmware resident on storage controller 310, to replace hardware components (e.g., a battery) for storage controller 310, etc. When a planned shutdown of a storage controller occurs, all volumes of the controller may be transferred to the other controllers of the system. For example, all volumes may be transferred to a single other storage controller, or the volumes may be distributed to a variety of other storage controllers. The following section describes how ownership of a single logical volume may be transferred between controllers. However, each of multiple volumes may be transferred in a similar fashion to that described below for a single logical volume. Further, multiple logical volumes may be transferred as part of a parallel or serial process.
During the transfer of ownership of a logical volume, control unit 314 is enhanced to provide native-format metadata 313 to storage controller 320 via channel 316 or any other suitable channel (e.g., a separate channel dedicated to intercontroller communications in system 300). The native-format metadata describes the configuration of the logical volume, and is native-format in that it exists in a format such that it is immediately usable by the receiving controller without requiring access to the original DDF metadata. Further, the “native” format may be significantly smaller than the DDF metadata in that a controller may condense the complete set of DDF metadata to only that information required for the controller to function. Upon acquiring the metadata, storage controller 320 is operable to assume ownership and manage the operations of the logical volume of clustered storage system 300 that used to be managed by storage controller 310. Specifically, storage controller 320 is operable to integrate metadata 313 into existing metadata structures used to manage logical volumes. In this manner, storage controller 320 does not need to acquire (or process) DDF data from storage devices 332, 342, or 352 before assuming ownership of the volume. Instead, native-format metadata for managing the logical volume is immediately available.
Using enhanced storage controllers 310 and 320 provides numerous benefits in terms of processing speed. For example, because storage devices are not accessed by storage controller 320 to build metadata for the transferred logical volume, the availability of the storage devices is increased. Similarly, because storage controller 320 does not need to actively generate native-format metadata from DDF data, storage controller 320 also experiences increased availability, thereby reducing the period during which host I/O requests are queued.
Step 402 includes identifying, at the first storage controller, a second storage controller to receive the logical volume. The second storage controller may be identified in a number of ways. For example, the second storage controller may be identified/selected because it is currently experiencing little in the way of processing load. In another example, a received request at the first storage controller explicitly indicates that the logical volume should be transferred to the second storage controller. In a still further example, the second storage controller is identified because it manages a smaller number of logical volumes and/or storage devices than the first storage controller. Whatever technique is used, the second storage controller is typically chosen/identified because transferring ownership of the logical volume will result in a processing benefit at the first and/or second storage controllers or will ensure continued availability of the data for the logical volume.
Step 404 comprises transferring ownership of the logical volume to the second storage controller. The transfer includes the first storage controller transferring native-format metadata for the logical volume from a memory of the first storage controller to the second storage controller. The second storage controller assumes ownership of the transferred logical volume. The first storage controller then quiesces processing by shipping host I/O requests directed to the logical volume to the second storage controller.
Simply copying DDF data from each storage device into a memory of a storage controller would be undesirable for a number of reasons. First, the DDF data is likely to include data describing multiple logical volumes, and the storage controller assuming ownership may only need data relating to one of the volumes being transferred. Additionally, DDF data stored across multiple storage devices is likely to include repetitive configuration data describing the logical volumes (i.e., certain configuration data for the logical volume such as the RAID configuration of the volume may be repeated in DDF data for each storage device). For these reasons, it is generally desirable for a storage controller transferring ownership in accordance with features and aspects hereof to utilize custom, native-format metadata 313 for storing logical volume configuration information.
In order to generate native-format metadata that may be transferred between controllers, a storage controller may engage in the following processes. The entire DDF portion of each storage device owned by the storage controller may be analyzed. Virtual disk (i.e., logical volume) records may be read from the DDF sections of each storage device and then merged together to form a final configuration structure. The merging process itself may utilize multiple iterations through the DDF data to derive a native format for the logical volume configuration information. This is because each DDF section of each storage device has some information relevant for defining properties of the logical volume.
Native-format metadata 313 includes a specialized memory structure that describes the logical volumes and storage devices managed by the storage controller. For example, the native-format metadata may describe the RAID configurations of the storage devices that are managed by the storage controller (e.g., several storage devices may implement a RAID 5 configuration, several other storage devices may implement a RAID 0 configuration, etc.). This information may describe the number of RAID configurations, and for each set of storage devices implementing a RAID configuration: an identifier for the set of storage devices, identifiers for each storage device in the RAID configuration, total available space at the RAID configuration, remaining free space at the RAID configuration, a physical location (e.g., within an enclosure) at which to find each storage device of the RAID configuration, etc.
The native-format metadata may further describe the logical volumes managed by the storage controller. For example, the metadata may uniquely identify each logical volume, indicate a cache policy for the logical volume (e.g., write-back or write-through), describe an access policy for the logical volume (e.g., read/write, read, or write—the access policy may vary as a host-by-host determination), a disk cache policy for storage devices provisioning the volume (i.e., whether the storage devices are allowed to utilize their individualized caches when storing data—using caches may speed up transfer speeds, but may also make the system more vulnerable to transfer errors if a power failure occurs), and whether a power save policy is allowed for the storage devices (i.e., whether the storage devices provisioning the logical volume are allowed to go idle, thereby saving power but reducing performance). Further, the metadata may include information describing hot spares owned by the storage controller, whether the spares are reserved for specific RAID configurations or logical volumes, etc.
The following illustrates a memory structure that may be utilized to store the metadata:
Wherein the information is defined according to the C, C++, and/or C# family of languages, and wherein the term U32, U16, etc. indicates an unsigned integer of 32 bits, 16 bits, etc.
It may be immediately appreciated that the native-format metadata is stored at a memory of the storage controller and does not include duplicate data for a single logical volume. Furthermore, the native-format metadata does not include information describing logical volumes not owned by the storage controller. As the native-format metadata at the storage controller is smaller in size than the DDF data and more easily accessed, it is also easier to parse. Thus, using native-format metadata results in a substantial performance boost.
While the invention has been illustrated and described in the drawings and foregoing description, such illustration and description is to be considered as exemplary and not restrictive in character. One embodiment of the invention and minor variants thereof have been shown and described. In particular, features shown and described as exemplary software or firmware embodiments may be equivalently implemented as customized logic circuits and vice versa. Protection is desired for all changes and modifications that come within the spirit of the invention. Those skilled in the art will appreciate variations of the above-described embodiments that fall within the scope of the invention. As a result, the invention is not limited to the specific examples and illustrations discussed above, but only by the following claims and their equivalents.
This patent claims priority to U.S. provisional patent application No. 61/532,585, filed on 9 Sep. 2011 and titled “10 Shipping for RAID Virtual Disks Created On A Disk Group Shared Across Cluster,” which is hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
61532585 | Sep 2011 | US |