The present application claims the benefit under 35 U.S.C. §119(a) of Indian Patent Application Serial Number 818/KOL/2013, filed Jul. 10, 2013, which is incorporated herein by reference.
In a redundant array of independent discs (RAID) storage system with large numbers of drives, the use of expanders is inevitable. Expanders spin up the drives during power up. If all the drives were spun up simultaneously the resulting power draw would overload the available power supply. To overcome this issue, expanders perform staggered spin up where predefined sets of drives are spun up in cycles until all drives are spun up. Multiple such cycles are required to spin up all the drives, and all the drives need to be spun up before reporting the completion of spin up because drive usage is completely hidden from the expander; the controller is the device that communicates with the user/operating system and designates drive usage.
Consequently, it would be advantageous if an apparatus existed that is suitable for prioritizing spin up in a data storage system according to the designated usage of connected drives.
Accordingly, the present invention is directed to a novel method and apparatus for prioritizing spin up in a data storage system according to the designated usage of connected drives.
In at least one embodiment of the present invention, a data storage system controller designates critical drives for staggered spin up and other, non-critical drives for spin up only when the controller notifies the appropriate expander. Each expander in the data storage system maintains configuration information for each PHY of the expander and reports completion of spin up when all of the drives designated “staggered spin up” have been spun up.
In another embodiment of the present invention, an expander maintains PHY configuration data, designating each PHY as “staggered spin up,” “host notify” or “disabled.” At boot time, only devices connected to PHYs designated “staggered spin up” are spun up in cycles before reporting spin up completion to a host device. Devices connected to PHYs designated “staggered spin up” could include drives that are part of a redundant array of independent discs, or drives that contain the host operating system. Furthermore, devices connected to PHYs designated “disabled” could include operable devices that may be used as hot spares if necessary, and failed devices that have not yet been removed from the system.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention claimed. The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate an embodiment of the invention and together with the general description, serve to explain the principles.
The numerous advantages of the present invention may be better understood by those skilled in the art by reference to the accompanying figures in which:
Reference will now be made in detail to the subject matter disclosed, which is illustrated in the accompanying drawings. The scope of the invention is limited only by the claims; numerous alternatives, modifications and equivalents are encompassed. For the purpose of clarity, technical material that is known in the technical fields related to the embodiments has not been described in detail to avoid unnecessarily obscuring the description.
Referring to
The memory 104 stores PHY configuration information associated with each of the PHYs 108, designated the spin up priority of the device 106 connected to each PHY 108. In at least one embodiment, PHYs 108 are designated “staggered spin up,” “host notify” or “disabled.”
At boot time, when the expander 100 receives an instruction to begin spinning up connected devices, the processor 102 identifies all PHYs 108 designated “staggered spin up” and begins spinning up the devices 106 attached to those PHYs 108 according to some predetermined priority schedule to avoid overloading the expander power supply. When all of the devices 106 attached to PHYs 108 designated “staggered spin up” have been spun up, the processor 102 sends a signal to a controller indicating spin up is complete, even though less than all of the attached devices have spun up. The expander 100 thereby improves boot up time and system availability by allowing a controller to communicate with devices 106 more rapidly after boot up.
Referring to
In a redundant array of independent discs storage system, devices 206 connected to the expanders 200, 202, 204 may organized into one data storage volume, and the individual devices are substantially invisible to the end user. Because of the nature of such storage systems, input/output operations cannot be processed until all of the devices 206 comprising the redundant array of independent discs are spun up and operable. However, each of the expanders 200, 202, 204 is unaware of which devices 206 comprise the redundant array of independent discs and which devices 206 comprise spare capacity. The controller, however, is aware which devices 206 are actually necessary to process input/output operations.
Referring to
Where the plurality of devices 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 comprise a redundant array of independent discs, such that two or more of the devices 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 are treated as a single data storage volume, the host 312 cannot process input/output requests until all of the devices 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 comprising the redundant array of independent discs is spun up. However, devices 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 comprising hot spares or otherwise unused drives are not necessary to process input/output requests.
When a redundant array of independent discs is initially established, the controller 310 may identify which devices 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 will comprise portions of the storage volume, which will comprise hot spares and which will remain unutilized in anticipation of additional capacity needs. For example, the controller 310 can designate a first set 346 of devices 314, 316, 318 connected to the second expander 302 and a first set 352 of devices 338, 340, 342, 344 connected to the third expander 304 as part of a redundant array of independent discs. Those redundant array devices 314, 316, 318, 338, 340, 342, 344 must be spun up before the host 312 can begin servicing input/output requests. The controller 310 can also designate a second set 348 of devices 320, 322 connected to the second expander 302 and a second set 354 of devices 334, 336, 338 connected to the third expander 304 as hot spares. Hot spare devices 320, 322, 334, 336, 338 do not need to be spun up before the host 312 can begin servicing input/output requests but do need to be quickly available in the event of a disc failure. Finally, the controller 310 can designate a third set 350 of devices 324, 326, 328 connected to the second expander 302 and a third set 356 of devices 330, 332 connected to the third expander 304 as unconfigured or offline. Unconfigured devices 324, 326, 328, 330, 332 are initially unused and may be added to the redundant array of independent discs as more capacity becomes necessary; or they may be utilized as new hot spares as hot spare devices 320, 322, 334, 336, 338 are utilized. Unconfigured devices 324, 326, 328, 330, 332 do not need to be spun up before the host 312 can begin servicing input/output requests.
Once the controller 310 determines an initial configuration for the data storage system topology, the function of each device 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 is communicated to the corresponding expander 300, 302, 304. Each expander 300, 302, 304 then produces and stores a data structure correlating each PHY in the expander 300, 302, 304 with the designation of the device 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 connected to that PHY.
Continuing the previous example, the second expander 302 includes a PHY configuration data structure 306 storing PHY configuration information for the devices 314, 316, 318, 320, 322, 324, 326, 328 connected to the second expander 302. In at least one embodiment, the first set 346 is designated “staggered spin up.” Such designation is stored in the PHY configuration data structure 306. Staggered spin up indicates to the expander 302 that such devices 314, 316, 318 should be spun up at boot time. Where the first set 346 designated staggered spin up includes more devices 314, 316, 318 than can be spun up in a single cycle, the expander 302 spins up the devices 314, 316, 318 according to some predetermined priority rule such as spinning up devices 314, 316, 318 according to the sequence of the connecting PHY or any other appropriate priority sequencing.
In at least one embodiment, the second set 348 is designated “host notify.” Such designation is stored in the PHY configuration data structure 306. Host notify indicates to the expander 302 that such devices 320, 322 should be spun up only when the host issue an appropriate command, and not at boot time. Where the second set 348 is designated host notify, the expander 302 does not wait for such devices 320, 322 to spin up at boot time before reporting to the controller 310 that spin up is complete.
In at least one embodiment, the third set 350 is designated “disabled.” Such designation is stored in the PHY configuration data structure 306. Disabled indicates to the expander 302 that such devices 324, 326, 328 should be disabled and require some change in designation before spin up can occur. Where the third set 350 is designated disabled, the expander 302 does not wait for such devices 324, 326, 328 to spin up at boot time before reporting to the controller 310 that spin up is complete.
Similarly, the third expander 304 includes a PHY configuration data structure 307 storing PHY configuration information for the devices 330, 332, 334, 336, 338, 340, 342, 344 connected to the third expander 304. In at least one embodiment, the first set 346 is designated “staggered spin up.” Such designation is stored in the PHY configuration data structure 307. Staggered spin up indicates to the expander 304 that such devices 338, 340, 342, 344 should be spun up at boot time. Where the first set 346 designated staggered spin up includes more devices 338, 340, 342, 344 than can be spun up in a single cycle, the expander 304 spins up the devices 338, 340, 342, 344 according to some predetermined priority rule such as spinning up devices 338, 340, 342, 344 according to the sequence of the connecting PHY or any other appropriate priority sequencing.
In at least one embodiment, the second set 348 is designated “host notify.” Such designation is stored in the PHY configuration data structure 307. Host notify indicates to the expander 304 that such devices 334, 336 should be spun up only when the host issue an appropriate command, and not at boot time. Where the second set 348 is designated host notify, the expander 304 does not wait for such devices 334, 336 to spin up at boot time before reporting to the controller 310 that spin up is complete.
In at least one embodiment, the third set 350 is designated “disabled.” Such designation is stored in the PHY configuration data structure 307. Disabled indicates to the expander 304 that such devices 330, 332 should be disabled and require some change in designation before spin up can occur. Where the third set 350 is designated disabled, the expander 304 does not wait for such devices 330, 332 to spin up at boot time before reporting to the controller 310 that spin up is complete.
At boot time, each of the second expander 302 and third expander 304 receives a boot signal from the controller 310, reads its corresponding PHY configuration data structure 306, 307 and spins up all devices 314, 316, 318, 338, 340, 342, 344 connected to PHYs designated “staggered spin up.” Where necessary, spin up occurs according to a staggered spin up schedule defined by each expander 302, 304. Each expander 302, 304 then reports spin up complete to the controller 310. Devices 320, 322, 324, 326, 328, 330, 332, 334, 336 connected to PHYs designated “host notify” or “disabled” are not spun up at this time.
In another exemplary embodiment, the host 312 operating system is stored on one or more devices 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344 connected to one of the expanders 302, 304. For example, the host 312 operating system stored on a third set 356 of devices 330, 332 connected to the third expander 304. Because the host 312 operating system is critical to the operation of the host 312, the third set 356 must be spun up at boot time before any other operations can be performed. The third set 356 is therefore designated “staggered spin up.” At boot time, the third set 356 containing the host 312 operation system is spun up and the third expander reports spin up complete to the controller 310. The host 312 then boots up.
In order to minimize the time to boot up the host 312, it is advantageous for the third expander 304 to report spin up complete as soon as the third set 356 is spun up; therefore, only the third set 356 is designated staggered spin up in the PHY configuration data structures 306, 307. Other devices 314, 316, 318, 320, 322, 324, 326, 328, 334, 336, 338, 340, 342, 344 are connected to PHYs designated either “host notify” or “disabled.” For example, there the first set 346 connected to the second expander 302 and the first set 352 connected to the third expander 304 are previously designated to comprise a redundant array of independent discs, the PHYs corresponding to such sets 346, 352 are designated “host notify.” After the host 312 has booted up, the controller 310 sends appropriate commands to instruct the second expander 302 and third expander 304 to spin up devices 314, 316, 318, 338, 340, 342, 344 comprising the redundant array of independent discs. In one embodiment, the controller 310 determines an acceptable spin up sequence; in another embodiment, each expander 302, 304 determines a spin up sequence where the number of spin up commands received from the controller 310 would exceed the available power supply.
During normal operation, a controller 310 can change the designation of a PHY in a PHY configuration data structure 306, 307. For example, where a second set 348 in the second expander 302 is designated “host notify,” and comprises devices 320, 322 operating as hot spares, one of the devices 320, 322 may be activated to compensate for some other failed device. In that case, the PHY connected to the newly activated device 320 is re-designated “staggered spin up.” Furthermore, the PHY connected to the failed device 316 is re-designated “disabled.” Also, a PHY connected to an operable but disabled device 324 is re-designated “host notify” in anticipation of use as a hot spare.
Referring to
In at least one embodiment, the controller creates 404 one or more hot spares from one or more discs connected to the one or more expanders. The controller then sets 406 one or more data elements associated with expander PHYs corresponding to such discs in a PHY configuration data structure in the expander to some value indicating that the devices should not be spun up at boot time, but should be available to spin up based on a command from a host.
In at least one embodiment, the controller identifies 408 one or more unconfigured discs from one or more discs connected to the one or more expanders. The controller then sets 410 one or more data elements associated with expander PHYs corresponding to such discs in a PHY configuration data structure in the expander to some value indicating that the devices should be disabled.
Referring to
In at least one embodiment, the controller identifies 504 one or more redundant array of independent disc volumes and hot spares from one or more discs connected to the one or more expanders. The controller then sets 506 one or more data elements associated with expander PHYs corresponding to such discs in a PHY configuration data structure in the expander to some value indicating that the devices should not be spun up at boot time, but should be available to spin up based on a command from a host.
In at least one embodiment, the controller identifies 508 one or more unconfigured discs from one or more discs connected to the one or more expanders. The controller then sets 510 one or more data elements associated with expander PHYs corresponding to such discs in a PHY configuration data structure in the expander to some value indicating that the devices should be disabled.
Referring to
Once the host has booted up, the controller identifies 604 one or more devices connected to the one or more expanders that are required at boot time. The controller then sends 606 one or more commands to the expanders to spin up such required devices.
In at least one embodiment, the controller identifies 608 one or more discs in a redundant array of independent discs volume connected to the one or more expanders. The controller then sends 610 one or more commands to the expanders to spin up discs in the volume.
It is believed that the present invention and many of its attendant advantages will be understood by the foregoing description of embodiments of the present invention, and it will be apparent that various changes may be made in the form, construction, and arrangement of the components thereof without departing from the scope and spirit of the invention or without sacrificing all of its material advantages. The form herein before described being merely an explanatory embodiment thereof, it is the intention of the following claims to encompass and include such changes.
Number | Date | Country | Kind |
---|---|---|---|
818KOL2013 | Jul 2013 | IN | national |