The present disclosure relates generally to computer systems and information handling systems, and, more specifically, to a system and method for prioritizing disk access for shared-disk applications.
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to these users is an information handling system. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may vary with respect to the type of information handled; the methods for handling the information; the methods for processing, storing or communicating the information; the amount of information processed, stored, or communicated; and the speed and efficiency with which the information is processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include or comprise a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
Cluster database software can allow a collection, or “cluster,” of networked computing systems, or “nodes,” shared access to a single database. One example of cluster database software is the Real Application Cluster software of Oracle Corporation in Redwood Shores, Calif. The shared database may be located in a set of shared storage devices, such as a shared set of external disks. Although this shared-access feature offers many advantages, problems may arise if every node in the cluster attempts to access the shared external disks simultaneously. The resulting disk-access contentions could lead to timeout failures for the requested I/O operations. The extra time needed to retry failed I/O operations may put the operation of the cluster as a whole at risk. For example, the cluster database may designate one or more disks, or one or more partitions, in the shared external disks as a “voting disk,” which stores and provides cluster-status information. Timely access to the voting disk by the nodes is critical to the continued operation of the cluster. If the nodes cannot access the voting disk before the set timeout period for operations expires, certain processes performed by the cluster could fail.
A system and method for prioritizing disk access for a shared-disk storage system are disclosed. The system includes a cluster of computing systems coupled via a network, wherein the cluster of computing systems includes at least two nodes. A storage system is coupled to the network. The at least two nodes each may access the storage system. A high-priority buffer stores requests from the cluster of computing systems for high-priority information stored in the storage system, and a low-priority buffer stores requests from the cluster of computing systems for low-priority information stored in the storage system. A storage-system controller serves requests stored in the high-priority buffer before serving requests stored in the low-priority buffer.
The system and method disclosed herein are technically advantageous because they reduce the chances of timeout failures for I/O requests for critical information by allowing the cluster to serve requests for high-priority information before serving requests for low-priority information. Timeout failures can force the requesting node to reboot. Any other services performed by the rebooting node will be delayed until the reboot is complete, ultimately slowing the operation of the cluster of computing systems. Moreover, because timeout failures for critical I/O requests can lead to the failure of the entire cluster of computing systems in certain situations, the resulting reduction in timeout failures for critical requests improves the stability of the cluster as a whole.
A more complete understanding of the present embodiments and advantages thereof may be acquired by referring to the following description taken in conjunction with the accompanying drawings, in which like reference numbers indicate like features, and wherein:
For purposes of this disclosure, an information handling system may include any instrumentality or aggregate of instrumentalities operable to compute, classify, process, transmit, receive, retrieve, originate, switch, store, display, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, or other purposes. For example, an information handling system may be a personal computer, a network storage device, or any other suitable device and may vary in size, shape, performance, functionality, and price. The information handling system may include random access memory (RAM), one or more processing resources such as a central processing unit (CPU) or hardware or software control logic, ROM, and/or other types of nonvolatile memory. Additional components of the information handling system may include one or more disk drives, one or more network ports for communication with external devices as well as various input and output (I/O) devices, such as a keyboard, a mouse, and a video display. The information handling system may also include one or more buses operable to transmit communications between the various hardware components.
As shown in
Example shared-disk storage system 108 may use a Redundant Array of Independent Disks (“RAID”) configuration to guard against data loss should any of the individual shared-disks 112, 114, 116, or 118 fail. As such, cluster database system 100 may include a storage-system controller 120 to handle the management of disks 112, 114, 116, and 118. Storage-system controller 120 may perform any parity calculations that may be required to maintain the selected RAID configuration. Storage-system controller 120 may consist of software and, in some cases, hardware, components located in ones of the nodes 102 or 104. Alternatively, storage-system controller 120 may reside within example shared-disk storage system 108, if desired. Shared-disk storage system 108 may be configured according to any RAID Level desired or may use an alternative redundant storage methodology. Also, shared-disk storage system 108 may be configured in a software-based RAID system that does not rely on storage-system controller 120 but instead upon a host-based volume manager for management commands and parity calculations.
Typically, storage-system controller 120 will treat requests for data from the different disks 112, 114, 116, and 118, equally and will process such requests in the order that they were received. Thus, a request for data located in disk 112 will be given equal weight as a request for data located in storage device 114, even if disk 112 contains data that is critical to the continued operation of cluster 100, while disk 114 contains only non-critical data. This equal treatment may cause problematic disk-access contentions. For example, the cluster database software may designate one disk as the voting disk, such as disc 112 in shared-disk storage system 108. Again, the voting disk will contain information, such as cluster-status information, that is critical to the continued function of the database. Should node 102 send a request for non-critical information stored on disk 114 and then node 104 send a request for critical information from disk 112, Storage-system controller 120 will queue the requests in the order received.
In certain embodiments of the system and method of the present invention, disks in shared-disk storage system 108 may be assigned a priority level based on the information stored on the disk. A disk containing critical information, such as the voting disk, could be assigned a higher priority level than disks containing non-critical information. All I/O requests would be queued in a high-priority buffer or a low-priority buffer according to information they seek. Priority assignments could be made at the time the RAID system is created.
If data is present in the high-priority buffer, storage-system controller 120 may then move on to the step shown in block 406 and process a high-priority request or data block in the high-priority buffer. As shown by the arrows in
Although the present disclosure has described a shared-disk storage system with two buffers, a high-priority buffer and a low-priority buffer, the reader should recognize that the shared-disk storage system may incorporate any number of buffers of differing degrees of priority. That is, intermediate-priority buffers may be used between the low- and high-priority buffers, with requests in the intermediate-priority buffers served after requests in the high-priority buffer but before requests in the low-priority buffer. Moreover, the shared-disk storage system may use a less-rigid hierarchy for processing requests, if desired. For example, the shared-disk storage system may process higher-priority requests before lower-priority requests up and until a threshold number of lower-priority requests build up in the lower-priority buffers. At that point, the shared-disk storage system may service the lower-priority requests enough to bring the request total below the threshold before returning to processing the higher-priority requests. Although the present disclosure has been described in detail, it should be understood that various changes, substitutions, and alterations can be made hereto without departing from the spirit and the scope of the invention as defined by the appended claims.