This invention relates to an apparatus and method for allocating task control blocks in a data storage and retrieval system.
Data storage and retrieval systems are used to store information provided by one or more host computer systems, typically, host computer systems organized into local or wide area networks. Such data storage and retrieval systems are typically composed of an array of host adapter cards that interface with host computers, a processor complex, and an array of device adapters that communicate with one or more disk drives. The processor complex typically includes a processor, cache and a non-volatile storage device (NVS) and a backup power source to ensure continued operation of the processor and the cache in the event of a power failure. The processor typically runs several processes that direct the operation of the data storage and retrieval system. One of these processes, which manages communication between the processor complex and the host adapter cards, is defined by a Host Adapter Interface (“HAI”) code.
Conventional data storage and retrieval systems receive requests to write information to one or more secondary storage devices, and requests to retrieve information from those one or more secondary storage devices. Upon receipt of a write request, conventional systems store information received from a host computer in a data cache. In some cases, a copy of that information is also stored in NVS. NVS is used as temporary storage for data in the process of being written to secondary storage devices so that data will be available in the event that the host computer systems or the data storage and retrieval systems fail during the process of storing data. Upon receipt of a read request, the system recalls information from the one or more secondary storage devices and moves that information to the data cache and then to the host.
Conventional data storage and retrieval systems are continuously moving information to and from storage devices, to and from the data cache and in certain circumstances to and from the NVS. Conventionally, task control blocks (“TCBs”) are used to manage the movement of data within a data storage and retrieval system and between a host computer and the data storage and retrieval system. TCBs are passed between various processes within the data storage and retrieval system to clear space for and manage the movement of the data to be stored or retrieved.
The invention provides systems and methods whereby the Cache code (instead of the HAI code) controls the allocation of TCBs for new Host Adapter writes and reads. The Cache code's allocation of TCBs is based on the current knowledge of the number of existing TCBs already waiting to perform writes or reads, and the current knowledge of the number of existing TCBs already waiting to perform stage/destage work with the disk drives.
In one embodiment, methods and systems according to the invention allocate Task Control Blocks in information storage and retrieval systems that communicate with one or more host computers. Such information storage and retrieval system comprise a host adapter interface, a cache code for issuing Task Control Blocks, a data cache, a non-volatile storage, a new read Task Control Block threshold, a device adapter, and one or more information storage devices. The device adapter interconnects the data cache and the one or more information storage devices.
In certain embodiments, systems according to the invention receive a new read request from the host adapter interface, call the cache code, determine the number of Task Control Blocks already issued for previous new reads, and compare the number of Task Control Blocks already issued for previous new reads with the new read Task Control Block threshold. If the number of Task Control Blocks already issued for previous new reads exceeds the new read Task Control Block threshold, systems according to the invention queue the new read request. If the number of Task Control Blocks already issued for previous new reads does not exceed the new read Task Control Block threshold, systems according to the invention issue a Task Control Block corresponding to the new read request from the cache code.
In certain embodiments, systems according to the invention further comprise a queued stage work TCB threshold and perform the steps of determining a number of Task Control Blocks queued to perform staging of data from the one or more information storage devices to the cache, and comparing the number of Task Control Blocks queued to perform staging of data from the one or more information storage devices to the cache with the queued stage work TCB threshold. If the number of Task Control Blocks queued to perform staging of data from the one or more information storage devices to the cache exceeds the queued stage work TCB threshold, systems according to the invention queue the new read request. Alternatively, if the number of Task Control Blocks queued to perform staging of data from the one or more information storage devices to the cache does not exceed the queued stage work TCB threshold, systems according to the invention issue a Task Control Block corresponding to the new read request from the cache code.
In some embodiments, information data storage and retrieval systems according to the invention comprise a current stage work TCB threshold. Such systems determine a number of Task Control Blocks currently perform staging of data from the one or more information storage devices to the cache; and compare the number of Task Control Blocks currently performing staging of data from the one or more information storage devices to the cache with the current stage work TCB threshold. If the number of Task Control Blocks currently performing staging of data from the one or more information storage devices to the cache exceeds the current stage work TCB threshold, the system queues the new read request. Alternatively, if the number of Task Control Blocks currently performing staging of data from the one or more information storage devices to the cache does not exceed the current stage work TCB threshold, the system issues a Task Control Block corresponding to the new read request from the cache code.
In one embodiment, an information storage and retrieval system according to the invention communicates with one or more host computers. Such an information storage and retrieval system includes a host adapter interface, a cache code for issuing Task Control Blocks, a data cache, a non-volatile storage, a new write TCB threshold, a device adapter, and one or more information storage devices, said device adapter interconnecting said data cache and said one or more information storage devices. The information storage and retrieval systems receives a new write request from the host adapter interface, calls the cache code; determines a number of Task Control Blocks already issued for previous new writes, and compares the number of Task Control Blocks already issued for previous new writes with the new write Task Control Block threshold. If the number of Task Control Blocks already issued for previous new writes exceeds the new write Task Control Block threshold, the system queues the new write request. Alternatively, if the number of Task Control Blocks already issued for previous new writes does not exceed the new write Task Control Block threshold, the system issues a Task Control Block corresponding to the new write request from the cache code.
In another embodiment, the system further comprises a queued destage work TCB threshold. The system determines a number of Task Control Blocks queued to perform destaging of data from the cache to the one or more information storage devices, and compares the number of Task Control Blocks queued to perform destaging of data from cache to the one or more information storage devices with the queued destage work TCB threshold. If the number of Task Control Blocks queued to perform destaging of data from the cache to the one or more information storage devices exceeds the queued destage work TCB threshold, the system queues the new write request. Alternatively, if the number of Task Control Blocks queued to perform destaging of data from the cache to the one or more information storage devices does not exceed the queued destage work TCB threshold, the system issues a Task Control Block corresponding to the new write request from the cache code.
In another embodiment, the system further comprises a current destage work TCB threshold. The system determines a number of Task Control Blocks currently performing destaging of data from the cache to the one or more information storage devices, and compares the number of Task Control Blocks currently performing destaging of data from cache to the one or more information storage devices with the current destage work TCB threshold. If the number of Task Control Blocks currently performing destaging of data from the cache to the one or more information storage devices exceeds the current destage work TCB threshold, the system queues the new write request. Alternatively, if the number of Task Control Blocks currently performing destaging of data from the cache to the one or more information storage devices does not exceed the current destage work TCB threshold, the system issues a Task Control Block corresponding to the new write request from the cache code.
Embodiments of the invention can operate on new write requests that are cache fast writes, sequential fast writes or data storage device fast writes. In the event that a new write request is for a data storage device fast write, when determining the number of Task Control Blocks already issued for previous new writes, the system can determine the number of Task Control Blocks already issued for previous new writes in cache and non-volatile storage.
Methods and systems according to the current invention have a number advantages over conventionally configured systems. For example, methods and systems according to the invention allow for the more efficient, balanced allocation of task control blocks between read and write requests and between host adapter and stage/destage tasks, which provides for increased throughput.
The invention will be better understood from a reading of the following detailed description taken in conjunction with the drawings in which like reference designators are used to designate like elements, and in which:
The invention disclosed herein is based on systems and methods for issuing Task Control Blocks in data storage and retrieval systems with awareness of the allocation of existing Task Control Blocks for various tasks. The invention may be implemented as a method, instructions disposed on a computer readable medium for carrying out a method, apparatus or article of manufacture using standard programming or engineering techniques to produce software, firmware, hardware, or any combination thereof. The term “article of manufacture” as used herein refers to code or logic implemented in hardware or computer readable media such as optical storage devices, and volatile or non-volatile memory devices. Such hardware may include, but is not limited to, field programmable gate arrays (“FPGAs”), application specific integrated circuits (“ASICs”), complex programmable logic devices (“CPLDs”), programmable logic arrays (“PLAs”), microprocessors, or other similar processing devices.
This invention is described in preferred embodiments in the following description with reference to the Figures, in which like numbers represent the same or similar elements.
When a write request issues from a host computer to an information storage and retrieval system, the process defined by the Host Adapter Interface (“HAI”) code allocates a TCB from the operating system (“OS”) code. The TCB is used to maintain information about the write process from beginning to end as data to be written is passed from the host computer through the cache and/or the NVS to the secondary storage devices. In data storage and retrieval systems, once the HAI code has allocated a TCB, the TCB is passed to the cache code in order to ensure the allocation of space for the write in the cache. If the cache is full, it may queue the TCB until existing data in the cache can be destaged, or written to secondary storage devices, in order to free up space.
Different types of write processes require different levels of use for system resources. During a Cache Fast Write (CFW) and a Sequential Fast Write (SFW), for example, data is only written to the cache, and is destaged down to data storage devices such as disk drives at a later time. During a Data Storage Device Fast Write (DFW), on the other hand, data is written to both the cache and the NVS. In the event of a DFW, once cache space is allocated, the TCB is passed to the NVS code in order to allocate space in the NVS for the write. The NVS code may also queue the TCB if the NVS is full until data stored in the NVS can be destaged to make room for the write operation. Once space is allocated in the cache and NVS, the HAI code informs the Host Adapter that the write can proceed. Once data are written to the NVS and the cache, additional TCBs are generated that destage data from the cache and the NVS down to the disk drives.
As is set forth above, when TCBs are allocated during a write, the TCBs can themselves be queued in either the cache or the NVS depending on space availability. Space availability is determined by the speed and capability of the data storage and retrieval system to destage data from cache and NVS down to the disk drives. The speed and capability to destage data from cache and NVS down to the disk drives are determined ultimately by the speed of the disk drives, the size of the disk drive array and the speed of the device adapter interface.
The number of TCBs available to the processor complex for all tasks is finite. As each new request arrives from the Host Adapter cards, the HAI code requests a new TCB from the OS code. If one is available, a TCB for the write is passed to the HAI code and the write is allowed to proceed. If a TCB is not available, the HAI code must wait for an available TCB for the write to proceed. When the HAI code requests a TCB from the OS code for a new write, the HAI code is typically not aware of whether the cache and the NVS are already full of data waiting to be destaged. This can result in a situation where the majority of TCBs are consumed by new write requests, leaving few TCBs for use in other tasks, such as destaging data from NVS and cache or new read request. Similar TCB bottlenecks can occur during read operations.
The present invention provides systems and methods that efficiently allocate TCBs by allocating TCBs for new reads and writes only after taking into account the number of TCBs that have already been allocated for other tasks.
Referring now to
Information storage and retrieval system 100 includes a processor complex 111. Processor complex 111 includes a power supply 112, a cache 113 and a processor 114. The power supply 112 optionally includes a battery and ensures the continued operation of the processor 114 and the cache 113 in the event of a power failure. Processor complex 111 also includes a Non-volatile storage device (“NVS”) 115. Information storage and retrieval system 100 also includes an array of device adapter cards 116-119, which manage communication between the information storage and retrieval system 100 and an array of storage devices 120-123, for example disk drives arranged in a Redundant Array of Independent Disks (“RAID”) configuration.
Information storage and retrieval systems according to the invention may optionally be arranged in clusters having redundant parallel sets of host adapter cards, processor complexes and device adapter card arrays, where each cluster can communicate between host computers and a shared array of storage devices.
The processor 114 is capable of running a variety of processes defined by various pieces of code, which are illustrated in
If the current number of TCBs allocated for new reads is below the threshold, the system then interrogates the Device Adapter to determine the number of TCBs currently queued to perform staging of data from the disk drives (step 330). The system compares the number of TCBs queued for stage work with a queued stage work TCB threshold (step 335). The queued stage work TCB threshold (step 335) can be a user set parameter, or can be dynamically set according to system performance. The system determines whether the number of TCBs queued for stage work exceeds the queued stage work TCB threshold (step 340). If the queued stage work TCB threshold is exceeded, the system queues the new read request (step 345) until the number of TCBs already queued for stage work drops below the queued stage work TCB threshold.
If the current number of TCBs currently queued for stage work is below the queued stage work TCB threshold, the system then interrogates the Device Adapter to determine the number of TCBs currently being used to perform staging of data from the disk drives (step 350). The system compares the number of TCBs currently being used for stage work with a current stage work TCB threshold (step 355). The current stage work TCB threshold (step 355) can be a user set parameter, or can be dynamically set according to system performance. The system determines whether the number of TCBs currently being used for stage work exceeds the current stage work TCB threshold (step 360). If the current stage work TCB threshold is exceeded, the system queues the new read request (step 365) until the number of TCBs already being used for stage work drops below the current stage work TCB threshold. If the current stage work TCB threshold is not exceeded, the cache code issues a new TCB and the read is allowed to proceed (step 375).
The embodiment of
If the current number of TCBs allocated for new CFW/SFW requests is below the new CFW/SFW TCB threshold, the system then interrogates the Device Adapter to determine the number of TCBs currently queued to perform destaging of data to the disk drives (step 430). The system compares the number of TCBs queued for destage work with a queued destage work TCB threshold (step 435). The queued destage work TCB threshold (step 435) can be a user set parameter, or can be dynamically set according to system performance. The system determines whether the number of TCBs queued for destage work exceeds the queued destage work TCB threshold (step 440). If the queued destage work TCB threshold is exceeded, the system queues the new CFW/SFW request (step 445) until the number of TCBs already queued for destage work drops below the queued destage work TCB threshold.
If the current number of TCBs currently queued for destage work is below the queued destage work TCB threshold, the system then interrogates the Device Adapter to determine the number of TCBs currently being used to perform destaging of data from the disk drives (step 450). The system compares the number of TCBs currently being used for destage work with a current destage work TCB threshold (step 455). The current destage work TCB threshold (step 455) can be a user set parameter, or can be dynamically set according to system performance. The system determines whether the number of TCBs currently being used for destage work exceeds the current destage work TCB threshold (step 460). If the current destage work TCB threshold is exceeded, the system queues the new CFW/SFW request (step 465) until the number of TCBs already being used for destage work drops below the current destage work TCB threshold. If the current destage work TCB threshold is not exceeded, the cache code issues a new TCB and the CFW/SFW is allowed to proceed (step 475).
The embodiment of
If the current number of TCBs allocated for new DFW requests is below the new DFW TCB threshold, the system then interrogates the Device Adapter to determine the number of TCBs currently queued to perform destaging of data to the disk drives (step 530). The system compares the number of TCBs queued for destage work with a queued destage work TCB threshold (step 535). The queued destage work TCB threshold (step 535) can be a user set parameter, or can be dynamically set according to system performance. The system determines whether the number of TCBs queued for destage work exceeds the queued destage work TCB threshold (step 540). If the queued destage work TCB threshold is exceeded, the system queues the new DFW request (step 545) until the number of TCBs already queued for destage work drops below the queued destage work TCB threshold.
If the current number of TCBs currently queued for destage work is below the queued destage work TCB threshold, the system then interrogates the Device Adapter to determine the number of TCBs currently being used to perform destaging of data to the disk drives (step 550). The system compares the number of TCBs currently being used for destage work with a current destage work TCB threshold (step 555). The current destage work TCB threshold (step 555) can be a user set parameter, or can be dynamically set according to system performance. The system determines whether the number of TCBs currently being used for destage work exceeds the current destage work TCB threshold (step 560). If the current destage work TCB threshold is exceeded, the system queues the new request (step 565) until the number of TCBs already being used for destage work drops below the current destage work TCB threshold. If the current destage work TCB threshold is not exceeded, the cache code issues a new TCB and the DFW is allowed to proceed (step 575).
The embodiment of
In addition to the methods set forth above, the invention includes an article of manufacture comprising a computer useable medium having computer readable program code disposed therein to efficiently allocate TCBs in a storage and retrieval system. The invention further includes a computer program product usable with a programmable computer processor having computer readable program code embodied therein to efficiently allocate TCBs in a data storage and retrieval system.
While the preferred embodiments of the present invention have been illustrated in detail, it should be apparent that modifications and adaptations to those embodiments may occur to one skilled in the art without departing from the scope of the present invention as set forth in the following claims.