1. Field of the Invention
The invention relates generally to storage systems and more specifically relates to a tree management assist circuit to manage tree data structures in a storage system.
2. Related Patents
This patent is related to commonly owned U.S. patent application Ser. No. 09/026,8 entitled APPARATUS AND METHODS FOR REGION LOCKING ASSIST CIRCUIT IN A STORAGE SYSTEM, which is hereby incorporated by reference.
3. Discussion of Related Art
Storage systems or devices typically include a storage controller allowing a host system to couple to the storage system/device. The storage device/system receives I/O requests through the controller from attached host systems. I/O requests received by the storage controller may be encoded, for example, as SCSI (Small Computer Systems Interface) commands. Processing of the I/O requests in the storage controller may involve a number of computations and significant data processing. Much of the computations and data processing may involve manipulation of tree data structures. For example, tree data structures may be used in processing of region locks as described in co-pending patent application Ser. No. 09/026,8, for cache-line lookup processing for data stored in a cache memory of the storage controller, and in other processing within the storage controller.
Processing of tree data structures may entail significant processing by a general-purpose processor of the storage controller. Further, some storage controllers may include customized circuits for faster processing of I/O requests (i.e., a “fast-path” I/O processor to improve performance of common read and write I/O request processing). Tree data structures utilized in processing of I/O requests present further problems for such “fast-path” I/O request processing in that the fast-path processing circuits may rely on the general-purpose processor to provide the required tree data structure processing even for the fast-path I/O request processing circuits. Relying on a tree processing algorithm that runs on a general-purpose processor involves substantial overhead for I/Os that are otherwise processed exclusively in hardware (i.e., in the “fast-path” I/O request processing circuit), thereby compromising the potential performance of the I/O processing subsystem.
Thus, it is an ongoing challenge to provide efficient processing of tree data structures in a storage controller.
The present invention solves the above and other problems, thereby advancing the state of the useful arts, by providing circuits and methods for fast processing of tree data structures. In one exemplary embodiment, a tree assist circuit in a storage controller provides tree management functions used by either a general-purpose processor of the storage controller or a “fast-path” I/O processor of the storage controller to process I/O requests received from an attached host system. Through the interface the I/O request processing may insert, delete, or modify nodes in a tree data structure. The tree assist circuits may also provide tree data structure rotation features to permit a tree data structure to be re-balanced as may be desired following insertion or deletion of nodes in a tree data structure. In one exemplary embodiment, the tree data structures may be AVL tree data structures.
In one aspect hereof, apparatus is provided in a storage controller of a storage system for managing tree data structures. The apparatus comprises a memory adapted to store a plurality of tree node data structures. The memory is further adapted to store a plurality of tree information data structures, each tree information data structure is adapted to identify a root tree node data structure. The apparatus further comprises an application interface circuit adapted to couple with an application circuit (e.g., a “fast-path” I/O processor and/or a suitably programmed genera-purpose processor) in the storage system and adapted to receive a request from the application circuit to access an identified tree data structure and adapted to return information to the application circuit based on access to the identified tree data structure. The application interface circuit comprising a tree information configuration register adapted to store a starting address in the memory for the plurality of tree information data structures. The apparatus further comprises a tree management circuit coupled with the application interface circuit and coupled with the memory. The tree management circuit adapted to access an identified tree data structure in the memory in response to a request received from an application circuit through the application interface circuit.
Another aspect hereof provides a storage controller. The storage controller comprises a front-end interface adapted for coupling the storage controller to a host system and a back-end interface adapted to couple the storage controller to a plurality of storage devices. The controller also comprises a general-purpose processor coupled with the back-end interface and coupled with the front-end interface. The general-purpose processor is adapted to receive a host system I/O request through the front-end interface and is adapted to process a received I/O request by accessing storage devices through the back-end interface. The storage controller further comprises an I/O request processor coupled with the back-end interface and coupled with the front-end interface and coupled with the general-purpose processor. The I/O request processor is adapted to receive a host system I/O request through the front-end interface and is adapted to process a received I/O request by accessing storage devices through the back-end interface. The controller further comprises a memory adapted to store a plurality of tree node data structures and further adapted to store a plurality of tree information data structures, each tree information data structure is adapted to identify a root tree node data structure. The controller also comprises a tree assist circuit coupled with the general-purpose processor and coupled with the I/O request processor and coupled with the memory. The tree assist circuit comprises a tree information configuration register adapted to store a starting address in the memory for the plurality of tree information data structures and a tree management circuit coupled with the tree information configuration register and coupled with the memory. The tree management circuit is adapted to access an identified tree data structure in the memory. The tree assist circuit is adapted to receive a request from the general-purpose processor and/or from the I/O request processor to access an identified tree data structure and is adapted to return information to the requesting processor based on access to the identified tree data structure.
Still another aspect hereof provides a method operable in a storage controller, the storage controller comprising an I/O request processor and a tree assist circuit and a memory. The method includes receiving an I/O request from an attached host system and transmitting a tree management request from the I/O request processor to the tree assist circuit. The method also includes receiving in the tree assist circuit a request from the I/O request processor, the request for access to an identified tree data structure stored in the memory. The method then accesses, by operation of the tree assist circuit, the identified tree data structure. The access further comprises one or more of the steps of modifying the identified tree data structure and returning information from the identified tree data structure.
Storage controller 100 is enhanced in accordance with features and aspects hereof to include tree assist circuit 120 and associated tree memory 122. Tree assist circuit 120 and tree memory 122 may also be coupled to components within storage controller 100 via internal bus 150. Tree assist circuit 120 comprises logic circuits specifically adapted to perform tree data structure management in conjunction with I/O request processing by general-purpose processor 106 and/or I/O request processor 108. As noted above, tree data structures may be useful in processing of I/O request for managing various aspects of the I/O request processing. For example, region locking features or cache-line lookup features may be processed utilizing tree data structures stored in tree memory 122 and managed with the assistance of tree assist circuit 120. In general, general-purpose processor 106 and/or I/O request processor 108 (collectively or individually referred to as I/O processors) interact with tree assist circuit 120 to access and/or modify tree data structures stored in tree memory 122. For example, tree assist circuit 120 may provide an application circuit interface to allow the I/O processors to insert or delete nodes in an identified tree data structure and/or to search an identified tree data structure for particular nodes of interest for the processing of one or more I/O requests. Utilizing the application interface of the tree assist circuit 120, I/O processors may define a new tree for a desired management function in processing of I/O requests directed to one or more volumes stored on storage devices 130. For example, in processing region lock capabilities of storage controller 100, a new tree data structure may be defined for each logical volume defined by the storage system. The tree data structure may then be used to identify regions of the storage volume locked by processing of one or another I/O request. Such a tree data structure may then be used to determine whether a conflict may arise when a new region lock request is received. The nodes of the tree structure may represent granted region locks such that a new region lock request may search the tree to determine whether a new lock request overlaps, and thus, conflicts with, a previously granted region lock request.
The plurality of TREE_INFO data structures may be stored in contiguous memory starting from the base address stored in the tree information configuration register 204. Those of ordinary skill in the art will readily recognize other information that may be stored in the tree information configuration register. The following table exhibits exemplary fields of an exemplary tree information configuration register 204:
Each TREE_INFO data structure includes a pointer to a root tree node (TREE_NODE) data structure of the associated tree data structure. Other fields may be present in each TREE_INFO data structure as a matter of design choice. In one exemplary embodiment, the following table exhibits exemplary fields of an exemplary TREE_INFO data structure:
In an alternate embodiment, the above exemplary data structure may be represented more compactly by encoding the “Compare” and “Change Method” fields (comprising a total of 4 bits) as the low order bits of the otherwise 32-bit address of the root. In other words, a TREE_INFO data structure may be more compactly encoded as a single 32-bit word with the high order 28 bits representing the high order 28 bits of the address of the root of the tree (presuming the root tree node to be aligned at a 32 byte boundary). The low order 4 bits then represent the compare and change fields described above.
Each TREE_NODE of a tree data structure includes a key field used for storing information encoded by the TREE_NODE. The key values are used to order the nodes of the tree such that the tree data structure may be “walked” or “traversed” in order of the key values. In addition, each TREE_NODE data structure may include pointers to a left branch sub-tree of the node and to a right branch sub-tree of the node. Still further, each TREE_NODE data structure may include a pointer to its parent TREE_NODE in the tree data structure. Lastly, where the tree structure is implemented as an AVL tree structure, each TREE_NODE data structure may include a balance field indicating the degree of balance or imbalance of the sub-trees descending from the corresponding TREE_NODE data structure. The balance field value may be used to guide modification operations of the tree assist circuit to minimize the height of a tree data structure as measured from the most distant leaf node to the root node of a tree. The following table describes exemplary fields of a TREE_NODE data structure in one exemplary embodiment hereof:
The root pointer field 354 of the identified TREE_INFO data structure 304 points to the root TREE_NODE 310 of the identified tree data structure to be accessed by a tree management request. Each of the other TREE_NODEs (312 through 322) are “children” of the root TREE_NODE 310. Based on the exemplary TREE_NODE data structure described above, each TREE_NODE structure 310 through 322 includes a corresponding key field, a parent pointer field, a left and right sub-tree pointer field, and a balance field. As the root node of the identified tree, TREE_NODE 310 includes a left sub-tree pointer 356 pointing to its left child sub-tree starting with TREE_NODE 312 and a right sub-tree pointer 358 pointing to its right child sub-tree starting with TREE_NODE 314. These child TREE_NODEs 312 and 314 each include a parent pointer (370 and 372, respectively) pointing back to their parent TREE_NODE 310. TREE_NODE 312 includes a left sub-tree pointer 360 pointing to TREE_NODE 316. TREE_NODE 312 has no right sub-tree and thus the right sub-tree pointer value would be nil (e.g., zero). Since TREE_NODE 312 has no right sub-tree and the depth of its left sub-tree is only a single node, the balance field value of TREE_NODE 312 is −1. TREE_NODE 316 include a parent pointer 374 pointing back to parent TREE_NODE 312. Still further, TREE_NODE 314 includes a left sub-tree pointer 364 pointing to TREE_NODE 320 and a right sub-tree pointer 366 pointing to TREE_NODE 322. TREE_NODEs 320 and 322 each include a parent pointer, 378 and 380, respectively, pointing back to parent TREE_NODE 314. As leaf nodes in the tree structure, the left and right child pointers of node 316, 320, and 322 are all nil (e.g., zero). TREE_NODEs 310, 314, 316, 320, and 322 all have balance field values of zero indicating equal balance on their respective left and right sub-trees. Since no node of tree 300 has a balance field value other than −1, 0, and 1, tree 300 is deemed balanced and does not require any re-balancing operations. Tree 300 of
In addition, sync request 406 and sync response 408 comprise a synchronous interface whereby an application circuit may issue a request in the sync request interface 406 and await a corresponding response in the synchronous response interface 408 before continuing any further processing of an I/O request. A synchronous request and response may be performed when the application circuit cannot proceed further with processing until the tree management request is completed. By contrast, an asynchronous request and response may be appropriate where the application circuit is capable of proceeding with further processing while awaiting the completion of the tree management request. Those of ordinary skill in the art will recognize standard arbitration logic that may be associated with the application interface circuit 200 to help avoid conflicts from simultaneous requests. Such arbitration logic is well known to those of ordinary skill in the art and thus omitted for simplicity and brevity of this discussion. Other features and logic of the tree assist circuit 120 help avoid processing of conflicting or incoherent requests from multiple application circuits.
Tree management circuit 202 may include a tree search circuit 410 and a tree modification circuit 416. Tree search circuit 410 comprises logic circuits for searching an identified tree data structure based on a particular supplied key value. In addition, tree search circuit 410 may include tree successor search logic 412 and tree predecessor search logic 414 for locating a succeeding or preceding TREE_NODE in an identified tree data structure based on a provided key value. Tree modification circuit 416 may include a TREE_NODE insertion circuit 418 adapted to insert a provided new TREE_NODE into an identified tree data structure. Tree modification circuit 416 may include TREE_NODE deletion circuit 420 adapted to delete an identified TREE_NODE from an identified tree data structure. Tree rotation circuit 422 within tree modification circuit 416 provides functionality to rebalance or rotate an identified tree data structure. The rotation or rebalance function of tree rotation circuit 422 may be invoked directly by an application circuit or may be indirectly invoked as an aspect of processing an insertion or deletion of a TREE_NODE by circuit 418 and 420, respectively.
Rotation operations for AVL tree data structures are well known to those of ordinary skill in the art. After the insertion of each node or deletion of a node, the tree should be checked for balance. Once the tree is found to be out-of-balance then re-balance it using the appropriate algorithm. An exemplary algorithm for required rotations to re-balance an AVL tree is as follows:
Step 1: Set up the pointers:
Exemplary functions provided by tree assist circuit 120 are discussed in the table below. The table indicates a particular type of request or function to be performed as may be entered in the async request FIFO 400 or the sync request interface 406 and corresponding response information that may be entered in the async response FIFO 402 or the sync response interface 408. A description of the processing performed for each tree management operation is also provided in the table below.
As noted, the requests and replies to tree management requests may be entered in corresponding FIFOs. In one exemplary embodiment, the request and response FIFO entries are pointers to corresponding request and response descriptors. In other embodiments, as a matter of design choice, the entries in the FIFOs are the actual request and response descriptors. An exemplary request/response descriptor is shown in the table below:
Application circuits specify the operation to be performed in the Command field, along with the Tree Index, Node Pointer, and Key, as applicable for the specified command. The Node Type field is specified by an application circuit in the request, and is reflected in the Node Type field of the AVL Change Notification response. The Node Type is used by the application circuit to resolve the usage of the TREE_NODE (e.g., for cache look-up, region locks, sorted writes, etc.). The Node Type field may overlay the least significant five bits of the Node Pointer. Because the TREE_NODE structures are 32-byte aligned, these least-significant five bits are masked off and set to zero when using the 32-bit value as a TREE_NODE pointer. A request may be posted either to the synchronous request queue or to the asynchronous request queue. Prior to posting a request, the application circuit sets the Status and V fields to an initialized value (e.g., 0 or −1). Reading a non-initialized value from the V or Status field indicates tree management circuit has completed processing the request. In one exemplary embodiment, only one request may be outstanding on the synchronous request interface at a time. Application circuits should service the response on the synchronous response queue before servicing any replies on the asynchronous response queue and before submitting any new AVL Tree requests.
Application circuits should be adapted to avoid overflowing the asynchronous request queue. The application interface 200 of
Prior to posting any response (synchronous or asynchronous), the tree management circuit 202 of
The following describes additional exemplary details of the logic in the tree management circuit 202 for processing each of the above exemplary tree management requests in the context of AVL tree management.
The requesting application circuit provides the Key and the Tree Index in the AVL Tree request descriptor (i.e., in the async request FIFO or the sync request interface). Starting at the root node of the identified tree, the tree management circuit walks (traverses in key value order) nodes in the AVL tree until it finds a matching key. The traversal is complete if a node is found with a matching key value, if two connecting nodes with keys that fall on either side of the specified key are encountered, or a node with a key that falls on one side of the provided key is encountered with no children in the direction of the specified key relative to the located node key. If a node is found matching the specified key the tree management circuit returns a pointer to the matching node in the Node Pointer of the response descriptor with the Status set to FOUND. If the tree does not contain a node matching the specified key, the tree management circuit sets the Node Pointer of the response descriptor to NULL and the Status to MISSING.
The requesting application circuit provides the Key and the Tree Index in the AVL Tree request descriptor. If the identified tree is not empty (root≠NULL), starting at the root, the tree management circuit walks the AVL tree until it finds the node with a key value closest to, but not less than, the specified key. If a node with a key equal to or greater than the value of the specified key is found, the tree management circuit sets the Node Pointer of the response descriptor to the node with a key that is closest to, but not less than the value of the specified key, and sets the Status field of the response descriptor to FOUND. If the tree contains no nodes with a key that is greater than or equal to the value of the specified key, the Node Pointer is set to NULL, and sets the Status field to MISSING.
The requesting application circuit provides, in the AVL Tree request descriptor, a pointer to a TREE_NODE that is already in the AVL tree and the Tree Index. The tree management circuit finds the node with the lowest key value that is greater than the key value of the specified node by traversing the left links to the first NULL link, starting with the right child of the specified node. This is used to facilitate identification of rotation nodes for rebalancing the tree.
The requesting application circuit provides, in the AVL Tree request descriptor, a pointer to a TREE_NODE that is already in the AVL tree and the Tree Index. The tree management circuit finds the node with the highest key value that is less than the key value of the specified node by traversing the right links to the first NULL link, starting with the left child of the specified node.
The requestor provides, in the AVL Tree request descriptor, a pointer to a TREE_NODE to be placed into the AVL tree identified by the supplied Tree Index. The tree management circuit finds the branch to insert the new node. The tree management circuit navigates left and right to find a leaf node such that the key value of the new node is between the key values of the leaf node and its parent (smaller keys go left, higher keys go right). The leaf becomes the inserted node's parent. The tree management circuit rebalances the tree towards the root (this should require no more than two node rotations).
The requestor provides, in the AVL Tree request descriptor, a pointer to a TREE_NODE that is already in the AVL tree identified by the Tree Index. The tree management circuit will delete the specified node from the tree; swapping the “successor” branch into the deleted node's slot in case the deleted node had two children; and then rebalance the tree upwards.
Those of ordinary skill in the art will recognize the desirability of arbitration logic and atomicity of operations in interfacing between the application circuits and the tree assist circuit. For example, the tree assist circuit's response FIFO may be a simple hardware FIFO circuit with a limited number of entries. To avoid overflowing the asynchronous response FIFO, application circuits should keep a count of outstanding requests, and suspend issuing new requests when there are as many requests outstanding as there are available entries in the response FIFO. When a queue-full condition occurs, the application circuits should queue asynchronous requests internally or otherwise delay issuing new requests. The request/response descriptor does not contain linking elements, so the linking used to queue the requests within the application circuits should be provided in an application circuit construct that includes or references the request/response descriptor.
If more than one processing core (application circuit) issues requests to the same tree data structure, the application circuit's algorithm is used to access the tree should guarantee atomic access. This includes a requirement to use a memory semaphore to count the number of outstanding requests for detection of the queue-full condition. This should involve acquiring a memory semaphore and/or the use of atomic linked list updates.
Likewise, any processing core (application circuit) may service the asynchronous response queue. However, if a response is retrieved for an asynchronous request issued by a different processor core (indicated in the application specific context fields associated with the tree request), the application may forward the response to the other core via a message (e.g., by placing the context containing the request/response descriptor on a queue serviced by the other core).
When the tree assist circuit has multiple asynchronous replies pending, application circuits should retrieve all outstanding replies before issuing new tree management requests. After retrieving all outstanding replies, the application circuits should check any application circuit pending request queue (within the application circuits) and issue as many requests to the tree assist circuit as possible before encountering a queue-full condition.
If the tree assist circuit has more than one pending response available after the application circuits read the first response from the response FIFO register, the tree assist circuit will have the second response available in the register before the application circuit reads the response FIFO register again. A value of 0xFFFFFFFF read from the response FIFO indicates no more responses are available.
An application circuit may issue a synchronous tree management request in circumstances that require a response before processing can continue. There may be only one synchronous request pending at any point in time. After issuing a synchronous request, the application circuit should poll the synchronous response register before issuing any new tree management requests, and continue polling the synchronous response register until the response is available.
All processing cores (application circuits) may have access to the synchronous request register. To prevent multiple concurrent synchronous requests, access to the synchronous request and response registers should be protected using a memory semaphore to provide atomic access spanning both issuance of the request and fetching of the response.
While the invention has been illustrated and described in the drawings and foregoing description, such illustration and description is to be considered as exemplary and not restrictive in character. One embodiment of the invention and minor variants thereof have been shown and described. In particular, features shown and described as exemplary software or firmware embodiments may be equivalently implemented as customized logic circuits and vice versa. Protection is desired for all changes and modifications that come within the spirit of the invention. Those skilled in the art will appreciate variations of the above-described embodiments that fall within the scope of the invention. As a result, the invention is not limited to the specific examples and illustrations discussed above, but only by the following claims and their equivalents.
This patent application claims priority to U.S. provisional patent application Ser. No. 61/169,399, filed 15 Apr. 2009, which is hereby incorporated by reference.
Number | Date | Country | |
---|---|---|---|
61169399 | Apr 2009 | US |