In some computer systems, multiple requesters (e.g., processors) are permitted access to sharable data structures. To ensure coherency, only one requestor is permitted access to the sharable data structure at a time. In this way, a requestor can read the data structure, modify the data, and write the data back to its original location (e.g., memory), while all other requestors that also want to access the data structure are forced to wait. One of the waiting requesters may then be granted access to the data structure. The mechanism of forcing the other requestors to wait may be implemented as a “spin lock.” In a spin lock, a requestor repeatedly reads a lock flag associated with the data structure. The lock indicates whether the data is “free” or “busy.” The requester repeatedly reads the flag until the flag indicates the data structure is free thereby permitting the requester to access the data. Repeatedly issuing read cycles to read the lock flag undesirably consumes bus bandwidth. The problem is exacerbated as more requestors issue read cycles to read the lock flag associated with a particular lock data structure.
In at least some embodiments, a system comprises a plurality of requesters and control logic. The requesters are each capable of accessing a sharable data structure. The control logic causes a request for access to the sharable data structure to be deferred to permit only one requestor at a time to access the sharable data structure.
For a detailed description of exemplary embodiments of the invention, reference will now be made to the accompanying drawings in which:
Certain terms are used throughout the following description and claims to refer to particular system components. As one skilled in the art will appreciate, computer companies may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following discussion and in the claims, the terms “including” and “comprising” are used in an open-ended fashion, and thus should be interpreted to mean “including, but not limited to . . . .” Also, the term “couple” or “couples” is intended to mean either an indirect or direct electrical connection. Thus, if a first device couples to a second device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections. The term “system” refers to a collection of two or more parts and may be used to refer to a computer system, a portion of a computer system or any other combination of two or more parts.
The following discussion is directed to various embodiments of the invention. Although one or more of these embodiments may be preferred, the embodiments disclosed should not be interpreted, or otherwise used, as limiting the scope of the disclosure, including the claims. In addition, one skilled in the art will understand that the following description has broad application, and the discussion of any embodiment is meant only to be exemplary of that embodiment, and not intended to intimate that the scope of the disclosure, including the claims, is limited to that embodiment.
Each CPU 52 can access the system memory 64 or the I/O devices 66, 68 through the bridge 60. The bridge 60 thus functions to coordinate accesses (reads and writes) between the various devices shown in
Each CPU 52 is assigned an identifier that distinguishes that CPU from the other CPUs. In
Exemplary embodiments of the invention provide for control of access to one or more sharable data structures. In
The embodiments described herein implement a deferred cycle mechanism to control access to a sharable data structure (e.g., data structures 65 or 67). Access to a sharable data structure generally involves both software and hardware. For example, a device driver or operating system kernel may require access to a sharable data structure and the controller (e.g., controller 62) responds to the request. In at least some embodiments, the software may implement any suitable “spin lock” mechanism that comprises a control loop which, when executed, repeatedly reads a spin lock variable (in at least some contexts referred to as a “spin lock flag”) to determine whether the associated sharable data structure is free or busy. In response to an access to the lock variable, the controller implements a deferred cycle mechanism in which the controller issues a deferred-reply message to the requesting agent (e.g., a CPU 52) which indicates to the requesting agent that the requested information is not yet ready. At that point, the software's spin lock loop running on the requestor is caused to stop execution until the controller responds again that the requested lock variable is available. The controller, however, does not respond again until the lock variable is free. Thus, the software's spin lock loop always, in at least some embodiments, upon execution of the first spin lock variable access in the loop results in an indication that the sharable data structure is free. In this manner, the requesting agent generally will not have to repeatedly read the lock variable, reducing memory bandwidth consumption due to write-back cache activity to keep the lock variable consistent between CPUs. The software need not implement a spin lock, but software that already exists with spin locks implemented therein will function correctly with the hardware's deferred reply mechanism without requiring modifications to the software. In short, the software can implement a spin lock, but the hardware implements a deferred cycle mechanism to coordinate access to a sharable data structure. The following description explains the hardware's deferred cycle mechanism.
The TAGRAM and control logic 74 comprises logic which implements one or more of the features described herein for implementing a spinlock.
The embodiments described below refer to the actions performed by the memory controller 62, although other types of controllers including controllers that are not memory controllers per se can be used as well. In general, any controller that can be configured to implement a deferred cycle to control access to sharable data structures can be used.
The timeout field 84 is used to control a timer 72 associated with the lock variable. The timeout field 84 may comprise one or more bits. In some embodiments, a value of zero for the timeout field 84 disables the timer 72 and in other embodiments, other values disable the timer. A non-zero value is a threshold value for a timer state machine to execute to employ the agent ID queue 86 as will be explained below. The agent ID queue 86 is used to store identifiers associated with one or more requestors that are awaiting access to the associated sharable data structures.
The number of lock variables is set in accordance with user needs. Further, the number of AGENTIDQs 86 is set depending on the number of requesters. For example, because there can be at most three waiters in a four-requestor system, then only three AGENTIDQs are needed. Thus, in at least some embodiments, the number of AGENTIDQs is one less than the number of potential requesters.
The use of the controller 62 to control access to a shared data structure using a deferred cycle mechanism will now be described with regard to the exemplary method 100 depicted in
Block 102 may be performed by the controller 62 decoding the lock variable access. With regard to the memory controller embodiment of
If, however, the target tag is valid, which indicates that the sharable data structure is being accessed by another requestor, then the controller 62 determines if the agent ID queue 86 associated with the target lock variable is empty. If the associated agent ID queue 86 is not empty (i.e., there is at least one agent ID in the queue 86), then the agent ID of the requestor is inserted into the agent ID queue (block 114). If the agent ID queue 86 is empty, then, in addition to inserting the agent ID into the queue 86 (block 114), the timer 74 associated with the lock variable is caused to begin counting for that particular lock variable (block 112). The timer 74 begins counting towards its terminal count value (e.g., 0 if counting down or a non-zero number if counting up from 0). The timer is reset when the lock variable is released by the requestor (i.e., the requestor no longer needs access to the associated shared data structure). If the timer reaches its terminal count value, then all entities awaiting access to the lock variable (“waiters”), as indicated by their agent IDs, are removed from the associated agent ID queue 86 and issued a “busy” reply to cause the software spin lock acquire algorithm to retry its attempt to claim lock ownership. After inserting the agent ID of the requester in the agent ID queue 86, the controller 62 issues a deferred-reply message to the requester (block 116). This message indicates that another requester currently has control of the lock variable and thus access to the requested sharable data structure, and that the controller will again contact the requestor when the sharable data structure becomes available, or if the timer 74 expires indicating that the sharable data structure is still unavailable. The requestor receives the deferred reply which precludes the software process that issued the read (106) from continuing to run until subsequently receiving the requested data. The data will subsequently be provided to the requestor when the requestor is given control of the associated lock variable which will occur when the lock variable becomes available and the waiting requestor is next in the agent ID queue awaiting access to the lock variable. Essentially, the requestor issues a read request for a lock variable and the read request is not permitted by the controller to complete until the requested data is available. From the requestor's perspective, the requester issues a read request and the data is returned, albeit possibly after a delay while the controller coordinates access to the requested data.
If, at decision block 104, the lock variable access is decoded as a write (which indicates that the requester that currently has access to the lock variable and no longer needs access to the sharable data structure), the functions of blocks 120-140 are performed as described below. At block 120, the controller 62 examines the decode information 80 to determine if the tag associated with the target lock variable is present in tag field 82. If the tag is not present in table 80, then the process ends. If the target tag is present in the tag field 82 of the associated lock variable, then at 126 the controller 62 determines if the lock address is initializing. Initializing the lock variable comprises the software act of placing the lock variable into a known “free” state and ensuring that there are no waiters in the agent ID queue. If the lock address is initializing, the next agent ID is removed from the agent ID queue 86 (block 126) and, per decision 128, if that next agent ID is valid, the controller 62 issues a “not busy” reply message to the requestor associated with the next agent ID (block 130) removed from the agent ID queue and control loops back to block 126. If the next agent ID is not valid, as determined by decision 128, the timer 74 associated with the lock variable is stopped at block 132 and the tag is freed at block 134.
If, at decision 124, the lock address is not initializing, the next agent ID is removed from the agent ID queue (block 136) and the controller 62 further determines, at block 138, whether that next agent ID is valid. If the next agent ID is not valid, the timer 74 is stopped (block 132) and the tag is freed (block 134). If the next agent ID is valid, then the controller 62 issues a “not busy” reply to the requestor associated with the next agent ID (block 140) removed from the agent ID queue and the process ends.
Access to some shared data structures can be controlled by a deferred cycle mechanism such as that described above. Access to other shared data structures instead can be controlled via a conventional spin lock mechanism in which a requester repeatedly reads a lock variable to determine if the shared data structure is available. If desired, access to all shared data structures can be controlled by the disclosed deferred cycle mechanism. Identification of the data structures that are susceptible to control via the deferred cycle mechanism and those that are susceptible to control via a spin lock mechanism can be hard-coded or programmable. For example, the memory controller 62 shown in FIG. 2 shows an address registry 69 which is used to register address of lock variables that are used in conjunction with the deferred-reply mechanism described above. As such, when an access to a lock variable is requested, that access is processed according the deferred-reply mechanism of
At least some embodiments of the invention function with software that implements atomic or non-atomic software instructions to acquire the lock and access the shared data structure. An exemplary portion of software that does not use atomic instructions to acquire the lock is provided as follows:
Software having atomic instructions can be used as well as with memory controller described herein. At least some atomic instructions, such as atomic read-modify-write or “bit test set” instructions, include a write cycle as part of the atomic instruction's execution. In an embodiment of the invention, this write cycle is ignored. With reference to
A predefined number of deferred cycles may be outstanding at any point in time. If more than that number is exceeded, in at least some embodiments the oldest pending memory accesses may be continued using a spin lock mechanism (repeatedly checking the lock status), rather than using the disclosed deferred cycle mechanism (waiting on the memory controller to release the lock to the requester).
The above discussion is meant to be illustrative of the principles and various embodiments of the present invention. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. It is intended that the following claims be interpreted to embrace all such variations and modifications.