The present invention pertains to electronic systems, and in some embodiments, to processing systems. Embodiments of the present invention also pertain to pipelined systems, to memory controllers, and to wireless communication devices.
Processing systems, including pipelined systems, generally utilize an arbiter to arbitrate among the devices requesting access to a shared resource over a system bus. When a requesting device desires access to the shared resource, the requesting device generally generates a request and waits for a grant. One problem with this approach is the latency involved with such requests. This is especially a problem for memory devices, such as memory controllers, because latency and bus-bandwidth limitations may result in delays that impact system-level operations.
Thus there are general needs for systems and methods that help reduce the effects of latency and increase bus-usage efficiency.
The appended claims are directed to some of the various embodiments of the present invention. However, the detailed description presents a more complete understanding of embodiments of the present invention when considered in connection with the figures, wherein like reference numbers refer to similar items throughout the figures and:
The following description and the drawings illustrate specific embodiments of the invention sufficiently to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. Examples merely typify possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in or substituted for those of others. The scope of embodiments of the invention encompasses the full ambit of the claims and all available equivalents of those claims.
Requestors 106 may include any device or element that requests use of a shared resource. Examples of requesters 106 may include, for example, memory controllers, such as memory controller (MC) 104, processors and processing resources including cryptographic processors, direct memory access (DMA) units, network interfaces, digital signal processors (DSPs), network controllers including wireless local area network controllers, signal processors, floating-point units (FPUs), application accelerators, and/or data acquisition devices.
Requestors 106 generate bus requests 116 for bus arbiter 102 and may receive bus grants 118 from bus arbiter 102. Bus arbiter 102 may provide a bus grant for a request to a requester in accordance with one or more arbitration schemes, including, for example, priority based arbitration schemes or fixed arbitration schemes. In some embodiments, bus requests 116 and bus grants 118 may be communicated between arbiter 102 and requesters over a grant/request bus (not illustrated).
When granted access, requestors 106 may access one or more of shared resources 108 over system bus 114. Shared resource 108 may include one or more resources that may be shared among requestors 106. Shared resource 108, may include a processor, a central processing unit, a particularly-configured processing engine (e.g., for cryptographic processing), or other resource that may be shared by one or more requestors 106 over bus 114. Bus 114 may be almost any type of data bus that supports multiple clients using some form or arbitration. In some embodiments, bus 114 may be a 32-bit or 64-bit bus including a PCI bus, a PCI-express (PCIX) bus or a third-generation input/output (3GIO) bus, although the scope of the invention is not limited in this respect.
In accordance with some embodiments, bus arbiter 102 generates bus-activity indicator (BAI) 120 for use by one or more of requestors 106. Bus-activity indicator 120 may be an indication of how busy system bus 114 has been during a recent system-bus observation window. The system-bus observation window may comprise a prior predetermined number of system-bus cycles. One or more of requestors may predict when to generate a bus request based on bus-activity indicator 120, a bus-usage efficiency indicator and a bus-bandwidth usage indicator. The bus-usage efficiency indicator may be generated by one of requesters 106 based on a number of unused bus cycles that were granted to the requester during a prior observation window. The bus-bandwidth usage indicator may be generated by the requestor based on a number of bus transactions effectively utilized by the requestor during the prior observation window.
In some embodiments, when bus-activity indicator 120 indicates that system bus 114 is not busy, a requester may engage in full speculation generating a bus request ahead-of-time, which may be a maximum predetermined number of bus cycles ahead-of-time. When bus-activity indicator 120 indicates that the system bus is busy, the requestor may predict how early to generate the bus request based on the bus-activity indicator, the bus-usage efficiency indicator and the bus-bandwidth usage indicator.
In some embodiments, a requestor may predict when to generate the bus request based on one of a plurality of speculation states, which may be at least initially determined by bus-activity indicator 120. In these embodiments, the requester may transition among the various speculation states based on the bus-usage efficiency indicator and the bus-bandwidth usage indicator. In some embodiments, the requestor may transition among the various speculation states based on changes in the bus-usage efficiency indicator and/or bus-bandwidth usage indicator. In some embodiments, the requestor may determine a number of bus cycles to generate the bus request ahead-of-time based on an imminence level of a transaction for which the bus request is to be generated.
In some embodiments, bus-activity indicator 120 may comprises a two-bit value broadcasted by bus arbiter 102 to one or more requestors 106. The two-bit value may be broadcasted over a two-wireline connection to one or more requesters 106, although the scope of the invention is not limited in this respect.
In wireless embodiments, system 100 may include wireless network interface 110, such as a network interface card (NIC). In these embodiments, interface 110 may operate as one requesters 106 and may communicate RF signals with other networked devices, such as an access point, using antenna 112. In these embodiments, system 100, including wireless network interface 110 and antenna 112, may be part of a wireless communication device, such as a personal digital assistant (PDA), a laptop or portable computer with wireless communication capability, a web tablet, a wireless telephone, a wireless headset, a pager, an instant messaging device, an MP3 player, a digital camera, an access point, or other device that may receive and/or transmit information wirelessly. In these embodiments, wireless network interface 110 may receive RF communications in accordance with specific communication standards, such as the IEEE 802.11(a), 802.11(b) and/or 802.11(g) standards for wireless local area network standards, although interface 110 may receive communications in accordance with other techniques including Digital Video Broadcasting Terrestrial (DVB-T) broadcasting standard, and the High performance radio Local Area Network (HiperLAN) standard. Antenna 112 may be almost any type of antenna including a dipole antenna, a monopole antenna, a loop antenna, a microstrip antenna or other type of antenna suitable for reception and/or transmission of RF signals, which may be processed by wireless network interface 110.
Although system 100 is illustrated as having several separate functional elements, one or more of the functional elements may be combined and may be implemented by combinations of software-configured elements, such as processing elements including digital signal processors (DSPs), and/or other hardware elements. For example, processing elements may comprise one or more microprocessors, DSPs, application specific integrated circuits (ASICs), and combinations of various hardware and logic circuitry for performing at least the functions described herein.
Unless specifically stated otherwise, terms such as processing, computing, calculating, determining, displaying, generating, or the like, may refer to an action and/or process of one or more processing or computing systems or similar devices that may manipulate and transform data represented as physical (e.g., electronic) quantities within a processing system's registers and memory into other data similarly represented as physical quantities within the processing system's registers or memories, or other such information storage, transmission or display devices.
In some embodiments, logic circuitry 202 may generate the bus-usage efficiency indicator based on unused bus cycles that were granted to the requestor 200 during a prior observation window, and may generate the bus-bandwidth usage indicator based on a number of bus transactions effectively utilized by the requestor 200 during the prior observation window.
Requestor 200 may also include logic circuitry 206 to generate the bus request ahead-of-time based on a prediction received from logic circuitry 204. In some embodiments, logic circuitry 204 may receive bus-activity indicator (BAI) 220 from a bus arbiter indicating system bus activity during a prior system-bus observation window. In embodiments, bus-activity indicator 220 may correspond to bus-activity indicator 120 (
In some embodiments, when bus-activity indicator 220 indicates that the system bus is not busy, requester 200 may engage in full speculation generating bus request 216 ahead-of-time at a maximum predetermined number of bus cycles. In some embodiments, when bus-activity indicator 220 indicates that the system bus is busy, logic circuitry 204 may predict how early to generate the bus request based on bus-activity indicator 220, the bus-usage efficiency indicator and the bus-bandwidth usage indicator. In some embodiments, logic circuitry 204 may predict when to generate bus request 216 based on one of a plurality of speculation states, which may be at least initially determined by bus-activity indicator 220. Requestor 200 may transition among the speculation states based on the bus-usage efficiency indicator and the bus-bandwidth usage indicator, for example, as the bus-usage efficiency indicator and/or bus-bandwidth usage indicator may change. In some embodiments, logic circuitry 204 may determine the number of bus cycles to generate the bus request ahead-of-time based on an imminence level of a transaction for which the bus request is to be generated.
In some embodiments in which requester 200 comprises a memory controller, such as memory controller 104 (
Although requester 200 is illustrated as having several separate functional elements (e.g., logic circuitry 202, 204 and 206, and OSEs 208), one or more of these functional elements may be combined in a single hardware element. In some embodiments, one or more these elements and may be implemented by combinations of software-configured elements, such as processing elements and/or other hardware elements and firmware.
In some embodiments, state diagram 300 may be applicable to situations when the bus-activity indicator indicates that the system bus has been busy during the prior observation window. When the bus-activity indicator indicates that the system bus has not been busy, or has been idle, a requester may remain in full-speculation state 302 regardless of the bus-usage efficiency indicator and the bus-bandwidth usage indicator.
When a requestor is in full-speculation state 302, the requestor may engage in full speculation and may generate bus requests ahead-of-time at a maximum number of bus cycles. In some embodiments, the number of bus cycles may be a predetermined number of bus cycles, while in other embodiments, the number of bus cycles may be based on the imminence of the request. The number of bus cycles ahead-of-time in which bus requests are generated may also depend on the type of memory being accessed (e.g., whether the memory is synchronous verses asynchronous, or static verses dynamic). The number of bus cycles ahead-of-time in which bus requests are generated may also depend on the width of the data-bus to an external-memory chip (e.g., 8-bit, 16-bit, 32-bit, 64-bit, etc.). The number of bus cycles ahead-of-time in which bus requests are generated may also depend on the size of the data being fetched (e.g., 1-byte, 6-bytes, 32-bytes, 1K-bytes, 1M-bytes, etc.).
When a requester is in delayed-speculation state 304, the requester may engage in a delayed speculation and may also generate bus requests a number of bus cycles ahead-of-time. In delayed-speculation state 304, however, the number of bus cycles ahead-of-time in which bus requests are generated may depend on the bus-usage efficiency indicator and/or the bus-bandwidth usage indicator generated by the requestor.
When a requester is in no-speculation state 306, the requester engages in no speculation generating bus requests when the requestor is ready to use the bus. In other words, in no-speculation state 306, bus requests are not generated ahead-of-time and the data is available in the buffer before the transaction starts.
In accordance with some embodiments, when a requester receives a bus-activity indicator from a bus arbiter indicating that the system bus is busy, the requester may go into one of the speculation states, such as full-speculation state 302. A bus-activity indicator may be broadcast by the arbiter on a regular basis, such as every 1000 bus cycles, or may be broadcast when the bus activity changes.
During a speculation state, such as full-speculation state 302, a requester may measure its recent bus usage and generate the bus-usage efficiency indicator the bus-bandwidth usage indicator. The bus-usage efficiency indicator may be based on a number of unused bus cycles that were granted to the requester during a prior observation window. The bus-bandwidth usage indicator may be based on a number of bus transactions utilized by the requester during the prior observation window. As indicated by transition 308, when the bus-usage efficiency is low, the requestor may transition from full-speculation state 302 to delayed-speculation state 304. This is because of the causal link between predictive requesting and data availability. The more aggressive the requesting algorithm, the more likely it may be that data is not available to complete a transaction. Efficiency is a proxy for data availability, and when the efficiency is low, the prediction algorithm may be relaxed in an effort to reduce the wasted cycles.
During delayed-speculation state 304, the requestor may measure its bus usage and generate a bus-usage efficiency indicator the bus-bandwidth usage indicator. As indicated by transition 310, when the bus-usage efficiency is high, and the bandwidth usage (e.g., BWu) measured during delayed-speculation state 304 is less than the bandwidth usage measured during full-speculation state 302, the requester may transition from delayed-speculation state 304 to full-speculation state 302. This is because efficiency may not be a complete measure of bus optimization. Due to the nature of a bus, such as a PCI bus and most other buses, it is possible to achieve higher efficiencies without getting useful transactions (e.g., in the case with retries). The bandwidth usage is a failsafe which helps improve some of these situations.
As indicated by transition 312, a requestor may transition from delayed-speculation state 304 to non-speculation state 305 when the bus-usage efficiency is low or remains low. During no-speculation state 306, the requester may measure its bus usage and generate a bus-usage efficiency indicator the bus-bandwidth usage indicator. As indicated by transition 314, when the bus-usage efficiency is high, and the bandwidth usage (e.g., BWu) measured during no-speculation state 306 is less than the bandwidth usage measured during delayed-speculation state 304, the requestor may transition from no-speculation state 306 to delayed-speculation state 304.
As indicated by transition 316, when the bus-usage efficiency is high, and the bandwidth usage (e.g., BWu) measured during no-speculation state 306 is less than the bandwidth usage measured during full-speculation state 302, the requester may transition from no-speculation state 306 to full-speculation state 302. In some embodiments, transition 316 may be performed when user programmability may bar transition 314.
In some embodiments, the requestor may remain in full-speculation state 302 as long as the bus-usage efficiency is high. In general, a requester may determine and operate in a particular speculation state for a given bus-activity indicator and may remain in that speculation state until the bus-activity indicator changes, which may occur after each system-bus observation window.
In some embodiments, user settings may disallow certain speculation states and associated transactions. For example, in some embodiments, transition 318 from full-speculation state 302 to no-speculation state 306 may occur when user programmability bars transition 308. User programmability, including settings and selections, may be provided through an I/O device, such as I/O 122 (
In some embodiments, a requestor may further determine a number of bus cycles to generate the bus request ahead-of-time based on an imminence level of a transaction for which the bus request is to be generated. In these embodiments, an imminence bit may be set. In some embodiments, memory controller 104 (
For example, bus requests for a variable-latency input-output (VLIO) memory, such as command chips and card memory, are generally not considered imminent because the data arrival may be unpredictable. Bus requests for synchronous memory, for example, may be considered imminent when the transaction length exceeds a certain number of bytes (such as 16) or a predetermined percentage of bites (e.g., 16 out of 32) has been received. In some embodiments, the transfer may be considered imminent upon receipt of the 16th byte. Bus requests for synchronous memory for transfers less than 16 bytes may be considered imminent and may have their imminence bit set when the transfer is initiated. For transactions involving other memory types (e.g., SRAM, FLASH), an imminence bit may be set upon the arrival of the last_but_one beat of data from an external source. In these embodiments, data may arrive in beats from an external memory chip. The size of the beat relates to the number of wires connected to the chip (e.g., 8, 16, 32, 64, etc.). Thus, it will take 16 beats to fetch 32 bytes from a memory over 16 wires, for example. In this case, the imminence bit may be set upon the arrival of the 15th beat (e.g., the last_but_one beat of data).
In operation 402, a bus-activity indicator, such as bus-activity indicator 120 (
In operation 404, a requester may enter an initial speculation state, such as full-speculation state 302 (
In operation 408, the requester determines the bus-usage efficiency for the requester, and in operation 410, the requester determines the bus-bandwidth usage for the requester. In operation 412, the requestor may transition among the various speculation states, such as speculation states 302, 304 and 306 (
In operation 414, the requester may predict when to generate a bus request depending on its speculation state. In some embodiments, the requester may predict when to generate a bus request depending on the bus-usage efficiency and the bus-bandwidth usage. In operation 416, the requester generates the bus request based on the prediction generated in operation 414. In some embodiments, the requester generates the bus request based on an imminence level of the transaction discussed above.
Accordingly, the use of speculation states may help reduce the effects of latency and increase bus-usage efficiency. Although the individual operations of procedure 400 are illustrated and described as separate operations, one or more of the individual operations may be performed concurrently and nothing requires that the operations be performed in the order illustrated.
It is emphasized that the Abstract is provided to comply with 37 C.F.R. Section 1.72(b) requiring an abstract that will allow the reader to ascertain the nature and gist of the technical disclosure. It is submitted with the understanding that it will not be used to limit or interpret the scope or meaning of the claims.
In the foregoing detailed description, various features are occasionally grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features that are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the detailed description, with each claim standing on its own as a separate preferred embodiment.
This application is a continuation of U.S. patent application Ser. No. 10/654,544, filed on Sep. 2, 2003, which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 10654544 | Sep 2003 | US |
Child | 11460852 | Jul 2006 | US |