This application relates to the following commonly assigned applications entitled:
“Apparatus And Method for Interfacing A High Speed Scan-Path With A Slow Speed Tester Equipment,” Ser. No. 09/653,642, filed Aug. 31, 2000, now U.S. Pat. No. 6,775,142 “Priority Rules For Reducing Network Message Routing Latency,” Ser. No. 09/652,322, filed Aug. 31, 2000, now U.S. Pat. No. 6,961,781; “Scalable Directory Based Cache Coherence Protocol,” Ser. No. 09/652,703, filed Aug. 31, 2000, now U.S. Pat. No. 6,633,960; “Scalable Efficient I/O Port Protocol,” Ser. No. 09/652,391, now U.S. Pat. No. 6,738,836, filed Aug. 31, 2000; “Efficient Translation Lookaside Buffer Miss Processing In Computer Systems With A Large Range Of Page Sizes,” Ser. No. 09/652,552, filed Aug. 31, 2000, now U.S. Pat. No. 6,715,057; “Fault Containment And Error Recovery Techniques In A Scalable Multiprocessor,” Ser. No. 09/651,949, filed Aug. 31, 2000, now U.S. Pat. No. 6,678,890; “Speculative Directory Writes In A Directory Based Cache Coherent Uniform Memory Access Protocol,” Ser. No. 09/652,834, filed Aug. 31, 2000, now U.S. Pat. No. 7,099,913; “Special Encoding of Known Bad Data,” Ser. No. 09/652,314, filed Aug. 31, 2000, now U.S. Pat. No. 6,662,315; “Broadcast Invalidate Scheme,” Ser. No. 09/652,165, filed Aug. 31, 2000, now U.S. Pat. No. 6,751,721; “Mechanism To Track All Open Pages In A DRAM Memory System,” Ser. No. 09/652,704, filed Aug. 31, 2000, now U.S. Pat. No. 6,662,265; “Programmable DRAM Address Mapping Mechanism,” Ser. No. 09/653,093, filed Aug. 31, 2000, now U.S. Pat. No. 6,546,453; “Computer Architecture And System For Efficient Management Of Bi-Directional Bus,” Ser. No. 09/652,323, filed Aug. 31, 2000, now U.S. Pat. No. 6,704,817; “An Efficient Address Interleaving With Simultaneous Multiple Locality Options,” Ser. No. 09/652,452, filed Aug. 31, 2000, now U.S. Pat. No. 5,557,900; “A High Performance Way Allocation Strategy For A Multi-Way Associative Cache System,” Ser. No. 09/653,092, filed Aug. 31, 2000, now abandoned; “Method And System For Absorbing Defects In High Performance Microprocessor With A Large N-Way Set Associative Cache,” Ser. No. 09/651,948, filed Aug. 31, 2000, now U.S. Pat. No. 6,671,822; “A Method For Reducing Directory Writes And Latency In A High Performance, Directory-Based, Coherency Protocol,” Ser. No. 09/652,324, filed Aug. 31, 2000, now U.S. Pat. No. 6,654,858; “Mechanism To Reorder Memory Read And Write Transactions For Reduced Latency And Increased Bandwidth,” Ser. No. 09/653,094, filed Aug. 31, 2000, now U.S. Pat. No. 6,591,349; “System For Minimizing Memory Bank Conflicts In A Computer System,” Ser. No. 09/652,325, filed Aug. 31, 2000, now U.S. Pat. No. 6,622,225; “Computer Resource Management And Allocation System,” Ser. No. 09/651,945, filed Aug. 31, 2000, now U.S. Pat. No. 6,754,739; “System for recovery data in a multi-processor system comprising a conduction path for each bit between processors where the paths are grouped into separate bundles and routed along different paths” Ser. No. 09/653,643, filed Aug. 31, 2000, now U.S. Pat. No. 6,668,335; “Fast Lane Prefetching,” Ser. No. 09/652,451, filed Aug. 31, 2000, now U.S. Pat. No. 6,681,295; “Mechanism For Synchronizing Multiple Skewed Source-Synchronous Data Channels With Automatic Initialization Feature,” Ser. No. 09/652,480, filed Aug. 31, 2000, now U.S. Pat. No. 6,636,955; and “Chaining Directory Reads And Writes To Reduce DRAM Bandwidth In A Directory Based CC-NUMA Protocol,” Ser. No. 09/652,315, filed Aug. 31, 2000, now U.S. Pat. No. 6,546,465, all of which are incorporated by reference herein.
Not applicable.
1. Field of the Invention
The present invention generally relates to the distribution of buffer space between multiple sources. More particularly, the invention relates to a fair and efficient method of controlling allocation of a shared buffer pool.
2. Background of the Invention
In computer systems and networks, buffers are a convenient means of storing commands, requests, and data that are in transit from one location to another. Buffers may be used in a variety of applications. They may be used in network switching devices to temporarily hold data packets while networks congestion subsides or while the switch determines the location to which the data must be forwarded. It is not uncommon for network switches to manage traffic for a plurality of sources. Buffers may also be used in memory and data allocation. An example of the latter would be a data read/write request buffer that must allocate requests from multiple sources. A common problem in systems using a shared buffer space is signal traffic that creates congestion and may lead to buffer overflow and monopolization by one or more of the buffer sources.
Ideally, a system comprising a buffer with multiple sources should accomplish several tasks. First, the system should not deliver data, commands, or requests to the buffer if the buffer does not have any free space. This prevents data loss or packet drops which may require that the data packet be re-sent resulting in even greater bandwidth loss than simply holding the data until buffer space becomes available. Secondly, access to buffer space by the multiple sources should preferably be allocated in a fair manner. The sources should have fair access to the buffer so a source does not become backlogged while other sources are able to deliver data freely. This does not necessarily imply that the allocation needs to be equal for each source. For instance, one source may be given priority over the others. However, even in this scenario, it is important to prevent complete monopolization by the source that has priority.
One conventional solution to the problem of fair allocation of a shared buffer space is hard partitioning of the buffer space. For example, if a buffer has 16 data spaces and the buffer is shared among 4 sources, each source may be allocated four buffer slots. This method of allocation is certainly fair but may be horribly inefficient. If one of the sources has a string of data that ideally could be burst to the buffer, congestion may occur because the source only has four buffer spaces available. The remaining 12 buffers could be used to hold the burst of data, but instead may lie dormant because of the hard partitioning.
If prior knowledge exists about the type of traffic that can be expected from the sources, the hard partitioning may be altered to allocate more buffer space to one source or another. For instance, in the example given above, seven buffer spaces may be allocated to one source while the other three sources are allocated three spaces each. This allocation may alleviate congestion for the prioritized source, but does not prevent the burst congestion for of the other sources. In either case, hard partitioning tends to preclude use of at least a fixed percentage of the buffer space unless all sources are continuously accessing the buffer.
Another conventional solution to the problem of fairly and efficiently allocating buffer space is with stop and go commands issued by the buffer. In this type of system, the buffer is configured to keep track of available spaces within the buffer. During normal operation with light traffic, each source receives a “go” signal from the buffer indicating that buffer space is available. As buffer space becomes limited, the buffer may send “stop” signals to the individual sources to halt data transmission to the buffer. This approach offers better use of the buffer space because the sources are not limited to a fixed percentage of the buffer space. However, some risk is involved in this type of embodiment because a finite time exists between the moment the buffer sends out a stop command and the moment the source receives and responds to the stop command. During this finite time, it is possible for additional data to be transmitted to the buffer from the various sources. If the buffer was sufficiently close to being full and enough data was sent to the buffer before the stop commands were received by the sources, buffer overflow may occur and data may be lost. To prevent this, stop commands are usually sent well in advance of the buffer filling to capacity. Thus, if all buffer sources are bursting data to the buffer, the stop command is preferably timed so that the sources stop sending data to the buffer before the buffer capacity is filled. Unfortunately, the side effect of sending the stop commands early is that the maximum capacity of the buffer will not be used when the buffer sources are not simultaneously sending bursts of data. The stop/go command solution to this buffer allocation problem is an improvement over the hard partitioning solution, but presents problems of either overflow or inefficient use of the whole buffer.
It is desirable therefore to develop a fair and efficient means of allocating buffer space among several sources. The allocation preferably prevents monopolization by any of the buffer sources. The allocation method also preferably takes advantage of the full capacity of the buffer space without the possibility of buffer overflow. The system may advantageously be applied to a plurality of buffer sources and may also be applied to a variety of applications.
The problems noted above are solved in large part by a method and apparatus for ensuring fair and efficient use of a shared memory buffer. The invention uses a credit-based allocation scheme to prevent monopolization by one or more sources and permits efficient use of the entire buffer. A preferred embodiment comprises a shared memory request buffer in a multi-processor computer system. The shared memory request buffer is used to store requests from different processors. Memory requests from a local processor are delivered to the local memory controller by a cache control unit. Requests from other processors are delivered to the memory controller by an interprocessor router. The memory controller allocates the memory requests in a shared buffer using a credit-based allocation scheme. The cache control unit and the interprocessor router are each assigned a number of credits. Each must pay a credit to the memory controller when a request is allocated to the shared buffer. If a source does not have any available credits, that source may not send a request to the shared buffer. The number of credits assigned to each source is sufficient to enable each source to deliver an uninterrupted burst of memory requests to the buffer without having to wait for credits to return from the buffer. The total number of credits assigned to the sources is preferably small compared to the overall size of the buffer. If the number of filled spaces in the shared buffer is below a threshold, the buffer immediately returns the credits to the source from which the credit and memory request arrived. If the number of filled spaces in the shared buffer is above a threshold, the buffer holds the credits and returns a single credit in a round-robin manner only when a space in the shared buffer becomes free. The buffer threshold is the point when the number of free spaces available in the buffer is equal to the total number of credits assigned to the cache control unit and the interprocessor router. Since credits are not freely returned as the buffer gets full and since there are never any more credits available than spaces in the buffer, the buffer may reach capacity, but will not overflow.
For a detailed description of the preferred embodiments of the invention, reference will now be made to the accompanying drawings in which:
Certain terms are used throughout the following description and claims to refer to particular system components. As one skilled in the art will appreciate, computer companies may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following discussion and in the claims, the terms “including” and “comprising” are used in an open-ended fashion, and thus should be interpreted to mean “including, but not limited to . . . ”. Also, the term “couple” or “couples” is intended to mean either an indirect or direct electrical connection. Thus, if a first device couples to a second device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections.
The preferred embodiment of the shared buffer allocation scheme is directed toward application in a memory request buffer used in a multi-processor computer system. While the preferred embodiment may be implemented in a variety of applications, a detailed description in the context of the multi-processor computer system is provided herein. The descriptions herein are not intended to limit the scope of the claim limitations set forth in this specification.
Referring now to
As noted, each processor preferably has an associated I/O controller 104. The I/O controller 104 provides an interface to various input/output devices such as disk drives 105 and 106, as shown in the lower, left-hand corner of
Each processor also, preferably, has an associated memory 102. In accordance with the preferred embodiment, the memory 102 preferably comprises RAMbus™ memory devices, but other types of memory devices can be used, if desired. The capacity of the memory devices 102 can be any suitable size. Further, memory devices 102 preferably are implemented as Rambus Interface Memory Modules (“RIMM”). To aid in the control of distributed memory in the multiprocessor system, each processor includes a memory manager and directory structure for the local memory source. A preferred embodiment of the memory controller is shown in
In general, computer system 90 can be configured so that any processor 100 can access its own memory 102 and I/O devices, as well as the memory and I/O devices of all other processors in the system. Preferably, the computer system may have physical connections between each processor resulting in low interprocessor communication times and improved memory and I/O device access reliability. If physical connections are not present between each pair of processors, a pass-through or bypass path is preferably implemented in each processor that permits accesses to a processor's memory and I/O devices by another processor through one or more pass-through prcoessors.
Referring now to
Each of the functional units 170, 180, 190, 200 contains control logic that communicates with the control logic of other functional units as shown. Other functional units may exist within the processor 100, but have been omitted from
The L2 instruction and data cache control unit (“Cbox”) 170 controls the L2 instruction and data cache 180 and handles data accesses from other functional units in the processor, other processors in the computer system, or any other devices that might need data out of the L2 cache.
The interprocessor and I/O router unit (“Rbox”) 200 provides the interfaces to as many as four other processors and one I/O controller 104 (
Processor 100 preferably includes at least one RAMbus™ memory controller 190 (“Zobx”). The Zbox 190 controls 4 or 5 channels of information flow with the main memory 102 (
The front-end DIFT 191 performs a number of functions such as managing the processor's directory-based memory coherency protocol, processing request commands from the Cbox 170 and Rbox 200, sending forward commands to the Rbox 200, sending response commands to and receiving packets from the Cbox 170 and Rbox 200.
The DIFT 191 also comprises a shared request buffer for tracking up to thirty-two in-flight transactions. These transaction requests or packets are received from the Cbox 170 or the Rbox 200. When a request comes from either source, the DIFT must allocate an entry in the 32-space buffer. The requests from the Cbox 170 are from the local processor chip whereas requests from the Rbox 200 are off chip requests. Since each processor in the multi-processor system shown in
In the preferred embodiment, credits are returned from the DIFT 191 to the Cbox 170 and Rbox 200 in two distinct manners. The method of returning credits depends on the region of operation for the DIFT buffer 300. The first occurs under light loads when the DIFT buffer 300 is relatively empty. This condition is met when the number of occupied buffer spaces is below a threshold 360. In the example shown in
The number of credits assigned to the Cbox 170 and Rbox 200 are based, in part, on credit round trip times. Since credits are returned immediately during light load situations, the total time required to transmit a credit from the Cbox 170 to the DIFT 191 and for the DIFT 191 to return the credit to the Cbox 170 may be determined. The number of credits given to the Cbox 170 is based not only on this round trip time, but also on the speed with which the Cbox 170 may deliver a burst of requests. The preferred embodiment gives enough credits so that if the Cbox 170 has enough memory requests from the processor, it may continuously deliver these requests without having to wait for credits to return from the DIFT 191. Consider for example that the Cbox 170 has a string of requests that need to be sent yet only has 4 credits. Ideally if the Cbox bursts these requests one after another as quickly as it can, then by the time the Cbox is ready to transmit the fifth request, the credit from the first request has preferably arrived back at the Cbox 170. This credit may then be used to transmit the fifth request. Subsequent requests may then be transmitted with other credits as they arrive back at the Cbox 170. This process may continue as long as the number of occupied spaces in the DIFT buffer remains under the threshold 360.
The number of credits given to the Rbox 200 is determined in the same way as for the Cbox 170. The round trip times between the two sources should not be assumed to be the same. A source with longer credit round trip times will necessarily require more credits to guarantee uninterrupted burst requests. Thus the number of credits assigned to the Rbox 200 will probably, though not necessarily, differ from the number of credits given to the Cbox 170. In the system shown in
The second manner in which the DIFT 191 returns credits to the Cbox 170 and Rbox 200 occurs under heavier loads when the DIFT buffer 300 is relatively full. The threshold 360 between these two conditions is preferably defined as the difference between the size of the DIFT buffer 300 and the number of credits distributed among the sources. In the system shown in
In
In the second region of operation, each time a space in the DIFT buffer 300 becomes available, credits are preferably returned to the Cbox 170 or Rbox 200. To ensure fairness, the credits are returned in a random, equally probable round-robin fashion to those sources which have spent credits. A suitable method of returning the credits may be to simply alternate credit returns. Other embodiments exist where statistical methods are used to determine which source receives a credit. For instance, if priority needs to be given to the Cbox 170, the credit returns may be based on a random generator that is statistically skewed to return more credits to the Cbox 170 than the Rbox 200. The preferred embodiment however, returns credits on an equally likely basis to ensure fairness between the sources. It should also be noted that when the system operates in the first region of operation (during light loads), fairness between the sources is guaranteed because credits are returned immediately.
An alternative embodiment exists whereby the credit allocation scheme may be used to send request or response commands to the DIFT buffer 300. In this embodiment, the Cbox 170 and Rbox 200 are assigned credits as discussed above, but each must reserve a final credit for a response only. Thus, all credits in the Cbox 170 or Rbox 200 may be spent in issuing requests or responses to the DIFT buffer 300, but the last credit must be spent on a response.
The above discussion is meant to be illustrative of the principles and various embodiments of the present invention. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated. For example, the shared buffer space may be extended to include three or more buffer sources. It is intended that the following claims be interpreted to embrace all such variations and modifications.
Number | Name | Date | Kind |
---|---|---|---|
5261066 | Jouppi et al. | Nov 1993 | A |
5317718 | Jouppi | May 1994 | A |
5446734 | Goldstein | Aug 1995 | A |
5758183 | Scales | May 1998 | A |
5761729 | Scales | Jun 1998 | A |
5787480 | Scales et al. | Jul 1998 | A |
5802585 | Scales et al. | Sep 1998 | A |
5809450 | Chrysos et al. | Sep 1998 | A |
5875151 | Mick | Feb 1999 | A |
5890201 | McLellan et al. | Mar 1999 | A |
5893931 | Peng et al. | Apr 1999 | A |
5898671 | Hunt et al. | Apr 1999 | A |
5918250 | Hammond | Jun 1999 | A |
5918251 | Yamada et al. | Jun 1999 | A |
5923872 | Chrysos et al. | Jul 1999 | A |
5950228 | Scales et al. | Sep 1999 | A |
5958019 | Hagersten et al. | Sep 1999 | A |
5964867 | Anderson et al. | Oct 1999 | A |
5982771 | Caldara et al. | Nov 1999 | A |
5983325 | Lewchuk | Nov 1999 | A |
5999518 | Nattkemper et al. | Dec 1999 | A |
6000044 | Chrysos et al. | Dec 1999 | A |
6044406 | Barkey et al. | Mar 2000 | A |
6070227 | Rokicki | May 2000 | A |
6078565 | Ben-Michael et al. | Jun 2000 | A |
6085300 | Sunaga et al. | Jul 2000 | A |
6104727 | Moura et al. | Aug 2000 | A |
6115748 | Hauser et al. | Sep 2000 | A |
6256674 | Manning et al. | Jul 2001 | B1 |
6347337 | Shah et al. | Feb 2002 | B1 |
6359884 | Vincent | Mar 2002 | B1 |
6426957 | Hauser et al. | Jul 2002 | B1 |
6452903 | Peck et al. | Sep 2002 | B1 |
6493776 | Courtright et al. | Dec 2002 | B1 |
6515963 | Bechtolsheim et al. | Feb 2003 | B1 |
6532501 | McCracken | Mar 2003 | B1 |
6546453 | Kessler et al. | Apr 2003 | B1 |
6546465 | Bertone | Apr 2003 | B1 |
6567900 | Kessler | May 2003 | B1 |
6591349 | Steinman et al. | Jul 2003 | B1 |
6594701 | Forin | Jul 2003 | B1 |
6601084 | Bhaskaran et al. | Jul 2003 | B1 |
6622225 | Kessler et al. | Sep 2003 | B1 |
6633960 | Kessler et al. | Oct 2003 | B1 |
6636955 | Kessler et al. | Oct 2003 | B1 |
6646986 | Beshai | Nov 2003 | B1 |
6654858 | Asher et al. | Nov 2003 | B1 |
6662265 | Kessler et al. | Dec 2003 | B1 |
6662319 | Webb, Jr. et al. | Dec 2003 | B1 |
6668335 | Breach et al. | Dec 2003 | B1 |
6671822 | Asher et al. | Dec 2003 | B1 |
6674722 | Tiainen et al. | Jan 2004 | B1 |
6678840 | Kessler et al. | Jan 2004 | B1 |
6681295 | Root et al. | Jan 2004 | B1 |
6704817 | Steinman et al. | Mar 2004 | B1 |
6715008 | Shimizu | Mar 2004 | B2 |
6715057 | Kessler et al. | Mar 2004 | B1 |
6735174 | Hefty et al. | May 2004 | B1 |
6738836 | Kessler et al. | May 2004 | B1 |
6751698 | Deneroff et al. | Jun 2004 | B1 |
6751721 | Webb, Jr. et al. | Jun 2004 | B1 |
6754739 | Kessler et al. | Jun 2004 | B1 |
6779142 | Bhavsar et al. | Aug 2004 | B1 |
6961781 | Mukherjee et al. | Nov 2005 | B1 |
6992984 | Gu | Jan 2006 | B1 |
7099913 | Bertone et al. | Aug 2006 | B1 |