Claims
- 1. A computer system, comprising a plurality of processor clusters, each cluster including a plurality of local nodes and a cache coherence controller interconnected by a local point-to-point architecture, the computer system further comprising memory corresponding to a global memory space, each cluster corresponding to a contiguous portion of the global memory space, selected ones of the plurality of local nodes in each cluster having a memory controller associated therewith, each memory controller in each cluster being responsible for a memory range within the corresponding contiguous portion of the global memory space, the cache coherence controller in each cluster having a cache coherence directory associated therewith, entries in the cache coherence directory in each cluster corresponding to memory lines within the corresponding contiguous portion of the global memory space which are cached in remote clusters, the cache coherence controller being operable to initiate an eviction of a first one of the entries corresponding to an unmodified copy of a first memory line by sending a request to write to the first memory line to a first one of the memory controllers corresponding to the first memory line.
- 2. The computer system of claim 1 wherein the cache coherence directory is operable to designate the first entry to be evicted and maintain the first entry therein at least until the first memory controller allows the eviction to proceed.
- 3. The computer system of claim 1 wherein the cache coherence directory includes an eviction buffer, the cache coherence directory being operable to designate the first entry by placing the first entry in the eviction buffer.
- 4. The computer system of claim 3 wherein the cache coherence directory is further operable to invalidate the first entry in the eviction buffer in response to a communication from the first memory controller corresponding to the eviction.
- 5. The computer system of claim 1 wherein the first memory controller is operable to generate a plurality of invalidating probes in response to which all copies of the first memory line in the cache memories are invalidated.
- 6. The computer system of claim 5 wherein the cache coherence controller is operable to enable interaction by the associated processing nodes with processing nodes in others of the clusters in accordance with the associated cache coherence directory.
- 7. The computer system of claim 6 wherein the cache coherence controller includes the cache coherence directory.
- 8. The computer system of claim 6 wherein the cache coherence controller is operable using the cache coherence directory to forward the invalidating probes only to clusters having at least one copy of the first memory lines in the associated cache memories.
- 9. A cache coherence controller for use in a computer system comprising a plurality of processor clusters, each cluster including a plurality of local nodes and an instance of the cache coherence controller interconnected by a local point-to-point architecture, the computer system further comprising memory corresponding to a global memory space, each cluster corresponding to a contiguous portion of the global memory space, selected ones of the plurality of local nodes in each cluster having a memory controller associated therewith, each memory controller in each cluster being responsible for a memory range within the corresponding contiguous portion of the global memory space, the cache coherence controller including a cache coherence directory, entries in the cache coherence directory in each cluster corresponding to memory lines within the corresponding contiguous portion of the global memory space which are cached in remote clusters, the cache coherence controller being operable to initiate an eviction of a first one of the entries corresponding to an unmodified copy of a first memory line by sending a request to write to the first memory line to a first one of the memory controllers corresponding to the first memory line.
- 10. An integrated circuit comprising the cache coherence controller of claim 9.
- 11. The integrated circuit of claim 10 wherein the integrated circuit comprises an application-specific integrated circuit.
- 12. At least one computer-readable medium having data structures stored therein representative of the cache coherence controller of claim 9.
- 13. The at least one computer-readable medium of claim 12 wherein the data structures comprise a simulatable representation of the cache coherence controller.
- 14. The at least one computer-readable medium of claim 13 wherein the simulatable representation comprises a netlist.
- 15. The at least one computer-readable medium of claim 12 wherein the data structures comprise a code description of the cache coherence controller.
- 16. The at least one computer-readable medium of claim 15 wherein the code description corresponds to a hardware description language.
- 17. A set of semiconductor processing masks representative of at least a portion of the cache coherence controller of claim 9.
- 18. A computer implemented method for evicting entries in a cache coherence directory, the cache coherence directory being associated with a computer system comprising a plurality of processor clusters, each cluster including a plurality of local nodes and a cache coherence controller interconnected by a local point-to-point architecture, the computer system further comprising memory corresponding to a global memory space, each cluster corresponding to a contiguous portion of the global memory space, selected ones of the plurality of local nodes in each cluster having a memory controller associated therewith, each memory controller in each cluster being responsible for a memory range within the corresponding contiguous portion of the global memory space, the cache coherence controller in a first cluster having the cache coherence directory associated therewith, entries in the cache coherence directory corresponding to memory lines within the contiguous portion of the global memory space corresponding to the first cluster which are cached in remote clusters, the method comprising:
determining that a first one of the entries in the cache coherence directory should be evicted, the first entry corresponding to an unmodified copy of a first memory line; generating a request to write to the first memory line, the request being directed to a first one of the memory controllers corresponding to the first memory line; in response to the request, generating a plurality of invalidating probes to all of the local nodes and the cache coherence controller in the first cluster; invalidating the first entry in the cache coherence directory in response to a first one of the invalidating probes received by the cache coherence controller; forwarding the first invalidating probe to the remote clusters having at least one copy of the first memory line; and invalidating all copies of the first memory line in the cache memories.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] The subject matter described in the present application is related to U.S. patent application Ser. No. 10/___,___ for METHODS AND APPARATUS FOR MANAGING PROBE REQUESTS filed on ______, 2002 (Attorney Docket No. NWISP024), the entire disclosure of which is incorporated herein by reference for all purposes.