Claims
- 1. A multi-processor unit comprising:
a set of processing clusters; a data ring coupled to each processing cluster in said set of processing clusters; a snoop controller adapted to process memory requests from processing clusters in said set of processing clusters; and a snoop ring coupled to said snoop controller and each processing cluster in said set of processing clusters.
- 2. The multi-processor unit of claim 1, including:
a set of point-to-point links, each coupling a respective processing cluster in said set of processing clusters to said snoop controller, wherein each processing cluster in said set of processing clusters provides said memory requests to said snoop controller via a point-to-point link in said set of point-to-point links.
- 3. The multi-processor unit of claim 1, wherein each processing cluster in said set of processing clusters includes:
a set of compute engines; and a set of cache memory coupled to said set of compute engines, wherein said set of cache memory includes:
a set of first tier cache memory coupled to said set of compute engines; and a second tier cache memory coupled to each first tier cache memory in said set of first tier cache memory, said data ring, and said snoop ring.
- 4. The multi-processor unit of claim 3, wherein said second tier cache memory includes:
a first set of request queues coupled to said set of first tier cache memory; and a snoop request queue coupled to said snoop ring.
- 5. The multi-processor unit of claim 4, wherein said second tier cache memory includes:
request selection logic coupled to said first set of request queues, said snoop request queue, and a fill request from said data ring.
- 6. The multi-processor unit of claim 4, wherein said first set of request queues maintains a set of memory request entries corresponding to memory requests from said set of first tier cache memory, wherein a memory request entry in said set of memory request entries includes:
a dependency field identifying memory access operations for said second tier cache memory to perform before servicing a memory access operation identified by said memory request entry.
- 7. The multi-processor unit of claim 4, wherein said first set of request queues maintains a set of memory request entries corresponding to memory requests from said set of first tier cache memory, wherein a memory request entry in said set of memory request entries includes:
a sleep field indicating whether said memory request entry has been placed in a sleep mode, wherein said sleep mode prevents said memory request entry from being serviced by said second tier cache memory.
- 8. The multi-processor unit of claim 4, wherein said second tier cache memory services memory request entries in said first set of request queues based on a set of programmable criteria.
- 9. The multi-processor unit of claim 8, wherein said set of programmable criteria includes a time period a memory request entry has been pending in said first set of request queues and a type of operation specified by a memory request entry in said first set of request queues.
- 10. The multi-processor unit of claim 4, wherein said snoop request queue maintains a set of snoop request entries corresponding to requests made by said snoop controller, wherein a snoop request entry in said set of snoop request entries includes:
a cluster field identifying a processing cluster that issued a first memory request corresponding to said snoop request entry, a memory request field identifying said first memory request, an identification field identifying said snoop request entry, an address identifying a requested memory location, and an opcode field identifying a type of snoop request.
- 11. The multi-processor unit of claim 10, wherein said type of snoop request is a snoop request type from a set of snoop request types consisting of:
an own snoop request instructing a processing cluster to transfer exclusive ownership of said memory location and transfer contents of said memory location to another processing cluster, a share snoop request instructing a processing cluster to transfer shared ownership of said memory location and transfer contents of said memory location to another processing cluster, a kill snoop request instructing a processing cluster to release ownership of a memory location without performing any data transfers.
- 12. The multi-processor unit of claim 3, wherein said second tier cache memory includes an external request queue coupled to said snoop controller.
- 13. The multi-processor unit of claim 12, wherein said external request queue maintains entries requesting said snoop controller to issue a snoop request to said set of processing clusters via said snoop ring.
- 14. The multi-processor unit of claim 13, wherein said snoop request is a snoop request from a set of snoop requests consisting of:
an own snoop request instructing a processing cluster to transfer exclusive ownership of a first memory location and transfer contents of said first memory location to another processing cluster, a share snoop request instructing a processing cluster to transfer shared ownership of a second memory location and transfer contents of said second memory location to another processing cluster, and a kill snoop request instructing a processing cluster to release ownership of a third memory location without performing any data transfers.
- 15. The multi-processor unit of claim 3, wherein said set of first tier cache memory includes:
a first tier data cache coupled to a compute engine in said set of compute engines; and a first tier instruction cache coupled to said compute engine.
- 16. The multi-processor unit of claim 15, wherein said first tier data cache includes a fill buffer, wherein said fill buffer maintains a list of memory requests from said compute engine for load operations submitted to said second tier cache.
- 17. The multi-processor unit of claim 16, wherein said first tier data cache includes a data array and does not store in said data array data returned by said second tier cache in response to a cacheable load operation listed in said fill buffer if said compute engine issues a subsequent store operation to a location specified for said cacheable load operation.
- 18. The multi-processor unit of claim 1, further including:
external bus logic coupling a main memory to said data ring, wherein said external bus logic is coupled to said snoop controller to receive instructions.
- 19. The multi-processor unit of claim 1, wherein said set of processing clusters, said data ring, said snoop controllers, and said snoop ring are formed together on a single integrated circuit.
- 20. The multi-processor of claim 1, wherein said multi-processor is formed on a single integrated circuit.
- 21. An apparatus comprising:
a set of cache memory systems, each cache memory system including:
a set of first tier cache memory, and a second tier cache memory coupled to said set of first tier cache memory; a data ring coupled to each second tier cache memory in said set of cache memory systems; a snoop controller; and a snoop ring coupled to said snoop controller and each second tier cache memory in said set of cache memory systems.
- 22. The apparatus of claim 21, wherein said second tier memory includes:
a first set of request queues coupled to said set of first tier cache memory; and a snoop request queue coupled to said snoop ring.
- 23. The apparatus of claim 22, wherein said second tier cache memory includes:
request selection logic coupled to said first set of request queues, said snoop request queue, and a fill request from said data ring.
- 24. The apparatus of claim 22, wherein said first set of request queues maintains a set of memory request entries corresponding to memory
requests from said set of first tier cache memory, wherein a memory request entry in said set of memory request entries includes: a dependency field identifying memory access operations for said second tier cache memory to perform before servicing a memory access operation identified by said memory request entry.
- 25. The apparatus of claim 22, wherein said second tier cache memory services memory request entries in said first set of request queues based on a set of programmable criteria, wherein said set of programmable criteria includes a time period a memory request entry has been pending in said first set of request queues and a type of operation specified by a memory request entry in said first set of request queues.
- 26. The apparatus of claim 22, wherein said snoop request queue maintains a set of snoop request entries corresponding to requests made by said snoop controller, wherein a snoop request entry in said set of snoop request entries includes:
a cluster field identifying a processing cluster that issued a memory request corresponding to said snoop request entry, a memory request field identifying said memory request, an identification field identifying said snoop request entry, an address identifying a requested memory location, and an opcode field identifying a type of snoop request.
- 27. The apparatus of claim 26, wherein said type of snoop request is a snoop request type from a set of snoop request types consisting of:
an own snoop request instructing a second tier cache memory to transfer exclusive ownership of a memory location and transfer contents of said memory location to another second tier cache memory, a share snoop request instructing a second tier cache memory to transfer shared ownership of a memory location and transfer contents of said memory location to another second tier cache memory, and a kill snoop request instructing a second tier cache memory to release ownership of a memory location without performing any data transfers.
- 28. The apparatus unit of claim 21, wherein said second tier cache memory includes an external request queue coupled to said snoop controller.
- 29. The apparatus unit of claim 28, wherein said external request queue maintains entries requesting said snoop controller to issue a snoop request to second tier cache memories in said set of cache memory systems via said snoop ring.
- 30. The apparatus of claim 21, wherein said set of first tier cache memory includes:
a first tier data cache coupled to a compute engine, wherein said first tier data cache includes a fill buffer, wherein said fill buffer maintains a list of memory requests for load operations submitted to said second tier cache; and a first tier instruction cache coupled to said compute engine.
- 31. The apparatus of claim 30, wherein said first tier data cache includes a data array and does not store in said data array data returned by said second tier cache in response to a cacheable load operation listed in said fill buffer if said compute engine issues a subsequent store operation to a location specified for said cacheable load operation.
- 32. The apparatus of claim 21, wherein said set of cache memory systems, said data ring, said snoop controller, and said snoop ring are formed together on a single integrated circuit.
- 33. A multi-processor system comprising:
a main memory; a set of processing clusters; a data ring coupled to each processing cluster in said set of processing clusters; buffer logic coupling said data ring and said main memory; a snoop controller adapted to process memory requests from processing clusters in said set of processing clusters; and a snoop ring coupled to said snoop controller and each processing cluster in said set of processing clusters, wherein each processing cluster in said set of processing clusters includes
a set of compute engines, a set of first tier cache memory coupled to said set of compute engines, and a second tier cache memory coupled to each first tier cache memory in said set of first tier cache memory, wherein said second tier cache memory includes:
a first set of request queues coupled to said set of first tier cache memory, a snoop request queue coupled to said snoop ring, wherein said snoop request queue maintains a set of snoop request entries corresponding to requests made by said snoop controller, and an external request queue coupled to said snoop controller and adapted to maintain entries requesting said snoop controller to issue a snoop request to said set of processing clusters via said snoop ring.
- 34. The multi-processor unit of claim 33, including:
a set of point-to-point links, each coupling a respective processing cluster in said set of processing clusters to said snoop controller, wherein each processing cluster in said set of processing clusters provides said memory requests to said snoop controller via a point-to-point link in said set of point-to-point links.
- 35. The multi-processor unit of claim 33, wherein said set of processing clusters, said data ring, said snoop controller, and said snoop ring are formed together on a single integrated circuit.
Parent Case Info
[0001] This application is a continuation of, and claims priority under 35 U.S.C. §120 from, U.S. patent application Ser. No. 09/900,481, entitled “Multi-Processor System,” filed on Jul. 6, 2001, which is incorporated herein by reference.
Continuations (1)
|
Number |
Date |
Country |
Parent |
09900481 |
Jul 2001 |
US |
Child |
10105993 |
Mar 2002 |
US |