Claims
- 1. A method for performing block copy operations from a remote processing node to a local processing node in a multiprocessor computer system, comprising:
initiating a block copy write to at least one coherency unit within a destination block by a processor, wherein said processor is located within said local processing node; detecting said block copy write within said local processing node; generating a read request upon detection of said block copy write, wherein said read request identifies a corresponding coherency unit within a source block located by said remote processing node; transmitting said read request to said remote processing node; receiving data from said corresponding coherency unit into said local processing node; and storing said data into said coherency unit within said destination block.
- 2. The method as recited in claim 1 wherein said generating includes translating a first address provided with said block copy write into a second address identifying said corresponding coherency unit.
- 3. The method as recited in claim 2 further comprising creating a translation from said first address to said second address prior to said generating.
- 4. The method as recited in claim 2 wherein said first address comprises a local physical address having a specific coding of a plurality of most significant bits of said local physical address.
- 5. The method as recited in claim 4 further comprising creating a second translation within a memory management unit included within said processor, wherein said second translation associates said local physical address with a virtual address formed via execution of said block copy write by said processor.
- 6. The method as recited in claim 4 further comprising accessing said data using said local physical address without said specific coding of said plurality of most significant bits.
- 7. The method as recited in claim 4 wherein said block copy write comprises a write stream instruction.
- 8. The method as recited in claim 2 wherein said second address comprises a global address.
- 9. The method as recited in claim 1 wherein said read request is a non-uniform memory architecture request.
- 10. The method as recited in claim 9 wherein said read request is performed by said local processing node regardless of a coherency state of said coherency unit within said local processing node.
- 11. The method as recited in claim 1 wherein said data is received from a third processing node different than said remote processing node and said local processing node.
- 12. An apparatus for performing block copy operations, comprising:
a processor including a memory management unit configured to translate a virtual address of a memory operation to a local physical address, wherein a block copy operation is specified if said local physical address resides in a specific predefined address space; and a system interface coupled to receive said block copy operation from said processor, wherein said system interface is configured to perform a translation from said local physical address to a global address and is further configured to transmit a read request including said global address via a network in response to said block copy operation, and wherein said system interface includes a translation storage for storing information for performing said translation from said local physical address to said global address.
- 13. The apparatus as recited in claim 12 wherein said block copy operation comprises a write operation.
- 14. The apparatus as recited in claim 13 wherein said system interface is configured to discard write data corresponding to said write operation.
- 15. The apparatus as recited in claim 13 wherein said write operation comprises a write stream operation.
- 16. The apparatus as recited in claim 12 wherein said read request solicits data for a coherency unit identified by said local physical address of said block copy operation.
- 17. The apparatus as recited in claim 16 further comprising a memory coupled to said system interface, wherein said system interface stores said data into said memory at a time when said data is received from said network.
- 18. The apparatus as recited in claim 17 wherein said system interface stores said data into said memory at a memory location within said destination block.
- 19. A computer system comprising:
a first processing node including a request agent configured to perform a read request for a coherency unit upon execution of a block copy write to said coherency unit by a processor within said first processing node; a second processing node including a home agent, wherein said first processing node is coupled to receive said read request from said first processing node, and wherein said second processing node is a home node for said coherency unit, and wherein said home agent is configured to identify an owner of said coherency unit upon receipt of said read request and further configured to transmit a demand; and a third processing node including a slave agent, wherein said third processing node is coupled to receive said demand from said second processing node, and wherein said slave agent is configured to convey data corresponding to said coherency unit to said first processing node upon receipt of said demand.
- 20. The computer system as recited in claim 19 further comprising a network interconnecting said first processing node, said second processing node, and said third processing node.
- 21. The computer system as recited in claim 19 wherein said first processing node is further configured to transmit a completion to said second processing node upon receipt of said data from said third processing node.
- 22. The computer system as recited in claim 19 wherein said block copy write comprises a write to an address space which identifies said write as said block copy write.
- 23. An apparatus configured to perform efficient block copy operations, comprising:
a processor configured to initiate a block copy write to at least one coherency unit within a destination block, wherein said destination block is located within a local processing node which includes said processor; and a system interface configured to detect said block copy write within said local processing node and to transmit a read request for a corresponding coherency unit within a source block located within a remote processing node, and wherein said system interface transmits said read request upon detection of said block copy write, and wherein said system interface is further configured to receive data from said corresponding coherency unit of said source block and to store said data into said coherency unit within said destination block.
- 24. The apparatus as recited in claim 23 wherein said block copy write is identified via a particular encoding upon a bus within said local processing node.
- 25. The apparatus as recited in claim 24 wherein said particular encoding includes a plurality of most significant bits of a local physical address corresponding to said block copy write, and wherein certain ones of said plurality of most significant bits, when encoded in a predetermined manner, identify a write transaction as said block copy write.
- 26. The apparatus as recited in claim 25 wherein said write transaction comprises a write stream transaction.
- 27. A method for performing block copies, comprising:
initiating a block copy command via a processor, wherein said block copy command identifies a first coherency unit within a source block and a second coherency unit within a destination block; transmitting data corresponding to said first coherency unit from a first processing node storing said source block to a second processing node storing said destination block; and storing said data into said second coherency unit.
- 28. An apparatus for performing block copies comprising:
a processor configured to execute a block copy command identifying a first coherency unit within a source block and a second coherency unit within a destination block; and a system interface coupled to receive said block copy command, wherein said system interface is configured to transfer data from said first coherency unit to said second coherency unit in response to said block copy command.
CROSS REFERENCE TO RELATED PATENT APPLICATIONS
[0001] This patent application is related to the following copending, commonly assigned patent applications, the disclosures of which are incorporated herein by reference in their entirety:
[0002] 1. “Extending The Coherence Domain Beyond A Computer System Bus” by Hagersten et al., filed concurrently herewith. (Reference Number P990)
[0003] 2. “Method And Apparatus Optimizing Global Data Replies In A Computer System” by Hagersten, filed concurrently herewith. (Reference Number P991)
[0004] 3. “Method And Apparatus Providing Short Latency Round-Robin Arbitration For Access To A Shared Resource” by Hagersten et al., filed concurrently herewith. (Reference Number P992)
[0005] 4. “Implementing Snooping On A Split-Transaction Computer System Bus” by Singhal et al., filed concurrently herewith. (Reference Number P993)
[0006] 5. “Split Transaction Snooping Bus Protocol” by Singhal et al., filed concurrently herewith. (Reference Number P989)
[0007] 6. “Interconnection Subsystem For A Multiprocessor Computer System With A Small Number Of Processors Using A Switching Arrangement Of Limited Degree” by Heller et al., filed concurrently herewith. (Reference Number P1609)
[0008] 7. “System And Method For Performing Deadlock Free Message Transfer In Cyclic Multi-Hop Digital Computer Network” by Wade et al., filed concurrently herewith. (Reference Number P1572)
[0009] 8. “Synchronization System And Method For Plesiochronous Signaling” by Cassiday et al., filed concurrently herewith. (Reference Number P1593)
[0010] 9. “Methods And Apparatus For A Coherence Transformer For Connecting Computer System Coherence Domains” by Hagersten et al., filed concurrently herewith. (Reference Number P1519)
[0011] 10. “Methods And Apparatus For A Coherence Transformer With Limited Memory For Connecting Computer System Coherence Domains” by Hagersten et al., filed concurrently herewith. (Reference Number P1530)
[0012] 11. “Methods And Apparatus For Sharing Stored Data Objects In A Computer System” by Hagersten et al., filed concurrently herewith. (Reference Number P1463)
[0013] 12. “Methods And Apparatus For A Directory-Less Memory Access Protocol In A Distributed Shared Memory Computer System” by Hagersten et al., filed concurrently herewith. (Reference Number P1531)
[0014] 13. “Hybrid Memory Access Protocol In A Distributed Shared Memory Computer System” by Hagersten et al., filed concurrently herewith. (Reference Number P1550)
[0015] 14. “Methods And Apparatus For Substantially Memory-Less Coherence Transformer For Connecting Computer System Coherence Domains” by Hagersten et al., filed concurrently herewith. (Reference Number P1529)
[0016] 15. “A Multiprocessing System Including An Enhanced Blocking Mechanism For Read To Share Transactions In A NUMA Mode” by Hagersten, filed concurrently herewith. (Reference Number P1786)
[0017] 16. “Encoding Method For Directory State In Cache Coherent Distributed Shared Memory Systems” by Guzovskiy et al., filed concurrently herewith. (Reference Number P1520)
[0018] 17. “Software Use Of Address Translation Mechanism” by Nesheim et al., filed concurrently herewith. (Reference Number P1560)
[0019] 18. “Directory-Based, Shared-Memory, Scaleable Multiprocessor Computer System Having Deadlock-free Transaction Flow Sans Flow Control Protocol” by Lowenstein et al., filed concurrently herewith. (Reference Number P1561)
[0020] 19. “Maintaining A Sequential Stored Order (SSO) In A Non-SSO Machine” by Nesheim, filed concurrently herewith. (Reference Number P1562)
[0021] 20. “Node To Node Interrupt Mechanism In A Multiprocessor System” by Wong-Chan, filed concurrently herewith. (Reference Number P1587)
[0022] 21. “Deterministic Distributed Multicache Coherence Protocol” by Hagersten et al., filed Apr. 8, 1996, Ser. No. 08/630,703.
[0023] 22. “A Hybrid NUMA Coma Caching System And Methods For Selecting Between The Caching Modes” by Hagersten et al., filed Dec. 22, 1995, Ser. No. 08/577,283.
[0024] 23. “A Hybrid NUMA Coma Caching System And Methods For Selecting Between The Caching Modes” by Wood et al., filed Dec. 22, 1995, Ser. No. 08/575,787.
[0025] 24. “Flushing Of Cache Memory In A Computer System” by Hagersten et al., filed concurrently herewith. (Reference Number P1416)
[0026] 25. “Efficient Allocation Of Cache Memory Space In A Computer System” by Hagersten et al., filed concurrently herewith. (Reference Number P1576)
[0027] 26. “Efficient Selection Of Memory Storage Modes In A Computer System” by Hagersten et al., filed concurrently herewith. (Reference Number P1726)
[0028] 27. “Skip-level Write-through In A Multi-level Memory Of A Computer System” by Hagersten et al., filed concurrently herewith. (Reference Number P1736)
[0029] 28. “A Multiprocessing System Configured to Perform Efficient Write Operations” by Hagersten, filed concurrently herewith. (Reference Number P1500)
[0030] 29. “A Multiprocessing System Including An Apparatus For Optimizing Spin-Lock Operations” by Hagersten, filed concurrently herewith. (Reference Number P1525)
[0031] 30. “A Multiprocessing System Configured to Detect and Efficiently Provide for Migratory Data Access Patterns” by Hagersten et al., filed concurrently herewith. (Reference Number P1555)
[0032] 31. “A Multiprocessing System Configured to Store Coherency State within Multiple Subnodes of a Processing Node” by Hagersten, filed concurrently herewith. (Reference Number P1527)
[0033] 32. “A Multiprocessing System Configured to Perform Prefetching Operations” by Hagersten et al., filed concurrently herewith. (Reference Number P1571)
[0034] 33. “A Multiprocessing System Configured to Perform Synchronization Operations” by Hagersten et al., filed concurrently herewith. (Reference Number P1551)
[0035] 34. “A Multiprocessing System Having Coherency-Related Error Logging Capabilities” by Hagersten et al., filed concurrently herewith. (Reference Number P1719)
[0036] 35. “Multiprocessing System Employing A Three-Hop Communication Protocol” by Hagersten, filed concurrently herewith. (Reference Number P1785)
[0037] 36. “A Multiprocessing System Configured to Perform Software Initiated Prefetch Operations” by Hagersten, filed concurrently herewith. (Reference Number P1787)
[0038] 37. “A Multiprocessing Computer System Employing Local and Global Address Spaces and Multiple Access Modes” by Hagersten, filed concurrently herewith. (Reference Number P1784)
[0039] 38. “Multiprocessing System Employing A Coherency Protocol Including A Reply Count” by Hagersten et al., filed concurrently herewith. (Reference Number P1570)
Continuations (1)
|
Number |
Date |
Country |
Parent |
08674269 |
Jul 1996 |
US |
Child |
09216506 |
Dec 1998 |
US |