Claims
- 1. A global interrupt and barrier network comprising:
means for generating global interrupt and barrier signals for controlling global asynchronous operations performed by processing elements at selected processing nodes of a computing structure in accordance with a processing algorithm; means interconnecting said processing nodes for communicating said global interrupt and barrier signals to said elements via low-latency paths, said signals respectively initiating interrupt and barrier operations at said processing nodes at times selected for optimizing performance of said processing algorithm.
- 2. The global interrupt and barrier network as claimed in claim 1, implemented in a scalable, massively-parallel computing structure comprising a plurality of interconnected processing nodes each including a respective processing element, said nodes being interconnected by a first network type.
- 3. The global interrupt and barrier network as claimed in claim 1, wherein the global signals are generated and communicated asynchronously.
- 4. The global interrupt and barrier network as claimed in claim 1, wherein the global signals are communicated synchronously.
- 5. The global interrupt and barrier network as claimed in claim 3, wherein said barrier signal generating means includes means at each node for receiving signals from one or more connecting nodes and for performing a global logical AND operation upon said signals, said logical AND operation functioning to enable a global barrier operation at said selected nodes, wherein said selected nodes are prevented from processing beyond a certain operating state until said global barrier operation has completed.
- 6. The global interrupt and barrier network as claimed in claim 5, wherein said interrupt signal generating means includes means at each node for receiving signals from one or more connecting nodes and for performing a global logical OR operation upon said signals, said logical OR operation functioning to enable global notification operations at said selected nodes.
- 7. The global interrupt and barrier network as claimed in claim 6, wherein said means interconnecting said processing nodes forms a tree network comprising low-latency paths between nodes, wherein global interrupt and barrier signals flow upstream from children to parent nodes of said tree and downstream from parent nodes to child nodes of said tree.
- 8. The global interrupt and barrier network as claimed in claim 6, wherein said computing structure further includes an independent global tree network for enabling high-speed global tree communications among said processing nodes, wherein said means interconnecting said processing nodes forms a tree network operating in parallel with said independent global tree network for efficiently initiating said global asynchronous operations in coordination with said processing algorithm.
- 9. The global interrupt and barrier network as claimed in claim 7, wherein each said node includes an associated routing device for routing signals to other nodes, said router including an upstream port for routing said global asynchronous signals in respective upstream direction toward a parent node of said tree network and down stream ports for broadcasting global asynchronous signals in a respective downstream direction toward child nodes of said tree network
wherein each node includes a logic circuit responsive to signals received from processing elements included at said node and from other nodes connected therewith for initiating interrupt and barrier operations at parent and child nodes of said node in said tree.
- 10. The global interrupt and barrier network as claimed in claim 9, wherein each said logic circuit includes means for setting an asynchronous global signal in response to a control signal from said processing element.
- 11. The global interrupt and barrier network as claimed in claim 10, wherein each node includes a detection device for receiving an asynchronous global signal, said detection device comprising means for synchronizing said received asynchronous signal with a system clock signal for avoiding false detection of noise in said tree network.
- 12. The global interrupt and barrier network as claimed in claim 11, wherein said means for detecting an asynchronous global signal is edge sensitive.
- 13. The global interrupt and barrier network as claimed in claim 1, further including means utilizing said global interrupt and barrier network for synchronizing a global clock over a whole computer structure.
- 14. A method for implementing global asynchronous operations in a computing structure comprising a plurality of nodes interconnected by at least one high-speed network, said method comprising:
a) generating global interrupt and barrier signals for controlling global asynchronous operations performed by processing elements at selected processing nodes of a computing structure in accordance with a processing algorithm. b) providing another high-speed network interconnecting said processing nodes for communicating said global interrupt and barrier signals to said elements via low-latency paths, said signals respectively initiating interrupt and barrier operations at said processing nodes at times selected for optimizing performance of said processing algorithm.
- 15. The method as claimed in claim 14, implemented in a scalable, massively-parallel computing structure comprising a plurality of interconnected processing nodes each including a respective processing element, said nodes being interconnected by an independent torus network.
- 16. The method as claimed in claim 14, wherein the global signals are generated and communicated asynchronously.
- 17. The method as claimed in claim 14, wherein the global signals are communicated synchronously.
- 18. The method as claimed in claim 16, wherein said generating step includes the step implemented at each node of:
receiving signals from one or more connecting nodes; and performing a global logical AND operation upon said signals, said logical AND operation functioning to enable a global barrier operation at said selected nodes, wherein said selected nodes are prevented from processing beyond a certain operating state until said global barrier operation has completed.
- 19. The method as claimed in claim 18, wherein said generating step further includes the step implemented at each node of:
receiving signals from one or more connecting nodes; and performing a global logical OR operation upon said signals, said logical OR operation functioning to enable global notification operations at said selected nodes.
- 20. The method as claimed in claim 19, wherein said processing nodes are interconnected to form a global tree network comprising low-latency paths between nodes, wherein global interrupt and barrier signals flow upstream from children to parent nodes of said tree and downstream from parent nodes to child nodes of said tree.
- 21. The method as claimed in claim 19, wherein said computing structure further includes an independent global tree network for enabling high-speed global tree communications among said processing nodes, said interconnecting of said processing nodes forming a tree network operating in parallel with said independent global tree network for efficiently initiating said global asynchronous operations in coordination with said processing algorithm.
- 22. The method as claimed in claim 20, wherein each said node includes an associated routing device for routing signals to other nodes, said router including an upstream port for routing said global asynchronous signals in respective upstream direction toward a parent node of said tree network and down stream ports for broadcasting global asynchronous signals in a respective downstream direction toward child nodes of said tree network, said method further comprising:
implementing logic at each node in response to signals received from processing elements included at said node and from other nodes connected therewith for initiating interrupt and barrier operations at parent and child nodes of said node in said tree.
- 23. The method as claimed in claim 22, wherein said logic implemented includes the step of setting an asynchronous global barrier or interrupt signal in response to a control signal from said processing element.
- 24. The method as claimed in claim 23, wherein each node implements the steps of:
detecting receipt of an asynchronous global signal; and, synchronizing said received asynchronous signal with a system clock signal for avoiding false detection of noise in said tree network.
- 25. The method as claimed in claim 24, wherein said logic implemented for detecting an asynchronous global signal is edge sensitive.
- 26. The method as claimed in claim 14, further including the step of utilizing said global interrupt and barrier network for synchronizing a global clock over a whole computer structure.
CROSS-REFERENCE TO RELATED APPLICATION
[0001] The present invention claims the benefit of commonly-owned, co-pending U.S. Provisional Patent Application Serial No. 60/271,124 filed Feb. 24, 2001 entitled MASSIVELY PARALLEL SUPERCOMPUTER, the whole contents and disclosure of which is expressly incorporated by reference herein as if fully set forth herein. This patent application is additionally related to the following commonly-owned, co-pending United States Patent Applications filed on even date herewith, the entire contents and disclosure of each of which is expressly incorporated by reference herein as if fully set forth herein. U.S. patent application Ser. Nos. (YOR920020027US1, YOR920020044US1 (15270)), for “Class Networking Routing”; U.S. patent application Ser. No. (YOR920020028US1 (15271)), for “A Global Tree Network for Computing Structures”; U.S. patent application Ser. No. (YOR920020029US1 (15272)), for ‘Global Interrupt and Barrier Networks”; U.S. patent application Ser. No. (YOR920020030US1 (15273)), for ‘Optimized Scalable Network Switch”; U.S. patent application Ser. Nos. (YOR920020031US1, YOR920020032US1 (15258)), for “Arithmetic Functions in Torus and Tree Networks’; U.S. patent application Ser. Nos. (YOR920020033US1, YOR920020034US1 (15259)), for ‘Data Capture Technique for High Speed Signaling”; U.S. patent application Ser. No. (YOR920020035US1 (15260)), for ‘Managing Coherence Via Put/Get Windows’; U.S. patent application Ser. Nos. (YOR920020036US1, YOR920020037US1 (15261)), for “Low Latency Memory Access And Synchronization”; U.S. patent application Ser. No. (YOR920020038US1 (15276), for ‘Twin-Tailed Fail-Over for Fileservers Maintaining Full Performance in the Presence of Failure”; U.S. patent application Ser. No. (YOR920020039US1 (15277)), for “Fault Isolation Through No-Overhead Link Level Checksums’; U.S. patent application Ser. No. (YOR920020040US1 (15278)), for “Ethernet Addressing Via Physical Location for Massively Parallel Systems”; U.S. patent application Ser. No. (YOR920020041US1 (15274)), for “Fault Tolerance in a Supercomputer Through Dynamic Repartitioning”; U.S. patent application Ser. No. (YOR920020042US1 (15279)), for “Checkpointing Filesystem”; U.S. patent application Ser. No. (YOR920020043US1 (15262)), for “Efficient Implementation of Multidimensional Fast Fourier Transform on a Distributed-Memory Parallel Multi-Node Computer”; U.S. patent application Ser. No. (YOR9-20010211US2 (15275)), for “A Novel Massively Parallel Supercomputer”; and U.S. patent application Ser. No. (YOR920020045US1 (15263)), for “Smart Fan Modules and System”.
PCT Information
Filing Document |
Filing Date |
Country |
Kind |
PCT/US02/05567 |
2/25/2002 |
WO |
|