The present application generally relates to arbitrating a shared resource in a computing environment. More particularly, the present application relates to detecting and/or correcting soft error(s) in an arbitration logic device in a digital circuit while the arbitration logic device continues to work correctly under the soft error(s).
In a digital circuit, it is common that multiple modules compete for a single shared resource (e.g. bus, cache memory, etc.). Thus, an arbitration logic device is often used to resolve shared resource conflicts. An arbitration logic device selects one of a winner requestor among the multiple requestors (i.e., the competing multiple modules). Then, the winning requestor accesses the shared resource. In very large scale integrated (VLSI) circuits, a large number of requestors are subject to competing each other. For example, there are hundreds or even thousands of candidate requestors for such competition.
An arbitration logic device memorizes the state of each requestor (e.g. whether each requestor has a pending request), e.g., by storing the state of each requestor in storage elements, e.g., latches, registers, flip-flops, etc. However, these storage elements can flip their values due to soft errors. Soft error refers to an error on data stored in a computing system that does not damage hardware of the computing system but corrupts the data. Because of a trend of high-density and low-power consumption in semiconductor designing/manufacturing technology (e.g., 20-nm CMOS technology), a soft error may occur more frequently in recent VLSI circuits. A soft error occurs not only in a memory device (e.g., SRAM, DRAM, SDRAM, etc.), but also in a register, for example, in a processor (core). Therefore, a soft error becomes more significant problem as the digital circuits are designed based on nanotechnology (e.g., 30-nm CMOS technology).
Traditionally, a duplication method has been used to detect and correct soft errors in a digital circuit. Duplication method uses multiple instances of storage elements to store same data. Using two copies of data, it is possible for the digital circuit to detect a single bit error. For example, if the two copies have different values, there exists a soft error on the data. Similarly, using three copies of data, the digital circuit can correct a single bit error, e.g., considering two copies that store same data as valid copies. Although this duplication method is simple and easy to implement, it increases the number of storage elements in the digital circuits unacceptably in terms of hardware size and power consumption.
ECC (Error Correcting Code) has also been a popular method to correct soft errors in digital circuits. Adding a small number of extra information (e.g., additional 10% data) to original information, hardware logic implementing an ECC scheme (e.g., multiple parity bits) can correct soft errors as long as the number of flipped bits is small enough (e.g., the number of bits being corrupted is one).
Protecting memory cells (i.e., cells in a memory device) using ECC is widely used in current digital systems. However, a naïve ECC method is not efficient for the arbitration logic device. For example, because the arbitration logic device needs to know the states of all the requestors, the arbitration logic device looks up all the memorized information at once. Therefore, all the memorized information has to be corrected at the same time. As a result, significant amount of ECC correction logic device is necessary: traditionally, one ECC correction logic device is required per one ECC word. The number of ECC correction logic devices increases as the number of ECC words increases. However, this increase becomes not acceptable both in hardware size and in power consumption as digital circuits become dense and operate in a low-power environment (e.g., Vdd=1.6V). Furthermore, an ECC correction delay (i.e., the time that an ECC correction logic device takes to fix a soft error) is added to the critical path in the arbitration logic device, thus increasing latency for the arbitration. A critical path in a digital circuit refers to a path that takes the longest time to operate in the digital circuit.
There have been other methods proposed to solve soft errors that include, but are not limited to: 1. Exploiting time redundancy to tolerate soft errors, 2. Using a known Delay-Assignment-Variation (DAV) methodology to mitigate soft errors, 3. Optimizing internal structures of latches to make them tolerant to soft errors, etc. Though these methods have some effect on reducing the impact of soft errors, they depend on semiconductor devices or development tools. Thus, they are lack of generality because these proposed methods rely on semiconductor device technologies (e.g., 40 nm CMOS technology) and synthesis tools (e.g., synthesis tools from Cadence®, etc.) through which these method are implemented on semiconductor devices. Sometimes, they are difficult or even impossible to be implemented.
There has been a method for fixing soft errors at a system level. For example, there is a method for microprocessors to recover from soft errors by an additional system-level logic or process for soft error handling, e.g., adding check points. However, depending on a design of a digital circuit, it may not be easy to add such mechanism in the digital circuit.
The present disclosure describes a method and computer program product for operating an arbitration logic device that controls a shared resource. The present disclosure also describes the arbitration logic device that detects and/or corrects soft error(s) after speculatively computing an arbitration result.
In one embodiment, there is provided an arbitration logic device for controlling an access to a shared resource. The arbitration logic device comprises at least one storage element, a winner selection logic device, and an error detection logic device. The storage element stores a plurality of requestors' information received from a plurality of requestors. The winner selection logic device selects a winner requestor among the requestors based on the requestors' information. The winner selection logic device selects the winner requestor without checking whether there is the soft error in the winner requestor's information.
In a further embodiment, the arbitration logic device includes a result cancellation logic device and an error detection logic device. The result cancellation logic device cancels the selection of the winner requestor in response to determining that there is the soft error on the winner's requestor's information. The error detection logic device detects a soft error on the winner requestor's information.
In a further embodiment, the error detection logic device resides outside of a critical path in the arbitration logic device.
In a further embodiment, the requestors' information is encoded with an error correcting code (ECC) that includes one or more of: Hamming code, Golay code, Reed-Muller code, BCH (Bose and Ray-Chaudhuri) code, Reed-Solomon code, self-dual code, convolutional code, SEC-DED code.
The accompanying drawings are included to provide a further understanding of the present invention, and are incorporated in and constitute a part of this specification.
In one embodiment,
Requestor status information includes, but is not limited to: one or more bits representing a requestor ID associated with a particular requestor, one or more bits indicating whether the particular requestor has a pending request to a shared resource controlled by the arbitration logic device 100, one or more bits indicating when the particular resource issued the pending request, one or more bits indicating how many requests the particular requestor issued so far or within a pre-determined time period, one or more bits indicating the number of total pending requests, one or more bits indicating how many requestors are waiting an access to the shared resource 50, etc.
Returning to
Traditional systems that use ECC method(s) correct data before processing. In contrast, according to one embodiment, the arbitration logic device 100 processes data (e.g., requestor status information) before correcting the data, and subsequently checks the correctness (e.g., right before outputting a result of arbitration). The arbitration logic device 100 speculatively performs arbitration (i.e., selecting a pending request among a plurality of requests) using uncorrected requestor status information, and cancels it afterward if a corresponding arbitration result is incorrect because of a soft error. The arbitration logic device 100 concurrently checks whether the used requestor status information is correct or not, e.g., based on the ECC method(s) described above, while processing the information for arbitration. Accordingly, based on a result of the checking, the arbitration logic device 100 determines whether the arbitration result obtained from the requestor status information is correct or not. For example, if the requestor status information is determined to be incorrect due to a soft error in it according to the ECC method(s), the corresponding arbitration result is incorrect. Because of the speculative arbitration (i.e., processing the requestor status information while detecting correctness of the information), ECC correction delay does not impact on an arbitration delay. ECC correction delay refers to a certain time required to fix a soft error on the requestor status information. Arbitration delay refers to a certain time to make an arbitration decision in an arbitration logic device. The arbitration logic device 100 requires a small amount of hardware (e.g., only one ECC correction logic device in an entire digital system/circuit) that is necessary to check the correctness of the arbitration result.
In one embodiment, the arbitration logic device 100 does not check correctness of all the requestor status information, i.e., there is no need to have numerous ECC correction logic devices corresponding to numerous ECC words. The arbitration logic device 100 has only one ECC correction logic device to arbitrate pending requests that are included in numerous ECC words.
Thus, this arbitration logic device 100 provides an efficient way to detect and/or correct soft errors with small impact on hardware size, power consumption, and arbitration delay: there is needed only one ECC correction logic device for correcting soft error(s) on a particular ECC word (i.e., a word (64-bit/128-bit data) encoded with ECC); the power consumption is also reduced since only one ECC correction logic device (e.g., an error detection logic device 175 in
The arbitration logic device 100 performs one or more of:
(a) Speculative arbitration with cancellation ability due to a soft error: Instead of correcting requestor status information before processing the requestor status information, the arbitration logic device 100 selects a requestor among a plurality of requestors based on uncorrected requestor state information, e.g., in round-robin fashion, randomly, in first come first served, etc. If the concurrently running ECC correction logic device finds that there was a soft error on the information of the selected requestor, the arbitration logic device 100 cancels the selection, e.g., setting an “invalid” flag bit associated with the selection.
(b) Status information correctness check is performed outside the critical path of the arbitration logic device 100: The arbitration logic device 100 checks whether requestor status information of the selected requestor has a soft error, e.g., by running an ECC method operated in the ECC correction logic device. If that requestor status information has a soft error, the ECC correction logic device sends a signal to the arbitration logic device 100 to cancel the selection. This correctness check is performed outside of the critical path of the arbitration logic device 100. For example, in
(c) Periodic scan and correction on requestor status information: Traditionally, if a requestor has a pending request but a soft error occurs on corresponding requestor status information associated with the requestor and/or pending request, a resulting bit pattern (e.g., a request cancellation signal 190 in
In one embodiment, requestor status information is encoded by one or more of ECC methods. For example, an ECC word includes 72 bit original data (requestor status information) and 8 bit ECC (e.g., parity bits). The present invention is not limited to any particular ECC encoding scheme.
The arbitration logic device 100 performs arbitration (i.e., selecting one requestor among M×N requestors) in a winner selection logic device 105 that includes an M-to-1 selector (e.g., ECC word selector logic device 155 in
In one embodiment, the winner selection logic device 105 is pipelined into two stages, e.g., the arbitration is performed in two processor clock cycles. In the first stage, the M-to-1 selector (e.g., ECC word selector logic device 155 in
After receiving the selected ECC word that includes status information of the winner requestor to be selected by the N-to-1 arbiter, ECC correction logic device (e.g., an error detection logic device 175 in
Table 1 illustrates an exemplary Hamming code. This exemplary Hamming code is obtained from http://www.hackersdelight.org/ecc.pdf, whose whole content is incorporated by reference as if set forth herein.
For example, if the selected ECC word is 10000112 encoded with Hamming code shown in Table 1, this 10000112 represents 310. If a soft error occurs in this selected word and thus the selected word becomes 10001112, upon receiving this 10001112, the ECC correction logic device may first count the number of zeroes in 1st, 3rd, 5th, and 7th bit positions and determines that there is a soft error in the first parity bit (1st bit position), the fourth data bit (3rd bit position), the third data bit (5th bit position) or the first data bit (7th bit position) since the number of zeroes is odd: (1, 0, 1, 1). Then, the ECC correction logic device counts the number of zeros in 2nd, 3rd, 6th, and 7th bit positions and determines that there is no error on the second parity bit (2nd bit position), the fourth data bit (3rd bit position), the second data bit (6th bit position) and the first data bit (7th bit position) since the number of zeros is even: (0, 0, 1, 1). The arbitration logic device 100 counts the number of zeros in 4th, 5th, 6th and 7th bit positions and determines that there is a soft error in third parity bit (4th bit position), the third data bit (5th bit position), the second data bit (6th bit position) or the first data bit (7th bit position) since the number of zeros odd: (0, 1, 1, 1). According to the first counting (i.e., counting the number of zeroes in 1st, 3rd, 5th, and 7th bit positions) and the second counting (i.e., counting the number of zeroes in 2nd, 3rd, 6th, and 7th bit positions), the first parity bit or the third data bit has the soft error. According to the third counting (i.e., counting the number of zeroes in 4th, 5th, 6th and 7th bit positions) and the second counting, it is determined in this example that the third data bit or the third parity bit has soft error. In other words, a first analysis based on the first counting and second counting concludes that the first parity bit or the third data bit has the soft error. A second analysis based on the second counting and the third counting concludes that the third data bit or the third parity bit has soft error. A common factor between the two analyses is the third data bit (5th bit position). Thus, the ECC correction logic device detects the soft error on the third data bit and fixes the error, e.g., by converting “1” in the third data bit to “0”.
If the ECC correction logic device (e.g., the error detection logic 175 in
The control logic device 145 includes the scan and correct logic device 140 as well as the M-to-1 arbiter 135. The scan-correct logic periodically reads each ECC word and corrects it if a soft error is detected. The scan and correct logic device 145 periodically reads requestors' status information stored in storage element(s), checks whether there is a soft error in the requestors' information, and corrects the soft error in the requestors' information, e.g., by using ECC correction logic device and the M-to-1 selector. Specifically, the scan and correct logic device 145 drives a select line 150 of the M-to-1 selector to select a desired ECC word (e.g., ECC word 1120, ECC word 2125, . . . , or ECC word M 130 in
At step 220, the N-to-1 arbiter selects one of the pending requests in the selected ECC word according to a known selection method (e.g., round-robin, randomly, first come first served, etc.). At step 230, while the N-to-1 arbiter selects one of the pending requests in the selected ECC word, the ECC correction logic device simultaneously detects whether the selected ECC word includes a soft error according to an ECC method adopted by the arbitration logic device 100.
If there is no soft error detected in the selected ECC word, at step 240, the arbitration logic device 100 grants the request (e.g., access permission to a shared resource controlled by the arbitration logic device 100). Specifically, the N-to-1 arbiter outputs the arbitration result 197 (the selection of the winner requestor), e.g., by asserting the request grant flag bit 193 with the winner requestor ID for enabling the winner requestor's access to the shared resource 50. Then, the control returns to the step 200.
If there is a soft error on the selected ECC word, at step 250, the ECC correction logic device evaluates whether the soft error is correctable or not. Thus, if the soft error is correctable, e.g., a single bit error within the selected ECC word, the ECC correction logic device corrects the soft error on the selected ECC word and writes back the corrected ECC word into its corresponding storage elements. While correcting the soft error, the ECC correction logic device sends the cancellation signal 190 to void the request selected at step 220. The arbitration logic device does not grant access permission to the shared resource to any requestor (including the winner requestor). Then, the control returns to step 200 to redo the selection process (i.e., selecting a pending request among a plurality of pending requests).
Otherwise, if there is a detected soft error on the selected ECC word but the soft error is uncorrectable (e.g., double bit error within the selected ECC word), at step 260, the ECC correction logic device does not attempt to fix the error and stops the operation of the arbitration logic device 100, e.g., by setting a critical error flag bit.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with a system, apparatus, or device running an instruction.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a system, apparatus, or device running an instruction.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may run entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which run via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which run on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more operable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be run substantially concurrently, or the blocks may sometimes be run in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The present invention is related to the following commonly-owned, co-pending United States patent applications, the entire contents and disclosure of each of which is expressly incorporated by reference herein as if fully set forth herein. Attorney docket No. (YOR920090171US1 (24255)), for “USING DMA FOR COPYING PERFORMANCE COUNTER DATA TO MEMORY”; Attorney docket No. (YOR920090169US1 (24259)) for “HARDWARE SUPPORT FOR COLLECTING PERFORMANCE COUNTERS DIRECTLY TO MEMORY”; Attorney docket No. (YOR920090168US1 (24260)) for “HARDWARE ENABLED PERFORMANCE COUNTERS WITH SUPPORT FOR OPERATING SYSTEM CONTEXT SWITCHING”; Attorney docket No. (YOR920090473US1 (24595)), for “HARDWARE SUPPORT FOR SOFTWARE CONTROLLED FAST RECONFIGURATION OF PERFORMANCE COUNTERS”; Attorney docket No. (YOR920090474US1 (24596)), for “HARDWARE SUPPORT FOR SOFTWARE CONTROLLED FAST MULTIPLEXING OF PERFORMANCE COUNTERS”; Attorney docket No. (YOR920090533US1 (24682)), for “CONDITIONAL LOAD AND STORE IN A SHARED CACHE”; Attorney docket No. (YOR920090532US1 (24683)), for “DISTRIBUTED PERFORMANCE COUNTERS”; Attorney docket No. (YOR920090529US1 (24685)), for “LOCAL ROLLBACK FOR FAULT-TOLERANCE IN PARALLEL COMPUTING SYSTEMS”; Attorney docket No. (YOR920090530US1 (24686)), for “PROCESSOR WAKE ON PIN”; Attorney docket No. (YOR920090526US1 (24687)), for “PRECAST THERMAL INTERFACE ADHESIVE FOR EASY AND REPEATED, SEPARATION AND REMATING”; Attorney docket No. (YOR920090527US1 (24688), for “ZONE ROUTING IN A TORUS NETWORK”; Attorney docket No. (YOR920090531US1 (24689)), for “PROCESSOR WAKEUP UNIT”; Attorney docket No. (YOR920090535US1 (24690)), for “TLB EXCLUSION RANGE”; Attorney docket No. (YOR920090536US1 (24691)), for “DISTRIBUTED TRACE USING CENTRAL PERFORMANCE COUNTER MEMORY”; Attorney docket No. (YOR920090538US1 (24692)), for “PARTIAL CACHE LINE SPECULATION SUPPORT”; Attorney docket No. (YOR920090539US1 (24693)), for “ORDERING OF GUARDED AND UNGUARDED STORES FOR NO-SYNC I/O”; Attorney docket No. (YOR920090540US1 (24694)), for “DISTRIBUTED PARALLEL MESSAGING FOR MULTIPROCESSOR SYSTEMS”; Attorney docket No. (YOR920090541US1 (24695)), for “SUPPORT FOR NON-LOCKING PARALLEL RECEPTION OF PACKETS BELONGING TO THE SAME MESSAGE”; Attorney docket No. (YOR920090560US1 (24714)), for “OPCODE COUNTING FOR PERFORMANCE MEASUREMENT”; Attorney docket No. (YOR920090579US1 (24731)), for “A MULTI-PETASCALE HIGHLY EFFICIENT PARALLEL SUPERCOMPUTER”; Attorney docket No. (YOR920090581US1 (24732)), for “CACHE DIRECTORY LOOK-UP REUSE”; Attorney docket No. (YOR920090582US1 (24733)), for “MEMORY SPECULATION IN A MULTI LEVEL CACHE SYSTEM”; Attorney docket No. (YOR920090583US1 (24738)), for “METHOD AND APPARATUS FOR CONTROLLING MEMORY SPECULATION BY LOWER LEVEL CACHE”; Attorney docket No. (YOR920090584US1 (24739)), for “MINIMAL FIRST LEVEL CACHE SUPPORT FOR MEMORY SPECULATION MANAGED BY LOWER LEVEL CACHE”; Attorney docket No. (YOR920090585US1 (24740)), for “PHYSICAL ADDRESS ALIASING TO SUPPORT MULTI-VERSIONING IN A SPECULATION-UNAWARE CACHE”; Attorney docket No. (YOR920090587US1 (24746)), for “LIST BASED PREFETCH”; Attorney docket No. (YOR920090590US1 (24747)), for “PROGRAMMABLE STREAM PREFETCH WITH RESOURCE OPTIMIZATION”; Attorney docket No. (YOR920090595US1 (24757)), for “FLASH MEMORY FOR CHECKPOINT STORAGE”; Attorney docket No. (YOR920090596US1 (24759)), for “NETWORK SUPPORT FOR SYSTEM INITIATED CHECKPOINTS”; Attorney docket No. (YOR920090597US1 (24760)), for “TWO DIFFERENT PREFETCH COMPLEMENTARY ENGINES OPERATING SIMULTANEOUSLY”; Attorney docket No. (YOR920090598US1 (24761)), for “DEADLOCK-FREE CLASS ROUTES FOR COLLECTIVE COMMUNICATIONS EMBEDDED IN A MULTI-DIMENSIONAL TORUS NETWORK”; Attorney docket No. (YOR920090631US1 (24799)), for “IMPROVING RELIABILITY AND PERFORMANCE OF A SYSTEM-ON-A-CHIP BY PREDICTIVE WEAR-OUT BASED ACTIVATION OF FUNCTIONAL COMPONENTS”; Attorney docket No. (YOR920090632US1 (24800)), for “A SYSTEM AND METHOD FOR IMPROVING THE EFFICIENCY OF STATIC CORE TURN OFF IN SYSTEM ON CHIP (SoC) WITH VARIATION”; Attorney docket No. (YOR920090633US1 (24801)), for “IMPLEMENTING ASYNCHRONOUS COLLECTIVE OPERATIONS IN A MULTI-NODE PROCESSING SYSTEM”; Attorney docket No. (YOR920090586US1 (24861)), for “MULTIFUNCTIONING CACHE”; Attorney docket No. (YOR920090645US1 (24873)) for “I/O ROUTING IN A MULTIDIMENSIONAL TORUS NETWORK”; Attorney docket No. (YOR920090646US1 (24874)) for ARBITRATION IN CROSSBAR FOR LOW LATENCY; Attorney docket No. (YOR920090647US1 (24875)) for EAGER PROTOCOL ON A CACHE PIPELINE DATAFLOW; Attorney docket No. (YOR920090648US1 (24876)) for EMBEDDED GLOBAL BARRIER AND COLLECTIVE IN A TORUS NETWORK; Attorney docket No. (YOR920090649US1 (24877)) for GLOBAL SYNCHRONIZATION OF PARALLEL PROCESSORS USING CLOCK PULSE WIDTH MODULATION; Attorney docket No. (YOR920090650US1 (24878)) for IMPLEMENTATION OF MSYNC; Attorney docket No. (YOR920090651US1 (24879)) for NON-STANDARD FLAVORS OF MSYNC; Attorney docket No. (YOR920090652US1 (24881)) for HEAP/STACK GUARD PAGES USING A WAKEUP UNIT; Attorney docket No. (YOR920100002US1 (24882)) for MECHANISM OF SUPPORTING SUB-COMMUNICATOR COLLECTIVES WITH O(64) COUNTERS AS OPPOSED TO ONE COUNTER FOR EACH SUB-COMMUNICATOR; and Attorney docket No. (YOR920100001US1 (24883)) for REPRODUCIBILITY IN BGQ.
This invention was Government support under Contract No. B554331 awarded by Department of Energy. The Government has certain rights in this invention.