Claims
- 1. A data capture method to allow optimal sampling and capture of an asynchronous data stream without sending a clock signal with the data stream comprising:
capturing the data by sending serial data bits of the data stream down a clocked delay line with a series of delay taps; sampling all of the delay taps with a clock; comparing each delay tap output with a neighbor delay tap output to determine if it is the same; using the comparisons to form a clocked string to generate a data history record; examining the data history record to determine optimal data capture eyes by looking for data capture eyes where the data does not transition between adjacent delay taps, which are detected as optimal data capture eyes.
- 2. The method of claim 1, including periodically updating the data history record to compensate for changing parameters.
- 3. The method of claim 1, wherein the clocked string is combined with previous clocked strings to generate the data history record.
- 4. The method of claim 1, wherein the serial data enters the clocked delay line, and is clocked through a combinatorial series of inverters, each of which adds an increment of delay, and each inverter output is directed to history registers.
- 5. The method of claim 4, wherein each inverter output is directed to even history registers and odd history registers, the even history registers are clocked by a positive edge of the clock, and the odd history registers are clocked by a negative edge of the clock, to allow logic to capture the serial data at twice the clock rate, the even history registers are used to detect an even data capture eye for the positive clock phase, and the odd history registers are used to detect an odd data capture eye for the negative clock phase.
- 6. The method of claim 5, wherein an even eye multiplexer receives all of the outputs of the even history registers and an odd eye multiplexer receives all of the outputs of the odd history registers.
- 7. The method of claim 4, wherein the history registers include a first history register clocked at a first clock rate and serially arranged second, third and fourth data history eye registers serially receiving the output of the first history register and clocked at a second clock rate.
- 8. The method of claim 4, wherein
the clocked delay line comprises a delay line register at the output of each inverter, and the output of each delay line register is directed to an exclusive OR gate XOR which also receives an input from the next delay line register in the clocked delay line, and since the data bit is inverted by the next delay line inverter before it enters the next delay line register, and if the data bit does not undergo a transition between the consecutive stages, each register and the next register will hold opposite values, such that the XOR gate will produce a 1, indicating there was no data transition between the consecutive stages, and conversely, if the data bit undergoes a transition between the consecutive stages, each register and the next register will hold the same value, such that each stage XOR gate will produce a 0, indicating there was a data transition between the consecutive stages.
- 9. The method of claim 8, wherein the output of each XOR gate is input to an AND gate, the output of which is input to a first history register, which is the first of a seriers of four history registers, and the first history register is sampled and reset to high at first clock rate, and the second, third and fourth history are sampled and updated at a higher second clock rate.
- 10. The method of claim 5, including searching for an even data eye, by incrementally searching through the even history registers, to search for leading and ending edges of the even data eye, and the search for the odd eye starts at the detected center of the even eye, and then searches for the leading and ending edges of the second data eye.
- 11. The method of claim 5, wherein:
in a first stage, the history registers are periodically reset and flushed and a new history record is acquired, after which the best data eye is determined for each phase of the clock independently, the best data eyes are then used to send and capture data bits which are forwarded to a next stage every system clock; and in a second stage, the forwarded data bits are inserted into a shift register which is used along with a barrel shifter to select and pass correctly aligned data bit.
- 12. The method of claim 5, wherein the data sampling eyes are constantly being updated and realigned, which starts at existing even and odd data sampling eyes, and then looks left and right of the existing eyes to determine the left and right eye edges, and then realigns the center of the even and odd eyes between their left and right edges.
- 13. The method of claim 8, wherein the outputs of the XOR gates are sampled in sequence and examined one at a time to determine if each output is either a 0, indicating a data transition outside of an eye, or a 1, indicating no data transition possibly inside an eye, which sequential sampling and examination is repeated in sequence for each delay tap output of the shift register.
- 14. The method of claim 1, wherein the delay tap outputs are sampled by a first circuit clocked by a positive edge of the clock, and by a second circuit clocked by a negative edge of the clock, an even data capture eye is detected for the positive clock phase, and an odd data capture eye is detected for the negative clock phase independently of the detection of the even eye.
- 15. A mechanism for automatically adjusting transmission delays for optimal simultaneous bi-directional (SiBiDi) signaling between two nodes to improve the signal quality of the simultaneous bi-directional signaling over a communication line, wherein during a set-up sequence, parameter setting data is sent in a unidirectional communication over the communication line to allow the two nodes to more accurately exchange the parameter setting data during the set-up sequence, whereby the unidirectional communication has better signal quality to more accurately exchange the parameter setting data.
- 16. The mechanism of claim 15, wherein during the set-up sequence, data is sent at a slower data rate than during SiBiDi signaling, and in which a 1:n ratio is used, holding a ‘1’ or ‘0’ for n bit times.
- 17. The mechanism of claim 15, wherein each node has a possibility of n different delays ranging from a minimum or zero delay to a maximum delay in n steps, so the number of possible combinations of delays is n×n, so that n×n combinations are tested to select an optimum delay combination, and the mechanism cycles through all n×n combinations, one at a time.
- 18. The mechanism of claim 15, wherein one differential data line connects the two nodes, and each node operates with a 1-bit sender CPU and 1-bit capture CPU.
- 19. The mechanism of claim 15, wherein two differential data lines connect the two identical nodes, and each node operates with a 2-bit sender CPU and 2-bit capture CPU.
CROSS-REFERENCE
[0001] The present invention claims the benefit of commonly-owned, co-pending U.S. Provisional Patent Application Serial No. 60/271,124 filed Feb. 24, 2001 entitled MASSIVELY PARALLEL SUPERCOMPUTER, the whole contents and disclosure of which is expressly incorporated by reference herein as if fully set forth herein. This patent application is additionally related to the following commonly-owned, co-pending U.S. Patent Applications filed on even date herewith, the entire contents and disclosure of each of which is expressly incorporated by reference herein as if fully set forth herein. U.S. patent application Ser. No. (YOR920020027US1, YOR920020044US1 (15270)), for “Class Networking Routing”; U.S. patent application Ser. No. (YOR920020028US1 (15271)), for “A Global Tree Network for Computing Structures”; U.S. patent application Ser. No. (YOR920020029US1 (15272)), for ‘Global Interrupt and Barrier Networks”; U.S. patent application Ser. No. (YOR920020030US1 (15273)), for ‘Optimized Scalable Network Switch”; U.S. patent application Ser. No. (YOR920020031US1, YOR920020032US1 (15258)), for “Arithmetic Functions in Torus and Tree Networks’; U.S. patent application Ser. No. (YOR920020033US1, YOR920020034US1 (15259)), for ‘Data Capture Technique for High Speed Signaling”; U.S. patent application Ser. No. (YOR920020035US1 (15260)), for ‘Managing Coherence Via Put/Get Windows’; U.S. patent application Ser. No. (YOR920020036US1, YOR920020037US1 (15261)), for “Low Latency Memory Access And Synchronization”; U.S. patent application Ser. No. (YOR920020038US1 (15276), for ‘Twin-Tailed Fail-Over for Fileservers Maintaining Full Performance in the Presence of Failure”; U.S. patent application Ser. No. (YOR920020039US1 (15277)), for “Fault Isolation Through No-Overhead Link Level Checksums’; U.S. patent application Ser. No. (YOR920020040US1 (15278)), for “Ethernet Addressing Via Physical Location for Massively Parallel Systems”; U.S. patent application Ser. No. (YOR920020041US1 (15274)), for “Fault Tolerance in a Supercomputer Through Dynamic Repartitioning”; U.S. patent application Ser. No. (YOR920020042US1 (15279)), for “Checkpointing Filesystem”; U.S. patent application Ser. No. (YOR920020043US1 (15262)), for “Efficient Implementation of Multidimensional Fast Fourier Transform on a Distributed-Memory Parallel Multi-Node Computer”; U.S. patent application Ser. No. (YOR9-20010211US2 (15275)), for “A Novel Massively Parallel Supercomputer”; and U.S. patent application Ser. No. (YOR920020045US1 (15263)), for “Smart Fan Modules and System”.
PCT Information
Filing Document |
Filing Date |
Country |
Kind |
PCT/US02/05568 |
2/25/2002 |
WO |
|