The present invention relates generally to synchronous signal transmission between modules within a computer system, and more specifically, to systems and methods for maintaining synchronicity between multiple components within a fault-tolerant computer system.
As the speed and performance of digital computer systems increase, the demands on data interconnects that link the various components within these systems also increase. These interconnects, or communication links, connect computer systems, subsystems, chips or other components within a computer system, thereby enabling data exchange. Typically, this data is transferred as pulses of electrical energy through wires or other electrically conductive material. However, the data may also be conveyed wirelessly, via RF transmitters and receivers, as well as though pulses of coherent light, via through optical fibers.
Regardless of transmission medium, serial line protocols have increasingly been among the protocols of choice for communications links between internal system components. In theory, serial line protocols may be either synchronous or asynchronous. For synchronous communications, each connected component or device is typically connected to a common clock. The serial line also typically contains at least one wire or data path to transmit the common clock signal to interconnected components. In most asynchronous (or non-synchronous) serial line communications, the serial line does not have a wire dedicated to clock signal transmission. Instead, if a clock signal is transmitted, it is sent using the data wires, either separately or embedded within another signal. In many applications, asynchronous data is merely transmitted when possible, and is handled by any receiving component at the component's discretion.
In most typical computer applications, asynchronous serial links meet the needs of the hardware developers. These links transmit data quickly, efficiently, and inexpensively. As no-dedicated clock signal wire is necessary, the datapaths can be one wire smaller, the I/O interconnects can be one pin shorter, and the dependent microcircuitry can be simplified. Additionally, for most applications, asynchronous data arrival is good enough, and most users will neither notice nor object to slight delays in processing caused by the asynchronous transmission. Consequently, most off-the-shelf computer systems today make use of asynchronous serial lines for internal data transfers.
In fault-tolerant applications, however, individual components must often operate in synchronized, or lock-step, operation in order to maintain system-wide determinism.
Thus, a need exists for improved methods and systems facilitating synchronous signal transfer among components over asynchronous serial lines. Further, a need exists to enable off-the-shelf computer systems with asynchronous internal serial lines to be used as fault-tolerant computer systems. Finally, within fault-tolerant computer systems, a need exists to enable deterministic computing among components, even as the signals are transmitted asynchronously between these components via high speed transmission channels.
In satisfaction of these needs, embodiments of the present invention provide systems and methods for transmitting high-speed signals while maintaining lock-step determinism using remote clock phase adjustments. Embodiments of the present invention also provide systems and methods for maintaining determinism through the use of synchronized time slice counters within the various components.
In accordance with one aspect of the invention, a synchronized communications system is provided. This system includes a transmitter, a receiver and an asynchronous communications link connecting the transmitter and the receiver. The transmitter includes a data clock and a round trip timer. The data clock preferably comprises a clock-forwarded clock which transmits a signal on its own data path. Preferably, the transmitter and the round trip timer are configured to measure the round trip time required to send a signal to the receiver over the communications link and to receive an acknowledgement back. Thereafter, the round trip time is used to calculate a transmission delay. In addition, the transmitter is further configured to establish an appropriate offset for the data clock in order to counteract the effect of the transmission delay and to facilitate synchronous processing between the transmitter and the receiver. This synchronized communications system may be located within a fault tolerant computer system. In various embodiments, the data clock may produce a signal that is transmitted over the communications link and used by the receiver in order to synchronize the receiver's operations with those of the transmitter.
In accordance with another aspect of the invention, a method is provided for synchronizing a transmitter and a receiver through the use of a signal. Preferably, the transmitter includes a transmitter clock and a data clock and the receiver includes a receiver clock. Under this method, a signal is transmitted from the transmitter to the receiver, an acknowledgement is sent from the receiver to the transmitter, and the round trip transit time is calculated and recorded. Thereafter, an offset is added to the data clock, and the procedure is repeated until a stopping condition has been reached. Thereafter, a preferred offset is selected and the data clock is adjusted accordingly. In various embodiments, a data clock signal generated by the data clock may be sent across the communications link from the transmitter to the receiver, which may in turn use the data clock signal to synchronize its operations with those of the transmitter.
These and other aspects of this invention will be readily apparent from the detailed description below and the appended drawings, which are meant to illustrate and not to limit the invention, and in which:
The claimed invention will be more completely understood through the following detailed description, which should be read in conjunction with the attached drawings. In this description, like numbers refer to similar elements within various embodiments of the present invention.
The claimed invention provides methods and systems for providing deterministic operation of computer components connected via an asynchronous communications link.
As discussed previously, most presently available computer systems rely upon high speed busses to transmit data among components within the computer system. These components may include low bandwidth items (e.g. mice, keyboards and joysticks) or high bandwidth components (e.g. processors, memory subsystems, graphics cards). Regardless of component type, the devices on either end of a communications link may be characterized as transmitters and receivers, where the transmitter is sending data to the receiver across a communications link.
In many modern computer systems 100, the communications link 106 comprises a high speed serial bus linking the transmitter 102 and the receiver 104. Set protocols and standards govern the manufacture and use of the link 106, so that various devices can communicate via the same link 106. One such protocol is the PCI-SIG's standard Peripheral Component Interconnect Express, or PCI-Express, protocol.
PCI-Express is a two-way, serial connection that carries data in packets along two pairs of point-to-point data lanes. Estimated bit rates for PCI-Express reach 2.5 Gigabits per second per lane direction, which is fast enough to provide an I/O architecture suitable for high speed data interconnects such as USB 2.0, InfiniBand and Gigabit Ethernet.
Typically, the PCI-Express serial connection, or bus, is clocked independently from the devices it connects. This facilitates isochronous and asynchronous communications. Isochronous communications are necessary for processes where data must be delivered within certain time constraints. For example, multimedia streams typically require an isochronous transport mechanism to ensure that data is delivered as fast as it is displayed and to ensure that the audio is synchronized with the video. Asynchronous communications refer to processes in which data streams can be broken by random intervals, where packets may arrive at their destinations at any point in time. Both asynchronous and isochronous communications may be contrasted with synchronous processes, in which data streams can only be delivered only at specific intervals or according to a common clock signal.
Because PCI-Express is readily available and because most processes need only asynchronous or isochronous communications among components, the majority of computer systems produced today include internal busses which operate according to PCI-Express, or similar standards.
Conversely, fault tolerant computers typically require that their various components operate deterministically. This means that the output for each component must be able to be predicted with absolute certainty. As a component's output is necessarily a function of its input, asynchronous communications alone are insufficient for deterministic computing applications. Accordingly, existing deterministic computing systems have typically relied upon synchronous communications links between internal components in order to facilitate data transfer.
The claimed invention makes use of asynchronous and isochronous communications lines, such as PCI-Express busses, in order to facilitate deterministic processing. Accordingly, disclosed herein are at least two primary techniques which accomplish that goal. These techniques include Remote Clock Phase Determinism and Time Slice Determinism, which are discussed below.
Remote Clock Phase Determinism
Remote Clock Phase Determinism is a system and method by which a transmitter and receiver may operate deterministically, even when connected by an asynchronous bus. Embodiments of this technique are discussed below in reference to
In this embodiment, the transmitter 202 also comprises a data clock 118 and a timer 116. The timer 116 is used to calculate the round trip time necessary for the transmitter 202 to send a signal or packet across the communications link 106 to the receiver 204, and for the receiver 204 to reply with an acknowledgement. The timer 116 may calculate the round trip time in clock cycles, in real time, or via it's own incremental counter. The data clock 118 is a second clock preferably located within the transmitter 202. Preferably, the data clock 118 is adjustable based upon instructions received from the transmitter logic 110 or other elements within the transmitter 202. In addition, the data clock 118 preferably generates a data clock signal or clock forwarded signal, which is preferably transmitted across the communications link 106 through the use of a dedicated line or datapath. In alternate embodiments, this data clock signal may be transmitted together with other data or instructions. The data clock signal may also be juxtaposed or data within the data and instructions transmitted across the communications link 106.
The operation of the communication system 200 depicted in
The training cycle begins at startup, or upon the occurrence of external events or signals which trigger initiation of the training cycle. Initially, the transmitter 202 sends a signal to the receiver 204 via the communications link 106 (Step 302). This signal may contain data packets, instructions or any other information which may be interpreted by the receiver 204 and which will cause the receiver 204 to send an acknowledgement to the transmitter 202.
Upon receipt of the signal, the receiver 204 replies to the transmitter 202 by sending an acknowledgement over the communications link 106 (Step 304). This acknowledgement may be a copy of the originally transmitted signal, a modified copy of the original signal, a simple “acknowledged” packet, or any other data stream known by those skilled in the art to indicate safe receipt of the originally transmitted signal.
The timer, running simultaneously with the send-receive-acknowledge process described above then calculates and stores a round trip time (Step 306). If present, any offset currently applied to the data clock 118 is also stored and correlated with that particular round trip time.
At this point, the transmitter 202 determines whether or not the training cycle is complete (Step 308). Preferably, the training stage would be deemed complete upon the occurrence of one or more stopping conditions. These stopping conditions may include, without limitation:
Assuming a stopping condition was not met, an offset is preferably added to the data clock (Step 310). This offset preferably comprises an incremental adjustment forward or backward. In various embodiments, the offset may shift the phase of the data clock 118 with reference to the transmitter clock 108, the receiver clock 112 or other system-wide clocks (not illustrated). This offset may then be used to shift the time which the next signal is transmitted by the transmitter 202. Alternately, the data clock's signal may be included with the next signal transmitted to the receiver 204. The receiver 204, in turn, may receive the data clock signal, and may appropriately adjust the timing of its operations, and specifically, the processing of any data received over the communications link 106 and processed by the receiver logic 114. Thus, the receiver may use the data clock, with any present offset, to clock in and process data. The training cycle repeats, starting again with Step 302, until a stopping condition is met.
Once a stopping condition is met, the training cycle is deemed complete (Step 308) and normal operation begins. At this point, the transmitter 202 determines a preferred offset to apply to the data clock 118 (Step 314). In order to determine a preferred offset, the transmitter 202 examines all round trip times to assess which round trip time appeared most frequently during the training cycle. For this value, the corresponding minimum and maximum offsets are collated, and the average offset is deemed the preferred offset. This will be explained in more detail in connection with Table 1, below.
After a preferred offset has been established (Step 314) the data clock is adjusted using this preferred offset (step 316). Thereafter, the signal generated by the adjusted data clock is transmitted, together with all subsequent data packets, from the transmitter 202 to the receiver 204 via the communications link 106. As before, the receiver 204 preferably uses the adjusted data clock signal in order to clock in data from the communications link 106. The adjusted data clock signal is then used for subsequent processing by the receiver 204 and the receiver logic 114. Thus, the receiver logic 113 will process instructions synchronously with the transmitter logic 110 because the adjusted data signal will compensate for any delay inherent in the communications link 106. With the ability to process data synchronously, the transmitter 202 and the receiver 204 will be able to proceed deterministically, and will thus enable fault-tolerant processing within the context of a standard, off-the-shelf computer system.
In order to facilitate this synchronous processing, it is important that the preferred offset be chosen properly. As described previously, Table 1 illustrates an exemplary calculation of a preferred offset in accordance with this embodiment of the invention.
In the exemplary embodiment illustrated by Table 1, assume that the transmitter clock 108 operates at 100 MHz and an offset of one nanosecond is applied to the data clock 118 through each iteration of the training cycle. With each iteration, the transmitter 202 calculates and stores the round trip time for each transmit-receive-acknowledge cycle, along with the offset applied (Step 306). In his example, the Timer 116 increments an internal counter measuring this round trip time. Table 1 illustrates the values obtained for ten iterations of the training cycle. Thus, for the first iteration, no offset is applied to the data clock 118, and the round trip counter value is seven. For the second iteration, a one nanosecond delay is applied to the data clock 118, and the round trip counter value is also seven. The process continues until at the tenth iteration, a nine nanosecond delay is applied to the data clock, and the round trip counter value is nine.
As is evident from the table, the counter values which appeared most frequently throughout the ten iterations were counter values of eight. Thus, the transmitter 202, looks up the minimum and maximum delay values for a counter value of eight, which are 2 ns and 8 ns, accordingly. The preferred offset for this example is the average offset, or:
Preferred Offset=(Min+Max)/2
Preferred Offset=(2 ns+7 ns)/2
Preferred Offset=4.5 ns.
Thus, the data clock 118 would be adjusted by the preferred offset of 4.5 ns, and normal operation would continue accordingly. Thereafter, all subsequent transmissions would include the data clock signal as adjusted by 4.5 ns.
In alternate embodiments, the time adjustment applied may be rounded to the nearest whole number, or five nanoseconds. Furthermore, in alternate embodiments, the preferred offset need not be the average offset, and may comprise the median offset, or an offset reasonably close to the average or median offset. Although other offsets may be used, they would be less desirable, as the likelihood of sending data across a clock boundary increases as the offsets push the data clock towards the edge values of the round trip counter, and by doing so, moves closer to a non-determinism point.
Another example is illustrated by Table 2 below:
In the exemplary embodiment illustrated by Table 2, assume again that the transmitter clock 108 operates at 100 MHz and an offset of one nanosecond is applied to the data clock 118 through each iteration of the training cycle. With each iteration, the transmitter 202 again calculates and stores the round trip time for each transmit-receive-acknowledge cycle, along with the offset applied (Step 306). In this example, however Table 2 illustrates ten different values obtained for ten iterations of the training cycle. Thus, for the first iteration, no offset is applied to the data clock 118, and the round trip counter value is eight. For the second iteration, a one nanosecond delay is applied to the data clock 118, and the round trip counter value is also eight. The process continues until at the tenth iteration, a nine nanosecond delay is applied to the data clock, and the round trip counter value is eight. Under this scenario, the training cycle has presumably crossed a period boundary.
A period boundary exists when the training cycle crosses a period edge of the transmitter clock 108. If a period boundary is crossed during the training cycle, the round trip values measured are preferably shifted such that the data clock 118 can be adjusted relative to the transmitter clock 108. This situation is illustrated above in Table 2.
As is evident from Table 2, the counter values which appeared most frequently throughout the ten iterations were again counter values of eight. However, if a 4.5 ns offset were applied to the data clock 118, the round trip counter value would register nine, not eight. Thus, it is readily apparent that a period boundary was crossed during the training cycle. In this case, improper entries in the table must be shifted by one clock period, or 10 ns, in order to compensate. Thus, the nine nanosecond delay would remain the same, along with its round trip counter value of eight. The delays for other entries corresponding in a round trip counter value of eight would be shifted accordingly. Thus, 0 ns would become 10 ns, 1 ns would become 11 ns, and so on, as indicated in parenthesis in Table 2.
Thereafter, the transmitter 202, again looks up the minimum and maximum delay values for a counter value of eight, which are 8 ns and 13 ns, accordingly. The preferred offset for this example is the average offset, or:
Preferred Offset=(Min+Max)/2
Preferred Offset=(8 ns+13 ns)/2
Preferred Offset=10.5 ns
Preferred Offset=0.5 ns (subtracting 10 ns for one clock period)
With the preferred offsets so calculated, the transmitter 202 and receiver 204 would again be able to proceed deterministically, and will thus enable fault-tolerant processing within the context of a standard, off-the-shelf computer system.
One skilled in the art will recognize the many advantages inherent in this system. Specifically, embodiments of the claimed invention allow for deterministic processing by both a transmitter and a receiver, without any modifications to a receiver or the receiver's logic, and over an asynchronous communications line. Furthermore, this system allows off-the-shelf computer systems to serve as fault-tolerant computer systems, as they may now be operated deterministically.
With Remote Clock Phase Determinism thus described, we will now turn to the second technique for facilitating deterministic processing, namely Time Slice Determinism.
Time Slice Determinism
Time Slice Determinism is a related system and method by which a transmitter and receiver may operate deterministically, even when connected by an asynchronous bus. Embodiments of this system are built around knowing the total variance across a communications link a priori. By restricting the times that a transmitter and receiver process packets, one can create a deterministic transfer regardless of transmission medium. This may be done through the use of a time slice, or window of time, during which packets may be sent, received and processed. Each time slice is preferably the same length, and is preferably measured in real time. In alternate embodiments, however, time slices may be represented by a fixed number of clock cycles from a core clock or other clock, so long as the time slices at each component have the same period. Embodiments incorporating Time Slice Determinism are discussed below in reference to
As illustrated, the transmitter 402 and the receiver 404 are connected via a communications link 106. As before, the transmitter 402 preferably comprises transmitter logic 110 which operates and processes instructions at a frequency set by a transmitter clock 108. Similarly, the receiver 404 preferably comprises receiver logic 114 which operates and processes instructions at a frequency set by the receiver clock 112. The receiver also comprises a FIFO buffer 410, which serves to store signals received via the communications link 106 until such time as they can be processed.
In this embodiment, the transmitter 402 and receiver 404 each also comprise respective time slice counters 406, 408. The time slice counters 406, 408 operate synchronously, and measure slices of time in order to synchronize processing between the transmitter 402 and the receiver 404. The time slice counters 406, 408 are preferably initialized simultaneously via an optional shared reset signal (not illustrated) or common core clock 412. The time slice counters 406, 408 then increment their time slice periods as would any other clock, and facilitating synchronous transfer between the transmitter 402 and receiver 404.
Preferably, the time slice may be defined a priori and may be hardwired or pre-programmed into the time slice counters 406, 408. Optionally, the time slice counters 406, 408 may be re-programmed at a later time, and re-initialized simultaneously, so as to use the newly defined time slice.
The size of a time slice is first determined by establishing the maximum and minimum delays that a signal may encounter as it travels from the transmitter 402 to the receiver 404 across the communications link. The difference between the maximum and minimum delays is the link variance. Link variance can be determined by the designer a priori, or established later, through experimentation, according to techniques generally known by those skilled in the art. Preferably, link variance should account for asynchronous clock domain crossings, transmission variance and clock recovery affects. Preferably, the link variance should be calculated in real time, rather than clock cycles, due to the potential differences in clock frequencies encountered across the link. The time slice period must be greater than the link variance.
In addition to being greater than the link variance, the time slice period must be an integer number of clock cycles of the transmitter clock 108 and the receiver clock 112. Preferably, the time slice period is defined as the lowest common denominator among these two clocks' periods. Notably, the clock frequency for the communications link 106 may be disregarded when establishing the time slice period. In sum, the time slice should be defined as the lowest common denominator of the periods of the transmitter clock 108 and receiver clock 112 which is still greater than the total link variance.
In operation, the transmitter 402 will only allow packets to be sent on time slice boundaries. Preferably, the packets will travel across the communications link 106 and will be stored by the receiver 404 in the FIFO buffer 410 until they are ready to be processed. In the receiver 404, the time slice counter 408 may be offset slightly to account for any fixed delay present in the communications link 106. Such an offset will guarantee that the earliest a packet can be received will be early in the time slice, and consequently that packets transmitted during a particular time slice will be received during the same time slice, as the designated time slice is greater than the fixed delay.
As is typical with asynchronous communications links, packets sent from the transmitter 402 to the receiver 404 will preferably include a packet start bit or sequence. To avoid confusion with other bits, the start bit is preferably twice the size of any other bit in the transmission. The end of a packet is also preferably followed by a stop bit, which tells the receiver 404 that the packet has come to an end, that it should begin looking for the next start bit, and that any bits it receives before getting the next start bit should be ignored. To ensure data integrity, a parity bit is often added between the last bit of data and the stop bit. The parity bit makes sure that the data received is composed of the same number of bits in the same order in which they were sent.
When a start bit or sequence is received, the total length of the packet is preferably sampled. Based upon the length of the packet, the receiver logic 114 will preferably calculate the number of time slices which will be required to receive the entire data stream. The receiver will then wait that number of slices and then declare the packet valid upon the next time slice. Finally, the packet will be released by the FIFO buffer 410, and the receiver will process it accordingly.
One skilled in the art will realize the invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting of the invention described herein. Scope of the invention is thus indicated by the appended claims, rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
This application is a continuation-in-part of U.S. Ser. No. 11/095,173 filed Mar. 31, 2005, the entire disclosure of which is incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
Parent | 11095173 | Mar 2005 | US |
Child | 11143259 | Jun 2005 | US |