Clock-embedded vector signaling codes

Abstract
Vector signaling codes providing guaranteed numbers of transitions per unit transmission interval are described, along with methods and systems for their generation and use. The described architecture may include multiple communications sub-systems, each having its own communications wire group or sub-channel, clock-embedded signaling code, pre- and post-processing stages to guarantee the desired code transition density, and global encoding and decoding stages to first distribute data elements among the sub-systems, and then to reconstitute the received data from its received sub-system elements.
Description

The field of the invention generally relates to communications systems for conveying information with vector signaling codes.


REFERENCES

The following references are herein incorporated by reference in their entirety for all purposes:


U.S. Patent Publication No. 2011/0268225 of U.S. patent application Ser. No. 12/784,414, filed May 20, 2010, naming Harm Cronie and Amin Shokrollahi, entitled “Orthogonal Differential Vector Signaling”, hereinafter identified as [Cronie I];


U.S. patent application Ser. No. 13/030,027, filed Feb. 17, 2011, naming Harm Cronie, Amin Shokrollahi and Armin Tajalli, entitled “Methods and Systems for Noise Resilient, Pin-Efficient and Low Power Communications with Sparse Signaling Codes”, hereinafter identified as [Cronie II];


U.S. Provisional Patent Application No. 61/753,870, filed Jan. 17, 2013, naming John Fox, Brian Holden, Peter Hunt, John D Keay, Amin Shokrollahi, Richard Simpson, Anant Singh, Andrew Kevin John Stewart, and Giuseppe Surace, entitled “Chip-to-Chip Communication with Reduced SSO Noise”, hereinafter identified as [Fox I];


U.S. patent application Ser. No. 13/842,740, filed Mar. 15, 2013, naming Brian Holden, Amin Shokrollahi and Anant Singh, entitled “Methods and Systems for Skew Tolerance in and Advanced Detectors for Vector Signaling Codes for Chip-to-Chip Communication”, hereinafter identified as [Holden I];


U.S. Provisional Patent Application No. 61/934,804, filed Feb. 2, 2014, naming Ali Hormati and Amin Shokrollahi, entitled “Methods for Code Evaluation Using ISI Ratio”, hereinafter identified as [Hormati I];


U.S. Provisional Patent Application No. 61/934,807, filed Feb. 2, 2014, naming Amin Shokrollahi, entitled “Vector Signaling Codes with High pin-efficiency and their Application to Chip-to-Chip Communications and Storage”, hereinafter identified as [Shokrollahi I];


U.S. Provisional Patent Application No. 61/839,360, filed Jun. 23, 2013, naming Amin Shokrollahi, entitled “Vector Signaling Codes with Reduced Receiver Complexity”, hereinafter identified as [Shokrollahi II].


The following additional references to prior art have been cited in this application:


U.S. Pat. No. 7,053,802, filed Apr. 22, 2004 and issued May 30, 2006, naming William Cornelius, entitled “Single-Ended Balance-Coded Interface with Embedded-Timing”, hereinafter identified as [Cornelius];


U.S. Pat. No. 8,064,535, filed Mar. 2, 2007 and issued Nov. 22, 2011, naming George Wiley, entitled “Three Phase and Polarity Encoded Serial Interface, hereinafter identified as [Wiley].


FIELD OF THE INVENTION

The present invention relates generally to the field of communications, and more particularly to the transmission of signals capable of conveying information within and between integrated circuit devices.


BACKGROUND

In communication systems, a goal is to transport information from one physical location to another. It is typically desirable that the transport of this information is reliable, is fast and consumes a minimal amount of resources. One common information transfer medium is the serial communications link, which may be based on a single wire circuit relative to ground or other common reference, or multiple such circuits relative to ground or other common reference. A common example uses singled-ended signaling (“SES”). SES operates by sending a signal on one wire, and measuring the signal relative to a fixed reference at the receiver. A serial communication link may also be based on multiple circuits used in relation to each other. A common example of the latter uses differential signaling (“DS”). Differential signaling operates by sending a signal on one wire and the opposite of that signal on a matching wire. The signal information is represented by the difference between the wires, rather than their absolute values relative to ground or other fixed reference.


There are a number of signaling methods that maintain the desirable properties of DS while increasing pin efficiency over DS. Vector signaling is a method of signaling. With vector signaling, a plurality of signals on a plurality of wires is considered collectively although each of the plurality of signals might be independent. Each of the collective signals is referred to as a component and the number of plurality of wires is referred to as the “dimension” of the vector. In some embodiments, the signal on one wire is entirely dependent on the signal on another wire, as is the case with DS pairs, so in some cases the dimension of the vector might refer to the number of degrees of freedom of signals on the plurality of wires instead of exactly the number of wires in the plurality of wires.


With binary vector signaling, each component or “symbol” of the vector takes on one of two possible values. With non-binary vector signaling, each symbol has a value that is a selection from a set of more than two possible values. Any suitable subset of a vector signaling code denotes a “sub code” of that code. Such a subcode may itself be a vector signaling code.


A vector signaling code, as described herein, is a collection C of vectors of the same length N, called codewords. The ratio between the binary logarithm of the size of C and the length N is called the pin-efficiency of the vector signaling code.



FIG. 1 illustrates a prior art communication system employing vector signaling codes. Bits x0, x1, . . . enter block-wise 100 into an encoder 105. The size of the block may vary and depends on the parameters of the vector signaling code. The encoder generates a codeword of the vector signaling code for which the system is designed. In operation, the encoder may generate information used to control PMOS and NMOS transistors within driver 110, generating voltages or currents on the N communication wires 115. Receiver 120 reads the signals on the wires, possibly including amplification, frequency compensation, and common mode signal cancellation. Receiver 120 provides its results to decoder 125, which recreates the input bits 130.


Depending on which vector signaling code is used, there may be no decoder, or no encoder, or neither a decoder nor an encoder. For example, for the 8b8w code disclosed in [Cronie II], both encoder 105 and decoder 125 exist. On the other hand, for the Hadamard code disclosed in [Cronie I], an explicit decoder may be unnecessary, as the system may be configured such receiver 120 generates output bits 130 directly.


The operation of the transmitter, consisting of elements 100, 105, and 110, and that of the receiver, consisting of elements 120, 125, 130 have to be completely synchronized in order to guarantee correct functioning of the communication system. In some embodiments, this synchronization is performed by an external clock shared between the transmitter and the receiver. Other embodiments may combine the clock function with one or more of the data channels, as in the well-known Biphase encoding used for serial communications.


One important example is provided by memory interfaces in which a clock is generated on the controller and shared with the memory device. The memory device may use the clock information for its internal memory operations, as well as for I/O. Because of the burstiness and the asynchronicity of memory operations, the I/O may not be active all the time. Moreover, the main clock and the data lines may not be aligned due to skew. In such cases, additional strobe signals are used to indicate when to read and write the data.


BRIEF DESCRIPTION

Vector signaling codes providing guaranteed numbers of transitions per unit transmission interval are described, along with a generalized system architecture. Elements of the architecture may include multiple communications sub-systems, each having its own communications wire group or sub-channel, clock-embedded signaling code, pre- and post-processing stages to guarantee the desired code transition density, and global encoding and decoding stages to first distribute data elements among the sub-systems, and then to reconstitute the received data from its received sub-system elements. Example embodiments of each architectural elements are described, as well as example code embodiments suitable for sub-channel communication.





BRIEF DESCRIPTION OF FIGURES


FIG. 1 illustrates a prior art communication system employing vector signaling codes.



FIG. 2 shows an embodiment of a vector signaling communications system with embedded clock information.



FIG. 3 is a block diagram of one embodiment of the history pre-coder.



FIG. 4 is a block diagram of one embodiment of the history post-decoder.



FIG. 5 is a flow chart for one embodiment of the Global Encoder.



FIG. 6 is a flow chart for one embodiment of the pre-code unit.



FIG. 7 is a flow chart for one embodiment of the post-decoder unit.



FIG. 8 is a flow chart for one embodiment of the Global Decoder.



FIG. 9 is a block diagram of one embodiment of the transmitter encoding portions of an ENRZ3 communications system.



FIG. 10 is a block diagram of one embodiment of the receiver decoding portions of an ENRZ3 communications system.



FIG. 11 is a block diagram of one embodiment of the transmission encoding portions encoder of a S34 communications system.



FIGS. 12A and 12B show schematic diagrams of two circuits providing an embodiment of an encoder for S34.



FIG. 13 is a block diagram of one embodiment of the receiver decoding portions of a S34 communications system.



FIG. 14 shows one embodiment of an encoder for S4 vector signaling code.



FIG. 15 shows one embodiment of an encoder for P3 vector signaling code.



FIG. 16 shows an embodiment of clock extraction using Analog Hysterisis plus Decision Feedback High Pass Filter clocking.



FIG. 17 shows an embodiment of clock extraction using Digital hysteresis plus Decision Feedback High Pass Filter clocking.



FIG. 18 illustrates an embodiment of clock extraction using Analog XOR clocking.



FIG. 19 illustrates an embodiment of clock extraction using per-codeword detectors and digital hysteresis.



FIG. 20 is a block diagram of an encoder embodiment, highlighting its open- and closed-loop processing circuit portions.



FIG. 21 is a block diagram of an encoder embodiment as in FIG. 20, where multiple instantiations of the open-loop portion of the circuit are implemented in parallel.



FIG. 22 is a block diagram of a decoder embodiment, highlighting the open- and closed-loop processing circuit portions.



FIG. 23 is a block diagram of an encoder embodiment as in FIG. 22, where multiple instantiations of the open-loop portion of the circuit are implemented in parallel.



FIG. 24 is a flowchart of a transmission method.



FIG. 25 is a flowchart of a reception method.





DETAILED DESCRIPTION

An embodiment of a vector signaling communication system with embedded clock information is shown in FIG. 2. Elements of this system will be referenced and further described in descriptions of subsequent figures.


The communication system of FIG. 2 consists of k distinct communication sub-systems, each comprising a history pre-coder 220, encoder 105, driver 110, n[i] communication wires, receiver 120, a clock-recovery unit 235, decoder 125, history post-decoder unit 245. There are a total of n[1]+n[2]+ . . . +n[k] communication wires, subdivided into k groups having n[1], n[2], . . . , n[k] wires, respectively. Each communication sub-system i utilizes a vector signaling code in which the codewords have n[i] coordinates.


As exemplified in this figure, bits x(0), . . . , x(N−1) enter as a block into “Global Encoder” unit 205. In some embodiments, this unit may only forward the bits in subgroups, while in other embodiments this unit may perform further computations on the incoming bits 200. Global Encoder 205 outputs k groups of bits 210, one for each of the communication sub-systems.


The i-th group of bits 210 enters the i-th history pre-coder unit 220, which in turn outputs another group of bits 230 which is forwarded to encoder 105 of the communication sub-system. Encoder 105 generates a codeword of its corresponding vector signaling code, and driver 110 drives the coordinates of this codeword on the n[i] communication wires as voltages or currents.


The communication wire voltages or currents are received as signals by receiver 120, which may perform further equalization and processing of the received signals, and may generate information for the clock-recovery unit 235 which recovers the clock information from the received signals. The received signals are further forwarded to decoder 125, which generates a group of bits 240 forwarded to the corresponding history post-decoder unit 245. This unit calculates a possibly new set of bits 250 and forwards these to the Global Decoder unit 260. As with the corresponding Global Encoder, in some embodiments Global Decoder 260 simply concatenates or combines inputs 250 to obtain output bits 270, while in other embodiments Global Decoder 260 performs additional calculations on the bits received 250 from the various history post-decoder units to re-generate the bits x(0), . . . , x(N−1) output as 270. The number of codewords of the vector signaling codes used in the i-th communication sub-system of FIG. 2 is denoted by M(i) in the following.


In accordance with at least one embodiment, reception of distinct codewords in each unit interval provides a self-clocking capability. Thus, decoder 125 may consider a previous unit interval ended and a new unit interval (and thus, a new need to decode a codeword) begun each time a new (i.e., different from the preceding codeword) appears at its input. In such an embodiment, for every unit interval a codeword is transmitted on each communication sub-system that is different from the codeword sent in the previous unit interval. Thus, the number of possible codewords across all the communication sub-systems is

(M(1)−1)*(M(2)−1)* . . . *(M(k)−1)  (Eqn. 1)


An embodiment of the history pre-coder unit 220 is shown in FIG. 3. One task of this unit is to make sure that the same codeword of the vector signaling code is not sent on the corresponding communication wires (also referred to herein as a sub-channel) in two consecutive unit intervals. Where the vector signaling code receiver uses comparators for the detection of the codeword, that condition guarantees that the output of at least one of the comparators changes value from one unit interval to the next. This value change can then be used to recover the clock information, to be subsequently described in more detail.


As shown in FIG. 3, the history pre-coder unit comprises a pre-coder 305 and a history memory unit 320. Upon receiving the block of bits b(0), . . . , b(L−1) from the Global Encoder 205, the pre-coder 305 computes its output using these bits, and the history bits in 320. It forwards the resulting bits 230 to the encoder 105, and simultaneously replaces the value of the history memory 320 with these bits. In some embodiments described below, the history memory 320 may keep the vector signaling codeword that was transmitted in the previous clock cycle and use a pre-coder which makes sure that the next transmitted codeword differs from the previous one. Such examples are given below for various types of vector signaling codes.


Similarly, an embodiment of the history post-decoder unit 245 is shown in FIG. 4. It comprises a post-decoder unit 405 and a history memory unit 420. Upon receiving the block 240 of bits from encoder 125, the post-decoder calculates a possibly new block of bits from the bits in 240 and the bits in its history unit 420, forwards the new bits 250 to the Global Decoder 260, and replaces the bits in its history unit with these bits.


A flow-chart of an exemplary embodiment of the Global Encoder 205 is given in FIG. 5. The main task of the Global Encoder is to compute from the given block of bits x(0), . . . , x(N−1) a number k of blocks of bits, one for every communication sub-system in FIG. 2, such that these blocks are uniquely determined by the incoming bits 200, and vice-versa. In the procedure described in FIG. 5, the incoming bits x(0), . . . , x(N−1) in 510 are used in Step 520 to compute bit-representations of reduced-modulus integers y(1), y(2), . . . , y(k), wherein each y(i) is an integer from 0 to M(i)−2 inclusive (note that y(i) is strictly less than M(i)−1, and hence referred to herein as having a reduced-modulus), and wherein M(i) is the number of codewords of the vector signaling code used in the i-th communication sub-system in FIG. 2.


It might be expected that when converting a number to a mixed-based representation (i.e., a mixed modulus), the digits in each position would range from 0 to M−1, where the modulus M is determined by the number of possible signals, M. That is, if there are M possible signals or codes available to represent the digits (e.g., base 10 uses ten digits: 0 through 9, and base 5 uses five digits: 0 through 4), a typical conversion might use M values: 0 to M−1. Note, however, that the conversions described herein uses digits 0 through M−2, and thus uses a reduced modulus of M−1 compared to what would normally be available with a set of M signals, or vector code codewords. The advantages of using the reduced modulus values are described below.


The particular way this calculation is done in Step 520 is by representing the integer X whose binary representation is x(0), . . . , x(N−1), with x(0) being the least significant and x(N−1) being the most significant bit, as

X=Σl=1ky(ij=1i−1(M(j)−1).  (Eqn. 2)


Many different algorithms may be used to perform this computation, as is known to those of skill in the art. For example, where 0≦X<257, so N=9, M(1)=M(2)=12, M(3)=6, then we have y(1)=X mod 11, y(2)=(X−y(1))/11 mod 11, and y(3)=(X−y(1)−11*y(2))/121.


One embodiment of a general procedure for pre-code unit 220 is outlined in FIG. 6. It is assumed that the bits in the history memory unit 320 of FIG. 3 represent an integer, called h, in this figure. Upon receiving the block of L bits y(0,i), . . . , y(L−1,i) as the i-th output 210 of Global Encoder 205, the pre-coder calculates in Step 620 the integer b=(y+1+h) mod M(i), wherein y is the integer with bit-representation y(0,i), . . . , y(L−1,i), and M(i) is the number of codewords of the i-th vector signaling code. It is assumed that the integer h is between 0 and M(i)−1, so it corresponds uniquely to a codeword of the i-th vector signaling code. Moreover, since the value of y is, by construction, smaller than M(i)−1 (i.e., ≦M(i)−2), we always have that b is not equal to h mod M(i). Since h corresponds to the index of the codeword in the i-th vector signaling code transmitted in the last unit interval, and b corresponds to the index of the codeword transmitted in the current unit interval, this type of calculation makes sure that no two consecutive codewords are the same. The use of the reduced modulus in calculating the integers y causes the encoder to generate an output codeword that is different from the immediately prior codeword based on the reduced modulus digit (y) and the prior codeword (h). In summary, after an initial codeword h, selected from M codewords (0 to M−1), has been sent in a first signaling interval, a subsequent codeword is selected based on h+1+y, where y is a data-dependent reduced-modulus (M−1) integer and is in the range 0 to M−2, such that no valid data-dependent reduced modulus integer will result in the subsequent codeword equaling the initial codeword h.


Other types of operations or the pre-code unit are also possible. For example, where M(i) is a power of 2, it is possible to ensure the distinctness of b and h using simple XOR arithmetic, as will be shown in the subsequent example of an ENRZ encoder.


An embodiment of the operation of the post-decoder unit 245 is shown in FIG. 7. The input to this procedure is a block of bits b(0), . . . , b(R−1) in Step 710. This block may have been produced by the decoder 125 of the i-th communication sub-system illustrated in FIG. 2. In Step 720, the post-decoder unit may use the bits in its memory history unit, interpreted as an integer h, to calculate an integer y=(b−1−h) mod M(i), wherein b is the integer with bit-representation b(0), . . . , b(R−1). In Step 730 the history value h is replaced by b, and simultaneously, b is forwarded to the Global Decoder 260.


The operation of an embodiment of the Global Decoder 260 is given in FIG. 8. The input to this procedure are y(1), . . . , y(k), wherein each y(i) is a block of bits generated by the post-decoder unit of the i-th communication sub-system. In Step 820 an integer X is calculated from y(1), . . . , y(k) according to the formulation in (Eqn. 2). The bit representation of this integer is the desired sequence of bits 270 in FIG. 2.


As mentioned above, in some applications the Global Encoder 205 may only forward the incoming bits in subgroups to the corresponding communication sub-systems, and the Global Decoder 260 may just collect the incoming bit blocks and concatenate them to obtain the bits 270. Some such examples are discussed further below.


Clock Extraction


[Holden I] describes comparator-based detectors for vector signaling codes designed such that no comparator is presented with ambiguous decision conditions; that is, at all times each comparator output is either explicitly true, or explicitly false. An embodiment based on such codes and detectors may be combined with a simple transition detector to extract sub-system transition information (herein called the “edge signal”) to drive a clock extraction circuit, as in 235 of FIG. 2. Three circuits for these codes are detailed below. These are referred to in said descriptions as AH-DF-HPF, UDH-DF-HPF, and A-XOR.


The fourth type of clock extractor, referred to in said descriptions as PCD-DH, uses a per-codeword detector. This type of detector works with vector signaling codes in which the comparator outputs have ambiguous outputs.


In general, clock extraction embodiments detect changes in sub-system detector outputs. In some embodiments, only changes from one valid codeword to another valid codeword are detected, and in other embodiments decision feedback and/or hysteresis is provided to the input signal comparators to avoid extraneous transitions caused by signal reflections and noise. Any of a number of methods may then be used to analyze the edge signal to eliminate artifacts caused by near-simultaneous detector output transitions, including methods known to the art, producing a reliable sampling clock derived from the detector edges. One such embodiment incorporates fixed or variable delay stages and a simple state machine configured such that a clock output is produced a fixed delay time after the last edge signal transition, suppressing the effect of multiple edge signal transitions within the delay interval.


As will be apparent to one of skill in the art, propagation delay differences (also know as skew) within a communications channel group will result in different arrival times for receive data. If the amount of this skew is significant (i.e. more than a transmit unit interval), the teachings of [Holden I] may be applied to permit the coherent reconstruction of aggregated receive data.


Similarly, a communications system utilizing multiple sub-systems may generate a global receive clock by applying the same edge signal generation and sampling clock derivation methods using the individual sub-system receive clocks as inputs, and producing a global sampling clock suitable for sampling the aggregated receive data as obtained at 270 of FIG. 2. As in sub-system clock extraction, embodiments presenting significant skew between sub-system results must carefully control generation of an aggregate or global decoder output clock, such that all of the global decoder's component inputs are valid and the result meets all necessary set-up and hold times for subsequent circuits. Some embodiments may require intermediary holding latches on the sub-system results and/or other skew mitigation measures as taught by [Holden I] or as generally applied in practice.


Code/Receiver Categories for Clock Extraction


The codes and the receivers that accompany them that are used with these clocking solutions can be divided into two categories. The first group of codes can be described as Unambiguous Comparator Output code/receiver (UCO). For these code/receiver combinations, the binary or multiwire comparator circuits used in the defined receiver have unambiguous outputs for every codeword in the code. An example of a code that is always UCO is the ENRZ code, also known as H4 code or Hadamard code of size 4, as described in [Cronie I].


The second group of codes can be called Ambiguous Comparator Output codes/receiver (ACO). In these code/receiver combinations, a given comparator is sometimes presented with inputs at the same level and thus has an ambiguous output for some codewords. These ambiguous outputs are later resolved in a decoder stage. An example of a code that is always ACO is the 8b8w code described in Cronie II.


In practical implementations, most codes are either UCO or ACO. There are a few codes that are ACO with one receiver implementation and UCO with another receiver implementation, typically with more complex multi-input analog detectors.


AH-DF-HPF—Analog Hysteresis plus Decision Feedback High Pass Filter Clocking Solution


The following clocking solution is only applicable to UCO code/receiver solutions.


The simplest clock extraction embodiment adds an analog hysteresis function to each of the comparators in order to filter out the multiple zero crossing on the wires that are caused by noise and reflections, as illustrated in FIG. 16. However, there are known disadvantages to such solutions. The maximum amplitude of any reflections on the communications channel must be known, so that the hysteresis offset value may be chosen correctly. Such embodiments are known to add jitter to the recovered clock, as noise or reflections on the leading edge can cause the transition to occur early, causing the effective eye opening in the timing dimension to close, and reducing the ability of the receiver to handle difficult channels. Similarly, the added hysteresis lowers the receive sensitivity of the comparators, reducing the eye opening in the amplitude dimension as well. Finally, such analog hysteresis embodiments contain a closed loop circuit that must be implemented carefully.


The function of the hysteresis comparator can be described as follows:

















HysOut = Hysteresis(HysIn, HysOffset)



{



If HysOut == 0



  If HysIn > HysOffset, HysOut = 1;



  Else HysOut = 0;



else



  If HysIn > − HysOffset, HysOut = 1;



  Else HysOut = 0;



Endif;



}










For each detector, the hysteresis functions are applied to the comparators:


HysOffset=voltage value determined either statically or adaptively that exceeds the expected amplitude of reflections and other noise sources in the receive signal.


C(x)=Hysteresis(detector inputs(x), HysOffset)


In the following example, the value “x” is shown to range from 0 to 2 for clarity. This is the case for the ENRZ code. For other UCO codes, the value that “x” would range over is equal to the number of comparators.


The clock signal is created by using an exclusive-or function to look for changes on any of the wires. The code delivers a transition on one wire each clock:


Clock=(C(0) XOR Q(0)) OR (C(1) XOR Q(1)) OR (C(2) XOR Q(2))


For each comparator, the data is delayed by a delay line that has a nominal delay of one half of the unit interval (UI). The actual delay would depend on the implementation and may be somewhat less or more than one half the UI:

















D(0) = HalfUIDelayLine(C(0))



D(1) = HalfUIDelayLine(C(1))



D(2) = HalfUIDelayLine(C(2))










For each comparator, recover each bit with a D Flip-Flop (DFF) or cascade of latches in some implementations:

















Q(0) = DFF(Clock, D(0))



Q(1) = DFF(Clock, D(1))



Q(2) = DFF(Clock, D(2))



/* Decode and retime the data */



DecodedData = Decode(Q(0), Q(1), Q(2))



RetimedDecodedData = DFFs(Clock, DecodedData)










UDH-DF-HPF—Unrolled Digital hysteresis plus Decision Feedback High Pass Filter clocking solution


The following clocking solution is only applicable to UCO code/receiver solutions.


An embodiment of clocking solution AH-DF-HPF shown in FIG. 17 performs six additional binary comparisons, such that two values of a hysteresis comparison is provided along with each data comparison. This embodiment has the advantage that the closed loop portion of the hysteresis function is digital, and the data path portion of the circuit has better sensitivity than AH-DF-HPF. The disadvantages include greater implementation size and higher power consumption, because of the additional comparators needed to produce the required hysteresis comparisons.


One embodiment uses two extra separate comparators that add and subtract a fixed value from the analog inputs, rather than using analog hysteresis feedback. The hysteresis function may then be implemented digitally.


Another embodiment uses a combined comparator that delivers three outputs, the regular comparator output, an output with the comparison done with the offset added, and a third with the comparison done with the offset subtracted.


This example uses the embodiment with separate comparators. In this example, the function of the regular comparators is described as follows:


Comparator(Inputs)


The operation of the offset comparators adds the offset value to the comparator inputs before the comparison is done. It is be described as follows:


OffComparator(Inputs, HysOffset)


For a three-comparator code/receiver solution such as for the ENRZ code, the comparators are:

















OffCompOutHigh(0) = OffComparator(Inputs(0), HysOffset)



CompOut(0) = Comparator(Inputs(0))



OffCompOutLow(0) = OffComparator(Inputs(0), −HysOffset)



HysCompOutHigh(1) = OffComparator(Inputs(1), HysOffset)



CompOut(1) = Comparator(Inputs(1))



HysCompOutLow(1) = OffComparator(Inputs (1), −HysOffset)



HysCompOutHigh(2) = OffComparator(Inputs(2), HysOffset)



CompOut(2) = Comparator(Inputs(2))



HysCompOutLow(2) = OffComparator(Inputs(2), −HysOffset)










This circuit recovers the clock by comparing the flip-flop outputs with the comparator outputs from the opposite side of center:

















Clock =



((NOT Q(0)) AND CompOutHigh(0)) OR (Q(0) AND



(NOT CompOutLow(0))) OR



((NOT Q(1)) AND CompOutHigh(1)) OR (Q(1) AND



(NOT CompOutLow(1))) OR



((NOT Q(2)) AND CompOutHigh(2)) OR (Q(2) AND



(NOT CompOutLow(2)))










The rest is the same as in the AH-DF-HPF embodiment.


A-XOR—Analog XOR Clocking Solution


An embodiment of clock extraction using Analog XOR clocking is shown in FIG. 18. This embodiment is compatible with both UCO and ACO code/receiver solutions.


Each comparator function is divided into two halves. The first half of each comparator is a linear low gain comparator that performs the function of the comparator with a linear output. Each of these linear values is then passed through an analog low-pass filter. Each linear value is compared against the analog low-pass filtered version of itself by an analog XOR circuit, which serves as the second half of the comparison function. Analog XOR circuits are well known in the art. The analog XOR circuit will produce a voltage output that has a higher value if the inputs have different values than if they have the same value.


The outputs of the three analog XOR circuits are summed. The output of the summer is passed through a limiting gain stage to give the signal sharp edges. This signal then forms the clock.


In parallel to the clock path, in the data path, the output of the low gain comparator is passed through a gain stage to form a regular binary comparator. The clock is used to sample this data.


A challenge with this circuit is that the detected change is less for some code transitions than for others. This circuit is also sensitive to reflections and noise.


PCD-DH—Per Codeword Detectors, Digital Hysteresis Clocking Solution


This embodiment is compatible with both UCO and ACO code/receiver solutions.


As illustrated in FIG. 19, this embodiment of a clock extraction circuit does not use an analog hysteresis circuit. Instead it uses normal comparators 1910. A special unrolled and equal-delay digital detector is implemented that has one output for each of the allowed codewords.


These per-codeword outputs put out a high value if that codeword is present on the output of comparators 1910, and a low value if that codeword is not present. The circuit is implemented to have a roughly equal delay from the output of each of the comparators to the output of each of the per-codeword detector. An example of such an equal-delay circuit is a circuit that has a AND gate 1920 per codeword. That AND gate has the same number of legs as the number of comparators. The inputs of the legs of the AND gates are wired to the appropriate true or complement outputs of the comparators, here shown distinct true and complimentary inputs to each AND gate 1920. The particular decoded values shown are exemplary, and non-limiting.


When ACO codes are employed with this detector, the per-codeword detectors are only connected to those comparator outputs that are needed to detect that codeword and not to those that have an ambiguous value for that codeword.


The outputs of each of the per-codeword detectors is wired to the Set input of a per-codeword Resettable D Flip-Flop with the D input set to a high value (or equivalent circuit.) For purposes of illustration, the flip-flops 1930 are shown in FIG. 19 as edge triggered set/reset devices, with the output Q going true on a rising edge of input S, and going false on a rising edge of input R. Thus, any detected codeword by AND gates 1920 will cause the corresponding flip-flop 1930 to set. The outputs of all of these Flip-Flops 1930 are ORed together 1940 and delayed by a delay line 1950 that is statically or dynamically calibrated to create a rising edge in the middle of the data eye. Said rising edge signal is used as the clock in a data re-timer circuit. Said rising edge signal is also connected to the Reset input of each flip-flop 1930 to clear the detectors for the next clock cycle.


The described embodiment will catch the first instance within a clock cycle of a codeword being detected and will ignore subsequent reflections that cause zero-crossings.


Memory Links


As one specific example applying the previously described systems and methods, an embodiment is described of links connecting one or more Dynamic Random Access Memory (DRAM) units to a memory controller.


Traditionally, such links are byte-oriented, with each data byte communicated over 8 wires in a single-ended manner, and a 9th wire communicating a write mask signal identifying whether the data byte is to be applied or ignored in the associated memory operation. Two more wires provide a strobe signal using differential signaling. As has been noted in prior art such as [Wiley] and [Cornelius], the ability to embed the clock information into the data and hence eliminate the need for separate strobe signals can be advantageous. The following examples show several examples of vector signaling codes and how they can be used in conjunction with the general principles described above.


In order to have a system according to FIG. 2 for such a memory link, the number of vector signaling codewords in these applications has to satisfy the inequality

257≦(M(1)−1)* . . . *(M(k)−1)  (Eqn. 3)

as 256 distinct codewords are required to communicate 8 bits of data, and at least a 257th codeword is required to communicate the notification provided by the write mask signal that this data byte is to be ignored for this memory operation.


EXAMPLE 1: ENRZ3

ENRZ is a vector signaling code obtained from a 4×4 Hadamard transform, as described in [Cronie I]. It has eight codewords and transmits them on 4 wires. The eight codewords are the four permutations of the vector (1, −⅓, −⅓, −⅓) and the four permutations of (−1, ⅓, ⅓, ⅓). In this case, k=3, and M(1)=M(2)=M(3)=8. The inequality of (Eqn. 3) is satisfied. The resulting embodiment is hereinafter called ENRZ3, referring to its three sub-systems, each utilizing ENRZ vector signaling code.


An exemplary operation of the encoder is detailed in FIG. 9. The input to the Global Encoder consists of 9 bits x0, x1, . . . , x8 corresponding to an integer between 0 and 256 inclusive (that is, 257 distinct values.) The Global Encoder may have an implementation as previously described in FIG. 5. It produces 3 groups of 3 bits, called (a0, a1, a2), (b0, b1, b2), and (c0, c1, c2), one group of bits for each ENRZ sub-system. Each of these vectors corresponds to the bit-representation of an integer modulo 7. This means that none of these vectors consists of three 1's. The history units 320 each contain 3 bits corresponding to the bit sequences transmitted in the previous unit interval, and called respectively h0, h1, and h2.


The pre-coding units 305 used in this example operate differently than the general pre-coding units described in FIG. 6, as the particular input characteristics permit simplification. Here, each pre-coding unit computes the XOR of the complement of the inputs 210 from the Global Encoder 205, with its corresponding history bits. Since none of the vectors 210 consists entirely of 1's, the complement of none of these vectors consists entirely of 0's, and hence the operation of the pre-coding unit ensures that the result of the operation is always different from the bits in the corresponding history units 320. Each of the pre-coding units forwards the computed bits to the corresponding ENRZ encoders 105, and simultaneously replaces the history bits with these bits.


Each communication sub-system in this embodiment transmits 3 bits on its corresponding 4-wire interface. The number of wires is therefore 12. Each sub-system uses 3 multi-input comparators (also known as generalized comparators, as described in [Holden I]) to recover its bits. The output of these comparators can be used to do a clock recovery on every one of the sub-systems, according to the teachings above. There are therefore a total of 9 comparators.



FIG. 10 is an exemplary embodiment of the receiver portion of the decoder for this communication system. In operation, the ENRZ decoders 125 forward a group 240 of three bits each to the post-decoder units 405. These units XOR the incoming bits with the 3 bits in their history units 420, complement the result, and forward it to the Global Decoder 260. Simultaneously, they replace their three history bits with the forwarded bits.


The operation of the Global Decoder 260 in this embodiment may be as described in FIG. 8.


The ISI ratio of this coding system, as defined in [Hormati] is 1, which is the lowest ISI ratio possible. This means that this coding system has a low susceptibility to ISI noise. This communication system uses 12 signal wires, and 9 comparators. To enable operation at high data rates, the wires have to be routed in 3 low-skew groups of 4 wires each.


EXAMPLE 2: S34

S3 is a vector signaling code on three wires consisting of the 6 permutations of the vector (+1, 0, −1). In this case, we may choose k=4, corresponding to four communication sub-systems in FIG. 2, and M(1)=M(2)=M(3)=M(4)=6, satisfying the inequality of (Eqn. 3). The resulting embodiment is hereinafter called S34, referring to its four sub-systems, each utilizing S3 vector signaling code. This coding scheme is similar to the one reported in [Wiley], though the details of the encoding and decoding are different.


An embodiment of the encoder is detailed in FIG. 11. The input to the Global Encoder are the 9 bits x0, x1, . . . , x8 corresponding to an integer between 0 and 256 inclusive. This means that x0=x1= . . . =x7=0 if x8=1. In this communication system there is no Global Encoder unit. Instead, the incoming bits are subdivided into three groups (x0, x1), (x2, x3), (x4, x5) of two bits, and (x6, x7, x8) of three bits. Because of the restriction of the input bits, the fourth group corresponds to an integer between 0 and 4, inclusive.


The history units 320 each contain 3 bits corresponding to the bit sequences transmitted in the previous unit interval, and can be viewed as integers modulo 6, and called h0, h1, h2, and h3, respectively.


The pre-coding units 305 operate as described in FIG. 6. Each of the pre-coding units forwards the computed bits to the corresponding S3 encoders 105, and simultaneously replaces the history bits with these bits.


Each communication sub-system in this example transmits two or more bits on its corresponding 3-wire interface using ternary signaling. In preferred embodiments, the encoders 105 may conveniently represent their ternary output by generating two bit vectors of length 3 such that each bit vector has exactly one “1”, and the positions of the 1's in these vectors are disjoint. In operation, the first bit vector may encode the position of the +1 in the vector signaling codes S3, and the second bit vector may encode the position of the −1, in the sense that a +1 is transmitted on the wire where the first bit vector is 1, a −1 is transmitted on the wire where the second bit vector is 1, and a 0 is transmitted on the wire if neither bit vector is 1. It will be apparent to one familiar with the art that the described bit vectors may be used to drive transistors in an output line driver generating the desired +1 and −1 output signal values.


An example of the operation of such an encoder is described in FIGS. 12A and 12B, showing two logical circuits. The inputs to these circuits are three incoming bits a,b,c corresponding to an integer between 0 and 5, inclusive, where a is the least and c is the most significant bit of the integer. The circuit of FIG. 12A does not, in fact, use the input a, and computes its three outputs as NOR(b,c), b, and c. In operation, the output of this circuit may be interpreted as a mask for the position of +1 in the codeword of S3 chosen to be transmitted. The circuit in FIG. 12B uses all its three inputs and outputs, from top to bottom, the logical functions (custom character(a^c))&(a^b), (custom characterb)&(a^c), and NOR(c, a^b), where custom characterx is the complement of x, x^y is the XOR of x and y, x&y is the logical AND of x and y, and NOR(x,y) is the NOR of x and y. The circuit described is only an example, and one moderate skill in the art will be aware of many other solutions.


An exemplary embodiment of decoder 125 of FIG. 1 for the case of S3 coding is given in FIG. 13. The three communication wires S3D01, S3D02, S3D03 enter a network of comparators S3D20, S3D25, and S3D30. In operation, S3D20 produces an output of “0” if the value on wire S3D01 is larger than the value on wire S3D02, and otherwise the output is 1. Similarly, the output of S3D25 is “0” if and only if the value on the wire S3D01 is larger than the value on wire S3D02, and the output of S3D30 is “0” if and only if the value on wire S3D02 is larger than the value on wires S3D03. Decoder 125 is a circuit that computes as its first output the value B&C, as its second output the value A^B^C, and on its third output the value A&(custom characterC), wherein A, B, and C are the outputs of units S3D20, S3D25, and S3D30, respectively.


The post-decoder units in this embodiment operate as described in FIG. 7. No explicit Global Decoder is required, as the bits output by the post-decoder units may simply be concatenated together to re-create the output bits 270 of FIG. 2.


The ISI ratio of this coding system is 2. This means that this coding system has a higher susceptibility to ISI noise than the ENRZ3 scheme. This communication system uses 12 signal wires, and 12 comparators. The wires have to be routed in 4 low-skew groups of 3 wires each.


EXAMPLE 3: CODE S42×P3

The S4 code is a vector signaling code on four wires consisting of the 12 distinct permutations of the vector (+1, 0, 0, −1). This code can be detected using six pairwise comparators. The ISI ratio of this code is 2.


The P3 code is a vector signaling code on three wires consisting of the four codewords (1, 0, −1), (−1, 0, 1), (0, 1, −1), and (0, −1, 1). The codewords can be detected using the comparators x−y and (x+y)/2−z on the received signals (x,y,z) on the three wires. The ISI ratio of this code is 1.


For the communication system in FIG. 2, we choose 3 communication sub-systems, i.e., k=3, wherein the first two communication sub-systems use the vector signaling code S4, and the third one uses the vector signaling code S3. We have M(1)=M(2)=12, and M(3)=4, so that the inequality of (Eqn. 3) is satisfied. The resulting code is called S43×P3.


The Global Encoder 205 of FIG. 2, and the Global Decoder 260 of FIG. 2 can operate according to the procedures in FIG. 5 and FIG. 8, respectively. The history pre-coding and post-decoding units 220 and 245 may also operate according to the procedures in FIG. 3 and FIG. 4, respectively.


One embodiment of an encoder for the S4 code is given in FIG. 14. The encoder produces two bit-vector (p0, p1, p2, p3) through the upper circuit and (m0, m1, m2, m3) through the lower circuit from inputs a,b,c,d representing an integer between 0 and 11 inclusive, wherein a is the least and d is the most significant bit of this integer. The bit sequence (p0, p1, p2, p3) is a mask for the position of the +1 in the corresponding codewords of S3, and (m0, m1, m2, m3) is a mask for the position of −1 in that codeword.


One embodiment of an encoder for the code P3 is given in FIG. 15. Similar to the encoder for S4, this encoder produces two bit-vectors (p0, p1) and (m0, m1) from its inputs a and b. These vectors are masks for the positions of +1 and −1, respectively, in the corresponding codeword of P3.


These example embodiments are for illustrative purposes only. They can be further optimized using methods well-known to those of skill in the art.


The ISI ratio of this coding system is 2. This means that this coding system has a higher susceptibility to ISI noise than the ENRZ3 scheme, but a similar susceptibility to ISI noise as S34. This is confirmed by statistical simulation results reported below.


This communication system uses 11 signal wires, and 14 comparators. The wires have to be routed in 2 low-skew groups of 4 wires and one low-skew group of 3 wires each.


EXAMPLE: OCT3

OCT is a vector signaling code on three wires consisting of the 8 codewords ((0.6, −1, 0.4), ((−0.2, −0.8, 1), ((−0.8, −0.2, 1), ((1, −0.6, −0.4). This code can be detected using four comparators x−y, (x+2*z)/3−y, (y+2*z)/3−x, (x+y)/2−1 on input (x,y,z) which represent the received values on the three wires of the interface. This code was first described in [Shokrollahi I].


For the communication system in FIG. 2, we choose 3 communication sub-systems, i.e., k=3, each using the vector signaling code OCT. We have M(1)=M(2)=M(3)=8, so that the inequality of (Eqn. 3) is satisfied. The resulting code is called OCT3.


In a first embodiment, Global Encoder 205 of FIG. 2 and the Global Decoder 260 of FIG. 2 operate according to the procedures in FIG. 5 and FIG. 8, respectively, and the history pre-coding and post-decoding units 220 and 245 operate according to the procedures in FIG. 3 and FIG. 4, respectively. In an alternative embodiment, pre-coding 220 and post-decoding 245 units operate according to the procedure outlined for ENRZ3 in FIG. 9 and FIG. 10, respectively.


The ISI ratio of this coding system is 8/3. This means that this coding system has a higher susceptibility to ISI noise all the previous systems. This is confirmed by statistical simulation results reported below. This communication system uses 9 signal wires, and 12 comparators. The wires have to be routed in 3 low-skew groups of 3 wires each.


EXAMPLE: C182

The code C18 is a vector signaling code on four wires consisting of the 18 codewords


(−1, ⅓, −⅓, 1), (−1, ⅓, 1, −⅓), (−1, 1, −⅓, ⅓), (−1, 1, ⅓, −⅓), (−⅓, 1, −1, ⅓), (−⅓, 1, ⅓, −1), (⅓, −1, −⅓, 1), (⅓, −1, 1, −⅓), (1, −1, −⅓, ⅓), (1, −1, ⅓, −⅓), (1, −⅓, −1, ⅓), (1, −⅓, ⅓, −1), (−1, −⅓, ⅓, 1), (−1, −⅓, 1, ⅓), (−⅓, ⅓, −1, 1), (−⅓, ⅓, 1, −1), (⅓, 1, −1, −⅓), (⅓, 1, −⅓, −1).


This code can be detected using five comparators x-z, x-u, y-z, y-u, z-u on input (x,y,z,u) which represent the received values on the four wires of the interface. This code was first disclosed in [Shokrollahi II].


For the communication system in FIG. 2, we choose 2 communication sub-systems, i.e., k=2, each using the vector signaling code C18. We have M(1)=M(2)=18, so that the inequality of (Eqn. 3) is satisfied. The resulting code is called C182.


This communication system can be made to work without a global encoder or a global decoder unit. The history pre-coding 220 and post-decoding 245 units may operate according to the procedures in FIG. 3 and FIG. 4, respectively.


The ISI ratio of this coding system is 3. This means that this coding system has a higher susceptibility to ISI noise all the previous systems. This is confirmed by statistical simulation results reported below. This communication system uses 8 signal wires, and 10 comparators. The wires have to be routed in 2 low-skew groups of 4 wires each.


Statistical Simulations


For the simulations below, the peak-to-peak voltage between the top and low levels was chosen to be 200 mV, and a channel model was used that is based on conventional communications channel characteristics for microstrips routed between integrated circuit devices. The only equalization used is a Tx FIR with one pre- and one post-cursor. The channel represents a realistic mobile DRAM channel, operating at a signaling rate of 7 GBaud/second, with the interfaces transmitting one full byte (plus mask) in every unit interval. The total throughput is therefore 56 Gbps.


Simulations were done with statistical eye program software proprietary to Kandou Bus, called “KEYE”. For all the resulting eye diagrams the minimum horizontal and the minimum vertical eye openings as shown in Table I were recorded. Most of the time, these two minima don't occur within the same eye.













TABLE I









ISI
Max # wires
Minimal opening














#wires
#comp.
ratio
in group
Horizontal
Vertical
















ENRZ3
12
9
1
4
92 psec
83 mV


S34
12
12
2
3
50 psec
35 mV


S42 × P3
11
14
2
4
49 psec
34 mV


OCT3
9
12
2.667
3
16 psec
 2 mV


C182
8
10
3
4
 7 psec
 1 mV









As can be seen, and is to be expected, the minimal horizontal eye opening is a decreasing function of the ISI ratio. Higher crosstalk and lower margin further reduces the vertical opening for all codes other than ENRZ3.


Multi-Phase Embodiments


For each of the examples shown, an alternate embodiment exists that can be made to run faster through parallel implementation, often called a multi-phase implementation. In some embodiments, the positions of the encoder and pre-coder as shown in FIG. 3 may be more conveniently reversed to facilitate loop unrolling.


In one embodiment, in which an example transmit encoding function is shown in FIG. 20 and an example receive decoding function is shown in FIG. 22, the coding functions are divided into open-loop and closed-loop portions. The goal of such a division is to make the closed-loop portion as small as possible in order to allow it to run at the highest speed possible. The closed loop portion works with historical information of what was sent on the line. In one embodiment, said closed-loop circuit works with the sample from the previous clock time. The open-loop portion of the circuit does not work on historical information from the line.


Because the open-loop portion of said circuit does not use historical information, an embodiment incorporating multiple instantiations of the circuit can be implemented in parallel, as illustrated in the example transmit encoding function shown in FIG. 21 and the example receive decoding function shown in FIG. 23. This is often referred to as a multi-phase circuit because the said parallel circuits are fed their inputs and produce their outputs offset in time from the other parallel circuits, e.g. in different circuit phases.


This parallel operation allows said open-loop encode circuit to have a markedly higher effective throughput. The outputs of said parallel circuits are then multiplexed back together into one output that said closed-loop encode circuit can operate on.


In the transmitter, the operation that said parallel open-loop encode circuit must perform is to break down the data input b(0) through b(L−1) into chunks that have M(K)−1 states.


The operation that said closed-loop encode circuit must perform is to compare the vector with the last vector that was sent. If said vectors are the same, the vector is replaced by the pre-defined repeat code.


In the receiver, the operation that said closed-loop decode circuit must perform is to compare the vector received with the repeat code. If said vectors are the same, said vector is then replaced by the vector that had been received immediately prior to the repeat code.


The operation that said parallel open-loop decode circuit must perform is to reassemble the vectors that have M(K)−1 states back into the data output of b(0) through b(L−1).


Generalized Open-Loop, Closed-Loop Operation


Said division of labor between the open-loop and closed-loop portions of the encoder and decoder circuits allows high speed implementations of vector signaling codes that modify the high frequency aspects of the interface. For example, embodiments utilizing the TL-3 and TL-4 codes of [Fox I] can be subdivided into their open-loop and closed-loop components and implemented at higher speed that would otherwise be possible. These two codes do not implement clock encoding, but rather lower the high-frequency spectral content of the vector signaling, thus reducing its power consumption.


Embodiments

In one transmitter embodiment, a transmitter comprises a global transmission encoder used for accepting input data to be partitioned across two or more sub-channels of a communications channel and generating a set of reduced-modulus sub-channel transmit data; a communications sub-systems for each of the two or more sub-channels, each comprising a data history pre-coder for accepting a respective one of the set of reduced-modulus sub-channel transmit data from the global transmission encoder and producing sub-channel transmit data based on the reduced modulus sub-channel transmit data and a prior codeword such that a signaling transition is provided by not retransmitting a given codeword in adjacent signaling intervals; a data encoder to encode the sub-channel transmit data into codewords of a vector signaling code; and a driver to produce physical signals representing the vector signaling code on the communications sub-channel.


In one such transmitter embodiment, the global transmission encoder performs a computation on the input data producing multiple results to be distributed among the two or more sub-channels.


In one such transmitter embodiment, each of the data coders maintains a history of at least one previous transmission interval to insure its sub-channel transmit data changes in each transmission interval.


In one such transmitter embodiment, the vector signaling code for each sub-channel is selected from a group consisting of: ENRZ, S3, OCT, C18, S4, and P3;


In one such transmitter embodiment, the vector signaling code for at least one sub-system is S4, and for at least one other sub-system is P3.


In one such transmitter embodiment, each of the data encoders maintains a history of at least one previous transmission interval to insure its transmit vector changes in each transmission interval. In a further such embodiment, the transmitter is implemented with parallel instantiations of the data history pre-coder.


In one receiver embodiment, a receiver comprises a circuit for receiving physical signals on a communications sub-channel; a data decoder for decoding the received signals representing a vector signaling code; a data post-decoder for accepting the decoded received signals and producing received sub-system data; a global decoder for accepting received sub-system data from each of the two or more communications sub-systems to be reconstituted into a received version of a set of input data.


In one such receiver embodiment, the timing of at least each communications sub-channel receiver is derived from signal transitions within its communications sub-channel.


In one such receiver embodiment, the global decoder performs a complementary computation on the received sub-system data to obtain the received version of the input data.


In one such receiver embodiment, each of the data post-decoders maintains a history of at least one previous reception interval to accurately produce its received sub-system data from the decoded received signals.


In one such receiver embodiment, the timing of at least one communications sub-channel receiver is derived from received signal transitions produced by the pre-coding of the corresponding sub-channel transmit data.


In one such receiver embodiment, the timing of the global decoder is obtained from the timing of at least one sub-channel receiver.


In once such receiver embodiment, each of the data decoders maintains a history of at least one previous reception interval to accurately deliver data to the post-decoder.


In one such receiver embodiment, the receiver is implemented with parallel instantiations of the post-decoder.


In once such receiver embodiment, the receiver further comprises a clock extraction circuit, wherein the clock extraction circuit further comprises one or more implementations from the group consisting of: analog hysteresis, decision feedback, digital decision feedback, offset comparators, analog XOR logic, per-codeword detector logic, and per-codeword flip-flops. In a further such embodiment, the outputs of the per-codeword flip-flops are combined together and passed through a delay line circuit. In a further such embodiment, the output of the delay line is used to clear the per-codeword flip-flops.


In one embodiment, a method 2400 as depicted by FIG. 24 comprises: at block 2402, input data is processed and partitioned to be distributed across two or more sub-channels, each sub-channel comprising a plurality of signal lines; performing substantially in parallel for each of the two or more sub-channels: at block 2406 a portion of input data is pre-encoded and distributed to the respective sub-channel to produce sub-channel transmit data; at block 2410 the sub-channel transmit data is encoded into a codeword of a vector signaling code; and, at block 2414 physical signals are driven representing the codeword on the communications sub-channel.


In one embodiment a method 2500 as depicted by FIG. 25 comprises: at block 2505 physical signals are detected on two or more communications sub-channels to produce received signals, each sub-channel comprising a plurality of signal lines; at block 2510, timing information is derived for each of the two or more communication sub-channels from the respective sub-channel encoded vector signaling code; for each of the two or more communications sub-channels, at block 2515 the received signals are decoded as a representation of a vector signaling code having M elements; at block 2520, received sub-system data is produced representing a reduce modulus (M−1) data for each of the two or more communications sub-channels; and, at block 2525 received sub-system data from each of the two or more sub-channels is processed to produce a received version of the input data output.

Claims
  • 1. An apparatus comprising: two or more receiver sub-systems, each receiver sub-system comprising: a receive circuit configured to receive symbols of a codeword of a vector signaling code on a receive sub-channel, wherein codewords received in any adjacent signaling interval are different;a data decoder configured to decode the received symbols of the codeword into a set of bits;a data post-decoder configured to receive the set of bits, and to produce received reduced-modulus sub-system data based on the received set of bits and a set of prior received bits; and,a global decoder configured to receive the received reduced-modulus sub-system data from each of the two or more receiver sub-systems and to reconstitute the received reduced-modulus sub-system data into a set of output data.
  • 2. The apparatus of claim 1, further comprising a clock extraction circuit configured to obtain a clock signal based on symbol transitions between codewords received in adjacent signaling intervals.
  • 3. The apparatus of claim 2, wherein the clock extraction circuit comprises an analog hysteresis circuit configured to filter out zero crossings caused by noise and/or reflections.
  • 4. The apparatus of claim 2, wherein the clock extraction circuit comprises offset comparators configured to add and subtract fixed values from the received symbols of the codeword.
  • 5. The apparatus of claim 2, wherein the clock extraction circuit comprises a combined comparator configured to deliver three comparator outputs: a regular comparator output, an output with an offset added, and an output with an offset subtracted, wherein the clock extraction circuit obtains the clock signal based on the three comparator outputs.
  • 6. The apparatus of claim 2, wherein the clock extraction circuit comprises an analog XOR circuit.
  • 7. The apparatus of claim 2, wherein the clock extraction circuit comprises per-codeword flip-flops, and wherein the outputs of the per-codeword flip-flops are combined together and passed through a delay line circuit.
  • 8. The apparatus of claim 7, wherein an output of the delay line is used to clear the per-codeword flip-flops.
  • 9. The apparatus of claim 1, wherein the global decoder is configured to perform a complementary computation on the received reduced-modulus sub-system data to obtain the set of output data.
  • 10. The apparatus of claim 1, wherein the global decoder is configured to concatenate the received reduced-modulus sub-channel data received from each receiver sub-system to generate the set of output data.
  • 11. A method comprising: receiving, via two or more receiver sub-systems, a respective set of symbols of a codeword of a vector signaling code on a respective receive sub-channel, wherein codewords received in any adjacent signaling interval are different;decoding, at a respective data decoder, each respective received symbols of the codeword into a respective set of bits;receiving, at a respective data post-decoder, the respective the set of bits, and producing respective received reduced-modulus sub-system data based on the respective set of bits and a respective set of prior received bits; and,reconstituting, at a global decoder, each respective received reduced-modulus sub-system data from each of the two or more receiver sub-systems into a set of output data.
  • 12. The method of claim 11, further comprising obtaining a clock signal based on symbol transitions between codewords received in adjacent signaling intervals on at least one receiver sub-system.
  • 13. The method of claim 12, wherein obtaining the clock signal comprises filtering out zero crossings caused by noise and/or reflections using an analog hysteresis circuit.
  • 14. The method of claim 12, wherein obtaining the clock signal comprises adding and subtracting fixed values from the respective set of received symbols of the codeword using offset comparators.
  • 15. The method of claim 12, wherein obtaining the clock signal comprises operating on three comparator outputs provided by a combined comparator, the three comparator outputs comprising: a regular comparator output, an output with an offset added, and an output with an offset subtracted.
  • 16. The method of claim 12, wherein obtaining the clock signal comprises using an analog XOR circuit.
  • 17. The method of claim 12, wherein obtaining the clock signal comprises using per-codeword flip-flops, and wherein the outputs of the per-codeword flip-flops are combined together and passed through a delay line circuit.
  • 18. The method of claim 17, further comprising using an output of the delay line to clear the per-codeword flip-flops.
  • 19. The method of claim 11, further comprising performing a complementary computation on each respective received reduced-modulus sub-system data to obtain the set of output data.
  • 20. The method of claim 11, further comprising concatenating each received reduced-modulus sub-channel data received from each receiver sub-system to generate the set of output data.
CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 14/636,098, filed Mar. 2, 2015, entitled “CLOCK-EMBEDDED VECTOR SIGNALING CODES”, which claims the benefit of U.S. Provisional Application No. 61/946,574 filed on Feb. 28, 2014, reference of which is hereby incorporated in its entirety.

US Referenced Citations (249)
Number Name Date Kind
3196351 Slepian Jul 1965 A
3636463 Ongkiehong Jan 1972 A
3939468 Mastin Feb 1976 A
4163258 Ebihara Jul 1979 A
4181967 Nash Jan 1980 A
4206316 Burnsweig Jun 1980 A
4276543 Miller Jun 1981 A
4486739 Franaszek Dec 1984 A
4499550 Ray, III Feb 1985 A
4722084 Morton Jan 1988 A
4772845 Scott Sep 1988 A
4774498 Traa Sep 1988 A
4864303 Ofek Sep 1989 A
4897657 Brubaker Jan 1990 A
4974211 Corl Nov 1990 A
5053974 Penz Oct 1991 A
5166956 Baltus Nov 1992 A
5168509 Nakamura Dec 1992 A
5283761 Gillingham Feb 1994 A
5287305 Yoshida Feb 1994 A
5331320 Cideciyan Jul 1994 A
5412689 Chan May 1995 A
5449895 Hecht Sep 1995 A
5459465 Kagey Oct 1995 A
5461379 Weinman Oct 1995 A
5511119 Lechleider Apr 1996 A
5553097 Dagher Sep 1996 A
5566193 Cloonan Oct 1996 A
5599550 Kohlruss Feb 1997 A
5659353 Kostreski Aug 1997 A
5727006 Dreyer Mar 1998 A
5802356 Gaskins Sep 1998 A
5825808 Hershey Oct 1998 A
5875202 Venters Feb 1999 A
5945935 Kusumoto Aug 1999 A
5949060 Schattscneider Sep 1999 A
5995016 Perino Nov 1999 A
5999016 McClintock Dec 1999 A
6005895 Perino Dec 1999 A
6084883 Norrell Jul 2000 A
6119263 Mowbray Sep 2000 A
6172634 Leonowich Jan 2001 B1
6175230 Hamblin Jan 2001 B1
6232908 Nakaigawa May 2001 B1
6278740 Nordyke Aug 2001 B1
6346907 Dacy Feb 2002 B1
6359931 Perino Mar 2002 B1
6404820 Postol Jun 2002 B1
6417737 Moloudi Jul 2002 B1
6433800 Holtz Aug 2002 B1
6452420 Wong Sep 2002 B1
6473877 Sharma Oct 2002 B1
6483828 Balachandran Nov 2002 B1
6504875 Perino Jan 2003 B2
6509773 Buchwald Jan 2003 B2
6556628 Poulton Apr 2003 B1
6563382 Yang May 2003 B1
6621427 Greenstreet Sep 2003 B2
6624699 Yin Sep 2003 B2
6650638 Walker Nov 2003 B1
6661355 Cornelius Dec 2003 B2
6664355 Kim Dec 2003 B2
6686879 Shattil Feb 2004 B2
6766342 Kechriotis Jul 2004 B2
6839429 Gaikwad Jan 2005 B1
6839587 Yonce Jan 2005 B2
6854030 Perino Feb 2005 B2
6865234 Agazzi Mar 2005 B1
6865236 Terry Mar 2005 B1
6898724 Chang May 2005 B2
6927709 Kiehl Aug 2005 B2
6954492 Williams Oct 2005 B1
6963622 Eroz Nov 2005 B2
6972701 Jansson Dec 2005 B2
6973613 Cypher Dec 2005 B2
6976194 Cypher Dec 2005 B2
6982954 Dhong Jan 2006 B2
6990138 Bejjani Jan 2006 B2
6999516 Rajan Feb 2006 B1
7023817 Kuffner Apr 2006 B2
7039136 Olson May 2006 B2
7053802 Cornelius May 2006 B2
7075996 Simon Jul 2006 B2
7085153 Ferrant Aug 2006 B2
7085336 Lee Aug 2006 B2
7127003 Rajan Oct 2006 B2
7130944 Perino Oct 2006 B2
7142612 Horowitz Nov 2006 B2
7142865 Tsai Nov 2006 B2
7164631 Tateishi Jan 2007 B2
7167019 Broyde Jan 2007 B2
7180949 Kleveland Feb 2007 B2
7184483 Rajan Feb 2007 B2
7269212 Chau Sep 2007 B1
7335976 Chen Feb 2008 B2
7339990 Hidaka Mar 2008 B2
7348989 Stevens Mar 2008 B2
7349484 Stojanovic Mar 2008 B2
7356213 Cunningham Apr 2008 B1
7358869 Chiarulli Apr 2008 B1
7362130 Broyde Apr 2008 B2
7370264 Worley May 2008 B2
7372390 Yamada May 2008 B2
7389333 Moore Jun 2008 B2
7428273 Foster Sep 2008 B2
7456778 Werner Nov 2008 B2
7462956 Lan Dec 2008 B2
7496162 Srebranig Feb 2009 B2
7535957 Ozawa May 2009 B2
7570704 Nagarajan Aug 2009 B2
7599390 Pamarti Oct 2009 B2
7616075 Kushiyama Nov 2009 B2
7620116 Bessios Nov 2009 B2
7633850 Ahn Dec 2009 B2
7639596 Cioffi Dec 2009 B2
7643588 Visalli Jan 2010 B2
7656321 Wang Feb 2010 B2
7694204 Schmidt Apr 2010 B2
7697915 Behzad Apr 2010 B2
7706456 Laroia Apr 2010 B2
7706524 Zerbe Apr 2010 B2
7746764 Rawlins Jun 2010 B2
7787572 Scharf Aug 2010 B2
7804361 Lim Sep 2010 B2
7808883 Green Oct 2010 B2
7841909 Murray Nov 2010 B2
7869497 Benvenuto Jan 2011 B2
7869546 Tsai Jan 2011 B2
7882413 Chen Feb 2011 B2
7899653 Hollis Mar 2011 B2
7907676 Stojanovic Mar 2011 B2
7933770 Kruger Apr 2011 B2
8030999 Chatterjee Oct 2011 B2
8036300 Evans Oct 2011 B2
8050332 Chung Nov 2011 B2
8055095 Palotai Nov 2011 B2
8064535 Wiley Nov 2011 B2
8085172 Li Dec 2011 B2
8091006 Prasad Jan 2012 B2
8106806 Toyomura Jan 2012 B2
8149906 Saito Apr 2012 B2
8159375 Abbasfar Apr 2012 B2
8159376 Abbasfar Apr 2012 B2
8180931 Lee May 2012 B2
8185807 Oh May 2012 B2
8199849 Oh Jun 2012 B2
8199863 Chen Jun 2012 B2
8218670 AbouRjeily Jul 2012 B2
8245094 Jiang Aug 2012 B2
8253454 Lin Aug 2012 B2
8279094 Abbasfar Oct 2012 B2
8289914 Li Oct 2012 B2
8295250 Gorokhov Oct 2012 B2
8310389 Chui Nov 2012 B1
8365035 Hara Jan 2013 B2
8406315 Tsai Mar 2013 B2
8406316 Sugita Mar 2013 B2
8429495 Przybylski Apr 2013 B2
8437440 Zhang May 2013 B1
8442099 Sederat May 2013 B1
8442210 Zerbe May 2013 B2
8443223 Abbasfar May 2013 B2
8451913 Oh May 2013 B2
8462891 Kizer Jun 2013 B2
8472513 Malipatil Jun 2013 B2
8498344 Wilson Jul 2013 B2
8498368 Husted Jul 2013 B1
8520348 Dong Aug 2013 B2
8520493 Goulahsen Aug 2013 B2
8539318 Cronie Sep 2013 B2
8547272 Nestler Oct 2013 B2
8577284 Seo Nov 2013 B2
8578246 Mittelholzer Nov 2013 B2
8588254 Diab Nov 2013 B2
8588280 Oh Nov 2013 B2
8593305 Tajalli Nov 2013 B1
8638241 Sudhakaran Jan 2014 B2
8649445 Cronie Feb 2014 B2
8649460 Ware Feb 2014 B2
8649556 Wedge Feb 2014 B2
8649840 Sheppard, Jr. Feb 2014 B2
8687968 Nosaka Apr 2014 B2
8711919 Kumar Apr 2014 B2
8718184 Cronie May 2014 B1
8755426 Cronie Jun 2014 B1
8773964 Hsueh Jul 2014 B2
8780687 Clausen Jul 2014 B2
8782578 Tell Jul 2014 B2
8831440 Yu Sep 2014 B2
8879660 Peng Nov 2014 B1
8897134 Kern Nov 2014 B2
8949693 Ordentlich Feb 2015 B2
8951072 Hashim Feb 2015 B2
8975948 GonzalezDiaz Mar 2015 B2
8989317 Holden Mar 2015 B1
9015566 Cronie Apr 2015 B2
9020049 Schwager Apr 2015 B2
9036764 Hossain May 2015 B1
9069995 Cronie Jun 2015 B1
9077386 Holden Jul 2015 B1
9093791 Liang Jul 2015 B2
9100232 Hormati Aug 2015 B1
9106465 Walter Aug 2015 B2
9124557 Fox Sep 2015 B2
9165615 Amirkhany Oct 2015 B2
9172412 Kim Oct 2015 B2
9197470 Okunev Nov 2015 B2
9281785 Sjoland Mar 2016 B2
9288082 Ulrich Mar 2016 B1
9288089 Cronie Mar 2016 B2
9292716 Winoto Mar 2016 B2
9306621 Zhang Apr 2016 B2
9331962 Lida May 2016 B2
9362974 Fox Jun 2016 B2
9374250 Musah Jun 2016 B1
20020044316 Myers Apr 2002 A1
20020057592 Robb May 2002 A1
20020154633 Shin Oct 2002 A1
20030146783 Bandy Aug 2003 A1
20050174841 Ho Aug 2005 A1
20050213686 Love Sep 2005 A1
20070194848 Bardsley Aug 2007 A1
20070263711 Kramer Nov 2007 A1
20080013622 Bao Jan 2008 A1
20080104374 Mohamed May 2008 A1
20080159448 Anim-Appiah Jul 2008 A1
20090059782 Cole Mar 2009 A1
20090251222 Khorram Oct 2009 A1
20100046644 Mazet Feb 2010 A1
20100180143 Ware Jul 2010 A1
20100296556 Rave Nov 2010 A1
20110150495 Nosaka Jun 2011 A1
20110291758 Hsieh Dec 2011 A1
20110299555 Cronie Dec 2011 A1
20120008662 Gardiner Jan 2012 A1
20120152901 Nagorny Jun 2012 A1
20120161945 Single Jun 2012 A1
20130010892 Cronie Jan 2013 A1
20130049863 Chiu Feb 2013 A1
20130229294 Matsuno Sep 2013 A1
20140198841 George Jul 2014 A1
20140226455 Schumacher Aug 2014 A1
20150078479 Whitby-Strevens Mar 2015 A1
20150146771 Walter May 2015 A1
20150333940 Shokrollahi Nov 2015 A1
20150381232 Ulrich Dec 2015 A1
20160020796 Hormati Jan 2016 A1
20160020824 Ulrich Jan 2016 A1
20160036616 Holden Feb 2016 A1
Foreign Referenced Citations (6)
Number Date Country
101478286 Jul 2009 CN
2039221 Feb 2013 EP
2003163612 Jun 2003 JP
2009084121 Jul 2009 WO
2010031824 Mar 2010 WO
2011119359 Sep 2011 WO
Non-Patent Literature Citations (43)
Entry
Abbasfar, A., “Generalized Differential Vector Signaling”, IEEE International Conference on Communications, ICC '09, (Jun. 14, 2009), pp. 1-5.
Brown, L., et al., “V.92: The Last Dial-Up Modem?”, IEEE Transactions on Communications, IEEE Service Center, Piscataway, NJ., USA, vol. 52, No. 1, Jan. 1, 2004, pp. 54-61. XP011106836, ISSN: 0090-6779, DOI: 10.1109/tcomm.2003.822168, pp. 55-59.
Burr, “Spherical Codes for M-ARY Code Shift Keying”, University of York, Apr. 2, 1989, pp. 67-72, United Kingdom.
Cheng, W., “Memory Bus Encoding for Low Power: A Tutorial”, Quality Electronic Design, IEEE, International Symposium on Mar. 26-28, 2001, pp. 199-204, Piscataway, NJ.
Clayton, P., “Introduction to Electromagnetic Compatibility”, Wiley-Interscience, 2006.
Dasilva et al., “Multicarrier Orthogonal CDMA Signals for Quasi-Synchronous Communication Systems”, IEEE Journal on Selected Areas in Communications, vol. 12, No. 5 (Jun. 1, 1994), pp. 842-852.
Ericson, T., et al., “Spherical Codes Generated by Binary Partitions of Symmetric Pointsets”, IEEE Transactions on Information Theory, vol. 41, No. 1, Jan. 1995, pp. 107-129.
Farzan, K., et al., “Coding Schemes for Chip-to-Chip Interconnect Applications”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 14, No. 4, Apr. 2006, pp. 393-406.
Healey, A., et al., “A Comparison of 25 Gbps NRZ & PAM-4 Modulation used in Legacy & Premium Backplane Channels”, DesignCon 2012, 16 pages.
International Search Report and Written Opinion for PCT/EP2011/059279 mailed Sep. 22, 2011.
International Search Report and Written Opinion for PCT/EP2011/074219 mailed Jul. 4, 2012.
International Search Report and Written Opinion for PCT/EP2012/052767 mailed May 11,2012.
International Search Report and Written Opinion for PCT/US14/052986 mailed Nov. 24, 2014.
International Search Report and Written Opinion from PCT/US2014/034220 mailed Aug. 21, 2014.
International Search Report and Written Opinion of the International Searching Authority, mailed Jul. 14, 2011 in International Patent Application S.N. PCT/EP2011/002170, 10 pages.
International Search Report and Written Opinion of the International Searching Authority, mailed Nov. 5, 2012, in International Patent Application S.N. PCT/EP2012/052767, 7 pages.
International Search Report for PCT/US2014/053563, dated Nov. 11, 2014, 2 pages.
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration for PCT/EP2013/002681, dated Feb. 25, 2014, 15 pages.
Poulton, et al., “Multiwire Differential Signaling”, UNC-CH Department of Computer Science Version 1.1, Aug. 6, 2003.
She et al., “A Framework of Cross-Layer Superposition Coded Multicast for Robust IPTV Services over WiMAX,” IEEE Communications Society subject matter experts for publication in the WCNC 2008 proceedings, Mar. 31, 2008-Apr. 3, 2008, pp. 3139-3144.
Skliar et al., A Method for the Analysis of Signals: the Square-Wave Method, Mar. 2008, Revista de Matematica: Teoria y Aplicationes, pp. 109-129.
Slepian, D., “Premutation Modulation”, IEEE, vol. 52, No. 3, Mar. 1965, pp. 228-236.
Stan, M., et al., “Bus-Invert Coding for Low-Power I/O, IEEE Transactions on Very Large Scale Integration (VLSI) Systems”, vol. 3, No. 1, Mar. 1995, pp. 49-58.
Tallini, L., et al., “Transmission Time Analysis for the Parallel Asynchronous Communication Scheme”, IEEE Transactions on Computers, vol. 52, No. 5, May 2003, pp. 558-571.
Wang et al. “Applying CDMA Technique to Network-on-Chip”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 15, No. 10 (Oct. 1, 2007), pp. 1091-1100.
“Introduction to: Analog Computers and the DSPACE System,” Course Material ECE 5230 Spring 2008, Utah State University, www.coursehero.com, 12 pages.
Counts, L., et al., “One-Chip Slide Rule Works with Logs, Antilogs for Real-Time Processing,” Analog Devices Computational Products 6, Reprinted from Electronic Design, May 2, 1985, 7 pages.
Design Brief 208 Using the Anadigm Multiplier CAM, Copyright 2002 Anadigm, 6 pages.
Grahame, J., “Vintage Analog Computer Kits,” posted on Aug. 25, 2006 in Classic Computing, 2 pages, http.//www.retrothing.com/2006/08/classic—analog—.html.
Jiang, A., et al., “Rank Modulation for Flash Memories”, IEEE Transactions of Information Theory, Jun. 2006, vol. 55, No. 6, pp. 2659-2673.
Loh, M., et al., “A 3x9 Gb/s Shared, All-Digital CDR for High-Speed, High-Density I/O” , Matthew Loh, IEEE Journal of Solid-State Circuits, Vo. 47, No. 3, Mar. 2012.
Notification of Transmittal of International Search Report and the Written Opinion of the International Searching Authority, for PCT/US2015/018363, mailed Jun. 18, 2015, 13 pages.
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration, dated Mar. 3, 2015, for PCT/US2014/066893, 9 pages.
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration, for PCT/US2014/015840, dated May 20, 2014. 11 pages.
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration, for PCT/US2014/043965, dated Oct. 22, 2014, 10 pages.
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration, for PCT/US2015/037466, dated Nov. 19, 2015.
Oh, et al., Pseudo-Differential Vector Signaling for Noise Reduction in Single-Ended Signaling, DesignCon 2009.
Schneider, J., et al., “ELEC301 Project: Building an Analog Computer,” Dec. 19, 1999, 8 pages, http://www.clear.rice.edu/elec301/Projects99/anlgcomp/.
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration, for PCT/US2015/043463, dated Oct. 16, 2015, 8 pages.
Tierney, J., et al., “A digital frequency synthesizer,” Audio and Electroacoustics, IEEE Transactions, Mar. 1971, pp. 48-57, vol. 19, Issue 1, 1 page Abstract from http://ieeexplore.
Zouhair Ben-Neticha et al, “The streTched-Golay and other codes for high-SNR finite-delay quantization of the Gaussian source at 1/2 Bit per sample”, IEEE Transactions on Communications, vol. 38, No. 12 Dec. 1, 1990, pp. 2089-2093, XP000203339, ISSN: 0090-6678, DOI: 10.1109/26.64647.
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration, for PCT/US2015/039952, dated Sep. 23, 2015, 8 pages.
Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority, or the Declaration, for PCT/US2015/041161, dated Oct. 7, 2015, 8 pages.
Related Publications (1)
Number Date Country
20160294586 A1 Oct 2016 US
Provisional Applications (1)
Number Date Country
61946574 Feb 2014 US
Continuations (1)
Number Date Country
Parent 14636098 Mar 2015 US
Child 15176085 US