In the depicted scheme, the data is “double-pumped,” which means that data bits are transmitted on both the rising and falling edges of a clock, thereby effectively doubling the data bit rate. Each data transmitter 105 has flops 111, 112 coupled to feed data bits from a data source (e.g., read FIFO) through a 2:1 multiplexer 109 to a differential output driver 107 (e.g., a current mode driver to generate a differential voltage across termination resistors, not shown). (Note that for simplicity sake, the output drivers are represented simply using driver block symbols. Skilled persons will appreciate, however, that actual implementations may include other blocks such as transmitter equalization circuitry, finite state machine and other non-timing critical logic, as well as more time critical circuits such as transmitter serializers, pre-driver blocks, and/or current-steering current-mode digital-to-analog converter circuits. In some embodiments, such timing critical circuits may be supplied with a separate supply such as a filtered analog supply.) The clocks for the flops are provided from a PLL 119, which also feeds a single-ended to differential buffer 116 to provide the clock to an output clock driver 117 in each clock transmitter 115 to be forwarded to an associated receiver.
The clock receivers 135 comprise a differential receiver 137 to receive the clock from an associated clock transmitter. The receiver 137 provides the received clock to a delay locked loop (DLL) 139, which typically provides the clock at two or more different delay values (as dictated by control circuitry, not shown) to its associated N data receivers.
Each data receiver 125 has a differential receiver 127, a phase interpolator circuit (PI) 129, and a sampling latch 131. A transmitted data signal is received by the receiver 127, which provides it to the latch in a suitable form. The latch samples (or captures) a data value in each data signal phase off of an edge of a clock that is generated by the PI 129. From here, it is transferred downstream into a suitable memory buffer and/or into other memory (not shown).
The phase of the PI generated clock is dictated by the clock phases provided to the PI from the DLL 139. During initialization or calibration, training sequences are transmitted from the transmitters in each agent to their associated receivers for among other reasons, to set the DLLs in order to sample the data sufficiently within the center of the data phase. This compensates for jitter due to factors such as process and temperature variations. Unfortunately, this calibration or training can be impaired due to non-ideal conditions such as when noise is generated in the transmitter. For example, simultaneous switching noise (SSN) may be generated within the transmitters when transitions occur at the same time over the several lanes in a link. To reduce the affects of noise, in some cases, separate supplies may be used for the more timing critical blocks in a transmitter. Unfortunately, however, this may not be practical or it may not be sufficient, e.g., to meet more ambitious jitter requirements. Accordingly, new solutions may be desired.
Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings in which like reference numerals refer to similar elements.
It has been discovered that package resonance and other detrimental noise may be caused by data dependent transient switching at the transmitters. This problem may be exacerbated, for example, during training sequences when all of an agent's (or port's) transmitter lanes drive out the same training sequence data pattern simultaneously. (In some embodiments, a pseudo-random bit sequence may be used for the training sequence to reduce simultaneous switching noise during training, retraining, or idle periods, but the same, albeit pseudo random sequence, may still be used for each lane at the same time, which may result in excessive simultaneous switching noise, SSN.) Accordingly, to redress this and other problems, methods and circuits to mitigate against such data dependent noise are provided herein.
An activity generator 305 is coupled to the transmitter and latch circuitry, as shown, to generate transients in the power supply, VSupply, used or the identified transmitter portion 204 when a data transition would otherwise not occur on a clock event so that transients are periodically generated thereby making them data independent. (the term “clock event” refers to a time, usually periodic, when a data bit is to be transmitted out of a transmitter and onto a channel. it will usually coincide with a falling and/or rising edge of a clock used to “clock” data out of the transmitter.)
The activity generator 205 has a transition detector 206 to detect whether a transition will occur and a replica load 208 to sufficiently replicate transient characteristics of the identified transmitter portion 204. For each clock event, the transition detector 206 determines if a data transition will occur and if not, then it excites the replica load 208 for that clock event. Otherwise, data will transition naturally for the clock event. In this way, transients are generated for each clock event, and they are no longer data dependent. Essentially, the overall net effect is constant switching transients that have little or no data frequency content, regardless of the package resonance. This means that the activity generator can be effectively used with any package resonance frequency and be insensitive to resonance frequency shifts.
The replica load 308 comprises first and second sets of tapered inverters 411 and 413. Each set is configured to sufficiently replicate at least the transient characteristics for the identified portion of the transmitter 304, and possibly multiplexer 303 as well. (Tapered inverters may be convenient but any suitable circuit, even the relevant transmitter circuit portions, could also be used for replica load(s)). The replica loads are powered by the same supply (VSupply) that supplies the identified problematic transmitter portion. Thus, when either load is pulsed, it causes sufficiently similar dynamic switching noise as when a data transition actually occurs.
The transition detector 306 comprises latches 403, 404, XNOR gates 405, 406, and AND gates 407, 408, all coupled together as shown.
In operation, D0 and D1 are alternatively provided to the output driver 304 by their respective clocks. The transition detector circuit is configured so that XNOR gate 406 compares a present D0 with a previous D1 and asserts if they are the same or de-asserts if they are not the same. Similarly, XNOR gate 405 compares a present D1 with a previous D0 and asserts if they are the same or de-asserts if they are not the same. Thus, for every clock event, if no data transition occurs, one of the two XNOR gates will assert. Each is coupled to an AND gate (407, 408), which serves to synchronize the assertion from its XNOR gate with the appropriate clock. Thus, the assertion (indicative of no data transition) causes one of the replica loads (411 or 413) to be excited.
With reference to
As shown in
With reference to
The memory 806 comprises one or more memory blocks to provide additional random access memory to the processor(s) 802. It may be implemented with any suitable memory including but not limited to dynamic random access memory, static random access memory, flash memory, or the like. In some embodiments, one or more of the memory 806, control functionality 804, and I/O components comprise links to establish interconnections with transmitters having activity generation as discussed herein.
In the preceding description, numerous specific details have been set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known circuits, structures and techniques may have not been shown in detail in order not to obscure an understanding of the description. With this in mind, references to “one embodiment”, “an embodiment”, “example embodiment”, “various embodiments”, etc., indicate that the embodiment(s) of the invention so described may include particular features, structures, or characteristics, but not every embodiment necessarily includes the particular features, structures, or characteristics. Further, some embodiments may have some, all, or none of the features described for other embodiments.
In the preceding description and following claims, the following terms should be construed as follows: The terms “coupled” and “connected,” along with their derivatives, may be used. It should be understood that these terms are not intended as synonyms for each other. Rather, in particular embodiments, “connected” is used to indicate that two or more elements are in direct physical or electrical contact with each other. “Coupled” is used to indicate that two or more elements co-operate or interact with each other, but they may or may not be in direct physical or electrical contact.
The invention is not limited to the embodiments described, but can be practiced with modification and alteration within the spirit and scope of the appended claims. For example, it should be appreciated that the present invention is applicable for use with all types of semiconductor integrated circuit (“IC”) chips. Examples of these IC chips include but are not limited to processors, controllers, chip set components, programmable logic arrays (PLA), memory chips, network chips, and the like.
It should also be appreciated that in some of the drawings, signal conductor lines are represented with lines. Some may be thicker, to indicate more constituent signal paths, have a number label, to indicate a number of constituent signal paths, and/or have arrows at one or more ends, to indicate primary information flow direction. This, however, should not be construed in a limiting manner. Rather, such added detail may be used in connection with one or more exemplary embodiments to facilitate easier understanding of a circuit. Any represented signal lines, whether or not having additional information, may actually comprise one or more signals that may travel in multiple directions and may be implemented with any suitable type of signal scheme, e.g., digital or analog lines implemented with differential pairs, optical fiber lines, and/or single-ended lines.
It should be appreciated that example sizes/models/values/ranges may have been given, although the present invention is not limited to the same. As manufacturing techniques (e.g., photolithography) mature over time, it is expected that devices of smaller size could be manufactured. In addition, well known power/ground connections to IC chips and other components may or may not be shown within the FIGS, for simplicity of illustration and discussion, and so as not to obscure the invention. Further, arrangements may be shown in block diagram form in order to avoid obscuring the invention, and also in view of the fact that specifics with respect to implementation of such block diagram arrangements are highly dependent upon the platform within which the present invention is to be implemented, i.e., such specifics should be well within purview of one skilled in the art. Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the invention, it should be apparent to one skilled in the art that the invention can be practiced without, or with variation of, these specific details. The description is thus to be regarded as illustrative instead of limiting.