Multiple adjacent slicewise layout of voltage-controlled oscillator

Information

  • Patent Grant
  • 11777475
  • Patent Number
    11,777,475
  • Date Filed
    Tuesday, May 31, 2022
    2 years ago
  • Date Issued
    Tuesday, October 3, 2023
    a year ago
Abstract
Methods and systems are described for generating multiple phases of a local clock at a controllable variable frequency, using loop-connected strings of active circuit elements. A specific embodiment incorporates a loop of four active circuit elements, each element providing true and complement outputs that are cross-coupled to maintain a fixed phase relationship, and feed-forward connections at each loop node to facilitate high frequency operation. A particular physical layout is described that maximizes operating frequency and minimizes clock pertubations caused by unbalanced or asymmetric signal paths and parasitic node capacitances.
Description
REFERENCES

The following prior applications are herein incorporated by reference in their entirety for all purposes:


U.S. patent application Ser. No. 16/107,822 filed Aug. 21, 2018, entitled “High Performance Phase Locked Loop”, naming Armin Tajalli, hereinafter referred to as [Tajalli I].


FIELD OF THE INVENTION

The present embodiments relate to communications systems circuits generally, and more particularly to obtaining a stable, correctly phased receiver clock signal from a high-speed multi-wire interface used for chip-to-chip communication.


BACKGROUND

In modern digital systems, digital information has to be processed in a reliable and efficient way. In this context, digital information is to be understood as information available in discrete, i.e., discontinuous values. Bits, collection of bits, but also numbers from a finite set can be used to represent digital information.


In most chip-to-chip, or device-to-device communication systems, communication takes place over a plurality of wires to increase the aggregate bandwidth. A single or pair of these wires may be referred to as a channel or link and multiple channels create a communication bus between the electronic components. At the physical circuitry level, in chip-to-chip communication systems, buses are typically made of electrical conductors in the package between chips and motherboards, on printed circuit boards (“PCBs”) boards or in cables and connectors between PCBs. In high frequency applications, microstrip or stripline PCB traces may be used.


Common methods for transmitting signals over bus wires include single-ended and differential signaling methods. In applications requiring high speed communications, those methods can be further optimized in terms of power consumption and pin-efficiency, especially in high-speed communications. More recently, vector signaling methods have been proposed to further optimize the trade-offs between power consumption, pin efficiency and noise robustness of chip-to-chip communication systems. In such vector signaling systems, digital information at the transmitter is transformed into a different representation space in the form of a vector codeword that is chosen in order to optimize the power consumption, pin-efficiency and speed trade-offs based on the transmission channel properties and communication system design constraints. Herein, this process is referred to as “encoding”. The encoded codeword is communicated as a group of signals from the transmitter to one or more receivers. At a receiver, the received signals corresponding to the codeword are transformed back into the original digital information representation space. Herein, this process is referred to as “decoding”.


Regardless of the encoding method used, the received signals presented to the receiving device must be sampled (or their signal value otherwise recorded) at intervals best representing the original transmitted values, regardless of transmission channel delays, interference, and noise. Such Clock and Data Recovery (CDR) not only determines the appropriate sample timing, but may continue to do so continuously, providing dynamic compensation for varying signal propagation conditions.


Many known CDR systems utilize a Phase-Locked Loop (PLL) or Delay-Locked Loop (DLL) to synthesize a local receive clock having an appropriate frequency and phase for accurate receive data sampling.


BRIEF DESCRIPTION

To reliably detect the data values transmitted over a communications system, a receiver must accurately measure the received signal value amplitudes at carefully selected times. Various methods are known to facilitate such receive measurements, including reception of one or more dedicated clock signals associated with the transmitted data stream, extraction of clock signals embedded within the transmitted data stream, and synthesis of a local receive clock from known attributes of the communicated data stream. In general, the receiver embodiments of such timing methods are described as Clock-Data Recovery (CDR) or alternatively as Clock-Data Alignment (CDA,) often based on Phase-Lock Loop (PLL) or Delay-Locked Loop (DLL) synthesis of a local receive clock having the desired frequency and phase characteristics.


In both PLL and DLL embodiments, a phase comparator compares the relative phase (and in some variations, the relative frequency) of a received reference signal and a local clock signal to produce an error signal, which is subsequently used to correct the phase and/or frequency of the local clock source and thus minimize the error. As this feedback loop behavior will lead to a given PLL embodiment producing a fixed phase relationship (as examples, 0 degrees or 90 degrees of phase offset) between the reference signal and the local clock, an additional fixed or variable phase adjustment is often introduced to permit the phase offset to be set to a different desired value (as one example, 45 degrees of phase offset) to facilitate receiver data detection.


In some embodiments, synthesis of additional local clock phases is desirable to enable multi-phase or pipelined processing of received data values, facilitate phase interpolation, and/or provide additional inputs to the phase detection process to reduce clock jitter and/or improve PLL closed-loop bandwidth. As one example, [Tajalli I] describes an embodiment in which multiple VCO phases and (optionally, multiple delayed) instances of a received clock reference are compared using a matrix of phase comparator elements, the multiple partial phase error signals of which are combined in a weighed summation to provide a VCO phase error correction.


Voltage-controlled oscillators utilizing loop-connected strings of active circuit elements are well represented in the art. The basic oscillation frequency of such an oscillator is determined by the total propagation time of the string of active element. Thus, to enable high-speed operation, a simple digital inverter having minimal propagation delay is typically used as the active element. Loop-connected strings of differential amplifiers or buffers are also known, with stable oscillation occurring as long as the overall phase shift at the desired oscillation frequency is an odd multiple of 180 degrees.


The loop oscillation frequency may be varied using a control signal that adjusts active circuit element delay, which may in turn be functions of circuit gain and switching threshold. Alternatively, the effective propagation delay may be increased by limiting the skew rate of signal transitions propagating between loop elements, either explicitly by constraining the output current drive of each active circuit element, or implicitly by constraining the supply voltage or current provided to each active circuit element.


Conveniently, an N-element loop oscillator inherently provides N multiple clock output phases, each typically offset by 180/N degrees of phase difference. Thus, a loop oscillator comprised of four stages can provide four distinct phases of oscillator output signal, and if each stage also provides inverted and non-inverted outputs, four additional phases from the complementary outputs.





BRIEF DESCRIPTION OF FIGURES


FIG. 1 is a block diagram showing one embodiment of a variable frequency loop oscillator.



FIG. 2 is a schematic detailing one embodiment of one of eight identical sections of the loop oscillator of FIG. 1.



FIG. 3 is a schematic illustrating two embodiments which may be used as part of a duty cycle controller.



FIG. 4 illustrates a physical integrated circuit layout for one particular embodiment having eight VCO stages.



FIG. 5 illustrates a physical integrated circuit layout for a second embodiment having eight VCO stages.



FIG. 6 illustrates a physical integrated circuit layout for an embodiment having four VCO stages.





DETAILED DESCRIPTION

To reliably detect the data values transmitted over a communications system, a receiver must accurately measure the received signal value amplitudes at carefully selected times. Various methods are known to facilitate such receive measurements, including reception of one or more dedicated clock signals associated with the transmitted data stream, extraction of clock signals embedded within the transmitted data stream, and synthesis of a local receive clock from known attributes of the communicated data stream. In general, the receiver embodiments of such timing methods are described as Clock-Data Recovery (CDR) or alternatively as Clock-Data Alignment (CDA,) often based on Phase-Lock Loop (PLL) or Delay-Locked Loop (DLL) synthesis of a local receive clock having the desired frequency and phase characteristics.


A typical PLL is composed of a phase comparator that compares an external reference signal to an internal clock signal, a low pass filter that smooths the resulting error value to produce a clock control signal, and a variable frequency clock source (typically, a Voltage Controlled Oscillator or VCO) controlled by the smoothed error value, producing the internal clock signal presented to the phase comparator. In a well-know variation, such a PLL design may incorporate a clock frequency divider between the VCO and the phase comparator, allowing a higher-frequency clock output to be phase locked to a lower-frequency reference signal.


In an alternative Delay-Locked Loop (DLL) embodiment, the variable frequency clock source is replaced by a variable delay element, its (optionally multiple tapped) outputs thus representing one or more successive time-delayed versions of the original input signal rather than successive cycles of an oscillator to be phase compared to the reference input signal. For the purposes of this document, the variable delay elements utilized in a DLL are considered functionally equivalent to the variable delay elements of a loop-connected oscillator VCO in a PLL embodiment.


In some embodiments, the PLL may synthesize additional local clock phases to enable multi-phase or pipelined processing of received data values, facilitate phase interpolation, and/or provide additional inputs to the phase detection process to reduce clock jitter and/or improve PLL closed-loop bandwidth. As one example, [Tajalli I] describes an embodiment in which multiple VCO phases and (optionally, multiple delayed) instances of a received clock reference are compared using a matrix of phase comparator elements, the multiple partial phase error signals of which are combined in a weighed summation to provide a VCO phase error correction.


A simple digital XOR gate may be used as a phase comparator. As a non-limiting example, an XOR between two square wave signals having a phase difference will result in a variable-duty-cycle waveform which, when low pass filtered into an analog error signal, results in a proportional error signal centered in its analog signal range when the two input signals have a 90-degree phase offset relationship. More complex finite state machine (FSM) phase comparator compare the relative arrival times of clock transitions, as one example using edge-triggered latches clocked respectively by the reference and internal clock signals, with the relative arrival time of the clock edges resulting in an “early” or “late” output signal that produces a corresponding correction of the VCO phase. Other FSM phase comparator embodiments are additionally sensitive to clock frequency differences, enabling faster initial loop lock at startup. Further embodiments accumulate multiple phase error measurements into a integral error result which may be used alone or in combination with a short-term proportional result as a VCO control value. Some embodiments incorporate some or all of these functions into firmware or software executing on a CPU, or implemented as a hardware state machine.


In most PLL embodiments, the error signal produced by the phase comparator is low pass filtered and applied as an analog control voltage used to adjust the VCO operating frequency.


The control signal used to adjust the VCO frequency may be comprised of multiple components; phase comparison of a reference and a local clock signal by a phase comparator, a matrix of multiple such comparisons as taught in [Tajalli I], the output of a FSM performing frequency or phase comparisons, or an integral error term derived from an accumulated history of phase measurements. In some embodiments, a first control signal component may correspond to a long time constant correction and a second control signal component may correspond to a short time constant correction. Identical or different weights or scaling factors may be applied to said first and second control components when they are combined.


In an alternative embodiment, all or part of the filtering operation are subsumed into the same digital processing used for phase comparison, with the digital error result applied to a digital-to-analog converter (DAC) to obtain an analog VCO control signal. In further embodiments, all or a portion of the VCO control signal may be applied in the digital domain.


Voltage-controlled oscillators based on loop-connected strings of active circuit elements are well represented in the art. In such oscillators, an initial signal transition propagates down the string of connected elements, appearing at consecutive element outputs after a delay determined by the signal propagation delay of the active circuit element. Thus, as one example offered without limitation, the initial signal transition would appear at the end of a series-connected string of four active circuit elements after (4*prop_delay), corresponding to one half-cycle of the oscillator. If the output is inverted and applied to the input (thus, the term “loop-connected”,) the oscillation will continue for another half period, resulting in a square wave output with a period of (2*Σprop_delay), determined by the total propagation time Σ prop_delay of the string of active elements. The number of active circuit elements in the loop may vary, with oscillation occurring as long as the overall phase shift around the loop-connected string at the desired oscillation frequency is an odd multiple of 180 degrees. Thus, to enable high-speed oscillation, the string length is kept short and minimal-delay active elements such as simple digital inverters, amplifiers, or buffers are used.


The loop oscillation frequency may be varied using a control signal that adjusts an active circuit element parameter affecting propagation delay, such as gain, switching threshold, or output drive; low frequency embodiments are also known that incorporate configurable passive delay elements between active stages to provide additional control. At high frequencies, a significant component of an active element's propagation time can be the node charge/discharge time required for an output state change in one element to charge or discharge the parasitic capacitance of the interconnecting node and reach the switching threshold of the subsequent element's input. Under these conditions, the effective propagation delay may be varied by limiting the skew rate of signal transitions propagating between sequential loop elements, either by explicitly adjusting the output current drive capability or output impedance of each active circuit element, or by implicitly making such adjustment by varying the supply voltage or current provided to the active circuit elements.


Conveniently, an N-element loop oscillator inherently generates N multiple clock output phases as the output of consecutive active elements, each typically offset by an additional 180/N degrees of phase difference. In embodiments based on inverting active elements, an additional 180 degree offset (i.e. inversion,) will be seen at odd-numbered outputs, using the input of the first element as the reference.


One embodiment of such an oscillator consists of a loop of four elements, where each element is a differential digital inverter composed of a pair of identical single-ended inverters supporting the noninverted-input-to-inverted output, and inverted-input-to-noninverted output paths, respectively. To provide the required odd number of inversions around the loop, the outputs of the fourth element are cross-connected to the inputs of the first element. In the illustration of FIG. 1, the first element is comprised of inverters 120 and 160, the second element of 130 and 170, the third element of 140 and 180, and the fourth element of 150 and 190.


Even though the embodiment of FIG. 1 includes eight inverters, the length of the loop is four loop elements because of the double loop-back of the output of inverter 150 and of inverter 190. Thus, the natural oscillation period is (2*4*prop_delay), not twice that value.


To maintain tight phase matching between inverted and non-inverted outputs of each loop element, each loop element output is cross-connected to its complementary output using back-to-back digital inverters, maintaining the desired 180 degree phase offset between them. In this embodiment, the back-to-back inverters provide bidirectional synchronization between complementary outputs, as well as introducing a small amount of hysteresis into node switching transitions. Thus, the output of the first loop element 180 inverter 120 is cross-connected to the output of inverter 160 by cross-coupled inverters 125 and 126, the output of 130 is cross-connected to the output of 170 by cross-coupled inverters 135 and 136, the output of 140 is cross-connected to the output of 180 by cross-coupled inverters 145 and 146, and the output of 150 is cross-connected to the output of 190 by cross-coupled inverters 155 and 156.


As the desired loop oscillation frequency approaches the design limits of the embodiment's integrated circuit process, each loop node is also driven with a small amount of feed-forward signal from a node 45 degrees earlier in the oscillation cycle (which, in the example four loop element differential topology, may be obtained from the complementary input of the preceding loop element.) This feed-forward signal begins to drive the node in anticipation of the switching transition, allowing operation at a higher frequency than would otherwise be possible. In FIG. 1, this feed-forward may be seen in inverter 138 driving an inverted and delayed version of ø180 (thus corresponding to a non-inverted ø000,) onto output node ø090 in anticipation of primary inverter 130 driving an inverted and delayed version of ø225 (thus corresponding to a non-inverted ø045) onto the same node, which after accounting for the propagation delay will be the expected ø090 result. Thus, in this example the feed-forward signal anticipates each corresponding change of primary output signal by 45 degrees, providing a beneficial pre-charging action which accelerates the transition of the output node.


Obviously, such anticipatory signaling cannot exceed that of the primary signal path, or spurious high-frequency oscillation can occur. Similarly, signaling on the cross-coupling path introduces hysteresis which delays switching transitions, so must also be constrained to be significantly less than that of the primary signal path. The amount of anticipatory and cross-coupled signaling may be controlled by scaling the size of the transistors (and thus their current drive capability) relative to the transistor size used on the primary signal path.


In one particular embodiment, feed-forward signaling was found to be of benefit at approximately 60% of the drive level of the primary signal path, with cross-coupling at approximately 40% of the drive level of the primary signal path. Smaller amounts of feed-forward signaling provided correspondingly smaller speed-up benefit. Larger amounts of cross-coupling increased the effective propagation delay of the active loop elements, and significantly smaller amounts reduced the desired locked phase relationship between the first and second loops.


Loop frequency adjustment is shown as being controlled by Frequency Control 210, which as an example produces a varying oscillator supply voltage VCOVdd, which powers all of the loop oscillator inverters.


To provide isolation and increased output drive capability, each clock phase produced by the loop oscillator is buffered by clock buffers 111 through 118, producing output clocks Clock1 through Clock8. To insure full-swing clock outputs, these clock buffers are powered by the full Vdd supply voltage.


Observing the block diagram of FIG. 1, it is obvious that loop oscillator 100 may be partitioned into eight identical sections, 200 being one example, each such section incorporating one half of the inverters needed within each loop element, and producing one buffered clock output. As shown, the VCO of FIG. 1 contains eight VCO stages, one of which is outlined as VCO stage 200. The VCO stages of FIG. 1 may further be grouped into loop elements containing complementary VCO stages, as illustrated by loop element 180 which contains the two complementary VCO stages associated with outputs ϕ225 and ϕ045. Thus, the loop element containing the complementary VCO stages associated with phases ϕ000 and ϕ180 corresponds to a preceding loop element that provides primary signal path outputs ϕ000 and ϕ180 to inverters 120 and 160, respectively. Furthermore, the loop element containing the complementary VCO stages associated with ϕ315 and ϕ135 corresponds to a two-prior loop element providing the feed-forward outputs ϕ315 and ϕ135 to feed-forward inverters 128 and 168, respectively. Each VCO stage further includes a cross-coupled inverter 125/126 for receiving a complementary output from the complementary VCO stage within the loop element. For example, the VCO stage associated with ϕ225 includes cross-coupled inverter 126 for receiving complementary output ϕ000 and ϕ045 from the complementary VCO stage within loop element 180. Thus, loop elements as described below are defined by phase relationships between VCO stages of the VCO, and not necessarily physical layout.



FIG. 2 is a transistor-level illustration detailing one VCO stage 200 of the loop oscillator of FIG. 1, corresponding to primary path inverter 130, feed-forward inverter 138, and cross-coupled inverter 136. For purposes of illustration and without implying limitation, each inverter is drawn as a simple two transistor CMOS structure; thus primary inverter 130 of the first loop is composed of transistors 211 and 212, feed-forward inverter 138 is composed of transistors 230 and 231, and cross-coupled inverter 136 is composed of transistors 250 and 251, all driving common output node ø090. In the particular embodiment described above, transistors 211 and 212 associated with the primary signal path are sized to provide design-typical output drive currents, herein described as produced by transistors of 1× channel dimensions. Feed-forward transistors 221 and 222 are scaled to 0.6× channel dimensions, thus providing approximately 60% of the primary signal path output drive current. Cross-coupling transistors 231 and 232 are scaled to 0.4× channel dimensions, thus providing approximately 40% of the primary signal path output drive current.


As the output current and thus the output slew rate of a CMOS inverter varies with its supply voltage, the varying output slew rate of each inverter into its output node capacitance will result in a variation of propagation delay with supply voltage, providing a mechanism for adjusting the loop oscillation frequency. As detailed in FIG. 2, inverters 130, 138, and 136 are powered by a controlled voltage VCOVdd, produced by frequency control circuit 210. Within 210, Pass transistor 270 controlled by a first VCO control signal VCOctrl1 contributes to the collective tail current output by 190 onto controlled voltage bus VCOVdd; similarly, pass transistor 271 controlled by a second VCO control signal VCOctrl2 also contributes to the collective tail current output by 190. The relative channel size and thus current capability of pass transistors 270 and 271 may differ, allowing the relative control effect of the two control signals to differ. In one embodiment, VCOctrl1 is produced by a PLL component providing a proportional phase error component associated with high-frequency phase comparisons, and VCOctrl2 is produced by a PLL component providing an integral component associated with an accumulation of phase comparisons. In one particular instance of that embodiment, pass transistor 270 is smaller than pass transistor, allowing VCCctrl2 to provide a higher-magnitude, and VCCctrl1 to have a lower-magnitude effect on VCO frequency. In an alternative embodiment, each of pass transistors 270 and 271 are comprised of sets of one or more parallel instances of a pass transistor, the sets 270 and 271 thus collectively providing differing control effects in a ratio proportional to the number of transistors in set 270 versus the number of transistors in set 271. In other embodiments, a single control signal VCOctrl1 is sufficient, and only pass transistor 270 is required within 210. In many embodiments, one instance of frequency control circuit 210 produces controlled voltage VCOVdd used by all instances of 200. In embodiments utilizing individual gate delay adjustment as taught in [Tajalli I], separate instances of 210 may be used for individual loop elements, or individual loop segments 200.


For the inverter structures shown in FIG. 2, the output voltage swing at each loop node will span Vss to VCOVdd, i.e. a varying range. Thus, one embodiment capacitively couples 290 the VCO output phase to a buffer/driver circuit 112, to provide a consistent output level and drive capability. A non-inverting buffer/driver incorporating two inverting buffers 240 and 250 is illustrated, without implying limitation.


In one particular embodiment, capacitively coupling 290 the input signal to only the PMOS buffer input 241 was found to be sufficient, with the NMOS buffer input 242 being driven directly. A bias resistor 280 and configurable voltage Vbias is shown to providing an input bias to the PMOS gate. In other embodiments, the conventional practice of capacitively coupling a signal and bias voltage to both gate inputs may be used. Embodiments utilizing a different digital logic family (e.g. CIVIL) for the VCO elements may incorporate other known solutions for logic family level conversion into the design of FIG. 2.


Asymmetrical switching thresholds between PMOS and NMOS inverter or buffer elements, or variations from one section of the loop to another may result in undesirable periodic modulations of the resulting clock outputs. In particular, deviations from exact 50% duty cycle outputs may be detrimental. FIG. 3 illustrates two methods of measuring VCO output duty cycle errors. Circuit 310 takes advantage of the expected balance between complementary output phases, and even spacing between adjacent phases to reduce the need for heavy low-pass filtering to obtain a consensus common mode voltage result. Isolation resistors 301 through 308 linearly sum the eight clock phases Clock1 through Clock8 to produce analog result DutyCycleDetect. If all phases are exactly symmetrical 50% duty cycle full-swing CMOS square waves with equal inter-phase phase differences, the DC level of DutyCycleDetect will be exactly Vdd/2. In some embodiments, filter capacitor 309 may be used to reduce ripple on the output signal. A control/measurement subsystem may observe the actual value, and then incrementally adjust 260 Vbias in one or more oscillator sections 200 to minimize deviation from the desired result.


Alternatively, a circuit similar to 320 may be used on each clock output, with Clock8 used here as one example. As before, if the full-swing CMOS clock is an exact 50% duty cycle square wave and is substantially low-pass filtered to obtain its average DC level, the result should be exactly Vdd/2. In one embodiment, a R/C time constant 321, 322 approximately 1000 times longer than the clock period was used. As before, a control/measurement subsystem may observe the actual value, and then incrementally adjust 260 Vbias in that clock output's section 200 to minimize deviation from the desired result.



FIG. 4 illustrates an exemplary physical integrated circuit layout of one embodiment as in FIGS. 1 and 2. Rather than grouping circuit elements by function, e.g. VCO stage elements together and then buffer elements, circuit elements are grouped first as identical VCO stages, and then within each VCO stage by component type, e.g. capacitors, PMOS transistors, NMOS transistors.


The floor plan of FIG. 4 is shown to be logically divided into eight identical strips, each strip containing all of the elements of one VCO stage. Specifically, each VCO stage includes a plurality of inverters, where each inverter is arranged along a second dimension (shown in FIG. 4 as the vertical axis). The plurality of inverters include (i) an inverter configured to receive an output of a VCO stage in a preceding loop element, (ii) a feed-forward inverter configured to receive a feed-forward output VCO stage in a two-prior loop element, and (iii) a cross-coupled inverter configured to receive a complementary output from a complementary VCO stage within the same loop element. Furthermore, each VCO stage includes buffer inverters for generating the VCO outputs on the metallization layers of the interconnection grid. As shown, FIG. 4 includes an interconnection grid 405 having metallization layers configured to provide (i) interconnections between inverters in a plurality of VCO stages, the plurality of VCO stages repeated along a first dimension (shown in FIG. 4 as the horizontal axis) and (ii) a set of VCO outputs from the plurality of VCO stages. Vertically, each VCO stage is composed of a capacitor, PMOS transistors 410, a portion of interconnection wiring channel 405, and NMOS transistors 415. The PMOS transistors 410 and NMOS transistors 415 make up the plurality of inverters of each VCO stage. As shown in FIG. 4, each VCO stage of the plurality of VCO stages includes (i) a capacitive coupling element 407 and a PMOS portion 410 of each inverter of the plurality of inverters arranged on one side of the interconnection grid 405 and (ii) an NMOS portion 415 of each inverter of the plurality of inverters arranged on an other side of the interconnection grid. This unusual functional layout results in minimized wiring delay within each segment. True and complementary segments are placed next to each other, again to minimize wiring delays in the cross-coupled structures that maintain tight 180 degree phase relationship between them. However, the order of VCO stages along the first direction is permuted from a strict sequential order, e.g. 0-45-90-135-180-225-270-315, to a gray counting sequence such as the nonlimiting example 0-90-180-225-270-135-45-315, as shown in FIG. 4. Another exemplary order is shown in FIG. 5. In FIG. 5, the sequence is 0-180-90-270-45-225-135-315. In the embodiment of FIG. 5, the VCO stages are grouped according to loop elements, and thus every VCO stage is adjacent to at least its complementary stage. Such a reordering balances the average segment interconnection length, so that rather than many segment interconnections being one unit long and the first-to-last loop closing connection being 8 units long, most interconnection lengths are 4-5 units. FIG. 6 illustrates a VCO layout having four VCO stages, in accordance with such an embodiment. In FIG. 6, the VCO stages are ordered 0-270-90-180.


The arrangement of components within a VCO stage also minimizes wiring delays and parasitic node capacitances. The physical size of the AC coupling capacitor 290 connecting an oscillator output to a buffer input is relatively large compared to the rest of the circuitry of FIG. 2. Thus, it is beneficial to locate it adjacent to the PMOS transistors it couples to, to avoid long wire runs. Allowing a generous area for horizontal inter-segment wires and vertical intra-segment connections allows efficient routing, with in-phase signals run on top of each other to minimize Miller capacitance.


Finally, half-width “dummy” segments 420 are placed on the outsides of the bordering VCO stages (shown as the VCO stages at the extreme left and right of the iterated array of VCO stages), so that the bordering VCO stages of the eight active VCO stages do not see a difference in inter-segment parasitics compared to the inner VCO stages.


The particular examples of four inverters per VCO stage and two differential outputs per VCO stage do not imply a limitation in either minimum or maximum, although the available phase differences within a two element loop will generally preclude use of feed-forward speedup as described herein. Similarly, the CMOS loop inverters used for descriptive purposes above may alternatively utilize CIVIL or other digital design conventions, or equivalent analog amplifier/buffer conventions.

Claims
  • 1. An apparatus comprising: a plurality of voltage controlled oscillator (VCO) stages laterally arranged along a first dimension wherein each VCO stage is configured to provide a respective VCO phase and is arranged next to a corresponding complementary VCO stage that is configured to provide an opposite VCO phase, each VCO stage comprising inverters having groups of PMOS, NMOS, and capacitive coupling circuit components grouped by circuit component type along a second dimension, the groups of PMOS and NMOS circuit components separated by an interconnection grid and the capacitive coupling circuit components arranged adjacently to the group of PMOS circuit components;the interconnection grid located between the groups of PMOS and NMOS circuit components, and comprising inter-segment connections between circuit components of different VCO stages and intra-segment connections between circuit components within each VCO stage.
  • 2. The apparatus of claim 1, wherein each VCO stage comprises a primary path inverter, a cross-coupled inverter, and a buffer inverter.
  • 3. The apparatus of claim 1, wherein the cross-coupled inverter is 0.4× a size of the primary path inverter.
  • 4. The apparatus of claim 2, wherein the VCO output of each VCO stage is AC-coupled to the PMOS circuit component of the buffer inverter via the capacitive coupling circuit.
  • 5. The apparatus of claim 2, wherein each VCO stage further comprises a feed-forward inverter configured to receive a VCO output from a two-prior VCO stage.
  • 6. The apparatus of claim 5, wherein the feed-forward inverter is 0.6× a size of the primary path inverter.
  • 7. The apparatus of claim 1, further comprising dummy capacitors arranged next to the capacitive couple elements of a pair of bordering VCO stages of the plurality of VCO stages, the dummy capacitors matching inter-segment capacitive coupling of the bordering VCO stages to a plurality of inner VCO stages of the plurality of VCO stages.
  • 8. The apparatus of claim 1, wherein the interconnection grid is configured to output the respective VCO phase of each VCO stage.
  • 9. The apparatus of claim 1, wherein the plurality of VCO stages comprise a 0-degree stage, a 90-degree stage, a 180-degree stage, and a 270-degree stage.
  • 10. The apparatus of claim 1, wherein the plurality of VCO stages comprise a 0-degree stage, a 45-degree stage, a 90-degree stage, a 135-degree stage, a 180-degree stage, a 225-degree stage, a 270-degree stage, and 315-degree stage.
  • 11. The apparatus of claim 10, wherein the plurality of VCO stages are laterally arranged in a sequential order of the 0-degree stage, the 180-degree stage, the 90-degree stage, the 270-degree stage, the 45-degree stage, the 225-degree stage, the 135-degree stage, and the 315-degree stage.
  • 12. The apparatus of claim 10, wherein each inter-segment connection has a maximum length of five units, wherein a unit is a width of one VCO stage.
  • 13. The apparatus of claim 1, further comprising a frequency control circuit configured to power the plurality of VCO stages.
  • 14. The apparatus of claim 13, where in the frequency control circuit is controlled via an integral component associated with an accumulation of phase comparisons and a proportional component associated with high-frequency phase comparisons.
  • 15. The apparatus of claim 14, wherein the frequency control circuit has a first PMOS transistor configured to receive the integral component and a second PMOS transistor configured to receive the proportional component.
  • 16. The apparatus of claim 15, wherein the second PMOS transistor has a smaller size relative to the first PMOS transistor.
  • 17. The apparatus of claim 16, wherein the first PMOS transistor comprises a plurality of equal-sized PMOS devices connected in parallel.
  • 18. The apparatus of claim 1, wherein in-phase VCO outputs are aligned on top of each other in the interconnection grid.
  • 19. The apparatus of claim 1, wherein the interconnection grid is made up of metallization layers to provide the inter-segment connections and the intra-segment connections.
  • 20. The apparatus of claim 1, wherein the interconnection grid comprises wires to provide the inter-segment connections and the intra-segment connections.
CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 17/210,260, filed Mar. 23, 2021, naming Armin Tajalli, entitled “Multiple Adjacent Slicewise Layout of Voltage-Controlled Oscillator, which is a continuation of U.S. application Ser. No. 16/843,785, filed Apr. 8, 2020, now U.S. Pat. No. 10,958,251 issued on Mar. 23, 2021 naming Armin Tajalli, entitled “Multiple Adjacent Slicewise Layout of Voltage-Controlled Oscillator”, which claims priority to U.S. Application No. 62/831,165, filed Apr. 8, 2019, entitled “Multiple Adjacent Slicewise Layout of Voltage-Controlled Oscillator”, naming Armin Tajalli, all of which are hereby incorporated by reference in their entirety for all purposes.

US Referenced Citations (159)
Number Name Date Kind
4839907 Saneski Jun 1989 A
5266907 Dacus Nov 1993 A
5302920 Bitting Apr 1994 A
5528198 Baba et al. Jun 1996 A
5565817 Lakshmikumar Oct 1996 A
5602884 Wieczorkiewicz et al. Feb 1997 A
5629651 Mizuno May 1997 A
5802356 Gaskins et al. Sep 1998 A
6002717 Gaudet Dec 1999 A
6026134 Duffy et al. Feb 2000 A
6037812 Gaudet Mar 2000 A
6122336 Anderson Sep 2000 A
6307906 Tanji et al. Oct 2001 B1
6316987 Dally et al. Nov 2001 B1
6380783 Chao et al. Apr 2002 B1
6389091 Yamaguchi et al. May 2002 B1
6426660 Ho et al. Jul 2002 B1
6507544 Ma et al. Jan 2003 B1
6509773 Buchwald et al. Jan 2003 B2
6633621 Bishop et al. Oct 2003 B1
6650699 Tierno Nov 2003 B1
6717478 Kim et al. Apr 2004 B1
6838951 Nieri et al. Jan 2005 B1
6917762 Kim Jul 2005 B2
7078978 Wakii Jul 2006 B2
7102449 Mohan Sep 2006 B1
7158441 Okamura Jan 2007 B2
7199728 Dally et al. Apr 2007 B2
7336112 Sha et al. Feb 2008 B1
7532697 Sidiropoulos et al. May 2009 B1
7535957 Ozawa et al. May 2009 B2
7616075 Kushiyama Nov 2009 B2
7650525 Chang et al. Jan 2010 B1
7688929 Co Mar 2010 B2
7697647 McShea Apr 2010 B1
7822113 Tonietto et al. Oct 2010 B2
7839229 Nakamura et al. Nov 2010 B2
7852109 Chan et al. Dec 2010 B1
7860190 Feller Dec 2010 B2
8036300 Evans et al. Oct 2011 B2
8253454 Lin Aug 2012 B2
8407511 Mobin et al. Mar 2013 B2
8583072 Ciubotaru et al. Nov 2013 B1
8649476 Malipatil et al. Feb 2014 B2
8791735 Shibasaki Jul 2014 B1
8929496 Lee et al. Jan 2015 B2
9036764 Hossain et al. May 2015 B1
9059816 Simpson et al. Jun 2015 B1
9306621 Zhang et al. Apr 2016 B2
9374250 Musah et al. Jun 2016 B1
9397868 Hossain et al. Jul 2016 B1
9438409 Liao et al. Sep 2016 B1
9520883 Shibasaki Dec 2016 B2
9565036 Zerbe et al. Feb 2017 B2
9577815 Simpson et al. Feb 2017 B1
9602111 Shen et al. Mar 2017 B1
9906358 Tajalli Feb 2018 B1
9960902 Lin et al. May 2018 B1
10055372 Shokrollahi Aug 2018 B2
10326435 Arp et al. Jun 2019 B2
10374787 Tajalli Aug 2019 B2
10574487 Hormati Feb 2020 B1
10848351 Hormati Nov 2020 B2
10958251 Tajalli Mar 2021 B2
11349459 Tajalli May 2022 B2
20030001557 Pisipaty Jan 2003 A1
20030146783 Bandy et al. Aug 2003 A1
20030212930 Aung et al. Nov 2003 A1
20030214977 Kuo Nov 2003 A1
20040092240 Hayashi May 2004 A1
20040141567 Yang et al. Jul 2004 A1
20050024117 Kubo et al. Feb 2005 A1
20050078712 Voutilainen Apr 2005 A1
20050084050 Cheung et al. Apr 2005 A1
20050117404 Savoj Jun 2005 A1
20050128018 Meltzer Jun 2005 A1
20050141662 Sano et al. Jun 2005 A1
20050201491 Wei Sep 2005 A1
20050220182 Kuwata Oct 2005 A1
20050275470 Choi Dec 2005 A1
20060008041 Kim et al. Jan 2006 A1
20060062058 Lin Mar 2006 A1
20060140324 Casper et al. Jun 2006 A1
20060232461 Felder Oct 2006 A1
20070001713 Lin Jan 2007 A1
20070001723 Lin Jan 2007 A1
20070047689 Menolfi et al. Mar 2007 A1
20070058768 Werner Mar 2007 A1
20070086267 Kwak Apr 2007 A1
20070127612 Lee et al. Jun 2007 A1
20070146088 Arai et al. Jun 2007 A1
20070147559 Lapointe Jun 2007 A1
20070183552 Sanders et al. Aug 2007 A1
20070201597 He et al. Aug 2007 A1
20070253475 Palmer Nov 2007 A1
20080007367 Kim Jan 2008 A1
20080111634 Min May 2008 A1
20080136479 You et al. Jun 2008 A1
20080165841 Wall et al. Jul 2008 A1
20080181289 Moll Jul 2008 A1
20080219399 Nary Sep 2008 A1
20080317188 Staszewski et al. Dec 2008 A1
20090103675 Yousefi et al. Apr 2009 A1
20090167389 Reis Jul 2009 A1
20090195281 Tamura et al. Aug 2009 A1
20090231006 Jang et al. Sep 2009 A1
20090243679 Smith et al. Oct 2009 A1
20090262876 Arima et al. Oct 2009 A1
20090262877 Shi et al. Oct 2009 A1
20100033259 Miyashita Feb 2010 A1
20100090723 Nedovic et al. Apr 2010 A1
20100090735 Cho Apr 2010 A1
20100156543 Dubey Jun 2010 A1
20100180143 Ware et al. Jul 2010 A1
20100220828 Fuller et al. Sep 2010 A1
20110002181 Wang et al. Jan 2011 A1
20110025392 Wu et al. Feb 2011 A1
20110148498 Mosalikanti et al. Jun 2011 A1
20110234278 Seo Sep 2011 A1
20110311008 Slezak et al. Dec 2011 A1
20120051480 Usugi et al. Mar 2012 A1
20120170621 Tracy et al. Jul 2012 A1
20120200364 Iizuka et al. Aug 2012 A1
20120206177 Colinet et al. Aug 2012 A1
20120235717 Hirai et al. Sep 2012 A1
20120327993 Palmer Dec 2012 A1
20130088274 Gu Apr 2013 A1
20130091392 Valliappan et al. Apr 2013 A1
20130093471 Cho et al. Apr 2013 A1
20130107997 Chen May 2013 A1
20130108001 Chang et al. May 2013 A1
20130207706 Yanagisawa Aug 2013 A1
20130243127 Ito et al. Sep 2013 A1
20130271194 Madoglio et al. Oct 2013 A1
20130285720 Jibry Oct 2013 A1
20130314142 Tamura et al. Nov 2013 A1
20140286381 Shibasaki Sep 2014 A1
20140286457 Chaivipas Sep 2014 A1
20150043627 Kang et al. Feb 2015 A1
20150078495 Hossain et al. Mar 2015 A1
20150117579 Shibasaki Apr 2015 A1
20150180642 Hsieh et al. Jun 2015 A1
20150220472 Sengoku Aug 2015 A1
20150256326 Simpson et al. Sep 2015 A1
20160056980 Wang et al. Feb 2016 A1
20160087610 Hata Mar 2016 A1
20160134267 Adachi May 2016 A1
20170005785 Aleksic et al. Jan 2017 A1
20170228215 Chatwin et al. Aug 2017 A1
20170310456 Tajalli Oct 2017 A1
20180083763 Black et al. Mar 2018 A1
20180219539 Arp et al. Aug 2018 A1
20180227114 Rahman et al. Aug 2018 A1
20180343011 Tajalli et al. Nov 2018 A1
20180375693 Zhou et al. Dec 2018 A1
20190109735 Norimatsu Apr 2019 A1
20190377378 Gharibdoust Dec 2019 A1
20200162233 Lee et al. May 2020 A1
20210248103 Khashaba et al. Aug 2021 A1
Foreign Referenced Citations (3)
Number Date Country
203675093 Jun 2014 CN
0740423 Oct 1996 EP
3615692 Nov 2004 JP
Non-Patent Literature Citations (17)
Entry
Chang, Hong-Yeh , et al., “A Low-Jitter Low-Phase-Noise 10-GHz Sub-Harmonically Injection-Locked PLL With Self-Aligned DLL in 65-nm CMOS Technology”, IEEE Transactions on Microwave Theory and Techniques, vol. 62, No. 3, Mar. 2014, 543-555 (13 pages).
Cui, Delong , et al., “A Dual-Channel 23-Gbps CMOS Transmitter/Receiver Chipset for 40-Gbps RZ-DQPSK and CS-RZ-DQPSK Optical Transmission”, IEEE Journal of Solid-State Circuits, vol. 47, No. 12, Dec. 2012, 3249-3260 (12 pages).
Ha, J.C., et al., “Unified All-Digital Duty-Cycle and phase correction circuit for QDR I/O interface”, Electronic Letters, The Institution of Engineering and Technology, vol. 44, No. 22, Oct. 23, 2008, 1300 1301 (2 pages).
Inti, Rajesh , et al., “A 0.5-to-2.5 Gb/s Reference-Less Half-Rate Digital CDR with Unlimited Frequency Acquisition Range and Improved Input Duty-Cycle Error Tolerance”, IEEE Journal of Solid-State Circuits, vol. 46, No. 12, Dec. 2011, 3150-3162 (13 pages).
Loh, Mattew , et al., “A 3x9 Gb/s Shared, All-Digital CDR for High-Speed, High-Density I/O”, IEEE Journal of Solid-State Circuits, vol. 47, No. 3, Mar. 2012, 641-651 (11 pages).
Nandwana, Romesh Kumar, et al., “A Calibration-Free Fractional-N Ring PLL Using Hybrid Phase/Current-Mode Phase Interpolation Method”, IEEE Journal of Solid-State Circuits, vol. 50, No. 4, Apr. 2015, 882-895 (14 pages).
Ng, Herman Jalli, et al., “Low Phase Noise 77-GHz Fractional-N PLL with DLL-based Reference Frequency Multiplier for FMCW Radars”, European Microwave Integrated Circuits Conference, Oct. 10-11, 2011, 196-199 (4 pages).
Pozzoni, Massimo , et al., “A Multi-Standard 1.5 to 10 GB/s Latch-Based 3-Tap DFE Receiver with a SSC Tolerant CDR for Serial Backplane Communication”, IEEE Journal of Solid-State Circuits, vol. 44, No. 4, Apr. 2009, 1306-1315 (10 pages).
Riley, M. W., et al., “Cell Broadband Engine Processor: Design and Implementation”, IBM Journal of Research and Development, vol. 51, No. 5, Sep. 2007, 545-557 (13 pages).
Ryu, Kyungho , et al., “Process-Variation-Calibrated Multiphase Delay Locked Loop With a Loop-Enbedded Duty Cycle Corrector”, IEEE Transactions on Circuits and Systems, vol. 61, No. 1, Jan. 2014, 1-5 (5 pages).
Shu, Guanghua, et al., “A 4-to-10.5 Gb/s Continuous-Rate Digital Clock and Data Recovery With Automatic Frequency Acquisition”, IEEE Journal of Solid-State Circuits, vol. 51, No. 2, Feb. 2016, 428-439 (12 pages).
Fajalli, Armin , “Wideband PLL Using Matrix Phase Comparator”, Journal of Latex Class Files, vol. 14, No. 8, Aug. 2016, 1-8 (8 pages).
Tan, Han-Yuan , “Design of Noise-Robust Clock and Data Recovery Using an Adaptive-Bandwidth Mixed PLL/DLL”, Harvard University Thesis, Nov. 2006, 1-169 (169 pages).
Wang, Yi-Ming , et al., “Range Unlimited Delay-Interleaving and -Recycling Clock Skew Compensation and Duty-Cycle Correction Circuit”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 23, No. 5, May 2015, 856-868 (13 pages).
Yoo, Danny , et al., “A 36-Gb/s Adaptive Baud-Rate CDR with CTLE and 1-Tap DFE in 28-nm CMOS”, IEEE Solid-State Circuits Letters, vol. 2, No. 11, Nov. 2019, 252-255 (4 pages).
Zaki, Ahmed M., “Adaptive Clock and Data Recovery for Asymmetric Triangular Frequency Modulation Profile”, IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), Aug. 21, 2019, 1-6 (6 pages).
Rau, M , et al., “Clock/Data Recovery PLL Using Half-Frequency Clock”, Phase-Locking in High-Performance Systems: From Devices to Architectures, 2003, 643-646 (4 pages).
Related Publications (1)
Number Date Country
20220294432 A1 Sep 2022 US
Provisional Applications (1)
Number Date Country
62831165 Apr 2019 US
Continuations (2)
Number Date Country
Parent 17210260 Mar 2021 US
Child 17829126 US
Parent 16843785 Apr 2020 US
Child 17210260 US