Mechanism for Windaging of a Double Rate Driver

Information

  • Patent Application
  • 20070300098
  • Publication Number
    20070300098
  • Date Filed
    June 27, 2006
    18 years ago
  • Date Published
    December 27, 2007
    16 years ago
Abstract
A double data rate launch system and method in which the two-to-one multiplexer select signal delay is programmable and can be adjusted individually for each system. This allows the amount of delay to be minimized based on the actual set up time required, not the worst-case set-up time. The select signal to the multiplexer is delayed sufficiently to compensate for non-uniformity of duty cycle of data at the inputs to the multiplexer. Compensation of the non-uniformity allows the data on the wire to have a uniform duty cycle for all data transferred regardless of which latch is sourcing the data. The multiplexer that selects data from the two latches which are launching data on the edge of different clocks has a select line that is delayed by a variable amount to tune the select such that the data is clean at the input to the multiplexer on all ports.
Description

BRIEF DESCRIPTION OF THE DRAWINGS

The subject matter that is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:



FIG. 1 illustrates one example of the prior-art DDR driver design with a 2-to-1 multiplexer and a delay element to delay one of the two data input ports of the 2-to-1 multiplexer.



FIG. 2 illustrates one example of the signal timing in the prior-art DDR driver design of FIG. 1 in a timing diagram.



FIG. 3 illustrates one example of this invention with the master latch drives one of the two data input ports of the 2-to-1 multiplexer and the programmable delay element at the multiplexer select signal port.



FIG. 4 illustrates the signal timing in the DDR driver design of this invention shown in FIG. 3.



FIG. 5 illustrates one example of the programmable delay element used by this invention at the multiplexer select signal port that is controlled by either a register that can be scan-initialized and accessed by firmware or an edge detection and feedback circuitry.



FIG. 6 illustrates one example of the edge detection and feedback circuitry of this invention.



FIG. 7 illustrates one embodiment of a clock generator for generating separate clock signals for the DDR driver master-slave latches.





The detailed description explains the preferred embodiments of the invention.


DETAILED DESCRIPTION OF THE INVENTION

Referring now to FIGS. 3 and 4 of the drawings, in accordance with the teachings of this invention, the fixed delay element 220 of FIG. 1 has been replaced by a programmable delay element 120. The programmable delay element for each bus driver group can be adjusted individually on each system. Either registers or edge-detection circuitries can control the delay setting of these delay elements for each bus driver group. The delay setting to these select signal delay elements can be preset via scan initializations during the system bring-up phase, under firmware or software controls; they also can be changed at any time or periodically when systems are running. This allows the amount of delay to be minimized based on the actual set up time required, not the worst-case set-up time. Also, the select signal to the multiplexer is delayed sufficiently to compensate for non-uniformity of duty cycle of data at the inputs to the multiplexer. Compensation of the non-uniformity allows the data on the wire to have a uniform duty cycle for all data transferred regardless of which latch is sourcing the data. The multiplexer that selects data from the two latches that are launching data on the edge of different clocks has a select line that is delayed by a variable amount to tune the select such that the data is clean at the input to the multiplexer on all ports. In order to compensate for unbalanced duty cycle clocks, the multiplexer select signal is delayed and the duration of the delay can be varied in order to tune the select signal such that the data is clean at the input to the multiplexer on both ports. The data is held at the input the mulitplexer for a period longer than the duty cycle of the select signal, and the duty cycle of the select signal is uniform, shaping the data uniformly at the output of the multiplexer.


In addition, the fixed delay element in one of the data paths (here element 210 in the odd data path) has been replaced by the master latch 115, providing a half cycle delay comparable to the fixed delay, but allowing for DDR drivers to operate at a much wider range of system cycle times with minimum delay at the 2-to-1 multiplexer data output port. As can be seen by comparing FIGS. 2 and 4 the odd data is, in accordance with the teachings of this invention, delayed by one half clock cycle, as shown in FIG. 4, so that delay changes as the clock frequency changes, allowing a wide range of system cycle times with minimum delay at the 2-to-1 multiplexer data output port.


To further tune the system, and to eliminate problems that arise if an unbalanced duty cycle clock clocks the master-slave latches, one embodiment of the invention provides separate clocks for these latches. The clock generating circuitry (described in connection with FIG. 7) provides each latch with a 50% duty-cycle clock m1-s1 or m2-s2. The phase relation between these two clock signals can be adjusted so that the data arrival time to the 2-to-1 multiplexer data input ports is minimized in order to reduce delays. In the exemplary embodiment of FIG. 3, the latches 212 and 214 are driven the m1-s1 clock and the m2 clock drives the latch 115. However, latch 214 can be clocked by clock_m2 and clock_s2 in some cases, such as to use the delayed clock_m2 falling edge and the delayed clock_s2 rising edge so that the odd data DATA_ODD can arrive late and still meet latch 214's set-up time.


Referring now to FIG. 5, it shows one embodiment of a typical prior art programmable delay element 120, which can be used in the practice of the invention. As will be appreciated by those skilled in the art, it is comprised of delay elements dly[0] through dly[n], such as, for example series connected invertors. A decoder 510 decodes a delay count input 512 and produces an output that determines the number of delay elements the select signal encounters between its input to the delay element and its output there from. Either registers or edge-detection circuitries can generate a delay count to control the delay setting of these delay elements for each bus driver group. The delay setting to these select signal delay elements can be preset to registers via scan initializations during system bring-up phase, under firmware or software controls; they also can be changed at any time or periodically when systems are running using edge detection.


Referring now to FIG. 6, in order to determine a delay count by edge detection, the select input to the delay element 120 is coupled not only as a delayed select (select_delayed) input to the multiplexer, but also to the input of a comparator 610 and as an input to a second incremental delay element 612. The output of the second delay element 612 is coupled to the input of the comparator 610. The final input to the comparator 610 is the output of multiplexer 614 whose inputs are the corresponding even or odd half of the data, with a select input (set) that selects either data for the comparator 610. In operation, the comparator determines/finds the edge of the even or odd data with the SLECT signal delayed by delay element 120, then 612. Using the output signals of delay element 220 and 612 to sample the output of multiplexer 614, the transition edge of the thus sampled signal can be detected. By changing the settings of the one or both of the DELAY_COUNT and DELAY_COUNT inputs to delay elements 120 and 612 respectively, the proper settings of these inputs can be determined to minimize the DDR driver delay with sufficient margins for the set-up and hold times of multiplexer 216 of FIG. 3.


Referring now to FIG. 7, in order to generate two 50% duty cycle clocks m1/s1 and m2/s2 whose phase can be adjusted one relative to another, the local clock signal is coupled as an input to two programmable clock generators 170 and 172 of a suitable type known in the art. Inputs Adjust 1 and Adjust 2 respectively to the generators 170 and 172, adjust the duty cycle of the clock outputs and also the phase of these outputs, one to the other.


The capabilities of the present invention can be implemented in software, firmware, hardware or some combination thereof


As one example, one or more aspects of the present invention can be included in an article of manufacture (e.g., one or more computer program products) having, for instance, computer usable media. The media has embodied therein, for instance, computer readable program code means for providing and facilitating the capabilities of the present invention. The article of manufacture can be included as a part of a computer system or sold separately.


Additionally, at least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform the capabilities of the present invention can be provided.


The flow diagrams depicted herein are just examples. There may be many variations to these diagrams or the steps (or operations) described therein without departing from the spirit of the invention. For instance, the steps may be performed in a differing order, or steps may be added, deleted or modified. All of these variations are considered a part of the claimed invention.


While the preferred embodiment to the invention has been described, it will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow. These claims should be construed to maintain the proper protection for the invention first described.

Claims
  • 1. A method for launching synchronous data from two sources on chip to a double data rate bus, including the steps of: coupling said two sources as inputs to a multiplexer that couples first one source then the other source to the bus on each edge of a select signal operating at a local clock signal rate;delaying the select signal with a programmable delay element.
  • 2. A method for launching synchronous data from two sources on chip to a double data rate bus as in claim 1 including the further step of delaying one of the input of one of sources with a delay element whose delay is a function of the local clock signal.
  • 3. A method for launching synchronous data from two registers on a chip to a double data rate bus as in claim 1 including the further step of generating from a local clock signal two clock signals, one to clock one of said two registers, and the other to clock the other one of said two registers.
  • 4. A method for launching synchronous data from two sources on chip to a double data rate bus as in claim 3, including the further step of delaying one of the input of one of sources with a delay element whose delay is a function of the local clock signal.
  • 5. A system for launching synchronous data from two sources on chip to a double data rate bus, comprising in combination: means for coupling said two sources as inputs to a multiplexer that couples first one source then the other source to the bus on each edge of a select signal operating at a local clock signal rate;programmable means for delaying the select signal.
  • 6. A system for launching synchronous data from two sources on chip to a double data rate bus as in claim 5 including means for delaying one of the input of one of sources with a delay element whose delay is a function of the local clock signal.
  • 7. A system for launching synchronous data from two registers on a chip to a double data rate bus as in claim 5 including means for generating from a local clock signal two clock signals, means to couple one clock signal to one of said two registers, and means to couple the other clock signal to clock the other one of said two registers.
  • 8. A system for launching synchronous data from two registers on a chip to a double data rate bus as in claim 6 including means for generating from a local clock signal two clock signals, means to couple one clock signal to one of said two registers, and means to couple the other clock signal to clock the other one of said two registers.
  • 9. A chip that launches synchronous data from two sources on the chip to a double data rate bus, comprising in combination: a multiplexer that couples first one source then the other source to the bus on each edge of a select signal operating at a local clock signal rate;a programmable delay element for delaying the select signal.
  • 10. A chip that launches synchronous data from two sources on the chip to a double data rate bus as in claim 9 including a delay element whose delay is a function of the local clock signal frequency that delays one of the input from one of sources.
  • 11. A chip that launches synchronous data from two registers on the chip to a double data rate bus as in claim 9 including a clock generator generating from a local clock signal two clock signals, means to couple one clock signal to one of said two registers, and means to couple the other clock signal to clock the other one of said two registers.
  • 12. A chip that launches synchronous data from two registers on the chip to a double data rate bus as in claim 10 including a clock generator generating from a local clock signal two clock signals, means to couple one clock signal to one of said two registers, and means to couple the other clock signal to clock the other one of said two registers.