This application claims foreign priority benefits under 35 U.S.C. §119 to co-pending German patent application number DE 10 2004 009 958.8-55, filed 1 Mar. 2004. This related patent application is herein incorporated by reference in its entirety.
1. Field of the Invention
The invention relates generally to the clock-controlled transmission of data and relates especially to a circuit arrangement for regulating the data transmission latency. A preferred, but not exclusive, area of application of the invention is the transmission of memory information (which has been read), within a memory module, from the output buffer of a memory bank to the data output of the module.
2. Description of the Related Art
In data processing systems, a reference clock is normally used as a time standard to control the operations. Accordingly, the time markers which are set in a control device to coordinate the operating cycle are not defined as units of absolute time (for example micro- or nanoseconds) but rather as units of the reference clock, that is to say, as a number of clock periods. Variations in the clock frequency can thus be allowed without having to change the specifications of the control device. These specifications also include the stipulation of the so-called “latency” for the operation of transmitting data over a data path. This latency is prescribed as a whole number “n” clock periods which are intended to elapse, as of the time of a transmission command, before the data item to be transmitted appears at the end of the data path.
However, it must be taken into account that virtually every data path contains elements which, for physical reasons, give rise to an inevitable and essentially fixed “absolute” time delay (delay time) of the transmitted data. These include, for example, electrical components with inevitable response and transfer times (for instance, inverters and amplifiers) and delaying transmission lines. The sum “τf” of these fixed absolute delay times in the data path determines the lower limit nmin for selecting the latency n (defined above) for a given period duration T of the reference clock, since the product n*T must not be less than τf.
τf≦n*T Eq. (1)
If, in contrast, the latency n has been prescribed, the absolute time value τf determines the lower limit Tmin for the period duration T of the reference clock (and thus the upper limit fcmax=1/Tmin for selecting the reference clock frequency fc).
To comply with a desired latency n in an accurate manner, a period of time τg that is exactly the same as a whole number n of clock periods T of the clock frequency fc must elapse between a reference time t0, at which the transmission of a data item (or of a burst of successive data) is requested, and the time tn, at which the data item appears at the output end of the data path. The following requirement thus exists.
tn−t0=Tg=n*T Eq. (2)
Of course, it cannot be assumed that the fixed delay time τf of the data path corresponds exactly to the product n*T. Rather, the delay time τf will be some fraction of n*T, which fraction changes with the value T (that is to say, in a manner dependent on the clock frequency fc). It must therefore be ensured that the data experience an additional delay that is dependent on T to compensate for the difference between n*T and τf. With reference to
At the top left,
The pad 50 forms the end of a data path that begins at the output of the data source 10 and contains transmission elements which delay the data by a respective fixed time. In this example of data transmission from the read data buffer 10 of a memory bank to the pad 50, these transmission elements having a fixed delay time are, for example, various branches of a system of bus lines, elements of a data control logic unit for directing the data via selected branches of the bus system and, as the last element before the pad 50, a transmission driver (off-chip driver OCD) for amplifying the data before they are transmitted to the external data connection of the chip via the pad 50. Data amplifiers which respectively likewise give rise to a fixed delay may also be provided between individual branches or sections of the bus system. In addition to these elements (which are not shown individually in
Depending on the value n selected and depending on the clock frequency fc=1/T used, the fixed delay time τf may be less than T or equal to T or greater than T. To bring the total delay τg from the data source 10 to the pad 50 precisely to the value n*T, the prior art introduces an additional delay that is composed of a first part p*T corresponding to an integer multiple p of the clock period T and of a second part q*T corresponding to a fraction q of the clock period T, in accordance with the following equations,
p=INT(n−τf/T)≧0 Eq. (3)
q=(n−τf/T)−INT(n−τf/T) Eq. (4)
where INT denotes the integer part of the argument placed in brackets after it.
The part p*T of the additional delay is introduced by inserting a suitable number of shift register stages at the start of the data path, said shift register stages being clock-controlled at the frequency fc of the reference clock. The part q*T of the additional delay is introduced by delaying the phase of the clock control of the shift register stages by an appropriate degree with respect to the reference clock.
In accordance with
The integer part p of the difference between n and τf/T is ascertained, in a latency control logic unit 80, in accordance with Eq. (3) above by comparing the two clock signals CLK(0) and CLK(0−τf) and taking into account the desired latency n.
In addition, the shifted reference clock signal CLK(0−τf) is used to clock a shift register 20. This has the effect of the data along the shift register being clocked at the frequency fc=1/T of the reference clock but with a clock phase that effectively appears to be delayed by the fraction q (defined in Eq. (4)) of the clock period T with respect to the phase of the reference clock CLK(0).
The shift register 20 is shown in
The data which have been transmitted thus experience a total delay of:
τg=q*T+p*T+τf Eq. (5)
If the values for p and q from Eq. (3) and Eq. (4) are inserted into Eq. (5), Eq. (2) above is arrived at exactly. The imposed requirement is thus satisfied.
The method of operation of the circuit arrangement shown in
The transmission command RDD is issued at the time to in synchronism with an edge of the reference clock CLK(0) and ensures that, as of this time, a connection is set up between the reference clock signal CLK(0) and a clock input of the data source 10, and a connection is set up between the shifted clock signal CLK(0−τf) and the clock connections of the shift register 20. This is symbolized in
τg=0.167T+3*T+0.833*T=4*T.
τg=0.333T+2*T+1.666*T=4*T.
The known circuit arrangement can be used to theoretically achieve a desired total delay time of exactly n*T given any desired values for T, n and τf, provided that T*n is not less than τf. However, one critical point in the known latency regulation method described is the latency control logic unit that ascertains the integer p in accordance with Eq. (3) above. This logic unit has decision-making problems when the quotient τf/T is equal to a whole number or comes very close to a whole number. Such a situation arises whenever the clock frequency fc has a value for which the clock period T=1/fc is equal to τf or an integer multiple thereof or comes very close to such values.
One aspect of the invention is to provide a circuit arrangement for latency regulation that reliably avoids undesired latency jumps.
Accordingly, one embodiment of the invention is implemented using a circuit arrangement for regulating a latency that is defined as a whole number n of periods T of a reference clock of frequency fc and is intended to elapse, as of a data transmission command, before the data which are to be transmitted from a data source appear at the end of the data path that is to be passed through. The circuit arrangement contains a chain of transmission elements having fixed delay times, making it possible to set the frequency fc in a range from 1/Tmax to 1/Tmin, where Tmin is at least equal to τf/n and τf is the sum of the fixed delay times in the data path. One embodiment of the invention provides a device for subdividing the data path into n successive sections, each of which contains, at its input, a clock-controlled sampling element for accepting the data to be transmitted and has a propagation time that is considerably shorter than Tmin, the propagation time τn of the last section also being considerably (or substantially) greater than zero. A device for controlling the clock of the sampling elements using a version of the reference clock that has been delayed by T−τn is also provided.
The wording “considerably/substantially greater than zero” that is used above and in the claims is to be understood in the sense that τn must always be sufficiently greater than zero such that it can be unambiguously measured and reproduced, that is to say, by an amount that is greater than the tolerance or variation range of the measurement and setting parameters.
During operation of the circuit arrangement according to one embodiment of the invention, the data require the whole number n-1 of clock periods T to pass through the first n-1 sections of the data path, because of the section-by-section clock control that forms a pipeline architecture. The propagation time τn in the last section remains as a non-integer fraction of T. Only the difference T−τn needs to be ascertained to delay the clock control by this amount so that the total delay is exactly n*T.
The circuit arrangement according to one embodiment of the invention thus never needs to make a critical decision as to the size of the integer part of clock periods T in a fixed delay time. Latency jumps which are based on wrong decisions do not occur. No complicated latency control logic unit, as in the prior art, is required.
A further advantage of the invention is that only the delay time τn of the last section, rather than the delay time of the entire data path, must be measured or simulated to set the clock control delay on the data path. This reduces the circuit complexity and also increases accuracy.
So that the manner in which the above recited features of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, for the invention may admit to other equally effective embodiments.
The exemplary embodiment shown in
The data path also contains the chain of conventional transmission elements which delay the data by a respective fixed time. In said example of data transmission from a read data buffer 10 to a pad 50, these transmission elements are, for example, the abovementioned bus line sections, amplifiers, etc., which together give rise to the abovementioned fixed delay time τf. The chain, which forms a cohesive structure in the prior art shown in
The positions of the flip-flops FF#1 . . . FF#n along the data path are selected in such a manner that the following conditions are satisfied:
The minimum number mmin of sub-blocks 140 (that is to say, the minimum number of those sections which must contain parts of the transmission elements which delay in a fixed manner) depends on the values Tmin and τf. Since each section also contains a flip-flop, the “transfer time” τFF of the respective associated flip-flop must also be included in the delay time of each of the sections S1 . . . Sn. This transfer time is always considerably shorter than Tmin. The following relationship thus applies.
mmin=INT(τf/Tmin+τFF)+1 Eq. (6)
The actual number m of sub-blocks 140 may, of course, be greater than mmin (however, at most equal to n). The number of sub-blocks and their respective length (delay time) may be selected as desired within said limits. If m is less than n, n-m sections of the data path are “empty” in the sense that they do not contain a sub-block of the transmission elements (which delay in a fixed manner) of the data path but rather only the respective associated flip-flop. The case of n=4 and m=3 is taken as a basis in the exemplary embodiment shown in
According to one embodiment of the invention, the flip-flops FF#1 . . . FF#n in the data path are controlled, in accordance with
The method of operation of the circuit arrangement shown in
The transmission command RDD is issued at the time to in synchronism with an edge of the reference clock CLK(0) and ensures that, as of this time, a connection is set up between the reference clock signal CLK(0) and a clock input of the data source 10 and that a connection is set up between the shifted clock signal CLK(0−τn) and the clock connections of the flip-flops FF#1 . . . FF#n. This is symbolized in
The above described transmission corresponds precisely to the relationship required in accordance with Eq. (2) above. This transmission applies to any desired clock frequencies fc, provided that none of the sections S1 . . . Sn has a delay time of longer than 1/fc. The desired result is achieved using considerably simpler means than in the prior art. There is no need for either a complex latency control logic unit for ascertaining the integer part of a numerical value or for a multiplexer for selecting shift register taps in a manner dependent on the integer part ascertained. There is thus no risk of latency jumps.
The requisite shifting of the clock signal to control the flip-flops can also be accurately achieved, in the circuit arrangement according to one embodiment of the invention, in a simpler manner than hitherto. It is necessary to only simulate the delay time of the last section rather than the entire data path. This not only requires less circuit complexity but can also be implemented in a more precise manner. Possible variations in the simulated delay time are, when regarded in absolute terms, far smaller, with the result that the adjustment means require a much smaller absolute dynamic range, thus improving the adjustment fineness.
In the exemplary embodiment shown in
The empty sections, if required (that is to say, in the case of m<n), can be inserted at any desired places on the data path before the last section. If, however, one or more of the empty sections or all of the empty sections are positioned at the start of the data path, this advantageously makes it possible to vary the latency value n in a very simple manner. This shall be explained in more detail below using an exemplary embodiment and with reference to
In the example shown in
For this purpose, in accordance with
To set the latency to a desired value n, precisely that switching path Ki (of the switching paths K0 . . . Kr), whose ordinal number i within the sequence 0 . . . r is equal to n−m, is closed. If m=3 and n is supposed to be equal to 4, the switching path K1 is closed, so that only the first flip-flop FF#1 is inserted as an empty data path section. This then results in effectively the same circuit as shown in
The switching paths K0 . . . Kr may be implemented using any desired switching elements, for example, using fusible links (e.g., fuse elements) which can be “blown” electrically or by means of lasers or using open contact links which can be closed by means of metallization. Electronic switching devices, for example, a multiplexer that can be controlled using a suitable switching signal, may also be utilized to select the respective desired tap of the register 120. The electronic switching devices may be advantageous when it is not desired to permanently retain the selected latency setting. The delay time of the data in the switching device must not, however, be longer than 1/fcmax minus the flip-flop transfer time τFF because, otherwise, the condition that none of the sections of the data path should have a delay time of less than 1/fcmax is not satisfied.
In the figures, the data-carrying connecting lines are depicted as thick lines. The latter are intended to be used to indicate that these lines can be multicore to transmit a multibit data stream in parallel form. For this case, each element shown in the data path and at the ends of the data path mayt be considered to be a parallel circuit of a plurality of identical elements.
Whereas the preferred area of application of the invention is regulating the latency of the read data paths in memory modules (in particular in DRAMs), applications of latency regulation in other fields or in other data processing devices are also within the scope of the invention.
While the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Number | Date | Country | Kind |
---|---|---|---|
10 2004 009 958 | Mar 2004 | DE | national |
Number | Name | Date | Kind |
---|---|---|---|
5796673 | Foss et al. | Aug 1998 | A |
5978284 | Pawlowski | Nov 1999 | A |
6128248 | Idei et al. | Oct 2000 | A |
6262938 | Lee et al. | Jul 2001 | B1 |
6327217 | Chung | Dec 2001 | B1 |
6804165 | Schroegmeier | Oct 2004 | B1 |
Number | Date | Country |
---|---|---|
102 08 715 | Sep 2003 | DE |
Number | Date | Country | |
---|---|---|---|
20050213417 A1 | Sep 2005 | US |