The disclosure of Japanese Patent Application No. 2023-188107 filed on Nov. 2, 2023, including the specification, drawings and abstract is incorporated herein by reference in its entirety.
The present invention relates to semiconductor devices and memory modules, for example, semiconductor devices such as data buffers mounted on memory modules.
There are disclosed techniques listed below.
For example, in systems such as cloud and enterprise, high-speed memory modules such as DDR5 (Double Data Rate 5) LRDIMM (Load Reduced Dual Inline Memory Module) are used. As shown in Non-Patent Document 1, such memory modules are equipped with a data buffer for driving data signal (DQ signal) and data strobe signals (DQS signal). The data buffer includes a data retention circuit called a slicer, which latches the input DQ signal at the edge of the DOS signal. The data buffer adjusts the phase of the DQS signal using a variable delay circuit, for example, in an initial sequence, to ensure such latching operation.
On the other hand, in the data buffer, for example, a decision feedback equalizer (DFE) or the like, which enhances signal quality, may be inserted in the transmission path of the input DQ signal. For example, the delay time of the DEE changes according to changes in the environment such as temperature and voltage. As a result, even if the phase of the DOS signal is adjusted by the initial sequence, the phase of the DQ signal may change according to changes in the environment, potentially failing to correctly maintain the phase relationship between the DQ signal and the DQS signal.
The embodiment described later is made in view of such matters, and other problems and novel features will become apparent from the description of this specification and the accompanying drawings.
A semiconductor device according to one embodiment includes a first and a second data retention circuits, a variable delay circuit, and a timing adjustment circuit. The first data retention circuit latches an input data signal in synchronization with a first data strobe signal. The second data retention circuit latches the input data signal in synchronization with a second data strobe signal. The variable delay circuit generates the first and second data strobe signals by delaying the input data strobe signal by a first delay amount and a second delay amount, respectively. The timing adjustment circuit adjusts the first delay amount based on the determination result by determining the match/mismatch between the first data signal from the first data retention circuit and the second data signal from the second data retention circuit while changing the second delay amount.
According to one embodiment, it is possible to correctly maintain the phase relationship between the data signal and the data strobe signal that defines the latching timing of the data signal.
In the following embodiments, for convenience, when necessary, they are described by dividing into multiple sections or embodiments. Except when specifically indicated, these are not unrelated to each other; one may be a part or all of a modified example, detail, supplementary explanation, etc., of the other. Furthermore, in the following embodiments, when referring to the number of elements, etc. (including the number of elements, numerical values, quantities, ranges, etc.), except when specifically indicated and when it is clearly limited to a specific number in principle, it is not limited to that specific number and may be more or less than the specific number.
Moreover, in the following embodiments, it goes without saying that the constituent elements (including element steps and the like) are not necessarily essential, except in cases where they are specifically indicated and in cases where they are considered to be obviously essential in principle. Similarly, in the following embodiments, when referring to the shapes, positional relationships, etc., of components and the like, it is assumed to include those that are substantially approximate to or similar to the shapes, etc., except in cases where they are specifically indicated and in cases where they are considered not to be so in principle. This applies to the above numerical values and ranges as well.
Furthermore, the circuit elements constituting each functional block of the embodiments are not particularly limited but are formed on a semiconductor substrate such as single-crystal silicon by integrated circuit technology known as CMOS (Complementary MOS transistors), among others,
Hereinafter, embodiments are described in detail with reference to the drawings. In all the drawings for explaining the embodiments, members having the same functions are denoted by the same reference numerals, and repetitive descriptions thereof are omitted. Moreover, in the following embodiments, descriptions of the same or similar parts will not be repeated in principle except when particularly necessary.
Moreover, the module wiring substrate MB is equipped with a plurality of external terminals. The plurality of external terminals include control external terminals PNc and data external terminals PNdA, PNdB. The control external terminals PNC are input terminals for memory control signals such as a clock signal CLK, a command signal CMD, and an address signal ADD, The data external terminals PNdA, PNdB are input/output terminals for a data signal DQ and a data strobe signal DQS,
The registered clock driver RCD is, for example, composed of a single semiconductor chip. The registered clock driver RCD re-drives the memory control signals input via the control external terminals PNc and outputs them to the plurality of memory chips MDL via the CA bus BS_CA. Furthermore, the registered clock driver RCD generates buffer control signals for the data buffers DB based on the input memory control signals and outputs them to the plurality of data buffers DB via the BCOM bus BS_BCOM.
Each of the plurality of data buffers DB, for example, is composed of a single semiconductor chip. The plurality of data buffers DB identifies the write period and the read period for the memory chip MEM based on the buffer control signal from the Registered Clock Driver RCD. During the write period, the plurality of data buffers DB re-drive the data signal DQ and the data strobe signal DQS inputted through the external terminals for data PNdA, PNdB, and output them to the plurality of memory chips MEM. During the read period, the plurality of data buffers DB re-drive the data signal DQ and the data strobe signal DOS from the plurality of memory chips MEM, and output them to the external terminals for data PNdA, PNdB.
Each of the plurality of memory chips MEM is, for example, a DDR-SDRAM chip, specifically such as a DDR5-SDRAM chip. Each memory chip MEM performs a write operation or a read operation in response to the memory control signal from the CA bus BS CA. During the write operation, each memory chip MEM takes in the data signal DQ from the data buffer DB using the data strobe signal DQS and writes it to the selected memory cell. During the read operation, each memory chip MEM outputs the data signal DQ read from the selected memory cell, along with the data strobe signal DQS, to the data buffer DB.
A host provided outside the memory module MDL inputs and outputs “2×m” bits of data signal DQ, which are inputted and outputted by the data buffer DB, through the m-bit data terminals included in the external terminals for data PNdA, PNdB. At this time, the host inputs and outputs m bits of data signal DQ in half the clock cycle “(½) Tck” based on twice the clock frequency “2×fck”, and inputs and outputs m bits of data signal DQ in the following half clock cycle. The Registered Clock Driver RCD and the plurality of data buffers DB absorb the speed difference between such memory interface MEM_IF and host interface HST_IF by buffering.
The drivers TXh_St, TXh_Sc output complementary data strobe signals DQSt, DQSc to the host interface HST_IF side. Receiver RXh_S differentially inputs the complementary data strobe signals DQSt, DQSc from the host interface HST_IF side and outputs a clock signal for latching to the slicer SLw. The driver TXh_D outputs data from the read buffer BUFR as data signals DQ[n] to the host interface HST_IF side.
The receiver RXh_D differentially inputs the data signal DQ[n] from the host interface HST_IF side and a pre-generated reference voltage Vref, and outputs the data signal DQ[n] to the slicer SLw. The slicer SLw is composed of flip-flops, among other components. The slicer SLw latches the data signal DQ[n] from the receiver RXh_D in synchronization with the clock signal from the receiver RXh_S.
Meanwhile, the drivers TXm_St, TXm_Sc output complementary data strobe signals MDQSt, MDQSc to the memory interface MEM_IF side. The read latch circuit RET includes receivers RXm_S, RXm_D, a variable delay circuit VDLYS, and a slicer SLr. The receiver RXm_S differentially inputs the complementary data strobe signals MDQSt, MDQSc from the memory interface MEM_IF side and outputs a clock signal for latching to the variable delay circuit VDLYs. The variable delay circuit VDLYs delays the inputted clock signal and outputs it to the slicer SLr.
The receiver RXm_D differentially inputs the data signal MDQ[n] from the memory interface MEM_IF side and a pre-generated reference voltage Vref, and outputs the data signal MDQ[n] to the slicer SLr. The slicer SLr is a data holding circuit composed of flip-flops, among other components. The slicer Sur latches the data signal MDQ[n] from the receiver RXm_D in synchronization with the clock signal from the variable delay circuit VDLYs.
Each of the read buffer BUFR and the write buffer BUFW is composed of, for example, FIFO (First In First Out) buffers. The read buffer BUFR stores the data signal MDQ[n] from the slicer SLr on the memory interface MEM_IF side. Then, the read buffer BUFR outputs the stored data signal MDQ[n] as data signal DQ[n] to the host interface HST_IF side through the driver TXh_D.
Meanwhile, the write buffer BUFW stores the data signal DQ[n] from the slicer SLw on the host interface HST_IF side. Then, the write buffer BUFW outputs the stored data signal DQ[n] as data signal MDQ[n] to the memory interface MEM_IF side through the driver TXm_D.
The buffer control circuit CTRL inputs a buffer control signal from the BCOM bus BS_BCOM and outputs an internal control signal CT for controlling each part within the data buffer DB based on the buffer control signal. The buffer control signal includes, for example, a clock signal BCK, a command signal BCOM, a chip selects signal BCS, and a reset signal BRST. The internal control signal CT includes, for example, an enable signal to drivers and receivers, input and output clocks to the read buffer BUFR and the write buffer BUFW, and a setting value for the delay amount to the variable delay circuits VDLYs.
Furthermore, the buffer control circuit CTRL includes a phase-locked loop (PLD). The phase-locked loop (PLD) generates a new clock signal CK synchronized with the input clock signal BCK. Using the generated clock signal CK, the buffer control circuit CTRL produces complementary data strobe signals MDQSt, MDQSc for the memory interface MEM_IF side, and complementary data strobe signals DQSt, DQSc for the host interface HST_IF side.
Moreover, the buffer control circuit CTRLc shown in
The decoder QDEC sets the value “(¼) Tck”, which is one-quarter of the delay amount “Ick”, in the variable delay circuit VDLYs_C within the read latch circuit RLTc. As a result, the data strobe signal MDQS from the receiver RXm_S is input to the slicer SLr as the data strobe signal DQSin after a delay time of “(¼) Tck”. The slicer SLr latches the data signal DQin from the receiver RXm_D in synchronization with the data strobe signal DQSin.
Specifically, as shown in
The delay time td_dq is the delay time from the output terminal of the memory chip MEM to the data input node of the slicer SLr, including the delay time by the receiver RXm_D and the delay time by the wiring of the transmission path. Similarly, the delay time td_dqs is the delay time from the output terminal of the memory chip MEM to the clock input node of the slicer Sbr, including the delay time by the receiver RXm_S and the delay time by the wiring of the transmission path.
Therefore, by adding the delay time of “(¼) Tck” from the decoder QDEC to the delay time td_dqs with the variable delay circuit VDLYs_C, it is possible to set the edge timing of the data strobe signal DQSin in the middle of the eye width W of the data signal DQin at the input node of the slicer SLr.
The worst case (WC) refers to, for example, the case where the delay time changes the most in response to changes in the environment such as temperature and voltage. Since the receivers RXm_D, RXm_S are usually composed of similar circuits, the delay times by each receiver RXm_D, RXm_S are equivalent even when the environment changes. Therefore, even in the worst case (WC), it is possible to set the edge timing of the data strobe signal DQSin in the middle of the eye width W of the data signal DQin.
For example, as shown in
However, in this case, the signal quality in the transmission path may deteriorate. Therefore, in the example shown in
However, when a Decision Feedback Equalizer (DFE) is provided, as shown in
On the other hand, the adder-subtractor DFE_SUM is composed of an analog circuit. Therefore, the delay time caused by the adder-subtractor DFE_SUM may change by time “dt” according to changes in the environment such as temperature and voltage. Consequently, time “dt” is added or subtracted only to the delay time td_dq. As a result, as shown in the worst case (WC), the phase relationship between the data signal DQin and the data strobe signal DQSin changes, and the slicer SLr may not be able to correctly latch the data signal DQin with the data strobe signal DQSin.
In such cases, for example, it is possible to return to a state similar to the typical case (typ) by re-setting the initial value of the variable delay circuit VDLYs_C through re-training, Specifically, for example, it is advisable to perform training periodically, taking into account changes in the environment. However, performing training periodically will reduce the operating time of the system for the duration of the training. Therefore, it is beneficial to use the method of the embodiment described below,
The readout latch circuit RLTa shown in
Similarly to the case of
The main slicer (first data holding circuit) SLr latches the input data signal DQin transmitted from the memory chip MEM through the receiver RXm_D and the adder/subtractor DFE_SUM in synchronization with the main data strobe signal (first data strobe signal) DQSin. Meanwhile, the monitor slicer (second data holding circuit) SLr_M latches the input data signal DQin in synchronization with the monitor data strobe signal (second data strobe signal) DQSin_M.
The variable delay circuit VDLYs_A delays the input data strobe signal MDQS transmitted from the memory chip MEM through the receiver RXm_S by the main delay amount (first delay amount) ST1 and the monitor delay amount (second delay amount) ST2, respectively. Thus, the variable delay circuit VDLYs_A generates the main data strobe signal DQSin reflecting the main delay amount ST1 and the monitor data strobe signal DQSin_M reflecting the monitor delay amount ST2.
The variable delay circuit VDLYs_A, as shown in detail in
The selection circuit SEL1 selects one of the multiple outputs from the multiple delay elements DE[0]-DE[k] based on the main delay amount ST1 and outputs it as the main data strobe signal DQSin. Similarly, the selection circuit SEL2 selects one of the multiple outputs from the multiple delay elements DE[0]-DE[k] based on the monitor delay amount ST2 and outputs it as the monitor data strobe signal DQSin_M.
Returning to
The search circuit SC, in general, monitors the monitor determination result MPF while changing the monitor delay amount ST2 and adjusts the main delay amount ST1 based on the monitor determination result MPF. As a prerequisite for starting such adjustment, the initial value of the main delay amount ST1 before adjustment needs to be reasonably appropriate. That is, the initial value of the main delay amount ST1 must be at least such that the main slicer SLr can correctly latch the input data signal DQin. Therefore, the search circuit SC determines the initial value of the main delay amount ST1 by reflecting “(¼) Tck” from the Delay Locked Loop (DLL).
The adder/subtractor DFE_SUM is a component of the Decision Feedback Equalizer (DFE) and is inserted into the transmission path of the input data signal DQin, The Decision Feedback Equalizer (DFE) performs waveform equivalence of the input data signal DQin. Specifically, as shown in
Each of the plurality of delay circuits DLY2, DLY3, . . . is composed of, for example, flip-flops similar to the slicer SLY, achieving a delay of one clock cycle. The weighting circuit W1 applies weighting, that is, multiplication, to the output of the slicer SLr. Similarly, the weighting circuits W2, W3, . . . apply weighting to the outputs of the delay circuits DLY2, DLY3, . . . , respectively. The adder/subtractor DFE SUM adds or subtracts the outputs from the plurality of weighting circuits W1, W2, W3, . . . to the input data signal MDQ[n].
Then, by sequentially changing the monitor delay amount ST2 in this manner and monitoring the monitor determination result MPF from the exclusive OR circuit EOR, the timing adjustment circuit TMCT can detect the eye width W of the input data signal DQin based on the edge timing at the point of change from pass (match) to fail (mismatch). The timing adjustment circuit TMCT adjusts the main delay amount ST1 so that the edge timing of the main data strobe signal DQSin is positioned in the center of the eye width W.
By using such a method, it is possible to correctly maintain the phase relationship between the data signal (input data signal DQin) and the data strobe signal (main data strobe signal DQSin) that determines the latch timing of the data signal. That is, the edge timing of the data strobe signal can be fixed in the center of the eye width W of the data signal.
Furthermore, by providing a monitor slicer SLr_M and scanning the monitor data strobe signal DQSin_M, it is possible to detect the eye width W of the input data signal DQin without affecting the normal read operation in the memory chip MEM. That is, while transmitting the data signal DQo from the main slicer SLr as the read data signal from the memory chip MEM to the data external terminals PNdA, PNdB in the subsequent stage, as shown in
Accordingly, for example, during the period of normal reading operation, even if the environment such as temperature and voltage changes, and hence, the delay time of the transmission path changes due to a Decision Feedback Equalizer (DFE) or the like, it is possible to optimize the strobe timing while following the changes in real-time. As a result, a system robust to changes in the environment can be realized. Furthermore, for example, there is no need to interrupt the normal reading operation to retrain the strobe timing, etc. As a result, the operating time of the system can be secured.
Moreover, since it is possible to compensate for changes in the delay time of the transmission path according to changes in the environment, circuits with temperature dependency, such as a Decision Feedback Equalizer (DFE), can be easily mounted within the transmission path. By mounting a Decision Feedback Equalizer (DFE), it is possible to achieve higher data transfer speeds, lower power consumption as mentioned in
In the initial sequence period (Step S10), first, the Delay Locked Loop (DLL) becomes locked (Step S101). That is, as described in
Through read training, for example, as shown in
In the subsequent normal operation period (step S20), the timing adjustment circuit TMCT starts its operation by setting the main delay amount ST1 from the state determined by the initial value obtained during the initial sequence period (step S10). In step S20, in summary, the following processes are executed.
First, as shown in
The timing adjustment circuit TMCT judges pass/fail based on the monitor judgment result MPF with the edge timing of the monitor data strobe signal DQSin_M set to one of the edge timings th, ts, and then, with it set to the other, judges pass/fail based on the monitor judgment result MPF. Then, as long as the monitor judgment result MPF is a pass, the timing adjustment circuit TMCT sequentially increases the integer “N”, thereby gradually expanding the shift width of the edge timings th, ts based on the edge timing tm (steps S202-8206).
Here, the monitor judgment results MPF in steps S203, S205 are obtained using the normal read operation for the memory chip MEM. That is, as shown in
Thus, for example, the timing adjustment circuit TMCT obtains the monitor judgment result MPF at the hold side edge timing th during the data read period within the read cycle Trd1 in step S203. Subsequently, the timing adjustment circuit IMCT obtains the monitor judgment result MPF at the setup side edge timing ts during the data read period within the read cycle Trd2 in step S205.
On the other hand, if the shift width of the edge timings th, ts is sequentially expanded, at a certain point, a failure occurs at least on one of the edge timings th, ts. First, assume the case where a failure occurs at the hold side edge timing th, and also a failure occurs at the setup side edge timing ts (steps S202, S203, S207, S208, S211). This state indicates that the main edge timing tm is positioned in the center of the eye width W of the input data signal DQin. Therefore, the timing adjustment circuit TMCT maintains the main edge timing tm, that is, the main delay amount ST1 as it is (step S211), and returns to the state shown in
Next, assume a case where a failure occurs at the hold side edge timing th, and a pass occurs at the setup side edge timing ts (steps S202, S203, S207, S208, S210). This state indicates that the main edge timing tm is located on the hold side rather than the center of the eye width W. Therefore, the timing adjustment circuit IMCT shifts the main edge timing tm by a unit delay time “dTof” in the setup direction, that is, in the direction of delay reduction (step S210), and returns to step S201.
Finally, assume a case where a pass occurs at the hold side edge timing th, and a failure occurs at the setup side edge timing ts (steps S202, S203, 3204, S205, S209). This state indicates that the main edge timing tm is located on the setup side rather than the center of the eye width W. Therefore, the timing adjustment circuit TMCT shifts the main edge timing tm by a unit delay time “dTof” in the hold direction, that is, in the direction of delay increase (step S209), and returns to step S201.
For example, by using such a flowchart, during the normal operation period (step S20), the edge timing tm of the main data strobe signal DQSin can be fixed to the center of the eye width W of the input data signal DQin. And thus, by optimizing the main delay amount ST1 during the normal operation period (step S20), for example, during the initial sequence period (S10), it is not necessarily required to precisely determine the initial value of the main delay amount ST, specifically, the main delay amount ST1. As a result, for example, the delay-locked loop (DLL) can be simplified, and also, the time required to lock the delay-locked loop (DLL), that is, the time required for processing in step S101, can be shortened.
In DDR-SDRAM chips, more specifically, multiple data signals MDQ are assigned to a single data strobe signal MDQS, In this case, the timing adjustment circuit TMCT may, for example, execute processing as shown in
Alternatively, the timing adjustment circuit TMCT may execute processing as shown in
Furthermore, for example, in
Note that the application of the configuration of the read latch circuit RLTa and the buffer control circuit CTRLa shown in
As described above, in the method of the first embodiment, a monitor slicer SLr_M is provided, and using this monitor slicer SLr_M, the phase of the data strobe signal DQSin to the main slicer SLr is adjusted to the optimal value. This allows, typically, the phase relationship between the data signal and the data strobe signal to be correctly maintained. Especially, even when the environment changes, such as temperature and voltage, during the normal operation period, the phase of the data strobe signal can be adjusted in real-time in the background without retraining.
As a result of such a difference in configuration, the read latch circuit RLTb includes a variable delay circuit VDLYs_B, which is an expanded version of the variable delay circuit VDLYs_A shown in
Returning
On the other hand, the search circuit SC, similarly to the case shown in
By using such a configuration, it is possible to further improve the real-time performance when optimizing the main delay amount ST1 according to changes in the environment such as temperature and voltage. That is, in the examples shown in
As described above, by using the method of the second embodiment, it is possible to achieve effects similar to those described in the first embodiment. Furthermore, compared to the method of the first embodiment, although there is a slight increase in circuit area overhead, it is possible to further improve the real-time performance when optimizing the phase of the data strobe signal according to changes in the environment. As a result, a more robust system can be realized against changes in the environment.
Although the invention made by the present inventor has been specifically described based on the embodiment, the present invention is not limited to the embodiment described above, and it is needless to say that various modifications can be made without departing from the gist thereof.
Number | Date | Country | Kind |
---|---|---|---|
2023-188107 | Nov 2023 | JP | national |