BACKGROUND OF THE INVENTION
1. Field of the Invention
The invention generally relates to synchronizer modules and methods, and particularly to synchronizers which may be simulated in an RTL (Register Transfer Level) simulation.
2. Description of the Related Art
Many integrated circuit chips exist which have clock driven digital circuits which form more than one clock domain. In such devices, a first part of the digital circuitry is driven by a first clock, while a second part is driven by a second clock. The second clock may be different from the first clock and may even come from a different source. Examples of devices having multiple clock domains are computer chipsets, USB (Universal Serial Bus) host controllers, and WLAN (Wireless Local Area Network) receiver or transceiver devices. A number of other fields where an integrated circuit chip may have more than one clock domain exist in the state of the art.
In many applications, the digital circuitry in the various clock domains are not independent from each other. For example, a circuit in one of the clock domains may receive a signal from a circuit in another clock domain for further processing. That is, such devices require digital signals to cross clock domains. These clock domain crossing signals may be single-bit signals or even multiple-bit bus signals.
Taking for example the arrangement shown in FIG. 1A, a first circuit 100 is shown to receive a first clock signal clk1 while a second digital circuit 110 receives a different clock signal clk2. Thus, the two digital circuits 100, 110 are located in different clock domains. The circuit 100 in the first clock domain receives the single-bit signal bit0, and performs some digital operations on this signal. In the example of FIG. 1A, the digital circuit 100 may be a flip-flop device. The output signal of circuit 100 is then crossing the clock domain boundary as signal bit1 which reaches the second digital circuit 110. The circuit 110 generates the single-bit signal bit2 from bit1, driven by clock signal clk2.
As shown in FIG. 1B, the circuit 100 latches the incoming signal bit0 at a positive edge, i.e., when the clock signal clk1 rises up from the low to the high level. The digital circuit 110, which may also be a flip-flop device, is driven by the clock signal clk2 which is of higher frequency in the present example. At the positive edge of the second clock cycle, the flip flop 110 in the second clock domain registers a low signal. With the positive edge of the third clock cycle, a high signal is correctly latched. However, in the time between the positive edges of the second and third clock cycles, the output signal level may be undefined. This is commonly referred to as metastable state and may be particularly the case when the positive edge of the second clk2 clock cycle happens to occur near the positive edge of the clk1 clock cycle.
FIGS. 2A and 2B show similar examples where multi-bit bus signals are crossing the clock domain. In the example depicted, the bus includes three separate bit lines. As may be seen from FIG. 2B, a similar problem as discussed above with reference to FIG. 1B may occur. RTL simulation (in contrast to timing simulation at gate level) may often fail to identify incorrect bus synchronization such as that of FIG. 2A since the RTL simulator deals with all of the bus bits in the same manner, i.e. the bits are always “in phase” This example also shows that a method according to FIG. 2A may not be suited to correctly synchronize a bus into another clock domain: As FIG. 2B shows, there may be value artefacts on bus2, since not all bits experience the same delay and hence bus2 carries a value that never occured on bus1.
To solve the clock domain crossing signal problem, some synchronization facility may be added to the circuits. For instance, an additional flip-flop device 310 may be put between both digital circuits 300, 320 but within the second clock domain. This is depicted in FIG. 3A. FIG. 3B then shows that the output signals out do no longer have undefined levels.
FIG. 4A shows an example of correct clock domain crossing bus signals. A multiplexer 410 is put between the first and second digital circuits 400, 420 to either forward the output signals of the first digital circuit 400 to the second digital circuit 420, or feedback the output signal bus2 to the input port of the second circuit 420. The multiplexer 410 is driven by a capture signal. As may be seen from FIG. 4B, the output bus signals bus2 are correctly synchronized.
Thus, while clock domain crossing signal synchronization is already possible in the prior art, there are a number of structural and functional issues that may be sources of potential errors. For instance, synchronization problems may still occur if the overall circuitry design includes errors or design flaws which are difficult to observe in advance. For instance, if a signal is taken from a specific source and is independently fed through two different paths which at the end are re-convergent, proper synchronization may depend on the delay behaviour of both paths. Another common design flaw is to use the actual correct bus synchronizer structure according to FIG. 4A, but use an unsychronized signal for capture2.
As digital circuitry usually becomes quite complex, it is often not possible to detect such design errors in advance. This may then lead to functional errors which are only detected late in the design cycle, or even worse, during post-silicon verification. Due to the generally unreliable nature of such error,it is then even more difficult to find the source of the error, thus leading to increased circuit development costs.
SUMMARY OF THE INVENTION
An improved synchronization technique is provided that may allow for better modelling real-silicon behaviour for simulation to detect signal synchronization problems earlier in the flow.
In one embodiment, an RTL simulation apparatus is provided which is adapted to simulate bus synchronization across a clock domain boundary. The apparatus comprises a first RTL design element configured to simulate circuitry in a first clock domain, and a second RTL design element configured to simulate circuitry in a second clock domain. The apparatus further comprises a third RTL design element which is configured to simulate functionality of a multiple-stage synchronizer having multiple synchronizer stages, which are each capable of generating a synchronizer signal which is different from the synchronizer signals generated by other synchronizer stages of the multiple-stage synchronizer. The third RTL design element is coupled to the second RTL design elements. The RTL simulation apparatus is adapted to dynamically enable and disable at least one of the multiple synchronizer stages.
In another embodiment, a synchronizer module is provided which is arranged to be connected to a first latching register driven by first clock, and a second latching register driven by a second clock. The first latching register outputs a first digital signal while the second latching register receives a second digital signal. The synchronizer module comprises a delay unit which is adapted to selectively delay the first digital signal by a variable delay to provide the second digital signal.
In another embodiment, there may be provided an HDL (Hardware Description Language) library comprising at least one synchronizer module as specified above.
Still a further embodiment relates to a computer readable storage medium which stores computer readable instructions that when executed by a processor, cause the processor to perform RTL simulation to simulate bus synchronization across a clock domain boundary. The computer readable storage medium comprises a first RTL design element which is configured to simulate circuitry in a first clock domain, and a second RTL design element which is configured to simulate circuitry in a second clock domain. The computer readable storage medium further comprises a third RTL design element which is configured to simulate functionality of a multiple-stage synchronizer having multiple synchronizer stages, which are each capable of generating a synchronizer signal different from the synchronizer signals generated by other synchronizer stages of the multiple-stage synchronizer. The third RTL design element is coupled to the first and second RTL design elements. The computer readable storage medium further comprises computer readable instructions to dynamically enable and disable at least one of the multiple synchronizer stages.
According to yet another embodiment, there is provided a synchronizer simulation method for simulating a digital electronic circuit forming a synchronizer module that can be connected to a first register driven by a first clock, and a second register driven by a second clock, wherein the first register outputs a first digital signal, and the second register receives a second digital signal. The method comprises selectively delaying the first digital signal by a variable delay, and providing the delayed signal as the second digital signal.
BRIEF DESCRIPTION OF THE DRAWINGS
The accompanying drawings are incorporated into and form a part of the specification for the purpose of explaining the principles of the invention. The drawings are not to be construed as limiting the invention to only the illustrated and described examples of how the invention can be made and used. Further features and advantages will become apparent from the following and more particular description of the invention, as illustrated in the accompanying drawings, wherein:
FIG. 1A is a block diagram illustrating digital circuitry of two different clock domains having a single-bit clock domain crossing signal;
FIG. 1B is a timing chart illustrating signal levels of the circuitry shown in FIG. 1A;
FIG. 2A is a block diagram illustrating digital circuitry of two different clock domains with multiple-bit clock domain crossing bus signals;
FIG. 2B is a timing chart corresponding to the circuitry of FIG. 2A;
FIG. 3A is a block diagram illustrating single-bit cross domain clock signal synchronization;
FIG. 3B is a timing chart illustrating operation of the circuitry of FIG. 3A;
FIG. 4A illustrates clock domain crossing synchronization for multiple-bit bus signals;
FIG. 4B is a timing chart relating to the arrangement of FIG. 4A;
FIG. 5 is a graph illustrating a technique for static verification of bus synchronization;
FIGS. 6A to 6C illustrate embodiments of multiple-stage synchronizers that may be used for dynamic verification of single-bit or bus synchronization;
FIG. 7 illustrates another embodiment of a multiple-stage synchronizer having. two multiplexers;
FIG. 8 illustrates a multiple-stage synchronizer according to a further embodiment having a reduced number of elements;
FIG. 9A is a block diagram used to illustrate the real-time requirements of single-bit synchronization using a two-stage synchronizer;
FIG. 9B is a timing chart illustrating the operation of the arrangement shown in FIG. 9A;
FIG. 10A is a block diagram used to illustrate the real-time requirements of single-bit synchronization using a three-stage synchronizer;
FIG. 10B is a timing chart illustrating the operation of the arrangement shown in FIG. 1A;
FIG. 11A is a block diagram used to illustrate the real-time requirements of bus signal synchronization using two-stage synchronizers;
FIG. 11 B is a timing chart illustrating the operation of the arrangement shown in FIG. 11A; FIG. 12A is a block diagram used to illustrate the real-time requirements of bus signal synchronization using two-stage and three-stage synchronizers;
FIG. 12B is a timing chart illustrating the operation of the arrangement shown in FIG. 12A;
FIG.13A illustrates an embodiment of a five-stage synchronizer;
FIG. 13B is a timing chart illustrating the switching of the synchronization signal from a lower to a higher rank;
FIG. 13C is a timing chart illustrating the switching of the synchronization signal from a higher to a lower rank;
FIG. 14 is a block diagram illustrating an implementation example of a multiple-stage synchronizer according to an embodiment;
FIG. 15 is a state diagram illustrating the states which the arrangement of FIG. 14 can have; and
FIG. 16 is a block diagram illustrating an interrupt generator as another implementation example.
DETAILED DESCRIPTION OF THE INVENTION
The illustrative embodiments of the present invention will be described with reference to the figure drawings wherein like elements and structures are indicated by like reference numbers.
Before discussing in more detail the synchronization modules of the embodiments which provide a dynamic verification of single-bit and bus synchronization, it is referred to FIG. 5, which illustrates a static verification technique that may be used in connection with the embodiments. In this approach, the whole design structure is mapped onto a graph model which is built from vertex and edge elements. Vertices may be flops (denoted as “f” in FIG. 5) and combinational elements (denoted as “c”). Edges are depicted as wires in the graph model.
In the static verification approach, the model is partitioned into clock domains and stage levels. As may be seen from FIG. 5, the present example illustrates four different levels. Taking the example of bus synchronization, buses may be identified based on the finding that combinational logic inputs have bus requirements. In the graph of FIG. 5, different bus requirements are illustrated by means of bold vertical lines having numbers 1), 2) or 3) accompanied. It is noted that this verification approach allows for detecting the number of buses needed to achieve proper bus synchronization.
As will be described in more detail below, the embodiments may make use of multiple-stage synchronizers in a manner that allows for dynamic verification of single-bit or bus synchronization. Examples of multiple-stage synchronizers that may be used in the embodiments are shown in FIGS. 6A to 6C. In FIG. 6A, four parallely arranged register sequences are provided which all receive the same input signal. The register sequences used in FIG. 6A have two to five register elements which may be latches, flip-flop devices or other two-state systems capable of performing storage operations on clock edges. As may be seen from FIG. 6A, the output ports of each of the sequences are provided to a multiplexer which may be any selection element for selecting and outputting one of the signals.
The multiple-stage synchronizer shown in FIG. 6B differs from the one shown in FIG. 6A in that each of the register sequences is reduced in length by one element, and a single register element is added to the end of the synchronizer. It is noted that the mentioned single register element may likewise be added to the left side of the multi-stage synchronizer. The synchronizer of the embodiment of FIG. 6B may achieve the same functionality as that of FIG. 6A but has the total number of register elements reduced by three.
Referring to FIG. 6C, this concept is again applied by removing an additional one of the register elements in each of the sequences and adding a second register element at the end. Again, one or two of the extra register elements shown to be connected to the output port of the multiplexer of FIG. 6C could also be provided before the register sequences are branched out.
FIG. 7 illustrates a further embodiment that somehow resembles the synchronizer of FIG. 6A but which differs in that there is provided a second multiplexer. That is, while the register sequences in FIG. 6A each receive the same input signal, the arrangement of FIG. 7 provides an extra multiplexer. This allows for better decoupling the register sequences to increase reliability and may be particularly useful for bus synchronization.
While not shown in FIG. 7, it is noted that embodiments similar to those of FIGS. 6B and 6C would also be possible as modifications from the arrangement of FIG. 7.
Turning now to FIG. 8, a multi-stage synchronizer according to an embodiment is depicted where the number of register elements is reduced as much as possible. In the embodiment of FIG. 8, only one physical register sequence remains. To achieve multiple logical register sequences of different lengths, signals are branched off and are separately provided to the multiplexer. That is, the upper most signal provided to the multiplexer is the output signal of the entire register sequence. The second signal provided to the multiplexer is taken from the output port of the fourth register element so that it corresponds to the output signal of a register sequence having four elements. Similarly, the third and fourth signals provided to the multiplexer correspond to sequences of three and two register elements, respectively.
It is noted that in a single-bit synchronization embodiment, each of the individual register elements shown in FIGS. 6A to 6C, 7 and 8 may be single-bit registers such as flip flops. In bus synchronization embodiments, each of the register elements may be configured to temporarily store multiple bits at a time, one for each line in the bus.
In an embodiment, the selection element, such as a multiplexer, is driven in a dynamic manner to change the register sequence used. The change may be done regularly or irregularly, in a reproducible manner or not. In an embodiment, the selection device may be driven by a random or pseudo-random control signal. Using a reproducible signal such as a pseudo-random control signal may allow learning from correcting design errors by comparing the simulation results before and after the correction.
Before discussing this in more detail, the real-time requirements for single-bit signals and bus signals are discussed first.
FIGS. 9A and 10A show examples using two-stage and three-stage synchronizers for single-bit synchronization. As may be seen from FIGS. 9B and 10B, the synchronization is carried out properly so that it may be concluded that single-bit signals are usually not fully real-time required.
FIGS. 11A and 12A show similar arrangements for bus synchronization. The stage module 1100 of FIG. 11A applies two stages to each bit, leading to the timing chart of FIG. 11B. FIG. 12A shows an arrangement where one bit is processed by a two-stage synchronizer element while three stages are used for the second bit. It can be seen from FIG. 12B that the output signal may have undefined values leading to potential functional errors. From this it may be concluded that bus synchronization has increased real-time requirements compared with single-bit synchronization.
Turning to FIG. 13A, a multi-stage synchronizer is shown like that of FIG. 8. This synchronizer is now used to discuss the switching from a lower to a higher rank and vice versa, assuming that switching is possible at each positive edge clock event.
FIG. 13B shows an example where synchronization is switched from sync2 to sync5. It may be seen that the output signal properly reflects all of the incoming data although the data was crossing a clock domain.
The following is a brief summary of the timing model for switching from a lower to a higher rank, assuming that tswitch occurs at a positive edge of the clk signal, and Tclk=1/fclk.
|
|
out: sync2 → sync3sync2(t);t = (−∝, tswitch]
hold sync2(tswitch) forout = {open oversize brace} sync2(tswitch);t = tswitch, tswitch + Tclk]
1 clk cycle before switchingsync3(t);t = (tswitch + Tclk, +∝)
out: sync2 → sync4sync2(t);t = (−∝, tswitch]
hold sync2(tswitch) forout = {open oversize brace} sync2(tswitch);t = (tswitch, tswitch + 2Tclk)
2 clk cycles before switchingsync4(t);t = (tswitch + 2Tclk, +∝)
out: sync2 → sync5sync2(t);t = (−∝, tswitch]
hold sync2(tswitch) forout = {open oversize brace} sync2(tswitch);t = (tswitch, tswitch + 3Tclk]
3 clk cycles before switchingsync5(t);t = (tswitch + 3Tclk, +∝)
out: sync3 → sync4sync3(t);t = (−∝, tswitch]
hold sync2(tswitch) forout = {open oversize brace} sync3(tswitch);t = (tswitch, tswitch + Tclk]
1 clk cycle before switchingsync4(t);t = (tswitch + Tclk, +∝)
out: sync3 → sync5sync3(t);t = (−∝, tswitch]
hold sync2(tswitch) forout = {open oversize brace} sync3(tswitch);t = (tswitch, tswitch + 2Tclk]
2 clk cycles before switchingsync5(t);t = (tswitch + 2Tclk, +∝)
out: sync4 → sync5sync4(t);t = (−∝, tswitch]
hold sync2(tswitch) forout= {open oversize brace} sync4(tswitch);t = (tswitch, tswitch + Tclk]
1 clk cycle before switchingsync5(t);t = (tswitch + Tclk, +∝)
|
Turning now to FIG. 13C, an example is shown for switching the synchronizer from a higher to a lower rank. In the example shown in FIG. 13C, sync5 is switched to sync4. In this and other embodiments, switching may be possible at positive clock edges only, and when no data will be lost. The condition to switch one ranking level to the next lower one may be that both ranking levels must have the same value.
In the following, the timing model for down-switching the register sequences is briefly summarized, assuming tswitch to occur at positive clock edges.
|
|
out: sync3 → sync2 sync3(t); t = (−∝, tswitch]
switch @(posedge clk) andout = {open oversize brace}
(sync3 == sync2) sync2(t); t = (tswitch, +∝)
out: sync4 → sync3 sync4(t); t = (−∝, tswitch]
switch @(posedge clk) andout = {open oversize brace}
(sync4 == sync3) sync3(tswitch); t = (tswitch, +oc)
out: sync5 → sync4 sync5(t); t = (−∝, tswitch]
switch @(posedge clk) andout = {open oversize brace}
(sync5 == sync4) sync4(t); t = (tswitch, +∝)
|
Turning now to FIG. 14, a multiple-stage synchronizer is shown as an implementation example according to an embodiment. This example switches between ranks two and three at a switching rate which may be varied by extending the bit width of the random variable sel.
FIG. 15 illustrates a state diagram that may be used with the embodiment of FIG. 14. As can be seen the synchronizer cycles through two different synchronization stages in a manner driven by the random value sel. A CRC (Cyclic Redundancy Check) generator may be used to produce reproducible pseudo-random delays of two or three clock cycles for this purpose. In another exemplary embodiment, a linear feedback shift register may be used with the polynomial 1+x3+x10. In yet another embodiment, the CRC generator may use a linear feedback shift register.
Referring back to FIG. 14, the following is exemplary Verilog code that may used in an embodiment during RTL design:
|
|
module generic_sync (
dest_clk,// destination clock
reset,// async reset
d_i,// asynchronous data input
d_o// synchronous data output
);
// synopsys template
parameter CLKPOL = 1′b1;// clock polarity: 1: posedge,
0: negedge
parameter RSTPOL = 1′b1;// reset polarity, 0: low active,
1: high active
parameter RSTVAL = 1′b0;// reset value
parameter HASRST = 1′b1;// has reset
inputdest_clk;// destination clock
inputreset;// async reset
inputd_i;// asynchronous data input
outputd_o;// synchronous data output
reg [1:0] sync_1;
reg [1:0] sync_2;// the actual flops
wireint_clk = dest_clk {circumflex over ( )}˜CLKPOL;// internal clock,
according to requeste
wireint_reset = reset {circumflex over ( )}˜RSTPOL; // internal reset, according
to requeste
always @ (posedge int_clk or posedge int_reset)
if (int_reset)
begin
sync_1[0] <= RSTVAL;
sync_1[1] <= RSTVAL;
end
else
begin
sync_1[0] <= d_i;
sync_1[1] <= sync_1[0];
end
always @ (posedge int_clk)
begin
sync_2[0] <= d_i;
sync_2[1] <= sync_2[0];
end
assign d_o = HASRST ? sync_1[1] : sync_2[1];
endmodule // generic_sync
|
By applying variable delays in the manner described above, the embodiments allow for modelling real-silicon behaviour for simulation purposes to detect signal synchronization problems very early in the design flow, for instance during RTL design. Generally, the embodiments may make use of synchronizer modules defined using any HDL (Hardware Description Language) syntax and semantics. The synchronizer modules may be separately defined, or provided as part of a library.
A simulation example of an interrupt generator is shown in FIG. 16. This circuitry may exhibit an incorrect bus synchronization which the technique of the embodiments can reveal already in the RTL design phase. In FIG. 16, the synchronizer module is provided as block 1620. An impulse generator 1600 provides a signal to a 4-bit counter 1610 to enable the counter to count upwards. Further, an input generation unit gen_int 1630 receives the output of the synchronizer 1620 to generate and output an interrupt. The impulse generator 1600 together with the 4-bit counter 1610 on the one side, and the synchronizer 1620 and the input generator 1630 on the other side, form different clock domains. It is noted that separate reset synchronizers 1640 and 1650 may be provided for the clock domains.
As described above, a simulation technique is provided to simulate a, e.g. two-stage, flip flop synchronizer. In simulation (but not later in implementation on the silicon) a switching logic switches between, e.g., two and three cycle delays. This simulates the real silicon circuit behaviour where signal delays may sometimes vary for many reasons. With respect to bus synchronization, embodiments may bring individual bus bits “out of phase” (in contrast to conventional RTL simulators which deal with all of the bus bits in the same manner) so that the designer may notice an incorrect RTL description early in the flow.
While the invention has been described with respect to the physical embodiments constructed in accordance therewith, it will be apparent to those skilled in the art that various modifications, variations and improvements of the present invention may be made in the light of the above teachings and within the purview of the appended claims without departing from the spirit and intended scope of the invention. In addition, those areas in which it is believed that those of ordinary skill in the art are familiar, have not been described herein in order to not unnecessarily obscure the invention described herein. Accordingly, it is to be understood that the invention is not to be limited by the specific illustrative embodiments, but only by the scope of the appended claims.