This disclosure relates generally to high-resolution timing measurement for integrated circuits.
As the miniaturization of integrated circuits progresses, the measurement of timing parameters is a big challenge. Two main challenges are to measure the timing width of a very small pulse with high accuracy in any general digital circuit and to achieve a high resolution, e.g. around 1 ps.
The small pulse width measurement is an important step for many applications, for example: 1) exact timing characterization of silicon standard cells library, 2) measuring the critical path delay time on the chip in silicon, 3) measuring actual hold time on the chip in silicon, 4) measuring rising and fall slew rate on chip in silicon, and 5) SRAM access time detection, etc.
However, very accurate timing characteristics for cell delay measurement are very difficult to achieve with high resolution due to scaling values of cell timing characteristics with scaling technology and limitations on automatic tester equipment (ATE) such as coarse resolution etc. Conventional methods suffer from very low resolution, have difficulty getting the on-chip digital data, capturing large volumes of data in short time, and measuring rise and fall slew rate using normal ATE. Also, they require using a long delay chains or averaging out mechanism to overcome problems of coarse resolution, etc.
Accordingly, new methods for high-resolution timing measurement with better accuracy are desired.
For a more complete understanding of the present disclosure, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
The making and using of the presently preferred embodiments are discussed in detail below. It should be appreciated, however, that the present disclosure provides many applicable inventive concepts that can be embodied in a wide variety of specific contexts. The specific embodiments discussed are merely illustrative of specific ways to make and use the disclosure, and do not limit the scope of invention.
An integrated circuit to achieve high-resolution timing measurement is provided. Throughout the various views and illustrative embodiments of the present disclosure, like reference numbers are used to designate like elements.
The delay pulse generator 104 can be any generic block, which generates a very small pulse to be measured. For example, the pulse can be generated by two parallel paths, one of which containing array of inverters in series while the other path connected directly to the input of an AND gate, as shown in
The tunable oscillator 106 is a digitally tunable oscillator whose frequency can be tuned directly from a digital control to match up the frequency of the original oscillator 114. It can include coarse tuning as well as fine-tuning functionality to provide good digital control for the tuning as shown in
The oscillator tuner 108 is a state machine, which can automatically tune the tunable oscillator 106 to the original oscillator 114 frequency till the highest possible resolution is achieved for a given oscillator. An automatic frequency-tuning algorithm can be used and the state machine design can incorporate the functionality for automatic tuning to the highest resolution based on the input reference frequency. An exemplary algorithm for the state machine is described in
The counter 110 also functions as a statistical computational block. This is the generic block that has the functionality of basic counters along with additional functionality. Computational capabilities can be added to this counter block, such as 1) statistical computation by measuring pulse width for a given number of times, e.g. 100 times, 2) ignore initial unstable pulses and wait for stable pulses, 3) ability to convert parallel counter data into serial output, and 4) parallel counters to count not only high pulse time but also low pulse time.
The reset counter block 112 counts how many times the reset signal has been applied to the counter (and statistical computational block) 110 for resetting the sampling DFF 118 or counter 110. Based on this value certain computational decisions can be made in the oscillator tuner 108 and the counter 110. DFF 118 can detect the phase of differential clocks by sampling one clock at the frequency of another clock.
There are two functional stages: one is an oscillator tuning stage and the other is a measurement stage. At the oscillator tuning stage, when the system is initially turned on after the reset, the path I0 through the MUX 120 is turned on, which feeds in the tunable oscillator 106 to DFF 118 that is clocked by the reference original oscillator 114 frequency. Using counter (and statistical computational block) 110 and oscillator tuner 108's state machine, the frequency of tunable oscillator 106 is tuned to highest possible resolution for the system (i.e. the minimal possible difference between the two clock frequencies). The design also can optionally choose a resolution threshold required in case it is not possible to achieve a stable frequency by tuning.
At the measurement stage, after the counter becomes stable (i.e. the sample counts are stable for the frequency difference of the reference oscillator 114 and the tunable oscillator 106), oscillator tuner 108 turns on the delay pulse path through I1 of the MUX 120. The pulse width of the delay pulse is measured in terms of digital counter value using the equivalent time sampling (ETS) method based on differential clocks. The digital counter value can be shifted out from the counter (and Statistical computational block) 110 for further processing.
The circuit 102 uses ETS to achieve high-resolution measurement using lower frequency clock. For the pulse timing measurement, a periodic repeating pulse is required whose width needs to be measured. Two differential clocks with small time difference are used to achieve ETS as described below.
When Clock 1 and Clock 2 align perfectly, clock 1 that is the clock input to the DFF 118, will be able to sample DELAY—0 pulse that is the input to the DFF 118. Hence QO that is the output of the DFF 118, will be high as shown. The QO enables the counter 110 to start counting on the Clock 1.
As clock 2 has very small difference from clock 1 in the period (or frequency), it will start shifting gradually in small steps Δt with each clock cycle. And this small shift in time Δt is determined by their frequency difference, which is made as small as possible using oscillator tuner block 108. This shift is indicated by dotted vertical lines in
Resolution of the timing measurement determined by the difference of two clocks as defined by the following equation:
where period—2 and period—1 are time periods of the two clocks, and f2 and f1 are the frequencies of the two clocks, measured directly or applied from a known source.
As the result, a very small pulse DELAY—0 that was very difficult to measure, has been translated to long pulse QO using differential clock ETS method. The counter 110 not only counts when QO is high, but also can have a parallel counter which can also count till next time QO goes high. This also enables to calculate the time for which DELAY—0 pulse is low.
The coarse tune bit module 502 single stage has the stage input connected to a 2-to-1 MUX 508 through two inverters 506 and also the same input connected to the MUX 508 directly. As a result when the path connected through inverters 506 is turned on through MUX selection bit, the delay of the stage increases, which lowers the frequency of the ring oscillator 106. And when direct path is turned on by MUX 508 selection bit, the delay through the path is decreased, which increases the frequency of the ring oscillator 106. So when multiple such stages, e.g. 16 stages in
The number of stages of 16 and inverter gates number of 2 used in this implementation need not to be a fixed number. The number of inverters and inverter size (drive strength) chosen for each design application depends on the step size required, and the number of stages depends on the range of frequency required for the ring oscillator 106.
The fine tune bit module 504 single stage consists of two inverters 510 and 512 connected in parallel with one arm connected through CMOS pass transistor gate 514. When the pass gate 514 is turned on by setting the control pin to 1, both the inverters come in parallel in the path, which effectively increase the drive strength of that single stage. Therefore, the delay for that stage is reduced, resulting in increased frequency of the ring oscillator 106. Similarly when the pass gate 514 is turned off by setting control pin to 0, only a single inverter 512 exists in the path, reducing the effective drive strength along the path. Therefore, the delay is increased for that stage, which decreases the frequency. By doing so, a desired resolution in frequency of the ring oscillator 106 can be controlled in small steps, e.g. 1.7 ps for this implementation.
Also, the inverter 510 and 512 gate size can be chosen appropriately to achieve desired resolution. After deciding the delay step size for fine tuning, the number of stages can be chosen such that when all are turned off and all are turned on, the frequency/delay difference is approximately the same as a single step size of the coarse tune bit module 502. For example, with 32 stages of fine tune bit stages 504 and each fine tune bit stage 504 has 1.7 ps step, 1.7 ps*32=54.4 ps, which is approximately the same as single step size of a coarse tune bit stage 502, e.g. 53.1 ps. The circuit of oscillator tuner 108 can adjust the frequency of tunable ring oscillator 106 automatically to achieve small time/frequency difference between clock 1 and clock 2.
The circuit design for each application should target for the highest resolution desired, however the resolution can be adjusted according to technology node and various design constraints. Also, to reduce the impact of OSC jitter, the mean value of multiple measurements, e.g. 100 times, can be calculated.
From the first step of coarse tune phase 602, it is required to check whether counters are stable or not. If counter 110 is not stable that means the clocks are too close and only very small change in frequency is required for the tunable oscillator which is just enough to make the counter 110 stable. So if counter 110 is not stable then the process directly moves to the fine tune phase 604 and if the counter 110 is stable then the process first goes through the coarse tune phase 602 and after the coarse tune phase 602 is finished the process moves to fine tune phase 602 of the state machine.
In principal the fine tune phase 604 and coarse tune phase 602 are the same except for the delays added by them for tuning the frequency of tunable oscillator 106 and entry and exit stages for each of them.
For coarse tune phase 602, if counter 110 is stable at 608 and it is the first phase of the tuning at 610 then by default some delay is added to the tunable ring stage at 612 and hence the frequency is decreased by one step. After adding the delay in tunable ring stage and decreasing the frequency at 612, new count value is compared to the previous count value at 614. If the addition of delay increases the count value then the frequencies come closer together and if it decreases the count value then frequencies move farther apart.
The coarse bit +1 at 616 reduces the delay by one step and hence increases the frequency and coarse bit −1 at 618 increases the delay by one step and hence decreases the frequency. The coarse bit +1 and coarse bit −1 after adjusting the frequency also captures the new count values and certain flags to make decision for the following step to check whether coarse tune has finished or not at 620.
If old count value is larger then coarse bit +1 step at 616 reduces the delay back again to go back to original frequency and set the direction to adjust (decrease or increase) the frequency. After adjusting Coarse tune done step at 620 checks whether the coarse tune is done or not and determines whether further adjustment using coarse tune is required or not. If further adjustment is not required then it moves to fine tune phase 604.
In the fine-tune phase 604 the same procedure is repeated for the fine tune step. In the last step for fine tune done at 632, it also checks whether further fine tune can be done or not. If further fine tune can't be done then state machines goes to done stage at 634 and locks the frequency for tunable ring oscillator. It indicates by setting done flag high that the tuning is finished and now tunable ring oscillator can be used for small pulse width measurement.
In
The unique point of state machine and algorithm illustrated in
The advantageous features of the present disclosure include very high accuracy using Equivalent Time Sampling (ETS) where the resolution depends on the difference of two clocks, not on the frequency of an individual clock. The delay pulse is periodic with ETS scheme and there is no requirement to synchronize the phase of the two clocks. By using the differential clock approach, very small difference can results in very high resolution that is easier to achieve even on slower clocks. Also, it is easy to complete the automatic placement and routing (APR) for the integrated circuit design, because no synchronization effort is required on APR for the two differential clocks used for measurement.
The novel scheme to obtain tunable (digital programmable) ring oscillator with very small incremental steps, the frequency of one clock can be adjusted to be close to the other clock that enhances the measurement resolution. Digital output for the measured value is available for on-chip post processing. With a built in circuit to compute statistical data, higher accuracy can be achieved. For example, a built in circuit can measure the delay/timing characteristics repeatedly to account for uncertainties in silicon. Statistical data can be processed with on chip circuit or directly shifted out. With this disclosure, large sample space data collection in very short time is possible without the using automatic tester equipment (ATE).
The present disclosure can not only detect the cell delay in integrated circuits with scaling technology, but can also measure other short pulses for variety of applications. For example, pulse width measurement has many other applications such as 1) Hold time for the FF (i.e., relatively fast NMOS/PMOS transistors), 2) Cell rise time and fall time to characterize FS/SF process corner (i.e., combinations of relatively fast and slow NMOS/PMOS transistors), etc. In addition, the present disclosure helps to save time compared to doing direct measurement each time following conventional methodologies. A skilled person in the art will appreciate that there can be many embodiment variations of this disclosure.
Although the present disclosure and its advantages have been described in detail, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of invention as defined by the appended claims. Moreover, the scope of the present application is not intended to be limited to the particular embodiments of the process, machine, manufacture, and composition of matter, means, methods and steps described in the specification. As one of ordinary skill in the art will readily appreciate from the disclosure, processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed, that perform substantially the same function or achieve substantially the same result as the corresponding embodiments described herein may be utilized according to the present disclosure. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.
The present application claims priority of U.S. Provisional Patent Application Ser. No. 61/234,052 filed on Aug. 14, 2009 which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61234052 | Aug 2009 | US |