The present disclosure relates to modeling and simulating intra-die variations in integrated circuits.
Manufacturing imperfections in the integrated circuit fabrication process result in undesired variations in the behavior of an integrated circuit from one die to another. These manufacturing or process variations can typically be classified into two types: global variations and intra-die variations. Global variations are variations from one die to another. In this case, the differences between transistors on the same die are as designed for the nominal case. For example, two transistors that are designed to be exactly identical will be identical on any one die, but all the transistors on a single die will shift together to some non-nominal process corner. In other words, the parameters of all the transistors within a single die are perfectly correlated. In the second case of intra-die variations, there is some independent variation from one transistor to another, resulting in partially correlated transistors on the same die. Furthermore, this correlation tends to be very high for neighboring transistors and decreases with the separating distance between any two transistors. As a result, this is spatially correlated intra-die variation. Typically all transistors within a single logic gate are assumed to be perfectly correlated, and this intra-die variation is modeled as inter-gate variation. In such a representation, if there are 1 million logic gates in an integrated chip, there will be 1 million random variables for each transistor parameter (e.g., channel length, threshold voltage, or the like).
Statistical static timing analysis (SSTA) is one critical application that has emerged as an essential tool for dealing with this statistical uncertainty in nanoscale designs. Prevailing SSTA methods, which are referred to as model based methods, typically exploit some simplifications to make the computation tractable. The most adopted implementations typically assume a linear dependence of gate delays (i.e., slew rates) on the statistical parameters (e.g., channel length, gate width, threshold voltage, or the like) for gates. This linear model enforces a severe approximation of the max( ) operator used to compute the worst case delay of any gate to maintain the linear dependence of the circuit delay on all statistical parameters. Early implementations also required the assumption of normally distributed statistical parameters. This leads to errors in the estimates of the circuit delay (or slack) distributions. Extensions to nonlinear gate models and/or non-normal distributions have been proposed. However, these extensions usually result in higher computation cost that might not scale cheaply with an increasing number of parameters, which is a trend expected for upcoming technologies. As a result, these extensions seem not to have yet found wide adoption in practice. Therefore, there is a need for a method and system for rapidly modeling and simulating intra-die variations in an integrated circuit.
A method and system for rapidly modeling and simulating intra-die variations in an integrated circuit are provided. In one embodiment, each logic gate in an integrated circuit has a characteristic to be simulated, where the characteristic of the gate is a function of one or more parameters having intra-die variations. For each parameter, a model of intra-die variation of the parameter is generated such that a number of random variables in the model is compressed to a reduced number (r) of random variables based on a spatial correlation of the intra-die variation of the parameter. In one embodiment, the model of the intra-die variation is a truncated Karhunen Loéve Expansion (KLE) of a function representing the intra-die variation of the parameter, wherein the KLE is truncated such that the number of random variables is compressed to the reduced number (r) of random variables. Then, using a Quasi Monte Carlo (QMC) technique, the integrated circuit is simulated based on the model of the intra-die variation of each of the one or more parameters.
In one embodiment, in order to simulate the integrated circuit using the QMC technique, values for the reduced number (r) of random variables are generated for each of the one or more parameters for one run of a simulation using a Low Discrepancy Sequence (LDS). Using the values generated for the reduced number (r) of random variables for the one or more parameters, values for the one or more parameters for all or a subset of the gates in the integrated circuit are calculated. Then, the run of the simulation of the integrated circuit is performed using the calculated values of the one or more parameters. This process is repeated for a desired number of runs of the simulation. Results of the simulation may then be output to a user or stored for subsequent use.
Those skilled in the art will appreciate the scope of the present invention and realize additional aspects thereof after reading the following detailed description in association with the accompanying drawings.
The accompanying drawings incorporated in and forming a part of this specification illustrate several aspects of the invention, and together with the description serve to explain the principles of the invention.
The embodiments set forth below represent the necessary information to enable those skilled in the art to practice the invention and illustrate the best mode of practicing the invention. Upon reading the following description in light of the accompanying drawings, those skilled in the art will understand the concepts of the invention and will recognize applications of these concepts not particularly addressed herein. It should be understood that these concepts and applications fall within the scope of the disclosure and the accompanying claims.
Once the modeling and simulation function 12 has obtained the integrated circuit layout 14, the modeling and simulation function 12 operates to simulate the integrated circuit defined by the integrated circuit layout 14 using models stored in a model library 16. In the preferred embodiment, the integrated circuit includes tens of thousands, hundreds of thousands, or millions of logic gates. The model library 16 includes a model for a characteristic (e.g., delay) of each of a number of gate types (e.g., NAND gates, NOR gates, and the like). For each gate type, the corresponding model defines the characteristic of the gate type as a function of a number of parameters including one or more parameters having intra-die variations. As discussed below, for each parameter of the one or more parameters having intra-die variations, the modeling and simulation function 12 operates to generate a model of the intra-die variation of the parameter. Then, using the models of the intra-die variations of the one or more parameters in combination with the models for the characteristic of the various gate types, the modeling and simulation function 12 is enabled to simulate the characteristic for all gates in the integrated circuit or a desired subset of the gates in the integrated circuit.
In one embodiment, the simulation is a Statistical Static Timing Analysis (SSTA) in which the modeling and simulation function 12 determines a statistical distribution of an input-to-output delay for one or more critical paths in the integrated circuit or for each path in the integrated circuit. The one or more critical paths may be the longest input-to-output paths or one or more critical paths defined by a user such as, for example, a designer of the integrated circuit. As will be appreciated by one of ordinary skill in the art, SSTA may be performed in order to determine a worst case maximum delay path within the integrated circuit, which in turn defines a maximum clock frequency that can be used to clock the integrated circuit. SSTA may also be performed to determine the minimum delay to ensure that data travels no faster than the clock period plus some necessary margin, which is referred to as “hold time.”
As a starting point, a grid-less stochastic process model for spatially correlated intra-die variations for each parameter (p) is used. This model does not rely on arbitrary partitioning of the chip area. Also, the spatial correlation of the intra-die variation of the parameter (p) is defined by a correlation kernel K(x,y). The correlation kernel K(x,y) is a known function that returns the covariance of the parameter (p) at locations x and y on the normalized chip area D=[−1,1]×[−1,1] (x,yεD). Note that different parameters (p) typically have different correlation kernels K(x,y). Having normalized the parameters, the covariance is equal to the correlation. As such, K(x,y) is referred to herein as a correlation kernel K(x,y). While not essential, for a discussion of one exemplary technique for generating the correlation kernel K(x,y), the interested reader is directed to J. Xiong, “Robust Extraction of Spatial Correlation,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 4, pp. 619-631, April 2007, which is hereby incorporated herein by reference for its teachings relating to the generation of the correlation kernel K(x,y).
The parameter (p) is represented as a 2-dimensional function p(x), where xεD. The values of p(x) across the chip area, or domain (D), follow the correlation kernel K(x,y). As such, p(x) is a stochastic process with the correlation kernel K(x,y). Using Karhunen Loéve Expansion (KLE), the parameter (p) can be represented by the following orthogonal decomposition:
where λj is the j-th largest eigenvalue of the correlation kernel K(x,y) and fj(x) is the corresponding eigenfunction of the correlation kernel K(x,y). The eigenvalues λj and the eigenfunctions fj(x) are referred to as eigenpairs. ξj is the j-th random variable. The eigenpairs (λj,fj) are solutions of the integral equation:
The eigenfunctions fj(x) are orthogonal and the random variables are uncorrelated for a Gaussian stochastic process.
Each location, or logic gate, within the integrated circuit die has a random variable ξj for the parameter (p). In other words, without compression, the number of random variables ξj is equal to the total number of logic gates on the integrated circuit, which can be hundreds of thousands or millions of logic gates in modern integrated circuits. From Equation (1), it can be seen that the j-th eigenvalue λj is a measure of the contribution of the j-th random variable ξj to the overall variance of p(x). Thus, in order to compress the number of random variables ξj, the KLE of p(x) shown in Equation (1) is truncated at the first r eigenpairs, where r is substantially less the total number of logic gates in the integrated circuit, in order to approximate the parameter (p) as:
The approximation for the parameter (p) in Equation (3) is optimal in the sense that it minimizes the mean squared error resulting from a finite representation of the parameter (p). This implies that infinitely many random variables ξj (j=1 . . . ∞) spread over the domain (D), or chip area, can be represented using a potentially small number of uncorrelated random variables ξj (j=1 . . . r). For example, it has been found that, for ISCAS89 sequential benchmark circuits having 5,597 gates, using r=25 eigenpairs for each statistical parameter yielded errors of only 0.35% on average. This is a greater than 200× reduction in the number of random variables ξj. Note that the reduced number (r) may, in some embodiments, be variable such that an amount of error in the approximation of Equation (3) can be controlled. For example, the user operating the computing system 10 may be enabled to configure r or a maximum error.
In one embodiment, the value of r is in the range of 1<r<200 and is programmatically selected by the modeling and simulation function 12 such that:
where λ200 is the 200-th eigenvalue and n is a total number of eignvalues. The left side of the inequality of Equation (4) is an upper bound on the sum of all unused n-r eigenvalues λ, given that only the first 200 eigenvalues λ are computed. The inequality of Equation (4) states that the upper bound on the sum of all unused n-r eigenvalues λ is less than 1% of the sum of the first r eigenvalues. By selecting a value for r in this manner, the truncated KLE of Equation (3) accounts for most of the variation in the parameter (p). It should be noted that both the inequality criterion, which is 1% in Equation (4), and the maximum value of r and thus the number of computed eigenvalues λ may vary depending on the particular implementation. For example, in an implementation requiring greater accuracy, the inequality criterion may reduced from 1% to 0.5%, and the maximum value of r and the total number of computed eigenvalues λ may also be increased to a point needed to satisfy the 0.5% inequality criterion.
In order to complete the model for the parameter (p) given in Equation (3), Equation (2) must be solved to compute the first r eigenpairs (λj,fj). These first r eigenpairs (λj,fj) can then be used to compute values for p(x) for any desired locations, or logic gates, on the integrated circuit die. The following discussion provides a preferred technique for solving Equation (2). However, the present invention is not limited thereto. Other techniques for solving Equation (2) may be used, as will be apparent to one of ordinary skill in the art upon reading this disclosure.
In the preferred embodiment, Equation (2) is solved using a Galerkin technique. Let Vn be a finite-dimensional function space with a basis set {Φi}i=1n that is a subset of the (Hilbert) function space containing the solutions of Equation (2). Then, dropping the subscript j, any eigenfunction f(x) can be approximated as a linear combination of the basis functions φi:
Here, the subscript n indicates that an expansion in n basis functions φi is being used. The variable d, is the i-th element of eigenvector d. Note that bold letters are used herein to indicate vectors or matrices.
If Equation (5) is substituted into Equation (2), the two sides of the resulting equation will not match exactly, resulting in a residual R given by:
Substituting Equation (5) into Equation (6) gives:
where λn and di are all the unknowns that need to be computed, given the known set of basis functions φi.
In order to estimate these unknowns, the residual R is minimized by making the residual R orthogonal to the basis:
This ensures that the basis functions φi are completely utilized to “explain” as much of the true solution as possible using this finite-dimensional space Vn. Equation (8) can then be manipulated into matrix form:
Kd=λnΦd, (9)
where
Equation (9) defines the well known Generalized Eigenvalue Problem (GEP), λn being the eigenvalue and d being the eigenvector. Note that the j-th largest eigenvalue λn and its corresponding eigenvector d approximate the j-th eigenpair (λj,fj) of Equation (2).
Since the basis functions φi and the correlation kernel K(x,y) are known, Equations (10) and (11) can be used to compute the values for Kik and Φik. Then, the GEP problem of Equation (9) can be solved to determine the first (i.e., largest) r eigenvalues λn and corresponding eigenvectors d. Using Equation (4), the eigenvectors d can then be used to approximate corresponding eigenfunctions f(x).
In one embodiment, the basis functions φi are piecewise constant over a triangular mesh (i.e., a triangulation of the chip area (D)):
where Δi are triangles with a maximum overlap of one side.
Once the matrices K and Φ are computed using Equations (13) and (14), the GEP of Equation (9) can be solved and used in combination with Equation (4) to provide the first r eigenpairs (λj,fj). Then, using the first r eigenpairs (λj,fj), the truncated KLE of p(x) in Equation (3) provides a model of the intra-die variation of the parameter (p) that enables the parameter (p) to be computed for any location x (i.e., logic gate) on the integrated circuit die.
Once the model of the intra-die variation for each of the one or more parameters (p) has been generated, the modeling and simulation function 12 generates sample values for the reduced set of random variables ξi for j=1 . . . r for each parameter (p) using a Quasi Monte Carlo (QMC) technique (step 102). Note that a standard Monte Carlo (MC) technique may alternatively be used. However, a QMC technique is preferable over a standard MC technique. More specifically, random sample values typically used for a standard MC technique suffer from clusters and empty spaces in their distribution over the sampling region (unit cube). In other words, the random sample values typically used for a standard MC technique are not very uniformly spread out, especially for smaller sample sizes. Although the details are outside the scope of this disclosure, this non-uniformity of any set of n points can be represented mathematically by a measure called star discrepancy D*n. The Koksma-Hlawka theorem shows that lower discrepancy sample sets can lead to lower Monte Carlo integration errors; specifically, the star discrepancy D*n bounds the integration error. The discrepancy of n uniformly distributed random points in s dimensions is:
D*n|MC=O(n−0.5(log log n)−0.5), (15)
echoing the O(n−0.5) convergence of the standard MC error.
With respect to QMC, point sequences with asymptotically superior discrepancy exist:
D*n|QMC=O(n−1(logs n)−0.5), (16)
and are called Low Discrepancy Sequences (LDSs). Various types of LDSs are known. Some exemplary types of LDSs are Sobol' LDSs, Faure LDSs, Halton LDSs, Hammersley LDSs, Niederreiter LDSs, and van der Corput LDSs. The discrepancy behavior of LDSs suggests an asymptotic integration error rate of O(n−1), which is much faster than the O(n−0.5) of standard MC techniques. QMC techniques are Monte Carlo techniques that use samples from these deterministic LDSs rather than pseudo-randomly generated samples.
One issue with LDSs is that for a large number of dimensions s, the number of points needed for uniformity in all dimensions can be large, which is reflected in Equation (16) where a large value of n is needed to make logs n<<n. For practical sample sizes, this is manifested as undesirable patterns of empty regions in the low dimensional projections of the point set, which can lead to integration errors. However, current best LDSs, such as for example Sobol' LDSs, are able to achieve uniform filling in the early dimensions and suffer from patterns mainly in the projections of the later, or higher, dimensions. For example, a set of 10K Sobol' LDS points in 100 dimensions will have well distributed points in dimensions 1-10 and undesirable patterns in dimensions 91-100.
As such, with respect to step 102, in the preferred embodiment, the modeling and simulation function 12 generates sample values for the reduced set of random variables ξj for j=1 . . . r for each of the one or more parameters using an LDS. In an embodiment where there is only one parameter (p), the modeling and simulation function 12 maps coordinates from the first dimension in the LDS to the first random variable ξ1, coordinates from the second dimension in the LDS to the second random variable ξ2, coordinates from the third dimension in the LDS to the third random variable ξ3, and so on. In other words, for each random variable ξj, the modeling and simulation function 12 maps a point from the j-th lowest dimension of the LDS to the random variable ξj. As a result of this mapping, the early or low dimensions of the LDS are mapped to the most important random variables thereby providing better than standard MC convergence. Recall that the measure of importance of the random variables ξj are the corresponding eigenvalues λj.
In an embodiment where there are multiple parameters (p), the modeling and simulation function 12 preferably interlaces the parameters (p) when mapping LDS points to the random variables for the parameters (p). For example, if there are two parameters W and L having random variables ξWj and ξL1, . . . , ξLj, respectively, the modeling and simulation function 12 may map the first coordinate in the LDS to ξW1, the second coordinate in the LDS to ξL1, the third coordinate in the LDS to ξW2, the fourth coordinate in the LDS to ξL2, and so on. In other words, if there are NP parameters pk (k=1, 2, . . . , NP), this interlacing of the multiple parameters when mapping LDS coordinates to the random variables for the parameters may be represented as:
LDSINDEX(j,k)=k+[NP·(j−1)], (17)
where LDSINDEX is an index for the LDS sequence (i.e., a dimension of the LDS sequence). Therefore, the point in the {k+[NP·(j−1)]}-th dimension of the LDS sequence is mapped to the j-th random variable ξj for the k-th parameter pk.
Next, for each parameter (p), the modeling and simulation function 12 computes values for the parameter (p) for all gates on the integrated circuit, or a defined subset of the gates on the integrated circuit, using the sample values for the reduced set of random variables ξj for j=1 . . . r for the parameter (p) (step 104). Then, using the values computed for the one or more parameters, the modeling and simulation function 12 performs one run of a simulation for the integrated circuit (step 106). The modeling and simulation function 12 then determines if it has performed a last run of the simulation (step 108). For example, the modeling and simulation function 12 may be configured to perform 1,000 runs of the simulation. As such, the modeling and simulation function 12 may determine whether the run of the simulation performed in step 106 was the 1,000-th run. If not, the process returns to step 102 and is repeated for the next run of the simulation.
Once the last run of the simulation has been performed, in this embodiment, the modeling and simulation function 12 outputs results of the simulation to a user (step 110). The results of the simulation may be, for example, a statistical distribution of the simulated characteristic of the integrated circuit. For example, if the simulated characteristic is a delay for all input-to-output paths in the integrated circuit or one or more critical input-to-output paths in the integrated circuit, the results output to the user may be a statistical distribution of the delay of each of the simulated input-to-output paths. Note, however, that the results output to the user may vary depending on the particular implementation.
The computing system 10 may also include a secondary storage device 24 such as, for example, a hard disk drive, an optical storage device, flash memory, or the like. The secondary storage device 24 may be used to store, for example, the model library 16 (
The modeling and simulation function 12 provides a substantial improvement as compared to traditional modeling and simulation techniques for parameters having intra-die variation. In particular, for SSTA of the ISCAS89 circuits of the standard industry benchmark, it has been found that a few hundred, well-chosen sample points can achieve errors within 5%, with no assumptions on gate models, wire models, or the core Static Timing Analysis (STA) engine, with runtimes less than 90 seconds. This is 10×-100× faster than conventional Monte Carlo for statistical timing analysis, with none of the modeling restrictions required for traditional fast SSTA methods.
It should also be noted that while the discussion herein has focused on the modeling and simulation of integrated circuits having a number of logic gates, the present invention is not limited thereto. For example, the modeling and simulation function 12 may also be used to model and simulate integrated circuits including memory devices (e.g., Static Random Access Memory (SRAM)) having parameters with intra-die variation, integrated circuits including analog or Radio Frequency (RF) systems having active and/or passive devices with intra-die variation, or the like.
Those skilled in the art will recognize improvements and modifications to the embodiments of the present invention. All such improvements and modifications are considered within the scope of the concepts disclosed herein and the claims that follow.
This application claims the benefit of provisional patent application Ser. No. 61/059,557, filed Jun. 6, 2008, the disclosure of which is hereby incorporated by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
7590518 | Phillips | Sep 2009 | B2 |
7921402 | He | Apr 2011 | B2 |
Number | Date | Country | |
---|---|---|---|
61059557 | Jun 2008 | US |