The present invention relates generally to analog circuit design simulations, and more specifically to analog circuit design simulations using a mixed frequency/time approach.
Using a description language such as a netlist and device models an analog circuit can be first designed in terms of its predetermined inputs and expected outputs. The analog circuit design is then simulated before it is physically fabricated on a silicon chip.
One of the most difficult challenges in analog circuit simulation is the analysis of the circuits that operate on multiple time scales. Typical examples of this type of circuits are switched-capacitor filters and circuits used in RF (radio frequency) communications systems. Applying standard transient analysis to a circuit of this type requires simulation of the detailed responses of the circuit over hundreds of thousands of clock cycles (millions of time points).
Many circuits of engineering interest are designed to operate near a time-varying, but quasi-periodic, operating point. Some of these circuits can be analyzed under the assumption that one of the circuit inputs produces a periodic response that can be. directly calculated by steady-state algorithms, thus avoiding long transient simulation times. Under this assumption, all other time-varying circuit inputs are treated as small-signal by linearizing the circuit around the periodic operating point.
Existing algorithms are able to find periodic operating points and to perform periodic time-varying small-signal analysis. However, many circuits cannot be analyzed with the periodic-operating-point-plus-small-signal approach, because the above-described assumption may not apply. For example, predicting intermodulation distortion of a narrowband circuit, such as a mixer-plus-filter circuit, involves calculating the nonlinear response of the mixer circuit, driven by an LO (local oscillator), to two high-frequency inputs that are closely spaced in frequency. The steady-state response of such a circuit is quasi-periodic.
The analog circuit simulation is further complicated by the fact that many multi-timescale circuits have a response (again mixers and switched-capacitor filters are typical examples) that is highly nonlinear with respect to at least one of the exciting inputs, and so steady-state approaches, such as the multi-frequency harmonic balance approach, do not perform well. To circumvent these difficulties, mixed frequency-time (MFT) algorithms have been proposed. Specifically, the MFT algorithms exploit the fact that many circuits of engineering interest have a strongly nonlinear response to only one input, such as the clock in the case of a switched-capacitor circuit, or local oscillator in the case of a mixer, but respond only in a weakly nonlinear manner to other inputs.
Unfortunately, existing MFT algorithms suffer from several drawbacks that prevent their application to practical circuits, particularly large circuits. In existing MFT algorithms, poor sample point selection leads to ill-conditioned simulation environment, in which simulation values may be unsolveable with acceptable accuracy. In addition, existing MFT algorithms are based on a matrix-explicit linear solver (via Gaussian elimination) whose computational cost (or time) is proportional to an order of N3 for each Newton iteration, where N is the number of nodes of the circuit in simulation.
A new class of algorithms has been developed for simulating multi-timescale circuits by converting the circuit DAE (differential-algebraic equation) into an equivalent multi-variable partial differential equations (M-PDE). However, the effectiveness of the M-PDE method to simulate large circuits has yet to be proven. In addition, there is evidence that, for some circuits, the M-PDE method generates inaccurate simulation results.
There is, therefore, a need in the art for a method and apparatus that utilizes the MFT method to accurately simulate large circuits.
There is another need in the art for a method and apparatus utilizing the MFT method to simulate large circuits with reduced computational cost and increased speed.
There is still another need in the art for a method and apparatus for generating an efficient linear problem solver structured such that the MFT method can accurately simulate large circuits with improved convergence.
The present invention provides a method and apparatus to meet these and other needs.
To overcome the shortcomings of the available art, the present invention discloses a novel method and apparatus for simulating analog circuits by using a mixed frequency/time approach.
In broad terms, the present invention provides a method for simulating responses of a circuit, the circuit receiving a periodic sample signal and at least one information signal. The method comprises the steps of: selecting a set of distinct time points; defining a set of reference time points, wherein each of the reference time points is associated with one of the distinct time points; establishing a first set of relationships between the values at the distinct time points and the values at the reference time points; establishing a second set of relationships between the values at the distinct time points and the values at the reference points; combining the first and second relationships to establish a system of equations in terms of the values at the distinct time points; and finding responses of the circuit at the distinct time points by solving the established system of equations.
The present invention also provides a corresponding apparatus for performing steps in the method described above.
The above mentioned advantages of the present invention as well as additional advantages will be more clearly understood as a result of a detailed description of the preferred embodiments of the invention when taken in conjunction with the following drawing in which:
1. The improved MFT algorithm
The behavior of a circuit (such as circuit 102 in
Where Q(v(t)) ε is typically the vector of sums of capacitor charges at each node, I(v(t)) ε is the vector of sums of resistive currents at each node, u(t) ε is the vector of inputs, v(t) ε is the vector of node voltages, and N is the number of circuit nodes.
The present invention is particularly advantageous in situations where the input signal u(t) is quasiperiodic. A signal is L-quasiperiodic if it can be written as a Fourier series with L fundamental frequencies. RF circuits are generally influenced by one periodic timing signal, often referred to as the LO (local oscillator) or the clock, and one or more information signals. If fc denotes the clock signal frequency (such as the signal at input 106 shown in
In a preferred embodiment, the present invention utilizes two conditions to improve the simulation of quasi-periodic circuit operating conditions. The first condition is that the circuit of interest possesses a quasiperiodic steady-state response. That is, v(t) is an S+1 quasiperiodic signal with fundamentals f1, . . . fs, fc. The second condition is that all physical circuits have a finite bandwidth. Using these two conditions, the present invention selects only a finite number of Fourier series terms to approximate v(t) while maintaining the necessary accuracy. Thus:
where V(k1, . . . , ks, kc) εCN. (An interesting property of the MFT algorithm is that it is not necessary to truncate to a finite number of harmonics of fc.)
Assume that v(t) is sampled at a discrete set of points t′n=t0+nTc, where Tc=1/fc is the clock period, t0 ε[0, Tc) and n runs over the integers, to obtain a discrete signal
the “envelope”
In principle, because there are only
Fourier coefficients to represent
Let us define the state transition function φ(v0, tk, tf)=v(tf)=v(tf):v(t) that satisfies equation (1) for tε[tk, tf] and v(tk)=v0. In particular, define the vector.
0
=[
T(t1), . . . ,
where superscript T denotes matrix transpose, to contain
T
T
=[v
T(t1+Tc), . . . , vT(tk+Tc)]T=[φ(v(t1), t1,t1+Tc)T, . . . , φ(v(tk),tk+Tc)T]T (7)
which may be written more compactly by introducing the multi-cycle transition function that is the collection of the K transition functions from tk to tk+Tc, as
T
=ΦT
Note that for each mode n, the vector of signals on that node, at the sample time plus one clock cycle,
T
n=DT
Note that DT
Combining equations (8) and (9) gives
(DT
where is the Kronecker product1 and IN is the N by N identity matrix. Equation (10) is a system of KN nonlinear equations and KN unknowns. that can be solved for the envelope sample points. From these sample points and the transition functions, the circuit's quasiperiodic operating point (in particular, the spectrum of v) can be recovered.
To construct the matrix DT
Thus if Γ is the matrix mapping sample points on the envelope to Fourier coefficients, then the delay matrix may be constructed as
DT
In particular Γ may be constructed as the Kronecker product of one-dimensional (2Ks+1)-point Fourier-transform matrices
Γmn(s)=ej2πmf
as
Γ=Γ(1) . . . Γ(s) (15)
From the properties of Kronecker products, Γ−1 is likewise a Kronecker product of the inverses of the Γ(s). In the existing MFT algorithms, no particular consideration was given to the choice of the sample points tk, so that the Γ(s)'s there are ill-conditioned matrices corresponding to an “almost-periodic” Fourier transform. By contrast, the improved MFT algorithm of the present invention performs a process of choosing well-conditioned sample points.
Assume the K sample points can be arranged into an S-dimensional array τ(k1, . . . , ks), −Ks≦ks≦Ks, 1≦s≦S, such that for a given dimension s, there exists an integer p, and
holds. In this case, the entries of the Γ(s) matrices are:
Γmn(s)=ej2πmn/(2K
That is, they are the DFT matrices, and the matrix ΓLC2K
The Newton's method can now be employed to solve
At iteration i, the Jacobian matrix is given by
Recall from (13) DT
be obtained from the multicycle transition function by
Note that J is block-diagonal. Defining b=−(DT
((DT
using an iterative Generalized Minimal Residual (GMRES) solver, and setting
0
i+1
=
0
i+9Δ
Each iteration of GMRES requires a matrix-vector multiplication. For a vector qε the term (DT
Let q be partitioned into q=[qiT, . . . , qKT]T, qkε for 1≦l≦K. Then
The calculation of each
can be carried through matrix-vector multiplication and
backsolving without explicitly forming the matrix.
For many problems, the GMRES algorithm is not efficient for solving equation (21) without an effective preconditioner. To analyze the reason, consider the case where the state transition function of the circuit, over one clock cycle, is approximately linear, that is φ(χ, t, t+Tc)≅Hχ(t). Linear circuits are an obvious example of a case where this is true, and while nonlinear circuits will have nonlinear state-transition functions, if the method performs poorly for linear circuits it surely will not work well for nonlinear circuits either. However, many nonlinear circuits have a state-transition function that is nearly linear, a fact which is exploited below to construct an effective preconditioner. The convergence of the GMRES method will depend on the location of the eigenvalues of the Jacobian matrix, DT
The following lemmas about the properties of Kronecker products are needed to perform the formal analysis.
Lemma 5.1 If A1, A2, . . . , ApεFn×m, B1, B2, . . . BpεFn×m then
The proof is as follows. For linear circuits, the diagonal blocks of
are the same, i.e.,
Denote a diagonal block as H, then the Jacobian matrix is equal to
Equation (24) to equation (25) holds because of IN=INININ and Lemma 1. Equation (26) to equation (27) holds due to Lemma 2(b), and equation (27) to equation (28) holds due to Lemma 5.2(a). Since (Γ−1IN) is unitary and its inverse is (ΓIN)−1, the right hand side of equation (29) has the same spectrum as (ΩT
The preceding analysis suggests a good way of preconditioning for solving the Newton equation (21). Solving equation (21) is equivalent to solving
{ωT
where γ=ΓΔ
for i=1, . . . , K, of
In particular, if the single-cycle state-transition function is linear and time invariant, then the Newton equation can be solved in a single GMRES iteration. Note that the preconditioner presented here is effective if the Jacobian of the state-transition function is nearly constant over multiple cycles. The circuit behavior inside each clock cycle is hidden from the preconditioner. This is not the case in, for example, the time- or frequency-averaged preconditioners typically used in modem harmonic balance codes. For this reason the preconditioner presented here may perform well under much weaker assumptions about the circuit behavior, in particular at higher power levels.
For each GMRES iteration, a system Pu=v has to be solved. Since P is block diagonal, it needs to solve a sequence of K systems (ej2πω
5. Improving Newton convergence
Rapid convergence of Newton's method can only be assured with a good initial estimation. To achieve a good initial estimation, the present invention first calculates the periodic steady state response of the circuit with the clock signal applied, while suppressing other non-DC signals. Using the steady state solution as an operating point a small-signal analysis is performed by treating non-clock fundamentals as small signals. As a result of the small signal analysis, amplitudes at fs+ksfc, for −KS≦KS≦Ks, 1≦s≦S, are generated. These amplitudes are transformed into time domain initial conditions via inverse multidimensional discrete Fourier transform (DFT). At higher input power levels, using a Newton continuation method, with the amplitude of the non-clock signals as the continuation parameter, is generally effective in securing convergence.
After the solution is converged, the values
Then for each K N-vector V(′,kc) where −Kc≦kc≦Kc which is collection of all N-vectors V(k1, . . . , kS, kc), where −K1≦k1≦K1, . . . , −Ks≦kS≦Ks (the actual order is determined by the Fourier transform),
Forming {(Ω(T)−1Γ)IN}
The synchronized time step requirement may not be easily met in practice. One alternative is to use interpolation schemes. However, these schemes potentially lose accuracy. Another alternative is to utilize integration instead of multidimensional discrete Fourier transforms. Specifically, it is easy to verify that
where Ep is a KN×N block matrix whose pth N×N block is IN and other blocks zero, and p is determined by (k1, . . . , kS) from the Fourier transform. Calculating equation (34) does not require synchronized time points. The total cost of calculating V(., kc) is K KN-vector integrations plus one final Fourier transform. However, it might be more expensive since integrations normally cost more than Fourier transforms-
The first example is a low-pass switched-capacitor filter of 4 kHz bandwidth and having 238 nodes, resulting in 337 equations. To analyze this circuit, the improved MFT of the present invention analysis was performed with an 8-phase 100 kHz clock and a 1V sinusoidal input at 100 Hz.
The 1000 to 1 clock to signal ratio makes this circuit difficult for traditional circuit simulators to analyze. In the improved MFT method, three harmonics were used to model the input signal. The eight-phase clock resulted in the need to use about 1250 timepoints in each transient integration. This brings the total number of variables solved by the analysis to slightly less than three million (337×(2×3+1)×1250=2,948,750). The simulation took a little less than 20 minutes CPU time to finish, on a Sun UltraSparc1 workstation with 128 Megabyte memory and a 167 MHz CPU clock.
The second example is a high-performance image rejection receiver. It consists of a low-noise amplifier, a splitting network, two double-balanced mixers, and two broadband Hilbert transform output filters combined with a summing network that is used to suppress the undesired side-band. A limiter in the LO path is used for controlling the amplitude of the LO. It is a rather large RF circuit that contains 167 bipolar transistors and uses 378 nodes. This circuit generates 987 equations in the simulator.
To determine the intermodulation distortion characteristics, the circuit was driven by a 780 MHz LO and two 50 mV closely placed RF inputs, at 840 MHz and 840 MHz+10 KHz, respectively. Three harmonics wee used to model each of the RF signals. 200 time points were used in each transient clock-cycle integration, considered to be conservative in terms of accuracy for this circuit. As a result, nearly ten million unknowns (987×(2×3+1)2×200=9,672,600) were generated. It took 55 CPU minutes to finish on a Sun UltraSpare10 workstation with 128 Megabytes of physical memory and a 300 MHz CPU clock.
To understand the efficiency of the improved MFT method of the present invention, consider that traditional transient analysis would need at least 80,000 cycles of the LO to compute the distortion, a simulation time of over two days. In contrast, the MET method of the present invention is able to resolve very small signal levels, such as the 5th order distortion products show in
Solving the MFT equations by direct factorization methods is also impractical, as the storage needed for the factored rank −50,000 (987×(2×3+1)2=48,363) MFT Jacobian of Equation 19 is several gigabytes. Forming the Jacobian matrix by direct methods would also require computation time proportional to the cost of 50,000 transient integration cycles, again a number on the order of days.
Step 804 selects a set of evenly spaced distinct time points shown as the circle dots in
Step 806 defines a set of reference time points shown as the square dots in
Step 808 establishes a first set of relationships between the values at the distinct time points and the values at the reference time points. The details of step 808 can be found in equation (8) and related descriptions.
Step 810 establishes a second set of relationships between the values at the distinct time points and the values at the reference time points. The details of step 810 can be found in equation (9) and related descriptions.
Step 812 combines the first and second sets of relationships to establish a system and equations that contain the values at the distinct time points only. The details of step 808 can be found in equations (10) and (18) and related descriptions.
Step 814 finds (or generates) the simulated responses of the circuit at the distinct time points by solving the established system of equations. If a circuit includes N internal circuit nodes and M outputs, step 814 can find (or generate) the simulated responses for all of the N internal circuit nodes and M outputs. The details of step 814 can be found in equations (18)-(22) and (30) and related descriptions.
Step 904 selects a set of estimated values to reflect estimated circuit responses at the distinct time points. The details of step 904 can be found in Section 5.
Step 906 establishes a system of linear equations at the estimated values. The details of step 906 can be found in equation (21) and related descriptions.
Step 908 preconditions the system of linear equations to improve the convergence of solution to the system of linear equations. The details of step 908 can be found Section 4.
Step 910 solves the system of linear equations to generate the correction values to adjust the estimated circuit responses at the distinct time. points. The details of step 910 can be found in equations (21)-(22) and related descriptions.
Step 912 adjusts the estimated values as newly estimated values to reflect the estimated circuit responses at the distinct time points. The details of step 912 can be found in equations (21)-(22) and related descriptions.
Step 914 determines whether the adjusted estimated values have an acceptable accuracy to represent the circuit responses. If the determination is negative, the process is led to step 906. If the determination is positive, the process is led to step 916. The estimated values and adjusted estimated values are in time domain.
Step 916 converts the estimated values from time domain to frequency domain. The details of step 916 can be found in Section 6, equations (31)-(34).
As shown in
The hard disk 1008 is coupled to disk drive interface 1006, monitor display 1012 is coupled to display interface 1010; and mouse 1016 and keyboard 1018 are coupled to bus interface 1014. Coupled to system bus 1001 are processing unit 1002, memory device 1004, disk drive interface 1006, display interface 1010, and network communication interface 1020.
The memory device 1004 stores data and programs. Operating together with disk drive interface 1006, hard disk 1008 also stores data and programs. However, memory device 1004 has faster access speed than hard disk 1008, while hard disk 1008 has higher capacity than memory device 1004.
Operating together with the display interface 1010, display monitor 1012 provides visual interfaces between the programs being executed and users, and displays the outputs generated by the programs.
Operating together with bus interface 1014, mouse 1016 and keyboard 1018 provide inputs to computer system 1000.
The network communication interface 1020 provides an interface between computer system 1000 and network 104 in accordance with predetermined networking protocols.
The processing unit 1002, which may include more than one processor, controls the operations of computer system. 1000 by executing the programs stored in memory device 1004 and hard disk 1008. The processing unit also controls the transmissions of data and programs between memory device 1004 and hard disk 1008.
In the present invention, the program for performing the steps shown in
The present invention improves the existing MFT method. The MFT method of the present invention is an efficient approach to analyzing multi-frequency nonlinear effects such as intermodulation distortion. Making the MFT method computationally efficient on problems of engineering interest required careful construction of the delay matrix, matrix-implicit Krylov subspace iterative linear solvers, and a preconditioner tailored to the MFT method and the circuits it typically analyzes. As a result, nonlinear systems comprising tens of millions of unknowns can be solved in less than an hour with computational resources commonly available to engineering designers.
One salient advantage of the MFT method in the present invention is in computing the functions Φ and the product of the Jacobian of Φ with some vectors. Both computations are essentially the solution of an initial value problem. Each application of the operator DT
While the invention has been illustrated and described in detail in the drawing and foregoing description, it should be understood that the invention may be implemented through alternative embodiments within the spirit of the present invention. Thus, the scope of the present invention is not intended to be limited to the illustration in this specification, but is to be defined by the appended claims.
Number | Date | Country | |
---|---|---|---|
Parent | 09570709 | May 2000 | US |
Child | 12372608 | US |