Multi-Dimensional Error Definition, Error Measurement, Error Analysis, Error Function Generation, Error Information Optimization, and Error Correction for Communications Systems

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention is related to multi-dimensional error definition, error measurement, error analysis, error function generation, error information optimization, and error correction for communication systems.

2. Background Art

System non-linearities may be compensated in real time if the non-linearities are known and fully characterized apriori or if they can be approximated “on the fly.”

“On the fly” calculations are power consuming and require large silicon real estate. Compensation techniques based on apriori knowledge, on the other hand, typically rely heavily on memory and DSP (Digital Signal Processing) algorithms.

Feedback techniques compensate for long term effects such as process drift, temperature change, aging, etc. However, having a finite feedback loop, practical considerations of sample rate and stability limit feedback techniques in terms of correction of real time signal anomalies.

Feed forward techniques may, in theory, correct for virtually every non-linearity in a system. However, feed forward techniques come at a significant price in complexity as well as efficiency.

There is a need therefore for efficient error definition, error measurement, error analysis, error function generation, error information optimization, and error correction techniques.

BRIEF SUMMARY OF THE INVENTION

Novel techniques are provided herein. These techniques can be applied to a myriad of applications for which an input to output transfer characteristic must be corrected or linearized. This disclosure will show that systems that process signals or inputs, which can be described by complex math in 2 dimensions, can also possess a corresponding complex output signal description that may be compared to some ideal desired output signal. The difference between the measured and ideal is associated with an error function or error data, D_ε_R, and can be exploited to compensate the system.

D_ε_Rbegins as a raw data set from the above described comparison and morphs into a compact mathematical description which eliminates the requirement to store all compared data points of the original difference measurements. This compact mathematical description may further be optimized in 2, 3, or higher order dimensions to minimize the amount of data stored using a variety of techniques and principles associated with approximation theory, data fitting, interpolation theory, information theory, tensor analysis, spline theory, as well as a plethora of disciplines related to geometric modeling.

This approach permits a convenient partition of resources in the VPA (Vector Power Amplification) application. Some extensive data processing can be accomplished off line in ‘non real time’, which characterizes the compact mathematical descriptions. Once the ‘off line’ number calculation is complete, the compact formulation is stored in the form of coefficients in a memory. The memory possesses a description of D_ε_Rthat hardware can utilize to provide corrective compensation in a ‘real time’ application very efficiently.

VPA based designs utilizing these techniques maintain excellent performance metrics while maintaining high power efficiency for many types of waveform standards in a unified design, with only a modest amount of memory and associated correction hardware. This blend of attributes is exceedingly desirable and can be flexibly tailored using the principles provided herein.

According to embodiments of the present invention, error can be described D_ε_R, processed, and geometrically interpreted. Raw error data is assigned to coordinates of a 2, 3, or higher order dimensional space. Then contours, surfaces, and manifolds can be constructed within the space to fit the raw data in an optimal sense subject to some important criteria. These contours, surfaces, and manifolds can be described in terms of convenient functions with a minimum amount of memory over specified domains, and may be used to reconstruct interpolated data over a continuum of points, reconstructing new data sets many orders of magnitude larger than the original memory used to describe the various original geometrical structures and their domains. Furthermore, the memory based algorithm can execute in ‘real time’ hardware very efficiently, thereby rendering practical designs.

All manners of compensation requirements are contemplated by the techniques provided herein. Temperature effects, power supply drift, power gain variations, process variation and many other imperfections can be addressed as parameters of the compensation space or dimensions of the compensation space. In this manner, a high order geometrical interpretation of D_ε_Rcan characterize system errors for virtually all imperfections, including the non-linearities of the input/output transfer characteristic. This does not preclude the simultaneous use of slow feedback adjustments to assist the open loop algorithm to enhance yield or increase performance.

Embodiments of the present invention provided herein advocate the compact formulation of correction and calibration functions which reduces memory requirements as well as computational time and associated math operations related to VPA operation. Embodiments are addressed to VPA applications. However, they can be applied to any linear or non-linear system which can be represented in an N order geometrical description.

Embodiments of the present invention introduce methods for characterizing a distorted space by mathematical functions or describing non-linearities of trajectories through the space by functions, apply these descriptions to some definition of an error function D_ε_R, then use D_ε_Rto either linearize the space or correct non-linearities within the functional trajectories. The proposed techniques are specifically applied to D2P VPA compensation and correction. Typically, RF power amplifiers are compensated with real time feed forward or feed back loops which seek to reduce intermod products by ‘on the fly analog circuits’ or DSP algorithms. Such approaches do not reconstruct general mathematical functional descriptions of errors prior to execution. Rather, errors are manipulated point by point in virtual real time on the fly with algorithmic convergence determined by control loops. In contrast, the algorithms provided herein fully describe an N-dimensional space by mathematical functions such that the distortion of the space is completely known apriori without storing raw data from the space. Only the data required to describe spatial distortion is stored. This provides a significant advantage in data storage requirements to achieve a specified performance for the VPA.

According to embodiments of the present invention, error information is related to a geometric shape, or contour of the space, and significantly related to the ‘tension’ or ‘latent energy’ of the space, not directly to point wise errors encountered in time as the application runs. This approach permits much more efficient calculations which transform scalars, vectors, tensors, and indeed the space itself, and can be accommodated by linear algebra.

BRIEF DESCRIPTION OF THE FIGURES

Embodiments of the present invention will be described with reference to the accompanying drawings, wherein generally like reference numbers indicate identical or functionally similar elements. Also, generally, the leftmost digit(s) of the reference numbers identify the drawings in which the associated elements are first introduced.

FIG. 1 illustrates an example VPA system.

FIG. 2 illustrates functional interrelationships between parameters that affect a VPA system.

FIG. 3 illustrates an example Volterra model.

FIG. 4 illustrates an example feedback system.

FIG. 5 illustrates an example feedforward system.

FIG. 6 illustrates example WCDMA constellations with and without distortion.

FIG. 7 illustrates example EDGE constellations with and without distortion.

FIG. 8 illustrates a linear stimulus-linear response within a complex signaling plane.

FIG. 9 illustrates a linear stimulus-non linear response within a complex signaling space.

FIG. 10 illustrates an example Star Burst input.

FIG. 11 illustrates an example response for an input stimulus that is reversed in the complex plane after sweeping outward to the unit circle.

FIG. 12 illustrates an example 2 dimensional complex signal constellation.

FIG. 13 illustrates an example sample-by-sample error correction system.

FIG. 14 is an example that illustrates error distortion.

FIG. 15 is an example that illustrates input pre-correction.

FIG. 16 illustrates an example feedback approach.

FIG. 17 is a block diagram that illustrates an example system according to an embodiment of the present invention.

FIG. 18 illustrates an example partitioning of real time and non-real time portions of error correction algorithms to support the implementation of the system of FIG. 17.

FIG. 19 illustrates an example in Cartesian space.

FIG. 20 illustrates an example linear sweep of a constellation.

FIG. 21 illustrates an example three-dimensional graphical error description.

FIG. 22 illustrates example error for an input circle constellation stimulus.

FIG. 23 illustrates example error for a radial constellation stimulus.

FIG. 24 is an example graphical illustration of a 6-dimensional calibration space.

FIG. 25 illustrates an example error surface representation.

FIG. 26 is an example 2-dimensional view of a starburst calibration pattern.

FIG. 27 illustrates a single radial arm of the starburst calibration pattern of FIG. 26.

FIGS. 28-35 illustrate example magnitude and phase error gradients.

FIG. 36 illustrates an example radial sampling path.

FIG. 37 illustrates an example multi-radial sweep technique.

FIG. 38 illustrates example input circle constellations.

FIG. 39 illustrates example output constellations that correspond to the input constellations of FIG. 38.

FIG. 40 illustrates an example input sampling web.

FIG. 41 illustrates an example input having radial sweeps.

FIG. 42 is a process flowchart that illustrates a methodology for VPA error function characterization.

FIG. 43 is an example fitting illustration.

FIG. 44 is a heuristic example.

FIG. 45 is a graphical example.

FIG. 46 is an example that illustrate piecewise fitting.

FIG. 47 illustrates an example single cubic B spline.

FIG. 48 illustrates an example dissected spline.

FIG. 49 illustrates various components of an example spline.

FIG. 50 illustrates overlapping splines.

FIG. 51 illustrates overlapping splines.

FIG. 52 illustrates an example cardinal series expansion of the function x(t)=cos²( ).

FIG. 53 illustrates an example system, including a low-pass anti-alias filter, a sampler, and an interpolation and reconstruction filter.

FIG. 54 illustrates an example uniformly sampled cubic B spline.

FIG. 55 illustrates example patch grids.

FIG. 56 illustrates how a regular rectangular grid in 2-D plane projects onto a 3-D surface.

FIG. 57 illustrates an example transformation of a region in 3-D space.

FIG. 58 illustrates two plots of example magnitude error surfaces in the fundamental 3-D spatial kernel.

FIG. 59 illustrates a process flowchart for information minimization.

FIG. 60 illustrates a process flowchart for information minimization.

FIG. 61 illustrates an example multi-option gradient based weighting algorithm.

FIGS. 62-63 illustrate example output radial excitation responses in the 2D complex plane.

FIGS. 64-65 illustrate examples of a single averaged radial.

FIG. 66 illustrates a conic magnitude error surface and bucket phase error surface for a full power output VPA condition.

FIGS. 67-68 illustrate example magnitude and phase curve fits for an averaged collapsed surface radial using a polynomial fitting technique.

FIG. 69 illustrates an example of resulting approximate error after correction using the polynomial fitting technique.

FIG. 70 illustrates example magnitude and phase error surfaces.

FIG. 71 illustrates an example magnitude curve fit using the polynomial fitting technique.

FIG. 72 illustrates an example phase curve fit using the polynomial fitting technique.

FIG. 73 illustrates an example of resulting approximate error after correction using the polynomial fitting technique.

FIGS. 74-79 illustrate example error compensation results using an extended polynomial fitting technique.

FIGS. 80-84 illustrate example results using an extended polynomial fitting technique.

FIGS. 85-90 illustrate example results using an I and Q explicit polynomial fitting technique.

FIG. 91 illustrates a comparison of bit resolution versus RF attenuation for three example algorithms.

FIG. 92 illustrates a comparison of bit resolution versus RF attenuation for three example algorithms.

FIGS. 93-94 illustrate example results using the Least Squares Fit algorithm for a 0 db attenuation case.

FIGS. 95-96 illustrate example results using the Minimax Fit algorithm for a 0 dB attenuation case.

FIGS. 97-98 illustrate example results using the Chebyshev Fit algorithm for a 0 dB attenuation case.

FIGS. 99-100 illustrate example results using the Least Squares Fit algorithm for a 22 dB attenuation case.

FIGS. 101-102 illustrate example results using the Minimax algorithm for a 22 dB attenuation case.

FIGS. 103-104 illustrate example results using the Chebyshev Fit algorithm for a 22 dB attenuation case.

FIGS. 105-106 illustrate example results using the Least Squares Fit algorithm for a 40 dB attenuation case.

FIGS. 107-108 illustrate example results using the Minimax Fit algorithm for a 40 dB attenuation case.

FIGS. 109-110 illustrate example results using the Chebyshev Fit algorithm for a 40 dB attenuation case.

FIGS. 111-114 illustrate example results using an upgraded multi radial algorithm for a 0 dB attenuation case.

FIGS. 115-118 illustrate example results using an upgraded multi radial algorithm for a 22 dB attenuation case.

FIGS. 119-122 illustrate example results using an upgraded multi radial algorithm for a 31 dB attenuation case.

FIG. 123 provides a performance summary of the multi radial algorithm for a WCDMA1-1 waveform.

FIGS. 124-139 illustrate example performance results using a radial starburst technique.

FIG. 140 provides a performance summary of the radial starburst technique for an EDGE waveform.

FIG. 141 illustrates functional relationships of calibration parameters.

FIG. 142 illustrates example calibration memory estimates for D_ε_Rfor WCDMA and EDGE.

FIG. 143 illustrates example memory requirements for operational calibration parameters.

FIG. 144 illustrates example memory requirements for modulation calibration parameters.

FIG. 145 illustrates example total memory requirements for WCDMA and EDGE.

FIG. 146 illustrates an example partitioning of functions of a correction algorithm into non-real time and real-time portions.

FIG. 147 illustrates an example approach to combine real time portions of a correction algorithm within a VPA.

FIG. 148 illustrates an example Type I interpolator.

FIG. 149 illustrates an example Type II interpolator.

FIG. 150 illustrates a Type I spline generator core.

FIG. 151 illustrates an example Type I spline interpolator.

FIG. 152 illustrates an example Type II spline generator core.

FIG. 153 illustrates a Type II spline interpolator.

FIG. 154 illustrates a filter based spline interpolator.

FIG. 155 illustrates a family of radials that can be obtained from a line equation.

FIG. 156 illustrates polar transformation.

FIG. 157 is a process flowchart that provides a high-level description of methods according to the present invention.

DETAILED DESCRIPTION OF THE INVENTION
1. INTRODUCTION . . . 11
(A) SYSTEM PARADIGM . . . 13
2. VOLTERRA BASED MODELS . . . 15
3. REAL TIME COMPENSATION . . . 17
4. OTHER APPROACHES . . . 18
5. VPA CALIBRATION THEORY . . . 19
(A) CALIBRATION WAVEFORMS . . . 19

(i) Star Burst . . . 22

(ii) Signaling Memory . . . 27

(iii) Comments on PAPR, Heating, Sweep Rate . . . 29

(iv) Error Mapping . . . 30

(v) Real Time Feed Forward Algorithm Interaction with VPA Signal Processing Flow . . . 33

6. BUILDING ERROR FUNCTIONS . . . 34
(A) D_ε_RDATA AND FUNCTION . . . 34
(B) EXAMPLE VPA ERROR PLOT FOR D_ε_R. . . 35
(C) EFFICIENT ERROR GRADIENTS . . . 37

(i) Error in the Complex Plane . . . 39

(D) EXAMPLE OF RADIAL SAMPLING CONTOUR IN COMPLEX PLANE . . . 41

(i) Averaged Weighting . . . 42

(E) MULTIPLE RADIAL SAMPLING ARMS . . . 43
(F) COMPARISON OF CIRCULAR SAMPLING AND RADIAL . . . 45
(G) JOINT SAMPLING APPROACH . . . 46

(i) Cross Correlation . . . 46

(ii) Radial to Radial Correlation . . . 48

(iii) Circle to Circle Correlation . . . 49

(H) SAMPLING DENSITY AND SAMPLE RATE . . . 49

(i) Sampling Density . . . 52

(I) INTERMEDIATE SUMMARY . . . 52
7. APPROXIMATION THEORY . . . 53
(A) FITTING . . . 54

(i) Polynomial Fitting . . . 55

(B) INTERPOLATION . . . 57

(i) Newton's Formula . . . 59

(ii) Lagrange Interpolation . . . 60

(iii) Hermitian (Oscullatory) Interpolation . . . 61

(C) APPROXIMATION BY ORTHOGONAL FUNCTIONS . . . 62
(D) LEAST SQUARES REVISITED . . . 64
8. PIECEWISE POLYNOMIALS AND SPLINES . . . 66
(A) CUBIC B SPLINE . . . 67
(B) SMOOTHING . . . 69
(C) POLAR SPLINES . . . 71
(D) RELEVANCE OF THE SAMPLING THEOREM . . . 72
(E) B SPLINE TRANSFORM . . . 74
9. SURFACE FITTING . . . 76
(A) BI-CUBIC SURFACE B-SPLINES . . . 76
10. PATCHES AND SPLOTCHES . . . 78
(A) POLYNOMIAL BASED PATCH INTERPOLATION . . . 79
11. TENSOR APPLICATIONS . . . 80
(A) BASIC TENSOR OPERATOR APPLICATION . . . 82
12. MISCELLANEOUS RELATED TOPICS . . . 83
(A) INFORMATION CONTENT . . . 83
13. EXAMPLES FOR CONSTRUCTION OF D_ε_R. . . 87
(A) 2D EXAMPLE USING STARBURST . . . 87
(B) 3D D_ε_REXAMPLE . . . 89
(C) IMPROVEMENTS IN 3-D POLYNOMIAL FIT . . . 90
(D) EXPANDED 2_D TOOL PERFORMANCE . . . 93
(E) EXTENSION OF 2-D TECHNIQUE WITH EXPLICIT I, Q COMPONENT POLYNOMIAL FIT . . . 94
(F) HEAD TO HEAD COMPARISON FOR THREE GENERATION I ALGORITHMS . . . 94
(G) UPGRADED MULTI RADIAL ALGORITHMS FOR WCDMA1_—1 . . . 95
(H) EDGE APPLICATIONS . . . 96
(I) INTERIM SUMMARY . . . 97
14. HARDWARE FOR ‘REAL TIME OPERATION’ . . . 98
(A) POLYNOMIAL INTERPOLATOR . . . 98
(B) SPLINE INTERPOLATOR . . . 100

(i) Filter Based Spline Interpolator . . . 101

15. SUMMARY . . . 101
16. APPENDIX A—BIBLIOGRAPHY . . . 105
17. APPENDIX B . . . 109
(A) EQUATION OF A RADIAL LINE IN RECTANGULAR AND POLAR COORDINATES . . . 109
(B) EQUATION OF CIRCLE IN CARTESIAN AND POLAR COORDINATES . . . 110
(C) CUBIC B SPLINE IN POLAR COORDINATES . . . 110
(D) GRADIENTS IN CARTESIAN AND CYLINDRICAL COORDINATES . . . 110
18. APPENDIX C . . . 111
(A) METHODS . . . 111

(i) Input Signals . . . 111

(ii) Analysis and Weighting Functions . . . 112

(iii) Minimize . . . 112

(iv) Output Measurement . . . 112

(v) Formulated 2-D D_ε_RDescription . . . 112

(vi) Create Efficient Multi-Dimensional Descriptions of D_ε_R. . . 113

(vii) Store Coefficients and Domains . . . 113

(viii) Real Time Application . . . 113

(ix) Features and Advantages of the Invention . . . 114

(x) Measurement Claims . . . 116

(xi) 2-D Fundamental Kernel Claims . . . 116

(xii) Create 2-D and 3-D Description Claims . . . 116

(xiii) N-Dimensional Processing Claims . . . 117

(B) PRE DISTORTION CASE COMPARISON . . . 118
(C) SUMMARY . . . 119
1. INTRODUCTION

The following sections provide an overview of Vector Power Amplification (VPA) compensation with a theoretical basis and a presentation of various solutions at a high level. Some of these solutions are described in relative detail. Primarily, techniques of approximating hyper geometric manifolds in multi dimensional space and subset solutions are provided. These techniques can be used to generate an efficient functional description of the VPA transfer characteristic that is being corrected or compensated. Systems and methods of RF power transmission, modulation, and amplification also referred to herein as Vector Power Amplification (VPA) are described in related U.S. patent application Ser. No. 11/256,172, filed Oct. 24, 2005, now U.S. Pat. No. 7,184,723 and U.S. patent application Ser. No. 11/508,989, filed Aug. 24, 2006, now U.S. Pat. No. 7,355,470, both of which are incorporated herein by reference in their entireties.

Solutions in 2 or 3 dimensions are discussed and the higher order spatial solutions are extrapolated inductively from the easier to assimilate lower order cases. The low order N=2, 3 solutions are minimal spatial geometries and are considered fundamental spatial kernels for higher order geometries consisting of many dimensions or parameters.

Sections 1 to 4 set the stage for understanding the basic need for compensation. Section 5 provides some background theoretical insights. Section 6 provides some detail of error gradients related to the transfer characteristic of the VPA. Section 7 provides a survey of basic approximation theory, which is relevant to estimating error gradients while minimizing data requirements. Section 8 covers spline concepts. Section 9 addresses surface fitting using splines and other techniques.

Section 14 provides some detailed examples which apply various theoretical aspects to solving VPA compensation problems. These examples incorporate ideas of approximation, interpolation, fitting, smoothing as well as correction functions, which are applied to enhance the VPA performance.

Embodiments provided herein permit the compact formulation of correction and calibration functions that reduce memory requirements as well as computational time and associated math operations related to VPA operation. Although embodiments are described with respect to VPA applications, they can also be readily applied to any non-linear system that can be represented using an N order geometrical description. Embodiments herein illustrate this general idea and support it with an appropriate mathematical basis. The applications of the theories disclosed herein for compensating non-linear electronic systems described in N order geometrical terms, are believed to be new and unique.

Embodiments of the present invention enable the characterization of distorted space or non-linearities of trajectories through space using mathematical functions. Further, embodiments permit the application of these characterizations to a given error function D_ε_Rand the use of D_ε_Rto either linearize the distorted space or correct non-linearities within the functional trajectories. Proposed techniques are particularly applicable to Direct2Power (D2P) VPA compensation and correction.

Typically, radio frequency (RF) power amplifiers are compensated with real time feedforward or feedback loops, which seek to reduce intermodulation products using ‘on the fly analog circuits’ or digital signal processing (DSP) algorithms. Such approaches do not reconstruct general mathematical functional descriptions of errors prior to execution. Rather, errors are manipulated point by point in virtual real time. In contrast, embodiments of the present invention generate an N dimensional space using mathematical functions such that the distortion of the space is completely known apriori without storing raw data from the space. Only the data required to describe spatial distortion is stored. This provides a significant advantage in data storage requirements to achieve a specified performance of the VPA.

(a) System Paradigm

A portion of this disclosure focuses on obtaining descriptions for D_ε_R. Such descriptions are required to obtain memory efficient implementations in practice.

Among the factors that influence D_ε_Rare the following:

- Architecture
- Process
- Operational Calibration Parameters
- Modulation Calibration Parameters
- Functional Parameters

Consider the following system paradigm, which is illustrated in FIG. 1. This paradigm is useful for interpreting the interaction of the VPA and the compensation or calibration process. The following basic definition applies:

A VPA Calibration State consists of all variables, functions, configurations, and operations which can affect the system in such a manner to impact the initial state of the VPA and ongoing calibration status. This definition includes consideration of hardware, software, firmware, and support equipment.

The VPA Calibration State is defined by its mathematical and configuration status, which is generally related to:

- Calibration Parameters
  - Operational
  - Functional
- Modulation Calibration Parameter
- System Error Function (D_ε_R)
- Stimulus Profile
- Response Profile

The Calibration Parameters and Modulation Parameters interact with D_ε_R. This interaction is important and affects various significant attributes of the VPA. Functional and Operational Parameters include several parameters as shown below.

Operational Calibration Parameters

- MISO Power Supply
- Driver Supply
- MISO Bias 1 and 2
- Driver Bias 1 and 2
- VGA 1 and 2
- Waveform State
- Out Phasing Offset
- Filter State
- Sample Rate
- Interpolation Time Delays
- Auto Bias Function

Functional Calibration Parameters

- Gain
- Frequency
- Temperature
- Battery Voltage
- Load Conditions

Modulation calibration parameters include, among others, factors within the system which are required to align modulation domain functions such as:

- Gain Balance (for all system symmetric functions)
- Phase Balance (for all system symmetric functions)
- Orthogonality
- DC Offset

The functional interrelationships between Functional Calibration Parameters, Operational Calibration Parameters, and Modulation Parameters are illustrated in FIG. 2.

The techniques provided herein which apply to descriptions of D_ε_Rcan also be applied to Operational, Functional, and Modulation parameters. In fact, some of the parameters are explicitly inducted to augment the spatial geometric description of D_ε_R. In particular, functional parameters are always implicitly assumed to be a part of the D_ε_Rspace for R^Nwhere N>3. Such partitioning is not entirely arbitrary since it can affect the solution for efficient implementation in hardware, software, and firmware. However, such partitions are not essential for the theoretical disclosures which follow.

{tilde over (ℑ)}_CP_F, {tilde over (ℑ)}_CP_O, {tilde over (ℑ)}_CP_M, illustrated in FIG. 2, are known to be quantities which relate to physical parametrics, adjustments, process variations, etc. Therefore, they are random variables. In their treatment below, it can be assumed that each is parameter an average or mean value.

2. VOLTERRA BASED MODELS

An ideal VPA would reproduce a complex passband waveform, at power, given its baseband complex description. That is, the VPA possesses a complex baseband input waveform description and renders an output RF waveform, on frequency, at power. The complex envelope of desired, processed waveforms should not be distorted in the process of frequency translation and power amplification. However, practical solutions do create distortions in the complex signal envelope which must be corrected to some prerequisite standard.

The input waveform is a composite of in phase (I) and quadrature phase (Q) components which can be represented as an ordering of complex values in time;

x(t)=I(t)+jQ(t)

The magnitude of x(t) can be calculated from

|x(t)|=√{square root over ((I(t))²+(Q(t))²)}{square root over ((I(t))²+(Q(t))²)}=r(t)

The phase of the signal is found from

$Θ (t) = \arctan (\frac{Q (t)}{I (t)})$

Θ(t) requires additional processing to resolve the angular quadrant due to ambiguities of the arctan ( ) function. This is accomplished by tracking the signs of I(t) and Q(t). The complex envelope signal representation after amplification and frequency translation of x(t) can be written as;

y
_p(t)=a₀r(t)e^j(Θ(t)+ω^C^(t))

A complex phasor notation is adopted for convenience.

y_P(t) is never achieved in practice as given by the simple formula. The reason is that the VPA is a non-linear device. These non-linearities are manifested in the magnitude and phase domains. That is, r(t) and Θ(t) are both typically distorted and cannot be represented purely as previously described.

A model of particular theoretical interest and historical significance provides insight into the distortion mechanisms This model, the Volterra model, is presented graphically in FIG. 3. y_P(t) in the non-linear system model of FIG. 3 can be written as:

y(t)_p=a₀x(t)+a₁∫₀^∞h(τ₁)x(t−τ₁)dτ₁+a₂∫₀^∞∫₀^∞h₂(τ₁,τ₂)x(t−τ₁)x(t−τ₂)dτ₁,dτ₂. . .

Notice that the first branch, with gain a₀, produces the exact desired result, y_P(t), if frequency translation is assumed (note that frequency translation is not explicitly described in the convolution kernels provided in FIG. 3 for simplification). Harmonics of the bandpass do arise from the non-linearities and are manifest at bandpass center frequencies of m·ω_cwhere m=2 . . . ∞. Nevertheless, these channel harmonics are implicitly assumed and not explicitly illustrated by the block diagram. h₁(t) . . . h_n(t) are impulse responses which are typically fixed but may or may not be known. They vary as a function of temperature, power level, and other parameters.

The block diagram of FIG. 3 illustrates some important aspects of the model:

- Multi convolution kernels in the parallel branches give rise to components in the output signal related to x(t), x²(t), x³(t) . . . xⁿ(t);
- The branches may possess memory;
- Each branch can operate independently; and
- All branches operate in parallel and obtain a result using superposition at the output.

It is readily apparent by inspection that r(t) and Θ(t) cannot be easily preserved unless the branches in parallel with the first branch are null, or unless the parallel impulse responses are constants. Neglecting these two degenerate cases, significant distortion of the complex envelope is likely due to the impulse responses h₁(t), h₂(t) . . . .

The remainder of this disclosure focuses on the effect of the parallel branches on the envelope and on compensation techniques according to the present invention which can reduce the relative impact of the parallel branches on performance. Certain approaches attempt to directly measure all the elements in each branch of the model of FIG. 3 and generate the Volterra model explicitly. Once the model is generated, suitable compensation may be designed to counter the effects of each branch. The approach provided herein acknowledges the mathematical constructs and properties of the model, along with the associated performance impacts. However, compensation techniques presented herein shall not require generation of the Volterra model explicitly. Rather, embodiments of the present invention focus on partitioning of algorithms, which can be implemented in hardware and software, to obtain a profile of VPA non-linearities and thereby also provide for appropriate compensation. An important aspect is to provide such profile efficiently with minimum data requirement.

3. REAL TIME COMPENSATION

Amplifier and other system non-linearities may be compensated in real time if the non-linearities are known and fully characterized apriori or if they can be approximated ‘on the fly’. ‘On the fly’ calculations may be power consuming and require large silicon real estate, while compensation techniques based on apriori knowledge typically rely on memory and DSP algorithms.

Feedback techniques compensate for long term effects such as process drift, temperature change, aging, etc. However, a feedback loop's bandwidth is finite and therefore practical considerations of sample rate and stability limit these techniques in terms of correction of real time signal anomalies. Essentially, the non-linear signal will be transmitted into the medium before a loop can compensate or correct.

On the other hand, feed forward techniques are causal and may correct for virtually every non-linearity. However, this comes at a significant price in complexity as well as efficiency.

FIGS. 4 and 5 illustrate the two approaches at a high level. FIG. 4 illustrates a feedback approach. FIG. 5 illustrates a feedforward approach. As would be understood by a person skilled in the art, many variations exist between digital, analog, and hybrid implementations. Some associated and relevant features of the two approaches are listed below.

Feedback Processing

- Requires a feedback monitoring point which is sensitive to VSWR effects.
- Complex implementation to correct VSWR effects.
- Typically requires down-conversion technology in the loop with polar or rectangular processing.
- Down-converter requires calibration.
- Cannot typically correct wide band signals (i.e., ACPR side bands) without unreasonable sample rates and complexity and significant power consumption.
- Suitable for long term slowly varying average effects like temperature, process, aging, etc.

Feed Forward Processing

- Excellent correction capability.
- Relatively Complex to implement.
- Inefficient feed forward delay element required (τ_D).
- Inefficient corrector block or coupler required.
- Feed forward amplifier is typically power consumptive to create A_ε.

As noted, both approaches have merits and drawbacks. At the same time, both approaches are at odds with small, low cost, efficient handheld mobile phone applications.

4. OTHER APPROACHES

Other potential VPA approaches, not specifically addressed above, can be partitioned into the following categories:

- Process based compensation.
- Calibration using non real time processing techniques.
- Hybrid approaches.

Process based approaches eliminate non-linearities by virtue of the circuit design via good design practice and topological compensation.

Calibration techniques fully characterize all anomalies of the VPA in all potential system states. Then, the waveform processor uses the characterization to compensate waveform parameters precisely to produce compliant signals at power.

Hybrid approaches use multiple compensation approaches to affect compensation.

5. VPA Calibration Theory

As presented in Section 2, a VPA is inherently non-linear. This presents a formidable challenge to create linear amplification from non-linear technology. Hence, other schemes, topologies, and compensation approaches must be employed along with good circuit design to deliver a competitive solution.

Calibration, as referred to herein, relates to measurement of VPA non-linearity and anomalies, characterization of these imperfections for each potential state of operation, and compensation of all waveform parameters. The process can be generally described by the following steps:

- 1. Stimulate the VPA with a pre-designed waveform file, for each applicable calibration system state;
- 2. Measure the response of the VPA to the pre-designed waveform for each system state;
- 3. Characterize the response;
- 4. Generate a correction file;
- 5. Minimize the correction file; and
- 6. Apply the correction and test standards based waveforms.

(a) Calibration Waveforms

It is desirable to produce compliant standards based waveforms consistent with Market Requirements Document (MRD) goals. The MRD goals include the following waveforms:

- WCDMA
- GSM
- EDGE
- HSUPA

These standard waveforms each possesses an associated constellation within the complex signaling plane. A constellation is a two dimensional space which geometrically represents the signal, its signaling states, and state transitions. The properties of a constellation provide insight to the signaling attributes of the waveform which may relate to non-linear amplifier behavior in an adverse manner. As such, the accuracy of the signaling states and the character of transitions between signaling states are things of concern. Non-linearities can distort the constellation in a manner to modify the average metric of the space, diminishing the signaling state distance, creating metric asymmetries, and modifying transitional trajectories through the signaling space. These distortions generally result in the following:

- Increased P_edue to signaling state metric reduction.
- Increased P_edue to metric asymmetries.
- Increased timing jitter in synchronization processes, which in turn produces a greater probability of bit error or symbol error.
- Spectral domain anomalies such as increased ACPR, spurs, etc., which reduce channel capacity.

Calibration waveforms are required which exercise the complex signal plane to create stimulus constellations such that each possible point within the constellation space can be fully characterized, through the VPA.

FIGS. 6 and 7 illustrate example constellations of WCDMA and EDGE, respectively, with and without distortion. It is noted that WCDMA possesses greater amplitude domain variation (radial distance of the signal from the origin) while EDGE possesses greater numbers of required phase states (angular perturbations about the origin). Generally, these two waveforms represent the extremums of requirements. The wideband CDMA waveform possesses a large peak to average power ratio (PAPR), particularly when fully loaded with the maximum number of available channels. The EDGE waveform possesses a modest PAPR but requires excellent phase resolution and definition of each separate phase state within the complex signaling plane. Hence, calibration stimulus waveforms should account for these amplitude domain and phase domain fluctuations.

Both constellation types have been accompanied by undistorted and distorted examples to illustrate comparative very linear and very non-linear behavior. Of course, the distorted constellation arises due to system non-linearities which must be corrected.

FIG. 8 illustrates a linear stimulus-linear response within a complex signaling plane. Notice that the input points lie along a straight line within the plane. If these signal states were utilized as inputs for a perfectly linear device then observation of the output would yield a similar constellation, a perfectly straight line, with some relative phase shift (angular rotation with respect to the input due to time delay within the device) and perhaps a stretching or compression of the line due to gain of the device. However, what is more likely to occur can be illustrated by FIG. 9, which illustrates a linear stimulus—non-linear response within a complex signaling space.

Note in FIG. 9 that the input points lie along a straight line by design and that the output points lie along a distorted line, which indicates gross non-linearity for this example. Amplitude domain non-linearities as well as phase domain non-linearities can be observed in FIG. 9. However, distortions which are manifest as equal transport delays for the I and Q portions of a signal and which can change over the short term (within a symbol or a few symbols) cannot be detected. However, these distortions are rare and compensation of the dominant amplitude and phase distortions will suffice for all applications related to the MRD.

Accordingly, a known input may be compared to the output to determine the nature of the non-linearity. Other input trajectories can also be utilized provided they permit comparison with the output. For instance, a circular input can be used as the stimulus, which would have the advantage of exercising all phases within the complex plane, while a radial line exercises all amplitudes. Generally, a blend of both attributes is ultimately required. Actual WCDMA or EDGE waveforms can also be used to accomplish the calibration.

In making stimulus selection, practical considerations of detection, synchronization, statistical coverage, and overall demodulation complexity should be considered. Furthermore, it is preferable to stimulate the system in such a manner to facilitate easy input-output comparison and subsequent data processing. Systematic patterns solve a myriad of concerns, including:

- Input Output Comparison;
- Data Sample Description;
- Complex Plane Coverage (for universal modulation support);
- Synchronization and Detection; and
- Data Reduction.

Hence, it is desirable to use such patterns as lines, circles, spirals, pseudo random patterns, etc. One consideration though is that of sweep rate and pattern discontinuities. The pattern should be swept at a rate commensurate with normal operation. This permits stimulus of all dominant frequencies contemplated by the models described in Section 2. That is, it is important to exercise the appropriate bandwidth of the VPA relating to intended applications. Any stimulus should contemplate this requirement. When such stimulus signals are swept however, discontinuities may present special problems unless accommodated. For instance, straight lines do possess discontinuities at end points. This can be generally overcome by a number of techniques such as blanking, sample time expansion, shaping the transitions (bowtie) rather than permitting discontinuities, retracing, variable sweep rate setup, so on and so forth. Similarly, concentric circles possess discontinuities when the sweep jumps from the circle with radius r_bto a circle with radius r_a. One technique is to use continuous ‘spiral graph’ like functions to sweep out the entire complex plane and minimize discontinuities. This technique is effective but may be less efficient or less systematic from a coverage point of view and does not necessarily reveal circuit memory anomalies. Nevertheless, many approaches will work with some possessing better coverage or mathematical description, while some may provide a more rapid sweep capability. The ‘ideal’ waveform is a function of many multi discipline concerns and such optimizations are often complex.

The sweep speed at the input must also consider the instrumentation available. That is, the frequency response of the test set up can become a factor.

(i) Star Burst

A simple pattern used to stimulate the complex plane is the Star Burst approach. The star burst is a collection of radial lines which form bisecting segments of the ‘normalized’ unit circle in the complex plane. FIG. 10 illustrates an example Star Burst input. As shown, 8 radial lines are illustrated with angular resolution of 45° for this example. The number of radial sample points and the angular separations are generally selected based on the coverage required. Notice that the samples can be arranged into circles as well as lines. Generally, WCDMA requires less angular resolution than EDGE. However, WCDMA does require a modest resolution for the radial sample points. For a multi-mode application, the ‘worst case’ dominates the calibration routine waveform consideration. Ultimately, however, the stimulus must possess some minimum coverage so that subsequent analysis can extract all of the necessary information. The essential required information is the deviation from linear behavior throughout the complex signaling plane.

(1) Sample Distribution

Each waveform possesses a required sample distribution for coverage considerations. In addition, VPA non-linearities may strongly suggest sample spacing distribution based on location of the errors contemplated above. The ‘essential’ information exists only at points in the complex plane (normalized) where the output does not match the input. That is, points where non-linearities are significant possess the greatest information content. Hence, these places within the complex plane can and should possess greater densities of samples. In a scenario where samples are evenly distributed within the complex plane, the regions of greatest non-linearity will drive the minimum sample spacing consideration. One measure or metric for sample density adjustment is to adjust sample density based on input versus output data differences. This differentiation technique will automatically establish the regions of greatest error to be compensated.

The method of establishing input-output error metrics as a reference for sample distribution is a first order technique whenever the error file is generated from the difference signal. A difference of zero requires minimal sampling while a large difference may require greater sampling density.

In addition to input-output difference metric, the rate of change of the difference is also important. That is, the error surface generated by input-output comparison can be characterized by peaks, valleys, transitions, erratic contours, etc. Non-linearities will manifest as contours which are ill behaved and not always smooth. An n-fold directional derivative or gradient can characterize a differential portion of the error space in such a manner to relate the non-linearity of the space to a sample requirement. As a gradient changes from one point in the space to another, information is imparted concerning the number of samples required to define the error. Suppose that the error (perhaps an n-fold error surface) is a function of several variables, then the following definitions are created:

- D_ε_RΔ Error in region R

$\frac{\partial D_{ɛ_{R}}}{\partial R} \underline{Δ}$

Gradient of the error with respect to parameters of the region R, also represented as ∇D_ε_R

- D_ε_Ris a function which characterizes the error surface of interest. The samples required for characterizing the space defined by D_ε_Rare related to two basic principles.

Principle 1: The number of samples required to characterize D_ε_Rto some desired accuracy, is proportional to the regional gradients and the rate of change of gradients over the region R^N, where N is the dimension of the space of interest. In addition, regional cross correlation characterizes the uniqueness properties of the D_ε_Rspace.

Principle 2: The number of samples required to characterize D_ε_Ris inversely proportional to its probability or predictability. The more systematic D_ε_Ris, the lower its information content.

Principle 1 can best relate to the realms of approximation theory, geometry, tensor calculus, etc., where characterization of the multi dimensional function is the prime concern.

Principle 2 is a loose statement of Shannon's original information theorem. That is, the greater the likelihood of predicting one point of the sub space defined by D_ε_Rby knowing or observing another point in that sub space, then the lower the information content is for the region near those points and in between. When the information content is low, fewer samples are required. On the other hand, if D_ε_Ris non determinant and almost random then the largest information content is stored in the space and a significant number of reconstruction samples may be required. The application of the notion of information here is similar to Shannon's context but not identical. First, Shannon's Theorem anticipates ‘white or random noise’, although this is not a strict requirement. Secondly, Shannon's Theorem contemplates an information source characterized by a probability density function (pdf) which is indeterminate from one instant to another. In this application the interference is not white, the channel possesses memory, and there is complete knowledge of the system input, although the non-linearity of devices (VPA) is typically unknown. A lesser concern is that some measurements may not be precise. Nevertheless, any apriori knowledge is very important.

It should be noted that the sample rate versus the number of samples is not the focus of this present discussion. Naturally, Nyquist's sampling theorem does apply for any waveform reconstruction. That consideration is one of bandwidth and sampling rate, not information density. Reconstruction theory shall be revisited in a manner consistent with function interpolation, and proper sampling principles in later sections.

The immediate primary focus is to relate the concept that the error space characterized by a function D_ε_Ror data set D_ε_Rcan be described using some minimal amount of information or some minimal number of samples. If the behavior of D_ε_Rcan be described by a polynomial or some collection of functions, then information content can be minimized. This idea can be exploited in a manner to minimize the storage of calibration coefficients. Rather than storing all measured (observed) points within the space, a description is stored which characterizes the space. Once the function is known the continuum may be generated from it, provided the calculations and cost of calculation are reasonable.

According to embodiments of the present invention, higher order complex spaces may be assigned tensor descriptions. Accordingly, the VPA compensation technique according to the present invention characterizes the signaling space, and any signal in the space may be predicted and therefore compensated, based on the knowledge of the regional non-linearities of the characterized space.

(2) Entropy

Suppose that the points stored in memory from the calibration operation are stored without any additional consideration or description. Each point then possesses some amount of information. If the points are completely unique and statistically independent of one another, then they would possess the following entropy as an ensemble:

- H=log₂m bits/point
- mΔ and resolution of acquisition data point in discrete levels

If the digitization process is a 1024 discrete level process then H=10 bits/point. The total required memory then is N_p·H=m_B, where N_pis the number of points required.

Now suppose that each point is related in some way to adjacent points. As an example, for instance, let the previous two points relate to or be statistically dependent on the current point. Then, for this third order entropy model, the entropy is calculated from:

$H = \sum_{i = 1}^{m} p_{i} \sum_{j = 1}^{m} P_{j / i} \sum_{k = 1}^{m} P_{k / j, i} \log_{2} P_{k / j, i}$

P_iΔ probability of a level i occurring

P_j/iΔ probability of level j given i occurred previously

P_k/i,jΔ probability of level k given i and j occurred

For the general case given n adjacent related points, the entropy can be written as:

$H = l_{im} \frac{I}{n} \sum^{m^{n}} p (β_{n}) \log_{2} p (β_{n})$

p(β_n)Δ occurrence of first n levels

Although extremely difficult to calculate, the equation illustrates that statistical dependence on the points can drastically reduce the values of H and N_p, and ultimately the required memory.

The VPA calibration case is a case which in fact possesses interrelated probabilities. In the limit, the functional description exists and is completely deterministic. That is, each point in the space cannot be treated as an independent random variable from a random process. The points should be considered as points in a space which under ideal conditions could be completely characterized, and under perfect conditions, completely determined. Under such conditions, H is minimized and the joint probabilities indicated above are large. Also N_pcan be significantly reduced. Therefore, from an information theory perspective it is desirable to solve the following:

min{m_B}=min{H·N_p}, subject to Q

H can be minimized provided that appropriate sampling techniques are employed within the space (i.e., space is not under sampled). This leaves consideration of min{N_p}.

The non-linearity of the space defined by D_ε_Rmay not be known apriori. However, it is stable, once characterized, with modest to low variance. Hence, D_ε_Rcan be described by deterministic means rather than by pdf. The pdf description may be valuable however for certain production considerations and fine tuning associated yields.

A. Data Redundancy

Some potential for data compression exists due to certain symmetries within the sub space defined by D_ε_R. Although this is contemplated by the calculation of H in the previous section, for the most general conditions, the notion of data redundancy is described here for clarity. The type of data compression possible for D_ε_Rcharacterization should not be confused with other classical problems associated with redundant data sources, i.e., media which possess memory. The information processing formulation in the VPA context is different for reasons already mentioned. Compression is best accomplished by appropriate sampling of the sub space and description of the sub space.

(ii) Signaling Memory

As illustrated in Section 2, there are signal constellation memory effects within the non-linear VPA which should be recognized. Although not dominant, these local memory phenomena affect calibration accuracy, as well as performance in an application. In general, the current state of a waveform at the output of the VPA is a function of previous waveform values. If the input and output waveforms are sampled appropriately the memory effect can be described by:

$y (t_{j}) = \sum_{i = 0}^{n} a_{i} \sum_{k_{i} = - \infty}^{j} h_{i} (t_{j - k}) x (t_{k})$

where;

y(t_j)Δ j^thTime Sample VPA Output

x(t_k) Δ k^thInput Time Sample of the Convolution Sum

h_i(t_j-k) Δ The Volterra based Impulse Response over Time in the i^thBranch Convolution Sum

a_iΔ Branch Weights

Each branch impulse response convolves with x(t) to produce a slightly prolonged time domain response according to the filtering characteristics of h_i(t). Also, non-linearities can arise in the amplitude domain for the most general case of each branch, due to the multiple convolution kernel k_i(t).

It is helpful to consider how this could affect the complex signal plane response for the following simple case of linear radial stimulus. Consider the case for a linear radial excitation where the input stimulus is reversed in the complex plane after sweeping outward to the unit circle. The reverse sweep covers the reverse complex plane ‘real estate’ in exactly the same input trajectory provided that the time delay is very small. This is illustrated in FIG. 11. The arrows along the dotted trajectory illustrate the outward bound radial (outward from the origin) and the inward bound radial.

Now consider the system output. Notice the distinct difference of the two output trajectories for outward and inward bound responses. As shown, the inward bound trajectory at the output is a function of the current input signal trajectory as well as the previous input signal trajectory. This phenomena relates to energy transport through the VPA. The VPA translates real world signals which involve voltages and hence energies. Therefore, parasitic energy storage components such as capacitance, inductance, etc., redistribute the input energy to the WA in time. It takes time to move charge through a system with such parasitics. Hence, the output energy transport is affected by broader swaths of time than a specific instant. In addition, heating effects within the VPA can produce temporal fluctuations in the parasitics. Also, the return path does not (in general) possess the same shape as the outward trajectory which implies a separate requirement for characterization.

FIG. 12 illustrates a 2 dimensional complex signal constellation (order 2 spatial kernel) with inputs illustrated by the linear radial traces and outputs illustrated by the curved traces. The input traces of this starburst pattern stimulate the VPA input, while the output traces illustrate the response at the VPA output.

Notice that the memory effect is clearly present at the output, as indicated from the bifurcation of the output traces.

If this memory effect is too significant then the calibration or compensation becomes complicated and requires the extraction of a model at an intricate level. In addition, it is not unusual to find that capacitance and inductance in such non-linear devices must be modeled as functionally related to the radial position of samples within the complex signaling plane, which further complicates the characterization. Again, the VPA heating anomalies play a role as well. For small perturbations of the radial sweeps a single directional sweep may suffice rather than the attempted retrace. Also, averaging the two sweeps outward—inward, may provide some benefit. This smoothing or interpolating theme will be explored repeatedly in subsequent sections according to embodiments of the present invention.

(iii) Comments on PAPR, Heating, Sweep Rate

Embodiments of the VPA typically comprises a power MISO (Multiple Input Single Output) amplifier. Since the amplifier is not perfectly efficient, some energy is dissipated in the form of heat. This heat in turn modifies the Volterra model description slightly. Thus, it is desirable to consider the heating effect in the context of calibration. In an embodiment, it is possible to create a calibration waveform with a specific peak to average power (PAPR) to reproduce the heating phenomena. In some cases this may improve the calibration process.

Generally, the calibration waveform could be pseudo-random by nature with a mean and a variance designed to deliver a particular PAPR. It may also be systematic and still approximate the heating caused by a WCDMA stimulus.

According to embodiments of the present invention, it is also possible to utilize a variable sweep rate which slows down the samples in the portion of the complex plane for a duration to create the heating affect. For instance, a starburst could possess a variable sweep rate along the linear radial stimulus. In fact, the waveform could even stall at a particular point. In addition, the stimulus could retrace specific small segments of the radial line over and over at fixed or variable sweep rates.

As would be understood by a person skilled in the art based on the teachings herein, any swept stimulus could incorporate these variations, whether the stimulus employs circles, radial lines, spirals, etc.

If variable sweeps or even fixed rate sweeps with retracing are employed, care must be exercised to maintain some reasonable power spectral density (psd) for the stimulus if possible. It is desirable to not only exercise the complex plane appropriately, but also the frequency domain performance of the device should be tested by the calibration waveform.

(iv) Error Mapping

An error mapping of the VPA relates the input to output response of the VPA D2P technology. An ideal response is a one-to-one correspondence within the complex signaling plane for input versus output given normalized scaling and time shifting between x(t) and y(t). That is, given normalized plots of input and output responses within the complex signaling plane, the two trajectories would exactly overlay. This in turn would imply a perfectly linear system. Since this is never the case in practice, except for a perfect wire, some technique is required to provide the measure for the error.

One proposed measure, disclosed above, can be described as:

${\overset{⇀}{D}}_{ɛ_{R_{j}}} = {\overset{⇀}{y}}_{j} - {\overset{⇀}{x}}_{j} (Vector Representation)$

Since the space is in general, multidimensional, a vector representation can be useful. Notice that the proposed error is on a sample by sample basis, comparing the j^thinput (x_j) and output (y_j) samples. It is also easy to calculate the magnitude and the phase of the error as well based on the proposed measure. However, there is some inaccuracy in the metric as proposed. This is because the error

${\overset{⇀}{D}}_{ɛ_{R_{j}}}$

is a vector which moves through a region of the distorted space that must be corrected. Therefore a simple comparison of input and output samples may not be accurate enough.

The prior statement involves the notion that {right arrow over (D)}_ε_Rmust also be utilized to facilitate the necessary correction operation. However, if {right arrow over (D)}_ε_Ris applied in front of a non-linearity then {right arrow over (D)}_ε_Ris also distorted. Thus, the metric {right arrow over (D)}_ε_R, measured at the VPA output, is an important metric but does require additional consideration if it is subtracted prior to the VPA input.

To illustrate further consider the scenario illustrated in FIG. 13, which attempts to make use of

${\overset{⇀}{D}}_{ɛ_{R_{j}}}$

from a sample to sample basis. VPA_Aand VPA_Bare identical. All signaling time axes are ideally aligned to facilitate proper cancellation of distortions in VPA_Bbased on measurement of

${\overset{⇀}{D}}_{ɛ_{R_{j}}}$

in VPA_A.

By inspection, the perfectly corrected signal is:

$y_{0} (t_{j}) = y_{1} (t_{j}) - D_{ɛ_{R_{j}}} (Error @ Output)$

Also, by inspection:

$y_{2} (t_{j}) = {(kx (t_{j} + τ) - D)}_{ɛ_{R_{j}}} * {V} (Correction @ Input)$

- (k an arbitrary scaling constant)
- {V}Δ Volterra kernel describing the VPA transfer characteristic

Also,

y
₀(t_j)≠y₂(t_j)

in general, due to {V} unless {V} is a constant. This of course is a contradictory or degenerate case and cannot occur. The result of the above comparison may be in the ‘ball park’ but in general is not sufficiently accurate and must converge by trial and error, modifying D_ε_Ras required to account for {V}.

Notice that this is equivalent to a feed back correction in many respects and that is why it is mathematically inferior, unless a control loop is present. If D_ε_Rsimply carried forward (feed forward compensation) and subtracted from y₁(t_j) then a perfect result is possible. Generally, there are reasons why such an architecture is not acceptable (See Section 3). Therefore, only predistortion is contemplated in the current discussion and is considered as a constraint, for VPA applications, and subsequent arguments.

A graphical based commentary provides some other key insights and illustrated in FIG. 14, where:

- y_o_rΔ Required output vector from origin to desired output point P_o_r
- y_o_aΔ Actual output vector, warped by VPA non-linearities, extending from origin to point P_o_a(actual point achieved)
- {right arrow over ({tilde over (D)}_εΔ Approximate Error Vector, measured in a warped space {right arrow over (y)}_o_a−{right arrow over ({tilde over (D)}_ε={right arrow over (y)}_o_r

As shown in FIG. 14, the actual output is compared to the desired output, resulting in an error vector {right arrow over ({tilde over (D)}_ε. y_o_rwould be equal to k·x_iin a linear system, where k accounts for gain of the VPA. Given that equivalence, consider the input correction vector diagram illustrated in FIG. 15.

FIG. 15 illustrates an attempt to apply a scaled version of {right arrow over ({tilde over (D)}_ε to the input vector {right arrow over (x)}_ito obtain {right arrow over (x)}_i_c·{right arrow over (x)}_i_cis the intended input with pre correction or predistortion to correct in such a manner to achieve y_o_rshown in FIG. 14. If this space (R²) were linear (undistorted) then in fact the method of correction illustrated in FIG. 15 would work in the event of an error. However, the space R²is not linear and is in fact distorted. Thus, the error {right arrow over ({tilde over (D)}_ε is measured in a distorted or warped space and is reapplied in a linear manner to the input of the warped space, which warps it ({right arrow over ({tilde over (D)}_ε) yet again. Furthermore, the distortion is non-homogeneous throughout the region (R²) The distortion is introduced by the VPA. Pre-Correction in FIG. 15 can be described as:

{right arrow over (x)}
_i
−{right arrow over ({tilde over (D)}
_ε
={right arrow over (x)}
_i
_c(Approximation to required Predistorted Input)

If the vector is introduced into the VPA at the input, then {right arrow over (D)}_ε does not possess the desired effect because it is distorted as it progresses through the region. If that distortion is known apriori, in the region where −{right arrow over (D)}_ε must be applied, then a corrective calculation can be accomplished to augment {right arrow over ({tilde over (D)}_ε to achieve the desired result. Pre-correction can then be written as:

{right arrow over (x)}
_i−({right arrow over ({tilde over (D)}_ε+{right arrow over (ε)})

where {right arrow over (ε)} is a correction term, which is a function of the local distortion within the complex plane.

FIG. 16 provides insight into an example feedback approach according to an embodiment of the present invention. In this example, D_ε_Rmay be driven to zero by adjusting or modifying the VPA input until the desired output is achieved, and D_ε_Ror a related metric is minimized

Modifying x(t) can be done by choosing a new input value known apriori to produce a desired correspondence to the perfect result at the output. That is, the entire complex plane may be mapped apriori with a correspondence between input and output VPA samples. In this scenario, the correct numbers from the appropriate memory locations are summoned for x(t_j) to produce the intended and desired output point y(t_j), effectively building in {right arrow over (ε)}. This is a memory mapped form of pre-distortion. Since values may not lay on perfect sample grids, some interpolation of values may be required. Other forms of adaptation may also be possible. The important distinction is that pre-distortion is applied, not post distortion.

This feedback system is handled ‘off line’ in the proposed calibration process. Therefore, unlike the feedback illustrated in Section 3, this feedback is stable, causal, and bandwidth sufficient.

(v) Real Time Feed Forward Algorithm Interaction with VPA Signal Processing Flow

It is helpful to gain some insight into the relationship between the D2P VPA baseband (BB) algorithms and the processing of D_ε_Rerror surface information. The two work together to define and correct the distortions inflicted by non-linearities of analog and RF circuitry. FIG. 17 is a block diagram that illustrates an example solution according to an embodiment of the present invention. As shown, mathematical descriptions of the error space (direct or indirect) are stored in a cal coefficient memory. D_ε_Ris re-generated by an interpolation algorithm which manipulates complex information at baseband in Cartesian and/or polar format. The interpolated N-dimensional error functions are then used to minimize the end-to-end error through the VPA circuits all the way to the RF passband signals.

In addition to cal coefficients, the algorithm can be affected, as illustrated in FIG. 17, by quasi-real time/feed back variables such as:

- Temperature
- Battery Voltage
- Gain Balance
- Phase Balance
- Process Variation
- Output Power

These real time variables are filtered and sampled at a rate which is insignificant compared to the signal rate of interest. This ensures stability and causality over reasonable epochs yet does influence the N-dimensional manifold description of the error surface D_ε_R.

Also, as shown in FIG. 17, four feed-forward signaling paths drive the VPA while additional signal space shaping can be accommodated via:

- 2 Independent Programmable Power Supplies;
- 4 BIAS Controls; and
- 1 AGC Control.

The entire feed-forward transfer function compensation is implemented by these 11 signals. It is possible to reduce the number of feedback and feed forward controls, trading off power consumption, silicon area, and performance, in the process.

FIG. 17, as described above, provides a snapshot of feed-forward algorithm interaction with VPA implementations. FIG. 18 illustrates the partitioning of real time and non-real time portions of the algorithms to support the implementation of FIG. 17.

6. BUILDING ERROR FUNCTIONS

As indicated previously, it is assumed that the VPA can be (in part) characterized via a function called D_ε_R·D_ε_Ris some error measure (with respect to ideal) whose effect is to be minimized. Although D_ε_Ris described as a function above, it is originally available as a raw data set, available in an N dimensional space R^N. In addition to characterizing D_ε_Raccording to some robust metric, D_ε_Rshould be described in such a manner to drastically reduce or minimize the amount of data required and permit accurate and effective calculation of any value within the applicable portion of R^Nwhich best reflects or estimates D_ε_R, even for values not originally available in the initial raw data set.

FIG. 18 is a process flowchart that illustrates the process of building an error function. As shown, the process flowchart is partitioned into real time and non real time applications.

(a) D_ε_RData and Function

It is assumed that error data can be obtained by a measurement process. If the data set for D_ε_Ris adequately sampled then a function can be defined which interpolates or approximates the original data set and which also interpolates for points which were not captured in the original data set.

Consider the 2D(R²) example illustrated in FIG. 19, which illustrates two points in a Cartesian space. In this example, the points D₁and D₂can be defined as:

D
₁=(x₁,y₁), D₂=(x₂,y₂)

and lie on the line given by

y=mx+b

where,

$m = (\frac{y_{2} - y_{1}}{x_{2} - x_{1}})$

Slope of line

b=Intercept evaluated @x=0

The dotted line in FIG. 19 illustrates an interpolation function based on a simple linear polynomial. Non-linear polynomials could also be used and might provide acceptable results, depending on additional data or constraining information which might be available. What can be observed from the example of FIG. 19 is that:

- 1) Only two data points were required to define the linear polynomial with order n=1.
- 2) An infinite number of points on the line can be derived from the equation for the line, while storing only the coefficients m, b.

This implies that significant savings in data storage can be realized if the appropriate function can be constructed. Similarly, the goal is to replace the data input for D_ε_Rwith functions approximating D_ε_R. This will permit all the calculations required to correct VPA transfer characteristics while simultaneously conserving memory in the process.

(b) Example VPA Error Plot for D_ε_R

As previously described, errors are defined as differences between the input and output of the VPA as measured in some space whose basis is a signaling constellation. Also, extensions of the space must be contemplated to encompass all potential operational states. Power control, time, temperature, waveform type, etc., are but a few of potential dimensional extensions.

Consider the linear sweep of a constellation illustrated in FIG. 20. The input to output relationship can be described by the equation below.

I
_Dε
_R
=a
₀
+a
₁
R
_e
+a
₂
R
_e
²
+a
₃
R
_e
³
+a
₄
R
_e
⁴
. . . ; εR
²

The equation above indicates that the output functional, given a linear stimulus at the input, is a non-linear function which relates the Imaginary and Real components of the signal constellation space and is described by some number of terms, weighted by coefficients a₀, a₁, a₂, a₃. . . etc., over some region R. However, if a₀, a₁, a₂. . . a_ncan be determined, then only those coefficients are required to be maintained. It is not necessary to store all the points or even many points along the non-linear function.

The description above with respect to FIG. 20 involves a single thread through a two dimensional space. A three dimensional graphical error description for D_ε_R, where the third axis represents the magnitude/phase of the error within the complex signal space is illustrated in FIG. 21. Error in this example is measured as a difference between the input and the normalized output of the system.

In this representation, the third axis is D_ε_R, and the two dimensional foundation plane is the real-imaginary complex signal plane. This permits a compact formulation of the error as:

|D_ε_R|=ƒ{D_ε_R(R_e,I_m)},εR³

<- D_ε_R=ƒ{D_ε_R(R_e,I_m)},εR³

FIGS. 22 and 23 illustrate the behavior of the two dimensional complex plane error in the cases of an input circle constellation stimulus and a radial stimulus, respectively. In FIGS. 22 and 23, the top left and top rights plot illustrate the magnitude and phase of the error, respectively. The bottom left and the bottom right plots illustrate the input stimulus and phase distortion rings, respectively.

An extension of the error formulation above can also be generated for an N-dimensional complex plane as follows:

∥D_ε_R∥=ƒ{D_ε_R[(R_e,I_m), PS, WF, T, ƒ, G . . . ]},εR^N

<- <- D_ε_R=ƒ{D_ε_R[(R_e,I_m), PS, WF, T, ƒ, G . . . ]},εR^N

This illustrates that the error can relate to a large number of parameters as well as the complex signal state. Parameters illustrated in the formulation above include:

(R_e,I_m); Coordinates of the Complex Signal Plane

PS; Power Supply State

WF; Waveform State

T; Temperature

ƒ; Frequency of Operation

G; Gain of VPA

This is a hyper geometric space and the resultant error surface D_ε_R_Nis a hyper geometric manifold. Calculations involving such surfaces typically involve tensor calculus. However, it is also possible to present the formulation as multiple parametric states of a third order geometry. This approach is often not as efficient but is easier to understand, and visualize. One such graphical illustration of a third order geometry as a subset of R⁶is illustrated in FIG. 24. Notice that time was inserted as a horizontal dimension in the graphical example of FIG. 24. Other parameters could have been selected as well.

D_ε_Rin the complex plane is a fundamental 2-D spatial kernel which is transformed via geometric translation to other higher order coordinate systems of up to N dimensions or the higher order systems consisting of expanded dimensions as well as parametric space. The next fundamental kernel is a 3-D spatial kernel which is derived from the complex information contained in the 2-D kernel. Hence, the two are equivalent in that respect. These kernels are universal to all analysis that follows and all higher order spaces and representations can be reduced to these lower order kernels.

Stimulus functions which probe the distorted signal space of interest have been described above. Higher order multi dimensional error surfaces can also be helpful, but even simple error surfaces (3 dimensional representation) can possess substantial numbers of samples to describe (on the order of 13K complex samples) the error.

An theorem is disclosed as follows, with detailed proof provided in the Appendix:

Theorem for Efficient Error Gradient Calculation

- The most efficient error gradient is obtained by maximizing ∇D_ε_R_Nfor all samples within the region R^N, and is proportional to maximization of the joint directional derivative

$\frac{\partial D_{ɛ_{R} N}}{\partial (ζ_{N})}$

where ζ₁. . . ζ_Nare orthogonal dimensional parameters of the space.

Although D_ε_R_Ncan be represented in many ways, which include vector representation, scalar fields with parameters, etc., the visual aspect presented as a surface manifold is particularly useful. The illustrations of FIGS. 20 and 24, for example, should be considered as scalar fields. Hence, the theorem given above is related to the directional derivative applied along the error surface. The goal then is to maximize the directional derivative. By constraining the derivatives to orthogonal dimensions, independence is assured such that unique information is imparted along the directions of the dimensional unit vectors. In the case of a 3 dimensional Cartesian system, the unit vectors would be {right arrow over (i)}_x, {right arrow over (j)}_y, {right arrow over (k)}_Z-. In a cylindrical coordinate system, the unit vectors would be {right arrow over (a)}_r, {right arrow over (a)}_φ, {right arrow over (a)}_Z-. Both coordinate systems possess application for the problem at hand.

Consider the example error surface representation illustrated in FIG. 25. Notice that there are peaks, valleys, varying contours, etc. in the representation. Consider the point X₁, which is located on a relatively flat portion of the three dimensional rendering. No matter the direction selected within the immediate vicinity of X₁, the directional derivative or gradient is roughly zero. Therefore, deploying a lot of samples in this region is costly in that very little information is imparted per sample. Nonetheless, some minimal data is required to characterize even the flat area.

Now consider X₂, which is located on a portion of the surface which is highly irregular. In this volatile region, the directional derivative yields significant information. Visual inspection would indicate that the greatest rate of change is in the vertical direction. It follows that sampling in that direction near the locality of X₂provides the maximum benefit for characterizing D_ε_R, with the fewest required samples.

Since the directional slope at various points along the surface changes, the optimal sample distribution should be biased as well. An optimal solution from this point of view consistent with the theorem requires:

max{∇D_ε_R}

Another method of illustrating the point is obtained by projecting a vector along the error surface denoting the sampling direction, then maximizing the gradient according to the sampling vector orientation in the region. This would be represented by:

$\frac{\partial D_{ɛ_{R}}}{\partial s} |_{opt} = \max {\langle {\overset{⇀}{b}}_{s} \rangle \langle \nabla D_{ɛ_{R}} \rangle \cos γ}$

where γ is the angle between the sampling vector ({right arrow over (b)}_s) direction and the gradient, along the surface, s.

(i) Error in the Complex Plane

The view presented in Section (c) above relates to a surface error formed from the magnitude and phase error of {right arrow over (D)}_ε_Rwithin the complex plane. It is important to note the impact of sampling and the theorem presented in Section (c) when the space is constrained by the 2 dimensional complex signal plane rather than the 3 dimensional scalar cloud.

A polar representation is very convenient and will be used, occasionally. FIG. 26 is a 2-D view that illustrates a starburst calibration pattern in the complex plane with distortion. Notice the curvature in the radial arms. A single radial arm is illustrated in FIG. 27 and is examined below:

{right arrow over (r)} is a radial vector which intercepts the radial arm at some desired sample location. φ is the angle to that vector. {right arrow over (a)}_rand {right arrow over (a)}_φ are unit vectors for the polar representation and are an orthogonal basis. The test input to the VPA is x_r_i_′φ_iand the output, the spiral arm, is y_r_i_′φ_i.

The following equations relate to this discussion:

{right arrow over ({tilde over (D)}
_ε
_r,φ
={right arrow over (y)}
_r,φ
−k{right arrow over (x)}
_r,φ (k=desired VPA gain)

Where;

{right arrow over (y)}
_r,φ=ƒ_y(r)·{right arrow over (a)}_r+ƒ_y(φ){right arrow over (a)}_φ

Since the unit vectors {right arrow over (a)}_rand {right arrow over (a)}_φ are orthogonal

$\frac{\partial}{\partial_{r}} {\tilde{\overset{⇀}{D}}}_{ɛ_{r}} = \frac{\partial}{\partial_{r}} (f_{y} (r) - {kf}_{x} (r)) {\overline{a}}_{r}$

$\frac{\partial}{\partial_{φ}} {\tilde{\overset{⇀}{D}}}_{ɛ_{r}} = (\frac{1}{r}) \frac{\partial}{\partial_{r}} (f_{y} (φ) - {kf}_{x} (φ)) {\overline{a}}_{φ}$

Notice that if the errors are constant in the radial direction then that partial derivative is zero. Similarly, the same is true for the {right arrow over (a)}_φ direction. Furthermore, it is possible to maximize these derivatives independently. If the radial derivative dominates the error gradient then it is better to sample in the radial direction. If the angular derivative dominates the error gradient then angular sampling may prove effective. However, as illustrated in FIG. 26, a blend of distortions is likely. In the case where distortion is distributed between the orthogonal dimensions, cross dimensional sampling may be optimized through proportional weighting. That is, some blend of sampling between the {right arrow over (a)}_rand {right arrow over (a)}_φ directions is warranted, when star burst sampling is desired.

FIGS. 28-35 illustrate gradients generated for magnitude and phase of D_ε_R. The gradients are directionally calculated for the scalar fields and a magnitude of the gradient is also included for both |D_ε_R| as well as <- D_ε_R.

(1) Higher Order Derivatives

Suppose that the error gradient is non-zero yet is constant. Accordingly, the directional derivative with the greatest first derivative would determine the sampling bias, i.e., the direction through the complex plane for which sampling should be applied, for most efficient characterization.

However, it should be noted that higher order derivatives are an important indicator as well. The rate of change of the error gradient is important because of considerations of entropy. Sections 5(a)(i)(1) and 5(a)(i)(2) described that the more erratic the behavior for D_ε_R, the greater the entropy for its description (i.e., the greater the information content). Thus, the second order (or higher) gradient can also be maximized to obtain the path (sampling path) through the space which produces the greatest information acquisition per sample. Also, the regions with such erratic gradient behavior warrant greater density of samples as well.

(2) Power Weighting Considerations

The efficiency theorem as stated relates to variations in the error surface because variations retain more information content which must be processed in some manner for effective correction algorithms. Additional considerations which weigh the value of the information content include:

1) Significance of higher order gradients; and

2) Position of the gradients within the complex R²base.

Consideration 2) implies that corrections for linearization near the origin in a polar or cylindrical coordinate system are not typically as critical as corrections applied at larger radial distances. Larger radials correspond to larger signal distances and energies. Corrections must ultimately minimize misplaced signal metrics such as energies and distances. Therefore, small signals with large errors are not always the dominant concern. A small error in a large signal (larger distance for origin) may be more significant.

In addition, if errors are large in the vicinity of constellation decision points, then the EVM is more important in that vicinity and should be weighted accordingly. Some regions of the complex plane may experience signaling transitions rather than fixed constellation points or decision states. Transition regions should be weighted according to their effect on spectral domain compliance, spurious, ACPR, etc.

(d) Example of Radial Sampling Contour in Complex Plane

In this section, a solution to radial sampling in the complex plane is provided. To illustrate, consider the example radial sampling path of FIG. 36.

The following definitions relate to parameters illustrated in FIG. 36:

- k₀, k₁, . . . k_mΔ sampling points within the complex plane
- w₁, w₂, w_e. . . w_nΔ weighting values determined from the grad {D_ε_R} and other considerations
- a₁, a₂, . . . a_nΔ differential orthogonal components of the samples. Even order terms are in the {right arrow over (a)}_φ direction, odd order terms are in the {right arrow over (a)}_rdirection

{right arrow over (r)}₀, {right arrow over (r)}₁, . . . {right arrow over (r)}_mΔ vectors from origin to the sampling points or knots

k₀, k₁. . . in the direction of {right arrow over (a)}_R_n

φ_mΔ sample angle, angle between R_eaxis and sample radial vector {right arrow over (r)}_m.

Based on the above, the following equations can be written:

$r_{m}^{2} = {(r_{m - 1} + w_{2 m - 1})}^{2} + {(w_{2 m} z_{2 m})}^{2} m = 0, 1, 2, \dots \frac{n}{2}$

$φ_{m} = {\begin{matrix} φ_{m - 1} + μ_{m} \sum_{v = 0}^{\infty} {(- 1)}^{v} \frac{μ_{m}^{2 v}}{2 v + 1}, & \langle μ_{m} \rangle < 1 \\ φ_{m - 1} + (\frac{π}{2} - \frac{1}{μ_{m}} \sum_{v = 0}^{\infty} \frac{{(- 1)}^{v}}{(2 v + 1) μ_{m}^{2 v}}), & \langle μ_{m} \rangle \geq 1 \end{matrix}$

where μ_mis defined as:

$μ_{m} = \frac{W_{2 m} a_{2 m}}{W_{2 m - 1} a_{2 m - 1}}$

φ_mmust be calculated based on the quadrant of the complex plane by tracking the signs of the even and odd portions of the quotient for μ_m.

Note that the function traced out by the example plot is monotonic and smooth. However, the differential components, a_n, can take on +/− values so that the sampling function can meander anywhere within the plane. If gradients of D_ε_Rare large and negative in the radial direction, then the weights in the radial direction (odd weights) will be large and the a_oddwill be negative. The same reasoning applies for the {right arrow over (a)}_φ direction as well.

The sample point locations within the output complex plane are r_m, φ_m. In order to ascertain the inputs, the distortion is reversed and the sample is mapped to the input since the input stimulus is known to be perfectly linear along {right arrow over (a)}_r. Therefore the cross coupling of components in the output (AM-PM conversion) can be easily detected by the gradient calculation.

This analysis reveals the following principles:

- 1) The analysis of the output samples for radial input sampling should occur on the natural trajectories or contours of the warped output radials;
- 2) Sampling radially can detect the distortions giving rise to D_ε_Ronly in the following manner, if a single radial thread is analyzed:
  - a) Local AM-AM distortions such as amplitude compression and expansion can be detected; and
  - b) Local AM-PM distortions can be detected i.e., cross coupling from the radial {right arrow over (a)}_rdirection to the {right arrow over (a)}_φ direction.
- 3) Excitation of the {right arrow over (a)}_rat the input is an efficient detector of radial distortion at the output; and
- 4) The density of samples along the output radial threads (Euclidian distance between k_m) is different than the assigned distance at the input as a function of a_nan w_n.

(i) Averaged Weighting

The equations provided above with respect to FIG. 36 can provide fine structure tracking of the sampling gradient if a_nis a small increment. In the limit a_nmay be a differential component. It is possible under certain conditions to average the weights w_n. The even weights could be averaged separately from the odd weights, for example.

If the radial gradients are averaged then each of the a_nfor n even would be weighted by {tilde over (W)}_AV_E. The odd a_ncould be weighted by the odd average, W_av₀. As such:

$W_{{AV}_{E}} = {\frac{1}{2 n}} \sum_{Int}^{n = even} W_{n}$

$W_{{AV}_{0}} = {\frac{1}{2 n}} \sum_{Int}^{n = odd} W_{n}$

where { }_Intdenotes the integer value of the number in brackets. This approach may prove very practical under a number of circumstances, especially if multiple radial arms are averaged and calculated or if many devices are to be characterized. Then it is desirable to provide a universal weighting value, where possible, to reduce the number of calculations. It may be possible to use the same even and odd averaged weighting to characterize entire lots of components. Similarly, the angles may be indirectly characterized by using:

${\tilde{μ}}_{m} = \frac{{\overline{W}}_{{av}_{E}} a_{2 m}}{{\overline{W}}_{{av}_{o}} a_{2 m - 1}}$

In later sections it is suggested that the errors between the raw data and a polynomial fit to the radial sample arm may be utilized to form a MMSE (Minimum Mean Square Error). This type of smoothing fit or averaging fit is often preferred.

(e) Multiple Radial Sampling Arms

The previous section described that when the radial sampling approach is used in a single thread manner:

1. PM to PM and PM to AM distortion cannot be easily detected; and

2. Non homogeneous distortions in R²cannot be characterized.

Both a) and b) represent conditions of under sampling. D_ε_Ris a two dimensional function when restricted to the complex plane and a single contour cannot characterize the 2 dimensional topology, unless the topology is trivial. A single thread could characterize the topology if it is a spiral which swirls from the origin outward toward the unit circle. However, for reasons stated earlier this violates the maximum gradient theorem, and would typically be inefficient in terms of sample deployment.

A multi radial sweep technique, as illustrated in FIG. 37, could potentially characterize the entire complex plane, when taken as an ensemble. As shown in FIG. 37, the entire plane is covered by 48 radial sweep arms in this example star burst. Although swept in radial bursts, the adjacent radial arm samples can be arranged or organized to present a circle like contour at an approximate radial offset r₀and analyzed along the {right arrow over (a)}_φ direction.

There is however another concern of practical importance. As described in Section 2, there are multiple parallel impulse responses that make up the VPA transfer characteristic. Hence, bandwidth (BW) is a concern. That is, AM-PM, AM-AM, PM-PM, and PM-AM are potentially bandwidth dependent to some degree. It is widely known that the first two (i.e., AM-PM and AM-AM) can be bandwidth dependent. The second two (PM-PM and PM-AM) are often overlooked. The radial arm technique permits efficient means of exciting the VPA for detection of AM-AM and AM-PM. However, the bandwidth of sampling the sweeps successively to reveal PM-PM and PM-AM is roughly reduced by the number of samples along a radial arm. In the nominal multiple radial arm scheme, a single radial arm would complete prior to sweeping another radial arm somewhere in the complex plane. Therefore a newly excited φ_iat the input is deferred, radial by radial, at a much slower pace than the amplitude sweep of a particular.

Whether or not this is a draw back should be determined prior to selecting a sweep technique. For instance, if PM-PM and PM-AM possesses a low sensitivity to bandwidth then perhaps a radial sweep is sufficient. More importantly, if the rate of change of phase within the targeted application (WCDMA, EDGE, GSM, etc) can be emulated by the calibration signal then the BW is sufficient by definition.

Techniques may be employed which order the radial arms to be sampled in such a manner as to increase bandwidth of the phasor through phase modulation, by various means such as alternating quadrants of the complex plane, jumping from one radial arm to another, eventually completing all sample locations required for coverage. These techniques will trade-off bandwidth expansion due to radial domain (amplitude) fluctuation for angular fluctuation of the complex signal.

(f) Comparison of Circular Sampling and Radial

Earlier sections included significant discussion of radial sampling. This section provides some insight into circular sampling. FIG. 38 illustrates example input circle constellations.

Referring to FIG. 38, suppose that the input samples X_iare arranged on circles within the complex plane and that many such circles are utilized to excite the input, successively. The entire complex plane can be characterized if the density of circles and samples located on the circles are sufficient. Furthermore, PM-PM conversion as well as PM-AM conversion can be characterized. Notice that the direction of stimulus is always orthogonal to the radial direction, according to {right arrow over (a)}_φ. Hence, the PM-PM and PM-AM affects are completely decoupled from the AM-PM and AM-AM affects uncovered by radial sampling.

Let X(t) represent some input function which is to be sampled, to create X_i, the input samples. X(t) can be written as:

X(t)=A(t)e^{j(2πft+Θ(t))}

The amplitude and phase components of the complex phasor X(t) are illustrated in the equation as decoupled from one another, and as such they can be modulated independently. Moreover, their vector representations within the polar plane are orthogonal. This is an important observation to resolving the non-linearity mechanisms.

FIG. 39 illustrates example output constellations that correspond to the input constellations of FIG. 38. The output constellations appear distorted because the VPA is non-linear.

Note that D_ε_Rmay be obtained from the x_iand y_ias previously discussed in other portions of this disclosure. Based on the above, the following can be said regarding sampling options discussed thus far:

- 1. In the most general case, both radial domain and phase domain sampling would be required because AM-AM, AM-PM must be decoupled from PM-PM and PM-AM, assuming all mechanisms are present and significant;
- 2. The degree of importance of these orthogonal sampling approaches is directly related to the sensitivity of ∇D_ε_Robtained by/from exciting the input in one or the other direction, i.e., {right arrow over (a)}_r, {right arrow over (a)}_φ; and
- 3. Whether or not the AM-AM, AM-PM and PM-PM, PM-AM components can be resolved by some other non orthogonal sampling trajectories on the input is related to the bandwidth dependency due to phase variation (Θ(t)) and amplitude (A(t)) variation at the input and the sensitivity of the VPA to these variables.

The implication is that D_ε_Ris a function of sweep rate and direction of sweep through the complex plane.

(g) Joint Sampling Approach

As indicated in sections (e) and (f), there are cases for which radial (amplitude domain) and phase domain sampling are warranted. In those cases a polar MESH can be obtained. This forms an input sampling web, as illustrated in FIG. 40, for example.

In a general case, the circles are generated independent of the radials, their rates are independent, and their sampling densities can also be controlled independently. If so desired, the polar coordinates could be converted to rectangular coordinates. The number of radials, the number of circles, and the sample density on circles and radials are variable parameters according to this approach.

(i) Cross Correlation

If circular sampling does not reveal any new and unique information compared to radial sampling, then it is not required. Similarly, if radial sampling does not reveal any new and unique information compared to circular sampling, then radial sampling is not required.

Suppose a fine input polar sampling mesh is generated. Then, output samples y_i({right arrow over (a)}_r) and y_i({right arrow over (a)}_φ) are be compared (for common input sample locations) by correlating the data. The peak correlation coefficient can be represented by:

ρ_φ_i_′ri=E{(ŷ_φ_i)(ŷ_r_i)}(co-variance)

ŷ
_φ
_i
=y
_φ
_i
−{right arrow over (y)}
_φ

ŷ
_r
_i
=y
_r
_i
− y
_r

The mean values y_φ or y_rare obtained on circle contours or radial contours (whichever is desired) at the mesh intersections (or interpolated intersections) so that the inputs whether circular or radial would be expected to give rise to the same output numbers. That is, mesh crossings on the input should give rise to mesh crossings on the output which contain information regarding the related correspondence of input and output data versus sweep method. The correlation coefficient is a measure of how much the sweep methods are similar in their results.

The hat symbol in the equations above denotes a normalization process such that the maximum cross correlation value is unity. The input mesh crossings are exactly identical for radial or circular sweeps and possess a cross correlation coefficient value of 1.

Consider exciting the input with radial sweeps. At some input radius, r_jx_kthere is a corresponding output radius value r_jk_kon each of the k^thradials.

As such, a set of constant r_jinput points would spawn a set of constant r_j·c output radial points if the system were linear. These k points would plot out a circle if k radials were used to cover R², as illustrated in FIG. 41. Many such circles of varying r_jradii could be formed, or arranged. A set of such points can be assembled as:

ψ_jk={(rr_jx,rφ_kx)(rr_jy,rφ_ky)}

- j=1, 2 . . . discrete radii
- k=1, 2 . . . radial angles

It is assumed that the means can be removed from these data sets. x's correspond to inputs and y's correspond to outputs.

Another set of data points can be collected by stimulating the input with circles, and organizing the data on circles for the input and corresponding output. This data is collected into the set (adjusted for zero mean):

Λ_jk={(φr_ix,φφ_kx),(φr_jy,φφ_ky)}

The coordinates for

$\overset{\overset{radial sweep}{}}{({rr}_{jx}, r φ_{kx})} = \overset{\overset{circular sweep}{}}{(φ r_{jx}, φ_{kx})}$

are at the input mesh intersection points. The difference is in how they are generated and organized. One set is generated along the {right arrow over (a)}_rdirection and the other set is generated along the {right arrow over (a)}_φ direction. Now the output data generated by {right arrow over (a)}_rdirectional inputs are also collected into concentric rings. Then they are correlated with outputs which result when the input stimulus is along {right arrow over (a)}_φ. This can be written as:

$ρ_{ψ Λ} = E \frac{{(ψ_{y}) (Λ_{y})}}{K_{N}}$

Both Ψ and Λ are functions of r and φ but are obtained from sweeping the input in different experiments by radial ({right arrow over (a)}_φ) or orthogonal ({right arrow over (a)}_φ) excitation. K_Nis a normalization factor. The indices j, k, can be tracked and assigned to ρ so that regional correlations can be assigned.

The above described process can also be accomplished by sweeping {right arrow over (a)}_φ, {right arrow over (a)}_rand organizing on radials, rather than circles.

If ρ_ΨΛ=1 at any radius then sampling in the {right arrow over (a)}_φ direction does not yield any additional data compared to input sampling in the {right arrow over (a)}_rdirection. If ρ≠1 then some sampling in both directions is warranted.

ρ is a function of r and φ and therefore is complex. Thus |ρ| and <- ρ can be obtained much in the same manner that D_ε_Ris processed to form error surfaces with Z- as the third dimension in cylindrical coordinates. The correlation surfaces are a metric for increasing or decreasing the orthogonal sampling densities. The natural tendency might be to reduce the number of orthogonal sampling contours until there is little cross correlation. This can only be accomplished provided that the remaining minimal mesh does not permit the error function and its gradient to sift through, undetected. That is, reducing sample contours and densities based on ρ_ΨΛ is acceptable, with the provision that D_ε_Rcan still be faithfully acquired by applying principles of the sampling theorem.

(ii) Radial to Radial Correlation

If a single radial could characterize the entire complex plane, then there would be no need for multiple radials. That is, if the amplitude distortion is not a function of φ then a single radial sweep is sufficient for characterizing AM-AM and AM-PM phenomena. If these distortions vary as a function of φ, however, then more than a single radial is required.

Within the set Ψ_jkthe numbers exist, parsed in a different manner, to produce additional cross correlations of interest. All j samples of the k^thradial must be cross correlated with all j samples of the υ^thradial. Since each k^thradial possesses a unique spawning φ_k, this φ_kmust be accounted for in the cross correlation process. That is φ_kis a metric associated with the input sweep and must not bias the correlation data:

ρ_r(k,υ)=E{{circumflex over (ψ)}_rk·{circumflex over (ψ)}_rυ}

- k=1, 2, 3 . . .
- υ=1, 2, 3 . . .

Whenever k=υ, then ρ_r(k,υ)=1. It is assumed that the mean values have been extracted from the data set and that the data are suitably normalized for max{ρ_υ(k,v)}=1. If adjacent radials possess a correlation constant ρ≈1, the radials are too closely spaced. On the other hand, the radials cannot be so sparsely positioned that D_ε_Rcannot be accurately reconstructed.

(iii) Circle to Circle Correlation

Concentric circle sweeps can be correlated to one another in a manner prescribed for the radials in Section (ii). The parsing of Λ_jkis required so that all k samples of the j^thring or circle are correlated to all k samples of the l^thring so that:

ρ_φ(j,l)=E{{circumflex over (Λ)}_cj·{circumflex over (Λ)}_cl}

- j=1, 2, . . .
- l=1, 2, . . .

ρ_φ(j,l)=1 for j=l and the data sets have been adjusted for zero means and are normalized. It is noted that very high values for ρ imply that circles are too closely spaced. In this case, sampling densities must be accounted for since they may vary from ring to ring. One method of accomplishing this is by interpolating the connecting rings prior to mean extraction, normalization, and correlation.

(h) Sampling Density and Sample Rate

There are several sampling constraints which must be applied to accomplish the characterization of the VPA, including:

- 1. Adequate sampling is required for constructing approximations to the raw sample data which are robust and which can described by convenient continuous functions;
- 2. Sampling must be sufficient to cover the entire R^Nregion to efficiently reveal D_ε_Rand ∇D_ε_R; and
- 3. Since the stimulus and the response of the VPA involve real signals as a function of time, they also possess Fourier Transforms and must be sampled according to Sampling Theorem principles subject to the frequency content revealed by the transform.

Item 3) is the concern of this section. The Fourier Transform of the input signal is given by:

X(ƒ)=∫_−−∞^∞x(t)e^−j2πƒtdt

This form assumes knowledge of the continuous input signal function, x(t). Numerical computation usually demands the discrete transform:

$\tilde{X} {\frac{n}{N_{s} T_{s}}} = \sum_{k = 0}^{N_{s} - 1} x ({kT}_{s}) e^{- j2π nk / N_{s}}$

$n = 0, 1, 2, \dots (N_{s} - 1)$

This form relates N_stime samples to N_sfrequency samples.

Shannon's Sampling Theorem:

The signal x(t) must be sampled at a rate T_s⁻¹>2ƒ_maxwhere 2ƒ_maxknown as Nyquists rate is calculated from the maximum significant frequency content of the transform

It follows that sample abasing will be minimized and complete signal reconstruction is possible.

x(t) is represented as some related function within the complex plane. This representation is not sufficient without the inclusion of the time variable in terms of sampling theorem consideration. That is, x(t) is actually of the form:

x(t)=A(t)e^−j(2πƒ^c^t+Θ(t))

The input samples discussed previously are samples of this function acquired at discrete intervals. The general exponential form of x(t) is used above though not required. x(t) in this form permits the following:

- x(t) can represent baseband and passband waveforms;
- x(t) can be resolved into complex (I & Q) components;
- A(t) permits amplitude domain variation;
- ƒ_c(t) can permit frequency variation if desired;
- Θ(t) can permit phase variation; and
- Any point in R²can be swept and acquired via x(t).

Now the output transform is obtained from;

$Y {\frac{n}{N_{s} T_{s}}} = \sum_{k = 0}^{N_{s} - 1} (x ({kT}_{s}) * V ({kT}_{s})) e^{- j 2 π k / N_{s}}$

where * is a shorthand notation for convolution. {V} is the transfer characteristic defined by Volterra (see section 2) or approximation thereof. The transfer characteristic of the VPA is non-linear and therefore gives rise to additional frequency components not found within x(t). That is, y(t) possesses unwanted harmonics and intermodulation distortion which is revealed in the transform Y above.

The error function D_ε_Rrequires accurate knowledge of x(t) and y(t). D_ε_Rindirectly portrays V by revealing its impact. Shannon's requirements dictate Nyquists rate for all the functions; x(t), y(t), and D_ε_R

If the sampling rate is adequate for x(t) and y(t) then the difference;

$D_{ɛ_{R_{i}}} = {kx}_{i} - y_{i}$

will be adequate. However, ∇D_ε_Rand higher order differential processing techniques could dictate higher sampling rates. As such, the sampling rate from a Shannon constraint will be driven from the output spectrum Y and the spectrums related to the gradient of D_ε_Ror n-th order directional derivatives of D_ε_R, if such gradients are required.

One approach is to oversample the input X_isufficiently to account for all anomalies at the output. Another approach is to create greater density of samples in the areas of the output where D_ε_Ris non-linear or gradients are more active. This forces non-uniform sampling intervals at the discrete input samples X_i, due to the affect of {V}. There are variations of these themes which permit convenient input sample definition while accommodating output bandwidth expansion using interpolation, high resolution sampling clocks combined with sparse sampling techniques, etc.

(i) Sampling Density

Based on the teachings herein, one skilled in the art would appreciate that the sampling densities are a function of the sweep rates and the information content of D_ε_Rover the region of interest. As suggested above, the sample density can be required to meet the Nyquist rate at a minimum for all processed signals, given knowledge of their transforms. This rate is then translated to the distance along the output contour based on the rate of change of the sweep selected for that region.

(i) Intermediate Summary

The following statements summarize certain concepts disclosed thus far in this disclosure:

- 1. The VPA is a non-linear device with a transfer characteristic that is a function of power supply, gain, carrier frequency, temperature, waveform bandwidth, transitional state through the complex plane, etc.;
- 2. The VPA can be modeled by Volterra kernels;
- 3. Traditional ‘real time’ feed forward and feedback processing approaches can compensate for non-linearities to a degree, but suffer from a perspective of complexity, inefficiency, size and performance. A compensation technique which characterizes the VPA ‘off line’ and creates a manageable mathematical model may be executed in real time hardware, using a moderate amount of memory to store certain calibration factors;
- 4. The VPA is characterized in each state by error functions which compare a desired ideal response to actual responses. These error functions can be represented in R², R³. . . R^Nwhere R^Nis an N dimensional space;
- 5. Correction can be applied based on the error functions, thereby compensating the VPA to obtain a nearly linear transfer characteristic for each VPA state;
- 6. Error functions are created from discrete measurements or samples which are applied to the VPA input and observed at the output. The systematic application of these input samples creates contours of output samples in the complex plane which can be approximated or fit by polynomials or combinations of polynomials and other mathematical descriptions. The polynomial descriptions permit significant reduction of stored data;
- 7. Specific input sampling trajectories are required to exercise the VPA for full characterization. Furthermore, the rate of sampling and the transitions through the complex plane help determine AM-AM, AM-PM, PM-PM, and PM-AM performance of the VPA;
- 8. Sampling densities along sampling trajectories and sampling densities within regions of the complex plane are proportional to the directional gradient of the error function and other weighting considerations. Also, cross correlation functions are inversely proportional to the required sampling densities.

FIG. 42 is a process flowchart that illustrates a methodology for VPA error function characterization according to an embodiment of the present invention.

7. APPROXIMATION THEORY

Classical approximation theory involves the mathematical description of special functions by other simpler functions. The purpose of replacing one functional description for another is often related to calculation efficiency and convenience.

At first glance it may seem that approximation in its classical form has little to offer to the problem at hand since the VPA error function D_ε_Rdoes not possess a closed form functional description. Indeed D_ε_Rbegins with a raw data set description and needs to be morphed to a functional description. Nevertheless, many theorems and ideas originally established by approximation theory can be applied to the problem at hand. The reason that this occurs naturally is because functions by nature are constraints on numerical domains represented by variables. Since solutions are always implemented on machines, functions need to be defined numerically and hence the forced connection.

In this section some principles and theorems are introduced, which will be used in later discussion and analysis.

The cornerstone approximation theorem proven by Wierstrass and then generalized by Stone can be stated as follows:

- Given a function ƒ, ε, R^N, we can approximate ƒ by continuous functions which also exist within, R^N, and;

∥ƒ−g∥=∫_−∞^∞ƒ(x)−g(x)|dx<ε

- Where g approximates f and the error ε is bounded according to the quality of the approximating function g.

This theorem may restrict the domain for the candidate approximating functions but does not restrict the family or form of the functions. Nor does the theorem imply accuracy, how to find such functions, etc.

A corollary of the above theorem (also from Wierstrass) is that:

- “A continuous function defined on a closed and bounded interval can be approximated uniformly by means of polynomials.”

The quality of the approximation to ƒ(x) given by g(x) can be measured from:

∥ƒ−g∥_γ={∫_a^b|ƒ(x)−g(x)|^γdx}^l/γ

- a≦x≦b
- l≦γ≦∞

This metric is known as the Lebesgue norm.

The γ=1 application is well known in the literature as a measurement norm. However, γ=2 is probably the most famous norm, also known as the Least Squares Metric. When γ→∞ the norm is known as the Chebyshev or uniform norm or also as the min-max solution. γ=2 and γ=∞ are by far the two most important exponents for the Lebesgue integral equation. However, custom applications may work well with γ=1 or some other value.

Favard is usually credited with the so called saturation theorems or descriptions of the saturation phenomena. Each functional class can be segregated to ascertain limits for the accuracy of approximations. Both the target function and the approximating function play a role. Once a particular class of approximation is identified, a limit in performance is also predictable and cannot be improved substantially beyond an asymptotic limit. This phenomena is known as saturation. The phenomena, if it arises, can often be recognized empirically in practice, then numerically analyzed more rigorously as required.

(a) Fitting

Fitting refers to construction of special functions which either ‘pass through’ data points within a particular region or provide some best estimate of a function which passes in the vicinity of the data. FIG. 43 provides an example illustration. Notice that in the top fit the polynomial P_a(x) forces the condition that each datum must explicitly be defined as a specific solution. This is also sometimes referred to as an interpolating polynomial for a specific data set. The second example is such that the polynomial P_b(x) appears to the eye to be of a lower order or at least ‘smoothed’ such that it passes to within some prescribed distance but is not constrained to possess D₁. . . D₆as specific explicit solutions, individually. Typically, there is a constraint applied to P_b(x) in terms of the average ‘distance’ or norm from the data set. As presented in the previous section, the Lebesgue norm is the metric often applied, and the γ=2 least squares case is the most popular for a variety of reasons. One reason that γ=2 is appealing for certain applications is that amplitude errors can be converted to energy or power errors which in turn are often more relevant to the solution of certain real world problems. In addition, γ=2 provides a metric which converges in the solution formulation for a wide class of problems and therefore is often robust.

Least squares formulation can be global or local and can be weighted, or not. This provides for significant flexibility. Note that fitting typically requires numerical input to the algorithm rather than the presumption of a specific function type, like e^−x. Nonetheless, one could make the argument that e^−xcould have been evaluated at certain points x₀, x₁. . . etc, and that data could be used as an input to an algorithm without regard for the presumption of e^−x. In this sense, approximation and fitting share some common ground.

(i) Polynomial Fitting

Justification for polynomial fitting was provided by Wierstrass and Stone in the approximation problem which of course extends to the fitting problem as well.

Consider the general Taylor Series expansion given by:

$f (x) = f (a) + (x - a) \frac{\partial f (a)}{\partial x} + \frac{{(x - a)}^{2}}{2!} \frac{\partial^{2} f (a)}{\partial x^{2}} + \dots$

If the function is smooth near a then the expansion exists and there is a remainder term given by;

$Rem = \frac{{(x - a)}^{n + 1}}{(n + 1)!} f^{n + 1} (a - Δ), where x < a - Δ < a$

Whenever a=0 the series is a Maclaurin series. As an example, e^xcan be estimated in the vicinity of zero from;

$e^{x} = 1 + x + \frac{x^{2}}{2!} + \frac{x^{3}}{3!} + \frac{x^{4}}{4!} + \frac{x^{5}}{5!} \dots \frac{x^{n}}{n!}$

If a=0.01 then the approximation (using only 5 terms) yields 1.010050167, which agrees exactly with a calculator! On the other hand, consider the calculation for a=0.1 and 5 terms;

e
⁻¹≈1.105170914

Calculator→1.105170918

Notice that the approximation is diverging slightly. Hence, the polynomial is most accurate over a limited domain under certain conditions. This simple example illustrates polynomial representation utility for reconstructing functions. However, expansions of this form are not robust for general application over large intervals and other methods are required.

Consider the following theorem which is a restatement of the Wierstrass and Stone theorem presented in the previous section:

- If {x₀, x₁, . . . x_n} is a set of n+1 numbers in R², then there exists a polynomial p(x) such that p(x) yields values y_i. That is, the solutions of p(x) correspond precisely to (x_i, y_i) at n+1 distinct points.

Thus p_n(x) has a form;

$p_{n} (x) = \sum_{i = 0}^{n} c_{i} \cdot x^{i} .$

This is in fact a fundamental theorem of algebra. By inspection it should be obvious that such a polynomial possesses up to n distinct roots. This fact permits the function defined by such a polynomial to change direction or slope up to n+1 times along an interval containing the roots. Coefficients within the expansion for p_n(x) can also super impose movement on the average or direction of this oscillating function within the domain. This is obviously restricted to 2 dimensions for the particular theorem presented here. However, there are methods which extend variations of this theorem to consider R^N.

In order to determine p_n(x) the c_imust be calculated. The typical formulation is usually presented in matrix form as follows;

$(\begin{matrix} 1 + x_{0} + x_{0}^{2} + \dots x_{0}^{n} \\ 1 + x_{1} + x_{1}^{2} + \dots x_{1}^{n} \\ ⋮ \\ 1 + x_{n} + x_{n}^{2} + \dots x_{n}^{n} \end{matrix}) (\begin{matrix} c_{0} \\ c_{1} \\ ⋮ \\ c_{n} \end{matrix}) = (\begin{matrix} y_{0} \\ y_{1} \\ ⋮ \\ y_{n} \end{matrix})$

This can be rewritten in compact formas

[V][c]=[Y]

Solving for [c] yields;

[c]=[V]⁻¹[Y].

This is a classical problem in matrix algebra where the inverse of the Vandermode matrix, [V], is usually the issue. In some cases, the inverse matrix is ill defined and therefore considerable algorithmic investment is required to avoid singularities or computational issues.

It is important to note that the solution, if it exists, yields a polynomial function which passes through the points x_i, y_iexactly. Consider the following heuristic example, illustrated in FIG. 44.

Notice that two polynomials (one a simple line) pass through the data x₀, x₁, x₂. . . . Without additional constraining information the choice of solution may not be considered unique.

That is, the theorems are more complete if and only if the order n is prescribed in the solution or if other constraints are prescribed. However, sometimes efficiency demands that n be minimized. In this case further considerations and constraints are warranted.

(b) Interpolation

As a specialized branch or technique of mathematics, interpolation theory developed at a slower pace, following approximation. Theoreticians have been primarily interested in convergence or representation of formulas for functions, particularly infinite series. However, interpolation is a tool for the applied mathematician or engineer or scientist to calculate specific numbers to some desired accuracy.

Interpolation in an extended sense is a method of creating new data points from pre existing data points constrained on the interval of the pre existing data points. In addition, interpolation is often referred to as fitting a specific polynomial or other function description to a specific set of data over an interval. Calculating points outside of the interval is known as extrapolation or prediction and is not contemplated within interpolation theory.

Consider the graphical example illustrated in FIG. 45. If X₀, X₁. . . X₄are data points in R², then they exist as ordered pairs, say (x₁, y₁), (x₂, y₂) . . . (x₄, y₄). Suppose that we wish to calculate a point on the interval between X₃and X₄, named X_I_3,4. This can be accomplished in a number of ways. There are some facts to observe:

- 1. A simple data set (x₀, y₀), (x₁, y₁) . . . (x_k, y_k) does not provide enough information to solve the interpolation problem uniquely as stated;
- 2. A functional relationship must be assumed for the set of numbers in order to calculate other numbers not originally provided in the data set;
- 3. Continuity of functions and the existence of functional derivatives is a requirement along the interval of interest;
- 4. The interpolation technique may include consideration of all or a portion of the data set (x₀, y₀) . . . (x_k, y_k).

In the simple case of linear interpolation all of the data points of the data set may be constrained such that ach datum is assumed to be connected to the next datum via a straight line and the line is everywhere differentiable along the interval bounded by the end points x_iand x_x+1. Nevertheless, the derivatives of the piecewise reconstruction may or may not be well defined at the connecting nodes or knots.

Facts (1 through 4 above) provide the necessary constraints to calculate intermediate values by assigning coefficients (m, b) to the equation of a line,

$y_{1} = {mx}_{1_{3, 4}} + b = (\frac{y_{i + 1} - y_{i}}{x_{i + 1} - x_{i}}) x_{1} + y_{int}$

where y_intis defined as they intercept for this example.

Given the five original data points X₀, X₁, X₂. . . X₄a polynomial fit can be defined which included the data (x₀, y₀), (x₁, y₁) . . . (x₄, y₄) as solutions and is everywhere possessing a derivative on the interval between x₀, y₀and x₄, y₄. FIG. 45 illustrates that p₄(x) is a curvaceous trace given by the dotted trajectory. Conveniently, the datum x_l_3,4, for this graphic example alone, is a common solution point in the linear interpolator case and p₄(x) interpolator case. This is not the case in general and even for this example it is easy to recognize that if x_l_3,4were decreased slightly or increased slightly that the two interpolation techniques would yield different results at that point. Hence, it is important to recognize the degrees of freedom possible with a variety of classes of interpolation problems.

In the following sections, interpolation formulas and techniques are provided.

(i) Newton's Formula

Newton's interpolation formula using the method of divided differences is simply stated without proof as:

$f (x) = f (x_{0}) + (x - x_{0}) \frac{\overset{\overset{divided difference f (x_{0}, x_{1})}{}}{f (x_{0}) - f (x_{1})}}{(x_{0} - x_{1})} + (x - x_{0}) (x - x_{1}) \underset{\underset{divided difference f (x_{0}, x_{1}, x_{2})}{}}{[(x - x_{0}) (\frac{f (x_{0}) - f (x_{1})}{x_{0} - x_{1}}) - (x - x_{2}) (\frac{f (x_{1}) - f (x_{2})}{x_{1} - x_{2}})]} \dots + R$

where R is a remainder term that also may be approximated from:

$R = (x - x_{0}) (x - x_{1}) \dots (x - x_{k}) \overset{kth divided difference}{\overset{}{f (x, x_{0} \dots x_{k})}}$

R is easiest to calculate when the form of ƒ(x) is known. However, the formulas can be applied mechanically as well without apriori knowledge of ƒ(x), provided all the data is available, as a set. R can be estimated by the additional data available from an extended functional description.

Lagrange and Jensen taught methods of calculating Newton's remainder and the formula using divided differences. Stirling also taught a similar formula, as well as Bessel. Bessel's and Stirling's formulas were originally derived by Newton. Gauss and Everett also joined these formulas with their own. The difference between these formulas involves how other numbers within the data set are permitted to influence local calculations (variations primarily in the divided difference term). In some cases, offset intervals, half distance formulas between data, and arithmetic means of differences are utilized to augment the formula variations, which all have provision for remainder terms. Newton's original formula and the numerous variations are still used today and under certain circumstances considered to be very efficient, even for computational digital electronics.

(ii) Lagrange Interpolation

Lagrange had the idea that a single interpolation could be broken up into a set of (n+1) simple problems where n is the number of data in the set. In previous sections it was stated that an n^thorder polynomial can represent the function associated with a data set consisting of n+1 points. This is in fact a fundamental theorem. Lagrange supposed that a set of polynomials could be obtained each of which known as cardinal functions. Then, a linear combination of these Lagrangian polynomials is used to construct the desired polynomial. Once the final polynomial p_n(x) is constructed, then any value within this domain can be calculated.

The Lagrangian solution has the form:

$p_{n} (x) = \sum_{i = 0}^{n} f_{k} (x_{i}) L_{i} (x)$

The Lagrangian polynomials possess the following property:

$L_{i} (x_{j}) = {\begin{matrix} 1, & i = j \\ 0, & elsewhere \end{matrix}$

The above restriction implies roots (i≠j) and leads to the conclusion that;

L
_i(x)=k(x−x₀)(x−x₁) . . . (x−x_n)

where k is a constant which can be calculated form the fact:

$L_{i} (x_{i}) = 1 = k \sum_{\underset{j \neq 1}{j = 0}}^{n} (x_{i} - x_{j})$

Therefore:

$L_{i} (x) = \frac{(x - x_{0}) (x - x_{i - 1}) \dots (x - x_{n})}{(x_{i} - x_{0}) \dots (x - x_{i - 1}) (x - x_{i + 1}) (x - x_{n})}$

$L_{i} (x) = (\frac{\prod_{\underset{j \neq 1}{j = 0}}^{n} (x - x_{j})}{\prod_{\underset{j \neq 1}{j = 0}}^{n} (x_{i} - x_{j})}) i = 0, 1, \dots n$

An example may provide some insight into the application of the equations. Consider the following data points:

- (0,2)(3,4)(4,5)(6,10)

Using the above equations;

$L_{0} (x) = \frac{(x - 3) (x - 4) (x - 6)}{(0 - 3) (0 - 4) (0 - 6)} = - \frac{1}{72} (x^{3} - 13 x^{2} + 34 x - 72)$

$L_{1} (x) = \frac{(x - 0) (x - 4) (x - 6)}{(3 - 0) (3 - 4) (3 - 6)} = \frac{1}{9} (x^{3} - 10 x^{2} + 24 x)$

$L_{2} (x) = \frac{(x - 0) (x - 3) (x - 6)}{(4 - 0) (4 - 3) (4 - 6)} = - \frac{1}{8} (x^{3} - 9 x^{2} + 18 x)$

$L_{3} (x) = \frac{(x - 0) (x - 3) (x - 4)}{(6 - 0) (6 - 3) (6 - 4)} = \frac{1}{36} (x^{3} = 7 x^{2} + 12 x)$

The final interpolating values are obtained by the function;

$\begin{matrix} p_{3} (x) = f (x_{0}) L_{0} + f (x_{1}) L_{1} + f (x_{2}) L_{2} + f (x_{3}) L_{3} \\ = - \frac{1}{36} (x^{3} - 13 x^{2} + 34 x - 72) + \\ \frac{4}{9} (x^{3} - 10 x^{2} + 24 x) - \\ \frac{5}{8} (x^{3} - 9 x^{2} + 18 x) + \\ \frac{5}{16} (x^{3} - 7 x^{2} + 12 x) \end{matrix}$

p₃(x) then is an interpolating polynomial formed from the Lagrange interpolating cardinal functions with solutions at the given data points. With this final solution, any point (x_k, y_k) on the interval of the function can be calculated and is said to also be an interpolated value. The functions L₀(x), . . . L_n(x) are linearly independent along the interval.

(iii) Hermitian (Oscullatory) Interpolation

Hermite solved a similar interpolation problem (similar to Lagrange) but included an additional restriction, slope (derivative) at each coordinate. Thus there are n distinct points given by:

(x₀,y₀),(x₁,y₁) . . . (x_n,y_n)

And their slopes are given by:

ƒ′₀,ƒ′₁, . . . ƒ′_n

This will spawn the natural requirement of polynomial order p_2n+1(x). The calculations are obtained from:

$p_{2 n + 1} = \sum_{i = 0}^{n} α_{i} (x) f (x_{1}) + \sum_{i = 0}^{n} β_{i} (x) f^{'} (x_{i})$

$α_{i} (x) = (1 - 2 L_{j}^{'} (x_{i}) (x - x_{i})) {L_{i} (x)}^{2}$

$β_{i} (x) = (x - x_{i}) {L_{i} (x)}^{2}$

Notice that the Lagrange interpolation functions are part of the solution.

Another method of function construction is based on the sum of orthogonal polynomials. Chebyshev polynomials is an important class of such polynomials and can be used for functional reconstruction. The basic form of the solution is:

$f (x) = \sum_{i = 0}^{n} c_{i} T_{i} (x)$

T_i(x) are the polynomials which are defined as:

$T_{0} (x) = 1$

$T_{1} (x) = x$

$T_{2} (x) = 2 {xT}_{1} (x) - T_{0} (x) = 2 x^{2} - 1$

$T_{3} (x) = 2 {xT}_{2} (x) - T_{1} (x) = 4 x^{3} = 3 x$

$⋮$

$T_{n + 1} (x) 2 {xT}_{n} (x) - T_{n - 1} (x)$

The even order polynomials are even functions while the odd order polynomials are odd functions. An alternate representation is often used:

T
_n(x)=cos(n cos⁻¹(x))

The coefficients can be found from;

$c_{j} = \frac{2}{π} \int_{- 1}^{1} \frac{Tj (x) f (x)}{\sqrt{1 - x^{2}}} \partial x = \frac{2}{π} \int_{0}^{π} f (\cos (Θ)) \cos j Θ \partial Θ$

An important theorem associated with Chebyshev's work, known as the Equi-Oscillation Theorem, is stated as follows:

- Let ƒ(x) be a continuous function defined on the interval └a, b┘ and p_n(x) be a polynomial approximation, of degree n, for ƒ(x) on └a, b┘. Then an error function is defined as:

E
_n
_i=ƒ(x_i)−p_n(x_i)

- If there are at least n+2 points along the internal └a, b┘ where the error function is bounded by maximum values according to

e
_n(x_n)=−1ⁱE_n,i=0, 1, . . . n+1

- Then p_n(x) is known as a minimax approximation. Such an approximation is unique for p_n(x) with degree ≦n.

The approximating function is such that it meanders around the desired ƒ(x), back and forth, with a defined and bounded maximum error. Furthermore, it is known to be a best approximation for p_n(x) whenever the degree is ≦n. Although this error metric is very different than the least squares, it is contemplated by the Lebesgue norm when γ=∞. The Chebyshev polynomials often produce a result approaching the minimax solution in performance, and are considered a minimax class solution with slight modifications. Cheney in 1966 provided an excellent discussion and analysis of the minimax problem. The reader is also referred to the Remes Algorithm which is popular for various signal processing applications. The basic Chebyshev approach has been refined using modified Chebyshev polynomials and explicit minimax criteria, using the Remes algorithm for example. The results achieved are incrementally more accurate than basic Chebyshev approximation, for some applications.

Generally, the Chebyshev solution is from a class of solutions known as orthogonal function solutions. However, many such polynomials sets are candidates. Consider φ_n(x) as some orthogonal set of functions. Then,

φ_n(x),−1≦x≦1,n=0, 1 . . .

∫₋₁¹w(x)φ_m(x)φ_n(x)dx=K_nδ_mn

That is, the functions are said to be orthogonal on the interval −1≦x≦1 and weighted according to w(x) for generality. The approximated function ƒ(x) is found from:

$f (x) = c_{0} φ_{0} + c_{1} φ_{1} + C_{2} φ_{2} \dots$

$f (x) = \sum_{i = 0}^{k} c_{i} φ_{i}$

And the c_nmay be obtained from:

$c_{n} = \frac{1}{K_{n}} \int_{- 1}^{1} w (x) f (x) φ_{n} (x) \partial x$

Chebyshev, Legendre polynomials as well as Fourier series fit this solution class and are in wide use.

(d) Least Squares Revisited

Suppose that a polynomial is constructed of a specific order n such that a data set is to be fit approximately. Suppose further that the error in actual value desired versus actual value rendered by the polynomial p_n(x) is bounded by the following norm:

$E = \sum_{i = 0}^{n} {[p (x_{i}) - f_{i}]}^{2}$

This norm is recognized as a variation of the Lebesgue norm, γ=2 condition. The estimating polynomial is everywhere compared to the exact data set values point by point with the errors accumulated as indicated.

E(p(x)) is then minimized to some needed accuracy. Lagrange introduced this technique and called it the method of least squares.

The previous polynomial definitions are now recalled:

p
_m(x)=c₀+c₁x+c₂x²+c₃x³+c₄x⁴. . . c_mx^m

In order to minimize E(p) it is necessary to obtain the partial differential,

$0 = \frac{\partial E (p)}{\partial c_{j}} = \sum_{i = 0}^{n} 2 [p (x_{i}) - f_{i}] \frac{\partial p (x_{i})}{\partial c_{j}}$

The solution to this recursive equation yields a family of equations which can be arranged in a convenient form:

$c_{0} \sum_{i = 0}^{n} x_{i}^{0} + c_{i} \sum_{i = 0}^{n} x_{i} + \dots c_{m} \sum_{i = 0}^{n} x_{i}^{m} = \sum_{i = 0}^{n} f_{i}$

$c_{0} \sum_{i = 0}^{n} x_{i} + c_{1} \sum_{i = 0}^{n} x^{2} + \dots c_{m} \sum_{i = 0}^{n} x_{i}^{m + 1} = \sum_{i = 0}^{n} x_{i} f_{i}$

$c_{0} \sum_{i = 0}^{n} x_{i}^{m} + c_{i} \sum_{i = 0}^{n} x_{i}^{m + 1} + \dots c_{m} \sum_{i = 0}^{n} x_{i}^{2 m} = \sum_{i = 0}^{n} x_{i}^{m} f_{i}$

Solving this set of equations for the coefficients c yields a minimum to E(p). It is important to recognize that the order m of the polynomial need not equal the number n of data points utilized as the data input set. Whenever m=n then minE(p)=0, and the problem is reduced to the simple polynomial interpolation problem presented earlier where the data points become exact solutions for the system of equations. Whenever m≠n, a solution is still possible. Usually, it is desirable for curve fitting problems to have m<n. Then, a solution is obtained and it appears ‘smoothed’. The resulting function passes in the vicinity of the data rather than directly through the data points.

This technique (of least squares) is perhaps the most ubiquitous signal processing estimator or approximator in use today.

The system of equations is most often written in matrix form as:

[V]^T[V][c]=[V]^T[ƒ]

[V]^T[V] is notorious for producing extraordinary difficult solutions. One technique is the method of Singular Value Decomposition. Another class of solutions involve Orthogonal Transforms or Orthogonal Decomposition. These methods seek to provide some alternate view, or representation of the vector space implied by the above matrix formulation, which substantially improves the individual matrix loading as well as the subsequent matrix operations.

Another variation of the least squares technique can best be illustrated by rewriting E(p) as:

$E (p) = \sum_{i = 0}^{n} {w_{i} (x) [p_{i} (x) - f_{i}]}^{2}$

The w_iare weighting values which weigh each of the sample points for the error calculation. Proceeding as before yields the system of equations:

$\sum^{n} (w_{i} x_{i}^{0}) c_{0} + \dots \sum^{n} (w_{i} x_{i}^{m}) c_{m} = \sum^{n} w_{i} f_{i}$

$\sum^{n} (w_{i} x_{i}) c_{0} + \dots \sum^{n} (w_{i} x_{i}^{m + 1}) c_{m} = \sum^{n} w_{i} x_{i} f_{i}$

$⋮ ⋮ ⋮$

$\sum^{n} (w_{i} x_{i}^{m}) c_{0} + \dots \sum^{n} (w_{i} x_{i}^{2 m}) c_{m} = \sum^{n} w_{i} x_{i}^{m} f_{i}$

The weighting values provide an increased ‘importance’ or impact of certain data points with respect to the others. This solution is known as the moving least squares fit to the data.

8. PIECEWISE POLYNOMIALS AND SPLINES

This section focuses on the technique of fitting data to multiple functions, which are cascaded to extend the interpolation interval, while enhancing accuracy of the fit and stabilizing solutions. A single polynomial describing function was discussed, in a number of formats, in Section 7. Whenever the data is erratic or requires high order polynomials for the fit, the solutions are often difficult to obtain and are numerically unstable. However, breaking up a large domain or interval into a series of smaller domains, addressed by multiple functions, usually results in accurate representations with well behaved solutions, which are often numerically efficient.

Without detailing the technique, it is assumed that polynomials of arbitrary order, and of differing order per sub-interval can be fit to a sequence of data. FIG. 46 illustrates this concept. p(x)_AB, p(x)_BC, p(x)_CD, p(x)_DEcan be unique to each of the indicated intervals. Typically, additional constraints are applied at the interval end points defining continuity. Nevertheless, over the intervals, the solution forms are very similar to those previously represented. In fact, the techniques described in Section 7 can be applied to each interval shown in FIG. 46.

Various strategies can also be applied to permit practical solutions, including a restriction of the order and form of the interpolating function on each interval. A common approach is to employ linear interpolants at each sub-interval, connecting nodes by lines, called linear splines.

Another popular interpolant is the cubic spine. It is simple to calculate and possesses excellent performance when constrained by first and second derivatives at knots, the connection nodes along the data path. This section focuses on certain aspects of the cubic spline because of its ubiquitous application history. Of course the principles presented here can be extended to quadratics, quartics, etc.

The conceptual origin of the spline probably originates from artisans and engineers who construct smooth curves from a variety of materials in applications such as ship building, car body manufacturing, etc. For instance, in ship building the smooth hull shapes are often formed by positioning strakes over bulkheads, which are fit conformally, using the bulkhead edges (knots) as a constraint and the natural tension of the strake to provide smooth continuity along the strake interval. When a strake is not long enough they are joined end to end. At the joints (knots) they must fit smoothly without disruption of continuity. When lofting the lines for boat construction, battens (physical splines) are used for drawing the hull form. The battens are bent and twisted according to their natural curvature, often using some constraining mechanism at various intervals along the span.

Physicists and applied mathematicians have analyzed the ‘energy’ stored in the deflected spline. This self energy (potential energy) is of course related to the curvature and is proportional to:

$T_{E} \propto \int_{a}^{b} \frac{β_{s}^{n} (x)}{{(1 + {(β_{s}^{'} (x))}^{2})}^{5 / 2}} \partial x$

The natural cubic spline tends to approximately minimize this energy over its span according to the imposed constraints. This minimization is due to distribution of the load in an optimal manner. This property can be exploited mathematically if the constraints such as continuity, locations of joins, knot locations, etc., are defined correctly. The optimal natural spline typically is not subject to undue force or torque at the constraining knots, and tends toward asymptotic behavior at the extremums. This type of behavior gives rise to functions, which are most likely to represent the best natural fit to various classes of topologies relating to physical application.

Although the VPA application is not mechanical by nature, it is likely, along certain portions of the transfer characteristic, that functional continuity is maintained due to the physics of the semi conductor. The threshold of the semi conductor may present anomalies. However, above and below the threshold, ideas of continuity, minimal energy splines, etc., fit well with the paradigm.

(a) Cubic B Spline

The equation for a cubic spline is:

$β (x) = \frac{1}{3} {Ax}^{3} + \frac{1}{2} {Bx}^{2} + Cx + D + \frac{1}{6} \sum_{i = 1}^{N - 1} a_{i} {\langle x - k_{i} \rangle}^{3}$

The knot locations k_icorrespond to sub interval nodes, which are data from a super set and used to constrain the spline and its functional components. N+1 (N=5) knots form the spline interval. Seven constants (A, B, C, D, a₁, a₂, a₃) are required to define the cubic spline in this form. Once these constants are obtained, each number on the interval may be calculated from the spline.

The spline can be derived by twice integrating:

$β^{″} (x) = Ax + B + \sum_{i = 1}^{N - 1} a_{i} \langle x - k_{i} \rangle + \sum_{i = 1}^{N - 1} b_{i} J (x - ?)$

$? indicates text missing or illegible when filed$

and applying constraints on continuity at interval ends k₀, k₄as well as restricting the value outside the intervals, usually requiring null performance there. The last component of the double prime equation defines the jump functions, which are associated with knot sub intervals.

A spline of the B form, (B spline), requires the additional constraints of zero derivatives for the spline at k₀and k₄and is zero outside the interval. A single cubic B spline is illustrated in FIG. 47. FIG. 47 assumes equal spacing, Δ, between knots. This is however not a requirement for B splines. FIG. 48 illustrates a dissected spline to reveal the component or spline basis.

The j^thspline is given by:

B
_j(x)=B₀(x−jΔ) j=−3, 2, 1, 0, 1 . . . N−1

The cubic B spline can be written as a piecewise continuous function over a limited domain in terms of its basis functions. A convenient matrix description is:

β(x)=[x]└M_B_s┘└k_B_i┘

Expanding the above matrix notation explicitly yields:

$[x] = [x^{3}, x^{2}, x, 1] [M_{B_{s}}] = \frac{1}{6} \langle \begin{matrix} - 1 & 3 & - 3 & 0 \\ 3 & - 6 & 3 & 0 \\ - 3 & 0 & 3 & 0 \\ 1 & 4 & 1 & 0 \end{matrix} \rangle [k_{B_{i}}] = \langle \begin{matrix} k_{i - 3} \\ k_{i - 2} \\ k_{i - 1} \\ k_{i} \end{matrix} \rangle$

${β (x)}_{k_{i}} = \overset{\overset{B_{3}}{}}{\frac{{(1 - x)}^{3}}{6}} k_{i - 3} + \overset{\overset{B_{2}}{}}{\frac{(3 x^{3} - 6 x^{2} + 4)}{6}} k_{i - 2} + \overset{\overset{B_{1}}{}}{\frac{(- 3 x^{3} + 3 x^{2} - 3 x + 1)}{6}} k_{i - 1} + \overset{\overset{B_{0}}{}}{\frac{x^{3} k_{i}}{6}}$

$0 \leq x \leq 1$

Notice that the convention B_iutilizes the subscript to represent a segment of the spline. In subsequent sections, B^Nrepresents the n^thorder B spline. Thus B³refers to a third order B spline and B₀refers to the 0^thsegment or initial segment of the spline.

FIG. 49 illustrates the various components or basis for β(x) over the interval 0≦x≦1. These components of the spline are also known as blending functions.

Successive applications of splines overlap knot intervals as follows as illustrated in FIG. 50. These splines overlap in such a manner, given proper coefficients for each shifted B spline, that within the interval k₀<x<k₄, a normalized response is obtained, requiring:

$\sum_{i = - 3}^{N} B_{i} (x) = 1.$

In an application involving fitting to a curve the response over k₀. . . k_Nis tailored by:

$S_{a} (x) = \sum_{i = 3}^{N} a_{i} B_{i} (x)$

where a_iare additional weighting factors.

De Boor derived a recursive method for calculating the a_ivalues. The technique requires sequential calculations beginning with splines of order n=1 and recursively progressing up to order=4 (cubic spline):

${B (x)}_{i, n} = \frac{x - k_{i}}{k_{i + n - 1}} {B (x)}_{i, n - 1} + \frac{k_{i + n} - x}{k_{i + n} - k_{i + 1}} {B (x)}_{i + 1, n - 1}$

$i = - 3 \dots N - 1 n = 1, 2, 3, 4$

(b) Smoothing

The methods presented in section 7.5 may be applied to the spline problem as well. That is, the Lebesgue norm=2(L₂) constraint can be placed on a data set which possesses a greater number of knots over an interval than is strictly required. In this problem the same minimization is necessary:

$\min (E (f)) = \sum_{i = 0}^{n} {w_{i} [\tilde{f} (x_{i}) - f_{i}]}^{2}$

Each of the {tilde over (ƒ)}(x_i) data influences the spline along with the weighting values w_i. {tilde over (ƒ)} represents the function interpolated by splines. Rather than interpolating the contour of all of the knots explicitly, a smoothed or averaged solution is obtained.

This approach can have a lot of merit for processing ensembles of data. If the ensemble possesses an optimal least squares solution then perhaps this single solution can be applied for every member experiment of the ensemble. That is, rather than requiring unique spline coefficients for each and every data set associated with a measurement, perhaps a single spline can be calculated with its corresponding single set of coefficients to apply for all separate experiments. This could reduce a significant amount of data.

Another variation can be done by averaging the X_iprior to calculating the spline. Suppose that many similar experiments are run which spawn data sets each of which can be used to produce a separate experiment dependent spline, or using the L₂norm derive a universally optimal spline. Rather than invoking the L₂norm, it is possible under certain circumstances to approximate a solution by calculating an averaged X_iand therefore obtain a single averaged {tilde over (ƒ)}_i. This requires:

${\overline{X}}_{i} \underline{Δ} \frac{?}{n} \sum_{j = 1}^{n} X_{i, j} W_{j}$

$? indicates text missing or illegible when filed$

- nΔ Number of samples at the i^thdata location due to n similar experiments.

KΔ Normalization factor if so desired to create a unity form or other weight for the weighted average. This permits individual weighting of the j components for X_iseparate from weighting the X_i.

The individually averaged X_ican then be used to obtain a spline. If the variance across the ensemble of data sets is not too great then this technique may be very efficient and very effective. The L₂norm minimization can also be applied to a single experiment, which possesses a noisy data set. In addition, a ‘noisy’ data set may be pre convolved with a smoothing kernel (filtered) to smooth prior to fitting.

Interpolation in curvilinear coordinates is in general a more difficult problem to analyze than the simple Cartesian formulation presented earlier. In addition to the natural curvature of the spline, the space itself can be warped. As presented earlier, the ‘self energy’ of the spline is related to the curvature of the spline which is subject to a minimization procedure. Changing coordinate systems, and projecting the spline in co-linear coordinates will change the minimization and potentially the spline formulation.

The previously written self energy minimization is once again presented:

$S_{2} \underline{Δ} \min {\int_{a}^{b} {\langle \frac{\partial^{2} β (x)}{\partial x^{2}} \rangle}^{2} \partial x}$

β(x) is a piecewise cubic polynomial with continuous derivatives at the joins/knots. Sometimes S_lsplines find application but are not presented here because of certain uniqueness properties which they lack. This can present certain computational issues unless regularization terms are included within the context of minimization.

The equivalent polar spline minimization procedure involves:

$S_{2_{p}} \underline{Δ} \min {\int_{φ_{a}}^{φ_{b}} {(\frac{\partial^{2} r}{\partial φ^{2}})}^{2} \partial φ}$

where r(φ) are piecewise cubic polynomials and possess continuous derivatives at the nodes, φ_i.

The minimization can take on a form involving curvature:

$\min {\int_{φ_{a}}^{φ_{b}} {(K)}^{2} f \partial} = \min \int_{φ_{a}}^{φ_{b}} (\frac{r^{2} + 2 {(\frac{\partial r}{\partial φ})}^{2} - r \frac{\partial^{2} r}{\partial φ^{2}}}{{(r^{2} + {(\frac{\partial r}{\partial φ})}^{2})}^{5 / 2}}) \partial φ$

r an be expressed as a Hermitian function in a form:

$r (φ) = r_{i} + \frac{\partial r (φ_{i})}{\partial φ} (φ - φ_{i}) + (\frac{1}{φ_{i + 1} - φ_{i}}) [(- (\frac{2 \partial r (φ_{i})}{\partial φ} + \frac{\partial r (φ_{i + 1})}{\partial φ})) + 3 Δ r_{i}] {(φ - φ_{i})}^{2} + {(\frac{1}{φ_{i + 1} - φ_{i}})}^{2} (\frac{\partial r (φ_{i})}{\partial φ} + \frac{\partial r (φ_{i + 1})}{\partial φ} - 2 Δ r_{i}) {(φ - φ_{i})}^{3}$

$where$

$Δ r = (\frac{r_{i + 1} - r_{i}}{φ_{i + 1} - φ_{i}})$

Calculating the cubic spline in polar coordinates then includes finding the derivatives dr(φ_i)/dφ subject to the minimization constraints, at each knot or join.

(d) Relevance of the Sampling Theorem

Sampling theory has a long and famous history. Yet, until the 1990's the association of Splines and the classical sampling interpolants was overlooked. However, embodiments provided herein rely on the connection between the Cardinal series represented by a host of pioneers in mathematics, information theory, and signal processing, and the general theory of splines as applied to sampled time domain signals.

Cauchy developed some idea of the sampling theorem as far back as 1847. Borel repeated the theme with his Fourier proof in 1897. Whittaker provided a proof in 1915, which included the idea of cardinal functions associated with the samples and used as an interpolation formula. Kotelnikov in 1933 also presented a proof of the sampling theorem. In 1948 Shannon used the sampling theorem to relate the samples of information bearing analog signals to the Nyquist sampling rate. In 1962 Peterson and Middleton extended the sampling theorem to higher order spaces/dimensions. And finally, Papoulis published a generalization of the sampling theorem in 1968.

Schoenberg released a landmark paper on spline interpolants in 1946, prior to the Shannon paper. The relationship between Shannon's theorem and spline theory has been explored relatively recently even though splines have found increasing application since the 1960's. Perhaps this has been due to the lack of cross pollination between signal processing applications and the applications advancing spline theory. A common language was not developed early on and splines were considered as graphic interpolants rather than means of describing physical sampling phenomena.

The cardinal basis for sampling can be compared to the spline cardinal basis. The cardinal series originally identified with the sampling theorem is:

$\tilde{x} (t) = \frac{1}{π} \sum_{n = - \infty}^{\infty} x (\frac{n}{2 B_{w}}) \frac{\sin (π (2 B_{w} t - n))}{2 B_{w} t - n}$

- x(t) Δ Signal as a function of time
- B_wΔ Bandwidth of the signal x(t)
- nΔ Sample Index

This equation illustrates that any signal can be represented from an infinite number of sample functions of the sin c form weighted by the original waveform at discrete intervals. The weighted samples are linearly combined to create the smoothed version of x(t). In fact, the time domain waveform and its discrete samples are related by means of this interpolation series. FIG. 52 illustrates this cardinal series expansion graphically for the function x(t)=cos²( )

The Fourier Cardinal Series (sampling function) for a band limited sampled signal x(t) is given by:

$X (ω) = \sum_{n = - \infty}^{\infty} \frac{1}{2 B} x (\frac{n}{2 B}) Π (\frac{ω}{2 B})$

Schoenberg's form for the basic B-Spline expansion is:

$s (x) = \sum^{k} c (k) (β^{n} (x - k))$

The basic rectangular spline is represented by:

$B^{0} (x) = {\begin{matrix} 1, & - \frac{1}{2} < x < \frac{1}{2} \\ \frac{1}{2}, & \langle x \rangle = \frac{1}{2} \\ 0, & Otherwise \end{matrix}$

n+1 convolutions of the rectangular spline will spawn the n^thorder spline.

$B^{n} (x) = \underset{\underset{n + 1}{}}{B^{0} (x) * B^{0} (x) * \dots B^{0} (x)}$

The Fourier Transform can be calculated from:

$B^{n} (ω) \int_{- \infty}^{\infty} \frac{{B^{0} (t) * B^{0} (t) * \dots}}{n + 1 fold convolution} e^{- j wt} \partial t$

$B^{n} (ω) = {(\frac{\sin (ω / 2)}{ω / 2})}^{n + 1} = \frac{{(e^{jω / 2} - e^{- jω / 2})}^{n + 1}}{{(jω)}^{n + 1}}$

Schoenberg's form can also be rewritten in terms of the time variable, t, and a reconstructed function x(t) as:

$\tilde{x} (t) = \sum_{k = - \infty}^{\infty} c (k) B^{n} (t - k)$

Notice the similarity to the form given earlier for the Whittaker-Shannon Cardinal sampling series. Shannon's form depends on the notion of the ideal low pass “brick wall” to bandwidth limit the function x(t) prior to sampling. This demands convolution with a filter which possesses a sin c (τ) impulse response. If the pre-filtered waveform is subsequently sampled by an impulse sampler, then it can be reconstructed by post convolution with a reconstruction filter possessing the sin c (τ) impulse response.

Schoenberg's form given above is in fact a convolution sum involving B-Splines rather than sin c (τ) functions. As the B-Spline order n→∞, Schoenberg's spline reconstruction asymptotically approaches the Shannon reconstruction Cardinal series performance.

FIG. 52 illustrates the impulse response equivalent comparing the sin c interpolation and B-Spline kernels.

The sampling theorem dictates the use of anti alias filters, proper samplers, and reconstruction or interpolation filters. This is illustrated in FIG. 53, which illustrates a chain including a low-pass anti-alias filter, a sampler, and an interpolation and reconstruction filter.

The parallel has been substantiated for considering spline kernels as the interpolant. All that remains is to specify pre filtering and describe an efficient means of constructing the spline interpolation functions.

It can be shown that the ideal pre filter frequency response for the cubic spline case approaches:

$H (f) = \frac{{(\sin c (f))}^{4} \sum_{k = - \infty}^{\infty} {(\sin c (f - k))}^{4}}{\sum_{k = - \infty}^{\infty} {(\sin c (f - k))}^{8}}$

(e) B Spline Transform

The linear sum of cubic B-Splines weighted by

$S_{a} (x) = \sum^{n} a_{i} B_{i} (x)$

can be formulated for all knots over a very large interval to construct very complicated functions. Another notation which is of a discretely sampled continuous form provides a useful interpretation of the spline sampled operation:

$S (i) = \sum_{i = - \infty}^{\infty} a (i) B_{j} (x - i)$

The shifted B-Splines B(x−i) constitutes the basis for the system. The cubic B-Spline basis was previously described. The form for S(i) is a discrete sampling function similar to a convolution sum. If x were replaced by discrete samples then a convolution sum would result. This form is suggestive of a filter. In fact, this is identified as the discrete B-Spline transform:

a(k)=(b³)⁻¹*s(k)

where (b³)⁻¹is the impulse response of the direct B-Spline filter (3^rdorder B-Spline for current discussion). A Z- transform may be calculated for (b^a) and results in:

$B^{3} () = \sum_{k = - n / 2}^{n / 2} (b^{3} (k)) - k = \frac{+ 4 + - 1}{6};$

$(roots @ α_{1} ≃ - .268, α_{2} ≃ - 3.732) .$

Replacing custom-character →e^j2πƒ yields the Fourier Transform:

$B^{3} (f) = (b^{3} (0) + \sum_{k = 1}^{n / 2} 2 b^{3} (k) \cos (2 π fk)) \cdot {[\sin c (f)]}^{n + 1} .$

Notice the exponent (n+1) of the sin c(ƒ) function relating to the n fold convolution.

The transfer function of the cubic spline transform is:

${[B^{3} ()]}^{- 1} = \frac{1}{(b^{3} (0)) + \sum_{k = 1}^{n / 2} b^{3} (k) [k + - k]}$

This form is known as the direct B spline transform while B³( custom-character ) is known as the indirect form. As such:

s(k)=b³*a(k)

where b³is in fact the indirect B-Spline filter. The B-Spline filter can be implemented by a symmetric FIR filter. The direct B-Spline filter is naturally an IIR form, but can be converted to an FIR form.

The implications are that very fast real time algorithms can be obtained for interpolation of the spline polynomials under certain circumstances. For instance, running the signal samples through a filter, [B³( custom-character )]⁻¹yields the coefficients a(k). b³(k) is a discrete sampled function. The fundamental uniformly sampled cubic B-Spline is illustrated in FIG. 54.

k_j
β(k_j)
t_k

k₀
0
0

k₁
.16 6
1

k₂
.6 6
2

k₃
.16 6
3

k₄
0
4

The discretely sampled values of β³(x) represent the unit pulse response of the indirect B-Spline filter whose coefficients b³(k) correspond to the tabulated values above. The direct filter is obtained as the normalized inverse, [b³(k)]⁻¹.

9. SURFACE FITTING

(a) Bi-Cubic Surface B-Splices

Birkhoff and de Boor (1965) are usually cited for presenting the idea of deriving surface splines for two independent directions, then interpolating the internal region of a MESH element using a form:

$f (x, y) = \sum_{i = 0}^{3} \sum_{j = 0}^{3} c_{i, j} x^{i} y^{i}$

It should be noted that the B-spline represented in 2D space requires 4 basis functions. Furthermore, 4 unique splines are required to characterize a curve within the interval a≦x≦b. A single B-spline consisting of the 4 basis functions requires 7 coefficients.

Independent basis functions in the x and y direction can be multiplied in a tensor product to obtain the bi-cubic surface B-spline form:

$s (x, y) = \sum_{i = 1}^{8} \sum_{j = 1}^{8} γ_{i, j} B_{i}^{3} (x) B_{j}^{3} (y)$

Thus, a total of 7×64=448 coefficients are required to described the associated single patch of the meshed surface using this approach.

A least squares solution can then be applied to calculate γ_i,jfor the surface. The calculation given above was based on the minimum number of knots assigned to the MESH boundaries. Usually, the problem is over determined and the minimization takes the form:

$\min {\sum_{r - 1}^{m} {(s (x_{r}, y_{r}) - f_{r})}^{2}$

where m is the number of data points, and the number of knots is <<m.

448 coefficients is a significant calculation and memory load. However, Section 8 provides methods for implementing a B spline transform which can be applied recursively. This reduces the memory requirements. The example given above would be reduced dramatically to 64 memory locations. Furthermore, it should be understood that there is significant overhead in the example as given for a single MESH patch. Boundary conditions require additional ISI introduced by 4 adjacent splines. If an entire surface with several or many meshes is employed then the overhead for memory is distributed amongst all the MESH elements for the boundary conditions. Therefore, ignoring the additional boundary splines further reduces the memory required for a single MESH patch to a maximum of 16 coefficients. These coefficients correspond to weights related to the surface contour.

It is useful to use the parametric representation of a surface patch which is formed from the intersection of the successive spline contours B_i³(x), B_j³(y) which are arranged on an orthogonal grid. The patch is an important geometrical feature for CAGD (Computer Added Graphical Design) application and can also play an important role for the techniques disclosed herein. Section 10 provides additional clarification of the patch concept.

The following matrix representation for the tensor product of splines is given as:

P
_ij
^ν
=[X]
^T
[M
_B
_s]^T[γ_ij^ν][M_B_s][Y]^T(Patch Representation)

This representation is built from the scalar field γ_ij^ν which is a tensor form whenever ν assumes one of the 3 space dimensional coordinates. That is γ_ij^ν is a 4×4×3 operator. The other components of the equation were described previously but are given here for convenience:

$x = [x^{3}, x^{2}, x, 1]$

$y = [y^{3}, y^{2}, y, 1]$

$M_{B_{s}} = \frac{1}{6} [\begin{matrix} - 1 & 3 & - 3 & 0 \\ 3 & - 6 & 3 & 0 \\ - 3 & 0 & 3 & 0 \\ 1 & 4 & 1 & 0 \end{matrix}] Bi Cubic Spline Blending Function Matrix$

$γ_{i, j}^{v} = [\begin{matrix} γ_{i, j}^{v} & \dots & γ_{i, j + 3}^{v} \\ ⋮ & ⋮ \\ γ_{i + 3, j} & \dots & γ_{i + 3, j + 3} \end{matrix}]$

This formulation is based on the bi cubic spline but can easily be modified to represent arbitrary algebraic polynomials, orthogonal polynomials, etc.

10. PATCHES AND SPLOTCHES

Another representation to describe the error surface D_ε_Rthe space R^Npermits the sub division of the surface or manifold into local regions called patches for cases where the space is R²or R³. The patches may take on any convenient shape, vary in size and be mixed within an application (different shapes and sizes). Conceptually the patch has a high level of representation within the mathematical description and its own describing functions. These higher level structures may be manipulated to morph the overall representation of D_ε_R, patch by patch. If hardware accelerators are constructed which operate at this level of abstraction, speed, efficiency and utility can be blended to attain some desired performance.

In 2-D, the patches would represent regions as illustrated FIG. 55, for example. The patches are connected in networks which form the surface within the region of interest. The edges of each patch form shared boundary conditions with adjacent patches. The boundary conditions are typically described by polynomials, splines, special functions, etc. Any point within the patch boundary may then be calculated by interpolation of functions which represent the patch boundaries. In this manner, modest amounts of data, describing the boundaries, permit calculation of a continuum of coordinate values.

Whenever the surface is represented in 3-D the boundaries are projected onto the surface to represent a conformal grid. FIG. 56 illustrates how a regular rectangular grid in the 2-D plane (base of figure) projects onto the 3-D surface. In 2-D, the grid is undistorted, while in 3-D the grid is distorted in proportion to the projected scalar field of values representing the surface contour.

Patch techniques are widely used in the CAGD industry and have been useful in accelerating graphically intensive processes. They have never been applied to the describing functions for calibration and compensation of RF power amplifiers.

Splotches are a new entity (newly created for this application) which can be manipulated much in the same way the patch can be manipulated. However, a splotch is a region extracted from the error surface manifold within R^Nwhenever N>3.

(a) Polynomial Based Patch Interpolation

Section 9 introduced the idea of the spline product and a matrix formulation of a rectangular patch element associated with the spline based tensor product surface MESH. A more fundamental tensor product is definable in terms of polynomials which are associated with the patch boundaries. This polynomial based patch has a surface description given by:

P
_i
=c
₀₀
+c
₁₀
x+c
₀₁
y+c
₂₀
x
²
+c
₁₁
xy+c
₀₂
y
²
+c
₃₀
x
³
+c
₂₁
x
²
y+c
₁₂
xy
²
+c
₀₃
y
³
+c
₃₁
x
³
y+c
₂₂
x
²
y
²
+c
₁₃
xy
³
+c
₃₂
x
³
y
²
+c
₂₃
x
²
y
³
+c
₃₃
x
³
y
³
+ . . . c
_mn
x
^m
y
ⁿ

P_iis the amplitude or height above the x-y plane within the patch. x and y represent rectangular coordinate variables for the point and c_mnare the coefficients of the polynomial given above.

Simultaneous equations may be solved to determine the c_mncoefficients on a point by point basis if so desired and boundary conditions at the patch edge, corners, plus center (if known) can provide convenient constraints for calculating solutions.

Once a strategy is selected for obtaining a suitable set of coefficients which describes the patch they may be stored and recalled as required.

The matrix form is more compact:

P
_i
=[x][c
_mn
][y]
^T

Some common schemes of interpolation with this patch topology can be as follows:

- Level Plane Interpolation;
- Linear Plane Interpolation;
- Double Linear Interpolation;
- Bi Linear Interpolation;
- Bi Quadratic Interpolation;
- Piecewise Cubic Interpolation; and
- Bi Cubic Interpolation.

There are other popular methods in addition to the list provided above. Tradeoffs of accuracy versus calculation complexity govern the choice of method.

11. TENSOR APPLICATIONS

A tensor is a mathematical entity, which when defined within a particular coordinate system, becomes defined also within a coordinate transformation of that system. Even when the tensor's elements are fixed in a particular frame of reference, they are defined in all other legitimate frames, albeit with perhaps new quantities according to some prescribed transformation.

One particularly useful application of tensor theory is now presented because of its utility and analogous characteristic to D_ε_Rmanipulation. Thus far, this disclosure has presented the error surface and focused on a ‘snapshot’ description. However, the error surface changes as a function of many variables. This is similar to pushing, pulling stretching and deforming the error surface, juxtaposed to some baseline form.

FIG. 57 illustrates a case for which a change occurred in the relative position of points for a 3-D kernel space, transforming from region R to R′.

Region (R) possesses two points D_Aand D_Bwhich lie on a surface or are functionally related in some manner that has specific mathematical description. The radial vectors {right arrow over (r)} and {right arrow over (r)}+Δ{right arrow over (r)} relate to the orientation of these points in the 3 space defined by x₁, x₂, x₃.

Region (R′) illustrates the original two points on the original surface, translated to region R′. The new point locations are labeled D_A′ and D_B′. Notice that

${\overset{⇀}{D}}_{ɛ_{R (A - A^{'})}}$

and

${\vec{D}}_{ɛ_{R_{(B - B^{'})}}}$

have been defined as error vectors. x₁, x₂, x₃define a fundamental Euclidian spatial kernel in this example.

The error vector possesses components in this 3-D space written as:

$D_{ɛ_{R_{i}}} = ({\overset{⇀}{D}}_{ɛ_{R}} \cdot {\overset{⇀}{x}}_{1}, {\overset{⇀}{D}}_{ɛ_{R}} \cdot {\overset{⇀}{x}}_{2}, {\overset{⇀}{D}}_{ɛ_{R}} \cdot {\overset{⇀}{x}}_{3}) .$

The spatial components on the surface may be defined in shorthand notation as:

$D_{ɛ_{R_{i}}} = D_{ɛ_{R_{i}}} (x_{1}, x_{2}, x_{3}) .$

The coordinates in R′ are related to R in the following manner:

$Δ x_{i}^{'} = Δ x_{i} + D_{ɛ_{R_{i}}} (x_{1} + Δ x_{1}, x_{2} + Δ x_{2}, x_{3} + Δ x_{3}) - D_{ɛ_{R_{i}}} (x_{1}, x_{2}, x_{3}) .$

A differential equation is then formulated as follows:

$Δ x_{i}^{'} = Δ x_{i} + \frac{\partial D_{ɛ_{R_{i}}}}{\partial x_{k}} Δ x_{k} .$

For very small Δx_i, Δ′x_i, variations in the radial vectors of interest can be approximated from:

Δx′_iΔx′_i≅(Δr′)²

Δx_iΔx_i≅(Δr)²

Notice the introduction of the k index along with the i index. This degree of freedom is required to address additional spatial perturbation. The radial variations of the translated regions can be expressed by:

$\begin{matrix} {(Δ r^{'})}^{2} - {(Δ r)}^{2} ≃ 2 \frac{\partial D_{ɛ_{R_{i}}}}{\partial x_{k}} Δ x_{i} Δ x_{k} + \frac{\partial D_{ɛ_{R_{i}}} \partial D_{ɛ_{R_{i}}}}{\partial x_{k} \partial x_{k}} Δ x_{k} Δ x_{} \\ = (\frac{\partial D_{ɛ_{R_{i}}}}{\partial x_{k}} + \frac{\partial D_{ɛ_{E_{k}}}}{\partial x_{i}} + \frac{\partial D_{ɛ_{R_{}}} \partial D_{ɛ_{R_{}}}}{\partial x_{k} \partial x_{k}}) Δ x_{i} Δ x_{k} . \end{matrix}$

It should be noted that l has been introduced so that i, k, l account for all spatial perturbations in the Euclidian 3-D kernel. The second order tensor can be identified from:

$D_{ɛ_{R_{i_{k}}}}^{'} = \frac{1}{2} (\frac{\partial D_{ɛ_{R_{i}}}^{'}}{\partial x_{k}^{'}} + \frac{\partial D_{ɛ_{R_{k}}}^{'}}{\partial x_{i}^{'}} + \underset{non linear term}{\underset{}{\frac{\partial D_{ɛ_{R_{}}}^{'}}{\partial x_{i}^{'}} \frac{\partial D_{ɛ_{R_{}}}^{'}}{\partial x_{k}^{'}}}}) .$

whenever transformation from region R to R′ is accomplished.

In the linear theory of elasticity, the second order tensor of interest is given by the first 2 terms as follows:

$D_{ɛ_{R_{i_{k}}}} = \frac{1}{2} (\frac{\partial D_{ɛ_{R_{i}}}}{\partial x_{k}} + \frac{\partial D_{ɛ_{R_{k}}}}{\partial x_{i}})$

FIG. 58 illustrates two plots that show two separate magnitude error surfaces

$\langle D_{ɛ_{R_{A}}} \rangle and \langle D_{ɛ_{R_{B}}} \rangle$

rendered in the fundamental 3-D spatial kernel. The plot to the right is obtained by some action on the plot to the left. This action could be temperature related, gain change, power supply change, frequency change, etc. If the tensor

$D_{ɛ_{R_{i_{k}}}}$

is known, then the surfaces in FIG. 58 are entirely defined by the transformation associated with the tensor description.

Several other tensor descriptions also find relevance and are presented subsequently.

(a) Basic Tensor Operator Application

The application for tensors in the VPA characterization is realized in the R³space by operators which are of rank 2. Consider the following:

$[y] = {\tilde{}}_{D_{ɛ_{R}}} \cdot [x]$

The tensor {tilde over (ℑ)}_Dmaps the input vector x into an output vector y. The tensor

${\tilde{}}_{D_{ɛ_{R}}}$

in this case can represent an operator which carries or imparts information concerning the transformation which is inflicted by the VPA action. This tensor is also therefore directly related to the error surface, D_ε_R.

In practice, mapping is actually non-linear while the above simplistic form is revealed as linear. However, non-linear forms are readily accommodated as follows:

$[y] = ({\tilde{}}_{D_{ɛ_{R_{a}}}} \cdot [x]) + ({\tilde{}}_{D_{ɛ_{R_{b}}}} \cdot [x^{2}]) + {\tilde{}}_{D_{ɛ_{R_{c}}}} \cdot [x^{3}] \dots$

The non-linear form is similar to a polynomial and is often used to describe non-linear elastic spaces and scalar fields. xⁿdenotes an input vector whose elements are individually raised to the power n. The second rank tensors can be constructed from 9 element matrices which relate well to the 3-D error surfaces. Multi dimensional geometries may be accommodated in this manner simply by parameterizing the space or by forming tensors of higher rank.

For instance if power supply variation, gain variation, temperature, and VSWR are taken into account along with the 3-D amplitude and phase error surfaces, a rank 6 tensor would be required to describe all interactions of the mapping in a unified treatment. Such unification can lead to significant memory reduction, with a trade-off in real time hardware to execute the tensor operations. Thus, a trade-off is required to identify the optimal rank versus hardware and memory allocation.

The previous section indicates how the input/output map can be obtained via a tensor operator, even when the system is inherently non-linear. However, each tensor operator as represented previously, as related to powers of x, can experience transformation as well. This transformation can also be viewed as a tensor operation under certain conditions. That is, the unified tensor of say rank 6 (for R⁷) can be broken into multiple operators of lower rank if so desired. This partitioning enhances the hardware trade-off and permits significant design degree of freedom. For instance the functional calibration parameters of temperature, battery voltage, gain, load condition, and frequency can be unified or mathematically decoupled. When they are decoupled they still may be related via some transformation such as a tensor operator, for example.

12. MISCELLANEOUS RELATED TOPICS

(a) Information Content

As previously indicated in Section 6(c), the gradient of the error surface defined by D_ε_Rconveys information content. Higher order gradients are possible and also convey information. Consider the 2-D spatial kernel in the complex plane:

D
_ε
_R2=ƒ(I(t))+jg(Q(t)).

In some cases, the equation can be simplified to describe I(t) as a function of Q(t), in the complex plane. In most cases, suitable domains can be defined for which continuity is adequate and derivatives exist. If it is assumed that a finite number of unique and significant derivatives exist for D_ε_R2, then the information content of the function is bounded. Even for the case of exponential functions a suitable truncation of terms in the series provides a practical bound on valuable (significant) information content. A pathological case may seem to exist for periodic functions. However, the claim above demands significant and unique derivatives. Hence, simple sin and cos functions do not present a particularly difficult problem when the significance of higher order derivatives is weighed. Thus,

$\frac{\partial^{v} D_{ɛ_{R}}}{\partial I^{v}} = \frac{\partial^{v} f (I)}{\partial {(t)}^{v}}$

$\frac{\partial^{ζ} D_{ɛ_{R}}}{\partial Q^{ζ}} = \frac{\partial^{ζ} g (Q)}{\partial Q^{ζ}}$

- give an indication of the information stored in the surface.

For instance, suppose over some domain, there exists a function that can be described as follows:

D
_ε
_R
=ƒ{I,Q}=a
₃
I
³
+a
₂
I
²
+a
₁
I+K
₁
+b
₂
Q
²
+b
₁
Q+K
_Q
+c
₂₂
I
²
Q
²
+c
₂₁
I
²
Q+c
₁₂
IQ
²
+C
₁₁
IQ+d
₃₁
I
³
Q . . .

This example polynomial representation provides up to 3 derivatives in I and up to 2 derivatives in Q (as written). At each derivative stage, a certain number of coefficients of the polynomial survive. Even without complete knowledge of the function above, one could obtain some idea of the information content within the 2-D plane by performing successive gradient operations and weighting the area of the plane with surviving derivative magnitudes, and/or the area under the surviving gradient surfaces over specific domains. This provides an immediate methodology of emphasizing certain regions which may possess more information via weighting functions, if so desired.

The raw information content of the error surface or manifold is directly related to the form and content of the functional description. In the case of algebraic polynomials, information is stored in the coefficients as well as the description of the polynomial domain. Orthogonal polynomial expansions also possess coefficients and domain descriptions. Therefore, the information storage requirements of such descriptions are directly related to the number of bits required for each coefficient multiplied by the number of coefficients. It follows that minimizing the information of D_ε_Ris related to minimizing the number of significant coefficients and therefore the number of significant derivatives of the manifold description, as well as the required coefficient resolution.

It can be advantageous to project the error function or translate the error function to dimensions or coordinate systems for which these derivatives are naturally minimal.

In the case of VPA compensation, the 2-D fundamental spatial kernel can be expanded to a surface in a 3-D fundamental kernel description with a magnitude surface and phase surface. The magnitude surface in particular takes on the features of a cone over significant portions of operational dynamic range. The side profile of the cone possesses a nearly linear slant description over most of its domain. Section 13 indicates that this feature is advantageous to exploit in the 3-D view. This correlates well with the minimized derivative or gradient over reasonable portions of the domain as opposed to some other view such as complex 2-D, etc.

FIG. 59 illustrates a process flowchart of a methodology for approaching the information minimization requirement. The actual minimization phase of the procedure can be further expanded as illustrated in FIG. 60. Note that the minimization procedure is fairly complex and affected by the branch from which it is spawned (see FIG. 60). Choosing the right branch would typically be a matter of experience based on laboratory empirical evidence combined with some applied math. The minimization procedure does require significant intelligent control. This control can be computer driven or interactive with a human interface. In an embodiment, the human interface would be desirable until all scenarios, waveforms, degrees of freedom, etc., have been characterized. Once ‘learning’ was complete, suitable test thresholds could be established for each procedural step.

Since there are multiple degrees of freedom for the mathematical, functional geometric, descriptions, it is assumed that a particular methodology has been selected, i.e.:

- Algebraic Polynomials;
- Chebyshev and Orthogonal Expansions;
- Least Squares;
- Splines;
- Tensors;
- Beziér Curves;
- Hermitian Polynomials; or
- Hybrid Techniques.

Within the context of a particular genre or approach, the procedure would be applied iteratively, tweaking the domain descriptions, specifying the mathematical description, and modifying coefficients, subject to intermediate testing by MMSE criteria and ultimately final verification against waveform criteria.

It is anticipated that gradient algorithms combined with mean square error considerations as well as actual waveform sensitivity functions will drive the domain weighting function procedure in the first step. The algorithm is qualitatively described as illustrated in FIG. 61, as a multi-option gradient based weighting algorithm.

Each branch of the weighting algorithm could be exclusively selected. Each data domain or functional domain could be equally weighted to a constant or each data point or collection of points could be appropriately weighted according to certain criteria relating to the impact or value of the data and its accuracy on the overall performance of the final solution. Although this could have been subjected to iterative algorithmic cycles, fed back from the output tests, it has been assumed here that apriori knowledge was gleaned from bench testing and characterizations. This simplifies the subsequent algorithm significantly.

The most comprehensive algorithmic step is based on the principle that the information content of D_ε_Ris related to the functional gradient and regional cross correlation functions, as well as the gradient positions relative to the constellation origin with final tweaks applied on a per waveform basis. The constellation requirement weighs the magnitude error of the gradient versus the magnitude squared position of the gradient value within the constellation. Typically, signals on the unit circle possess greater energy and therefore greater weighting than signals near the origin. However, trajectories of signals through the complex plane and their associated derivative properties are important and cannot be totally ignored. For this reason, the waveform sensitivity weights must be added in some cases.

Also, cross correlation functions such as those described in Section 6.7.1, 6.7.2, 6.7.3, 6.7.4, etc., should be included in the mix of techniques since considerable redundancy of information exists in the D_ε_Rdescriptions for practical applications. Section 13 provides some examples where the cross correlation function properties are significantly exploited, thereby drastically reducing the requirements for stored data to describe certain 2-D and 3-D fundamental kernels.

13. EXAMPLES FOR CONSTRUCTION OF D_ε_R

This section provides some examples of forming and processing error functions which have been reduced in some fashion by a combination of the principles presented in this disclosure. In some cases, 2D processing is favored, and in other cases 3D processing is presented. In all cases, there is an emphasis in the reduction of memory required for characterizing and processing the error D_ε_R. Sections 14.7 and 14.8 provide some detailed examples for calculating the memory required (using algorithms) in the cases of WCDMA1_—1 and EDGE waveforms. Section 14.9 provides a comprehensive summary. The examples illustrate the promise of the technology but do not represent the full potential.

(a) 2D Example Using Starburst

In this example, a starburst technique was employed to stimulate the VPA and measure as well as characterize the associated non-linearities. The correction files generated were used to produce compliant WCDMA waveforms covering more than 40 dB of dynamic range starting at the maximum power output. The input baseline starburst normally consists of 48 radial arms composed of uniformly distributed (in time) 128 outward bound and 128 inward bound samples per radial. This results in an input sample record size of 12228 complex points. Using principles outlined in Sections 6 and 7, the input sample record or output sample record was cut in half, because of minimal VA memory affects. In addition, output sample record lengths were reduced by correlation consideration, averaging/smoothing, and/or polynomial fitting.

FIGS. 62 and 63 illustrate output radial excitation responses in the 2D complex plane, for full power and 22 dB attenuation conditions, respectively. Note that the ensemble of traces near φ=0° on the average are illustrated along with the 48 response radials (bottom left hand corner plot). These collapsed traces represent each original response radial collapsed to a zero offset angle. The collapsing process is accomplished by de-rotating the radial in such a manner to account for an averaged radial angle. Each radial possesses such an angle which is an average defined by:

$φ_{k_{av}} = \frac{1}{n} \sum_{i = 1}^{n} φ_{i}$

where

- kΔk^thradial
- iΔ Sample number along the radial
- (n max=128 in nominal case)

The angle φ_k_avis subtracted as the bias for each radial. This creates an ensemble of radials near the real axis on the average. The radials are then averaged to produce an averaged radial. FIGS. 64 and 65 illustrate a single averaged radial as interpolated or actually fitted in a least squares sense to an inner and outer polynomial, for full calibration and 22 dB attenuation calibration, respectively. As illustrated in FIGS. 64 and 65, two separate polynomials are utilized to break up the radial dynamic range. The inner polynomial is intended to estimate the VPA behavior at and below threshold, while the outer polynomial estimates the behavior at and above threshold. The threshold is a programmable parameter of the fitting algorithm, as well as the outer polynomial order. The inner polynomial order is restricted to be a straight line. Once the new radial formed by joining the inner and outer polynomials is generated, it is redeployed to the 48 excitation angles to render a symmetric radial complex response. (See bottom right hand plot in FIG. 62, for example). Once this 2D contour is realized, a 3D surface error for magnitude and phase can be generated. (See top left and top right plots in FIG. 62, for example)

Based on this technique, generally:

- A 70 dB plus dynamic range is accommodated by this technique, although only 40 dB of dynamic range is required.
- 7 complex numbers are stored per power level. If 40 power levels are required then only 560 words must be stored (maximum).

The approach presented also assumes that null cal coefficients can be employed for the bottom 40 dB of dynamic range. This requires no additional memory for the bottom 40 dB in the WCDMA application.

Accuracy and coefficient resolution for the polynomials improves the performance of this approach. There are several techniques for reducing the sensitivities to coefficient accuracies, including:

- Smooth derivative boundary conditions between inner and outer polynomial join.
- Higher order inner polynomials.
- Using 3 or more polynomials.
- Use spline reconstruction techniques.
- Use quadrant or sub quadrant characterization and averaging rather than modulo 360°.

(b) 3D D_ε_RExample

In this example, polynomial fitting is applied after the error surfaces are rendered. Again, the radial sample theme is exploited with projections of these sample paths deployed along the reconstructed error surfaces. The projected radials are then averaged or smoothed after collapsing and an inner as well as outer polynomial applied to fit the smoothed result. Once the polynomial is obtained it is redeployed to form a perfectly symmetrical surface and test waveforms are run to ascertain the effectiveness of the technique.

FIG. 66 illustrates a conic magnitude error surface and bucket phase error surface for a full power output VPA condition. Note the projected lines along the contour of the surfaces. These projected lines are images derived from the 2D complex plane radial spokes. These projected surface lines are collapsed and averaged then polynomial fitted to obtain a new surface defined by re-expanding the polynomial to each original 0 prior to collapsing. The polynomial is obtained via a least squares fit to the collapsed projections at an averaged φ=0.

FIGS. 67 and 68 illustrate the curve fit for an averaged collapsed surface radial, which is rendered in 2-D for the curve fitting process. FIG. 67 illustrates a magnitude curve fit. FIG. 68 illustrates a phase curve fit. The solid lines indicate the averaged radial projections, also called spokes. The dots illustrate the curve fit results from the polynomial approximation. The polynomial approximation possesses an inner and outer polynomial fit as described before, but the inner fit can take on higher order, than that of the example in Section (a).

The resulting approximate error which remains after correction is illustrated in FIG. 69.

An additional example test case for a 22 dB attenuation is provided in FIGS. 70, 71, 72, and 73. FIG. 70 illustrates magnitude and phase error surfaces. FIG. 71 illustrates a magnitude curve fit. FIG. 72 illustrates a phase curve fit. FIG. 73 illustrates the resulting approximate error after correction.

Note that an important distinction exists between this approach and that of Section 13(a). In Section 13(a), the example requires a single curve fit in 2D within the complex plane. Then, the 2D polynomial is expanded to create the 3D error surface for magnitude and phase.

In this example, |D_ε_R| and <- D_ε_Rare created first and then the surface approximations are separately rendered, requiring at least 2 separate poly fit exercises (four total if inner and outer polynomials are considered separately).

The rough upper bound for coefficient memory over the 40 dB of dynamic range of interest is approximately 720 words for magnitude and phase combined, at <30 bits/word.

The technique described in Section 13(b) can be extended to include up to 5 joined polynomials, each with programmable order. In addition, a practical constraint was added to the quantization of the polynomial coefficients. 16 bits can be partitioned to the left and right of the decimal.

The following sequence of figures (FIGS. 74-80) and Tables 1-3 below provide examples of 3 power levels, compensated by this technique. In each case compliance was obtained by as few as 12 or 14 bits quantization. Tables of floating point (high resolution) coefficients as well as fixed point (quantized coefficients) are provided for comparison in Table 1-3. The number of coefficients for each polynomial exceeds the polynomial order by 1.

The 3-D parameterized curves for magnitude and phase error are fit separately. The thresholds or joins of the piecewise polynomials are defined to restrict each polynomials domain.

The 3 documented examples illustrate significant improvement in robust behavior and reduced sensitivity to coefficient quantization by increasing the number of polynomials and decreasing their orders.

An upper band of coefficient memory to accommodate magnitude and phase over a 40 dB dynamic range is approximately 960 14-bit words.

FIGS. 74 and 75 and Table 1 below illustrate example results for a 0 db attenuation example.

TABLE 1

Magnitude Boundaries: 0.20000 3.00000 3.00000 3.00000

Phase Boundaries: 0.30000 0.65000 1.10000 1.50000

Number of Magnitude Coefficients: 2 2 2 2 2

Number of Phase Coefficients: 3 3 2 2 2

Resolution for the Magnitude Coefficients = [12 9]

Resolution for the Phase Coefficients = [12 4]

High Resolution Magnitude Coefficients

0.85540354
−0.00338067
0.00000000
0.00000000
0.00000000

0.97280137
−0.02137875
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

Quantized Magnitude Coefficients

0.85546875
−0.00390625
0.00000000
0.00000000
0.00000000

0.97265625
−0.02148438
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

High Resolution Phase Coefficients

14.96697647
−9.79068913
0.63612456
0.00000000
0.00000000

−9.20465211
−2.35328758
0.06139486
0.00000000
0.00000000

−33.66942017
0.07013977
0.00000000
0.00000000
0.00000000

−48.16052162
0.61092995
0.00000000
0.00000000
0.00000000

−77.16833330
1.37159559
0.00000000
0.00000000
0.00000000

Quantized Phase Coefficients

14.93750000
−9.81250000
0.62500000
0.00000000
0.00000000

−9.18750000
−2.37500000
0.06250000
0.00000000
0.00000000

−33.68750000
0.06250000
0.00000000
0.00000000
0.00000000

−48.18750000
0.62500000
0.00000000
0.00000000
0.00000000

−77.18750000
1.37500000
0.00000000
0.00000000
0.00000000

8580860.912

FIGS. 76, 77, and Table 2 below illustrate example results for a 22 dB attenuation example.

TABLE 2

Magnitude Boundaries: 0.25000 0.60000 1.05000 3.00000

Phase Boundaries: 0.30000 0.60000 1.10000 1.50000

Number of Magnitude Coefficients: 2 3 2 2 2

Number of Phase Coefficients: 3 3 2 2 2

Resolution for the Magnitude Coefficients = [14 11]

Resolution for the Phase Coefficients = [14 5]

High Resolution Magnitude Coefficients

−0.12295166
0.18226050
0.00000000
0.00000000
0.00000000

0.61223504
0.12268472
−0.00582460
0.00000000
0.00000000

1.63301276
−0.03114089
0.00000000
0.00000000
0.00000000

1.73008103
−0.03477335
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

Quantized Magnitude Coefficients

−0.12304688
0.18212891
0.00000000
0.00000000
0.00000000

0.61230469
0.12255859
−0.00585938
0.00000000
0.00000000

1.63281250
−0.03125000
0.00000000
0.00000000
0.00000000

1.72998047
−0.03466797
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

High Resolution Phase Coefficients

66.26567906
2.15845366
−0.22148454
0.00000000
0.00000000

148.97252757
−13.85828510
0.40500947
0.00000000
0.00000000

50.20379811
−1.36110692
0.00000000
0.00000000
0.00000000

32.56491983
−0.68370462
0.00000000
0.00000000
0.00000000

26.38508668
−0.50974078
0.00000000
0.00000000
0.00000000

Quantized Phase Coefficients

66.28125000
2.15625000
−0.21875000
0.00000000
0.00000000

148.96875000
−13.84375000
0.40625000
0.00000000
0.00000000

50.21875000
−1.37500000
0.00000000
0.00000000
0.00000000

32.56250000
−0.68750000
0.00000000
0.00000000
0.00000000

26.37500000
−0.50000000
0.00000000
0.00000000
0.00000000

8580854.912

FIGS. 78, 79, and Table 3 below illustrate example results for a 31 dB attenuation example.

TABLE 3

Magnitude Boundaries: 0.70000 0.90000 1.10000 3.00000

Phase Boundaries: 0.40000 0.85000 1.10000 1.50000

Number of Magnitude Coefficients: 2 2 2 2 2

Number of Phase Coefficients: 3 3 2 2 2

Resolution for the Magnitude Coefficients = [14 11]

Resolution for the Phase Coefficients = [14 6]

High Resolution Magnitude Coefficients

−0.13015737
0.04567030
0.00000000
0.00000000
0.00000000

0.46206359
0.01307941
0.00000000
0.00000000
0.00000000

1.11527556
−0.01650204
0.00000000
0.00000000
0.00000000

1.52015026
−0.03058306
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

Quantized Magnitude Coefficients

−0.13037109
0.04589844
0.00000000
0.00000000
0.00000000

0.46191406
0.01318359
0.00000000
0.00000000
0.00000000

1.11523438
−0.01660156
0.00000000
0.00000000
0.00000000

1.52001953
−0.03076172
0.00000000
0.00000000
0.00000000

0.00000000
0.00000000
0.00000000
0.00000000
0.00000000

High Resolution Phase Coefficients

70.23409410
−3.07340941
0.18605940
0.00000000
0.00000000

16.05006950
6.27965393
−0.23170749
0.00000000
0.00000000

103.06551146
−2.79457112
0.00000000
0.00000000
0.00000000

67.40027620
−1.50747667
0.00000000
0.00000000
0.00000000

40.63303212
−0.78994105
0.00000000
0.00000000
0.00000000

Quantized Phase Coefficients

70.23437500
−3.07812500
0.18750000
0.00000000
0.00000000

16.04687500
6.28125000
−0.23437500
0.00000000
0.00000000

103.06250000
−2.79687500
0.00000000
0.00000000
0.00000000

67.40625000
−1.50000000
0.00000000
0.00000000
0.00000000

40.64062500
−0.79687500
0.00000000
0.00000000
0.00000000

8580857.912

(d) Expanded 2_D Tool Performance

The technique presented in Section 13(a) can be expanded to include consideration of 5 polynomials along with truncations of coefficients. This technique is of interest because the user simply fits to a single complex curve in 2-D space. This fit can then be expanded in a 3-D parametric space to reconstruct magnitude and phase error surfaces. This is equivalent to implicitly constructing I polynomials and Q polynomials. In certain embodiments, this technique requires up to 24 bit resolution for coefficients.

FIGS. 80-84 illustrate example results using this expanded technique. FIGS. 80 and 81 illustrate example results for a 0 db attenuation example. The fixed point resolution is slightly changed between FIG. 80 and FIG. 81. FIG. 82 illustrates floating point (high resolution) coefficients as well as fixed point (quantized) coefficients for the example of FIG. 81. FIGS. 83-84 illustrate similar example results for a 22 dB attenuation example.

(e) Extension of 2-D Technique with Explicit I, Q Component Polynomial Fit

As presented in Section (d), the complex 2-D representation implicitly fits I and Q components of the function by approximating the single complex plane focus of points using up to 5 polynomials with truncated coefficients.

In this section, the I and Q functions are fit explicitly and separately. Then, the 3-D error representation is reconstructed. Typically, up to 16 bits are required for the coefficients to achieve average performance.

FIGS. 85-90 illustrate example results using this approach. FIGS. 85 and 86 illustrate example results for a 0 dB attenuation case. FIGS. 87 and 88 illustrate example results for a 22 db attenuation case. FIGS. 89 and 90 illustrate example results for a 40 dB attenuation case.

(f) Head to Head Comparison for Three Generation I Algorithms

In this section, three separate algorithms (Least Squares Fit, Minimax Fit, and Chebyshev Fit) are compared in terms of coefficient resolution requirements for achieving acceptable performance at three distinct power output level attenuations: 0 dBm, 22 dBm, and 40 dBm.

FIG. 91 illustrates a comparison of bit resolution versus RF attenuation for the three algorithms.

FIG. 92 illustrates a comparison of bit resolution versus RF attenuation for the three algorithms.