1. Field of the Invention
The present invention relates to signal processing. In particular, the present invention relates to applying crest factor reduction techniques to a digitally modulated signal used in communication application.
2. Discussion of the Related Art
The crest factor of a modulated signal is the square root of the signal's peak-to-average power ratio (PAR). Signals with large crest factor are widely used in communication systems. In many 4G/4G communication systems (e.g. WCDMA or LTE), when seen in the frequency domain, the signal band is divided into a number of non-overlapping sub-bands or carrier signals, with each carrier signal having its own multiple-access modulation format. The sampling rate of the signal is typically higher than the Nyquist rate, i.e. the signal's double-side bandwidth. In such systems, crest factor reduction (CFR) improves power efficiency in a wireless transmitter.
In the prior art, various CFR methods have been developed. However, many of these methods require modifications of either the data symbols used or the modulation schemes. Such methods are unsuitable for post-processing multi-carrier modulated signals for CFR, because the data symbols and modulation details are not available for the CFR processor.
One method that is suitable for multi-carrier signals is known as the “windowing method.” However, the windowing method performs poorly due to fundamental drawbacks in the algorithm.
Other types of methods that are widely used are the “peak cancellation” methods which use a number of pulse generators to create a cancellation signal. Peak cancellation methods have two drawbacks. First, such methods result in circuits that have high power consumption requirements, due to their computational complexity. For example, the GC1115 integrated circuit marketed by Texas Instrument, Inc. has a peak power consumption of 1.8 watts. Second, such methods result in circuits that have relatively low performance.
Achieving near optimal CFR in arbitrary multi-carrier signals without incurring high computational complexity is highly desired.
According to one embodiment of the present invention, a crest factor reduction (CFR) scheme for a digitally modulated signal in a complex baseband is achieved by postprocessing the input signal. The present invention provides a digital CFR processor that reduces the signal's peak-to-average power (PAR) ratio with none or minimal increase in out-of-band emissions.
The present invention takes advantage of a procedure that solves for an optimum CFR using a constraint-optimization approach. In this approach, the CFR-induced distortion is measured using a weighted mean square error (MSE) adapted for use with arbitrary multi-carrier signals. The optimum CFR results from a procedure that either minimizes the crest factor, subject to a constraint on the weighted MSE, or minimizes the weighted MSE subject to constraint on the crest factor. In this regard, the weighted MSE is closely correlated with the error-vector-magnitude (EVM) specification.
In one embodiment, the crest factor reduction processor, which receives an input signal and provides an output signal, includes: (a) an error generation circuit that receives the input signal and provides an error signal that is indicative of a crest factor-induced distortion and a delayed input signal, the delayed input signal being the input signal delayed by a predetermined value; (b) a linear-phase filter receiving the error signal to provide a correction signal; and (c) a summer that subtracts the correction from the delayed input signal to provide the output signal. The linear phase filter may have a frequency response that can be expressed as A(ω)e−jωD, where D is a delay of the linear phase filter, A(ω) is a non-negative real-value frequency response, and ω is the frequency variable.
In one embodiment, the error generation circuit comprises one or more error generation stages for an input signal that includes multiple carrier signals. In one implementation, each error generation stage comprises a high-pass filter, a polar-clipping-error block and a delay circuit matching the delayed input signal to a delay of the polar-clipping-error block. The high pass filter may have a frequency response that can be expressed as [1−A(ω)]e−jωD, where D is a delay of the linear phase filter, A(ω) is a non-negative real-value frequency response, and ω is the frequency variable. The polar-clipping error block may implement a non-zero, non-linear function for an input complex sample having a magnitude greater than a predetermined value.
In one embodiment, each error generation stage may include (a) a quantized error block that receives a block input signal and provides a quantized output signal; (b) an error filter that provides a filtered quantized output signal from the quantized output signal; (c) a delay element that delays the block input signal by a time period matching a delay of the error filter; and (d) a summer that subtracts the filtered quantized output signal from the delayed block input signal to provide a block output signal. The quantized error block may include comparator logic that determines whether or not each sample in the block input signal satisfies a predetermined condition, and wherein when the comparator logic determines that the predetermined condition is satisfied, a quantizer circuit is enabled to provide non-zero samples in the quantized output signal. The quantized stage may further include a non-linear gain circuit that provides a non-linear gain to the block output signal.
In one embodiment, the error filter includes: (a) a tapped delay line that receives a number of complex samples from the quantized output signal over many periods of an input clock; (b) summers that combine the complex samples received over the clock periods to provide a plurality of complex sums; (c) multipliers for multiplying the complex sums with filter coefficients to provide complex products; and (d) an output summer for summing the complex products. The summers and the multipliers operate at a higher clock rate than the input clock, such that multiple summing and multiplication operations are performed within each input clock period.
In an alternative embodiment, the error filter includes: (a) registers each provided to store a non-zero complex sample from the quantized output signal and a corresponding life time index representing a number of clock periods of the quantized output signal since the non-zero complex sample is stored into the register; (b) processing circuits that receive the non-zero complex samples and their corresponding life time indices stored in the registers; and (c) a summer for summing accumulated sums of the processing circuits to provide samples in the filtered quantized output signal. Each processing circuit may include: (a) a random access memory circuit that receive one or more of the life time indices as addresses to provide corresponding filter coefficients stored in the random access memory circuit; (b) a multiplier that receives in a predetermined order one or more of the non-zero complex samples and the corresponding filter coefficients to provide corresponding products; and (c) an accumulator for summing the corresponding products to provide an accumulated sum. Each processing circuit may be working at higher clock cycle to provide a plurality of non-zero complex samples over each clock period of the quantized output signal.
According to one embodiment, the quantized output signals of the error generation stages may be delayed and summed to provide an error signal suitable for error vector magnitude monitoring.
In one embodiment, the CFR processor is adaptive, and includes a spectrum analysis circuit for determining power spectra of the input signal, and a crest factor reduction controller that receives the power spectra to provide a set of filter coefficients for the error filter. The spectrum analysis circuit may compute fast fourier transforms. The crest factor reduction controller varies the set of filter coefficients adaptively based on the power spectra received. The crest factor reduction controller also extracts carrier-power distributions from the power spectra.
According to one embodiment of the present invention, a CFR processor is constructed that has close-to-optimum performance—as measured by the relevant PAR and EVM parameters—and low computation complexity. The low computational complexity results in digital circuit implementations that are low-power and have a relatively small footprint. For example, a method of the present invention can reduce PAR of a conventional LTE signal to 6.0 dB with 6.5% EVM, or 6.7 dB with 4.3% EVM, as compared to the performance of the CFR solution from Optichron in the prior art, which reduces that PAR to 6.7 dB, with 6.5% EVM.
The present invention is better understood upon consideration of the detailed description below in conjunction with the accompanying drawings.
According to one embodiment of the present invention, crest factor reduction (CFR) of a digitally modulated signal in complex baseband is achieved by post-processing of the input signal. The present invention is applicable, for example, to an input signal that, as seen in the frequency domain, has a signal band that is divided into a number of non-overlapping subbands or carriers, with each carrier having its own multiple-access modulation format. Such an input signal is used in, for example, WCDMA or LTE signals of 3G/4G wireless systems.
The present invention takes advantage of a procedure that solves for an optimum CFR using a constraint-optimization approach. In this approach, the CFR-induced distortion is measured using a weighted mean square error (MSE) adapted for use with arbitrary multi-carrier signals. The optimum CFR may result from a procedure that either minimizes the crest factor, subject to a constraint on the weighted MSE, or minimizes the weighted MSE subject to constraint on the crest factor. In this regard, the weighted MSE is closely correlated with the relevant EVM specification.
where x is the input value to the fPCE function, typically a complex number, and ξ is the threshold of polar clipping. For the error generation subsystem 101 implemented in circuit 200, delayed input signal 112 (associated with error generation block 201-n) is represented by a sequence of samples, s={sk}, and the error output signal 111 (also associated with block 201-n) is represented by p={pk}. The output error signal 111 of the multi-stage subsystem satisfies the equation
{pk−D}=fPCE(s+H(p)) (2)
where H denotes the function implemented by H-FIR filter 202. Repeated application of error generation blocks 201-1 and 201-n provide error signal which is used to obtain optimum CFR performance (i.e. the theoretical limit) by the method illustrated in
The present invention allows computational complexity of error generation to be reduced without incurring large degradation of CFR performance from the theoretical limit. According to one embodiment of the present invention,
where g(|x|) is an nonlinear gain function computed at non-linear gain block 408 based on instantaneous power signal 410, round( ) is the rounding-to-integer function, ρ1 and ρ2 are quantization parameters. For example, g(|x|) can be a piece-wise linear approximation of max(1−ξ|x|−1, 0).
In one embodiment, boolean sample bk of signal 411 is given by
and, in one implementation, nonlinear gain g(|x|) is given by
g(|x|)=min(0.7(log2|x|−log2ξ),0.5) for |x|>ξ (5)
Some characteristics of QSE block 401 are:
CFR performance may be improved by cascading a number of CFR processors. Unlike a method that cascades multiple stages of PCE-based modified clip and filter, cascading 2 or 3 stages of a QSE-based CFR processor can achieve nearly optimum CFR performance (e.g., reducing the PAR to 0.1 dB higher than the theoretical limit).
A cascaded-CFR processor may be used to construct a multi-stage error generation subsystem, which may then be filtered to achieve filtered error cancellation, as illustrated by
As mentioned above, because the output signal of QSE block 401 is coarsely quantized and highly sparse in the time domain, computational complexity may be reduced substantially for FIR 404 (
y
k=αNqk−N+Σn=0N−1αn(qk−n+qk+n−2N)+jΣn=0N−1βn(qk−n−qk+n−2N) (6)
where coefficients {αn}n=0N and {βn}n=0N−1 are fixed-point real numbers. In
Coefficient multipliers 903 of FIR filter 900 may each be designed to consume no power when its input data is zero.
Non-zero sample container 1102 may include M register, which is the maximum number of non-zero samples expected to appear in 2N+1 consecutive complex samples of input signal 1110 (i.e., {qk}). With an appropriate quantization selected, non-zero sample container 1102 does not overflow. Should overflow occurs, the oldest non-zero sample in non-zero sample container 1102 is forcibly expired.
The stored input samples (i.e., up to M complex samples) and their respective life time indices are provided on data bus 1103 and index bus 1104, respectively. Each bus is further divided into M/R sets of Rvalues, with R being a resource reuse ratio. In other words, data bas 1103 and index bus 1104 are divided into M/R sets of buses, with each set of bus containing {Val(i)} values, where i≦R, and Val(i) being a complex number, and {idx(i)} values, with i≦R, and idx(i) being an integer. Each set of buses is then fed into one of RAM-multiplier-accumulator circuits 1106-1 to 1106-(M/R). Each RAM-multiplier-accumulator circuit works at an internal clock rate that is R times of the sampling data rate of input samples {qk}. At the beginning of each input sample clock period, accumulator 1107 is set to zero. At each internal clock period s (which is each 1/R of the input sample clock period), index idx(s) is used to fetch a FIR coefficient from RAM table 1121. The FIR coefficient is then multiplied with the corresponding Val(s) by multiplier 1108 and the result is accumulated in accumulator 1107. Therefore, at the input data clock period (i.e., at the end of R internal clock periods), each of RAM-multiplier-accumulator circuit completes R samples from non-zero sample container 1102 and provides as output a complex number. In sparse data FIR 1100, the linear-phase FIR coefficients are 2N+1 real numbers, rather than 2N+1 complex numbers. The M/R output complex values of RAM-Multiplier-accumulator circuit 1106-1 to 1106-(M/R) are then summed in sum circuit 1131 to provide value yk of sparse-data FIR filter 1100 (see equation 6).
One advantage of the present invention is the monitoring of each carrier's EVM in the output signal of the CFR processor. The error-signal output from a multi-stage error generation subsystem (e.g., any of the circuits of
where Pav is the average power of the multi-carrier signal. The error-to-signal ratio, i.e., Eav/Pav, may be obtained from average power measurements.
FFT subsystem 1203 may also be used in the computation of time-domain FIR coefficients by inverse FFT.
In one embodiment, adaptive CFR processor 1200 may include a 2x upsampler 1204. If the sampling rate of input signal 1210 is lower than twice the Nyquist rate, 2x upsampler 1204 is enabled to double the sampling rate. Otherwise, upsampling is not required. CFR datapath subsystem 1201 should operate at a sampling rate that is at least twice the Nyquist rate.
The detailed description above is provided to illustrate the specific embodiments of the present invention and is not intended to be limiting. Numerous variations and modifications within the scope of the present invention are possible. The present invention is set forth in the accompanying claims.