The present invention pertains generally to signal processing systems for extracting signals from interference and noise, and more particularly to signal processing systems that extract digitally modulated signals from noise and interference induced during the transmission of such digitally modulated signals.
Channel and noise effects in communication and imaging systems introduced during the transmission of digitally modulated signals compel the use of signal processing and multivariate statistical inference for extracting useful information from filtered and noisy data. The principles used in the design of such systems utilize modeled or measured properties of the signals, interference, noise, and signal-plus-interference-plus-noise, to design matrix signal processors. Such matrix processors use matrix inversion or matrix eigen-analysis for multi-user separation and interference cancellation, for channel estimation and equalization, and for timing acquisition. These matrix processors are indicated in most, if not all, modern standards for wireless communication, including, but not limited to, CDMA, OFDM, GSM, and UWB, and in most, if not all, systems for beam forming, imaging, and detection in radar, sonar and medical imaging. The matrix calculations are extensive and time consuming. They add to the complexity of the signal processing and reduce the real-time or lab-time speed at which the signal processor can run and produce results.
The present invention overcomes the disadvantages and limitations of the prior art by providing a multiple user communication system that is designed using resource management of spreading codes and power controls to reduce the number of distinct eigenvalues that would otherwise be created in the signal correlation matrix so that the conjugate direction calculations can be made in reduced rank Wiener filters to approximate de-correlation results in a substantially reduced number of steps (substantially equal to the reduced number of distinct eigenvalues) in comparison to inverse matrix calculations. The terminology de-correlation is used in a very general way, to include any matrix processor that uses a matrix inverse to design a receive filter.
The present invention may therefore comprise a method of designing a transmitter of a digital modulation communication system to simplify the process of extracting noise and interference from a received signal comprising: encoding an input for transmission in the digital modulation communication system with selected spreading codes to produce an encoded input; adjusting the amplitudes of the spreading codes to produce an encoded, weighted signal; the spreading codes and the amplitudes of the encoded input being selected so that the encoded, weighted signal has a correlation matrix with a designed number of distinct eigenvalues; receiving the digitally modulated signal; and filtering the encoded, weighted signal from noise and interference induced during transmission using a reduced-rank Wiener filter that approximates de-correlation results by using conjugate direction calculations that have a number of steps that is substantially equal to the designed number of distinct eigenvalues so that the number of steps of said conjugate direction calculations can be controlled by selection of the spreading codes and the amplitudes of the spreading codes.
The present invention may further comprise a method of providing noise and interference cancellation in a multi-user digital modulation communication system comprising: assigning symbols to a plurality of binary inputs from the multiple users; encoding the symbols with good spreading codes to create a plurality of encoded symbols; adjusting amplitudes of the encoded symbols to produce a plurality of encoded, weighted symbols such that the spreading codes used to encode the symbols and the amplitude of the encoded symbols cause a correlation matrix of the encoded, weighted symbols to have a designed number of distinct eigenvalues; digitally modulating the encoded, weighted symbols to produce a digitally modulated, encoded weighted signal; transmitting the digitally modulated, encoded weighted signal; receiving a detected signal having a digitally modulated, encoded weighted signal component and an interference and noise component; demodulating the detected signal; and filtering the detected signal to substantially remove the interference and noise component using a reduced-rank Wiener filter that approximates de-correlation results by using conjugate direction calculations that have a number of steps that is substantially equal to the designed number of distinct eigenvalues so that the number of steps of the conjugate direction calculations can be controlled by selection of the spreading codes and the amplitudes of the spreading codes.
The present invention may further comprise a method of canceling noise and interference in both passive and active scanning systems comprising: encoding an input signal of the active scanning system with good space-time codes and encoding symbols to create a good transmit signal; adjusting amplitudes of the plurality of encoding symbols to produce the space-time codes and the amplitudes of the encoding symbols so that the good transmit signal has a correlation matrix with a designed number of distinct eigenvalues; exploiting the small number of distinct eigenvalues in a passive system; generating a reduced-rank Wiener filter steering vector for beam forming, detection, or estimation, using a Wiener filter that approximates de-correlation results by using conjugate direction calculations that have a number of steps that is substantially equal to the designed and exploited number of distinct eigenvalues so that the number of steps of the conjugate direction calculations can be controlled in active scanning by selection of the good space-time codes and the amplitudes of the symbols, and exploited in passive scanning; and applying the reduced-rank Wiener filter steering vector to signals received by a space-time receiver of the scanning system to cancel interference and noise.
The advantages of the present invention are that information systems such as communication, radar, sonar, ultrasound, NMR, etc. can be designed using the techniques of the present invention to greatly reduce the complexity of these information systems. By reducing the complexity, reduced costs, higher speeds, and less expensive information systems can be provided, or systems of fixed complexity can be designed for higher bandwidth or larger multiuser capacity.
The foregoing and further objects features and advantages of the present invention will be understood more completely from the following detailed description of presently preferred, but nonetheless illustrative, embodiments in accordance with the present invention, with reference being had to the accompanying drawings, in which:
(not used)
As indicated above, the binary inputs 102 are from multiple users. So, for example, if there are 64 users, there may be 64 separate binary input sequences 102. After these inputs are coded in coder 106, in this example, there are 64 coded binary sequences 108 that are applied to block 110. For each baud interval, which has 64 inputs, a separate symbol (a) is generated for each block of user inputs. For each baud interval, a group of symbols (a) can be assembled as a vector. Block transform 110 assigns a separate code, such as a Walsh code or a Gold code, to each user symbol by generating a matrix to convert the sequence of symbol vectors to the corresponding Walsh or Gold codes to produce a sequence of Walsh encoded, or Gold encoded, symbols. Although Gold codes and Walsh codes are mentioned herein as examples of “spreading codes” that can be used in accordance with various embodiments disclosed herein, any set of spreading codes can be used that is capable of achieving correlation matrices with a small number of distinct eigenvalues, or with some other desired eigenvalue shaping.
In this manner, block 110 multiplies each symbol, in each baud period, for each user, by the corresponding Walsh or Gold code assigned to that user, and generates a series of a plurality of vectors in which each vector encodes a symbol (a) in a given baud interval for one of the different users. If there are less than or equal to 64 users and the spreading factor is 64, then the transmitted vector will be a 64-vector consisting of the sum of less than or equal to 64 vectors. This invention accommodates any integer number for the spreading factor and any number of users less than or equal to this number.
Another function of block 110 is to multiply each of the symbols and each of the encoding vectors by an amplitude (A), which can be thought of as weighting the symbols. This is accomplished by creating a diagonal matrix of amplitude values and performing a matrix multiplication with the transmitted vector. The assignment of the spreading codes, and the multiplication of each of these symbols by a specific amplitude, shapes the eigenvalues of the signal correlation matrix and affects the results of the de-correlation process. The number of distinct eigenvalues of the signal correlation matrix can be greatly reduced. For example, in the case of 64 different users, the number of distinct eigenvalues, without code design and power control, can be as large as 64. By designing the system, through the application of good codes such as Walsh or Gold codes, and applying an amplitude signal to each of the symbols, the number of distinct eigenvalues can be as low as one or two. Of course, in a standard inverse matrix de-correlation process with 64 users, there are 64 cubed steps required.
Conjugate direction calculation such as disclosed in
As explained in more detail below, the conjugate direction receiver 146 can be made one to two orders of magnitude simpler by applying designed codes and amplitudes in block 110 to create an encoded, weighted signal 112 having correlation matrix with a small number of distinct eigenvalues. This is disclosed in more detail in the THEORY SECTION, Part A (below), entitled “Warp Converging Reduced-Rank Conjugate Gradient Wiener Filters for Multi-User Detection.” In addition, the THEORY SECTION, Part B, entitled “Warp Convergence in Conjugate Gradient Wiener Filters” discloses more details of the use of Gold codes.
As indicated above, specifically designing the encoded, weighted symbols 112 can also reduce the effects of interference through careful selection of the codes (see THEORY SECTION, Part B). By selecting codes in the presence of interference, such as multi-path interference, so that the codes remain good and maintain a small number of repeated eigenvalues in their correlation matrix, the effects of interference can be filtered in the reduced rank Wiener filter in a greatly reduced number of steps. In other words, the codes selected by the block transform in block 110 can be chosen so that the goodness of the code is not affected by interference. In this manner, the effects of interference can be minimized.
Selection of codes can be done empirically, or it can be accomplished in a feedback control system which detects the effects of interference and provides control signals to adjust the selection of codes in the block transform 110. A feedback control signal can also be used to modify the amplitude (A) matrix. Even with imperfect control of codes and amplitudes, the resulting signal correlation matrix will have eigenvalues clustered around a few distinct values, and the invention disclosed here will have the desired properties of rapid convergence.
The encoded weighted symbols 112 generated by block transform 110 are applied to a parallel to serial converter 114. The parallel to serial converter 114 generates a time sequence of the weighted and spread symbols in a serial stream 116. The stream 116 is then applied to a transmit filter 118. Transmit filter 118 is a standard transmit filter that convolves the time sequence of weighted and spread symbols with a waveform (g). The waveform 120 generated by the transmit filter 118 is applied to an RF modulator 122 that upconverts the waveform 120 to a desired carrier frequency. The upconverted waveform 124 is then applied to an antenna 126 for wireless transmission via an electromagnetic wave 128. Of course, the embodiment shown in
The electromagnetic wave 128 is received by the antenna 130 in the receiver section 156 of the embodiment of the wireless communication system 100. The electromagnetic wave 128, which is detected by the antenna 130, is transmitted as an electrical signal 132 to the RF demodulator 134. The RF demodulator 134 downconverts the receiver signal to a baseband signal 136. The baseband signal 136 is applied to a receive filter 138. The receive filter 138 can be matched to the transmit filter 118, or to the convolution of the transmit filter and the channel impulse response. The output 140 of the receive filter 138 therefore constitutes a discrete-time sequence (d) that includes noise and interference introduced in the transmit channel. The serial to parallel converter 142 receives the discrete-time sequence 140 and converts sequence 140 to a parallel sequence of symbols 144. The parallel sequence of symbols 144 is then applied to the conjugate direction receiver 146.
The conjugate direction receiver 146, that is illustrated in
The electromagnetic wave 220 is received at an antenna 222, which is provided to the RF modulator 224. The RF modulator 224 downconverts the signal to a baseband signal which is applied to a receive filter 226. The receive filter 226 can operate in the same manner as receiver filter 138 illustrated in
A code 621 for a particular look direction is also applied to the conjugate direction calculator. The functions of the conjugate direction calculator 620 are disclosed in more detail in the THEORY SECTION, Part D, Section 3.2, “Application and Array Signal Processing.” The conjugate direction calculator 620 generates a reduced-rank Wiener filter steering vector 622 that is applied to space-time decoder 624 applies the steering vector 622 to cancel interference and noise from signal 618 using beam forming, detection, or estimation to generate image output 626 for beam forming detection, or estimation, in which noise and interference are substantially cancelled. As described in the THEORY SECTION, Part D, the calculations performed by the conjugate calculator 620 are greatly simplified because the reduced number of eigenvalues in the transmitted signal. In another embodiment of the invention, no signal is transmitted as the receiver passively listens for radiated signals. The disclosure of the THEORY SECTION, Part D shows that radiated signals typically have correlation matrices with special eigenvalue shaping so that the concepts disclosed herein can be utilized.
The THEORY SECTION, Part E, entitled “Reduced-Rank Filtering: Complexity and Performance Scalable Multiuser Detectors for Multi-Rate CDMA Systems,” discloses additional implementations of the conjugate direction filter. This section discloses the manner in which the conjugate direction filter operates with mixed rate signals in which longer codes are being used by some users, while other users are using shorter codes.
The present invention therefore provides an unique method of designing information systems, such as communication systems and imaging systems, to produce approximate de-correlation results with substantially reduced complexity, in comparison to inverse correlation matrix calculations. As a result, receivers can be constructed with substantially less complexity at less cost and easily operate at higher speeds. Systems can be designed with spreading codes and amplitudes such that the signal correlation matrix has only a very few distinct eigenvalues. The reduced rank Wiener filter rapidly converges to provide approximate de-correlation results in a number of steps substantially equal to the number of distinct eigenvalues. Hence, eigenvalue shaping of the transmitted signal allows fast convergence of the reduced rank Wiener-filter to substantially reduce the complexity of the receiver. In applications where no design is required to achieve the desired eigenvalue shaping, the invention exploits eigenvalue shaping that exists.
The foregoing description of the invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed, and other modifications and variations may be possible in light of the above teachings. The embodiment was chosen and described in order to best explain the principles of the invention and its practical application to thereby enable others skilled in the art to best utilize the invention in various embodiments and various modifications as are suited to the particular use contemplated. It is intended that the appended claims be construed to include other alternative embodiments of the invention except insofar as limited by the prior art.
Part A—Warp Converging Reduced-Rank Conjugate Gradient Wiener Filters for Multiuser Detection
Many signal processing problems, ranging from signal detection to interference suppression to beamforming and array signal processing, involve the filtering operation sTRyy−1y. For example, in CDMA systems, the measurement y is a vector of samples from the chip matched filter outputs, containing information from all active users, the matrix Ryy is the correlation matrix of such a data vector, and the vector s is the signature code of a desired user.
In a synchronous CDMA system, the real-valued data vector yεRN, obtained in one symbol interval, can be modeled as
where K is the number of active users in the system. The matrix A=diag{A1, A2, . . . , AK} contains the users' amplitudes; the N×K matrix S=[s1 s2 . . . SK] contains the K linearly independent users' signature vectors (normalized spreading codes); and the vector b=[b1 b2 . . . bK]T contains the independent BPSK symbols from all users. The real-valued white noise vector, n˜N(0, σ2I), is assumed Gaussian distributed.
Under equation (1), the data correlation matrix, the cross-correlation matrix between the data vector and the symbols from all users, and the cross-correlation vector between the data and a desired user (say user k) all have the structures,
Ryy=E{yyT}=SA2ST+σ2I,Ryb=E{ybT}=SA,Ryb
where the fact that E{b}=0 and E{bbT}=I is used. It is well known that the linear MMSE (LMMSE, also the Wiener filter) multiuser detector (MUD) is optimum among all linear detectors, in terms of maximizing the output signal to interference-plus-noise ratio (SINR) or minimizing the mean square error (MSE). The decision rule for the LMMSE MUD is as follows: centralized decision for all K users:
{circumflex over (b)}=sgn{ASTRyy−1y}=sgn{STRyy−1y}, (3)
decentralized decision for a desired user k (k=1, . . . , K):
{circumflex over (b)}k=sgn{AkskTRyy−1y}=sgn{skTRyy−1y}. (4)
Note that, from the right-hand-most equalities above, the decision rules are invariant to positive scaling by the diagonal amplitude matrix A or scalar Ak for a BPSK modulation.
In practical systems, the high dimensionality of the received vector y often makes direct evaluation of the decision statistics in equations (3) and (4) unrealistic. First, with increasing dimensionality N, complexity associated with matrix inversion is of vital concern. Second, for a realistic scenario in which the statistics are not completely known beforehand but rather have to be estimated from data, there is a conflict between the dimension of Ryy and the number of measurements required for estimating the full rank Ryy. These design constraints commonly lead to situations in which low-complexity, reduced-rank approximations to the decision statistics of (3) and (4) are required. In this paper, we study two conjugate gradient (CG) reduced-rank Wiener filters: the matrix conjugate gradient for decoding all users in one shot, and the vector conjugate gradient for decoding each user one at a time. We prove several new convergence results for MUD in CDMA systems by exploiting the idea of expanding Krylov subspaces and the nature of invariant subspaces of reduced dimension. The results indicate that for many commonly encountered systems, the “full-rank” performance can be achieved by a “reduced-rank” receiver with a complexity only slightly higher than the matched filter or correlation receiver.
It was not until the work of Goldstein, et. al. [1], and its subsequent connection to conjugate gradients [2], that researchers in communication and signal processing understood the connection between filter design and optimization theory. We take this connection to be now known, and aim in this paper to establish that the convergence of conjugate gradient methods is actually dramatically faster than previously thought, for designed systems like CDMA systems.
The conjugate gradient method, first introduced by Hestenes and Stiefel [3], was developed for iteratively solving quadratic minimization problems. The objective function to be minimized can be formulated as
where the vector w stands for an N-tuple of real-valued variables; the N-tuple s stands for a pre-selected known vector; and the N×N matrix R is symmetric and positive definite. The solution to this quadratic minimization problem is
In designing a linear MUD for a desired user, say user 1 with signature vector s1, the expression for the mean squared error (MSE) of estimating b1 from the real-valued synchronous CDMA data in equation (1) falls into the class of problems in (5). Specifically, we have
where w stands for the filter vector and wN=w/A1 is its scaled version; and σb
MMSE({circumflex over (b)}1)=1−A12s1TRyy−1s1,
is achieved when the filter is chosen as the Wiener filter wopt=A1Ryy−1s1 or its scaled version wN,opt=Ryy−1s1.
We note that the decision rule in equation (4) and performance measures such as the SINR and the bit error rate (BER) will not be affected by omitting the positive scalar A1 in the Wiener solution. This invariance property can be observed from the decision rule in (4) and from the expressions for SINR and BER [4]. The SINR for a desired user (say user 1) is
which is invariant to non-zero scaling on w. Here,
is the correlation matrix due to multiple access interference (MAI) and noise. The maximum SINR, which is related to the minimum MSE, is achieved when the Wiener filter wopt is used:
The BER (see [4], [5] for details) for the desired user (user 1) is
which is invariant to positive scaling of w. Here
is the Gaussian tail probability.
Note that the expressions for SINR in (8) and the approximate BER (as a function of SINR) in (9) apply to any linear filter w, whether or not it is a full or reduced-rank Wiener filter. When the optimal linear filter (full-rank Wiener filter wopt or its scaled version wN,opt) is used, we reach the upper bound in SINR and the lower bound on the BER for linear receivers. As mentioned earlier, the “full-rank” solution is not practical for many real-time adaptive systems of large dimension. Hence, in the sequel, we will focus on different methods for efficiently computing the reduced-rank approximation, wk, to the scaled full-rank Wiener solution WNWF=Ryy−1s1. Here, computationally efficient methods refer to iterative procedures for approximating Ryyw=s1 without using matrix inversion and/or eigen-decomposition. A reduced-rank solution is especially important for adaptive MUD for CDMA systems involving large spreading gains N in dynamically changing wireless environments.
Using the vector conjugate gradient (V-CG) method (with zero for the initial filter vector w0=0), we start with an initial search direction d1=g1. The vector g1=s1−Ryyw0 is the initial residue vector (negative gradient). During the iterative refinement stages, a rank-k approximation to the Wiener filter is formed, using the conjugate direction vectors generated by the V-CG method, as
wk=wk−1+αkdk, (10)
where the step size αk=∥gk∥2/dkTRyydk is chosen to optimize the step in direction dk. Subsequent search directions are chosen as the Ryy-conjugate directions
where dkT Ryydm=0 for k≠m. The residue vector is updated accordingly as
gk+1=s1−Ryywk=gk−αkRyydk, (12)
where the first equation may have numerical advantage, while the second equation may have some computational and storage advantage. The choice of conjugate direction vectors dk decorrelates the internal variables, zk=dkTy, in the recursion and alleviates the oscillating slow convergence commonly observed in the simple steepest descent algorithms. Note that the step size coefficients can be further simplified as αk=s1Tdk/dkTRyydk, which are simply the scalar Wiener filters working on the decorrelated internal variables zk.
We point out that there mainly exist two different ways to decorrelate the internal variables, zk=TkTy, during the reduced rank pre-processing of data y, namely the eigen-subspace approach (choosing columns of Tk=[u1 u2 . . . uk] as the eigenvectors of Ryy) and the Krylov-subspace approach (choosing columns of Tk=[d1 d2 . . . dk] as the Ryy-conjugate direction vectors). Examples of the eigen-subspace approach are the principal component (PC) [6] and the cross-spectral metric (CSM) methods [7]. Examples of the Krylov subspace approach are the conjugate gradient (CG), the conjugate direction (CD), and multistage Wiener filter (MSWF) methods [1], [2]. The mean squared error (MSE) expressions for the full-rank Wiener filter, the rank-k conjugate gradient (CG) Wiener filter, and the rank-k cross-spectral metric (CSM) Wiener filter for estimating b, from data y in (1) are summarized as [8], [9]
where the vectors dl are the Ryy-conjugate direction vectors generated from the V-CG method, and the vectors ul and scalars λl are the eigenvectors and eigenvalues of the data covariance matrix Ryy, respectively. There are several points to be made about these formulas. First, the direction vectors dl and ul are computed without regard for minimization of the MSE. But once chosen, the step sizes are optimized to minimize the mean squared error that can be achieved with an estimator in the Hilbert space spanned by the random variables zl=dlTy or vl=ulTY. Second, the rank-k CSM Wiener filter and the rank-k PC Wiener filter have the same MSE expression, the difference lies in their ways to choose the eigenvectors ul, (l=1, . . . k). From the MSE expressions, we note that the best choice for the conjugate direction vector dl in the CG-WF to maximize the quantity
would be the full-rank WF dl=Ryy−1s1. However, without resorting to matrix inversion, the CG-WF uses a few simpler steps to reach the full-rank WF solution. There exists a best choice of conjugate direction (CD) vectors dl's such that the MSE decreases the most at each stage, and a rank-k CG-WF simply uses one set of CD vectors initialized by the desired signal vector s1 and the gradients. The computational advantage of the CG-WF (w/o eigen-analysis) over the CSM-WF (w/ eigen-analysis and sorting) is obvious. Using a block diagram, we present, in
In this section, we present convergence results, general and warp, for the reduced-rank V-CG WF for the CDMA application in equation (1).
It has been shown in [10] that for a general quadratic problem the V-CG algorithm converges in N steps, where N is the dimension of the measurement y and the correlation matrix Ryy. However, when designing a MUD in a CDMA system, we show in this work that the V-CG converges in, at most, K steps (K≦N), where K is the number of active users in the system. This slightly improves on the result in [10], [2], in which convergence was guaranteed in at most (K+1) steps. This convergence is due to the fact that the Ryy-conjugate direction dk formed at each stage of the V-CG belongs to an expanding Krylov subspace, as does the corresponding reduced-rank Wiener filter vector wk. Specifically, we have
where Kk(Ryy, s1)=<s1, Ryys1, . . . , Ryyk−1s1> is the Krylov subspace of dimension k [10]. Using the fact that,
we can further explicitly write the basis vectors of the expanding subspace Kk(Ryy, s1) as
where the coefficients βk,l depend on the signatures sk and the amplitudes Ak of all active users, and the noise power σ2.
We notice that the expanding subspace is contained in the signal subspace <S> spanned by the signature vectors of all active users. Each basis vector in the Krylov subspace is therefore a linear combination of the K code vectors {s1, s2, . . . , sK}, and there can be no more than K such linearly independent vectors. Therefore, in the general CDMA case, the subspace Kk(Ryy, s1) stops expanding in dimension after at most the k=K iteration. This concludes our sketch proof of the (at most) K-stage (rank-K) convergence of the V-CG MUD.
When implementing the MUD using the V-CG method, practical convergence occurs whenever the residue vector in equation (12) is smaller in norm than a preset threshold. Experimentally, we have observed that there exists an exact convergence in the V-CG method at warp speed, meaning that the subspace stops expanding at a stage well below K, for certain important applications in signal processing, adaptive beamforming, and wireless communications. In a CDMA system, under ideal near-far ratio (NFR) conditions among all users, using the same-length Gold codes, convergence can actually be reached either at the 2nd (for the use of a good set of Gold codes) or at the 4th (for the use of a bad set of Gold codes) iteration, independent of the number of users K and the spreading code length N. Even further, with a group-wise power scheme in combination with spreading code design, we can limit the convergence steps to certain prescribed values. This result is very useful in the down-link of CDMA systems, where power control can be achieved almost ideally for all active users (in groups). Here the distributed receiver for each desired user only needs to use its own signature code to achieve a performance equal to the full-rank LMMSE MUD in just L=2 to 4 iterations (L<<K≦N), independent of the total number of active users K and the spreading Gold code length N. The underlying reason for such a warp convergence occurring in the reduced-rank Wiener filter is due to the number of distinct eigenvalues of the data covariance matrix Ryy which induces a matrix annihilating equation of order-L (L<<N) [9]. We summarize, in a Theorem, the necessary and sufficient conditions for warp convergence in the V-CG reduced-rank Wiener filters.
Theorem (Warp Convergence in V-CG Reduced-Rank Wiener Filters):
the reduced-rank conjugate gradient Wiener filter wk that lies in the Krylov subspace Kk(Ryy, s1) yields the full-rank Wiener filter wopt=Ryy−1s1 in at most k=L−1 steps, where L is the number of distinct eigenvalues of the matrix Ryy.
Proof: We use the concept of invariant subspace and orthogonal projections to prove the Theorem in a constructive way. A more general Theorem for applications with models other than (1) can be found in [9]. For a covariance matrix Ryy of L distinct groups of eigenvalues, using the spectral factorization theorem, for any integer 1≦k≦L, we have
Ryyk=λ1kP1+ . . . +λLkPL
where Pl=QlQlT is the orthogonal projection matrix on to the invariant subspace Ql=<Ql>. Columns of the rank-ml matrix Ql are the eigenvectors of Ryy associated with the eigenvalue λl (with a multiplicity ml).
For a CDMA system in (1), we have Ryy=SA2ST+σ2I, then
λ1=σ12+σ2, . . . ,λL−1=σL−12+σ2,λL=σ2.
Furthermore, we have PLS=0N×K. Therefore, for every s1ε<S>,
holds for every integer k. Therefore, we have,
Hence, the (L−1)-step convergence result for the V-CG method holds.
Note that our warp convergence theorem reveals important fundamentals behind the observation (based on the Cayley-Hamilton Theorem) that a small number of stages is needed in the polynomial expansion (PE) multiuser detector or the Cayley-Hamilton detector [11], [12]. However, we show that, contrary to common belief, it is the order, L, of the minimal polynomial, g(λ)=(λ−λ1)(λ−λ2) . . . (λ−λL), that determines convergence, and not the rank, K, of the signal covariance matrix
or the code Grammian STS. This order is determined by the number of distinct eigenvalues of the signal covariance matrix
which is related to the number of distinct eigenvalues of the signal Grammian G=AGSSA where GSS=STS is the code Grammian.
In a multiuser application of the V-CG algorithm, K V-CG algorithms, each using a different initial vector sk, must be run in parallel. Moreover, in these parallel filters there will be correlation between the internal variables ul(k)=dl(k)Ty that is not exploited. Therefore, a bank of K uncoupled V-CG filters makes sub-optimum use of the data. The matrix conjugate gradient (M-CG) algorithm exploits these correlations to reach a faster convergence and, at the same time, decode all active users in one shot instead of using K parallel V-CG procedures for all K users.
The matrix Wiener filter applied to the problem of designing a MUD for the information vector of all K active users in a CDMA system is
where the objective function is
where ∥•∥F stands for the Frobenius norm, and tr(•) is the trace operator. The minimum MSE,
MMSE({circumflex over (b)})=tr{I−RybTRyy−1Ryb},
is achieved when the matrix Wiener filter WWF=Ryy−1Ryb is used.
For CDMA applications, we have Ryb=SA so the matrix Wiener filter, the associated MUD for decoding the BPSK symbol vector b of all K users, and the MMSE are WWF=Ryy−1SA, {circumflex over (b)}=sgn{WWFTy}=sgn{STRyy−1y}, MMSE({circumflex over (b)})=tr{I−ASTRyy−1SA} Note again that the diagonal amplitude matrix A may be omitted from the WWF without affecting the decision rule, the SINR, and the BER of the MUD. Our aim is to implement a normalized Wiener filter WNWF=Ryy−1S for MUD using a M-CG algorithm.
The M-CG algorithm for implementing the matrix Wiener filter and the MUD starts with the initial matrix filter W0=0N×K, the search direction matrix D1=G1, and the initial residual matrix G1=S−RyyW0. The M-CG algorithm updates the filtering matrix as [13]
Wk=Wk−1+DkVk,
where the step size matrix is calculated as follows:
Vk=(DkTRyyDk)−1GkTGk.
Similarly the residual matrix is updated according to
Gk+1=Gk−RyyDkVk; G1=S.
The R-conjugate direction matrix is updated according to
Dk+1=Gk+1+Dk(GkTGk)−1Gk+1TGk+1; D1=S.
As a matter of fact, when the initials are chosen as,
W0=0N×K, D1=G1=S.
then we notice the important fact that the updated residual matrix G2 is simply a zero matrix:
The last equation can be seen from the orthogonality between subspaces <G2> and <S>. That is,
STG2=0K×K<G2>⊥<S>
G2=STR=TLS<G2>⊂<S>
Combining these two results, we have G2=0N×K.
This result shows that the M-CG method converges to the full rank normalized Wiener filter WNWF=Ryy−1S=S(STS+σ2A−2)−1A−2 in just one step. That is, we can verify the fact that W1=D1V1=WNWF. This result is especially useful in the up-link of a CDMA system, where knowledge of all active users' signatures is available, and one-step convergence for the M-CG MUD guarantees MMSE performance for all users in one shot. Here, we should point out that in decoding information for all K active users, the V-CG based MUD needs operations on the order of O(kstepKN2); while the M-CG based MUD needs operations on the order of O(K3+K2N)=O(K2N).
It has be shown [2] that the orthogonal multi-stage Wiener filter (OMSWF) constructs the same normalized rank-k filter vector wk as the V-CG procedure, at each stage of iteration. In [14] this result is generalized to establish a one-to-one correspondence between the set of conjugate direction filters and (non-orthogonal) multi-stage Wiener filters [1], [15]. The only difference between the CGWF and the OMSWF is that the OMSWF successively uses a generalized sidelobe cancellation (GSC) idea to build a rank-k Wiener filter, wk, to approximate the full-rank Wiener filter in nested stages. In doing so, the rank-k Wiener filter is refined in the expanding Krylov subspace Kk(Ryy, s1), spanned by a rank-k pre-processing matrix Tk(OMSWF) with orthogonal columns. The matrix Tk(OMSWF) (applied on data y) tr-diagonalizes the Ryy matrix. With much simplified computation, the V-CG method successively constructs the rank-k Wiener filter that is refined in the same expanding Krylov subspace spanned by Ryy-conjugate direction vectors Tk(VCGWF)=[d1 d2 . . . dk]. The pre-processing matrix Tk(VCGWF) (applied on data y) diagonalizes the Ryy matrix, hence bringing further simplification to the reduced-rank filter structure. Therefore, the warp convergence results obtained for the V-CG MUD designs also apply to the OMSWF MUD designs.
Computer simulations and numerical evaluations based on analytical results in our previous paper [4] are given in this section. In all the experiments, we designed a CDMA system with K=10 active users, each using a distinct length N=15 Gold code. The SNR for the desired user (user 1) is fixed at SNR1=A12/σ2=11 dB. To verify our results on the convergence of the CG WF MUD, different near-far ratios (NFRs) are used in the experiments.
In
as well as the signal Grammian G contains only two distinct eigenvalues. Using this and the subspace expansion in (15) we can argue that in this case the V-CG WF MUD converges to the optimal full-rank MUD in just 2 steps.
In
We have introduced the matrix conjugate gradient (M-CG) method into the design of reduced-rank Wiener filter for multiuser detection in CDMA systems, and proved warp convergence for the vector conjugate gradient (V-CG) method. Using a subspace framework, we have shown that the V-CG WF MUD converges in at most K steps while the M-CG WF MUD converges in just one step. With power control among all users using Gold codes as their spreading codes, we have proved warp convergence for the reduced-rank V-CG WF MUD in 2 to 4 steps, independent of the number of users and spreading factor. Furthermore, we can properly design a groupwise power allocation scheme among all users using Gold codes, so that the rank-4 V-CG WF MUD will guarantee to deliver the performance of the optimal full-rank WF MUD. Replacing the data correlation matrix by the code correlation matrix, results obtained here for the CG WF MUD also apply to the design of decorrelating MUDs.
Many advanced signal processing systems, ranging from linear filters for space-time adaptive processing in radar and sonar systems, to signal separation and interference suppression in communications, involve such filtering operations as sHRyy−1s, and sHRyy−1y, where Ryy is the covariance matrix of the measurement vector y, and the vector s denotes the signal mode of interest. The signal mode s can either be parameterized by the temporal/spatial frequencies in space-time adaptive processing, or as the signature vectors in a spread spectrum CDMA system. In practice, the dimension, N, of the vectors s, y and the matrix Ryy may be very large. This not only causes slow convergence during the real-time adaptation of the systems in dynamically changing interference environments, but also causes performance degradation of the system when only a small number of data samples are available for estimating the data statistics.
In this work, we study the conjugate gradient (CG) method [1] from the perspective of expanding subspaces, for iteratively designing reduced-rank (RR) Wiener filters (WF) [2], [3], [4] in array processing, robust beamforming and communications. At each iteration stage of the CG algorithm, a RR WF, contained in an expanding Krylov subspace, Kl(Ryy, s) [3], [5], is constructed. We study in-depth the cases when the Gram matrix of the signal modes has a structure that leads to warp convergence of the RRCG WF [6]. This finding will be useful during the adaptive implementation of systems for communication and array processing. Computer simulations verify the remarkably fast convergence of the RRCG Wiener filters predicted by our theoretical results. This warp convergence is much faster than predicted by standard results in [3], [5].
Let us assume that the measurement vector y follows a linear model,
y=Sθ+n, (15)
where the matrix S=[s1 s2 . . . sK] contains signal modes for all sources; the random vector θ=[θ1 θ2 . . . θK]T contains information carried by signal modes; the noise vector, n˜CN(0, σ2I), is proper complex white Gaussian. In this work, we choose the mode s1 as the desired signal mode for CDMA communications and beamforming on a single source. The rest of the modes, sk
The full-rank data covariance matrix Ryy often has the structure,
where the diagonal matrix P=diag{P1, P2, . . . , PK}, with Pk=E{|θk|2}, contains the power for each of K uncorrelated sources. The Gram matrix of the signal mode matrix is GSS=SHS. Previous convergence results [3], [5] have been based on the rank K of the structured part of Ryy. Our results are based on a more refined analysis of the eigen-structure of Ryy.
We have been motivated in our study by the application of the RRCG WF to problems where the Grammian SHS has a small number of distinct eigenvalues, compared with its rank. Rather amazingly, this situation arises in many branches of engineering and statistics.
The equivalence between the vector conjugate gradient (V-CG) method and the orthogonal multi-stage (OMS) approach for Wiener filter design was reported in [3] and generalized in [4]. To approximate the full-rank Wiener filter, at the k-th stage of iteration, both approaches construct a rank-k Wiener filter, wk, that lies in the expanding Krylov subspace Kk(Ryy, s1). The OMS approach [3] uses the orthogonal gradient vectors to construct the RR WF, which relies on the nested scalar synthesis filters. With much simplified computation, the V-CG method uses the Ryy-conjugate direction vectors dk to construct the RR WF. Specifically, using the V-CG method with initial w0=0 and d0=s1, a rank-k Wiener filter is [6]
where the conjugate direction vectors satisfy dlHRyydm=0, for l≠m. The coefficients γl are simply the scalar Wiener filters working on the decorrelated internal variables, dlHy.
In this work, we prove that the RRCG Wiener filter,
wl(RRCG)=SlμlεRange(Sl),
Sl=[s1,Ryys1, . . . ,Ryyl−1s1],
converges to the full-rank Wiener filter, wWF=Ryy−1s1, in at most L steps, where the number of distinct eigenvalues of Ryy is L. It converges in at most (L−1) steps when Ryy has the structured form of (2) and the vector s1 is an element of <S>. This means a rank-L RRCG WF delivers the same performance as that of the full-rank WF. This is due to the fact that at each iteration of the V-CG method, a RR-WF is constructed from the expanding Krylov subspace Kl(Ryy, s1)=<Sl>, and the subspace does not expand past l=L. In other words, for applications where L<<K, we can reduce the rank down to L during the construction of the RR-WF using the V-CG method, yet still deliver the same performance as the full-rank WF.
Note that earlier results on the (K+1)-step [3], [5] or the K-step [6] convergence for Ryy of the form (2) still apply to problems without repeated eigenvalues.
We summarize our results for the warp convergence of the reduced-rank vector conjugate gradient (V-CG) Wiener filter using a Theorem and a Corollary.
Theorem: For 1≦l≦N and 1≦L≦N, the maximum dimension of the Krylov subspace Kl(Ryy, s), over all sεCN, equals the min(l, L), if and only if the number of distinct eigenvalues of Ryy denoted # dev(Ryy), is L. That is,
Remarks: This is an if and only if statement that may be interpreted to read “The Krylov subspace stops expanding in a number of steps that cannot exceed the number of distinct groups of eigenvalues of Ryy”. For many applications, for which Ryy=Rss+σ2I, this number L is much smaller than rank(Rss)+1, which is the maximum dimension of Kl(Ryy, s) for covariance matrices Rss that have distinct eigenvalues. So, contrary to common belief, it is the minimal polynomial of Ryy that determines convergence, and not the characteristic polynomial, as our proof will show.
Proof: The proof consists of two parts.
(The sufficient condition or the “if part”)
Assume that # dev(Ryy)=L. Then the characteristic polynomial of Ryy is
ΔR
with m1+m2+ . . . +mL=N. The polynomial ΔR
and g(Ryy)=0N×N. Hence, the vectors Ryyls cannot expand the dimension of the Krylov subspace Kl(Ryy, s) for any l≧L.
(The necessary condition or the “only if part”)
Let L be the integer for which
holds. Then, L is the smallest integer such that
for some α0, α1, . . . αL−1, and αL=1.
This makes
the minimal polynomial of Ryy. This minimal polynomial divides the characteristic polynomial, ΔR
ΔR
and
g(λ)=(λ−λ1)μ
with 1≦μl≦ml. The parameter μl, which is the size of the largest Jordan block corresponding to λl, is 1 for any diagonalizable matrix. Thus the minimal polynomial g(λ) has order μ=L, which is the number of distinct eigenvalues of Ryy. Q.E.D.
Note that for many applications, we have Ryy=Rss+σ2I, with Rss=SSH, and the initial vector s used to construct the Krylov subspace Kl(Ryy, s) belongs to the signal subspace <S>. For such a choice of sε<S>, we also develop a Corollary of the warp convergence Theorem.
Corollary: Assume Ryy=SSH+σ2I. For 1≦l≦N and 1≦L≦N, the maximum dimension of the Krylov subspace Kl(Ryy, s), over all sε<S>, equals the min(l, L−1), if and only if the number of distinct eigenvalues of Ryy, # dev(Ryy), is L. That is,
Remarks: This Corollary says that if the initial vector sε<S> instead of sεCN, the warp convergence rank (L−1) is determined by the number of distinct eigenvalues of the K×K Gramirian GSS=SHS. The N×N signal covariance matrix Rss=SSH, which is of rank K, has the same number, L, of distinct eigenvalues as the Ryy. This number, L, is one more than the number of distinct eigenvalues of the Grammian GSS.
Proof: We use the concept of invariant subspace and orthogonal projections to prove the Corollary in a constructive way. For a covariance matrix Ryy of L distinct groups of eigenvalues, using the spectral factorization theorem, for any integer 1≦l≦L, we have
Ryyl=λ1lP1+ . . . +λLlPL
where Pl=QlQlH is the orthogonal projection matrix on to the invariant subspace Ql=<Ql>. Columns of the rank-ml matrix Q, are the eigenvectors of Ryy associated with the eigenvalue λl (with a multiplicity ml).
For Ryy=SSH+σ2I, then
λ1=σ12+σ2, . . . ,λL−1=σL−12+σ2,λL=σ2.
Furthermore, we have PLS=0N×K. Therefore, for every sε<S>,
which has dimension dim≦L−1.
Therefore, we have,
Q.E.D.
We present a few applications from communications and array processing to demonstrate the warp convergence of the RRCG Wiener filter.
A. Application in CDMA Systems
For synchronous CDMA applications, the data model in (1) is real-valued, and columns of the S matrix stand for signature vectors (normalized spreading codes) of all active users in the system. When a set of K length-N Gold codes is chosen as spreading codes, we can always decompose the Gram matrix in the form of,
where the (K×1) vectors ul are linearly independent of each other; and the vectors ql are orthogonal. Specifically, when a good set of Gold codes is chosen, we have [6]
where 1 stands for a (K×1) vector of all ones, and when a bad set of Gold codes is chosen, we have [6]
The constant t is related to the three-value cross-correlation property of Gold codes. The vectors u and v contain only elements drawn from {−1, 0, +1}. The vectors q2 and q3 are q2=u+v and q3=u−v. The presence of terms uvT and vuT is due to the inclusion of Gold codes with bad cross-correlation in the code signature matrix S. The effect of these terms is to sparsely place elements with opposite values t/N and −t/N on the locations above and below the main diagonal of the GSS. Most importantly, these additional dyad terms in the Gram matrix GSS reveal the factors that cause the non-orthogonality among signal modes. They dictate additional steps needed, in additional to the first stage matched filtering w1(RRCG)=s1, for the V-CG method to converge to the full-rank Wiener solution Ryy−1s1. This observation implies that under ideal power control, the RRCG WF multiuser detector (MUD) should converge to its full-rank counterpart within only 2 to 4 iterations [6]. Furthermore, we can also design a groupwise power control scheme for users (using Gold codes) with four different transmission power levels, so that the 4-stage convergence property of the RRCG-WF still holds. Simulation results in
B. Application in Adaptive Array Signal Processing
In array processing applications, the complex-valued snapshot from a given N-element uniform linear array (ULA), in response to K signal modes, follows the model in (1). Here the signal modes (columns of the S matrix) are parameterized by the angles of arrival of different sources,
S=[s(v1)s(v2) . . . s(vK)].
When all the modes are orthogonal, we have sH(vk)s(vl)=Nδk,l and GSS=NIK×K. However, when the signal of interest, s1=s(v1), and interfering sources, s(vk)'s, (k=2, 3, . . . K), are not orthogonal, the RRCG WF can be used to filter out the desired signal mode. In such cases, we have
where the 2 L0 unitary dyads account for L0 modes that force GSS away from the NIK×K. For many applications, we experience fast and early convergence in at most L steps with L=2L0+1<<K. In
Similar predictable convergence behavior can be observed for other experimental setups, where the spacing between array elements (due to the break-downs of some sensors in the array structure or the environmental constraints on sensor distribution), along with the arbitrary DOAs of sources, make the Gram matrix of signal modes deviate from the diagonal matrix. These results might clarify other findings for RRCG WF applied to multi-sensor arrays [7].
Conventional convergence analysis of the V-CG algorithm, whether for optimization or signal processing, has been based on the rank of a signal covariance matrix, or equivalently on the order of its characteristic polynomial. In this paper we show that it is the order of the minimal polynomial that determines convergence. This order is determined by the number of distinct eigenvalues of the signal covariance matrix, or its Grammian. For many problems in statistics, communication, and array processing, this number is much smaller than the rank of the Grammian, leading to warp convergence.
Synchronization or timing acquisition is an important aspect of every communication system. From a signal processing perspective, timing acquisition is related to the traditional time delay estimation problem in the presence of multiple access interference and ambient noise. The time delay parameters are typically non-linearly entered in the data model, resulting in a non-linear parameter estimation problem. The maximum likelihood estimation of the non-linear parameters boils down to a peak search of an objective function, commonly named the compressed likelihood function (CLF), over the parameter space. For our problem, the CLF, is the ratio of quadratic forms involving data covariance matrix inversion. When only a finite amount of data rather than the true covariance matrix is available for timing acquisition, a reliable low-complexity data-driven solution is in need. In this work, we develop a reduced-rank data-driven solution that avoids the matrix inversion and only uses very few data snapshots, for rapid timing acquisition in a multiple access communications system.
In an asynchronous multiple access system over the additive Gaussian noise channel, the baseband data r(t) can be modeled as,
where K is the number of active MA users; i is the symbol index; Ak, bk(i), τk, and sk(t) are the amplitude, BPSK information bit, propagation delay, and signature waveform (normalized within symbol interval T) of the kth user, respectively; n(t) is a white Gaussian process with an average power of σ2. The signature waveform is generated based on a binary signature sequence sk[l]ε{−1, +1} and the rectangular pulse of chip duration. That is,
with T=LTc. Denote the k-th user's propagation delay as τk=vkTc+γk, where vk and γk are the integer and the fractional parts of τk with respect to the chip duration Tc. In our further analysis and simulations, we choose a normalized chip interval, Tc=1. Within the i-th processing interval of length T, which is commonly not aligned with the unknown delay, the chip-rate matched filtered and sampled data in (17) can be written in matrix form as,
where the i.i.d. white noise vectors n(i)˜N(0, σ2IL). The (L×1) vectors uk(r) and uk(l) are the effective signature vectors of the kth user, parameterized by the delay τk, i.e.,
with sk(r)(vk) and sk(l)(vk) being the right and left portions of signature vector sk partitioned by the integer part of delay vk. That is,
In this work, we combine the MAI and WGN of (18) into one colored noise vector. Estimating delay of the desired user τ1 then becomes jointly estimating signal parameters v1 and γ1 in colored noise of unknown covariance structure. The colored noise has zero mean and covariance matrix
Q=UA12UT+τ2IL,
where U=[u2(r)u2(l) . . . uK(r)uK(l)], and A1=I2{circle around (×)}diag{A2, A3, . . . , AK}. Note that the actual structure of the noise covariance matrix depends on the nuisance parameters of all interfering users.
In most communications systems, a fixed preamble (M bits) associated with each user is used for timing acquisition and system synchronization. Then the signal of interest as well as the data model in (18) associated with the preamble bits can be simplified to
r(i)=β1u1(τ1)+e(i), i=1, 2, . . . , M, (20)
where β1 is an unknown scalar; the signal vector u1(τ1)=u1(r)+u1(l) is parameterized by delay θ=[v1γ1]T; and the colored noise e(i) is of zero mean and unknown covariance, E{e(i)}=0, cov(e(i))=Q. Note that under the Gaussian assumption for the data vectors in (4), the sufficient statistics for estimating all the unknowns, τ1, β1 and Q, are functions of the sample mean vector and the sample correlation matrix, i.e.
Hence, in this work, we use the Gaussian approximation to model the sample mean statistic. That is,
This turns out to be a reasonable assumption when the combined effect from M preamble bits and the number of MA users K is large. The delay estimate is then given by [7],
where
is the sample covariance estimate of the unknown Q matrix. Note that without the Gaussian assumption, the above delay estimate is the non-linear weighted least squares solution for τ1 in the model of equation (20). Also note that normalizing the objective function J(τ) by a term {circumflex over (m)}T{circumflex over (Q)}−1{circumflex over (m)} (not affecting the delay estimate) leads to the equivalent adaptive coherence estimator (ACE) [9], [10].
In communication applications, the preamble bits are scarce, therefore we propose to use a rank reduction technique to facilitate the rapid timing acquisition. The rank reduction technique from our recent work [5], [6] can alleviate the problem encountered in sample covariance matrix inversion (especially when M<L), yet at the same time to deliver data-driven solutions to low-complexity timing acquisition. In doing so, the rank-r solution to timing acquisition is obtained from,
where the rank-r version of the objective function becomes,
In (6), the vector wu
where Kr({circumflex over (Q)}, u1(τ)) denotes the rank-r Krylov subspace; the vectors diεKr({circumflex over (Q)}, u1(τ)) are the {circumflex over (Q)}-conjugate directions; and the scalars αi=u1T(τ)di/diT{circumflex over (Q)}di are the best linear combination coefficients (the scalar Wiener filters working on the decorrelated internal variables zi=diT{circumflex over (m)}) for the set of r given conjugate direction vectors. Specifically, at the r-th step of iteration, out of the rank-r Krylov subspace Kr({circumflex over (Q)}, u1(τ)), a rank-r approximation to the Wiener filter is constructed as an optimal linear combination of the r conjugate direction vectors generated by the vector conjugate gradient (V-CG) method. The initial direction vector is chosen as d1=u1(τ), and the subsequent direction vectors are chosen as the {circumflex over (Q)}-conjugate directions. That is [2, 5], the new direction vectors are updated using the innovation contained in the residue (gradient) vector,
where dkT{circumflex over (Q)}dm=0 for k≠m. The residue vector is updated accordingly as
gk+1=gk−αk{circumflex over (Q)}dk, (24)
Note that when the number of available preamble bits M is very small, making the sample covariance matrix {circumflex over (Q)} rank deficient (M<L), the proposed reduced-rank version of the objective function can still be calculated using the above procedures. In such cases, we observe from the simulations the advantages of the low-rank timing acquisition scheme over the high-rank solutions.
The application problem considered in this work is the propagation delay estimation of a desired user in a code-division multiple-access communication environment [7], [8]. In such applications, the presence of multiple-access interference (MAI) from other users renders the conventional correlator based delay estimator useless. As mentioned earlier, we treat the desired user as the signal of interest and other interfering users as interference of unknown covariance structure. Under the condition of the near-Gaussian interference-plus-noise, the maximum likelihood estimate of τ1 simply corresponds to the maximum of an objective function J(τ1) in (5) or its scaled version JACE(τ1). This estimator is near-far resistant due to the fact that it makes use of the structure of the MAI. In our approximation of this objective function, we use a sample covariance matrix {circumflex over (Q)} and an implicit approximation to its inverse, by using a sequence of recursively approximated Krylov subspaces. For the examples used, we choose a multiple access system with K=10 users. The signature length is chosen as L=31. The SNR for the desired user is fixed as SNR(1)=11 dB, and the SNRs for all the interfering users are chosen as SNR(k)=SNR(1)+NFRdB, k=2, . . . , K, with near-far ratio chosen as NFR=0, 10 dB. We vary the number of preamble bits M=10, 20, 31. In all the figures, we show the objective functions at different ranks and different NFRs as a function of delay parameter τ1. The true delay is marked by a dashed line.
This work demonstrates the applicability of the fast converging reduce-rank Wiener filter to low-complexity data-driven rapid timing acquisition in CDMA systems.
Many advanced signal processing systems, ranging from designing linear filters for space-time adaptive processing in radar and sonar systems, to signal separation and interference suppression in communications, involve such filtering operations as sTRyy−1s, and sTRyy−1y, where Ryy is the correlation matrix of the measurement vector y, and the vector s denotes the signal mode of interest. For different application scenarios, the signal mode vector can either be parameterized by the temporal/spatial frequency as the mode/steering vector in space-time adaptive processing, or be digitalized as the signature vector in spread spectrum CDMA system. In practice, the dimension, N, of these vectors and matrix involved may be very large. This not only causes the slow convergence during the real-time adaptation of the systems in dynamically changing interference environments, but also causes the performance degradation of the system when only a small number of data samples are available for estimating the data statistics.
In this paper, we study the conjugate direction (CD) methods [1], [2], [3] applied to design reduced-rank (RR) Wiener filter (WF) iteratively for applications in beamforming and communications. At each iteration of the algorithm, a RR WF, that is contained in the expanding Krylov subspace, Kl(s, Ryy) is constructed. We study cases when the Gram matrix of the signal modes has structures that lead to fast convergence of the CDRR WF. This finding will be useful during the adaptive implementation of the systems for communications and array processing. Examples of computer simulation are provided to demonstrate the remarkable fast convergence property of the CDRR Wiener filters.
Let us assume that the measurement vector y follows a linear model,
y=Sθ+n, (25)
where the matrix S=[s1s2 . . . sK] contains signal modes of all sources; the random vector θ=[θ1θ2 . . . θK]T contains information carried by signal modes; the noise vector, n˜N(0, σ2I), is white Gaussian. In this work, we choose mode s1 as the desired signal mode. The rest of the signal modes, sk, (k=2, 3, . . . , K), are treated as interferences. The data correlation matrix Ryy has the following structure,
where the diagonal matrix P=diag{P1, P1, . . . , PK}, with Pk=E{|θk|2}, contains the power of each uncorrelated sources.
We studied the cases when the Gram matrix of signal modes, Gs=STS, can be decomposed as,
where the (K×1) vectors, ql's, are linearly independent of each other. In most of the cases, we have L<<K≦N. This decomposition represents the factors that cause the signal modes to deviate from the perfect orthogonal set. In this work, We have proved that the CDRR Wiener filter,
wlCDRR=SlμlεRange(Sl),
with Sl=[s1,Ryys1, . . . ,Ryyl−1s1],
converges to the full-rank Wiener filter, wWF=Ryy−1s1, in at most (L+1) steps/iterations, when the powers, Pk's, are evenly distributed among all signal modes. This means that in such cases, a rank-(L+1) CDRR WF guarantees to deliver the same performance as that of the full-rank WF. This is due to the fact that at each iteration of CD method, a RR-WF is constructed from the Krylov subspace Kl(s1, Ryy)=<Sl>. Using the Ryy in (2), we can explicitly write the equivalent basis vectors of the expanding subspace <Sl> as
where the coefficients βk(l)'s depend on signal modes, sk's, and their powers, Pk's, as well as the noise power σ2. For the most general choices of the mode matrix S and the parameters Pk's in (26), we have shown earlier [4] that the subspace <Sl> stops expanding after l=K-step. This leads to the K-step (at most) convergence results in [4]. However, for some important cases when the Gram matrix satisfies (27), we further show in this work that,
<Sl>⊂<s1,Sq1, . . . ,SqL>,∀l.
In other words, for such applications (L+1<<K), we can further reduce the rank/stage down to (L+1) during the construction of the RR-WF using the CD method, yet still deliver the same performance as the full-rank WF.
We present two application scenarios originated from communications and array processing to demonstrate the fast convergence behavior of the CDRR Wiener filters. Other application examples will be included in the full paper.
A. Applications in CDMA Systems
For CDMA applications, columns of S matrix stand for signature vectors (normalized spreading codes) of all active users in the system. When a set of K length-N Gold codes is chosen as spreading codes, we can always decompose the Gram matrix in the form in equation (26). Specifically, when a good set of Gold codes is chosen, we have:
where 1 stands for a (K×1) vector of all ones.
When a bad set of Gold codes is chosen, we have:
the constant t is related to the three-value cross-correlation property of Gold codes. The vectors u and v contain only elements drawn from {−1, 0, +1}. The vectors q2=u+v and q3=u−v. The presence of terms uvT and vuT is due to the inclusion of the Gold code with bad cross-correlation in the code signature matrix S. The effect of these terms is to sparsely place elements with opposite values t/N and −t/N on the locations above and below the main diagonal of the G. Most importantly, these additional dyad terms present in the Gram matrix Gs reveal the factors that cause the non-orthogonality among signal modes. They dictate additional steps needed, in additional to the first stage matched filtering w1(CDRR)=s1, for the CD method to converge to the full-rank Wiener solution R−1s1. This observation implies that under ideal power control, the CDRR WF based multiuser detector (MUD) should converge to its full-rank counterpart within only 2˜4 stages/iterations. Furthermore, we can also design a group-wise power control scheme used for users (using Gold codes) with four different transmission power levels, so that the 4-stage convergence property of the CDRR-WF still holds. Simulation results in
B. Applications in Array Signal Processing
In array processing applications, the output data snapshot from a given N-element uniform linear array (ULA), in response to K signal modes, follows the same model in (25). Here the signal modes (columns of S matrix) are parameterized by the angles of arrival of different sources,
S=[s(v1)s(v2) . . . s(vK)].
When all the modes are orthogonal, we have sH(vk)s(vl)=Nδk,l and GS=NIK×K. However, when the signal of interest, s(v1), and interfering sources, S(vk)'s, (k=2, 3, . . . K), are not allied with the orthogonal modes as we expected, the CDRR WF can be used to filter out the desired signal mode. The L effective dyad terms present at the Gram matrix GS in (27) reflect the net independent factors that cause the non-orthogonal columns of the GS. For many applications, we experience fast and early convergence at the steps (at most L+1) smaller than K, the total number of signal plus interference modes. In
The same kind of predicable convergence behavior can be observed for other experimental setups, where the spacing between array elements (due to the break-downs of some sensors in the array structure) along with the DOAs of sources makes the Gram of the mode matrix to deviate from the diagonal matrix.
Multimedia wireless communication systems need to accommodate a variety of potentially disparate information sources, such as voice, packet data, and video signals. These different information sources have different QoS requirements, which include bit-rate, allowable transmission delay, source priority, and performance measures (the SINR and the BER). In this work, we propose reduced-rank approach [1-4] to reducing the implementation complexity faced by the traditional full-rank multiuser detectors for multi-rate CDMA systems. We study the detection performance of the reduced-rank linear minimum mean squared error (LMMSE) MUD. The motivation for putting emphases on such MUD is based on the fact that the LMMSE MUD (in single-rate CDMA system) not only provides a good compromise between computational simplicity and satisfactory system performance, but also provides the possibility of being implemented adaptively without the need for the signatures of other interfering users except for the desired user. For the multi-rate CDMA systems involving data traffic of different data-rates and QoS requirements, this work proposes the fast converging group-wise reduced-rank MUD for different user groups and studies their performances.
We assume that the considered multi-rate CDMA system uses the variable spreading length (VSL) access scheme to handle users of different data-rates. In such a system, users are grouped according to their data rates, and all the users with different data-rates use the signature codes of the same chip-rate but different spreading lengths. Hence, the multi-rate data traffic containing g+1 user groups, over the processing interval of duration T0 specified by the low-rate (LR) users' symbol interval can be modeled as
where users, each using a distinct signature, are grouped according to their data rates. The waveform sk,l(t) of duration Tl is the signature waveform of the k-th user in the l-th group; Ak,l and bk,l are the amplitude and BPSK symbol of the user; and n(t) is an AWGN process. The data vector yεRN, obtained at the output of the chip-rate matched filtering and sampling unit, over the processing interval T0 (the LR users' bit interval) can be formulated in matrix form as
where the matrix Al=IM
Sl=[s1,l(0) . . . sK
The N×Kvirtual signature matrix S contains signatures of all Kvirtual virtual users,
S=[S0S1 . . . Sg].
where
The real-valued white noise vector, n˜N(0, σ2IN), is assumed Gaussian distributed.
We point out that for multi-rate CDMA systems, there exists additional design freedom for us to exploit in constructing the signature codes with certain desired property for user groups of different data rates and QoS requirements.
For example, given a set of
spreading codes of length L, there exist many different ways of constructing the variable length signature codes for all the Kvirtual virtual users. A straightforward way to construct a signature matrix Sl, out of a set of Kl spreading codes {s1,l, s2,l, . . . , sK
Sl=IM
where ql=Mg/Ml=Tl/Tg, and N=MgL. More advanced way of constructing Sl is to utilize the idea of space-time block codes (STBC), such as the Alamouti code (orthogonal-STBC) and other quasi-orthogonal STBCs to map the L×Kl signature [s1,ls2,l . . . sK
Sl=C(s1,ls2,l . . . sK
where
Such additional design freedom comes from the difference in data rates and QoS requirements among different user groups. In section 4, we provide a few specific code design examples for simulation experiments to demonstrate the fast converging features of the MUD.
The general design strategy proposed here is to construct, on a groupwise basis, properly normalized (according to the data rates and QoS requirements) orthogonal or quasi-orthogonal composite signatures out of a given set of spreading codes (typically non-orthogonal) for users in the different data-rate groups. By doing so, we can effectively reduce the number of distinct eigenvalues present in the data covariance matrix Ryy, hence, enable the warp convergence [5-6] in the low-complexity reduced-rank MUD.
A. Distributed LMMSE MUD and the RR-MUD
The distributed LMMSE MUD for a desired user, say the user k, decodes the BPSK symbol according to the decision rule,
{circumflex over (b)}k=sgn{skTRyy−1y}=sgn{wTy}, (29)
where the N×1 vector sk is the signature code of the desired user; and Ryy=SA2ST+σ2I is the data covariance matrix. The vector
w=Ryy−1sk (30)
is the scaled full-rank Wiener filter for the symbol bk. Due to the high dimensionality of the system (large N) and the changing dynamics of wireless systems, the reduced-rank solutions of low computational complexity is preferred to the full-rank Wiener filter. Using the concept of expanding (with increasing rank r) Krylov subspaces Kr(Ryy, sk), we develop a computationally efficient approach [5] to constructing reduced-rank approximations to the full-rank Wiener filter in (30). The approach, stemmed from the vector conjugate gradient (V-CG) method, consists of iterative procedures to construct a sequence of approximations to the full-rank Wiener filter. Specifically, at the r-th step of iteration, out of the rank-r Krylov subspace Kr(Ryy, sk) a rank-r approximation to the Wiener filter in (30) is constructed as an optimal rank-r linear combination of the r conjugate direction vectors generated by the V-CG method. That is
where vectors diεKr(Ryy, sk) are the Ryy-conjugate directions; and the scalars αi=skTdi/diTRyydi are the best linear combination coefficients (the scalar Wiener filters working on the decorrelated internal variables zi=diTy) for the set of r given conjugate direction vectors.
In our earlier work [5-6], we have shown that there exist the warp convergence (L step convergence, where L<<K≦N) as well as the general convergence (K step convergence) for the V-CG reduced-rank LMMSE MUD. Here by convergence we mean that a reduced-rank MUD delivers the same performance of the full-rank MUD.
B. Centralized LMMSE MUD and the RR-MUD
For the MUD performed at the base stations and relay stations, a centralized solution is desired. In the centralized MUD assuming the knowledge of all active users' signature codes, all the active users' information are decoded in one shot. Therefore, the vector conjugate gradient (V-CG) reduced-rank MUD can be extended into the matrix conjugate gradient (M-CG) MUD using the matrix Wiener filter theory. The centralized LMMSE MUD decodes the symbol vector of all active users according to the decision rule
{circumflex over (b)}=sgn{STRyy−1y}=sgn{WTy}, (32)
where the matrix W=Ryy−1S is the scaled matrix Wiener filter for the symbol vector b in (28).
In [5], we have shown that the M-CG MUD converges to its full-rank counterpart in just one step, i.e.
W1=D1V1=Ryy−1S,
with columns of the initial direction matrix D1=S selected as the signature vectors of all the interested users, and the step size matrix V1=(D1TRyyD1)−1G1TG1. The initial gradient matrix is chosen as G1=S.
C. Group-Wise LMMSE MUD in Multi-Rate CDMA
The full-rank matrix Wiener filter for symbol vector bd of the desired group is
The reduced-rank MUD for either a desired user or a desired group of users can be constructed according to the procedures of the vector-CG (V-CG) or the matrix-CG (M-CG), respectively. That is,
Further more, we can use various quasi-orthogonal STBC scheme to construct the composite signature sets (see application examples) to enable partial orthogonality among signature codes in a given user group, so that the M-CG computation can be further reduced due to the subspace decoupling. This will help to speed up the convergence of the reduced-rank MUD in the group-wise MUD. For many application cases, due to the multiplicity relation existed in the data-rates we can construct quasi-orthogonal composite signatures for users in a given group so that the group-wise M-CG MUD constructed from the expanding Krylov subspaces
can be decoupled into pair-wise M-CG MUD with much reduced dimension. The matrices Di, Vi in (32) contain step-size vectors and direction vectors during the i-th iteration of the M-VG. In addition, we can judiciously control signature code and power allocation among groups of users with various data-rates so that the covariance matrix of the data, on which the group-wise MUD is to be applied, has reduced number of distinct eigenvalues, resulting a warp convergence in the reduced-rank MUD.
Application examples are chosen to demonstrate the fast converging property of the reduced-rank LMMSE MUD for multi-rate CDMA systems. We design a multi-rate CDMA system with variable spreading length sequences to accommodate multi-rate data traffic. We choose a system with the total number of users K=8. There are two user groups among all active users, K0=4 low-rate (LR) users and K1=4 high-rate (HR) users. In all examples, we assume that a total of K=8 distinct normalized Gold codes of length L=15 are available for constructing signatures for LR users and HR users. In all examples, we choose to use Gold codes g1, . . . , g4 for the LR user group and g5, . . . , g8 for the HR user group.
In the first example, we assume that the ratio between the data rates is M=2. Hence, within the processing interval T0, there are a total of Kvirtual=K0+MK1=12 virtual users. Hence, the signature matrices for all the Kvirtual virtual users (including the LR and HR user groups) are constructed respectively according to the O-STBC and repetition coding
In the second example, we assume that the rate ratio M=4, resulting a total of Kvirtual=K0+MK1=20 virtual users. Hence, the signature matrices for all the Kvirtual virtual users (including the LR and HR user groups) are constructed respectively according to the QO-STBC and repetition coding
Note that the columns of matrices S0 and S1 are all normalized to unity, so that the SNR for a desired user, say user 1, is consistent with the definition of Eb/N0 commonly used in communications. That is, SNR1=A12/σ2=Eb/(TcN0).
In
In
In both examples, one can see that rapid convergence occurs much earlier than traditionally predicted by the dimension of the total signal subspace Kvirtual.
This work demonstrates the applicability of the fast converging reduce-rank LMMSE MUD to multi-rate CDMA systems, involving multi-rate data traffic. Given a set of fixed length spreading codes, through proper design on power allocation schemes along with the construction of composite variable length signature codes we can accommodate different data rates and QoS requirements for different user groups, yet at the same time to enable the warp convergence of the reduced-rank MUD. During the evolution stages of the reduced-rank MUDs, the scalable rank MUDs provide solutions to trade off implementation complexity and satisfying performance.
This application claims benefit of and priority to U.S. Provisional Patent Application Ser. No. 60/660,960 entitled “Code, Signal and Conjugate Direction design for Rapidly-Adaptive Communication Receivers and Electromagnetic, Acoustic and Nuclear Array Processors”, filed Mar. 11, 2005, the entire contents of which are specifically incorporated herein by reference.
This invention was made with government support under Contract No. N00014-04-1-0084 awarded by the Office of Naval Research; Contract Nos. CCR-0112573 and CCR-0085846 awarded by the National Science Foundation; and Contract No. F33615-02-C-1198 awarded by the Defense Advanced Research Projects Agency. The government may have certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
60660960 | Mar 2005 | US |