This application claims priority under 35 U.S.C. §119 of Taiwanese Patent Application No. 094115211 dated May 11, 2005 the disclosure of which is hereby incorporated by reference.
This invention relates to communication systems and, more particularly to the multi-input multi-output (MIMO) communication systems.
Several prior techniques have been developed to process and deal with the MIMO communication system including:
(1) In MIMO systems, it is known that high spectral efficiency and high quality can be achieved by exploiting the spatial multiplexing (SM) scheme ‘[G. D. Golden, G. J. Foschini, R. A. Valenzuela, and P. W. Wolniansky, “Detection algorithm and initial laboratory results using V-BLAST space-time communication structure,” Electronic Letters, vol. 35, no. 1, pp. 14-161, January 1999] (hereafter referred to as REF. 1) and space-time coding (STC) [V. Tarokh, H. Jafarkhani, and A. R. Calderbank, “Space-time block codes from orthogonal designs,” IEEE Trans. Inform. Theory, vol. 45, no. 7, pp. 1456-1467, July 1999] (hereafter referred to as REF. 2) scheme, respectively. Such schemes can be directly applied to the multiuser (MU) systems yielding an MU SM system or an MU STC system. However, whenever in which the system, the data streams of all the users must be transmitted under the same mode and cannot be switched. This is very inflexible and cannot achieve the best performance for a general system link requirement and/or wide channel conditions.
(2) Naguib's 2-step Method [A. F. Naguib, N. Seshadri, and A. R. Calderbank, “Applications of space-time block codes and interference suppression for high capacity and high data rate wireless systems,” Proc. 32th Asilomar Conf. Signals, Systems, and Computers, vol. 2, pp. 1803-1810, 1998] (hereafter referred to as REF. 3) can be directly implemented an MU STBC system. In this scenario, the overall detection framework can be simply regarded as a parallel interference cancellation (PIC) scheme followed by a local ML search. In such a processing, the signal detection thus cannot enjoy the increased receive diversity gain through the PIC step. On the other hand, this method is based on the ML metric to decide the optimal detection order. This may achieve better detection performance but, however, attain a large computational cost.
(3) The method proposed in [V. Tarokh, A. Naguib, N. Seshadri, and A. R. Calderbank, “Combined array processing and space-time coding,” IEEE Trans. Inform. Theory, vol. 45, no. 4, pp. 1121-1128, May 1999] (hereafter referred to as REF. 4) applies the BLAST algorithm for signal detection followed by an ML search in single user (SU) systems. However, the algorithm is mainly based on the space-time trellis codes, and do not exploit the codeword's algebraic structures for decoding.
(4) in the Stamoulis's method [A. Stamoulis, N. Al-Dhahir, and A. R. Calderbank, “Further results on interference cancellation and space-time block codes,” Proc. 35th Asilomar Conf Signals, Systems, and Computers, vol. 1, pp. 257-261, 2001] (hereafter referred to as REF. 5) is a pure interference cancellation scheme through appropriate linear transformation based on the algebraic structure of orthogonal based space-time block coding (O-STBC) to decouple a user's data stream one at a time over the MU STBC systems. At each stage, there are no increased degrees-of-freedom that can be retained after the interference cancellation step for further interference suppression/signal detection at the next stage. This causes that it cannot enjoy the increase in receive diversity as the algorithm goes on, even if it is combined with some power ordering strategy.
We propose a group SIC detection algorithm for a general MIMO CDMA systems, in which each user's data stream can be either orthogonal space-time block encoded for transmit diversity or spatially multiplexed for high spectral efficiency according to the channel conditions.
Based on the rich and distinctive structures imbedded in the resulting channel matrix, we derive a well-performance and computationally efficient detector. The algorithm can be described as:
(1) A flexible MIMO transceiver is suggested for uplink CDMA systems over the frequency-selective channels as depicted in
(2) The data streams transmitted form each mobile terminal can be either spatial multiplexed (e.g., vertical Bell laboratories layered space-time, V-BLAST) for achieving high data rate or orthogonal space-time block encoded (e.g., orthogonal space-time block code, O-STBC) for transmit diversity.
(3) At the base station, the received data is despread, linearly combined with the channel matrix followed by an ordered successive interference cancellation (SIC) algorithm to detect the transmitted symbols from each mobile terminal.
(4) For such the considered dual-signaling system, the receiver could suffer from the large dimension data processing. However, based on the algebraic structure of the O-STBCs and through judiciously exploiting it, it can be shown that an attractive block-wise implementation of the SIC algorithm can be achieved to restore the algorithm complexity back.
(5) The imbedded algebraic structure resulting in the channel matrix is further exploited for developing a low-complexity recursive-based detector. It is shown that the calculation of the weights of the V-BLAST detection at each iteration is not computed but rather directly obtained from the information at the previous iteration without any matrix inversion.
(6) To solve the time delay problem caused by STBC signal, this invention proposes a 2-stage group SIC detection algorithm in the dual signaling system, and this can reduce the computational complexity.
(7) The proposed flexible MIMO transceiver can be applied to the B3G high-speed uplink communications.
A preferred embodiment of this invention is to be described as the following:
I •System Modeling
A. System Descriptions and Basic Assumptions
Consider an MIMO uplink CDMA system over the frequency-selective multipath fading channels, as shown in
LT:=PQD+NKQM (1)
Concretely speaking, these two space-time signal transmission mechanisms can be completely described by of the associated N×K space-time codeword matrix. In this invention, the qth user's data stream sq(k) can be split into groups of sub-data streams as sq,l(k):=sq(Lqk+l−1), where 1≦l≦Lq, and Lq is the number of sub-data streams transmitted by the qth user, which depends on the signaling mode. When a user using STBC for transmitting, i.e., qεSD, then Lq=P; and when a user using SM for transmitting, i.e., qεSM, then Lq=NK. Hence, the space-time codeword matrix of the qth user can be represented as:
Wherein Aq,lεN×K is a space-time modulation matrix. For qεSD, based on REF. 2, Aq,l possess the following characteristics: (1) Aq,lAq,lH=IN when k=l and (2)Aq,kAq,lH+Aq,lAq,kH=ON when k≠l, qεSD. Besides, {tilde over (s)}q,l(k):=Re{sq,l(k)} when 1≦l≦Lq, and {tilde over (s)}q,l(k):=Im{sq,l-L
Suppose that the receiving end uses M (≧N) antennas. Define y(k)εCM(G+L
wherein HqεM(G+Lc
B. Vectorized Signal Model
To facilitate the detection process and analysis based on the matrix linear model (3), the inventors propose to use an equivalent vectorized linear model. Suppose sq(k):=[sq,l(k), . . . , sq,Lq(k)]T is the transmission symbol block of the qth user. Without loss of generality, the inventors re-number the NK symbols sq,l(k) of each SM user (i.e., qεSM), so that the nth data group sq,l(k) of Kth symbol, for (n−1)K+1≦l≦nK, can be transmitted from the nth antenna. Define {tilde over (s)}q(k):=[Re{sqT(k)}Im{sqT(k)}]Tε2L
yc(k):=[{tilde over (y)}T(k) . . . {tilde over (y)}T(k+K−1)]T=Hcsc(k)+vc(k), (4)
Where Hcε2KM(G+L
sc(k):=[{tilde over (s)}1T(k) . . . {tilde over (s)}QT(k)]Tε2L
is the symbol vector transmitted by all user terminals, vc(k) is the resulting noise item. Through dispreading and linearly combing for yc(k) with channel matrix Hc, a Matched-Filtered (MF) data vector can be obtained as follows.
z(k):=HcTyc(k)=Fsc(k)+
wherein
F:=HcTHcε2L
and
II •Matched Filter Channel Matrix
After the descriptions in this section, it will be found that the F matrix have an appealing structure. To characterize the structure of F, the inventors collect the all elements in F first, and then put them all together to examine how F is actually like. Based on the characteristics of a channel, the data stream of each user can be processed by STBC to obtain the transmit diversity, or by SM to obtain high spectrum efficiency. This leads to two signal prototypes, one for a particular signaling. Besides, among all interference signatures, there are three distinct canonical building blocks need to be identified: two of which reflect the “intra-class” interference between each pair of distinct SM or STBC users, and the other is for the “inter-class” user interference.
To further pin-down these signal prototypes signatures, recall that the numbers of symbols transmitted from an SM and an STBC user terminals transmit are P and NP, respectively, during the consecutive K(=P) symbols. Therefore, if Fp,q is a sub-matrix of F, representing the interference signature between the pth and the qth user's data streams, we then have Fp,qε•P×P if p, qεS•D, Fp,qε•NP×NP if p, qεS•M, and Fp,qε•P×NP if pεS•D and qεS•M. All such three Fp,q's, together with the signal signature Fp,q for either pεSD or qεSM, are specified as follows. In the sequel, O(P) is defined as a set of all P×P real orthogonal designs with constant diagonal entries (those sub-matrices, each of which belongs to a scalar multiple of IP, do fall within this category), as described in REF. 2. Suppose that Fp,q is the sub-matrix of F, and is used to represent the mutual coupling between the pth and the qth user. Then the following results hold:
Some explanations and discussions related to above results are further described in the following (as to the drawings for the matrix structure, see
(a) Property (1) describes that at a particular situation, the data of all users are modulated for the purpose of obtaining diversity gains. Each P×P diagonal sub-matrix of F is a scalar multiple of IP, but each P×P off-diagonal sub-matrix of F is an orthogonal design.
(b) For p, qεSM, because SM signals does not impose any spatial and temporal correlations among the transmitted data streams, the interference between two SM data streams transmitted by different antennas will appear to be spatially and temporally decoupled. In particular, it is actually with constant diagonal entries since the respective propagation channels are assumed to be static during K signaling periods.
(c) Property (3) establishes a quite interesting result. The interference from SM data streams potentially retains, rather than wiping off, the orthogonal feature of O-STBC signals. A rough yet heuristic justification of this result is seen by noting that a single-antenna SM data streams may interfere with a STBC signal only through the “time” dimension. Because the SM data streams is temporally decoupled, the incurred interference might be likely to render the imbedded temporal correlation of STBC data streams unchanged. Therefore the resultant interference signatures still appear as orthogonal type matrices. This good property also holds for the single-antenna transmission in above the proposed dual-signal systems since, in this case, such a user can be regarded as a single-antenna SM data stream.
For the complex-valued constellations, there are analogue results as above descriptions, with merely possible modifications of matrix dimensions. The results are summarized in Table 1, wherein the matrix A(i,j) is the (i,j)th sub-matrix A with proper matrix dimension.
III •Block-Wise SIC SYMBOL DETECTION
In order to separate the cross-coupled symbol data stream from equation (6), this invention proposes to adopt the SIC algorithm, based on (see REF. 1). It is known that in case of all SM signaling (i.e., QD=0), it is natural to perform the conventional one symbol per-layer (symbol-wise) SIC detection technique based on REF. 1. In this case, only LT=NQM sub-data streams in z(k) will need to be detected. With a few STBC users presented leading to dual-mode signaling, due to the inherent time latency for symbol detection so as to exploit the diversity benefit for those STBC users, the receiver will receive more independent data symbols and the total LT=P(QD+NQM) sub-data streams in z(k) will thus need to be detected. This leads to the receiver suffering from a related large dimension data processing and there will be an unavoidable increase in the detection complexity. However through judiciously exploiting the sophisticated usage of the algebraic structure of O-STBC, it turns out that the corresponding SIC detector can be implemented in a block-wise manner. That is, in each SIC iteration, a block of P symbols, transmitted either from a particular STBC user or from an antenna of an SM user, can be “jointly” detected. As a result, only QD+NQM iterations are needed to detect all P(QD+NQM) transmitted symbols, it is only needs QD+NQM iteration calculations, and this can actually restore the algorithm's complexity back.
Zero-Forcing Law (ZF Criterion): The inventors shall first consider the Zero-Forcing (ZF) based SIC detection algorithm, in which the optimum detection order at each iteration is found based on the maximum SNR criterion, based on REF. 1. At the initial stage, the ZF decision vector is F−1z(k) and from (6), is obtained as:
sd(k):=F−1z(k)=sc(k)+F−1
Equation (8) shows that, for 1≦l≦LT, the lth symbol decision statistics, that is, the lth element of sd(k) is simply the desired symbol contaminated by an additive noise elTF−1
Because all transmission symbols have the same variance, equation (9) means that the (average) SNR of lth decision channel can be completely determined by [F−1]l,l, the lth diagonal element of the noise covariance matrix F−1. Smal [F−1]l,l implies large SNR in the lth channel noise, and hence better detection accuracy the lth symbol decision statistics to yield. As a result, the optimum detection order at the initial state is obtained by searching for the index 1≦l≦LT at which [F−1]l,l is minimal. The determination of the optimal index requires the explicit knowledge of diagonal elements of F−1.
Under the situation of a fixed parameter P, the inventors define (L)• as the set of all invertible real symmetric PL×PL matrices, so that for Xε(L)• we get the following results: (i) each P×P block diagonal sub-matrix of X is a (non-zero) scalar multiple IP, (ii) each P×P block off-diagonal sub-matrix of X belongs to (P). Besides, denote by [F−1]k,l the (k,l)th P×P block sub-matrix of F−1, 1≦k, l≦L, and L:=QD+NQM. Then the inventors can further prove that [F−1]l,l=β0,lIP• and [F−1]k,lε(P) when k≠l. This result proves that all P(QD+NQM) diagonal elements of F−1 might have QD+NQM different levels. A block of P symbols can thus be simultaneously detected at the initial stage, with the optimal detection order being given by
Besides, the ZF weight matrix can be calculated from the corresponding indexed columns of F−1, as W0=F−1[eP(
Through the detect-and-cancel process followed by an associated linear combining of the resultant data as in (6), it can be directly verified that, at the ith iteration, 1≦i≦L−1, the noise covariance matrix can be written as:
Fi−1:=(Hc,iTHc,i)−1ε(L
Where Hc,i is obtained from Hc by removing i block(s) of P column (corresponding to the previously detected signals). Since Fi is simply obtained from F by removing the i block(s) of P column and rows, we have:
FiεF(L−i), (11)
and
Fi−1ε(L−i). (12)
Based on the foregoing discussions, the inventors conclude that the block-wise detection can thus be done likewise at each iteration. The corresponding detection order and the weight matrix can be calculated in an analogue way as:
It should be noted that, the joint detection of a block of P symbols per iteration benefits uniquely from the use of orthogonal codes (O-STBC). However, it is also noted that, when the number of transmission antennas of the STBC users is greater than four, the appealing block detection property does not hold even if the orthogonal codes are used. This is because that the F has already lost the particular structure mentioned above and, as a consequence, the assertion on the inverse matrix F−1 may not be true.
Minimum Mean Square Error (MMSE) Criterion: The MMSE based SIC detector is also capable of per iteration jointly detecting a block of P symbols in essentially the same manner as in the ZF case. Next, the inventors will introduce that the MMSE SIC detector can be also implemented in the same block-wise manner. In the initial stage, the MMSE weight matrix minimizing E{∥sc(k)−W0Tz(k)∥2} is obtained as:
The lth symbol MSE, that is E{|elT[sc(k)−W0Tz(k)]|2}, is then computed as:
Because Fε(L), it is obvious that R0:=[(2/συ2)F+IL
From Table 1, it can be found that F does consist of orthogonal type block sub-matrices, therefore, the block-wise SIC detection for complex-valued constellation case can similarly be established. By going through essentially the same arguments as what we have done in the real symbol case, we can similarly derive a block based ZF/MMSE SIC detector, in which 2P real symbols are detected for an STBC user and 2K real symbols for an antenna of an SM user per iteration.
IV •Low-Complexity Detector
The major calculation load of the SIC algorithm is the successive matrix intersions throughout all iterations. The inventors will show that how the knowledge of the imbedded structure of F and it's inverse matrix F−1, can further help to reduce the calculation load. Specifically, with the special structure of F, there is an efficient way of finding F−1 by solving a set of linear equations of relatively small dimensions based on the Cholesky decomposition. Moreover, the inverse matrix required at each iteration can be calculated based on the parameters available in the previous stage.
A. A Efficient Method for Calculating F−1 Using the Cholesky Decomposition
Recall that every P×P block sub-matrixs of F−1 is (loosely stated) a P×P real orthogonal design. Each such a sub-matrix is completely characterized by p independent variables. It thus suffices to determine, say, its last column, and the rest can be simply obtained through appropriate linear transformations. This a priori structural information shows that the matrix F−1 is completely specified by its (jP)th columns, for 1≦j≦L. Hence, the calculation of F−1 amounts to solving the following linear equation of reduced dimensions:
FG=E, (17)
Wherein G and E are LT×L matrices whose jth columns are the (jP)th columns of F−1 and ILT respectively. To solve for the unknown G based on (17), for the jth column gj we must have gi,j=0, for (j−1)P+1≦i=jP−1. This is because these imbedded P−1 consecutive zero entries come from the last column of the jth P×P diagonal sub-block of F−1. Only the non-zero entries thus remain to be determined. The Hermitian property of F−1 moreover limits the number of the “actual” non-zero unknowns in each gj. It merely calls for the computations of those lying below gjP−l,j(=0). This analysis thus implies that, for the jth column gj only the last P(L−j)+1 entries need to be determined, and there is a decrease in the number of unknowns by an amount P as the index j increases to j+1.
To evidence how the above structural information about G can simplify the process for solving (17), let us first perform the Cholesky factorization on F to obtain F=LLT, where L is an LT×LT lower triangular matrix (also belongs to (L)). Hence, (17) can be equivalently rewritten as:
LLTgj=ej1≦j≦L. (18)
Because L is a lower triangular matrix, a typical approach to solve (18) for gj is the forward and back substitutions. Because the unknown elements to be determined in each gj all lie below the entry gjP−1,j(=0), the forward and back substitution procedures do not have to exhaust all the entries in gj. It simply terminates as long as gjP,j is calculated; and there is need for computing the remaining “upper” entries due to the Hermitian property of F−1.
B. The Method of Recursive Calculation of F−1
As mentioned above, at the ith iteration, it requires to determine the optimal detection order and the associated ZF weighting matrix. With F−1 obtained, in what follows the inventions will show how Fi−1 in each iteration can be recursively computed based on Fi-1 and Fi-1−1.
From the construction of Hc,i, the inventors observe that the matrix Fi=Hc,iTHc,i is simply obtained from Fi-1(=Hc,i-1THc,i-1) by deleting one block of P columns and the corresponding indexed block of P row. Without loss of generality, it is assumed that the last column and row blocks of Fi-1 are to be deleted. Otherwise, the inventors can simply permute those to be discarded to the right and bottom ends of Fi-1 to fit the prescribed form. Then the inventors can thus partition the Fi-1 as:
Where Bi-1ε(L
Fi−1=
where Bi-1T
Equation (20) thus provides a simple recursive formula for calculating Fi−1, based on the block of sub-matrices Fi-1 and Fi-1−1, without any “direct” matrix inversion operations. The overall low-complexity implementation of Fi, for 1≦i≦L−1, is illustrated in
V •Two-Stage Block-Wise SIC Detection
As mentioned above, as a few STBC users presented, due to the inherent time latency for those STBC users, the receiver could suffer from a large dimension data processing through the conventional symbol-wise SIC algorithm [“A fast recursive algorithm for optimum sequential signal detection in a BLAST system”, IEEE trans. Signal Processing, vol. 51, no, 7, pp. 1722-1730, July 2003]. To remedy this, a “two-stage group SIC” detector is proposed to suggest that it could first detect the group of those STBC streams by the block-wise SIC algorithm mentioned above since they may appear to be relatively robust to channel conditions. With this done, by removing the detected STBC streams from the data yc(k), the SIC algorithm can turn back to the conventional symbol-wise realization for recovering the group of remaining SM streams. However, in this case, the corresponding detection order of the 2-stage group SIC algorithm may not be actually optimal leading to a possible performance loss. It is also noticed that even in the optimal order sorting, the symbol-wise SIC algorithm, in fact, can be done as all the STBC streams have been detected.
VI •Computer Simulation Results
To assess the performance of the dual-signaling scheme, we consider a four-user cellular system specified as follows: 1) two transmit antennas at each handset, 2) two receive antennas at the BS, and 3) the processing gain being 16. Assume that all the four access channels are with a delay spread of five chips; two of them are spatially correlated, in which the direct-path fits the Ricean model with the same κ-factors κ=10, and the others are independent Rayleigh fading. At the BS, the SIC detector with minimum mean square error (MMSE) criterion is used for signal recovery. The mean BER, averaged over all the detected streams, is used as the overall cell performance measure.
The performance of the proposed block SIC detector is compared with two existing interference cancellation schemes introduced for multiuser space-time coded wireless systems, namely, the Naguib's 2-step approach [REF.3] and the Stamouli's method [REF.5]. For the previously considered four-user platform,
Number | Date | Country | Kind |
---|---|---|---|
94115211 A | May 2005 | TW | national |
Number | Name | Date | Kind |
---|---|---|---|
7103325 | Jia et al. | Sep 2006 | B1 |
7120395 | Tong et al. | Oct 2006 | B2 |
7209522 | Shirali | Apr 2007 | B1 |
20040132496 | Kim et al. | Jul 2004 | A1 |
20060035639 | Etemad et al. | Feb 2006 | A1 |
Number | Date | Country | |
---|---|---|---|
20060268809 A1 | Nov 2006 | US |