The aspects of the present disclosure relate generally to wireless communication systems and in particular to data detection in a wireless communications link.
The proliferation of modern wireless communications devices, such as cell phones, smart phones, and tablet devices, has seen an attendant rise in demand for high volume multimedia data capabilities for large populations of user equipment (UE) or mobile stations. These multimedia data capabilities may be used to provide services at the UE such as streaming radio, online gaming, music, and TV. To support this ever increasing demand for higher data rates, multiple-access networks are being deployed based on a variety of transmission techniques such as time division multiple access (TDMA), code division multiple access (CDMA), frequency division multiple access (FDMA), orthogonal frequency division multiple access (OFDMA), and single carrier FDMA (SC_FDMA). New standards for wireless networks are also being developed to provide ever increasing data rates. Examples of these newer standards include Long Term Evolution (LTE) and LTE-Advanced (LTE-A) being developed by the third generation partnership project (3GPP), the 802.11 and 802.16 family of wireless broadband standards maintained by the Institute of Electric and Electronic Engineers (IEEE), WiMAX, an implementation of the IEEE 802.11 standard from the WiMAX Forum, as well as others. Networks based on these standards provide multiple-access to support multiple simultaneous users by sharing available network resources.
Many of these newer standards support multiple antennas at both the base station and the UE. These multi-antenna configurations, referred to as multi-input multi-output (MIMO), provide improved spectral efficiency resulting in increased data rates. However the improved capacity comes at the cost of increased complexity and computational requirements at the transmitter and receiver. Detection of the transmitted data symbols at the receiver can be a difficult problem in systems with multiple transmit and receive antennas. Theoretically, maximum log likelihood detection (MLD) is the optimal method of detecting the transmitted data symbols. Unfortunately, the computational complexity of MLD in large MIMO systems often exceeds the computational capabilities of the UE preventing its use in low end UE. An alternative to MLD is a linear minimum mean square error (LMMSE) detector which has low computational complexity but suffers from sub-optimal performance especially when the condition number of the MIMO channel matrix is large. Another approach is the development of less complex maximum likelihood (ML) based methods, sometimes referred to as quasi-ML detection methods. The goal of these quasi-ML detection methods is to reduce the overall computational complexity while providing performance that is as close as possible to MLD.
A conventional method of approximating optimal MIMO detection is to reduce the size of the candidate set of symbol vectors that needs to be searched. The search size can be reduced by removing branches from the search tree, sometimes referred to as a pruning process, based on priority information obtained from a low complexity linear detector. Once the candidate set has been substantially reduced, a simplified or approximate ML detection can be implemented to refine the search results.
Another conventional approach, often referred to as the QR-M algorithm, applies QR decomposition to the channel matrix then reduces the size of the tree search by retaining only the best candidate nodes. Another variant of the QR-M algorithm is known as the K-Best algorithm which employs detection similar to the vertical-Bell Labs Space Time (V-BLAST) structure. With these approaches, only a limited number of candidates are retained at each layer, and because the limited number is usually much smaller than the full possible set, the complexity is also reduced.
These approaches can significantly reduce the complexity as compared to MLD. However, in order to achieve near MLD performance the complexity is still too high for implementation in many UE designs. This is especially true in advanced communication systems, such as LTE, or LTE-A, where large systems including 4×4 or 8×8 MIMO are applied with high order modulation schemes such as 64 symbol quadrature amplitude modulation (64QAM) or 256 symbol QAM (256QAM). The complexity of detectors in these systems increases exponentially with the number of MIMO layers and the high order modulation schemes.
Thus there is a need for improved methods and apparatus for detecting symbols in advanced communication networks.
It is an object of the present disclosure to provide an apparatus and methods to detect data in a wireless communication signal. A further object of the present disclosure is to provide methods and apparatus that can achieve near optimal data detection performance with significantly reduced computational complexity. Reducing computational complexity allows low cost UE to achieve significant improvements in data transmission rates.
According to a first aspect of the present disclosure the above and further objects and advantages are obtained by an apparatus for receiving wireless communication signals that includes a processor configured to receive a digital communication signal, where the digital communication signal has a plurality of transmitted layers. The processor is configured to determine an estimated channel matrix based on the digital communication signal. The processor then determines a first estimated transmitted symbol vector and a mean square error matrix based on a linear analysis of the received digital communication signal and determines a first set of bit log likelihood ratios by performing linear minimum mean square error detection based on the first estimated transmitted symbol vector. The processor is also configured to determine a second set of bit log likelihood ratios by performing a tree search for one or more layers in the plurality of transmitted layers in the digital communication signal, based on the first estimated transmitted symbol vector and the mean square error matrix. The processor is configured to determine a refined set of bit log likelihood ratios based on the first set of bit log likelihood ratios and the second set of bit log likelihood ratios, and to determine a second estimated transmitted symbol vector based on the refined set of bit log likelihood ratios. The processor determines the second set of bit log likelihood ratios by selecting a set of parent layers from the plurality of transmitted layers, wherein the number of layers in the set of parent layers is less than or equal to the number of layers in the plurality of transmitted layers. A shortened channel correlation matrix is then determined for each layer in the set of parent layers, based on the mean square error matrix. An optimal shortened channel matrix is determined based on each shortened channel correlation matrix and the estimated channel matrix. During each tree search a single child node is selected for each parent node in the tree search based on evaluation of a branch metric, and the second set of bit log likelihood ratios is determined based on the results of each tree search.
In a first possible implementation form of the apparatus according to the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to determine the first set of bit log likelihood ratios based on a detector comprising one or more of a linear minimum mean square error detector, successive interference cancellation, and parallel interference cancellation.
In a second possible implementation form of the apparatus according to the first aspect as such or to the first possible implementation form of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to evaluate the branch metric based on the shortened channel correlation matrix and a single parent node.
In a third possible implementation form of the apparatus according to the first aspect as such or to the first or second possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to select the single child node as the node having a maximum value of the branch metric.
In a fourth possible implementation form of the apparatus according to the first aspect as such or to the first through third possible implementation forms of the first aspect, processing time is reduced by configuring the processor to perform the tree search for each parent layer in the set of parent layers in parallel.
In a fifth possible implementation form of the apparatus according to the first aspect as such or to the first through fourth possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to select when the corresponding element of the shortened channel correlation matrix is positive the child node having a peak value of the branch metric, and when the corresponding element of the shortened channel correlation matrix is negative select the child node based on a quadrant of a residual value.
In a sixth possible implementation form of the apparatus according to the first aspect as such or to the first through fifth possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity when the number of parent layers is smaller than the number of transmitted layers by configuring the processor to select the layers in the set of parent layers based on an amount of energy or a channel capacity of the plurality of transmitted layers.
In a seventh possible implementation form of the apparatus according to the first aspect as such or to the first through sixth possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to determine the refined set of bit log likelihood ratios when the second set of bit log likelihood ratios is missing a bit hypothesis by determining the sign of the bit log likelihood ratio corresponding to the missing bit hypothesis and determining the refined set of bit log likelihood ratios based on the determined sign and the first set of log likelihood ratios.
In an eighth possible implementation form of the apparatus according to the first aspect as such or to the first through seventh possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to determine the shortened channel correlation matrix based on a mismatched received signal probability density function.
In a ninth possible implementation form of the apparatus according to the first aspect as such or to the first through eighth possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to determine the shortened channel correlation matrix based on a factorization matrix. The factorization matrix has non-zero elements on its main diagonal, non-zero elements in its last column, and the remaining elements of the factorization matrix have a zero value.
In a tenth possible implementation form of the apparatus according to the first aspect as such or to the first through ninth possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to use a permutation matrix to switch a layer in the set of parent layers to be a parent layer for the tree search. The elements of the permutation matrix have a value of zero or one, pre or post multiplication of the permutation matrix by a transpose of the permutation matrix yields an identity matrix, and the permutation matrix is configured to switch a layer in the set of parent layers to be the parent layer for the corresponding tree search and to leave the remaining layers unchanged.
In an eleventh possible implementation form of the apparatus according to the first aspect as such or to the first through tenth possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to select the set of parent layers based on an amount of energy or a channel capacity of each layer in the plurality of transmitted layers.
In a twelfth possible implementation form of the apparatus according to the first aspect as such or to the first through eleventh possible implementation forms of the first aspect, improved data transmission rates are achieved with reduced computational complexity by configuring the processor to determine the shortened channel correlation matrix for a second layer in the set of parent layers based on computation results obtained from determining the shortened channel correlation matrix for a first layer in the set of parent layers.
According to a second aspect of the present disclosure the above and further objects and advantages are obtained by a method for detecting data in a wireless communication system. The method includes receiving a digital communication signal, where the digital communication signal has a plurality of transmitted layers. An estimated channel matrix is determined based on the digital communication signal and a first estimated transmitted symbol vector and a mean square error matrix are determined based on a linear analysis of the received digital communication signal. A first set of bit log likelihood ratios is determined by performing linear minimum mean square error detection based on the first estimated transmitted symbol vector, and a second set of bit log likelihood ratios is determined by performing a tree search for one or more layers in the plurality of transmitted layers in the digital communication signal, based on the first estimated transmitted symbol vector and the mean square error matrix. A refined set of bit log likelihood ratios is determined from the first set of bit log likelihood ratios and the second set of bit log likelihood ratios, and a second estimated transmitted symbol vector is determined based on the refined set of bit log likelihood ratios. Determination of the second set of bit log likelihood ratios is accomplished by selecting a set of parent layers from the plurality of transmitted layers, wherein a number of layers in the set of parent layers is less than or equal to a number of layers in the plurality of transmitted layers. A shortened channel correlation matrix is then determined for each layer in the set of parent layers, based on the mean square error matrix and an optimal shortened channel matrix is determined based on each determined shortened channel correlation matrix and the estimated channel matrix. A single child node is selected for each parent node in the tree search based on evaluation of a branch metric, and the second set of bit log likelihood ratios is determined based on the tree search.
According to a third aspect of the present disclosure the above and further objects and advantages are obtained by a computer program including non-transitory computer program instructions that when executed by a processor cause the processor to perform the method according to the second aspect as such or to the first possible implementation form of the second aspect.
These and other aspects, implementation forms, and advantages of the exemplary embodiments will become apparent from the embodiments described herein considered in conjunction with the accompanying drawings. It is to be understood, however, that the description and drawings are designed solely for purposes of illustration and not as a definition of the limits of the disclosed disclosure, for which reference should be made to the appended claims. Additional aspects and advantages of the disclosure will be set forth in the description that follows, and in part will be obvious from the description, or may be learned by practice of the disclosure. Moreover, the aspects and advantages of the disclosure may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the appended claims.
In the following detailed portion of the present disclosure, embodiment of the disclosure will be explained in more detail with reference to the example embodiments shown in the drawings, in which:
In wireless receivers such as those used in UE employed as mobile devices, it is desirable to use detectors with low or reduced complexity to provide accurate symbol detection in lower cost UE. This goal can be achieved by using a technique employing a device according to an embodiment of the present disclosure which is configured to receive a digital communication signal that includes a plurality of layers. A channel matrix is estimated based on the digital communication signal and a first estimated transmitted symbol vector and mean square error vector is determined based on a linear analysis of the received digital communication signal. A first set of bit log likelihood ratios is determined using a linear minimum mean square error type detector based on the first estimated symbol vector and the mean square error matrix, and a second set of bit log likelihood ratios is determined by performing a tree search for one or more of the transmitted layers based on the first estimated symbol vector and the mean square error matrix. A refined set of bit log likelihood ratios is determined based on both the first and second bit log likelihood ratios. A final estimated transmitted symbol vector is determined based on the refined set of bit log likelihood ratios.
A second set of bit log likelihood ratios is determined using a tree search that begins by selecting a set of parent layers from the set of transmitted layers. The set of selected parent layers may include all of the transmitted layers or a subset of the transmitted layers. A special shortened channel correlation matrix is determined for each of the selected parent layers and an optimal shortened channel matrix is determined from each shortened channel correlation matrix. A tree search is performed for each layer in the set of parent layers where each tree search is performed by selecting a single child node for each parent node based on evaluation of a branch metric and the second set of bit log likelihood ratios is determined based on the tree search.
As an aid to understanding the reduced complexity detector according to an embodiment described above, begin with a conventional model for the received signal in a wireless MIMO communication system as shown in Equation 1:
Y=hx+W. Eq. 1
The model of Equation 1 represents a MIMO system where the number of receive antennas is represented by an integer M and the number of transmit antennas is represented by an integer N. The transmitted signal X is a N×1 column vector, X (x1,x2, . . . xN)T, where xi(1≦i≦N) represents the symbol transmitted on the ith antenna. The received signal Y is an M×1 column vector: Y=(y1,y2, . . . ,yM)T, where yi (1≦i≦M) represents the symbol received on the ith antenna. The MIMO channel matrix H is an M×N matrix made up of N column vectors: H=(h1,h2, . . . ,hN), where hi(1≦i≦N) represents the ith column vector in the channel matrix H. Thermal noise is represented in the system model illustrated in Equation 1 as a column vector W=(w1,w2, . . . wN)T with dimension M×1.
The bit log likelihood ratio (bit LLR) may be calculated as shown in Equation 2:
where Sb
i=0,1, with the maximum of the probability terms:
Even with the Jacobian approximation the complexity in large MIMO systems remains prohibitively high for implementation in many UE designs. For example in a MIMO system where there are 4 transmit antennas and the data is modulated using a 64 symbol alphabet such as with 64QAM, the set of symbol vectors with the kth bit equal to one, Sb
MLD methods may be formulated as tree search problems as illustrated by the search tree 100 depicted in
For example when both the first 108 and second 110 layer are transmitted using 64QAM, the second level 110 will include 642 or 4096 nodes. For clarity, some of the nodes in each layer have been left out of the tree diagram 100 and replaced with dashed lines 120, where the dashed lines 120 are used to indicate a continuation of the adjacent pattern. In a full complexity design the MLD search pattern includes the entire tree 100. Each path from root node 106 to lowest level 112 child node represents a candidate path corresponding to a particular symbol vector (xN-2,xN-1,xN). For example nodes 106, 114, 116, 118 represent a candidate path from the root node to the lowest level child node. In a full complexity MLD search all candidate paths are evaluated using a branch metric also referred to herein as a path metric to determine the best candidate path or symbol vector.
A number of conventional methods may be used to reduce the complexity of maximum likelihood symbol detection while keeping performance close to that of MLD. One conventional approach, often referred to as the QR-M algorithm, begins by performing QR decomposition on the channel matrix H=QR and transforming the received signal model as shown in Equation 3:
Z=RX+V. Eq. 3
where the transformed received symbol vector Z is formed from the Hermitian transpose of a matrix Q times the received symbol vector Y: Z=QHY. The Hermitian transpose, also known as the conjugate transpose, is denoted by a superscript H. The thermal noise W is transformed to a noise vector V where V=QHW, and the matrix R is an upper triangular matrix. The search process is based on the transposed system model illustrated in Equation 3 and starts from the bottom layer of the transmitted symbol vector X. The modified search tree 200 resulting from the QR-M algorithm is illustrated in
For each layer 208, 210, 212 in the search tree 200, a number of candidate nodes are preserved and subtracted from the transformed received signal Z when detecting the next layer. In the search tree 200 preserved candidate nodes are indicated by dark colored nodes, such as the dark color used to shade node 202, while light color nodes, such as the light color used to shade node 216, are pruned or removed from the search tree. In a typical implementation of the QR-M algorithm it is often necessary to retain a fairly large number of nodes in each layer in order to preserve near MLD performance. Therefore, because the total number of retained nodes remains relatively large, the total complexity is often still prohibitively high for implementation in many UE designs.
An exemplary embodiment of a detection method as used by a detector according to an embodiment of the present disclosure significantly reduces the complexity of symbol detection through the use of an optimal channel shortening procedure followed by a simplified tree search process. The optimal channel shortening procedure is used to determine an optimal shortened channel matrix
{tilde over (p)}
Y|X∝exp(2Re{YHHrX}−XHGrX). Eq. 4
The transmitted data X and received data Y may be assumed to be jointly Gaussian. Using Eigen value decomposition allows a shortened channel correlation matrix Gr to be decomposed into a unitary matrix U and a diagonal Eigen value matrix Λg as Gr=UΛg UH. Λg is a diagonal Eigenvalue matrix: Λg=diag(λ1g,λ2g, . . . λNg) where λig are the Eigen values of the shortened channel correlation matrix Gr.
Let the transformed received symbol vector Z=UHX=(z1,z2, . . . ,zN)T denote the received data after preprocessing with the unitary matrix U, then the probability function of the received data Y can be described as shown in Equation 5:
where the vector D=(YHHrU)H=(d1,d2, . . . ,dN)T is a column vector.
The expected value of the probability with respect to the received signal Y, denoted by EY, is shown in Equation 6:
By defining an upper triangular matrix R as shown in Equation 7:
R=E(DDH)=UHHrHE(YYH)HrU=UHHrH(HHH+σ2I)HrU, Eq. 7
the expected value EY can be re-written as shown in Equation 8:
It can be shown that the expected value relationship illustrated below in Equation 9 holds for the above described system:
E
X,Y(log2({tilde over (p)}Y|X))=EX,Y(2Re{YHHrX}−XHGrX)=2Re{tr(HrHH)}−tr(Gr). Eq. 9
The lower bound of the achievable information rate can be found as shown in Equation 10:
Applying the above definitions leads to the relationship shown in Equation 11:
The optimal shortened channel matrix
The optimal shortened channel matrix
r=(HHH+σ2I)−1H(Gr+I). Eq. 13
Putting the optimal shortened channel matrix
{tilde over (I)}=log2(det(Gr+I))+tr((Gr+I)HH(HHH+σ2I)−1H)−tr(Gr). Eq. 14
Equation 14 can be solved to find the shortened channel correlation matrix Gr by assuming a decomposition of the shortened channel correlation matrix Gr based on a factorization matrix F as shown in Equation 15:
G
r
=F
H
F−I. Eq. 15
where the sum of the shortened channel correlation matrix Gr and the identity matrix I, (Gr+I) is positive definite.
A reduced complexity tree search, referred to herein as an alternative marginalized tree search (AMTS) may be facilitated by using a specially formed factorization matrix F where the factorization matrix F is an N×N upper triangular matrix having the form illustrated in Equation 16 where there are non-zero elements on the main diagonal and in the last column and all other elements are zero:
The lower bound of the achievable information rate Ĩ can be re-written based on the factorization matrix F as shown in Equation 17:
A mean square error (MSE) matrix B can be derived from the channel matrix H as shown in Equation 18:
B=I−H
H(HHH+σ2I)H. Eq. 18
Because the factorization matrix F is an upper triangular matrix a relationship between the lower bound of the achievable information rate Ĩ and the MSE matrix B can be written as shown in Equation 19:
The kth diagonal element (FBFH)k of matrix FBFH can be calculated as shown in Equation 20:
were bkj represent the kth row and jth column element of the MSE matrix B, and fkj represents the kth row and jth column element of the factorization matrix F.
Taking the partial derivative of the lower bound of the achievable information rate Ĩ with respect to the complex conjugate of the elements of the last column of factorization matrix f*kN and setting the result equal to zero as shown in Equation 21:
yields a relationship between the elements of the factorization matrix F and the elements of the MSE matrix B as shown in Equation 22:
Using the result found in Equation 22 in the lower bound of the achievable information rate Ĩ, i.e. putting fkN from Equation 22 into Equation 19, and taking the partial derivative of the lower bound of the achievable information rate Ĩ with respect to the complex conjugate of the elements of the last column of the factorization matrix f*kN and setting the result equal to zero as shown in Equation 23:
provides a relationship between the elements of the factorization matrix fkj and the elements of the MSE matrix bkj shown in Equation 24:
The factorization matrix F may be uniquely obtained from the MSE matrix B according to Equation 24. The shortened channel correlation matrix Gr may then be obtained using Equation 15, and the optimal shortened channel matrix
where bij is the element of the MSE matrix B at the ith row and jth column and as before N is the number of transmitted layers.
Using the shortened channel correlation matrix Gr obtained from Equation 25, the a priori probability can be rewritten as shown in Equation 26:
The pre-processed symbol vector ZH=(z(1),z(2), . . . z(N)) is equal to the LMMSE estimation of the transformed received symbol vector Z and may be defined as shown in Equation 27:
Z=H
H(HHH+σ2I)−1Y. Eq. 27
Once the optimal shortened channel matrix
A path metric for the kth layer can be defined as in Equation 29:
From the a priori probability shown in Equation 26 it can be seen that the best path is the one that maximizes the accumulated path metric γ. However, because of the special form of the shortened channel correlation matrix Gr, maximizing the accumulated path metric γ is equivalent to maximizing each path metric γk at the kth layer separately as illustrated in Equation 30:
The relationship illustrated in Equation 30 shows that the search of the optimal candidate x(k) for each layer may be done by independently maximizing an individual layer branch metric γk for each layer. This allows the selection of each candidate to be handled in parallel. The parallel structure of the AMTS is illustrated by the search tree 300 shown in
For example when the parent layer 304 is transmitted using 256QAM there will be 256 parent nodes in the parent layer 304. For clarity, some of the parent nodes and their associated child nodes have been omitted from the tree diagram 300 and replaced with dashed lines 310 indicating where tree branches have been omitted. As used herein, the term “branch” or “tree branch” refers to a node and its associated child nodes. For example the AMTS search tree 300 includes a plurality of parallel branches such as the branch made up of nodes 306, 312, 314, 318. In accordance with certain embodiments of the AMTS method described above, each parent node, such as parent node 306, in the parent layer 304 has a single child node, such as child node 312 and each child node, such as child nodes 312, 314, also has a single child node. As the search progresses a child node 312 is selected for the parent node 306. This child node 312 then becomes the parent node for selection of the child node in the next lower level. This process continues until a node has been selected for all layers in the tree search. Including only a single child node in each child layer significantly reduces the overall complexity of the AMTS as compared to MLD or the QR-M algorithm. While only three child layers 308 are illustrated in the search tree 300 it is understood that when the transmitted signal has more than four layers the search tree 300 will include additional child layers where each child layer 308 corresponds to a layer in the transmitted signal.
In alternate embodiments several candidate nodes may be selected at each child layer by selecting candidate nodes having the highest values of the individual branch metric γk. However, in embodiments designed to have the lowest possible complexity, a single best node is chosen under each parent node as illustrated in
To find the best candidate node in each child layer 308 the maximum value of the individual layer branch metric γk needs to be found. The maximum value can be found by taking the partial derivative of the individual branch metric γk with respect to each candidate and setting the result equal to zero as shown in Equation 31:
The maximum value can then be found as shown in Equation 32:
When the diagonal value of the shortened channel correlation metric gkk is positive, the individual branch metric γk describes a concave surface for the candidate symbol x(k), and the peak value {circumflex over (x)}(k) is the maximum point on the surface. In the case of a concave surface the best estimation is provided by Equation 32 and quantizing the peak value {circumflex over (x)}(k) to the nearest constellation point in a QAM alphabet provides the best estimate of the candidate symbol x(k).
For example,
When the diagonal value of the shortened channel correlation matrix g is non-positive, the individual branch metric γk is a convex function and the maximal value is located along the boundaries so the corners of the constellation map need to be considered. Further when the modulus correspond to Equation 33:
the best candidate depends on the quadrant in which the residual signal z*(k)−gkN×(N) is located.
As described above, a single best candidate is selected under each parent node for each layer resulting in significantly lower complexity than MLD. However, preserving only a single candidate node at each layer is essentially sub-optimal. To compensate for this, each layer, or at least portions of the weaker layers are switched to be the parent node and the AMTS process is repeated with each layer as the parent layer. The results of each AMTS leg are then combined to obtain a more reliable result.
Switching of a layer to become the parent layer may be accomplished using a permutation matrix. In an exemplary embodiment, to switch a layer, designated as the jth layer in the following equation, to be the parent layer, a permutation matrix Pj is defined as shown in Equation 34:
where the 1 in the last column to the right corresponds to the jth element and the remaining elements in the last column are zero. The permutation matrix Pj may be used to permute the jth column of a matrix to the last column while keeping the rest of the columns in the same order.
The permutation matrix Pj also has a useful property where pre or post multiplying by its transpose yields the identity matrix: PjTPj=PjPjT=1. The permutation matrix Pj can be used to switch the jth layer of the received signal model by rewriting Equation 1 as shown in Equation 35:
Y=HP
j
P
j
T
X+N=H
j
X
j
+N, Eq. 35
Where Hj is a permuted channel matrix and Xj is the permuted transmitted symbol vector permuted according to the permutation matrix Pj.
After post-multiplying with the permutation matrix Pj the column vectors of the permuted channel matrix Hj are re-ordered as shown in Equation 36:
H
j
=HP
j=(h1,h2, . . . hj−1,hj+1, . . . hN,hj), Eq. 36
and after pre-multiplying the elements of the transmitted symbol vector X, with the transpose of the permutation matrix PjT the elements are re-ordered as shown in Equation 37:
X
j
=P
j
T
X=(x(1),x(2), . . . ,x(j−1),x(j+1), . . . ,x(N−1),x(j))T. Eq. 37
An embodiment of the AMTS process can then be implemented for the jth layer based on the permuted channel matrix Hj and the permuted symbol vector Xj.
Much of the complexity of the channel shortening process can be shared by all the parallel searches of the AMTS. Sharing of portions of the channel shortening process reduces the overall complexity and provides significant complexity savings. With the permuted received signal model described above in equation Eq. 35, the permuted MSE matrix Bj is updated as shown in Equation 38:
B
j
=I−H
j
H(HjHjH+σ2I)−1Hj=I−PjTHH(HPjPjTHH+σ2I)−1HPj=PjT(I−HH(HHH+σ2I)−1H)Pj=PjTBPj Eq. 38
The MSE matrix is the original non-permuted MSE matrix defined above. Thus, since P is a permutation matrix, the permuted MSE matrix Bj may be obtained from the MSE matrix B with a negligible increase in complexity.
Similarly the transformed received symbol vector shown in Equation 27 may be permuted to obtain a permuted transformed received symbol vector Zj as shown in Equation 39:
Z
j
=P
j
T
H
H(HHH+σ2I)−1Y=PjTZ. Eq. 39
The transformed received symbol vector Z and the corresponding MSE matrix B may be obtained from the initial LMMSE step. Therefore only the shortened channel correlation matrix Grj needs to be re-calculated based on the permuted MSE matrix Bj after the jth layer is switched to be the parent node.
After application of the permutation, the updated branch metric may be defined as shown in Equation 40:
γ(Xj)=2Re{ZjH(Grj+I)Xj}−XjHGrjXj. Eq. 40
The remainder of the AMTS process described above remains unchanged for the permuted layer.
The illustrated embodiment of the AMTS detector 500 begins by inputting the estimated channel matrix H and received signal Y to an initial LMMSE based step 514 which assumes the noise component to be white. The LMMSE step 514 produces a MSE matrix B and an estimated transformed received symbol vector Z. When PIC is included in the LMMSE step 514 the symbol estimation is input to a soft symbol regeneration module 516 that produces a soft symbol estimation {circumflex over (X)}u-1 and a corresponding covariance matrix Cu-1. Since the estimation process is iterative the superscript u is used to indicate the current iteration number and the superscript u−1 is used to indicate that these estimations are for the u minus 1 or previous iteration. The soft symbol estimation {circumflex over (X)}u-1 and a corresponding covariance matrix Cu-1 are input to a LMMSE-PIC process 518 to produce a first set of bit-LLR 510 which is fed back 520 to the soft symbol regeneration module 516. Once a desired number of iterations has been completed the final first set of bit-LLR 510 is provided to the LLR post process 506. The self-iterative LMMSE-PIC detector 518 may be summarized as shown in Equation 41:
The bit-LLR can be calculated based on the symbol estimation {circumflex over (X)}u for a specific modulation type. The bit-LLR can then be used by the soft symbol regeneration process to create a soft symbol estimation {circumflex over (X)} and covariance matrix C for the next iteration.
The parallel marginalized tree search (MTS) process 504 has a number of parallel legs 526 where each leg (wherein each leg can be processed in parallel by the detector 500), labeled as leg 1 through T, includes a channel shortening process 532 and an AMTS process 534. The channel shortening process 532 and AMTS process 534 for each parallel leg 526 share the same processes with a different transmitted layer switched to be the parent layer. Selection of the parent layers is described in more detail below. The outputs 528 from each parallel MTS leg 526 are combined in a candidate set combination and bit LLR calculation process 530 to produce a single output 512.
The estimated channel matrix H and MSE matrix B of
H=(h1,h2, . . . ,hj−1,hj,hj+1, . . . hN), Eq. 42
where hi(1≦i≦N) represents the ith column vector.
For energy based parent layer selection the layers chosen to be the parent layers of each parallel leg correspond to the channel vectors hK
∥hK
Alternatively, parent layer selection 524 may be based on the MSE matrix B obtained from the first LMMSE module 514. This approach is equivalent to basing selection on channel capacity. With channel capacity selection, the layers chosen to be parent layers correspond to the maximal diagonal elements of the MSE matrix B. Since the MSE matrix B is a square N by N matrix let its elements be represented by a lower case b as B=(bij)N×N where the subscripts i and j represent the row and column position of the element b respectively.
The layers chosen to be the parent layers correspond to the elements from the main diagonal of the MSE matrix B, bK
b
K
K
≧b
jj, where 1≦i≦T, and 1≦j≠Ki≦N. Eq. 44
Once the parent layers have been selected 524 a permutation matrix Pj is used as described above to switch each selected layer to be the parent layer of one parallel leg 526. A channel shortening process 532 creates a shortened channel correlation matrix Gr corresponding to the parent layer selected for each parallel leg 526. As described above the channel shortening processes 532 all use the same process for creating the shortened channel correlation matrix Gr which allows a large portion of the computational complexity to be shared. The shortened channel correlation matrix Gr is then used in an AMTS 534 to obtain a candidate set of bit-LLR 528. The candidate sets of bit-LLR 528 are then combined 530 and a final set of bit-LLR 512 is calculated.
In certain embodiments the number of parallel legs is less than the total number of layers that need to be detected. In these embodiments, since not all layers have a chance to be a parent node, no bit combination assumptions will occur in as candidate paths in one of the parallel legs 526 and not all bit-LLR values will be calculated by the AMTS 528. This may be referred to as the missing bit problem.
For the layers that have been chosen to be a parent node of one of the parallel AMTS legs 526, the bit-LLR calculation is simply based on the corresponding AMTS leg and since all assumptions for the parent node have been preserved there is no missing bit problem. For example, assume the layer chosen to be the parent node is modulated with 64QAM, the bit-LLR calculation will be calculated among the final 64 surviving paths as illustrated in the tree diagram of
where bi is the i-th bit of the parent node x(N).
It is often the case that an embodiment will use a number of parallel AMTS legs T that is less than the number of layers, N, in the transmitted signal, i.e the number of AMTS legs T is less than the number of layers N that needs to be detected. In embodiments where the number of parallel legs T is less than the number of received layers N not all the possible bit combinations are to be included in the search process and the missing bit problem needs to be solved. There are a number of alternatives for solving the missing bit problem which will be presented in the following.
For example, in certain embodiments the sign of the bit-LLR output from the AMTS 534 may be used. Although the bit-LLR for the missing bit combinations cannot be calculated, the sign of the bit-LLR is known. Thus the sign of the bit-LLR may be used to reconstruct the missing bit-LLR values as follows, when the sign of the bit-LLR output from the AMTS 534 is the same as the sign of the bit-LLR 510 output from the SPIC module, the bit-LLR 510 output from the SPIC module is used as the final output; and when the sign of the bit-LLR 510 output from the AMTS 534 is different than the sign of the bit-LLR 510 output from the SPIC module, the negative of the bit-LLR 510 output from the SPIC module is used as the final output.
Finally the bit-LLR 512 of bits that do not have the “missing bit” issue from AMTS detection module 504 are combined with the bit-LLR 510 from the LMMSE or LMMSE-SPIC detector 502 in an LLR post process 506. In certain embodiments the LLR post process 506 combines the bit-LLR 510 from the linear detector 502 with the bit LLR outputs 512 from the AMTS detector 504 based on a simple linear averaging. Alternatively, embodiments of the LLR post process 506 can use adaptive averaging where the averaging factor can be based on the measured signal to noise ratio (SNR).
The detection step 626 is based on a novel simplified tree search process described above. This novel simplified tree search process used in the detector step 626 begins with a parent layer selection process 610 where the layers in the received digital signal that are to be used as parent layers in the parallel legs, depicted as parallel legs 628-1 through 628-T in
Improved throughput obtained with the above described embodiments can be seen through simulations based on industry standard transmission modes, such as transmission mode 3 (TM3) of a LTE system as defined by the 3GPP.
The processor 802 may be a single processing device or may comprise a plurality of processing devices including special purpose devices such as for example it may include digital signal processing (DSP) devices, microprocessors, or other specialized processing devices as well as one or more general purpose computer processors. The processor 802 is configured to perform the before mentioned processes. The processor 802 is coupled to a memory 804 which may be a combination of various types of volatile and/or non-volatile computer memory such as for example read only memory (ROM), random access memory (RAM), magnetic or optical disk, or other types of computer memory. The memory 804 stores computer program instructions that may be accessed and executed by the processor 802 to cause the processor 802 to perform a variety of desirable computer implemented processes or methods such as the detection methods described above. The program instructions stored in memory 804 may be organized as groups or sets of program instructions referred to by those skilled in the art with various terms such as programs, software components, software modules, units, etc., where each software component may be of a recognized type such as an operating system, an application, a device driver, or other conventionally recognized type of software component. Also included in the memory 804 are program data and data files which are stored and processed by the computer program instructions.
The RF Unit 806 is coupled to the processor 802 and configured to transmit and receive RF signals based on digital data 812 exchanged with the processor 802. The RF Unit 806 is configured to transmit and receive radio signals that may conform to one or more of the wireless communication standards in use today, such as for example LTE, LTE-A, Wi-fi, as well as many others. The RF Unit 806 may receive radio signals from one or more antennas, down-convert the received RF signal, perform appropriate filtering and other signal conditioning operations, then convert the resulting baseband signal to a digital signal by sampling with an analog to digital converter. The digitized baseband signal also referred to herein as a digital communication signal is then sent 812 to the processor 802.
The UI 808 may include one or more user interface elements such as a touch screen, keypad, buttons, voice command processor, as well as other elements adapted for exchanging information with a user. The UI 808 may also include a display unit 810 configured to display a variety of information appropriate for a mobile device or apparatus 800 and may be implemented using any appropriate display type such as for example organic light emitting diodes (OLED), liquid crystal display (LCD), as well as less complex elements such as LEDs or indicator lamps, etc. In certain embodiments the display unit 810 incorporates a touch screen for receiving information from the user of the mobile device 800. In certain embodiments the UI 808 may be omitted. The mobile device 800 is appropriate for implementing embodiments of the apparatus and methods disclosed herein.
Thus, while there have been shown, described and pointed out, fundamental novel features of the disclosure as applied to the exemplary embodiments thereof, it will be understood that various omissions, substitutions and changes in the form and details of devices and methods illustrated, and in their operation, may be made by those skilled in the art without departing from the spirit and scope of the disclosure. Further, it is expressly intended that all combinations of those elements, which perform substantially the same function in substantially the same way to achieve the same results, are within the scope of the disclosure. Moreover, it should be recognized that structures and/or elements shown and/or described in connection with any disclosed form or embodiment of the disclosure may be incorporated in any other disclosed or described or suggested form or embodiment as a general matter of design choice. It is the intention, therefore, to be limited only as indicated by the scope of the claims appended hereto.
This application is a continuation of International Application No. PCT/EP2015/052743, filed on Feb. 10, 2015, the disclosure of which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2015/052743 | Feb 2015 | US |
Child | 15629181 | US |