As consumer demand for high data rate applications, such as streaming video, expands, technology providers are forced to adopt new technologies to provide the necessary bandwidth. Multiple-input Multiple-Output (“MIMO”) is an advanced radio system that employs multiple transmit antennas and multiple receive antennas to simultaneously transmit multiple parallel data streams. Relative to previous wireless technologies, MIMO enables substantial gains in both system capacity and transmission reliability without requiring an increase in frequency resources.
MIMO systems exploit differences in the paths between transmit and receive antennas to increase data throughput and diversity. As the number of transmit and receive antennas is increased, the capacity of a MIMO channel increases linearly, and the probability of all sub-channels between the transmitter and receiver fading simultaneously decreases exponentially. As might be expected, however, there is a price associated with realization of these benefits. Recovery of transmitted information in a MIMO system becomes increasingly complex with the addition of transmit antennas.
Many MIMO detection algorithms have been proposed. The maximum-likelihood detector, while conceptually simple and exhibiting optimal detection performance, is often impractical because its complexity increases exponentially with the number of input channels. Consequently, a vast assortment of algorithms have been proposed to solve the detection problem with reduced complexity while sacrificing minimal performance. Many MIMO detectors have been proposed exclusively as hard detectors that give only the final estimate of the channel input. Most notable is the sphere detector because it can achieve Max-Log (“ML”) performance in an uncoded system with much less complexity on average. A summary of many other previously proposed MIMO detectors is given in Deric W. Waters, Signal Detection Strategies and Algorithms for Multiple-input Multiple-Output Channels (December 2005) (unpublished Ph.D. dissertation, Georgia Institute of Technology), including many variations of the sphere detector that have been proposed to minimize complexity without sacrificing performance. In Bernard M. Hochwald & Stephan ten Brink, Achieving Near-Capacity on a Multiple-Antenna Channel, 51 IEEE TRANSACTIONS ON COMMUNICATIONS 389-99 (2003), the list-sphere detector was proposed as a way to compute the log-likelihood ratio (LLR) for the channel input.
In at least some embodiments, a Multiple-input Multiple-Output (MIMO) receiver is provided. The MIMO receiver comprises a parameterized sphere detector having two search modes. During a first search mode, the parameterized sphere detector enumerates a number of best candidate vectors up to a fixed parameter value. During a second search mode, the parameterized sphere detector enumerates additional candidate vectors using a greedy search until a predetermined number of candidate vectors have been enumerated
In other embodiments, a method is provided. The method comprises initializing a parameterized sphere detector to enumerate a fixed number of best candidate vectors before beginning a greedy search. The method further comprises enumerating additional candidate vectors with the greedy search.
In the following detailed description, reference will be made to the accompanying drawings, in which:
Certain terms are used throughout the following description and claims to refer to particular system components. As one skilled in the art will appreciate, companies may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following discussion and in the claims, the terms “including” and “comprising” and “e.g.” are used in an open-ended fashion, and thus should be interpreted to mean “including, but not limited to . . .”. The term “couple” or “couples” is intended to mean either an indirect or direct connection. Thus, if a first component couples to a second component, that connection may be through a direct connection, or through an indirect connection via other components and connections. The term “system” refers to a collection of two or more hardware and/or software components, and may be used to refer to an electronic device or devices, or a sub-system thereof. Further, the term “software” includes any executable code capable of running on a processor, regardless of the media used to store the software. Thus, code stored in non-volatile memory, and sometimes referred to as “embedded firmware,” is included within the definition of software.
It should be understood at the outset that although an exemplary implementation of one embodiment of the present disclosure is illustrated below, the present system may be implemented using any number of techniques, whether currently known or in existence. The present disclosure should in no way be limited to the exemplary implementations, drawings, and techniques illustrated below, including the exemplary design and implementation illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.
Embodiments of the disclosure implement a “Parameterized-Sphere” (P-SPH) detector to reduce complexity for many types of (Multiple-input Multiple-Output) MIMO detectors while maintaining the same performance. As used herein, a P-SPH detector refers to a sphere detector configured to search for a fixed number of candidate vectors. In at least some embodiments, the P-SPH detector operates in an initialization mode and in a completion mode to enumerate a fixed number of candidate vectors. During the initialization mode, the P-SPH detector enumerates a fixed number of “best” candidate vectors based on a parameter (“K”). For example, the best candidate vectors can be determined using “cost” and “score” calculations of a greedy tree-search algorithm as will be described herein. During the completion mode, the P-SPH detector enumerates additional candidate vectors, based on a parameter (“L”), until a fixed number of additional candidate vectors have been enumerated. In at least some embodiments, the additional candidate vectors are enumerated using a greedy search that starts with the leaf node having the lowest cost as determined during the initialization mode. In different embodiments, P-SPH detector can be configured to approximate a truncated-sphere detector, a max-log detector, or a decision-feedback detector.
While MIMO systems may greatly increase spectral efficiency, the process of separating signals simultaneously transmitted from multiple antennas 106 may be burdensome for the MIMO receiver 104. To reduce the computational burden of separating signals, the MIMO receiver 104 implements a P-SPH detector 114 to determine a fixed number of candidate vectors, which are used to detect symbols (i.e. generate symbol estimates). The detected symbols are used to decode bits of a received signal. After decoding a signal, the MIMO receiver 104 can provide an output 118, which includes the decoded signal.
A MIMO narrowband channel model with N inputs a=[a1 a2 . . . aN ]T and M outputs r=[r1,. . . rM]T can be written as:
r=Ha+w, (Equation 1)
where H is a complex M×N matrix and w=[w1 . . . wM]T is noise. The noise has the autocorrelation matrix E[ww*]=Σ2. This disclosure focuses on the case where Σ2=Iσ2, where I is the N×N identity matrix and where σ2 is the variance of the noise. Those skilled in the art will recognize that the concepts are extendable to more general cases.
For example, the channel output could be left multiplied by the matrix Σ−1 to achieve an effective channel whose noise has an identity autocorrelation matrix. This equation also applies to a single tone in an Orthogonal Frequency Division Multiplexing (OFDM) or Orthogonal Frequency Division Multiple Access (OFDMA) system after the cyclic prefix has been removed and the FFT has been applied. This equation also applies to single-tap code-division multiple access (CDMA) channels.
In at least some embodiments, the detector/decoder 114 uses a QR decomposition of the channel. This decomposition is defined as follows:
where {tilde over (Q)} is an (M+N)×N matrix with orthonormal columns, R is an N×N triangular matrix with positive and real diagonals, Π is an N×N permutation matrix, {circumflex over (σ)} is an estimate of σ, and α is a chosen parameter that is a non-negative real number. A permutation matrix is defined as any matrix with a single element in each row and column that has the value of one, while all other elements have the value zero. This disclosure describes the algorithm assuming a lower triangular R matrix. Alternatively, embodiments can implement an algorithm for an upper triangular R matrix as would be understood by those skilled in the art.
The optimal value of the parameter α depends on the type of MIMO detector that is used. For example, α=1 is optimal for a linear receiver because it minimizes the error, ∥R−1QHy−s∥2. On the other hand, α=0 is optimal for a max-log (ML) receiver. When α=1, the equalization is called minimum mean-squared error (MMSE) equalization. When α=0, the equalization is called zero-forcing (ZF) equalization. Thus, an appropriate parameter can be selected for the detector/decoder 114 depending on whether a MMSE equalizer or a ZF equalizer is being implemented. When α=0, the QR decomposition may be simplified to:
HΠ=QR. (Equation 3)
The way the permutation matrix Π is defined impacts performance for some MIMO detectors. For example, the BLAST ordering chooses Π to maximize the minimum diagonal of R. A less complex way to choose Π is the sorted-QR decomposition that attempts to maximize R1,1 (lower triangular R).
The MIMO detector problem can be transformed into an easier problem by creating an effective channel that is triangular. One way to achieve this uses the conjugate transpose of Q as follows:
y=QHr=Rs+n, (Equation 4)
where s=Π−1α=[S1 S2 . . . SN]T is a permutation of the channel input vector, and n is an effective noise. Note that n may be a function of α when α ≠0. The constellation for the i-th symbol is defined as sk,i ε Ai. The set containing all valid values of a subset of the channel inputs is denoted as AN
The output of an ML detector is the log-likelihood ratio (LLR) of each bit transmitted in the vector s, where the LLR value indicates the probability that a given bit was transmitted as a one or zero. The ML detector output for the j-th bit of the i-th symbol is described by a single equation:
λi,j=(∥r−HΠs(o)∥2−∥r−HΠs(1)∥2)/{circumflex over (σ)}2, (Equation 5)
where ∥r−HΠs(k)∥2 is minimized under the constraint that s(k) ε A1N(k,i,j). The value ∥r−HΠx(k)∥2 is defined as the cost of the vector x.
The ML detector may also be defined using the triangular channel model:
λi,j=(∥y−Rs(o)∥2−∥y−Rs(1)∥2)/{circumflex over (σ)}2, (Equation 6)
where ∥y−Rs(k)∥2 is minimized subject to the constraints s(k) ε A1N(k,i,j), and α=0, and where Π can be any permutation matrix. Note that ∥y−Rx∥2=∥r−HΠx∥2 when α=0.
The P-SPH detector 114 is an example of a list detector. A list detector is any detector that generates a list of candidate vectors for the channel input. The set of candidate vectors is labeled as the set LS, and the number of candidates in the set is called the list length. The size of the set LS is labeled as L. The ML detector is an example of a list detector with an exhaustive list (i.e., L=A1N or L=|A1N|). Alternatively, many list detectors generate their lists to be as small as possible without sacrificing too much performance.
Given the set LS generated by any list detector, the LLR for the j-th bit of the i-th symbol may be computed in a manner similar to the ML detector in Equations 5 and 6:
λi,j=(∥y−Rs(o)∥2−∥y−Rs(1)∥2)/{circumflex over (σ)}2, (Equation 7)
where ∥y−Rs(k)∥2 is minimized subject to the constraits s(k) ε A1N(k,i,j) and S(k) ε LS.
An important problem for MIMO detection os finding the vector {circumflex over (α)} that maximizes Pr[y|α={circumflex over (α)}] can be written as:
Pr[y|α={circumflex over (α)}]=min âεA
where A1N is set of all possible channel inputs. The detection problem of Equation 8 can be fully described in terms of a tree-search. The number of branches exiting the root node corresponds to the number of possible values for the first symbol. Likewise the number of branches exiting the nodes preceding the i-th level corresponds to the number of possibilities for the i-th symbol. In the end, there are
total leaf nodes in the tree.
The sphere detector is an algorithm that finds the leaf node with the smallest cost in a computationally-efficient manner. The cost of any node is the sum of the scores of all the branches in the path back to the root node, where every branch in the tree is associated with a unique score. The score of a branch exiting a node at the i-th level can be written as:
Score=|zi−Ri,iŝi|2, (Equation 9)
where ŝ=[ŝ1 . . . ŝN]T and where zi is the result of an interference cancellation procedure. The interference cancellation procedure is defined as:
where yi is defined by Equation 4, and [ŝ1 . . . ŝi-1]T are the symbols from the path that connects the current branch back to the root node. For more information regarding representing detection problems as a tree-search, reference may be had to J. R. Barry, E. A. Lee, and D. G. Messerschmitt, Digital Communication, 3rd edition, Kluwer Academic Publishers, 2004, chapter 10.
The sphere detector uses a depth-first approach to find the leaf node with minimum cost. In other words, the sphere detector uses a “greedy” search approach to find the first leaf node, thereby establishing a cost threshold. The sphere detector then begins its depth-first search from that first leaf node. As the sphere detector seeks a lower cost leaf node, any branches leading to nodes whose cost exceeds the threshold are pruned. The pruning is possible because the cost of a particular path can only increase from one level to the next. If a leaf node with a cost below the threshold is found during the search, then that leaf node becomes the preliminary result of the search and the threshold is reduced accordingly. The search continues until each leaf node's path has either been pruned, or its cost has been computed.
All possible channel input vectors are points in space, and the tree-search technique searches through all points to find the one with the lowest cost. The search technique described above is called sphere detection because it does not search through those points outside the hypersphere defined by the value of the initial radius, or the cost of the first leaf node. As leaf nodes with lower costs are found, the radius shrinks thereby reducing complexity by excluding more points from its search. For more information regarding sphere detectors, reference may be had to A. Chan and I. Lee, “A new reduced-complexity sphere decoder for multiple antenna systems,” IEEE Conference on Communications, pp. 460-464, 2002.
One way to reduce complexity of the sphere detector is to set an initial threshold, which is equivalent to establishing the radius of the hypersphere before any computations are done. For more information regarding setting an initial threshold, reference may be had to W. Zhou, G. B. Giannakis, “Sphere decoding algorithms with improved radius search,” IEEE Communications and Networking Conference, vol. 4, pp. 2290-2294, March 2004. The risk of setting an initial threshold is that if the hypersphere excludes all points, the radius would have to be expanded and the search continued. On the other hand, if the initial hypersphere contains only a few points then the sphere detector can find the solution with low complexity. The origin of the hypersphere is also important to reducing complexity. Ideally, the distance from the origin to the point with the lowest cost is minimized, so as to exclude as many points as possible from the search. The most common origin for the hypersphere is the point y.
Computing the LLR values for each bit requires a list of candidate vectors, or leaf nodes, not just the one with minimum cost. For example, implementing the ML detector exactly would require at least 2 candidate vectors, and at most |A1N|/2+1 candidate vectors in order for all possible bit values to be represented in the list of candidate vectors.
One practical problem with the sphere detector is that its complexity has a large variance depending on channel and noise realizations. Accordingly, embodiments of the disclosure involve a novel algorithm with fixed complexity (called the P-SPH detector algorithm) that achieves nearly the same performance as the sphere detector. The idea is to initialize or “prime” the sphere detector by enumerating some candidate vectors before beginning the greedy search. Then the greedy search will stop after it has enumerated a predetermined number of leaf nodes. The final outputs of the P-SPH detector are the list of leaf nodes enumerated and their costs. The P-SPH detector for the case when N=2 (where N is the number of inputs) is the building block for other embodiments when N>2. Therefore, this disclosure will first focus on the P-SPH for the case when N=2. For this case the scores are defined as follows:
S1(ŝ)=|y1−R1,1ŝ|2. (Equation 11)
where the i-th best candidate for the first symbol is denoted as ŝ1(i). Similarly, ŝ2(i,j) is the j-th best candidate for the second symbol when the first symbol is equal to ŝ1(i). With this notation, the j-th leaf node enumerated below the i-th best candidate for the first symbol is ŝ(i,j)=[ŝ1(i) ŝ2(i,j)]T.
During the initialization mode, K leaf nodes are enumerated corresponding to the K best candidates for the first symbol. The K candidates correspond to K branches from the root node to nodes at the second level. The best branch from each of the K second-level nodes to a leaf node is chosen thereby enumerating K leaf nodes. This initialization mode as just described can be viewed as a modified breadth-first tree-search. Besides the set of candidate values for the first symbol, the vectors c=[c1 c2 . . . cK+1] and b=[b1 b2 . . . bK 0 ] are also computed during the initialization phase. The vector c contains the costs of each enumerated leaf node and the score of the (K+1)-th best branch at the first level:
The vector b contains the indices of the last branches enumerated for each node belonging to one of the paths of the enumerated leaf nodes. After the initialization mode, b=[1 1 . . . 1 0]. In other words, for each of the first K candidates at the first level only one candidate at the second level has been enumerated.
During the second phase L−K additional leaf nodes are enumerated using a greedy tree-search starting from the leaf node from the initial set with the minimum cost. In the end, the P-SPH detector enumerates L leaf nodes, or candidate vectors. This greedy search enumerates the next best sibling from the leaf node that has already been enumerated with minimum cost, or it enumerates a leaf node below another candidate for the first symbol if its first symbol score is smaller. In other words, the next best leaf node for the i-th candidate of the first symbol is enumerated where ci≦cj
In order to reduce complexity further, the maximum number of branches that can be enumerated for the first symbol is Kmax and the most symbols that can be enumerated from a node at the second level connected to a leaf node is Bmax. Deciding which of the branches to enumerate and their enumeration order is determined by a constellation sorting algorithm. Ideally, the constellation sorting algorithm should output the symbol in the constellation that is the i-th nearest to the input value. In practice, some constellation sorting algorithms may reduce complexity by approximating the ideal algorithm. Constellation sorting algorithms either presently known or that are developed in the future could be implemented to determine the enumeration order.
As shown in
In
In
In
In
In accordance with some embodiments, the flow of the P-SPH detector algorithm when N=2 and the l-th best leaf-node is computed as follows. This processing would be repeated for l=1 to L, and the outputs would correspond to the l-th element of the sets LS and LC, LS(l) and LC(l), respectively.
Input Parameters:
Outputs:
Steps in the flow of the P-SPH detector.
For Initialization Mode:
For Completion Mode:
and its cost C=ci.
Another embodiment of the P-SPH detector reduces the amount of memory storage required by keeping the number of elements in {ck} constant. This may be implemented by changing Step 14 as follows:
Alternative Step 14:
As presented here the P-SPH detector uses a serial implementation. Alternatively, embodiments of the P-SPH detector could increase the amount of computations done in parallel. One way to accomplish this is to enumerate n leaf nodes at once, instead of the one-by-one enumeration described above (n=1). This would allow more operations to be done in parallel. Specifically, instead of enumerating the single next best branch for the i-th candidate of the first symbol (ci≦cj), enumerate n new leaf nodes. This could be done either by enumerating one new leaf node from each sub-tree corresponding to the n smallest elements in {cj}, or enumerating n new leaf nodes for the i-th sub-tree. For example, consider the P-SPH detector when N=2. In the serial implementation the L best leaf nodes are computed one-by-one. Alternatively, the initialization mode of the parallel implementation could simultaneously enumerate the n best candidates of the first symbol until K leaf nodes are generated. Then the completion mode of the parallel implementation could simultaneously enumerate the next best leaf node for the n candidates of the first symbol that correspond to the n smallest elements in the vector c. As the parameter n increases, the parallel implementation suffers more and more performance loss compared to the serial implementation. However, for small values of n the performance loss can be acceptable in order to achieve the complexity savings.
Instead of repeatedly searching for the maximum and/or minimum element in the set {cj}, another embodiment it so keep the elements sorted so that the index of the maximum and/or minimum are known. This requires maintaining the association between the sub-trees and the elements of the sets {cj} and {bj}.
In at least some embodiments, various parameters can be adapted to provide many different versions of the P-SPH detector that achieve various performance-complexity trade-offs. For example, four parameters (L, K, Kmax, {Bmax,i}) of the P-SPH detector and two parameters in the QR decomposition (α and Π)be adapted. As an example, when K=1, Kmax=|A1| and Bmax=|A2N|, the P-SPH detector is equivalent to a sphere detector that stops after it has enumerated L leaf nodes. This detector is referred to as a truncated-sphere detector in D. W. Waters, “Signal Detection Strategies and Algorithms for Multiple-input Multiple-Output Channels”, Georgia Institute of Technology, PhD Dissertation, December 2005. Alternatively, when Kmax=|A1|, Bmax=|A2N|, L=|A1N|, and a=0, the P-SPH detector is equivalent to an ML detector for any valid value of K (where 1 K |A1|) and Π. Alternatively, when K=Kmax=Bmax=L=1, the P-SPH detector is equivalent to a decision-feedback detector.
While several embodiments have been provided in the present disclosure, it should be understood that the disclosed systems and methods may be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein, but may be modified within the scope of the appended claims along with their full scope of equivalents. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.
Also, techniques, systems, subsystems and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as directly coupled or communicating with each other may be coupled through some interface or device, such that the items may no longer be considered directly coupled to each other but may still be indirectly coupled and in communication, whether electrically, mechanically, or otherwise with one another. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the spirit and scope disclosed herein.
The present application claims priority to U.S. provisional patent application Ser. No. 60/869,963, entitled “Low-Complexity MIMO Receiver and Method of Using Same”, filed on Dec. 14, 2006. The above-referenced application is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
6662024 | Walton et al. | Dec 2003 | B2 |
6757337 | Zhuang et al. | Jun 2004 | B2 |
6859689 | Vos | Feb 2005 | B2 |
7116725 | Ketchum et al. | Oct 2006 | B2 |
7123887 | Kim et al. | Oct 2006 | B2 |
7308047 | Sadowsky | Dec 2007 | B2 |
7443928 | Nefedov et al. | Oct 2008 | B2 |
7457376 | Sadowsky | Nov 2008 | B2 |
7596191 | Son et al. | Sep 2009 | B2 |
20050135498 | Yee | Jun 2005 | A1 |
20050210039 | Garrett | Sep 2005 | A1 |
Number | Date | Country | |
---|---|---|---|
20080144746 A1 | Jun 2008 | US |
Number | Date | Country | |
---|---|---|---|
60869963 | Dec 2006 | US |