Multiple-Input Multiple Output (MIMO) technology is employed in today's wireless digital communication systems to improve spectral-efficiency and robustness to fading without increasing power or bandwidth. In many current wireless standards, MIMO may be combined with channel coding to further improve the system diversity. Besides, Quadrature Amplitude Modulation (QAM) may be utilized to further increase spectral-efficiency. One challenge in MIMO is the detection stage, which is performed at the receiver and can require an excessively large computational complexity in order to achieve the optimal MIMO gain. As of today, many detection techniques have been proposed to closely approach optimal performance with affordable complexity. Among them, K-best detection methods offer an excellent performance/complexity tradeoff.
K-best detection methods search in a breadth-first manner a MIMO detection tree configuration, wherein, the tree configuration is formed by a plurality of nodes arranged in levels, and connected via a plurality of branches. Basically, for each level in the tree configuration, the K-best detection algorithm only expands the paths emerging from the K nodes with the smallest metric.
Historically, the first K-best detector implementations have been based on the Fincke-Pohst (FP) and Schnorr-Euchner (SE) strategies originally used in sphere decoding. However, these strategies may not yield the best complexity-efficiency since they involve complex operations such as matrix inversion in the preprocessing stage. More recently, one observed that, although utilized in sphere decoding, matrix inversions may be unnecessary in K-best detection, Accordingly, many implementations may utilize K-best detection algorithms involving no such inversions. Moreover, given the lattice property of QAM constellations, the computational complexity utilized to detect each signal can be further reduced by replacing some multiplications with shift/add operations.
Despite the momentum to reduce K-best detection complexity, all K-best detection techniques proposed so far can be identified to a K-best tree search, where the complexity associated with visiting a node in the tree configuration grows with the node depth (i.e., level) in the tree.
Included are embodiments of method for method for computing metrics. At least one embodiment includes searching a MIMO detection tree, the detection tree configuration being formed by a plurality of nodes and a plurality of leaves connected via a plurality of branches, the computational complexity associated with computing a node metric decreases with the node depth in the tree configuration and providing an estimate on a transmitted signal.
Also included is an embodiment of system for computing metrics, comprising: a searching component configured to search a MIMO detection tree, the detection tree configuration being formed by a plurality of nodes and a plurality of leaves connected via a plurality of branches, the computational complexity associated with computing a node metric decreases with the node depth in the tree configuration; and a providing component configured to provide an estimate on a transmitted signal.
Other embodiments and/or advantages of this disclosure will be or may become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description and be within the scope of the present disclosure.
Many aspects of the disclosure can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present disclosure. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views. While several embodiments are described in connection with these drawings, there is no intent to limit the disclosure to the embodiment or embodiments disclosed herein. On the contrary, the intent is to cover all alternatives, modifications, and equivalents.
Embodiments disclosed herein include methods for computing metrics in K-best MIMO detection such that the complexity associated with visiting a node in the tree configuration decreases with the node depth in the tree. Since the number of visited nodes increases with the depth, these embodiments better balance node complexity and the number of visited nodes at a given depth, thereby reducing complexity over existing (previously proposed) algorithms. Some embodiments include hard-output detection, while some embodiments include a complexity-efficient extension to soft-output MIMO detection in order to use MIMO jointly with soft-input channel decoding.
At least one embodiment included herein assumes a perfectly synchronized MIMO system with an identical number of transmit and receive antennas, as depicted in
More specifically, an information bit stream B may be demultiplexed into N substreams, each labeled by an index n ε {1, . . . , N}, where N may be used to signify an MIMO signal dimension. At each instant in time, M bits (b1,n, . . . , bm,n, . . . , bM,n) of the nth substream (where M may be used to indicate a number of bits mapped per dimension) may be modulated by the modulator 102 to a signal xn taken from an amplitude shift keying (ASK) signal set Ω, where Ω may be used to represent a set of 2M elementary one-dimensional coordinates forming a MIMO signal. Each ASK signal xn may then transmitted over an antenna and/or can be concatenated in quadrature with another ASK signal xn′ to form a QAM signal xn+j·xn′ transmitted over an antenna. One should note that x=[x1 . . . xN]T may be the N-dimensional signal transmitted over all antennas at each time instant. On the receiver side, at each instant, the detector inputs the baseband signal may be represented as:
y=Hx+w (1)
where H 104 represents an N×N transfer matrix of a channel (in at least one embodiment, H is real), and W is an N-dimensional additive noise vector. Each element of W may include an additive white Gaussian noise (AWGN) sample of power spectral density N0. The receiver can switch to hard-output lattice decoder (K-HOLD) decoder 106 or and/or a soft-output lattice decoder (K-SOLD) decoder 108. Each dimension n maps the column vector bn (formed by M bits b1,n . . . bM,n) to a 2M-ary one-dimensional symbol xn. The mapping operation is characterized by the one-to-one correspondence:
xn=μ(bn). (2)
The inverse operation is called demapping and may be characterized by the function:
bn=μ−1(xn). (3)
As a member of the family of near-optimal MIMO detection techniques, K-best lattice decoding generally aims to approach a most likely (ML) transmitted signal:
By limiting the minimization problem (4) to searching candidates x located in a non-exhaustive set χ⊂ΩN. The construction of χ may be performed recursively and can be better understood in terms of a tree search. In one exemplary embodiment, the minimization problem (4) may be rewritten as:
where {hacek over (y)}=QTy, and Q and R are a rotation and upper-triangular matrix resulting from a QR decomposition of the channel matrix H, respectively. R may be obtained by a unitary rotation of H, and hence minimizing ∥{hacek over (y)}−Rx∥2 may be equivalent to minimizing ∥y−Hx∥2. Nonetheless, because of the upper-triangular property of R, the calculation of Rx may involve less of a computational burden. Generally, for each candidate point {circumflex over (x)}, the computation of ∥{hacek over (y)}−R{circumflex over (X)}∥2 can be rewritten as
where the ri,j's denote the row-column entries of R.
Besides, each node {circumflex over (x)}n . . . {circumflex over (x)}N may be associated with a weight Δ{circumflex over (x)}
Therefore, ∥{hacek over (y)}−R{circumflex over (x)}∥2 can be interpreted as the weight Δ{circumflex over (x)}
In K-best lattice decoding, the subset χ⊂ΩN of searched candidates is constructed recursively. More specifically, at each level in the tree search, only the K nodes yielding the smallest node metric (i.e., which are those most likely to lead to the ML leaf {circumflex over (x)}ML) are selected for being extended to lower levels, other nodes are pruned (i.e., not further extended). Thus, although an exhaustive search could involve visiting 2M(N−n−1) nodes at level-n, a K-best search may involve visiting at most 2M·K nodes per level, thereby reducing the exhaustive search's exponentially growing complexity (as a function of n) to a linearly growing complexity. Pruning some nodes can lead to suboptimality. Yet, optimal performance can be closely approached for an appropriate choice of K.
Embodiments disclosed herein include techniques for updating the node metric, such that the complexity for computing δ{circumflex over (x)}
As illustrated, the tree 300 can include a plurality of nodes 302-310, organized with levels N to N−3. As discussed above, at least one exemplary embodiment may be configured such that the weight of a given node may equal a sum of the metrics of the branches to reach that node from the root node 302. Consequently, embodiments disclosed herein can utilize a breadth first search based algorithm. More specifically, the weight of node 304a may have a value of 3 due to the fact that the metric of the branch between 304a and 302 has a value of 3. The same can be said for nodes 304b, 304c, and 304d. Similarly, the node 306 may have a weight of 4, which is determined from the sum of the metric of the branch between node 302 and node 304a (value of 3), and the metric of the branch between the node 304a and the node 306 (1). Depending on the determined value of K (in this nonlimiting example, K=4), a determination may be made of the K best results at each level. Accordingly, at each level of
In the following, one mathematically details a method is for computing node and branch metrics such that the complexity associated with visiting a node in the tree configuration decreases with the node depth in the tree.
Given the upper-triangular property of R, one can develop the vector {hacek over (y)}−R{circumflex over (x)} as:
where the operation └{circumflex over (x)}┘N−1 denotes the truncation of the vector {circumflex over (x)} to its N−1 first coordinates, └R┘N−1 is a (N−1)×(N−1) submatrix of R obtained via the decomposition:
and rN denotes the Nth column vector of R. More generally, one should note └R┘n the matrix R may be restricted to its first n rows and n columns. Thus, └R┘n is upper-triangular.
Given equation (9), the square distance ∥{hacek over (y)}−R{circumflex over (x)}∥2 can be rewritten as:
∥{hacek over (y)}−R{circumflex over (x)}∥2=∥{hacek over (y)}{circumflex over (x)}
{hacek over (y)}{circumflex over (x)}
δ{circumflex over (x)}N=({hacek over (y)}N−rN,N·{circumflex over (x)}N)2. (13)
As illustrated in equation (11), the metric ∥{hacek over (y)}−R{circumflex over (x)}∥2 can be computed in a recursive manner by calculating a branch metric δ{circumflex over (x)}
δ{circumflex over (x)}
where [{hacek over (y)}{circumflex over (x)}
Such square distance computation is well suited to tree search algorithms, since the coordinates of the candidate {circumflex over (x)} are sequentially involved in the recursive construction of the metric. More specifically, in order to reduce the problem dimension, each recursion may involve a knowledge of at least one extra coordinate of {circumflex over (x)} relative to the previous recursion. Thus, each recursion can be interpreted as going one level deeper in the following tree where, at level-n, the node representing the path [xn . . . xN]T is labeled with xn . . . xN; the branch connecting the node xn . . . xN is weighted with the metric δ{circumflex over (x)}
In at least one exemplary embodiment, the branch metric computation involves knowledge of the nth coordinate of {hacek over (y)}{circumflex over (x)}
The complexity associated with each visited node can be approximated by the sum of the complexities for computing the branch metric from equation (17) and that for updating {hacek over (y)}{circumflex over (x)}
As typical lattice decoders, K-HOLD may utilize some preprocessing and predecoding before starting the actual decoding. Performed after each (re)estimation of H, the preprocessing stage may involve computing Q and R via a QR decomposition of the channel matrix estimate. One should note that inverting H or R may not be needed.
The predecoding stage involves basically calculating {hacek over (y)}=QTy and initializing memory. Let's recall that predecoding is performed more often than preprocessing, (i.e., for each received signal y).
The decoding stage inputs {hacek over (y)} and outputs an estimate on {circumflex over (x)}ML. The decoding stage may involve a tree search in a K-best manner, which is summarized in the flowchart depicted in
Similarly,
At block 476, i may be assigned a value of 1. Additionally, [{hacek over (y)}{circumflex over (x)}
Additionally, at least one modification to the algorithm of
Similarly, to limit the number of branch metrics computed per level, one can limit the total number 2M of branches leaving each node to a smaller number J<2M. A good branch selection involves choosing the J branches most likely to lead to the most likely (ML) leaf, (typically, the J branches with the smallest metric). This may involve determining, for each node {circumflex over (x)}n . . . {circumflex over (x)}N, a subset Ω{circumflex over (x)}
Similarly, while some embodiments may be utilized as hard detection algorithms, at least one embodiment may be utilized as a soft detection algorithm. Embodiments of soft-detection algorithms may be configured to provide a bit-wise soft-reliability information approaching the log-likelihood ratio (LLR)
A (near-optimal) approximation of Λm,n may be given by
In at least one embodiment, Ω|b
A brute force technique to compute (19) may involve running a hard-output detection algorithm (such as, e.g., K-HOLD) a plurality of times to obtain {circumflex over (x)}|b
As opposed to the brute force technique, K-SOLD can approximate {circumflex over (x)}|b
Additionally, K-SOLD may be similar to K-HOLD in that K-SOLD may utilize a similar algorithm to search the tree and compute the branch and node metrics. In at least one exemplary embodiment, one difference relative to K-HOLD is the utilization of extra resources for computing the soft-output. The calculation of the soft-output Λm,n expressed in equation (19) can be performed independently for each bit bm,n. For each set {m,n}, Λm,n can be obtained by updating (for each visited node) two soft-bit metrics Lm,n0 and Lm,n1, where Lm,nz corresponds to the smallest computed node metric associated with nodes representing the bit bm,n=z. For a visited node with label {circumflex over (x)}n . . . {circumflex over (x)}N, the update of Lm,nz can be written
Lm,n0=Δ{circumflex over (x)}
Lm,n1=Δ{circumflex over (x)}
where bm,n denotes the mth component bit mapped to {circumflex over (x)}n. Once the entire tree has been scanned, the soft-output can be obtained by computing:
As stated above, the computation of Lm,nz may be performed recursively while visiting the tree by updating Lm,nz with the metric of any node representing the bit bm,n=z, if the latter metric is smaller than Lm,nz. Then, before starting the tree search, Lm,nz may be initialized with a value Linit that is greater than the metric Δ|b
Assuming the metrics of branches connecting a pruned node to a leaf are all-zero amounts to under-estimating the real leaf metric. Such under-estimation may become significant for nodes at higher levels, which can drastically degrade performance. A solution to reduce this degradation is to limit the node metrics used in the soft-output computation to the nodes at levels n≦N′, where N′<N. The more N′ approaches 1, the less impact the under-estimation has. Limiting nodes used for soft-output computation also lowers complexity, since the update in equation (21) is performed for fewer nodes.
The K-best search does not guarantee that all visited nodes for levels n<N can represent bm,n=z for all combinations {m,n,z}ε{1, . . . , M}×{1, . . . , N}×{0,1}. Therefore, by limiting the nodes used for soft-output computation to levels n≦N′, there may exist sets {m,n,z} for which Lm,nz is never updated, and hence remains equal to Linit. In such case, choosing Linit too large may lead to performance degradation.
Similarly, Lm,nz may be initialized with a more moderate value Linit=R<<+∞ and can be interpreted as a saturation threshold, which represents the fact that there may be no symbol representing the bit bm,n=z in a spherical region of radius R centered on {hacek over (y)}. An optimized value of R yielding improved performance shall be jointly optimized with N′ for each particular MIMO system, (i.e., for each particular signal set ΩN).
Referring again to the drawings,
More specifically,
for all sets {m,n}ε{1, . . . , M}×{1, . . . , N} (block 564).
Because many MIMO systems utilize soft-input channel decoding, the following is based on K-SOLD performance. At least one nonlimiting example utilizes K-SOLD detection performance for unitary average energy convolutional coded 16-QAM and 64-QAM transmissions over 2×2 and 4×4 uncorrelated Rayleigh fading channels with background AWGN. The considered convolutional code is the 64-state non-recursive convolutional code recommended in the IEEE standards 802.11n and 802.16e. In at least one exemplary embodiment, simulations may assume perfect knowledge of H and N0. It may also be assumed that no channel matrix reordering is performed prior to the decoding stage (unless stated otherwise).
Additionally, the complexity of the K-best search is proportional to the total number of visited nodes per level in the tree. The complexity can then be lowered by reducing J and/or K. Some embodiments may implement J=min {2M,4}. Simulation results for hard and soft output detection showed that using J=min{2M,4} yields no noticeable performance degradation as compared to using J=2M.
The effect of K is illustrated in
Additionally, other parameters, such as N′ and R can degrade K-SOLD performance if not aptly chosen. As a nonlimiting example,
Additionally, Table 1 lists the couples {N′,R} yielding the best detection performance for 802.16e (2×2 QAM) and 802.11n (4×4 QAM) systems transmitting over an uncorrelated Rayleigh fading channel. Performance has been evaluated via Monte-Carlo simulations, for all {N′,R}ε{1,2, . . . , N}×{0.05,0.1, . . . , 4.95,5}.
Performance results discussed so far assume no reordering the channel matrix. Nonetheless, for a BER of 10−5 and K=8, when reordering H, little performance improvement may be seen for QPSK transmission, and around 0.2 to 0.3 dB performance improvement for 16 and 64-QAM transmission, as illustrated in
Included herein are embodiments of a novel approach for computing metrics in K-best MIMO detection algorithms. By identifying the detection process to a K-best tree search, these embodiments involve updating node metrics such that the complexity associated with each node decreases with its depth in the tree, thereby better balancing the node update complexity and the number of visited nodes at a given depth. Embodiments disclosed herein also include a plurality of new K-best lattice decoding algorithms. A first algorithm, referred as K-HOLD, may be configured to generate a hard-output on the received signal for a complexity reduction as compared to existing K-best hard detectors with similar performance. A second algorithm, referred as K-SOLD, is a low-complexity soft-output extension of K-HOLD.
Embodiments of these algorithms include: branch and node metric computation technique such that the complexity associated with each node decreases with the node depth in the tree. Similarly, embodiments of K-SOLD may be configured such that not only leaf metrics are used in the computation of the soft-output, but also the pruned node metrics. In other words, N′ can be >1. Additionally included is one or more techniques to saturate the soft-bit metric, including the choice of R. The optimization of the parameters N′ and R is also provided in Table 1 for some particular exemplary MIMO systems.
Further, both K-HOLD and K-SOLD algorithms may be configured to achieve ML or near-ML performance with a significant complexity reduction as compared to brute force ML detection methods. Simulation results for various QAM transmissions over a 2×2 uncorrelated Rayleigh fading channel illustrate the near-optimality of our algorithm as compared to optimal (ML) performance. Besides, simulations showed that, for Wi-Fi and WiMAX applications, K-SOLD can approach ML performance within 1 dB for very reasonable complexity (K=8). One should also note that the embodiments for metric computation detailed in herein can be easily applied to any tree search based decoding technique, which includes, as a nonlimiting example, sphere decoding.
The embodiments disclosed herein can be implemented in hardware, software, firmware, or a combination thereof. At least one embodiment disclosed herein may be implemented in software and/or firmware that is stored in a memory and that is executed by a suitable instruction execution system. If implemented in hardware, one or more of the embodiments disclosed herein can be implemented with any or a combination of the following technologies: a discrete logic circuit(s) having logic gates for implementing logic functions upon data signals, an application specific integrated circuit (ASIC) having appropriate combinational logic gates, a programmable gate array(s) (PGA), a field programmable gate array (FPGA), etc.
One should note that the flowcharts included herein show the architecture, functionality, and operation of a possible implementation of software. In this regard, each block can be interpreted to represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative implementations, the functions noted in the blocks may occur out of the order and/or not at all. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
One should note that any of the programs listed herein, which can include an ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. In the context of this document, a “computer-readable medium” can be any means that can contain, store, communicate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. The computer readable medium can be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device. More specific examples (a nonexhaustive list) of the computer-readable medium could include an electrical connection (electronic) having one or more wires, a portable computer diskette (magnetic), a random access memory (RAM) (electronic), a read-only memory (ROM) (electronic), an erasable programmable read-only memory (EPROM or Flash memory) (electronic), an optical fiber (optical), and a portable compact disc read-only memory (CDROM) (optical). In addition, the scope of the certain embodiments of this disclosure can include embodying the functionality described in logic embodied in hardware or software-configured mediums.
One should also note that conditional language, such as, among others, “scan,” “could,” “might,” or “may,” unless specifically stated otherwise, or otherwise understood within the context as used, is generally intended to convey that certain embodiments include, while other embodiments do not include, certain features, elements and/or steps. Thus, such conditional language is not generally intended to imply that features, elements and/or steps are in any way required for one or more particular embodiments or that one or more particular embodiments necessarily include logic for deciding, with or without user input or prompting, whether these features, elements and/or steps are included or are to be performed in any particular embodiment.
It should be emphasized that the above-described embodiments are merely possible examples of implementations, merely set forth for a clear understanding of the principles of this disclosure. Many variations and modifications may be made to the above-described embodiment(s) without departing substantially from the spirit and principles of the disclosure. All such modifications and variations are intended to be included herein within the scope of this disclosure.
This application claims the benefit of U.S. Provisional Application No. 61/035,472, filed Mar. 11, 2008, which is incorporated by reference in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
20080144746 | Waters et al. | Jun 2008 | A1 |
20080279299 | Reuven et al. | Nov 2008 | A1 |
20090213954 | Bursalioglu et al. | Aug 2009 | A1 |
Entry |
---|
K. Wong, C. Tsui R. Cheng, and W. Mow, “A VLSI architecture of a K-best lattice decoding algorithm for MIMO channels”, in Proc IEEE Int. Symp. Circuits Systems, vol. 32, pp. 273-276, May 2002. |
Z. Guo, and P. Nilsson, “Algorithm and implementation of the K-best sphere decoding for MIMO detection”, IEEE journal on selected areas in commun., vol. 24, No. 3, pp. 491-502, Mar. 2006. |
P. Radosavljevic, and J. Cavallaro, “Soft sphere detection with bounded search for high-throughput MIMO receivers”, In Proc. 40th Asilomar Conf. on Signals, Syst. and Computers, pp. 1175-1179, Nov. 2006. |
R. Shariat-Yazdi and T. Kwasniewski, “Reconfigurable K-best MIMO detector architecture and FPGA implementation”, In Proc IEEE It. Symp. Intelligent Signal Process. and Commun. Syst., pp. 349-352, Nov. 2007. |
E. Viterbo, and J. Boutros, “A universal lattice code decoder for fading channels”, IEEE trans. in Information theory, vol. 45, No. 5, pp. 1639-1642, Jul. 1999. |
E. Agrell, T. Eriksson, A. Vardy, and K. Zeger, “Closets point search in lattices”, IEEE trans. in Information theory, vol. 48, No. 8, pp. 2201-2214, Aug. 2002. |
B. Hassibi, “An efficient square-root algorithm for BLAST”, Mathematics of communication research, Bell Labs Lucent technologies, Jan. 2000. |
J. Pons, “MIMO detection—Part I: Classical techniques & performance comparison”, Conexant white paper, Dec. 2007. |
IEEE Std 802.16TM-2004, IEEE Standard for Local and Metropolitan Area Networks, “Part 16: Air Interface for Fixed Broadband Wireless Access Systems”. Oct. 1, 2004. |
Number | Date | Country | |
---|---|---|---|
20090232232 A1 | Sep 2009 | US |
Number | Date | Country | |
---|---|---|---|
61035472 | Mar 2008 | US |