Field of the Invention
The present invention relates generally to methods, apparatus and systems for communication encoders, decoders, transmitters, receivers and infrastructure and/or user devices. More particularly, aspects of the invention relate to constrained turbo block convolutional codes, constrained interleaving, and related methods, apparatus, and systems for improved constrained interleaving, encoding, decoding, signal mapping, MIMO applications, spatial modulation, and rate matching. The present invention also relates to efficient parallel ASICs and VLSI architectures and optical integrated circuit architectures to implement these methods, apparatus, and systems.
Description of the Related Art
A large body prior art includes of technical publications, patents, and standards that relate to 4G LTE (fourth generation long term evolution) wireless systems. In particular, the relevant prior art relates to encoding and decoding architectures and algorithms for use with the CTC (convolutional turbo code) specified for use with 4G LTE. Specifically, important prior art relates to algorithms and high performance ASIC architectures for CTC encoding/decoding, deterministic contention-free interleavers such as the QPP (quadratic polynomial permutation) based interleavers, and rate matching/puncturing architectures.
A parallel decoding ASIC for the CTCs used in 4G LTE can be found in C. Studer, C. Benkeser, S. Belfanti, and Q. Huang, “Design and Implementation of a Parallel Turbo-Decoder ASIC for 3GPP-LTE,” IEEE J. Solid State Circuits, Vol. 46, No, 1, January 2011 (referred to as the “Studer” reference” herein). A follow on paper explains more improvements and details about efficient parallel decoding of the CTC used in 4G LTE. This second technical publication is: C Roth, S. Belfanti, C. Benkeser, and Q. Huang, “Efficient parallel turbo-decoding for high throughput wireless systems,” IEEE Transactions on Circuits and Systems, 2012 (referred to as the “Roth reference” herein).
One of ordinary skill in the art would be familiar with the Studer reference which explains how to design a highly optimized parallel real time ASIC designed to implement the CTC specified for use in 4G LTE. The Roth reference provides further details and optimizations to the same architecture as described in the Studer reference. One of ordinary skill in the art would also be familiar with the following prior art reference as well: A. Nimbalker, Y. Blankenship, B. Classon, T. K Blankenship, “ARP and QPP Interleavers for LTE Turbo Coding,” WCNC 2008 proceedings, (referred to as the “Nimbalker reference” herein). The architecture in the Studer reference uses the QPP interleaver as described in the Nimbalker reference. The QPP interleaver is important because it is used in the 4G LTE standard and because it can be described as a “contention free” “vectorizable” and “deterministic” interleaver.
As is well known “contention free”/“vectorizable” means that the permutation function has a particular property that aids in parallel processing implementations. Consider a case where there are N=8 parallel processors. Then, as long as N divides the frame size, K, the contention free interleaver places on a given row in memory all of the N elements to be processed by the N processors in a given clock cycle. The QPP only supports up to N=8 level vectorization.
The Studer reference also points out a very efficient way to compute the QPP address sequence. As per the Nimbalker reference, the QPP interleaved address sequence can be written as
πQPP(i)=(f1i+f2i2)mod K (1)
where f1 and f2 are suitably chosen interleaver parameters that depend on the code-block size K. Note that in this notation the sequentially incremented symbol i is used to denote a coded bit position in the transmitted frame, and the permuted version of the indexing sequence, πQPP(i), is used to look up a bit position in the non-permuted sequence of input bits. The Studer reference explains a very efficient way to compute equation (1) is to use the following set of recursions which can be easily implemented in hardware. The recursions below only use additions and modulo operations which can be very efficiently implemented in hardware. Hence at runtime, in hardware, equation (1) is computed as
πQPP(i+1)=(πQPP(i)+δ(i))mod K (2)
and
δ(i+1)=(δ(i)+b)mod K (3)
where πQPP(0)=0, δ(0)=f1+f2, and b=2f2.
Another prior art reference that is known to those of skill in the art and that goes into further detail about QPP recursions is: Y. Sun and J. Cavallaro, “Efficient hardware implementation of a highly-parallel 3GPP LTE/LTE-advanced turbo decoder,” Integration, the VLSI Journal, No, 44, 2011, pp 305-315, (referred to as the “Sun reference” herein). This reference provides additional recursions that allow QPP addresses to be incremented by an integer, d=Δi, that can be any positive integer. This allows forward and backward sequences of QPP addresses to be generated for forward and backward recursions used in decoding. Also, this allows recursions similar to equations (2)-(3) to increment by more than one element, for example, Δi=K/M, where K is the frame size and M is the number of processors in a system. The Sun reference also explains the prior art knowledge that a set of M different QPP address generators can be run in parallel with relative offsets of one and with Δi=K/M to generate a set of M consecutive QPP addresses in parallel. The Sun reference also provides efficient hardware circuits to implement such an addressing scheme.
Another relevant field of art is called rate matching. Rate matching is also known as “puncturing.” The CTC mother code defined in the LTE standards is a rate 1/3 parallel concatenated turbo code. This CTC leads to very complicated rate matching circuits at both the encoder and the decoder, thus increasing over all hardware complexity of the 4G LTE CTC encoding and decoding. A reference that discusses rate matching for LTE turbo codes is C. Ma and P. Lin, “Efficient implementation of rate matching for LTE codes,” IEEE ICFCC 2010 international conference proceedings, pp. V1-704-708 (referred to as the “Ma reference” herein). FIG. 1 of the Ma reference shows the basic configuration of 4G LTE rate matching at the transmitter side. The data stream plus two streams of parity bits from the rate 1/3 parallel concatenated CTC pass through three parallel blocks labeled “sub-block interleaver.” That is, three interleavers are used, one each to process the total number of bits in a non-punctured frame. Another reference that explains the rate matching used in 4G LTE is L. Yu et al., “An improved rate matching algorithm for 3GPP LTE Turbo code,” Conference on Communications and Mobile Computing (CMC), pp. 345-348, April 2011. FIG. 2 of this article and the discussion thereof is very helpful in understanding the 4G LTE rate matching algorithm.
There also exists a vast body of literature related to OTN (optical transport network) applications. OTN applications are demanding because they require very high data rates and powerful codes and the frame size used in coding/decoding is long, (122,368 message bits plus coding overhead bits). OTN systems are either already available or still being researched and developed to support data rates of 100 GBPS (usually referred to as 100G), 400 GBPS and even up to 1000 GBPS (1 Terabit per second, 1 T). These very high speed systems demand very powerful codes to achieve specified high NCGs (net coding gains) at very low BERs (bit error rates) below 10−15. High speed digital hardware that employs extensive parallel processing is needed to decode these powerful codes in real time.
It can be noted that in OTN applications, the codes being used/considered now correspond to LDPC (low density parity check) codes, concatenations of LDPC codes with one or more long block codes, or TPCs (turbo product codes). OTN applications cannot use CTCs like LTE does because the error floors required by OTN applications are far below those afforded by CTCs. Hence it would be desirable to have a much lower complexity parallel coding/decoding technique and parallel architecture than those that are currently proposed for use in or used in the OTN field. It would be desirable if this low complexity coding/decoding technique could meet the stringent NCGs requirements at BERs of 10−15 and outperform all known coding/decoding techniques that are currently proposed for use in or used in the OTN field.
The prior art also includes U.S. Pat. No. 8,537,919 “Encoding and decoding using constrained interleaving,” and its continuation-in-part, U.S. Pat. No. 8,532,209, “Methods, apparatus and systems for coding with constrained interleaving, and both of these US Patents are incorporated herein by reference in order to provide the reader with written description level details of known constrained interleaver design techniques, and known encoder/decoder structures that use constrained interleaving. These patents are incorporated by references, but it is to be understood that for claim construction purposes, the instant written description should be used, and not any of the written description in the incorporated-by reference patents. In this patent application, some terms are defined differently than the US patents incorporated by reference herein. Therefore, it is to be understood that the interpretation of terms and phrases used in the claims herein should be taken in the context of the present application and not the references incorporated herein. The prior art also includes J. Fonseka, E. Dowling, S. I. Han and Y. Hu, “Constrained interleaving of serially concatenated codes with inner recursive codes,” IEEE Communications Letters, Vol. 17, No. 7, July 2013, referred to herein as “the Fonseka [1] reference.” The prior art also includes J. Fonseka, E. Dowling, T. Brown and S. I. Han, “Constrained interleaving of turbo product codes,” IEEE Communications Letters, vol. 16, 2012, pp. 1365-1368, September 2012, referred to herein as “the Fonseka [2] reference.” The prior art also includes S. I. Han, J. P. Fonseka and E. M. Dowling, “Constrained Turbo Block Convolutional Codes for 100G and Beyond Optical Transmissions,” IEEE Photonics Technology Letters, Vol. 26, No. 10, May 2014, referred to herein as “the Fonseka [3] reference.” The above-listed patents and technical publications also cite to related articles in the technical literature and to other U.S. Patent references, which are also part of the prior art. It can be noted that the above referenced patents and technical papers constitute at least a portion of what would be known to one of skill in the art of CTBC (constrained turbo block convolutional) codes.
Consider
Another characterizing feature of the CTBC encoder 400 is that it makes use of a constrained interleaver 410. Any specific CTBC code is defined in terms of the specifically selected outer block code B used in the OBC encoder 405, the specifically selected recursive convolutional code (RCC) used in the IRCC encoder 415, and a specifically selected constrained interleaver having a specified size and permutation function used in block 410. The constrained interleaver 410, and various forms of its interleaver constraints are described in the above-cited prior art references. The constrained interleaver 410 can be designed to provide an interleaver gain, G similar to uniform interleaving, but also can be designed to ensure that the net MHD of the entire CTBC code satisfies some target MHD, di≧d0di. It can be noted that if the constrained interleaver used in the CTBC were to be replaced by a uniform interleaver of the same length, a “Uniform-interleaved Turbo Block Convolutional” (UTBC) code would result, and the MHD of this corresponding UTBC code would typically be close to MHD,
Various forms of constrained interleavers are defined in the above-referenced US patents and the three above-cited references related to constrained interleaving. A constrained interleaver type 2, i.e., the “CI-2” is introduced and used in the block 401 of
An objective of the CI-2 interleaver is to create CTBC codes that simultaneously provide a specified high MHD while achieving as high of an interleaver gain as possible. The high MHD provides a lower error floor and has other desirable effects in various types of channels, and the high interleaver gain ensures a high coding gain for the CTBC code. However, the interleaver gain attainable by the CI-2 is limited to a large extent by the number of rows, L in the CI-2 design matrix. The lower the number L, for a fixed frame size K, the higher the CI-2 interleaver gain. However, when CI-2 interleavers are used, lowering L will eventually limit the achievable MHD.
It would be desirable to have improved constrained interleavers that do not require a CI-2 design matrix, but instead use L=1, and can thus lead to improved CTBC codes that have higher interleaver gains as compared to a CI-2 interleaver of the same length. It would be desirable to further include improved signal mapping methods, apparatus and systems to map a CTBC code onto a target signal constellation in such a way as to provide a constellation mapping gain, similar to the kinds of gains provided by trellis coded modulation (TCM) and bit interleaved coded modulation (BICM). It would also be desirable to have new rate matching algorithms that could efficiently interoperate with these new and improved CTBC codes and signal mapping subsystems. It would also be desirable to have algorithms developed for applications in multiple input multiple output (MIMO) systems and spatial modulation and subsystem for use communications devices that include in multi-antenna subsystem.
Next consider
Block 1105 processes or otherwise demodulates a received signal r(t) to generate an initial vector rs, which preferably corresponds to a vector of bit metrics. The bit metrics are preferably used in decoding of the component codes using an a-posteriori probability (APP) decoding technique.
The IRCC soft in soft out (SISO) decoder 515 can implement a well known soft decoding algorithm such as the BCJR algorithm, or a soft output Viterbi algorithm (SOYA), the min sum algorithm. Such algorithms are known to generate extrinsic information indicative of the reliability of the soft decoded results. The BCJR algorithm can be embodied using any of the MAP, Log-MAP, or the Max-Log-Map algorithms. For example, if the IRCC SISO decoder 515 involves the BCJR algorithm, then the IRCC SISO decoder 515 will need to compute a sequence of branch transition probabilities, γ's, that each are a function of a respective element of the received signal metrics, rs, and a corresponding respective element of updated or initial extrinsic information, the Le's. The IRCC SISO decoder 515 will use this sequence of branch transition probabilities, γ's, while making one forward recursion pass to update a set of state metrics, α's, and one backward recursion pass algorithm to update a set of state metrics, β's. Such concepts are well known in the art in the context of decoding convolutional turbo codes (CTCs). Using the calculated α's, β's and γ's values, the BCJR decoding of the IRCC decoder calculates the extrinsic information of all its input bits. For example, see P. Robertson, et al., “A comparison of optimal and sub-optimal MAP decoding algorithms operating in the log domain,” IEEE ICC 1995, pp. 1009-1013.
The IRCC SISO decoder 515 couples its extrinsic information output to a constrained deinterleaver 520 which deinterleaves the extrinsic information received from the IRCC SISO decoder 515, for example, in accordance with the inverse CI-2 permutation function. The OBC SISO decoder 525 is coupled to receive the deinterleaved extrinsic information from the constrained deinterleaver 520. The OBC SISO decoder 525 also preferably implements a known soft decoding algorithm such as the well known Chase-Pyndiah algorithm (also referred to as the Pyndiah algorithm), low complexity Chase-Pyndiah algorithm, the OSD algorithm and its low complexity variations, or any similar soft decoding algorithm for decoding of block codes, for example. In general, different well known (or proprietary) soft decoding algorithms can be used in the blocks 515 and 525. All such algorithms are well known to those of skill in the art, for example, see J. Cho and W. Sung, “Reduced complexity Chase-Pyndiah decoding for turbo product codes,” pp. 210-215, IEEE workshop on signal processing systems, October, 2011.
It would be desirable to have a decoding architectures that could be used to efficiently decode the new improved CTBC codes. It would be desirable to have additional efficient algorithms and parallel architectures to decode the improved CTBC codes that have undergone additional constrained interleaving based signal mapping and/or rate matching and/or constrained interleaving based spatial modulation.
While the above mentioned prior art relating to constrained interleaving for use with an OBC and an IRCC provide very powerful CTBC codes, the CI-2 is based on the CI-2 design matrix, [A]L×ρn, and the concept of a random interleaver. The construction of the CI-2 requires many randomization operations performed in the CI-2 design matrix and a complicated process of ensuring that randomizations do to not violate any constraints in the CI-2 design matrix. As discussed below, this CI-2 design matrix and design process actually limits BER performance. Also, the CI-2 is not a vectorizable/contention free interleaver. Herein a “random interleaver” is also defined in opposition to a “deterministic interleaver” that uses a mathematical formula to generate the deterministic interleaver permutation. A random interleaver is thus often implemented as a table look up or with a state-machine logic circuit whose sequencing logic does not use a fixed mathematical equation but whose state transition logic needs to be specifically designed for each is frame size.
It would be desirable to have a family of a contention free, vectorizable constrained interleavers, both deterministic and semi-random. It would be desirable to have an SCC that is constructed by coupling the output of the OBC to the IRCC via a contention free, vectorizable and deterministic version of a constrained interleaver. It would further be desirable to be able to design a system that could achieve the memory efficient benefits of the Studer reference, and to also greatly simplify the rate matching requirements of the system. It would be desirable to have a parallel architecture that could meet the encoding and decoding performance requirements of the 4G LTE CTC encoders and decoders, but with simpler computational functional units, less overall computational complexity, and thus lower power consumption. It would be desirable to have a CTBC encoder/decoder architecture that could eliminate the complicated and hardware intensive rate matching and inverse rate matching subsystems required by 4G LTE encoders and decoders. It would also be desirable if the parameters of this same CTBC encoder/decoder architecture could be scaled to higher values of N levels of parallelism and designed to provide the NCGs need at BERs of 10−15 for 400 GHz and beyond MN applications. It would be desirable to also have a new coded modulation techniques that could be used to map codes onto higher order constellations and to implement advanced functions such as rate matching, spatial modulation, and MIMO systems. It would be desirable if the advanced modulation technique could be used along with optical integrated circuits and similar technology to implement higher capacity optical communication channels, for example 400 GHz and beyond, and 1 Tera Hz and beyond. It would be desirable to have a constrained interleaver design process that did not rely on the CI-2 design matrix and was able to provide higher BER performance for random and deterministic constrained interleavers.
Using the abbreviations CI=“constrained interleaver” and CICM=constrained interleaved coded modulation,” and other more common abbreviations that are all defined herein, the present patent application is organized and the present invention can be summarized into sub-invention categories as follows:
In accordance with a first aspect of the present invention, constrained interleavers are designed that only use a single row vector as opposed to CI-1 or a CI-2 design matrix which always needs more than one row to meet non-trivial MHD design objectives. For example, CI-3 and CI-4 constrained interleavers are designed by identifying restricted zones of numbers where a pseudo-random number generator cannot generate an output. These restricted zones correspond to sets of adjacent integers within the integer domain [0,K−1]. The length of the constrained interleaver is K, and is used to permute the integer ring [0,K−1]=[0, . . . , K−1] to a permuted version of this integer ring, which can be denoted as r[0,K−1]. An unconstrained pseudo-random permutation can map [0,K−1] to any reordering of [0,K−1]. In contrast, a constrained interleaver in accordance with the present invention imposes constraints that eliminate any possible reordering that would cause a particular index of [0,K−1] to be mapped to a position (index) in π[0,K−1] that would violate a constraint. The constraints are implemented by sequentially pseudo-randomly permuting (placing) indices (positions) in the integer ring [0,K−1] to new positions (indices) in π[0,K−1] subject to the constraint of not permuting any index of [0,K−1] into any restricted zone in π[0,K−1]. The restricted zones are used to identify ranges of indices in π[0,K−1] where, if a particular index from [0,K−1] were to be placed, a low weight a CTBC codeword would be/could be generated. Herein, the phrases “low weight codeword,” “low weight error sequence,” and “low distance error sequence” generally correspond to any possible low weight encoded bit sequences, iP, of weights dt≦d≦df where none of the possible low weight encoded bit sequences, iP, can have a weight less than dt. Here the weights dt≦d≦df correspond to Hamming distances, and the coded sequence can be a CTBC coded sequence or some other kind of encoded sequence encoded in accordance with a code for which the low weight error sequences can be identified and enumerated. The interleaver constraints are used to eliminate the possibility of the generation/existence of any low weight CTBC codewords that have weight below a target MHD value denoted as dt.
It should be noted that while U.S. Pat. Nos. 8,537,919 and 8,532,209 disclose the general genus of constrained interleavers and certain species such as CI-1 and CI-2 species of constrained interleavers, the an aspect of the present invention discloses additional specific novel species that members of the genus of constrained interleaver inventions. That is, the present invention specifically discloses the two new species of constrained interleavers, CI-3 and CI-4. Both CI-3 and CI-4 are members of the newly disclosed sub-genus class SRCI (single row constrained interleaving).
SRCI as performed in accordance with the present invention provides several advantages. First, the interleaver gain can be improved in comparison to the prior art CI-2 because the number of restricted-out permutation possibilities decreases with respect to the CI-2 design method. Second, it is possible to design CTBC codes that allow different target MHD values to be used for different categories of low weight error sequences. This is important because certain categories of low distance error sequences can be identified that are relatively much less likely. These categories of low distance error sequences have low associated error coefficients. Therefore, the overall probability of error can be reduced by allowing lower MHD to these less likely categories of low distance error sequences. This allows the overall probability of error to be reduced by balancing MHD and error coefficient products in the error probability expression as a function of distance spectra. A third advantage to the SRCI approach is that it provides additional flexibility that allows vectorizable (contention free) deterministic constrained interleavers to be designed using the single row type interleaver constraints.
Another aspect of the present invention focuses on parallel processing architectures that can be used to implement chips, systems of chips or chip subsystems for encoding and decoding of CTBC codes. These parallel architectures make use of the contention free deterministic constrained interleaver along with a parallel-access memory architecture as well as parallel processing units that perform SISO decoding in parallel.
Another aspect of the present invention centers around CICM (constrained interleaved coded modulation). CICM signal mapping is used to map coded sequences such as CTBC coded sequences and other coded sequences for which the low distance error sequences (lowest weight codewords) can be identified and tabulated. CICM signal mappers uses a permutation, Γ, to permute the coded bit positions of a coded bit stream of frame of frame size K, onto a sequence of K/m groups of m bits, each of which will be mapped to an 2m-ary symbol in accordance with a selected constellation mapping rule. The constellation mapping rule is preferably uses RGC (reverse Gray coding) to map groups of m bits at a time onto the 2m-ary signal constellation points. The combination of the CICM permutation Γ and the constellation mapping rule is preferably designed to ensure that at least one of a symbol Hamming distance and a MSED (minimum squared Euclidian distance) is achieved. This is achieved by keeping track of a set of low distance error sequences that can be generated at weights d, where dt≦d≦df. Similar to the CI permutation π[0,K−1], Γ is constrained to ensure that low distance error sequences are avoided, but now in terms of symbol Hamming distance and MSED on the transmitted sequence as opposed to the encoded bit stream itself. In preferred embodiments both the symbol Hamming distance and the MSED are jointly achieved. In unequal error protection embodiments, the symbol Hamming distance and a plurality of different MSEDs for different subsets of message bits are jointly achieved. In all such systems mentioned above, when AWGN (additive white Gaussian noise) channels are in use, it may not be needed to maintain a given symbol Hamming distance, so that the maintaining/achieving a given symbol Hamming distance portion becomes optional. In general, the CICM permutation and mapping is selected to improve or optimize the net probability of error on the channel.
CICM can be used to aid in a variety of areas. For example, CICM is used herein to implement rate matching/puncturing/variable redundancy. Also, CICM is used to implement improved SM (spatial modulation) and MIMO (multiple input multiple output) systems such as multiple-antenna wireless systems for potential use, for example, in 5G and beyond wireless systems.
Another aspect of the present invention involves the design of OTN (optical transport network) systems for 100G and beyond fiber optic or free space laser communication systems. The present invention shows how to design and implement optical subsystems using filter banks constructed using a plurality of known optical discrete-time filters that can be implemented in coupled fiber subsystems and/or optical integrated circuits. The optical discrete time filter banks are used to implement a transmit portion of a MIMO type channel matrix, H. The output of the optical discrete time filter banks is coupled onto a single fiber or free space optical laser channel for transmission. At the receiver, another optical filter bank is used to implement a receive portion of the MIMO type channel matrix, H. Both SM and MIMO type modulation formats are disclosed. The SM and MIMO type systems can be used to increase the performance and data rate of the optical communication system at a given noise level.
Further disclosure in this continuation in part application involves a set of embodiments called combined MIMO spatial modulation (C-MIMO-SM). Such embodiments generally combine aspects of MIMO systems with aspects of spatial modulation systems. Such embodiments as described and defined below include C-IC-SM, C-OFDM-SM and C-TDM-SM. Both encoders and decoders are described. For example, a method, apparatus or system can be constructed for use in a communications system that transmits groups of encoded bits via a group of nt number of independent channels during a plurality of respective symbol intervals. Here nt≧1 and each of the plurality of respective symbol intervals has associated therewith a respective group of one or more subsets of signal constellation (SC) bits and a respective subset of spatial modulation (SM) bits. Such methods, apparatus and systems preferably involve encoding a frame of bits in accordance with one or more outer codes to form a frame of encoded bits. The frame of encoded bits is interleaved into a symbol frame that includes a plurality of subsets of SC bits and a plurality of subsets of SM bits. During each respective symbol interval, to the group of nt number of independent channels is coupled a respective group of nt number of subsets of SC bits and SM-bit derived information determined in accordance with a respective subset of SM bits that is associated with the respective symbol interval. This is used to form a respective group of nt number of independent channel symbols. Each respective independent channel is configured to transmit a respective independent channel symbol that is determined in accordance with the respective subset of SC bits sent during the respective symbol interval.
The various novel features of the present invention are illustrated in the figures listed below and described in the detailed description that follows.
Throughout this written description various mathematical algorithms will be presented in the form of block diagrams. It is to be understood that in any such cases, the block diagrams can be viewed as hardware blocks or logic blocks that could be carried out in software. Likewise, especially in hardware implementations, a given block in the any block diagram herein could be embodied using two or more separate hardware sub-blocks. Hence all such modifications are contemplated as ways to implement various aspects and embodiments of the present invention. Also, it should be recognized that any block diagram whose operation is described herein can be viewed as a flow chart, thereby describing a method in addition to a system or an apparatus.
A single frame of a CTBC code can be modeled starting from a set of ρ independent message blocks, each of length k, mj=(mj1, mj2, . . . mjk), j=0, 2, . . . , ρ−1, where ρ is the integer number of message blocks in a frame. These message blocks are first individually encoded by an (n,k) outer block code (OBC) with minimum Hamming distance (MHD) do to form a sequence of codewords of the OBC. This sequence of codewords of the OBC will be placed into a vector, [c]ρn=[c]K, where K=ρn is the frame size. The elements of [c]pn can be written in terms of the codeword positions, cj=(cj1, cj2, . . . cjn), for j=0, 1, 2, . . . , ρ−1, and or in terms of the individual coded bit positions, c(i), for i=0, . . . , K−1, where i=nj+t, for j=0, 1, 2, . . . , ρ−1 and t=0, . . . , n−1. In this document, the term “codeword” specifically refers to a set of coded bits generated by applying the OBC to a message block, {mj}, while the term “codeword position” refers to physical memory locations where the coded bits of a corresponding codeword reside. The vector, [c]ρn, can be viewed as a memory array whose contents are the naturally ordered set of codewords, {cj}, or can be viewed as a bit-oriented memory array containing “coded bit positions” where the corresponding coded bits {(cj0, cj1, . . . , cjn-1)}, j=0, 1, 2, . . . , ρ−1, physically reside. Also, the term “coded bit position” can refer to a permuted location or address where the corresponding coded bit will reside after an interleaving operation has occurred as described below.
The contents of the vector, [c]ρn, can be permuted to form a constrained interleaved sequence, π:c→u denoted as, u=π[c]. In terms of a physical interleaver structure, the vector u can also be viewed as a vector of coded bit positions, where the coded bit positions (and/or their addresses) are in a permuted order with respect to the coded bit positions in the vector c. The sequence u is then encoded according to an inner recursive convolutional code (IRCC) to form the final coded sequence v=(v1, v2, . . . vLpn+v) of the CTBC code, where v is the number of additional terminating bits added by the IRCC. In terms of the generator function G(D) of the IRCC, this conversion from u to v can also be described as v(D)=G(D)u(D), or, in vector notation, v=G[u]=G[π[c]].
In the analysis herein, the IRCC is assumed to be the modulo-2 accumulator, i.e., G(D)=1/(1+D), where then v=1. However, in case of an accumulator this single termination bit can be eliminated as it contributes the same bit metric resulting from the same coded bit for the two paths terminating at state zero. Using this modulo-2 accumulator, when the Hamming weight of u, W[u], is an even value d, the CTBC coded sequence v consists of d/2 number of disjoint segments of all ones. Similarly, when W[u] is an odd value d, v consists of ┌d/2┐=(d+1)/2 number of disjoint segments of all ones including one segment that ends at the last bit of the sequence v. The interleaver constraints developed herein put restrictions on the permutation π:c→u so as to selected categories of the low distance error sequences of the final CTBC codeword, v=G[u]=G[,r[c]]. That is, constraints are placed on the permutation π:c→u to ensure that the minimum weight of v generated by any vector c is at least dt, where dt is a target MHD of the CTBC code. As discussed later, the constraints developed herein can be applied to any general IRCC with any arbitrary G(D).
CTBC Code Encoder/Transmitter Using a CI with a Single Row:
The present invention introduces a new family of L=1 constrained interleavers (Single Row Constrained Interleavers—SRCI) that are based on a new type of constraint that directly restricts (i.e., constrains out) a particular subset of zero or more indices in the vector u to which a given coded bit of an associated OBC codeword cannot be placed, given the previous placement coded bits of the current codeword into u and possibly coded bits of other codewords of the OBC that have already been placed into the vector u. As stated above, the prior art CI-2 constrained interleaver required the use of [A]L×ρn that necessarily required L>1 in order to meet a specified target MHD requirement. Hence using the prior art techniques, it would be impossible to set L=1 in order to meet a set of interleaver constraints that would enforce a specified target MHD dt>d0di because both intra-codeword bit separations and the inter-row constraints would be needed, thus forcing L>1. Also, using prior art constrained interleaving techniques it would not be possible to design a deterministic constrained interleaver as defined below which has contention free properties and is compatible and used along with a pre-defined deterministic contention free permutation such as the QPP permutation.
In a specific example, an L=1 constrained interleaver (also called a SRCI) is used to construct a transmitter to generate and transmit CTBC codes. In such a transmitter, an outer encoder is configured to transform a sequence of input bits to a sequence of outer encoded bits. The sequence of outer-encoded bits is encoded in accordance with an outer code that can be block code (which would include an LDPC code) or a non-recursive convolutional code, for example. A constrained interleaver would be configured to implement a permutation function to permute the order of the outer-encoded bits to produce a constrained-interleaved sequence of outer-encoded bits. The constrained interleaver implements at least one SRCI (single row constrained interleaver) constraint that prevents one or more low-distance error sequences from occurring. The permutation function also implements a pseudo-random reordering of the outer-encoded bits subject to the at least one SRCI constraint. An inner encoder is configured to encode the constrained-interleaved sequence of outer-encoded bits into a sequence of inner-encoded bits. A constellation mapper is used to map the sequence of inner-encoded bits to a transmission signal such as a BPSK signal, a QPSK signal, a 16-QAM signal, or a 16-PSK signal, for example. In this example, the sequence of inner-encoded bits constitutes a serially-concatenated sequence of bits that incorporates coding from both the inner code and the outer code in accordance with a serially-concatenated code that achieves a target minimum distance of dt. The outer code has a minimum distance of do and the inner code has a minimum distance of di. In this example, the permutation function implemented by the SRCI constrained interleaver is configured to implement the SRCI constraint in order to enforce dt>d0di. The SRCI constraint ensures that the permutation function does not place any respective index from the integer ring [0,K−1] into any position in a permuted integer ring π[0,K−1] that corresponds to any identified respective restricted zone. Each identified respective restricted zone corresponds to a subset of one or more adjacent positions in π[0,K−1] that, if the respective index were to be placed into any one of the identified respective restricted zones, at least one error sequence of weight less than dt would become possible in the serially-concatenated code.
The design of the specific classes of permutation functions implemented by the L=1 constrained interleavers of
1. The MHD of a CTBC codeword, W[v] is a sum of the distances between non-zero coded bits in u. Starting the count with zero, a string of ones in v begins at the position of each even numbered non-zero coded bit in u and ends at the position immediately before each odd numbered non-zero coded bit in u.
2. The effect of the parameter L on certain key error coefficients of CTBC codes constructed using a CI-2 can be seen directly in equations (2) and (6) of the Fonseka reference [1]. These error coefficients are minimized with respect to L when L=1. When the CI-2 interleaver is used to construct a CTBC code, increasing L leads to higher values of MHD, but also lower values of the interleaver gain. This is because when the CI-2 design matrix, [A]L×ρn, is read in column-major order to create the sequence u, any two coded bits of any given codeword of the OBC will have a separation of at least L bits in u. However, note that the frame size is K=Lρn, and for fixed values of K, and n, the value of ρ is maximized when L=1. Therefore, decreasing L increases the number of codewords of the OBC, ρ, that can be placed on any single row. The interleaver gain, which increases with the number of possible permutations of coded bits in u, thus increases as L is lowered and is maximized when L=1.
3. The inter-row constraints in CI-2 were introduced to ensure that two non-zero codewords of the OBC placed on two different rows of [A]L×ρn will cause to be generated a CTBC codeword, v, that has a weight, W[v], that is greater to or equal to the target MHD. With the CI-2 inter-row constraints, when coded bits of a codeword c1 on row i and a codeword c2 on row (i−l) are observed in pairs (with one coded bit from c1 and the other from c2) in the sequence u, the inter-row constraints ensure that only up to κ(l) such pairs are allowed have a separation of l in u, for a set of considered/constrained row separations l=l, . . . , lmax. Further, the inter-row constraints and the reading of [A]L×ρn in column-major order ensure that all remaining pairs have at least a separation of (L−l) in u, up to a maximum of lmax. For example, consider the placement of coded bits of a codeword c2 in u when κ(l)=1, for l=1, 2, . . . , lmax. Then if codeword c1 has a coded bit with a l≦lmax bit separation from a coded bit of c2 in u, the inter-row constraints ensure that the separation between every other coded bit of c1 and every other coded bit of c2 has to be at least (L−l).
4. Additionally, the act of reading the CI-2 design matrix, [A]L×ρn, in column-major order introduces an inherent constraint. This inherent constraint deals with codewords separated by lmax+1 rows. In order to understand the inherent constraint, consider a typical example as provided in the Fonseka reference [1] where L is selected as L=2(lmax+1) and (lmax+1)=dt/d0 in order to achieve a target MHD of dt=d02. In particular, consider the specific case where d0=4, dt=d02=16, lmax=3, and L=8.
Given the above example, consider the case where three codewords c1, c2 and c3 of the OBC that have placed into consecutive rows of [A]L×ρn. When [A]L×ρn is read in column major order, if the separation between a coded bit of c1 and a coded bit of c2 on u is one, and the separation between a coded bit of c2 and c3 is also one, then the row-column structure of [A]L×ρn ensures that the separation between any coded bit of c1 and a coded bit of c3 has to be at least 2. Similarly, when L=8 and lmax+1=4, if c1, c2, c3 and c4 are codewords of the OBC are placed on consecutive rows of [A]L×ρn, then the minimum possible separation between each {ci,c(i+1)}, i=1 . . . , 3 is one, and the minimum possible separation between c1 and c3 and c2 and c4 is at least two.
If it happens to be that the actual minimum separation between coded bit pairs in codewords c1 and c3, and the actual minimum separation between coded bit pairs in c2 and c4 are both 2, then the minimum separation between coded bit pairs in c1 and c4 will have to be at least 3. Due to the row-column structure of a CI-2, when κ(l)=1 for l=1, 2, . . . , lmax, this inherent constraint prevents the generation of coded sequences v with weight less than dt from three through lmax number of codewords of the OBC. This inherent constraint ensures the minimum weight of coded sequences v generated by three through lmax codewords of the OBC is dependent on L since all remaining n−1 pairs of coded bits have at least a separation of (L−l) in u. However, the minimum weight of sequences of v generated by (lmax−1) codewords of the OBC is independent of L. The act of (implicitly) reading of the row-column matrix structure of [A]L×ρn, in column major order adds in this inherent constraint that is not explicitly called out as a separate constraint in the Fonseka reference [1], or in U.S. Pat. Nos. 8,537,919 and 8,532,209.
5. CI-2 is structured to maintain the same MHD for all codewords of the concatenation regardless of whether they are generated by one non-zero codeword of the OBC or whether they are generated by combinations of two or more non-zero codewords of the OBC. Observe that different categories (subsets) of CTBC codewords, {v} can be defined in terms of the number of codewords of the OBC that combine to form a potentially low weight CTBC codeword, v, at a given distance, d. Further, observe that the interleaver gain of CTBC codewords in each different category of codewords has a corresponding different category-level error coefficient.
To understand this further, note that the asymptotic bit error rate (BER) of any CTBC code is determined by the error contributions of the codewords according to
where Ad is the error coefficient of the corresponding weight d codewords, and P(d,γb) is the probability of decoding in favor of a CTBC codeword with a weight d error sequence at a bit signal to noise ratio of γb=Eb/N0. As per Lemma 1 of the Fonseka reference [1], CTBC codes can be designed with a CI-2 to eliminate the error contributions in equation (4) associated with all CTBC codewords having weights d which are below a selected target MHD, dt. At the same time, the Ad values of the remaining terms in equation (4) can be reduced to be close to the Ad values associated with uniform interleaving. This simultaneous elimination of the lower weight error terms in equation (4) and the reduction of the remaining Ad values allows powerful CTBC codes to be constructed starting from simple component codes. However, it is further observed here that the individual Ad values at each given distance, d, can also be sub-divided down to a finer granularity by considering different categories of codewords that have the same distance, d, but different error coefficients.
6. CI-2 is structured to maintain the same MHD for all codewords of the concatenation regardless of their category, i.e., whether they are generated by one or two or more non-zero codewords of the OBC. Observe that the final bit error probability of (1) can be lowered by using different values of dt for different categories of codewords. For example, if the error coefficient for a certain category of codewords is much less than the error coefficient for another category of codewords, the error probability of equation (4) can be lowered by using a higher dt value for the category of codewords with the much lower error coefficient. This is because the number of possible error sequences in the category of codewords with the much lower error coefficient is much fewer.
7. The standard CI-2 construction treats any combination of d0 non-zero coded bits of a codeword of the (n,k) OBC as a codeword whether or not that combination is actually a codeword. Observe that the actual number of codewords of the OBC with weight d0 is usually lower than the total possible number of permutations of the d0 non-zero coded bit positions of each codeword position, cj. Hence, many combinations of one or more codewords, each containing d0 non-zero coded bits, will not actually correspond to valid combinations of one or more codewords of the OBC. Such invalid combinations should be ignored to increase interleaver gain whenever possible.
A CI-3 interleaver is defined in accordance with a set of Constraints 1-4 as defined in this section. Constraints 1-4 provide similar restrictions as the CI-2 constraints, but do so in a manner so as to avoid the use of the CI-2 design matrix, [A] L×ρn. This allows the use of L=1 and thus allows CTBC codes to be constructed with higher interleaver gains than can be achieved with a CI-2 interleaver. Just like the CI-2, constraints 1-4 can be used to achieve a target MHD, dt, as high as dt=d02. In the next section, an additional constraint, Constraint 5, is defined that also allows even higher target MHDs to be reached than is possible with CI-2 interleavers.
To start, a parameter s1 is defined to identify a spacing requirement between coded bits of a single non-zero codeword of the OBC. All of Constraints 1-5 put constraints directly in the sequence u without the use of [A]L×ρn. The parameter s1 performs a similar function as L in the CI-2 interleaver, but defines a separation requirement directly applied to the coded bit positions of codewords as opposed to defining the number of rows of the CI-2 design matrix.
Constraint 1:
Constraint 1 is used to prevent low distance error events/sequences from occurring among a first category of CTBC codewords, denoted Φ1, generated as v=G[u]=G[π[c]]εΦ1, where c consists of a single non-zero codeword of the OBC having the minimum weight do. Constraint 1 ensures that any two coded bits of every codeword of the OBC must have at least a separation of s1 positions between them on u, where S1 is chosen to satisfy:
With this constraint, the resulting sequence vεΦ1 will contain ┌d0/2┐ segments of all ones, and each such segment will have at least weight s1. Therefore, W[v]≧┌s1*d0/2┐≧dt.
Constraint 1 can be easily handled using the pre-selected value of S1. For example, when finding a position for a coded bit (nj+t) of a codeword of any codeword position, cj, all positions within s1 locations in u away from any already positioned coded bits, π(nj+t) for tε{0, . . . n−1}, of that codeword correspond to restricted locations in the vector u where the bit (nj+t) cannot be placed. The term “restricted zone” is used herein to denote the set of restricted locations in the vector u where the bit (nj+t) cannot be placed. The interleaver gain associated with codewords in category 1 is the lowest relative to all the other categories of codewords discussed in this section. This is because there are more ways to generate codewords in category 1 than any other category identified herein.
Constraint 2:
Constraint 2 is used to prevent low distance error events/sequences from occurring among a second category of CTBC codewords, denoted Φ2, generated as v=G[u]=G[π[c]]εΦ2, where c consists of two non-zero codewords of the OBC, each having the minimum weight do. Constraint 2 ensures that, if a coded bit of any codeword position cj and a coded bit of any other codeword position cj1 have a spacing of exactly (lmax+1), then all other coded bits of cj must have a separation of at least (lmax+1) positions from every other coded bit of each codeword position cj1 on u. The parameter, lmax, is chosen to satisfy (lmax+1)=┌dt/d0┐. With this constraint, the resulting sequence vεΦ2 will have do segments of ones, each with weight of at least (lmax+1)=┌dt/d0┐. This ensures that W[v]≧(lmax+1)do≧dt.
Constraint 2 can be handled by storing a list of pairs of codewords positions (cj, cj1) that have a coded bit of cj and a coded bit of cj1 separated by exactly (lmax+1) positions. When finding a position for a coded bit of cj, if cj happens to be on that list coupled with cj1, all positions within (lmax+1) from the remaining coded bit positions of cj1, need to be added to the restricted zone. Note that if they are all bigger, then there is no constraint.
The impact of the inter-row constraints related to rows i and (i−l) which is defined using k(l) in traditional CI-2 is to ensure that (a) coded bits of a codeword c1 can pair up with only up to k(1) number of pairs with coded bits of a codeword c2 on u, where a pair is formed by a coded bit of c1 and a coded bit of c2 at a separation of l, and (b) while all remaining pairs of coded bits of c1 and c2 maintain at least a separation of (L−l), for l=1, 2, . . . lmax, where, lmax is found according to d0(lmax+1)≧dt. In SRCI constraint 2, the same lmax value as in traditional CI-2 is used. Preserving the same impact, inter-row constraints of SRCI can be enforced as: (a) no more than k(l) number of pairs of coded bits from any two codewords c1 and c2 are allowed to have a separation of l or less on u, and (b) if la(≦k(l)) number of pairs have separations, l1, l2, . . . , l1a, (where each lx≦l, x=1, 2, . . . , la, then all remaining pairs need to have a separation of more than
Constraint 3:
Constraint 3 is used to prevent low distance error events/sequences from occurring among a third category of CTBC codewords, denoted Φ3, generated as v=G[u]=G[π[c]]εΦ3, where, similar to Constraint 2, c consists of two non-zero codewords of the OBC, each having the minimum weight do. Constraint 3 ensures that, if the two nearest coded bits of a codeword position cj and a coded bit of a codeword position cj1 have a separation of l<(lmax+1), then only up to a total of κ(l) (<d0) such pairs of coded bits of cj and cj1 may have the separation of less than (lmax+l), and all the rest of the (n−κ(l)) coded bits of cj must have a separation of at least s2(l)=(s1-l) positions from every other coded bit of cj1. In the selection of κ(l values, l=1, 2, . . . , lmax, note that the lowest weight category 3 CTBC codewords will consist of (a) κ(l) number of segments each with weight between l and lmax and (b) (doκ(l)) number of additional segments, each with weight of at least s2(l)=(s1−l). Hence, to assure W[v]≧d1, κ(l) is selected to ensure that lκ(l)+(d0−κ(l)s2(l)≧dt. Equivalently, κ(l) is selected as
In the case κ(l)=1 for l=1, 2 . . . , lmax, when a coded bit of a codeword position cj is 1 (<=lmax) positions away from a coded bit of cj1, every remaining coded bit of cj has to be positioned at least (s1−l) positions away from every other coded bit of cj1. The case κ(l)=1 for l=1, 2 . . . , lmax, with the introduction of constraint 4 below, can generate powerful concatenations with MHD values of dt=d02 as discussed in the Fonseka reference [1].
Constraint 3 can be implemented by checking to see that if a coded bit of codeword position cj and a coded bit of codeword position have a separation of l, then when finding a position for a coded bit of cj, all positions within (s1−l) away from already placed other coded bits of cj1 should be designated as restricted zones.
Constraint 4:
Constraint 4 is used to prevent low distance error events/sequences from occurring among a fourth category of CTBC codewords, denoted Φ4, generated as v=G[u]=G[π[c]]εΦ4, where c consists of 3, . . . , (lmax+1) non-zero codewords of the OBC. Constraint 4 ensures that, if a set of codewords cj_h, h=1, 2, . . . , p for each p≦(lmax+1) are placed on u in such a way that the minimum separation between two coded bits of Cj_h and Cj_h+1 is one for h=1, 2, . . . , (p−1), then the minimum separation between every coded bit of cj_h and every coded bit of cj_h+2 has to be at least 2 for h=1, 2, . . . (p−2). In addition, once the coded bits are randomly placed, if they are placed in such a way that the actual minimum separation of coded bits of codewords cj_x and cj_x+y is y for y=2, 3, . . . s, x=1, 2, . . . , (lmax+1−y), s+x<(lmax+1), then the minimum spacing between every coded bit of cj_x and cj_(x+y+1) has to be at least (y+1).
Constraint 4 is designed to implement the inherent constraint discussed above in connection with CI-2, but can be used when L=1, i.e., there is no CI-2 design matrix, [A]L×ρn, that is read in column-major order. With Constraint 4 as stated above, if the minimum separation of coded bits of codewords 1 and 2, codewords 2 and 3, and codewords 3 and 4 are all one, and the minimum separation of coded bits of codewords 1 and 3, and codewords 2 and 4 (each of which should be at least 2) happens to be actually 2, then the minimum separation of coded bits of codewords 1 and 4 has to be at least 3. Constraint 4 thus makes the L=1 implementation function like a standard CI-2 where the reading of [A]L×ρn in column major order automatically/inherently adds in the above-mentioned inherent constraint.
Constraint 4 can be efficiently implemented by monitoring neighboring codewords of every coded bit on u. Let us identify the n-bit codewords by their identification numbers, cj, for j=0, . . . , ρ−1. For each codeword cj, for j=0, 1, . . . , ρ−1, and for each s=1, 2, . . . , (lmax+1), prepare a respective list of neighboring codewords, Lnj(s) whose list entries identify all of the neighboring codewords of cj in u that have a coded bit at a minimum separation of s relative to any of the n coded bits of the codeword cj. Note that each of these lists is an array with at most 2n entries. Once the sequence u starts to fill up, these lists of neighbors begin to fill up to their maximum value of at most 2n entries. When selecting a position for a coded bit position (nj+t) of codeword position, cj, the lists Lnj(s) are consulted. Suppose that cjx is an entry of Lnj(1), and cjy is an entry of Lnjx(1). Then mark as a restricted zone one position around each coded bit of codeword cjy when placing coded bit position (nj+t).
When κ(l)=1 for l=1, 2, . . . , lmax, Constraint 4 prevents the generation of coded sequences v with weight less than dt from three through (lmax+1) number of codewords of the OBC. Together, Constraints 1-4 ensure the minimum weight of coded sequences v generated by three through lmax codewords of the OBC is dependent on s1. However, the minimum weight of sequences of v generated by (lmax+1) or more codewords of the OBC is independent of s1. The minimum weight generated by a combination of (lmax+1) codewords of the OBC can be found by considering the worst case placement of coded bits of (lmax+1) codewords when placed in accordance with Constraints 1-4. As discussed in Lemma 1 of the Fonseka reference [1], the minimum weight of sequences of v generated by (lmax+1) or more codewords of the OBC limits the MHD that is achievable by CTBC code constructed with a CI-2 to be dt=d02. Constraints 1-4 can be used to generate concatenations with MHD dt≦d02 but now with a higher interleaver gain since L=1.
Furthermore, note that when L=1 and Constraints 1-4 are applied as described above, the worst case placement of (lmax+1) weight d0 codewords of the OBC creates the following sequences of ones in v: one sequence with weight d0, two sequences with weight (d0−1) and so on up to d0 sequences of ones with weight 1. Therefore, the worst case weight generated by (lmax+1) codewords is,
Note that the resulting MHD is thus dt=d0(d0+1)(d0+2)/6, and this is greater than d02 for d0>2. However, to reach this target MHD greater than d02 it is necessary to also satisfy Constraint 5 as provided in the next section to prevent the generation of any additional low weight error sequences that can give rise to a coded sequence v with weight less than dt=d0(d0+1)(d0+2)/6 that can arise from combinations of 2d0, (2d0+1), . . . , └2dt−1)/d0┘ codewords of the OBC.
That is, if only Constraints 1-4 are applied, the resulting CI-3 interleaver can be designed to achieve the MHD that is achievable by CTBC code constructed with a CI-2, i.e., dt=d02. Constraint 5 can additionally be enforced in order to reach dt=d0(d0+1)(d0+2)/6>d02. This causes the relatively few additional possible error sequences due to combinations of up to └2dt−1)/d0┘ codewords to be eliminated by preventing the generation sequences with weight between d02 and dt=d0(d0+1)(d0+2)/6.
A CI-3 interleaver can also make use of Constraint 5 be to provide still higher MHDs, i.e., with dt>d0(d0+1)(d0+2)/6. To do this, the Constraints 1-4 are applied as outlined above, and Constraint 5 is applied using the CI-4 design approach of the next section, using a selected target MHD, dt>d0(d0+1)(d0+2)/6. This mixed CI-3/CI-4 interleaver can reduce the size of the restricted zones as compared to the straight CI-4 interleaver design approach described below that can reach the same target MHD. This can provide increased interleaver gain and make it easier to find CI-3 interleavers as compared to a CI-4 interleaver designed at a selected target MHD. Hence it is to be understood that CI-3 interleavers with dt>d02 and even dt>d0(d0+1)(d0+2)/6 can be constructed by additionally applying the CI-4 design approach of the next section, but also enforcing any restricted zones from Constraints 1-4 at the same time. Once Constraints 1-4 are enforced, the restricted zones that arise due to Constraint 5 will be greatly reduced, thereby shifting a large load of restricted zones from the Constraint 5 to Constraints 1-4. Since Constraints 1-4 are more restrictive than Constraint 5, when applied along with Constraint 5, they will tend to make it easier to find CI-4 solutions and to potentially lower increase interleaver gain due to a smaller number of restricted zones when compared to applying CI-4 on its own.
As mentioned in observation 6 of the CI-2 as discussed above, it is possible to use different values for the target MHD, dt, for different categories of codewords. The categories of codewords whose category-level error coefficient is lowest, for example, can use a lower MHD value to cause the overall probability of error in (1) to be lowered. This use of multiple target MHD's is applied similarly in both of the CI-3 and CI-4 interleaver designs, so the multiple target MHD versions of the CI-3 and CI-4 interleavers will be described after the CI-4 interleaver is developed.
In the CI-4 interleaver design approach, the coded bits of codewords of the OBC are pseudo-randomly placed directly into the sequence u in such a way as to maintain a target MHD, dt of a concatenation of non-zero codewords. The CI-4 interleaver is designed to be as close to a uniform interleaver as possible while simultaneously maintaining the MHD at dt. In the raw CI-4 approach, there is only one constraint, namely constraint 5. Interleavers that only use Constraint 5 are called CI-4 interleavers. Interleavers that use one or more of Constraints 1-4 and additionally enforce Constraint 5, are called mixed CI3/CI-4 interleavers.
Constraint 5:
Constraint 5 requires that the coded bits of combinations of an integer number, Nc, of nonzero codewords of the OBC are positioned in u such that W[v]≧dt, for some specified target MHD, d1, where v=G[u]=G[π[c]] is the CTBC codeword generated from the combination of codewords of the OBC in c.
In CI-4 interleavers, different categories of CTBC codewords that correspond to different types of error sequences are denoted, Φm(CI-4), m=1, . . . , Nc. The category Φm(CI-4) includes all weight d CTBC codewords formed by a combination of m≦Nc non-zero codewords of the OBC with the minimum weight do. In mixed CI-3/CI-4 interleavers, all of the categories of codewords discussed in connection with the CI-3 interleaver can exist, and additionally, the categories Φm(CI-4), m=1, . . . , Nc are defined. If there is any overlap between the CI-3 categories and the CI-4 categories, the CI-3 categories take precedence and any remaining combinations of codewords that are not already accounted for by a CI-3 category are accounted for in the CI-4 categories. With this definition, no single CTBC codeword can fall into more than one category.
To understand how Constraint 5 can be implemented, consider the example where the OBC is an (8,4) Hamming code and the IRCC is an accumulator. Table 1 enumerates the 16 different codewords of the (8,4) extended Hamming code. Note that 14 of these codewords have weight do=4, one has weight 8, and one has weight zero. Next recall that the vector c contains codeword positions cj=(cj1, cj2, . . . Cjn), for j=0, 2, . . . , ρ−1. To start, consider the case where Nc=1, so that only one non-zero codeword need be considered. Let a1<a2<a3<a4 be the ordered set of indices of where the four ones of a corresponding weight d0=4 codeword are placed into u by the permutation, π. Then v=G[u] will have all zeros, except for a string of ones starting at a1 and terminating at a2, and another string of ones starting at a3 and terminating at a4. Hence for this codeword, Constraint 5 will require that all 14 of the weight do=4 codewords in Table 1 satisfy W[v]=(a2−a1)+(a4−a3)≧dt. Constraint 5 will require also that W[v]=(a2−a1)+(a4−a3)+(a5−a6)+(a7−a8)≧dt is satisfied by codeword #16 in Table 1 and whose eight ones are placed on the ordered set of indices in u, a1<a2<a3<a4<a5<a6<a7<a8.
The above expressions for W[v] form sums based upon “pairs” of indices of where the ones of a corresponding codeword of the OBC are located in the vector u. The locations of the ones that make up the pairs are identified starting from left to right in u. These pairs are important, because they each give rise to a respective string of ones in the vector v. Each string of ones in v begins at the location of the first one in each pair and ends at the location right before the second one in each pair of ones in u. In the above expressions for W[v], the weight do=4 codeword has pairs given by (a1,a2) and (a3,a4), and the weight 8 codeword has pairs given by (aba2), (a3,a4), (a5,a6), and (a7,a8). A “doublet” is defined as a pair that generates weight one in v, e.g., the ordered indices a1, and a2 form a doublet if (a2−aj)=1. The condition that Constraint 5 avoids, i.e., W[v]<dt, can generally occur due to formation of low weight pairs, and in the worst case, doublets.
Note that when bit (nj+t) of the vector c is placed into a location π(nj+t) on u to maintain W[v]≧dt for all combinations of i=1, 2, . . . ,
(i.e., 2dt/d0 for the case where d0 is odd, and 2dt/d0 rounded down to the nearest integer for the case where d0 is odd) number of the codewords, then additionally, all combinations of a total of Nx>Nc number of codewords of the OBC with cj will also satisfy W[v]≧dt. This is the case because when there are total of
codewords, each with the minimum weight d0, this combination of codewords generates at least a total of d0Nc=d0 └2dt/d0┘=2dt ones in u. In the worst case, each of these ones will pair up to form 2d/2=dt doublets u, each generating a weight of 1 in v, therefore W[v]>dt. Increasing Nx beyond Nc can only increase W[v] beyond this worst case value. Also, if higher weight codewords are involved, this also will only increase W[v] beyond the worst case value.
The paragraph above shows that when Constraint 5 is implemented by checking all combinations of only up to Nc=└2dt/d0┘ number of non-zero codewords of the OBC, that W[v]>dt will satisfied for all possible combinations of nonzero codewords in c. Additionally, note that if the target MHD of the CTBC codeword, v, is dt=20 and the MHD of the OBC is do=4, then Nc=└40/4┘=10, while if dt=16, then Nc=└32/4┘=8. That is, the higher the target minimum distance, dt, of the CTBC codeword v relative to the MHD of the OBC, d0, the more combinations of codewords, Nc, need to be considered to maintain W[v]≧dt.
The permutation π can be built up sequentially. For example, each bit for t=0, 2, . . . 7 of any coded bit position within the codeword position, cj, in c, can be “placed” one bit at a time. The term “placed” is action of identifying that a bit location nj+t in c will be permuted to a location π(n j+t) in u. A codeword such as codeword #13 in Table 1 is said to have been “completed” once enough coded bits of cj, have been placed into u to allow all of the ones of the completed codeword to appear on u. For example, coded bit positions 0, . . . , 3 of the codeword position, cj, can be mapped to any set of permutated locations, π(nj+0), . . . , π(nj+3) without the possibility of having any weight do=4 codeword of the OBC of Table 1 complete. This is because [1 1 1 1 0 0 0 0] or any other 8 bit sequence with fewer than 4 ones in the first four positions is not a codeword of the (8,4) OBC as can be seen from Table 1. However, when the fifth bit of the codeword position, cj, is mapped to π(nj+4), then all of the ones needed to complete codeword #13, i.e., [1 1 0 1 1 0 0 0] will have been mapped from codeword position, cj, to u. In general a maximum of, μ number of bits can be placed from a given codeword position, cj, without completing any codeword of the OBC. Checks to ensure that Constraint 5 is satisfied must be made when placing the remaining n−μ coded bits of cj.
For the (8,4) OBC of Table 1, μ=4. Hence, a suitable ordering of tε{0, . . . 7} can be selected to allow a maximum of μ=4 coded bit positions to be placed into u freely and without restriction. The remaining n−μ=4 coded bit positions from each codeword position need to be checked to ensure Constraint 5 is satisfied. Note that the process of “placing” coded bits involves finding, one by one, a permuted ordering of the coded-bit locations in c to define the corresponding permuted ordering of coded bit positions in the vector u.
The CI-4 interleaver design process can be started off by pseudo-randomly selecting and placing pp coded bits that can be placed in u without any restriction. Next, the CI-4 design process can proceed by randomly selecting a remaining coded bit in c, one at a time to be placed into the sequence u. At this point codewords will start to complete and care should be taken to ensure that Constraint 5 is satisfied. Note that in a brute force approach, there would be a large number of combinations of the OBC to consider to ensure that Constraint 5 is satisfied. This number of combinations
is very large, especially at higher values of ρ. The CI-4 interleaver design algorithm presented below reduces the complexity greatly by only evaluating the relatively few codeword combinations that can potentially give rise to low weight CTBC codewords, v. In the process of sequentially placing the coded bits of codeword positions cj=(cj1, cj2, . . . cjn), for j=0, 1, 2, . . . , ρ−1, consider the placement of a coded bit of codeword position cj on u that will end up completing one or more valid codewords in Table 1. At the point in time of placing each coded bit, it is assumed that any and all of the previously placed coded bits were placed into u in such a way as to meet Constraint 5. This condition is clearly met after placing the first pp coded bits as described above, but from then forward, additional care needs to be taken to avoid placing any bit in a “restricted zone,” i.e., into any range of one or more locations that would cause Constraint 5 to be violated. As u fills up beyond the first p coded bits, a list L of already completed codewords is preferably maintained which includes the identification numbers in Table 1 of each completed codeword along with the positions of their respective coded bits in u. By the end of the CI-4 interleaver design process, the list L will have grown to include ρ*2k entries, containing all codewords of the OBC mapped from all of the codeword positions, cj=(cj1, cj2, . . . cjn), for j=0, 1, . . . , ρ−1.
Once μ bits from codeword position have already been placed, when placing a “current coded bit” position (nj+t) of the “codeword position cj, tε{0, . . . 7}, care needs to be taken to assure that Constraint 5 is met for each codeword in Table 1 that completes upon this bit's placement. That is, for each of the codewords that complete, in accordance with Constraint 5, identify any restricted zones in u where coded bit (nj+t) cannot be placed due to the codewords that currently complete due to the placement of bit (nj+t). In addition, combinations of other already completed codewords from other codeword positions other than cj need to be considered to determine if additional restricted zones in u exist due to combinations of the currently completing codeword(s) with other already completed codewords from different codeword positions, e.g., cj2.
Consider the example of
To better understand Algorithm 1, consider
An adjustment will be needed prior to calling Algorithm 1 for cases when the length of the position vector, p, is even. For example, consider
To identify combinations of two codewords that potentially can give rise to low weight vectors, v, again consider the limiting worst case conditions. Assume that an already placed codeword exists whose ordered indices can be written as {b1, b2, b3, b4} as illustrated in
In the construction of sets as described in the various paragraphs below, the actions are to be carried out separately and as many times as is needed to account for each codeword that will be completed by placing the coded bit (nj+t). Also, the elements of each set, Si, will generally contain the identities from the list L of each codeword that has already completed and that will be used in a combination with the codeword currently under consideration that will complete due to the placement of coded bit (nj+t). Subsets of such elements are added separately for each codeword that completes due to the placement of coded bit (nj+t). The elements of each set, Si, can be considered to be vectors of (i−1)-tuples of previously completed codewords, where each element of the tuple corresponds to a corresponding already completed codeword on the list L. From each set, Si, can be constructed a corresponding set of position vectors, {p}i, which will be evaluated by Algorithm 1 to find any restricted zones due to combinations of each currently completing codeword with already completed codewords on the list L that can potentially form the low-weight combinations that need to be avoided by to satisfy Constraint 5.
The above example regarding
In the context of
Similarly,
In the context of placing a tth bit of a current codeword to satisfy Constraint 5 for combinations of three codewords, define the set, S3, as containing the indices of all the already placed bits of each 2-tuple of two codewords that can potentially form low weight vectors, v, when combined with each codeword that completes when the current bit, (nj+t), is placed. The definition of the set S3 can be considered a next action of an iterative set construction where each element of each set contains the combinations of other already placed codewords that need to be checked in combination with the currently being placed bit of a current codeword. That is, the elements of the set S3 contain the indices of the 2d0 already placed ones of the two codewords that need to be checked along with the d0−1 already placed bits of the current codeword. The d0−1 already placed bits of the current codeword can then be appended to each codeword 2-tuple contained in the set S3, and these indices can be sorted to form a set of positions vectors, {p}3. The position vectors in {p}3 can each be sent to Algorithm 1 with ni=3 to find any restricted zones that come up due to all possible combinations of three codewords. To identify each 2-tuple of codewords to be included in S3, first look for codewords on the list L of completed codewords that fall within the window w3 around already placed bits of each codeword of Table 1 that completes due to the placement of bit (nj+t). Call this set S2′ because it looks like the set S2 but uses the smaller window, w3. For example, in
Next consider an additional type of element for possible inclusion in the set S3, called a “chained tuple” of codewords. To understand the concept of chained tuples of codewords, consider the low weight example of
To identify the chained tuples for possible inclusion into the set S3, perform the following steps for each codeword in the set S21: 1) Select a current codeword under consideration from the set S2′. 2) Create a new set of windows around each of the coded bit positions in u of the current codeword in S2′ under consideration. 3) For this current codeword under consideration, identify all of the other already completed codewords from the list L that have at least one coded bit positioned within the window, w3. 4) For each such identified codeword from the list L, identify a 2-tuple of codewords consisting of the current codeword under consideration and the newly identified codeword from the list L with at least one bit within the window, w3. For each chained tuple found that is not already in the set S3, add it to the set S3. At the end of this process the set S3 will be complete. If no chained tuples were found, no new elements will be added to the set S3. If S2′ only had one element and no chained tuples were found, then the set S3 is empty.
In the context of placing a tth bit of a current codeword to satisfy Constraint 5 for combinations of i>3 codewords, define the set, S as containing the indices of all the already placed bits of the (i−1)-tuples of codewords to be considered in combination with a current codeword when placing bit (nj+t) of the current codeword. If the set Si-1 is empty, then the set Si and all subsequent sets will also be empty. In combinations of i codewords, (i·d0/2)−1 doublets can be formed so the worst case window is thus wi=[dt−(i·d0/2)−1)]. To identify each (i−1)-tuple of codewords to be included in Si, an iterative process is used that begins by first looking for codewords on the list L of completed codewords that fall within the window wni around coded bits of the current codeword. Also call this set S2′ because it looks like the set S2 but uses the smaller window, wni. Start by including any and all distinct (i−1)-tuples of codewords in S2′ into Si. If the set S2′ is empty, stop, because the set Si, will also be empty. If the set S2′ contains less than i−1 elements, no (i−1)-tuples of codewords will be formed or added to the set Si at this time.
Next construct a set S3′. Start by including all of the distinct 2-tuples, if any, of codewords in the set S2′ into the set S3′. As in the construction of the set S3 as previously discussed, next identify chained 2-tuples of codewords using the codeword(s) in S2′ to identify the other potentially existing codewords from the list L within the newly introduced windows of size wi. Add any distinct chained 2-tuples found to the set of S3′. If the set S3′ is empty, stop, because S will also be empty. Next form the set S4′ by including any and all distinct 3-tuples formed as combinations of a codeword in S2′ with a 2-tuple in S3′. Next the chaining is applied as discussed in the context of S3 to form chained 3-tuples for further inclusion into S4′. This process of forming tuples and chaining is iteratively continued until the set S, is reached or until the set S, is found to be empty. The elements of the set Si will be (i−1)-tuples of the indices of the locations of the ones of i−1 other completed codewords to be considered in combination along with the indices of the d0−1 already placed ones of the currently being placed codeword.
That is, the elements of the set Si are used to construct a set of vectors, {p}i, where each vector p in this set contains an ordered list of the indices of all of the already placed ones from the current codeword along with the indices of all of the already placed ones from i−1 additional completed codewords that need to be considered in combinations of i codewords to ensure Constraint 5 is satisfied.
As illustrated in
In order to ensure that a situation like that shown in
It follows from the above discussion that in order to find a position in u, π(nj+t), where a currently being placed coded bit, (nj+t), in c can be placed in accordance with Constraint 5, perform the following actions:
1. For i=1, . . . , Nc,
2. Take the union of all restricted zones found by Algorithm 1 in action 1 above.
3. Randomly select π(nj+t) among the remaining available positions.
4. Open a sequence of windows of widths w2, w3, . . . wNc around the π(nj+t). Identify any completed codewords on L within this newly introduced set of windows. If no such codewords are found, place bit (nj+t) at π(nj+t). If completed codewords are found within this newly introduced window, do action 5.
5. Without placing the coded bit in question at π(nj+t) do action 1 by augmenting each set Si for i=2, . . . Nc, with any newly identified elements due to the newly opened windows of action 4. Use Algorithm 1 to identify the newly selected positions to be restricted from this run. If π(nj+t) is not in the newly found positions to be restricted, stop. If π(nj+t) is in the set of newly found positions to be restricted, add the newly found positions to be restricted to the set of positions to be restricted and repeat actions 4-5 until a permutation position that can be confirmed is found. Once a position is conformed, place the coded bit at the final π(nj+t).
6. In the event that the algorithm is having difficulty, perform a roll-back and continue. A “roll-back” is defined as undoing (and eventually re-placing) any number of already placed positions. A record is preferably kept as to the most recently placed positions so that the roll-back can remove any desired number of most recently placed positions. When these positions are removed, the restricted zones due to these already placed positions that are being undone will be have been removed when placing subsequent positions and the randomization in the placement process will provide opportunity to bypass the current problem that caused the roll-back to occur. Alternatively, positions that have caused a proportionally large number of restricted zones to appear while placing subsequent positions can be removed. A respective list is preferably kept for each respective placed position. Each respective list identifies all of the restricted zones that had to be removed for subsequent positions being placed due to the respective already-placed position. This allows the most troublesome positions to be intelligently selected for removal/unplacement in the roll-back. Further aspects of intelligent roll-backs are discussed below. Also discussed below is RCID (reverse constrained interleaver design), which is a general method of performing intelligent roll-backs, i.e., ensuring a set of interleaver constraints are met for an interleaver whose positions are already placed, but may not already meet the interleaver constraints.
The above approach selectively considers only the necessary combinations of codewords to place coded bits on u. It is seen that the complexity of this method increases with increasing dt values. Even though the above approach still considers different combinations of codewords, the complexity of the above outlined algorithm is much lower than searching over all possible codeword combinations of all of the ρ codewords.
Referring now to
To understand the action 602, recall that the number p represents the maximum number of coded bit positions that can be placed from each codeword position, c, without completing a codeword. That is, for any fixed j, indicative of the set of coded bit positions (nj+t), the variable t may take on p different selected values, {t}μ⊂{0, . . . , n−1}, (called a “μ-subset”), to identify a corresponding subset of μ different coded bit positions, {(nj+t)}μ that can be placed into any jth codeword position without completing any codeword of the OBC. For example, when the (8,4) Hamming code of Table 1 is used, μ=4, and {t}μ={0,1,2,3} or {t}μ={4,5,6,7} represent valid μ-subsets because if four ones were placed into the coded bit positions {(nj+t)} using the four t-values from either of these two μ-subset, no codeword in the (8,4) Hamming code of Table 1 would complete. Depending on the code, there will be a fixed number of valid μ-subsets that applies to all codeword positions, cj. The action 602 generates a μ-set by selecting a respective μ-subset to be used in each codeword position. For example, if there are a total of 8 valid μ-subsets in a given code, then the action 602 could use a randomly selected one of these μ-subsets for use in each of the coded bit positions, cj. With L=1 constrained interleavers, a CTBC code is typically constructed using μ different (n,k) codewords of the OBC, and the frame size is K=ρn coded bits. Therefore, any complete μ-set of coded bit positions will contain up to a total of μ*ρ coded bit positions.
When the CI-4 design approach is in use, the action 602 preferably generates a complete μ-set, {(nj+t)μ: j=0, . . . , ρ−1}, with t taking on the μ different values in each jth selected μ-subset. Once the action 602 identifies a complete μ-set, a variable “PLACED” is set to the number of elements in the μ-set, e.g., PLACED=μ*ρ. In some embodiments, in order to provide additional flexibility in mapping, the action 602 creates a partial μ-set in order to loosen the requirements later while placing coded bits subject to the constraints. Whether a complete or incomplete μ-set is selected, the action 605 identifies a corresponding set of permutation locations, {π(nj+t)μ: j=0, . . . , ρ−1}, for the μ-set selected in the action 602. The “non-constrained” portion of action 605 refers to Constraint 5. Constraint 5 need not be evaluated while placing the bits in the action 605 because it is guaranteed that no codewords of the OBC will complete during the placement of these bits.
Control next passes to action 610 where one or more (Δ) remaining coded bit positions (i.e., Δ bit positions that have not yet been placed), {(nj+t)}, are selected. Often, Δ=1. In preferred embodiments, the selection 610 is generated using a pseudo random number generator that generates an index into a vector that includes all the indices of the bits that have not yet been placed, although other selection criteria can be used. When Δ>1, if a particular selected coded bit position, (nj+t), cannot be placed at a particular location, πcandidate, then a different selected bit position being analyzed at the same time can be checked to see if it can be permuted to the location πcandidate. This way, the action 610 can work to avoid or resolve potential conflicts and provide further flexibility in finding a valid permutation function, πCI−L=1:c→u.
Control next passes to an action 615 which performs an analysis function. Action 615 identifies any and all restricted zones associated with placing the selected bit, (nj+t). If a plurality of coded bit positions, {(nj+t)} (a mapping group”) have been selected in the action 610, then all of the restricted zones associated with placing each of the plurality of coded bit positions are preferably identified in the action 615. In a preferred embodiment, the action 615 generates a set of positions vectors for each of the one or more coded bit positions in the mapping group and passes these positions vectors to Algorithm 1 in order to identify their respective restricted zones. As discussed below, different target MHDs may optionally be used in Algorithm 1 depending on LengthP of each positions vector.
Next control passes to an action 620 which starts by identifying one of more candidate permutation locations. In embodiments where the action 615 only selects one coded bit to be placed at a time, i.e., where each mapping group has only one coded bit, the action 620 identifies a candidate permutation position, π(nj+t), in which to place the coded bit position, (nj+t), from c to u. The permutation location, π(nj+t), is selected to be outside of any restricted zones identified for the selected coded bit position, (nj+t). As described above in connection with the CI-4 design algorithm, one or more verification actions are next taken to ensure that, once placed, no new constraint violations occur. This verification is preferably performed by opening a set of windows around the candidate bit placement location, π(nj+t), and determining whether any already completed codewords on the list L have any bits within these new windows. If not, the bit (nj+t) can be verified to be placeable at the candidate bit placement location, π(nj+t). If one or more coded bit positions from already placed codewords are found to be in the new windows, Algorithm 1 is preferably used to identify any new restricted zones. Next it is determined whether the candidate bit placement location, π(nj+t), is located within any new restricted zones. If not, the action 620 performs the placement (nj+t)→π(nj+t) and declares this placement to be verified. If the placement cannot be verified, then a new candidate placement is selected outside all identified restricted zones and the process is continued until a verified location can be found. If, for example, toward the very end of the method 600, no location π(nj+t) can be found to be verifiable, then a roll-back procedure as described above is invoked and the method 600 is reentered at action 610 using the rolled back state of the method 600. In embodiments where the mapping group consists of a single coded bit position, then Δ=1, and control passes to action 625 where the variable PLACED is incremented by one.
In embodiments where the mapping group has more than one element the actions 610-625 perform additional functions. For example suppose that the action 610 selects a mapping group, {(nj+t)}, that contains ten different candidate coded bit positions from different codeword positions, cj, to be mapped together as a group. Then the action 610 would identify the ten different coded bit positions in the mapping group. For each coded bit position in the mapping group, the action 615 would identify a respective set of positions vectors and would use Algorithm 1 to identify ten respective sets of restricted zones. Next the action 620 would observe the ten sets of restricted zones and analyze additional information, such as overlapping restricted zones and zones where none of the bits in the mapping group had any restrictions. Such additional information could be used in a mapping group placement strategy to more intelligently place one or more of the coded bits in the mapping group. For example, permutation positions located outside of the union of these restricted zones would likely lead to verifiable placements. Also, for example, if the restricted zones of nine the coded bit positions overlapped in a certain “crowed area(s),” but the tenth coded bit to be placed did not, it may be desirable to place the tenth coded bit position into an identified crowded area in order to fill a difficult position. The mapping group placement strategy is preferably organized to increase a measure of performance such as the probability of finding valid CI-4 interleaver solutions by eventually being able to place all of the K bits in the frame.
In the example where the mapping group has ten coded bit positions, suppose that a candidate target location in u, πcandidate-1, is selected by a random number generator during the action 620. In this example, assume that πcandidate-1 is not in any restricted zone of six of the ten coded bits positions in the mapping group, {(nj+t)}. Then the verification portion of the action 620 could be carried out for these six coded bits positions. Suppose that three of these six coded bit positions were verified to be placeable at πcandidate-1. This information can be recorded, another candidate permutation location, for example πcandidate-2 could be similarly analyzed. This analysis can be continued up to πcandidate-10. Now, with the knowledge of the verifiable placements and the interactions between the different coded bit position in the mapping group, the action 620 can determine an ordering in which to make the placements and final verifications of the coded bits in the mapping group. For example, the action 620 can recursively perform tentative placements and verifications among the remaining bit positions to be placed based on the first pass analysis above. The action 620 continues analyzing the effects of different placement strategies until all of the bits of the mapping group have been placed, in which case, the parameter Δ is set as Δ=10. Also, the action 620 preferably maintains data records indicating placements that could have been made but were not selected. This information can later be used if and when a roll-back is needed. When the method 600 is near completion, it is possible that Δ<10 placements can be made for a given mapping group. In such cases the roll-back process discussed above can be invoked and the method 600 reentered at action 610, or Δ<10 placements can be made and the parameter Δ can be set to the number of coded bit positions that have actually been placed. Control then passes to action 625 where the variable PLACED is incremented by Δ.
In a one type of embodiment, the mapping group can be selected to be all of the remaining bits to be place outside of the originally selected μ-set. In such an embodiment, the action 615 analyzes all possible valid placement positions for each remaining coded bit outside the μ-set. Next computer-chess forward looking trellis logic is used whereby each placement is considered to be a “move.” Using the same type of game theory forward looking analysis as is used in computer chess games, the action 620 could analyze all sequences of “moves” and identify a sequence of “moves” that caused the method 600 to “win” the game, i.e., to place all of the coded bits into a proper CI-4 interleaver design. While such an approach requires more computing time, such logic is well known, the method 600 will be carried out off line and the final result potentially used millions and millions of times in the future or published in a standards document. Also, the computer-chess forward looking trellis logic can be applied during roll-backs to just be applied to a smaller portion of the placement problem within which the trouble spots have been identified.
Control next passes from action 625 to action 630 which then passes control back to the action 610 until an error condition arises where certain placements cannot be verified, or until the condition PLACED=K is met, in which case the CI-4 permutation vector, π, is supplied as output. If control passed to the action 630 because of an error condition (e.g., Δ=0), then a roll-back as discussed above is performed and the process 610-630 is continued until the condition PLACED=K is met. Once this condition is met, the entire permutation vector, π, is output from the design algorithm 600.
The method 600 can also be configured to perform an “analysis run” that does not place coded bit positions from c to u in actions 605 and 620. In analysis runs, the coded bit positions are assumed to already have been placed by a previous run of the method 600, so the method 600 is configured to only to identify and analyze restricted zones for a specified MHD>dt. Analysis runs are used to identify a set of positions vectors (and their respective lengths) that correspond to low weight CTBC coded sequences whose weights are di<d≦df, for some specified weight, df. It is assumed that all CTBC coded sequences d≦dt will already have been eliminated in the previous run of the method 600 that performed the placements subject to dt. If, as discussed below, multiple dt's are used in the method 600, then the set of weights identified in the analysis run, dt<d≦df, holds for the lowest value of dt used in the run of the method 600 that performed the placements. Analysis runs can be configured to provide additional information such the restricted zones associated with each of the identified positions vectors in the higher weight regions, dt<d≦df. While the previous run of the method will have avoided all restricted zones for all weights d<dt, there will be new restricted zones that can be identified for remaining low weight CTBC codes whose weights are in the range dt<d≦df.
The method 600 can also be configured to perform a CI-3 interleaver design, or a mixed CI-3/CI-4 interleaver design. First consider configuring the method 600 to perform a CI-3 interleaver design. To start, the action 602 is configured by setting the parameter μ to μ=1. This causes action 605 to place one bit from each codeword without constraints. For example, action 605 can use a random number generator to place a total of ρ coded bit positions, one from each codeword position, from c onto u. Control next passes to action 610 which is configured to use a mapping group containing one bit, i.e., Δ=1.
In a CI-3 embodiment of the method 600, action 615 is configured to identify the restricted zones due to Constraints 1-4. For Constraint 1, assuming a value has been specified for s1, the restricted zones for a coded bit (nj+t) consist of all positions within s1 locations in u away from any already placed coded bits, π(nj+t) for tε{0, . . . n−1}. For Constraint 2, a list of pairs of codewords positions (cj, cj1) that have a coded bit from cj and a coded bit from cj1 separated by exactly (lmax+1) positions is maintained. When finding a position for a coded bit of cj, each the list element containing as an element of a pairs of codewords positions (cj, cj1) is identified, and all positions within (lmax+1) from the remaining coded bit positions of each cj1 on the list are added to the Constraint-2 restricted zone for placing the current bit of cj. For Constraint 3, checks are performed to see that if a coded bit of codeword position cj and a coded bit of codeword position cj1 have a separation of l, then when finding a position for a coded bit of cj, the Constraint 3 restricted zones include all positions within (s1−1) away from already placed other coded bits of cj1. For Constraint 4, neighboring codewords of every coded bit on u are monitored. For each codeword for j=0, 1, . . . , ρ—1, and for each s=1, 2, . . . , (lmax+1), a respective list of neighboring codewords, Lnj(s) is maintained whose list entries identify all of the neighboring codewords of cj in u that have a coded bit at a minimum separation of s relative to any of the n coded bits of the codeword cj. When selecting a position for a coded bit position (nj+t) of codeword position, cj, the lists Lnj(s) are consulted. Suppose that cjx is an entry of Lnj(1), and cjy is an entry of Lnjx(1). Then when placing coded bit position (nj+t), the Constraint 4 restricted zone includes one position around each coded bit of codeword cjy
In the action 620, the coded bit position (nj+t) is placed at a selected permutation position, π(nj+t), that is outside of all restricted zones identified in the action 615. For example, a random number generator can select π(nj+t) from among the remaining non-restricted positions. In the action 625 the variable PLACED is incremented by Δ=1. In the action 630 the end conditions are checked and control is passed back to action 610 until a CI-3 interleaver is available. If needed, a roll-back can performed as needed prior to returning control to the action 610. Upon completion the method 600 will provide as output a full CI-3 interleaver, π.
To configure the method 600 to design a mixed CI-3/CI-4 interleaver, the method 600 is instantiated twice, with one instantiation configured to design a CI-3 interleaver as described above (“the CI-3 instantiation,”) and the other instantiation configured to design a CI-4 interleaver as also described above (“the CI-4 instantiation.”) The two instantiations will work on the same problem together and communicate and synchronize with each other as described in the example embodiment provided immediately below.
To start, the CI-3 instantiation is allowed to execute actions 602-630 as described above until PLACED=μ*ρ, where the value of μ refers to the value of p used in the CI-4 instantiation, e.g., μ=4 when the (8,4) Hamming code is used in the OBC. Now that the entire p-set has been placed in accordance with Constraints 1-4, action 610 of the CI-3 instantiation is allowed to select the next coded bit position, (nj+t) to be placed. Action 615 of the CI-3 next identifies all restricted positions for all of Constraints 1-4 as described above. At this point, the coded bit position to be placed, (nj+t), and all the restricted zones for Constraints 1-4 are passed to the CI-4 instantiation. The CI-4 instantiation then executes action 615 using this selected coded bit position (nj+t) and identifies of its CI-4 restricted zones using Constraint 5. The CI-4 instantiation then takes the union of all CI-3 and CI-4 restricted zones to form the final restricted zone for coded bit position (nj+t). Actions 620-630 are then performed just as in the CI-4 approach, except once control passes back action 610, the CI-3 instantiation is allowed to take over and the cycle repeats this way until the mixed CI-3/CI-4 interleaver, π, is available at the output.
To understand the concept of running the method 600 using multiple target MHDs, consider an example where the method 600 has already be run to determine a CI-4 interleaver with a target MHD of dt. Next an analysis run of the method 600 is subsequently run using a higher distance value, namely df+1 where dt≦df, so that the analysis run identifies all low weight CTBC codewords with weights dt<d≦df. In the analysis run, some of the additional information collected can include statistics that tabulate the low weight CTBC coded sequences and monitor the number weight d CTBC codewords there are in each of the above described categories of codewords. Specifically, the specific categories of the CTBC codewords whose weights are in the range dt<d≦df are evaluated and their positions vectors, p(d), are tabulated in respective tables, P(d). Note that the principal probability of error contributions are thus given by
where Pe,df denotes the error probability due to low weight CTBC codewords whose weights are in the range dt<d≦df and P(d,γb) is the probability of decoding in favor of a CTBC codeword with a weight d error sequence at a bit signal to noise ratio of γb=Eb/N0. A further granularity in the error coefficients, Ad, can be discerned using the statistics provided by the analysis run of the method 600. The analysis run preferably tabulates how many weight d CTBC sequences come from each of the categories of low weight error sequences as defined herein above, i.e., Φ1, . . . , Φ4 and Φm(CI-4), for m=1, . . . , Nc, at each weight, dt<d≦df. Define the category-error-coefficient expansion, for each dt<d≦df, as follows:
where the Ad(cat) values (“category-error-coefficients”) equal Ad times the percentage of low weight CTBC codewords at weight d, that come from, respectively, categories Φ1, . . . , Φ4 and Φm(CI-4), for m=1, . . . , Nc, and divided by 100. In a CI-3 design only the first four Ad(cat) values can be non-zero, and in a CI-4 design only the Ad(cat) values for cat=5, . . . , Nc+4 values can be non-zero. In a mixed CI-3/CI-4 design, all the Ad(cat) values in equation (9) can be nonzero. As discussed earlier, in mixed CI-3/CI-4 interleaver designs, if a given positions vector determined in a CI-4 instantiation identifies a low weight error vector that has already been identified as a member of any of categories 1-4, that positions vector would be grouped into its respective category 1-4 and not counted a second time in a category cat≧5. It can be noted that combinatorics could alternatively be used to determine closed form expressions for each of the category-error-coefficients. With these definitions, equation (8) can be modified to take advantage of this additional information as
where d(cat) is a separate target minimum distance, dt<d(cat)≦df, defined for each category of codewords. The values of d(cat) are used as the multiple target MHDs in the method 600. The values of d(cat) are selected to lower the error probability of equation (10) below that of equation (8) or to preferably minimize error probability of equation (10).
Note that the lengthP parameter sent to Algorithm 1 identifies each positions vectors as belonging to a certain category, Φm(CI-4), for m=1, . . . , Nc. Hence the dt value used in Algorithm 1 can be changed to dt(LengthP) which correspond to the d(cat) values for cat≧5 in equation (10). Different d(cat) values can be used in each of Constraints 1-5 so that each category of low distance error sequences is made to give rise to a lower overall contribution in (10) from the identified values of Ad(cat) and dt(cat).
In the CI-3 and CI-4 constrained interleaver design examples presented so far, the constrained interleaver, π:c→u i.e., u=π[c], was created by sequentially placing coded bit positions, one at a time, from the coded sequence c into an initially empty interleaver vector u. The placement of the coded bits into the vector u was performed subject to a set of interleaver constraints that, when satisfied, ensure that a target minimum Hamming distance, dt, will be maintained at the output of the IRCC. This process of sequential placement was continued until the interleaver vector u was completely filled with K=nρ coded bits. RCID (reverse constrained interleaver design) takes the view that the coded bits from the vector c have initially been placed into the vector u. However, this initial placement may very well violate the interleaver constraints. RCID then applies a systematic approach to rearrange certain selected bits in u in order to arrive at a new vector u that gives rise to a vector v that does satisfy the constraints and thus achieves the target minimum Hamming distance, dt. In the CI-3 and CI-4 constrained interleaver design methods, as the placement of coded bits from the vector c were sequentially placed into the vector u, in some cases it was indicated that roll-backs may have been needed. As explained below, RCID can also be used at the time it is determined that a roll-back will be required. RCID is then used to convert a current interleaver u that requires a roll-back into an interleaver u that meets the interleaver constraints.
The RCID method starts by assuming all of the positions have already been placed into the interleaver u and thus the u vector is initially full. For example, the vector u may be initialized to the natural ordering of coded bits in the vector c by setting u=c. At this time the interleaver, π, amounts to an identity transformation and the coded bits of the concatenation, v=G[u]=G[π[c]], most likely violate the target MHD requirement dt. The RCID approach remedies this situation by removing a predetermined set of coded bits from the initial vector u until all sub-distance error sequences, denoted iP<, with weights d<dt are eliminated. This is preferably performed by removing the minimum possible number from of bits u so as to prevent the sub-distance error sequences, iP<, from completing. The act of removing already placed positions from the vector u creates a set of “holes” in the vector u that correspond to the now-vacated positions in u.
RCID next seeks to place the removed positions back in the interleaver in such a way as to achieve the target MHD, dt. An advantage of this approach is that there will be a smaller number of bits that need to be placed thus making the interleaver design simpler. Also, using this approach, shorter interleavers (i.e., with a lower value of p) can be constructed that meet the MHD requirement, dt. However, when RCID is used to construct minimally short constrained interleavers that meet the MHD requirement, dt, the interleaver it will only permute the bits that have been removed be and thus will not be nearly as random as the previously discussed CI-3 and CI-4 designs. Such RCID interleavers will thus sacrifice much of the interleaver gain. However, when RCID is used to resolve roll-back conditions, this disadvantage is not the case. Nor is it the case if the initial u vector is set as u=πrand[c], where πrand is a random interleaver.
To better understand RCID, consider a simple example. In this example, let the OBC be a (4,1) repetitive code that is formed by repeating each message bit four times, and whose MHD is d0=4. That is, the c vector is equal to a vector of message bits with each message bit repeated four times. Therefore, each codeword involves a repetition of four message bits, and each codeword requires four coded bits in order to complete. In this example, a concatenation v=G[u]=G[π[c]] will be formed using the accumulator IRCC for the inner code, and the RCID interleaver will be selected to achieve a target MHD of dt=16. Note that this target MHD corresponds to the maximum that can be achieved using a CI-2 since CI-2 can achieve dt=d02di. Using RCID, set the initial condition u=c so that the interleaver it is initialized to the identity transformation. In this example, the goal is to modify π (i.e., the ordering in u) in a minimalistic way so that the concatenation, v=G[u]=G[π[c]], achieves dt=16. To achieve this MHD, the following steps may be used:
1. Start by placing all 4ρ coded bits into u by setting u=c. At this time the resulting MHD (from each single OBC codeword) is only 2 because the accumulator IRCC converts each sequence like “1111” into goes “1010”. Hence, changes are needed.
2. In order to ensure that none of the ρ codewords can complete, remove one coded bit from each codeword position. For example, remove the last coded bit position from every codeword position. This will result in removing a total of ρ coded bit positions at locations congruent to 3 modulo 4 (u(i), i Mod 4==3). Once these coded bit positions are removed, no codeword of the OBC will be completed in u. Hence, with these removals, there will be no sub-distance error sequences, iP<, and thus there will be no violations to the MHD objective, dt=16.
3. Next the RCID approach seeks to place only the ρ number of removed coded bit positions back into the ρ number of holes created in u, but in a different order. The removed positions needed to be placed back into the holes in such a way as to achieve the target MHD, dt. In this example, this involves placing only ρ number of removed coded bit positions as opposed to placing the entire set of 4ρ as is needed in the above described CI-3 and CI-4 design methods. When placing this smaller set of p number of removed coded bit positions, the CI-3 or CI-4 constraints as described above can be used. For example, if constraint 5 is used, and considering a single codeword, with bit position 4 (i mod 4=3) taken out of codeword position zero, if the coded bit position removed from this first codeword is placed at position u(i) where i≧19, then the concatenation, v=G[u]=G[π[c]], will achieve dt≧16 for this single codeword error event. Similarly, any one or more of constraints 1-5 can be checked/applied when placing the removed coded bit positions back into u to account for low distance error events involving multiple codewords as well.
In the above example, the parameter ρ need not be known ahead of time. Instead, the method can be carried out and the lowest value of ρ for which the MHD requirement can be met can be determined to be ρmin. This allows a minimum frame size, Kmin=nρmin, to be determined that is needed to meet the target MHD. Also, it should be noted that this simple example was provided to illustrate the main RCID concepts. RCID could be applied to the (8,4) Hamming code as well. In the case of the (8,4) Hamming code, for example, RCID can be applied by starting with u=c and then removing the last four bit positions from of each codeword position u. In this case, the vector u will hold the first four coded bit positions of each codeword position in natural order. The second four bit positions of each codeword position will include bit positions from other codeword positions, and in a more randomized order. However, this approach will lead to a lower interleaver gain due to less randomization as compared to the CI-3 and CI-4 design approaches. The RCID approach may be desirable if it is desired to use as small of a frame size as possible to meet a given MHD requirement.
The RCID technique can be more generally be described as follows:
1. Starting with a given permutation, u=π[c], (where π may initially be the identity transformation or an interleaver being designed using one or both of the above-described CI-3 or CI-4 design methods, but at a point in need of a roll-back, for example), determine a (preferably minimal) set of positions that can be removed from u that will prevent any sub-distance error sequences, iP<, from occurring, so that no violations to the target MHD, dt, occur in the concatenation v=G[u]=G[π[c]]. This can be done by removing the minimum number of coded bit positions from u, or by removing bits in steps of a fixed number of bits at a time, or any other sequential or grouped manner. Also, particular coded bits that have already been placed but are causing too many restricted zones to be present in the remaining bits to be placed can also be removed as a part of this first step.
2. Place the removed positions back into the holes created in u, but in a different order, such that the required MHD condition of the concatenation is achieved. The re-ordering can be done by placing the removed subject to a selected set of constraints. Alternatively the removed coded bits can be placed back into a randomly selected hole and followed by checking to see if any constraint violations have been made by that placement and only allowing valid placements. Similarly, any combination of the above mentioned placing or swapping methods can be used that checks for and prevents any sub-distance error sequences, iP<, from occurring.
For example, consider how to apply RCID to perform a roll-back when the CI-3 and/or CI-4 design algorithm reaches a point where a roll-back is required. The RCID approach is preferably applied by considering all of the remaining positions to be placed into the holes created in step 1 of the RCID approach as outlined above. Additionally, already placed locations that are identified to be giving rise to excessive restricted zones for of the remaining positions can also optionally be removed as well prior to the reordering and replacement step 2. Then the CI-3 and/or CI-4 design approaches are continued as per step 2 to fill the holes and complete the CI-3 and/or CI-4 interleaver design.
If necessary, the interleaver gain in RCID can be improved by removing more bits than are needed from u and then randomly selecting their reordering to place them back subject to the selected interleaver constraints. In this sense, the CI-3 and CI-4 design methods can be viewed as special cases of RCID where all of the coded bit positions are removed from u and are then placed back to maximize the interleaver gain.
Parallel Architectures with Deterministic Constrained Interleaver (DCI):
Interleavers are often required exhibit the “Contention free” property, also known as “vectorizable.” Such interleavers have the advantage that they can be efficiently implemented in decoder chips that employ a set of M parallel processing engines that are able to make repeated parallel accesses to a bank of M parallel memories without any memory address conflicts or memory contentions. For example, the LTE standard uses a QPP interleaver which is 8-way vectorizable, and LTE decoder chips are often organized as 8-way parallel processing systems. OTN also uses M-way vectorizable interleavers and parallel processing chips, but, most usually, with M>8 due to the very high data rates used in OTN applications. In the context of constrained interleavers, the “contention free”/“vectorizable” property can be formulated as an additional interleaver constraint. Herein, an “M-way vectorized deterministic constrained interleaver” corresponds to a DCI (deterministic constrained interleaver) that typically implements a SRCI such as CI-3 and/or CI-4, and also implements the vectorization constraint below. An M-way vectorized deterministic constrained interleaver also uses a deterministic pseudo-randomization function (such as the QPP or other deterministic interleaver). M-way vectorized deterministic constrained interleavers are preferably used in transmitters to generate CTBC codes that have vectorizable constrained interleavers. Also, a certain class of M-way vectorized deterministic constrained interleavers are used in high speed real time parallel access/parallel processing implementations of SISO decoders as described in more detail below. Also, a permutation is said to be deterministic and vectorizable if it meets the vectorization constraint below and can be generated by a pre-determined DCI using one or more predetermined mathematical formulas as discussed in further detail below.
Constraint 6 (“Vectorization Constraint”): Given that the c and u vectors are of length K, in order for an interleaver to be M-way vectorizable, Constraint 6 requires that the permutation u=π[c] is selected to ensure that subsequences in c whose elements are spaced by multiples of K/M positions apart are permuted into re-ordered subsequences in u whose corresponding elements are also spaced by multiples of K/M positions apart.
Constraint 6 can be better understood by considering an example memory system 710 arranged as an K/M×M matrix as shown in
Let [C]K/M×M denote a K/M×M “vectorization matrix” into which the vector c is loaded in column-major order. Such a matrix C is shown as matrix 710 in
Any permutation that satisfies Constraint 6, π:c→u, can be factored as follows: u=π[c]=πLSBπi
The memory system 700 will be connected via the M×M interconnection network to a set of M processing engines, labeled as Proc(jcol), jcol=0, . . . , M−1 (not shown). The address generator 705 is configured to be able to count in natural order (to access any row of the C matrix in parallel), and also in permuted order (to access any row of the U matrix in parallel). For example, as discussed in further detail below in connection with
To understand how the memory 710 can be accessed in accordance with both the C and U matrices, consider an example involving a QPP interleaver as used in the current 4G LTE turbo code. During the second half of each SISO iteration, the block 705 acts as a sequential up/down counter that increments/decrements the row index, irow. During the first half of each SISO iteration, the block 705 performs QPP addressing as described above in connection with the Sun reference. A high speed M-way parallel hardware embodiment of the address counter 705 can be implemented to generate M consecutive QPP addresses in parallel. Inside the block 705, are M parallel QPP address generators that are configured to sequence through all of the addresses of all elements stored on each column of U. This way, all of the elements, (πMSB[irow],πLSBπi
In the operation of the system described above, while addressing the matrix C, all the up/down row counter in the block 705 needs to do is to provide a single row address, irow, because once this row is accessed in parallel, data elements C[irow,jcol], jcol=0, . . . , M−1 can then be passed directly to the set of processors Proc(jcol), jcol=0, . . . , M−1 (not shown in
Next consider an example where the memory system 700 is specifically used while decoding a CTBC code. To look at a larger example than that shown in the block 710, let the frame size be K=4096 (212) and let there be M=23=8-way vectorization, so that each column of the interleaver matrix 710 has 212−3=29=512 bits per column. Assuming that the (8,4) Hamming code of Table 1 is being used as the OBC, there will be 23=8 bits per OBC codeword, and thus each column of C will contain 29−3=26=64 codewords of the OBC. In the first half of a SISO iteration, each of the M=8 processors will need to access a respective column of the implicit permutation matrix U. Since the U matrix is never explicitly formed, the address generator 705 is preferably configured to generate a set of QPP permuted row and column addresses using the parallel configuration based on the Sun reference as described in detail above. During the first half the of the CTBC code's SISO decoding operation, IRCC decoding is performed, so that each of the M=8 processors perform parallel decoding on a separate column of the U matrix in order to decode a respective length-K/M=512 subsequence of the CTBC codeword, v. During the second half of the SISO decoder cycle, the address counter 705 counts in natural order, 0, . . . ,K/M−1, and the M×M interconnection network 730 performs a direct pass through, so that each of the M=8 processors can perform, in parallel, an OBC SISO decoding cycle on a subsequence of 64 codewords stored in each column of the matrix C.
Certain permutations like the QPP are already factorizable, in which case a set of MSBs extracted from the address generator 705 can be used to select a row, and the LSBs of each of M parallel-generated QPP addresses can be decoded and used to control the interconnection network 730 to apply the intra-row permutation to the elements the selected row. However, an aspect of the present invention 700 contemplates that any valid permutation over the integer ring {0, . . . , K/M−1}, πMSB[], can be used to select rows in the memory 710, whether πMSB[] is vectorizable or not. Then any independent set of intra-row permutations, πLSBπi
As previously discussed, a “random interleaver” can be defined in opposition to a “deterministic interleaver” that uses a mathematical formula to generate the deterministic interleaver permutation. A random interleaver is thus often implemented as a table look up or with a state-machine logic circuit whose sequencing logic does not use a fixed mathematical equation but whose state transition logic needs to be specifically designed for each is frame size. In this context, many of the deterministic interleavers defined herein can have some random components to them that rely on state transitions and state dependent logic that are different for each frame size. A design objective is to design and select DCI solutions that minimize the amount of hardware that needs to be specifically designed for each is frame size.
Referring now to
The method 800 is first described in the context of designing CI-4 DCIs. As discussed in further detail below, the method 800 can also be configured to also design CI-3 and mixed CI-3/CI-4 DCIs. In the descriptions of certain preferred embodiments below, it is assumed that the block 705 is a QPP interleaver that generates both a sequence of permuted row addresses from the MSBs and also generates a set of M permuted column addresses using the LSBs from a set of M=8 QPP recursive permutation address generators as discussed above in connection with
Action 802 is similar to action 602 as discussed above, and when a CI-4 DCI is being designed, the action 802 preferably generates a complete μ-set, {(nj+t)μ: j=0, . . . , ρ−1}. However, in the method 800, elements of the μ-set are selected to be a subset of the rows in the vectorization matrix C. For example, if M=8, K=4096, K/M=512 and the (8,4) Hamming code of Table 1 is used, so that n=8, ρ=512, and μ=4, this means that half of the coded bits of c can be included in the μ-set. A selected μ-subset consisting of μ=4 bits will come from each codeword. Because the codewords of the OBC are loaded into the matrix C in column major order, the indices of all of the coded bits on each row of C will be congruent to the same value of t modulo n, given by π[irow] MOD n=t. This implies that half of the rows of the matrix C can be include in the μ-set. Hence in a typical embodiment, the action 802 randomly selects a sequence of p μ-subsets, until, in this example, a selected set of K/M/2=256 rows has been placed, and the placement counter is set to PLACED=μ*ρ=4*512=2048. In some embodiments, in order to provide additional flexibility in mapping, the action 802 creates a partial μ-set in order to loosen the requirements later while placing coded bits subject to the constraints. Whether a complete or incomplete μ-set is selected, the action 805 identifies a corresponding set of permutation locations, {π(nj+t)μ:j=0, . . . , ρ−1}, for the μ-set selected in the action 802. In the example of
Control next passes to action 810 where a remaining row of C, i.e., a row that has not yet been placed, is selected. In preferred embodiments, the selection 810 is generated using a pseudo random number generator that generates a row index into a vector that includes all the indices of the rows that have not been placed yet, although other selection criteria can be used. The currently selected row can be selected by first randomly selecting a value for the variable irow which is outside of the original μ-set, e.g., the 256-element μ-set denoted {irow}256. The selected row 810 is then given by πMSB [irow].
Control next passes to an action 815 which performs an analysis function. The action 815 views the currently selected row as a mapping group as discussed in connection with the actions 615 and 620 above. All of the restricted zones for each coded bit position on the currently selected row, πMSB[irow], of the matrix C are preferably evaluated in the action 815. In a preferred embodiment, the action 815 generates a set of positions vectors for each of the coded bit positions on the πMSB[irow]th row of the C matrix and passes these positions vectors to Algorithm 1 in order to identify their respective restricted zones. As discussed earlier, different target MHDs may be used in Algorithm 1 depending on LengthP of each positions vector. The action 815 preferably also maintains a data structure that records all of the positions vectors and restricted zones for all the elements that have been placed. As described below, the recorded information regarding the already placed positions vectors and restricted zones can be useful if a roll-back is required later, or for other purposes as discussed below in connection with
Control next passes to an action 820 which starts by identifying a preferred default candidate intra-row permutation, πLSBi
As shown in
In other kinds of embodiments, such as illustrated in
In all such embodiments of the method 800, control next passes to action 825 where the variable PLACED is incremented by M If no valid placement could be found in the action 820, a flag is preferably set in the action 820 so that the action 825 will not increment the variable PLACED. Control next passes to action 830. If the variable PLACED has been incremented by M, and the value of PLACED is still less than K, then control passes from the action 830 back to the action 810. If an error condition has been marked in the action 820, then action 830 performs a roll-back and increments a roll-back counter. When a roll-back is carried out, the value of irow and thus πMSB[irow] is rolled back to a previous value, and the value of PLACED is set to a lower value indicative of the roll-back point. If a certain number of roll-back attempts fail, a new deterministic interleaver can be selected for use as πMSB[]. For example, if the deterministic interleaver is a QPP interleaver, the parameters f1 and f2 are adjusted and the method 800 started over again using this new πMSB[] permutation at the action 802. Also, If a certain number of roll-back attempts fail, the method 1000 as discussed below can be executed, especially for cases where the value of PLACED is close to K=μ*ρ.
To understand how roll-backs are performed, consider the above example where there are a total of 512 rows to be placed, and a “non placeable row” error condition is flagged when attempting to place row πMSB[irow], which corresponds to the 507th row in the row-placement sequence. The probability of having a “non placeable row” error condition increases toward the end of the method 800 when the majority of the rows have already been placed. There are various approaches that can be performed to perform a roll-back. One approach it to not place the current non-placeable row, flag the current row as not having been placed, but to then increment PLACED, and continue to the next row, and continue doing this until PLACED reaches its final value. Assume that this is attempted, and by completion of the method 800, all of the rows could be placed except for rows, πMSB[irow], for irowε{14, 210, 507}. To perform the roll-back, the stored positions vectors and restricted zones are analyzed and the codewords involved in the positions vectors are identified. An analysis is performed to determine which already placed rows contain codeword positions that caused the difficulty in the placement of the codeword positions in the rows that could not be placed. The roll-back is then preferably performed by causing certain earlier-placed rows to be placed in such a way as to alleviate all of the placement problems in the problematic rows. This can even go as far as causing certain alternative permutations to be applied in the μ-set. Once the changes are made to the placement of a subset of already placed rows, the method 800 is restarted at the point after the earliest row that was changed (or after the p-set if changes were made in the way any of the rows of the μ-set). The method 800 is then allowed to run to completion or until another roll-back is needed. This process continues until the method 800 finds a solution or until a roll-back counter meets a threshold. If the roll-back threshold is met, a new deterministic interleaver is used in the block 705, and the method 800 is started over with this new deterministic interleaver. For example, if the deterministic interleaver used in block 705 is a QPP interleaver, the parameters f1 and f2 are adjusted and the method 800 started over again using this new πMSB[] permutation at the action 802. Also, the method 1000 as discussed below in connection with
Like the method 600, the method 800 can be configured to perform an “analysis run” that does not place coded bit positions from c to u in actions 805 and 820. In analysis runs, the coded bit positions have all already been placed, so the method 800 is used instead to identify restricted zones for any specified MHD, e.g., MHD>dt. The output of an analysis run includes a set of positions vectors (and their respective lengths) that correspond to low weight CTBC coded sequences whose weights are dt<d≦df, where df corresponds to the highest weight of sequences that need to be identified in the analysis run. Other information such as restricted zones or other statistical information such as the category of each identified low weight positions vectors and counts of positions vectors in each category can also be provided.
The method 800 can also be configured to perform a CI-3 interleaver design, or a mixed CI-3/CI-4 interleaver design. First consider configuring the method 800 to perform a CI-3 interleaver design. To start, the action 802 is configured by setting the parameter μ to μ=1. This causes action 805 to place just one row without constraints. Control next passes to action 810 which can select a next remaining row, irow, for example, by using a random number generator, and can then select the next row of C to be placed to be πMSB[irow]. In CI-3 design embodiments, action 815 is configured to identify the restricted zones for all coded bit positions on the selected row due to Constraints 1-4. The same CI-3 checks (Constraints 1-4) are used as discussed in connection with action 615 of the method 600 when configured for CI-3 based designs as discussed in connection with
To configure the method 800 to design a mixed CI-3/CI-4 DCI, the method 800 is instantiated twice, with one instantiation configured to design a CI-3 interleaver as described above (“the CI-3 instantiation,”) and the other instantiation configured to design a CI-4 interleaver as also described above (“the CI-4 instantiation.”) These two instantiations will work on the same mixed CI-3/CI-4 DCI design problem together and communicate and synchronize each other as described above in connection with the method 600 that used a CI-3 instantiation of the method 600 in communication and in synchronization with the CI-4 instantiation of the method 600. The difference is that mixed CI-3/CI-4 vectorizable DCI uses a CI-3 instantiation of the method 800 in communication and in synchronization with the CI-4 instantiation of the method 800.
The method 800 can also be modified to operate with a plurality of different target MHDs. This type of operation is similar to the above described embodiments of the method 600, except the multiple target MHDs are applied in the action 820 instead of the action 620 as described above.
In an alternative embodiment of the method 800, the number of elements per row is set to M=1 so that there is only one column which has K/M=K coded bits in a single column (i.e., and entire frame). Length K DCIs that are not vectorizable have the disadvantage that they are not vectorizable, but have increased interleaver gain as compared to a vectorizable DCI. In such embodiments, the Constraint 6 will not be enforced. A starting deterministic interleaver is provided, πMSB[], that has a frame size of K. The actions 802 and 805 is operate like actions 602 and 605 to select and place a set, and action 810 behaves similarly to action 610 to preferably select a single coded bit to be placed. In such embodiments action 820 places each coded bit in accordance with the deterministic interleaver, πMSB[]. A full data structure of related information such as positions vectors, restricted zones, and constraint violations in placing bits in accordance with πMSB[] are recorded. The final result is an analysis of πMSB[] to determine how close it is to a length K DCI. As discussed in connection with
A deterministic interleaver 905 generates a set of permuted indices using a deterministic formula-based calculation to generate the deterministic interleaver's address sequences under state machine or program control. The output of the deterministic interleaver 905 is coupled to a local constraint enforcer permutation block 910. As the name implies, the purpose of the local constraint enforcer permutation block 910 is to perform a local post-permutation to transform the deterministic interleaver's output to a valid constrained-interleaver permutation function. The local constraint enforcer permutation 910 takes as input the deterministically permuted sequence of indices and applies a predetermined set of correction permutations to ensure that the resulting sequence, u, meets a set of interleaver constraints such as any one or more of interleaver Constraints 1-6 as discussed above. For example, the local constraint enforcer permutation can apply a predetermined set of swaps or sub-permutations to convert the output of the deterministic interleaver 905 into a valid deterministic constrained interleaver permutation, 900.
Similar concepts as described in connection with
Referring now to
The method 1000 begins with an action 1002 that selects a deterministic interleaver, πD[], for use in the block 905 of
The deterministic interleaver 905, πD[], that is identified in action 1002 is then processed by action 1005 using an embodiment of the method 800. For example, any of the embodiments of the method 800 as described in connection with
Alternatively, the method 1000 can be used similar to a roll-back for cases where the method 800 was not able to find a deterministic constrained interleaver permutation, πDCI[] to meet all the specified constraints. In such cases, the method 1000 can be viewed as an outer control loop that calls the method 800 to design a DCI from the action 1005. If the method 800 is able to find a valid DCI, then the method 1000 can exit at action 1005. If the method 800 is not able to find a DCI to meet a specified set of Constraints 1-6, then the analysis information provided by method 800 can be used to identify one or more interleavers that are close to a DCI but still have one or more identified constraint violations. Preferably a complete record of relevant of the information from the one or attempted design runs of the method 800 that provided the one or more best/closest approximations to a DCI are recorded in the action 1005. For example, the μ-sets, the lists of positions vectors used when placing each position, (nj+t), the respective restricted zones identified when placing each bit (nj+t), and the orderings used to make the rest of the placements after the μ-set as identified by each pass through action 810 would be identified for each candidate deterministic permutation πD[] that is identified to be close to a deterministic constrained interleaver permutation πDCI[]. The data structure will have preferably recorded a set of one or more coded bit positions, (nj+t), that could not be placed in such a way as to meet the target MHD during the previous run of the method 800. This way, the complete state and history of the runs of the method 800 that resulted in each candidate πD[] could be made available to the method 1000. Depending on the embodiment of the method 1000, only a subset of all the recorded information described above may need to be recorded for use by the rest of the method 1000. In embodiments where the action calls the method 800 to provide only an analysis run, a similar set of information would be provided by the analysis run of the method 800.
Control next passes to an action 1010 that places a μ-set similar to the action 805 as described above. The same placement as used in the action 805 when the method 800 was called from the action 1005 is preferably used. The previous run of the method 800 could have been an analysis run or a failed design run as discussed above. Control next passes to an action 1015 which selects a next coded bit position or a mapping group to be placed. This action can be performed similar to any of the embodiments of the action 810 as described above. In preferred embodiments the action 1010 follows the stored ordering that was generated by looping through the action 810 by the method 800 when it was called from the action 1005. When this ordering is used, the information recorded in the previously described data structure will be perfectly synchronized with the current run of the method 1000. In alternative embodiments, the mapping group is selected based upon, for example, a window of positions in the vector u where the state information data structure provided by the previous run of the method 800 indicates that constraint violations exist that need reconciliation.
Control next passes to an action 1020 identifies one or more respective local swap lists associated with each of the one or more current bit positions in the current mapping group. For example, the action 1020 analyzes each bit position (nj+t) of the current mapping group to determined whether the deterministic interleaver 905's permutation location πD[(nj+t)], is in a restricted zone. If πD[(nj+t)] does not correspond to a location in any restricted zone of (nj+t), then a respective local swap list, Lswap(πD[(nj+t)]), is left empty. In such a case, if there is only one element in the mapping group, then control then passes to actions 1025 and 1030 where the variable PLACED is incremented and control is looped back to the action 1015 to select the next placed element to be analyzed. If πD[(nj+t)] is in an identified restricted zone of (nj+t), then the local swap list Lswap(πD[(nj+t)]) will need to be built. The local swap list will contain a list of positions that are both local to the mapped position πD[(nj+t)] and are outside any restricted zones of (nj+t).
The concept of “local” is relative to the underlying hardware on which the πDCI[] 900 is to be implemented. For example, if the interleaver πDCI[] 900 is not being designed to be a vectorizable interleaver, a local set of candidate swap locations can be defined as a window, πD[(nj+t)]±wd, where wd corresponds to a window distance and is used to define a window around the position πD[(nj+t)] in u. If the interleaver πDCI[] 900 is being designed to be a vectorizable interleaver, then “local” typically refers to a two-dimensional window area, given by U(πMSB[irow]±wd-row, πLSB[jcol]±wd-col). Typically the smaller the value of wd, wd-row, and/or wd-col, the lower complexity that will be required to implement the local constraints enforcement permutation 910. The window size used in a swap zone is preferably made as small as possible and the minimum possible window size is dependent on the distance to a nearest swappable position in u as discussed below. In many cases the window need not be centered on the position πD[(nj+t)], but as discussed below, the window edges will be determined by the edges of certain relevant restricted zones. When the current mapping group contains more than a single coded bit position, the minimum possible window size for use with a swap list can also be influenced by the other swap lists of other elements in the current mapping group.
Without loss of generality, assume for now that the DCI 900 being designed is not required to be contention free, so the simpler one-dimensional window, πD[(nj+t)]±wd in u is in use. Also assume that the current mapping group only has one element, (nj+t), and that πD[(nj+t)] has been placed into a restricted zone of (nj+t). In this example the local swap list, Lswap(πD[(nj+t)]), will need to be built. The local swap list is built by starting with the smallest window size possible. The smallest window size possible is influenced by the restricted zone in which the coded bit position, (nj+t), has been placed, i.e., the restricted zone of (nj+t) around πD[(nj+t)]. In this example, suppose that the restricted zone into which (nj+t) has been placed can be defined as the range [πD[(nj+t)]−rz1, πD[(nj+t)]+rz2], where rz1 and rz2 are parameters that define the restricted zone edges relative to the placed position, πD[(nj+t)].
Continuing with this example, and focusing on the increasing direction in u, there will be a neighboring bit position at πD[(nj2+t)]=πD[(nj+t)]+rz2+1. The action 1020 will next check to determine whether πD[(nj+t)] is in a restricted zone of bit position (nj2+t). If πD[(nj+t)] is not in a restricted zone of bit position (nj2+t), then πD[(nj2+t)] can be added to the swap list Lswap(πD[(nj+t)]). This is because, if the positions πD[(nj+t)] and πD[(nj2+t)] are swapped in u, then after the swap, neither of πD[(nj+t)] will have moved out of its restricted zone and πD[(nj2+t)] will still be outside of its restricted zones. The local swap list Lswap(πD[(nj+t)]) can thus be built in this way in both the increasing and decreasing directions in u (or 2-dimensional indexing-area in U). The entire swap list, Lswap(πD[(nj+t)]), need not all be built at once, but can be expanded as needed to include more elements. The idea is to start with the closest elements in u and to grow the list as needed. If another restricted zone of (nj+t) is encountered while expanding outward in any direction from the center position, πD[(nj+t)], those points are skipped over to one point beyond the distant edge of the newly encountered restricted zone.
Once one of more elements have been added to Lswap(πD[(nj+t)]) in the action 1020, control next passes to an action 1025. Continuing with the simple example where there is only one element, (nj+t), in the mapping group, the action 1025 will typically start with the closest element of Lswap(πD[(nj+t)]) and analyze whether this swap is a valid swap. A swap is said to be valid if it swaps πD[(nj+t)] with πD[(nj2+t)] so as to eliminate the constraint violation in (nj+t) without introducing any new constraint violation associated with any (one or more) third coded bit position(s), πD[(nj3+t)]. To ensure the swap is valid, a check is first made by scanning each coded bit position in u in the vicinities of πD[(nj+t)]±wd and πD[(nj2+t)]±wd and identifying any placed coded bit position in u, πD[(nj3+t)], that is associated with a respective coded bit position (nj3+t) whose position vectors contain any coded bits from completed codewords associated with the codeword positions cj and/or cj2. If any such coded bit positions πD[(nj3+t)] are found, then a further check is made to determine whether the proposed swap would cause a constraint violation associated coded bit position (nj3+t) to occur. If no such placements πD[(nj3+t)] in the local vicinity is found, the swap is determined to be valid, and the swap can be made or annotated in the list to be a valid potential swap for later use. If the swap is not valid, then additional positions in the local swap list can be checked, or control can pass back to the action 1020 to identify more elements to add to the swap list and then action 1025 is repeated looping in this way until at least one valid swap is found. Once one or more valid swaps are found, the counter PLACED is incremented and control passes to an action 1030.
As discussed above, in some cases a mapping group with Δ>1 element is selected in the action 1015. In such cases, the above process is carried out, but by additionally observing the interactions between making multiple swaps. For example, if four elements are in the mapping group, it could turn out that several different valid swaps could have been made, but a particular valid swap caused a problem later. Hence computer-chess logic (“look ahead logic”) is used to analyze a set of potential valid swaps (“moves”) several moves into the future. Such added logic of looking into a trellis of paths containing several moves into the future can be used to find a set of potential valid swaps that avoid having an earlier swap cause a problem for a later swap. In fact, this type of optimized forward looking trellis logic can be used with a mapping group that includes all of the bit positions that have constraint violations.
At times, an invalid swap may purposely be made. An invalid swap is made in order to be able to chain swaps. Chained swaps are used when the distance of the swap is too large for the underlying hardware, so that an actual swap is implemented as two sub-swaps, selected such that after the two sub-swaps there will be no constraint violations.
The method 1000 can also be used in conjunction with the method 600 that designs a CI-3 or CI-4 or mixed CI-3/CI-4 random constrained interleavers. If a random interleaver is being designed as per the method 600, then the method 1000 can have its action 1005 call the method 600 instead of the method 800. The method 1000 runs similarly as described above, but all of the swaps can be carried out off line and used to correct the random interleaver's permutation function so that all the constraints are enforced. In such embodiments, no separate constraint enforcer permutation 905 is needed because it is incorporated directly into the random interleaver's permutation function.
The receive metrics calculator 1105 calculates a set of input signal metrics. When optional rate matching is in use, in accordance the received signal metrics calculator 1105 inserts dummy signal metrics to account for the bits that have been deleted due to the rate matching operation. Typically the signal metrics that are re-inserted based upon the puncture pattern generator 626 are set to zero, although other values could optionally be used. The receive metrics calculator 1105 couples these inverse rate matched receive signal metrics to a gamma and branch metrics initialization unit 1116.
The gamma metrics initialization unit 1116 is configured to initialize the gamma metrics, typically by filling a gamma memory using the calculated received signal metrics coupled from the receive metrics calculator 1105. The gamma memory is coupled to (or built into as an integral part of) an inner code trellis SISO half iteration block 1117. The inner code trellis SISO half iteration block 1117 generally uses the initial gamma values to perform forward and backward state metrics recursions used to support trellis decoding operations used in SISO decoding. After the first iteration, during each inner code trellis SISO half iteration 1117, the gamma values are updated and then the forward and backward state recursions (forward alpha and backward beta recursions) are carried out to update the alphas and the betas in block 1117. To do these updates, a set of a-priori extrinsic LLR values are read from a 2D memory array, 1160. An “a priori extrinsic LLR value” refers to an extrinsic LLR value before an update occurs and an “a posteriori extrinsic LLR value” refers to an extrinsic LLR value after an update occurs. Hence depending on exactly where the SISO iteration the SISO decoder 1100 is processing and from which point in the SISO decoder algorithms one is looking, a given extrinsic LLR in the 2D memory 1160 keeps switching from being an a-priori extrinsic LLR value to an a-posteriori extrinsic LLR value, and back to an a-priori extrinsic LLR value and so on.
The order in which the a-priori extrinsic LLR values read into and processed by the block 1117 is determined by the L=1 deterministic constrained interleaver (DCI) or random constrained interleaver (RCI) address generator 1161. The address generator 1161 makes sure the a-priori extrinsic LLR values are sent to block 1117 in L=1 DCI or RCI interleaved order. After the inner code trellis SISO half iteration is complete, a set of updated (a posteriori) extrinsic LLR values are written back into the 2D memory array 1160 using the same interleaved ordering as discussed above, i.e., ordering determined by the DCI or RCI ordering used in the address generator 1161. The 2D memory 1160 can be viewed as holding the U matrix as described above and can be stored in the physical two-dimensional memory array memory 710 as discussed in connection with
It can be noted that the 2D memory array block 1160 appears twice in
As is common practice, a stopping criterion is used to stop iterations. Although not shown, the stopping criterion may be implemented, for example, in block 1126 to indicate when the total LLRs have converged. To do this one or more total a-posteriori LLR is checked for convergence. If the convergence criterion is not met, the a-priori extrinsic LLR received from the memory 1160 is subtracted from this total a-posteriori LLR to produce the a-posteriori extrinsic LLR that is written back to the 2D memory array 1160 so that SISO iterations can continue. In this exemplary embodiment, if the convergence criterion is met, a control signal is generated to the 2D memory array 1160's control logic, and the block 1117 writes the total LLRs into the into the memory array 1160 and the control logic of the memory array 1160 causes the converged data values to be output from the system SISO decoder 1100. Alternatively, a fixed number of iterations may be used as the stopping criterion in the above description.
The memory architecture 700 can be used to support the memory accesses needed to support CTBC code SISO decoding. A discussion of the operation memory system 700 is provided in connection with
Referring now to
Before describing
With the above result in mind, consider the hardware and computational complexity needed to implement the each of the half iterations of a SISO iteration to decode the 4G LTE CTC the Studer reference. The Studer reference uses a radix-2 and a radix-4 Max-log-Map BCJR algorithm. The 4G LTE trellis code is an eight-state trellis code. Thus decoding such a trellis code requires performing eight gamma branch metrics calculations, one for each of the 4G LTE CTC's eight states, plus eight forward alpha state metrics recursions, and eight backward beta state metrics recursions, plus an LLR update to update the extrinsic information (3×8+1=25 vector operations of length Ksub, where Ksub is the length of each of the N=8 trellis subsequences). Therefore the order of complexity for decoding each of the N=8 trellis subsequences in of each of the two half iterations used in the Studer reference's ASIC to perform the Max-log-Map BCJR algorithm is given by O(25 Ksub). As can be seen by equations (2), (3), and (4) in the Studer references, around 6 or so additions and compare-select-max type operations, operations on average. Hence in terms of actual operations performed by adders and/or compare circuits, a closer estimate of complexity would be O(150 Ksub). Additionally some LLR based arithmetic is needed (scalar operations that do not add into the order of complexity calculations).
Next consider the complexity of first half iteration 1117 of the CTBC code whose inner code has been selected to be the rate-1 accumulator. The rate-1 accumulator does not require and 8-state trellis decoding operation but instead requires a 2-state trellis decoding operation. Changing the number “eight” to the number “two” in the above analysis gives a complexity of O(6 Ksub) if the Max-log-Map BCJR algorithm is to be used. However, as mentioned above in the Li reference, for the special case of decoding the rate-1 accumulator, the Max-log-Map BCJR algorithm used in the Studer reference is equivalent to the MM-sum algorithm and the complexity of the Min-sum algorithm is roughly ⅛ as inexpensive as compared to the Max-log-Map BCJR algorithm operating on the same rate-1 accumulator. An inspection of tables I and II of the Li reference reveal that the comparative complexity to implement the Min-sum O(3Ksub). That is, the complexity to perform the first half of the SISO iteration 1117 is roughly (3/150)×100=2% as much work as is required to implement the first half iteration of the best current 4G hardware that relies on 8-state Turbo decoding.
When the OBC can is selected to be a simple (8,4) Hamming code, this code will need to be soft decoded during the second half of the SISO iteration. As discussed in connection with
The first half iteration 1117, can thus be implemented using about a 2% as much computational complexity while the second half iteration of the SISO iteration can be implemented using about 23% computational complexity. However, it can be noted that the hardware and memory requirements to implement both the first and second half iterations also drops considerably. The Min-sum algorithm requires three recursions, but not on a per-state basis (see table I of the Li reference). Hence roughly eight times the state-metrics related memory requirement is eliminated as well. As shown in
Next consider receiver/decoder 1200 in further detail. A received signal is received and demodulated prior to being processed in a receiver metrics calculation block 1205. The block 1205 is typically preceded by a received signal demodulator to demodulate the received signal that is has been modulated by a signal mapper that can include rate matching and spatial modulation components as are known in the art or as discussed in further detail below in the context of additional aspects of the present invention. The block 1205 can reside off chip from the rest of the decoder 1200, and can instead reside in one or more separate front-end circuits/chips designed to demodulate and preprocess the received signal.
The block 1205 computes a set of received signal metrics based upon the demodulated received signal. In embodiments where signal preprocessing includes rate matching, the receiver metrics calculation unit 1205 typically inserts a signal metric into the received signal metrics stream to compensate for a signal value that was deleted due to rate matching in the transmitter. In a preferred embodiment, the inserted signal metrics are set to zero, although other values could alternatively be used. To avoid cumbersome language, it is to be understood that when describing the receiver/decoder 1200, when the term “receive metrics” is used, it is to be understood that from here forward, this can refer to the inverse rate matched received signal metrics.
The receive metrics calculation block 1205 couples its output receive metrics to a receive metrics RAM block 1210. Associated with the receive metrics RAM 1210 is a gamma branch metrics RAM 1220. The receive metrics RAM 1210 and the gamma branch metrics RAM 1220 may be merged into one memory embodiment as the receive metrics are typically used to initialize the gamma metrics. The receive metrics/gamma metrics RAM 1220 typically holds sets of gamma values, alpha values, and beta values. The output of the RAM 1210/1220 is coupled to an M-level parallel gamma-branch metrics calculation engine 1215. In general, the blocks 1210 and 1220 may be implemented as distributed sets of sub-memories that are distributed and tightly coupled with (i.e., existing within) a set of specialized arithmetic-logic processing circuits within the M-level parallel gamma-branch metrics calculation engine 1215. For example, in the CTBC code example given discussed in connection with the CTC currently in use in the 4G LTE standard, there would preferably be N=8 processing clusters inside the M-level parallel gamma-branch metrics calculation engine 1215. Each of these sub-clusters would preferably contain three sets of arithmetic-logic processing circuits each, one to update a set of alpha values (forward branch metric recursion), another to update a set of beta metrics (backward branch metric recursion) and another to update a gamma value (gamma update recursion). Given that only a two-state trellis typically needs to be decoded by the M-level parallel gamma-branch metrics calculation engine 1215, some of these hardware units could be eliminated. For example one functional unit could be used to compute both the forward and the backward state metrics. If the Min-sum algorithm is used as discussed above to decode the rate-1 accumulator, even more reductions are possible. With the Min-sum algorithm, the work required is as if there were only one state in the trellis. Hence significantly low complexity hardware can be designed. See Table I of the Li reference for a comparison.
Therefore the block 1220 would preferably include small RAM blocks collocated with the alpha, beta and gamma updating hardware. That is, M-level parallel gamma-branch metrics calculation engine 1215 is preferably embodied using M sets of parallel processing circuits tightly coupled and integrated with M different sub-memory modules that make up the memory blocks 1210/1220. Methods of initializing the alpha beta and gamma values used in the various forms of the BCJR algorithm of each subsequence are well known to those of skill in the art. The receive metrics are used to initialize the gamma metrics and the alpha and beta metrics are thus initialized in a selected way as is known to those of skill in the art for parallel SISO decoding of Turbo codes. For example, see the Studer and Roth references to understand some techniques that would be known to one of ordinary skill in the art as to how to initialize the parallel trellis subsequences. Many algorithms can be used to perform the soft trellis decoding on the parallel subsequences to decode the IRCC using M-level parallelism and finer grain sub-parallelism. That is, the blocks 1210, 1215, 1220 and 1225 can be configured to compute the operations of first half SISO iteration as computed in block 1117 of the CTBC decoding algorithm of
The decoder 1200 uses the 2D-array extrinsic LLR RAM 1240 to hold the updated the extrinsic LLR values similar to the 2D memory array 1160 of
Note that the M×M interconnect and constraint enforcer permutations block 1250 couples (optionally using the 2M lane bi directional data busses as described above) to the 2D-Array extrinsic LLR RAM 1240 and also to a processing array unit 1235 that includes both the M-level parallel extrinsic LLR trellis update calculation engine 1225 and an M-level parallel extrinsic LLR soft block decode update calculation engine 1230. In ASCI designs certain functional units that are used in trellis decoding are reconfigured or controlled by a different set program instructions or control signals to switch over to a second mode where they become engaged in block decoding SISO iterations as described below. That is, blocks 1225 and 1230 are inside a general block 1235 in order to indicate that certain hardware resources like functional units can be shared in a time division multiplexed fashion during the first and second halves of the SISO iteration. Also, the reason that the optional LSBs of the current extrinsic LLR address are shown as coming into the processor block 1235 is to indicate that the processors themselves may be programmed or configured to perform constraint enforcement permutation operations that so far have been described as occurring in the block 1250. This LSBs path could optionally carry additional information beside the LSBs that relates to the interleaving function. Using this data/control path, the processors could be controlled to read/write data elements stored in a local register bank in a predetermined order in order to enforce a pre-defined interleaver constraints. A state machine generating control signals in the block 1235 could cause extrinsic LLR values to be read into a local buffer accessed by a functional unit, and that functional unit would process those buffered elements in the prescribed order in accordance with a set of program instructions or hardware control signals.
Again referring to the M-level parallel extrinsic LLR updating engine 1235, each of the M internal processing engines in the M-level parallel extrinsic LLR soft block decode update calculation engine 1230 may use one or more parallel functional units to also optimally soft decode a specified block code such as an (8,4) Hamming code to update an extrinsic LLR value in the second half-SISO iteration. The optimal soft block decoding update is similar to the type of update that would be carried out in a half iteration of a SISO decoder configured to decode a turbo product code (TPC) (also known as block turbo code (BTC)). As discussed in connection with
The exemplary short (8,4) Hamming code can be optimally decoded using the approach that is well known to those of skill in the art and which is outlined in outlined in C. Xu, Y-C Liang and W. S. Leon, “A low complexity decoding algorithm for turbo product codes,” IEEE Radio and Wireless Symposium, pp. 209-212, January 2007, “the Xu reference” herein. Longer block codes can also be soft decoded according to the algorithms well known to those of skill in the art as taught in R. M. Pyndiah, “Near-optimum decoding of product codes: Block Turbo Codes,” IEEE Trans. Comm. Vol. 46, No. 8, August 1998, pp. 1003-1010 “the Pyndiah reference herein.” Depending the length of the codeword used and other implementational factors, the M-level OBC SISO decoder 1230 can be configured to implement various well known forms of the above approaches for soft decoding of block codes, for example, the Chase-Pyndiah algorithm (also referred to as the Pyndiah algorithm), low complexity Chase-Pyndiah algorithm, the OSD algorithm and its low complexity variations, the sum of product algorithm (SPA), or any similar soft decoding algorithm for decoding of block codes, as are well known in the technical publications literature.
In operation, the receiver and decoder 1200 performs as described above and performs the same CTBC code SISO iterations as described in detail in connection with
Referring now to
Also, as will be seen, since the memory and logic design of the functional unit 1300 is simple, so a more powerful functional unit could be created by chaining five or so such functional units together into a parallel functional unit embodiment whereby each parallel functional unit can be loaded and unloading in a circular buffer ordering. By the time the circle has completed in the circular buffer ordering, as one sub-functional unit 1300 is loaded the next functional unit (last functional unit loaded mod 5) has its results ready to read out. This way, with very little hardware, the block decoding portion of the SISO iteration could be balanced with the IRCC decoding speed.
The design of methods and circuits to decode short codes like (8,4) hamming codes and the like are well known. Such techniques can readily be used to design highly efficient soft decoders to decode one or more codewords of the OBC in parallel for use in each parallel processing channel of the M-level parallel LLR soft block code update calculation engine 1230 in
In
The extrinsic LLR input/output buffer 1305 is a very small RAM that only uses 16 RAM/register locations (the microsequencer can be configured so that only 8 RAM/register locations are needed as will become apparent below). The extrinsic LLR input/output buffer 1305 is coupled to a very simple arithmetic logic unit (ALU) that preferably performs, for example, additions, subtractions, and compare-and-select-max instructions. A predetermined pattern generator 1315 is controllably coupled to the ALU 1310. The ALU executes a small predetermined set of instructions to perform addition, subtraction, and compare-and-select-max instructions, preferably using signed-number fixed point arithmetic. The ALU executes these instructions in response to the signals provided by the pattern generator 1315. The output of the ALU 1310 is coupled to a dual accumulator/result register 1320. The dual accumulator/result register 1320 includes an A-accumulator register and a B-accumulator register. The A- and B-accumulator registers are more generally A- and B-result registers that can generally hold any intermediate results needed to be held in order to support computations. Another small RAM is the codeword metrics memory 1325. Because the (8,4) Hamming code only has 16 possible codewords, the codeword metrics memory 1325 only requires 16 memory locations (i.e., registers). As can be seen from
The functional unit 1300 also includes its own equivalent of a program memory, but this program memory is preferably implemented as a program logic microsequencer 1330. In some embodiments, some or all of this program logic microsequencer 1330 can be shared by all M of the functional units 1300 since most of the time they are executing exactly the same sequence of operations. In many embodiments, little or no instruction decoding is needed because the microsequencer 1330 can be configured to act as a pattern generator state machine that sequences through a set of states whose state outputs are a set of control signals that cause the different registers to be read and written in a specified order as discussed in more detail in connection with
To understand the operation of the (8,4) Hamming code soft decode functional unit 1300, consider the method/process 1400 of
Next at 1410 each of the eight LLR values is sent to the ALU in a circular buffer order, (i.e., LLR1, LLR2, . . . LLR8, LLR1, LLR2, . . . LLR8, . . . ) until all eight extrinsic LLRs have been cycled out to the ALU sixteen times. Each time a set of the eight stored LLR values is received in sequence at the ALU 1310, the pattern generator 1315 generates a respective sequence of eight bits corresponding to a respective one of the sixteen possible (8,4) Hamming codewords. Before the first set of the eight extrinsic LLR values is sent to the ALU 1310, accumulator-A of the accumulator/result register 1320 is set to zero. Next the eight extrinsic LLRs stored in the extrinsic LLR input/output buffer 1305 are sequenced in order to the ALU 1310. As each ith extrinsic LLR, for i=1, . . . , 8, is received at the ALU 1310, the corresponding ith bit of the first Hamming codeword is output from the pattern generator. If the ith bit of the first Hamming codeword is a one, the corresponding LLR is added by the ALU 1310 to the A-accumulator of the block 1320 and the result of the addition is stored back into the A-accumulator. If the ith bit of the first Hamming codeword is a zero, the LLR is subtracted from the A-accumulator by the ALU 1310, and a result of the subtraction is stored back into in the A-accumulator. After all eight LLRs have been processed this way, the result of the A-accumulator is written into the first position in the codeword metrics memory 1325. This process is then repeated for j=2, . . . , 16, once for each of the remaining 16 unique Hamming codewords associated with the 16 unique 8-bit Hamming codewords of the (8,4) Hamming code. That is, as the above periodic sequence of extrinsic LLRs are clocked in a circular buffer fashion out of the extrinsic LLR input/output buffer 1305, the pattern generator, in synchronization, clocks out the set of sixteen (8,4) Hamming codewords, and the ALU responds to 1's as add commands and 0's as subtract commands. The program logic microsequencer 1330 sends out control signals to control the circular-buffer reading order of the extrinsic LLR input/output buffer 1305, and to control the writing of the A-accumulator results to the codeword metrics memory 1325 after the eight extrinsic LLRs are processed this way each of the 16 times.
In the process above, if 2-lane bussing and dual ported register files are used inside the functional unit 1300, then the process can be sequenced to ping-pong between using the A-accumulator and the B-accumulator so that the a result can begin accumulating in the B-accumulator while the A-accumulator is being written out. Such lower level optimizations can be used throughout the decoder 1200 to save clock cycles wherever desired.
The process 1400, generally as carried out in accordance with the program logic microsequencer 1330, next advances to the sub-process 1415 in
Therefore, in accordance with 1415 as enforced by the microsequence 1330 and the pattern generator 1315 (which in general may be implemented as a part of the microsequencer 1330), a set of total LLRs will be computed. To begin, the A-accumulator and the B-accumulator are set to the most negative number representable by the signed fixed point numbering system used by the ALU 1310. Starting with the first bit position, the eight bit metrics corresponding to the eight (8,4) Hamming codewords that have a one in the first bit position are sequenced out of the codeword metrics block 1325 and are coupled to the ALU 1310. As each new codeword metric arrives at the ALU 1310, the pattern generator 1315 sends a control signal that causes the ALU to compute a compare-and-select-max instruction, comparing the incoming codeword metric with the contents of the A-accumulator and storing the max value back into the A-accumulator. After this has been performed eight times for all eight of the selected codeword metrics, the A-Accumulator will be left with the maximum of the codeword metrics that correspond to codewords that have a one in their first bit position. Next, staying with the first bit position, the eight bit metrics corresponding to the eight (8,4) Hamming codewords that have a zero in the first bit position are sequenced out of the codeword metrics block 1325 and are coupled to the ALU 1310. As each new codeword metric arrives at the ALU 1310, the pattern generator 1315 sends a control signal that causes the ALU to compute a compare-and-select-max instruction, comparing the incoming codeword metric with the contents of the B-accumulator. After this has been performed eight times for all eight of the selected codeword metrics, the B-Accumulator will be left with the maximum of the codeword metrics that correspond to codewords that have a zero in their first bit position.
Next in accordance with the sub-process 1420 of
Next in accordance with the sub-process 1425 of
The sub-processes 1415, 1420, and 1425 have only been described for the first bit position. However, the same sub-processes 1415, 1420, and 1425 also sequence to be carried out for the remaining bit positions, i=2, . . . , 8. The a-posteriori extrinsic LLR values are sent back to the 2D memory 1240 to be used as a-priori extrinsic LLR values in the first half of the next SISO iteration. Additionally, the total LLR values may be used as a part of a stopping criterion. As SISO iterations continue, the total LLR values converge to the (8,4) Hamming codewords. The three parity bits can be discarded and the four data bits from each word correspond to the output sequence of the SISO decoder 1200.
In an alternative embodiment, one micosequencer is used. The K functional units 1300 are sequenced to generate K answers in parallel instead of the pipelined approach Mod 5. Also, more parallelism can be extracted at the 1300 level, for example, 16 ALUs can be configured to operate in parallel. That is, both higher level parallelism and lower level parallelism within the functional units can be extracted using single instruction multiple data or multiple instruction multiple data control.
CTBC codes can be designed to provide both high MHD and high interleaver gain. When a CTBC code is transmitted through a Gaussian channel using BPSK signaling with constellation points at ±a or using Gray coded QPSK signaling with constellation points at {±a, ±a}, the CTBC code's MHD=dt translates directly to a Minimum Squared Euclidean distance (MSED) of Dmin2=4a2dt. When this same CTBC code is transmitted through a Gaussian channel using a larger Gray coded signal constellation where the minimum squared Euclidean distance between two constellation points is 4a2, then the CTBC code's MHD=di also translates directly to a Minimum Squared Euclidean distance (MSED) of Dmin2=4a2dt.
Bit interleaved coded modulation (BICM) as is known in the art can be used to map the coded bits of an underlying code via an interleaver in such a way as to spread neighboring coded bits onto different symbols. The BICM interleaver is typically selected to be a uniform interleaver. BICM is known to perform better in fading channels because it can spread the neighboring coded bits of the underlying code onto different symbols.
“Constrained interleaved coded modulation” (CICM) is developed herein in accordance with an aspect of the present invention to map CTBC codes onto various sized signal constellations. As can be seen from the CI-3 and CI-4 design approaches, the complete set of low weight error sequences that dominate error performance (e.g., CTBC codewords with weights dt≦d≦df, that correspond to the sequences i p in the tables P(df≧d≧dt)) can be readily identified and enumerated. This allows CICM mapping rules to be designed to provide MSED advantages similar to Ungerboeck's trellis coded modulation (TCM). Also, similar to BICM, the CICM interleaver is preferably designed to spread the non-zero coded bits of the identified low weight CTBC codewords onto different symbols (i.e., constellation points) transmitted during different symbol intervals, and this leads to improved performance over fading channels.
CICM can be viewed as a two step mapping process. The first step involves identifying a constellation mapping rule to map subsets of m coded bits onto constellation points. The coding policy preferably assigns high distances between constellation points that differ by a single bit and progressively smaller distances between constellation points that differ by more bits up to m-bits. In a sense, this is the opposite of Gray coding which assigns low distances between constellation points that differ by a single bit and progressively larger distances between constellation points that differ by more bits up to m-bits. For this reason, the constellation mapping policies discussed herein for use with CICM are called “Reverse Gray Coded” (RGC) constellation mapping policies. The second step involves determining a CICM permutation function (interleaver rule) for use within the CICM mapper. If the frame size is big enough, the CICM interleaver can be designed to spread each possible pattern of dt non-zero coded bits of each of the identified lowest weight (weight dt) CTBC codewords onto dt different symbols. Also, the permutation can be designed to ensure that changes in the values of each of these dt non-zero coded bits correspond to respective large Euclidian distances on the constellation. Thus a “CICM mapping rule” includes a CICM permutation rule followed by a selected constellation mapping rule. A “CICM signal mapper” includes a CICM permutation Γ (a different type of constrained interleaver as compared to the CI-3 or CI-4 type constrained interleavers, π) followed by a selected constellation mapper such as a RGC constellation mapper for a given 2m-ary signal constellation.
To better understand the mapping rule, consider the QPSK example of
The minimum symbol Hamming distance, ds, is the minimum number of symbols onto which the non-zero coded bits of any coded sequence, v, will be mapped. For example, if each of the dt non-zero-coded bits of a weight dt CTBC codeword are mapped onto separate respective symbols, then ds=dt. The maximum achievable ds, denoted ds,max, results when all the non-zero coded bits in every weight dt sequence of v are placed into different symbol intervals, so that ds,max=dt. On the other hand, if the size of the signal constellation is M=2m, the lowest possible ds, i.e., ds,min, results if a coded sequence with weight dt is allowed to feed all its dt bits into only ┌dt/m┐ number of m-bit symbols. In the worst case, the weight dt sequence of v feeds its non-zero coded bits into ┌(dt/m)−1┐ symbols completely and feeds any of its remaining bits into one other symbol. Hence, ds,min=┌dt/m┐, and the achievable ds satisfies, ┌(dt,m)┐≦ds≦dt. The CICM interleaver rule is designed to achieve the highest possible target value of ds, denoted as ds,t, subject to the constellation size, M, and the frame size, K.
In order to achieve any target symbol Hamming distance ds,t, in addition to observing only weight dt sequences of v, it is also necessary to ensure that every higher weight sequence of v also results in at least a Hamming symbol weight of ds,t. Specifically, if the size of the signal constellation is M=2m, to achieve a symbol Hamming distance of ┌dt/m┐<ds,t≦dt, it is necessary that all valid CTBC codewords, v, with Hamming weight up to dw=m(ds,t−1) result in a symbol Hamming distance of at least ds,t. Because every symbol is formed by m bits, a coded sequence v with weight d>dw=m(ds−1) is guaranteed to feed its bits into at least ds,t symbols. Therefore, to achieve the target value, ds,t the non-zero coded bits of all low weight CTBC codewords with weight up to dw need to be placed in such a way as to achieve the target symbol Hamming distance, ds,t. All CTBC codewords with weight higher than dw will thus be guaranteed to have a symbol Hamming distance greater than or equal to ds,t.
Next consider how to achieve a target MSED. If the minimum squared Euclidean distance between any two constellation points is 4a2, since a symbol is formed by m bits, every subset of m bits of v contributes at least 4a2 to the squared Euclidean distance of that sequence and thus any weight d sequence of v is guaranteed to have a squared Euclidean distance of at least 4a2┌d/m┐. Therefore, at the sequence level, in order to maintain an MSED of Dmin2, it is necessary to make sure that all sequences of v with Hamming weight from dt and up to de=└mDmin2/4a2┘ achieve the selected MSED of Dmin2. Here “de” denotes the Hamming weight that is needed to meet the target MSED, and the subscript e denotes Euclidian. In order to ensure that the CICM mapping rule achieves a target symbol Hamming distance and a target MSED, it is necessary to consider all sequences v with weights starting from dt and up to df=max {dw,de}. Here “df” denotes the final Hamming distance that is needed to meet both the target minimum Hamming distance ds, and target MSED Dmin2, as described above in connection with dw and de. In most practical cases, de>dw so that df=de.
The CICM interleaver constraints assume that the low weight CTBC codewords can be enumerated according to their weights. Recall that the CI-4 design algorithm identifies and eliminates all low weight CTBC codewords whose weights are less than dt. Similarly, an analysis run of the CI-4 design algorithm can be used to identify all of the low weight CTBC codewords at any desired Hamming weight d≧dt. All such low weight CTBC codewords, enumerated as iP=0, . . . , NP(d≧dt)−1, where NP(d≧dt) is the number of unique positions in the table P(d≧dt), can thereby be identified by a listing of their respective positions vectors, p(iP) into table P(d≧dt), where each positions vectors, p(iP), lists the positions of “1”s (i.e., non-zero coded bits) of a respective weight d sequence, v(iP). The table P(d≧dt) can be viewed as being built up as a sequence of constituent tables, {P(d)}, which each constituent table tabulates all of the positions vectors, p(iP), associated with respective CTBC codewords with a respective weight, d. That is, P(d≧dt)={P(dt), P(dt+1), . . . , P(d)}. The number of elements each positions vector, p(iP), has is equal to the weight of its associated CTBC codeword, v(iP), which is denoted as d(iP). In any constituent table P(d), each positions vector, p(iP), in the table P(d) can be enumerated and referred to as iP=0, . . . , NP(d)−1. Herein, the “sequence iP” is used to generally refer to the positions vector, p(iP), and/or the associated the low weight coded sequence, v(iP).
The CICM mapping rule involves: (a) selection of a constellation mapping policy to map each m-bit combination of coded bits onto a respective constellation point, and (b) selection of the CICM interleaver rule to permute the coded bits of the vector v, subject to the constraint that, once mapped, the CICM mapped sequence will exhibit the best set of target values of ds,t, and Dmin2 that can be achieved for a given frame size. The CICM interleaver rule can be viewed as a constrained interleaver whose constraints involve placing all of the non-zero coded bits of the low weight sequences identified in the Table P(d≧dt) in such a way as to enforce: (a) the target minimum symbol Hamming distance ds,t, and (b) the target squared MSED, Dmin2. In practice, an iterative algorithm will be used that will be initialized with the maximum possible ds,t=ds,max=dt and the maximum possible Dmin2=Dmin,max2 for the selected signal constellation and its constellation mapping rule. Using these values of ds,max and Dmin,max2, starting values for dw, de, and df are next computed using the formulas provided above. Next, subject to the selected constellation mapping rule and the specified frame size, K, it is attempted to construct a CICM interleaver rule that meets the interleaver constraints for ds,max and Dmin,max2. If the frame size is too small, the target ds,t and Dmin2 values will be incrementally lowered and the design process will be repeated until a valid CICM interleaver rule is found to achieve the final values of ds,t, and Dmin2.
To design the CICM interleaver rule, an m×K/m permutation matrix, Γ, is defined. Each column of Γ can be considered to correspond to a respective symbol interval. The individual elements of Γ can be considered to be permutation indices pointing back into the vector v. Each column of Γ thus contains the indices of the coded bits from v that need to be constellation-mapped onto a symbol in each symbol interval. Similar to the CI-4 design approach, a “coded bit position” in v identifies a physical memory location, i, in the vector v, where 0≦i≦K−1. A “position” typically is used to refer to an index, i, in v, where a respective nonzero coded bit (i.e., a “1”) occurs in a respective one of the low weight error sequences identified by the table P(d≧dt). Also, while the elements of the permutation matrix Γ are actually indices into the vector v, similar to the discussion of the CI-4 design process, the concept of “placing” a coded bit (position) from v into Γ will be used herein.
To begin, the same sequential bit placement approach as used in the CI-4 design algorithm can be used to identify all of the coded sequences v with weight d, starting with d=dt. For example, once the CI-3 and/or CI-4 (or DCI) interleaver is designed, the same bit-placing ordering as used in the CI-4 design algorithm can be followed and Algorithm 1 can be called, but by replacing dt with d≧dt to identify all of the CTBC codewords having weight d. That is, an analysis run as described above can be performed, and this analysis run will cause Algorithm 1 to enumerate all possible CTBC coded sequences with weights d≧dt. The results of the analysis run can be used to create the table, P(d≧dt) which tabulates all of the positions vectors of all of the respective CTBC codewords of weights d≧dt. The table P(d≧dt) can be readily sub-divided into a set of constituent tables, P(dt), P(dt+1), . . . , P(d), which each respectively list all of the positions vectors corresponding to the CTBC codewords that exists at each respective weight, dt, dt+1, . . . , d.
In the analysis runs of the CI-4 design algorithm, the bits of c will already have been placed into u in such a way as to ensure that no CTBC codewords with weight less than dt will exist. In each analysis run, no bits are placed, but all of the positions vectors identified by Algorithm 1 corresponding to the CTBC codewords with the weight d are tabulated into the Table P(d). As will be seen later, it is useful to also tabulate information that identifies the contents of the non-zero OBC codeword positions, {cj} of the c vector associated with each tabulated sequence, iP.
Given the table P(d), a set E(d) is defined to be a set whose members are the distinct positions that appear in any of the positions vectors contained in P(d). The number of elements in the set E(d) is denoted as N(d). The number of times a given position, i, occurs in E(d) is denoted as Popularity(i,d). For example, if position v(50) only occurs in one of the sequences in the Table P(d), then the index value i=50 would be included in E(d), the i=50 index would be counted once in N(d), and Popularity(i=50,d)=1. If position v(55) occurs in ten different ones of the sequences in Table P(d), then the index value i=55 would be included in E(d), the i=55 index would be counted once in N(d), and Popularity(i=55,d)=10. Note that if a given position, i=70, is not used to hold any non-zero coded bits of any low weight sequences listed in Table P(d≧dt), then popularity of i=70 at this weight of d is zero, i.e., Popularity(i=70,d)=0.
The iterative CICM mapping rule design algorithm will attempt to place all the positions of v into Γ to achieve the maximum possible ds,t=ds,max=dt and the maximum possible Dmin2=Dmin,max2. However, the values of the parameters such as dt, the frame size, K, and the constellation size, M=2m will determine the actual highest possible values of the targets ds,t and Dmin2 that can actually be reached. Specifically, if the signal constellation size is M=2m, the CICM mapping rule design algorithm computes the associated value of df; and then starts off by considering only Hamming weight d=dt sequences in v. Next the design algorithm gradually increases d until a limiting condition is reached or until the ds,t=ds,max=dt and Dmin2=Dmin,max2 objectives are achieved with the final value of d=df. In the event that the ds,max=dt and Dmin,max2 objectives cannot be achieved, then ds and/or Dmin2 are decreased to achieve the next highest possible values of ds,t and Dmin2. As discussed in further detail below, the amount by which ds,t and/or Dmin2 are decreased depends on the maximum number of coded bits from a weight d sequence that will need to be loaded into any particular symbol, and the positioning of those bits on different symbols. Next a new (lower) value of df is calculated, and the process is repeated, building the table P(d≧dt), for each d=dt, dt+1, . . . , df, and attempting to place all the positions of v from each constituent table P(d) into Γ to achieve the current (lowered) values of ds,t and Dmin2. If the mapping is able to achieve the current values of ds,t and Dmin2 all the way up to P(df), then the algorithm stops. Otherwise then ds,t and/or Dmin2 are decreased again, and the design process is repeated until a valid CICM interleaver rule can be found to achieve a final pair of target values of ds,t, and Dmin2 at the specified frame size, K.
Without loss of generality, the CICM mapping rule design algorithm computes the normalized squared Euclidean distance by dividing it by the MSED on the constellation itself (which is 4a2), i.e., the normalized squared Euclidean distance is given by Den2=De2/(4a2). This normalization is slightly different from the standard squared normalized Euclidean distance used in the literature given by D2=De2/(2Eb,avg), or the normalized squared MED dmin2=Dmin2/(2Eb,avg), which also takes into account of the number of bits transmitted per interval, where Eb,avg is the average bit energy.
As the iterative design algorithm proceeds, certain quantities associated with individual sequences, iP=0, . . . , NP(d≧dt)−1, as listed in each Table P(d≧dt) can evolve. The quantities ds,temp(iP) and Den,temp2(iP) respectively represent the contributions to the symbol Hamming distance and to the normalized squared Euclidean distance due to the already placed positions of iP. The quantities ds(iP) and Den2(iP) respectively represent the actual symbol Hamming distance and the actual normalized squared Euclidean distance of the low weight sequence, iP, once it has finished being placed into Γ. The quantities ds,max(iP) and Den,temp2(iP) respectively represent the maximum possible values that ds(iP) and Den2(iP) can possibly achieve for each low distance error sequence iP as listed in the Table P(d). These maximum possible values, ds,max(iP) and Den,max2(iP), are the values reached by the sequence iP based on its already placed positions in Γ, assuming that its remaining positions can be placed in Γ so as to meet the CICM interleaver constraints. Once a sequence iP is fully placed in accordance with the CICM interleaver constraints, ds,temp(iP)=ds(iP)=ds,max(iP) and Den,temp2(iP)=Den2(iP)=Den,max2(iP). The maximum achievable ds,t and Dmin,n2 can be calculated at any point as ds,t=min {ds,max(iP)} and Dmin,n2=min {Den,max2(iP)}.
By way of example, consider the QPSK example using the coding policy as illustrated in
Step 1. Set d=dt and perform an analysis run of the CI-4 design algorithm to identify all weight dt CTBC codewords {v(iP)}, for iP=0, . . . , NP(dt)−1, and tabulate their respective positions vectors, {p(iP)}, into the table P(dt). Form the set of all distinct positions of the set of all weight dt CTBC codewords, E(dt), and find the number of elements in E(dt), N(dt), and the Popularity(i, dt) for each position, i, in E(dt). Arrange the elements of E(dt) in the descending order of their popularity, i.e., the first element in the set E(dt) appears most in all sequences in P(dt) and the last element appears least. At this time, there is no information that suggests that ds,max and Dmin,max2 cannot be reached, so in order to aim for the highest possible targets ds,t and Dmin,n2, initialize each tabulated sequence, iP, as follows: ds,max(iP)=dsm=dt, ds,t=min {ds,max(iP)}=dt, Den,max2(iP)=De,max2/4a2=8a2dt/4a2=2dt, and Dmin,n2=min {Den,max2(iP)}=2dt.
If K/2≧N(dt), then all elements of E(dt) can be placed on the first row of Γ. The first row of Γ contains the most significant bits of each of the K/m different m-bit symbols that are stored down the columns of Γ (m=2 in this example). Each of these most significant bits have a squared Euclidian distance of 8a2 in the example of
Step 2. However, if K/2<N(dt) some elements of E(dt) will need to be placed on the second row of Γ. This suggests that it will not be possible to achieve a Dmin,n2 of Dmin,n,max2=2dt because there will be at least one coded sequence with weight dt that cannot place all its non-zero positions on the first row (i.e., the most significant bit in
Next the positions of the subset H need to be placed onto the second row of Γ in such a way as to achieve the highest possible value for ds,t. In the specific example of
Starting with the column whose first element contains the position, i, in E(dt) whose popularity, Popularity(i, dt), is the highest, the design algorithm attempts to match this position with the position(s) in the subset H that have the highest popularity. This is because the higher a position's popularity, the more potential conflicts it will have when being considered for placement into any candidate column, and thus the more difficult it will be to place later when there are not too many vacant locations left in Γ. Therefore, the design algorithm places the more difficult positions first and leaves the easier to place positions with lower popularities for later.
To accomplish the above, because of the popularity-rank ordering in which the first K/2 positions of E(dt) have been placed into the first row of Γ, the (1,1) position in Γ (the first position in the first row) will have the highest popularity. Next identify the position in the subset H with the highest popularity that is not a position of any sequence iP that contains the (1,1) position. Due to the way that E(dt) has been rank ordered according to popularity, this can be done by checking each position of H from left to right and selecting the first position in H that is not a position of any sequence, iP, that contains the position stored in the (1,1) location of Γ. Place the identified position of the subset H below the (1,1) location of Γ (i.e., the (2,1) location). Continue in this way by moving from column to column along the first row of Γ until all positions in H are placed into the second row of Γ in such a way that no column contains more than one position from any given sequence iP in E(dt). If this can be successfully done, it is still possible to achieve ds,t=dsm=dt based on all weight dt coded sequences. If all positions of H cannot be placed in such a way that no column contains more than one position from any given sequence iP, one or more roll-back attempts as discussed above in connection with the CI-4 design algorithm can be made, but if the roll-back attempts fail, two positions of the same sequence iP will have been placed in at least one column of Γ, and thus ds,t must be lowered. Therefore, if two positions of the same sequence had to be placed in at least one column of Γ, update all of the ds(iP), Den,nax2(iP), ds,t, Dmin,n2 and df values.
Next step 3 is executed to place the any remaining positions of v. If ds,t had been lowered below dsm=dt, then if necessary, some of the positions that were initially placed on the first row with the aim of achieving ds,t=dt can be judiciously removed from Γ to create room for the remaining positions of v as discussed in further detail below.
Step 3. Set d=d+1 and perform an analysis run of the CI-4 design algorithm to identify all of the CTBC codewords having weights d, and tabulate their respective positions, {p(iP)}, iP=0, . . . , NP(d)−1, into the table P(d) and use these identified sequences to update the table P(d≧dt). Next identify the positions that have already been placed in Γ, and using these already-placed positions, calculate ds,temp(iP) and Den,temp2(iP) for sequence, iP=0, . . . , NP(d)−1, listed in the table P(d). The ds,temp(iP) and Den,temp2(iP) values represent the symbol Hamming distance and the Euclidean distance contributions respectively made by the already placed positions of each of the NP(d) weight d CTBC codewords identified by table P(d). Furthermore, the values of Den,temp2(iP) are calculated only using positions from the positions vectors p(iP) of table P(d) that have already been placed on fully completed columns of Γ. For example, if a non-zero coded bit from a CTBC codeword, iP has been placed onto the first row, this would indicate a MSED of 8a2 for that coded bit. However, it may be necessary to later place another coded bit of the same sequence onto the second row of the same column. If that happens, that 8a2 contribution would be lowered to 4a2. For this reason, Den,temp2(iP) is only updated based upon completed columns. If any sequence, iP, has all of its positions placed into Γ, then ds,temp(iP) and Den,temp2(iP) will have reached their highest values, so in such cases set ds(iP)=ds,temp(iP) and Den2(iP)=Den,temp2(iP). Note that if ds,temp(iP)≧ds and Den,temp2(iP)≧Dmin,n2, then any remaining position of the entry iP in P(d) can be placed at any available place in Γ because that placement will not lower the targets ds,t or the Dmin2. This is because, if the “temp” values are already above the target values, there is no need to consider the additional weight or distance above the threshold target values. If ds,temp(iP)<ds,t and/or Den2(iP)<Dmin,n2, record ds,max(iP) and Den,max2(iP). This indicates the best case numbers for the weight d sequences that still need to be placed.
In order to systematically place the additional positions of the set P(d), identify the subset of sequences, P′(d)⊂P(d) for which ds,temp(iP)<ds,t or Den,temp2(iP)<Dmin,n2. The set P′(d) thus contains the sequences, iP, that still need to be placed so as to meet the target values, ds,t and Dmin,n2. For sequences that already have ds,temp(iP)≧ds or Den,temp2(iP)≧Dmin,n2, there is no need waste key positions in Γ for the additional positions of the sequences in P(d) that have already satisfied Γ's interleaver constraints. Such positions can be placed later after all of Γ's interleaver constraints have been met.
Next construct a set E′(d) consisting of the popularity-ranked unique positions in P′(d), and construct a set H′(d) by removing all of the positions in E′(d) that have already been placed into Γ. Next identify a candidate position from H′(d), starting from left to right (highest popularity to lowest popularity) and attempt to place this candidate position from H′(d) into the left-most column of Γ that has a vacant position on the second row. Similar to step 2, before the placement can be made, it should be verified that the position already occupying the first row of the same column is not a position associated with any sequence iP in P′(d) that contains the candidate position. If the first row does not contain any position associated with any sequence iP that contains the candidate position, the candidate position is placed into the left most column of Γ that has a vacant position on the second row. If not, the process is repeated by attempting to place the candidate position into the next left most column of Γ with a vacant position on the second row and ensuring that the above described constraint is satisfied. This process is repeated until the candidate position is placed. Once the candidate position from H′(d) is placed in Γ, for all affected sequences, iP, in P(d), update ds,temp(iP), Den,temp2(iP), ds,max(iP) and Den,max2(iP). Continue placing the remaining elements of H′(d), one at a time, until ds,temp(iP)≧ds and Den,temp2(iP)≧Dmin,n2, for all iP in P′(d) or until it is determined that it is impossible to do so. The above process will ensure that the elements of E′(d) will be placed in such a way that no two positions of any sequence, iP, in P′(d) will be placed into the same column, thereby maximizing the ds,temp(iP) values, and thereby achieving the highest value of ds,t.
In the case where it is possible to meet these conditions for all sequences iP in P′(d), it may also be the case that these conditions are met before all of the positions in H′(d) have been placed. If there are such additional unconstrained positions in H′(d), do not place them at this time so as leave as many vacant locations in Γ as possible for the later placement of positions from higher weight sequences of v subject to Γ's interleaver constraints.
On the other hand, if it was not possible to place all positions of H′(d) to satisfy ds,temp(iP)≧ds and Den,temp2(iP)≧Dmin,n2 for all the sequence iP in P′(d), then a roll-back can be attempted. Start by identifying positions on the first row (as mentioned at the end of step 2) that can be moved to the second row (or lower rows) without lowering the targets ds,t or Dmin2. Note that, if the values of ds,t and Dmin,n2 had to be lowered one or more times, there will have been positions placed not only on the first row but also on the second row (or other rows) for some sequences that are no longer needed to maintain the now less restrictive interleaver constraints, dtemp(iP)≧ds and Den,temp2(iP)≧Dmin,n2. In order to systematically identify the positions on rows that can be removed without violating constraints, identify the subset PQ(d≧dt)⊂(d≧dt) whose elements are sequences, iP, which have values of ds(iP)>ds,t and Den2(iP)> Dmin,n2 that are high enough so that at least one position of each of these sequences can afford to be moved out of Γ to create a vacancy in Γ while still maintaining ds(iP)≧ds or Den2(iP)≧Dmin,n2 of all affected sequences, iP, in P(d≧dt).
Next form a set EQ(d≧dt) containing a popularity-ranked (descending order of the popularity of its distinct entries in PQ(d≧dt)) set of positions that can be removed from Γ without lowering the targets ds,t or Dmin2. That is, all of the positions in the subset EQ(d≧dt) can be removed from Γ while still maintaining ds(iP)≧ds and Den2(iP)≧Dmin,n2 for all sequences in P(d≧dt). Note that sequences in PQ(d≧dt) can afford to lower their distances whereas the sequences in P′(d) have to increase their distances. With that in mind swap positions of the sequences of P′(d) placed in H′(d) from left-to-right with the positions of EQ(d≧dt) from right-to-left. By doing so, the distances of the least number of sequences that can afford to lower their distances are lowered while the highest number of sequences that are in need of increasing their distances are increased. After every swap, update ds(iP) and Den2(iP) of all affected sequences in P(d≧dt), and update P′(d), PQ(d≧dt), and EQ(d≧dt). Continue this process to try to make ds(iP)≧ds and Den2(iP)≧Dmin,n2 for all sequences iP in P(d). If all sequences in P′(d) can be made to satisfy the constraints ds(iP)≧ds and Den2(iP)≧Dmin,n2, repeat step 3 until d=df. If any potential swap would cause any CICM interleaver constraint to be violated for any sequence iP, the swap is not made.
Note that during step 3, some (or all) of the positions that were moved out of Γ for later placement could be picked up by the next set (or sets) of weights d≦df. Hence, it is possible to get Γ mostly (or totally) filled by different positions before reaching d=df. At this point, in order to guarantee the target ds,t and Dmin,n2 values, it is still necessary to keep generating coded sequences of v until we reach d=df. In that process, it is possible to find sequences of v whose positions are already almost or fully placed in Γ. If all of the positions of a newly identified sequence iP have already been fully placed, then their ds(iP) and Dmin2(if) can be directly calculated. For other sequences for which the positions are partially placed in Γ, ds,max(iP) and Den,max2) can be calculated. If any of ds(iP), Den2(iP), ds,max(iP) and Den,max2(iP) values happen to fall below their corresponding target values (ds and Dmin,n2), it is necessary to make changes in Γ by swapping already placed positions in it until all constraints are met by all sequences in P(d≧dt). Any violation of a constraint can result from either category (a): the sharing of columns of Γ by the coded bits of a given sequence iP, and/or category (b): positions of v that are mostly placed on the second row that make lower contributions to Den2(iP). As discussed below, the additional constraints can be enforced during the placement of positions in Γ to avoid the sharing of columns of Γ by the coded bits of a given sequence iP.
In fact, if all the additional conditions such as inter-column conditions and inter-sequence constraints as described below can be fully satisfied during the placement of positions in Γ, violations caused by sequences that fall under category (a) can be completely eliminated. In situations where all inter-column conditions and inter-sequence constraints as described below cannot be fully satisfied and the case of category (a) occurs for any given sequence iP, then it becomes necessary to move some of the positions, preferably starting from the positions of the sequence iP that share the same columns in Γ. In any such sequence iP, it is first desirable (may even be sufficient) to move positions that share the same columns of Γ. This change can be done by swapping with positions on the same row. For example, if such a sequence iP currently has four positions of it in two columns of Γ, a position from one of the rows from each column can be swapped with a different position on the same row. In the selection of the row of the selected column, it is preferable to select the row that contains the position that has the lower popularity to minimize the number of affected sequences. Such a swap increases the ds(iP) and Den2(iP) values of that sequence. Further, it is desirable to find a position that can be swapped without lowering ds(iP) and Den2(iP) of any other sequence including the sequences that contain the position selected for the swap on the same row. It is also desirable to select a position on the same row that is contained by only sequences on P(d≧dt) that barely satisfy the two constraints. This is because if only such a sequence(s) is involved it is not really necessary to use up the other sequences that can afford to lower their ds(i) and Den,min2 values on these swaps, and instead, it is better to save them to form EQ(d≧dt). Any sequence that satisfies the two constraints and does not qualify to feed positions to EQ(d≧dt) can be considered as such a sequence that barely satisfies the constraints. Hence, it is helpful to form a set, ĒQ(d≧dt), for each row separately that list the positions on the respective row, in the decreasing rank popularity order, that are not in the set EQ(d≧dt) and can thus be used for the swaps. In order to systematically handle sequences that fall under category (a), (i) identify the set of all sequences that do not satisfy one or both constraints and come under category (a), (ii) identify the set of distinct columns that are shared by the sequences found in (i), (iii) identify each position on the least popular row of each of the selected columns, and (iv) find the least popular position from ĒQ(d≧dt) of the same row that can be used for the swap with each selected position in step (iii). After every swap, re-calculate ds(iP), Den2(iP) of all completed sequences, and ds,max(iP) and Den,max2(iP) of all partially completed sequences to identify the sequences that still do not still satisfy any of the two constraints. If the above steps (i) through can be successfully completed, all ds(iP) and ds,max(iP) values will be guaranteed to satisfy the condition ds(iP)≧ds,t. However, some sequences may still need to increase their Den2(iP) values. For those sequences, it is necessary to swap selected positions of them on the second row with positions on the first row as mentioned in category (b). The sequences under the category (b) mentioned above can be handled by using the same approach used to find places in Γ for positions in H′(d) using the set EQ(d≧dt). Instead of H′(d), the set of distinct positions of all the sequences iP that fall under category (b) in their descending rank popularity order can be used instead.
However, at any value of d, if step 3 fails to make all sequences of P′(d) to satisfy the constraints, then:
(a) lower ds,t and/or Dmin2 and recalculate all needed parameters such as df as discussed above for these lower values.
(b) repeat step 3 with these new lower values. In failing,
(c) go back to step 1 with these lower values of ds,t and Dmin2. If that also fails,
(d) repeat (a)-(d) until all sequences iP in P(d≧dt) can satisfy ds(iP)≧ds and Den2(iP)≧Dmin,n2.
Once all of the interleaver constraints have been met for all dt≦d≦df as described above, any and all of the remaining positions, i=0, . . . K−1 that have not already been placed into Γ can be placed anywhere in Γ without violating the interleaver constraints. The values of ds,t and Dmin,n2 at the point of stopping are the values that can be finally reached.
It is important to note here that throughout the above discussion we have used one ds,t and one Dmin2 value for all sequences of v. However, because v is generated from a concatenated code that achieves different interleaver gains for different sequences, it can be desirable to employ different ds(iP) and Dmin2(iP) values for different categories of sequences of v(iP) as described above in connection with the CI-3 and CI-4 design algorithms. Since all calculations that are used to design Γ employ ds(iP) and Dmin2(iP) values individually on sequences, the above method can be directly used for varying sets of ds,t and Dmin2 values for different sequences if desired. Since the higher weight sequences v of the concatenation usually achieve higher interleaver gains, even though it is necessary to consider all weights up to df, it may be sufficient to only consider weights up to a weights less than df to achieve good performance for selected CTBC codes. Hence it is to be understood that in any of the algorithms and examples presented herein, the interleaver constraints can be modified to employ different ds(iP) and Dmin2(iP) constraint thresholds depending upon the category any given sequence v(iP) belongs.
A. Inter-Column Constraints:
In order to understand the impact of the constellation-mapping of bits onto symbols has on Dmin2, consider a given sequence iP listed in P(d). Let x(iP) be the number of columns of Γ that contain one position from the sequence iP on first row and zero positions of iP on the second row. Let y(iP) be the number of columns of Γ that contain zero positions from the sequence iP on the first row and one position from the sequence iP on the second row. Let z(iP) be the number of columns of Γ that contain two positions from the sequence iP, one on the first row and anther on the second row. With these definitions and the constellation mapping rule of
D
en
2(iP)=(2x(iP)+y(iP)+z(iP)). (11)
Further, since the sequence iP, is taken from P(d) and thus has weight d=d(iP), the parameters x(iP), y(iP) and z(iP) will necessarily satisfy
x(iP)+y(iP)+2z(iP)=d(iP) (12)
where d(iP) is the weight of the sequence iP. It follows from (11) and (12), that
D
en
2(iP)=[(x(iP)−z(iP))+d(iP)] (13)
and ds(iP)=x(iP)+y(iP)+z(iP). Further, from equations (11) through (13) it follows that for any pre-selected pair of values ds,t and Dmin,n2, if the sequence iP satisfies both constraints, then x(iP) and z(iP) must satisfy,
z(iP)≦(d(iP)−ds,t), (14a)
and
x(iP)≧[Dmin,n2−d(iP)+z(iP)]. (14b)
Also, it follows from (14) that the maximum allowable value of z(iP), zmax(iP), and the minimum required value of x(iP), xmin(iP), can be computed as
z
max(iP)=(d(iP)−ds) (15a)
and
x
min(iP)=[Dmin,n2−d(iP)+z(iP)]. (15b)
As can be seen from equations (14) and (15), it is desirable to have a low value of z(iP) (like z(iP)=0). This is because a lower value of z(iP) can increase the value of ds,t and an decrease the required value of x(iP). However, when placing the positions of any sequence iP into Γ, a potential current lack of available locations in Γ may give rise to the requirement that z(iP)>0. Hence, as each of the main algorithmic steps 1 through 3 as described above are executed, for each sequence iP in the table P(d≧dt), it is desirable to compute and record xmin(iP) and zmax(iP) and to then use these values to guide the placement of the positions of each sequence iP into Γ.
When the CICM mapping rule design algorithm begins to execute, or at each new pass through step 3, for each identified sequence iP in P(d), z(iP) is initialized to z(iP)=0, and the starting value of xmin(iP) is computed from equation (15b). However, as each additional position from the set E′(d) or H′(d) is placed into Γ, the values of z(iP) for all affected sequences, iP, may need to be increased, and at such time, any such affected value of xmin(iP) is then updated in accordance with equation (15b). In order to monitor the progress of the values of x(iP) and z(iP), as each one of the positions from the set E′(d) or H′(d) is placed into Γ, for all affected sequences, iP, first update xmin(iP) and zmax(iP) using equation (15), and additionally monitor the current x(iP) value, Xtemp(iP), and the current z(iP) value, ztemp(iP). Initially, before any of the positions of any such sequence iP have been placed, initialize xtemp(iP)=ztemp(iP)=0. Note that all of these parameter updates are computed only when both rows of a column in Γ are filled. This is because the above parameters are only fixed (not subject to future change) after the both of the locations column in Γ have been filled.
In order to satisfy the inter-column constraints, the goal is to achieve xtemp(iP)≧xmin(iP) and ztemp≦zmax when all positions of any sequence, iP, have been placed, thereby satisfying (14a) and (14b). If this cannot be done for all sequences, iP, all the way up to weight df, then a roll-back can be attempted and steps 1 through 3 of the CICM mapping rule design algorithm can be executed again to see if a valid Γ can be found. If the roll-back attempt fails, then lower ds,t and/or Dmin,n2 as discussed above, and keep executing the CICM mapping rule design algorithm until equation (14) can be satisfied for all sequences, iP, in P(d≦df).
B. Inter-Sequence Constraints:
The above steps 1 through 3 of the CICM mapping rule design algorithm implement constraints and perform processing based upon each of the individual coded sequences, iP, in P(d), for d=dt, . . . , df. In addition to considering single sequences, additional inter-sequence constraints are needed to avoid conditions that can arise where multiple different sequences interact to cause the Dmin2 value of the constellation to decrease. There is no need to implement inter-sequence constraints to ensure that the target symbol Hamming distance is maintained at ds,t because the target symbol Hamming distance is only affected by the placement of the positions in each sequence, iP, when considered alone.
To understand the inter-sequence constraints that ensure that Dmin,n2 is not lowered due to combinations of sequences, consider a specific example involving a weight dt sequence, iP1 from the table P(dt). In this example it is desired to maintain Dmin,n2=1.5 dt by placing x(iP1)=dt/2 positions on the first row, y(iP1)=dt positions on the second row, and to have z(iP1)=0. Next consider a second weight dt sequence, iP2, also from the table P(dt). In this example, dt/2 positions of the second sequence iP2 are also placed on the first row, and in the same columns where dt/2 of the positions of the sequence iP1 have been placed on the second row. Also, dt/2 of the positions of iP2 are placed on the second row in the same columns where dt/2 positions of the sequence iP1 have been placed on the first row. In the end the combination of the two sequences still maintain ds,t=dt but Dmin,n2 for the combination is now lowered to dt because the two sequences in combination generate dt number of “11” QPSK symbols. The easiest way to prevent these types of undesirable inter-sequence interactions is to limit the number of columns any two different sequences can share. In the above mentioned example, sequences iP1 and iP2 are allowed to share dt positions. However, such cases can be prevented by imposing an inter-sequence constraint to limit the number of columns that certain potentially troublesome pairs of sequences, iP1 and iP2, can share.
Specifically, the potentially troublesome pairs of sequences, iP1 and iP2, are called “disjoint sequences” herein. To understand what a pair of disjoint sequences is, first note that the sequences iP1 and iP2, are each low weight sequences as listed in the table P(d) or P(d≧dt) and thus have respective weights d(iP1) and d(iP2). Next note that each of the sequences, iP1 and iP2, are associated with CTBC codewords, v(iP1) and v(iP2). Also, due to the way the CTBC coded v-sequences are formed, i.e., v=G[u]=G[π[c]], each of the positions vectors, iP1 and iP2, are also associated with two OBC coded sequences, c(iP1) and c(iP2). Recall that each sequence c can be viewed as a naturally ordered set of OBC codewords positions, {cj}, for j=0, 1, 2, . . . , ρ−1. Therefore, two sequences, iP1 and iP2, are said to be disjoint sequences if the nonzero codeword positions in the vectors c(iP1) and c(iP2) are disjoint, i.e., non-overlapping. Inter sequence constraints are developed below to specifically eliminate potential ill effects due to higher weight CTBC codewords, v, that include at least two disjoint low weight vectors c(iP1) and c(iP2), that correspond to sequences, iP1 and iP2, in the table P(d≧dt).
In order to identify disjoint sequences at any weight d, start by performing an analysis run of the CI-4 design algorithm in order to construct the table, P(d). Next, for each positions vector, iP in P(d), identify the corresponding vectors v and c, such that V=G[π[c]]. Next, for each sequence iP1 in the table P(d≧dt), identify a corresponding set of disjoint sequences Δdis(iP1), where if iP2 is disjoint relative to iP1 (i.e., there is no overlap in c(iP1) and c(iP2)), then iP2 is included as a member of Δdis(iP1). Each time step 3 is entered so that the weight d is incremented, if iP, was already in the set {P(dt), . . . , P(d−1)} then the set Δdis(iP) need only be updated by adding any of the weight d sequences in P(d) that are disjoint to iP. Also, by observing the associated vectors c(iP1), c(iP2), and c(iP3), it can be readily seen that if iP2 and iP3 are two entries of Δdis(iP1), then even though iP2 and iP3 are disjoint relative to iP1, iP2 and iP3 are not necessarily disjoint sequences relative to each other.
Define the quantity, sh(iP1,iP2), to be the highest number of columns that any two selected sequences iP1 and iP2 can share without causing MED to be lowered below Dmin2. Recall that the weight of the sequence iP1 is d(iP1) and the weight of the sequence iP2 is d(iP2) Then for the two sequences achieve an MED of at least Dmin2, the following inequality must be satisfied,
[x(iP1)+x(iP2)]−[z(iP1)+z(iP2)]+[d(iP1)+d(iP2)]−2sh(iP1,iP2)≧Dmin,n2 (16)
and hence,
sh(iP1,iP2)≦└[(x(iP1)+x(iP2))−(z(iP1)+z(iP2))+(diP1+djp2)−Dmin,n2]/2┘ (17)
To better understand the inequality (16), note that each of the two sequences iP, and iP2 each individually satisfies equation (12) at their respective weights, d(iP1) and d(iP2). Also, the combination of the two sequences form (x(iP1)+x(iP2)) columns with a one on the first row and a zero on the second, (y(iP1)+y(iP2)) columns with a zero on the first row and a one on the second, (z(iP1)+z(iP2)+sh(iP1,iP2)) columns with a one on both rows. The sum of squared Euclidean distance contributions from all rows is at least Dmin2.
Since the two sequences can have only (x(iP1)+x(iP2)) number of free ones on the first row, and y(iP1)+y(iP2) number of free ones on the second row, the maximum number of columns the combination can form with ones on both rows, i.e., sh(iP1,iP2), is further restricted by
sh(iP1,iP2)≦min{(x(iP1)+x(iP2)),(Y(iP1)+y(iP2))}. (18)
However, the sh(iP1,iP2) values can be calculated after all values, X(iP1), x(iP2), z(iP1) and z(iP2) are known, and these values only become available after all the positions of iP1 and iP2 are placed. Hence, the following temporary values are calculated
x
t(iP1)=max {xtemp(iP1),xmin(iP1)} (19a)
z
t(iP1)=min {ztemp(iP1),zmax(iP1)} (19b)
x
t(iP2)=max {xtemp(iP2),xmin(iP2)} (19c)
z
t(iP2)=min {ztemp(iP2),zmax(iP2)} (19d)
to use in place of the respective values, x(iP1), x(iP2), y(iP1) and y(iP2) in equation (17). Note that for any two disjoint sequences iP1 and iP2, it is also possible to use the parameters of equation (9) to calculate a temporary sh(iP1,iP2) value, sh(iP1,iP2)temp, that indicates the highest allowable number of columns the two sequences can share based on the currently available information. Once the positions are placed in Γ, these sh(iP1,iP2)temp values can be updated to ensure that in the end equation (17) is satisfied.
Further, it is noted that if inter-sequence constraints are not needed. This is because two sequences cannot interact to lower Dmin,n2, below dt. This is because for any two sequences, each with weight d, the worst case for Dmin,n2, is to have both the sequences completely aligned in d columns, in which case, Den2(iP)=d. Similarly, if dt<Dmin,n2≦1.5dt, equation (17) must be satisfied by every pair of disjoint sequences, but no more than pairs of sequences need be considered. This is because more than two sequences cannot interact to lower Dmin,n2 below 1.5dt.
However, if 1.5dt<Dmin,n2≦2dt, it should be ensured that equation (17) is satisfied for all pairs, and in addition, it should be ensured that no combinations of three mutually disjoint sequences achieve a normalized squared MED below Dmin,n2. A set of sequences are called mutually disjoint if every two sequences of that set are disjoint. Consider three mutually disjoint sequences v(iP1), v(iP2) and v(iP3) with respective weights d(iP1), d(iP2) and d(iP3) and with their respective parameters, x(iP1) & z(iP1), x(iP2) & z(iP2), and x(iP3) & z(iP3). Following the same logic of equation (17), equation can be extended to three mutually disjoint sequences as
Hence, when updating any sh(iP1,iP2) value, it is not only necessary to ensure that it satisfies equation (17) but also it is necessary that it satisfies (20) based on the already available values of sh(iP2,iP3) and sh(iP1,iP3). Since the highest achievable Dmin,n2 of the QPSK constellation in
In order to reduce the complexity of implementing inter-sequence constraints, Method 1 and Method 2 are presented. Under broad conditions, Methods 1 and 2 can reduce or altogether avoid the need to check equation (20) and similar equations that deal with combinations of more than three disjoint sequences.
Method 1: This method imposes stronger restrictions on pairs of sequences given by (17) so that multiple combinations automatically satisfy the MED condition. For example, if the division by 2 in (17) is changed to a division by 4, equation (20) will be guaranteed to be always satisfied.
In order to better understand why, consider a combination of weight d mutually disjoint sequences in an application, similar to the 16-QAM example discussed later, that maintains the same Euclidean distance for all one bit, 2-bit, and so on up to m-bit differentials on the constellation. Let us also consider the case, when xl(iP)=d(iP), i.e., all coded bits of every weight d sequence iP1 is placed in d(iP1)=d different columns and on the first row whose associated distance is D12. Then each such sequence iP individually achieves a normalized squared Euclidean distance of Den2(iP)=d(iP2). Next consider the case that any two disjoint sequences, iP1 and iP2, both of weight d(iP1)=d(iP2)=d and these two sequences can only share one column, i.e., sh(iP1,iP2)=1. Hence, in this case the highest possible minimum squared Euclidean distance achieved by any two sequences iP1 and iP2, is Den2(iP1,iP2)=2(d−1)D12+D22 (where D22 is the distance associated with the a two bit differential on the constellation). Similarly, the highest possible minimum squared Euclidean distance achieved by any three weight d sequences iP1, iP2, and iP3, is Den2(iP1,iP2,iP3)=3(d−2)D12+3D22. Hence, the minimum squared Euclidean distance achieved by the worst case of (d+1) sequences iP1, iP2, iP3, . . . , iP(d+1) is Den2(iP1,iP2,iP3, . . . , iP(d+1))=[d(d−1)/2]D22. Depending on the values of d, D12 and D22, [d(d−1)/2]D22 can be larger than dD12. Note that if sh(iP1,iP2)>1, fewer than (d+1) number of disjoint sequences can result in a squared Euclidean distance that is dependent only on D22. Hence, depending on d, D12, and D22, the highest possible value of sh(iP1,iP2) between any two sequences iP1 and iP2 can be chosen to make the squared Euclidean distance of any combination of disjoint sequences to be larger than dD12.
Therefore, by choosing sh(iP1,iP2) values to ensure Den2≧Dmin,n2, it is possible to guarantee that any combination of disjoint sequences is guaranteed to generate the desired MED. When Method 1 is compared to equation (17), equation (17) allows two sequences iP1, and iP2 to share more columns. Note that equation (20) must then also be satisfied separately, and based upon the number columns shared by sequences iP1, & iP2 and iP2, & iP3, equation (20) determines how many columns iP1, & iP3 can be allowed to share. This suggests that it is always desirable to limit the number of columns any two disjoint sequences can share even if it imposes additional restrictions on finding locations in Γ to place the positions contained in E(d≧dt).
Method 2: Method 2 is based on the fact that equations (17) and (20) apply only to pairs of disjoint sequences. Hence if each column of Γ is constrained hold a subset of positions whose associated subset of sequences, {iP1}, do not include any disjoint sequences, this would avoid all of these problematic situations completely. Hence, for example, when placing a position into a candidate location on the second row of a column of Γ, first identify all of the sequences in P(d≧dt), {iP1}, that contain the position already placed into in the first row above the candidate location. Then identify all of the Δdis(iP) sets corresponding to all of the sets of sequences {iP} that contain the position iP1 in the row above, and find a position to load into the candidate location that is not a member of any of these identified sets, {Δdis(iP1)}.
In practice, the following sequence can be used. First attempt to place a position into the current candidate location using Method 2. In failing, it will be necessary to place a position of a disjoint sequences relative to a position already placed in the same column above in Γ. Next use Method 1 with the smallest possible sh(iP1,iP2) values as discussed in Method 1. In failing increase sh(iP1,iP2) values up to the levels as required to meet equation (17) and then make sure that (20) is also satisfied.
When the inter-sequence constraints are used as in any of the ways discussed above, it is desirable to create a second table P+(d≧dt) along with P(d≧dt) to list the linear combinations of the distinct coded sequences. The table P+(d≧dt) can be derived from all the Δdis(iP) entries. If iP2 is an entry of Δdis(iP1) then the modulo 2 addition of the v(iP1) and v(iP2) are listed into to P+(d≧dt). Before adding a new weight d sequence iP into P(d), check to see if iP corresponds to any entry of P+(d≧dt). If so, it is not necessary to add that sequence to P(d).
In accordance with the definition of de, in order to maintain the normalized squared MED at Dmin,n2, it is necessary to ensure that no coded sequence of weight up to de=2Dmin,n2 can generate a normalized squared MED less than Dmin,n2. Hence, even if the inter-sequence constraints are ignored in the design stage, all problematic cases will be found in that search. However, it is highly desirable to impose the inter-sequence constraints during the design. This is because otherwise once troublesome combinations of disjoint sequences that reduce Dmin,n2 are found, it is necessary to swap positions when all positions are placed and it becomes harder to keep doing it for a larger number of cases. Instead if the inter-sequence constraints are imposed it is not even necessary to check for any combination of already checked disjoint sequences as they are guaranteed to satisfy the MED requirement. Hence, one good way to implement this procedure is to create a second Table P+(d≧dt) along with P(d≧dt) to list the linear combinations of the disjoint coded sequences. The set P+(d≧dt) can be derived from all the Δdis(iP1) entries. If iP2 is an entry of Δdis(iP1) then the modulo 2 addition of v(iP1) and v(iP2) are added into the table P+(d≧dt). Hence, when a new candidate sequence is identified for inclusion into P(d≧dt), before entering it into P(d≧dt), check to see if it is an entry of P+(d≧dt). If so, do not add this new candidate sequence into P(d≧dt). This way, P(d≧dt) can be made shorter and checks for higher weights can be made easier.
With the knowledge gained from the previous example, we now consider the systematic construction of Γ with a 16-QAM constellation. First a constellation mapping rule is chosen to maximize the Euclidean distance for single bit separations, and decreasing Euclidean distances for higher bit separations.
In order to establish the mathematical background for analysis in a general manner, let us consider the placement of the positions associated with a weight d sequence iP into Γ. In general, consider a M=2m-ary constellation that has minimum Euclidean distance Db for every b-bit separation, b=1, 2, . . . , m. Now let us say that this weight d sequence iP is placed in Γ such that it occupies xb columns with weight b, b=1, 2, . . . , m. For example, when d=dt, it is desired to have xl=dt and all other xb=0 to achieve ds,max=dt and Dmin,max=dtD12. However, for a general distribution of xb,s, the resulting symbol Hamming distance, ds(iP) and the squared normalized Euclidean distance, Den2(iP), of this general weight d(iP) sequence iP can be written as
subject to the constraint imposed by its weight d(iP)
In order to ensure the target symbol Hamming distance ds,t, is always met, it needs to be ensured that no sequence iP of weight up to dw=m(dst−1) can create a symbol Hamming weight, ds(iP), less than ds,t. Similarly, in order to ensure a normalized squared Euclidean distance of Dmin,n2 no sequence iP of weight up to de=└mDm2/4a2┘ can generate a Den2(iP) value less than Dmin,n2. Again, df=max {dw,de} is used to identify the highest weight of sequences iP that need to be eventually included into P(d≧dt) to in order to ensure both the selected target ds,t and Dmin,n2 values are achieved.
Referring again to the 16-QAM constellation with the constellation mapping rule shown in
D
en
2(iP)≧x1(iP)D12+xa(iP)Da2 (24)
where Da2=min {Dj2}, j=1, 2, . . . m, and j=jm minimizes over j, i.e., Da2=Djm2
x
a(iP)={(d(iP)−x1(iP))┌(d(iP)−x1(iP))/jm┐}. (25)
The bound in (24) derived from (21) observes the following facts: (a) the highest contribution of (21) comes from the first term of it, (b) there are at least xa(iP) number of additional columns occupied by any sequence iP, and (c) each of these xa(iP) additional columns contributes at least Dm2(iP) to Den2(iP).
By considering Den2(iP) in terms of its bound in (24), the design of Γ in the 16-QAM constellation can be easily related to that of the QPSK constellation in
The algorithms starts off, similar to the QPSK example, by attempting to achieve ds,t=dt and Dmin,n2=D12dt, and then, if necessary, these targets are lowered gradually until a solution, Γ, is found. The steps can be summarized as:
Step 1. Find all weight dt sequences iP of the concatenation and load their positions vectors into P(dt). The information of every sequence iP includes, x1,min(iP) and xa,max(iP) (like xmin(iP) and zmax(iP) of the QPSK example), and c(iP). Once all weight d sequences are loaded on to P(dt), find (or update) Δdis(iP) for each sequences iP in P(dt) and form the set P+(dt) using the modulo 2 addition of the disjoint sequence in P(dt) as previously discussed. Check every new candidate sequence, iP, to see if it is already in P+(dt), and if this sequence already appears in P+(dt), do not enter it in P(dt).
2. Identify E(dt) the set of distinct positions of weight dt sequences. Rearrange E(dt) in the order of descending popularity of the positions among the sequences on P(dt). Place as many positions of E(dt) as possible on the first row of Γ. If E(dt) has any remaining positions, in contrast to the QPSK case, this alone does not mean that needs to be lowered. Place the remaining positions of E(dt) (set H) by filling in columns one at a time. Try to fill in the left most possible columns first, and in failing move to the right. Try to place the most popular positions first and try to maintain ds,t and maintain x1,min(iP) for all sequences. In addition, by using Δdis(iP) and following the analysis in (17)-(20), and Methods 1 and 2, make sure that all the inter-sequence constraints are satisfied. If it is necessary to have disjoint sequences share columns try to avoid the situation where mutually disjoint sequences share any of the same columns. If all positions of E(dt) cannot be placed in Γ to satisfy the above conditions, first try swapping positions. If that fails too, lower ds,t and/or Dmin,n2 until all positions of E(dt) can be placed in Γ while meeting the CICM interleaver constraints for ds,t and Dmin,n2.
3. Set d=d+1. For every sequence iP in every set of sequences P(d≧dt) record ds,temp(iP) and Den2(iP). As described in step 3 of the QPSK case identify sets P(d≧dt), P′(d≧dt), P+(d≧dt), E′(d), H′(d), EQ(d≧dt), and ĒQ(d≧dt). Place the positions of E′(d), H′(d) similarly as described in step 3 of the QPSK example, but without favoring the first row, and in order to best meet the ds,t and Dmin,n2 constraints, while also satisfying the inter-sequence constraints. If needed, swap positions until all sequences can maintain ds(iP)≧ds and Den2(iP)≧Dmin,n2. If d<df, repeat step 3.
It is interesting to compare the performance of the QPSK and 16-QAM examples with the same concatenated code with BPSK transmission. For that comparison, we use the standard normalized squared Euclidean distance dmin2=Dmin2/2Eb,avg that considers the average bit energy Eb,avg and observe that Eb,avg in the QPSK and the 16-QAM schemes are respectively, Eb,avg,QPSK=a2 and Eb,avg,16-QAM=5a2/2. Hence, the highest achievable dmin2 for the QPSK and 16-QAM schemes (assuming that the interleaver Γ can be designed to achieve the highest possible Dmin2) are dmin,QPSK2=4Rdt and dmin,16-QAM2=4Rdt respectively, where R is the rate of the CTBC code. Note that with BPSK signaling (or QPSK with standard Gray mapping) dmin,BPSK2=2Rdt. Hence, the CICM design of the QPSK is clearly better than the usual QPSK that uses Gray mapping. Interestingly, even the CICM-16-QAM which transmits 4 bits per interval has a higher value of dmin2 than the standard QPSK with Gray mapping.
The next example shows demonstrates how to determine a CICM mapping policy using PSK constellations. Similar to set partitioning in Ungerbock's TCM, it is shown how to systematically expand a M=2m point PSK constellation to form a 2M=2m+1 point PSK constellation. With the CICM mapping rule, the MSED of the constellation at the sequence level does not reduce each time the constellation size is doubled.
To begin, consider the construction of a reverse Gray coded 8-ary PSK constellation whose phase angles are in their standard positions, {0, ±π/4, ±π/2, ±3 π/4, π}, as shown in
Similarly, two copies of the 8-ary constellation in
It can be seen from
It is interesting to compare the above CICM mapped 16-ary-PSK constellation with the above CICM mapped 16-QAM constellation that is capable of achieving dmin2=4Rdt. The CICM-mapped 16-PSK constellation can be designed to achieve dmin2=8Rdt and thus to perform better than the 16-QAM constellation over both Gaussian channels and fading channels. In fact, if the frame size is large enough so that Γ can be designed to meet the CICM interleaver constraints, then dmin2, of CTBC codes can be increased by increasing the order of signaling M. With the above construction for building reverse Gray coded PSK constellations, each time the PSK constellation size, M, is doubled, the resulting 2M-ary PSK constellation will maintain the Dmin2 same value as the original 4-PSK constellation. However, as M increases, both Dmin,n2, and de also increase, thereby adding more and more sequences the sequences iP into the table P(d≧dt), and the number of available columns of Γ, K/m, also decreases. As a result, it becomes more difficult to design a valid Γ to achieve higher values of without increasing the frame size. Hence, in practice, different orders of signaling can be tested and the best possible order in terms of dmin2 can be chosen. In addition, compared with the 16-QAM constellation, the PSK constellations comes with additional advantages due to their constant envelope property. The constant envelope property offers the scheme with a simpler (inexpensive) power amplifier at the transmitter and a simpler CSI recovery and equalization at the receiver.
The construction of the interleaver Γ of the 8-ary constellation in
Alternatively, if the reverse Gary coded bit vector of
In the CICM mapping rule design algorithms discussed above, the permutation matrix, Γ, was designed to map coded bits of a CTBC code on to a higher order constellation. While the above CICM mapping rule design algorithms map the coded bits from a CTBC code onto a target constellation, many aspects of CTBC codes were not required in the above presented design algorithms. The CICM mapping rule design algorithm made use of the tables table P(d≧dt), d=dt, dt+1, . . . , df. Hence, the CICM mapping rule design algorithm can be easily extended to work with any type of an outer code for which the tables P(d≧dt) can be prepared. All that is needed to do this is to have the ability to identify the low weight sequences. Note that typical BICM systems can be viewed as an outer code, that feeds into a uniform interleaver, and the output of the interleaver feeds into a constellation mapper that takes the place of an inner code. Therefore, many different types of codes can be used as outer codes, and if these outer codes can be used to prepare the tables P(d≧dt), then the uniform interleaver in the BICM can be replaced by the CICM interleaver, Γ. These outer codes include block codes, convolutional codes, turbo product codes, and others.
For example, consider a system that involves a simple (8,4) extended Hamming code with d0=4, and that feeds ρ codewords of this code into an interleaver before constellation-mapping onto a QPSK symbol stream. The (8,4) code has the all zero codeword, the all ones codeword and 14 weight d=4 codewords. Therefore, the table P(d≧dt) of this (8,4) code up to weight d=8 will contain (a) 14 codewords of each of the ρ codewords (14ρ in total), (b) all ones codewords of each of the ρ codewords (ρ in total), and
combinations of two codewords each with weight d=4. If the weight on P(d≧dt) needs to increase, we can extend the table to a desired weight.
Similarly, if there is a way to identify the lowest weight sequences and to thus prepare a corresponding table P(d≧dt), the same method can be applied to other kinds of codes such as various types of convolutional codes. Additional gains can be achieved by using the permutation Γ that is chosen in accordance with the CICM interleaver constraints to rearrange the coded bits of the outer code to form symbols for transmission. It is interesting to note that, even with a relatively simple outer code, Γ can be designed to work with a target signal constellation in order to achieve very good performance. The systematic design of Γ and a signal constellation mapped according to a properly identified constellation mapping rule (such as a reverse Gray coded (RGC) constellation mapping rule) to allow the CICM mapping approach to be applied in a variety of situations beyond CTBC encoded applications.
As of present, it is difficult to enumerate all of the low weight codewords of turbo codes and LDPC codes that have large frame sizes. However, the CICM approach can be applied to certain turbo codes and LDPC codes with small to moderate frame sizes where the low weight error sequences can be enumerated and thus the table P(d≧dt) can be found. All that is needed to apply CICM is the table P(d≧dt) can be built. Also, for larger frame sizes, if exhaustive algorithms or other kinds of long-running, off-line algorithms are used to identify the low weight error sequences to build the table P(d≧dt), then a CICM mapper can be designed for any such turbo code or LDPC code for which the table P(d≧dt) has been constructed.
Two example approaches are provided below in order to achieve variable redundancy (also known as Rate Matching) in systems that use CTBC codes.
1. Puncturing: In this approach, we first consider a concatenation with a low rate OBC and an accumulator. Then in order to adjust the rate, puncturing is performed at the output of the accumulator. It is well known that a low rate OBC usually comes with a high MHD do. Even a standard CI-2 would square the effect of this increase in the MHD. For example, consider a (8,4) OBC with d0=4, which can be used with an accumulator and a CI-2 interleaver to construct a concatenation with rate 1/2 that has MHD=16. Consider a (12,4) shortened BCH code derived from a (15,7) BCH code with d0=5. If this (12,4) OBC is used the same way to construct a concatenation, and if the frame size is large enough, the resulting concatenation can achieve a MHD of 25. However, to bring the rate up to ½, puncturing can be applied to puncture out, on the average, one bit out of three bits, at the output of the accumulator. This puncturing can be done in an optimal manner by following the construction of the interleaver Γ. A set of K/3 coded bits is selected at the output of the accumulator that would maintain the highest MHD of the punctured code. This can be done by trying to preserve most of the coded bits of low weight coded sequences by monitoring the non-zero positions of the low weight sequences as in the construction of Γ. Also, during the execution of the CICM mapping rule design algorithm, the orderings of the sets E(d) and H(d) and/or the constraints used to place the positions in these sets and related sets of positions could be preferably placed to maintain higher values of ds,t and Dmin,n2 within a subset containing K/m/3 columns of Γ than are achieved in the rest of the columns of Γ.
That is, the puncturing and the design of Γ to assign the punctured coded bits to bit positions within symbols can be done jointly. As an example, consider the above stated concatenation of a (12,4) OBC and an accumulator. Following steps 1 and 2, we can first form the set of distinct positions of all lowest weight sequences, E(d). If the length of E(d) is more than 2K/3, there is no way we can remove N/3 bits at the output of the accumulator while preserving the overall MHD of the punctured code at d=dt=25. However, if the length of E(d) is less than 2K/3, there is a chance. The goal is to identify a set of positions that can later be punctured so as to maintain the highest possible MHD after puncturing. This can be done using the sequences that are needed to build Γ, that is, the sequences iP in the tables P(d≧dt), i.e., the sequences in the tables P(d) where dt≦d≦df. If the MHD can be maintained at dt, no positions of E(dt) can be removed. Similarly, one position from each of the sequences in E(dt+1) can be removed. It will need to be checked that any position removed from E(dt+1) is not also a member of E(dt). So, in general if the target MHD after puncturing is dt′, then up to (d−dt′) number of positions can be removed from every sequence in P(d), d>dt′. When selecting positions to remove, always try to find the least popular newly added positions at every weight d to thereby affect the least number of sequences in P(d) while also maintaining the desired MHD for lower weights also.
2. Use of a SPC code with the inner code: In this approach, a high rate OBC is used. Then to adjust the rate lower, a (λ+1, λ) SPC encoder is used to further encode the output of the accumulator. As a result, the IRCC is formed by the concatenation of the accumulator and the SPC code. With this construction, the rate of the overall CTBC code can be readily adjusted by adjusting λ.
Referring now to
The CICM based transmitter involves a CTCB encoder 1905 that is coupled to a CICM signal mapper that includes a CICM interleaver 1910 that is in turn coupled to a Reverse Gray coded (RGC) constellation mapper 1915. The CTBC encoder block can be implemented using any of the valid CTBC encoder embodiment as discussed herein. The CICM interleaver performs interleaving in accordance with an CICM interleaver rule Γ that is designed as discussed herein to meet one or more CICM interleaver constraints, to include CICM interleaver rule to permute the coded bits of the vector v, subject to the constraint that, once mapped, the CICM mapped sequence will exhibit the best set of values of ds,t, and Dmin2 that can be achieved for a given frame size and for a given constellation size and the RCG constellation mapping rule. Also, Γ can be designed to meet subordinate types of constraints such as inter-column and inter-sequence constraints as discussed herein. The RCG constellation mapper maps, for example in accordance with the QPSK or the 16-QAM constellations or 8-PSK constellations as shown in
The CICM based receiver involves a RCG constellation demapper 1920 that is coupled to a CICM deinterleaver 1925 that is coupled to a CTBC decoder 1930. The RGC constellation demapper 1920 performs the inverse operation of the RGC constellation mapper 1905, and in practical embodiments is used to compute a set of bit metrics for later decoding in a SISO decoder. The CICM deinterleaver 1925 performs deinterleaving in accordance to the inverse of the CICM interleaver rule Γ, which is denoted as Γ−1. The output of the CICM deinterleaver 1925 is typically a set of bit metrics that are coupled to CTBC decoder 1930. The CTBC decoder 1930 can be implemented in accordance with any of valid CTBC decoder embodiment as discussed herein. However, as each pass is made through the SISO algorithm implemented in the CTBC decoder, in order to compute new bit metrics based on the updated extrinsic information, the bits from the v sequence will need to map via the CICM interleaver 1935 to the RGC signal constellation information so that the bit metrics can be updated. The updated bit metrics then pass back through the CICM deinterleaver 1925 for further SISO decoding in the block 1930.
In the transmitter and/or the receiver 1900, rate matching and other forms of variable redundancy can be implemented using the (λ+1, λ) SPC encoder at the output of the accumulator inside the CTBC encoder/decoder blocks 1905, 1930. In such embodiments, the IRCC in the CTBC code is formed by the concatenation of the accumulator and the SPC code as discussed above. In systems where rate matching and/or other forms of variable redundancy functions are designed into the CICM permutation rule Γ, the blocks 1910 and 1925 can be implemented as discussed above to cause a subset containing less than the full K/m columns of Γ to be transmitted in any given variable redundancy frame or sub-frame.
For example, consider a case where the full CTBC code, v, will be transmitted as three sub-frames. In this case, the permutation Γ can be arranged to send the first set of K/m/3 columns in a first sub-frame, the second K/m/3 columns in a second sub-frame, and the third K/m/3 columns in a third sub-frame. Preferably the columns of Γ are organized so that the first K/m/3 columns of Γ contain a carefully constructed set of columns that maximize a given performance measure, such as the MHD of the CTBC coded vector v, in light of the fact that only the first K/m/3 columns of Γ will be available to the SISO decoder 1930. The K/m/3 columns of Γ preferably contain a carefully constructed set of columns that maximize the MHD of the CTBC coded vector v in light of the fact that only the first 2K/m/3 columns of Γ will be available to the SISO decoder 1930. When the final K/m/3 columns of Γ have been transmitted, all of the elements of v will be available to the SISO decoder 1930. If now further redundancy is needed, a retransmission protocol can be used so that any specified subset of the columns of Γ can be retransmitted to further increase the probability of correct decoding of the vector, v.
In embodiments where the CICM interleaver 1910 and deinterleaver 1925 are designed to work in variable redundancy systems, there will be additional control logic associated with the blocks 1910 and 1925 to implement the variable redundancy protocol. Information at a control channel level or some other higher layer such as a radio link layer or a radio physical layer control entity or data stream will be coupled to a control element associated with each of the blocks 1910 and 1925, and these control elements can be considered to be a part of the blocks 1910 and 1925 in such embodiments involving rate matching or other forms of CTBC/CICM adaptive modulation and coding.
In embodiments as mentioned above where some other form of coding is used beside CTBC codes, the blocks 1905 and 1930 can be configured to encode and decode in accordance any selected form of coding for which the table P(d≧dt) can be constructed for d=dt, . . . , df. For example, any type of block code, and most types of trellis codes, convolutional codes, and certain turbo codes and LDPC codes can be used in the blocks 1905 and 1930 in these types of embodiments to achieve the benefits of CICM and CICM based variable redundancy as described herein.
In typical CICM communications embodiments, an encoder will be used that converts a sequence of input bits to an encoded bit sequence in accordance with an encoding rule. The encoding rule can be CTBC encoding or could be any other coding rule such as a block code or a convolutional code or any other code that produces a frame of K of encoded bits, and where the encoding rule has the property that, for all possible sequences of input bits, all possible low weight encoded bit sequences, iP, of weights dt≦d≦df can be identified and enumerated, where none of the possible low weight encoded bit sequences, iP, can have a weight less than dt, and the weights dt≦d≦df correspond to Hamming distances. Such embodiments will also include a constrained interleaver that is configured to implement an m×K/m permutation rule. The m×K/m permutation rule is configured to permute the K encoded bits of the encoded bit sequence to a sequence of K/m number of subsets that each contain m encoded bits. This permutation can be optionally/preferably implemented using the CICM permutation matrix, Γ. A constellation mapper will then receive the sequence of K/m number of subsets and use a pre-defined constellation mapping rule to convert the sequence of K/m number of subsets to a sequence of K/m number of 2m-ary signal constellation points. The m×K/m permutation rule and the constellation mapping rule are jointly selected to ensure that a pre-defined target value of MSED is maintained for all of the possible low weight encoded bit sequences, iP, of weights dt≦d≦df. The m×K/m permutation rule and the constellation mapping rule are preferably also jointly selected to ensure that a pre-defined target value of minimum symbol Hamming distance, d is maintained for all of the possible low weight encoded bit sequences, iP, of weights dt≦d≦df. The Hamming distance df is preferably selected to ensure that any possible encoded bit sequence, iP, that has a weight d>df will be guaranteed to have at least the pre-defined target value of MSED and the pre-defined target value of minimum symbol Hamming distance, ds. In typical embodiments, The constellation signal mapper uses either anti-Gray coding of RGC.
As is discussed in further detail below in connection with
Referring now to
Next control passes to an action 2010 which initializes d to d=dt. Control next passes to an action 2015 which determines a set o sequences, {iP}=P(d) which includes the positions vectors p(iP) for the weight d CTBC coded sequences, iP=0, . . . , NP(d)−1. Other information can be optionally included in the table P(d) such as the sets of associated vectors v(iP) and c(iP), for example. Also, the newly identified constituent table P(d) can be used to update an aggregate table, P(d≧dt), and the sequences iP=0, . . . , NP(d)−1 can be added to a larger set of sequences with weights d≧dt, iP=0, . . . , NP(d≧dt)−1.
Control next passes to an action 2020 which attempts to place into Γ any and all of the positions associated with the sequences iP in P(d) that have not already been already placed. As discussed above, all such placements are made in accordance with the CICM constraints i.e., ds(iP)≧ds,t and Den2(iP)≧Dmin,n2. The placements of these positions can also optionally be made in accordance with the inter-column and inter-sequence constraints as discussed above. Moreover, any of the swaps discussed above or similar types of swaps can be made with the goal enforcing the CICM interleaver constraints on all sequences in P(d≧dt) for the current value of d as determined by the action 2010 or 2035.
Control next passes to an action 2025 which determines whether the CICM interleaver constraints were able to be achieved in the action 2020. If the CICM interleaver constraints were achieved, control passes to the action 2035 where the distance, d, is incremented as d=d+1. If the CICM interleaver constraints were not achieved, control passes to the action 2030 where the target minimum symbol hamming distance, ds,t and the target normalized minimum Euclidian distance Dmin,n2 are decreased to their next lower values that preferably corresponds to their highest possible values that are lowered relative to their current values. Control first passes out of action 2030 to action 2015 to allow the design algorithm to attempt to place the current set of {iP} sequences in P(d) using these lowered values. When this branch is taken out of the action 2030, the action 2030 preferably removes already placed positions that can be removed from Γ without violating the CICM interleaver constraints subject to the lowered ds,t and Dmin,n2 values. If the action 2030 is reentered after this attempt fails, the second branch out of the action 2030 will be taken to restart the algorithm 2030 at the action 2010 using the original d=dt value.
Control passes out of action 2035 to action 2040 where it is determined if the incremented value of d is greater than df. If d is not greater than df, then control passes from the action 2040 to the action 2015. If d is greater than df, then control passes from the action 2040 to the action 2045. This logic ensure that the algorithm is allowed to run for d=dt, dt+1, . . . , df. The action 2045 provides a valid CICM permutation matrix, Γ, which identifies the CICM interleaver rule.
Next consider the problem of designing vectorizable permutations for the CICM permutation, Γ. The CICM permutation has been defined in terms of the m×K/m permutation matrix, Γ. If the m×K/m permutation matrix, Γ, is viewed as
Γ=[Γ1Γ2 . . . ΓK/m]ΣZm×K/m, (26)
where each rjεzm, for j=1, . . . K/m, then one can define
The elements of the matrix Γ′DCI as defined in equation (27) correspond to permutation indices that point back into the vector v. In terms of the CI-2, CI-3, or CI-4 type permutations, c=π−1[v], so that indirection can be used (r→v→c) to construct another permutation matrix, ΓDCI. That is, ΓDCI is defined to be a matrix just like Γ′DCI of equation (27), but whose elements correspond to permutation indices pointing back into the vector c instead of the vector v. The elements of the matrix ΓDCI are related to the elements of the matrix Γ′DCI via the constrained interleaver permutation u=π[c]. The reason ΓDCI is defined in terms of the coded bit positions of the vector c is because of the way data is stored in the parallel access 2D memory 710, 1160, 1240 used within the above described SISO decoders described in connection with
As discussed in connection with
If the CICM permutation matrix, Γ, is already known, then the construction of ΓDCI via Γ′DCI of equation (27) is straight forward. However, in practice it will often be desirable to select the permutation πCICM[] to satisfy Constraint 6, so that it can be factored as ΓDCI=πCICM[c]=πLSB,CICMπi
To understand how the memory 710 and related 2D memories used in the SISO decoder (1160, 1240) can be accessed in accordance with all of the C, U, and ΓDCI matrices, consider an example where the constrained interleaver, U=π[C] is a DCI with πMSB[] selected to correspond to the MSBs of a QPP interleaver as discussed above in connection examples discussed using
As is known to those of ordinary skill in the art, M. Isaka et al., “On the iterative decoding of multilevel codes,” IEEE JSAC, Vol. 19, No. 5, May 2001, pp. 935-943 (“the Isaka reference”) teaches know known ways update a set of bit metrics when higher order constellations are in use. Using this as a starting point, an aspect of the present invention uses this concept to update a set of bit metrics after the soft decoding of the inner code. Similar to calculating the extrinsic information (LLR values) of the input bits of the inner code, the extrinsic information of the output bits of the inner code can also be found at the same time during the soft decoding of the inner code. Just like the calculation of the extrinsic information of the input bits, the extrinsic information of the output bits can again be calculated by considering the transitions that favor bit 0 and bit 1 separately for the output bit in consideration. Using these extrinsic LLR values of the output bits of the inner code which form the M-ary transmitted symbols, the probability of each of the M symbols during every interval can be calculated.
For example, consider the case where 8-PSK is used for transmission. For example, in such a system, during any interval, three output bits of the CTBC code, v1, v2 and v3, are used to form a 8-PSK constellation point. Let, Le1, Le2 and Le3 be the extrinsic information of these three bits found in the decoding of the inner code. In order to calculate the updated bit metric of v1, identify a set of constellation points, S0, that favor the event that v1=0 and another set of constellation points, S1, that favor the event that v1=1. In this 8-PSK example, the sets, S0 and S1, will contain four constellation points each. Note that any ith extrinsic LLR value, denoted Le, can be expressed in terms of the probabilities as Le1=ln{{P(vi=0)}/{P(vi=1)}}. Hence, P(vi=0) and P(vi=1) can be expressed as,
P(vi=0)=eLe
Therefore, for every constellation point sj in S0 and S1, the probability contribution to constellation point sj can be found using the extrinsic information from the remaining bit positions 2 and 3 by multiplying the respective probabilities of the bit positions 2 and 3 obtained according to equation (28). Then the bit metric for v1, b(v1), can be updated by following equation (27) of the Isaka reference as,
The same process can be continued to calculate the updated bit metric of the other two bit positions v2 and v3 as well. For example, to calculate b(v2) the sets S0 and S1 will be defined in accordance with bit v2 instead of bit v1. The value of b(v1) in the above equation can be approximately calculated by considering only the significant term in each summation which results in equation (18) of the Isaka reference as
where, x20 and x21 represent the second and third bits (in natural binary) of the constellation points Sc, and Sb chosen, in the in set S0 and S1 respectively in the maximization. Similarly, x30 and x31 represent the third bit positions of Sa and Sb respectively.
Note that each ΓjεZm in equation (26) have different weights per element, e.g., 4E, 3.8478E, 3.4142E, 2E. As discussed above, it is important that certain specified elements of various low weight sequences listed in the relevant tables P(d) map to rows of Γ that correspond to the higher weights. Therefore, certain permutations πMSB,CICM[] will be favored over others. For example, if a particular permutation πMSB,CICM[], maps the bulk of the elements of the low weight sequences listed in the relevant tables P(d) to the higher weighted rows of Γ, this permutation will be favored over other candidate permutations. This criterion can be used to select a good candidate permutation, πMSB,CICM[], over other candidates. For example, if a QPP permutation is being used to implement πMSB,CICM[], a set of QPP parameters can be selected based on a measure of the permutation to permute the coded bits of the low weight error sequences to the higher weighted rows in ΓDCI. Also, the permutation πMSB,CICM[] can be specially designed as a deterministic permutation rule that provides a good measure of the mapping of the coded bits of the low weight error sequences to the higher weighted rows in ΓDCI.
It is also possible to define a modified permutation πMSB,CICM[] that is modified to perform local inter-row permutations, e.g., if m=3, [irow, irow+1, irow+2]→[irow+2, irow, irow+1]. The local inter-row permutations can be used to find more favorable permutations to be applied to the columns in accordance with the weighting of the rows of ΓDCI. Such permutations could be applied per m rows and per column. That is, different groups of m rows and different columns could be modified on an individual basis. All such modifications are contemplated; however it is realized that such embodiments involve more hardware complexity. In the discussion below, a simpler permutation rule design example is provided, but it is understood that such additional modifications can be made to improve the ability to find good permutations at the expense of additional real-time hardware requirements and complexity.
To understand how to design ΓDCI, refer again to
Once ΓDCI is designed and is available, the method, apparatus and systems of 700 and 1100 of
Hence it should be understood that an aspect of the present invention is a parallel, vectorizable DCI that is able to provide three or more different permutation sequences from the same memory, 710. The present invention contemplates that such structure and functionality can lead to improved joint coding and modulation/signal mapping systems with improved coding performance that is derived from improved Euclidian distance and/or symbol Hamming distance.
Multimedia applications usually require unequal error protection for different types of information streams. For example, in systems where both data and Voice over IP (VoIP) packet streams are present, the data streams and the VoIP streams can have different required levels of error probability/error rates. Similarly, in live streaming video, the video stream and the audio stream can have different required levels of error probability/error rates. Multilevel codes are often used to provide unequal levels of error protection to different data streams by employing a more powerful code for the data stream(s) that require a higher levels of protection.
While the CICM mapping rule design algorithms described above were used to design mapping rules that had the same error probability for all message bits, CICM mapping rules can also be designed to provide different levels of error protection for different subsets of the message bits. For example, before being passed over a data link connection, a network layer or link layer interface unit can be used to examine a packet stream to be sent over the data link/physical channel. The bits in the packet stream may be categorized as packet header bits and according to packet payload type as indicated by the header bits. In a given example, header bits and TCP packet payloads could be assigned a first error protection level, while VoIP and audio payloads could be assigned a second error protection level and video payloads a third error protection level. In broadcasting applications, different error protection levels could be assigned for use with control bits, audio data stream bits, and video data stream bits.
To understand the how a CICM mapping rule can be designed to provide unequal levels of error protection, consider a specific example where an (8,4) block code is used to perform the coding and the coded bits of this (8,4) code are then constellation mapped onto a QPSK constellation using a reverse Gray code mapping. In this example, assume that each frame of N message bits can be divided into a first stream that has N/2 message bits and a second stream that also has N/2 message bits. The first stream is assumed to require a lower error probability (higher error protection), while the second stream requires lower level of error protection (allows a higher error probability) as compared to the first stream. In this example, K/2=N coded bits from the first stream and K/2 coded bits from the second stream are to be transmitted jointly in a frame of K coded bits using QPSK modulation while maintaining the lower error rate for the first stream. This can be accomplished by placing all of the K/2 coded bits of the first stream on row 1 of the CICM interleaver, Γ, and all of the K/2 coded bits of the second stream on row 2. In this example using the (8,4) Hamming code and the reverse Gray coded QPSK constellation, the minimum squared Euclidean distance of the first stream will be Dmin,12=4*8a2=32a2, while the minimum squared Euclidean distance of the second stream will be Dmin,224*4a2=16a2. Hence, instead of using two different codes as is performed in MLC systems, the CICM interleaver rule can be used to produce unequal error protection while using just a single block code applied separately to both of the bit streams associated with each of the reverse Gray coded bits of the QPSK constellation.
The above example can be extended to any constellation with any number of data streams. The codewords of the block code of any identified stream are permuted to a specified row of Γ so as to meet a desired minimum squared Euclidean distance. For example, in the above case of two streams, if 16PSK or 16-QAM is used, then the first and second rows of the CICM interleaver matrix may be primarily used for the first stream that requires higher protection while the last two rows can be primarily used for the second stream.
While the previous used a single block code, this same basic approach can also be extended to applications where a convolutional code or a concatenated code, like a CTBC code, is used in lieu of the above-described (8,4) block code to supply the coded bits to be constellation mapped, e.g., to a reverse-Gray coded constellation, via CICM. In general, any of the above-mentioned codes or any other code whose P(d) tables can be identified can be used. In such situations, CICM is generally designed using a respective set of P(d) tables and by then ensuring that different low distance error sequences as listed in the P(d) tables end up achieving corresponding different desired MSEDs while also maintaining a corresponding symbol Hamming distance, ds. The discussion below explains how this is achieved in the context of a CTBC code.
When the coding is performed in accordance with a CTBC code, and when it is desired to provide unequal error protection to different sub-streams of message bits, it is necessary to first identify a subset of codewords of the OBC to be used to encode the message bits from the different sub-streams. For any given sub-stream, there will be an associated set of OBC codewords that will correspond to the inputs of the IRCC that end up generating a corresponding subset of coded bits, v of the entire sequence, v. Next consider the error sequences that involve any of the coded bit positions that correspond to elements of vs. All low distance error sequences that need to be considered will be listed in the P(d) tables used to design the CICM interleaver, Γ. If all of the low distance error sequences involving coded bit positions from the subset vs can be ensured to have a specified higher Euclidean distance, then it is possible to maintain the specified higher level of protection for the message bits associated with the corresponding subset of codewords of the OBC that correspond to the coded bit positions of vs.
Hence, when designing a CICM mapping rule, it is desirable to use rows of Γ with a higher Euclidean distance for the coded bit positions involved in the error sequences that include any of the elements of vs. Depending on the constellation and the desired number of streams, the low distance error sequences listed in the P(d≧dt) tables can be used to form different groups of coded bits that will need respective unequal error protection levels.
In order to systematically select the sets of codewords of the OBC for different levels of error protection, let us consider the case where we have already constructed a Γ for equal error protection. At this point the P(d≧dt) table that lists all the coded sequences of v up to weight df will have already been prepared. The sequences in the P(d≧dt) table can then be used to calculate the actual Squared Euclidean Distance (SED) of each coded sequence in the table P(d≧di), each of which, by construction, must be at least as high as Dmin2. The goal is to next identify two sets of codewords, CW1 which contains codewords that have a higher level of protection, i.e, of at least Dmin,12, and CW2 that contains the remaining codewords which have a lower level of protection, i.e, of at least Dmin,22, where Dmin,22<Dmin,12.
At this point some observations will be made that will help to develop algorithm to identify the sets CW1 and CW2 given a particular code and given a starting CICM permutation matrix, Γ, that was developed for equal error protection. The CICM permutation matrix, Γ, will then need to be modified/adjusted in a way that maintains the symbol Hamming distance at ds and achieves the targets Dmin,12 and Dmin,22 for the identified sets CW1 and CW2. In order determine how to identify the sets CW1 and CW2 and to modify Γ to achieve these goals, the following observations are made:
Observation 1: Consider any codeword cj=(cj0, cj1, . . . cjn-1) of the OBC that places its ith coded bit, cjt, at the ith position of u at the input of the IRCC, i.e., u(i)=cjt. Identify the corresponding v(i) (output of the IRCC) for the corresponding u(i)=cjt. For this cjt, identify each sequence iP listed on P(d≧dt) that contains the corresponding position i. Note that the position i can be listed in multiple sequences contained in the tables P(d≧dt). Next calculate the SED, denoted as SED(iP), for each identified sequence iP that contains position i. Using all the identified sequences iP that contain the position i, find the minimum of all the SED(iP) values. Denote the minimum SED(iP) value for position i which corresponds to cjt as D2(cjt). Continue this process for all of the bit positions in the codeword cj for t=0, . . . n−1. Repeat the same process for all codewords cj, j=0, . . . , ρ−1. At this point the squared Euclidean distance of each coded bit positions of each of the codewords ci will have been computed for all coded sequences in the tables P(d≧dt). Next find the minimum of D2(cjt) among all t=0, . . . , n−1, for each of the codeword cj as D2(cj)=Mint {D2(cjt)}. This calculation implies that the current CICM permutation Γ will cause each codeword cj to have a MSED of D2(cj).
At this point, it may be possible to choose the group of codewords cj with higher D2(cj) for the set CW1 and the rest for CW2, without making any changes to Γ. If this is not the case, modification/adjustments can be made to Γ in order to increase the MSED separation between the sets CW1 and CW2.
Observation 2: As stated in observation 1, D2(cjt) is the minimum taken over all sequences on P(d≧dt) that has position v(i), and D2(cj) is the minimum of D2(cjt) over all t in each codeword position Hence, in order to increase D2(cj), one needs to focus on the cjt,min that determined D2(cj), i.e., D2(cj)=D2(cjt,min(1)). Note that D2(Cjt,min(1)) will have been determined by one or few of the low weight coded sequences listed in P(d≧dt). Further, if D2(cjt,min(1)) can be increased up to the next lowest D2(cjt) value among t=0, . . . , n−1, denoted by D2(cjt,min(2)), then D2(cj) will have been increased up to D2(cjt,min(2)). In order to realize an increase in D2(cj), it will be needed to judiciously swap some positions in Γ, preferably with the smallest number of swaps. If possible, one can attempt to increase each D2(cj) value gradually in n steps up to D2(cjt,max) for all codewords in a set which would become CW1, where D2(cjt,max) is the maximum D2(cjt) among all t=0, . . . , n−1. Each such increase will come about as a result of a modification/adjustment in Γ.
Observation 3: It was seen in observation 2 above that any D2(cjt) can be adjusted performing a sequence of swaps that cause the SED of selected corresponding coded sequences in P(d≧dt) to be adjusted. Hence, next consider how to perform swaps to change the SED(iP) value corresponding to any low weight coded sequence iP listed in P(d≧dt). Due to the assumed previous construction of Γ, all of the positions of iP will already have been placed into Γ. With respect to the previously discussed QPSK example, some of the positions of iP could have been placed on row 1 while the rest on row 2. Denote the portion of iP on row 1 by iP-1 and the portion of iP on row 2 by iP-2. Hence, iP-1 represents the positions of iP that can be swapped to lower the corresponding SED(iP) value, while iP-2 represents the positions of iP that can be swapped to increase the SED(iP) value. If either of the iP-1 or iP-2 sets are empty, then the corresponding coded sequence iP can only increase (if iP-1 is empty) or decrease (if iP-2 is empty) the SED(iP) value. Further, for every coded sequence in the tables P(d≧dt), one can also determine the maximum possible SED that coded sequence can achieve, SEDmax(iP), which will be realized if all of the positions in iP-2 can be moved to row 1.
Observation 4: Based on observations 1-3 above, the MSED of any OBC codeword c1 will be determined by the D2(cjt) values of one or a few of its coded bits cjt. Further, each of the D2(cjt) values will be determined by the SED(iP) values associated with each coded sequence iPε{iP(Cjt)}={iP,cjt(1), iP,cjt(kjt)}, where kjt is the total number of coded sequences that can influence D2(cjt). Hence, it is seen that the MSED of codeword cj, D2(cj) will be determined by the SED of a particular coded sequence, for example, iP,cjt(l). The value of D2(cj) can be increased by swapping one or more positions of iP,cjt-2(l) with iP,cjt-1(l). This will have the effect of swapping positions of the coded sequence iP,cjt(l) that are currently placed on row 2 with positions not directly related to iP that are currently placed on row 1. Before making such a swap, a check can be made to determine whether the movement of the position currently in row 1 to row 2 will violate any prescribed conditions. Further, the highest SED that codeword cj can reach is Dmax2(cj), can also be found by assuming that the SED of the worst case coded sequences related to cj can be increased up to its maximum Dmax2(cj) by successfully moving all of the associated positions from row 2 to row 1. That is, for every coded sequence iP listed in P(d≧dt), SED(iP) could be increased up to SEDmax(iP), if all its positions on the second row can be successfully swapped.
Consider a set of ns sequences on P(d≧dt), iP,ns={iP1, iP2, . . . , iPns} for which SEDmax(iPi)≧Dmin,12 for i=1, 2, . . . ns. Note that some of these coded sequences may have an SED(iPi) value that is above the threshold i.e. SED(iPi)≧Dmin,12. Because the CI will define a bidirectional permutation, π, between cjt and its corresponding position i as per v(i), a reverse permutation (inverse constrained interleaving operation, π−1), i.e., v(i)→cjt, can be defined which is referred to as “de-permuting” herein. Using this de-permuting process, next find the corresponding coded bit positions, cjt, of each and every position found in any of the low weight sequences in the set iP,ns. After that de-permuting process, if all the coded bits of a set of ρ1 codewords can be found whose SED(iPi) values all satisfy SED(iPi)≧Dmin,12, then those ρ1 codewords can be used to form a set like CW1 to maintain a MSED of Dmin,12.
Therefore, to identify a set of ρ1 codewords of the OBC for a level of protection determined by Dmin,12, first find the SEDmax(iP) values for every coded sequence iP listed in the tables P(d≧dt). Next form the set of sequences iP,ns by considering the set of coded sequences iP for which SEDmax(iP)≧Dmin,12. Next de-permute all the positions of all sequences in iP,ns. At this point, it is determined whether at least ρ1 codeword positions satisfy D2(cj)=Mint {D2(cjt)≧Dmin,12. If not, this means no modifications/adjustments can be made to the current CICM permutation matrix Γ in order to cause ρ1 codewords to achieve a MSED of Dmin,12. In addition, for all sequences in iP,ns, let ch(iP) denote the number of positions of the sequence iP-2 that need to be moved from the second row of Γ up to the first row in order to enforce SED(iP)≧Dmin,12. This parameter can be calculated as ch(iP)=(Dmin,12-SED(iP))/4a2, because moving each position from row 2 to row 1 increases the SED by 4a2 (i.e., from 4a2 to 8a2).
Next consider methods 1-4 below that can be used to select the candidate subsets of codeword positions to be used to construct set CW1 for unequal error protection. All of methods 1-4 below also be used to construct the set CW2 instead of CW1. In such cases, the methods 1-4 are modified by starting with the lowest SEDs instead of the highest SEDs. Depending on the code and the parameters, it is sometimes easier to construct CW2 as opposed to CW1. The methods are:
1. De-permute the positions of the coded sequences for which SED(iP)≧Dmin,12. If at least ρ1 codeword positions satisfy D2(cj)=Mint {D2(cjt)≧Dmin,n2, identify these ρ1 codeword positions and stop. No additional work is needed. The current design of Γ for equal error protection can also be used for unequal error protection.
2. Identify the coded sequences iP with the highest SEDmax(iP) values and de-permute all positions in these coded sequences. Then do the same for the sequence with the next highest SEDmax(iP) values. Continue the process by de-permuting sequences one by one selecting the sequence with the highest SEDmax(iP) value. Stop when ρ1 such codeword positions of have been identified. At that point, ρ1 codeword positions for the set CW1 will have been identified. Also identify the highest possible Dmin,12 value which will be equal to the last SEDmax(iP) used to construct the set CW1. This method can be used when the highest possible Dmin,12 is required.
3. De-permute the positions of the coded sequences for which SED(iP)>Dmin,12. If no set of ρ1 codeword positions satisfies the condition of method 1, de-permute one coded sequence at a time starting from the coded sequence with the lowest ch(iP) and moving to next coded sequence with the next lowest ch(iP). Continue this process until all coded bits of ρ1 codeword positions are observed in the set of de-permuted coded bits. The purpose of this approach is to find the set CW1 to lower the number of swaps needed.
4. De-permute the positions of the coded sequences for which SED(iP)>Dmin,12. If no set of ρ1 codeword positions satisfies the condition of method 1, find the codeword position that satisfy SED(iP)>Dmin,12 and has the highest number of coded bit positions. Permute the remaining coded bits of that codeword on to v (i.e. find the corresponding v(i)'s). Find the sequence iP that contains each v(i) with the smallest ch(iP) value. De-permute all of the positions, i, in that iP. Note that de-permuting of iP can fill in other coded bits of remaining codewords too. Continue the process until ρ1 codewords have been filled. This approach tries to identify codeword positions cj that have most of their coded bit positions that satisfy SED(iP)>Dmin,12.
Using a candidate set of codeword positions identified in one of the methods 2-4, next consider how to identify swaps that are used to modify/adjust Γ so as cause at least ρ1 codeword positions to satisfy D2(cj)=Mint {D2(cjt)≧Dmin,12. Note that when a position that is placed on the first row is moved down to the second row, all coded sequences iP that include the coded bit position being swapped will lower their SEDs by 4a2 (from 8a2 to 4a2). Hence, to identify a position that can afford to tolerate that swap, look at all coded sequences in P(d≧dt) that include the candidate position to be swapped and make sure that all such sequences can afford to lower their SED by 4a2 and still maintain the required MSED values of CW1 and CW2 (Dmin,12 and Dmin,22 respectively). Therefore, to prepare a list of valid positions to swap:
However, it is also important to note that each coded sequence iP can only afford move up to a maximum number of positions from the first row to the second row of Γ. Based on the positions involved in the coded sequence, that coded sequence will need to maintain a SED of Dmin,12 or Dmin,22. This is because when positions of the sequence iP are de-permuted, if all of the positions in iP fall into CW2, then iP needs to only maintain a SED(iP) of at least Dmin,22. However, if even a single de-permuted position falls into CW1, then iP needs to maintain a SED(iP) of at least Dmin,12. Hence, if the current SED of the coded sequence is SED(iP) and the sequence is required to maintain a SED of Dmin,12, then it can only afford to move npos(iP)=[SED(iP)−Dmin,12]/4a2 positions from it. Once CW1 and CW2 are identified, it is possible to find all npos(iP) values for all coded sequences iP in P(d≧dt). Note that if npos(iP) number of positions of a sequence iP are swapped, then more positions of that sequence cannot be swapped and all remaining positions of that sequence should be discarded from the list of valid positions. Also note that the valid list of positions to swap is formed by positions from sequences iP that de-permute to coded bit positions in CW1 and/or CW2 that have relatively high SED values. Further, when swapped with the positions from the list, it is seen that the D2(cf) values that are lower will increase while those that are higher (which are likely to represent the list of valid positions) will start to decrease. Note that when Dmin,12 is higher more swaps will likely be needed. That means a longer list of valid positions will be needed. The longer the list of valid positions is, the more likely that Dmin,22 will need to be lowered so as to create more possibilities for positions to be moved from row1 to row2 in CW2.
One other important point is that when Dmin,12 is higher than the Dmin2 used to construct the initial Γ for equal protection (which is usually the case), df will also need to be adjusted according to Dmin,12. Each time df is increased, this will cause more sequences to be added to P(d≧dt). All added sequences need to be considered while Γ is modified/adjusted to accommodate unequal error protection. Hence, a method to design Γ for unequal error protection can be outlined as follows:
Similarly, in the unequal error protection applications, a good Γ design should maintain similar SED(cj) values all close to for the set CW1 and similar SED(cj) values close to Dmin,22 for all cj in CW2. The above fine tuning process outlined for the design of two sets can be continued until similar SEDs in the two sets are reached.
It can be noted that other variations are possible. For example, if a constellation is being used that has four levels, it may be desirable to apply a strong code such as a CTBC code to only encode the first level or the first two levels, for example. Then weaker codes such as block codes could be used to encode the third and and/or fourth levels. By doing so, we can use both the codes and the design of Γ to generate a bigger separation between the levels of protection than by using the same code and using only Γ to provide different levels of protection.
In the previous discussion, we considered how to design Γ to provide unequal error protection using an already-designed constrained interleaver of a given CTBC code. It is also possible to design the constrained interleaver used in the CTBC code from the get go in order to make the design of the corresponding Γ simpler. For example, the constrained interleaver used in the CTBC code can be specifically designed using one or more additional constraints that causes each of the sequences listed in the P(d≧dt) tables to have all of their positions de-permute to either CW1 or CW2, but not both. If the constrained interleaver of the CTBC code is constrained in this way, the Γ design for unequal error protection as described above becomes much simpler.
Hence, the unequal error protection constraints used in the CTBC code's constrained interleaver design will ensure that no low Hamming weight sequences in the listed on P(d≧dt) tables are generated by combinations of codewords from CW1 and CW2 jointly. That is, combinations of codewords from only CW1 and only CW2 are allowed to generate the low weight sequences of v, but combinations from both CW1 and CW2 are not. With this additional constraint, the constrained interleaver will not allow any combination of codewords from CW1 and CW2 to generate sequences of v with weight less than df. This ensures that every sequence listed in the P(d≧dt) tables will have all of their positions de-permute to either from only CW1 or only CW2.
One way to implement this additional interleaver constraint is to start by arbitrarily selecting ρ1 codewords for CW1. Then instead of placing one coded bit of every codeword (as described before in the CTBC code's constrained interleaver design), place all coded bits of the ρ1 codewords in CW1 into v to maintain the desired MHD dt. This can be done by placing one coded bit at a time of the codewords in CW1 into v as described above in connection with the CI-3 and CI-4 constrained interleaver design methods, for example. Next place coded bits of CW2 into v in such as way as to maintain the desired MHD (of preferably dt or lower if necessary since the MHD of CW2 will be lower than dt). However, while placing coded bits of CW2, ensure that any combination of codewords that involve codewords from CW1 and CW2 end up generating a high MHD (preferably at least df calculated according to Dmin,12. If necessary, allow only as few of sequences of v as possible with lower weights (lower than df) to involve from combinations of positions that de-permute into CW1 and CW2. In such cases where it was not possible to completely separate CW1 and CW2, most of the sequences on P(d≧dt) will be from either only from CW1 or only from CW2 with only few from both CW1 and CW2. With such a P(d≧dt) table, it becomes easier to design Γ using the approach as explained above for unequal error protection. If it was possible to completely separate CW1 and CW2, then it becomes much easier to design Γ.
Alternatively, starting from any CTBC code's constrained interleaver, positions can be swapped on u (therefore on v also in the corresponding locations) to try to move towards a situation where the sequences on P(d≧dt) are completely separated in accordance with CW1 and CW2 as described immediately above. That is, for each low weight sequence on P(d≧dt), swaps are performed to move positions in u so that a given low weight sequence either becomes a high weight sequence or becomes a low weight sequence but whose positions come from only CW1 or CW2.
CTBC codes that use CICM with unequal protection can thus be designed by designing both the CTBC code's constrained interleaver (u=π[c]) and Γ as discussed above. Also, using the same concepts, the CTBC code's constrained interleaver and the Γ constrained interleaver can designed jointly. If the CTBC code's constrained interleaver cannot be designed to have all of the respective positions of each respective low weight sequence in the P(d≧dt) tables to de-permute to completely separated sets CW1 and CW2, then the separation that could not be carried out in the CTBC code's constrained interleaver can be carried out during the design of the Γ constrained interleaver. Similarly, if the Γ design becomes difficult with not enough positions to swap, then the CTBC code's constrained interleaver can be adjusted to help the r design. This way, the CTBC code's constrained interleaver and the r constrained interleaver can be designed and adjusted jointly. Swaps or other design steps can be carried out in one constrained interleaver design algorithm until a limiting condition is encountered. Next at this time a joint design algorithm switches over to the other interleaver and performs adjustments there, until another limiting condition is encountered. Then the joint design algorithm switches back to the first constrained interleaver design algorithm, and so on, until all of the constraints of both the CTBC code's constrained interleaver and the Γ constrained interleaver are jointly designed/adjusted to meet all of the interleaver constraints. Once the interleaver constraints are met, the symbol Hamming distance and the multiple MHD requirements of the unequal error protection coding scheme will be satisfied.
It should be noted that any of the embodiments that use unequal error protection as described above can be used in accordance with
MIMO systems employs nt>1 transmitting antennas at the transmitter and nt>1 receiving antennas at the receiver. A fading or stationary channel can be described by an nr by nt channel matrix H. Any ith column of H represents the channels from the jth transmitting antenna to each of the receiving antennas. The channel matrix H can be transformed to show that the channel can be represented in terms of nmin=min(nt, nr) number of independent data streams. MIMO modulation rules allow nmin number of constellation points to be transmitted simultaneously on the MIMO channel. Since each such data steam is capable of carrying m number of bits per interval using a signal constellation with M=2m constellation points, the resulting MIMO system using the MIMO modulation rule is capable of transmitting mnmin number of bits per interval. Hence, a MIMO system can transmit mnt bits per interval as long as nr≧nt. Therefore, by increasing the number of antennas, it is possible to increase the throughput of the system by increasing the transmitted data rate of a MIMO system. The V-BLAST system developed by the Bell Labs is such a system that can increase the data rate. In V-BLAST, the transmitted signal during any interval is a combination of symbols, s1, s2, . . . ,snt, which can be represented using a vector x=[s1, s2, . . . , snt], where, each sj, j=1, 2, . . . , nt is an independent symbol selected for transmission from the set of symbols {s}=(s1, s2, . . . , sM) used in the M=2m-ary constellation. In matrix-vector notation, the received column vector y is formed by all received signals from all nr antennas at the receiver during a frame interval. Hence, in presence of a noise vector w, y can be expressed in matrix-vector notation as
y=Hx+w (31)
where H corresponds to a MIMO channel matrix. Various forms of mathematical MIMO channel models are well known to those of skill in the art. For a more detailed discussion of the MIMO channel model, see G. R. Raleigh and J. M. Cioffi, “Spatio-temporal coding for wireless communication,” IEEE TR Comm. Vol. 46, No. 3, March 1998, pp. 357-366 (“the Raleigh reference). In the Raleigh reference, the MIMO channel matrix is representative of a channel for sending an entire frame of information. This channel matrix type can be used to model a transmit and receive filter pairs that effectively exist among and between the different transmit and receive antenna channels. In such channel models, the vector y corresponds to an entire frame of data. The goal at the detector is to estimate x from y thereby estimating each transmitted symbol s, from antenna j. It is assumed that the channel state information (CSI) is available at the receiver, i.e., the receiver knows the channel matrix, H. Channel state information derived at the receiver used to estimate the channel matrix, H.
In V-BLAST channel coding is not applied across data streams. Instead, the data streams coming from each antenna can be viewed as being stacked up in time domain vertically, which is what the V stands for in V-BLAST. As a result, V-BLAST can suffer under slow flat fading as some of the data streams can be severely faded. In order to overcome this drawback of V-BLAST, D-BLAST (diagonal BLAST) has been proposed. In D-BLAST every coded block of a data stream is spread over all the antennas. D-BLAST also does not transmit from the beginning of each frame so that interference between coded blocks can be cancelled at the receiver more effectively. While not transmitting from certain antennas at the beginning of selected intervals lowers the throughput of D-BLAST compared with V-BLAST, it makes successive interference cancellation (SIC) in the receiver/decoder as discussed in further detail below, easier and more efficient.
Spatial modulation (SM) (also called SM-MIMO) is a technique that uses the spatial domain to transmit information. As opposed to V-BLAST which assumes all of the transmitting antennas are transmitting simultaneously, SM selects one antenna among nt available transmitting antennas for transmission during each symbol interval. SM uses log2(nt) number of bits from the data stream to select one antenna out of the nt antennas to transmit an m-bit symbol during each symbol interval. Therefore, SM is able to transmit a total of [log2(nt)+m] number of bits each symbol interval. That is, log2(nt) bits are transmitted in the spatial domain (antenna selection) while m bits are transmitted in the signal domain (symbol selection). Stated another way, SM transmits log2(nt) bits on the spatial constellation (which is the available set of antennas) while simultaneously transmitting m bits on the signal constellation (which is the available set of signaling points).
The SM receiver is capable of identifying, for each symbol interval, both the transmitting antenna and the transmitted symbol. This is accomplished by observing and processing the received signal array y over an entire coding frame of length K. The receiver is able to determine the transmitting antenna because each transmitting antenna, as viewed by the full set of nr receive antennas, has its own electronic signature that can be used to differentiate between antennas. The signature of a transmitting antenna as observed by the full set of nr receiving antennas comes from the transmitted signal and the fading or other channel effects from the channel matrix, H, between the transmitting antenna to each receiving antenna. One significant advantage of SM over regular MIMO is that it uses only one RF signal during any interval. As a result there is no interference between the different transmitting antennas in SM as in MIMO. This lack of inter-antenna interference and the lack of a need to compensate for it at the receiver is the reason that SM receivers are much simpler than MIMO receivers such as V-BLAST. SM thus trades off some spectral efficiency for a much simpler receiver design and much better energy efficiency in terms of battery life and the like once processing-related power consumption is taken into account. The SM receiver must detect which of the transmit antennas sent the symbol (estimation of a spatial constellation coordinate) and must detect the symbol that was transmitted from that antenna (estimation of one or more signal constellation coordinates). For example, the ML (maximum likelihood) detection of SM requires minimization of the squared Euclidean distance, i.e., min∥y−H{circumflex over (x)}∥2, over the set of antennas, {c}={c1, c2, . . . , cnt}, and the set of symbols {s}={s1, s2, . . . , sM}, where the {circumflex over (x)} is the ML estimate of x in equation (31). Hence, ML detection of SM requires only Mnt number of Euclidean distance checks as opposed to the Mnt needed in a full MIMO system, thereby greatly reducing receiver complexity.
Various types of the SM systems are known to those of skill in the art. For example, space shift keying (SSK), space time shift keying (STSK), and generalized spatial modulation (GSM) are all reported in the literature. In SSK, only one signal is transmitted making m=0. That is, the information in SSK is transmitted completely from the selection of the antenna. In STSK, the role of the selection of antenna is generalized to a role of selecting one of a pre-selected set of dispersion matrices that can have channel response effects that span multiple symbol intervals. When Q number of dispersion matrices are used, STSK can transmit log2(Q) bits over the channel response duration of the dispersion matrices. In GSM, more than one antenna is selected for transmission during each symbol interval, thereby increasing number of bits that can be transmitted over the spatial domain. Hence, GSM can be viewed as a combination of SM and MIMO. If nm<nt number of antennas out of all nt antennas are selected for transmission, GSM is capable of transmitting up to
bits from the spatial constellation and nm*m bits from the signal constellation in any given interval. However, as in MIMO, the signals transmitted from the different antennas interfere in GSM, and hence GSM requires a more complex receiver as compared to pure SM where only one transmit antenna is active at a time.
Consider the case of SM where only one transmit antenna is active at any given time. Assume that CSI can be estimated at the receiver and made available to the receiver signal processor. Besides the ML detection, a simpler two step detection process is known that first detects the transmitted antenna in the spatial constellation and then detects the signal transmitted in the signal constellation. Similarly, the above receiver structures can be extended for soft detection. For example, in ML soft decoding, all the (ntM) Euclidean distances that correspond to different transmitted bit combinations, can be used to calculate the L values (i.e, the log likelihood values, where, for the jth bit, L(bj)=log(Pr(bj=1)/Pr(bj=0))) of each bit. This is done as in Pyndiah's soft decoding of block codes. For soft decoding of any bit position bj, j=1, 2, . . . , (m+log2(nt)), we first identify the ntM/2 distances (which can be called metrics) in favor bj=1 and the others in favor of bj=0. Using these two groups, we find the L value of bj.
Performance analysis of the ML detection of SM as described in the literature identifies three components that limit ML performance. These components are: (a) Psignal; a probability of error component that depends on the signal domain which is similar to the contribution when transmitting from a single antenna, (b) Pspatial; a probability of error component that depends on the spatial domain, and (c) Pjoint; a joint probability of error component that depends on both signal and spatial domains. The lower the values of Psignal, Pspatial, and Pjoint, the lower the total probability of error and the higher the performance. This analysis suggests that the signal constellations that are used in normal communications may not be the best in SM. Instead, constellations where all the signal points have high relative amplitudes are preferred. The best such constellation is a PSK constellation where all signal points have the same amplitude. Intuitively, the above statement makes perfect sense because if it is necessary to identify which transmitted antenna transmitted the signal, it is desirable for that signal to have as high of an amplitude as possible. In absence of noise, the received signal from the transmitted antenna will be high while that from all the other antennas will be zero. In addition to Pspatial and Pjoint, it is also necessary to reduce Psignal. In order to reduce Psignal, it is necessary to maintain a high Euclidean distance of the signal constellation. Up until now, a star constellation was shown to perform the best in SM systems as compared to other known constellations. However, with CICM, the present invention identifies that a CICM encoded PSK constellation is able to provide higher performance than the star constellation.
Next consider designing a SM system that makes use of a CTBC code and maps the CTBC coded bits using a CICM mapping rule. The CICM interleaver rule, Γ, can be designed to maintain as high of a Euclidean distance and a symbol Hamming distance as is possible. Hence, the design of Γ along with a reverse Gray coded constellation mapping onto symbols of the selected constellation ends up achieving as low of a value for Psignal that is possible. That means the remaining aspects of the constellation should be selected by focusing on the other two contributions, Pspatial and Pjoint. However, because the CICM-mapped-PSK constellations are all constant-envelope constellations, CICM-mapped PSK constellations are also optimal in terms of Pspatial and Pjoint. As described above, CICM-mapped QPSK, 8-PSK, and 16-PSK have been developed herein, and a general approach was provided to derive higher order CICM-mapped PSK constellations, e.g., 32-PSK, 64-PSK, and the like.
As stated before, SM transmits bits both on the signal constellation and on the spatial constellation. Specifically, during every interval, m coded bits are transmitted on the signal constellation while mspatial=log2(nt) bits are transmitted on the spatial constellation. Hence, when designing Γ with CTBC codes for SM, it is first important to form groups of coded bits, each with mtotal=(m+mspatial) number of CTBC-encoded bits (or whatever other underlying code is being used to encode the bit stream). A group with mtotal number of bits are transmitted during every interval by feeding m number bits from it to the signal constellation and the remaining mspatial number of bits to the spatial constellation. The task of designing Γ in SM is to (a) best form K/mtotal number of groups, each with mtotal number of bits, from the K coded bits coming out of the CTBC code, and (b) to identify which m bits in a group should be fed to the signal constellation and which bits should be fed to the spatial constellation.
When dealing with only the signal constellation, Γ was designed to permute the coded bits onto symbols to maximize the symbol Hamming distance and the Euclidean distance on the signal constellation. Hence, when designing Γ in SM, it is important to first get an idea about the symbol Hamming distance measure and the Euclidean distance measure on the spatial constellation. Symbol Hamming distance is straight forward as it is equal to the minimum number of symbol intervals that the positions of any given coded sequence listed on P(d≧dt) map into.
However, the Euclidean distance measure on the spatial constellation is not all that straightforward. In SM, during each symbol interval, one selected antenna will effectively transmit a signal constellation point. In order to roughly estimate a Euclidean distance type measure for the spatial constellation, consider the squared distance separation between an antenna that transmits an energy E during a symbol interval and an antenna that stays idle is D2=E. Hence, the energy E of a signal point can be used as an approximate measure of the squared Euclidean distance in the spatial domain. However, the actual impact of the selected Euclidean distance in the spatial domain is also dependent on the channel matrix H (to include the fading model in wireless systems). When CTBC codes or other codes are used with the CICM-PSK constellations and mappings discussed above, this PSK modulation will maintain the highest possible energy E for all possible transmitted symbols. The approximate Euclidean distance E on the spatial constellation is also comparable with that in the CICM-PSK signal constellations. For example, the QPSK constellation shown in
Because SM treats the [m+log2(nt)] bits transmitted during a symbol interval as mapping to a single SM symbol, the CICM mapping rule design algorithm discussed above can be directly applied to design of Γ and to the design of the mapping policy. For example, assume that 16-PSK is to be used as the signal constellation using the reverse Gray coded mapping policy of
However, another variation is to split the design of Γ into two parts, design of for the signal constellation and design of Γ2 for the spatial constellation. Γ1 can be designed as an array with m rows and K/mtotal columns while Γ2 can be designed with mspatial rows and the same number of K/mtotal columns. The idea is to form groups of bits for transmission during each interval by combining columns of Γ1 with columns of Γ2. Every group is constructed by merging one column of Γ1 with one column of Γ2. Combining columns of Γ1 and Γ2 will merge Γ1 and Γ2 to form the final array Γ with mtotal=m+mspatial rows and K/mtotal columns for transmission. Designs of Γ1 and Γ2 are similar to the design of Γ for a signal constellation previous discussed.
For example, if the signal constellation chosen is the 4-ary PSK constellation shown in
The same steps described above and as shown in
Referring now to
The output of the constellation and spatial mapper Γ 2105 is typically sent to a set of radio frequency circuits which include modulator circuits, transmitter amplifiers, and nt different transmit antennas. These transmit antennas are represented as the triangles to the right side of the block 2105. In a typical SM-CICM embodiment only one transmit antenna is active at any given time. The input coded bit stream is separated into groups of [m+log2(nt)] bits, and during each symbol interval, m of those bits are used to select a signal constellation point and the remaining log2(nt) bits are used to select a transmit antenna. In the example of CICM-16-PSK and four transmit antennas, the matrix r will have [m+log2(nt)] rows and K/(m+log2(nt)) columns.
The ultimate output of the block 2105 is SM transmission signal that is sent out over the nt different antennas as a function of time and the input coded bit stream. This multi-antenna output is then processed via a channel matrix, H. The channel matrix H is actually a mathematical representation of a combination of transmit and receive signal processing in addition to a stationary or time-varying fading channel. All of these channel effects are termed the channel matrix, H, herein. In practice the channel matrix, H, is embodied as a physical multiple input-multiple output transmission channel. The output of the channel matrix, H, is coupled to a multi-antenna receiver front end 2115. The multi-antenna receiver front end 2115 is coupled to receive the output of the channel matrix, H. The multi-antenna receiver front end 2115 then performs front end receiver processing and baseband processing in order to provide a detection signal. In practice the detection signal is digitized on the I and Q channels and is then processed to form a set of bit metrics that are to be used in conjunction with a SISO decoder 2125.
The output, typically in the form of computed bit metrics, is passed via a CICM deinterleaver block 2120 to the SISO decoder 2125. As the SISO decoder 2125 performs SISO iterations, extrinsic information will be updated. When every SISO iteration-completes, new updated bit metrics will be needed. Hence the SISO decoder sends a subset of its extrinsic information via a CICM interleaver 2130 back to the multi-antenna receiver front end 2115 (a memory structure therein that holds information associated with the received digitized I/Q signal points derived from the multi-antenna receiver front end 2115). The updated bit metrics are derived using the information associated with the received signal points from the multi-antenna receiver front end 2115 and the available extrinsic information. The updated bit metrics are sent via the CICM interleaver 2120 back to the SISO decoder 2125. SISO iterations continue in this way until the coded bit stream has been decoded and the original information bits become available. The output of the SISO decoder is then coupled from the output arrow to the right of the SISO decoder block 2125.
It can be noted that the system 2100 has many key advantages. First of all, in embodiments that use CICM-PSK, the constellation is constant envelope and is able to accommodate multiple bits while maintaining a high MSED at the coded sequence or codeword level. Secondly, the constellation and spatial mapper Γ 2105 will maintain as high of a symbol Hamming distance as is possible. Thirdly, because the PSK constellation is constant envelope, all signal points will have the same amplitude, and this will cause the performance of the spatial constellation aspects of the SM constellation to be maximized. If CTBC coding is used, the SISO decoder will reap all of the benefits discussed above in connection with CTBC encoding and decoding. Also, as in SM systems, because only one transmit antenna is active at any given time, the receiver complexity is greatly reduced relative to traditional MIMO systems.
It should be noted that aspects of the present invention can alternatively be used with MIMO modulation rules that are used to transmit a plurality of different constellation points through a plurality of different spatial channels simultaneously. For example, the system 2100 can be embodied to use GSM and MIMO systems such as V-BLAST, D-BLAST and the like. During each symbol interval, two up to nt number of CICM-mapped symbols (such as CICM-mapped 16-PSK) can be transmitted via nt number of separate antennas. In such systems, the class of CICM-PSK type constellations are considered to be optimal because all of the signal points have the same highest energy values.
In such systems, to maintain a system-wide symbol Hamming distance, the CICM mapping rule design algorithm can be designed to avoid allowing more than one antenna to transmit a bit from a given low weight sequence during a given symbol interval. In such cases, a single CICM permutation matrix, Γ, is designed with between two and m*nt number of columns, depending on the number of signal points that will be simultaneously transmitted in a single symbol interval. In full V-BLAST/D-BLAST type embodiments there will be nt number of columns that correspond to each Euclidian distance. The symbol Hamming distance can be computed as only be effective within each one of the separate nt different channels, or can be considered per symbol interval. In such embodiments, a single CICM permutation matrix, Γ, can be designed that has m rows and K/m columns as in the single-channel case. Now, however, a set of up to nt columns of Γ will be mapped to separate antenna channels during each symbol interval.
For use in MIMO modulation embodiments of the system 2100 where more than one antenna transmitting at the same time, e.g., STSK, GSM, V-BLAST and D-BLAST systems, the SISO decoder/interference canceller 2125 is preferably implemented to detect the multiple symbols that were transmitted from different antennas at the same time. In such embodiments, the SISO decoder/interference canceller block 2125 can be configured to perform ML estimation or can be augmented to additionally perform interference cancellation type functions as described below.
The block 2125 can be configured to perform optimal maximum likelihood (ML) detection. ML detection detects all streams jointly by searching over all possible x vectors to determine the best estimate of x, {circumflex over (x)}, that minimizes ∥y−H{circumflex over (x)}∥2, which is equivalent to minimizing the Euclidean distance. Hence, the ML decision rule is to find the vector, {circumflex over (x)}, that solves
min∥y−H{circumflex over (x)}∥2. (32)
Since the above minimization requires checking Mn
The block 2125 can be also be configured to function as a linear decorrelator detector. This technique detects streams individually. From equation (31) it can be seen that every element of y (signal received by any single received antenna) has contributions from every signal transmitted from every transmitted antenna. Hence, when detecting sj from antenna j, it is necessary to remove the interference on y caused by the signals transmitted from all other antennas. The removal of interference is done by decorrelation and the decorrelation can be done by using a transformation on y. Specifically, when detecting sj, the decorrelator maps y on to a space that is orthogonal to h1, h2, . . . , hj−1, hj+1, . . . , hnt, to form a new signal y′. As a result, the mapped signal y′ does not have any interference from signals transmitted from any antenna other than the desired signal transmitted from the jth antenna. The mapped signal y′ is then passed through a matched filter to detect sj. The combination of the mapper that maps y to y′ and the following matched filter is the decorrelator. The decorrelator detector consists of a bank of n, decorrelators, with one for each antenna j, j=1, 2, . . . , nt.
The block 2125 can also be configured to perform Successive Interference Cancellation (SIC) coupled with a decorrelator. In this method already decoded symbols are used to cancel out the interference caused by the already decoded symbols on a symbol that is currently being decoded. When decoding s1 through snt in a predetermined order, decoding of sj can be assisted by removing the interference caused by s1 through sj−1 on y. This is done before the mapping of y to y′ thereby making the mapping process easier.
The block 2125 can also be configured to act as a minimum mean squared error (MMSE) Receiver. The above described decorrelator performs well at high SNR when the interference is dominant, but it does not perform well at low SNR when the noise is dominant. Hence, in order to perform well at all SNR values, each decorrelator in the receiver can be replaced by a MMSE receiver. The MMSE receiver can be constructed as a transformation of y using the MIMO channel matrix, H. The block 2125 can also be configured to act as an MMSE receiver combined with SIC. This is very similar to the SIC described above, with the only difference that each decorrelator is replaced by a MMSE receiver discussed above. In addition, the block 2125 can also be configured to perform other MIMO detection algorithms such as sphere detection (SD). SD is a simplification of ML detection that limits the search to a sphere around the received vector y. Other techniques that could be implemented in the block 2115 include a developed matched filter (DMF) as is known in the art and signal vector based decoding (SVD) as is also know to those of skill in the art.
SIC plays an important role in the detection of the individual data streams associated with each component of the vectors x and y. However, it is known that SIC can introduce error propagation by passing incorrectly decoded symbols of different antennas for the detection of the signals on successive antennas. An aspect of the present invention is based on the observation that the type of error propagation that occurs in V-BLAST and D-BLAST in MIMO transmission systems is structurally similar to the error propagation that occurs in the multi-stage decoding (MSD) of multi-level codes (MLCs). U.S. Pat. No. 8,532,229, “Hard iterative decoder for multilevel codes” to E. M. Dowling and J. P. Fonseka (“the Dowling reference”) describes a hard iterative decoding (IHID) technique that improves upon MSD decoding. U.S. Pat. No. 8,532,229 is incorporated by reference herein to provide the full details of how to implement the IHID algorithm in the MIMO receiver algorithms and system described below which can be implemented in the block 2125 of the system 2100.
An aspect of the present invention is to first follow the steps of SIC (such as an MMSE based approach that uses SIC as described above). Start by using the IHID algorithm decoding at least a subset of signals from antenna 1, and using the symbol decisions from antenna 1, remove/cancel the interference caused by the signal on antenna 1 while decoding the signal on antenna 2. Next use the IHID algorithm decode the signal on antenna 2 and using the symbol decisions from antennas 1 and 2 to remove/cancel the interference caused by antennas 1 and 2 while decoding the signal on antenna 3. This process can be continued until the signal on the last antenna nt is decoded by removing interference from previously decoded signals on antennas 1 through (i−1) while decoding of the signal on antenna i. In addition, as in IHID, once the normal SIC steps are complete, loop back to antenna 1 with all the currently known information about decoded signals and repeat the process. In the second pass through the loop, any or all of the nt−1 decoded symbols from the previous pass through the loop can be used to remove interference from an antenna's signal stream that is currently being decoded. That is, in the IHID based SIC approach, hard decisions of the decoded symbols on a given antenna can be used to remove the interference caused by those symbols on the remaining antennas. If the transmitter staggered the transmission of the signals on the different antennas during a start up phase similar to D-BLAST, the earlier iterations can process fewer symbols than the later iterations. This IHID based interference cancellation and decoding algorithm is continued by repeating the SIC steps in an iterative manner. This process can be stopped as soon as no change is seen in the decoded sequences on all antennas.
Soft interference cancellation with SISO decoding works because after an initial number of SISO iterations, the correct decoded sequence will begin to emerge. At that point, using the received signal and the regenerated signal based on the currently decoded message, the interference and the level of interference can be estimated. This estimated interference can be used to cancel out the interference for the next iteration. In general, interference can come from other sources such as ISI, IQ imbalance, and polarization interference in optics. Also, soft interference cancellation can be used to estimate the carrier phase in a non-coherent system for use during joint soft decoding and carrier phase tracking. In soft interference cancellation with soft decoding, the interference and the level of interference are estimated and updated and used to perform interference cancellation for us in each SISO iteration.
For use when CTBC codes or other codes that are soft decoded, the present invention contemplates methods, apparatus and systems for soft interference cancellation to be used with soft iterative decoding. Consider an example where a CTBC code is constructed using either CI-3 or CI-4 and is then mapped on to symbols using a CICM interleaver rule, Γ. In this example, there will be K/m number of 2m-ary symbols ready for transmission. In accordance with CICM based MIMO transmission of the present invention, split these K/m symbols into K/(mnt) segments of symbols and feed those segments one by one to each antenna. These segments can be formed by simply dividing the symbol sequence in an orderly manner starting from the first symbol. With this construction, there will be a set of data streams available, placed vertically one below the other in time, as in V-BLAST. These segments can then be simultaneously transmitted from the respective antennas. In effect all K bits of the CTBC coded frame are transmitted during K/(mnt) intervals achieving the same data rate of V-BLAST. However, unlike V-BLAST and similar to D-BLAST, the above scheme has coding across different data streams.
In the decoding, bit metrics related to coded bits of the CTBC codes need to be extracted from the received vector y. This can be done by modifying the last step of any of the above receivers to extract soft information or by using any other known soft detection method described in the literature for MIMO applications. Next run the first SISO iteration on a frame of the CTBC code. At that point there will be the log likelihood ratio values (L values) of each coded bit. These L values indicate the best estimates of the bit values (1 or 0) of the CTBC coded sequence along with the reliabilities (which are the probabilities of these bit value decisions). Using the L values of these coded bits, next identify the corresponding decoded symbols 1 through K/m and the probability that the decision on each of those symbols is correct. At this point, based on the current information, the algorithm has identified each of the most likely symbols transmitted from each antenna and their probabilities. In any normal iterative decoding process that involves higher-order symbols (m>1) and has no interference, these probabilities can be used to better estimate the bit metrics from the received signal. However, in MIMO systems, since inter-antenna interference is present, the present invention introduces an additional step to be used to remove the interference in a soft manner. This is accomplished using the estimated probabilities of the symbols before updating the bit metrics. In the above IHID based SIC approach, hard decisions of the decoded symbols on a given antenna were used to remove the interference caused by those symbols on the remaining antennas. In soft decoding, the probabilities of the decisions of the decoded symbols is also available. Therefore, the interference caused by these soft decoded symbols can be removed/cancelled in a soft manner by using the estimated probabilities of the symbols.
Specifically, in soft interference cancellation, the interference from every symbol is first calculated as any of the SIC approaches described above, and is then multiplied by the probability of that symbol found from the L values to estimate the interference contribution from that symbol. If sm is the symbol having the highest L value, then L=log(P(symbol=sm)/(1−P(symbol=sm)), therefore P(symbol=sm) can be easily found from the L value. Next, after the soft interference cancellation operation, all interference contributions can be subtracted to update the bit metrics using the signal constellation. The updated bit metrics can then be used for the next iteration. Since the decisions made at the beginning of the iterations can be rather unreliable, the soft interference cancellation procedure can be started after a preselected initial number of iterations, ninit. As the iterations proceed, they will typically converge to a solution and the reliability of most symbols will become high and the soft interference cancellation will be similar to the above described SIC solutions.
Joint soft interference cancellation and soft decoding initially performs one or more soft decoding iterations to initially estimate a set of interference parameters. In some embodiments, the initial interference estimate for use in a current frame of data can be based upon interference parameters estimates from the immediately preceding frame of data. Once the initial interference cancellation parameters are available, and from then forward, soft interference cancellation subtracts an estimate of the interference from the received signal to perform a current SISO iteration. The estimate of the interference will be based upon information the previous SISO iteration.
Consider an example that involves the transmission of CTBC signals using the QPSK constellation in
y
I(k)=α1*α1(k)+βQ*αQ(k)+nI(k) (33a)
and
y
Q(k)=αQ*αQ(k)+β1*α1(k)+nQ(k) (33b)
for k=1, 2 . . . K/2, where, (α1(k),αQ(k))=(±α,±α) represents the transmitted symbol, α1 and αQ are interference parameters that account for amplitude distortion of the I and Q signal components (diagonal components of a 2×2 rotation/distortion matrix), βI and βQ are interference parameters that account for the interference from the I channel to the Q channel and from the Q channel to the I-channel respectively (off-diagonal components of the 2×2 rotation/distortion matrix), and nI(k) and nQ(k) are the I and Q channel noise components. Due to the 2×2 distortion matrix, even in absence of noise, the received signal, (yI(k) yQ(k)), will not necessarily match the transmitted sequence, (αI(k), αQ(k)).
Initially, the SISO iterations can start off by assuming that αI=aQ=1 and β1=βQ=0. After the decoded sequence starts to emerge, estimates for αI(k) and αQ(k), for all k become available. As the estimates for αI(k) and αQ(k) become more reliable, equations (3a) and (3b) can be used to estimate α, and β. Upon estimating, α and β values, the yI(k) and yQ(k) estimates can be modified to cancel out the I/Q imbalance and to thereby form still more reliable estimates for (αI(k), aQ(k)). These improved estimates can then be used to calculate the bit metrics for the next iteration. If desired, the α, and β values can be estimated every iteration or once every few iterations. Hence, in this example the joint soft interference cancellation and soft decoding forms initial estimates of the transmitted sequence using soft decoding, estimates the interference due to the I/Q imbalance, cancels the interference, and then continues to iteratively improve the reliability of both the interference estimates and SISO decoded bit stream until convergence.
In situations where the some or many decoded symbols have low probabilities, only the intervals that have higher probabilities of the decoded symbols can be made to contribute significantly to the estimates formed in equations (33a) and (33b). In some embodiments the α and β parameters of the 2×2 distortion matrix of (33a) and (33b) preferably are calculated/updated based only upon subset of decoded symbols whose reliabilities are above some threshold or relative measure.
The algorithm described above to cancel I/Q imbalance distortion can be viewed as a hard interference cancellation approach because the hard decoded symbols (αI(k), aQ(k)) are used in equations (33a) and (33b). A soft interference cancellation approach can be obtained by replacing αI(k) and αQ(k) in equations (3a) and (3b) by
where p(i,k) represents the probability of symbol i during symbol interval k. This way, all M symbols are taken into account each symbol interval in accordance to their respective probabilities. However, as the SISO iterations converge to a solution, these summations will converge to the contribution from only the correct symbol. This use of soft decoded data estimates with their probabilities are used in many preferred embodiments.
Referring to
During joint SISO iterations with soft interference cancellation, a block 2210 performs soft decoding using a modified set of bit metrics that have been updated in a block 2255. These modified bit metrics are then processed through a SISO half iteration involving the inner code in the block 2210. When CTBC codes are in use, the output of the block 2210 is then deinterleaved in accordance with a CI-2, CI-3 or CI-4 or any other constrained interleaver that implements a set of constraints as needed to support the underlying CTBC code. The de-interleaved sequence is then coupled to a block 2220 that performs soft decoding in accordance with the outer code. The soft decoded outputs of the block 2220 are then coupled to a block 2225 that performs de-interleaving and the deinterleaved sequence is fed to a block 2230 that performs soft decoding in accordance with the outer code, and which may be a software substantiation that shares some or all of the same hardware as the block 2210. The output of the inner decoded sequence is then passed to a block 2235 which symbol estimates along with their probabilities. The probabilities of all M symbols are calculated during each interval, and this calculation can be based on the likelihood (L) values of the coded bits generated during the SISO iterations.
Once the symbol estimates and their probabilities are known, hard type decisions (αI(k), αQ(k)) or soft type decisions ((αI′(k), αQ′(k)) can be made similar to those discussed in connection with equations (33a) and (33b). In a block 2247 certain operations can be performed every iteration or every couple or few iterations, depending on the embodiment and signal conditions. In a block 2240 interference parameters are calculated. For example, depending on the embodiment, the interference parameters could be the components of a 2×2 rotation/distortion matrix as in the I/Q imbalance correction example of equation (3) or in polarized channels type embodiments where channel imperfections cause the horizontal and vertical polarized channels to have a degree of cross talk. Other examples include V-BLAST, D-BLAST, GSM and other MIMO communications systems where there is more than one active transmitter at any given time. In such embodiments, the interference parameters computed in the block 2240 will be used to cancel interference due to other simultaneously transmitted channels (off-diagonal terms in an nr×nr inter-channel distortion matrix) from a selected channel (on-diagonal terms in the nr×nr inter-channel distortion matrix.) If desired, only a subset of intervals that have higher probabilities for the decoded symbols can be used to estimate the interference parameters. Once the interference parameters are available, the block 2245 is used to compute a new interference-cancelled signal estimate vector, y′. The signal estimate vector, y′ is then stored in memory in a block 2250. The sequence y′ along with the estimated probabilities of the symbols is then used to compute a set of interference-cancelled bit metrics in the block 2255. The output of the block 2255 is then used in the next SISO iteration.
In many prior systems, interference cancellation is performed using interference cancellation parameters have been estimated during previous frames. However, the method 2200 estimates the parameters based upon information from the frame being currently decoded. Even if there are slow time variations in the parameters inside of the current frame, the method 2200 will be able to track those slowly time-varying interference cancellation parameters. Since the number of estimated parameters are typically low, the method/system 2200 can be modified for calculating interference cancellation parameters for a variety of different types of interference during that occurs in the same frame being soft decoded. For example, the approach 2200 can be used in partially coherent or non-coherent systems to jointly perform SISO decoding and carrier phase recovery.
In many embodiments, the estimates that were made in the previous frame can also be used to provide a set of starting parameters to be used in the beginning of the current frame. That is, information from blocks 2240 and/or 2245 from the previous frame can be provided to block 2205 to start off the iterations in the current frame, v, using a vector y′ based upon the received signal information in the current frame and the interference cancellation parameters computed based on information from the previous frame.
A main benefit of the SISO decoder/soft interference cancellation is that the block 2125 of
The above soft interference cancellation approach 2200, the block 2125 can also be applied to other forms of soft interference cancellation that do not involve multi-antenna MIMO systems. For example, the soft interference cancelling approach of the present invention can be applied in systems where there are other forms of MIMO processing and multiple data streams. Consider a specific example where a single antenna transmits on both the horizontal and vertical polarization. In such a case the channel can introduce a rotation so that the horizontal and vertical polarization channels interfere with one another. In such an example, the same above described soft interference cancellation technique could be used to cancel the interference between the horizontal and vertical polarizations as an integral part of soft decoding with soft interference cancellation. The method/apparatus/system 2200 can be applied to many other kinds of codes beside CTBC codes. This would include other types of serially concatenated codes, parallel concatenated codes such as turbo codes, convolutional codes, block codes, or generally any kind of code that is soft decoded using a SISO decoder. Hence it is to be understood that the soft interference cancellation technique described above could be applied to a variety of different communications systems where there is more than one data stream being sent simultaneously, there is cross talk between channels, and the channels are encoded in such a way that a SISO decoder is located in a receiver that is designed to decode the plurality of received signals.
Referring now to
A constellation and spatial mapper
is provided to receive an input bit stream which is presented to the block 2310 on the input arrow to the left. For example, the input bit stream can be a CTBC encoded bit stream. As discussed earlier in connection with CICM, the input bit stream can be any coded bit stream for which a set of tables P(d), for d=dt, dt+1, . . . , df can be constructed. This would include, in addition to CTBC codes, block codes, convolutional codes, turbo product codes, and other codes like selected turbo codes and LDPC codes for which these tables can be constructed.
The output of the laser 2305 couples to a first input of an optical modulator 2315. The optical modulator receives at a second input m bits representative of a signal constellation point input that tells the optical modulator how to modulate the laser input to produce a modulated laser output. The modulation is performed in accordance with the signal constellation point supplied as a column of the submatrix Γ1 by the constellation and spatial mapper 2310. For example, when m=4, each column of the submatrix Γ1 could identify a four coded bits that identify a 16-PSK signal point to which the four coded bits will be mapped in a given symbol interval. In this example a particular CTBC code is used to encoded the bit stream input to the block 2310, and then encoded bit stream is mapped to a sequence of constellation points using a CICM-16-PSK mapping as previously described.
The modulated laser output of the optical modulator 2315 is coupled to a single input multiple output (SIMO) active optical filter bank/combiner 2320. The SIMO active optical filter bank/combiner 2320 receives a spatial constellation point input that tells the SIMO active optical filter bank how to configure its SIMO transfer function so that the single input containing the output of the optical modulator 2315 is coupled through one of nt internal optical signature filters that exist inside the SIMO active optical filter bank. The outputs of the different nt internal optical signature filters within the SIMO active optical filter bank is sent to a combiner. The combiner can be implemented using known optical technology to include merging optical paths in an optical integrated circuit or fiber couplers that have multiple optical input fibers which are length matched and combined to form a single output.
The selection of which one of the nt optical signature filters through which the modulated laser signal is coupled is performed in accordance with the log2(nt) bits that correspond to a spatial constellation point supplied as a column of the submatrix Γ2. For example, if there are sixteen possible optical signature filters inside of the SIMO active optical filter bank, then nt=16 and the submatrix Γ2 will have log2(nt)=4 bits per column. Therefore, in each symbol interval, m=4 bits will be coupled from a selected column of the submatrix Γ1 to the optical modulator to identify a 16-PSK signal point, and log2(nt)=4 bits will be mapped from the same column of Γ2 to identify the selected one of the nt=4 optical signature filters through which the modulated laser signal will be coupled during that same symbol interval. Note in this example where sixteen selectable optical signature filters exist within the SIMO active optical filter bank, that the line rate is doubled over what is sent by the CICM-16-PSK portion, because in addition to the four bits sent each symbol interval to select a 16-PSK constellation point, four more bits are sent each symbol interval to select a spatial constellation point (i.e., to select an optical signature filter through which to couple the 16-PSK modulated laser signal). The output of the SIMO active optical filter bank/combiner 2320 is output to an optical channel. The optical channel is typically implemented as a fiber optic communication channel, although free space laser communication channels could also be used.
To better understand the structure and function of the SIMO active optical filter bank/combiner 2320, let the output of the optical modulator 2315 be denoted as s(t), let the vector x=ei be a standard unit basis vector of all zeros except for a “1” in the ith component, and let Ht be a MIMO channel sub-matrix associated with a single symbol interval, then the output of the SIMO active optical filter bank, yt(t)εCn
y
t(t)=Htxs(t). (34)
The combiner effectively creates a single output signal st(t) to couple to and through the optical channel. The signal st(t) is created by summing all of the elements of the vector signal yt(t) at each point in time. Because the vector x is equal to a standard unit basis vector, ei, where the subscript, i, corresponds to a currently selected one of the possible log2(nt) spatial constellation indices, the output of the combiner will be equal to the signal s(t) convolved through the ith optical signature filter transfer function. In this model, it can be noted that the Ht sub-matrix can be described as a matrix whose elements correspond to filter transfer functions. These transfer functions are optical transfer functions and are applied during each symbol interval. As the constellation and spatial mapper 2310 outputs each new pair of signal and spatial constellation points, a new coherently modulated laser signal s(t) is generated, and the spatial constellation point is mapped to a selected index, i, for the corresponding symbol interval, and the output of the SIMO active optical filter bank/combiner 2320 corresponds to the ith optical-signature-filtered and modulated laser signal, st(t).
Each of the active optical signature filters can be implemented in accordance with known technology, such as by using optical integrated circuit technology or fiber gratings and the like. For further details of the technology used to design and implement active optical filters, see U.S. Pat. No. 6,687,461: “Active optical lattice filters,” D. L. MacFarlane and E. M. Dowling, and U.S. Pat. No. 7,042,657: “Filter for selectively processing optical and other signals,” D. L. MacFarlane both of which are incorporated by reference herein. In addition to the active components, which can include semiconductor optical amplifier regions (SOARs), the active optical signature filters can be designed to include passive optical filter sections. To see a number of optical filter architectures that are known to those of skill in the art and can be used inside the SIMO active optical filter bank, also see C. K. Madsen and J. H. Zhao “Optical filter design and analysis: a signal processing approach,” Wiley, 1999 (“the Madsen reference”).
The SIMO active optical filter bank/combiner 2320 will include a set of active components which can include voltage controlled reflection coefficients and voltage-controlled SOARs, for example. These active components can be used to alter the transfer functions of the optical signature filters inside of the SIMO active optical filter bank. For example, at the line rate, and in response to the log2(nt) bits that correspond to a spatial constellation point supplied as a column of the submatrix Γ2, the active components can be used to cause the optically modulated laser signal from the block 2315 to be coupled through a selected one of the optical signature filters that are otherwise implemented using passive optical components. In other embodiments, the active components in the SIMO active optical filter bank can be used to alter the transfer function of an active optical filter to select one or more transfer functions of one or more corresponding optical signature filters. In other embodiments, the gains of certain particular SOARs could be used to select a sub-bank in the filter bank, and then inside that sub-bank, a single active optical filter could be responsive to one or more sub-components the spatial constellation point to select from a plurality of pre-designated optical signature filter transfer functions that can be realized by the single active optical filter to realize a subset of transfer functions associated with the sub-bank.
As discussed in the Madsen reference, optical filters can be designed using multi-stage moving average (MA), multi-stage auto-regressive (AR) and multi-stage auto-regressive moving average (ARMA) based architectures. MA filters are also known as FIR (finite impulse response) filters, AR filters are also known as all pole IIR (infinite impulse response filters, and ARMA filters are also known as IIR filters with arbitrary poles and zeros. Therefore, any of the nt different optical signature filters in the SIMO active optical filter bank/combiner 2320 can be implemented as any of these filter types. Other filter types are known such as ring resonators and multi-port couplers, 2D lattice filters, N×M 2D-Lattice filters, higher dimensional lattice filters, 2D active lattice filters, and such architectures could also be used to construct the entire SIMO active optical filter bank/combiner 2320, or sub-portions thereof.
Each optical signature filter used within the SIMO active optical filter bank/combiner 2320 can be designed to form a portion of the information contained in the channel matrix, H. It can be desirable to design the optical signature filters to be an orthogonal basis set. For example, the filter bank may preferably constructed using discrete-time optical FIR filters that are preferably implemented using a multistage MA architecture as described in the Madsen reference. In such an example, if the filter coefficients of the FIR filters form a set of orthogonal basis vectors, then the optical signature filters correspond to a set of orthogonal filters. However, as discussed below, the total channel matrix, H, can include additional filters at the receiver, potentially matched to those in the transmitter, in which case, the combination of the transmit and receive filter banks may be designed to be an orthogonal basis set. Also, for example, if space time shift keying (STSK) is being used in the system 2300, then the optical signature filters used within the SIMO active optical filter bank/combiner 2325 could implement a portion of the dispersion matrix associated with each optical signature filter in the filter bank 2320. Likewise, instead of having the signature filters within the discrete-time optical filter bank implement an orthonormal basis set, it is possible to use different types of optical filters such as an all-pole optical lattice filters as described in chapter 5 of the Madsen reference. Such filters are easy to implement and could be used to provide a significant amount of signal separation as opposed to orthonormality. This type of design could lead to more compact and efficient optical signature filter banks at the expense of the SISO decoder or other related signal processing hardware to have to work harder. That is, it is not required to implement an orthonormal basis set of filters, nor an approximation thereto. All that is really needed is to ensure that the filters are selected so that the overall channel matrix H is invertible, full rank, or has enough rank so that the SM-modulated or MIMO-modulated signals can be suitably recovered/reconstructed after passing through the channel H. The transmit and receive filters will influence H as will any noise and distortion effects of the optical channel itself.
The output of the optical communication channel is coupled to a receiver subsystem whose front end comprises a single input multiple output (SIMO) active optical receive filter bank 2330. The SIMO active optical receive filter bank 2330 includes internal active optical components that effectively splits the received signal into nr different receiver channels. In preferred embodiments of the system 2300, at this time, it is deemed desirable to set nr=nt. For example, if the SIMO active optical receive filter bank 2330 includes nr=nt=16 internal receive filters, then a single input, multiple output SOAR can be designed to distribute the single input from the optical channel to the inputs of a set of nr optical receive filters arranged into a parallel filter bank. While an architecture involving a SIMO SOAR that distributes the optical receive signal from the optical channel to nr different optical receive filters arranged in parallel can be desirable, other optical filter architectures could alternatively be used. For example 2D active optical lattice filters, N×M 2D active optical lattice filters, higher dimensional active optical lattice filters, or other architectures such as SIMO optical ring resonators and the like could be used to implement the voltage-controlled SIMO transfer function of the SIMO active optical receive filter bank 2330. The SIMO transfer function can also be viewed as a set of transfer functions in parallel from the single input to the multiple outputs.
It should be noted that the combination of the SIMO active optical filter bank/combiner 2320 and the SIMO active optical receive filter bank 2330 collectively provide an optical computing structure/architecture to emulate/perform the mathematical operation of the channel matrix, H. The system 2300 is able to implement the equivalent of a MIMO system, but makes use of the fact that a form of spatial modulation is used where a non-zero modulated signal is only applied to one of the active optical signature filters in the SIMO active optical filter bank/combiner 2320 at a time. As a symbol passes through filter number i in the optical filter bank 2320, the filters will be designed so that as much energy as possible of this symbol will pass through filter i in the receive filter bank 2330, while as much energy as possible is blocked from passing through the other filters, j≠i. The optical signature channel number corresponding to the receive filter that has the highest energy generally corresponds to the spatial constellation point's coordinate.
In preferred embodiments, the SIMO active optical filter bank/combiner 2320 and the SIMO active optical receive filter bank 2330 act as a set of matched filters that are maximally orthogonal. That is, if the laser modulated signal s(t) is coupled to channel i of the SIMO active optical signature filter bank 2325, then the optical receive filter i of the SIMO active optical receive filter bank 2330 will be matched to provide at its output as much of the signal s(t) as is possible. Also, the rest of the optical receive filters in the active optical receive filter bank 2330 will be designed to provide at their output as little of the signal s(t) as is possible. This can be achieved, for example, by designing the cascade of each pair optical signature filter/optical receive filter i to be (as close as possible) to an orthogonal basis vector relative to all the other filter channels, j≠i. For systems like STSK, where dispersion matrices are used, the filters in the signature and receive filter banks can be designed in accordance with a desirable and selected set of fixed dispersion matrices, for example.
The output of the SIMO active optical receive filter bank 2330 is coupled to a coherent detector and processor/memory interface 2335. The coherent detector and processor/memory interface 2335 uses a set of nr coherent detectors to convert the nr different (multiple) outputs from the SIMO active optical receive filter bank 2330 from optical signals to electrical signals. The nr coherent detectors are sampled at a sample instant and converted into nr different respective digital signals, each with real and imaginary components (complex numbers) corresponding to the I and Q components. The set of nr received complex-number signal points are then stored in a memory. The memory is preferably arranged in an ordering related to the ordering of the signal points as observed at the receiver front end where the signal points are sampled. This memory preferably is double buffered and keeps track of all of the information related to the received signal that was received in each symbol interval of a coding frame from each of the spatial channels (outputs of each of the optical receive filters in the SIMO active optical receive filter bank 2330). While one memory is being processed by a SISO decoder 2240, another memory is being loaded from new information received from the optical channel.
The information that is stored in the memory associated with the block 2335 will be used to compute an initial set of bit metrics to be used in the SISO decoder 2340 that is operably coupled to the memory. The information stored in the processor/memory interface portion of the block 2335 is processed via a CICM deinterleaver 2350 and used to compute the initial set of bit metrics used by the SISO decoder 2340. Each time the SISO decoder 2340 executes a SISO decoding iteration, the updated extrinsic information is processed via a CICM interleaver 2345 and used to compute updated bit metrics. The updated bit metrics are then passed through the CICM deinterleaver 2350 for use in the next SISO iteration. The SISO decoder is allowed to compute SISO iterations until a convergence criterion or stopping condition is met. The output of the SISO is a decoded frame of information bits which exits the SISO decoder on the output arrow to the right to the right of the SISO decoder block 2340. In a preferred embodiment the SISO decoder is designed to decode a CTBC code. As discussed above, the CICM approach can also be used with other types of codes such as block codes, convolutional code, turbo product codes, and depending on the actual/particular selected code, certain turbo codes and LDPC codes where the P(d) table can be determined.
In some alternative embodiments the constellation and spatial mapper 2310 can be implemented using other spatial modulation techniques instead of CICM. In non-CICM SM embodiments of the system 2300, the P(d), for d=dt, dt+1, . . . , df tables do not need to be determinable. For example, if the bit stream is encoded using an LDPC code for which these tables cannot be determined, possibly concatenated with large block codes or the like as is used sometimes in OTN, or if for any other reason CICM is not desired to be used in a given embodiment, then block 2310 would perform any selected SM algorithm other than CICM-SM to constellation map and spatially map the coded input bits coming into the left of block 2310 onto a sequence of constellation/spatial constellation points. That is, the system 2300 is general enough to be used with CICM or any other SM identified technique. Key novel features beyond SRCI CTBC codes and CICM include the use of the 2320 and 2330 and other blocks in the system 2300 that allow the optical signal that traverses the optical channel to be processed as a SM type signal similar to the way shown in
That is, an aspect of the present invention as relates to the use of an optical filter bank 2320 to transform an optical-modulated laser signal (output of block 2315) to an SM-multichannel signal. The SM-multichannel signal can be viewed as a collection of nt number of optical filter bank channel outputs of optical filter bank inside block 2320. The optical filter bank preferably includes a collection of discrete-time optical filters arranged in parallel and the implementation is preferably using multistage architectures as are known to those of skill in the art via the Madsen reference and the numerous citations to related work provided therein. The optical filter bank inside block 2320 uses the same structure as shown in blocks 2105, 2110, but instead of the SM-multichannel signal exiting from multiple antennas and passing through transmit portion, Ht, of the channel matrix H, the optical SM-multichannel signal is the multichannel output of the optical filter bank (Ht) inside block 2320. The block 2320 also includes a combiner that is coupled to receive the SM-multichannel signal and combine the nt number of multichannel component signals to form a single modulated laser signal. The combining operation is equal to or similar to a summation operation and is preferably carried out/implemented/embodied using one or more optical combiners. The output of the block 2320, i.e., the single modulated laser signal is then transmitted onto the optical channel. At a receiver, 23302335, 2340, 2345, 2350, a noisy and optical-channel-distorted version of the single modulated laser signal is then received from optical channel. At the receiver, the SIMO optical filter bank 2330 or a variation thereof is used to decompose the single modulated laser signal into a plurality of multichannel component signals. This plurality of multichannel component signals can be viewed as a reconstruction of an estimated version of the SM-multichannel signal. While the preferred embodiment uses a multichannel SISO decoder 2340, 2345, 2345 to decode the estimated version of the SM-multichannel signal, other types of decoders can alternatively be used with the present invention. Other types of decoders would include hard iterative decoders, or any other type of decoder used to decode any kind of code, such as an LDPC decoder, possibly in operation with block code decoders, or turbo product decoders as are commonly used in OTN applications, or the like. Even simple channel decoders such as a multichannel equalizer followed by a conventional slicer/decision circuit could be used in the place of blocks 2340, 2345, 2350 in
It can also be noted that an aspect of the present invention as per the system 2300, in its broader context, teaches a broader genus of inventions that need not necessarily be implemented in optics. As is well known, digital filter banks can be readily implemented. For example, fred j harris, “Multirate signal processing for communication systems,” Prentice-Hall, 2004 (“the harris reference”) describes how multirate digital signal processing techniques can be used to construct digital filter banks that involve sub-band processing. Multirate signal processing make extensive use of bandpass sampling and can be useful to lower the computational load associated with filtering a band pass modulated signal, especially in embodiments where the modulated signal output of 2315 is centered at a RF (radio frequency) carrier frequency. Single rate digital filter banks can also be readily constructed and used in embodiments that do not perform resampling but have all parallel filters operating in all filter channels at a single sampling rate that is the same as the input and/or output signals. Also, MISO (multiple input, single output) type digital filter banks are well known that can be viewed as having multiple parallel inputs that feed to multiple parallel filter channels, and a summing junction (digital combiner) that is used to add the outputs of the multiple parallel filters to provide a single output. Therefore, the block 2320 could be implemented as a MISO digital filter bank using purely digital hardware. Similarly, SIMO digital filter bank could be used to implement the block 2330. A SIMO digital filter bank sends a single input stream to multiple parallel digital filter channels and provides a multi-channel output signal. Being digital, digital filter banks can be implemented using one or instruction set processors coupled to memory. This could be dedicated hardware or shared with other digital signal processing hardware in the system.
In all-digital embodiments, the laser 2305 and the optical modulator 2315 are replaced by a standard digital physical layer channel interface such as an BPSK, QPSK, QAM, OFDM, or any other kind of modulator for a given channel that can be used with spatial modulation. The modulated signal output of non-optical version of physical layer block 2315 is passed to the block 2320 in digitized form which performs SIMO filter bank operations. Since in SM only one transmit channel is used at a time, the spatial modulation constellation point coming from Γ2 of block 2310 will select a filter from the SIMO filter bank to be applied to the modulated signal during a given symbol interval. The output of the block 2310 will thus be a filtered version of the modulated signal, where a selected filter from the digital filter bank 2320 is applied each symbol interval. The filters inside the digital filter bank 2320 are preferably selected to allow the spatial modulation constellation point to be resolved at the receiver. The output of the block 2320 can be sent directly to a digital to analog converter (DAC), or to a line/channel interface that includes and an analog reconstruction filter. In 5G wireless and similar types of wireless embodiments, the channel interface could be an air interface such as used in 3G, 4G or 5G cellular, or as used in WiFi wireless local area networks, or as used in 802.16 type WiMAX systems. In other types of embodiments, the line interface could correspond to a DSL (digital subscriber line) broadband twisted pair telephone line, or could correspond to a cable modem type channel interface.
An advantage to using this alternative MISO/SIMO approach 2300 as opposed to a multi-antenna embodiment is that the filters in the digital filter bank can be made to be adaptive. The filter response of adaptive filters can be changed varied and a function of current channel conditions. Therefore, adaptive filters can be adjusted or otherwise changed or updated to improve the properties of the overall channel matrix, H. While multi-antenna embodiments rely on a complicated MIMO type channel model, the above described MISO/SIMO approach 2300 can better select and control the overall channel matrix, H. Another advantage is that while current SM systems as used in cellular networks can have a large number of antennas in the downlink from the base station to the mobile, the mobile unit (handset) itself can only have a small number of antennas due to mobile-unit size constraints. When the above described alternative MISO/SIMO approach 2300 is used, the mobile unit can have a large number of equivalent SM channels. Also, even in the base station, it can be more cost effective to implement the multiple downlink channels using a digital filter bank because this eliminates extra antennas and also provides more control in selecting and maintaining a desired channel matrix, H.
The digital, analog, or discrete time filter banks can be used to implement the transmitter and/or receiver portions of the channel matrix H. That is, the SM/MIMO channel matrices Ht and Hr (where H=f(Ht, Hch, Hr)) can be implemented using the filter banks in the transmitter and/or receiver, and Hch would be the actual communication channel. The actual communications channel matrix Hch may be a scalar if both the transmitter and the receiver include the above-described filter banks and only one antenna or only one line interface is used. Also, mixed systems that use filter banks to implement Ht and Hr but also use multiple antennas/physical channels may also be constructed. Similar to the optical embodiment described above, the filters in these digital, analog or discrete-time filter banks can be selected to be orthonormal, square-root orthonormal, or some other type of non-orthogonal basis functions such as an all-pole filters or filters with poles and zeros, That is, it is not required to implement an orthonormal basis set of filters, nor an approximation thereto. All that is really needed is to ensure that the filters are selected so that the overall channel matrix H is invertible, full rank, or has enough rank so that the SM-modulated or MIMO-modulated signals can be suitably recovered/reconstructed after passing through the channel H. The transmit and receive filters will influence H as will any noise and distortion effects of the optical channel itself.
In yet another embodiment, instead of implementing the blocks 2320 and 2330 using parallel MISO and SIMO digital filter banks, either analog filter banks or discrete-time filter banks (such as tapped delay lines or SAW (surface acoustic wave) filter banks are used. That is, the modulated signal such as the BPSK, QPSK, QAM, OFDM, or any other kind of modulated signal for a given channel is generated at the block 2315 and the block 2320 is operative to pass the modulated signal to a selected analog or discrete-time filter, where the selection is made in accordance with the spatial constellation point supplied by the spatial modulator during the give symbol interval. That is, all operations are similar to the above-described digital filter bank embodiment/approach, except the DAC operation is performed before the block 2320 instead of after it, and ADC (analog to digital conversion) is applied after the block 2330 instead of before it. Such embodiments are practical in some cases because the center frequency and/or the bandwidth of the modulated signal can make the digital filter bank operations require very high processing speeds. It is envisioned that a single chip could be used to implement the blocks 2320 and 2330 for use in a handset using either analog filter technology or discrete-time filter technology, and in the case of block 2320, under digital selective control in accordance with the spatial modulation constellation point coming in each sampling interval from the block 2310 (Γ2).
In the discussion below,
Referring now to
A constellation and spatial mapper 2410 is provided to receive an input bit stream which is presented to the block 2410 on the input arrow to the left. For example, the input bit stream can be a CTBC encoded bit stream. As discussed earlier in connection with CICM, the input bit stream can be any coded bit stream for which a set of tables P(d), for d=dt, dt+1, . . . , df can be constructed. In the MIMO embodiment 2400, the signal mapper 2410 can optionally include a spatial constellation mapper component. This optional spatial mapping component is shown in dotted lines as an optional output from the mapper 2410 to a MIMO optical signature filter bank/combiner block 2420. In embodiments where the optional spatial mapping component is supplied by the signal mapper 2410, the signal mapper 2410 becomes a constellation and spatial mapper 2410 (as shown in
In embodiments that use MIMO vector modulations such as V-BLAST and D-BLAST, the constellation and spatial mapper 2410 performs spatial mapping by providing nt number of signal constellation points to be transmitted via the vector, x, each symbol interval. In such modulations the dotted arrow coming from the spatial modulation sub-matrix, Γ2, is empty and there is no separate spatial modulation matrix Γ2. Instead, in this case, the spatial portion of the modulation is performed by virtue of mapping nt number of signal constellation points to be transmitted via the vector, x, each symbol interval.
The outputs of the lasers 2405 couple to the laser-inputs of a bank of optical modulators 2415. The optical modulators each receive a second input of m bits each, where each m-bit input is representative of a signal constellation point that tells each respective optical modulator how to modulate its laser input to produce a modulated laser output. The modulation is performed in accordance with the signal constellation point supplied as a column of the submatrix Γ1 by the constellation and spatial mapper 2410. For example, when m=4, each column of the submatrix Γ1 could identify nt groups of four coded bits each that respectively identify a respective 16-PSK signal point to which the each respective group of m=4 coded bits will be mapped in a given symbol interval. In this example a particular CTBC code is used to encode the bit stream input to the block 2410. During each symbol interval, the encoded bit stream is mapped to a vector comprising nt CICM-16-PSK constellation points. For example, if nt=4, this will increase the data rate by nt=4 times the data rate of a single CICM-16-PSK channel. If nt=16, this will increase the data rate by nt=16 times the data rate of a single CICM-16-PSK channel. In general, this type of transmission can provide a speed up of nt times the data rate of a conventional system that does not use the MIMO processing of the system 2400.
In the system 2400, to maintain a system-wide symbol Hamming distance, the CICM mapping rule design algorithm can be configured to create a CICM constellation and spatial mapper that does not allow more than one bit from any given low weight error sequence to be transmitted during any given symbol interval. In such embodiments, a single CICM permutation matrix, Γ, is designed with m*nt number of bits per column. In such embodiments there will be nt number of columns that correspond to each Euclidian distance in the signal constellation. In other embodiments, the symbol Hamming distance will only be effective within each one of the nt different channels. In such embodiments, a single CICM permutation matrix, Γ, can be designed that has m rows and K/m columns as in the single-channel case. Now, however, a set of nt columns of Γ will be mapped to separate filter channels during each symbol interval.
Also, as applies to the system 2300 as well, if transmission is occurring on both the vertical and horizontal polarizations, the CICM mapping rule can be designed to treat the horizontal and vertical polarizations as being imperfectly coupled, in which case it is desired to view the symbol Hamming distance as involving the bits sent on both the horizontal and vertical polarizations during a given symbol interval. However, because it is common to apply a small 2×2 rotation matrix to correct for imperfections in the horizontal and vertical polarizations, the CICM mapping rule can be alternatively designed to treat the horizontal and vertical polarizations as being perfectly isolated, in which case it is desired to view the symbol Hamming distance as involving the bits sent only on the horizontal or the vertical polarization during a given symbol interval. If the vertical and horizontal polarizations are considered to be perfectly isolated, then one column of Γ will be mapped to the horizontal polarization and another column of Γ will be mapped to the vertical polarization during each symbol interval. Similarly, when the systems 2300 and 2400 are applied in DWDM systems, since the different wavelengths can generally be considered to be isolated, and assuming isolated/corrected polarizations, a full 160 columns of Γ can be mapped each symbol interval. As mentioned above, soft interference cancellation can also be used with the SISO decoder to cancel the effect of the polarization cross talk.
The MIMO optical signature filter bank/combiner 2420 receives a length-nt vector of signal constellation points. If the submatrix Γ2 is in use, a column from Γ2 indicates a subset of the nt vector inputs to process during a given symbol interval. The nt outputs of the internal optical signature filters within the MIMO optical signature filter bank is sent to a combiner that is located within the block 2420. The combiner can be implemented using known optical technology to include merging optical paths in an optical integrated circuit or fiber couplers that have multiple optical input fibers which are length matched and combined to form a single output. The single output is coupled to an optical channel such as a fiber optic cable or a free space laser channel. The output of the optical channel is coupled to a SIMO active optical receive filter bank 2430. Each of the active optical signature filters 2420 and optical receive filters 2430 can be implemented in accordance with known technology, as described above in relation to the Dowling, MacFarlane and Madsen references, for example. The structure and operation of the optical receive filter bank 2430 is largely the same as described in connection with
Because in this case the vector x can be carrying information symbols on up to all n, channels per symbol interval, the action of the combiner in the block 2420 will be to form a linear combination of all of the columns of the submatrix Ht each interval and eventually the matrix H each frame. These signals can be separated in the receiver as long as all of the columns of Ht and/or H are linearly are independent. The more orthogonal the columns of Ht and/or H are, the easier it will be to effectively invert these matrices using an predetermined and matched orthogonal matrix in the receiver. The transfer functions inside the Ht submatrix are optical transfer functions and are applied during each symbol interval.
If the portion of the channel matrix H of equation (31) that is active during any symbol interval is factored as H=HrHt, then the MIMO optical signature filter bank can be viewed as having a matrix transfer function of Ht while the SIMO active optical receive filter bank 2430 can be viewed as having the transfer function Hr. In certain preferred embodiments, the channel matrices are constructed using orthogonal basis sets so that the matrix H is, or approximates, a constant times an identity matrix, and the matrices Hr and Ht can be viewed as orthogonal filter matrices.
The blocks labeled 2435, 2445 and 2450 perform similar functions and have similar structures to the corresponding blocks 2335, 2345 and 2350 in
Another class of embodiments contemplated by the present invention involves OFDM (orthogonal frequency division multiplex) systems, also known as DMT (discrete multitone) type systems. These systems map an entire frame of data onto a set of N carriers, usually using a DFT (discrete time Fourier Transform) that is implemented using an FFT (fast Fourier Transform) and its inverse transform. In such systems, each sub carrier is typically viewed as carrying one QAM type data symbol each frame. The collection of all QAM symbols on all sub carriers per OFDM frame is called an OFDM symbol. The OFDM symbol interval is thus the OFDM frame size plus possibly a cyclic prefix duration that is used as a guard interval to separate the OFDM symbols in time enough so that FFT processing can be employed in a demodulator. As is known in the art, the collection of QAM data symbols on the various subcarriers can be encoded using FEC and/or TCM.
In accordance with an aspect of the present invention, the OFDM symbol is formed by modulating the carriers using a CTBC encoded data sequence. The CTBC code's frame size may be the same as the number of bits mapped per OFDM symbol interval, or for example, one CTBC frame of data can be mapped to an integer or fractional number of OFDM symbols. For example, if K=1024 and the number of subcarriers is 256, then the CTBC frame would be carried by four OFDM symbols. If K=1024+128=1152, then the CTBC frame would be carried by four and a half OFDM symbols. In actual OFDM systems, certain sub carriers may be used as reference tones for synchronization purposes, but the general idea is that CTBC encoded data may be mapped to one or more OFDM symbols.
Also, in the OFDM transmitter, instead of using TCM-QAM, for example, CICM-PSK could be used to modulate each sub-carrier. In systems like DSL (digital subscriber line) modems, where different numbers of bits are mapped to different sub-carriers depending on channel conditions, different sized PSK constellations could be used at different subcarriers. The CICM permutation and constellation encoding could be carried out separately for each subcarrier, or could be carried out across subcarriers, k, depending on the embodiment. Hence the present invention specifically contemplates all variations using CICM in the time and subcarrier domains or a combination of both.
In certain embodiments of the present invention would encode a frame data into a K-bit CTBC encoded frame, then apply CICM interleaving, and then a reverse Gray coded or other similar constellation mapping such as anti-Gray coding to each subcarrier. For example a CICM-16-PSK could be used to modulate each subcarrier. Depending on the embodiment, each subcarrier could carry a separate CTBC/CICM encoded data frame, or a single data frame could be spread across the entire set or a subset of the subcarriers. A 5G LTE system could be designed using a CTBC code and CICM-PSK type modulation.
Also, embodiments of the present invention are envisioned that do not use CTBC codes but instead use any of the types of codes that are discussed above that can be used with CICM. For example, as discussed in the unequal error protections section above, a single (8,4) or longer block code could be used with CICM to provide equal or unequal error protection. In an OFDM embodiment, the CICM can be used with any suitable code as discussed above and does not need to be a CTBC code. That is, any suitable code can be used to create a valid CICM signal mapper with a CICM permutation and a selected constellation mapper, and this CICM can then be used to modulate either a single subcarrier or could be spread across multiple sub-carriers. If CICM is used with a block code or a convolutional code, something similar to TCM results, however CICM can perform better than TCM over fading channels. Therefore, the current TCM-QAM used in various standards to modulate subcarriers can be substituted with an appropriate CICM scheme, such as a CICM-PSK scheme that is derived from a block code or a convolutional code. Other codes such as turbo product codes, turbo codes and LDPC codes can also be used as long as their P(d) tables can be constructed as discussed above.
As is discussed in R. Y. Mesleh et al., “Spatial Modulation,” IEEE TR Vehicular Technology, Vol. 57, No. 4, July 2008, pp. 2228-2241, (“the Mesleh reference) both SM-OFDM and VBLAST-OFDM (MIMO) are known. The present invention contemplates that known SM-OFDM and MIMO-OFDM can also be improved by using CTBC coding of the bit stream, and/or CICM encoding of each subcarrier, preferably using CICM-PSK for each subcarrier's constellation mapping. The present invention also contemplates using the optical and/or non-optical versions of the SM and MIMO system configurations as shown in
Referring now to
Similar to how the bits in the Γ2 matrix are used in
To understand the function of the block 2505 to perform the mapping Q(k)→{X1(k), . . . , XNt(k)}, consider all of the elements of the vectors {X1(k), . . . , XNt(k)} to be initially set to zero. Note that the kth column of Q1(k) corresponds to a complex-valued signal constellation point to be transmitted onto the kth subcarrier and to be carried by the spatial transmit signature channel number given by the kth column of Q2(k). Hence similar to block 2310 and 2410, block 2505 will sequentially map each kth column of Q1(k) to a constellation point in the kth position of a selected channel vector, Xsp(k), where the subscript sp E {1, 2, . . . , Nt} is equal to the binary value of the kth column of Q2(k). Since all of the elements of all of the vectors {X1(k), . . . , XNt(k)} were initialized to zero, by the time each column of Q1(k) has been mapped to a complex number in the kth position of the Q2(k)-selected channel vector, Xsp(k), the vectors {X1(k), . . . , XNt(k)} will contain all zeros except for the frequency bins k to which a complex number corresponding to a signal constellation point have been inserted. Because each frequency bin k is mapped to one channel, it will never be the case that any one spatial channel sends more than one signal constellation point on any one subcarrier at a time. If the {X1(k), . . . , XNt(k)} are viewed as row vectors all stacked into a matrix X(k), the matrix X(k) will be of size Nt×N, and each kth column of X(k) will contain all zeros, except for the spth row, which will contain a complex number. Here the spatial channel number, sp, corresponds to the binary value of the kth column of Q2(k) and the complex number corresponds to the signal constellation point determined by constellation mapping the kth column of Q1(k). Next the frequency domain set of spatial channel vectors are sent to a frequency domain filter bank 2510 where the equivalent of time-domain filtering (convolution) is implemented using point-wise multiplications in the frequency domain. If needed a cyclic prefix or guard interval can be used to ensure that the convolutional tails between OFDM symbols is maintained. The signature filter bank 2510 thereby applies a frequency domain spatial-channel signature value to each frequency bin in each transmit spatial channel. To keep the guard band short, the signature filters can IFFT back to zero padded impulse-response vectors in the time domain. The outputs of the frequency domain filter bank are then summed together in a summing junction and are then the sum is sent to an inverse FFT/OFDM modulator 2515. The time-domain output signal from this OFDM modulator is then converted to analog or otherwise coupled onto a physical channel for transmission. The physical channel can be wireless, wireline, or optical, and in general, can involve one or more antennas, although a preferred embodiment only uses one antenna at the transmitter and one antenna at the receiver to implement the equivalent of the matrix MIMO type channel, H.
In the embodiment shown in
In the alternative embodiment shown in
Also, while
At this point, additional disclosure is provided to explain further details of mathematical design criteria for encoding, constrained interleaving, and decoding used by combined MIMO-SM (C-MIMO-SM) embodiments of the present invention as depicted
In connection with
As discussed in connection with
All embodiments of the system of
For example, consider an embodiment of either
The mutual information properties among the coded bits will be exploited in order to allow the decoder 2125, 2440, 2535, 2635 to SISO decode (or to optimally ML decode) the C-MIMO-SM symbol stream and to converge to a correct decoded sequence solution with high probability as long as the signal to noise ratio and channel modeling errors are within prescribed limits.
For example, consider a MIMO embodiment of
If optimal decoding such as ML decoding is used, then the full set of mspatial number of SM bits can be transmitted and decoded during every C-MIMO-SM symbol interval. However, when iterative decoding such as SISO decoding is used, C-MIMO-SM symbol streams are more difficult to decode due to the lack of a reliable set of bit metrics for use in starting the SISO iterations. Hence, with iterative SISO decoding, C-MIMO-SM embodiments can be better designed by only transmitting SM bits during selected C-MIMO-SM symbol intervals. This way, the symbol intervals where no SM bits are transmitted can be used to provide reliable bit metrics at the beginning (in the first iteration) to provide a reliable starting point for the SISO iterations so that the SISO iterations will converge to the correct decoded sequence with high probability as long as channel modeling estimation errors and the signal to noise ratio are within prescribed limits. The C-MIMO-SM symbol intervals where SM bits are transmitted are referred to herein as “SM intervals,” while the C-MIMO-SM symbol intervals where no SM bits are transmitted are referred to as “non-SM intervals.”
In a frame consisting of K outer-encoded bits, the frame can be considered to hold a plurality of subsets of SC bits (where each subset of SC bits preferably corresponds to an m-bit column of Γ1), and a plurality of subsets of spatial modulation SM bits (where each subset of SM bits preferably corresponds to an mspatial-bit column of Γ2). Groups of encoded bits of the frame are transmitted in a plurality symbol intervals (such as C-MIMO-SM symbol intervals). These symbol intervals include a plurality of SM intervals and a plurality of non-SM intervals. Each SM interval is preferably used to transmit a respective group of SC bits that can correspond to nt columns of Γ1 and a respective subset of SM bits that can correspond to one mspatial-bit column of Γ2. Each non-SM interval is preferably used to transmit a respective group of SC bits that can correspond to nt columns of Γ1 but no bits from Γ2. During each respective non-SM interval, a respective group of SC bits is coupled in a pre-determined order to a plurality of independent channels such as MIMO channels in a multi-antenna system. During each respective SM interval, a respective group of SC bits is also coupled to the plurality of independent channels, but in an order that is dependent on (determined by) a respective subset of SM bits. For example, the respective subset of SM bits can be a respective mspatial-bit column of Γ2. Each respective independent channel is configured to transmit a respective 2m-ary independent channel symbol that is determined in accordance with a respective subset of m SC bits. The SM intervals are identified to ensure that a measure of extrinsic information will meet a specified constraint so that both the SC bits and the SM bits will be decodable by a SISO decoder based upon the receiving a sequence of the independent channel symbols that have been corrupted at most by a predetermined level of channel distortion. The channel distortion is typically additive noise (e.g., additive white Gaussian noise) plus optionally channel modeling errors or other forms of channel distortions such as channel fading and non-linear effects.
Note that each subset of SC bits contains m number of outer-encoded bits, each subset of SM bits includes mspatial m number of outer-encoded bits, and the frame of encoded bits can be viewed as including NI number of SM intervals and N2 number of non-SM intervals. Each respective independent channel symbol typically corresponds to a 2m-ary signal constellation point that is generated in response to a respective subset of m SC bits. Each SM interval and each non-SM interval include nt number of m-length subsets of SC bits, and each SM interval also includes a respective subset of mspatial m number of SM bits. With these parameters, the frame length is given by K=m(N1+N2)nt+Nlmspatial, where K, m, N1, N2, nt and mspatial are all positive integers. Alternative embodiments are possible where this formula does not hold, but instead numbers such as m can vary adaptively within a frame. Also, embodiments are possible where different numbers, nt1 of nt2, of SC bits are sent to different numbers of independent channels respectively during the SM intervals and the non-SM intervals. In preferred embodiments, there are nt number of independent channels, and ntm number of SC bits are coupled to the nt number of independent channels during each respective SM interval and during each respective non-SM interval.
As noted above, optimal decoding can find the decoded sequence even if all intervals employ SM bits. In such embodiments, all C-MIMO-SM symbol intervals can be SM intervals, so that N2=0. Hence, if a practical implementation of an optimal decoder becomes feasible for large frames in the future (which is foreseen to be possible either with future modifications to SISO iterative decoding or by other methods), SM bits can be employed during all intervals making all intervals SM intervals. This allows the transmission of the highest possible number of SM bits (mspatial bits) per interval rather than an average of rmspatial SM bits per interval, where r=N1/(N1+N2) is the fraction of SM intervals among all intervals. In the rest of the embodiments described herein, it is assumed that SISO decoding is the best/most efficient decoding technology available and thus the following embodiments are discussed assuming SISO decoding is in use. While it is to be understood that other embodiments that make use of non-SISO decoding can also be implemented in accordance with the present invention, EXIT (Extrinsic Information Transfer) chart analysis that starts with mutual information and then uses extrinsic information in the actual EXIT chart is directed more specifically for design and analysis of embodiments that use SISO decoding. Such EXIT chart analysis, as discussed below, serves as a mathematical basis for C-MIMO-SM and related embodiments (e.g., C-IC-SM, C-OFDM-SM and C-TDM-SM) of the present invention that rely on SISO decoding.
Preferably, C-MIMO-SM embodiments are designed to use the highest number of SM intervals that are possible subject to the specified constraint (interleaver constraint) that all of the SM intervals will collectively include at most (d−1) coded bits of each weight d valid coded sequence as listed in the Table P(d≧dt) as discussed above. This interleaver constraint assumes that the coded bits in r are encoded in accordance with a selected outer code. The outer code can be any of the above mentioned codes for which a Table P(d≧dt) can be constructed, or any other code for which a Table P(d≧dt) can be constructed. For example, if the outer code is a (n,k) OBC with MHD d0, then the maximum number of SM intervals can be chosen by ensuring that no more than (d0−1) coded bits from any weight d0 codeword of the OBC will be placed in those selected SM intervals. The above choice ensures that the bit-wise mutual information is distributed so that the remaining non-SM intervals can provide useful extrinsic information from every codeword of the OBC in the first iteration thereby ensuring that the SISO iterations 2125, 2200, 2440, 2530, 2535 will converge towards the correct solution with high probability under the conditions described above.
The concept of mutual information is known to those of ordinary skill in the art of channel coding, and optimal decoding, and/or SISO decoding. As is known in the art, EXIT charts are used to analyze and predict how mutual information needs to be distributed in order for SISO decoders to converge to a correct solution with high probability under the above-described conditions. The EXIT chart includes two curves, one that relates to extrinsic information evolution of the outer code and another that relates to extrinsic information evolution of the inner code (e.g., the bit metrics related to the constellation mapper). EXIT chart based analysis suggests that mutual information related to these two component codes be distributed so that the bit-wise extrinsic information will evolve as SISO iterations proceed to provide positive information thus to guarantee convergence. Stated another way, the two curves on the EXIT chart should not cross over, but should be separated by a tunnel with one curve always above the other. For example, see Stephan ten Brink, “Designing iterative decoding schemes with extrinsic information transfer chart,” AEU Int. J. Electronic Communication, Vol. 54, No. 6, pp. 389-398, 2000 (“the Brink reference”). In the Brink reference, see
An alternative embodiment of
1. In the first iteration of SISO decoding, starting with the inner code soft decoder, calculate bit metrics only during the non-SM intervals and then pass the calculated bit metrics to the outer code soft decoder used in the SISO decoder 2125, 2440 through the CICM de-interleaver 2120, 2450. Recall that the outer code can be a block code, a convolutional code, a CTBC code, or in general any code for which a Table P(d≧dt) can be constructed. The outer code decoder in the SISO decoder 2125, 2440 soft decodes based on the received bit metrics and provides extrinsic information of all bits including those of the SM bits. This extrinsic information is passed back via the CICM interleaver 2130, 2445 to the inner code decoder for the second iteration. The SISO decoders 21252440 are typically arranged similarly to any of
2. During the second through the last iteration, the SISO decoder will update the bit metrics during all intervals including SM intervals. During all iterations, the outer code portion calculates the extrinsic information associated with both the SC and SM bits in the normal manner. If the outer code is a block code, this outer code portion of the SISO iteration will be as described above in accordance with CTBC codes. If the outer code is a convolutional code, a BCJR decoder can be used, or if the outer code is a LDPC code, a sum product algorithm (SPA) type decoder can be used, etc. However, the processing used during the inner code portion of each SISO iteration involves an additional step to decode the information in the SM bits during SM intervals, i.e., the permutation reordering within the SM intervals. During the inner code decoding portion of the SISO iteration, only extrinsic information associated with the SC bits is updated. During the non-SM intervals, the bit metrics updates of SC bits are done in the normal way using received signal and the extrinsic information provided by the outer code. However, during SM intervals, the bit metrics of SC bits are more involved and they require the additional step. This is because the SM bits associated with each SM interval also provide additional extrinsic information in addition to the extrinsic information or the SC bits as normally provided by the outer code soft decoder. This additional extrinsic information provided by the SM bits can be calculated similarly to the decoding of block codes. That is, a “soft permutation decoder” similar to a block code decoder is used, but the soft permutation decoder views all combinations of the mspatial m number of SM bits which correspond to valid permutations similarly to valid codewords of a block code. Note than when sub-groups (as described below) are used, the number of valid permutations decreases in accordance with the selection of the sub-groups. As in the previous example, if nt=4, it is possible to transmit mspatial=└log2(nt!)┘=4 SM bits during each SM interval. Hence, during any SM interval, there are 16 possible ways that the four subsets of m=2 bits each can be mapped to the nt=4 spatial transmission channels. Each such arrangement corresponds to a particular 4-bit combination of SM bits that are used during the corresponding C-MIMO-SM symbol interval. The extrinsic information of any SC bit provided by the SM bits during any SM interval can be calculated similarly to the soft decoding of a block code by (a) calculating a metric for each of the 2m
Referring now to
As seen from the previous section, as nt increases, the number of SM bits that can be sent during SM intervals, └log2(nt!)┘ also increases. However, as the number of possible combinations of SM bits (nt!) increases, the decoding complexity can increase significantly during SM intervals. For example, if nt=16, since 245>16!>244, a maximum of mspatial=44 SM bits can be transmitted during each SM interval. In this example, during SISO decoding, the 44 SM bits associated with all nt=16 corresponding columns of Γ1 will be decoded similar to bits of a block code. In such situations with large mspatial values, it may be desirable to optionally simplify the decoding during SM intervals by using a Pyndiah type decoder in the soft permutation decoder portion 2245 of the SISO decoder. Pyndiah type decoders eliminate unlikely codewords from the SISO decoding search process to reduce decoding complexity when the number of possible codewords is large. Therefore, Pyndiah type decoders are usually applied to decode large block codes. In certain embodiments of the present invention, a soft-permutation-Pyndiah type decoder can be applied when the number of possible permutations to be decoded is large. A Pyndiah-type soft permutation decoder eliminates unlikely permutations to reduce the search complexity of soft permutation decoding.
To better understand how a preferred embodiment of a “Pyndiah type soft permutation decoder” as defined herein would operate, consider the above example with nt=16 and mspatial=44. Initially the inner code soft decoder associated with the constellation mapper would compute the initial set bit metrics and associated extrinsic information related to the SC bits in the non-SM intervals. Next the outer code soft decoder would provide extrinsic information of both the SC bits and the SM bits in accordance with soft decoding of the outer code. This way, the outer code soft decoder portion of the SISO decoder would update the extrinsic information of all mspatial=44 SM bits. As discussed in connection with the alternative embodiment of
While the above approach represents a preferred C-MIMO-SM embodiment, C-MIMO-SM embodiments that employ various types of modifications or simplifications can also be constructed. One type of an alternative embodiment involves forming sub-groups of spatial transmission channels for use in simplified SM bit mapping and decoding. For example, in an embodiment where nt=16, four sub-groups of spatial transmission channels (e.g., each with four antennas or four channel signature filters) can be formed and each one referred to as a level-1 sub-group. Each of these level-1 sub-groups can transmit four SM bits per symbol interval. In addition, the four level-1 sub-groups can form one level-2 sub-group thereby transmitting an additional four SM bits. In such embodiments, where nt=16, using two levels makes it possible to transmit twenty SM bits in total. Even though this embodiment lowers the total number of SM bits, the grouping of SM bits in this way makes the decoding a lot simpler as each individual level is much smaller and each level-1 and level-2 sub-group only has sixteen possible combinations in each one. Similarly, if nt=64, sixteen level-1 sub-groups can be formed, four level-2 sub-groups can be formed, and one level-3 sub-group can be formed, so that it is possible to transmit (16×4)+(4×)+4=84 SM bits during SM intervals while keeping the decoding complexity at a much lower level. If each level-1 sub-group uses nt1 columns of Γ1, the highest number of spatial bits transmitted by that level-1 sub-group is └log2(nt1)┘. Similarly, if a level-k(>1) sub-group is formed by ntk number of level-(k−1) sub-groups, then highest number of spatial bits transmitted by that level-k sub-group is └log2 (ntk)┘. However, as the number of levels increase, the decoding of SM bits gets more complicated because the decoding of any level requires consideration of all of its higher levels. Even though decoding of levels can be done separately, every lower level needs to consider all (or most significant) decoding of higher levels. Hence, to lower the decoding complexity, only a limited number of all possible levels can be used, however, by lowering the number of SM bits. The decoding complexity is the lowest when only level-1 is used. When only level-1 is used, if SM1 number of bits are transmitted through each level-1 sub-group involving nt1 columns of Γ1, and if L1 number of such level-1 sub-groups are formed for each group of nt1 columns, so that nt=L1n1, then all of the SM1L1 number of SM bits of each C-MIMO-SM symbol can be divided into L1 sub-groups. Each of these L1 sub-groups can then be soft decoded each iteration completely independently. If sub-group simplifications are applied, the number of SM bits will decrease accordingly. In such a case, soft permutation decoding can be done within individual sub-groups. If mspatial is small enough, all valid permutations can be considered in the soft permutation decoder, but if mspatial large and the sub-group size is still large enough, then the above described Pyndiah type soft permutation decoder can be used by only considering the most significant permutations at the sub-group level.
Assuming that nt number of independent channels are in use, up to mspatial=└log2(nt!)┘ number of SM bits can be sent during each C-MIMO-SM symbol interval. Again consider the example where nt==16 so that mspatial=44. Note that in this example, this implies that each group of nt=16 of Γ1 can be permuted in 244 different ways each SM interval. When nt! is large, it will be important for the encoder to implement an efficient “variable deterministic interleaver” that will interleave each of the nt columns of Γ1 in each SM interval in one of 2m
In another type of alternative embodiment, instead of constructing a C-MIMO-SM system that transmits SM bits only during a selected set of SM intervals, SM bits are sent in all C-MIMO-SM symbol intervals. However, in such embodiments the SM bits are restricted to choose from among only a selected “SM group” of independent channels and the number of channels in the SM group, denoted nSMG, is typically less than nt, unless optimal decoding (or, e.g. signature-stamping as described below) is used. For example, when nt=4, if necessary, the first and second subsets of m coded bits can always be transmitted to channels 1 and 2 respectively, and then the third and fourth subsets of m coded bits can be sent to a selected one of the 3rd or the 4th channel, depending on a single SM bit. In this example, channels 3 and 4 correspond to the SM group, nt=4 and nSMG=2. In this type of embodiment, the SM bits are implicitly sent by permuting the order of the SC bits transmitted in all intervals, but only on the channels in the SM group. If desired, this alternative SM group based SM bit mapping policy can be used during all C-MIMO-SM symbol intervals, or SM bits can be transmitted to the SM group only during SM intervals. The subset of channels that define the SM group used during each SM interval can optionally vary from one SM interval to the other. Usually, to maximize the number of SM bits, if nt is small compared with the number of intervals (as is usually the case in multiple-antenna MIMO embodiments), it is better to use the SM interval/non-SM interval type of embodiment described above. However, if nt is large, then the SM group type of embodiment or the hybrid SM interval/SM group type of embodiment can lead to a more efficient SISO decoder implementation.
Next consider the issue of designing the CICM interleaver as used, for example, in blocks, 2130, 2410 and 2445. As in SM, the CICM interleaver for C-MIMO-SM embodiments can be designed by designing Γ1 for the signal constellation and Γ2 for the spatial constellation. Consider the transmission of a frame that consists of (mC+mspatialrC) coded bits, where C represents the total number of columns of Γ1. Similarly, Γ2 is formed by mspatial rows and rC columns to carry the SM bits. Preferably the values of r and/or mspatial are chosen to be the highest possible value that can satisfy the interleaver constraint that no more than (d−1) coded bits of any valid coded sequence of weight d can fall into r2. As in SM, both Γ1 and Γ2 are preferably designed to satisfy the ds and Dmin2 interleaver constraints as well. After constructing Γ1 and Γ2 combine them by joining columns of Γ2 with columns of Γ1 to form the final Γ as discussed in this patent application above. Note that nt columns of Γ1 typically need to be combined with a column of Γ2 to form bits in a single SM interval. As discussed in connection with CICM interleaver design above, Γ1 and Γ2 can be merged to form the matrix Γ and to ensure that the ds and Dmin2 constraints are met. In order to maintain the same number of columns in Γ1 and Γ2 to thus properly form the matrix Γ as defined hereinabove, the merged Γ can be formed by placing the matrix Γ1 over a zero-padded version of Γ2 whose columns are all initialized to zero, and then every ntth column (and only those corresponding to SM intervals) are set to the corresponding column of the non-zero-padded version of Γ2. At run time only the non-zero-padded columns of Γ2 would be used with a corresponding set of nt columns of Γ1. In practical embodiments, Γ would be formed using a data structure that simply holds Γ1 and the non-zero-padded columns of Γ2 to the same effect. When checking for ds of the coded bits placed in a non-zero column of Γ2, all of the nt columns of Γ1 that combine with that column of Γ2 will need to be considered as one single symbol interval. Hence, when a column of Γ2 is merged with nt columns of Γ1, it is not desirable to have coded bits of any given low weight sequence of the outer code in both that column of Γ2 and any of those nt columns of Γ1 as this would lower the symbol Hamming distance ds. Also, it is understood that the columns of Γ2 can be set to zero or eliminated all together during all non-SM intervals and coded bits from the outer code are only mapped to the columns of Γ2 in columns that correspond to SM intervals. In embodiments that use SM bits to map to only an SM group of spatial transmission channels, then this type of embodiment will add similar types of interleaver constraints to Γ2.
In C-MIMO-SM type embodiments, it is also noted that the CICM interleaver Γ can be specially designed to select the number of SM bits that can be transmitted based on a given signal to noise ratio or a signal to noise plus channel modeling error ratio or similar measure of channel quality/channel distortion. That is, a set containing different CICM interleavers, {Γi} can be designed, where in this context, the subscript i denotes an index to a particular set of channel conditions, such as signal to noise ratio or signal to noise plus channel modeling error ratio or similar measure. Each different CICM interleaver in the set {Γi} would be able to carry a respective number of SM bits. For example, CICM interleavers designed assuming a higher value of signal to noise plus channel modeling error ratio will typically be able to carry more SM bits than the CICM interleavers designed assuming a lower such value. At run time, the signal to noise plus channel modeling error ratio can be estimated, and the appropriate CICM interleaver, Γi from the set {Γi} can be selected for use in accordance with the estimation of the current channel conditions. If the channel conditions change slowly over time, the channel conditions determined at start-up time and can also optionally be monitored/tracked and the appropriate CICM interleaver from the set {Γi} can be adaptively selected in accordance with the current channel conditions. This provides adaptive C-MIMO-SM embodiments that can select a CICM interleaver, for example, at start up time of a modem or transceiver handshake sequence used for parameter exchange and negotiation at start-up/channel training time. Also, the CICM interleaver in use can optionally be changed and thus the number of SM bits transmitted can be changed as a function of time-varying channel conditions. In all C-MIMO-SM (and C-IC-SM and C-OFDM-SM and C-TDM-SM) embodiments as discussed herein, the adaptive design techniques as described in this paragraph can optionally be employed. Also, as discussed below, in embodiments that do not use a CICM interleaver but instead directly use EXIT charts to determine the number of SM intervals and to design the (non-CICM) interleaver constraints, the adaptive design techniques as described in this paragraph can also be optionally employed to select a pre-designed interleaver, Γi.
In accordance with
As discussed above, certain embodiments, especially wireless embodiments of
In still other embodiments, especially wireless embodiments,
Using the above analysis, a class of C-MIMO-SM embodiments can be constructed using multichannel systems. Multichannel systems may be viewed as sending a high data rate bit stream over a set of multiple lower data rate IC's (“independent channels”). Therefore, a class of embodiments called C—IC-SM embodiments can be constructed where the channel matrix, H={hij}, has orthogonal or near-orthogonal channel paths, i.e., the off-diagonal elements (hij, i≠j) are equal to or are nearly equal to zero. Multichannel systems include FDM (frequency division multiplexing), WDM (wavelength division multiplexing), CDM (code division multiplexing as used in CDMA systems), TDM (time division multiplexing), SDM (space division multiplexing as used in multi-antenna or multi-free-space laser or multiple fiber optic laser communication systems). In general any multichannel communication system that multiplexes one or more bit streams onto a plurality of independent channels to be transmitted in parallel to set of one or more parallel receivers can be used as a basis for constructing C-IC-SM embodiments. The independent channels can be separated in space, frequency (or wavelength), polarization, time or any other means possible such as separate CDMA code channels in CDMA systems. The independent channels in the channel matrix typically have little or no cross talk, but need not be 100% orthogonal. It is important that the channel matrix, H={hij} is full rank and in practical situations, each channel, hij will be zero or nearly zero whenever i≠j.
For example, consider a 100G OTN type embodiment that uses WDM with four wavelengths and two polarizations per wavelength. This system provides a total of eight near-orthogonal channels, each of which can be considered to be an independent channel. In this example embodiment, there are eight lasers 2405, one per polarization at each of the four wavelengths. The outputs of these eight lasers 2405 are passed to a respective set of eight optical modulators 2410, one per polarization at each of the four wavelengths. Also no signature filters are needed, but only a combiner is used in block 2420, so that the optical channel thus carries the nt=8 near-orthogonal channel signals. In this C-IC-SM embodiment, each separate WDM wavelength (line frequency), and, if present, each separate WDM polarization, is referred to as a (near-orthogonal) WDM channel (independent channel). Each C-IC-SM symbol interval, the block 2410 is used to cause a group of nt subsets of m bits each to be fed from nt respective columns of Γ1 into the nt different WDM polarization channels based upon a set of SM bits taken from Γ2. In this example, nt is the number of independent channels. The independent channels in this example correspond to near orthogonal channels that use standard adaptive 2×2 polarization rotation orthogonalization techniques so that each WDM polarization acts as a (near orthogonal) independent channel. Note that in this type of embodiment the dotted lines from block 2410 to 2420 need not be used. Instead, the block 2410 performs the Γ1 column permutation and the block 2420 is just a standard optical channel interface for the eight-polarization channel WDM system previously described.
In the above-described example, when and no sub-groups are used, mspatial15. However, different embodiments could also be formed by selecting sub-groups of size two or four. One option is to define two level-1 sub-groups, each with four symbols per sub-group, and one level-2 sub-group formed using the two level-1 sub-groups. Each level-1 sub-group can transmit four SM bits while the level-2 sub-group can transmit one additional bit during each SM interval for a total of nine SM bits. Optionally, eight SM bits could be transmitted during each C-IC-SM symbol interval using only the level-1 sub-groups to further simplify the decoding. Everything discussed in connection with determining Γ1 and Γ2 and the number and positions of SM intervals as discussed in connection with C-MIMO-SM also applies to C-IC-SM. While this example was provided in connection with a WDM example, both wireless and wireline C-IC-SM embodiments that do not use WDM can be constructed. Note that horizontal and vertical polarizations can optionally be used with a number of types of antennas at microwave and other frequency ranges, for example. All such embodiments, with or without separate polarizations, and using optical or non-optical frequencies are contemplated. Any coded sequence can be split up and sent over nt number of independent (or nearly orthogonal) channels, and the C-IC-SM technique described herein can be used to send additional bits from Γ2 by determining SM intervals and sending SM bits implicitly during SM intervals to selectively permute SC bits during the SM intervals. Additionally, C-IC-SM embodiments could be constructed that transmit SM during all C-IC-SM symbol intervals, but only to a specified SM group of the independent channels. All variations and hybrids of the sub-group and SM group approaches as discussed above can be readily applied to C-IC-SM (and to C-OFDM-SM and C-TDM-SM) embodiments.
C-MIMO-SM/C-IC-SM concepts can be applied to increase the data rate in OFDM embodiments as well. In such systems, the “independent channels” can be viewed as individual OFDM tones in each group of nt tones (channel numbers correspond to the congruence classes of OFDM tone numbers modulo nt). In
Another key difference in the OFDM type embodiments 2500 and 2600, is that the different tones used in OFDM generally correspond to the different independent channel symbol intervals in the C-MIMO-SM embodiments as described in connection with the systems 2100, 2200, and 2400. That is, in C-OFDM-SM embodiments of the systems 2500 and 2600, every m-bit column of the interleaver Γ1 corresponds to 2m-ary constellation point to be mapped onto a corresponding tone. Selected groups of nt tones can be identified as “SM tone groups.” SM tone groups can be viewed as a special case of SM intervals for use in OFDM applications. The value of nt can be judiciously selected by the designer in embodiments which use a single physical channel (such as an antenna/spatial transmission channel or optical channel), and additional SM bits from Γ2 can be transmitted by directing a group of nt different m-bit subsets of SC bits (i.e., corresponding columns Γ1) onto a permuted set SM tones (permuted-order version of the same SM tone group). The CICM interleaver design and the formation of different SM bit groups discussed above with C-MIMO-SM as described in connection with the systems 2100, 2200, and 2400 can be directly applied to OFDM by simply treating SM intervals as SM tone groups, and non-SM intervals as non-SM tone groups. That is, the same basic design approach and mathematics apply, but the C-MIMO-SM concepts are applied to increase the data rate over and above that which can be achieved by pure OFDM systems.
C-OFDM-SM embodiments as illustrated in the OFDM system 2600 or the MIMO embodiment of the OFDM system 2500 preferably use a coupling similar to the connection shown using the dotted lines in
To better understand C-OFDM-SM embodiments, consider
Hence in the simplified example where the number of columns of Γ is equal to N, the C-OFDM-SM design problem is to determine an appropriate matrix, Γ2, that defines a permutation of the columns of Γ1. Note that with optimal decoding, at most nt=N blocks of m bits (N columns of Γ1) can be ordered in (N!) different ways. Therefore, in C-OFDM-SM embodiments, at most mspatial=└log2 (N!)┘ number of SM bits can be sent during each C-OFDM-SM symbol interval. This maximum value for mspatial can be achieved assuming optimal decoding but such decoding will be complex.
Next assume that SISO decoding will be used to decode the C-OFDM-SM signal. To create more practical embodiments, nt will typically be chosen to be significantly smaller than N. However, mspatial=└log2 (nt!)┘ can still be selected to be large enough that it is often desirable to form sub-groups as discussed above in connection with C-MIMO-SM embodiments. This is because in C-OFDM-SM embodiments the channel numbers of the nt independent channels correspond to congruence classes of individual OFDM tone numbers modulo nt, so that the size of the SM and non-SM tone groups can be selected arbitrarily without affecting any specific channel filter or antenna based hardware block. However the value of nt will affect decoding complexity. To reduce decoding complexity, similar to the above discussion in connection with C-MIMO-SM embodiments, level-1 sub-groups can be formed consisting of a set of, for example, four tones each. Level-2 sub-groups can be formed as permutations of level-1 sub-groups, level-3 sub-groups can be formed as permutations of level-2 sub-groups, and so on. The use of sub-groups can be used to lower the decoding complexity for any selected value of n1. Note that the SM tone groups need not be made up of consecutive tones. If each level-1 sub-group uses nt1 columns of Γ1, the highest number of spatial bits transmitted by that level-1 sub-group is └log2 (nt1)┘. Similarly, if a level-k(>1) sub-group is formed by ntk number of level-(k−1) sub-groups, then highest number of spatial bits transmitted by that level-k sub-group is └log2 (ntk)┘. As before, larger values of nt or larger sub-groups can be used with a Pyndiah type soft permutation decoder. C-OFDM-SM can thus be viewed as sending additional SM bits associated with the SM tone groups. The SM bits become decoded once the permuted ordering of the decoded columns of Γ1 of each respective SM tone group becomes known via SISO decoding as described above in connection with C-MIMO-SM decoder.
Note that the set of SM intervals can be best be determined by selecting the highest number of SM bits and the highest number of SM intervals or SM tones that can maintain the currently highest possible ds, and Dmin2 values in the CICM design process. Hence, in a given situation the number of SM bits and the number of SM intervals/tone groups can be selected based on the outer code and other parameters of the application which includes the frame size, selected signal constellation, SNR, etc. Also note that in the C-OFDM-SM construction, Γ1 and r2 can be constructed identically to how they were constructed in the previously described C-MIMO-SM embodiments. For example, suppose that N=512. Then the designer could arbitrarily define nt=16, so that mspatial=44. Then sets of SM tone groups could be formed using nt=16 columns of Γ1 where each corresponding nt=16 columns plus a corresponding mspatial=44 SM bit column of Γ2. The rest of the CICM interleaver design is performed the same way Γ is constructed in SM-only embodiments as described above. The SM bits are chosen based on the Table P(d≧dt) to maintain the highest ds and Dmin2 Assuming the columns hij to be orthogonal (or sufficiently so), then similarly to SM-only embodiments, during CICM interleaver design, the SM bits held in Γ2 will each provide an SED contribution of approximately Dmin2=E=2a2. The same steps used to design Γ as described in connection with C-MIMO-SM can thus be used to design IΓ for C-OFDM-SM (and also for C-IC-SM and C-TDM-SM).
The C-OFDM-SM design process can thus be viewed as 1) identify the SM bits and the SM tone groups to be used for a given frame size and channel quality measure, 2) define a restriction on the type of permuted orderings the constellation points in the SM tone groups can have (e.g. employ different levels of sub-groups as described in connection with the C-MIMO-SM embodiments above). Note that the first step can be performed using the above approach provided for designing C-MIMO-SM embodiments, but now substituting SM intervals with SM tone groups. The second step is used to further reduce complexity for any chosen value of nt so as to be able to match SISO decoding complexity to the available hardware resources. At run time, the C-OFDM-SM spatial mapper 2505 will simply need to permute the columns of Γ1 that are in SM tone groups and to feed them into different tones depending on the SM bits. Block 2510 will perform a simple pass through operation and thus the summing junction will also not be needed. Then the IFFT will be applied in the block 2515 to form a time-domain OFDM symbol.
At the receiver, block 2520 will convert the received time-domain and noise-corrupted OFDM symbol back into the frequency domain and the iterations in blocks 2530 and 2535 will apply SISO decoding as described above to jointly decode both the coded bits of Γ1 and the bits of r2 that are responsible for determining the permutation ordering(s) applied to the SM tone group(s). Blocks 2525 and 2620 can be eliminated or viewed as merely performing a pass-through/no-operation. The resolved valid permutation ordering of the SM tone groups will correspond to a unique set of SM bits. That is, the SM bits are carried in the valid permutation ordering applied to the SC bits (columns of Γ1 in SM tone groups) and thus the a SISO decoder as previously discussed that uses a soft permutation decoder can determine the SM bits that determine the permutation ordering of the columns of Γ1 in the SM tone groups while also jointly decoding the SC bits using SISO iterations. The convergence of this SISO decoding operation 2530, 2535 will be ensured to high probability using the same type of Table P(d≧dt) analysis, SNR, and specified interleaver constraints as discussed above in connection with the C-MIMO-SM embodiments.
As can be seen from the above discussion, C-OFDM-SM can be viewed as simply using an additional set of SM bits to define one or more valid permutations of OFDM tones that are associated with respective SM tone sub-groups. The SISO decoder will then jointly decode the bits in Γ1 while also decoding bits of Γ2 that determine the permutation ordering applied to the columns of Γ1 that are in SM tone groups. By determining the applied permutation ordering, the SM bits can be resolved by assigning a unique set of SM bit values to each valid permutation ordering.
In OFDM processing, the IFFT 2515 and FFT 2520 cancel each other. Therefore, it is further observed that the same C-OFDM-SM concepts can be applied directly in the time domain (TD), resulting in C-TDM-SM embodiments. C-TDM-SM embodiments are designed and operated at run time similarly to C-OFDM-SM embodiments, except the IFFT 2515 and FFT 2520 blocks are removed from the systems 2500, 2600. Note that the discussion of C-OFDM-SM above was provided in the context of the columns of Γ1, so that all of the above discussion applies directly to C-TDM-SM embodiments as well. In such systems, the “independent channels” can be viewed as individual time slots or independent channel symbol intervals where each 2m-ary independent channel symbols are transmitted. In these embodiments the channel numbers of the “independent channels” correspond to congruence classes of individual OFDM tone numbers or to time slot indices modulo n1. This is very similar to C-OFDM-SM, except that the tones onto which a respective symbol is modulated are replaced by time slots into which a respective symbol is modulated. In both cases, the “independent channels” (tones or time slots) can actually be used to send a single data stream (e.g., a sequence of QPSK or 16-QAM symbols) on a single physical channel. In reality C-OFDM-SM is more complex because of the presence of the IFFT 2515 and FFT 2520 blocks and the need to fit the columns Γ1 into one or more matrices Q1(k) for C-OFDM-SM embodiments when the number of columns of Γ is not equal to N. C-TDM-SM embodiments perform the same above-described C-OFDM-SM permutations Γ2, but these permutations are applied directly to the columns of Γ1 which are sent during time domain symbol intervals as opposed to being mapped to OFDM tones. Hence in C-TDM-SM embodiments, any of the above mentioned use of SM tone groups can be replaced with corresponding SM intervals or SM interval groups. Note that neither C-OFDM-SM nor C-TDM-SM embodiments by themselves require any successive interference cancellation (SIC). Also, neither C-OFDM-SM nor C-TDM-SM require any separate antennas or signature filters. Only one physical channel is used, unlike C-IC-SM embodiments that use a number of different channels in a multichannel type system.
C-IC-SM, C-OFDM-SM, and C-TDM-SM embodiments can be used to increase the data rate of an existing multichannel system or an existing single (physical) channel system using existing hardware. In such embodiments, no additional hardware is needed to support multiple antennas or multiple spatial signature filters. The only modification needed is additional or otherwise altered processing and memory requirements. That is, any existing system such as a modem or physical layer transceiver that uses OFDM or ordinary single channel TD transmission such as QPSK or QAM, can be modified at the coding layer using existing physical layer hardware to implement a corresponding C-OFDM-SM or C-TDM-SM embodiment as described above. Also the data rate of an existing multichannel system like the WDM system with four wavelengths with two polarizations can be increased by modifying the encoding/decoding subsystem to perform C-IC-SM based processing. Hence the present invention contemplates logic software upgrades or new design revisions that change the digital signal processing software or FPGA or ASIC programmable logic and possible the underlying programmable or custom hardware to change the coding layer functionality to improve an existing physical layer transceiver. Hardware changes at the coding layer may be needed due to higher processing requirements that require certain chips or subsystems to be upgraded to support the additional processing methods and apparatus of the present invention. This allows a previously designed physical layer transceiver to be improved to implement a chosen one of a C-IC-SM, C-OFDM-SM, or C-TDM-SM embodiment. For example, digital signal processing software or FPGA/ASIC logic programs (e.g., VHDL programs that includes logic and state machines descriptions) and/or other decoder layer hardware structures can be modified to perform any of the coding layer functions of any selected C-IC-SM, C-OFDM-SM, or C-TDM-SM embodiments of the present invention.
To better understand the possible achievable increases in data rate, consider an example. Suppose that, depending on the frame size and the SNR and other factors, that the CICM interleaver design and/or EXIT chart analysis determines that r, the fraction of SM intervals can be achieved in the range 0.25≦r≦0.5. Suppose that QPSK is used in each independent channel symbol interval (e.g., per symbol interval per spatial channel in C-MIMO-SM systems, per symbol interval independent channel in a C-IC-SM system, or per symbol per tone in C-OFDM-SM systems, or per symbol per time slot in C-TDM-SM systems). If nt=8 and no sub-groups are used, then mspatial=15, and assuming QPSK, there will be 16 SC bits sent per SM interval. For example, if r=⅓, then this example sends on average 16+15/3=21 coded bits during all SM and non-SM intervals, a speed up of 21/16, or 31.25%. Similarly, if nt=16 and no sub-groups are used, then mspatial=44. In this example, also assuming r=⅓, then this example sends on average 16+44/3=30.66 coded bits during all SM and non-SM intervals, a speed up of 30.66/16, or 91.66%. Similarly, if nt=64 (as would be practical in C-OFDM-SM and C-TDM-SM embodiments) and no sub-groups are used, then mspatial=295. In this example, assuming r= 1/20, then this example sends on average 16+295/20=30.75 coded bits during all SM and non-SM intervals. This represents a speed up of 30.75/16, or a data rate increase of 92%. The achievable value of r in an actual system will be determined by the SNR and the Shannon limit.
In all the C-MIMO-SM, C-IC-SM C-OFDM-SM, and C-TDM-SM embodiments discussed so far, the SM concept has been used to increase the data rate by implicitly transmitting additional SM bits by permuting the order of an associated group of SC bits. Also, all these embodiments assumed the use of an outer code which could be any code for which a Table P(d≧dt) can be constructed, for example a block code, a convolutional code, or a CTBC code. Also, all C-MIMO-SM, C-IC-SM, C-OFDM-SM, and C-TDM-SM embodiments discussed so far assumed that this outer encoded sequence would then be used with a properly designed CICM interleaver T. However, alternative C-MIMO-SM, C-IC-SM, C-OFDM-SM, and C-TDM-SM embodiments can also be constructed using any coding scheme (even a code for which the Table P(d≧dt) cannot be constructed, such as the majority of long LDPC codes) and can be used with any interleaver, or can even be used without an interleaver. In such embodiments, the highest number of SM intervals (or SM tone groups in OFDM) and the highest number of SM bits in one SM interval (or in one OFDM SM tone group) can be selected using the EXIT chart analysis for iterative decoding as discussed in the Brink reference and the EXIT chart literature in general as is known to those of ordinary skill in the art of channel encoding/decoding. As stated before, all intervals can be made to be SM intervals if optimal decoding can be used. In the above embodiments, the specified constraint ensured that no more than (d-1) coded bits of a coded sequence of weight d can be transmitted during the SM intervals. This specified constraint ensures that an acceptable EXIT chart will result so that SISO decoding convergence can be assured to within a high probability under the conditions discussed above. However, it is not always required to satisfy the (d−1) coded bit condition for all sequences of weight d because in general the EXIT chart is developed using the extrinsic information calculated among all coded bits, and the EXIT chart can be computed starting with the mutual information and using the extrinsic information of the inner code (e.g., constellation mapper/bit metrics) and the outer code (e.g., a long LDPC code) even for codes where the Table P(d≧dt) cannot be constructed. That is, for some coded sequences the specified constraint involving the (d−1) condition could be relaxed while still resulting in an acceptable EXIT chart. Hence, the specified constraint can alternatively involve the particular way in which the SM intervals are chosen for a coded scheme that uses any arbitrary interleaving by using the EXIT chart analysis and finding the highest number of SM bit positions and/or SM intervals possible that ensures that the non-crossover condition of FIG. 5 of the Brinks reference is met. The specified constraint can also involve selecting the subsets of bits to be used as SC bits and SM bits within each SM interval. Further, the modulator can employ any constellation coding policy besides reverse Gray coding. As discussed earlier, if SIC is used, the performance with (or without) SM bits can be improved by running more iterations using the best known current results using either an IHID interference canceller or an interference canceller that uses soft interference cancellation.
In light of the discussion above, alternative embodiments of the systems 2100, 2200, 2400, 2500, and 2600 can be constructed even when the outer code is, for example, a long LDPC code or other code for which the Table P(d≧dt) cannot be constructed. In these alternative embodiments, the C-MIMO-SM, C-IC-SM, C-OFDM-SM, and C-TDM-SM concepts described herein are used, but the constellation and spatial mappers 21052410, 2505, 2605 and the CICM interleavers/deinterleavers 2120, 2130, 2445, 2450 do not use CICM. In such embodiments, instead of the specified constraint using CICM concepts using the Table P(d≧dt), the specified constraint is instead based upon the positive extrinsic information conditions as per the EXIT chart analysis of FIG. 5 of the Brink reference. The specified constraint thus involves applying EXIT chart analysis directly to determine the interleaver constraints used to govern what bits are mapped as SM bits and SC bits into SM intervals. For example, consider a coding scheme like a long LDPC code for which the Table P(d≧dt) cannot be constructed. In such situations, to determine the specified constraint, SM bits, SM intervals, and SC bits grouped with SM bits during SM intervals can be randomly chosen. The highest number of SM bits can be found by directed searching. For example, a computer program can start with a low number of SM bits and SM intervals and this low number can then be verified to ensure convergence according to EXIT charts. Once this initial convergence is verified, the number of SM bits and/or SM intervals can be gradually increased until the EXIT charts indicate that the number of SM bits and/or SM intervals cannot be further increased any more without the cross-over condition being reached. Alternatively, the computer program could be configured to start with a higher number of SM bits and SM intervals that would result in crossing curves on the EXIT charts indicating that convergence is not possible. Next the computer program would reduce the number of SM bits and/or SM intervals until convergence is indicated by EXIT chart analysis. In another type of alternative embodiment, a combination of the above two methods could be used where the number of SM bits can be varied in amounts that would change the convergence to non-convergence and this approach could be applied back and forth until the highest number of SM bits are found. In all of these embodiments and in similar embodiments, a mixture of randomization and EXIT chart testing could be used to identify the number of SM intervals, SM bits, and the actual positions or which intervals should be SM intervals and which intervals (or tone groups) should be non-SM-intervals. That is, interleaver constraints of T are introduced that directly ensure that the crossover condition in the EXIT chart of the C-MIMO-SM embodiment is avoided. This way, interleaver constraints based directly upon meeting the EXIT chart conditions using mutual information metrics are used instead of the CICM type interleaver constraints as described in the C-MIMO-SM, C-IC-SM, C-OFDM-SM, and C-TDM-SM embodiments above. It is noted that the interleavers used in a C-MIMO-SM, C-IC-SM, C-OFDM-SM, and C-TDM-SM embodiments to directly ensure that the EXIT chart cross over condition is avoided is a different type of constrained interleaver in accordance with the present invention. Other variations such as CICM interleavers that do not fully meet the ds and/or Dmin2 targets could also alternatively be used.
It should also be noted that while a number of embodiments based on the C-MIMO-SM concept have been provided, such embodiments could be used to form hybrid systems. For example, a C-MIMO-SM system could be formed where each independent MIMO channel sends a C-OFDM-SM or C-TDM-SM signal. Similarly, a C-IC-SM system could be used where each independent channel sends a C-OFDM-SM or C-TDM-SM signal. In such systems, a SISO decoder could be used to individually (sequentially) or to jointly decode both the C-MIMO-SM or C-IC-SM signal and the individual C-OFDM-SM or C-TDM-SM signal on each independent channel of the hybrid signal.
All of the embodiments so far have used a single outer code. Multi-media type applications or applications with multi-level codes employ different codes for different types of signals. In such cases, as discussed in the unequal error protection embodiments of CICM hereinabove, the CICM design approach can be used. In such applications selected level (or levels) of the MLC code, or selected portion of the transmitted stream can be uncoded. It is noted here that additional SM bits can also be transmitted in such embodiments using any of C-MIMO-SM, C-IC-SM, C-OFDM-SM, or C-TDM-SM. As an example, consider a case where a plurality of coded streams are mapped to a respective plurality of independent channels that employ a 16-ary constellation. The four bits selected for transmission can be coming from a single coded stream or up to four different coded streams. In this example, assume that there are four separate coded streams that use four separate respective outer codes. These streams can have their own ds and Dmin2 values thereby targeting unequal error protection. In such an application, it is possible to design an interleaver Γ with four rows to maintain the desired ds(i) and Dmin2(i) values for each of the four streams, i=1, 2, 3, 4.
Next consider how the C-MIMO-SM related principles can be applied to the above example with four streams. As before, it is possible to transmit additional SM bits over selected SM intervals. Note first that each interval transmits a column of Γ1 by mapping that column to a signal constellation point. Hence, additional SM bits can be used to determine a permutation to be applied to the contents of such columns of Γ1 during SM intervals. In this case, since there are 4 bits used to form symbols (similar to the 4 antenna example in MIMO), an additional four SM bits can be used to determine a permutation ordering in which the contents of each of the columns in the SM interval are permuted before mapping the four bits to a 24=16-ary constellation point. However, by doing this, the original design objectives of Γ to reach ds(i) and Dmin2(1) can be altered, and thus the respective ds(i) and Dmin2(i) values can change. In fact, if different rows of Γ1 have different SED contributions, then there is no point in placing bits on different rows if they are later to be permuted to make different SED contributions. Hence, the target Dmin2 values should be calculated for the worst case. The permutation Γ1 can be designed by considering that the bits placed on a column in an SM interval can be permuted to any other row. Further, if desired, these columns can even employ a different constellation mapping from that used in non-SM intervals. For example RGC can be used during non-SM intervals and a mapping that provides more equalized SED values can be used during SM intervals. However, if all rows of Γ1 have the same SED contribution, the original design method used to design Γ can be readily applied do deal with permutations that also permute the contents of columns of r1.
In another class of hybrid embodiments of all other C-MIMO-SM related embodiments considered above, re-ordering of the contents of the columns of Γ1 can be combined with the re-ordering of columns of Γ1. For example, consider the example of a MIMO system that employs four transmitting antennas and QPSK modulation (2 bits per column) on all channels. When only re-ordering of the columns of Γ1 within each SM interval is used as before, then four SM bits can be transmitted during each SM interval. However, if this example is expanded to also allow re-ordering of the contents of the columns as well, then there will be a total of eight positions that can be reordered during each selected SM interval. In this example, there are nt=4 spatial transmission channels, but instead having 4! possible permutations during each SM interval and thus being able to transmit mspatial=4 SM bits during each SM interval, there are now of 8! possible permutations. Therefore, the number of SM bits that can be transmitted in each SM interval increases to 15 (215<8!<216). In general by combining the permutation of columns with the permutation of the contents of columns (in other words by jointly permuting all the contents of nt columns), the number of SM bits during an SM interval can be increased up to mspatial=└log2((mnt)!┘. Hence, by combining the permutation of columns with the permutation of the contents of the columns, it is possible to increase the number of SM bits that can be sent in an SM interval made up of nt columns. Alternatively, if the number of SM bits is held constant, this hybrid permutation technique can be used to decrease the number of SM intervals, thereby allowing the SISO iterations to start from a better point and to thus improve performance.
Many variations and hybrid embodiments are possible and are available to designer. For example, the contents of all columns can be permuted the same way during each SM interval, or the contents of one, each of a subset of columns, or each individual column in an SM interval can be individually permuted. Such selections offer a number of design points which each trades off a successively higher amount of additional coding complexity for a successively higher data rate. Note that in such cases the number nt stays constant, and the numbers N1 of SM intervals and N2 of non-SM intervals can be unaffected or if necessary changed. However, as mentioned earlier the permutation of the contents of columns can adversely affect the highest achievable Dmin2 because the rows of Γ1 with higher distances cannot be counted upon and a worst case distance is used instead.
To better maximize the Dmin2 that can be achieved, consider a 16-PSK example. In this example the top two rows of Γ1 provide relatively higher distances as compared to lower two rows of Γ1. Therefore, in SM intervals, inter-column type permutations can be applied to just the lower two bits of each column (or in each of a defined set of columns) in each SM interval. During non-SM intervals, all of the SED contributions will be the same as can be achieved in ordinary 16-PSK CICM embodiments because no contents of any columns will be permuted. Also, if less than all of the columns in the SM intervals are to be permuted, then only those columns that would provide the lower SED (worst case) contributions would be permuted. Hence CICM design principles can be readily applied to design good constrained interleavers that increase the data rate while maintaining a high value of Dmin2. The permutation will permute all columns in an SM interval and the contents of one or a set of identified columns in the SM interval. This is preferably achieved using one joint permutation applied in each SM interval. Any columns whose contents are not permuted are allowed to contribute to SED without being reduced to worst case. Also, if less than 100% of all combinations are to be used, only the contents of columns corresponding to rows of Γ1 with lower SEDs are permuted. Such optimizations can be applied to unequal error protection embodiments or any of the C-MIMO-SM related embodiments such as C-IC-SM, C-OFDM-SM, and C-TDM-SM as discussed herein. A key design objective is to further increase the data rate while maintaining as high of performance as is possible for a given frame size.
Another type of hybrid embodiment uses SM groups of independent channels. This mainly applies to C-MIMO-SM and C-IC-SM embodiments. In such embodiments, certain spatial channels or independent channels are not permuted, but only the independent channel symbols being passed to an SM group of nSMG<nt number of channels is permuted in every C-MIMO-SM or C-IC-SM symbol interval. In such cases, similar permutations are applied to the contents of the columns as described above. Also, embodiments with N2=0 but that permute column contents can be constructed for any or all of C-MIMO-SM, C-IC-SM, C-OFDM-SM, and C-TDM-SM embodiments. In such embodiments, there can be no SM intervals, SM groups, or SM tone groups. Instead, only a subset of the SC bits are permuted in all intervals. For example, in 64-PSK CICM type embodiments, the upper four columns of Γ1 are never permuted and the bottom two bits of Γ1 are permuted in every independent channel symbol to send SM bits. All such hybrid combinations of all the various design methods and encoder and decoder methods and apparatus can be constructed with modifications that would be readily determinable to those of skill in the art given the current disclosure of the present invention.
In all of the above embodiments, whether they use any form of C-MIMO-SM or not, and whether they use just CTBC codes, or CICM, or a combination, or whether they are based on direct EXIT chart analysis, or even in other kinds of systems where SISO decoding is employed, e.g., turbo codes, BICM, and the like, when SISO iterations converge to the correct solution, the extrinsic values from different component codes will increase as number of iterations increases. However, when SISO iterations do not converge to the correct solution, the extrinsic information will stay significantly lower and will not continuously increase as more SISO iterations are carried out. Hence, at the end of the last iteration, by considering the extrinsic information, it is possible to reasonably well determine if the frame as been resolved so that the solution can be accepted or rejected. For example, whether a frame is in error or not can be decided by considering the number of bits at the outer code (component code that any iteration end) for which the magnitude of the extrinsic values are below a threshold. If that number is large the solution obtained by SISO iterations should be considered incorrect. Alternatively, if the sum of the magnitudes of the extrinsic value components is below a threshold, the solution can be considered to be incorrect. If the system uses a cyclic redundant code (CRC) prior to the channel code, that CRC can also be used to check if the solution is acceptable or not.
Scaling is commonly used in SISO iterative decoding by multiplying the extrinsic information of a component code by a scaling factor before passing to the next component code. The scaling factor can be varied from component code to component code and from iteration to iteration. These scaling factors are usually optimized numerically to achieve best performance. Besides such scaling methods, selective scaling can be used in SISO decoding of schemes that employ a higher order constellation, particularly when the constellation mapping is different from Gray coding. “Selective scaling” in accordance with an aspect of the present invention is a general technique that can be used with any concatenation where the inner code is associated with a higher order modulator such as is used in BICM and/or CICM. The scheme may or may not include SM bits. Hence, selective scaling can be employed in iterative decoding of turbo codes, serially concatenated codes, BICM, CICM, multi-level codes, etc. when the constellation is a higher order constellation.
For example, consider a CICM scheme constructed where the outer code is a (8,4) OBC and a 16-PSK constellation that employs RGC (reverse Gray coding). Denote the scaling factor normally used during the SISO decoding for both the outer OBC and the inner mapper by scale1. Note that RGC ensures that all single bit differences are farther apart on the constellation, and further, the CICM interleaver is constructed to spread the non-zero bits of any codeword of the OBC into different symbols. Consider a decoding error made by the OBC in one of its codewords. Once extrinsic information of that decoded codeword is passed back to the inner code (16-PSK modulator and the calculation of the bit metrics and extrinsic information associated therewith) for the next iteration, it is likely that the extrinsic information received from the OBC would suggest that a few symbols (those that contain errors in the decoding of the OBC) are farther away from their respective received signals. This suggests that if the hard decoded extrinsic information from the OBC points a symbol that is far off from the actual received signal of that interval, then the received extrinsic information will appear to be not reliable. Hence, if that happens, in selective scaling, a lower scaling factor is employed for those bits that carry the extrinsic information from the OBC for that symbol.
Selective scaling can be easily implemented as: (a) hard decode the symbol indicated by the extrinsic information received from the OBC, e.g., v, (b) calculate the SED between the received symbol y and {circumflex over (v)}, denoted as D2 (y, {circumflex over (v)}), and then (c) if D2(y, {circumflex over (v)})>th, (where th is a threshold) then use a lower scaling factor, scale2(<scale1) for all bits that form that symbol and multiply the extrinsic information received by the OBC for those bits by scale2. If not, use the scaling factor scalel. Selective scaling can be extended to include any number of scaling factors, xi, scale2(i), i=1, 2, . . . , xi, in the decreasing order (i.e., scale2(i+1)<scale2(i), i=1, 2, . . . , xi-1) in addition to scale, which is higher than any scale2(i). These scaling factors can be used along with a corresponding set of properly chosen threshold values, th(i), i=1, 2, . . . , x in the increasing order (i.e., th(i+1)>th(i), i=1, 2, . . . , xi-1. Selective scaling can be used only for a few initial iterations or throughout all iterations.
Another approach is that once it is noticed that the SISO iterations have failed to converge to a correct solution, the received signal can be perturbed by adding a pseudo random (or otherwise randomized or selected) noise signal (perturbation signal) to the received signal. This will have the effect of causing the iterations get started off in a different direction. Preferably, the variance of the perturbation signal can be gradually increased until an acceptable solution is found from SISO iterations. The acceptable solution is identified based upon the increasing values (or values above a threshold) of the extrinsic information generated during SISO iterations as discussed above.
In another type of alternative embodiment of the present invention, instead of using soft iterative decoding to jointly decode the SC bits and the SM bits, hard decoding is applied to decode/extract the SC bits and the SM bits that were encoded into the received symbol frame by the encoder/transmitter.
To better understand the concepts of encoding and hard decoding, consider an example embodiment where C-TDM-SM is used. In this example, the outer code is a (7,4) single error correcting Hamming code (error correcting capability of one bit error) with MHD=3, denoted HC(7,4). The outer encoded bits are passed through a CICM constrained interleaver followed by a BPSK modulator that employs constellation points at +a and −a. In this example, a frame consists of seven codewords of the HC(7,4) code, so that the number of coded bits per frame is given by K=7×7=49. Let Cwj denote the jth codeword in each frame of seven codewords, and let cbjt denote the tth coded bit of the jth codeword in the frame (t also ranges as t=0, 1, . . . 6 over the seven coded bits per codeword, Cwj, in the frame of seven codewords). With this notation, the jth codeword in each frame can be written as Cwj=(cbj0, cbj1, . . . , cbj6), and this holds for all codewords in the frame, i.e., for j=0, 1, 2, . . . , 6. In this example, the last coded bit, cbj6, of every one of the seven codewords in the frame will be designated as an SM bit. In this, example, each SM bit, cbj6, for j=0, 1, 2, . . . , 6, will be used to determine whether the fifth and sixth coded bits of next codeword Cwjp are swapped or not, where jp (j plus) is given by jp=(j+1) mod 6. Specifically, in this example, when cbj6=0, the positions of cbjp4 and cbjp5 are not swapped and when cbj6=1, the positions of cbjp4 and cbjp5 are swapped. In general, cbj6 or some other selected bit in each codeword can be used to influence different codewords other than Cwjp, and different coded bits other than the fifth and the sixth can be chosen as the SC bits in each codeword to be swapped or not swapped based on the chosen SM bit.
In the context of a CICM based C-TDM-SM constrained interleaver design, the above example has nt=6. The SC bits are interleaved using a Γ1 interleaver that is designed with one row and C=6×7=42 columns. In this design example, the first six coded bits, cbj0, cbj1, . . . , cbj5, of all codewords are placed onto the single row of Γ1 in a natural numeric order manner, for all codewords in the frame, j=0, 1, . . . , 6. Similarly, Γ2 is constructed with one row and 7 columns. These seven columns contain the SM bits, cbj6, j=0, 1, . . . , 6, but in an ordering as discussed below. In this example, because the MHD of the HC(7,4) code (OBC) is d0=3, the target values of ds and Dmin2 used in the CICM based design are d5=3 and Dmin2=3×4a2=12a2. In order to maintain these target ds and Dmin2 values, it is necessary to ensure that the six columns of Γ1 that correspond to any codeword Cwja is coupled with an SM bit from a different codeword Cwjb, where ja≠jb. Further, in this example only two selected columns of Γ1, i.e., the fifth and sixth coded bit positions of every block of six columns which correspond to coded bits Cbja4 and cbja5 will be selectively permuted based upon an associated the SM bit, cbjb6. In this C-TDM-SM example, the first four time slots of every nt=6 time slots are transmitted in natural order. The last two time slots are used to transmit cbja4 and cbja5, but in an order that is dependent on the element of Γ2 that is associated with these nt=6 columns of Γ1.
In this C-TDM-SM embodiment, during every nt=6 consecutive time slots (six columns of Γ1), the TDM system transmits SC bits cbjp0, cbjp1, . . . , cbjp5 of a respective codeword Cwjp. In this sub-frame of nt=6 time slots, coded bit positions cbjp4 and cbjp5 are either swapped or are not swapped depending on the value of coded bit cbj6 (SM bit from codeword Cwj, i.e., the element of Γ2 that is associated with these nt=6 columns of Γ1/sub-frame, i.e., the six SC bits from codeword Cwjp). Hence all of the coded bits cbj6 in the frame j=0, 1, 2, . . . , 6 are only implicitly transmitted as SM bits. In this example, since cbj6 influences the jpth codeword, i.e., codeword number jp=(j+1) mod 6, all of the seven codewords in the frame are chained together via their respective last coded bits, cbj6. As a result, only six out of seven bits of every codeword of Cwj are actually transmitted as SC bits while every seventh coded bit per codeword is transmitted implicitly as a SM bit to determine the permutation ordering of the fifth and sixth coded bit of a different codeword which are transmitted as SC bits. Hence this example achieves a coded-bit data rate increase of 7/6, i.e., 16.67%. Stated another way, the effective overhead of the HC(7,4) code drops from ¾ (75% increase in coded bits to message bits) to 2/4 (50% increase in transmitted coded bits to message bits).
Next consider how to hard decode a frame of codewords of an OBC that has been formed using C-TDM-SM as in the example above. In general, the hard decoding steps can be listed as: (a) pass the received signal (in case of BPSK) or the bit metrics (in case of higher order modulation) to a hard decoder that is configured to decode the OBC, (b) using the hard decoder, decode each codeword of the OBC separately and list the decoded sequences corresponding to all possible (or most likely) valid permutations and SM bit combinations of that codeword, and (c) use a search algorithm to select the a full frame of valid decoded codewords of the OBC (decoded sequence of K coded bits) that minimizes the Hamming distance to the entire received sequence. This search algorithm is preferably implemented to find valid solutions starting from every codeword separately and then performing additional searching to find the best collective solution among all of the codewords in the frame that minimizes the Hamming distance between the received sequence and the decoded sequence. As discussed below, special care will need to be taken in steps (a)-(c) to deal with the fact that the SM bits are transmitted implicitly by reordering specified SC bits.
Next apply the above hard decoding approach to hard decode the above described C-TDM-SM frame of seven HC(7,4) codewords transmitted as a frame of 6×7=42 SC bits. In this frame, only the SC bits cbj0, cbj2, . . . , cbj5 are actually received for each codeword, Cwj. Also, coded bits cbj4 and cbj5 of every codeword may have been either swapped or not based on cjm6, where, jm (j minus) is given by jm=(j−1) mod 6. The bit cbj6 is not explicitly received by the hard decoder. Therefore, every codeword Cwj can be initially decoded using the received SC bits cbj0, cbj2, . . . , cbj5, and by considering the separate events (possibilities) where the coded bits cbj4 and cbj5 were swapped, were not swapped, and where cbj6=0 and cbj6=1. To explicitly enumerate all combinations of these possibilities, note that cbjm6=0 implies that coded bit positions cbj4 and cbj5 were not swapped, and cbjm6=1 implies that coded bit positions cbj4 and cbj5 were swapped. Therefore, the above-mentioned four possibilities can be enumerated by defining Si,j, i=1, 2, 3, 4, where, (i) S1,j={cbjm6=0 cbj6=0} (ii) S2,j={cbm6=1, cbj6=0} (iii) S3,j={Cbjm6=0, cbj6=1} and (iv) S4,j={cbjm6=1, cbj6=1}. It accordance with step (b) above, these individual events can be analyzed separately for each codeword of the OBC in the received frame. That is, taken together with the received coded bits, cbj0, cbj2, . . . , cbj5, these four events {Si,j} can be associated with four different candidate codewords from among the sixteen possible codewords of the HC(7,4) code. For each of these four events, a hard decoder will determine a valid candidate codeword with minimum Hamming distance to the associated received coded bits, to include by checking the permutations, and codeword and its associated minimum Hamming distance is recorded for each Cwj in the frame. Note in step (b) of the hard decoder algorithm, that every codeword Cwj is preferably separately decoded without consideration of any other codeword. However, one of the four potential solutions Si,j (based on the permutations and chaining) may turn out to be a better solution for use with the jth codeword Cwj when the entire decoded sequence (frame) is hard decoded as a whole.
The next task is to search for the combination of the solutions Si,j for l=1, . . . , 4 and j=0, 1 . . . , 6 (one per each codeword Cwj) that would provide the best sequence of valid OBC codewords to minimize the Hamming distance between complete set of 42 SC bits of all seven codewords in the entire frame. Many different search algorithms can be used to determine the overall ML solution. One preferred method is to start from every codeword Cwj with its best solution (the one with the minimum Hamming distance from the received signal computed as per the paragraph above) Si,j. Note that once Si,j is selected, the above-described permutations and chaining imposes conditions on the selection of the solutions for Cwjp and Cwjm. The best solution of Cwjp and Cwjm can be then selected to match the solution selected for Cwj and to minimize the overall separation in Hamming distance of all codewords Cwjm, Cwj and Cwjp. This process can be continued as a sliding window until it completes looping through all seven codewords in the frame. When the loop completes, the last two selected codewords impose conditions on each other and the last two selected codewords may not be valid causing a “mismatch” condition. If there is such a mismatch, then select the best valid solution among the last two codewords that minimizes the total Hamming distance between the received and the decoded sequences of those two last codewords subject to the constraint that the selected solution must correspond to a frame of valid codewords of the HC(7,4) code. This process can be performed seven times by starting the loop from every codeword Cwj, j=0, 1, . . . , 6. After these seven passes through the loop have completed, the hard decoder will have found seven valid candidate solutions as candidate decoded sequences for the entire frame. From among these seven frame-level candidate decoded solutions select the candidate that has the lowest Hamming distance to the received sequence. This simple example demonstrates how permutation based encoding and hard decoding can be performed to reduce the effective coding overhead of simple codes such as block codes.
In case of higher values of i, similar to the above described Pyndiah decoding, only the most significant solutions (the ones with lower Hamming distance separations) need to be considered. In such cases with larger number of codewords, longer lists of candidate solutions per codeword, multiple loops, and interconnected loops can be present in the decoding. In all such cases the best valid solution can be sub-optimally searched using efficient search algorithms that are also well known to those of skill in the art.
The hard decoder as described above can be viewed as an alternative embodiment of a method and/or apparatus 2200 as depicted in
In this alternative type of embodiment, block 2210 is operative to first decode each segment to provide different potential solutions for each possible event, Si,j for l=1, . . . , Nevents and j=0, 1 . . . , Nseg (one per each segment). In the OBC type example described above, each “segment” corresponds to a codeword Cwt, but in other embodiments segments generally just correspond to groups of nt columns of Γ1. Also, Nevents corresponds to the number of different permutations that can be applied to the SC bits in each segment based upon the SM bits. Suppose that there are nt bits per segment, and that there are !ism SM bits that are associated to apply permutations to selected SC bits within in each segment. Then there are Nevents=Npermutations×2n
In the hard decoder type embodiment of the decoder 2200, block 2215 is typically removed. Also, block 2220 performs different outer code decoding functions as compared to the SISO decoder type embodiments of the decoder 2200. For each event identified in block 2210, block 2220 will record the corresponding distance measure (Hamming distance in Hard decoding and Euclidean distance in soft decoding) between the closest valid codeword to the associated with the received bit metrics in the segment depending on the event, and the received SC bits. That is, block 2220 will decode each respective segment and save the candidate solutions associated with each event and their respective distances to the sequence of received bit metrics. Block 2220 will typically search for the combination of the solutions Sl,j for i=1, . . . , Nevents and j=0, 1 . . . , Nseg (one per each segment) that would provide the best sequence of Nseg number of segments (e.g., valid OBC codewords) to minimize the Hamming distance between complete set of SC bits of all Nseg number of segments in the entire frame. In this type of embodiments, blocks 2225-2240, 2250-2255 are also not implemented (or can be viewed as no-operations/pass-through). The results of block 2220 are thus passed directly to block 2240. Block 2240 in this type of embodiment performs a search algorithm to determine the best valid combination of candidate solutions from each segment (one from each segment) that collectively form an entire decoded sequence for the frame that minimizes the distance measure (e.g., Hamming or Euclidian distance) between the entire decoded sequence and the received signal of the entire frame. Note that this type of hard decoder embodiment of the decoder 2200 is not just limited to block codes, and the outer code can be any general code, and further the scheme can employ any number of SM bits. However, as the number of SM bits and the frame size increase, the complexity increases.
Note that decoding of individual segments (which may or may not correspond to codewords of a block code as in this example, depending on the embodiment) can be done in parallel to reduce decoding delay. Also note that special decoder hardware similar to
The above method and the above discussion of the alternative embodiment of the decoder 2200 can also be converted into (non-iterative) soft decoding instead of hard decoding. This can be easily done by using the above described hard decoding steps (and blocks of
In previously-described embodiments, signature filters were used in
In the discussion that follows, all the discussion assumes that permutations will involve nt number of symbols. In all of the discussion below, nt may alternatively be broken down into sub-groups as discussed above to simplify the implementation. To keep the discussion focused, without loss of generality, it is to be understood that the permutations discussed in these signature-waveform based embodiments can be implemented using sub-groups. When sub-groups are used, the permutation size in the discussion below, nt, would instead apply to the level-1 sub-group size.
The present invention contemplates an alternative class of embodiments where the signature filters pi are used to provide explicit SM-bit information. To see how this is done, first consider an embodiment such as the above described C-IC-SM or C-TDM-SM embodiments that did not make use of the signature filers, {pi}. Without loss of generality, to keep the discussion simpler, consider a C-TDM-SM embodiment that uses the same pulse shape, p(t) in all independent channels. This pulse shape can be written in vector form as p, and the independent channels correspond to time slots in the TDM case. When a single pulse shape is used, the complex-valued signal transmitted into the ith independent channel during any non SM interval can be denoted as sN-SMI(i,t), where the subscript N-SMI denotes “non SM Interval” and the variable t represents continuous time or an sampled type discrete time variable within each independent channel's symbol interval. Using this notation, the complex-valued signal transmitted over the ith independent channel during any non SM interval can be written as
s
N-SMI(i,t)=sip(t−iT), (34)
where the subscript N-SMI denotes a non SM interval and the values {si} in equation (34) denote the 2m-ary symbols, numbered as {i=0, 1, . . . , nt−1} that are sent during the non SM interval. In this discussion, if the columns of Γ1 are numbed as k=0, . . . C−1, then i=k mod nt. Let {i}nt denote the integer ring, {0, 1, . . . nt−1}, and let {φ}nt denote some arbitrary permutation of {i}nt. The SM bits associated with any particular SM interval are used to select a specific permutation {φ}nt to be used during that particular SM interval. Then, assuming the same single pulse shape is used at modulator, the transmitted signal sent during time slot φ during SM intervals can be written in the form,
s
SMI(φ,t)=sφp(t−φT) (35)
where the subscript SMI denotes an SM interval and {φ}nt is the SM-bit selected permutation ordering during that SM interval. In C-IC-SM embodiments the equation changes to the form sSMI(φ,t)=s p(t−(kntT) where knt=k div nt, and div represents integer division. As stated before, to keep matters simple, the present discussion will continue to focus on C-TDM-SM by way of example.
Next consider this same example, but where the transmit signature filters as described above in connection with blocks 2320 and 2420 are used (whether in optical, analog, digital, or modulator-integrated form). In what shall be called SS1 (“signature-stamping technique 1”), the complex-valued signal transmitted during the ith time slot of any SM interval can be written in the form,
ψss1(i,t)=sipφ(t−iT) (36)
where the corresponding φ values are taken as the ith integer in the permuted integer ring, {φ}nt, and pφ(t−iT) corresponds to signature waveform vector p, which is applied in the ith time slot of the SM interval. Note that in SS1, equation (36) implies that the symbols in the transmitted symbol frame (corresponding to the columns of Γ1) are sent in their natural order. However, the signature waveforms {pi} are permuted to {pφ}, and these permuted signature waveforms are explicitly transmitted along with the SC-bit information contained in the columns of Γ1. This implies that the transmitted symbol frame will contain explicit SM bit information as opposed to only containing implicit SM bit information as in the previously-described embodiments. This explicit SM bit information can be exploited by a modified decoder to be described below to improve performance.
In what shall be called SS2, (“signature-stamping technique 2”), using similar notation, the complex-valued signal transmitted during the φth time slot of any SM interval can be written in the form,
ψss2(φ,t)=sipi(t−φT). (37)
Note that in SS2, equation (37) implies that the symbols in the transmitted symbol frame (corresponding to the columns of Γ1) are sent in permuted order. However, the signature waveforms {pi} are explicitly transmitted along with the SC-bit information contained in the columns of Γ1. This explicit SM bit information can be exploited by a modified decoder to be described below to improve performance. This is because both types of received symbols, ψss1(i,t) and ψss2(φ,t) have been effectively “stamped” with a signature. This signature stamp can be exploited by the SISO decoder (or the ML decoder) to help directly extract information about the permutation ordering (SM bit information) from the symbols in the received symbol frame.
In either or both of equations (36) and (37) the signature filters can be viewed as a point-wise multiplication of each of a sequence of 2m-ary signal constellation points, {si}, by a respective complex-numbered pulse shape, pi(t)=[r(i,t),θ(i,t)], where r(i,t) and θ(i,t) are radius and phase functions applied in the ith time slot of a selected SM interval. In this context, ψss1(i,t)=[r(φ,t), θ(φ,t)].*si and ψss2(φ,t)=[r(i,t), θ(i,t)].*si, where “.*” denotes point-wise multiplication of two sequences. This type of notation will be helpful in the context of C-OFDM-SM embodiments to be described below. In the decoder, the joint estimation problem includes jointly estimating the sequence of transmitted 2m-ary signal constellation points, {s,}, and the sequence of time domain signature waveforms, {[r(i,t), θ(i,t)]} in all SM intervals.
Consider a specific example of a C-TDM-SM embodiment where both SM groups and non-SM groups each contain nt=4 time slots, and each time slot is used to transmit a subset of m=2 SC bits as a QPSK symbol. In any given SM interval, the group of nt=4 QPSK symbols are permuted in accordance with mspatial=4 SM bits from a column of Γ2 associated with the SM interval. In an SS1 type embodiment, during each SM interval in accordance with equation (36) the SC bits will be used to identify four consecutive QPSK symbols si, i=0, . . . , 3, and these QPSK symbols will be pulse-shaped by the signature waveforms, pi, i=0, . . . , 3, but in their permuted order {pφ}. During the SM interval, the resulting sequence symbols {ψss1(i,t)} will be transmitted into the symbol frame in their natural order, {i}. In an SS2 type embodiment, during each SM interval in accordance with equation (37) the SC bits will be used to identify four consecutive QPSK symbols si, i=0, . . . , 3, and these QPSK symbols will be pulse-shaped by the signature waveforms, p, i=0, . . . , 3, in their natural order. During the SM interval, the resulting sequence symbols {ψss2(φ,t)} will be transmitted into the symbol frame in their permuted order, {φ}. In the previously described C-TDM-SM embodiment of
The same ideas as used in the above-described C-TDM-SM embodiment can be applied in OTN systems such as shown in
Referring again to the above-described C-IC-SM embodiment of
A C-MIMO-SM system that employs different transmitting antennas using a configuration similar to blocks 2405, 2410, 2415, and 2420 but implemented in electronic circuits instead of optical circuits and inserted into block 2105 of
In what shall be called SS3, (“signature-stamping technique 3”), equation (36) is used to form the transmitted symbols, but in SS3 there will be nsm>nt number of signature filters available, that is {pi, i=0, . . . , (nsm−1). For that reason, SS3 can be viewed as a close cousin to SS1. During any given SM interval, nt of these signature filters will be chosen in some selected permuted ordering to be used with nt columns of Γ1 as per equation (36). There is no need for any additional permutation because the assignment of the selected nt signature filters from the total set of nsm signature filters is all that is needed to explicitly transmit all of the information carried by the SM bits. Let (n)k=n!/(n−k)! denote the number of k-permutations that can be formed from n objects. Then since (nsm)nt>nt!, the use of nsm>nt number of signature waveforms can increase the number of SM bits that can be transmitted. When nsm=nt, SS3 reduces SS1 as per equation (36).
In what shall be called SS4, (“signature-stamping technique 4”), all independent channel symbol intervals, i, are typically designated to be SM intervals. SS4 a special case of SS3 with nt=1. In SS4, the complex-valued signal transmitted during the ith time slot of any SM interval can be written in the form,
ψss4(i,t)=sipSM(i)(t−iT) (38)
where the subscript SM(i) denotes a signature waveform selection index, SM(i)ε{0, . . . , nsm−1} that is designated during the ith interval according to a corresponding set of SM bits. In such embodiments, there will be a set of nsm number of possible signature waveforms, {p0, p1, . . . , pnsm-1} that can be sent during each SM interval. SS4 selects a signature waveform among all available nsm number of signature waveforms based on the SM bits transmitted during that interval and use that signature on the SC bits transmitted during the same interval. As a result, └log2(nsm)┘ number of SM bits can be transmitted during every interval i that is a designated SM interval. It is preferable to keep the number of different signature waveforms, nsm, at a manageable number so that the decoder can differentiate between the different possible signature waveforms thereby limiting the number of SM bits transmitted. At the receiver, the likelihood values of every SM bit can be extracted because soft information about the waveform that was selected during any SM interval can be extracted by soft detecting the signature waveform and then using the one-to-one correspondence between this signature waveform and the corresponding SM bit combination. SS4 can be used to construct a variety of different types of embodiments, to even include C-MIMO-SM systems that make use of a single physical channel such as an optical channel or a single antenna wireless channel. In most embodiments of SS4 all intervals are designated as SM intervals because there is no uncertainty as to which set of SC bits correspond to any column of Γ1.
To summarize, SS1 uses the SM bits to permute nt number of signature filters that are then applied to a corresponding nt number of columns of Γ1 each SM interval. SS2 applies nt number of signature waveforms to nt correspondingly numbered of columns of Γ1 to form transmission symbols, and then uses the SM bits to permute the transmission symbols within each SM interval. SS3 is similar to SS1 but differs from it by selecting a subset of nt number of signature filters from a bank of nsm>nt number of possible/available signature filters for permutation and use as per equation (36) in a corresponding SM interval. SS4 does no permuting. Instead, SS4 uses the SM bits to select a signature filter/waveform from a bank n5, number of possible/available signature filters. This selected signature filter/waveform is then applied to a single corresponding independent channel symbol in the corresponding SM interval, and typically all intervals are SM intervals. As discussed below in connection with the decoders that operate with various forms of signature stamping such as the examples SS1-SS4, if the SNR (and/or other channel impairments) permit, all intervals can be made to be SM intervals. Non SM intervals can be eliminated because of the presence of the explicitly transmitted SM bit information as described in connection with SS1-SS4.
To demonstrate how SS3 can be applied, consider an example where seven codewords of the (7,4) Hamming code make up a frame of K=7×7=49 coded bits. In this example, assume that BPSK signaling is used so that one bit is transmitted per symbol so that a traditional symbol frame would thus require C=7×7=49 symbol intervals. In this example, there are three signature waveforms, pi, i=1, 2, 3, and for each respective one of the seven codewords in the frame, a respective two of these signature waveforms are to be selected to be applied to the 4th and 5th coded bits of each codeword in the frame. Note that there are (3)2=3×2=6>22 ways to select these signature waveforms on the 4th and 5th coded bits. This implies that in total it is possible to transmit two SM bits with every codeword. However, to meet the constraint on ds, the SM bits for use with each codeword in the frame should be selected from two other codewords. On average, then, five coded SC bits from every codeword need to be explicitly transmitted and two SM bits from each codeword can be explicitly transmitted as SM bits via signature stamping. As a result, this example only needs to transmit a symbol frame of BPSK symbols that has 7×5=35 binary symbol intervals. Because 1−35/49=0.2857 this provides roughly a 29% reduction in bandwidth to send the coded frame in this simple example.
To demonstrate how SS4 can be applied, consider a C-TDM-SM example that uses as the outer code, the (8,4) Hamming code whose MHD is d0=4. In this example, SS4 is applied with nsm=4 (or SS3 with nt=1 and nsm=4). Also, QPSK will be used to transmit two SC bits in each time slot (which is also an nt=1 SM interval). In this example, all time slots are designated to correspond to SM intervals. If no SM bits were to be sent, then, on average, it would take four QPSK symbol intervals (time slots) to send each (8,4) codeword as SC bits. However, in it is possible to additionally send two SM bits per time slot by selecting one of nsm=4 possible signature waveforms {pi, i=0, . . . 3} for use in each time slot. As a result 2 SC bits and 2 SM bits can be sent every interval, thereby on average transmitting all coded bits of a codeword in just 2 QPSK intervals. For example, a frame of eight codewords could be constructed and four coded bits from each OBC codeword could be used for partial signature waveform selection in four different codewords to meet the constraint on ds. As a result all eight coded bits of every codeword can be transmitted by only transmitting four bits explicitly as SC bits in two QPSK time slots, and the remaining four bits would be allocated as SM bits to be used for signature waveform selection as described above. Each frame of K=8×8=64 coded bits can thus be sent using a symbol frame consisting of C=64/2=32 transmitted QPSK symbols. This provides a 50% reduction in the required bandwidth to transmit the symbol frame that carries the information of the eight OBC codewords in this example, in effect making the rate 1/2 (8,4) code a full rate code.
As discussed before, signature waveforms can be selected in the form of baseband pulse shapes and/or filters. Another way to realize a bank of signature filters or waveforms is to embody them as a bank of constellations. That is, a set of modulators/mappers are used that map SC bits to a base constellation and use the SM bits to select a modulation variation of this base constellation. One way to generate such a bank of nsm number of constellations is to map the SC bits to a base constellation and then map the SM bits to a magnitude scaling factor ri and a phase shift, θi, (e.g., clockwise), for i=0, 1, . . . , nsm−1 to be applied to the base constellation to form a set of scaled and/or phase shifted replica constellations. In this example, i=0 corresponds to the base constellation (r0=1 and θ0=0), and (ri, θi), i=1, 2 . . . , (nsm−1), define a set of (nsm−1) scaled and phase shifted copies of the base constellation, all of which correspond to a set of replica constellations. Such a set of replica constellations can be overlaid and viewed as a larger overall constellation. For example, if the above approach is applied when the base constellation is a QPSK, 8-PSK or a 16-PSK constellation, then the copied and complex-scaled constellations will have all of their constellation points located on concentric rings of radius and phase shifted relative to the r0=1 ring by θi degrees or radians (e.g., clockwise). With these replicas, to include base constellation, the larger constellation will become a multi-ring QAM constellation. If preferred, all of the ri values can be set at 1 thereby only using the phase to differentiate the based and replica constellations. Design variables such as the total number of different constellations (nsm), the number of SM intervals, and the interleaving rule used can be chosen based on the SNR, frame size, and the other system parameters. Some or all of these design variables can be selected by examining the EXIT charts and making selections to ensure that the corresponding SISO decoder will converge.
Note that an RGC mapping can be determined to map a group that contains nt number of subsets of SC bits and a corresponding subset SM bits directly onto nt number of copies of above-described type of larger constellation. In such a RGC mapping, the nt number of subsets of SC bits and a corresponding subset SM bits are all associated with a respective SM interval. Once such an RGC mapping rule is available, the CICM design rules can be directly applied to find a permutation rule to permute the coded bits within a frame of coded bits onto a symbol frame in such a way as to meet the ds and Dmin2 constraints used in CICM interleaving. Alternatively, a constrained interleaver can be designed by using the EXIT charts and making selections to ensure that the corresponding SISO decoder will converge. Alternatively, the bank of constellations itself can be designed by using the EXIT charts and making selections to ensure that the corresponding SISO decoder will converge.
Such constellation coding schemes are similar to multi-dimensional constellation encoding schemes (as described in further detail below), where nt number of 2D constituent constellations are mapped onto a 2nt-dimensional constellation. The difference is that the number of possible valid multi-dimensional constellation points is determined by the subset of SM bits associated with each SM interval. This SM bit induced limitation in valid multi-dimensional constellation points can be exploited in the encoder by reducing memory requirements and in the decoder reducing complexity by limiting the number of possible multi-dimensional constellation points that need to be considered/checked.
Similar examples as the above two examples can be constructed for convolutional codes and other types of codes such as CTBC codes, turbo codes and LDPC codes, but the concepts and reductions are easier to understand with these simple examples that use a simple block codes (e.g., (7,4) and (8,4)) Hamming codes) as the outer code. Also, even though the use of signature waveforms was discussed in connection with coded schemes that employ CICM interleaving, the same concepts can be used with any coded scheme. If a CICM interleaver cannot be constructed due to the unavailability of the P(d) table (such as most long LDPC codes), the columns of Γ1 and Γ2 can be constructed using the technique described above using EXIT charts and/or using computer based searching to that randomly selects coded bits of the code for SM intervals and SM bits and loops to find an acceptable result in light of a performance criterion. In fact, many of the above-described uses of signature waveforms can even be used for the transmission of uncoded bits.
OFDM embodiments can also be readily constructed using frequency domain versions of equations (36)-(39) and/or any of the techniques associated with banks of constellations as discussed above. For example, by denoting in this context, k, to be the OFDM frequency index, frequency domain versions of equations (36)-(37) can be constructed where the signature filters can be viewed as a point-wise multiplication of each of a sequence of 2m-ary signal constellation points, {si}, by a respective complex-numbered frequency domain filter, [r(i,k), θ(i,k)], where r(i,k) and θ(i,k) are radius and phase values point-wise multiplied into the ith time slot of a selected SM interval. In this context, ψss1(i,k)=[r(φ,k), θ(φ,k)].*si and ψss2(φ,k)=[r(i,k), θ(i,k)].*si. Note that point-wise multiplication in the frequency domain may be viewed as cyclic-convolution based signature filtering in the time domain.
To see an example of how the encoding can be performed, consider the same example where each frame contains K=8×8=64 coded bits, but this time, the frame is to be transmitted using OFDM with QPSK modulation on each tone. In this example, Γ1=Q1(k) has two bits per column and is mapped using CICM-QPSK, and Q2(k) also has two bits per column using the same approach as discussed above for the (8,4) Hamming code example with signature stamping. Following the above discussion on how to systematically construct any 2m+1=2M-ary PSK constellation, in this example the bit encoding of Q2(k) can be performed as follows:
The r-values could then be normalized to provide unit average power. In this example, SS4 is applied to the (8,4) extended Hamming code similarly to the previous C-TDM-SM example, but the two SM bits used to select pulse shaping is applied in the frequency domain. However, it can be noted that this example can also be viewed as a 16-ary constellation with four rings at the radii listed in the above table, and with the added constellation points shifted by the phases listed in the table. In OFDM, the constant modulus property is not important, and such multi-ring constellations can be useful. The difference between this above example and traditional multi-ring constellations type embodiments as known in the art has primarily to do with the coding of the SC bits and the SM bits in accordance with one or more outer codes, followed by and constrained interleaving of the SC bits and SM bits in accordance with Γ1 and Γ2 as described in accordance with CICM and/or other constrained interleaving strategies such as based on EXIT charts. Also, the novel SISO decoder described below for use in SS1-SS4 embodiments would also be preferably applied to decode this type of multi-ring QAM signal. Such encoding, constrained interleaving, and decoding would provide performance advantages of this type of multi-ring QAM embodiment over known multi-ring QAM constellation mapping, encoding and decoding techniques such as based upon BICM or TCM.
Improvements to the above example would include using radii that have the same separation in terms of Dmin2. The CICM interleaver could be designed for any given set of radii. Also, 4-ring 16-ary constellations that uses QPSK constellation on each of the four rings of radii 0.6, 0.8, 1.0 and 1.2 could be used, and the circles could be rotated so that the constellation points are placed evenly at angles of π/8 radians or 22.5 degrees. In other examples, only a single radius r=1 may be used and all the constellation differentiation information can be represented by the phase. Another alternative embodiment of the above example would have just two rings with the phase shifted version of the starting SC-bit based constellation. Then only one SM bit would be sent with 2 SC bits each interval or at least each SM interval. The number of SM bits can be increased if desired by increasing the number of rings and phase shifts to higher numbers beyond the four rings in the above example.
C-OFDM-SM embodiments can be further improved by designing the frequency domain, SM-bit-selected pulse shaping filters to be non-zero at more than a single tone frequency. Let sgi be an index defined for each SM interval that identifies level-1 sub-groups of tones within the SM interval. Embodiments can be constructed where the same selected parameters [r(sgi), θ(sgi)] are applied to each tone in an entire level-1 sub-group of tones, and no permutations are performed within the level-1 subgroups. The SM bits in this type of example are then used to determine which parameters [r(sgi), θ(sgi)] is applied to each level-1 sub-group and/or SM bits may optionally be used to permute the level-1 sub-groups as in SS2. That is, additional SM bits can be optionally used to select level-2 permutations and can also be optionally be used to select level-3 and higher permutations as previously described. In other embodiments, the level-1 sub-groups need not be permuted, but the extra SM bits may be encoded into the sequence of the [r(sgi), θ(sgi)] parameters that are applied to the individual subgroups similar to SS1, SS3 and SS4. In still other embodiments, [r(sgi), θ(sgi)] can be applied to all the tones each sgith sub-group, and then the tones within the sub-group can be permuted. All such variations and combinations are contemplated.
The act of multiplying the same parameters, [r(sgi), θ(sgi)], onto all of the tones in a sub-group, effectively provides a square window noise averaging filter whose filter length is the size of the level-1 sub-group. The SC bits in these sub-groups need not be adjacent in Γ1 or in the symbol frame) but rather the SC bits in each sub-group can be selected to be spread out in the symbol frame to combat the effects of local noise events as is common in standard in interleaving. This type of window averaging and optional spread-out subgroup selection improves the reliability of the estimates of the [r(sgi), θ(sgi)] parameters in the presence of channel noise and other impairments.
A similar approach is to point-wise multiply an SM-bit selected order of pre-determined sequence of the parameters {[r(sgi,k), θ(sgi,k)]} onto the tones within the sgith (non-permuted) level-1 sub-group in each corresponding SM interval. That is, each respective SM bit combination would be used to select a respective sequence {[r(sgi,k), θ(sgi,k)]} that would be point wise multiplied with a corresponding non-permuted level-1 subgroup. This way, the decoder can be looking for sequence based information, very similar to the original concept of e.g., pi(t) i=0, . . . , 3, and t=0, . . . 3, but now, for example, {[r(sgi,k), θ(sgi,k)]}, sgi=0, . . . 3, and k=0, . . . , 3. These complex-numbered sequences, {[r(sgi,k), θ(sgi,k)]} can preferably be chosen to be orthogonal to one another or to maximize the Dmin2 of the SM bits in Γ2. Again, additional SM bits can be used to specify level-2 and higher level permutations that involve permuting the order of the level-1 sub-groups themselves, or by using additional SM bits at the sequence level to permute the integer ring {sgi=0, . . . Nsg−1}, where Nsg is the number of level-1 subgroups per SM interval. This type of encoding and decoding can also be performed in the time domain where the columns of Γ1 correspond to a time slot index or other type of independent channel index as opposed to the frequency index, k. Similarly, if oversampling is used, any of the above-described time domain pulse shaping techniques of equations (36)-(38) can be directly applied in the frequency domain, although a larger FFT will be required.
To better understand how these techniques can be applied in
When signature-stamping is employed in the encoder and modulator, the received symbol frame will contain signature-stamped symbols or signature-stamped subsequences of symbols. This allows the SISO decoders 2200, 2530, 2535 to start to decode the SM bits from the very first iteration. In the discussion below, to keep the discussion of the decoder simple, and without loss of generality, it is assumed that signature stamping is being performed with the signature waveforms pi, for i=1, 2, . . . n1. However, it is to be understood in the discussion of the decoder below that these signature waveforms could also be applied using sub-groups, where the level-1 sub-groups contain less than nt number of independent channels. Also, in the discussion of the decoder below, any mention of the use of signature waveforms for signature-stamping of symbols also directly applies to embodiments where parameter sets or sequences of parameters such as [r(k), θ(k)], [r(sgi), θ(sgi)], {[r(sgi,k), E(sgi,k)]}, or [r(t), θ(t)] or {[r(sgi,t), θ(sgi,t)] } are used in place of pi, for i=1, 2, . . . nt.
In a signature-stamped type decoder, a signature-stamped symbol frame is received and decoded. In such decoders, the constraint that no more than (d−1) coded bits of any weight d sequence can be SM bits is no longer needed because reliable enough extrinsic information of SM bits can be extracted directly from the received symbol frame. As an EXIT chart analysis would show, this pi, i=1, 2, . . . nt, information having been stamped onto the symbols in the received symbol frame (e.g., prior to permutation as per equation (37) or in any other way as per SS1-SS4) in accordance with the SM bits can be exploited to increase the number of SM intervals. In fact, if the SNR is high enough, all intervals can be made to be SM intervals.
If all the signatures pi, i=1, 2, . . . nt, are orthogonal to each other, then the SM bits in Γ2 can contribute a SED of 2a2 (similar to SM bits in SM-only embodiments). For this reason, the design of Γ will be exactly the same or very similar to the design of Γ in the above-described SM-only embodiments. If the signature waveforms are not orthogonal to each other, the above SED contributions will be modified based on their correlation. In fact, if the correlation between any two of the signature waveforms is negative, then the SED between these two signature waveforms can even be made higher than 2a2. In embodiments where signature-stamped symbols are used, two different outer codes can optionally be used as well. A first outer code can be used to encode the SC bits and a second and distinct outer code can be used to separately encode the SM bits. When using QPSK signaling, since SC bits have higher SED contributions (4a2 and 8a2), a more powerful outer code can be used for SM bits to achieve a higher Dm2, value. This is a concept similar to the balanced distances rule in multilevel codes. Other criteria like the capacity criterion or the coding exponent criterion used with a MLC could optionally be used as an alternative to a balanced distance type approach when selecting the outer codes to be applied to the SC and SM bits or various rows of the associated interleavers Γ1 and Γ2. In some embodiments, if only a lower power code is used for SM bits, SC bits could be left uncoded. It is possible to select two or more separate levels of protection for SC and SM bits or subsets thereof by properly choosing the strengths of the two or more outer codes.
Next consider the decoding of a symbol frame that has been constructed in accordance with any of SS1-SS4. All iterations of SISO decoding are conducted by following the same basic set of steps with any minor modifications as needed to account for differences among the SS1-SS4 embodiments. That is, in each SISO iteration, extrinsic information provided by the outer code for SC bits is used, extrinsic information of SM bits provided by the soft permutation decoder is used, and information extracted directly from the symbols in the received symbol frame are also used. In the first iteration, the extrinsic information of the SC bits and the SM bits from the outer code are initialized to zero. At the beginning of each SISO iteration, the inner code soft decoder starts off by extracting the bit metrics of SC bits, and then calculating the probability that the signature waveform pi, for each i=1, 2, . . . nt, was used by each symbol in a given SM interval after permutation. In this context, let k denote the column number of the matrix Γ1, where k=0, . . . , C-1, and this applies in both OFDM and time domain embodiments.
Let P(k,i) denote the probability that column k of Γ1 has been signature stamped by pi and let P(k,inot) denote the probability that column k has not been stamped by pi. Next define Lio(k,i) to be a log-likelihood ratio, where the subscript “io” denotes information transfer from the inner code to the outer code, k is the column index into Γ1, and i is the signature waveform index, pi, for each i=1, 2, . . . nt. Then Lio(k,i)=log(P(k,i)/P(k,inot)). With these definitions, the inner code soft decoder starts by computing the bit metrics of the SC bits in column k as in the earlier-described non-signature-stamped decoder embodiments. Such SC-only based bit metrics are computed without any knowledge of any decoded permutation information and without knowledge of any signature stamp related information. Next, the inner code soft decoder performs computations related to the column of Γ1 (which corresponds the symbols in the received symbol frame and to the columns of Γ1 after permutation). That is, the previously computed SC bit metrics related to column k of Γ1 are used to perform a correlation of the kth received symbol with each pi, for each i=1, 2, . . . nt. In order for the inner code soft decoder to calculate the extrinsic information (the bit metrics) of the SM bits, the likelihood Lio(k,i) values as described above are calculated. Similarly to soft permutation decoder as previously described in detail, an inner-code signature-stamp soft permutation decoder is implemented that operates similarly to the above-described soft permutation decoder, but is configured to process the above-mentioned Lio(k,i) values. The signature-stamp soft permutation decoder calculates a metric for each combination of the SM bits that correspond to a specific permutation of the columns of Γ1, and then calculates the extrinsic information of each SM bit using these metrics which are then passed to the outer code decoder. Using the bit metrics of SC bits passed by the inner code and the soft information of SM bits passed by signature-stamp soft permutation decoder, the outer code soft decoder then calculates the extrinsic information of both SC bits and SM bits. When the columns of Γ1 are not permuted as per equation (37), but instead any of SS1, SS3 or SS4 was applied, then the signature-stamp soft permutation decoders of this decoder example (which is based on SS2) similarly computes the extrinsic information related to the permutation ordering (SS1, SS3) or selection of the signature waveforms (SS4).
During the second half (outer code decoding portion) of the SISO iteration, the outer code soft decoder then uses the received extrinsic information (bit metrics) of related to all of the SC bits received from the inner code and the soft information of SM bits passed by the signature-stamp soft permutation decoder and performs outer code soft decoding in the usual way in accordance with the selected outer code(s) to update the extrinsic information of both SC and SM bits. The extrinsic information of SC bits provided by the outer code are passed back to the inner code for the next iteration in the SISO decoding normal way. However, the extrinsic information of SM bits will now be used to calculate a set of Loi(k,i) values that are similar to the above-described Lio(k,i) values, but the “oi” subscript denotes extrinsic information transfer from the outer code to the inner code. The Loi(k,i) values are calculated similarly to the Lio(k,i) values, but using the SM bit metrics after being updated in accordance with an outer-code signature-stamp soft permutation decoder to calculate the soft information of different permutations or signature selections, e.g., in accordance with any of SS1-SS4. The outer-code signature-stamp soft permutation decoder calculates a metric for each combination of SM bits corresponding to a permutation or signature selection using the extrinsic information of the SM bits provided by the outer code. The extrinsic information about column k using pi is then extracted by subtracting the original Lio(k,i) value from the newly calculated Loi(k,i) as [Loi(k,i)−Lio(k,i)]. In essence the information about SM bits are passed between the inner and outer codes in both directions (i.e. from the inner code soft decoder to the outer code soft decoder and from the outer code soft decoder to the inner code soft decoder) via the two signature stamp soft permutation (or selection) decoders as in a double concatenation.
The signature-stamping technique allows for explicit transmission of SM bit information in addition to implicit SM bit information as in previously described embodiments. This explicit SM-bit-related information can result in an increased the number of SM intervals without reducing the chances of convergence according to EXIT charts. While signature-stamping has been described mainly in the context of SISO decoding, signature-stamping can be used in with both MAP based soft decoding or optimal ML based hard decoding or suboptimal (i.e., near-optimal/near-ML) based hard decoding. For example, the hard decoded (7,4) Hamming code example with seven codewords per frame can be similarly improved by employing two signature waveforms for the fifth and sixth coded bits of every codeword of the Hamming code. Also the signature-stamped embodiments described using the (8,4) Hamming code could be hard decoded or soft decoded in accordance with the general concepts illustrated by the various above-described examples. Any of the above aspects of the above described decoders such as the Pyndiah type soft permutation decoders can be applied with signature stamping as well, although adapted to SS1-SS4 or similar type decoding. Similarly, the same concepts above described hard decoders can applied with signature stamped based decoding as described herein to form hard decoders to decode symbol frames to which signature stamping had been applied in the transmitter.
Also, many other kinds of embodiments are possible. For example, there are various forms of modulation known as “Filter bank modulation” (FBM) For example, see A. M. Tonello et al, “A novel multi-carrier scheme: cyclic block filtered multitone modulation,” IEEE ICC 2013, pp. 3856-3860 and see the references cited therein. For example, a type of modulation called filtered multitone (FMT) is described therein and a cyclic-block FMT (CB-FMT) is also described. The present invention contemplates that all of the permutation based encoding and hard and soft permutation decoding techniques, to include signature-stamped permutation based encoding and hard and soft permutation decoding can be used with FMT, CB-FMT and other forms of FBM. This is because all of these FBM systems make use of multiple independent channels as defined herein, and FBM systems use filter banks that can be alternatively implemented using the signature filter banks as described in connection with
If, as discussed in further detail below, a 2Nd-dimensional mapping is used along with a corresponding CICM interleaver to map SC bits, then each column of Γ1 will correspond to Nd number of 2-dimensional symbol intervals. Hence, when SM bits are also transmitted with those SC bits, nt effectively becomes Nd×nt. Hence, more SM bits can be transmitted during an SM interval in the SS1-SS3 techniques. In the case of SS4, if desired, (a) the same selected signature can be applied to an entire column of Γ1, or (b) Nd number of signatures can be applied to each 2D sub-portion along each column. However, due to the multidimensional signaling, during the inner decoding (i.e., updating bits metrics) decoding requires consideration of a 2m×Nd-ary constellation thereby increasing the complexity.
Reverse Gray coding (RGC) and CICM were introduced above for mapping the coded bits of an outer code on to a 2m-ary constellation. In RGC the goal was to achieve the highest possible separation between constellation points that differ by one bit and then to use lower separations between constellation points to that differ by a higher number of bits. If the constellation size permits, the separations between constellation points is lowered progressively as the number of bits that differ increases. So far, RGC was discussed for mapping onto a 2-dimensional (2D) constellation. Higher order constellations can be formed by using two or more 2D constellations as constituent constellations. When ND-number of 2D symbols are joined together into a ND-tuple, this causes 2ND-dimensional symbols to be formed on a 2ND-dimensional signal constellation. Once the 2ND-dimensional constellation is built up as an ND-tuple of selected 2D constituent constellations, the RGC design rules will be applied to determine the RGC mapping for this 2ND-dimensional constellation. Next the CICM design rules will be applied to design a CICM interleaver, Γ, for use with the resulting 2ND-dimensional RGC-mapped constellation.
To understand the advantages of using higher-dimensional constellations with RGC and CICM, consider the 2D RGC QPSK constellation that was developed in detail above in connection with
Next consider how to construct a 4D RGC constellation that uses QPSK as the constituent constellations. That is, two QPSK constellation points are used to construct a single 4D symbol. With this construction there will be a total of four dimensions and sixteen possible 4D symbols, so that the 4D constellation is a 16-ary constellation. The RGC mapping rules can next be applied to find an RGC mapping to map coded bits directly onto this 16-ary 4D constellation. For example, all one-bit differentials from (0000), which are (0001), (0010), (0100) and (1000), can be separated on this 4D constellation by the highest possible separation. The SED contribution of the highest possible separation on the 4D constellation is 4×4a2=16a2 as opposed to 8a2 on the 2D constellation. The next highest SED contribution on the 4D constellation is 3×4a2=12a2. As a result, the resulting 4D RGC QPSK constellation will be able to achieve Dmin2 value given by Dmin2=[16a2+3×12a2)]=52a2 which represents an additional 160% increase over the Dmin2=20a2 that could be achieved with 2D RGC QPSK.
A CICM interleaver can be designed for use with the above 4D RGC QPSK using a normal CICM interleaver with four rows. Each column of that CICM interleaver would correspond to a constellation point on the above described 4D 16-ary constellation. If desired, the above example can be extended to 6 dimensions by using a triplet of 2D QPSK constituent constellation points. This would create a 43=64-ary 6D constellation and would provide even a higher value of Dmin2. However, the mapping and the CICM interleaver design would require 6 rows and likely a higher frame size.
Recall that CICM interleavers with four rows each were designed above to map coded bits on to a 2D 16-QAM constellation and onto a 2D 16-PSK constellation. Compared with the above 4D RGC QPSK system, these 2D 16-QAM and 2D 16-PSK systems are twice more bandwidth efficient because they transmit four bits in every 2D signaling interval as opposed to 4D RGC QPSK which only transmits two bits in every 2D signaling interval. However, either of the above-described 2D 16-QAM or 2D 16-PSK constellations can be used as constituent constellations to form higher dimensional constellations. For example, a 4D 16-PSK constellation can be built up as a 2-tuple of 2D 16-PSK constituent constellations. The RGC design rules would then be applied to determine the RGC mapping for this 4D, 16×16=256-ary constellation. The CICM design rules would then be applied to design an eight-row CICM interleaver for this 4D RGC-mapped 256-ary constellation. In general, for a 2ND-dimensional constellation formed as an ND-tuple of constituent constellations, the CICM interleaver will require ND times the number of rows as required by the 2D constituent constellations. Once such a CICM interleaver is designed to achieve a symbol Hamming distance of ds at the 2ND-dimensional symbol level, the value of ds will automatically be maintained at the 2D constituent constellation level as well.
When RGC and CICM are used to design systems that involve the use of 2ND-dimensional constellations, such systems can be used with or without SM bits. When transmitting additional SM bits, the 2ND-dimensional RGC-mapped constellation is preferably used to transmit SC bits as per Γ1. A subset of nt×ND number of SM bits will be needed for each group of nt×ND number of columns of Γ1, because each nt×ND number of columns of Γ1 corresponds to a group of nt number of 2ND-dimensional channel symbols. The SM bits of Γ2 can be transmitted as described above in connection with any of the embodiments discussed of
Finally, it is noted that while all of C-IC-SM, C-OFDM-SM, and C-TDM-SM have been developed as special cases of the C-MIMO-SM embodiments presented herein and as illustrated in
The user terminal 2710 similarly includes a physical layer interface 2732, a protocol stack 2734 and an application layer module 2736 which may include user interface devices as well as application software. The user terminal 2710 also may optionally include a packet processor 2738 which can be connected to a local area network, for example. The user 2710 terminal may also act as an IP switching node or router in addition to user functions in some embodiments.
Another type of embodiment replaces the headend system 2705 with another user device 2710 in which case direct peer-to-peer communications is enabled. In many applications, though, the headend can act as an intermediary between two user devices to enable indirect peer-to-peer communications using the same headend-to/from-user device uplink/downlink architecture illustrated in
In a preferred embodiment as directly illustrated by
Another aspect of the present invention contemplated by
Although the present invention has been described with reference to specific embodiments, other embodiments may occur to those skilled in the art without deviating from the intended scope. Figures showing block diagrams also identify corresponding methods as well as apparatus. All transmitted signals shown in the Figures can be applied to various types of systems, such as cable modem channels, digital subscriber line (DSL) channels, individual orthogonal frequency division multiplexed (OFDM) sub-channels, wireless channels, SM and MIMO channels, optical channels and the like. In general, more than two component codes can be concatenated together, and embodiments can be created that mix parallel and serial concatenation to form mixed parallel/serial concatenated codes. In such cases the constrained interleaving can be performed on any component-encoded or concatenated encoded bit stream to be interleaved within the mixed encoder structure to satisfy a constraint that is designed to jointly optimize or otherwise improve bit error rate performance by jointly increasing a measure of minimum distance and reducing the effect of one or more dominant error coefficients of the mixed encoded bit stream. The concepts presented herein can be extrapolated to these higher order cases by induction. This patent application contains various block diagrams and glow charts. It is to be understood that sub-portions of any of the block diagrams or flow charts can be used to extract apparatus, systems and methods that correspond to just the sub-portion of the block diagram or flow chart. Block diagrams in many cases can be indicative of all of methods, apparatus, and systems. Also, it is understood that an inner code in a concatenation can be replaced in many cases by a modulator such as a TCM, BICM, or CICM. That is, a serial concatenated code may be formed by an outer encoder followed by a constrained interleaver, followed by a signal mapper such as TCM, BICM, or CICM. Such embodiments of CTBC codes are contemplated herein.
Also
Also, it is to be noted that much of the description herein relates to computer, digital communications, and digital signal processing technology, and all of the block diagrams and flowcharts and related description herein can, in whole or in part, be implemented using processor technology. For example, apparatus and systems can comprise one or more processors coupled to one or more memories, and also coupled to other input/output devices such as channel interfaces, line interfaces, communication protocol stack upper layers, user interfaces, user input/output devices, switching fabrics, OTN backbone links, optical LAN interfaces, and the like. In such systems, instructions can be stored in the one or more memories to cause one or more functional units in one or more the processors to carry out actions or steps to implement any aspects of the block diagrams or flow charts herein. Also, special hardware can be hardwired, so that no instruction stream is needed to carry out certain actions such as highly repetitive/periodic processing. In such cases microsequencing logic can be built into dedicated control circuits to cause the hardware to loop through each frame of encoding, decoding, modulation, demodulation, and the like. The apparatus, systems and methods presented herein can be configured to perform computerized sequences of operations, however, the operations themselves are provided to solve problems that are necessarily rooted in computer and electronic communications technology in order to overcome specific problems that specifically arise in the realm of computer networks, local area networks, wide area networks, link layer communications, and physical layer communications. For example, errors naturally occur due noise, distortion, and other impairments physically introduced by a communication channel. The techniques developed herein provide solutions to recovering a message sequence at a receiver with error recovery and error avoidance in light of these physical technology-induced channel impairments.
Finally, it is recalled that U.S. Pat. No. 8,537,919 and U.S. Pat. No. 8,532,209, by the same inventors and dealing with constrained interleaving related technology, are incorporated herein by reference. In these incorporated-by-reference patents, CI-1 and CI-2 are presented. Likewise, a number of specific systems are presented therein, such as constrained turbo product codes (both the outer code and the inner codes are block codes) multiple concatenations, and the like. Hence it is to be understood that the present invention also contemplates modifying any specific embodiment (e.g., block diagram, flow chart, or written description portion) of these incorporated-by-reference patents by making any modification as disclosed in the instant patent application. For example, any S place a signal mapper is discussed in the incorporated-by-reference patents, CICM or a version of CICM-SM or CICM-MIMO would be used as the signal mapper. Any time BICM is mentioned in the incorporated-by-reference patents, CICM could be substituted to obtain an embodiment in accordance with a present invention. Likewise, any time CI-1 or CI-2 is mentioned in any disclosed embodiment in these incorporated-by-reference patents, a new embodiment in accordance with the present invention could be obtained by specifically reciting the new specific species SRCI, CI-3, or CI-4 of the more general genus of inventions, CI as disclosed in these incorporated-by-reference patents.
As a specific example how this would occur in practice, consider the steps involved in the construction of TPCs (turbo product codes) that are constructed in accordance with CI-1 which used a constrained interleaver design matrix with randomization along the rows and columns. Note that the CI-1 interleaver matrix ensures that every coded bit of a codeword of the OBC is fed into different codewords of the IBC (inner block code). In addition, the randomizations along all L=k, rows and then all nρ′ columns guarantee that coded bits are placed with the highest possible level of randomness allowing any coded bit of any OBC to be placed anywhere in the interleaved sequence u subjected to the above constraint. In other words, TPC designed according to CI-1 uniformly randomizes positions subject to the constraint that no two coded bits of any codeword of the OBC are allowed to be fed into the same codeword of the IBC. With that observation and with the intention of feeding blocks of k, bits of interleaved bits into the IBC, the SRCI counterpart can be designed by considering a block structure in the interleaver and constraining that any two coded bits of a codeword of the OBC cannot be placed in the same block of k, bits of the interleaved sequence. Hence, in SR-CTPC, every coded bit cjt which is at position i=(jn+t) on c, where j=0, 1, . . . , (ρ−1), t=0, 1, . . . , (n−1), an interleaved position π(i) can be found on u by using the following steps:
(a) for each q, 0≦q≦p, the restricted zone is from X(p) to Y(p) (including X(q) and Y(q)) on u, where X(q)=ki└π(q)/ki┘ and Y (q)=X(q)+ki−1, and
(b) randomly select a position among the remaining vacant positions on u as π(i). In order to treat all ρ codewords in the same manner, every selected coded bit position p (0≦p<n) of all codewords can be placed on u, one coded bit position at a time starting from p=0 and moving up to p=(n−1).
As another example, consider FIG. 18 of U.S. Pat. No. 8,532,209. Using this figure, the present invention would include an embodiment that could be described as having all of blocks 1810, 1010A and 1820 specifically recited to have their interleavers implemented using a SRCI such as CI-3 and/or CI-4. While U.S. Pat. No. 8,532,209 described the CI genus, it only described the CI-2 and CI-2 species. Hence while inventions in U.S. Pat. No. 8,532,209 could be described to recite embodiments using the CI genus, the current patent application specifically contemplates all recitable inventions in U.S. Pat. No. 8,532,209, but with the specific new SRCI, CI-3 and CI-4 species. Also and alternatively, for example, in class of concatenation as shown in FIG. 18 of U.S. Pat. No. 8,532,209, the block 1820 and 1825 can be implemented using a CICM signal mapper that comprises a CICM permutation 1820 followed by a constellation mapper 1825 that acts as the inner code in a double concatenation. It is also noted that r2p2 in FIG. 18 of U.S. Pat. No. 8,532,209 may be set to one. All such variations and embodiments are specifically contemplated by the present invention.
In can also be noted that in all of the optical and non-optical SM embodiments discussed herein, instead of jointly SISO decoding both the spatial constellation point (i.e., the channel number) and the signal constellation point, a two step process may be used instead. That is, a first detector/channel estimator can be used to identify one or a sequence of channel numbers through which the SM signal was transmitted in a given one or a sequence of symbol intervals, and a second detector/decoder can be used to estimate one or a sequence of signal constellation points. For example, the SISO decoder can be broken up into first and second SISO decoders (spatial constellation decoder followed by signal constellation decoder), or other types of arrangements can be used. For example, a hard decoder or a hard iterative decoder can be to estimate the sequence of spatial constellation points (sequence of channel numbers) and a SISO decoder can then be used to estimate the sequence of signal constellation points. Other types of arrangements that iterate between these two decoders can also be configured. Also, a channel pre-estimator portion could be used to narrow the search space and simplify the complexity of the SISO decoder. For example, if there are a total of 256 possible channels through which the SM signal can be transmitted each symbol interval, the channel pre-estimator could be configured to identify a sequence of the 16 or fewer most likely channels through which the SM signal was transmitted each symbol interval over a given frame interval. A reduced-complexity joint spatial-signal constellation SISO decoder could then be configured to use this channel pre-estimation information to narrow the search space when iterating to jointly find the sequence of spatial and signal constellation points that were transmitted during each symbol interval. To simplify further, for example, if out of the 16 channel estimates, only four are above a threshold during a given symbol interval, the joint SISO decoder could reduce its complexity further by only operating on the metrics that are above the threshold. All such variations are contemplated by the present invention.
It is therefore noted that any specific embodiment recited in any specifically drafted claim is what governs the claim scope of all recited claims in this application and any continuations, divisionals, or international filings derived herefrom. The disclosure provided herein is meant to explain how to construct all of these voluminous different types of embodiments and to explicitly show one of ordinary skill in the art how to readily construct them using standard levels of engineering creativity and engineering know-how as would be expected by one of ordinary skill in the art. The recited claims are provided to identify the scope of the claimed inventions.
This application is a continuation-in-part of co-pending U.S. patent application Ser. No. 14/545,588, entitled “Constrained interleaving for 5G wireless and optical transport networks,” filed May 27, 2015.
Number | Date | Country | |
---|---|---|---|
Parent | 14545588 | May 2015 | US |
Child | 14756308 | US |