Digital modulation systems use a finite number of distinct signals to represent digital data. Phase-shift keying (PSK) is a digital modulation scheme that conveys data by changing, or modulating, the phase of a reference signal (the carrier wave). PSK systems use a finite number of phases, each assigned a unique pattern of binary bits. Each pattern of bits forms a symbol that is represented by the particular phase. A demodulator, which is designed for the symbol-set used by a modulator, determines the phase of the received signal and maps it back to the symbol it represents, thus recovering original data. This requires the receiver to be able to compare the phase of a received signal to a reference signal.
Alternatively, instead of using the bit patterns to set the phase of a wave, a PSK system can instead change the phase of the wave by a specified amount. The demodulator determines the changes in the phase of a received signal rather than the phase itself. Since such a system relies on the difference between successive phases, it is termed differential PSK (DPSK). DPSK can be significantly simpler to implement than ordinary PSK since there is no need for the demodulator to have a copy of the reference signal to determine the exact phase of the received signal (it is a non-coherent scheme), but produces more erroneous demodulations.
Binary PSK (BPSK) uses two phases which are separated by 180°. Constellation points can be treated as if at 0° and 180° on an axis because it does not particularly matter exactly where the constellation points are positioned. Variations include quadrature PSK (QPSK), offset QPSK (OQPSK), π/4-QPSK, higher order PSK, etc.
Quadrature amplitude modulation (QAM) is both an analog and a digital modulation scheme. QAM conveys two analog message signals, or two digital bit streams, by modulating the amplitudes of two carrier waves, using the amplitude-shift keying (ASK) digital modulation scheme or amplitude modulation (AM) analog modulation scheme. These two waves are out of phase with each other by 90° and are thus called quadrature carriers or quadrature components. The modulated waves are summed, and the resulting waveform is a combination of both PSK and amplitude-shift keying, or in the analog case of phase modulation (PM) and AM. In the digital QAM case, a finite number of at least two phases and at least two amplitudes are used. PSK modulators are often designed using the QAM principle, but are not considered as QAM since the amplitude of the modulated carrier signal is constant.
A low-density parity-check (LDPC) code is a linear error correcting code, a method of transmitting a message over a noisy transmission channel, and is constructed using a sparse bipartite graph. LDPC codes are capacity-approaching codes, which means that practical constructions exist that allow a noise threshold to be set very close to the theoretical maximum (the Shannon limit) for a memory-less channel. The noise threshold defines an upper bound for the channel noise up to which the probability of lost information can be made as small as desired. Using iterative belief propagation techniques, LDPC codes can be decoded in time linear to their block length. Low Density Parity Check Codes, Number 21 in Research monograph series, R. Gallager, 1963—MIT Press, Cambridge, Mass. is incorporated by reference.
The following is described and illustrated in conjunction with systems, tools, and methods that are meant to be exemplary and illustrative, not limiting in scope. Techniques are described to address one or more of deficiencies in the state of the art.
A technique for low-density parity-check (LDPC) coding involves utilizing a fixed point implementation in order to reduce or eliminate reliance on floating point operations. The fixed point implementation can be used to calculate check node extrinsic L-value as part of an LDPC decoder in an LDPC system. The technique can include one or more of linear approximations, offset approximations, and node-limiting approximation. A system constructed according to the technique implements one or more of linear approximations, offset approximations, and node-limiting approximation.
Examples of the claimed subject matter are illustrated in the figures.
In the following description, several specific details are presented to provide a thorough understanding of examples of the claimed subject matter. One skilled in the relevant art will recognize, however, that one or more of the specific details can be eliminated or combined with other components, etc. In other instances, well-known implementations or operations are not shown or described in detail to avoid obscuring aspects of the claimed subject matter.
A station, as used in this paper, may be referred to as a device with a media access control (MAC) address and a physical layer (PHY) interface to a wireless medium that complies with the Institute of Electrical and Electronics Engineers (IEEE) 802.11 standard. In alternative embodiments, a station may comply with a different standard than IEEE 802.11, or no standard at all, may be referred to as something other than a “station,” and may have different interfaces to a wireless or other medium. IEEE 802.11a-1999, IEEE 802.11b-1999, IEEE 802.11g-2003, IEEE 802.11-2007, and IEEE 802.11n-2009 are incorporated by reference. As used in this paper, a system that is 802.11 standards-compatible or 802.11 standards-compliant complies with at least some of one or more of the incorporated documents' requirements and/or recommendations, or requirements and/or recommendations from earlier drafts of the documents.
The system 100 is depicted in the example of
In the example of
In the example of
An (N,K) linear block code can be specified by its K×N generator matrix G ε{0, 1}K×N or by its M×N parity check matrix H ε{0, 1}M×N, where M=N−K. The rate of the code is R=K/N. Let
u=[u1, u2, . . . , uK]ε{0, 1}K (1)
denote a message to be encoded (information bits). The codeword c=[c1, c2, . . . , cN]ε{0, 1}N corresponds to the message u, and the parity check equation is HcT=0, where the matrix multiplication is in GF(2). The parity check matrix can be split into two parts, H1 and H2, according to
The parity check equation can then be written as
Equation (1), (2) yields the expression pT=H2−1H1uT. Hence, the systematic form of G can be expressed in terms of H1 and H2:
where I is the identity matrix of size K×K. The codeword can be generated by c=uG, where the first K bits in c are equal to u and the last M=N−K bits in c are the parity bits c=[u, p]=[u1, u2, . . . , uK, p1, p2, . . . , pM]ε{0, 1}N.
Before transmission, a codeword is mapped to a symbol constellation, e.g., BPSK, QPSK, QAM, etc. Assuming BPSK symbols Xε{−1, +1}N, X is defined as 2c−1. The symbols are transmitted over an additive white Gaussian noise (AWGN) channel, where the received matched filter outputs are Y=aX+σW. The elements in W are zero-mean Gaussian random variables with unit variance, a is the received signal amplitude and σ is the standard deviation of the additive noise. Let Es and Eb denote the symbol and information bit energy, respectively. This means that the amplitude is a=(Es)−1/2=(REb)−1/2. Let the variance σ2=N0/2, where N0 is the one sided power spectral density of the additive noise.
A parity check matrix can also be visualized by a Tanner graph with variable nodes (on the left) representing the coded bits and check nodes (on the right) representing the parity checks. A Tanner graph is a bipartite graph used to specify constraints or equations that specify error correcting codes. Tanner graphs can be used to construct longer codes from smaller ones. A Recursive Approach to Low Complexity Codes, Tanner, R., IEEE Transactions on Information Theory, Volume 27, Issue 5, pp. 533-547 (September 1981) is incorporated by reference.
Channel L-values can be denoted by Ln and come from the channel to the variable node edges of a Tanner graph. The L-values generated by variable node, n, for check node, m, can be denoted by Vm,n. The L-values generated by check node, m, for variable node n can be denoted by Cm,n. When dv(n) and dc(m) are not the same for all n and m, the code can be referred to as a variable and check irregular code. A variable regular code, on the other hand, has dv(n)=dv for all n=1, 2, . . . , N. A check regular code has dc(m)=dc for all m=1, 2, . . . , M.
The 12 LDPC codes used in 802.11n are variable and check irregular codes. Each of the parity-check matrices for the LDPC codes used in IEEE 802.11n can be partitioned into square subblocks (submatrices) of size Z×Z. These submatrices are either cyclic-permutations of the identity matrix or null submatrices. The cyclic-permutation matrix xk is obtained from the Z×Z identity matrix by cyclically shifting the columns to the right by k elements. The matrix x0 is the Z×Z identity matrix. The H1 matrices for the 12 LDPC codes in 802.11n are all different (see IEEE 802.11n-2009 for the matrix prototypes for all 12 LDPC codes in 802.11n). However, H2 (and consequently H2−1) only depend on the code rate R.
IEEE 802.11a-1999, IEEE 802.11b-1999, IEEE 802.11g-2003, IEEE 802.11-2007, and IEEE 802.11n-2009 are incorporated by reference. As used in this paper, a system that is 802.11 standards-compatible or 802.11 standards-compliant complies with at least some of one or more of the incorporated documents' requirements and/or recommendations.
In the example of
In the example of
In the example of
In the example of
Precoding, as used in this paper, is used in conjunction with multi-stream transmission in MIMO radio systems. In precoding, the multiple streams of the signals are emitted from the transmit antennas with independent and appropriate weighting per each antenna such that some performance metric such as the link throughput is maximized at the receiver output. Note that precoding may or may not require knowledge of channel state information (CSI) at the transmitter. Some benefits of precoding include increasing signal gain on one or more streams through diversity combining, reducing delay spread on one or more streams, providing unequal signal-to-noise ratio (SNR) per stream for different quality of service (QoS).
Beamforming, as used in this paper, is a special case of precoding for a single-stream so that the same signal is emitted from each of the transmit antennas with appropriate weighting such that some performance metric such as the signal power is maximized at the receiver output. Some benefits of beamforming include increasing signal gain through diversity combining and reducing delay spread.
A MIMO antennae configuration can be used for spatial multiplexing. In spatial multiplexing, a high rate signal is split into multiple lower rate streams, which are mapped onto the Tx antennae array. If these signals arrive at an Rx antennae array with sufficiently different spatial signatures, the receiver can separate the streams, creating parallel channels. Spatial multiplexing can be used to increase channel capacity. The maximum number of spatial streams is limited by the lesser of the number of antennae at the transmitter and the number of antennae at the receiver. Spatial multiplexing can be used with or without transmit channel knowledge.
In the example of
The encoding of the LDPC codes used in, for example, 802.11n can be expressed recursively due to the structured form of H2−1. The LDPC codes in 802.11n have Zε{27, 54, 81} and Sε{12, 8, 6, 4}, such that N=24Z and M=SZ. The encoding can be expressed as
wT=H1uT=[w1,w2, . . . , ws]T, (3)
pT=H2−1wT=[p1, p2, . . . , ps]T, (4)
where wk and pk are binary row vectors of size Z for k=1, 2, . . . , S.
The matrix multiplication in (3) can easily be implemented since the nonzero entries of are Z×Z right circulant permutation matrices. Each wj is a shifted version of uj by the corresponding power of x in the (i, j) entry of the matrix prototype H1 and summing the results over i. If the (i, j) entry of the matrix H1 is zero, there is no contribution of u, to the sum. This can be implemented by, for example, barrel shift registers.
In the example of
In the example of
In the example of
In the example of
In the example of
The vector x is converted to analog waveforms at the D/A converters 214. The number of D/A converters 214 will typically correspond to the number of Tx antennae in the Tx antennae array 218, though it is conceivable that a system could have more or fewer D/A converters. In certain special cases of MIMO (i.e., SISO or SIMO), there may be only a single D/A converter 214 for the single spatial stream.
In the example of
In the example of
In the example of
In the example of
The antennae weighting engine 310 receives the demodulated received vector y, and can receive other values, as well, such as, e.g., CSI (shown) and interference measurements (not shown). The antennae weighting engine 310 can receive, compute, and/or have stored a weighting matrix, M, and generates a weighted vector y′=M1/2 y. In an alternative embodiment, the weighted vector y′ could combine M and y in some other manner than matrix-vector multiplication. The CSI can be used to compute a channel estimate, H′, (or the CSI could include the channel estimate, H′). Using these values, a precoding matrix, Q, could be represented as Q=M1/2H′. In an alternative embodiment, the matrix Q could combine M and H′ in some other manner than matrix multiplication. Regardless of the implementation-specific details, the antennae weighting engine 310 provides the precoding matrix, Q, to the MIMO equalizer 312.
In the example of
In the example of
In the example of
The LC LDPC decoder 318 can have an initialization stage before the decoding of each code word starts, where the check node L-values Cm,n are reset to zero. A posteriori probability (APP) L-values, Pn, are therefore initialized with the channel L-values, Ln, since Cm,n=0. After each layer, the N sign bits of an APP Pn (codeword) can be found and stored: ĉn=0 if Pn≧0, 1 if Pn<0, n=1, 2, . . . , N. When the same codeword has been found after T consecutive layers have been decoded, it can optionally be checked whether it is a valid codeword according to the parity check equation HĉT=0. If one or two criteria are fulfilled, decoding of the codeword can optionally stop. Note that T can be larger, smaller, or equal to S.
The extrinsic L-value for a variable node sent to check node, k, is the sum of all connections except the connection to check node k: Vk=L+Σ(i≠k)Ci=P−Ck, k=1, 2, . . . , dv. The LC LDPC decoder 318 iterates between the variable nodes updating the variable nodes, Vk, and the check nodes, Ck. One iteration is defined as one update of all variable nodes and one update of all check nodes. The iterations between variable nodes and check nodes continue until a predetermined number of iterations is reached or until some other criteria is met (for example a valid codeword is found, or the same valid codeword is found after T consecutive decoding layers). After finishing the iterations, the code word is simply the MSBs of the APP L-values based on the most recent check node L-values according to P=L+Σ(t=1 . . . dv)Ci and û=0 if P≧0, 1 if P<0. The APP L-value for coded bit n can be calculated as the sum of all check node L-values connected to variable node n plus the channel L-value for coded bit n. Pn=Ln+Σ(m=1 . . . M)Cm,nHm,n, n=1, 2, . . . , N, where Hm,n is used to pick out the check nodes connected with variable node, n. The decision for the information bits are based on the first K APP values according to ûk=0 if Pk≧0, 1 if Pk<0, k=1, 2, . . . , K. For simplicity of notation, the index, n, can be dropped such that P=L+Σ(t=1 . . . dv)Ci and û=0 if P≧0, 1 if P<0.
The extrinsic L-value for a check node sent to variable node, k, is based on all connections except the connection to variable node k:
C
k=2 tan h−1(Π(i≠k)(Vi/2)); (5.1)
C
k=ln {[1−Π(i≠k)((1−exp(Vi))/(1+exp(Vi)))]/[1+Π(i≠k)((1−exp(Vi))/(1+exp(Vi)))]}; (5.2)
C
k=Π(j≠k)αjφ(Σ(i≠k)φ(βi))=Ukφ(Ek); (5.3)
C
k
=U
kφ(Ek)=Π(j≠k)αj(Σ(i≠k)βi=Σ(j≠k)Vj; (5.4)
where k=1, 2, . . . , dc.
The equations (5.1) to (5.4), which can be referred to collectively as the equation (5), can be solved to obtain Ck. A first potential approach to solving for Ck (5.1) requires two look-up tables, one for tanh and one for tanh−1, together with many multiplications.
The equation (5.1) can be rewritten as equation (5.2). A second potential approach to solving for Ck (5.2) requires two look-up tables, one for ln and one exp, together with many multiplications and divisions.
The L-values from the variable nodes can be represented by sign-abs where Vi=αiβi, where αi is defined as sign (Vi), and where βi is defined as |Vi|. Using the sign-abs representation the equation (5.2) can be rewritten as equation (5.3), where φ(x)=ln [(exp(x)+1)/(exp(x)−1)], and the expression can be conceptually divided into a first part, Uk, and a second part, Fk, where Uk is defined as Π(j≠k)αj, Fk is defined as φ(Ek), and Ek is defined as Σ(i≠k)φ(βi). A third potential approach (5.3) is referred to in this paper as the sum-product (SP) approach (SPA).
To solve for the second part, Fk, the values for each φ(βk) are obtained from a φ lookup table. Since there are defined to be a total of dc edges going into the check node, a total of dc lookups are necessary to obtain the values. The values for each of the dc summations are then obtained from the φ lookup table, bringing the total number of lookup operations to 2dc.
To solve for the first part, Uk, it is first noted that the sign bits αjε{+1,−1}, and Uk=Π(j≠k)αj=αkU, where U, defined as Π(j=1 . . . dc)αj, is identical for all k=1, 2, . . . , dc. An equivalent solution is to use the MSB of Vj represented by αj*ε{0,1}, Uk*=αk* ⊕U*, where ⊕ is the XOR operation and U*, defined as Σ(j=1 . . . dc)⊕αj*, is identical for all k=1, 2, . . . , dc.
φ(x)>0, x<∞. (6)
φ(φ(x))=x. (7)
A fourth potential approach (5.4) is referred to in this paper as the box-plus (BP) approach (BPA). It may be noted that the third and fourth approaches have Uk in common. Fk, defined as φ(Ek), can be calculated recursively as Fk=φ(Σ(i≠k)φ(βi))=Σ(i≠k)β1.
The BP operation can be represented as:
V
1
V
2=α1α2[min(β1, β2)−δ(β1, β2)] (8)
The bias term in the BP operation (8) is δ(β1, β2)=μ(β1−β2)−μ(β1+β2), where the correction terms are defined as μ(x)=ln(1+exp(−|x|)).
BPA can be implemented using a forward-backward algorithm. Backward recursion:
Forward recursion:
The backward recursion requires (dc−2) BP operations. The forward recursion requires 2(dc−2) BP operations. One look-up table is needed to calculate the correction term. To calculate all dc extrinsic Fk, 3(dc−2) BP operations are required. Each operation finds the minimum of the two inputs, one subtraction for the argument to the first correction term, one addition for the argument to the second correction term, one subtraction to calculate the bias term, and one subtraction to remove the bias term.
The BP operations used in the BPA can be approximated by using the dominant term in the bias term δ(β1,β2)=μ(β1−β2)−μ(β1+β2)≈μ(β1−β2). With the approximated bias, the BP operation can be simplified to β1β2=min(β1,β2)−μ(β1−β2)+μ(β1+β2)≈max(min(β1,β2)−μ(β1−β2), 0), and the look-up table for μ is only needed once per BP operation. The maximization is needed to avoid negative values of Fk*. This variant of the BPA is referred to in this paper as the simplified BPA (SBPA).
Another approach, referred to as the APP approach (APPA) is similar to the calculations of the APP L-values Pn=Ln+Σ(m=1 . . . M) Cm,nHm,n, n=1, 2, . . . , N, that uses all check-node L-values. Instead of using the extrinsic values that disregard βk to calculate Fk, all βk can be used: Fk≈Σ(i=1 . . . dc)βi, for all k≧1, where F is defined as φ(Σ(i=1 . . . dc)φ(βi)). Note that F=Fkβk, for all k≧1, if Fk is the true extrinsic L-value calculated as in Fk=φ(Σ(i≠k)φ(βi))=Σ(i≠k)β1. Note also that even if Fk≈F is identical for all k, Ck will still depend on the signs Ck≈UkF, for all k≧1.
The techniques described with reference to the SBPA are also applicable to the APPA, and such an approach is referred to as a simplified APPA (SAPPA) in this paper.
|βi| can be ordered such that β1*≦β2*≦ . . . ≦βdc*. It follows from equation (6) that Σ(i=1 . . . dc)φ(βi*)≧Σ(i=1 . . . dc−1)φ(βi*)≧ . . . ≧φ(β1*)+(β2*)≧φ(β1*). Since φ(x+ε)≦φ(x) for any ε≧0 it follows that φ(βdc*)≦φ(βdc−1*)≦ . . . ≦φ(β1*). So Ek* can be defined as Σ(i=1 . . . k−1)φ(βi*)+Σ(i=k+1 . . . dc)φ(βi*). Note that Ei is only equal to Ek* if βi=βk*. Since φ(β1*) is the largest term, it can be concluded that
E
1
*≦E
2
*≦ . . . ≦E
dc*. (9)
An upper bound to Ek* is to replace every term in Σ(i=1 . . . k−1)φ(βi*)+Σ(i=k+1 . . . dc)φ(βi*) with the largest term:
E
1*≦(dc−1)φ(β2*), (10.1)
E
k*≦(dc-1)φ(β1*), for all k≧2. (10.2)
Another upper bound to Ek* is to add the missing term in Σ(i=1 . . . k−1)φ(βi*)+Σ(i=k+1 . . . dc)φ(βi*):
E
k
*≦E
k*+φ(βk*)=Σ(i=1 . . . dc)φ(βi*). (11)
A lower bound to Ek* is to keep only the largest term in Σ(i=1 . . . k−1)φ(βi*)+Σ(i=k+1 . . . dc)φ(βi*):
E
1*≧φ(β2*), (12.1)
E
k*≧φ(β1*), for all k≧2. (12.2)
Disregarding the last summation in Σ(i=1 . . . k−1)φ(βi*)+Σ(i=k+1 . . . dc)φ(βi*) gives a tighter lower bound to Ek*
E
k*≧Σ(i=1 . . . k−1)φ(βi*), for all k≧2. (13)
It can be shown that lim (x→∞)[x−φ((dc−1)φ(x))]=ln(dc−1).
F
1*≧φ((dc−1)φ(β2*))≧max(β2*−ln(dc−1), 0), (14.1)
F
k*≧φ((dc−1)φ(β1*))≧max(β1*−ln(dc−1), 0), for all k≧2. (14.2)
Where Fk is defined as φ(Ek), (11) gives another lower bound to Fk*:
F
k*≧φ(Σ(i=1 . . . dc)φ(βi*)). (15)
A tighter lower bound is to always choose the larger of the two lower bounds F1*≧max[φ((dc−1)φ(β2*)), φ(Σ(i=1 . . . dc)φ(βi*))] and Fk*≧max[φ((dc−1)φ(β1*)), φ(Σ(i=1 . . . dc) φ(βi*))], for all k≧2.
Where Fk is defined as φ(Ek), since φ is self-invertible (7), combining equations (12) and (13) gives upper bounds to Fk*:
F
1*≦φ(φ(β2*))=β2*, (16.1)
F
k*≦φ(Σ(i=1 . . . k−1)φ(βi*))≦φ(φ(β1*))=β1*, for all k≧2, (16.2)
F
k*≦φ(Σ(i=1 . . . Q, i≠k)φ(βi*)), for all 2≦Q≦dc and k≧2, (17)
The upper bound (UB) for F1* in (16.1) is shown in
Where Fk is defined as φ(Ek), combining (9) and (15) gives F1*≧F2*≧ . . . ≧Fdc*≧φ(Σ(i=1 . . . dc)φ(βi*)). It follows from equation (8) that the lower bound to Fk* can therefore be rewritten as Fk*≧Fk+1*≧Σ(i=1 . . . dc)βi*=Fj*βj*, for all 1≦k≦dc−1 and 1≧j≧dc. In a similar way, the upper bound to Fk* in equations (16), (17) can be written as F1*≦β2*≦ . . . ≦βdc*; Fk*≦Σ(i=1 . . . k−1)βi*≦β1*≦ . . . ≦βdc*, for all k≧2; Fk*≦Σ(i=1 . . . Q, i≠k)βi*, for all 2≦Q≦dc and k≧2.
Due to the complexities of the four approaches described above with reference to equation (5), it is considered desirable in practice to replace equation (5) with an equation that provides similar results with fewer and/or less complex calculations. The min-sum approximation (MSA) can be used to approximate Fk* using the simple upper bounds F1*≦β2*≦ . . . ≦βdc*; Fk*≦Σ(i=1 . . . k−1)βi*≦β1*≦ . . . ≦βdc*, for all k≧2; Fk*≦Σ(i=1 . . . Q, i≠k)βi*, for all 2≦Q≦dc and k≧2, where the approximation is equal to the minimum βj, F1*≈β2*, Fk*≈β1*, for all k≧2. This can also be expressed as Fk≈β2*, if βk=β1*, β1* otherwise, from which it can be seen that the check node needs to find the two smallest βj to update all dc extrinsic L-values. By definition, this approximation is always over-estimating Fk. MSA is equal to the upper bounds in equations (16) and is shown for F1* in
A linear approximation (LA) entails using a fixed attenuation, A. F1*≈Aβ2*, Fk*≈A=1*, for all k≧2. From (16.1) and 16.2), it follows that the attenuation should be chosen in the interval 0<A≦1. A disadvantage of the LA approach is that whichever A is chosen, there will be a range of β1* where Fk*≈Aβ1* is below its lower bound β1*−ln(dc−1). If A=1, the LA approach is equivalent to MSA. The LA for F1* using A=0.50 is shown in
The greater the number of iterations, the greater the reliability of L-values from the variable nodes Vk used in the check node update. When β1* becomes larger, the approximation of Fk* changes. Advantageously, a dependency on the iteration number to the parameters used in the approximations, A, B, q1, and q2, can be provided. Note that the bounds for Fk* are still independent of iterations. Note also that some of the above-mentioned check node update approaches, for example, SPA, BPA, and MSA have no parameters and are identical for each iteration.
Irregular LDPC codes with an attenuation that varies with the iterations (properly optimized) can remove the error floor and be almost as good as a non-linear approximation (properly optimized). Usually the attenuation is growing with iterations such that A(i)≧A(i−1). An ad hoc approach is to switch between MSA and LA. Let A(i) denote the attenuation at iteration, i. This can be referred to as a non-constant parameter approach (NCPA). For example, A(i)≈A if mod(i, I)=0 (LA), 1 otherwise (MSA). For example, I=3 and A=0.7 give the sequence A=[0.7, 0.7, 1.0, 0.7, 0.7, 1.0, . . . , 0.7, 0.7, 1.0].
In the example of
In the example of
Variables can include the current iteration of a number of iterations in a computation, dc, characteristics of the check node, or some other factor. For example, the variable could be an iteration counter, i. For such a variable, it may be desirable to initialize it to a starting value and perhaps also to set an iteration threshold, I, that causes the variable to trigger periodically, (pseudo-)randomly, or with some other frequency or consistency. The iteration threshold, in this example, could be set once and reused over time, or it could itself be subject to variation based upon changed conditions. One possible triggering mechanism could be a function that takes the variable as input and either selects or causes the selection of the ELC parameter value. With reference once again to the iteration counter example, the triggering mechanism could be a mod function that takes i and I as input: mod(i, I).
In the example of
In the example of
If, on the other hand, it is determined that there are no additional iterations (808-N), then the flowchart 800 continues to module 810 with calculating APP values and extracting information bits. The logic of the flowchart 800 in this example requires that if there are no additional iterations to be performed, there is by definition no new ELC parameter (808-N). It may be noted that there may be no actual “determination” at decision point 808, but it rather could be that a function that provides the ELC parameter values updates automatically. For example, if the parameter value changes at iteration, i, the parameter could simply be updated without a determination as to whether a new parameter value is needed.
In the example of
Advantageously, an offset approximation (OA) uses a fixed bias, B, as opposed to over-estimating Fk as with MSA, or using a fixed attenuation, A, as with LA.
F
1*≈max(β2*−B,0), (18.1)
F
k*≈max(β1*−B,0), for all k≧2. (18.2)
From (14.1) and (14.2) it follows that the bias should be chosen in the interval 0≦B≦ln(dc−1). It may be noted that if B=0, the OA reduces to the MSA. The OA for F1* using B=1.30 is shown in
Linear OA (LOA) combines OA and LA to use the larger of the two methods. F1*≈max(Aβ2*, β2*−B), Fk*≈max(Aβ1*, β1*−B), for all other k≧2. Since there are intervals of A and B, this can be written as
F
1*≈max(q1β2*, β2*−(1−q2) ln(dc−1)), (19.1)
F
k*≈max(q1β1*, β1*−(1−q2) ln(dc−1)), for all k≧2, (19.2)
where 0≦q1<1 and 0≦q2≦1 are design parameters. q1 specifies the attenuation for small β1*(and β2*) and q2 specifies how far Fk* is from the lower bound for large β1* (and β2*). Some examples of q1 and q2 for k≧2 are: Fk*≈max(0, β1*−(1−q2) ln(dc−1)), for q1=0 (OA); Fk*≈max(q1β1*, β1*−ln(dc−1)), for q2=0 (LA with bound); Fk*≈max(0, β1*−ln(dc−1)), for q1=0, q2=0 (OA with max basis); Fk*≈max(β1*, β1*−(1−q2) ln(dc−1))=β1*, for q1=1 (MSA); Fk*≈max(q1β1*, β1*)=β1*, for q2=1 (MSA).
The LOA for F1* using q1=0 and q2=0 is the lower bound (14.1) and shown in
As an example of an NCPA for LOA depending on dc and/or k,
F
1*≈max(A1β2*, β2*−B1 ln(dc−1)), (20.1)
F
k*≈max(A2β1*, β1*−B2 ln(dc−1)), for all k≧2. (20.2)
Single linear OA (SLOA) uses one linear curve including the upper and lower bound.
F
1*≈max(min(Aβ2*−B, β2*), 0), (21.1)
F
k*≈max(min(Aβ1*−B, β1*), 0), for all k≧2. (21.2)
In this case, A can be larger than 1.0. The SLOA for F1* using A=1.5 and B=1.8 is shown in
Double linear OA (DLOA) uses two linear curves including the upper and lower bound.
F
1*≈max(min(max(A1β2*−B1, A2β2*−B2), β2*), 0), (22.1)
F
k*≈max(min(A1β1*−B1, A2β1*−B2), β1*), 0), for all k≧2. (22.2)
OA, LOA, and SLOA are special cases of DLOA. DLOA can also be reduced to LA and MSA. The DLOA for F1* using A1=0.5, B1=0.25, A2=2.0, and B2=3.0 is shown in
In operation, the codeword buffer 902 contains at least a subset of a codeword. It may be noted that a known or convenient computer-readable storage medium could be used, regardless of whether it is referred to as a “buffer.” In at least some implementations, the entire codeword is contained in the codeword buffer 902 prior to demodulation by the decoding engine 904.
The LDPC decoding engine 904 performs operations to obtain data bits from the codeword contained in the codeword buffer 902. Depending upon the implementation, configuration, and/or embodiment, the LDPC decoding engine 904 can implement one of OA, LOA, SLOA, DLOA, NLA, LBPA, SLBPA, LAPPA, SLAPPA, BPA-APPA, SBPA-SAPPA, LBPA-LAPPA, SLBPA-SLAPPA, ASPA, ABPA, ASBPA, ALBPA, ASLBPA, AAPPA, ASAPPA, ALAPPA, ASLAPPA, MSA-AAPPA, MSA-ASAPPA, MSA-ALAPPA, MSA-ASLAPPA, combinations, variations, and other applicable approaches and approximations. If the LDPC decoding engine 904 is capable of a specific approach or approximation, it can be referred to as, for example, an OA LDPC decoding engine. Where the LDPC decoding engine 904 is capable of switching between approaches and approximations, the LDPC decoding engine 904 can be referred to as an approach-configurable LDPC decoding engine.
The attenuation engine 906 can provide an attenuation parameter, A, to the decoding engine 904 that is useful in bounding the operations. The attenuation engine 906 can either be provided the value of A or generate the attenuation parameter using rules based upon operational variables. The attenuation engine 906 may or may not be able to modify the value of A, depending upon the implementation. The attenuation parameter, A, can be used by the decoding engine 904 to perform a LA on the codeword. If a NCPA is used, A can vary depending upon, for example, the iteration, dc, the check node, or some other factor. Depending upon the implementation and/or configuration, the attenuation engine 906 can provide multiple attenuation parameters, e.g., A1 and A2 and/or intervals of A, e.g., q1. It should be noted that the attenuation engine 906 could be implemented as “part of” the LDPC decoding engine 904, and the two components could be referred to together as an attenuation-sensitive LDPC decoding engine.
The offset engine 908 can provide a bias parameter, B, to the decoding engine 904 that is useful in bounding the operations. The offset engine 908 can either be provided the value of B or generate the bias parameter using rules based upon operational variables. The offset engine 908 may or may not be able to modify the value of B, depending upon the implementation. The bias parameter, B, can be used by the decoding engine 904 to perform an OA on the codeword. If a NCPA is used, B can vary depending upon, for example, the iteration, dc, the check node, or some other factor. Depending upon the implementation and/or configuration, the offset engine 908 can provide multiple bias parameters, e.g., B1 and B2 and/or intervals of B, e.g., q2. It should be noted that the offset engine 908 could be implemented as “part of” the LDPC decoding engine 904, and the two components could be referred to together as an offset-sensitive LDPC decoding engine.
The φ lookup table 910 includes multiple values of φ. If the decoding engine 904 needs to know the value of φ, using a lookup table is a common technique for providing the values. The decoding engine 904 includes an interface for the φ lookup table 910, to the extent a given implementation requires such an interface.
The approximation engine 912 can provide an approximation parameter, Q, to the decoding engine 904 that is useful in reducing the amount of processing required for the decoding engine 904. If a NCPA is used, Q, can vary depending upon, for example, the iteration, dc, the check node, or some other factor. It should be noted that the approximation engine 906 could be implemented as “part of” the LDPC decoding engine 904, and the two components could be referred to together as a node-limiting LDPC decoding engine.
The LDPC decoding engine 904 puts data bits associated with the decoded codeword in the data buffer 914.
In the example of
In the example of
In the example of
If, on the other hand, it is determined that there are no additional iterations (1008-N), then the flowchart 1000 continues to module 1010 with calculating APP values and extracting information bits. The logic of the flowchart 1000 in this example requires that if there are no additional iterations to be performed, there is by definition no new ELC parameter (1008-N). It may be noted that there may be no actual “determination” at decision point 1008, but it rather could be that a function that provides the ELC parameter values updates automatically. For example, if the parameter value changes at iteration, i, the parameter could simply be updated without a determination as to whether a new parameter value is needed.
In the example of
Non-linear approximation (NLA) uses a weighted sum of the upper and lower bound using equations (14), as specified in (16.1) and (16.2).
F
1*≈(1−q1)φ((1+(dc−2)q3)φ(β2*))+q1β2*, (23.1)
F
k*≈(1−q1)φ((1+(dc−2)q3)φ(β1*))+q1β1*, for all k≧2, (23.2)
where 0≦q1≦1 and 0≦q3≦1 are design parameters that assure that the approximation is within its upper and lower bound. As with LOA, q1 specifies the attenuation for small β1*. If q3 is defined as [(dc−1)(1−q2/(1−q1)−1](dc−2)−1, NLA will be similar to LOA for large β1* if they are using the same q1 and q3. Some examples of q1 and q2 for k≧2 are: Fk*≈φ((1+(dc−2)q3)φ(β1*)), for q1=0 (non-linear bias); Fk*≈φ((dc−1)φ(β1*)), for q1=0, q2=0 (tight lower bound); Fk*≈(1−q1)φ((dc−1)φ(β1*))+q1β1*, for q1=q2 (weighted bounds); Fk*≈β1*, for q1=1 (MSA); Fk*≈(1−q1)φ(φ(β1*))+q1β1*=β1* for q2=1 (MSA).
The NLA for F1* using q1=0 and q2=0 is the tight lower bound in (14.1) and shown in
In the example of
In the example of
In the example of
Table 1 illustrates the relative complexity of MSA, LA, the OA variants, and NLA. The table lists the number of different operations required to decode one row in H. For one iteration these numbers need to be multiplied with M=ZS. However, Z of them will be performed in parallel. In Table 1, PAR lists the parameters used in the algorithm; FIND is the number of smallest βk that has to be found out of the dc edges connected to the check node; MIN is the number of two-input min or max operations (min or max operations with one input equal to zero are disregarded); LUT is the number of times a look-up table is needed; ADD is the number of additions/subtractions; MULT is the number of multiplications. Note that by choosing the scaling to Aε{2−j, 1-2−j}, for j=0, 1, 2, . . . simplifies the multiplications significantly.
Advantageously, with linear BPA (LBPA), the BP operation can be approximated by using the dominant bias term approximated with a linear function: μ(x)=max(B−A|x|,0), where A>0 and B>0 are design parameters.
Where the SBPA and LBPA are known, the simplified LBPA (SLBPA), can be obtained by combining the techniques. The BP operation can be approximated using the dominant bias term approximated with a linear function β1β2≈max(min(β1,β2)−max(B−A|β1−β2|,0)). If the two inputs are ordered the operation simplifies to β1*β2*≈max(β1*−max(B−A(β2*−β1*),0)).
Any of the approximations described earlier in SBPA, LBPA, and SLBPA can be used as the BP operation in the APPA. These approaches will then be referred to as SAPPA, LAPPA, and SLAPPA, respectively.
Instead of calculating the true extrinsic L-values for all check nodes (as in BPA), the extrinsic value for the least reliable check node can be calculated. The least reliable check node is the check node with the smallest βk, i.e., check node F1*. The other check nodes can be under-estimated using the lower bound as with APPA described earlier: F1*=Σ(i=2 . . . dc) βi*, Fk*≈F1* β1*, for all k≦2. This is a combination of BPA and APPA, here denoted by BPA-APPA. The number of BP operations in BPA-APPA is reduced from 3(dc−2) in the full BPA to dc−1. The BPA-APPA can be described by the following procedure, where βi* is found at the same time as F1* is calculated:
Any of the approximations described earlier in SBPA, LBPA, and SLBPA can be used as the BP operation in the BPA-APPA. These approaches will then be referred to as SBPA-SAPPA, LBPA-LAPPA, and SLBPA-SLAPPA, respectively.
Table 2 illustrates the relative complexity of SPA/BPA/APPA and BPA/APPA variants. The table lists the number of different operations required to decode one row in H. For one iteration these numbers need to be multiplied with M=ZS. However, Z of them will be performed in parallel. In Table 2, PAR lists the parameters used in the algorithm; FIND is the number of smallest βk that has to be found out of the dc edges connected to the check node; MIN is the number of two-input min or max operations (min or max operations with one input equal to zero are disregarded); LUT is the number of times a look-up table is needed; ADD is the number of additions/subtractions; MULT is the number of multiplications. Note that by choosing the scaling to Aε(2−j, 1-2−j), for j=0, 1, 2, . . . simplifies the multiplications significantly.
Advantageously, using an approximation engine, such as the approximation engine 912 (
F
k*≈φ(Σ(i=1 . . . Q,i≠k)φ(βi*)), where Qε{2, 3, . . . , dc}. (25)
Advantageously, using an approximation engine, such as the approximation engine 912 (
F
k*=Σ(i=1 . . . Q, i≠k)βi*, where Qε{2, 3, . . . , dc}. (26)
These approaches can be denoted as ABPA, ASBPA, ALBPA, and ASLBPA, where the approximation technique is applied to BPA, SBPA, LBPA, and SLBPA, respectively. If Q=dc, ABPA, ASBPA, ALBPA, and ASLBPA reduce to BPA, SBPA, LBPA, and SLBPA, respectively. When Q=2, F1*≈β2*, F2*β1*, Fk*β1*β2*, for all k≧3. When Q=3, F1*≈β2*β3*, F2*β1* β3*, F3*≈β1*β2*, Fk*≈β1*β2*β3* for all k≧4.
The APPA can be approximated by only using the Q first βi* in the calculation of Fk*. Note that the Q smallest βk has to be found and sorted beforehand, in contrast to APPA where the βk do not need to be sorted. For approximated APPA (AAPPA),
F
k*≈Σ(i=1 . . . dc)βi*, for all k≧1, where Qε{2, 3, . . . , dc}. (27)
In contrast to the APPA (that is a lower bound), the AAPPA is not a lower or an upper bound. Any of the approximations described earlier in SBPA, LBPA, and SLBPA can be used as the BP operation in the AAPPA. These approaches will then be referred to as ASAPPA, ALAPPA, and ASLAPPA, respectively.
As in BPA-APPA, where a combination of BPA and APPA is used, other combinations can be formulated. For example the combination of MSA and AAPPA according to
F
1*≈β2*, Fk*Σ(i=1 . . . Q)βi*, for all k≧2, where Qε{2, 3, . . . , dc}. (28)
Note that Q smallest βk has to be found and sorted beforehand. When Q=2, F1*≈β2*, Fk*≈β1*β2*, for all k≧2. Note that MSA-AAPPA is similar to ABPA when Q=2 (but not the same, since F2* is different). When Q=3, F1*≈β2*, Fk*≈β1*β2*β3*, for all k≧2. Any of the approximations described earlier in SBPA, LBPA, and SLBPA can be used as the box-plus operation in the MSA-AAPPA. These approaches will then be referred to as MSA-ASAPPA, MSA-ALAPPA, and MSA-ASLAPPA, respectively. Other combinations of the above approaches can also be constructed.
In the example of
In the example of
In the example of
In the example of
Table 3 illustrates the relative complexity of ASPA/ABPA/AAPPA and variants. The table lists the number of different operations required to decode one row in H. For one iteration these numbers need to be multiplied with M=ZS. However, Z of them will be performed in parallel. In Table 3, PAR lists the parameters used in the algorithm; FIND is the number of smallest βk that has to be found out of the dc edges connected to the check node; MIN is the number of two-input min or max operations (min or max operations with one input equal to zero are disregarded); LUT is the number of times a look-up table is needed; ADD is the number of additions/subtractions; MULT is the number of multiplications.
As used in this paper, A, B, q1, q2, and Q are identified as parameters for use in computation of an extrinsic L-value for a check node. Accordingly, the parameters can be referred to as extrinsic L-value computation parameters or check node parameters. Where one parameter is specifically identified, it can be referred to as an attenuation parameter, A, a bias parameter, B, a design parameter, qn, or an approximation parameter, Q. Q can also be referred to as node-limiting parameter because it reduces the number of computations associated with a node when computing an extrinsic L-value for a check node. The parameters A, B, qn can be referred to in the aggregate as check node bounding parameters. Where qn is associated with attenuation, A and qn can be referred to as attenuation parameters. Where qn is associated with bias, B and qn can be referred to as bias parameters.
The device 1302 interfaces to external systems through the communications interface 1310, which may include a modem, network interface, or some other known or convenient interface. It will be appreciated that the communications interface 1310 can be considered to be part of the system 1300 or a part of the device 1302. The communications interface 1310 can be an analog modem, ISDN modem, cable modem, token ring interface, ethernet interface, wireless 802.11 interface, satellite transmission interface (e.g. “direct PC”), or other interfaces for coupling a computer system to other computer systems.
The processor 1308 may be, for example, a microprocessor such as an Intel Pentium microprocessor or Motorola power PC microprocessor. The memory 1312 is coupled to the processor 1308 by a bus 1320. The memory 1312 can be Dynamic Random Access Memory (DRAM) and can also include Static RAM (SRAM). The bus 1320 couples the processor 1308 to the memory 1312, also to the non-volatile storage 1316, to the display controller 1314, and to the I/O controller 1318.
The I/O devices 1304 can include a keyboard, disk drives, printers, a scanner, and other input and output devices, including a mouse or other pointing device. The display controller 1314 may control in the conventional manner a display on the display device 1306, which can be, for example, a cathode ray tube (CRT) or liquid crystal display (LCD). The display controller 1314 and the I/O controller 1318 can be implemented with conventional well known technology.
The non-volatile storage 1316 is can include a magnetic hard disk, an optical disk, or another form of storage for large amounts of data. Some of this data is often written, by a direct memory access process, into memory 1312 during execution of software in the device 1302.
Clock 1320 can be any kind of oscillating circuit creating an electrical signal with a precise frequency. In a non-limiting example, clock 1320 could be a crystal oscillator using the mechanical resonance of vibrating crystal to generate the electrical signal.
The system 1300 is one example of many possible computer systems which have different architectures. For example, personal computers based on an Intel microprocessor often have multiple buses, one of which can be an I/O bus for the peripherals and one that directly connects the processor 1308 and the memory 1312 (often referred to as a memory bus). The buses are connected together through bridge components that perform any necessary translation due to differing bus protocols.
Network computers are another type of computer system that can be used in conjunction with the teachings provided herein. Network computers do not usually include a hard disk or other mass storage, and the executable programs are loaded from a network connection into the memory 1312 for execution by the processor 1308. A Web TV system is also considered to be a computer system, but it may lack some of the features shown in
In addition, the system 1300 is controlled by operating system software which includes a file management system, such as a disk operating system, which is part of the operating system software. One example of operating system software with its associated file management system software is the family of operating systems known as Windows® from Microsoft Corporation of Redmond, Wash., and their associated file management systems. Another example of operating system software with its associated file management system software is the Linux operating system and its associated file management system. The file management system is typically stored in the non-volatile storage 1316 and causes the processor 1308 to execute the various acts required by the operating system to input and output data and to store data in memory, including storing files on the non-volatile storage 1316.
Some portions of the detailed description are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the relevant art. An algorithm is here, and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
The device 1300 may be specially constructed for required purposes, or it may comprise a general purpose computer specially-purposed to perform certain tasks. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, read-only memories (ROMs), random access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, any type of disk including floppy disks, optical disks, CD-ROMs, and magnetic-optical disks, or any type of media suitable for storing electronic instructions, and each coupled to a computer system bus.
Systems described in this paper may be implemented on any of many possible hardware, firmware, and software systems. Algorithms described herein are implemented in hardware, firmware, and/or software that is implemented in hardware. The specific implementation is not critical to an understanding of the techniques described herein and the claimed subject matter.
As used in this paper, an engine includes a dedicated or shared processor and, hardware, firmware, or software modules that are executed by the processor. Depending upon implementation-specific or other considerations, an engine can be centralized or its functionality distributed. An engine can include special purpose hardware, firmware, or software embodied in a computer-readable medium for execution by the processor. As used in this paper, the term “computer-readable storage medium” is intended to include only physical media, such as memory. As used in this paper, a computer-readable medium is intended to include all mediums that are statutory (e.g., in the United States, under 35 U.S.C. 101), and to specifically exclude all mediums that are non-statutory in nature to the extent that the exclusion is necessary for a claim that includes the computer-readable medium to be valid. Known statutory computer-readable mediums include hardware (e.g., registers, random access memory (RAM), non-volatile (NV) storage, to name a few), but may or may not be limited to hardware. Data stores, including tables, are intended to be implemented in computer-readable storage media. Though the data would be stored on non-ephemeral media in order for the media to qualify as “storage media,” it is possible to store data in temporary (e.g., dynamic) memory.
As used in this paper, the term “embodiment” means an embodiment that serves to illustrate by way of example but not necessarily by limitation.
It will be appreciated to those skilled in the relevant art that the preceding examples and embodiments are exemplary and not limiting to the scope of the present invention. It is intended that all permutations, enhancements, equivalents, and improvements thereto that are apparent to those skilled in the art upon a reading of the specification and a study of the drawings are included within the true spirit and scope of the present invention. It is therefore intended that the following appended claims include all such modifications, permutations and equivalents as fall within the true spirit and scope of the present invention.
This application claims priority to U.S. Provisional Patent Application No. 61/122,648, filed on Dec. 15, 2008, and which is incorporated by reference.
Number | Date | Country | |
---|---|---|---|
61122648 | Dec 2008 | US |