SYSTEMS AND METHODS FOR REDUCED COMPLEXITY LDPC DECODING

Description

BRIEF DESCRIPTION OF THE DRAWINGS

Features, aspects, and embodiments of the inventions are described in conjunction with the attached drawings, in which:

FIG. 1 is a diagram illustrating an example communication system that uses LDPC codes;

FIG. 2 is a diagram illustrating the operation of an exemplary parity check matrix;

FIG. 3 is a diagram illustrating an exemplary parity node processor;

FIG. 4 is a diagram illustrating the operation of an exemplary parity node processor;

FIG. 5 is a diagram illustrating the operation of an exemplary variable node processor;

FIG. 6 is a diagram illustrating an example parity node processor configured in accordance with one embodiment;

FIG. 7 is a diagram illustrating an example parity node processor configured in accordance with another embodiment;

FIGS. 10 and 11 are graphs showing respectively the simulated frame error rate (FER) and bit error rate (BER) performance for the irregular, ¾-rate LDPC codes defined in 802.16eD12 under AWGN channel for various decoding algorithms;

FIG. 12 is a flow chart illustrating an example method for performing LDPC decoding using the parity node processors of FIG. 6 or 7;

FIG. 13 is a diagram illustrating a portion of an example LDPC decoder that includes degree reduction in accordance with one embodiment;

FIG. 14 is a diagram illustrating an example embodiment of a degree reducing unit that can be included in the LDPC decoder of FIG. 13 in accordance with one embodiment;

FIG. 15 is a diagram illustrating an example comparator that can be included in the degree reducing unit of FIG. 14;

FIG. 16 is a diagram illustrating an example embodiment of a degree reducing unit that can be included in the LDPC decoder of FIG. 13 in accordance with another embodiment; and

FIG. 17 is a graph illustrating the FER performance for the LDPC decoder of FIG. 13 with a degree reduction of 6/3 and 7/3.

DETAILED DESCRIPTION

In the descriptions that follow, certain example parameters, values, etc., are used; however, it will be understood that the embodiments described herein are not necessarily limited by these examples. Accordingly, these examples should not be seen as limiting the embodiments in any way. Further, the embodiments of an LDPC decoder described herein can be applied to many different types of systems implementing a variety of protocols and communication techniques. Accordingly, the embodiments should not be seen as limited to a specific type of system, architecture, protocol, air interface, etc. unless specified.

A check node processor 302 of degree n is shown in FIG. 3. At each iteration, the outgoing soft messages {λ_i, i=1, 2, . . . n} are updated with the incoming soft messages {u_i,i=1,2 . . . n}. The outgoing soft message is defined as the logarithm of the ratio of probability that the corresponding bit is 0 or 1.

With the standard sum-product algorithm, the outgoing message is determined as follows:

$\begin{matrix} λ_{i} = 2 \tanh^{- 1} \prod_{\underset{j \neq i}{j = 1}}^{n} \tanh \frac{u_{j}}{2}, i = 1, 2 \dots n & (1) \end{matrix}$

The outgoing soft messages are then fed back to the variable node processors for use in generating outputs u_iduring the next iteration; however, a soft message λ_ibased on a variable node output from a particular node are not returned to that node. Thus, the j≠i constraint in the following term of (1):

$\prod_{\underset{j \neq i}{j = 1}}^{n} \tanh \frac{u_{j}}{2}, i = 1, 2 \dots n .$

This can also be illustrated with the aide of FIG. 4, which is a diagram illustrating the operation of parity node processor 202. First, the LDPC decoder will initialize the variable data bits u₀, u₁, u₂. . . u₆of variable node processors 208, 210, 212, 214, 216, and 218 with r₀, r₁, r₂, . . . r₆. Referring to FIG. 4, U₀^k−1, U₂^k−1, and u₄^k−1are the variable messages sent from variable nodes 208, 212, and 216 to parity node processor 202. Parity node processor 202 operates on these messages and computes its messages λ^k. For example, λ^k(0>2) represents the message sent from parity node 202 to variable node 212 at the kth iteration.

The messages produced by parity node processor 202 can be defined using the following equations:

$\begin{matrix} λ^{k} (0 \to 0) = 2 \tanh^{- 1} [\tanh (\frac{u_{2}^{k - 1}}{2}) \tanh (\frac{u_{4}^{k - 1}}{2})] λ^{k} (0 \to 2) = 2 \tanh^{- 1} [\tanh (\frac{u_{0}^{k - 1}}{2}) \tanh (\frac{u_{4}^{k - 1}}{2})] λ^{k} (0 \to 4) = 2 \tanh^{- 1} [\tanh (\frac{u_{0}^{k - 1}}{2}) \tanh (\frac{u_{2}^{k - 1}}{2})] & (2) \end{matrix}$

Thus parity node processor 202 can be configured to implement the above equations (2). The soft messages produced by the parity nodes, e.g., parity node 202, are then fed back to variable nodes 208, 210, 212, 214, 216, and 218, for use in the next iteration.

For example, FIG. 5 is a diagram illustrating the operation of variable node processor 208. Referring to FIG. 5, variable node processor 208 receives as inputs messages from parity node processors 202 and 206 and produces variable messages to be sent back to the same parity node processors 202 and 206. In the example of FIG. 4 and FIG. 5, hard decisions are taken on the multilevel variable u_n^kand checked to see if they meet the parity node equations defined above. If there is a match, or if a certain defined number of iterations is surpassed, then the decoder can be stopped.

Variable node processor 208 can be configured to implement the following equation:

u
₀
^k
=u
_ch,0+λ^k(0>0)+λ^k(2>0), (3)

where u_ch,0is the message from the channel, which does not change with each iteration

It will be understood that the decoder described above can be implemented using hardware and/or software configured appropriately and that while separate parity check processors and variable node processors are described, these processors can be implemented by a single processor, such as a digital signal processor, or circuit, such as an Application Specific Integrated Circuit (ASIC); however, as mentioned above, implementation of a LDPC processor such as that described with respect to FIGS. 2-5 can result in large complexity, stringent memory requirements, and interconnect complexity that can lead to bottlenecks. These issues can be exacerbated if multiple data rates are to be implemented. In other words, practical implementation if such a decoder can be limited.

As noted above, the sum-product algorithm of equation (1) can be prohibitive in terms of practical and cost effective implementation. Approximations have been proposed with the aim of reducing this complexity. For example, it can be shown that (4) is equivalent to (1):

λ_i=u₁{circle around (+)}u₂{circle around (+)} . . . {circle around (+)}u_n, (4)

where the operator {circle around (+)} is defined as:

$\begin{matrix} x \oplus y \overset{Δ}{=} \ln \frac{1 + e^{x + y}}{e^{x} + e^{y}} . & (5) \end{matrix}$

Using the approximation formula:

e
^x
+e
^y≈max(e^x,e^y)=e^{max (x,y)}. (6)

Or equivalently,

1n(e^x+e^y)≈max(x, y) (7)

in both numerator and denominator of (5), then the following can be obtained:

$\begin{matrix} \begin{matrix} x \oplus y \approx \max (0, x + y) - \max (x, y) \\ = sgn (x) sgn (y) \min (\langle x \rangle, \langle y \rangle) . \end{matrix} & (8) \end{matrix}$

Repeatedly substituting (8) into (4), the min-sum algorithm (MSA) can be obtained as follows:

$\begin{matrix} λ_{i} \approx \prod_{j \neq i}^{} sgn (u_{j}) \times \min_{j \neq i} (\langle u_{j} \rangle) i = 1, 2, \dots n & (9) \end{matrix}$

It will be apparent that equation (9) is much simpler to implement than (1) or (4), but the cost for this simplification is a grave performance penalty, generally about 0.3˜0.4 dB, depending on the specific code structure and code rate. To reduce such performance loss, some modifications have been proposed. For example, the performance loss of MSA comes from the approximation error of (9) relative to (1). Accordingly, to improve the performance loss, the approximation error should be reduced. It can be shown that (9) is always larger than (1) in magnitude. Thus, normalized-MSA and offset-MSA use scaling or offsetting to force the magnitude be smaller.

With the normalized min-sum algorithm, (9) is scaled by a factor α:

$\begin{matrix} λ_{i} \approx α {\prod_{j \neq i}^{} sgn (u_{j}) \times \min_{j \neq i} (\langle u_{j} \rangle)}, where 0 < α \leq 1. & (10) \end{matrix}$

The offset min-sum algorithm reduces the magnitude by a positive constant β:

$\begin{matrix} λ_{i} \approx \prod_{j \neq i}^{} sgn (u_{j}) \cdot \max (\min_{j \neq i} (\langle u_{j} \rangle) - β, 0) & (11) \end{matrix}$

But these approaches again increase the complexity. Thus, as mentioned above, there is a constant trade-off between complexity and performance.

The embodiments described below use a new approach for the check nodes update in the decoding of LDPC codes. The approach is based on a new approximation of the SPA that can reduce the approximation error of the MSA and has almost the same performance as the SPA under both floating precision operation and fixed-point operation. As a result, the new approximation can be implemented in simple structures, the complexity of which is on par with MSA implementations.

The approximation error of MSA comes from the approximation error of equation (7). Note that equation (7) is coarse when x and y are close. MSA uses equation (7) in both numerator and denominator of equation (5). If the value of |x| and |y| is close, then either the numerator or the denominator can introduce large approximation error. Thus, to improve the accuracy of the outgoing message, equation (7) can be used in (5) only when the numerator or denominator of (5) will produce a small approximation error.

For example, when both x and y have the same sign, then using the approximation 1+e^x+y≈max(e⁰,e^x+y) in the numerator will produce better results than using e^x+e^y≈e^max(x,y)in the denominator. Similarly, when x and y have opposite signs, then only approximating the denominator of (5) using e^x+e^y≈e^max(x,y)can produce better results. Thus, a better approximation of (5), for x, y>0, can be generated using the following:

$\begin{matrix} x \oplus y \approx \ln \frac{e^{x + y}}{e^{x} + e^{y}} = - \ln (e^{- x} + e^{- y}) . & (12) \end{matrix}$

For all combinations of the signs of x and y, the following general expression can be used:

x{circle around (+)}y≈−sgn(x)sgn(y)1n(e^−|x|+e^−|y|) (13)

Iteratively substituting (13) into (4), produces:

$\begin{matrix} λ_{i} = - \prod_{\underset{j \neq i}{j = 1}}^{n} sgn (u_{j}) \times \ln (\sum_{\underset{j \neq i}{j = 1}}^{n} e^{- \langle u_{j} \rangle}) . & (14) \end{matrix}$

Note that (14) only holds when

$\sum_{j \neq i}^{n} e^{- \langle u_{j} \rangle} < 1.$

If this condition is not satisfied, then the results can be limited to 1, resulting in the following.

$\begin{matrix} λ_{i} = - \prod_{\underset{j \neq i}{j = 1}}^{n} sgn (u_{j}) \times \ln (\min (\sum_{\underset{j \neq i}{j = 1}}^{n} e^{- \langle u_{j} \rangle}, 1)) . & (15) \end{matrix}$

Now, let

$A = \sum_{j = 1}^{n} e^{- \langle u_{j} \rangle},$

then (15) can be expressed as:

$\begin{matrix} \begin{matrix} λ_{i} = \prod_{j = 1, j \neq i}^{n} sgn (u_{j}) \times \ln (\min (A - e^{- \langle u_{j} \rangle}, 1)) \\ = \prod_{j = 1, j \neq i}^{n} sgn (u_{j}) \times \min {\ln (A - e^{- \langle u_{i} \rangle}), 0} . \end{matrix} & (16) \end{matrix}$

The sign of (16) can be realized in the same way as in a MSA implementation, e.g., with binary ex-or logic circuit. The kernel of the approximation has the invertibility property, which allows the computation of the aggregate soft messages first, followed by intrinsic back-out to produce extrinsic updates.

The amplitude of equation (16) can be realized with a serial structure or a parallel structure shown in FIGS. 6 and 7 respectively. Thus, FIG. 6 is a diagram illustrating a serial implementation of a parity node processor 602. As can be seen, the variable node outputs are first processed by processing block 602 and then accumulated in accumulator 604. Each input is then stored in shift register 606 and subtracted from the output of accumulator 604 in adder 610. The natural log of the resulting difference is then taken in processing block 608 in order to produce the soft outputs.

FIG. 7 is a diagram illustrating a parallel implementation of a parity node processor 700. Here, the inputs from the variable node processors are processed in parallel in processing blocks 702, 704, and 706 and then summed in summer 708. Each input, is then subtracted from the output of summer 708 in parallel in adders 710, 712, and 714. The natural logs of the outputs of adders 710, 712, and 714 are then taken in parallel in processing blocks 716, 718, and 720 to produce the soft outputs.

Both structures 600 and 700 have the same computation load. Serial structure 600 requires smaller hardware size, but needs 2n clock cycles to get all outgoing soft messages. Parallel structure 700 requires only 1 clock cycle, but needs larger hardware size than serial structure 600. Parallel structure 700 is attractive when the decoding speed is the primary concern. It will be understood that the exponential and logarithm operations in FIGS. 6 and 7 can be realized in any way, such as look-up tables, software, or hardware, etc.

The 1n(•) operation can include the min (•,0) operation, which can be implemented by simply using the sign bit of the logarithm result to clear the output. In particular, if the logarithm is realized with look-up table, this can be done by simply setting the content of the table to 0 for all inputs greater than 1 or simply limiting the range of the address used to pick up the table content.

The implementations of FIGS. 6 and 7 can be included in a receiver such as receiver 110. Such a receiver can be included in a device configured to operate in a, e.g., wireless Wide Area Network (WAN) or Metropolitan Area Network (MAN), a wireless Local Area Network (LAN), or wireless Personal Area Network (PAN).

The computation complexity of the proposed implementations is similar to an MSA implementation. Table 1 is the comparison of the computation load for parity node processing for various decoding algorithm, where it has been assumed that SPA, MSA, normalized-MSA and offset-MSA are implemented in a known forward-backward manner.

TABLE 1

e^{( )}
ln ( )
+
x

SPA Eq. (4)
9(n − 2)
6(n − 2)
12(n − 2)
—

Eq. (14)
n
n
2n − 1
—

MSA Eq.
—
—
3(n − 2)
—

(9)

Normalized-
—
—
3(n − 2)
n

MSA Eq. (10)

Offset-
—
—
4n − 6
—

MSA Eq. (11)

FIGS. 8 and 9 are graphs showing respectively the simulated frame error rate (FER) and bit error rate (BER) performance for the irregular, ½-rate LDPC codes defined in 802.16eD12 under AWGN channel for various decoding algorithms including SPA, the proposed algorithm under both floating and fixed-point operation, MSA, normalized-MSA and offset-MSA. With normalized-MSA and offset-MSA, a normalization factor of 0.8 is used and the offset factor as 0.15. The check node degree distribution of the code is p(x)=0.6667x⁶+0.3333x⁷. The decoder use layered decoding with maximum iteration number as 30.

FIGS. 10 and 11 are graphs showing the corresponding simulation results for the irregular, ¾-rate LDPC codes with check node degree distribution as p(x)=0.8333x¹⁴+0.1667x¹⁵. All the curves are simulated with float-point operations except the curve labeled as “proposed-quantization,” which is the results of an implementation of equation (16) with a fixed-point decoder. In the simulation of the fixed-point decoder, the channel inputs are quantized to 8 bits binary integers, where 1 bit is used for the sign and the other 7 bits for the absolute value.

In the simulations, the variable node updates are integer summations with results ranging from −128˜+128. The exponential operation, e.g., in FIGS. 6 and 7, are implemented using a look up table with 128 entries each has 9 bits representing a quantized value in [0,1]. The summation and subtraction, e.g., in FIGS. 6 and 7, are 9 bits integer operations. The logarithm is a table with 512 entries, each of which has 7 bits representing the quantized absolute value to be sent to variable nodes together with sign bits.

It can be seen from the graphs of FIGS. 8-11 that implementation of equation (16) with floating operation can have almost the same performance as standard SPA, and performance that is better than that produced using MSA by 0.3-0.4 dB.

Moreover, although it can be challenging to meet the dynamic range requirements for the exp ( ) operation, the simulation results show that the fixed-point operation has hardly any performance loss relative to the floating operation. Note that the number of quantization bits can be greatly reduced with non-uniform quantization, with increased complexity. With non-uniform quantization, the size of the logarithm and exponential tables can be reduced, but these quantized values should be first mapped to the linearly quantized values before the operation of summation in FIG. 6.

FIG. 12 is a flow chart illustrating an example method for performing LDPC decoding as described above. First in step 1202, a wireless signal can be received and the signal can be demodulated in step 1204. In step 1206, variable messages can be generated from the demodulated signal. An exponential operation can be performed on the variable messages in accordance with equation (16) in step 1208. In step 1210, the resulting exponential data can be summed and the variable messages can be subtracted from the summed data in step 1212, again in accordance with equation (16). Finally, and again in accordance with equation (16), then a logarithmic operation can be performed, in step 1214, on the difference produced in step 1212.

Accordingly, using the systems and method described above, the resources, i.e., complexity, required to implement a parity node can be reduced, while still maintaining a high degree of precision. In certain embodiments, the complexity can be reduced even further through degree reduction techniques. In other words, the number of inputs to the parity node can be reduced, which can reduce the resources required to implement the parity node. It should also be noted that in many parity node implementations, the sign and the absolute value of the outgoing soft message are calculated separately.

FIG. 13 is a diagram illustrating a portion of an example LDPC decoder 1300 that includes degree reduction. In LDPC decoder 1300, the absolute value of variable messages {u_i,i=1,2 . . . n}, i.e., {|u₁|, |u₂|, . . . |u_n|}, are first input to Degree Reduction Unit (DRU) 1302, which produces a reduced number of outputs {u′₁, u′₂. . . . . u′_m}, where m<n. In other words, DRU 1302 is configured to select m inputs out of n total inputs, where normally, m<n. In certain embodiments, the inputs {u_i,i=1,2 . . . n} with smallest value can be chosen. The selected inputs {u′₁,u′₂, . . . ,u′_m} are then a subset of {|u₁|, |u₂|, . . . |u_n|}, such that all the elements in set {|u₁|, |u₂|, . . . |u_n|}\{u′₁,u′₂, . . . ,u′_m} cannot be smaller than any elements in {u′₁,u′₂, . . . ,u′_m}.

The outputs of DRU 1302 can then be provided to parity node processor 1304. Parity node processor 1304 can be implemented using either the serial configuration of FIG. 6 or the parallel configuration of FIG. 7.

Similarly, depending on the embodiment, DRU 1302 can be implemented in parallel or serial structures. FIG. 14 is diagram illustrating a parallel configuration for DRU 1302. In the example of FIG. 14, DRU 1302 comprises 12 comparators configured to reduce the degree from 8 to 3. In other words, 8 input variable messages are reduced to three output message to be based to parity node processor 1304. It will be understood, of course, that different input and output degrees can be accommodated depending on the requirements of a particular implementation. It will also be understood that the greater the degree reduction, the greater the reduction in complexity of parity node processor 1304; however, this can also lead to reduced precision. Accordingly, the level of degree reduction should be chosen to maximum resource savings and precision.

An example, implementation for the comparators of FIG. 14 is illustrated in FIG. 15. As can be seen, the S output is the smaller of the two inputs, while the L output is the larger of the two.

In the example of FIG. 14, DRU 1302 is configured to select the smallest inputs. Thus, the comparators are configured to select the smallest input from each input pair. In this case, five levels of comparators are used to produce the 8 to 3 degree reduction. Comparators 1402a-1402d, select the smallest input from the input pairs. These are then compared to the largest inputs form the input pairs in the second level of comparators comprising comparators 1404a-1404d in the manner shown. One of the outputs is the dropped out and the remaining inputs are compared in the third level of comparators 1406a-1406c. two more outputs are then dropped and the remaining inputs are compared in level four, comparator 1408 and level five, comparator 1410.

FIG. 16 is a diagram illustrating an example serial implementation of DRU 1302 in accordance with one embodiment. AS can be seen, in the example of FIG. 16, serial DRU 1302 reduces the degree from n to 3. In this example embodiment, DRU 1302 comprises serial comparators, e.g., comparators 1608, 1610, and 1612, which can be implemented as illustrated in FIG. 15 and described above. Delay units 1602, 1604, and 1606 are included and correspond to one clock cycle. The inputs {|u₁|, |u₂|, . . . |u_n|} arrive sequentially, one input for one clock cycle.

Parity node processor 1304 can be configured to calculate the absolute value of outgoing messages with equation (16), i.e., the second term of equation (16). In other words, the sign and absolute value for equation (16) can be determined separately using the following:

$\begin{matrix} λ_{i} = \prod_{j = 1, j \neq i}^{n} sgn (u_{j}) \times \max {- \ln (A - e^{- \langle u_{i} \rangle}), 0} & (16) \\ sgn (λ_{i}) = \prod_{j = 1, j \neq i}^{n} sgn (u_{j}) = sgn (u_{i}) B & (17) \\ \langle λ_{i} \rangle = \max {- \ln (A - e^{- \langle u_{i} \rangle}), 0} & (18) \end{matrix}$

Thus, parity node processor 1304 can be used to calculate the absolute value in accordance with equation (18) for a check node of degree m. Parity node processor 1304 can be implemented as a serial or parallel parity node processor as described above.

Output unit (OU) 1306 can be configured to simply connect the outputs of parity node processor 1304, i.e., {|λ′₁|, |λ′₂|, . . . |λ′_m|}, to the output ports {|u_λ|, |λ₂|, . . . |λ_n|}. For example, suppose there are 8 inputs {|u₁|, |u₂|, . . . |u₈|} and DRU 1302 select m=3 of them. The selection results depends on the specific data value of {|u₁|, |u₂|, . . . |u₈|}. Suppose that for some specific inputs, the selection result is {|u′₁|, |u₂|, u′₂=|u₈|, u′₃=|u₅|}, then OU 1306 should connect |λ′₁|, |λ′₂| and |λ′_{3| to |λ}₂|, |λ_{8| and |λ}₅|, respectively and connect −1n A to |λ₁|, |λ₃|, |λ₄|, |λ₆|, |λ₇|.

For this to be feasible, OU 1306 should be configured to operate in coordination with DRU 1302. For example, if the k-th input of DRU 1302, i.e., |u_k|, is selected by DRU 1302 as the j-th input of parity node processor 1304, i.e., u′_j, then OU 1306 can be configured to correspondingly connect the j-the output of parity node processor 1304 to |λ_k|.

It should be noted that while a parallel implementation of DRU 1302 can be paired with a parallel implementation of parity node processor 1304, and that a serial implementation of DRU 1302 can be paired with a serial implementation of parity node processor 1304, such us not required. In other words, a parallel implementation of DRU 1302 can be paired with a serial implementation of parity node processor 1304 and vice versa. Moreover, it may be better, depending on the requirements of a particular implementation to forgo the inclusion of DRU 1302 and OU 1306. For example, if decoding speed is of the most concern, then a combination of a parallel DRU 1302 and a parallel parity node processor 1304 can be the best choice. On the other hand, if hardware size and resources is the most important issue, then a serial parity node processor 1304 without any DRU 1302 or OU 1306 can be preferred. If the LPDC decoder is implemented, e.g., with a Digital Signal Processor (DSP), as in the Software Defined Radio (SDR) terminals, a serial DRU 1302 and a serial parity node processor can be preferred because it provides the least decoding delay.

FIG. 17 is a diagram illustrating simulation results for the decoder of FIG. 13, illustrating that such an embodiment can reduce the degree to 3 and only cause a performance loss less than 0.05 dB compared with SPA. The check node degree of the simulated LDPC code is 6 and 7. Similar performance can be observed for ¾ rate LDPC code whose check node degree is 14 and 15.

Table 2 illustrates the LDPC complexity comparison with the degree reduction of FIG. 13 and without. The data in table 2 is for n=8 and m=3. The “comparison” operation is normally less complex than the “Add” operation, thus the overall complexity with degree reduction is much less than without.

TABLE 2

exp
log
Add
Comparison

Without degree
8
8
15

reduction

With degree reduction
3
4
4
13

While certain embodiments of the inventions have been described above, it will be understood that the embodiments described are by way of example only. Accordingly, the inventions should not be limited based on the described embodiments. Rather, the scope of the inventions described herein should only be limited in light of the claims that follow when taken in conjunction with the above description and accompanying drawings.

Claims

1. A receiver, comprising: a demodulator configured to receive a wireless signal comprising an original data signal, remove a carrier signal from the wireless signal and produce a received signal; anda Low Density Parity Check (LDPC) processor coupled with the demodulator, the LDPC processor configured to recover the original data signal according to the received signal, the LDPC processor comprising: a plurality of variable node processors configured to generate variable messages based on the received signal, anda parity node processor coupled with the plurality of variable node processors, the parity node processor configured to implement an approximation of a sum product algorithm (SPA) based on the signs of the variable messages resulting in soft outputs representing estimates of the variable messages.
2. The receiver of claim 1, wherein the parity node processor is configured to implement the following:
3. The receiver of claim 1, wherein the parity node processor comprises: a plurality of input processing blocks configured to receive the plurality of variable messages in parallel and perform an exponential operation on the variable messages in order to generate exponential terms for use in generating the soft messages;a summer coupled with the plurality of input processing blocks, the summer configured to sum the exponential terms generated by the plurality of input processing blocks in order to generate sum terms for use in generating the soft messages;a plurality of adders coupled with the summer and the plurality of input processing blocks, the plurality of adders configured to subtract the exponential terms from the sum terms in order to generate a difference term for use in generating the soft messages; anda plurality of output processing blocks coupled with the plurality of adders, the plurality of output processing blocks configured to perform a logarithm function on the outputs of the plurality of adders in order to produce the soft messages.
4. The receiver of claim 3, wherein the parity node processor further comprises a sign processing block coupled with the plurality of output processing blocks, the sign processing block configured to determine a sign associated with the outputs of the plurality of output processing blocks.
5. The receiver of claim 1, wherein the parity node processor comprises: an input processing block configured to serially receive the variable messages and perform an exponential operation on the variable messages in order to produce exponential terms for use in generating the soft messages;an accumulator coupled with the input processing block, the accumulator configured to accumulate the exponential terms in order to generate sum terms for use in generating the soft messages;a shift register coupled with the input processing block, the shift register configured to store the variable massages for one clock cycle;an adder coupled with the accumulator and the shift register, the adder configured to subtract the output of the shift register from the sum terms in order to produce difference terms for use in generating the soft messages; andan output processing block coupled with the adder, the output processing block configured to perform a logarithm function on the difference terms in order to generate the soft messages.
6. The receiver of claim 5, wherein the parity node processor further comprises a sign processing block coupled with the output processing block, the sign processing block configured to determine a sign associated with the output of the output processing block.
7. A LDPC decoder comprising a parity node processor configured to generate soft messages that are estimates of variable messages received from a plurality of variable nodes, the parity node processor comprising: a plurality of input processing blocks configured to receive the plurality of variable messages in parallel and perform an exponential operation on the variable messages in order to generate exponential terms for use in generating the soft messages;a summer coupled with the plurality of input processing blocks, the summer configured to sum the exponential terms generated by the plurality of input processing blocks in order to generate sum terms for use in generating the soft messages;a plurality of adders coupled with the summer and the plurality of input processing blocks, the plurality of adders configured to subtract the exponential terms from the sum terms in order to generate a difference term for use in generating the soft messages; anda plurality of output processing blocks coupled with the plurality of adders, the plurality of output processing blocks configured to perform a logarithm function on the outputs of the plurality of adders in order to produce the soft messages.
8. The parity node processor of claim 7, further comprising a sign processing block coupled with the plurality of output processing blocks, the sign processing block configured to determine a sign associated with the soft messages.
9. The parity node processor of claim 8, wherein the sign processing block is implemented using a binary ex-or logic circuit; wherein the plurality of input processing blocks are implemented as look up tables, wherein the plurality of output processing blocks are implemented as look up tables.
10. A LDPC decoder comprising a parity node processor configured to generate soft messages that are estimates of variable messages received from a plurality of variable nodes, the parity node processor comprising: an input processing block configured to serially receive the variable messages and perform an exponential operation on the variable messages in order to produce exponential terms for use in generating the soft messages;an accumulator coupled with the input processing block, the accumulator configured to accumulate the exponential terms in order to generate a sum terms for use in generating the soft messages;a shift register coupled with the input processing block, the shift register configured to store the variable massages for one clock cycle;an adder coupled with the accumulator and the shift register, the adder configured to subtract the output of the shift register from the sum terms in order to produce difference terms for use in generating the soft messages; andan output processing block coupled with the adder, the output processing block configured to perform a logarithm function on the difference terms in order to generate the soft messages.
11. The parity node processor of claim 10, further comprising a sign processing block coupled with the output processing block, the sign processing block configured to determine a sign associated with the soft messages.
12. The parity node processor of claim 11, wherein the sign processing block is implemented using a binary ex-or logic circuit, and wherein the input processing block is implemented as a look up table, and wherein the output processing block is implemented as look up table.
13. A method for processing a received wireless signal using a parity node processor included in a LDPC decoder, the method comprising: receiving the wireless signal;removing a carrier signal from the wireless signal to produce a received signal;generating variable messages from the received signal;performing an exponential operation on the variable messages to generate exponential data;summing the exponential data;subtracting the variable messages from the summed exponential data to form a difference; andperforming a logarithmic operation on the difference.
14. The method of claim 13, wherein said summing the exponential data comprises accumulating the exponential data.
15. The method of claim 14, wherein said subtracting the variable messages from the summed exponential data comprises subtracting a time shifted version of a variable message from the accumulated exponential data.
16. A LDPC decoder comprising a parity node processor configured to generate soft messages that are estimates of variable messages received from a plurality of variable nodes, the LDPC decoder comprising: a plurality of variable node processors configured to generate variable messages based on the received signal;a degree reducing unit coupled with the plurality of variable node processors, the degree reducing unit configured to receive the plurality of variable messages and to reduce the degree of the variable messages prior to generation of the soft messages; anda parity node processor coupled with the degree reducing unit, the a parity node processor configured to implement an approximation of a sum product algorithm (SPA) based on the signs of the reduced degree variable messages resulting in soft outputs representing estimates of the variable messages.
17. The LDPC decoder of claim 16, wherein the parity node processor comprises: a plurality of input processing blocks configured to receive the plurality of reduced degree variable messages in parallel and perform an exponential operation on the reduced degree variable messages in order to generate exponential terms for use in generating the soft messages;a summer coupled with the plurality of input processing blocks, the summer configured to sum the exponential terms generated by the plurality of input processing blocks in order to generate sum terms for use in generating the soft messages;a plurality of adders coupled with the summer and the plurality of input processing blocks, the plurality of adders configured to subtract the exponential terms from the sum terms in order to generate a difference term for use in generating the soft messages; anda plurality of output processing blocks coupled with the plurality of adders, the plurality of output processing blocks configured to perform a logarithm function on the outputs of the plurality of adders in order to produce the soft messages.
18. The LDPC decoder of claim 17, further comprising a output unit coupled with the parity node processor, the output unit comprising a plurality of output ports, the output unit configured to receive the soft messages and couple each of the soft messages to the appropriate output port.
19. The LDPC decoder of claim 18, further comprising a sign processing block coupled with the output unit, the sign processing block configured to determine a sign associated with the outputs of the output unit.
20. The LDPC decoder of claim 16, wherein the parity node processor comprises: an input processing block configured to serially receive the reduced degree variable messages and perform an exponential operation on the reduced degree variable messages in order to produce exponential terms for use in generating the soft messages;an accumulator coupled with the input processing block, the accumulator configured to accumulate the exponential terms in order to generate sum terms for use in generating the soft messages;a shift register coupled with the input processing block, the shift register configured to store the variable massages for one clock cycle;an adder coupled with the accumulator and the shift register, the adder configured to subtract the output of the shift register from the sum terms in order to produce difference terms for use in generating the soft messages; andan output processing block coupled with the adder, the output processing block configured to perform a logarithm function on the difference terms in order to generate the soft messages.
21. The LDPC decoder of claim 20, further comprising a output unit coupled with the parity node processor, the output unit comprising a plurality of output ports, the output unit configured to receive the soft messages and couple each of the soft messages to the appropriate output port.
22. The LDPC decoder of claim 21, further comprising a sign processing block coupled with the output unit, the sign processing block configured to determine a sign associated with the outputs of the output unit.

RELATED APPLICATION INFORMATION

This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application Ser. No. 60/820,729, filed Jul. 28, 2006, entitled “Reduced-Complexity Algorithm for Decoding LDPC Codes,” which is incorporated herein by reference in its entirety as if set forth in full.

Provisional Applications (1)

	Number	Date	Country
	60820729	Jul 2006	US

SYSTEMS AND METHODS FOR REDUCED COMPLEXITY LDPC DECODING

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

US Classifications

International Classifications

Abstract

Description

Claims

RELATED APPLICATION INFORMATION

Provisional Applications (1)