The present invention relates to a encoding and decoding method for digital signals, and more specifically, to a decoding method for LDPC based on BP algorithm.
Low density parity code (LDPC) is a kind of linear block codes defined by very sparse parity matrix or bipartite graph, and since it is firstly discovered by Gallager, it is also called Gallager code. After decades of silence, with the development of computer hardware and the relative theory, MacKay and Neal rediscovered it and demonstrated that it has the property of approaching to the Shannon limit. The latest research shows that the LDPC has the following advantages: LDPC with a long code length can implement non-error transmission in the condition of extremely low signal to noise ratio, and it has the property of approaching to the Shannon limit; LDPC are usually decoded using BP algorithm, and the decoding complexity is proportional to the number of non-zero elements in the parity matrix, and the number of non-zero elements in the parity matrix is proportional to the code length, and thus for the LDPC with long code length, linear-time complexity decoding can be implemented and the approach to Shannon limit not only exists but also can be implemented; The BP algorithm has inherent parallelism, and thus it can be implemented with high parallel structure and can achieve very high decoding throughout.
The BP arithmetic uses the following log likelihood ratios:
And the standard BP algorithm in log domain specifically comprises:
(1) Initializing: initializing LLR (qmn) according to the following equation:
(2) Upgrading the check node: upgrading the LLR(rmn) according to the following equation:
Where:
(3) Updating the variable node: updating LLR(qmn) according to the following equation:
(4) Updating LLR(qn): updating LLR(qn) according to the following equation:
(5) Iteration termination judgment: performing hard decisions for the decoded output data according to the following equation:
In this step, if the condition of H{circumflex over (x)}T=0 is satisfied or the maximal number of iterations has been reached, terminating the whole decoding process, otherwise, proceeding to (2) to continue the process of iteration.
This standard BP algorithm iteratively decodes between the check node LLR(rmm) and the variable node LLR(qmm), in which, the number of iterations is relatively large, the calculation is relatively complicated, and for the mobile terminal, its silicon area and power cost are relatively large, thus not good for reducing the size of terminal and extending the using time of the battery.
In respect to the disadvantage and shortcomings in the prior art, the present invention offers a decoding method for LDPC codes based on BP algorithm, which can speed up the decoding convergence and decrease the number of iterations, further simplify the calculation and so as to decrease the complexity.
In order to achieve the above object, the present invention offers a decoding method for LDPC based on BP algorithm, which performs iteratively decoding between the check node LLR(rmn) and the variable node LLR(qmn) and comprises the following steps:
110) Initializing LLR (qmn) using the received LDPC bit stream;
120) Updating the check node LLR (rmn) and variable node LLR (qmn): after updating the LLR (rmn) corresponding to nonzero elements in each row in check matrix H, immediately updating the LLR (qmn) corresponding to all the nonzero elements in the column which corresponds to the nonzero elements in the row, and repeating the update row by row;
130) Updating LLR (qn) on the basis of LLR (rmn);
140) processing iteration termination judgment according to a maximum number of allowed iterations or whether the equation of H{circumflex over (x)}T=0 satisfied, and outputting the final decoding result xn;
wherein qmn is extra-decoding information, qn is a bit likelihood ratio, rmm is extrinsic information, LLR is a log likelihood ratio and {circumflex over (x)}n is a corresponding decision result of LLR(qn). Furthermore, the method applies simplified decoding algorithm to alleviate the heavy computational burden:
(1) Using the equation
to
perform simplified calculation, and specifically, in said step 120), performing updating according to the following pseudo code:
Where,
αmn′=sign(LLR(qmn′)), βmn=LLR(qmn′)|; N(m)={n: Hmn=1} means the set of the subscripts of all bits joining in the mth check function; M(n)={m: Hmn=1} means the set of all check functions joined by the nth bit; N(m)\n means the set of subscripts of all bits joining in the mth check function except the nth bit; M (n)\m means the set of all check functions joined by the nth bit except the mth check function;
where yn is the channel output soft information obtained on the basis of the received LDPC bit stream at the time instant n, A is a constant which is determined through simulation.
(2) using yn to approximately replace 2yn/σ2, in said step 110), specifically performing initializing according to the following pseudo code:
Where, yn is the channel output soft information at the time instant n, σ2 is the channel noise variance.
Furthermore, using the equation LLR(qmn)=LLR(qn)−LLR(rmn) to simplify the storage of LLR(qmn) so as to reduce the required memory, in said steps 120) and 130), specifically performing the updating according to the following pseudo code:
Furthermore, it is suitable to simultaneously support several communication standards: said check matrix is one of several LDPC basic matrices corresponding to a plurality of different communication standards and bit rates, this LDPC decoding method uses several LDPC basic matrices, and also includes selecting one matrix corresponding to said received LDPC bit stream from those several LDPC basic matrices and performing decoding.
Said communication standards includes but not limits to Mobile TV standard or IEEE standards, wherein each communication standard may have one or more code rates.
Compared with the decoding method corresponding to the present standard BP algorithm, the LDPC decoding method based on BP algorithm offered by the present invention has the following advantages:
(1) After updating the LLR (rmn) corresponding to nonzero elements in each row in check matrix H in each iteration, immediately updating the LLR (qmn) corresponding to all the nonzero elements in the column which corresponds to the nonzero elements in the row, which can speed up the decoding convergence, reduce the number of iterations to increase the throughout, in the case of the best condition, it only needs a half number of iterations compared with the other decoding methods corresponding to the standard BP algorithm;
(2) directly using the channel soft information yn to initialize the log likelihood ratio of the code bits without the channel noise variance σ2, that is, there is no need to estimate the signal to noise ratio corresponding to the codeword;
(3) Approximating the
to reduce the computation complexity;
(4) Calculating and obtaining LLR(qmn) according to LLR(qmn)=LLR(qn)−LLR(rmn) instead of storing it directly, thus reducing the number of storage units and the complexity.
The mobile terminals applying the method of the present invention can effectively decrease the power consumption and the silicon area of the decoding apparatus, it can high-efficiently achieve the decoding of LDPC code which supports multi-protocol standard with high speed by combining with several available basic matrices H, meanwhile, it is good for reducing the size of the terminal and extending the using time of the battery.
The present invention will be described in detail with reference to the accompanying figures and the specific implementation.
The present invention will be described in further detail in combination with hardware apparatus applying the method of the present invention.
The LDPC in IEEE 802.16e is defined by five basic matrices. LDPC in IEEE 802.16e comprises 4 code rates: 1/2, 2/3, 3/4 and 5/6, wherein each of code rate 1/2 and code rate 5/6 have only one basic matrix, while each of code rate 2/3 and code rate 3/4 have two basic matrices. For each code rate, the LDPC in IEEE 802.16e has 19 different code lengths which range from 576 to 2304, and the step is 96, respectively corresponding to 6 to 24 sub-channels in QPSK modulation. The Mobile TV standard and IEEE 802.11n standard also employ LDPC code, whose basic matrix is different from that of the LDPC codes in IEEE 802.16e, but the apparatus of the present invention can implement the decoding of LDPC codes of different basic matrices. LDPC basic matrix in IEEE 802.16e has 24 columns.
The function and connection of each module in the apparatus of the present invention which supports multi-protocol standard to decode LDPC codes is shown as
Wherein, the data input processing module is used to store the ping-pong RAM array of the bit likelihood ratio, including the input interface of the LDPC decoding apparatus;
The data shuffle module is used to perform the data shuffle function between the RAM read-write data and the check node processing module, and to make hard decision on the bit likelihood information before sending them to the data output processing module;
The check node processing module is used to calculate the corresponding extrinsic information rmn according to the bit likelihood ratio qn and the decoding extrinsic information qmn, and calculate the new qmn with modified min-sum algorithm, and use the new qmn to calculate the new qn;
The extrinsic information storage module is used to store the extrinsic information rmn during the decoding process;
The basic matrix storage module is used to store several kinds of information of the basic matrix;
The data output processing module is used to store the output ping-pong RAM array for outputting the hard decision bits and packet the decoded data;
The main control module is used to generate suitable control signals according to the condition of the data input/output ping-pong buffer and the parameters of the data packet, and the control signals are used to control the operation of the basic matrix storage module, the data input processing module, the extrinsic information storage module, the check node processing module and the data output processing module.
The main control module reads out the basic matrix data from the basic matrix storage module row by row, especially the cyclic shift coefficients (to determine the generation of the RAM read-write address) and the connection between the cyclic shift coefficients (to determine the connection of the data shuffle network), enables the data input processing module and the extrinsic information storage module, the input data dat_in is sent to the data input processing module, and the main control module reads out data qn and rmn, from the memories of the data input processing module and the extrinsic information storage module and send them to the check node processing module through the data shuffle module to be decoded. After a complete iteration is performed, the main control module determines whether the check is passed or it reaches the maximum number of allowed iterations, if yes, stops the iteration and outputs the present data, otherwise, sends out an enabling signal and starts a new iteration. After the decoding iteration is completed, the decoding result is sent to the data output processing module which will send the decoding result dat-out out.
As shown in
It is mainly a ping-pong RAM array for storing the bit likelihood ratios, including the input interface of the LDPC decoding apparatus. In order to make the input and output of the decoding apparatus smooth, ping-pong buffer is used at the decoding input port, and receive the data input of the next frame when processing the present frame. The data input of the decoding apparatus is the bit likelihood ratio, and the input interface data width is 32-b, and four bit likelihood ratios can be input at one time. Applying the ping-pong RAM array can increase the throughout.
The size of each RAM is (36×96/4)×b=864−b. The parameter register in
Each WORD in RAM stores 4 bit likelihood ratios, and every time it should read and write according to the WORD. The check node processing module processes 4 rows simultaneously at one time, and each cycle should continually read out 4 bit likelihood ratios from each RAM, and because of the randomness in cyclic shift coefficient of the basic matrix, the start address of the continuous 4 bit likelihood ratios in the WORD is random, thus it is maybe need to read out the data in disjoined WORDs, and it needs to merge the data read out from the two WORDs. In order to realize this functionality, each RAM is added with a 36-bit data read-write alignment unit. The data read-write alignment unit does not work when inputting the interface data and works only in decoding mode.
As shown in
The data shuffle module performs the function of data shuffle between the RAM read-write data and the check node processing module, and makes hard decisions on the shuffled bit likelihood ratios before sending them to the data output module. The data shuffle network is configured by the matrix row connection coefficient read out from the basic matrix ROM.
The consumed resources (the number of switches) is 24×20×24×4=46080, this number is based on all-connection switching matrix, and actually, many connections will not be used according to the present matrix, and the required switches should be configured according to the practical connections in terms of the basic matrix so as to save resources.
The data shuffle module is configured according to the maximum row weight (20) of the IEEE 802.16e basic matrix. The row weight may not be reached in specific application, at this time, the idle qn is set to the maximum value, qmn, is set to 0, and therefore, they will not affect the decoding result of the min-sum algorithm. In order to support the LDPC codes of multi-protocol, the data shuffle module can be configured according to the maximal row weight of the basic matrix in each protocol.
As shown in
Use the bit likelihood ratio qn and the extra-decoding information qmn, to calculate the corresponding extrinsic information rmn, and use layered modified min-sum algorithm to calculate the new qmn, and then use the new qmn, to calculate the new qn. The standard decoding algorithm for the LDPC codes is belief propagation (BP) algorithm, that is, the message passing algorithm based on encoding bipartite graph, and this arithmetic is based on defining the LDPC parity matrix H(m×n), where n is the LDPC code length and m is the number of check bits.
The present invention applies the layered modified min-sum algorithm to calculate the new qmn, and use the new qmn, to calculate the new qn, which has three differences compared with the standards BP arithmetic in log domain:
1. Directly use the channel soft information yn to initialize the log likelihood ratio of the code bits without the need for the channel noise variance σ2, thus without the need for estimating the channel signal to noise ratio corresponding to the codeword;
2. Use the following equation to approximate the value of
to reduce computational complexity:
Where,
and the constant A is related to the row weight of the check matrix H, and its value range is 0.6˜0.9, and the exact value should be determined through simulation.
3. In iteration, after updating the LLR(rmn) corresponding to the non-zero elements in each row in H matrix, update the LLR(qmn) corresponding to all non-zero elements in the column corresponding to each non-zero element in the row, and then decode the next row in the H matrix. Compared with standard BP algorithm, this algorithm can speed up the decoding convergence, reduce the number of iterations to increase the throughout, and can cut off a half number of iterations in the case of the best condition.
The layered modified min-sum algorithm specifically comprises:
(1) Initializing LLR(qmm) according to the following pseudo code:
Where, yn is the channel soft information, qmn is the extra-decoding information, qn is the bit likelihood ratio; rmn is the extrinsic information, LLR(qmn) is the log likelihood ratio of qmn; LLR(qn) is the log likelihood ratio of qn; LLR(rmn) is the log likelihood ratio of rmn.
(2) Updating the check node and the variable node: updating LLR(rmn) according to the following pseudo code:
Where:
αmn′=sign(LLR(qmn′))
βmn′=|LLR(qmn′)|
A: const, 0.0<A<1.0
N(m)={n:Hmn=1} represents the set of all bit subscripts joining in the mth check function;
M(n)={m:Hmn=1} represents the set of all check functions joined by the nth bit;
N(m)\n means the set of subscripts of all bits joining in the mth check function except the nth bit;
M(n)\m means the set of all the check functions joined by the nth bit except the mth check function;
yn is the channel output soft information at the time instant n;
xn is the codeword sent at the time instant n;
qmn is the extra-decoding information, rmn is the extrinsic information; LLR(rmn) is the log likelihood ratio of rmn; LLR(qmn) is the log likelihood ratio of qmn.
(3) Updating LLR(qn) according to the following pseudo code:
(4) Iteration termination judgment: hard decision on the decoding output according to the following pseudo code:
Where, {circumflex over (x)}n the decoding output hard decision, qn is the bit likelihood ratio, and LLR(qn) is the log likelihood ratio of qn.
If the condition of H{circumflex over (x)}T=1 is satisfied or the maximum number of allowed iterations is reached, terminating the whole decoding process, otherwise, proceeding to (2) to continue the process of iteration.
During the implementation, in order to decrease the complication, the above arithmetic should be transformed, and LLR(qmn) is not directly stored, and it should be computed according to the following equation:
LLR(qmn)=LLR(qn)−LLR(rmn)
And new processing procedures after transformation are as follows:
(11) Initializing LLR(qn) and LLR(rmn) according to the following pseudo code:
(12) Updating the check node and bit soft information: updating LLR(rmn) and LLR(qn) according to the following pseudo code:
Where:
A: const, 0.0<A<1.0
k: the kth iteration;
(13) The iteration termination judgment: hard decision on the decoding output according to the following pseudo code:
If the condition of H{circumflex over (x)}T=0 is satisfied or the maximum number of allowed iterations is reached, terminating the whole decoding process, otherwise, proceeding to (2) to continue the process of iteration.
The difference between the layered decoding algorithm and the standard BP arithmetic is: in iteration, after updating the LLR(rmn) corresponding to the non-zero elements in each row in H matrix, update the LLR(qmn) corresponding to all non-zero elements in the column corresponding to each non-zero element in the row, and then decode the next row in the H matrix. Compared with standard BP algorithm, this algorithm can speed up the decoding convergence, reduce the number of iterations to increase the throughout, and can cut off a half number of iterations in the case of the best condition.
The check node processing module comprises four child check node processing units, that is, the module can parallel process four rows in the LDPC parity matrix at one time. The diagram of the circuit implementation of the check node processing unit is shown in
As shown in
The main control module controls the operation of the whole decoder, and it generates a suitable control signal according to the condition of the data input/output ping-pong buffer and the parameters of the data packet. The output control signal will control the operation of the following modules: the data input processing module, the extrinsic information storage module, the basic matrix storage module and the data output processing module.
The main control module will generate all levels of control signals, including the level of data packet (the packet processing start signal, the packet processing end signal), the level of iteration (iteration processing start signal) and layer level (layer processing start signal). Under the control of these signals, the data input processing module reads and writes the bit likelihood information layer by layer, clears the input ping-pong buffer state register; the extrinsic information storage module initializes the extrinsic information storage module (reset to be zero), reads and writes the extrinsic information storage module layer by layer; the basic matrix storage module reads out the information in the basic matrix ROM row by row; the data output processing module writes the hard decision bit layer by layer, resets the output ping-pong buffer register.
As shown in
For the coefficient of the basic matrix, the following processing should be performed according to the parameters for the code length after reading out the coefficient from the ROM: for the codes with the code rate being 1/2, 2/3B, 3/4A and B, 5/6, their corresponding shift factors {p(f,i,j)} can be obtained by operating {p(i,j)} according to the following equation:
Where └x┘ means the round-down operation for x.
For the code with the code rate being 2/3A, the corresponding shift factor {p(f,i,j)} can be obtained by processing {p(i,j)} according to the following equation:
Where each basic matrix has nb=24 rows, and its extension factor is zf=n/24 when the code length is n. And the subscript f(f=0,1,2, . . . 18) corresponds to the index of each code length for each code rate. For the code length of 2304, the extension factor is z0=96.
After calculation, the coefficient and the in-row ROM needs a RAM of 16K, and storing the shuffle coefficient also needs a 16K RAM. The amount of data in the ROM storing the decoding correction coefficient is relatively small, and it can be implemented by combinational logic.
The width of the coefficient ROM is 64-b (two 32-bs), and 6 coefficients and in-column index offset (10b×6 in total) can be read out at one time, and it takes 4 times to read out the 24 coefficients, so as 48 data shuffle indexes.
The readout coefficients are buffered in a two-stage register, the target of setting the two-stage register is that the configuration coefficient of the next layer can be read out concurrently when the configuration coefficient of the current layer is being used, and the current configuration coefficient can be updated when layer handoff.
The basic LDPC matrices of different protocols are different, and increasing the number of basic matrices stored in the basic matrix storage module will provide more support for the LDPC code of new standard.
As shown in
The output ping-pong RAM array can make the decoding process and the data output process independent from each other, and thus they can be processed concurrently. Applying the ping-pong RAM array can increase the throughout. The size of each RAM is 96 bits, and the width of the read-write data is 4-b. The IEEE 802.16e LDPC basic matrix has 24 columns, and it needs 48 RAMs. The RAM read-write address generation should be configured by the basic matrix coefficients read out from the basic matrix storage module row by row. The parameter register in
The present invention applies the layered modified min-sum algorithm, and compared with standard LDPC decoding algorithm (BP algorithm), this algorithm can speed up the decoding convergence, effectively reduce the number of iterations to increase the throughout, and can cut off a half number of iterations in the case of the best condition.
The basic matrix storage module in the apparatus of the present invention can store the basic matrices of different protocols and can perform decoding operation on the basis of the stored basic matrices. Therefore, the apparatus of the present invention can support multi-bit-rate and multi-protocol LDPC codes and effectively decrease the consumed power and the silicon area in the decoding apparatus, thus can high-efficiently implement the decoding of the multi-protocol LDPC with high speed.
Number | Date | Country | Kind |
---|---|---|---|
200610167367.8 | Dec 2006 | CN | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/CN2007/001175 | 4/11/2007 | WO | 00 | 6/26/2009 |