Decoding a q-ary low density party check (LDPC) codes includes check node processing, which is a bottle-neck in LDPC decoder operation. Check node processing includes setting up a trellis to search for the most likely combinations of the q-ary levels for all symbols involved in a check. It would be desirable to develop new techniques for performing check node processing, at least some embodiments of which reduce or mitigate complexity and/or memory space requirements.
Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
A dynamic programming approach can also compute, to a good approximation, the LLRV of xk by finding the best forward and backward paths that can be connected with a given level of xk at symbol stage k. Assuming the LLRV is truncated to have only nm (<q) levels, the trellis has nm nodes in a given stage with nm branches coming out of each node. In either the forward or the backward trellis scan, at each symbol stage, nm distinct LLR values associated with the branches are added to each of nm LLR values associated with the nm nodes, giving rise to nm2 possible levels and corresponding LLR values. At the end of the symbol processing stage, only nm best distinct nodes and the corresponding accumulated LLR values are kept. Note that the trellis is dynamic in that the set of selected nodes is in general different from one stage to the next. For the truncated system, the LLRs for a symbol are stored in a vector of size nm. Each element in the vector, however, needs to be specified with the corresponding q-ary level. It is convenient to imagine, for example, that each element in the LLR vector now has a wider, say, 13-bit representation, with the first 8 bits indicating the q-ary level and the last 5 bits the corresponding LLR.
At 100, f(x(sk−1, sk))=A(sk−1)+B(sk) is calculated for nm2 pairs of consecutive state variables {sk−1, sk} using A(sk)=mins
The LLR values for a given symbol xk are obtained as follows. For each of nm trellis nodes skεGF(q), a metric A(sk) is computed:
A(sk)=mins
where Γ(xk=x) is the metric associated with a branch xk=xεGF(q) connecting the trellis nodes sk−1 and sk (the connection implies a constraint sk−1+hxx=sk). sk is referred to as a state variable or a node and may represent an iteration of LDPC processing. Note that Γ(xk=x) is the LLR ln{P(xk=x)/P(xk=0)) obtained in the previous LDPC decoder iteration. Equation (1) depicts forward trellis scanning. The initial node is set to s0=0 with its probability or likelihood measure set to zero, i.e., A(s0)=0. Beyond the initial stage, the nm values of sk−1 and nm values of xk=x, which all belong to GF(q), may all be different, meaning that finding the best nm trellis nodes at time k may need to consider up to nm2 pairs of {sk−1, xk} values. At the last stage, the check node constraint requires that sd
For each node skεGF(q), a second metric B(sk) is computed:
B(sk)=mins
where sk and x are such that sk+1+hk+1x=sk. This represents backward trellis scanning. Again, finding the best nm nodes at time k may require examination of up to nm2 pairs of {sk+1, xk+1}. There is also no need for this minimization at the initial and final trellis stages of the backward scan. In some embodiments, the node metric B(sd
The LLR values for xk=xεGF(q) is then obtained using the stored values of trellis node metrics
Lk(x)=mins
with the minimization taken with a constraint: sk−1+hkx+sk=0. Equation (3) computes the LLR vector for symbol xk based on examination of up to nm2 pairs of forward and backward nodes {sk−1, sk} and selection of up to nm best pairs (with the connecting branches corresponding to up to nm distinct levels for xk). This procedure can be stated as first computing
f(sk−1,sk)=A(sk−1)+B(sk) (3a)
for nm2 pairs of {sk−1, sk}. For example, if nm=4 then 16 values of f(sk−1, sk) would be computed; if nm=2 then 4 values of f(sk−1, sk) would be computed.
At 102, select from nm2 calculated values of f(x(sk−1, sk))=A(sk−1)+B(sk) the nm lowest values and set log likelihood ratios (LLRs) to those lowest f(x(sk−1, sk)) values. For example, if nm2=16 then the 4 lowest or minimum values are selected and 4 LLRs are set to those lowest 4 values.
At 104, the nm values of x that correspond to the nm lowest values are determined. For example, the nm q-ary levels are xk=hk−1(sk−1+sk). Each of these levels correspond to one of the LLR values obtained at 102. It is possible that only n′m<nm pairs of {sk−1, sk} are connected to each other. In this case, only the n′m q-ary levels are used to represent the LLRV. The corresponding probability mass (PMF) function is set such that q-n′m levels for xk=x are assigned an equal low probability value. In this way the overall probability sum is maintained 1.
In summary, finishing the trellis search based on the steps of (1), (2) and (3) above generates messages (in the form of LLRVs) from a check to all its dc symbol neighbors. In comparison, for a binary LDPC decoder, this operation is analogous to dc in-sum operations (each applied to dc−1 members). Let P denote the amount of computation required to process one full symbol stage that involves symbol and LLR additions and selection of nm smallest values out of nm2 metrics. The overall complexity of the above algorithm can be expressed in terms of P. It requires (dc−2)×P computations for either forward scan (1) or backward scan (2) and another (dc−2)×P computations for the LLRV computation step (3). The overall complexity is thus 3(dc−2)×P. As for the storage space needed, other than the input Γ(xk=x) and the output Lk(x) LLRV sequences, the algorithm needs to store all (dc−2)×nm node metrics for either the forward scan (A's in Equation 1) or the backward scan (B's in Equation 2).
In some embodiments, to reduce complexity and/or memory requirements, the symbols are first rearranged so that each of the first (e.g., 15 symbols) is characterized by one large LLR value (i.e., these symbols are already an LLRV of size nm). In at least some cases, arranging symbols based on the probability distribution improves the complexity/performance tradeoff. Some examples are described in further detail below.
In some embodiments, the process shown in
In some embodiments, although there is only one branch shown in tail portion 300 (and thus only one q-ary level specified) in a given stage, a check node processor is configured to generate LLRVs based on nm levels for all symbols, so that all the branch metrics used in the next iteration of check node processing are expressed using nm levels. In such embodiments, all nm survivors are carried along from the head section into the tail during the backward scan. Although these nm survivors may not necessarily be the best paths to date in the backward scan, they represent reasonably good paths.
For the symbols in the tail region of the forward trellis (i.e., region 300), the forward/backward dynamic programming reduces to a simple version. For forward scan of the tail, one survivor path is maintained by computing:
A(sk)=A(sk−1)+Γ(xk), sk−1+hkxk=sk (4)
along the path defined by the mostly likely level sequence (for x1 through x6 in
B(ski)=B(sk+1i)+Γ(xk+1), sk+1i+hk+1xk+1=ski, 1≦i≦nm (5)
where ski is the i-th node at symbol stage k in the backward scan and xk+1 corresponds to the single path in the tail of
L(x(sk−1,ski))=A(sk−1)+B(ski), 1≦i≦nm (6)
where x(sk−1, ski), a particular q-ary level for xk, is determined by the pair of consecutive nodes sk−1 and ski as x(sk−1, ski)=hk−1(sk−1+ski). Now that the LLR values for symbols x1 through x6 in
In some embodiments, the process shown in
The backward scan depicted in
The minimization technique via forward-backward dynamic programming is not optimal in that only one path is considered in computing the message from the check node to a given symbol. In some embodiments to restore some of the performance loss, a scaling factor and/or offset term is used in the computation of the LLRs. In particular, the LLR value obtained above can be modified with:
Lk(x)←max{αLk(x)−β,0}
where α is a scaling constant between 0 and 1, and β is a positive offset. The constants α and β can be determined empirically and may be allowed to change from one iteration stage to the next.
The complexity level of the proposed scheme based on Equations (4)-(6) is dominated by (nc−2)×P computations for forward scan, (nc−1)×P computations for backward scan and (nc−2) x P computations for the LLRV computation step, all for the head portion of the trellis. Letting nc=γdc so γ represents the portion of the symbols that retain nm-level PMFs, the complexity savings relative to uniform level PMFs is given by
So if γ=0.4, for example, the complexity is reduced to 40%.
The storage space is again dominated by the head section, requiring roughly (nc−2)×nm node metrics to be stored during forward scan. The complexity savings relative to the forward-backward trellis search with the uniform PMF truncation level nm is thus
which is the same as the relative complexity savings.
Let us now take a closer look at the computational requirement associated with P. A straightforward way to do processing for one full symbol stage is basically adding nm different branch values to each of nm distinct node values (path metrics updated in the previous stage) and then finding nm survivor paths out of nm2 candidates for the nm distinct nodes. The set of nodes changes dynamically from one stage to next; so hardware implementation is different from the case of the Viterbi algorithm operating on a fixed trellis structure with nm states and nm branches out of (and into) each state. Some techniques can roughly accomplish this task in 2nm log2nm max operations (the max operation compares two values and selects the larger) and the same number of q-ary symbol additions as well as LLR additions.
In order to use the technique above which requires 2nm log2nm max operations in carrying out P, sorting is needed for each symbol's LLR values as well, which requires log2nm operations per symbol. In some embodiments, to accommodate this, symbols are rearranged into the tail and head sections of the trellis. This requires looking (for example) at the difference in the highest and second highest LLR values for every symbol and then selecting nc symbols with least differences to form the head section of the trellis. It is also possible to look at the cumulative probability of a few most likely levels in determining the number of levels that account for the most of the probability mass. This would be a relatively low-complexity operation.
The idea of truncating the LLR vector to just one level for a number of symbols can be extended to a general case where different subsets of symbols undergo different levels of LLR vector truncation. In some embodiments, the process shown in
Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.
This application claims priority to U.S. Provisional Patent Application No. 61/197,495 entitled LOW-COMPLEXITY Q-ARY LDPC DECODER filed Oct. 27, 2008 which is incorporated herein by reference for all purposes.
Number | Name | Date | Kind |
---|---|---|---|
7398453 | Yu | Jul 2008 | B2 |
7453960 | Wu et al. | Nov 2008 | B1 |
20080276156 | Gunnam et al. | Nov 2008 | A1 |
20080301521 | Gunnam et al. | Dec 2008 | A1 |
20090175388 | Chang et al. | Jul 2009 | A1 |
20100030835 | Andreev et al. | Feb 2010 | A1 |
Entry |
---|
Huang et al., “Fast Min-Sum Algorithms for Decoding of LDPC over GF(q)”, School of Information Science and Technology, Tsinghua Universtiy, Beijing, P.R. of China, Proceedings of 2006 IEEE Information Theory Workshop. |
Voicila et al., “Low-complexity decoding for non-binay LDPC codes in high order fileds”, Department of Electrical Engineering University of Hawaii at Manoa Honolulu, Hi, USA, Aug. 8, 2007. |
Number | Date | Country | |
---|---|---|---|
61197495 | Oct 2008 | US |