The disclosure relates generally to minimum value determination capabilities and, more specifically but not exclusively, to implementation of minimum value determination capabilities for a check node unit (CNU) of a low-density parity-check (LDPC) decoder.
In information theory, a low-density parity-check (LDPC) code is a linear error correcting code for use in transmission of a message over a, typically noisy, transmission channel. For example, LDPC codes are a powerful technique for forward error-correction (FEC). LDPC codes are finding increasing use in applications requiring reliable and highly-efficient information transfer over bandwidth or return channel-constrained links in the presence of corrupting noise. For example, at least partially due to the parallel structure of the LDPC decoders, LDPC decoders are well-suited for multi-gigabit communications. Disadvantageously, however, soft-decision LDPC decoders are typically relatively large, complex, and power-hungry circuits. For example, a 48 Gb/s LDPC decoder might consume 2.8 Watts and occupy more than 5 mm2 of chip area in 65 nm complementary metal-oxide-semiconductor (CMOS) technology. Accordingly, there is a need for LDPC decoders that support reliable and highly-efficient information transfer while consuming less power and occupying less chip area. Furthermore, and more generally, there is a need for improved minimum determination capabilities for use within various contexts.
Various deficiencies in the prior art are addressed by embodiments for minimum value determination.
In at least some embodiments, an apparatus includes a set of modules configured to receive a set of values, evaluate a first portion of the values to determine a magnitude of a minimum value of the first portion of the values, evaluate a second portion of the values to determine a magnitude of a minimum value of the second portion of the values, and determine, based on a comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values, a first minimum value representing a magnitude of a smallest value of the set of values and a second minimum value representing an approximation of a magnitude of a next-smallest value of the set of values.
In at least some embodiments, a method includes using a set of modules for receiving a set of values, evaluating a first portion of the values to determine a magnitude of a minimum value of the first portion of the values, evaluating a second portion of the values to determine a magnitude of a minimum value of the second portion of the values, and determining, based on a comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values, a first minimum value representing a magnitude of a smallest value of the set of values and a second minimum value representing an approximation of a magnitude of a next-smallest value of the set of values.
In at least some embodiments, an apparatus includes a set of modules configured to receive a set of values from a set of variable node units (VNUs), evaluate a first portion of the values to determine a magnitude of a minimum value of the first portion of the values, evaluate a second portion of the values to determine a magnitude of a minimum value of the second portion of the values, and compute a set of responses for the set of VNUs based on a comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values.
In at least some embodiments, a method includes using a set of modules for receiving a set of values from a set of variable node units (VNUs), evaluating a first portion of the values to determine a magnitude of a minimum value of the first portion of the values, evaluating a second portion of the values to determine a magnitude of a minimum value of the second portion of the values, and computing a set of responses for the set of VNUs based on a comparison of the magnitude of the minimum value of the first portion of the values and the magnitude of the minimum value of the second portion of the values.
In at least some embodiments, an apparatus includes a module configured to receive a set of values where each of the values includes a respective set of bits associated with a set of bit positions, and determine, based on a set of bitwise comparisons performed for the respective bit positions, at least one characteristic of a minimum value of the set of values.
In at least some embodiments, a method includes receiving, at a module, a set of values where each of the values includes a respective set of bits associated with a set of bit positions, and determining, at the module based on a set of bitwise comparisons performed for the respective bit positions, at least one characteristic of a minimum value of the set of values.
In at least some embodiments, an apparatus includes a module configured to receive a set of values wherein each of the values includes a respective set of bits associated with a set of bit positions, and determine, based on a set of bitwise comparisons performed for the respective bit positions of the values based on the bits of the values associated with the respective bit positions, at least one of a magnitude of a minimum value of the set of values or an identification of one of the values of the set of values having the magnitude of the minimum value of the set of values.
In at least some embodiments, a method includes receiving, at a module, a set of values wherein each of the values includes a respective set of bits associated with a set of bit positions, and determining, at the module based on a set of bitwise comparisons performed for the respective bit positions of the values based on the bits of the values associated with the respective bit positions, at least one of a magnitude of a minimum value of the set of values or an identification of one of the values of the set of values having the magnitude of the minimum value of the set of values.
In at least some embodiments, an apparatus is configured to evaluate a set of values organized based on a set of bit positions wherein each of the values includes a respective set of bits associated with the respective bit positions. The apparatus includes a set of modules associated with the respective bit positions, the set of modules configured to determine, based on a set of bitwise comparisons performed for the respective bit positions based on the bits of the values associated with the respective bit positions, a magnitude of a minimum value of the set of values. For each of the modules associated with the respective bit positions, the respective module includes a respective bit detector module configured to receive a respective set of input bits for the respective bit position and configured to generate a respective output bit indicative as to whether at least one of the input bits for the respective bit position is a first bit value. For each of a subset of the modules associated with the respective bit positions, the respective module includes a respective mask generation module configured to generate, based on the respective set of bits associated with the respective bit position and based on the respective output bit generated by the respective bit detector module for the respective bit position, a respective disable signal comprising a respective set of disable bits associated with the respective values of the set of values, wherein, based on the respective output bit generated by the respective bit detector module for the respective bit position being indicative that at least one of the input bits for the respective bit position is the first bit value, each of the disable bits associated with a respective one of the values for which the bit in the respective bit position of the value is a second bit value and the bit in a next bit position of the value is the first bit value is configured to change the bit in the next bit position of the value from the first bit value to the second bit value for processing by the bit detector module associated with the next bit position.
In at least some embodiments, an apparatus is configured to evaluate a set of values organized based on a set of bit positions wherein each of the values includes a respective set of bits associated with the respective bit positions. The apparatus includes a first bit detector module associated with a first bit position of the set of bit positions, the first bit detector module configured to receive a respective set of input bits associated with the respective values and generate, for the first bit position, a respective output bit indicative as to whether at least one of the input bits is a first bit value. The apparatus includes a mask generation module configured to generate a disable signal based on the set of bits associated with the first bit position and based on the respective output bit generated for the first bit position, the disable signal comprising a set of disable bits associated with the respective values of the set of values, wherein, based on the respective output bit generated for the first bit position being indicative that at least one of the input bits of the first bit position is the first bit value, each of the disable bits associated with a respective one of the values for which the bit in the first bit position of the value is a second bit value and the bit in a second bit position of the value is the first bit value is configured to change the bit in the second bit position of the value from the first bit value to the second bit value for processing by a second bit detector module associated with the second bit position.
In at least some embodiments, an apparatus is configured to evaluate a set of values. The apparatus includes a module. The module is configured to receive a set of values organized based on a set of bit positions, where each of the values includes a respective set of bits associated with the respective bit positions. The module is configured to determine at least one characteristic of a minimum value of the set of values. The module is configured to determine the at least one characteristic of the minimum value of the set of values based on a set of bitwise comparisons performed for the respective bit positions based on the bits of the values associated with the respective bit positions. The module is configured to determine the at least one characteristic of the minimum value of the set of values based on generation of disable signals configured to prevent select bits of the bit positions from being evaluated during select ones of the bitwise comparisons based on determinations that the select bits are associated with respective ones of the values already disqualified from being the minimum value of the set of values.
The teachings herein can be readily understood by considering the detailed description in conjunction with the accompanying drawings, in which:
To facilitate understanding, identical reference numerals have been used, where possible, to designate identical elements common to the figures.
In general, a minimum determination capability, adapted for determining one or more minimum values from a set of values, is provided. In at least some embodiments, the minimum determination capability enables, for a set of values, determination of (1) at least one of a magnitude or an identification of a first minimum value representing a smallest value of the set of values and (2) at least one of a magnitude or an identification of a second minimum value representing an approximation of a next-smallest value of the set of values. While such embodiments may be used within various contexts, such embodiments may be well-suited for use by check node unit (CNUs) of a low-density parity-check (LDPC) decoder, where the CNUs are configured to determine a magnitude (typically denoted as Min1) and an identification (typically denoted as Ind1) of a first minimum value representing a smallest value of the set of values and to approximate a magnitude (typically denoted as Min2) of a second minimum value representing an approximation of a next-smallest value of the set of values, in order to provide an LDPC decoder having a bit error rate (BER) performance comparable to that of conventional LDPC decoders while also reducing the complexity, power consumption, and chip area of the LDPC decoder. In at least some embodiments, the minimum determination capability enables, for a set of values where each of the values is represented as a respective set of bits at a respective set of bit positions, determination of at least one characteristic of a minimum value of the set of values (e.g., at least one of a magnitude of the minimum value of the set of values or an identification of one of the values of the set of values having the magnitude of the minimum value of the set of values) based on a set of bitwise comparisons performed for the respective bit positions of the values. These and various other embodiments of the minimum determination capability may be better understood by way of reference to an exemplary system including an LDPC decoder, as depicted in
As depicted in
As further depicted in
The operation of LDPC decoder 122 in decoding LDPC codewords received at LDPC decoder 122 may be better understood by first considering a commonly used LDPC decoding algorithm known as the normalized min-sum algorithm (MSA). In MSA, if a CNU is connected to M VNUs and, thus, receives M input messages (denoted as βi, i=1, . . . , M), the CNU then computes M output messages (denoted as αi, i=1, . . . , M); i.e., one for each of the connected VNUs. In MSA, the output message αi to VNU i involves the computation of the minimum of incoming messages from all the remaining VNUs j=1, . . . , M where j≠i. More specifically, for i=1 to M,
where βi and αi are the ith input and output messages of the CNU, the sign function sign(.) returns the arithmetic sign (i.e., outputs +1 or −1 depending on the sign), the absolute-value function |.| returns the arithmetic magnitude, the minimum function min(.) returns the minimum value of the arguments, and Snorm is a normalization factor. The input and output messages have a word-length of w, including the sign bit (i.e., the magnitude portion of the messages is w-1 bits). It will be appreciated that, typically, the complexity of the CNU is primarily in evaluating the minimum function. In order to evaluate this minimum for each of the output messages, the computation essentially reduces to computing the first and second minimums among all input messages, as explained next. If the first and second minimums, amongst all |βi|s, are indicated by Min1=|βInd1| and Min2, where Ind1 is the index of the first minimum, then the output of the min(.) function in the equation above is Min2 for i=ind1, and Min1 otherwise. If more than one input has a magnitude equal to Min1 (i.e., multiple inputs have the same smallest value), then Min2 is equal to Min1 and Ind1 essentially plays no role.
In at least some embodiments, the LDPC decoder 122 is configured to use a justified approximation of the minimum computation of MSA which still achieves good BER performance compared to use of MSA. More specifically, each CNU 125 of the LDPC decoder 122 may be configured to use a justified approximation of the minimum computation of MSA. The use of the justified approximation of the minimum computation of MSA by a CNU 125 obviates the need for the CNU 125 to be configured to compute the second minimum at each stage of the CNU 125. Thus, use of the justified approximation of the minimum computation of MSA provides BER performance comparable to that of conventional LDPC decoders while reducing the complexity, power consumption, and chip area of such conventional LDPC decoders.
In at least some embodiments, a CNU 125i of LDPC decoder 122 may be configured to determine the magnitude of the first minimum value (Min1) and to approximate the magnitude of the second minimum value (Min2). The CNU 125i is configured to receive M input messages from M VNUs 124 to which the CNUi 125 is connected, where the M messages convey M|βi| values from the M VNUs 124 to which the CNUi 125 is connected. The CNU 125i is configured to evaluate or process a first portion of the M messages to determine a magnitude of a minimum value from among the |βi| values of the messages of the first portion of the M messages, evaluate or process a second portion of the M messages to determine a magnitude of a minimum value from among the |βi| values of the messages of the second portion of the M messages, and compare the magnitude of the minimum value from among the |βi| values of the messages of the first portion of the M messages and the magnitude of the minimum value from among the |βi| values of the messages of the second portion of the M messages to determine the magnitude of the first minimum value (Min1) and an approximation of the magnitude of the second minimum value (Min2). In at least some embodiments, the first and second portions of the M messages may include equal numbers of messages (namely, M/2 messages per portion). In at least some embodiments, the first portion of M messages may include the first M/2 messages (including the |β1|−|βM/2| values) and the second portion of M messages may include the second M/2 messages (including the |β(M/2)+1|−|βM| values), although it will be appreciated that the M messages (and, thus, the |βi| values of the M messages) may be evaluated or processed in various other combinations in order to determine the magnitude of the first minimum value (Min1) and to approximate the magnitude of the second minimum value (Min2). In at least some embodiments, the first and second portions of the M messages may include arbitrarily selected portions of the M messages. The magnitude of the first minimum value (Min1) is the lesser of the minimum |βi| value from among the |βi| values of the messages of the first portion of the M messages and the minimum |βi| value from among the |βi| values of the messages of the second portion of the M messages. The magnitude of the second minimum value (Min2) is the greater of the minimum |βi| value from among the |βi| value of the messages of the first portion of the M messages and the minimum |βi| value from among the |βi| value of the messages of the second portion of the M messages. The CNU 125i also may be configured to determine the index of the first minimum value (Min1), which provides an indication of the location of the first minimum value (Min1) within the |βi| values of the M messages (i.e., identification of which of the |βi| values of the M messages has the magnitude given by the first minimum value (Min1)). The CNU 125i also may be configured to determine the index of the second minimum value (Min2), which provides an indication of the location of the second minimum value (Min2) within the |βi| values of the M messages (i.e., identification of which of the |βi| values of the M messages has the magnitude given by the second minimum value (Min2)).
Accordingly, it will be appreciated that the first minimum value (Min2) that is output will be the magnitude of the smallest value of all of the |βi| values of the M messages received by CNU 125i, and that the second minimum value (Min2) that is output may or may not be the magnitude of the true next smallest value of all of the |βi| values of the M messages received by CNU 125i (and, thus, is described herein as providing an approximation of the magnitude of the second minimum value (Min2)). For example, if the smallest |βi| value of the M messages is in the first portion of messages evaluated or processed and the next smallest |βi| value of the M messages is in the second portion of messages evaluated or processed, then the first minimum value (Min1) will be the magnitude of the smallest value of all of the |βi| values of the M messages received by CNU 125i and second minimum value (Min2) will be magnitude of the true next smallest value of all of the |βi| values of the M messages received by CNU 125i. By contrast, for example, if the smallest |βi| value of the M messages and the next smallest |βi| value of the M messages are in the same portion of messages evaluated or processed, then the first minimum value (Min1) will be the magnitude of the smallest value of all of the |βi| values of the M messages received by CNU 125i but the second minimum value (Min2) will not be the magnitude of the true next smallest value of all of the |βi| values of the M messages received by CNU 125i (and, thus, the second minimum value (Min2) is overestimating the magnitude of the true next smallest value of all of the |βi| values of the M messages). In other words, the first minimum value (Min1) will always be computed correctly, and the second minimum value (Min2) may or may not be computed correctly (and, thus, again, is considered to be an approximation of the second minimum value (Min2)). However, since M-1 outputs of a CNU depend only on the first minimum value (Min1), only one output of the CNU might suffer from error. Thus, given only a potential for a minimal increase in error resulting from approximating the second minimum value (Min2), it is possible to achieve BER performance comparable to that of conventional LDPC decoders while reducing chip area, complexity, and power consumption of conventional LDPC decoders.
It will be appreciated that, although primarily depicted and described with respect to operation of the CNU 125i of LDPC decoder 122 in computing the magnitudes of the minimum values, the CNU 125i of LDPC decoder 122 may be configured to perform various other functions (e.g., sign calculation and the like, as will be understood by one skilled in the art) which have been omitted herein for purposes of clarity.
It will be appreciated that, following calculation of first minimum value (Min1) and the second minimum value (Min2) as discussed above, decoding may proceed in the normal manner, a description of which has been omitted herein for purposes of clarity.
An exemplary embodiment of a minimum determination module 126i of a CNU 125i of LDPC decoder 122 is depicted and described with respect to
The processing modules 210 each are configured to receive a set of M/2 |βi| values (from among the set of M |βi| values received in the M messages from the M VNUs to which the CNU is connected) and to determine a magnitude of a minimum |βi| value from among the set of M/2 |βi| values and an associated index associated with the minimum |βi| value from among the set of M/2 |βi| values, respectively. More specifically, processing modules 210 are configured such that (1) first processing module 210A is configured to receive a first half of the M |βi| values (e.g., including the |β1|−|βM/2| values) and to determine a magnitude of a minimum |βi| value from among the set of M/2 |βi| values (denoted as min_A) and an associated index associated with the minimum |βi| value from among the set of M/2 |βi| values (denoted as Ind_A) which indicates which of the M/2 |βi| values has the magnitude of the minimum |βi| value from among the set of M/2 |βi| values and (2) second processing module 210B is configured to receive a second half of the M |βi| values (e.g., including the |β(M/2)+1|−|βM| values) and to determine a minimum |βi| values from among the set of M/2 |βi| values (denoted as min_B) and an associated index associated with the minimum |βi| value from among the set of M/2 |βi| values (denoted as Ind_B) which indicates which of the M/2 |βi| values has the magnitude of the minimum |βi| value from among the set of M/2 |βi| values. The minimum value outputs of the processing modules 210 (min_A, min_B) are provided as inputs to comparator 220 and as inputs to value multiplexer 230. The index outputs of the processing modules 210 (Ind_A, Ind_B) are provided as inputs to index multiplexer 240.
The comparator 220 is configured to compare the minimum value outputs of the processing modules 210 (min_A, min_B) to determine which of the minimum values is smaller, and to generate a select signal for the value multiplexer 230 on the basis of which of the minimum values (min_A, min_B) is smaller. The value multiplexer 230 is configured to receive the minimum value outputs of the processing modules 210 (min_A, min_B) and, under control of the select signal from comparator 220, to output the minimum value outputs of the processing modules 210 in a manner for indicating which of the minimum value outputs of the processing modules 210 is output as the first minimum value (Min1) of the set of M |βi| values and which of the minimum value outputs of the processing modules 210 is output as the second minimum value (Min2) of the set of M |βi| values. The value multiplexer 230 is configured to pass the select signal from the comparator 220 through to index multiplexer 240 for controlling outputting of the first index (Ind1) of the first minimum value (Min1) in accordance with outputting of the minimum value outputs of the value multiplexer 230. The index multiplexer 240 is configured to receive the indexes of the processing modules 210 (Ind_A, Ind_B), and to output the indexes of the processing modules 210 in a manner for associating the first index (Ind1) of the first minimum value (Min1) with the first minimum value (Min1) output from value multiplexer 230.
In this manner, the smallest of the minimum value outputs of the processing modules 210 may be output as the first minimum value (Min1) of the set of M |βi| values and the next smallest of the minimum value outputs of the processing modules 210 may be output as the second minimum value (Min2) of the set of M βi values and, further, the first index (Ind1) of the first minimum value (Min1) may be associated with the first minimum value (Min1) to indicate the location of the first minimum value (Min1) within the set of M βi values (and, optionally, the second index (Ind2) of the second minimum value (Min2) may be associated with the second minimum value (Min2) to indicate the location of the second minimum value (Min2) within the set of M βi values).
It will be appreciated that, although primarily depicted and described with respect to embodiments in which processing modules 210 are configured such that (1) first processing module 210A is configured to receive and process a first half of the M |βi| values (e.g., including the specific |β1|−|βM/2| values) and (2) second processing module 210B is configured to receive and process a second half of the M |βi| values (e.g., including the specific β(M2)+1|−|βM| values), the processing modules 210 may be configured such that the processing modules 210 receive respective halves of the M |βi| values but the specific |βi| values that are provided to the processing modules 210 are arranged differently (e.g., first processing module 210A is configured to receive and process the |β1|−|βM/4| values and the |β(3M/4)+|−|βM| values and second processing module 210B is configured to receive and process the |β(M/4)+1|−|β3M/4| values), the processing modules 210 may be configured such that one of the processing modules 210 receives different sized portions of the |βi| values (e.g., first processing module 201A receives and processes greater than M/2 |βi| values and second processing module 201B receives and processes less than M/2 |βi| values), or the like, as well as various combinations thereof. Accordingly, it will be appreciated that, although primarily depicted and described with respect to a specific arrangement of functions using specific numbers, types, and arrangements of modules, functions of minimum determination module 200 of
Referring again to
As depicted in
The minimum determination module 300 includes X-1 stages of 2-input minimum modules 310. The 2-input minimum modules 310 each include two value inputs for receiving two values to be compared and one value output for outputting the minimum value of the two compared values. In the case of the first stage of the tree structure, the two value inputs of the 2-input minimum module 310 receive 2 |βi| values received by minimum determination module 300 from the VNUs to which the CNU is connected. In the case of any additional stages of the tree structure other than the first stage of the tree structure (e.g., a k-th stage), the two value inputs of the 2-input minimum module 310 receive two |βi| values output from two 2-input minimum modules 310 at the previous stage (e.g., a (k−1)th stage) of the tree structure. Additionally, in the case of any additional stages of the tree structure other than the first stage of the tree structure, (e.g., a k-th stage), the 2-input minimum module 310 also includes (1) two index inputs for receiving indexes associated with the two values received via the two value inputs and (2) one index output for outputting the one of the two received indexes associated with the minimum value output from the value output. As discussed further below, the index may be propagated in various ways (e.g., the index is log 2(M) bits long in which case we simply output one of the 2 received indexes in each stage, the index grows by 1 bit at each stage (in which case one of the 2 received indexes is output and an extra bit is further appended depending on which input was minimum), or the like). An exemplary 2-input minimum module 310′ is depicted in
The 2-input minimum module 310′ is configured for use at a stage other than the first stage of the tree structure. The 2-input minimum module 310′ includes a minimum determination element 311, a value multiplexer 312, and an index multiplexer 313. The minimum determination element 311 receives the two values (denoted as A and B) from the previous stage of the tree structure, the value multiplexer 312 also receives the two values (again, denoted as A and B) from the previous stage of the tree structure, and the index multiplexer 313 receives the two indexes (denoted as IndA and IndB, which are associated with values A and B, respectively) from the previous stage of the tree structure. The minimum determination element 311 compares the two values to determine which of the two values is smaller, and outputs a signal indicative as to which of the two values is smaller. The indication as to which of the two values is smaller is used as a control signal for both the value multiplexer 312 and the index multiplexer 313. If a determination is made that value A is less than value B, the signal indicative as to which of the two values is smaller that is output from minimum determination element 311 causes value multiplexer 312 to select the input corresponding to value A and, similarly, causes index multiplexer 313 to select the input corresponding to IndA. Alternatively, if a determination is made that value B is less than value A, the signal indicative as to which of the two values is smaller that is output from minimum determination element 311 causes value multiplexer 312 to select the input corresponding to value B and, similarly, causes index multiplexer 313 to select the input corresponding IndB. In this manner, the minimum values and associated indexes for the minimum values may be propagated toward the 2-input min-max module 320 for a final determination of the first minimum value (Min1) which is the smallest of the M |βi| values received by the minimum determination module 300 and the second minimum value (Min2) which is an approximation of the next smallest of the M |βi| values received by the minimum determination module 300. It will be appreciated that a 2-input minimum module 310′ for use at the first stage of the tree structure may omit the index multiplexer 313 and, rather, may simply output an index associated with the minimum value for use at the next stage of the tree structure. It will be appreciated that, although depicted and described with respect to a specific embodiment of a 2-input minimum module 310 (illustratively, exemplary 2-input minimum module 310′), the 2-input minimum module 310 may be implemented in various other ways in order to provide functions of the 2-input minimum module 310 as presented herein.
The minimum determination module 300 includes a 2-input min-max module 320 in the Xth stage of the tree structure. The 2-input min-max module 320 includes two value inputs for receiving two values from the (X-1)th stage of the tree structure and two value outputs for outputting the two values based on comparison of the two values. The 2-input min-max module 320 is configured to compare the two values received via the two value inputs, and to output the two values from the two value outputs in a manner for indicating (1) the first minimum value (Min1), which is the smaller of the two values received by the 2-input min-max module 320 and provides the magnitude of the smallest value of the M |βi| values received by the minimum determination module 300 and (2) the second minimum value (Min2), which is the larger of the two values received by the 2-input min-max module 320 and provides an approximation of the magnitude of the next-smallest value of the M |βi| values received by the minimum determination module 300. Additionally, the 2-input min-max module 320 also includes (1) two index inputs for receiving indexes associated with the two values received via the two value inputs and (2) one index output for outputting the one of the two received indexes associated with the first minimum value (Min1) determined by 2-input min-max module 320 (which, as discussed herein, is indicative of a location, within the M |βi| values received by the minimum determination module 300, of the |βi| value providing the magnitude of the smallest value of the M |βi| values received by the minimum determination module 300). As discussed further below, the index may be propagated in various ways (e.g., the index is log 2(M) bits long in which case we simply output one of the 2 received indexes in each stage, the index grows by 1 bit at each stage (in which case one of the 2 received indexes is output and an extra bit is further appended depending on which input was minimum), or the like). An exemplary 2-input min-max module 320′, which is suitable for use as a 2-input min-max module 320, is depicted in
The 2-input min-max module 320′ includes a minimum determination element 321, a minimum value multiplexer 322min and a maximum value multiplexer 322max, and an index multiplexer 323. The minimum determination element 321 receives the two values (denoted as A and B) from the (X-1)th stage of the tree structure, the minimum value multiplexer 322min and the maximum value multiplexer 322max each also receive the two values (again, denoted as A and B) from the (X-1)th stage of the tree structure, and the index multiplexer 323 receives the two indexes (denoted as IndA and IndB, which are associated with values A and B, respectively) from the (X-1)th stage of the tree structure. The minimum determination element 321 compares the two values to determine which of the two values is smaller, and outputs a signal (e.g., typically “1” or “0”, although any suitable signal may be used) indicative as to which of the two values is smaller. The indication as to which of the two values is smaller is used as a control signal for both the minimum value multiplexer 322min and the maximum value multiplexer 322max, as well as for the index multiplexer 323. If a determination is made that value A is less than value B, the signal indicative as to which of the two values is smaller that is output from minimum determination element 321 causes minimum value multiplexer 322min to select the input corresponding to value A and causes the maximum value multiplexer 322max to select the input corresponding to value B and, further, causes index multiplexer 323 to select the input corresponding to IndA. Alternatively, if a determination is made that value B is less than value A, the signal indicative as to which of the two values is smaller that is output from minimum determination element 321 causes minimum value multiplexer 322min to select the input corresponding to value B and causes the maximum value multiplexer 322max to select the input corresponding to value A and, further, causes index multiplexer 323 to select the input corresponding to IndB. In this manner, the 2-input min-max module 320 is able to output the first minimum value (Min1) which is the magnitude of the smallest of the M |βi| values received by the minimum determination module 300 and the second minimum value (Min2) which is an approximation of the magnitude of the next smallest of the M |βi| values received by the minimum determination module 300. It will be appreciated that, although depicted and described with respect to a specific embodiment of a 2-input min-max module 320 (illustratively, exemplary 2-input min-max module 320′), the 2-input min-max module 320 may be implemented in various other ways in order to provide functions of the 2-input min-max module 320 as presented herein.
As discussed above, minimum determination module 300, in addition to supporting propagation of |βi| values, also supports propagation of associated index values. In at least some embodiments (as presented in
It will be appreciated that, although primarily depicted and described with respect to embodiments of the minimum determination module 200 in which only a single 2-input min-max module 320 is used in the tree structure and the remaining 2-input elements of the tree structure are 2-input minimum modules 310, in at least some embodiments the minimum determination module may be configured to use 2-input min-max modules at one or more earlier stages of the tree structure, in which case the stage(s) of the tree structure preceding the 2-input min-max modules may include 2-input minimum modules 310 (i.e., such that less 2-input minimum modules 310 would be used) and the stage(s) of the tree structure following the 2-input min-max modules may include 4-input min1-min2 modules (discussed further below). An exemplary embodiment of the minimum determination module 200 in which multiple 2-input min-max modules are used in the tree structure is depicted and described with respect to
The minimum determination module 400 includes X-3 stages of 2-input minimum modules 410 (in the first (X-3) stages). The 2-input minimum modules 410 of
The minimum determination module 400 includes one stage of 2-input min-max modules 410 (in the (X-2)th stage). The 2-input min-max modules 420 of
The minimum determination module 400 includes two stages of 4-input min1-min2 modules 430 (in the (X-1)th and Xth stages). The 4-input min1-min2 modules 430 each include two sets of value inputs for receiving four values from the previous stage of the tree structure and two value outputs for outputting the two values based on comparison of the four input values. The 4-input min1-min2 modules 430 each are configured to compare two pairs of values received via the two sets of value inputs, and to output the two values from the two value outputs in a manner for indicating (1) the first minimum value (Min1), which is the smallest of two of the values in a first pair of values received by the 4-input min1-min2 module 430 and provides the magnitude of the smallest value of the M |βi| values received by the minimum determination module 400 and (2) the second minimum value (Min2), which is smallest of the remaining three values received by the 4-input min1-min2 modules 430 and provides an approximation of the magnitude of the next-smallest value of the M |βi| values received by the minimum determination module 400. Additionally, each 4-input min1-min2 module also includes (1) two index inputs for receiving two indexes associated with values received via the two sets of value inputs, respectively and (2) one index output for outputting the one of the two received indexes associated with the first minimum value (Min1) determined by 4-input min1-min2 modules 430 (which, as discussed herein, is indicative of a location, within the M |βi| values received by the minimum determination module 400, of the |βi| value providing the magnitude of the smallest value of the M |βi| values received by the minimum determination module 400). As discussed further below, the index may be propagated in various ways (e.g., the index is log 2(M) bits long in which case we simply output one of the 2 received indexes in each stage, the index grows by 1 bit at each stage (in which case one of the 2 received indexes is output and an extra bit is further appended depending on which input was minimum), or the like). An exemplary 4-input min1-min2 module 430′, which is suitable for use as a 4-input min1-min2 module 430, is depicted in
The 4-input min1-min2 module 430′ includes a first minimum determination element 4311, a second minimum determination element 4312, four multiplexers 4321-4324 (collectively, multiplexers 432), and an index multiplexer 433. The first minimum determination element 4311 receives the two values (denoted as Min1A and Min1B) from the previous stage of the tree structure, compares the two values to determine which of the two values is smaller, and outputs a signal (denoted as Ind[q], e.g., typically “1” or “0” although any suitable signal may be used) indicative as to which of the two values is smaller. The first multiplexer 4321 receives two values (denoted as Min1A and Min1B) from the previous stage of the tree structure, selects the smaller of the two values, and outputs the smaller value as Min1. The second multiplexer 4322 receives two values (denoted as Min1A and Min1B) from the previous stage of the tree structure, selects the larger of the two values, and outputs the larger value as an input to the second minimum determination element 4312 and as an input to the fourth multiplexer 4324. The third multiplexer 4323 receives two values (denoted as Min2A and Min2B) from the previous stage of the tree structure and provides an appropriate second input for second minimum determination element 4312. The second minimum determination element 4312, if Min1A<Min1B, compares Min1B with Min2A (which is output by third multiplexer 4323) to determine Min2. The second minimum determination element 4312, if Min1B<Min1A, compares Min1A with Min2B (which is output by third multiplexer 4323) to determine Min2. The fourth multiplexer 4324 receives the same two input values as the second minimum determination element 4312 and selects the smaller of the two values based on the control signal received from second minimum determination element 4312 and outputs the smaller value as Min2. It is noted that in each stage the Ind[q] bit is appended to the preceding Ind[q-1] . . . Ind[1] to form the entire index of the true minimum value from the set of |βi| values input to minimum determination module 400, which provides an identification of which of the |βi| values input to minimum determination module 400 has a magnitude that corresponds to the true minimum value from the set of values input to minimum determination module 400. It will be appreciated that, although depicted and described with respect to a specific embodiment of a 4-input min1-min2 module 430 (illustratively, exemplary 4-input min1-min2 module 430′), the 4-input min1-min2 module 430 may be implemented in various other ways in order to provide functions of the 4-input min1-min2 module 430 as presented herein.
It will be appreciated that, although primarily depicted and described with respect to specific embodiments of the minimum determination module 200 (illustratively, minimum determination module 300 of
Referring again to
The processing module 500 processes the group of 8 |βi| values from the 8 messages received from 8 VNUs 124 in order to determine the minimum |βi| value from among the |βi| values of the group (which gives the magnitude of the minimum |βi| value from among the |βi| values of the group, but does not indicate which of the 8 |βi| values of the group corresponds to the minimum |βi| value from among the 8 |βi| values of the group) and the index of the minimum |βi| value from among the |βi| values of the group (which identifies which of the 8 |βi| values of the group corresponds to the minimum |βi| value from among the 8 |βi| values of the group (e.g., a location of minimum |βi| value from among the 8 |βi| values of the group within the 8 |βi| values of the group), but does not indicate the magnitude of the minimum |βi| value from among the 8 |βi| values of the group). The 8 MSBs of bit set 5013 are provided to zero detector module 5103, which performs a logical AND operation on the 8 MSBs to produce a corresponding found signal f(bit 3). The found signal f(bit 3) is (a) output as the bit for the 3rd bit position (MSB) of the minimum |βi| value from among the 8 |βi| values being processed by processing module 500 and (b) fed back as an input to the mask generation module 5203 that is associated with the 8 MSBs of bit set 5013. The found signal f(bit 3) is set equal to “0” based on detection of the presence of at least one zero bit among the 8 bits of bit set 5013, and is set to “1” otherwise. The mask generation module 5203 receives the 8 MSBs of bit set 5013 and the found signal f(bit 3) output from zero detector module 5103, and uses the 8 MSBs of bit set 5013 and the found signal f(bit 3) to produce a disable signal (denoted as ds(bit 2) which is an 8-bit signal defined as ds(bit 2)=ds1(bit 2), ds2(bit 2), . . . , ds8(bit2) where the subscripts correspond to the 8 |βi| values) for use by mask application module 5302 associated with the bit set 5012 including the next MSBs of the 8 |βi| values. The mask generation module 5203 produces the disable signal ds(bit 2) by, based on a determination that the found signal f(bit 3) is active (e.g., found signal f(bit 3)=‘0’), set a corresponding bit of the disable signal ds(bit 2) equal to “1” for each bit of bit set 5013 that is equal to “1”. For example, if the 8 MSBs of bit set 5013 are 10011011 and f(bit 3) is “0” (indicative that the 8 MSBs of bit set 5013 included at least one “0”), then ds(bit 2) will be 10011011. The 8 bits of bit set 5012, rather than being provided directly to zero detector module 5102, are provided to the mask application module 5302 associated with the bit set 5012. The mask application module 5302 associated with the bit set 5012 receives the 8 bits of bit set 5012 and the disable signal ds(bit 2) generated by mask generation module 5203, and masks the 8 bits of bit set 5012 with the 8 bits of the disable signal ds(bit 2) to produce a masked bit set 5022 (including 8 masked bits, denoted as δ1(bit 2)-δ8(bit 2)) which is provided to the zero detector module 5102 instead of the 8 bits of bit set 5012. The disable signal ds(bit 2) turns into ‘1’ the 8 bits of bit set 5012 (namely, bits |βk|(bit 2)) for which the corresponding bits of bit set 5013 (namely, bits |βk|(bit 3)) are equal to ‘1’ if, for at least one value of ‘m’ (where m≠k), |βm(bit 3) equals ‘0’. The 8 masked bits of masked bit set 5022 are provided to zero detector module 5102 associated with the bit set 5012, which performs a logical AND operation on the 8 masked bits to produce a corresponding found signal f(bit 2). The found signal f(bit 2) is (a) output as the bit for the 2nd bit position (second MSB) of the minimum |βi| value from among the 8 |βi| values being processed by processing module 500 and (b) fed back as an input to the mask generation module 5202 that is associated with the 8 bits of bit set 5012. The found signal f(bit 2) is set equal to “0” based on detection of the presence of at least one zero bit among the 8 bits of masked bit set 5022, and is set to “1” otherwise. The mask generation module 5202 receives the 8 bits of bit set 5012 and the found signal f(bit 2) output from zero detector module 5102, and uses the 8 bits of bit set 5012 and the found signal f(bit 2) to produce a disable signal (denoted as ds(bit 1) which is an 8-bit signal defined as ds(bit 1)=ds1(bit 1), ds2(bit 1), . . . , ds8(bit1) where the subscripts correspond to the 8 |βi| values) for use by mask application module 5301 associated with the bit set 5011 including the next MSBs of the 8 |βi| values. The processing then continues as discussed above in order to produce a corresponding found signal f(bit 1) which is output as the bit for the 1st bit position (third MSB) of the minimum |βi| value from among the 8 |βi| values being processed by processing module 500 and to produce a corresponding found signal f(bit 0) which is output as the bit for the 0th bit position (LSB) of the minimum |βi| value from among the 8 |βi| values being processed by the processing module 500. In this manner, the concatenation of the found signals f(bit 3)-f(bit 0) provides the minimum |βi| value from among the 8 |βi| values of the group. Additionally, the index determination module 540, which is associated with the bit position of the LSB, is configured to determine the index of the minimum |βi| value from among the 8 |βi| values of the group. The index determination module 540 receives the 8 bits of bit set 5010 and the 8 bits of the masked bit set 5020, and uses a truth table to determine the index of the minimum |βi| value from among the 8 |βi| values of the group. Accordingly, as depicted in
The mask generation module 710 receives the 8 bits of the bit set for bit position p (denoted as |β1|(bit p)-|β8|(bit p)) and the found signal f(bit p) output from the zero detector of bit position p, and produces a disable signal ds(bit p-1) for use by mask application module 720 associated with bit position p-1. The mask generation module 710 includes 8 AND gates 7111-7118 (collectively, AND gates 711) and an inverter 712. The AND gates 711 each include two inputs and one output, respectively. The inverter 712 includes a single input and a single output. The 8 bits of the bit set (namely, |β1|(bit p)-|β8|(bit p)) are input into first inputs of the 8 AND gates 7111-7118, respectively. The input of the inverter 712 receives found signal f(bit p) and outputs an inverted found signal f′(bit p). The inverted found signal f′(bit p) is input into each of the second inputs of the 8 AND gates 7111-7118, respectively. If the found signal f(bit p) for bit position p is a “0” (indicative that at least one of the bits at bit position p was a “0”) then the inverted found signal f′(bit p) is a “1” such that, for each of the |βi|(bit p) values of bit position p that were “1”, the associated AND gate 711, will ensure that the corresponding disable signal dsi (bit p-1) for the next bit position p-1 is also a “1” since those |βi| values cannot be the minimum value of the set of |βi| values and, thus, should not be evaluated as part of the zero detection performed at the next bit position p-1. If the found signal f(bit p) is a “1” (indicative that all of the bits at bit position p were “1”) then the inverted found signal f′(bit p) is a “0” such that, regardless of the |βi|(bit p) values of bit position p that were “1”, the associated AND gates 711 will ensure that the corresponding disable signals ds(bit p-1) for the next bit position p-1 are “0”. The outputs of the 8 AND gates 7111-7118 form the disable signal ds(bit p-1) for use by mask application module 720 associated with bit position p-1.
The mask application module 720 receives the 8 bits of the bit set for bit position p-1 (denoted as |β1|(bit p-1)-|β8|(bit p-1)) and the disable signal ds(bit p-1) from mask generation module 710, and produces the 8 bits of the masked bit set for bit position p-1 (denoted as masked bits δ1(bit p-1)-δ8(bit p-1)). The mask application module 720 includes 8 OR gates 7211-7218 (collectively, OR gates 721), each of which includes two inputs and one output, respectively. The 8 bits of the bit set (namely, |β1|(bit p-1)-|β8|(bit p-1)) are input into first inputs of the 8 OR gates 7211-7218, respectively. The 8 bits of the disable signal ds(bit p-1) are input into second inputs of the 8 OR gates 7211-7218, respectively. If the disable signal dsi (bit p-1) for the bit position p-1 is a “1” (indicative that the bit |βi|(bit p) of the previous bit position was “1” even though at least one other bit |βj|(bit p) of the previous bit position was “0” then the associated OR gate 721i ensures that the corresponding masked bit δi(bit p-1) is a “1” (regardless of whether the associated bit |βi|(bit p-1) of bit position p-1 is “1” or “0”) and, thus, that the associated |βi| value cannot be the minimum value of the set of |βi| values (i.e., even though the current bit |βi|(bit p-1) of bit position p-1 is a “0”, it was previously determined that the |βi| value cannot be the minimum value of the set of |βi| values since at least one other |βi| value from the set of |βi| values has a “0” at a more significant bit position while the |βi| value has a “1” at that more significant bit position). In other words, even though the current bit |βi|(bit p-1) of bit position p-1 of the given |βi| value is a “0”, this “0” value is blocked, or masked, from being considered by the zero detector module for bit position p-1 since, as noted above, it was previously determined that the given |βi| value cannot be the minimum value of the set of |βi| values since at least one other |βi| value from the set of |βi| values has a “0” at a more significant bit position while the given |βi| value has a “1” at that more significant bit position. The outputs of the 8 OR gates 7211-7218 form the masked bit set (namely, δ1(bit p-1)-δ8(bit p-1)) which is provided to the zero detector module for bit position p-1.
It will be appreciated that, although primarily depicted and described herein with respect to embodiments in which numbers are represented using a sign-magnitude representation and the magnitude portions of the numbers are compared for determining the first minimum value and an approximation of the second minimum value, in at least some embodiments various modules depicted and described herein may be used (or adapted for use) for direct comparisons of the numbers (i.e., direct comparisons of the sign-magnitude representations of the numbers, including both the sign and magnitude portions of the numbers). For example, if, by the used sign-magnitude convention, the sign bit value ‘0’ represents a negative number, then the sign bit can be considered to be the MSB and processed according to the description of minimum determination module 400. Alternatively, for example, if the sign bit value ‘1’ represents a negative number, then the sign bits of all input numbers should be inverted and the inverted sign bits can be processed by the minimum determination module 400 as their MSBs. It will be appreciated that other mechanisms for handling sign-magnitude representations of numbers may be supported for use in determining the first minimum value and an approximation of the second minimum value.
It will be appreciated that, although primarily depicted and described herein with respect to embodiments in which the numbers that are processed for determining the first minimum value and an approximation of the second minimum value include sign and magnitude portions, in at least some embodiments various modules depicted and described herein may be used (or adapted for use) for determining the first minimum value and an approximation of the second minimum value for numbers that do not include a sign portion or for determining the first minimum value and an approximation of the second minimum value for numbers independent or irrespective of whether the numbers include a sign portion or merely represent magnitudes.
It will be appreciated that, although primarily depicted and described herein as determining an approximation of the second minimum value (Min2) given a set of values, the determination of the second minimum value (Min2) also may be said to be a determination of at least an approximation of the second minimum value (Min2) since the reference to “at least” may be used to cover the fact that, in at least some cases, the second minimum value (Min2) will be the true second-smallest value of all of the values in the set of values.
It will be appreciated that, although primarily depicted and described herein with respect to embodiments applied within the context of an LDPC decoder (e.g., in which evaluation of a set of input values to determine a smallest value of the set of input values and an approximation of a next-smallest value of the set of input values is performed for identifying the first minimum value (Min1) and second minimum value (Min2) for use by a CNU in computing a set of responses to a set of VNUs from which the input values were received), various embodiments depicted and described herein may be used within various other contexts (e.g., other devices, environments, technologies, or the like) for evaluating a set of input values to determine a smallest value of the set of input values and an approximation of a next-smallest value of the set of input values. Accordingly, a more general embodiment of a method for evaluating a set of input values to determine a smallest value of the set of input values and an approximation of a next-smallest value of the set of input values is depicted and described in
At step 1101, method 1100 begins.
At step 1110, a set of values is received. The set of values may be received from any suitable source of values.
At step 1120, a minimum value of the set of values is determined. The minimum value of the set of values may be determined based on bitwise comparisons of bits of the values on a per bit position basis. The minimum value of the set of values may be determined based on bitwise comparisons of bits of the values on a per bit position basis beginning with the most significant bit position of the values (and, thus, the most significant bits of the values) and proceeding toward the least significant bit position (and, thus, the least significant bits of the values). The bitwise comparisons on a bit position basis may be performed as depicted and described with respect to
The minimum value of the set of values may be determined based on bitwise comparisons by using the bitwise comparisons to determine at least one of a magnitude of the minimum value of the set of values or an indication of which of the values of the set of values has a magnitude of the minimum value of the set of values.
In at least some embodiments, only the magnitude of the minimum value of the set of values may be determined without determining which of the values in the set of values has that minimum magnitude (e.g., for a set of input values including 2, 6, 7, 1, 4, determination of only the magnitude may only provide an indication that the minimum value has a magnitude of “1” without an indication that the fourth value in the set of values is the value which has that minimum magnitude). In at least some embodiments, a determination of which of the values in the set of values has that minimum magnitude (e.g., in the above example, determining that the fourth value in the set of values is the value which has the determined minimum magnitude) also may be performed (e.g., based on bitwise comparisons, by searching the set of values to identify which of the values has that determined minimum magnitude, or the like).
In at least some embodiments, only an indication of which of the values of the set of values has a magnitude of the minimum value of the set of values may be determined without determining the magnitude of the minimum value of the set of values that is associated with that indicated value of the set of values (e.g., for a set of input values including 2, 6, 7, 1, 4, determination of only an indication of which of the values of the set of values has a magnitude of the minimum value of the set of values provide an indication that the fourth value in the set of values has the minimum magnitude without an indication that the magnitude of the fourth value is “1”). In at least some embodiments, a determination of the magnitude of the identified value of the set of values (e.g., in the above example, determining that the magnitude of the fourth value in the set of values is “1”) also may be performed (e.g., based on bitwise comparisons, by reading or accessing the identified value of the set of values to determine the magnitude of the identified value of the set of values, or the like).
In at least some embodiments, both the magnitude of the minimum value of the set of values and an indication of which of the values in the set of values has that minimum magnitude may be determined (e.g., for a set of input values including 2, 6, 7, 1, 4, determination that the magnitude of the minimum value in the set of values is “1” and an indication that the fourth value in the set of values is the value which has that minimum magnitude). The use of bitwise comparisons of bits of a set of values on a per bit position basis to determine a minimum value of the set of values (both the magnitude of the minimum value of the set of values and an indication of which of the values in the set of values has that minimum magnitude) may be further understood by way of reference to the following example. In this example, assume that there are three values (value v1=100, value v2=011, value v3=010) that need to be evaluated in order to determine the minimum value (which will be value v3=010). A first bitwise comparison is performed at the MSB position for the MSBs of the three values (namely, “1” from value v1, “0” from value v2, and “0” from value v3) to determine whether any of the bits of the three values are “0”. Here, since two of the values (value v2 and value v3) have a “0” in the MSB position, it is known that the minimum value of the three values begins with a “0” and, further, that one of the values (namely, value v1) cannot be the minimum value. Accordingly, an output is provided which may be used to indicate that the minimum value of the three values begins with a “0”. In the exemplary embodiment of
At step 1199, method 1100 ends.
It will be appreciated that, although primarily depicted and described herein with respect to embodiments in which the number of input values (M) in the set of input values being evaluated is a power of 2 (e.g., for determining first and second minimum values, for determining a single minimum value, or the like), various embodiments depicted and described herein may be configured for evaluating a set of input values where the number of input values (M) in the set of input values is not a power of 2. In at least some such embodiments, the module or modules used for evaluating the set of input values may be configured based on a next-higher power of 2 (e.g., for M=12 the module or modules used for evaluating the set of 12 input values may be based on evaluation of a set of 16 input values, for M=60 the module or modules used for evaluating the set of 60 input values may be based on evaluation of a set of 64 input values, or so forth). In at least some such embodiments, configuration of the module or modules used for evaluating the set of input values based on a next-higher power of 2 may use open input connections, dummy variables, or the like.
As discussed herein, various embodiments of the LDPC decoding capability presented herein provide an approximation of conventional LDPC decoders that has bit error rate (BER) performance comparable to that of conventional LDPC decoders while reducing chip area and power consumption (as discussed further below with respect to Table 1) and complexity (as discussed further below with respect to Table 2).
As discussed herein, various embodiments of the LDPC decoding capability presented herein provide an approximation of conventional LDPC decoders that has BER performance comparable to that of conventional LDPC decoders while reducing chip area and power consumption. The use of a logic synthesis tool may be employed to quantify chip area and power consumption benefits for at least some embodiments of the LDPC decoding capability. For example, assuming an 8-input CNU with word length w=4 bits where each design is synthesized in 90 nm CMOS for minimum area at VDD=1.2 V, results for a conventional LDPC decoder, an LDPC decoder designed based on the paper entitled “A Bit-Serial Approximate Min-Sum LDPC decoder and FPGA Implementation” by Darabiha et al., and an LDPC decoder based on various embodiments presented herein are presented in Table 1.
As discussed herein, various embodiments of the LDPC decoding capability presented herein provide an approximation of conventional LDPC decoders that has BER performance comparable to that of conventional LDPC decoders while reducing complexity. The complexity of a conventional LDPC decoder and an LDPC decoder based on various embodiments presented herein is presented in Table 2. The number of 1-bit 2-to-1 multiplexers (MUX2s) and the number of (w-1)-bit comparators (COMPs) are reduced by a factor about 2 and a factor of about 1.5, respectively. The improvement in the propagation delay depends on log2 M. The number of operations that are not related to finding the minimums (e.g. XOR operations) is not expected to be affected by embodiments presented herein. It is noted that, knowing the area, power dissipation, and delay of the cells (e.g., MUXs, comparators, and other elements), it is possible to estimate the benefits of various embodiments presented herein using Table 2 for any given LDPC code in a particular CMOS technology. In Table 2, tMUX corresponds to the delay of a MUX2, tCOMP corresponds to the delay of a comparator (COMP), and tadd corresponds to the delay of an adder.
Various advantages of embodiments of the LDPC decoding capability presented herein may be further understood by way of simulations related to a conventional LDPC decoder and an LDPC decoder based on various embodiments presented herein. A simulation was performed using an LDPC(2048,1723) code defined in the 10Gbase-T Ethernet standard and an LDPC(576,288) code defined in the WiMax standard may be used as test bench. The system-level characterization of the decoder was performed in MATLAB. Encoded data was sent through an additive white Gaussian noise (AWGN) channel using non-return-to-zero (NRZ) signaling. For MSA decoding, Snorm is 0.75, and w is either 4 or 5 bits. To evaluate the performance of the conventional CNU circuits and CNU circuits designed based on various embodiments presented herein, a combinational logic including sign and normalization calculations was simulated. The CNU circuits of the simulation were implemented in Verilog and then synthesized in a 90-nm CMOS technology. SPICE simulation using the same technology was used to find the relationship between supply voltage, power dissipation, and propagation delay. With respect to BER performance between a conventional LDPC decoder and an LDPC decoder based on various embodiments presented herein, the simulation indicated that (1) for LDPC(2048,1723), a SNR penalty of 0.1 dB and 0.2 dB was observed for w equal to 4 and 5 bit, respectively, and (2) for LDPC(576,288), the SNR penalty remained below 0.1 dB. With respect to post-FEC BER versus SNR for word lengths of 4 and 5 in a LDPC(2048, 1723) code, between a conventional LDPC decoder, an LDPC decoder designed based on the paper entitled “A Bit-Serial Approximate Min-Sum LDPC decoder and FPGA Implementation” by Darabiha et al., and an LDPC decoder based on various embodiments presented herein, the simulation indicated that (1) an LDPC decoder based on various embodiments presented herein may have a negligible increase in the required SNR for a given BER over a conventional LDPC decoder and an LDPC decoder designed based on the paper entitled “A Bit-Serial Approximate Min-Sum LDPC decoder and FPGA Implementation” by Darabiha et al. and (2) for a BER lower than 10−4, the average number of iterations to finish decoding (assuming that early termination is utilized) is about 7% higher and 3% higher for an LDPC decoder based on various embodiments presented herein as compared with a conventional LDPC decoder and an LDPC decoder designed based on the paper entitled “A Bit-Serial Approximate Min-Sum LDPC decoder and FPGA Implementation” by Darabiha et al., respectively (however, although the power dissipation of an LDPC decoder in fact increases with the number of iterations, the power saving in the CNU due to various embodiments presented herein is much larger than 7% and, thus, in total, use of various embodiments presented herein results in lower power dissipation).
As discussed herein, various embodiments of the LDPC decoding capability presented herein provide various advantages over various conventional LDPC decoder designs (as discussed further below with respect to Table 3 depicted below). Table 3 corresponds to implementations of a CNU, of an LDPC(2048,1723) in a fully-parallel decoder implementation, with M=32 inputs and word length w=5 bits. Comparing designs 1 and 3 of Table 3, which are both optimized for chip area, a CNU according to various embodiments presented herein occupies 37% less area than a conventional CNU, and also has lower power dissipation and lower propagation delay than a conventional CNU. In order to compare the two circuits with the same propagation delay, and hence throughput, the design (i.e., design 1) of the conventional CNU was re-synthesized for a higher speed (i.e., design 2). Comparing designs 2 and 3 of Table 3, a CNU according to various embodiments presented herein occupies 44% less area than a conventional CNU. Design 4, which was optimized for the highest throughput, has an area and power dissipation close to that of design 2, but it provides a throughput two times higher than that of design 2. If throughput is not the main concern, but area and power dissipation are the most critical, voltage scaling (VS) can be considered. The supply voltage (VDD) of design 3 was lowered to a point where a propagation delay equal to that of design 1 was obtained (i.e., design 5), and a comparison of the results (i.e., design 5) with design 1 shows a three time reduction in power dissipation.
As discussed herein, various embodiments of the LDPC decoding capability presented herein provide reduced power dissipation as compared to that of conventional LDPC decoders. It is noted that the average total power dissipation of the entire LDPC(N,K) decoder (not just the CNU) with early termination can be expressed as:
where Iavg and Imax are the average and maximum number of iterations, respectively. PVNU and PCNU are the power dissipation of a VNU and a CNU at a clock frequency of fCK, respectively. CINT is the total capacitance of the interconnect wires between CNUs and VNUs, and a is the signal activity factor. fCK is the clock frequency at which the decoder provides the desired throughput after Imax iterations. CINT is proportional to the total length of the interconnect wires and, thus, approximately proportional to the square-root of the total area. In a fully-parallel implementation, the total area is proportional to NA VNU+(N-K)ACNU, where AVNU and ACNU are the chip area of a VNU and a CNU, respectively. As a result, the average power dissipation can be written as
where parameter γ is a function of technology, chip area utilization factor, and average signal activity factor. The bigger the γ, the higher the impact of interconnects on the average power dissipation. In order to evaluate the impact of various embodiments of the LDPC decoding capability on the power dissipation of a LDPC(2048,1732) decoder, a VNU was synthesized (having a dynamic power dissipation of 3.05 μW/MHz, a leakage power of 14 μW, and an area of 4760 μm2) and the total power dissipation of the decoder was evaluated using assuming design 1 (conventional) and design 5 (various embodiments presented herein) for the CNUs. The use of various embodiments presented herein resulted in lower power dissipation for the LDPC(2048,1732) decoder.
The computer 1200 includes a processor 1202 (e.g., a central processing unit (CPU) and/or other suitable processor(s)) and a memory 1204 (e.g., random access memory (RAM), read only memory (ROM), and the like).
The computer 1200 also may include a cooperating module/process 1205. The cooperating process 1205 can be loaded into memory 1204 and executed by the processor 1202 to implement functions as discussed herein and, thus, cooperating process 1205 (including associated data structures) can be stored on a computer readable storage medium, e.g., RAM memory, magnetic or optical drive or diskette, and the like.
The computer 1200 also may include one or more input/output devices 1206 (e.g., a user input device (such as a keyboard, a keypad, a mouse, and the like), a user output device (such as a display, a speaker, and the like), an input port, an output port, a receiver, a transmitter, one or more storage devices (e.g., a tape drive, a floppy drive, a hard disk drive, a compact disk drive, and the like), or the like, as well as various combinations thereof).
It will be appreciated that computer 1200 depicted in
It will be appreciated that the functions depicted and described herein may be implemented in software (e.g., via implementation of software on one or more processors, for executing on a general purpose computer (e.g., via execution by one or more processors) so as to implement a special purpose computer, and the like) and/or may be implemented in hardware (e.g., using a general purpose computer, one or more application specific integrated circuits (ASIC), and/or any other hardware equivalents).
It will be appreciated that some of the steps discussed herein as software methods may be implemented within hardware, for example, as circuitry that cooperates with the processor to perform various method steps. Portions of the functions/elements described herein may be implemented as a computer program product wherein computer instructions, when processed by a computer, adapt the operation of the computer such that the methods and/or techniques described herein are invoked or otherwise provided. Instructions for invoking methods described herein may be stored in fixed or removable media (e.g., non-transitory computer-readable storage media), transmitted via a data stream in a broadcast or other signal bearing medium, and/or stored within a memory within a computing device operating according to the instructions.
It will be appreciated that the term “or” as used herein refers to a non-exclusive “or,” unless otherwise indicated (e.g., use of “or else” or “or in the alternative”).
It will be appreciated that, although various embodiments which incorporate the teachings presented herein have been shown and described in detail herein, those skilled in the art can readily devise many other varied embodiments that still incorporate these teachings.
Number | Name | Date | Kind |
---|---|---|---|
7395490 | Richardson | Jul 2008 | B2 |
7962830 | Eroz | Jun 2011 | B2 |
8140930 | Maru | Mar 2012 | B1 |
8291292 | Varnica | Oct 2012 | B1 |
20050210366 | Maehata | Sep 2005 | A1 |
20050257106 | Luby | Nov 2005 | A1 |
20060026486 | Richardson | Feb 2006 | A1 |
20070195894 | Shokrollahi | Aug 2007 | A1 |
20080288846 | Kyung | Nov 2008 | A1 |
20090327800 | Kim | Dec 2009 | A1 |
20120054576 | Gross | Mar 2012 | A1 |
20140281794 | Sakaue | Sep 2014 | A1 |
20150227419 | Sakaue | Aug 2015 | A1 |
Entry |
---|
Zhang et al., “An Efficient 10GBASE-T Ethernet LDPC Decoder Design With Low Error Floors,” IEEE Journal of Solid-State Circuits, vol. 45, Issue 4, Mar. 24, 2010, pp. 843-855. |
Fossorier et al., “Reduced Complexity Iterative Decoding of Low-Density Parity Check Codes Based on Belief Propagation,” IEEE Transactions on Communications, vol. 47, Issue 5, May 1999, pp. 673-680. |
Chen et al., “Near Optimum Universal Belief Propagation Based Decoding of Low-Density Parity Check Codes,” IEEE Transactions on Communications, vol. 50, Issue 3, Mar. 2002, pp. 406-414. |
Darabiha et al., “A Bit-Serial Approximate Min-Sum LDPC Decoder and FPGA Implementation,” 2006 IEEE International Symposium on Circuits and Systems (ISCAS'06), May 21, 2006, pp. 21-24. |
Darabiha et al., “Power Reduction Techniques for LDPC Decoders,” IEEE Journal of Solid-State Circuits, vol. 43, Issue 8, Aug. 2008, pp. 1835-1845. |
Number | Date | Country | |
---|---|---|---|
20160233884 A1 | Aug 2016 | US |