Embodiments of the present disclosure are directed to syndrome calculations in generalized concatenated error correction codes.
Reed-Solomon (RS) codes are a group of error-correcting codes used for correcting errors in data transmitted over unreliable or noisy communication channels. The transmitted message (c0, . . . , ci, . . . , cn−1) can be viewed as the coefficients of a polynomial:
A RS syndrome calculation can be described as follows:
Embodiments of the disclosure can calculate D Syndromes for each Reed-Solomon Decoding.
Embodiments of the disclosure can receive the symbols in any order, and calculate the addition for all syndromes.
Embodiments of the disclosure can maintain a Reed-Solomon syndrome (RSS) for RS decoding between activations.
According to an embodiment of the disclosure, there is provided a hardware circuit for calculating syndromes in Reed-Solomon (RS) error correction codes. The hardware circuit comprises a plurality of p multiplexors, wherein p is a positive integer, wherein each multiplexor receives α{circumflex over ( )}i powers that are selected by j, wherein α is a primitive point of a RS generator polynomial and j is an index of an RS symbol, wherein i and j are positive integers, wherein 1≤i≤p, and outputs α{circumflex over ( )}(i×j); and a plurality of p first multipliers, wherein each first multiplier is associated with a multiplexor and receives α{circumflex over ( )}(i×j) from the associated multiplexor, multiplies the α{circumflex over ( )}(i×j) by a jth RS-word symbol Rj and outputs Rj×α{circumflex over ( )}(i×j). The hardware circuit calculates and outputs p products of the form Rj×α{circumflex over ( )}(i×j), wherein 1≤i≤p.
According to a further embodiment of the disclosure, the hardware circuit comprises a plurality of p second multipliers, where each second multiplier is associated with a multiplexor, and receives α{circumflex over ( )}(i×j) from the associated multiplexor and Rj×α{circumflex over ( )}(p×j) from a pth first multiplier of the plurality of first multipliers, and multiplies the Rj×α{circumflex over ( )}(p×j) by the α{circumflex over ( )}(i×j). The hardware circuit further calculates and outputs p products of the form Rj×α{circumflex over ( )}((i+p)×j), wherein 1≤i≤p.
According to a further embodiment of the disclosure, for n syndromes, where n=2×p×m, where n and m are positive integers, the circuit repeats calculating and outputting a next 2×p products m times, wherein the jth RS-word symbol Rj for a kth iteration is replaced by Rj×α{circumflex over ( )}(2pj) from a (k−1)th iteration, wherein 1≤k≤m.
According to a further embodiment of the disclosure, the hardware circuit comprises a plurality of p second multipliers, where each second multiplier receives the product Rj×α{circumflex over ( )}(i×j), wherein 1≤i≤p, multiplies the product Rx α{circumflex over ( )}(i×j) by α{circumflex over ( )}(pj) and outputs a result Rj×α{circumflex over ( )}((i+p)×j).
According to a further embodiment of the disclosure, for n syndromes, where n=2×p×m, where n and m are positive integers, the circuit repeats calculating and outputting a next p products m times, wherein the jth RS-word symbol Rj for a kth iteration is replaced by Rj×α{circumflex over ( )}(2pj) from a (k−1)th iteration, wherein 1≤k≤m, and multiplies each of the p products by α{circumflex over ( )}(pj) and outputs a result thereof.
According to a further embodiment of the disclosure, the hardware circuit comprises a first GF-square hardware unit connected to an output of each multiplexor, where the GF-square hardware unit squares the α{circumflex over ( )}(i×j) received from the connected multiplexor; a first register, a second register, and a third register, where the first register stores a result α{circumflex over ( )}2(i×j) received from the first GF-square hardware unit, the second register stores a result Rj×α{circumflex over ( )}(i×j) received from a first multiplier associated with the connected multiplexor, and the third register stores the jth RS-word symbol Rj; a first iterative multiplier that calculates a first product of the result stored in the first register and the result Rj×α{circumflex over ( )}(i×j) stored in the second register, and outputs the first product; and a second iterative multiplier that calculates a second product of the result stored in the first register and the jth RS-word symbol Rj and outputs the second product. The result α{circumflex over ( )}2(i×j) stored in the first register is kept stable for all subsequent calculations of the syndrome.
According to a further embodiment of the disclosure, the hardware circuit comprises a first pipeline multiplexor disposed between the first multiplier of each multiplexer and the second register; and a second pipeline multiplexor disposed between an input line for the jth RS-word symbol Rj and the third register. The first pipeline multiplexor receives the first product from the first iterative multiplier, the second pipeline multiplexor receives the second product from the second iterative multiplier, and each of the first and second pipeline multiplexor respectively selects the first product and the second product for a new calculation of Rj×α{circumflex over ( )}(i×j) and Rj×α{circumflex over ( )}((i+p)×j) while a previous calculation of Rj×α{circumflex over ( )}(i×j) and Rj×α{circumflex over ( )}((i+p)×j) is still in progress.
According to a further embodiment of the disclosure, the hardware circuit comprises a second GF-square hardware unit connected to an output of each first GF-square hardware unit, where the GF-square hardware unit squares the α{circumflex over ( )}(2j) received from the connected first GF-square hardware unit; a third iterative multiplier that calculates a third product of a result α{circumflex over ( )}(4j) received from the second GF-square hardware unit and the result Rj×α{circumflex over ( )}(i×j) stored in the second register, and outputs the third product, and a fourth iterative multiplier that calculates a fourth product of the result α{circumflex over ( )}(4j) received from the second GF-square hardware unit and the jth RS-word symbol Rj stored in the third register, and outputs the second product.
According to a further embodiment of the disclosure, the hardware circuit comprises a fourth register disposed between the second GF-square hardware unit and the third iterative multiplier, and that stores the result α{circumflex over ( )}(4j) received from the second GF-square hardware unit.
Hereinafter, embodiments of the present disclosure will be described with reference to accompanying drawings.
Two options for calculating syndromes are as follows.
The Horner rule is very efficient when symbols are received in order. For each syndrome Si, (i=0, . . . , D−1), the syndrome value is the evaluation of the input polynomial in αi. The input symbols are referred to as polynomial coefficients, received from the highest degree. The syndrome calculation for all sequential symbols (RS word) can be implemented by Horner rule, optionally with parallelism. When the codeword symbols are received in order, the accumulative result is multiplied by a, and the new symbol is added each time. When all symbols have been received, the syndrome is ready.
Option 2: General HW for Syndrome Si from Symbol j:
Another option for syndrome calculation is to calculate the contribution of each symbol, rj, to the syndrome and accumulate it, as shown in
A direct implementation may be configured with more hardware. This method uses a general Galois field (GF) multiplier per syndrome, while the Horner implementation uses a constant-multiplier, and option 2 is configured with a preliminary calculation of i×j mod (2m−1) per syndrome.
However, the second option has few advantages over Horner-rule hardware: When symbols arrive out of order, option two is easier and faster to use, and hardware (HW) parallelism for different syndromes Si is possible. However, an embodiment may calculate about 70 RS syndromes for a word in the first column, and use fewer than 70 hardware “boxes”, and just update the “i×j” result. Further, there is a latency, since the syndrome update calculation (per frame) takes 1 cycle.
Generalized concatenated codes are two dimensional structures where rows of codewords are bonded together by a different code. In other words, the codewords meet together with a joint constraint.
All rows are of the same length n but different dimensions k. In the encoding process, the code words are grouped to S stages by their dimensions, so that each stage s, where s=1, . . . , S−1, contains all words c(n, ks), where s refers to a specific stage, and S is the number of stages. In
After all rows of stage s are encoded, the encoder computes their P transform and obtains all the information symbols of the relevant columns in the auxiliary matrix.
Referring to
Traditional methods of decoding GCCs decode all rows, and then calculate RS syndromes for the first column. However, it is time consuming to calculate the delta-syndromes, and storing them somewhere may also increase the overhead. It would be more efficient to calculate syndromes on-the-fly, and to have a greedy algorithm that, when enough rows are decoded, perform RS decoding and proceed to a higher stage of the decoding, without waiting for all rows to be decoded. This involves decoding the rows out of order. Note that Horner's method uses the locality property of polynomials, and multiplies the coefficient by the same power of α at each step. This calculation will not work for out-of-order α's.
Embodiments of the disclosure use an iterative parallel hardware to handle single RSS column. In a first step, the general multiplier is eliminated by defining Si specific hardware. For a known syndrome, Si, the contribution of the j-th frame will be Rj×αij, where Rj is the j-th RS-word symbol. An exemplary hardware for specific syndrome i is illustrated in
A next step involves an iterative calculation, illustrated in
Notice that for the ith instance of the hardware that calculates the ith RS syndrome (HWi), there is an alpha power of i×j, and taking the last syndrome Rj×αpj for some parallelism p, and multiplying it by alpha powers, a 2nd set of syndromes p+1 . . . 2p is obtained. In the subsequent cycle(s), the last syndrome (S2P) is multiplexed to R inputs, and the next 2P results are received.
In an alternative embodiment, note that the same solution can be obtained if αpj is taken out of the last module, and multiplied by the calculated syndromes S1 to Sp. For an SPolar implementation, both embodiments are substantially the same, although they may very slightly differ in area/power/timing, depending on other design parameters.
Embodiments provide further area optimizations.
Using an alternative embodiment described above, and sampling the results to registers before updating the RSS column, only αpj should be kept, and 2-3 syndromes per cycle can be calculated in the background.
Another iterative unit of multiplying by α4j can be added in parallel to α2j, shown in
The number of GF multipliers configured in the first and in the second phase, where the first phase includes the alpha powers and second phase is the iterative HW, depends on the expected TP. Planning the logic of RSS update should take into account the throughput configuration: the rate that updates are received, and the number of syndromes configured to calculate.
While the present disclosure has been described in detail with reference to exemplary embodiments, those skilled in the art will appreciate that various modifications and substitutions can be made thereto without departing from the spirit and scope of the disclosure as set forth in the appended claims.