1. Field of the Invention
The present invention relates to a processing module, an error correction decoding circuit, and a processing method for an error locator polynomial. More particularly, it relates to a processing module in which Euclid's algorithm specialized to binary BCH code is improved and which implements Euclid's algorithm processing within an error correction decoding circuit (decoder), the error correction decoding circuit, and a processing method for an error locator polynomial.
2. Description of the Related Art
An error correction decoding circuit in optical communications is used in order that original data may be restored by correcting errors mixed with transmission degradation on an optical fiber as shown in
Error correction circuits are generally employed in the fields of communications, computers, audios/videos, etc. For the purpose of making error corrections, data need to be turned into codes, the representative ones of which are Bose-Chaudhuri-Hocquenghem (BCH) code. With the BCH code, as shown in
One code length is constituted by several words, and the BCH code in which one word is of one bit as shown in
Both the BCH code and the RS code conform to processing rules on Galois fields. A Galois field GF (16), for example, has elements of 0, 1, α1, . . . and α14, totaling sixteen. Also the results of the additions and multiplications of these elements become any of the sixteen elements 0, 1, α1, . . . and α14, as in the addition (or subtraction) table of the Galois field GF (16) exemplified in
α14+α8=α6εGF(16) (1)
α14×α8=α7εGF(16) (2)
In the case of the Galois field GF (16), the tables of the additions and multiplications become ones of 16 rows×16 columns=256 elements. By the way, in some cases, the element “0” of the Galois field is expressed as “α∞”, and the element “1” as “α0”. Besides, the subtraction of the Galois field becomes the same processing as the addition as follows:
α14−α8=α14+α8εGF(16) (3)
In the field of optical networks, the International Telecommunication Union (hereinbelow, termed “ITU-T”) formally stipulated the addition of an error correction function to an information transmission frame, in Recommendation G. 709 (Non-patent Document 1) in 2003. In Recommendation G. 709, RS code as indicated in Table 1 were first employed. The RS code conform to processing rules on a Galois field GF (256). More specifically, the Galois field GF (256) has 256 elements, and the tables of the additions and multiplications of the elements become ones of 256 rows×256 columns=65536 elements.
In recent years, however, transmission capacities have rapidly increased with the spread of Internet communications and the enhancements of optical fiber communication technology, and the attendant degradations of signal qualities have become serious, so that error correction codes the correction rate of which is higher than that of the RS code have been required.
Description on concatenated codes formed of two different BCH code is contained in Recommendation G. 975. 1 (Non-patent Document 2) of the ITU-T, and the concatenated codes have a correction capability higher than that of the RS code of the Galois field GF (256) indicated before. Here, the two BCH code shall be respectively written as “BCH—1” and “BCH—2” for the sake of convenience. The BCH code BCH—1 are encoded as indicated in Table 2 by way of example.
Besides, the BCH code BCH—2 are encoded as indicated in Table 3.
For the purpose of making efficient corrections with the concatenated codes, pluralities of identical decoding circuits need to be used in view of the characteristics of algorithms and transmission data. In the example of Recommendation G. 975. 1, as the decoding circuit of the concatenated codes, the decoding circuits of the codes BCH—1 are introduced to be in the number of 8×3=24, and those of the codes BCH—2 are introduced to be 16×3=48. A configurational example of the decoding circuit of the concatenated codes is shown in
Further, in packaging the decoding circuit of the concatenated codes into an LSI, also the parameter of an operating frequency needs to be considered in addition to the logic scale and the memory capacity. The reason therefor is that, when the value of the operating frequency is low, a wiring delay is incurred at the high-speed operation of the LSI, so the circuit fails to operate normally.
In order to design the decoding circuit of the concatenated codes up to an actual level in view of these facts, the optimization of the decoding circuit of the BCH code becomes a very important problem. However, most of the BCH code heretofore proposed have been encoded with a small number of elements in such a Galois field as GF (16) or GF (32), and also the number of correction words of the decoding circuit has been 1 to 3 words or so. In contrast, in the above case of Table 2 or Table 3, the number of elements is large as in the Galois field GF (2048) or GF (4096), and further, as many words as 10 are corrected. In general, as the number of the elements of a Galois field becomes larger, and as the number of correctable words increases more, a decoding circuit becomes more complicated, and the logic scale of the decoding circuit enlarges more. It is the actual situation that the optimization of such a decoding circuit of the BCH code has hardly progressed as compared with that of the decoding circuit of RS code.
In this specification, JP-A-5-165662 is mentioned as Patent Document 1, and JP-A-7-240692 is mentioned as Patent Document 2.
In view of the above point, the present invention has for its object to simplify a controlling circuit configuration for the purpose of raising an operating frequency with the minimum latency or a small latency, and to decrease a resource quantity for the purpose of reducing a logic scale. Besides, another object of the invention is to execute the substitution control of a polynomial and a protection control in the case where the difference of degrees is larger than one.
The invention has for its object to provide a processing module for obtaining an error locator polynomial, an error correction decoding circuit and a processing method in which the candidate Bi(z) of the necessary error locator polynomial can be calculated without executing part of the processing of the candidate Ri(z) of an error evaluator polynomial. Besides, another object of the invention is to provide a processing module, etc. in which the processing of the unnecessary candidate Ri(z) is stopped in accordance with the number of steps, and the multiplication and addition of a Galois field at this part are transferred to the processing of the candidate Bi(z). In a memory having a predetermined storage area, the invention has for its object to decrease an area where the coefficients of the candidate Ri(z) are stored, in accordance with the number of steps, and to assign the decrement to an area where the coefficients of the candidate Bi(z) are stored. Besides, in Euclid's algorithm processing, the invention has for its object to derive the candidate Bi(z) without computing the low-dimensional terms of the candidate Ri(z).
According to the first solving means of this invention, there is provided a processing module for obtaining an error locator polynomial of BCH code in an error correction decoding circuit in which error corrections of t words (where t denotes a predetermined integer) are performed using the error locator polynomial, comprising:
a first register which includes 0th to 2tth storage areas, and in which coefficients of a syndrome polynomial are stored in the first to 2tth storage areas beforehand;
a second register which includes 0th to 2tth storage areas;
a Galois field division unit which subjects the coefficient stored in the 2tth storage area of said second register, to a Galois field division by the coefficient stored in the 2tth storage area of said first register;
a group of Galois field multiplication units which subject a result of the division of said Galois field division unit and the 0th to (2t−1)th coefficients of said first register to Galois field multiplications, respectively;
a group of Galois field addition units which subject the coefficients obtained by said group of Galois field multiplication units and the 0th to (2t−1)th coefficients of said second register to Galois field additions, respectively;
a first selector which selects either outputs from said group of Galois field addition units or the coefficients stored in said first register, thereby to select the coefficients stored in either said first or second register;
a shifter is for storing outputs from said first selector, in the predetermined storage areas of said first or second register;
an insertion unit which substitutes into zero or deletes one of the coefficients outputted from said first selector; and
a second selector which is for storing outputs from said shifter into one of said first register and said second register, and storing outputs from said insertion unit into the other of said first register and said second register;
wherein coefficients of the error locator polynomial are obtained by repeating steps which include the calculations by said Galois field division unit, said group of Galois field multiplication units and said group of Galois field addition units.
According to the second solving means of this invention, there is provided an error correction decoding circuit wherein error corrections of t words (where t denotes a predetermined integer) are performed using an error locator polynomial of BCH code, comprising:
a syndrome calculation unit which obtains a syndrome polynomial from an input signal;
a processing module which obtains an error locator polynomial; and
an error correction unit which corrects an error of the input signal on the basis of a coefficient of the error locator polynomial outputted from said processing module;
wherein said processing module includes:
a first register which includes 0th to 2tth storage areas, and in which coefficients of the syndrome polynomial obtained by said syndrome calculation unit are stored in the first to 2tth storage areas;
a second register which includes 0th to 2tth storage areas;
a Galois field division unit which subjects the coefficient stored in the 2tth storage area of said second register, to a Galois field division by the coefficient stored in the 2tth storage area of said first register;
a group of Galois field multiplication units which subject a result of the division of said Galois field division unit and the 0th to (2t−1)th coefficients of said first register to Galois field multiplications, respectively;
a group of Galois field addition units which subject the coefficients obtained by said group of Galois field multiplication units and the 0th to (2t−1)th coefficients of said second register to Galois field additions, respectively;
a first selector which selects either outputs from said group of Galois field addition units or the coefficients stored in said first register, thereby to select the coefficients stored in either said first or second register;
a shifter which is for storing outputs from said first selector, in the predetermined storage areas of said first or second register;
an insertion unit which substitutes into zero or deletes one of the coefficients outputted from said first selector; and
a second selector which is for storing outputs from said shifter into one of said first register and said second register, and storing outputs from said insertion unit into the other of said first register and said second register;
wherein coefficients of the error locator polynomial are obtained by repeating steps which include the calculations by said Galois field division unit, said group of Galois field multiplication units and said group of Galois field addition units, and are outputted to said error correction unit.
According to the third solving means of this invention, there is provided a processing method for an error locator polynomial of BCH code for performing error corrections of t words (where t denotes a predetermined integer) with the error locator polynomial, including:
a Galois field division step of inputting from a second register which includes 0th to 2tth storage areas, a first coefficient stored in the 2tth storage area of the second register, inputting from a first register which includes 0th to 2tth storage areas and in which coefficients of a syndrome polynomial are stored in the first to 2tth storage areas beforehand, a second coefficient stored in the 2tth storage area of the first register, and subjecting the first coefficient to a Galois field division by the second coefficient;
a Galois field multiplication step of subjecting a result of the division of said Galois field division step and 0th to (2t−1)th coefficients of the first register to Galois field multiplications, respectively;
a Galois field addition step of subjecting the coefficients obtained at said Galois field multiplication step and 0th to (2t−1)th coefficients of the second register to Galois field additions, respectively;
a shift step of shifting processed results of said Galois field addition step or the 0th to (2t−1)th coefficients of the first register so as to be stored in the predetermined storage areas of the first or second register;
an insertion step of substituting into zero or deleting one of the coefficients stored in the first or second register; and
a storage step of storing a processed result of said shift step into one of the first register and the second register, and
storing a processed result of said insertion step into the other of the first register and the second register;
wherein coefficients of the error locator polynomial are obtained by repeating steps which include said Galois field division step, said Galois field multiplication step and said Galois field addition step.
According to the present invention, it can simplify a controlling circuit configuration for the purpose of raising an operating frequency with the minimum latency or a small latency, and decrease a resource quantity for the purpose of reducing a logic scale. Besides, according to the present invention, it can execute the substitution control of a polynomial and a protection control in the case where the difference of degrees is larger than one.
According to the present invention, it can provide a processing module for obtaining an error locator polynomial, an error correction decoding circuit and a processing method in which the candidate Bi(z) of the necessary error locator polynomial can be calculated without executing part of the processing of the candidate Ri(z) of an error evaluator polynomial. Besides, according to the present invention, it can provide a processing module, etc. in which the processing of the unnecessary candidate Ri(z) is stopped in accordance with the number of steps, and the multiplication and addition of a Galois field at this part are transferred to the processing of the candidate Bi(z). According to the present invention in a memory having a predetermined storage area, it can decrease an area where the coefficients of the candidate Ri(z) are stored, in accordance with the number of steps, and assign the decrement to an area where the coefficients of the candidate Bi(z) are stored. Besides, according to the present invention in Euclid's algorithm processing, it can derive the candidate Bi(z) without computing the low-dimensional terms of the candidate Ri(z).
Besides, in accordance with the invention, a configuration is simplified in spite of the fact that three controls (1)-(3) to be stated later are performed. Therefore, an operating frequency is higher than in the prior art, and a resource quantity becomes as follows by way of example: The resource quantity is smaller even when compared with any of a basic resource quantity to be detailed later, a resource quantity in JP-A-5-165662, and a resource quantity in JP-A-7-240692.
1. Error Corrections and Concrete Examples of Problems
Improvements have been made with notice taken of Euclid's processing module a22 (refer to
As the circuit configuration of the Euclid processing module, the optimum one can be selected in adaptation to a purpose. In a case, for example, where four decoding circuits are operating simultaneously, there is a method which directly uses four Euclid's processing modules as shown in
In general, a logic scale and a latency have the relation of trade-off, and when the logic scale is made small, the latency lengthens, whereas when the logic scale is enlarged, the latency shortens. Here in the current case, however, the latency may well be shortened even if the logic scale becomes somewhat large. The reason therefor is that, as shown in
In case of the Euclid processing module of codes BCH—2, the memory is of about 22.5 kilobits. In an example of concatenated codes, the decoding circuits of the codes BCH—2 as number 48 in all are mounted, and hence, memories which are as large as 22.5×48=1080.0 kilobits exist in all. From the viewpoint of the whole decoding circuit of the concatenated codes, therefore, it is more efficient to decrease the number of ROMs as shown in
The Euclid processing is a method for obtaining an error locator polynomial σ(z) and an error evaluator polynomial ω(z) from a syndrome polynomial S(z) as stated below. When a case of making the error corrections of t words is exemplified, the syndrome polynomial S(z) is expressed by the following formula: Incidentally, the syndrome polynomial can be obtained by a syndrome processing module a21 on the basis of an input signal al of one code length.
S(z)=s2tz2t-1+s2t−1z2t-2+ . . . +s2z+s1 (4)
The error locator polynomial σ(z) becomes:
σ(z)=(1−αj1)(1−αj2) . . . (1−αjL) (5)
where jL denotes an error position (L=1 to t).
The error evaluator polynomial ω(z) becomes:
ω(z)=ΣiL=1eiαj1πkL=1(1−αjkz) (6)
The following limitations are imposed on the degrees of the respective polynomials:
deg σ(z)≦t (7)
deg ω(z)≦t−1 (8)
The error locator polynomial σ(z) and the error evaluator polynomial ω(z) can be obtained in the course of finding the greatest common measure of the known value z2t and the syndrome polynomial S(z) as stated below. Here, Ri(z) is defined to be the candidate polynomial of the error evaluator polynomial ω(z), while Bi(z) is defined to be the candidate polynomial of the error locator polynomial σ(z), and a suffix i is let denote the degree of each polynomial.
Ri(z) has its initial values put as:
R2t(z)=z2t,R2t−1(z)=S(z) (9)
and the following computations are executed until (the degree of Ri(z))≦(t−1) holds.
Here, “÷g” indicates the division of Gauss algorithm, and “Qi(z)” indicates a polynomial which becomes a quotient. By the way, in case of the division of the Gauss algorithm, the degree of a remainder polynomial is lower than that of a dividend polynomial inevitably.
In the division of the above formula 13,
deg Rt−1(z)≦t−1 (14)
holds, and Rt−1(z) becomes the error evaluator polynomial ω(z). Besides, the division ÷g of the polynomial Ri(z) is executed t times.
On the other hand, the polynomial Bi(z) has its initial values put as:
B−1(z)=0,Bo(z)=1 (15)
The computations of the polynomial Bi(z) are executed as stated below until (the degree of Ri(z))≦(t−1) holds. That is, the number of times of the computations of the polynomial Bi(z) depends upon the polynomial Ri(z).
“Qi(z)” in the above formulas 16 to 19 indicates the polynomials derived in the formulas 10 to 13 of the polynomial Ri(z). Bt(z) obtained by the computation of the formula 19 becomes the error locator polynomial σ(z), the degree on this occasion becomes:
deg Bt(z)≦t (20)
and also the computations of Bi(z) are executed t times.
Here, it is to be noted that Qi(z) needs to be computed for computing Bi(z), and that Ri(z) needs to be computed for computing Qi(z). That is, the computation of the error evaluator polynomial ω(z) is required for computing the error locator polynomial σ(z).
Concrete examples will be explained under the conditions of the following Table 4:
As the syndrome polynomial, the following is used by way of example:
S(z)=α2z5+α10z4+α2z3+α1z2+α1z+α8 (21)
The initial values of the syndrome polynomial are put as follows:
R6(z)=Z6,R5(z)=S(z),B−1(z)=0, and B0(z)=α0=1 (22)
Then, as the computations of Ri(z), as shown in
R6(z)÷gR5(z)=Q0(z) with remainder R4(z) (23)
R5(z)÷gR4(z)=Q1(z) with remainder R3(z) (24)
R4(z)÷gR3(z)=Q2(z) with remainder R2(z) (25)
are executed three times in total, and R2(z) becomes the error evaluator polynomial ω(z).
On this occasion, as the computations of Bi(z), as shown in
B1(z)=B−1(z)+Q0(z)B0(z) (26)
B2(z)=B0(z)+Q1(z)B1(z) (27)
B3(z)=B1(z)+Q2(z)B2(z) (28)
are executed three times in total, and B3(z) becomes the error locator polynomial σ(z).
Mathematically, as indicated above, the computations of Bi(z) and Ri(z) are executed three times (in general, t times), whereby the error locator polynomial σ(z) and error evaluator polynomial ω(z) to be found can be obtained. However, in a case where actual packaging into an LSI is intended, the above computations are simplified by, for example, a method as stated below.
First, the computations of Ri(z) will be explained. Formula 23 can be divided into the following two formulas by defining “Temp_R5(z)” anew:
R6(z)÷R5(z)=α13z with remainder Temp—R5(z) (29)
Temp—R5(z)+R5(z)=α6 with remainder R4(z) (30)
Here, Q0(z) can be obtained as follows from the quotients of Formulas 29 and 30:
Q0(z)=α13z+α6 (31)
By the way, in the division here, symbol ÷ is used unlike the symbol ÷g because the degree of the remainder polynomial is not always lower than that of the dividend polynomial.
In this manner, the division as indicated by Formula 23 is not executed at one time of step in the circuit, but it is calculated at the divided two times of steps of Formulas 29 and 30. In the case of t=3 on the current occasion, the computations of Ri(z) are ended by 2×3=6 steps as shown in
On this occasion, also in Bi(z), “Temp_B0(z)” is defined, whereby Formula 26 can be divided into two formulas as follows, and the computation of Formula 26 is executed at two times of steps:
Temp—B1(z)=B−1(z)+α13zB0(z) (32)
B1(z)=Temp—B1(z)+α6B0(z) (33)
In the case of t=3 on the current occasion, the computations of Bi(z) are ended after 6 steps as shown in
The present invention has for its object to minimize the latency, and the minimum latency becomes the 6 steps (in general, 2t steps) because the circuit is configured by the above method.
Next, the number of those calculations of each coefficient which are executed at the first time of step will be stated from Ri(z). First, there will be explained the case of calculating without decreasing the coefficients. As shown in
α13z=z6/α2z5 (34)
Subsequently, six times of Galois field multiplications are executed every coefficient between the found coefficients (quotients) α13z and R5(z). This is because the coefficients of R5(z) exists in the number of six. Subsequently, a Galois field addition (exclusive “or”) is executed every coefficient (Temp_R5(z) is found). Although the coefficient of R6(z) seems to be one of only z6, six times (in general, 2t times) of additions in total are actually executed by handling coefficients z5 to z1 as zero.
The processing number of times of the polynomial Bi(z) needs to correspond to the number of the coefficients. In the case of t=3, the numbers of times of the additions and multiplications of the Galois fields are both four times (in general, (t+1) times). Originally, as shown in
The Euclid processing module requires circuit arrangements for the divisions, additions and multiplications of the Galois fields, and besides, circuit arrangements for controlling these calculations. The principal controls are, for example, the two of (1) the substitution control of the polynomial, and (2) the protection control of (the difference of degrees)>1.
First, (1) the substitution of the polynomial will be explained. At the second step in
Next, (2) “the protection control of (the difference of degrees)>1” is a control for a situation where the difference of the degrees of the dividend polynomial and the divisor polynomial has become at least two. In the Euclid calculations, in most cases, the difference of the degrees of the dividend polynomial and the divisor polynomial is at most one at any of the first to sixth steps as shown in
As a method for performing these two controls, it is also considered to perform the controls while the degrees of the dividend polynomial and the divisor polynomial are being counted on occasion. With this method, however, an operating frequency is drastically lowered in some cases. Therefore, it is one of problems in the Euclid processing module improvements in the invention that any contrivance is made to realize these controls with the simple circuit arrangement and to raise the operating frequency.
The Euclid processing module improvements in the invention have the other problem of the decrease of a logic scale. In order to package the Euclid processing module into an LSI, there is considered a method in which registers for the coefficients of Ri(z) and Bi(z) are disposed in two sorts for the dividend polynomial and the divisor polynomial and are cascade-connected to each other. A resource quantity required for the calculations of both Bi(z) and Ri(z) based on this technique, that is, the numbers of registers, and processing modules becomes/become as follows on the side of Ri(z), in case of making the error corrections of t words:
To sum up, the improvements in the Euclid processing module on this occasion have for their objects to simplify the controlling circuit configuration for raising the operating frequency with, for example, the minimum latency (2t steps), and to decrease the resource quantity for the reduction of the logic scale.
Heretofore, a large number of circuit configurations have been contrived as the Euclid processing modules. However, when the Euclid processing module is created for RS code, it is applicable for BCH code directly without requiring any circuit alteration, so that most inventions are for the RS code. Therefore, when these inventions are viewed as being dedicated to the BCH code, they are circuit configurations which have room for the optimization yet. In order to explain this point, the two of JP-A-5-165662 (Patent Document 1) and JP-A-7-240692 (Patent Document 2) will be used as examples.
First, the technique of JP-A-5-165662 will be stated. Patent Document 1 contains statements concerning a non-multiplexed case (of short latency) and a multiplexed case (of long latency). In this embodiment, however, a target is to shorten a latency (to make the minimum 2t steps sufficient), and hence, the non-multiplexed case will be described.
The technique of JP-A-5-165662 realizes Euclid calculations with a small quantity of resources as indicated below by way of example.
For the purpose of effectively using the small quantity of resources, however, signals need to be sent to individual switches (selectors) in a control circuit by executing complicated processes as indicated in Table 5.
Here, a suffix “j” denotes the No. of the processing module, and 1≦j≦(2t−1) holds. This technique is configured of (2t−1) processing modules, and each processing module has three sorts of selectors. In consequence, the selectors in (2t−1)×3 sorts in total are controlled in six calculation modes. When the control circuit is made complicated by employing a large number of inequality signs in this manner, the operating frequency becomes low. Moreover, with this technique, the (2t−1) processing modules are individually controlled. Therefore, the circuit scale increases though not in Galois field calculations.
The technique of JP-A-7-240692 being the other example is improvements in a systolic algorithm (pp. 420-428, 1st Issue, VoL. J69-A No. 3 in Proceedings '86 of the Institute of Electronics and Communication Engineers: “Method for configuring Decoder of Reed-Solomon codes based on Systolic algorithm”). In this technique, the protection of the (difference of degrees)>1 is considered, and the controlling circuit is simplified. Therefore, the operating frequency is comparatively high, and it is supposed that, with this technique, a delay problem will be difficult to occur even in a device of low unit price and low performance.
The technique of JP-A-7-240692, however, requires a large quantity of resources as indicated below.
Capacity of Registers. Corresponding to (11t−3) coefficients
Further, with this technique, the calculation of the error locator polynomial σ(z) is started after the calculation of the error evaluator polynomial ω(z), so that the latency becomes long. Incidentally, when improvements are added so as to make the latency 2t steps with this technique, the resource quantity increases as indicated below.
The Euclid processing module for the RS code has the circuit configuration which calculates the error locator polynomial σ(z) and the error evaluator polynomial ω(z) as its functions. As shown in
However, whether the Euclid processing module is dedicated to the RS code or to the BCH code, the calculation circuit of the error evaluator polynomial ω(z) cannot be indiscriminately detached from the Euclid processing module. Although repeatedly explained, this is ascribable to the fact that the quotient polynomial Qi(z) which is calculated in the course of computing Ri(z) to become the candidate of the error evaluator polynomial ω(z) is required for calculating Bi(z) to become the candidate of the error locator polynomial σ(z).
Therefore, when a circuit configuration which calculates only the necessary error locator polynomial σ(z) can be formed by any technique, resources which are originally surplus for the BCH code can be decreased.
For complying with the requirement, the invention has for its object to provide an apparatus, a method and an algorithm which can calculate the necessary Bi(z) without executing some of the calculations of Ri(z).
2. Calculations of Embodiment
In the computations between polynomials, the computations of Ri(z) are necessary for obtaining the certainly necessary Bi(z). In actuality, however, when the computations are viewed in terms of the calculations of individual coefficients, it is understood that the calculations of all the coefficients of the Ri(z) are not necessary. For the brevity of description, the above computational example of Ri(z) and Bi(z) at t=3 will be referred to and will now be described reversely from the sixth step in conjunction with
It is as stated before that the quotients Qi(z) are necessary for the calculations of Bi(z). Conversely speaking, when only the values of the necessary quotients Qi(z) are found, all the Ri(z) calculations need not be executed. At the sixth step of this example, the following calculation is executed on the side of Bi(z) (refer to
B3(z)=Temp—B3(z)+α1B2(z) (35)
On the side of Ri(z), a quotient α01=α12/α11 necessary for Bi(z) has already calculated by a Galois field division. At the sixth step, therefore, the multiplication and addition of a Galois field need not be executed on the Ri(z) side. Parts where the calculations of the sixth step are unnecessary have doublets drawn as shown in
At the fifth step, the following calculation is executed on the Bi(z) side:
Temp—B3(z)=B1(z)+α3B2(z) (36)
On the Ri(z) side, a quotient α8=α4/α11 has already been calculated by a Galois field division. Besides, since α12 is used in the Galois field division of the next sixth step, only the multiplication and addition of a ternary Galois field need to be executed. However, the calculations of Galois fields except the ternary one are not required. At the fifth step in
Similarly, the fourth step will be described. Here, only the ternary and binary calculations are required. The reason therefor is that the result α11 of the ternary calculation becomes the denominator of the Galois field division α4/α11 at the fifth step, and that the result α11 of the binary calculation is used in the ternary calculation α12=α6+α8×α11 at the fifth step. In the same manner, when the number of steps is returned to the first step here, a coefficient which need not be calculated is also existent at any step. In
As a result, the respective numbers of times of the calculations of the multiplications and additions of the Galois fields of Ri(z) as are required for Bi(z) become 5 at the first step, 4 at the second step . . . , one at the fifth step, and zero at the sixth step.
On the other hand, as shown in
However, (3) the adjustments of the processing modules of Ri(z) and Bi(z) have been added anew to the two controls having heretofore been performed; (1) the substitution of polynomials and (2) the protection of the (difference of degrees)>1. Therefore, it is naturally possible that a circuit configuration will become complicated to hamper the optimization conversely. Nevertheless, in this embodiment, this control can be realized with a simple configuration by a technique concretely indicated below, and a circuit which decreases resources still further can be provided.
3. Hardware Configuration
An error correction decoding circuit a3 in this embodiment includes, for example, a configuration shown in
The error correction decoding circuit may well have a configuration shown in
The Euclid processing module includes a register 10 (first register), a register 20 (second register), a processing module 30, a sequencer (controller) 40, a shifter 50, a zero insertion unit 60, and selectors 70 and 80.
First, the registers 10 and 20 will be described. In actuality, it is more convenient to put the coefficients of Ri(z) and Bi(z) together and to handle them. Therefore, those coefficients of a virtual polynomial into which the above coefficients are put together will be discussed. First, initial values as indicated in
The coefficients of Ri(z) and Bi(z) are collected in this manner, whereby the capacity of necessary registers is decreased from (6t+4) to (4t+2) as compared with the basic resource quantity, and also the number of times of the multiplications and additions of Galois fields is decreased from (3t+1) to (2t+1). In actuality, however, the multiplication and addition of the coefficients of the highest degree #2t of the polynomial are unnecessary as shown in
Next, the circuit configuration of the processing module 30 in
α13z=z6/α2z5 (37)
Temp—R5(z)=R6(z)+α13zR5(z) (38)
Temp—B1(z)=B−1(z)+α13zB0(z) (39)
As shown in
α0×(1/α2) (40)
Here, 1/α2 is the inverse element value of α2. As the inverse element values, values which are previously registered in a memory as shown in the table of
Subsequently, the following multiplication parts of Formulas 38 and 37 are executed by the group of Galois field multiplication units in
α13z×R5(z) (41)
α13z×B0(z) (42)
The multiplication units exist, for example, in the number of 2t in the current case.
Lastly, R6(z) and B−1(z) are added to α13zR5(z) and α13zB0(z) outputted from the Galois field multiplication unit group 320, by the Galois field addition unit group 330 in
Here, #5 to #1 in Table 6 indicate Temp_R5(z), and #0 indicates Temp_B1(z).
Next, the sequencer 40, and the selectors 70 and 80 which are controlled by them will be described. The sequencer 40 has the virtual degrees of the dividend polynomial and the divisor polynomial as internal variables (the degrees are virtual to the last, and differ from actual values), and it receives the value of #2t of the register 10 as an input signal. Here, variables (parameters) are defined as follows:
dV=virtual degree of dividend polynomial (initial value is 0); (43)
dW=virtual degree of divisor polynomial (initial value is 0); (44)
R102t=value of #2t of register 10; (45)
The sequencer 40 creates the following three processing modes from dV, dW and R102t, and it controls the selectors 70 and 80 as in Table 7 in accordance with the processing modes:
Here, “SEL—1” indicates the control signal of the selector 70, and “SEL—2” that of the selector 80. The control circuit of JP-A-5-165662 as indicated in Table 5 has had the six processing modes, and has controlled (2t−1)×3 sorts of selectors. In contrast, in the control circuit of this embodiment, the number of processing modes is three, and the selectors controlled are of only two sorts. Also in this point, the control circuit of this embodiment is sharply simplified.
Incidentally, dV and dW change in values every mode as follows:
At processing mode 1dV=dV+1; (46)
At processing mode 2dV=dW; (47)
At processing mode 3dW=dW+1; (48)
In the operation mode 2, the sequencer 40 outputs SEL—1=0 so as to assert “0” of the selector 70, and it outputs SEL—2=1 so as to assert “1” of the selector 80. Consequently, an output from the processing module 30 is stored in the register 10 through the shifter 50, and the value of the register 10 is stored in the register 20 through the zero insertion unit 60.
In the operation mode 1, the sequencer 40 outputs SEL—1=1 so as to assert “1” of the selector 70, and it outputs SEL—2=1 so as to assert “1” of the selector 80. Consequently, the value of the register 10 is stored in the register 10 again through the shifter 50, and the value of the register 20 is stored in the register 20 again through the zero insertion unit 60.
Next, the shifter 50 in
Next, the zero insertion unit 60 in
In this manner, the substitution controls of the dividend polynomial, the divisor polynomial and the remainder polynomial are performed in accordance with the processing mode 2 and the processing mode 3. In case of a control where the difference of degrees of the dividend polynomial and the divisor polynomial is, at least, 2, the control of the processing mode 1 is performed. In the case where the difference of degrees has become, at least, 2, the highest degree coefficient of the dividend polynomial seems as if it were zero (as if R102t=0 were held), when viewed from the circuit. The processing mode 1 is executed to perform the shifts until the (highest degree coefficient of the dividend polynomial)≠0 is reached, and to control the degree difference of the respective polynomials stored in the registers 10 and 20.
Concrete examples of the operation of this embodiment will be described with reference to the flow chart of
First, the Euclid processing module a22 is initialized (step S1). By way of example, the sequencer 40 (or the zero insertion unit 60) sets the counter at zero (cnt=0). Besides, the sequencer 40 sets the parameters dV and dW at zero (dV=0, and dW=0). In addition, it stores the coefficients SDi of the inputted syndrome polynomial in the storage areas #1-#2t (hereinbelow, written as “#i”) of the register 10 (R10i=SDi−1 (i=1, . . . , and 2t)). An initial value “1” (α00) is stored in the #0 of the register 10 (R100=1). Besides, the initial value “1” (α00) is stored in the highest-degree #2t of the register 20 (R202t=1). An initial value “0” is stored in the #0-#2t−1 of the register 20 (R20i=0 (i=0, . . . , 2t−1)). Owing to the above processing, the registers 10 and 20 are set as the initial values of
Subsequently, if the coefficient of the highest degree #2t of the register 10 is not zero is judged by, for example, the sequencer 40 (step S13). If the coefficient is not zero, the flow shifts to a step S15, and if the coefficient is zero, the flow shifts to a step S29.
At the step S15, the processing module 30 obtains a quotient Q2t (Q2t=R202t/R102t) in such a way that the coefficient stored in the highest degree #2t of the register 20 is Galois-field-divided by the coefficient stored in the highest degree #2t of the register 10. By way of example, the Galois field division is executed by the inverse element ROM 310 and the Galois field multiplication unit 340 in
Besides, the processing module 30 Galois-field-multiplies the respective coefficients of the #0-#2t−1 of the register 10 and the obtained quotient Q2t, and it Galois-field-adds the results of the multiplications and the respective coefficients of the #0-#2t−1 of the register 20 so as to output the results of the additions (M30i=R20i+Q2t·R10i (i=0, . . . , 2t−1)). By way of example, the Galois field multiplications and additions are executed by the group 320 of Galois field multiplication units and the group 330 of Galois field addition units in
At a step S17, if the parameter dV is, at least, the parameter dW is judged by, for example, the sequencer 40. If dV is, at least, dW, the flow shifts to a step S19 corresponding to the operation mode 3, whereas if dV is not, at least, dW, the flow shifts to a step S25 corresponding to the operation mode 2. Incidentally, the flow shifts to a step S19 at the first step.
At the step S19, the sequencer 40 increases the parameter dW by one (dW=dW+1). Besides, among the storage areas of the register 10, one corresponding to (the value of the counter)+1 (cnt+1) is rewritten into “0” (R10i=0 (i=cnt+1)). By way of example, among the respective coefficients from the register 10, the coefficient of the area corresponding to (the value of the counter)+1 is zeroized by the zero insertion unit 60, and the other coefficients are left intact and are written into the register 10 again. In the example of
Besides, the output of the processing module 30 is stored in the register 20. By way of example, the outputs M30i−1 (i=1, . . . , and 2t) of the processing module 30 are stored in the #1-#2t of the register 20, and “0” is stored in the #0 of the register 20 (R20i=M30i−1 (i=1, . . . , and 2t), and R200=0). By way of example, the shifter 50 shifts the results of additions with the respective coefficients of the #0 to #(2t−1) of the register 20 in the Galois field addition unit group 330, so as to be stored in the storage areas #1 to #2t of the register 10 or the register 20. Thus, the respective coefficients of Temp_R5(z) and the coefficient of Temp_B1(z) are stored in the register 20 (refer to “after 1 step” in
At a step S21, if the value of the counter has become the predetermined number of times (2t−1) is judged by way of example. That is, if 2t times of steps have been repeated is judged. If the 2t times of steps have been repeated, the flow proceeds to a step S27. On the other hand, if the 2t times of steps have not been repeated, the counter is increased by one at a step S23 (cnt=cnt+1), and the steps of the step S13, et seq. are repeated.
Also at the second step, calculations are executed in the same manner as at the above steps S13 and S15 (refer to the second step in
At the step S25, the sequencer 40 substitutes the value of the parameter dW into the parameter dV (dV=dW). Besides, among the coefficients of the register 10, the area thereof ass corresponds to (the value of the counter)+1 (cnt+1) is rewritten into “0”, which is stored in the register 20 (R20i=R10i (i=0, . . . , cnt, cnt+2, . . . , and 2t) and R20i=0 (i=cnt+1)). By way of example, among the respective coefficients from the register 10, the coefficient of the area corresponding to (the value of the counter)+1 is zeroized by the zero insertion unit 60, and the other coefficients are left intact and are written into the register 20. In the example of
Besides, the output of the processing module 30 is stored in the register 10. By way of example, the outputs M30i−1 (i=1, and 2t) of the processing module 30 are stored in the #1-#2t of the register 10, and “0” is stored in the #0 of the register 10 (R10i=M30i−1 (i=1, . . . , and 2t) and R100=0). Thus, at the second step, the respective coefficients of R4(z) as have been obtained and the respective coefficients of B1(z) are stored in the register 10 (refer to “after 2 steps” in
Also at the odd-numbered steps of the third step et seq., processing is executed in the same manner as at the first step stated above, and also at the even-numbered steps of the fourth step et seq., processing is executed in the same manner as at the second step stated above.
Incidentally, at the respective steps, when the coefficient of the highest degree #2t of the register 10 is “0” at the step S13, the flow shifts to the step S29 corresponding to the operation mode 1. At the step S29, the sequencer 40 increases the parameter dV by one (dV=dV+1). Besides, the storage areas of the coefficients of the register 10 are shifted (R10i=R10i−1 (i=1, . . . , and 2t)), and “0” is stored in the #0 of the register 10 (R100=0). Besides, among the storage areas of the register 20, the area corresponding to (the value of the counter)+1 (cnt+1) is rewritten into “0”, which is stored in the register 20 (R20i=R20i (i=0, . . . , cnt, cnt+2, . . . , and 2t) and R20i=0 (i=cnt+1)).
When the above processing is repeated 2t steps, the flow shifts to the step S27, at which the coefficients of the error polynomial among the coefficients stored in the register 10 are outputted. By way of example, the parameter dW is substituted into the degree dσ of the error locator polynomial. Besides, regarding i=0 to d0, a parameter “j” is obtained from j=2t−dσ+i, and the coefficient of the #j of the register 10 is outputted as the coefficient of zi of the error polynomial. In the example of
Here, the calculated results of from the initial values to the sixth step are shown in
By way of example, the Euclid processing module a22 outputs the number of coefficients corresponding to the degree of the error locator polynomial among the coefficients outputted from the Galois field addition unit group 330 after the 2t steps or among the coefficients stored in the register 10 after the 2t steps, as the coefficients of the error locator polynomial. Besides, by way of example, among the coefficients stored in the first register after the 2t steps, (t+1) coefficients stored in storage areas #2t to #t are outputted as the coefficients of the error locator polynomial.
Owing to this embodiment, the three controls, namely, (1) the substitution of the polynomial, (2) the protection of (the difference of degrees)>1, and (3) the adjustments of the processing modules of Ri(z) and Bi(z) can be realized by the simplified circuit configuration, and the operating frequency can be heightened, while at the same time, the decreases of the Galois field calculations and the registers can be realized.
4. Performance Comparisons
Here will be discussed how much the improvements in the Euclid processing module of this embodiment have contributed to the decoding circuit of the concatenated codes in Recommendation G. 975. 1 of the ITU-T as stated at the beginning. Since, however, the statement of the entirety leads to a complicated explanation, comparisons will be confined to only a part at which the decoding circuits of BCH—2 are used in the number of 48.
Table 12 below indicates circuit scales, memories and operating frequencies after one Euclid processing module has been packaged into an LSI, before the improvements and after the improvements. Since the actual values of the logic scale and the operating frequency differ depending upon LSIs on which the processing module is mounted, only the comparisons of rough proportions will be made here. As a criterion, the logic scale of a unit of long latency before the improvements is set at 100. Incidentally, since packaging results are indicated, the unit of the latency has changed from steps to clocks, but the clocks have the same significance.
In the specifications of the recommendation, the Euclid processing module of the BCH—2 must end processing within 255 clocks. In order to limit the latency of the Euclid processing module within 255 clocks, a contrivance is required to a corresponding extent, but a method therefor shall not be stated here.
Next, regarding these Euclid processing modules, let's consider a case where four decoding circuits are operating at one time as seen in
Table 14 indicates comparisons at the part at which the decoding circuits of the BCH—2 are used in the number of 48. Unless 255 clocks are exceeded, Euclid's calculation process can be repeatedly used. Therefore, effects in the case of causing units of short latencies to execute the processing eight times are indicated on this occasion. In Table 14, “other units” correspond to all units other than the Euclid processing module as have been added up. Likewise, “other memories” signify memories used in others than the Euclid processing module as have been added up. By the way, in the inventor's design, the logic scale of the other unit logic becomes 125, and the memory thereof becomes about 22.3 kilobits, per decoding circuit.
In this manner, it is understood that, at the part at which the 48 BCH—2 decoding circuits are used, the logic scale and the memories have been respectively decreased about 30% and about 60% as a whole. Since, however, the values are somewhat different depending upon devices, actually the decrease rate of the logic scale is 20% to 30% in some cases.
Besides, as stated before, the Euclid processing module is more complicated as compared with any other unit, and hence, it is liable to form a factor for lowering the operating frequency of the whole circuit. Therefore, the performance of all of the 48 BCH—2 decoding circuits has also been bettered by the improvements of the Euclid processing module on this occasion.
In this manner, the improvements of the BCH encoding apparatus according to the invention are very useful in putting to practical use the concatenated codes which are forming the tributary stream of optical fiber communications, and they are considered to become still more important in the future. Of course, not only the communications, but also the enlargement of the number of the elements of BCH code, and the increase of the number of error correction words are thought to proceed in the future. Therefore, the invention is expected to be effectively utilized in all fields where error correction technology is employed.
The invention is applicable to, for example, error correction decoding in optical communications. Concretely, the invention is applicable in a field where the error correction decoding is performed using BCH code. Besides, the invention is applicable to error corrections in the fields of communications, computers, audios/videos, etc.
Number | Date | Country | Kind |
---|---|---|---|
2008-008425 | Jan 2008 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
4833678 | Cohen | May 1989 | A |
5699368 | Sakai et al. | Dec 1997 | A |
6304994 | Oh et al. | Oct 2001 | B1 |
6487691 | Katayama et al. | Nov 2002 | B1 |
6732325 | Tash et al. | May 2004 | B1 |
Number | Date | Country |
---|---|---|
05-165662 | Jul 1993 | JP |
07-240692 | Sep 1995 | JP |
Number | Date | Country | |
---|---|---|---|
20090187812 A1 | Jul 2009 | US |