This application is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2010-28523, filed on Feb. 12, 2010, the entire contents of which are incorporated herein by reference.
Embodiments relate to an error detection and correction system.
Along with miniaturization and increase in capacity of memory, data retention characteristics of individual memory cells deteriorate. Data retention characteristics become a particular problem when the memory is made multilevel. At the same time, phase change memory and resistance varying memory, of which there are great expectations as the next generation of NAND type flash memory, suffer a problem in stability of data state which makes it difficult to assure reliability of data retention.
Consequently, increasing importance is given to technology for installing an ECC (Error Correcting Code) system which detects and corrects errors prior to read data of the memory actually being used. Mounting an ECC circuit in a flash memory chip or in a memory controller controlling the flash memory chip is conventionally proposed (refer, for example, to Patent Document 1: U.S. Pat. No. 6,185,134 B1).
When performing error correction of two or more bits in a BCH-ECC system utilizing a finite field GF (2n), an enormous amount of calculation time is required for the method in which the solution of the error position search equation is found by sequentially substituting finite field elements and selecting as the solution those elements satisfying the equation, and, in the case of on-chip mounting, the read and write specification of the memory is thereby significantly degraded.
In contrast, the inventor of the present invention proposes an on-chip ECC circuit configured to enable high speed detection and correction of up to four bit errors without prior execution of any such sequential search (refer, for example, to Patent Document 2: JP 2009-43385 A).
An error detection and correction system in accordance with an embodiment comprises: an encoding unit operative to generate a check bit based on an information bit, the check bit and the information bit to be stored in a memory cell array; a syndrome calculating unit operative to calculate a syndrome based on read data from the memory cell array; a syndrome element calculating unit operative to perform a calculation for expressing coefficients of an error search equation corresponding to the read data by Galois field elements; an error search unit operative to solve the error search equation based on a calculation result of the syndrome element calculating unit, and thereby obtain an error bit position; and an error correction unit operative to perform an error bit correction of the read data, read and write of the memory cell array being assumed to be performed concurrently for m bits, and error detection and correction being assumed to be performed in data units of M bits (where M is an integer multiple of m), and the encoding unit and the syndrome calculating unit sharing a time-division decoder for performing data bit selection according to respective tables of check bit generation and syndrome generation, the time-division decoder being operative to repeat multiple cycles of m bit concurrent data input.
Next, an error detection and correction system in accordance with embodiments of the present invention is described with reference to the drawings.
Prior to describing the embodiments, aims of proposal of this invention are described.
In the case of installing an ECC circuit on-chip in a memory, and in application fields where processing speed is not a concern, it becomes top priority from the point of view of lowering costs of the memory to adopt a method satisfying the conditions of reducing as far as possible the degree of redundancy in a region occupied by check bits and so on, and furthermore of reducing as far as possible circuit scale of the data processing system.
In view of the above conditions, this invention proposes a BCH-ECC system (hereinafter referred to as “4EC-BCH-ECC system”) operative to perform four-bit error correction in units of 512 bytes, for example. The Galois field capable of covering 512 bytes=4096 bits of data is a GF (213)=GF (8192), in which the number of finite field elements excluding the zero member is 8191, which is itself a prime number. Hence this invention does not allow utilization of a method for achieving high speed by parallelizing a calculation step for error detection and correction by prime-factor-decomposing the finite field element numbers, as in, for example, the proposal of previously mentioned Patent Document 2 or the like, in which 512 bits is adopted as a unit of error correction.
Accordingly, this invention is assumed appropriate to memory applications requiring reduction of circuit scale rather than speeding up of error correction, and proposes the 4EC-BCH-ECC system operative to perform four-bit error correction in units of 512 bytes. In this case, the minimum required number of check bits is 13×4=52 bits, thereby allowing the degree of redundancy in the memory to be considerably reduced.
In addition, since error correction processing is performed in units of 512 bytes of data, there is to some extent a need for time to transfer the data required for one processing, thus making it difficult to perform ECC processing in real time of data input/output. Accordingly, on the assumption that ECC processing is performed within a time that is approximately the time used in data transfer, it is proposed to use time-division multiplexing of partial calculating circuits and thereby reduce circuit scale. Specifically, it is aimed to reconsider the method of calculation processing to configure a 4EC system capable of data processing in a few μs.
In the interest of circuit scale reduction, if an index expression is used for elements, then a calculation involving multiplication of elements becomes one of addition by an adder, allowing scale to be reduced. However, the fact that when elements are added, the elements are converted to polynomial coefficient expressions and a parity check of each coefficient performed leads to the need for a decoder to convert between the binary expression of the index and the coefficient expression, and, if the number of elements in the Galois field is large, scale of the decoder becomes enormous. Accordingly, a configuration of time-division operation is proposed in order that the circuit requiring this conversion be made as small as possible.
Next, aims of the 4EC-BCH-ECC system in this invention are clarified a little more specifically in relationship to conventionally known methods. Performing syndrome generation with respect to errors is similar for all methods, basic processing comprising constructing an error search equation Λ(x) from a syndrome, and next finding elements of a Galois field satisfying Λ(x)=0. The solution obtained is utilized in the correction, and, if all that is required is error correction up to four bits, then the following Method 1 and Method 2 become principle methods.
Method 1 is known as the Euclid method or Berlekamp-Massey method, and is established as an iteration method algorithm. The solution search of Λ(x)=0 involves successive iteration of Galois field elements to find the elements that are a solution, and this too is established as an iteration method algorithm known as the Chien search method. If the number of elements in the Galois field used becomes large, the calculation time increases accordingly, leading to an enormous calculation time being consumed by this search.
That is, with Method 1, the problem in on-chip processing lies in calculation time, leading to operation specification of the memory being considerably degraded. In contrast, the following Method 2 is proposed as a method enabling high speed error correction processing.
Method 2 solves Λ(x)=0 by an algebraic method. At this time, calculation is performed rendering elements of the Galois field in index display, thereby simplifying calculation. A result of the polynomial coefficient expression of the syndrome is rendered in index display and used in the computation processing, but in order to convert from index display back into a polynomial coefficient expression when addition of elements is performed, a decoder for these conversions is required.
Moreover, a decoder is used also in the solution search of Λ(x)=0 to obtain a solution instantaneously. If the Galois field is small in scale, size of the decoder is also small, but if data handled in one lot is increased, circuit scale of the decoder becomes enormous. Consequently, this Method 2 emphasizes circuit speed and is appropriate to on-chip processing in applications where data scale of batch processing is small enabling size of the Galois field to be reduced and where scale of the ECC circuit is not a great concern.
This invention proposes a novel BCH-ECC system, suitable in fields of memory requiring a reduction in circuit scale rather than processing speed of the ECC in the case that memory capacity is further increased and cells also miniaturized leading to inclusion of unstable cells.
First, an outline of a 4-CH-BCH-ECC system in an embodiment is described.
Encoding of Data
Generally, in a 4-CH-BCH-ECC system using a finite field GF (213), four irreducible polynomials m1(x), m3(x), m5(x), and m7(x) shown in Expression 1 are used to create a code generating polynomial g(x) of order 52.
Using this code generating polynomial allows a minimum of 13×4 (=52) check bits b0-b51 to be created from information bits d0-d4095.
This code generating polynomial is the minimum polynomial required for four-bit error correction. However in this case, a minimum Hamming distance is nine bits, and, although in the case of five-bit errors there will be no mistaken correction, in the case of six-bit errors or more there is a chance that a mistaken correction corrected by another code will occur.
Accordingly, in this embodiment, in order to further increase the Hamming distance of coded data and thereby lower the probability of mistaken correction, another irreducible polynomial m9(x) and (x+1) are added to create a code generating polynomial g1(x) as shown in Expression 2.
This code generating polynomial g1(x) is used to generate 66 check bits b0-b65, which are stored as redundant bits along with information bits.
That is, as shown in Expression 3, r(x) is assumed to be a remainder obtained by dividing an information polynomial f(x)x66 by the code generating polynomial g1(x). The 66 coefficients b0-b65 of this remainder polynomial r(x) become the check bits.
f(x)x66=q(x)g1(x)+r(x)
r(x)=b65x65+b64x64+ . . . +b1x+b0 [Expression 3]
If this code generating polynomial g1(x) is used, the minimum Hamming distance becomes 12, whereby mistaken correction can be prevented from occurring in the case of up to eight-bit error correction.
Specifically, the encoding processing for generating the check bits is performed by selecting the data bits d0-d4095 through XOR logic in accordance with a table of previously created check bits b0-b65.
Decoding of Data
A polynomial ν(x) corresponding to data read from the memory is expressed using an error polynomial e(x) as in Expression 4.
The term of coefficient 1 in the error polynomial e(x) is the error bit, and error detection thus involves obtaining this term.
As a first stage, remainders when σ(x) is divided by m1(x), m3(x), m5(x), and m7(x) are defined, respectively, as S1(x), S3(x), S5(x), and S7(x), these remainders also configuring each of the remainders of e(x). These remainder polynomials are syndrome polynomials, and are as shown below in Expression 5.
ν(x)≡S1(x)mod m1(x)→e(x)≡S1(x)mod m1(x)
ν(x)≡S3(x)mod m3(x)→e(x)≡S3(x)mod m3(x)
ν(x)≡S5(x)mod m5(x)→e(x)≡S5(x)mod m5(x)
ν(x)≡S7(x)mod m7(x)→e(x)≡S7(x)mod m7(x) [Expression 5]
If the four-bit errors are in the i, j, k, and 1 orders, the error polynomial e(x) is as follows in Expression 6.
e(x)=xi+xj+xk+xl [Expression 6]
Obtaining these orders i, j, k, and l of the error polynomial allows an error position, that is, which bit is an error, to be determined.
Accordingly, i, j, k, and 1 are obtained by calculation of an index of roots of m1(x)=0 in GF (2n). As a result, if a remainder when Xn is divided by m1(x) is defined as pn(x), then αn=pn(α), hence X1, X2, X3, and X4, and syndromes S1, S3, S5, and S7 are defined as below in Expression 7.
X1=pi(α)=αi
S1=S1(α)=ασ1
X2=pj(α)=αj
S3═S3(α3)=ασ3
X3=pk(α)=αk
S5═S5(α5)=ασ5
X4=pl(α)=α1
S7=S7(α7)=ασ7 [Expression 7]
The above definitions allow the following relationships in Expression 8 to be obtained.
e(α)=X1+X2+X3+X4=S1
e(α3)=X13+X23+X33+X43=S3
e(α5)=X15+X25+X35+X45=S5
e(α7)=X17+X27+X37+X47=S7 [Expression 8]
Now the indexes of X1, X2, X3, and X4 are i, j, k, and 1, respectively, and the indexes of S1, S3, S5, and S7 are σ1 (=σ), σ3, σ5, and σ7, respectively.
As a second stage, a polynomial ΛR(x) in GF (213) shown in Expression 9, in which X1, X2, X3, and X4 are unknowns, is considered.
Parameters S, D, T, and Q of each of the coefficients in Expression (28) form basic symmetrical polynomials of X1, X2, X3, and X4 as shown below in Expression 10.
S=S1=X1+X2+X3+X4
D=X1X2+X2X3+X3X4+X4X1+X1X3+X2X4
T=X1X2X3+X2X3X4+X4X1X2
Q=X1X2X3X4 [Expression 10]
A relationship exists between these coefficient parameters and the syndromes, that is, the symmetrical polynomials S1=S, S3, S5, and S7, which can be expressed as equations, thereby configuring the simultaneous equations in Expression 11 below.
SD+T=ζ
(ζ+S3)D+S2T+SQ=η
(η+S5)D+S4T+(ζ+S3)Q=θ, [Expression 11]
provided that ζ=S3+S3, η=S5+S5, and θ=S7+S7.
The above three-member simultaneous equations are solved to express D, T, and Q by syndromes. Assuming the coefficient determinant to be F, solutions can be obtained as in Expression 12 below.
Γ=S3ζ+Sη+ζ2
γD=S3η+S2ζ2+Sθ+ζη
ΓT=S4η+S2θ+ζ3
ΓQ=S4ζ2ζη+ζθ+η2 [Expression 12]
Now, if Γ≠0, D, T, and Q are determined, whereby the process of solving the following error search equation in Expression 13 begins.
ΛR(x)=x4+Sx3+Dx2+Tx+Q=0 [Expression 13]
Due to the solving method for the simultaneous equations, when Γ=0, one unknown Q can be arbitrarily set to obtain D and T.
When four errors occur, an arbitrary relationship cannot exist between S, D, T, and Q, hence, in this case, five errors or more occurred or three errors or less. Hereinafter, in accordance with branch conditions of the solution search, a solution for four errors, three errors, two errors, or one error is obtained, or a warning is issued that there are five errors or more is issued, or a signal indicating no errors or the like is given.
Outline of 4EC-BCH-ECC System
As previously mentioned, the remainder when f(x)x66 is divided by the code generating polynomial g1(x) is assumed to be r(x), and coefficients of the polynomial f(x)x66+r(x) are written as data bits to a memory core 22. The memory core 22 herein is one that includes a memory cell array and a decoder circuit and sense amplifier circuit, specifically, is a large capacity memory such as a NAND type flash memory, a resistance varying type memory (Resistive RAM: ReRAM), or a phase change memory (Phase Change RAM: PCRAM), and has a configuration where bit errors are inevitable.
Regarding data read, data read from the memory core 22 is treated as coefficients of the data polynomial ν(x) of order 8190 shown in Expression 4.
The syndromes S (=S1), S3, S5, and S7 are obtained based on this read data polynomial ν(x) by a syndrome calculating unit 23. The syndromes S, S3, S5, and S7 are obtained from remainders when ν(x) is divided by m1(x), m3(x), m5(x), and m7(x), respectively.
In calculation hereafter, the syndromes S, S3, S5, and S7 are expressed as an index, and an adder circuit and parity checker employed. That is, multiplication of indexes is calculated as an addition of binary numbers in the adder circuit. Addition of indexes is performed by decoding the indexes to express them as a polynomial of finite field elements of order 12, obtaining a polynomial of added elements as a result of the parity check of coefficients of each order, and then decoding this result back into an index.
Subsequent to the syndromes being obtained, syndrome elements undergo computation by a syndrome element calculating unit 24 (SEC: Syndrome Element Calculation). Here, syndrome elements undergo computation required to create the error search equation shown in Expression 13, the computation result being stored in a register (not shown).
An error position search is performed, on the basis of an amount obtained by the syndrome element calculating unit 24, by an error search unit 25 (ES: Error Search). These calculation processes are controlled by a clock generator 27, utilizing clocks ck0-ck17 frequency divided from an external clock CL.
The syndrome element calculating unit 24 and the error search unit 25 are configured to pass data back and forth and use each other's circuit blocks in a multiplexing manner, whereby circuit scale is reduced. Hence, the syndrome element calculating unit 24 and the error search unit 25 in
The result obtained by the error search unit 25 is utilized by an error correcting unit 26 (EC: Error Correction) in correction of data read from the memory. In the error correcting unit 26, the information data polynomial f(x) inputted to the memory from external is restored and outputted as information data.
That concludes description of the outline of the 4EC-BCH-ECC system of the embodiment. Next, a configuration method of each kind of calculating circuit used in this ECC system is described.
Encoding Circuit
That is, this encoding circuit comprises: a time-division decoder 31 (time sharing decoder: TSC) operative to time-division operate with a 64 clock cycle; a parallel parity check circuit 32 (parallel parity checker: PPC) operative to perform a parity check of the decoding result at each timing; and a serial parity check circuit 33 (serial parity checker: SPC).
Note that these time-division decoder 31, parallel parity check circuit 32, and serial parity check circuit 33 are employed not only in encoding but also in the syndrome calculating circuit as described later. Consequently they are configured to have modes changeable by mode selecting signals /en and/de.
Data transfer with the memory cell array is performed in units of eight bytes, that is, groupings of 64 bits. The fact that data is processed on this 64 bit data grouping basis to generate the check bits makes it possible to reduce scale to about one sixty-fourth ( 1/64) in comparison with the case where check bits are created by processing in a 512 byte batch.
The check bits can be obtained by a calculation selecting and performing XOR on the data bits d0-d4095 in accordance with an encoding table of each of the check bits b0-b65. The selection in this XOR computation is performed in a time-division manner by the time-division decoder 31. That is, the initial state of the circuit is set in cycle τ0, and 64-bit parallel input XOR computation is repeated for the 64 cycles τ1-τ64 shown in
Since there are 64 data bits in one cycle for each of the check bits bm, the total number of decoders is the product 4224 of this 64 and the range 66 of the cycle number m (0≦m≦65).
That is, the time-division decoder 31 includes a decoding-dedicated PMOS transistor P1 operative, on each clock cycle, to receive a data bit di at its source, to receive a clock at its gate, and thereby perform data bit selection. If there is a PMOS transistor P1 which, in cycle τ1, receives data bit di (0≦i=n≦63), the data di being “1”, and which has its gate inputted with a clock signal /τ1 which is “L” in cycle τ1, the path between data node di and output node di(bm) is turned on, whereby di=di(bm), and if there is no such PMOS transistor P1, then di=“0”.
Here, di is assumed to change synchronously with the cycle clock τi from “L” to “H” in accordance with the data bit. In the encoding, as previously mentioned, a PMOS transistor P2 inputted with the mode selecting signal /en causes a path to be formed between di and di(bm).
Generally, if a data bit di {i=n+64(k−1); 0≦n≦63} is received in a cycle τk (1≦k≦64), the data di being “1”, and the clock signal /τ1 which is “L” in cycle τi becomes the gate signal of a PMOS transistor P1 existing between data node di and output node di(bm), then di=di(bm), and, if not, then di(bm)=“0”. When di=“0”, an NMOS transistor N causes the output node di(bm) to be “L” (=“0”).
In this way, whether or not to select a data bit in each cycle is determined by the presence/absence of a PMOS transistor P1 to which the clock signal /τ1 is inputted.
In the case of the time-division decoder 31 of
The data bits di(bm) selected by the time-division decoder 31 are next inputted to the parallel parity check circuit 32 in a 64-bit parallel manner corresponding to n=0-63. The parallel parity check circuit 32 uses 66 parity checker ladders PCL existing in the respective check bits b0-b65 to obtain parity, that is, XOR, of 64 bits of data in one lot.
The parity obtained in each cycle is integrated with the parity result up to the previous cycle by the serial parity check circuit 33. That is, the serial parity check circuit 33 includes a latch LT for retaining a parity result, and a gate XOR for performing XOR logic with the retained data.
The latches LT are all reset to “0” in the first clock cycle τ0, and, at each cycle, re-perform XOR between output of the parallel parity check circuit 32 and the retained data. This results in the retained data when all cycles are completed being, respectively, the check bits b0-b65.
Syndrome Calculating Circuit
Concerning units of processing the coded and stored data, 512 bytes are supplemented with the 66 check bits, and since data from the memory cell array is transferred in 64 bit lots which are parallel-inputted, the syndromes are calculated by processing in 66 clock cycles, as shown in
The time-division decoding circuit 31 is configured such that mode is changeable by the mode selecting signals /en and /de as previously mentioned, and, in the current case, /en=“H” and /de=“L” causes it to be a syndrome calculating circuit. The circuit is set to an initial state in cycle τ0, and performs a time-division operation which is a 66 cycle τ1-τ66 repetition.
The syndromes are obtained as remainders of the irreducible polynomials m1 (x), m3 (x), m5 (x), and m7 (x) as previously mentioned, and hence can be calculated by selecting and performing XOR on each bit of the data bits d0-d4095 and check bits b0-b65 in accordance with a table of relationships of coefficient bits of each of the syndromes S1, S3, S5, and S7.
There are 64 data bits in one cycle for each coefficient bit Sαm of the syndromes, hence the number of decoders required is 3328, which is the product of this 64 and the number of coefficient bits Sαm of the syndromes, that is, 4×13=52. As a result, the syndrome calculating circuit can be handled by utilizing a portion of 4224 decoders comprising the previously described encoding circuit.
Specifically, it is sufficient to use the encoding circuit portion obtaining b0-b51. Operation of the time-division decoder 31 is absolutely the same as in the case of the encoding circuit, but the relationship between data bits and output bits differs, and is hence described.
That is, in the syndrome calculating circuit, if a data bit di (0≦i=n≦63) is received in cycle τ1, the data di being “1”, and the clock signal /τ1 which is “L” in cycle τ1 becomes the gate signal of a PMOS transistor P1 existing between data node di and output node di(Sαm), then di=di(Sαm), and, if not, then di(Sαm)=“0”.
Generally, if a data bit di {i=n+64(k−1); 0≦n≦63} is received in a cycle τk (1≦k≦64), the data di being “1”, and the clock signal /τk which is “L” in cycle τk becomes the gate signal of a PMOS transistor existing between data node di and output node di(Sαm), then di=di(Sαm), and, if not, then di(Sαm)=“0”.
Cycles τ65 and τ66 are special, input data bits to the decoder being set such that di′=bi′−4096, and di′=0 for i′≧4162.
In the case of the time-division decoder 31 of
Subsequent to data bit selection, syndrome coefficient bits (S1)0, (S1)1, . . . , (S7)12 are obtained by the parallel parity check circuit 32 and serial parity check circuit 33, similarly to the encoding circuit.
Syndrome Element Calculating Circuit
Next, a calculating circuit employed in the syndrome element calculating unit 24 is described. In the case where 512 bytes of data is adopted as unit data for ECC processing, a Galois field GF (213) is used. This Galois field has a number of elements, excluding the zero member, which is a prime number, hence a calculation method employing an expression index using factorization cannot be used. Accordingly, a method is adopted that uses the indexes as is, and, since the total number of indexes is 8191, a binary expression of 13 bits is used.
When ECC processing is performed using a conventionally proposed calculation method, multiplication of elements is performed by addition of indexes and addition of elements is performed by XOR computation of coefficient expressions, hence it becomes necessary to perform conversion between coefficient expressions and binary expressions of indexes. Scale of the decoding circuit for this conversion between binary expressions of indexes and coefficient expressions becomes large, and a description is provided of a portion related to this conversion circuit.
The coefficient expression is expressed by the 13 bits of coefficient Pnj of the term of order j in the polynomial pn(x) of order 12 on GF (2). The index refers to the exponent n in xn shared by pn (x) and m1 (x), and is expressed by the 13 bits of coefficient Bnj of 2j which is the binary expression of n.
In the decoder for coefficient expression→binary expression of index, input is Ii=Pni, and output is Oi=Bni, as shown in Expression 14.
In the decoder for binary expression of index→coefficient expression, input is Ii=Bni, and output is Oi=Pni, as shown in Expression 15.
Correspondence between input and output bits is provided by these converting decoders.
Scale of these decoders increases substantially in proportion to the square of bit number, necessitating simplification of the circuit using partial decoding.
13 Bit Code Converting Decoder: DEC
This code converting decoder DEC includes NAND gates G1-G3 operative to partially decode input data bits I0-I11 of the input data bits I (=I0-I12) four bits at a time, and thereby create signals /A0-/A15, /B0-/B15, and /C0-/C15, and further includes a NOR gate G4 operative to create a NOR signal AiBk from /Ai and /Bk.
The code converting decoder circuit DEC configures a NOR gate having 16×16=256 discharge paths PDC parallel-disposed on an output bit Om basis, the discharge paths PDC being controlled by these decoding signals /A0-/A15, /B0-/B15, and C0-C15 (inverted /C0-/C15). There is an output latch LT2 operative to latch data of a NOR node ND common to each of the blocks. One clock τ0 is used in resetting of this latch circuit LT2.
Each of the 256 discharge paths PCD is gated by the signal AiBk (i=0-15; k=0-15), and, moreover, branches into input bits I12 and /I12. Disposed below this branch is an NMOS transistor OR connection configured having Cmi (mi=0-15) as gate inputs in accordance with correspondence of the conversion. These branches are connected to Vss by the second clock τ1.
Disposition of the NMOS transistors having Cmi gates is performed by first carrying out an AiBk sort of the input bits corresponding to the output bits Om in the conversion, then sorting by the “1” and “0” of I12 to select those included from within C0-C15.
The above-mentioned code converting decoder DEC is applied also as a solving method decoder for obtaining roots of quadratic equations or cubic equations in a Galois field. In this case, it is required only to previously select elements configuring roots of the equations and match the Cmi gate transistor selection to these elements.
Here, the relationship between constant terms and variable terms in the equations to be solved during ECC processing and the relationship between input and output of the decoder is described.
First, a solving method decoder SLVu for the quadratic equation u2+u=g is described. Input is the constant term g, and obtained are the index un of this constant term g or each bit of a polynomial pun(x). Conversion between elements corresponds to selecting pn(x) such that the quadratic equation {pn(x)}2+pn(x)=pun(x) is satisfied.
This correspondence is obtained by calculating pn(x) for all elements g beforehand, and there are two elements pn1(x) and pn2(x) for one g. Output of the decoder SLVu is the index or coefficient expression of these elements, such that first root Oi=Bn1i or Pn1i, and second root Oi=Bn2i or Pn2i.
Cubic equation solving method decoders include a solving method decoder SLVw for solving w3+w=h, and a solving method decoder SLVcube for solving w3=c. Input is the constant term h or c, and obtained are the index wn of this constant term h or c or each bit of a polynomial pwn(x). Conversion between elements corresponds to selecting pn(x) such that the cubic equation {pn(x)}3+pn(x)=pwn(x) or {pn(x)}3=pwn(x) is satisfied.
This correspondence is obtained by calculating pn(x) for all elements h beforehand, and there is one element or three elements pn1(x), pn2(x), and pn3(x) as a solution. Used in the calculation process is one of the roots, and output of the solving method decoders SLVw and SLVcube is the index or coefficient expression of these elements, such that Oi=Bn1i or pn1i. The second root (Oi=Bn2i or pn2i) and third root (Oi=Bn3i or pn3i) are not used.
The error search (ES) unit 25 is provided with the above three equation solving method decoders SLVu, SLVw, and SLVcube, which are time-division operated by the control clock.
Element Adding Circuit (13-Concurrent Two-Input Parity Checker): PC
A calculating circuit for addition of finite field elements frequently used in an ECC processing process is described with reference to
Furthermore, if the next computation on the obtained element is a multiplication, a binary expression of index is required. Accordingly, a code converting decoder DECo operative to perform conversion of coefficient expression→binary expression of index is disposed on an output side. Hereinafter, the 13-concurrent two-input parity checker including the converting decoders of its input/output units is referred to simply as a parity checker PC.
As previously mentioned, two clock cycles are required in the code converting decoders DEC, hence a maximum of four cycles are required in this parity checker PC for addition of elements from index to index. Circuit scale becomes large since three code converting decoders DEC are employed.
It is also possible to adopt one code converting decoder on the input side and use it in a time-division manner for the two inputs, and, although, when doing so a further two cycles are required such that six cycles become necessary for addition of elements from index to index, this adoption of one code converting decoder on the input side is advantageous since only two code converting decoders DEC are required in the parity checker PC, whereby circuit scale is reduced. However, in the following description, the case is assumed of a parity checker PC employing three code converting decoders.
Such an adding circuit (parity checker) is provided as a shared circuit of the SEC unit and ES unit and performs addition of elements by time-division drive in the SEC unit, ES unit and between these units.
Element Multiplying Circuit (13-Bit Adder: AD)
This adder includes: a 13-bit first stage adding circuit 131; a carry correcting circuit 132 operative to detect if the sum is 8191 or more and perform a carry; and a second stage adding circuit 133 operative, along with the carry correcting circuit, to add a complement 1 in the case that the sum is 8191 or more.
The carry correcting circuit 132 judges whether or not a result S0′-S12′ of the first stage adding circuit 131 is 8191 or more. If all bits S0′-S12′ are “1” or a carry C12′ occurs, then it is 8191 or more and a signal PF0 is outputted, and a complement 1 of 8191 is added to S0′-S12′ by the second stage adding circuit 133. As a result, a sum Sm of Am and Bm with modulus 8191 is obtained.
This adder does not require synchronization of a clock or the like and is configured such that if the input is determined the output is also determined, thereby reducing the load of timing control in the system.
A plurality of such multiplying circuits (adders AD) are provided as shared circuits of the SEC unit and ES unit and perform multiplication of finite field elements by time-division drive in the SEC unit, ES unit and between these units.
ECC Processing
Up to now, various kinds of calculating circuits used in computation have been described. Next, a description is provided of ECC processing using these calculating circuits. Various quantities including coefficients of the error search equation are used in calculation branching and calculation processes, these quantities all being calculated from syndromes. The syndromes S (=S1), S3, S5, and S7 are quantities able to be directly calculated from data stored in the memory cell array, and other quantities are obtained by performing calculations of multiplication, exponentiation, and addition based on these syndromes.
In next stage 1, ζ=S3+S3, η=S5+S5, θ=S7+S7 are calculated. In next stage 2, ζ3 is calculated from this calculation result, then in next stage 3, ζ, η, θ, and S are used to calculate various products and quotients of these various exponents.
First, quantities required when calculating sums are the following four, namely S3ζ, Sη, S4η, and S2θ. Two quantities are calculated from a product of these four quantities in next stage 4, and Γ and ΓT, along with the four quantities S3η, Sζ, Sθ, and ζn required in next stage 5 are calculated.
In next stage 5, ΓD, which is the sum of the quantities in the previous stage, and the three quantities required in the sum of the next stage are calculated. Next stage 6 calculates D and T using ΓQ, which is the sum of the obtained quantities, and a product and quotient.
Furthermore, in next stage 7, the two quantities Q and ST required in the next stage are calculated by product and quotient computation, in stage 8, b, ζ, T, and S2Q are calculated, and in final stage 9, c is calculated. In order to reduce the number of additions in calculation of c, the relationship ζ=SD+T is used for transformation. There are a total of ten calculating stages counting from the syndrome stage.
Number of errors and method of error search branch according to handling of the zero member in the various elements obtained from the syndromes. Those conditions of branching are summarized in
Syndromes and coefficients of the error search equation in 4EC form the base, and it is defined from search conditions of each error number that searching in 4EC becomes searching in 3EC or less in case of r=0 or cQ=0; searching in 3EC becomes searching in 2EC or less in case of ζT3=Γ=0; and searching in 2EC becomes searching in 1EC or less in case of ζ=0. In consideration of this, the conditions of exclusive branching are obtained in order of increasing number of errors.
Condition of Branching to 1EC or Less
From ζ=0 under the condition of Γ=0 and ζ=0, relationships between syndromes and coefficients in 4EC become as in the following Expression 16.
SD+T=ζ→SD+T=0
(ζ+S3)D+S2T+SQ=η→SQ=η
(n+S5)D+S4T+(ζ+S3)Q=θ→ηD+S3Q=η(S2+D)=θ
Γ=S3+Sη+ζ2→Γ=Sη
c=S2Q+SDT+T2→c=S2Q=Γ [Expression 16]
From cQ=0, SQ=0 is obtained, and r=0 is satisfied. Further, if Γ=0, and c=0, then cQ=0 is satisfied.
Therefore, only the case of Γ=0 need be considered. In this case, η=0 or S=0, and, in case of S=0, η=0 at ζ=0, hence θ=0 is always satisfied. Conversely, from ζ=η=0 (θ=0), Γ=0 and cQ=0 are satisfied. This is the condition of 1EC or less in the 4EC calculation.
The exclusive condition of branching to 1EC or less is defined by: ζ=η=0 (θ=0). In this case, if S≠0, then 1 error, X1=S; and if S=0, then “no error”.
Corresponding to the error search of 1EC or less resulting in “no solution” is the case where contradiction is generated under the branching condition. That is, if η=0 and θ≠0, and if S=0 and η≠0, a contradiction is considered to be generated based on the relationship between the syndromes and the coefficients. In this case, there are judged to be five errors or more.
Condition of Branching to 2EC
In case of Γ=0, 4EC becomes 3EC or less. The requirement to branch to 2EC or less is that ζT3=Γ=0. However, if ζ=0, it is 1EC or less, hence the condition of branching to 2EC is ζ≠0 and T3=0. Therefore, the relationships between the syndromes and the coefficients in 2EC are the simultaneous equations in Expression 11 with Q=0 and the ζ and η equations having coefficients of 3EC and may be rewritten assigning a subscript 3 to those coefficients as in the following Expression 17.
SD3+T3=ζ→SD3=ζ
(ζ+S3)D3+S2T3=η→ζ(S2+D3)=η
Γ=S3ζ+Sη+ζ2→Γ=0 [Expression 17]
Since Γ=0, i.e. the condition of 2EC in 4EC, is always satisfied, there is no need to consider cQ=0. Therefore, the condition of branching to 2EC is: Γ=0 and ζ≠0.
Condition of Branching to 3EC
The condition of 3EC or less is Γ=0 or cQ=0, and Γ=0 is the condition of 2EC or less. Therefore, the condition of branching to 3EC is: c=0 or Q=0 in case of Γ≠0.
Condition of Branching to 4EC
The condition of 4EC is: Γ≠00 and cQ≠0. In other words, the condition of branching to 4EC is: Γ≠0 and c≠0 and Q≠0.
Next described is a method in which the number of sum computations between elements increasing circuit scale is reduced as much as possible as an improvement directed to reducing circuit scale at the expense of processing time. First, conditions for satisfying search polynomials are described. An error search equation is obtained having as coefficients elements obtained from the syndromes and having a degree according with the error number in the branch of the solving method, and the solving method must be applied judging whether or not roots of the equation obtained for the solution search differ from the zero member and all differ.
Accordingly, conditions are described for a method where the solution search is performed while making a judgment for multiple roots from the coefficients of the obtained error search equation. Conditions that error search equations are satisfied having different roots are summarized for each error number as follows.
Condition for Satisfying Fourth-Degree Error Search Equation x4+Sx3+Dx2+Tx+Q=0
If the zero member is the solution, Q=0 and it becomes third-degree. Moreover, in the case of a multiple root, i.e. x4+Sx3+Dx2+Tx+Q=(x2+α) (X2+βx+γ), then (c=):S2Q+SDT+T2=0, hence the condition for satisfying the fourth-degree search equation having different roots is: Q≠0 and c≠0.
Condition for Satisfying Cubic Error Search Equation x3+Sx2+Dx+T=0
If the zero member is the solution, T=0 and it becomes quadratic. Moreover, in the case of a multiple root, i.e. x3+Sx2+Dx+T=(x+α)(X2+β), then (ζ=):SD+T=0, hence the condition for satisfying the cubic search equation having different roots is: T≠0 and ζ≠0.
Condition for Satisfying Quadratic Error Search Equation x2+Sx+D=0
If the zero member is the solution, D=0 and it becomes linear. Moreover, in the case of a multiple root, i.e. x2+Sx+D=(x2+α), then S=0, hence the condition for satisfying the quadratic search equation having different roots is: D≠0 and S≠0.
Condition for Satisfying Linear Error Search Equation x+S=0
If the zero member is the solution, S=0 and it becomes no error. Hence the condition for satisfying this linear search equation is: S≠0.
The solving method under these conditions for each equation when configured to reduce sum computation is described below.
Solving Method for Fourth-Degree Error Search Equation (4EC): x4+Sx3+Dx2+Tx+Q=0
The case of 4EC, i.e. the solving method for solving the four-degree error search equation, is described by dividing into the four cases of relationships of the quantities determined from the syndromes.
(1) Case 1 (case of S≠0 and b≠0)
Here, it is assumed that b=D2+ST and c=S2Q+SDT+T2. The error search equation is factorized into a product of quadratic expressions as in Expression 18.
x4+Sx3+Dx2+Tx+Q=(x2+α1x+α0)(x2+β1x+β0)=0 [Expression 18]
From the relationship of quantities derived from the coefficients α0, α1, β0, and β1 of the factorized quadratic expressions and the syndromes, an unknown quantity δ=α0+β0 is introduced to obtain a cubic equation of Expression 19 satisfied by δ.
δ3+Dδ2+STδ+S2Q+T2=0 [Expression 19]
Here a further conversion δ=ψ+D is performed to obtain a cubic equation related to ψ as in Expression 20.
(ψ/b1/2)3+(ψ/b1/2)+c/b3/2=0 [Expression 20]
When deriving this equation, it is required that b≠0.
This equation is solved to select one root ψ. δ is calculated to obtain the quadratic equations of Expression 21 having an unknown E to be satisfied by the factorizing coefficients.
If ψ+D≠0, then (ε/δ)2+(ε/δ)+Q/δ2=0(δ≠0)
If ψ+D=0, then ε2+Q=0(δ=0) [Expression 21]
These are solved to obtain the coefficients α0 and β0. If δ is zero, then α0=β0 is obtained.
Moreover, obtained from ψ is one more quadratic equation, that of Expression 22, having an unknown ε to be satisfied by the factorizing coefficients.
(ε/S)2+(ε/S)+ψ/S2=0 [Expression 22]
This is solved to obtain α1 and β1.
The factorized quadratic equations shown in Expression 23 having coefficients of these solutions α0, β0, α1, and β1 are solved, setting X=x, to obtain the four solutions of the fourth-degree error search equation.
(x/α1)2+(x/α1)+α0/α1=0
(x/β1)2+(x/β1)+β0/β1=0 [Expression 23]
Note that, when solving each of the equations, the unknown coefficients are converted to be elements of GF (2) so that a table of solutions can be used.
In the cubic equation for obtaining ψ, the unknown variable is defined as ψ/b1/2; in the quadratic equation for obtaining α0 and β0, the unknown variable is defined as ε/δ; in the quadratic equation for obtaining α1 and β1, the unknown variable is defined as ε/S; and in the factorized quadratic equations, the unknown variables are set to x/α1 and x/β1.
(2) Case 2 (case of S≠0 and b=0)
Here, b=D2+ST=0 and c=S2Q+SDT+T2. (if c=0, multiple root, go to 3EC.) This fourth-degree error search equation is factorized into a product of quadratic expressions as in Expression 24.
x4+Sx3+Dx2+Tx+Q=(x2+α1x+α0)(x2+β1x+β0)=0 [Expression 24]
From the relationship of quantities derived from the coefficients α1, α1, β0 and β1 of the factorized quadratic expressions and the syndromes, an unknown quantity δ=α0+β0 is introduced to obtain a cubic equation of Expression 25 satisfied by δ.
δ3+Dδ2+STδ+S2Q+T2=0 [Expression 25]
Here a further conversion δ=ψ+D is performed to obtain a cubic equation related to ψ as in Expression 26.
ψ3+c=0(c≠0) [Expression 26]
This is solved to select one root.
δ is calculated to obtain the quadratic equations of Expression 27 having an unknown c to be satisfied by the factorizing coefficients.
If ψ+D≠0, then (ε/δ)2+(ε/δ)+Q/δ2=0(δ≠0)
If ψ+D=0, then ε2+Q=0(δ=0) [Expression 27]
These are solved to obtain the coefficients α0 and β0. If δ is zero, then α0=β0 is obtained.
Moreover, obtained from ψ is one more quadratic equation, that of Expression 28, having an unknown c to be satisfied by the factorizing coefficients.
(ε/S)2+(ε/S)+ψ/S2=0 [Expression 28]
This is solved to obtain α1 and β1.
The factorized quadratic equations shown in Expression 29 having coefficients of these solutions α0, β0, α1, and β0 are solved.
(x/α1)2+(x/α1)+α0/α1=0
(x/β1)2+(x/β1)+β0/β1=0 [Expression 29]
If this unknown quantity x is obtained, setting X=x, the four solutions of the fourth-degree error search equation are obtained.
Note that, when solving each of the equations, the unknown coefficients are converted to be elements of GF (2) so that a table of solutions can be used. In the cubic equation for obtaining ψ, the unknown variable is defined as ψ/b1/2; in the quadratic equation for obtaining α0 and β0, the unknown variable is defined as ε/δ; in the quadratic equation for obtaining α1 and β1, the unknown variable is defined as ε/S; and in the factorized quadratic equations, the unknown variables are set to x/α1 and x/β1.
(3) Case 3 (case of S=0 and b≠0)
Here, b=D2+ST=D2 and c=S2Q+SDT+T2=T2. (if c=0, multiple root, go to 3EC.) The fourth-degree error search equation is factorized into a product of quadratic expressions as in Expression 30.
x4+Dx2+Tx+Q=(x2+α1x+α0) [Expression 30]
From the relationship of quantities derived from the coefficients α0, α1, β0, and β1 of the factorized quadratic expressions and the syndromes, an unknown quantity δ=α0+β0 is introduced to obtain a cubic equation of Expression 31 satisfied by b.
δ3+Dδ2+T2=0 [Expression 31]
Here a further conversion δ=ψ+D is performed to obtain a cubic equation of Expression 32 related to ψ, which is solved to select one root ψ.
(ψ/b1/2)3+(ψ/b1/2)+c/b3/2=0(b≠0) [Expression 32]
Since δ never becomes zero, the quadratic equation of Expression 33 is obtained which has an unknown c to be satisfied by the factorizing coefficients.
(ε/δ)2+(ε/δ)+Q/δ2=0(δ≠0,T2≠0) [Expression 33]
This is solved to obtain the coefficients α0 and β0.
Moreover, obtained from ψ is one more quadratic equation, that of Expression 34, having an unknown ε to be satisfied by the factorizing coefficients, the quadratic equation being solved to obtain α1=β1.
ε2+ψ=0(S=0) [Expression 34]
The factorized quadratic equations of Expression 35 having coefficients of the above α0, β0, α1, and β1 are solved.
(x/α1)2+(x/α1)+α0/α12=0
(x/β1)2+(x/β1)+β0/β12=0 [Expression 35]
If this unknown quantity x is obtained, the four solutions of the fourth-degree error search equation are obtained.
Note that, when solving each of the equations, the unknown coefficients are converted to be elements of GF (2) so that a table of solutions can be used. In the quadratic equation for obtaining α0 and β0, the unknown variable is defined as ε/δ; and in the factorized quadratic equations, the unknown variables are set to x/α1 and x/β1.
(4) Case 4 (case of S=0 and b=0)
Here, b=D2+ST=0 and c=S2Q+SDT+T2=T2. (if c=0, repeated root, go to 3EC.) In this case, the fourth-degree error search equation is factorized into a product of quadratic expressions as in Expression 36.
x4+Tx+Q=(x2+α1x+α0)(x2+β1x+β0)=0 [Expression 36]
An unknown quantity δ=α0+β0 is introduced to obtain a cubic equation of Expression 37 satisfied by δ.
δ3+T2=0(δ≠0) [Expression 37]
This equation is solved to select one root δ. δ is not zero. Further, the quadratic equations of Expression 38 are obtained which have an unknown c to be satisfied by the factorizing coefficients.
(ε/δ)2+(ε/δ)+Q/δ2=0(δ≠0)
ε2+δ=0(S=0) [Expression 38]
These are solved to obtain the coefficients α0 and β0, and α1=β1.
The factorized quadratic equations of Expression 39 having coefficients of these α0, β0, α1, and β1 are solved.
(x/α1)2+(x/α1)+α0/α1=0
(x/β1)2+(x/β1)+β0/β1=0 [Expression 39]
If this unknown quantity x is obtained, setting X=x, the four solutions of the fourth-degree error search equation are obtained.
Note that, when solving each of the equations, the unknown coefficients are converted to be elements of GF (2) so that a table of solutions can be used.
In the quadratic equation for obtaining α0 and β0, the unknown variable is defined as ε/δ; and in the factorized quadratic equations, the unknown variables are set to x/α1 and x/β1.
The processes in the four-degree error search 4EC described so far may be summarized as follows.
(a) In the case of S≠0 and b≠0, one root w is selected from the cubic equation w3+w=c/b3/2 by decoding, and ψ=b1/2w and δ=ψ+D are defined. Next, with δ≠0, roots of the quadratic equation u2+u=Q/δ2 are decoded and obtained, and defined as u1 and u2, and α0=δu1 and β0=δu2 are defined. If δ=0, then α0=β0=u=Q1/2.
In addition, roots of v2+v=ψ/S2 are decoded and obtained, and defined as v1 and v2, and α1=Sv1 and β1=Sv2 are defined. Further, roots of the quadratic equations y2+y=α0/α12 and z2+z=β0/β12 are decoded and the obtained results defined as y1 and y2, and z1 and z2, and finite field elements indicating four error positions are obtained as follows: X1=α1y1, X2=α1y2, X3=β1z1, and X4=β1z2.
(b) In the case of S≠0 and b=0, ψ=c1/3 and δ=ψ+D are defined. Next, with δ≠0, roots of the quadratic equation u2+u=Q/δ2 are decoded and obtained, and defined as u1 and u2, and α0=δu1 and β0=δu2 are defined. If δ=0, then α0=β0=u=Q1/2.
In addition, roots of v2+v=ψ/S2 are decoded and obtained, and defined as v1 and v2, and α1=Sv1 and β1=Sv2 are defined. Further, roots of the quadratic equations y2+y=α0/α12 and z2+z=β0/β12 are decoded and the obtained results defined as y1 and y2, and z1 and z2, and finite field elements indicating four error positions are obtained as follows: X1=α1y1, X2=α1y2, X3=β1z1, and X4=β1z2.
(c) In the case of S=0 and b≠0, one root w is selected from the cubic equation w3+w=c/β3/2 by decoding, and ψ=b1/2w and δ=ψ+D are defined. Next, roots of the quadratic equation u2+u=Q/δ2 are decoded and obtained, and defined as u1 and u2, and α0=δu1, β0=δu2, and α1=β1=ψ1/2 are defined.
In addition, roots of the quadratic equations y2+y=α0/α12 and z2+z=β0/β12 are decoded and the obtained results defined as y1 and y2, and z1 and z2, and finite field elements indicating four error positions are obtained as follows: X1=α1y1, X2=α1y2, X3=β1z1, and X4=β1z2.
(d) In the case of S=0 and b=0, then δ=c1/3. Roots of the quadratic equation u2+u=Q/δ2 are decoded and obtained, and defined as u1 and u2, and α0=δu1, β0=δu2, and α1=β1=δ1/2 are defined. Further, roots of the quadratic equations and y2+y=α0/α12 and z2+z=β0/β12 are decoded and the obtained results defined as y1 and y2, and z1 and z2, and finite field elements indicating four error positions are obtained as follows: X1=α1y1, X2=α1y2, X3=β1z1, and X4=β1z2.
Solving Method for Cubic Error Search Equation (3EC): x3+Sx2+Dx+T=0
Next, the solving method for solving the cubic error search equation in the case of 3EC is described. Since T≠0 and δ≠0, then if Q=0, c≠0 is satisfied. This results in b=D2+ST and c=S2Q+ζT=S2Q+SDT+T2.
If this cubic equation is expressed as a product of the linear expression x+α and the quadratic expression x2+β1x+β0, then coefficients of these expressions play the role of intermediate variables, whereby the following relationships can be obtained between the original coefficients.
That is, if δ=β0 is defined, then, since αβ0=T, αβ1+β0=D, and α+β1=S, the δ cubic equation shown in Expression 40 is obtained.
δ3+Dδ2+STδ+T2=0(since T≠0,δ≠0) [Expression 40]
Here a further conversion δ=ψ+D is performed to obtain a cubic equation of Expression 41 related to ψ.
(ψ/b1/2)3+(ψ/b1/2)+ζT/b3/2=0 [Expression 41]
When deriving this equation, it is required that b≠0.
This equation is solved to select one root. If b=0, then ψ=(ζT)1/3. Setting β0=ψ+D from this ψ, β1=β0ψ/T and α=T/β0 are obtained. Subsequently, x=α and the factorized quadratic equation (x/β1)2+(x/β1)+β0/β2=0 having coefficients β0 and β1 are solved, setting X=x to obtain the three solutions.
Note that, when solving each of the equations, the unknown coefficients are converted to be elements of GF (2) so that a table of solutions can be used. In the cubic equation for obtaining ψ, the unknown variable is defined as ψ/b1/2; and in the factorized quadratic equation, the unknown variable is set to x/β1.
The above solving method for the cubic error search equation is reformatted as follows to allow rehashing to an actual circuit system.
If b≠0, the cubic equation to be solved becomes w3+w=ζT/b3/2. One root w is selected, and ψ=b1/2w is defined. If b=0, then ψ=(ζT)1/3 is defined. β0=ψ+D, β1=β0ψ/T, and α=T/β0 are set, and the quadratic error search equation becomes z2+z=β0/β12. The two roots z1 and z2 of this equation are decoded and obtained, thereby obtaining the three solutions for the error positions as follows: X1=α, X2=β1z1, and X3=β1z2.
Solving Method for Quadratic Error Search Equation (2EC): x2+Sx+D=0
Next, the case of 2EC, i.e. the method for solving the quadratic error search equation is described. In this case, D≠0 and S≠0. Hence, the quadratic equation is transformed as in the next Expression 42.
(x/S)2+(x/S)+D/S2=0 [Expression 42]
This is decoded and, setting X=S2, the two solutions are obtained.
Formally, the two roots of z2+z=D/S2 are decoded and obtained, these being defined as z1 and z2, and solutions obtained as follows: X1=Sz1 and X2=Sz2.
Next, a specific configuration of circuits of each unit of the ECC system in
Configuration of SEC Unit
For example, taking the case of adder AD1, this is utilized by clocks ck2, ck3, ck4, ck6, and ck7 but these show the same circuit being repeatedly used at each of the clock timings. Regarding the parity checker also, the six positions of the clocks ck1, ck4, ck5, ck6, ck8, and ck9 aligned in the horizontal direction, and, moreover, the three positions aligned in the vertical direction at the timing of clock ck1, for example, show the same parity checker PC being repeatedly used.
Although not shown in the drawing, syndromes S, S3, S5, and S7 are assumed to be calculated and stored in a register at a clock prior to clock ck0.
Numbers of the clocks ck0-ck9 correspond to the numbers of stages 0-9 indicating the procedure for finite field element calculation described previously at
Elements used only along the way in calculations in the parity checker PC are expressed by p1-p6. Basic step number for processing is completed in 53 steps including internal processing of the parity checker PC.
As previously mentioned, the adder AD processes in two steps, and the parity checker includes the code converting decoder DEC for converting to a coefficient expression and thereby processes in four steps. In reality, there are also cases where internal data only is used along the way, hence there are also computations where clock ck1 is three PC cycles and becomes 12 steps, clock ck4 is four PC cycles yet becomes 12 steps, and clocks ck5 and ck6 are three PC cycles yet six steps without passing the decoder.
Configuration of ES Unit (4EC-Case 1)
At clock ck10, adder AD1 is used to obtain b3, and at clock ck11, adder AD1 is similarly used to calculate H=cb−3/2. The cubic equation solving method decoder SLVw employing the code converting decoder is used to decode w from w3+w=H, and at clock ck12, ψ=wb1/2 is calculated by adder AD4. In addition, at clock ck13, δ=ψ+D is calculated by the parity checker PC, and at clock ck14, J=Qδ−2 and K=ψS−2 are calculated, respectively, by adders AD1 and AD4.
If δ≠0, then the quadratic equation solving method decoder SLVu is used in a time-division manner to decode respectively from u2+u=J and v2+v=K two of u and v, i.e. u1 and u2, and v1 and v2. At clock ck15, α0=δu1, β0=δu2, α1=Sv1, and β1=Sv2 are calculated by adders AD1-AD4. However, time-division in clock ck15 is performed in coordination with time-division in the solving method decoder SLVu.
If δ=0, then it is assumed that α0=β0=Q1/2. At clock ck16, L=α0α1−2 is calculated by adder AD1, and M=β0β1−2 is calculated by adder AD2. The quadratic equation solving method decoder SLVu is used in a time-division manner to decode respectively from y2+y=L and z2+z=M two of y and z, i.e. Y1 and Y2, and z1 and z2.
At clock ck17, X1=α1y1, X2=α1y2, X3β1z1, and X4=β1z2 are calculated by adders AD1-AD4. However, time-division in clock ck17 is performed in coordination with time-division in the solving method decoder SLVu.
All processing is completed in 28 steps. Generation of the control clock for those steps is shown in
Two steps are used in the adders AD, and four steps are used in the parity checker PC including the code conversion decoding to a coefficient expression. The equation solving method decoder SLV requires two steps in code conversion decoding, hence resulting in six steps of processing with the adder AD and decoding.
Configuration of ES Unit (4EC-Case 2)
At clock ck10, the cubic equation solving method decoder SLVcube employing the code converting decoder is used to define ψ=c1/3. At clock ck11, the parity checker PC is used to calculate δ=ψ+D, and at clock ck12, the adders AD1 and AD4 are used to calculate, respectively, J=Qδ−2 and K=ψS−2.
If δ≠0, then the quadratic equation solving method decoder SLVu is used in a time-division manner to decode respectively from u2+u=J and v2+v=K two of u and v, i.e. u1 and u2, and v1 and v2.
At clock ck13, adders AD1-AD4 are used to calculate α0=δu1 and β0=δu2, and α1=Sv1 and β1=Sv2. However, time-division in clock ck13 is performed in coordination with time-division in the decoder SLVu.
If δ=0, then it is assumed that α0=β0=Q1/2. At clock ck14, L=α0α1−2 is calculated by adder AD1, and m=β0β1−2 is calculated by adder AD2. The quadratic equation solving method decoder SLVu is used in a time-division manner to decode respectively from y2+y=1, and z2+Z=M two of y and z, i.e. y1 and y2, and z1 and z2.
Then, at clock ck15, X1=α1y1, X2=α1y2, X3=β1z1, and X4=β1z2 are calculated by adders AD1-AD4. However, time-division in clock ck15 is performed in coordination with time-division in the decoder SLVu.
All processing is completed in 22 steps. Generation of the control clock for those steps is shown in
Two steps are used in the adders AD and the cubic equation solving method decoder SLVcube, and the parity checker PC processes in four steps including the code conversion decoding. The SLV requires two steps in decoding, hence resulting in six steps of processing with the adder AD and decoding.
Configuration of ES Unit (4EC-Case 3)
At clock ck10, adder AD1 is used to obtain b3, and at clock ck11, adder AD1 is used to calculate H=cb−3/2. The cubic equation solving method decoder SLVw is used to decode w from w3+w=H, and at clock ck12, ψ=wb1/2 is calculated by adder AD4.
At clock ck13, δ=ψ+D is calculated by the parity checker PC, and at clock ck14, J=Qδ−2 is calculated by adder AD1. The quadratic equation solving method decoder SLVu is used to decode both of u1 and u2 from u2+u=J, and at clock ck15, α0=δu1 and β0=δu2 are calculated by adders AD1 and AD2, and α1=β1=ψ1/2 is defined.
At clock ck16, L=α0α1−2 is calculated by adder AD1, and m=β0β1−2 is calculated by adder AD2. The quadratic equation solving method decoder SLVu is used in a time-division manner to decode respectively from y2+y=L and z2+z=M two of y and z, i.e. y1 and y2, and z1 and z2.
At clock ck17, X1=α1y1, X2=α1y2, X3=β1z1, and X4=β1z2 are calculated by adders AD1-AD4. However, time-division in clock ck17 is performed in coordination with time-division in the decoder SLVu.
All processing is completed in 26 steps. Generation of the control clock for those steps is shown in
Two steps are used in the adders AD, and four steps are used in the parity checker PC including the code conversion decoding. The equation solving method decoder SLV requires two steps in decoding, hence resulting in four or six steps of processing with the adder AD and decoding.
Configuration of ES Unit (4EC-Case 4)
At clock ck10, the cubic equation solving method decoder SLVcube is used to set δ=c1/3. At clock ck11, J=Qδ−2 is calculated by adder AD1. The quadratic equation solving method decoder SLVu is used to decode both of u1 and u2 from u2+u=J, and at clock ck12, α0=δu1 and β0=δu2 are calculated by adders AD1 and AD2, and α1=β1=δ1/2 is defined.
At clock ck13, L=α0α1−2 is calculated by adder AD1, and m=β0β1−2 is calculated by adder AD2. The quadratic equation solving method decoder SLVu is used in a time-division manner to decode respectively from y2+y=L and z2+Z=M two of y and z, i.e. y1 and y2, and z1 and z2. At clock ck14, X1=α1y1, X2=α1y2, X3=β1z1, and X4=β1z2 are calculated by adders AD1-AD4. However, time-division in clock ck14 is performed in coordination with time-division in the decoder SLVu.
All processing is completed in 16 steps. Generation of the control clock for those steps is shown in
Two steps are used in the adders AD and the cubic equation solving method decoder SLVcube, and four steps are used in the parity checker PC including the code conversion decoding. The decoder SLV requires two steps in decoding, hence resulting in four or six steps of processing with the adder AD and decoding.
Configuration of ES Unit (3EC)
If b≠0, then at clock ck10, adder AD1 is used to obtain b3, and at clock ck11, H=ζTb−3/2 is calculated. The cubic equation solving method decoder SLVw is used to decode w from w3+w=H, and at clock ck12, ψ=wb1/2 is calculated by adder AD4.
If b=0, then the cubic equation solving method decoder SLVcube is used to set ψ=(ζT)1/3. At clock ck13, β0=ψ+D is calculated by the parity checker PC, and at clock ck14, β0ψ is calculated by adder AD1 and X1=Tβ0−1 is calculated by adder AD3. At clock ck15, β1=β0ψT−1 is calculated.
At clock ck16, J=β0β1−2 is calculated by adder AD2. The quadratic equation solving method decoder SLVu is used to decode both of u1 and u2 from u2+u=J, and at clock ck17, X2=β1u1 and X3=β1u2 are calculated by adders AD1 and AD2.
All processing is completed in a maximum of 23 steps. Generation of the control clock for those steps is shown in
Two steps are used in the adders AD and cubic equation solving method SLVcube, and four steps are used in the parity checker PC including the code conversion decoding. The decoder SLV requires two steps in decoding, hence resulting in four steps of processing with the adder AD and decoding.
Configuration of ES Unit (2EC)
At clock ck10, adder AD4 is used to calculate J=DS−2. The quadratic equation solving method decoder SLVu is used to decode both of u1 and u2 from u2+u=J, and at clock ck11, X1=Su1 and X2=Su2 are calculated by adders AD1 and AD2.
All processing is completed in a maximum of six steps. Generation of the control clock for those steps is shown in
Two steps are required for the adders AD and two steps are required for the decoder SLV, hence resulting in four steps of processing with the adder AD and decoder.
As the above makes clear, each solution search computation processing has fewer step numbers than the SEC unit. When processing in the SEC unit is completed, judgment for five bit errors or more can be performed, hence time taken for error position search processing is less than time taken for judgment of error warning. Specifically, processing from error judgment to obtaining result of error position search can be completed in a time which is less than the time taken to obtain judgment of presence/absence of five bit or more error.
It has been seen how using identical circuits in a time-division manner in each computational processing allows circuit scale to be reduced; however, what is important in time-division processing is the passing of processing results. Consequently, a data latch is provided and utilized to retain computation results for a required period. Since that latch circuit also is utilized in a time division manner for scale reduction, it is necessary for the latch circuit to be provided to control input and output of data in a way that prevents any clashing of retained data.
The relationship between processing data outputted by each computation circuit in the SEC and ES units and the time-division-utilized latch circuit operative to retain that processing data is shown below in table format along with the clock.
Note that syndrome calculation processing is performed prior to the start clock ck0 of subsequent computation processing, and is indicated as clock cycle ck-1.
Describing
Describing specifically from the top left section in the table of
Hereinafter, the same applies also to other element outputs. The latch circuits are used in a time-division manner up to the clock for which element data needs ultimately to be fixedly held, thereby reducing circuit scale.
There are 18 kinds of latch circuits required in the SEC unit, namely A1-A3, B1-B4, and R1-R11. Each of these has a 13-bit configuration, thereby necessitating 18×13=238 bits of latch circuit. The latch circuits R1-R11 shown in doubled outline are latch circuits for which data needs ultimately to be retained for the duration of the ECC period after the corresponding clock. These latch circuits R1-R11 configure the register REG in previously described
The above-mentioned latch circuits required in computation in the SEC unit are utilized also in the individual cases of branching in the ES unit. Examples of this latch circuit time-division use in the ES unit are shown in
There is no overlapping of each of the cases as they proceed, hence use of latch circuits can be set independently for each. Note that the computation circuit “SLVc” in
Similarly,
As shown above, when time-division use of the latch circuits is determined, the latch circuits to which each of the computation circuits is to be connected during time-division operation is determined. The application of the latch circuits A1-A3, B1-B4, and R1-R11 corresponding to input/output element data of the calculating circuits in the previously described calculating circuit configurations of the SEC and ES units is shown specifically in
That is,
Galois elements can have powers of two and reverse elements handled by bit position shift of the binary expression of index, hence these elements are the same as latch data and are connected to an identical latch.
When two inputs are connected to the same latch circuit (for example, adders AD2, AD3, and AD4 at clock ck0 in
In each of the ES unit configurations of
In
A central unit of the circuit layout in
Disposed in parallel with each other to sandwich the arithmetic circuit unit 401 and the latch circuit unit 402 are: an input side data bus 403 operative to supply index data configuring computation element input at one side of the arithmetic circuit unit 401 and the latch circuit unit 402; and an output side data bus 404 operative to receive computation element output index data obtained from the arithmetic circuit unit 401 at the other side of the arithmetic circuit unit 401 and the latch circuit unit 402. Disposed between the input side data bus 403 and the arithmetic circuit unit 401 are transfer gate circuits 406 and 407 each configured from a MUX operative to perform control of data transfer to the arithmetic circuit unit 401; and disposed between the output side data bus 404 and the arithmetic circuit unit 401 are transfer gate circuits 408 and 409 each configured from a MUX operative to perform control of data transfer from the arithmetic circuit unit 401 to the output side data bus 404.
Moreover, disposed between the latch circuit unit 402 and the input side data bus 403 is a transfer gate unit 405 operative to control transfer of latch circuit data to the input side data bus 403; and disposed between the latch circuit unit 402 and the output side data bus 404 is a transfer gate unit 412 operative to control transfer of bus data to the latch circuit unit 402.
At the left side ends of the data buses 403 and 404 there is a timing signal generating unit 411 operative to generate timing signals such as the clock, case selection signals, and the like, which control time-division operation and exchange of data with the buses for each of the computation elements. The clock signal is supplied to the transfer gate circuits 405-409 and 412.
An outline circuit operation is described sequentially in accordance with a flow of data.
First, the result obtained in the syndrome calculating unit 410 at top right is sent, either via the computation element PC or directly, to the code converting decoder, i.e. the “pn(x) to index” decoder, to be come a Galois field element index. Then, it passes through the output multiplexer 409 to become data in the output side data bus 404.
This data on-board the output side data bus 404 passes through the transfer gate circuit 412 to be loaded into the latch circuits (A, B, R) 402. The data in the latch circuits 402 passes through the transfer gate circuit 405 with an appropriate timing to become index data in the input side data bus 403. The data on-board this bus either passes through the transfer gate circuit 406 to be loaded directly as an index into the computation elements AD1-AD4, or passes through the transfer gate circuit 407 and through the code converting decoder, i.e. the “index to pn(x)” decoder, to be temporarily converted to coefficient display and sent to the computation element PC.
On completion of computation, output of the computation either passes through the output transfer gate circuit 408 or, in the case of the computation element PC, is first converted to an index by the code converting decoder, i.e. the “pn(x) to index” decoder and then passes through the output transfer gate circuit 409, to be sent to the output side data bus 404. That data passes again through the transfer gate circuit 412 to be loaded into the latch circuits (A, B, R) 402 and sent with an appropriate timing to the input side data bus 403, although a part of the latch data passes through the equation solving method decoder “solver” with an appropriate timing of an internal multiplexer MUX to be sent again to the bus.
As described above, computation processing proceeds through data in the input/output buses being utilized by the latches and computation elements with appropriate timings. Data remaining in the output side data bus 404 at the point of completion of ECC processing is the error position information X1-X4.
Data transfer from the bus to this data-retention-buffer latch is controlled by in-clocks “in clocks”, generated with an appropriate timing and connected to corresponding bus lines. Data transfer output from the latch is controlled by out-clocks “out clocks”, generated with an appropriate timing and connected to corresponding bus lines. This portion of the one-bit latch “b” can be expressed by circuit symbols as in the figure.
Furthermore, the index of an element in the Galois field GF (213) is 13 bits, and elements can be retained by 13 one-bit latches. Hence, as shown in
The clock signals “in ck” and “out ck” inputted to the input/output multiplexers MUX instruct data allocation to the input/output buses from the respective latches, similarly to as described for the connection relationship diagram of latches and computation elements in
As described above, innovation in summing computation enables the number of converting decoders for converting between expressions of Galois field elements used in the ECC circuit system to be kept to the minimum of one each of a “pn(x) to index” decoder, i.e. a decoder for converting from a polynomial coefficient expression to an index expression, and an “index to pn(x)” decoder, i.e. a decoder for converting from an index expression to a polynomial coefficient expression. Other similar decoders include the three equation solving method decoders, namely SLVcube, SLVw, and SLVu, the circuit system being configured such that these three decoders are all utilized in a time-division manner.
To further reduce circuit scale, innovation is required in the configuration of the decoders themselves. One such innovation, the modified example of a 13-bit code converting decoder DEC is shown in
In other words,
That is, the 13 steps τ0-τ12 shown in
Insteps τ0-τ12, discharge switches of the partial decoding signals C0-C15 (inverted /C0-/C15) in the final branching of the discharge are turned on to perform decoding of the 13 bits (m=0-12). These switches are realized by disposing NMOS transistors in accordance with decoding of bit m of the expression. For resetting of output latches LT21, steps immediately prior to the setting steps are respectively used.
Unchanged from
The above configuration allows a code converting decoder to be configured having a circuit block number of a one-bit amount. If a compact integrated circuit layout can be achieved for the transistor array of the clock-operated discharge circuit PDC, then circuit scale can be significantly reduced. However, a decoding time several times longer than in the batch method is required.
A time-division circuit system is controlled by clock cycle steps, and a simple configuration method for the clock generating circuit is described below.
This circuit generates a required number of separate clock pulses in synchronization with the basic clock CL, these becoming the step clocks τi configuring the previously described clocks ck0-ck17.
Specifically, one of the blocks Block_s has seven units, and the other of the blocks Block_t has three units.
External clock CL is loaded internally while signal “gate” is H to obtain clocks clk and /clk. In the first block Blocks, the operation is repeated in which, at clock clk, the boundary in the unit shifts to the right, and on reaching the right end is reflected and shifts to the left, and on reaching the left end is reflected and shifts to the right.
Assuming boundary numbers of the shifting unit “1”, “0” are expressed as 2i−2i+2 (i=0−3), and setting 2i=0 as Vdd, and 2i+2=8 as Vss, clock step signals /si and /s7-i are synthesized by the logic gates G11 and G12 as in the figure. Note that signal R indicates that the unit boundary is in the middle of shifting to the right, and that signal L indicates that the unit boundary is in the middle of shifting to the left. This block Block_s causes the internal clock ck and /ck to be generated each time a cycle of boundary shift is completed.
In the second block Block_t, the boundary of the unit state shifts due to the clock ck, whereby a similar operation to that in block Block_s is performed. The step signal generated in this block Block_t does not have a gap provided between clock pulses, hence unit number is approximately half that of block Block_s.
Assuming boundary numbers of the shifting unit “1”, “0” are expressed as j (−0−3), and setting j=0 as Vdd, and j+1=4 as Vss, clock step signals /tj and /t7-j are synthesized by the logic gates G13 and G14 as in the figure. The number of step signals /t3 logic-synthesized from the state of the boundary is the same as /si, but the way of synthesizing differs from that in block Block_s. Signal RSF indicates that the unit boundary is in the middle of shifting to the right, and signal LSF indicates that the unit boundary is in the middle of shifting to the left.
Step clock τp+8q is obtained by gate G15 operative to perform NOR logic on step signals /sp and /tq.
A logic circuit 511 is a step synthesis circuit operative to synthesize step signals /s0-/s7 from unit boundary signals B1-B7 and the shift signals R and L. Nodes B1-B6 and clock node ck in the circuit are initialized to Vss at initialization signal rs, and unit state shifts due to clock clk.
A shift signal generating circuit 512 generates the shift signals R and L based on signals F and C at both end nodes of block Block_s. A clock generating circuit 513 operative to generate the clock ck generates a clock ck which changes state each time a cycle of shifting is completed in response to the signals F and C at both end nodes of block Block_s and the shift signals R and L.
A logic gate circuit 521 is a step synthesis circuit operative to synthesize step signals /t0-/t7 from unit boundary signals C1-C3 and the shift signals RSF and LSF. Circuit nodes C1 and C2 are initialized to Vss at initialization signal rs, and unit state shifts due to clock ck.
A shift signal generating circuit 522 generates the shift signals RSF and LSF determining boundary shift direction based on signals D and E at both end nodes of the block.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel methods and systems described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the methods and systems described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2010-028523 | Feb 2010 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
4030067 | Howell et al. | Jun 1977 | A |
4556977 | Olderdissen et al. | Dec 1985 | A |
4694655 | Seidel et al. | Sep 1987 | A |
5107503 | Riggle et al. | Apr 1992 | A |
5216676 | Kimura | Jun 1993 | A |
6185134 | Tanaka | Feb 2001 | B1 |
6347389 | Boyer | Feb 2002 | B1 |
6643819 | Weng | Nov 2003 | B1 |
6651214 | Weng et al. | Nov 2003 | B1 |
6725416 | Dadurian | Apr 2004 | B2 |
RE40252 | Tanaka | Apr 2008 | E |
7890846 | Lee et al. | Feb 2011 | B2 |
7978972 | Ohira et al. | Jul 2011 | B2 |
20090049366 | Toda | Feb 2009 | A1 |
20090198881 | Toda | Aug 2009 | A1 |
20100115383 | Toda | May 2010 | A1 |
Number | Date | Country |
---|---|---|
2000-173289 | Jun 2000 | JP |
Entry |
---|
U.S. Appl. No. 13/327,418, filed Sep. 20, 2011, Toda. |
Number | Date | Country | |
---|---|---|---|
20110202815 A1 | Aug 2011 | US |