1. Field of the Invention
The present invention relates generally to the field of solid state flash disks and particularly to a method and apparatus for providing error detection and recovery on solid state disks.
2. Description of the Prior Art
Flash memory storage disk is gaining more in popularity as consumers look for more compact and versatile means of storing information. Solid state disk (SSD) can be organized by several storage units called solid state array being linked together where the disks may be identical in size or have different sizes but will be allowed to have a minimum capacity. Total capacity of SSD may be the total of all disk capacities being summed together for fast performance but without disk recovery capacity. To ensure data correctness by fault-tolerant feature, an overhead should be involved for adding parity byte operation.
However, as density of flash chips is increasing defect and fail rate are also increasing especially when Multi-Level Cell (MLC) technology is introduced in the flash manufacturing process even though most of the SSD are made from Single-Level Cell (SLC) process flash chips.
Reed Solomon (RS) algorithm is popularly applied in industrial applications for error detection and recovery. Prior art methods to find roots of error polynomial include Berlekamp-Messay and Euclidean methods after which Chien's method is applied to search for error locations. However, since implementation requires complex hardware and lengthy calculations, cost of error correction code (ECC) is higher using the above prior art methods.
A second method utilizes look up table. A pre-calculated value is stored in Read Only Memory (ROM) for error searching. Advantage of this method is the speed and substantial time saved for calculations, however, for long size codes ROM could occupy expensive silicon areas and directly increase controller chip cost.
Normally, four procedures are involved in Reed Solomon decoding process:
Procedure (2) above normally uses Berlekamp-Messay recursive method or Euclidean matrix method the complexity of which depends on the code length independently of error numbers. Procedure (3) to find roots of error polynomial adopts Chien's searching method with a calculation time depending on the code length.
The above two methods do not fully utilize the characteristic of low error counts of flash memory disk and requires sophisticated hardware and relatively long calculation time.
It is therefore desirable to develop a method and procedure for performing disk crash (erasure) calculations which can save hardware cost of silicon area ROM look up while eliminating lengthy calculations based on Chien's searching algorithm. In addition, the new method should only depend on the syndrome result to find out error location which is the most difficult process in the RS algorithm.
Briefly, an embodiment of the present invention includes an electronic data storage device having a Reed Solomon (RS) decoder including a syndrome calculator block responsive to information including data and overhead and operative to generate a syndrome. The electronic data storage device further includes a root finder block coupled to receive said syndrome and operative to generate at least two roots, said RS decoder for processing said two roots to generate at least one error address identifying a location in said data wherein said error lies; and an erasure syndrome calculator block responsive to said information and operative to generate an erasure syndrome, said RS decoder responsive to said information identifying a disk crash, said RS decoder for processing said erasure syndrome to generate an erasure error to recover the data in said disk crash.
The foregoing and other objects, features and advantages of the present invention will be apparent from the following detailed description of the preferred embodiments which make reference to several figures of the drawing.
a) shows an electronic data storage device 10 coupled to a host device 12, in accordance with an embodiment of the present invention.
b) shows four storage disks or solid disk arrays (SDA), 54-60, for storing data and four Reed Solomon (RS) parity disks 62-70 for storing overhead including the RS parity bytes, in accordance with an embodiment of the present invention.
a) shows a block diagram 220 of an RS encoder in a write process and a main memory 224, in accordance with an embodiment of the present invention.
b) shows a block diagram of an RS decoder 250 in the read direction, in accordance with an embodiment of the present invention.
a) shows a write operation block diagram 199, in accordance with an embodiment of the present invention.
b) shows a read operation block diagram 305, in accordance with an embodiment of the present invention.
In one embodiment of the present invention, a Reed Solomon (RS) error detection and correction (ECC) or coding and decoding method is used in conjunction with non-volatile memory with disk crash (erasure) calculations capability which in turn simplifies the circuitry employed. Simplification in circuitry results in hardware costs of silicon area read only memory (ROM) look up. Other advantages of the present invention include eliminating lengthy calculations based on Chien's searching algorithm.
The embodiment of the present invention is based on low error count characteristic of solid state disk (SSD) storage disk array for only two or less error counts per codeword. A relatively simple method depending on syndrome result is needed to find error location which is the most part of the RS algorithm. In addition. The present invention uses the knowledge of disk crash (erasure) as a pre-known condition to inform the RS controller that the error location is known whence does not need to be found by the RS controller. Thus error recovery capability is enhanced by fully utilizing the pre-known disk crash knowledge. Additional advantages of the present invention include simple RS logic design structure, reduced hardware complexity, simpler control signal handling, faster speed and not being influenced by the code length.
Two-error counts format RS code based on Galois Field (GF) (28) is used in the embodiment of the present invention wherein each symbol or byte is 8 bits. Also N is the code length and 4<N≦255, N−4 is the length of message unit which is the read out code in the data area of flash memory, 4 bytes comprise the parity bytes generated from the RS encoder to correct up to two errors per codeword.
If s denotes error bytes and r denotes the number of disk crash or erasure then 2s+r≦2t. Since the parity bytes are four then 2t=4 and from the above inequality it follows that the total error bytes may be 2, i.e. s=2. However, with only one error byte, i.e. s=1, up to disk crashes (erasure cases) may be corrected by the RS encoder/decoder.
Referring now to
In one embodiment of the present invention, the electronic data storage device 10 is a flash memory storage solid state disk (SSD) comprising a number of disks. The total storage capacity of the SSD is approximately the sum of storage capacities of the disks. To ensure data correctness by fault-tolerant feature, overhead is included as parity bytes. An example of the electronic data storage device 10 is the redundant array of independent disks (RAID).
Flash, which is one type of non-volatile memory, is known, at times, to have bit errors according to manufacturing defect(s) or repeated write/read/erase operation. Error correction coding (ECC) methods and techniques are commonly employed in flash applications. However, increased defect bits, in page access operations, require powerful ECC methods. An embodiment of the present invention addresses such a need and provides an effective and cost sensitive solution thereof.
The flash memory storage SSD 10 includes an interface circuit 13, a solid disk array (SDA) controller 14, storage disks 26-32 and RS parity storage disks 36-46, in accordance with an embodiment of the present invention. The number of storage disks is N−4 where N might be as high as 255. In
The interface circuitry 13 is coupled to the host device 12 and to the data stripping unit 24, disk command register 22 and the central controller 16. The data stripping unit 24 is coupled to the disk command register 22 and the central controller 16 and the storage disks 26-32. The disk command register 22 is coupled to the data stripping unit 24 and central controller 16 which is coupled to the buffer 20 and the RS calculation unit 21. The RS calculation unit 21 is coupled to the RS parity storage disks 36-46.
Data stripping unit 24 divides the data and overhead, which is then distributed between various storage disks 26-32 so that a data file may be spread between one or more of storage disks. However, the location of the bytes in each sector in which a data file is stored is the same. For example, a data file may be divided into four different bytes and each byte stored in the identical location in four different disks. Spreading of data and overhead enables the RS calculation unit 21 to perform parallel processing, which reduces the amount of delay. As in the last example, a single data file may be stored in four different sectors in a plurality of storage disks. The RS calculation unit 21 processes the whole file as a block length data without having to wait for processing one whole sector to be finished before moving to the next sector. Thus, the speed of error detection and correction is increased.
The RS algorithm stored in the controller 16 is used to perform encoding and decoding. The RS calculation unit 21 includes an RS decoder, which is a fixed block length decoder. The data stripping unit 24 divides the data and overhead sent by the host 12 to generate a block length data which is distributed between the storage disks 26-32. The RS calculation unit 21 includes an RS encoder, which generates RS parity bytes needed by the RS decoder to detect possible error generated in retrieving the data from the electronic storage device 10.
There are four storage disks dedicated to storing the RS parity bytes. The RS calculation unit 21 calculates the parity bytes which are stored in the RS parity storage disks 36-44. The RS parity storage disks 36-44 comprise the overhead in the total storage data array and there are N−4 storage disks (only four of which are shown in
The disk command register 22 operates to transfer host commands to the solid disk array controller 14. The information sent by the host 12 is stored temporarily in the buffer 20 while RS calculations are being performed.
Referring now to
The data stripping unit 50 by using the stripping algorithm 52 divides the information provided by the host 12 into one or more parts to generate a block length data. Each data part consists of data sent to a storage disk 54, 56, 58 or 60 to be distributed almost evenly between the aforementioned storage disks. The overhead part of the block length data consisting of overhead is sent to the RS parity disks 62-70 to be distributed almost evenly therebetween. Accordingly, the RS calculation unit 21 decodes various data parts in parallel without having to finish processing one disk before moving to the next one thereby reducing the delay time. For example, if approximately 0.2 milliseconds is required to read each sector, by parallel processing the embodiment of the present invention is almost 16 (2×8 for 2 errors and 8 disks) times faster than without data stripping.
Referring now to
The encoder 86 receives information provided by the host 12 in the form of bytes 96-106. Each sector within a disk comprises 512 bytes and the Nth byte for storage in disks 1 to N−4 as bytes 96-106, respectively. In the particular example shown in
The decoder 88 receives the data stored in the storage disks 94 for processing thereof to identify the magnitude and address of the disk crash or erasure. Specifically, output signals 118 and 120 indicate the number of errors and disk crashes or erasures, respectively. Output signals 112 and 126 indicate the location of the possible first and second errors, respectively, and signals 124 and 128 indicate the magnitudes or values of the first and second errors, respectively. Output signal 130 indicates the disk crash (or erasure) address and signal 132 indicates the value of erasure.
Referring now to
At the step 146 of the write process flowchart 140, the host 12 sends information to be stored in the storage disks through the interface circuitry 13. The central controller 16 has a direct memory access (DMA) engine which transfers the information bytes to the buffer 20 or to the local controllers in each of the solid disk arrays 54-60, as indicated at step 148.
For the purpose of encoding, the information provided by the host is processed by the data stripping unit 24 to generate the block length data. As a sector of the block length data is updated by writing new information, the remaining related sectors of the block length data need to be updated. At step 152, central controller 16 fetches the related sector bytes located at different storage disks through the DMA engine by whole sectors. At step 154, the RS calculation unit 21 generates parity bytes which are stored into different RS parity disks 62-70 at step 156. The whole process is thereby completed at step 158.
At step 160 of the read process flowchart 142, the host 12 sends new read sector address through the interface circuitry 13 to the flash solid state disk (SSD) 10. The storage disk having the appropriate sector determined by the address responds to the host read request by sending the sector to the DMA engine at step 162.
Step 164 indicates the activity of the local controller checking the overhead bytes, also known as error correction coding (ECC) bytes, in the storage disk for possible errors in reading the bytes. This is possible since each storage disk has an RS encoder/decoder whereby error detection and recovery is performed. At step 166 the relate sectors which comprise the block length data are fetched by the DMA engine and stored in a temporary buffer. In addition, the RS parity bytes associated with the block length data and stored in the RS parity storage disks are fetched by the DMA engine and stored in a temporary buffer. At step 168 the ECC bytes of the related sectors are checked for possible errors in reading the sectors.
At step 172 it is checked whether the ECC bytes indicate any errors in reading the data. The errors are indicated when the outcome of the RS decoding process is not compatible with the data. If the RS algorithm can correct possible errors in reading the data then no error is generated indicating the disk has not crashed and the data stored therein is correct. The block length data in the temporary buffers is sent back to the host at step 180 thereby completing the read process at step 182.
However, the case wherein the RS algorithm detects error and cannot recover the data is reported to the host as a disk crash or erasure. At step 174, the host is notified that the disk has failed which is reported back to the central controller 16 as a disk crash with a known position. As a result, the RS calculation unit 21 does not perform any error detection and recovery on the crashed disk. This case is referred to as erasure error. Instead, at step 176 the RS calculation unit 21 recovers the lost bytes due to erasure error and subsequently at step 178 the correct bytes are restored to the storage disks previously identified as crashed disks and stored in temporary buffers. The block length data in the temporary buffers is sent back to the host at step 180 thereby completing the read process at step 182.
The schematic 144 of data storage organization includes four storage disks 194-200, each of which has 512 bytes of data and associated ECC bytes 210-216, respectively. Also included are four RS parity storage disks 202-208. Also shown in the schematic 144 is a block length data written to four different sectors of the four storage disks 194-200. The block length data divided by the data stripping unit 24 is stored in byte x of each of the disks 194-200. Thus, the block length data is stored in the bytes with the same location in different storage disks 194-200. Similarly, the RS parity bytes are stored in the same relative location, i.e. byte x, of the four RS parity storage disks 202-208.
Referring now to
The RS encoder block diagram 220 indicates the DMA engine 226 which is included in the central controller 16. At step 234, the host 12 sends in new sector to be stored in the flash SSD 10 through the interface circuitry 13. In other embodiments of the present invention the host 12 sends one or more sectors for storage in the flash SSD 10. At step 238, the stripping algorithm is used to identify the related sectors which together with the new sector comprise the block length data. The new sector is stored in the storage disk determined by the data stripping unit 24 at step 236 through the DMA engine 226. The new sector is also stored in the temporary location 240 which is the same as temporary location 228 in the host.
At step 242 the related sectors included in the block length data are read and stored in the temporary location at step 244 through the DMA engine 226. The temporary locations 240 and 244 are combined at step 248 to perform RS encoding calculations, as described hereinabove in relation to
Referring now to
The related RS parity bytes are fetched from the RS parity storage disks at step 258 and sent trough a third pipeline to a third temporary location at step 270. The data and overhead in the first, second and third temporary locations are combined for RS decoding calculations to be performed thereon at step 272. The output of the RS decoding calculations is stored in a fourth temporary location at step 274. The DMA engine 264 sends the data stored in the fourth temporary location to the host 12 as the new sector data at step 262. The RS parity bytes stored in the fourth temporary location are sent to the RS parity storage disks for storage at step 260.
Referring now to
The related data sector bytes and the new write sector bytes, comprising the block length data, are fetched by the DMA engine 294 from the storage disk 295 and stored in the second temporary location at step 288. The related data sector bytes are also multiplied by the GF(28) coefficients at step 286. The output of multiplications at step 286 is a message polynomial which is processed by the RS encoder at step 290 to generate RS parity bytes as described hereinbelow. The stripping algorithm is employed at step 292 to divide the message polynomial and parity bytes into parts to generate the block length data. Each part of the block length data is transferred to the storage disk 295 through the DMA engine 294. Included in the solid storage disks are data sectors 298-304 located within data storage disks for storing data portion of the block length data. Also in the storage disk 295 are RS sectors 296 located within the RS parity storage disks for storing RS parity bytes of the block length data.
Referring now to
Following transfer of a new sector read request from the host 12 to the flash SSD 10, the DMA engine 332 transfers new data bytes located in the data sectors 336 to the temporary data sectors 328 at step 330. The related RS parity bytes stored in the RS parity storage disks 338 are transferred by the DMA engine 332 to the temporary RS parity sectors 328 at step 330. The related RS parity bytes are transferred to a new temporary location at step 318.
At step 324 GF(28) coefficients are calculated and at step 322 the data bytes are multiplied by the GF(28) coefficients to generate a message polynomial. The message polynomial is processed by the RS engine at step 320 to generate calculated RS parity bytes. The calculated RS parity bytes are compared with the RS parity bytes at step 316. If the two RS parity bytes are identical no error is detected in the parity bytes and the data bytes are transferred to the main memory 306 at step 342. However, if the calculated RS parity bytes and the RS parity bytes associated with the data bytes do not match one or more errors are detected at step 312. The errors are corrected at step 308 to generate corrected data which is transferred to the main memory 306 at step 340.
Referring now to
The erasure position block 352 receives input from the host operating system indicating the disk crash position and is operative to generate a symbol value input signal 354 which is sent to the syndrome calculate block 358. The syndrome calculate block 358 receives a clock signal 356 and is operative to generate and couple a calculated syndrome onto the syndrome signals 362 which are received by the blocks 360, 364 and 370. Further shown in
In
The error address signals 366 and 372 and the error value signals 368 and 374 serve as inputs to the erasure syndrome calculate block 376.
The erasure syndrome calculate block 376 operates to generate and couple a calculated erasure syndrome onto the erasure syndromes 378 which are received by the block 382, 384 and 386. The erasure syndrome calculate block 376 generates an error number signal (Error_Number) 400. The block 384 generates a first erasure error address signal (Erasure_Addr_1) 388 and the block 386 generates a second erasure error address signal (Erasure_Addr_2) 392. Moreover, a first erasure error value signal (Erasure_Value_1) 390 is generated by the block 384 and a second erasure error value signal (Erasure_Value_2) 394 is generated by the block 386. The block 386 calculates Y1 and Y2, in accordance with equations presented hereinbelow.
Referring now to
Z
2
+Z+K=0 Eq. (0)
In the interest of further clarification, an example of error count using block 412 is now presented. The example is to be used for flash or non-volatile memory error recovery and in the case where a page is 528 bytes in length. A page is 512 bytes of data and 16 bytes of spare or overhead which are used for RS encoding/decoding or ECC. Each page is divided or organized into three sections.
Also it is assumed two error counts format Reed Solomon code (RS) based on Galois Field GF(28), where each symbol size is 8 and it is easier to adapt byte indications. RS(N, N−4) which N is code length, and 4<N≦255, N−4 is the length of message unit and it is the read out code in data area of flash memory, 4 bytes is the parity bytes generated from RS encoder for four crash disks or to correct two error sector with disk array. The data reside in sector (page) areas of SD array as long as
N: total Storage Data Array size 2s+r<=2t where s denotes error bytes and r denotes disk crash (Erasure) cases.
2(error byte)+(Erasure disk)≦2t. In the simple example, N=8 disks, and message bytes can be 4 bytes, 2t=4; total error bytes may be 2 bytes, however if only there is one byte of error, and two more disk crashes (erasure case), the RS encoder/decoder is still capable of detecting and correcting errors.
It is assumed r(x) is the receiving polynomial wherein errors might have occurred along with possible disk crashes.
Let rsup(x) denote the polynomial that suppress those disk crash bytes,
then r(x)=c(x)+e(x) Eq. (1)
Since two error bytes are assumed to be correctable by the RS encode/decoder and erasure crash disk cases are also allowed, four syndromes Si (i=0, 1, 2, 3) are needed for error byte location as well as erasure bytes value recovery. It is assumed two error positions are i1, i2; error symbol correct values are Y1, Y2;
S
i
=Y
1
*X
1
j
+Y
2
*X
2
j; Eq. (2)
In Eq. (2), X1=αi
β1=αi
σ(x)=(x−X1)*(x−X2)=x2+σ1x+σ0; Eq. (3)
σ(x) is error location polynomial, and σ1 and σ0 are two error locations. From Eq. (3) it follows that
σ1=X1+X2, σ0=X1*X2. Eq. (4)
Let
r(x)=X2t*(a0+a1x+a2x2+ . . . +ak-2xk-2+ak-1xk-1)+(b2t-1X2t-1+b2t-2x2t-2+ . . . +b1+b0); (4A);
where r(x) is the receiving polynomial, and each coefficient aj is associated with disk position Xj. t is number of error bytes that may be recovered. In this example t=2 but t may be any number. Also (a0+a1x+a2x2+ . . . ) is the receiving data polynomial or M(x) obtained by reading each byte from different SD Array, Xi is the disk position; (b2t-1 X2t-1+b2t-2 x2t-2+ . . . +b1+b0) is the RS parity bytes stored in the SD Array, which is generated from the remainder when x2t*M(x) is divided by
The crash disk positions are first found out and then the corresponding terms in Eq. (4A) are zeroed out since the coefficients associated are no longer valid. As an example, r(x)=α6+α5x+α6x2+αx3, if f2 is the only crash disk and is reported by the Host operating system (OS) probing before accessing required data, then rsup(x)=α6+α5x+0+αx3;
Next, Γ(X)=(1−αZi X)*(1−αZj X) is considered where Γ(X) is the erasure (crash disk) locator polynomial. In the above example Γ(X)=(1−α2X); since f2 disk is crashed. Following is Modified Syndrome equations S(x): S(x)=[Γ(X)*rsup(X)] mod x2t; where 2t=4 and rsup(x) is the suppressed r(x) by setting the crash disk positions equal to zero as previous example.
from above equations, σ1 and σ0 may be calculated as
σ1=(S1S2+S0S3)/(S12+S0S2)=B/A=X1+X2; Eq. (9)
σ0=(S22+S1S3)/(S12+S0S2)=C/A=X1*X2; Eq. (10)
and error values may be obtained by following terms:
Y
2=(S0X1+S1)/σ1; Eq. (11)
Y
1
=S
0
+Y
2; Eq. (12)
The terms A, B, C may be calculated in terms of Si to serve as error number indicator:
A=S
1
2
+S
0
*S
2; Eq. (14)
B=S
1
*S
2
+S
0
*S
3; Eq. (15)
C=S
2
2
+S
1
*S
3; Eq. (16)
Now, a number of error and disk crash cases are considered.
(1) If no errors occurred and no disk crash reported by host OS, then
S0=S1=S2=S3=0; Eq. (13)
No recovery is needed as perfect case.
(2) No disk crash reported by the host operating system, A, B, C from Eqs. (14-16) are all non-zero values:
Two error bytes case with no disk crash needs to be recovered sine 2t=4 reaches the maximum correction amount. If two errors occur in read data, and assuming i1, i2 are the error locations then Y1, Y2 are two error symbol values.
A≠0 since arbitrary different numbers squared and added together must be greater than zero if X1, X2 are not zero and not identical (as indicated in Table 1);
B=S
1
S
2
+S
0
S
3=(X1+X2)*(X12+X22)*Y1*Y2≠0; Eq. (29)
C=S
2
2
+S
1
S
3
=X
1
*X
2*(X12+X22)*Y1*Y2≠0; Eq. (30)
Using cyclic characteristic of Galois Fields (GF), we may assume
X
i=σ1*Zi; Eq. (31)
where i may be 1 or 2 in the example above.
In order to make σ(x)=x2+σ1x+σ1x+σ0 simple, it is easier to consider
σ(z)=z2+z+K; Eq. (31A)
where K=σ0/σ12.
Once the root of above equation is found, we may recover X=σ1*Z again.
It may be assumed x=σ0*z also, however, no benefit results for doing so since it cannot simplify σ(x) equation.
Roots of σ(x) are error locations X1, X2, where the error symbol locations occur. Most of the RS decoding problem is centered around finding these two roots.
Now an easy way and simple hardware is introduced to find the roots. It is assumed Z1, and Z2 are roots of σ(z). Substituting Z1, Z2 into Eq (31A), it is found that
Z
1
2
+Z
1
+K=0; Eq. (32)
Z
2
2
+Z
2
+K=0; Eq. (33)
Subtraction of these two equations, results in
(Z12−Z22)+(Z1−Z2)=0, Eq. (34)
in Galois field operation “−” is identical with “+” so it is found that
(Z12+Z22)+(Z1+Z2)=0, Eq. (35)
Since 2*Z1*Z2=Z1*Z2+Z1*Z2=0; Eq. (36)
because two identical terms added together equal zero under Galois operation. It follows that
(Z12+Z22+2Z1*Z2)+(Z1+Z2)=0,
(Z1+Z2)2+(Z1+Z2)=0; Eq. (37)
(Z1+Z2)*(Z1+Z2+1)=0; Eq. (38)
which implies
Z
1
+Z
2=0; or Z1+Z2+1=0; Eq. (39)
However Z1=Z2 is not possible, as two error locations should not be same, the only choice is
Z
1
=Z
2+1; or Z2=Z1+1; or Z1+Z2=1; Eq. (40)
The above three equations are valid at the same time under Galois operation.
Also 1 in above equation means (1000 0000) if GF(28), and Z1 and Z2 highest bit (bit position 0) should be toggled to each other.
Examples like
Again Z12+Z1+K=0;
Z
1*(Z1+1)+K=0; Eq. (41)
Z
1*(Z1+1)=K; Eq. (42)
It is assumed
Z
1=β1*α+β2*α2+β3*α3+β4*α4+β5*α5+β6*α6+β7*α7; Eq. (43)
where βi are coefficients of the above Z equation.
Z
1+1=1+β1*α+β2*α2+β3*α3+β4*α4+β5*α5+β6*α6+β7*α7; Eq. (44)
βj is 1 or 0 only in the above derivation, so equalities hold for βj*βj=βj, βj+βj=0.
These Two terms may be swapped without influencing the final result,
Multiply two terms together in Eq. (42), it is found
(β7*α14+β6*α12+β5*α10+β4*α8)+β7*α7+(β6+β3)*α6+β5*α5+(β4+β2)*α4+β3*α3+(β2+β1)*α2+β1*α=K; Eq. (45)
where the fact that β7*β7=β7 is used.
Also β7*β6+β7*β6=0; as same terms add together to yield zero in Galois Field (GF) operation.
If Galois Field GF(28) is referenced it is found that
α14=(1100 1000)=1+α+α4;
α12=(1011 0011)=1+α2+α3+α6+α7;
a10=(0010 1110)=α2+α4+α5+α6;
α8=(1011 1000)=1+α2+α3+α4;
Substitute these four values into the above equation, it is found that
β7+β6=K7; Eq. (46)
β3+β5=K6; Eq. (47)
β7+β5+β2=K4; Eq. (48)
β6+β4+β3=K3; Eq. (49)
β6+β5+β4+β2+β1=K2; Eq. (50)
β7+β1=K1; Eq. (51)
β7+β3+β4=K0; Eq. (52)
K=σ
0/σ12=K7*+7+K6*α6+K5*α5+K4*α4+K3*α3+K2*α2+K1*α1+K0 Eq. (53)
Kj (j=7 . . . 0) are coefficients of Eq (53) 8 bit symbol value, and K may be obtained from once Syndrome Si is calculated, and σ0, σ1 values are also calculated after Si is found;
Comparing Eq (46) and (52), it is found that
β4=K7+K0; Eq. (54)
adding Eq (48), (50), and (51), and substituting β4, all double terms are eliminated because of Galois “+” is actually exclusive OR function, it follows that
β6=K7+K4+K2+K1+K0; Eq. (55)
From Eq (46), it follows that
β7=K4+K2+K1+K0; Eq. (56)
From Eq (51), it follows that
β1=K4+K2+K0; Eq. (57)
From Eq (49) and (52), it follows that
β7+β3=K3+K0; Eq. (58)
β3=K4+K3+K2+K1; Eq. (59)
From Eq (47), it follows that
β5=K6+K4+K3+K2+K1; Eq. (60)
From Eq (48), it follows that
β2=K6+K4+K3+K0; Eq. (61)
After all βj (j=1 . . . 7) are found from four syndrome formulas, Z1 is found from Eq (40), Z2 may also be found by adding 1(1000 0000) to it.
X1, X2 values are recovered by using Eq (31),
X
1=σ1*Z1;
X
2=σ1*Z2;
Y
2=(S0X1+S1)/σ1; Eq. (11)
Y
1
=S
0
+Y
2; Eq. (12)
Above Y1 and Y2 are error symbol value, and
e(x)=X1*Y1+X2*Y2; Eq. (62)
Correct code word c(x) may be obtained from Eq (1) by adding r(x) and e(x), in two error byte case, since already maximum number of error recovery limit is rerached, no erasure case is allowed, r(x)=rsup(x) c(x)=r(x)+e(x) as shown in Eq. (1).
As described hereinabove explanation, error locations X1, X2 need only be calculated from Kj, which in turn comes from syndrome value σ0/σ12 with very simple exclusive operations. It does not need either ROM expensive silicon area which is proportional to code size, or complex operations that requires lots of hardware for implementation.
It is noted that case (2) is a two error location case, so there is no room for disk crash recovery, since 2s=2t.
Exemplary implementation of the foregoing is shown relative to
After step 462 σ0 and σ1 are calculated from the syndromes S0-S3, in accordance with the foregoing equations, for the first section, at step 464 and similarly at step 475, after the step 488, the σ0 and σ1 are calculated from the syndromes S0-S3, in accordance with the foregoing equations, for the second section and at step 492, after step 490, the σ0 and σ1 are calculated from the syndromes S0-S3, for the third section of the page.
After step 464, K is calculated at step 466 based on the foregoing equations, for the first section and after the step 475, K is calculated, at step 476, for the second section of the page and after the step 492, at step 494, K is calculated for the third section of the page.
After step 466, Z1 and Z2 are calculated for the first section of the page at step 114, based on the foregoing equations, for the first section and are used to calculate X1 and X2, at step 470 after which, at step 472, Y1 and Y2 are calculated, after which at step 474, X1*Y1 is added to X2*Y2 and the sum thereof is added to R1(x). In this manner the first data segment is recovered. The notation “*” refers to the multiplication function or operator.
After step 476, Z1 and Z2 are calculated, for the second section of the page, at step 478, based on the foregoing equations, and used to calculate X1 and X2, at step 482 after which, at step 484, Y1 and Y2 are calculated, after which, at step 486, X1*Y1 is added to X2*Y2 and the sum thereof is added to R2(x). In this manner the second data segment is recovered.
After step 494, Z1 and Z2 are calculated for the third section of the page, at step 496, based on the foregoing equations, and are used to calculate X1 and X2, at step 498 after which, at step 500, Y1 and Y2 are calculated, after which, at step 502, X1*Y1 is added to X2*Y2 and the sum thereof is added to R3(x). In this manner the third data segment is recovered.
(3) K=0 and Si≠0: One error byte case, and involving several different sub-cases.
(3A) Single error byte without any disk crashes.
(3B) Single error byte with one or two disk crash which might coexist together as worst case;
First, the error byte disk position is found, and the correct value for this byte is recovered according to description hereinbelow, then the crash disk correct byte value is found. It is noted that error byte position should not be overlapped with crash disk values, since crashed disk has all sector value wrong.
(3A) If only single error occurs (No erasure occurring), and assuming error location is i1, error value is Y1, since only one error value is associated with X1,
S
0
=Y
1≠0; Eq. (17)
S
1
=Y
1
*X
1≠0; Eq. (18)
S
2
=Y
1
*X
1
2≠0; Eq. (19)
S
3
=Y
1
*X
1
3≠0; Eq. (20)
All Si≠0 Does not imply two errors, but implies that at least one error occurs.
Y1=S0; Eq. (17)
and simply changing sides of equation in equation (18) X1=S1/Y1=S1/S0; it is found that
σ1=(S1*S2+S0*S3)/(S12+S0*S2)=X13Y12+X13Y12=X1+X2=0; Eq. (21)
σ0=(S22+S1*S3)/(S12+S0*S2)=X14Y12+X14Y12=X1*X2=0; Eq. (22)
A=X
1
2
Y
1
2
+X
1
2
Y
1
2=0<=A; Eq. (23)
So σ1=σ0=0 with no erasure case implies only one error occurs, Table 1 explains σ1=0 case.
Y1=S0;
X
1
=S
1
/Y
1; From Eq. (17-18)
(3B) Single error byte with one or two disk crashes which might coexist together as worst case:
The terms for r(x) zeroed out since failing disk value is not meaningful, then Eq (23G) is used to find out one error byte location and value, first temporary syndrome Si is calculated from rsup as seen from Eq (23AF-DF), then another syndrome formula is formed as Eq (23H) by combining Γ(x) and rsup0(1), rrsup(α), rsup2(α2) and rsup3(α3), again this syndrome may be used to find out error byte position first. Details will be shown hereinbelow by way of an example. Once X1, Y1 is known,
r
new sup(x)=rSup+e(x)=rSup+XX1*Y1; Eq. (23A)
Whole error position polynomial is calculated as
Ψ(x)=Λ(x)*Γ(x)=(1−αix)*(1−βix)*(1−βjx); Eq. (23B)
where Λ(x) is the error byte location polynomial (1−αi x), and Γ(x) is the disk crash location polynomial (1−βix)*(1−βjx).
An erasure syndrome value may be found by
S(x)={Ψ(x)*[rnewsup0(1)+rnewsup1(α)x+rnewsup2(α2)x2+rnewsup3(α3)x3]} mod x4 (23C)
Using the four Si(1, α, α2, α3), the disk crash value is found easily, but maximum disk crash number is limited to two. Now an example is considered.
c(x)=α4x6+α5x5+α2x4+α4x3+α2x2+α3x+α2; Eq. (23D)
where c(x) is the golden coding example in GF(23) for RS(7,3) two error correction case for easier illustration. RS bytes in correct coding is a α4x3+α2x2+α3x+α2 as RS parity symbols to correct two errors maximum, and (α2x4+α5x5+a4x6) is true data code for storage. Symbol size is 3 bit per symbol for easy demonstration. In this example, there are a total of seven disks, 3 disks are data storage disks, and 4 disks are for RS recovery, also known as RS parity disks. In real application, since GF(256) or GF(28) approach is used, total disk size can be a maximum of 255 disks with only four disks for RS recovery purposes. The overhead is 1.57%, but correction capability is much better than current applications.
To verify the theory, if the four roots (1, α, α2, α3) are inserted into c(x), they should be all zeroes.
C(1)=0; Eq. (23AD)
C(α)=0; Eq. (23BD)
C(α2)=0; Eq. (23CD)
C(α3)=0; Eq. (23DD)
However due to presence of disk fail, f1 and f6 are failing, which are x6 and x terms relating to Eq (23C) and two disk crashes happen during disk locations 1 and 6, which are a x and a4x6 and should be ignored. The SSD disk controller reads Eq (23D) as
r(x)=α4x6+α5x5+α2x4+x3+α2x2+α3x+α2 Eq. (23E)
with x3 term italicized as error and two terms underlined due to disk crash.
r
sup(x)=0+α5x5+α2x4+x3+α2x2+0+α2 Eq. (23F)
is used to find temporary syndromes ri, i ranging from 0 to 3
r
sup 0(1)=α; Eq. (23AF)
r
sup 1(α)=α5; Eq. (23BF)
r
sup 2(α2)=α6; Eq. (23CF)
r
sup 3(α3)=α4; Eq. (23DF)
Then syndrome equations are used for the first error byte location calculation,
Γ(x)=(x−βi)*(x−βj)=(x+α)*(x+α6)=1+α5x+x2; Eq. (23G)
where i, j reflect 1 and 6 disk crash position inversion as disk 6 is α and disk 1 is α6. The syndrome S(x) is given as,
S(x)=[Γ(x)*(r0+r1x+r2x2+r3x3)] mod x4=[(1+α5x+x2)(α+α5x+α6x2+α4x3)] mod x4=α5x3+α2x2+αxαα; Eq. (23H)
S
0(1)=α5+α2+α+α=α3
S
1(α)=α
S
2(α2)=α4
S
3(α3)=α2
A=S
1
2
+S
0
S
2=α2+1=α6
B=S
1
S
2
+S
0
S
3=α5+α5=0
C=S
2
2
+S
1
S
3=α+α3=1
σ1=(S1*S2+S*S3)/(S12+S0*S)=B/A=X1+X2=0
σ0=(S22+S1*S3)/(S12+S0*S2)=c/A=α
σ1=0 implies this example is a single error case, because from below Table 1 only two identical roots can cause zero. Table 1 indicates addition of GF(8) combinations.
We know only one error occurs during receiving and
Then 1, α, α2, α3 are inserted into above Eq (23I) to find the following 3rd syndrome terms,
Λ(x)*Γ(x)*(s′0+s′1(α)x+s′2(α2)x+s′3(α3)x)mod x4=[(1−αix)*(1−βix)*(1−βjx)*(s′0+s′1(α)x+s′2(α2)x+s′3(α3)x)] mod x4;
=[(1−α3x)(1−αx)(1−α6x)(α6+α6x+α3x2+α5x3)] mod x4
S′1(x) is from Eq. (23I)
S′
0(1)=α6;
S′
1(α)=α6;
S′
2(α2)=α3;
S′
3(α3)=α5;
A′=S′
1
2
+S′
0
S′
2=α3
B′=S′
1
S′
2
+S′
0
S′
3=α
C′=S′
2
2
+S′
1
S′
3=α3
σ1=B′/A′=α5=X1+X2;
σ0=c′/A′=1=X1*X2;
Once the two disk crash positions are substituted, the failing disk bytes are recovered easily as:
f
6=(S0X1+S1)/σ1=(α6*α+α6)/α6=(1+α6)/α5=α2/α5=α4;
f
1
=S
0
+Y
2=α3;
e(x)=α3x+α4x6; from above derivations since
r
new sup(x)=α2+α2x2+α4x3+α2x4+α5x5 Eq. (23I)
where α4x3 is the error byte term, and
Eq. (23J) is identical to Eq. (23D). In total three syndrome calculations are performed for the syndrome, temporary syndrome and erasure syndrome in the embodiment of the present invention described hereinabove.
Conventional calculations for erasure byte values of disk crash cases are given by:
where i is the fail disk number and Ψ′(x) is the differentiation of two location polynomial product Eq (23B). Since the GF(x) differentiation is difficult to carry out, as can be demonstrated with a simple example given by:
Ψ(x)=α6+α5x+α6x2+αx3,
Ψ′(x)=0+α5+0+αx2,
odd xn coefficient of Ψ(x) need to be extracted and the original coefficients copied down as one order down Xn-1, since even xn must be zero by GF algebra; However this differentiation is difficult for hardware ASIC to implement;
For simple calculation without involving differential of Ψ(x), the syndrome calculation is repeated again based on Eq. (23A), and the same algorithm is applied to find sets of S0, S1, S2, S3 on Eq. (23A) to determine Xi, Xj and corresponding fail bytes fi and fj as disk crash byte values.
C(x)=rsup(x)+XX1*Y1+fi*Xi+fj*Xj; Eq. (23K)
where rsup(x) is obtained from Eq. (23F) by suppressing failing disk terms. C(X) is correct bytes that should be returned to the host as requested value.
Failing disks may be known easily once diagnosed, before the host talks to the SD Array. Once it is known which disk is failing, an erasure disk number is reported to the RS algorithm if flash memory sector ECC (or sector parity on each page within flash memory block) is not consistent with sector data.
Initially two disk crash positions are calculated and crash bytes are found by suppressing other two disk positions by zeroing out.
Then same theory is applied again to find out the rest of two disk crash byte values again. This is easier understood by an example:
c(x)=α4x6+α5x5+α2x4)+α4x3+α2x2+α3x+α2; Eq. (23D)
where c(x) is the golden code without any errors, and parenthesis contains the message data disk value.
Assuming f1, f2, f3, f4, i.e. four disks have failed (underlined), and rsup(x) is received as the partial value besides the four disk crash cases that have already occurred.
S
1(x)=rsup(x)=α4x6+α5x5+0+0+0+0+α2; Eq. (24)
1st syndrome value S1(x) may be obtained as
r
sup 0(1)=α6; Eq. (24A)
r
sup 1(α)=α2; Eq. (24B)
r
sup 2(α2)=α; Eq. (24C)
r
sup 3(α3)=α3; Eq. (24D)
where disk crash location polynomial equals
where i reflects four disk crash position inversions for example disk 3 is α3 and disk 4 is α4, and 2nd syndrome S(x) is given by:
Γ′(x)=α3*/ΣΠβ≠γ(1/β−1/γ), (24H),
x is location of crash disk position,
for example:
Γ′(α)=α3*/(1/α−1/α2)*(1/α−1/α3)*(1/α−1/α4);
Γ′(α2)=α3*/(1/α−1/α2)*(1/α2−1/α3)*(1/α2−1/α4);
Γ′(α3)=α3*/(1/α3−1/α2)*(1/α3−1/α)*(1/α3−1/α4);
Γ′(α4)=α3*/(1/α4−1/α2)*(1/α4−1/α3)*(1/α4−1/α);
and fj=[αj*S(αj)]/Γ′(x) (24I),
where β, γ are location of the four disk crash locations, fj is each disk error magnitude recovered.
f
1=(α*α6)/α4=α3;
f
2=(α2*α6)/α6=α2;
f
3=(α3*α6)/α5=α4;
f
4=(α4*α6)/α=α2;
and e(x)=f1x+f2x+f3x+f4x
=α3x+α2x2+α4x3+α2x4;
adding to Eq (24)
r
sup(x)=α4x6+α5x5+0+0+0+0+α2; Eq. (24)
adding e(x) it is found that
c(x)=α4x6+α5x5+α2x4+α4x3+α2x2+α3x+α2; Eq. (23D)
where c(x) is golden code without any errors.
Apparently, the differential value is needed to be calculated so as to find Γ′(x), however it is only needed to feed into the error locations reported back by the host,
It may be done with a table look up, or simple logic addition can fulfill the task.
Although the present invention has been described in terms of specific embodiment, it is anticipated that alterations and modifications thereof will no doubt become apparent to those more skilled in the art. It is therefore intended that the following claims be interpreted as covering all such alterations and modification as fall within the true spirit and scope of the invention.
This application is a continuation-in-part (CIP) of the co-pending application “Electronic Data Flash Card with Various Flash Cells”, U.S. application Ser. No. 11/864,671 filed Sep. 28, 2007, which is a CIP of “Flash Memory Controller for Electronic Data Flash Card”, U.S. application Ser. No. 11/466,759 filed Aug. 23, 2006. This application is also a continuation-in-part (CIP) of the co-pending application for “Electronic Data Storage Medium with Fingerprint Verification Capability”, U.S. Ser. No. 11/624,667 filed Jan. 18, 2007, which is a divisional application of U.S. patent application Ser. No. 09/478,720, filed on Jan. 6, 2000, which has been petitioned to claim the benefit of CIP status of one of inventor's earlier U.S. patent applications for “Integrated Circuit Card with Fingerprint Verification Capability”, U.S. application Ser. No. 09/366,976, filed Aug. 4, 1999, now issued as U.S. Pat. No. 6,547,130.
Number | Date | Country | |
---|---|---|---|
Parent | 09478720 | Jan 2000 | US |
Child | 11624667 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11864671 | Sep 2007 | US |
Child | 11932461 | US | |
Parent | 11466759 | Aug 2006 | US |
Child | 11864671 | US | |
Parent | 11624667 | Jan 2007 | US |
Child | 11466759 | US | |
Parent | 09366976 | Aug 1999 | US |
Child | 09478720 | US |