Embodiments described herein relate generally to an error correction circuit of an error correction circuit of a nonvolatile semiconductor memory device, for example, a NAND flash memory.
For example, as a NAND flash memory, a multilevel NAND flash memory, which can store data of a plurality of bits in one memory cell, has been developed with an increase in storage capacity. In addition, in accordance with an increase in storage capacity, a data error correction technique for the NAND flash memory has become important.
In general, according to one embodiment, an error correction circuit includes a first memory module, a read-out module, a first arithmetic module, a first register, a detector, a second arithmetic module, and a transfer module. The first memory module is configured to store logarithmic likelihood ratio data to which low density parity check codes (LDPC) data has been converted. The read-out module is configured to read out, from the first memory module, the logarithmic likelihood ratio data of a plurality of variable nodes which are connected to a selected check node, based on a check matrix. The first arithmetic module is configured to calculate a plurality of second reliability data, based on the logarithmic likelihood ratio data, which is read out of the first memory module, of the plurality of variable nodes connected to the selected check node, and first reliability data. The first register is configured to store the plurality of second reliability data. The detector is configured to detect a minimum value of the plurality of second reliability data stored in the first register. The second arithmetic module is configured to execute an arithmetic operation of the second reliability data and the minimum value which is output from the detector, and to output an arithmetic result as the logarithmic likelihood ratio data which has been updated. The transfer module is configured to transfer the updated logarithmic likelihood ratio data, which is supplied from the second arithmetic module, to the first memory module.
For example, a NAND flash memory includes a low density parity check codes (LDPC) decoder for error correction. The LDPC decoder has such a feature that a decoding capability is improved in proportion to an increase in code length. Thus, the code length of the LDPC, which are used in, for example, a NAND flash memory, is on the order of, e.g. 10 Kbits.
Referring to
To begin with, a description is given of LDPC codes and partial parallel processing in an embodiment. LDPC codes are linear codes which are defined by a very sparse check matrix, that is, a check matrix including a small number of non-zero elements in the matrix, and can be represented by a Tanner graph. An error correction process corresponds to updating by exchanging locally estimated results between bit nodes (also referred to as “variable nodes vn”), which correspond to bits of a code word, and check nodes corresponding to respective parity check formulae, the bit nodes and the check nodes being connected on the Tanner graph.
As illustrated in
Decoding of LDPC encoded data is executed by repeatedly updating reliability (probability) information, which is allocated to the edges of the Tanner graph, at the nodes. The reliability information is classified into two kinds, i.e. probability information from a check node to a bit node (hereinafter also referred to as “external value” or “external information”, and expressed by symbol “α”), and probability information from a bit node to a check node (hereinafter also referred to as “prior probability”, “posterior probability”, or simply “probability”, or “logarithmic likelihood ratio (LLR)”, and expressed by symbol “β” or “λ”). The reliability update process comprises a row process and a column process. A unit of execution of a single row process and a single column process is referred to as “1 iteration (round) process”, and a decoding process is executed by a repetitive process in which the iteration process is repeated.
As described above, the external value α is the probability information from the check node to the bit node at a time of the LDPC decoding process, and the probability β is the probability information from the bit node to the check node. These terms are well known to a person skilled in the art.
In a semiconductor memory device, threshold determination information is read out from a memory cell which stores encoded data. The threshold determination information comprises a hard bit (HB) which indicates whether the stored data is “0” or “1”, and a plurality of soft bits (SB) which indicate the likelihood of the hard bit. The threshold determination information is converted to an LLR by an LLR table which is prepared in advance, and becomes an initial LLR of the iteration process.
A decoding process by parallel processing can be executed in a reliability update algorithm (decoding algorithm) at bit nodes and check nodes, with use of a sum product algorithm or a mini-sum product algorithm.
However, in the case of LDPC encoded data with a large code length, a complete parallel processing, in which all processes are executed in parallel, is not practical since many arithmetic circuits need to be mounted.
By contrast, if a check matrix, which is formed by combining a plurality of unit matrices (hereinafter also referred to as “blocks”), is used, the circuit scale can be reduced by executing partial parallel processing by arithmetic circuits corresponding to a bit node number p of a block size p.
As illustrated in
As shown in
A bit, which has been shifted out of a block by a shift process, is inserted in a leftmost column in the block. In the decoding process using the check matrix H3, necessary block information, that is, information of nodes to be processed, can be obtained by designating shift values. In the meantime, in the check matrix H3 comprising blocks each with 5×5 elements, the shift value is any one of 0, 1, 2, 3 and 4, except for the 0 matrix which has no direct relation to the decoding process.
In the case of using the check matrix H3 in which square matrices each having a block size 5×5 (hereinafter referred to as “block size 5”) shown in
When decoding is executed by using the check matrix H3 which is formed by combining a plurality of unit matrices, if plural TMEM variables, which are read from the TMEM, are rotated by a rotater 113A in accordance with shift values, there is no need to store the entirety of the check matrix H3.
For example, as illustrated in
As illustrated in
On the other hand, as shown in
As illustrated in
As has been described above, before variables which have been read out of the LMEM 112 or TMEM 114, are input, the rotater 113A rotates the variables with a rotate value corresponding to the shift value of the block. In the case of the memory controller 103 using the check matrix H3 of the block size 8, the maximum rotate value of the rotater 113A is “7” that is “block size−1”. If the quantifying bit number of reliability is “u”, the bit number of each variable is “u”. Thus, the input/output data width of the rotater 113A is “8×u” bits.
In the meantime, the memory (LMEM) that stores a logarithmic likelihood ratio (LLR), which represents the likelihood of data read out of the NAND flash memory by quantizing the likelihood by 5 to 6 bits, needs to have a memory capacity which corresponds to a code length×a quantizing bit number. From the standpoint of optimization of cost, the LMEM functioning as a large-capacity memory is necessarily implemented with a static RAM (SRAM). Accordingly, the arithmetic algorithm and hardware of the LDPC decoder for a NAND flash memory are optimized, in general, on the presupposition of the LMEM that is implemented with an SRAM. As a result, a unit block base parallel method, in which the LLRs are accessed by sequential addresses, is generally used.
However, the unit block base parallel method has a complex arithmetic algorithm, and requires a plurality of rotaters of large-scale logics (large-scale wiring areas). The provision of plural rotaters poses a problem in increasing the degree of parallel processing and the processing speed.
Referring to
As illustrated in
An arithmetic module 13 reads LLRs of unit blocks from the LMEM 12, executes an arithmetic operation on the LLRs, and writes the LLRs back into the LMEM 12. There are provided arithmetic modules 13 corresponding to the unit block size (i.e. corresponding to four variable nodes (hereinafter also referred to simply as “vn”). In this example, the frame length is 12 bits and is short. However, for example, if the frame length increases to as large as 10 Kbits, because of the address management of the LMEM 12, such an architecture is adopted that LLRs of variable nodes vn with sequential addresses are accessed together from the LMEM 12 and the accessed LLRs are subjected to arithmetic operations. When the LLRs of variable nodes vn with sequential addresses are accessed together, the LLRs are accessed in units of a base block and processing is executed (“unit block parallel method”). At this time, in order to programmably select 4 variable nodes vn belonging to a basic block connected to a check node cn, the above-described rotater is provided.
The rotater includes a function of arbitrarily selecting four 6-bit LLRs with respect to a certain check node cn, if the quantizing bit number is 6 bits. Since the block size of an actual product is, e.g. 128×128 to 256×256, the circuit scale and wiring area of the rotater become enormous.
In loop 2, β is read out from the LMEM 12, α1 and α2, which have been calculated in loop 1, are added to the read-out β, and the resultant is written back to the LMEM 12 as a new LLR. This operation is executed in parallel for four vn at a time, and the parallel processing is repeatedly executed three times for the process of one row. Thereby, the update of LLRs of all vn is completed.
By executing the processes of the loop 1 and loop 2 for one row, one iteration (hereinafter also referred to as “ITR”) is finished. At a stage at which 1 ITR is finished, if the parity of all check nodes cn passes, the correction process is successfully finished. If the parity is NG, the next 1 ITR is executed. If the parity fails to pass even if ITR is executed a predetermined number of times, the correction process terminates in failure.
Row processes of vn0, 1, 2 and 3 belonging to column block 0 (calculation of β, α1 and α2 and parity check of cn0, 1, 2, 3)
(1) Row process of vn4, 5, 6, 7 belonging to column block 1.
(2) Row process of vn8, 9, 10, 11 belonging to column block 2.
(3) Column process of vn0, 1, 2, 3 belonging to column block 0 (LLR update).
(4) Column process of vn4, 5, 6, 7 belonging to column block 1.
(5) Column process of vn8, 9, 10, 11 belonging to column block 2.
The processing efficiency of the above-described unit block parallel method is low, since LLR update processes for all vn are not completed unless the column process and row process are executed by different loops. The essential reason for this is that a retrieval process of the LLR minimum value of variable nodes vn belonging to a certain check node, and a retrieval process of the next minimum value cannot be executed at the same time as the LLR update process. As a result, the circuit scale increases, the power consumption increases, and the cost performance deteriorates.
In addition, in order to access LLRs of vn of one block, it is necessary to access the large-capacity LMEM each time, and the power consumption by the LMEM 12 increases. Since the LMEM 12 is constructed by the SRAM, power is consumed not only at a time of write but also at a time of read.
Furthermore, since the LMEM 12 is read twice and written twice, power consumption increases.
Besides, an LDPC decoder circuit for a multilevel (MLC) NAND flash memory, which stores data of plural bits in one memory cell, is designed on the presupposition of a defective model in which a threshold voltage of a cell shifts. Thus, such an error (hereinafter referred to as “hard error (HE)”) is not assumed that a threshold voltage shifts beyond 50% of an interval between threshold voltages, or a threshold voltage shifts beyond a distribution of neighboring threshold voltages. If such defects occur frequently, the correction capability lowers. The reason for this is that since a threshold voltage at a time of read does not necessarily exist near a boundary of a determination area, such a case occurs that the logarithmic likelihood ratio absolute value (|LLR|), which is the index of likelihood of a determination result of the threshold voltage, increases, despite the data read being erroneous.
In a first embodiment, the efficiency of an arithmetic process is improved, cost performance is improved, and degradation of correction capability by a hard error is improved.
The first embodiment relates to an LDPC decoder circuit for a NAND flash memory, which includes a memory (LMEM) which stores logarithmic likelihood ratio conversion data (LLR) of LDPC frame data. A check matrix is composed of M*N unit blocks with M rows and N columns. The LDPC decoder circuit includes a process unit for pipeline-processing an LLR update process (vn process of cn base) of variable nodes vn which are connected to a selected check node cn. The LDPC decoder circuit further includes a process unit for parallel-processing vn processes of a cn base of some check nodes cn. At a time of parallel processing, vn processes per 1 cn can be executed by one cycle.
On the other hand, in the first embodiment, all variable nodes vn, which are connected to a check node cn, are simultaneously read out. Specifically, LLRs of variable nodes vn, which are connected to a check node cn belonging to i=1 row, are read out of the LMEM, and a matrix process is executed. Specifically, a 3 arithmetic operation and an a arithmetic operation are simultaneously executed (step S11, S12). Then, the value of row “i” is incremented, and the process of step S12 is executed for all the number of rows (step S13, S14, S12).
The present embodiment differs from the example of
As illustrated in
In the case where the LMEM 12 is composed of a single module, as shown in
(1) A matrix process (LLR update) of vn0, 5, 10 connected to cn0
(2) A matrix process (LLR update) of vn1, 6, 11 connected to cn1
(3) A matrix process (LLR update) of vn2, 7, 8 connected to cn2
(4) A matrix process (LLR update) of vn3, 4, 9 connected to cn3.
With substantially the same circuit scale as in the prior art, about 1.5 times to 2 times higher speed can be achieved, and the cost performance can greatly be improved.
In the meantime, the decoding algorithm of the first embodiment becomes the same as in the example of
In the case of the first embodiment, the order of update of LLRs is different from the example of
Specifically, as illustrated in
In
The LMEMs 12-1 to 12-n are configured as modules for respective columns. The number of LMEMs 12-1 to 12-n, which are disposed, is equal to the number of columns. The LMEMs 12-1 to 12-n are implemented, for example, as registers, and each of the LMEMs 12-1 to 12-n is composed with, for example, a block size×6 bits.
The arithmetic units 13-1 to 13-m are arranged in accordance with not the number of columns but the row weight number m. The number of blocks (non-zero blocks), in which a shift value is not “0”, corresponds to the row weight number. Specifically, since the LLR of one variable node vn is read out from one non-zero block, it should suffice if the number of arithmetic units is m.
The data bus control circuit 32 executes dynamic allocation as to which of LLRs of variable nodes vn of column blocks is to be taken into which of the arithmetic units 13-1 to 13-m, according to which of the sequentially ordered rows is to be processed by the arithmetic units 13-1 to 13-m. By this dynamic allocation, the circuit scale of the arithmetic units 13-1 to 13-m can be reduced.
The column-directional logic circuit 15 includes, for example, a controller 15-1, an intermediate value memory such as TMEM 15-2, and a memory 15-3. The controller 15-1 controls the operation of the LDPC decoder 21, and is composed of a sequencer.
The intermediate value memory (TMEM) 15-2 stores intermediate value data, for instance, α (α1, α2) of ITR, a sign of α of each vn (sign information of α, which is added to all vn connected to check node cn), INDEX, and a parity check result of each check node cn. Incidentally, the α sign of each vn will be described later.
The memory 15-3 stores, for example, a check matrix or an LLR conversion table (to be described later).
The controller 15-1 delivers vn addresses to the LMEM 12-1 to LMEM 12-n in accordance with a block shift value. Thereby, LLRs of variable nodes vn corresponding to the weight number of the row, which is connected to the check node cn, can be read out from the LMEM 12-1 to LMEM 12-n.
The minimum value detection circuit 14-1, which is provided in the row-directional logic circuit 14, retrieves, from the arithmetic results of the arithmetic units 13-1 to 13-m, the minimum value and next minimum value of the absolute values of the LLRs connected to the check node cn. The parity check circuit 14-2 checks the parity of the check node cn. The LLRs of all variable nodes vn, which are connected to the read-out check node cn, are supplied to the minimum value detection circuit 14-1 and parity check circuit 14-2.
The arithmetic units 13-1 to 13-m generate β (logarithmic likelihood ratio) by calculation using the LLR data read out of the LMEMs 12-1 to 12-n, an intermediate value, for instance, α (α1 or α2) of the previous ITR, and the sign of a of each vn, and further calculates updated LLR′ from the generated β and the intermediate value (output data α of the minimum value detection circuit 14-1 and the cn parity check result). The updated LLR′ is written back to the LMEMs 12-1 to 12-n.
Data, which has been read out of a NAND flash memory (not shown), is delivered to a data buffer 30. This data is data to which parity data is added, for example, in units of a frame, by an LDPC encoder (not shown). The data stored in the data buffer 30 is delivered to an LLR conversion table 31. The LLR conversion table 31 converts the data, which has been read out of the NAND flash memory, to logarithmic likelihood ratio data. The data, which has been output from the LLR conversion table 31, is supplied to the LMEMs 12-1 to 12-n.
The LMEMs 12-1 to 12-n are connected to first input terminals of β arithmetic circuits 13a, 13b and 13c via the data bus control circuit 32. The data bus control circuit 32 is a circuit which executes dynamic allocation, and executes control as to which of LLRs of variable nodes vn of column blocks is to be supplied to which of the arithmetic units.
The β arithmetic circuits 13a, 13b and 13c constitute parts of the arithmetic units 13-1 to 13-m. In the case of the example shown in
The TMEM 15-2 stores intermediate value data, for instance, α1 and α2 of the previous ITR, a sign of a of each variable node vn, INDEX, and a parity check result of each check node cn.
The β arithmetic circuits 13a, 13b and 13c execute arithmetic operations between the LLR data, which is supplied from the LMEMs 12-1 to 12-n, and the intermediate value data which is supplied from the TMEM 15-2.
Output terminals of the β arithmetic circuits 13a, 13b and 13c are connected to a first β register 34. The first β register 34 stores output data of the β arithmetic circuits 13a, 13b and 13c.
Output terminals of the first β register 34 are connected to the minimum value detection circuit 14-1 and parity check circuit 14-2. Output terminals of the minimum value detection circuit 14-1 and parity check circuit 14-2 are connected to the TMEM 15-2 via a register 35.
The output terminals of the first β register 34 are connected to one-side input terminals of LLR′ arithmetic circuits 13d, 13e and 13f via a second β register 36 and a third β register 37. The second β register 36 stores output data of the first β register 34, and the third β register 37 stores data of the second β register 36.
The second β register 36 and third β register 37 are disposed in accordance with the number of stages of the pipeline which is constituted by the minimum value detection circuit 14-1, parity check circuit 14-2 and register 35.
The LLR′ arithmetic circuits 13d, 13e and 13f constitute parts of the arithmetic units 13-1 to 13-m, and are composed of three arithmetic circuits, like the β arithmetic circuits 13a, 13b and 13c. The other-side input terminals of the LLR′ arithmetic circuits 13d, 13e and 13f are connected to an output terminal of the register 35.
The LLR′ arithmetic circuits 13d, 13e and 13f execute an arithmetic operation between the data β, which is output from the third β register 37, and the intermediate value which is supplied from the register 35, and output updated LLR's.
First output terminals of the LLR′ arithmetic circuits 13d, 13e and 13f are connected to input terminals of an LLR′ register 39, and second output terminals thereof are connected to the TMEM 15-2 via a register 38.
The LLR′ register 39 stores updated LLR's which are output from the LLR′ arithmetic circuits 13d, 13e and 13f. Output terminals of the LLR′ register 39 are connected to the LMEMs 12-1 to 12-n.
The register 38 stores INDEX data which is output from the LLR′ arithmetic circuits 13d, 13e and 13f. The register 38 is connected to the TMEM 15-2.
The above-described LMEMs 12-1 to 12-n, the β arithmetic circuits 13a, 13b and 13c functioning as first arithmetic modules, the first β register 34, the register 35, the second β register 36, the third β register 37, the LLR′ arithmetic circuits 13d, 13e and 13f functioning as second arithmetic modules, and the LLR′ register 39 are included in each stage of the pipeline, and these circuits are operated by clock signals (not shown).
The LDPC decoder 21 executes, in a 1-row process, processes of check nodes cn, the number of which corresponds to the block size number. To begin with, LLR data of variable nodes vn is read out of the LMEMs 12-1 to 12-n, a matrix process is executed on the LLR data, and the content of the LLR data is updated. The updated LLR data is written back to the LMEMs 12-1 to 12-n. This series of processes is successively executed on the plural check nodes cn by a pipeline. In this embodiment, 1-row blocks are processed by five pipeline states.
Next, referring to
To start with, LLR data is read out of the LMEMs 12-1 to 12-n. Specifically, LLR data of variable nodes vn, which are connected to a selected check node cn, is read out of the LMEMs 12-1 to 12-n. In the case of the present embodiment, three LLR data are read out of the LMEMs 12-1 to 12-n.
Further, intermediate value data is read out of the TMEM 15-2. The intermediate value data includes α1 and α2 of the previous ITR, the sign of a of each variable node vn, INDEX, and a parity check result of each check node cn. The intermediate value data is stored in the register 33. In this case, α is probability information from a check node to a bit node and is indicative of an absolute value of β in the previous ITR, α1 is a minimum value of the absolute value, and α2 is a next minimum value (α1<α2). INDEX is an identifier of a variable node vn having a minimum absolute value of β.
The β arithmetic circuits 13a, 13b and 13c, which function as first arithmetic modules, execute arithmetic operations between the LLR data from the LMEMs 12-1 to 12-n and the intermediate value data which has been read out of the TMEM 15-2, thereby calculating β (logarithmic likelihood ratio). Specifically, each of the β arithmetic circuits 13a, 13b and 13c executes an arithmetic operation of β=(LLR data)−(intermediate value data). In this arithmetic operation, with respect to a certain variable node vn, if the absolute value of β is minimum in the previous ITR, the next minimum value α2 is subtracted from β, and if the absolute value of β is not minimum, the minimum value α1 is subtracted from β. Incidentally, the sign of the intermediate value data is determined by the sign of a for each vn.
The results of the arithmetic operations of the β arithmetic circuits 13a, 13b and 13c are stored in the first β register 34.
The minimum value detection circuit 14-1 calculates, from the arithmetic operation result β stored in the first β register 34, the minimum value al of the absolute value of β, the next minimum value α2, and the identifier INDEX of a variable node vn having a minimum absolute value of β. In addition, the parity check circuit 14-2 executes a parity check of all check nodes cn.
The detection result of the minimum value detection circuit 14-1 and the check result of the parity check circuit 14-2 are stored in the register 35.
In addition, when the minimum value detection circuit 14-1 and parity check circuit 14-2 execute processes and the results of the processes are stored in the register 35, the data of the first β register 34 is successively transferred to the second β register 36 and a third β register 37.
Based on the check result of the parity check circuit 14-2, the LLR′ arithmetic circuits 13d, 13e and 13f functioning as second arithmetic modules execute arithmetic operations of the arithmetic operation result β, which is stored in the third β register 37, and the detection result which has been detected by the minimum value detection circuit 14-1, and generate updated LLR′ data. Specifically, the LLR′ arithmetic circuits 13d, 13e and 13f execute LLR′=β+intermediate value data (α1 or α1 calculated in stage 3). Furthermore, the LLR′ arithmetic circuits 13d, 13e and 13f generate the sign of α of each variable node vn. The generation of the sign of α of each vn is generated as follows.
If the LLR code is “0” and the result of the parity check of the check node cn is OK, β+α is calculated and the sign of α of each vn becomes “0”.
If the LLR code is “0” and the result of the parity check of the check′ node cn is NG, β−α is calculated and the sign of α of each vn becomes “1”.
If the LLR code is “1” and the result of the parity check of the check node cn is OK, β−α is calculated and the sign of α of each vn becomes “1”.
If the LLR code is “1” and the result of the parity check of the check node cn is NG, β+α is calculated and the sign of a of each vn becomes “0”.
The sign of a of each vn is stored in the register 38.
Along with the above-described operation, the intermediate value data stored in the register 35 (α1, α2, INDEX data, and the parity check result of each check node cn), and the sign of α of each vn stored in the register 38 is written in the TMEM 15-2.
The LLR′ data updated by the LLR′ arithmetic circuits 13d, 13e and 13f is stored in the LLR′ register 39, and the data stored in the LLR′ register 39 is written in the LMEMs 12-1 to 12-n.
In the case of the architecture shown in
By contrast, according to the first embodiment, it should suffice if the capacity of each of the first β register 34, second β register 36 and third β register 37, which function as buffers for temporarily storing β, is such a capacity as to correspond to the number of variable nodes vn which are connected to the check node cn. Accordingly, the capacity of each of the first β register 34, second β register 36 and third β register 37 can be reduced.
Moreover, according to the first embodiment, since the first, second and third β registers 34, 36 and 37, which temporarily store β are provided, accesses to the LMEMs 12-1 to 12-n can be halved to one-time read and one-time write. Therefore, the power consumption can greatly be reduced.
Besides, since the accesses to the LMEMs 12-1 to 12-n are halved, it is possible to avoid butting of accesses to the LMEMs 12-1 to 12-n in the pipeline process in the same row process. Thus, the apparent execution cycle number per 1 cn can be set at “1” (1 clock), and the processing speed can be increased.
Furthermore, the minimum value detection circuit 14-1 and parity check circuit 14-2 are implemented in parallel in the third stage, and the minimum value detection circuit 14-1 and parity check circuit 14-2 are operated in parallel. Thus, for example, with 1 clock, the detection of the minimum value and the parity check can be executed.
The LDPC decoder shown in the first embodiment can flexibly select the degree of parallel processing of the circuits which are needed for arithmetic operations of the check nodes cn, in accordance with the required capability.
In the meantime, it is possible to double the number of input/output ports of the LMEMs 12-1 to 12-n, instead of doubling the number of modules of the LMEMs 12-1 to 12-n.
According to the above-described second embodiment, since the parallel processing degree of check nodes cn is set at “2”, as illustrated in
Incidentally, the parallel processing degree of check nodes cn is not limited to “2”, and may be set at “3” or more.
In the above-described first and second embodiments, in order to make the description simple, the check matrix is set to be one row. However, an actual check matrix comprises a plurality of rows, for example, 8 rows, and the column weight is 1 or more, for instance, 4.
Referring to
The LDPC decoder 21 updates LLR data which has been read out of the LMEMs 12-1 to 12-n, and writes the LLR data back to the LMEMs 12-1 to 12-n.
In the case where a process has been executed by the LDPC decoder 21 by using the check matrix shown in
Specifically, in the check matrix shown in
In this case, as illustrated in
In this manner, by inserting idle cycles between row processes, butting of vn access can be avoided.
On the other hand, as illustrated in
For example, the block shift values of the check matrix shown in
According to the above-described third embodiment, by inserting idle cycles between row processes or by adjusting the shift values of the check matrix, the butting of variable node vn access in the LMEMs 12-1 to 12-n can be avoided.
In the fourth embodiment, LDPC correction is made with a plurality of decoding algorithms by using a result of parity check.
In the fourth embodiment, for example, when decoding is executed with a Mini-SUM algorithm, LLR is updated by making additional use of bit flipping (BF). Correction is made with a plurality of algorithms by using an identical parity check result detected from an intermediate value of LLR. Thereby, the capability can be improved without lowering an encoding ratio or greatly increasing the circuit scale.
Next, the fourth embodiment is described with reference to
In the LDPC decoder 21 shown in
When the parity check circuit 14-2 has executed parity check of check nodes cn, the flag register 41 stores a parity check result of the check nodes cn as a 1-bit flag (hereinafter also referred to as “parity check flag”) with respect to each variable node vn.
As shown in
As illustrated in
For example, in the check matrix shown in
In the second and subsequent ITR, the LLR′ arithmetic circuits 13d, 13e and 13f execute arithmetic processes in accordance with the parity check flag supplied from the flag register 41, with respect to each row block process (S42, S43).
Specifically, the LLR′ arithmetic circuits 13d, 13e and 13f execute, in addition to a normal LLR update process, a unique LLR correction process for, for example, a variable node vn with a parity check flag “1” (S44).
As the unique correction process, for example, a process according to a bit flipping (BF) algorithm is applied. Specifically, when all parity check results of three check nodes cn, which are connected to the variable node vn0, fail to pass, it is highly probable that the variable node vn0 is erroneous. Thus, correction is made in a manner to lower the absolute value of the LLR of the variable node vn0. To be more specific, the LLR′ arithmetic circuit 13d, 13e, 13f increases, by several times, the value of a which is supplied from the register 35, and updates the LLR by using this α. In this manner, the LLR of the variable node vn, which is highly probably erroneous, is further lowered.
In addition, the LLR′ arithmetic circuit 13d, 13e, 13f does not execute the unique correction process for the variable node vn with a parity check flag “0”.
The above-described unique correction process means that a single parity check is used in the LDPC decoder, and a decoding process is executed by using both the mini-sum algorithm and applied BF algorithm.
In the meantime, in the BF decoding that is one of decoding algorithms of LDPC, LLR is not used and only the parity check result of the check node cn is used. Thus, the BF decoding has a feature that it has a high tolerance to a hard error (HE) on data with an extremely shifted threshold voltage, which has been read out of a NAND flash memory. Therefore, the BF decoding process can be added to the LDPC decoder which determines the check node cn for which parallel processing is executed by the variable node vn base, as described above.
As shown in
Incidentally, the BF decoding can be executed by using the arithmetic circuits for mini-sum as such. In the ordinary mini-sum arithmetic circuit, only the most significant bit (sign bit) of the LLR is input, and the calculation of β or the detection of the minimum value of β is not executed. It should suffice if the parity check of all check nodes cn and the update of the parity check flag are executed.
For example, as shown in
With the above-described fourth embodiment, too, the same advantageous effects as with the first embodiment can be obtained. Moreover, according to the fourth embodiment, check nodes cn, which are connected to the same variable node vn, are processed batchwise, and sequential processes in the row direction are also executed, and furthermore the LLR is updated with an addition of the bit flipping (BF) algorithm. In this manner, by correcting an error with use of plural algorithms, the capability can be improved without lowering an encoding ratio or greatly increasing the circuit scale.
In the BF decoding, LLR is not used, and only the parity check result of the check node cn is used. Thus, since the tolerance to data with an extremely shifted threshold voltage, which has been read out of a NAND flash memory, is high, it is possible to realize ECC of a multilevel (MLC) NAND flash memory which stores plural bits in one memory cell.
The LDPC decoders described in the first to fourth embodiments process data of NAND flash memories. However, the embodiments are not limited to this example, and are applicable to data processing in communication devices, etc.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
This application claims the benefit of U.S. Provisional Application No. 61/782,919, filed Mar. 14, 2013, the entire contents of which are incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
61782919 | Mar 2013 | US |