CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of Korean Patent Application No. 10-2013-0143913, filed on Nov. 25, 2013, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
BACKGROUND
Inventive concepts relate to methods and apparatuses for encoding and decoding data in a memory system.
An error-correcting code (ECC) used in a memory system may be used for detecting errors, correcting the errors, and restoring data. The errors may include random errors that may not be predicted in advance and errors that may be predicted in advance. For example, if a portion of a memory is no longer functional due to a structural defect of the memory, errors may always occur in data stored in the corresponding portion of the memory. A location of the portion of the memory may be determined through a test.
If large resources of a memory system are dedicated to performing a particular ECC, the ECC may deteriorate the performance of the memory system.
SUMMARY
Provided are methods and apparatuses for encoding and decoding data in a memory system based on information regarding coordinates and values of stuck cells.
Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments.
According to an aspect of the inventive concept, an encoder includes a first vector generating unit configured to generate a vector u corresponding to data to be stored in a plurality of memory cells including t stuck cells, the vector a having n−s entries n, s, and t are integers, n is greater than or equal to s, s is greater than or equal, to t and t is greater than or equal to zero, a matrix generating unit, configured to generate an encoding matrix G, a second vector generating unit configured to receive information α and information μ and generate a vector w having s entries based on the information α, the information μ, and the encoding matrix G, the information α including coordinated of the stuck cells, and the information μ including values of the stuck cells, and a third vector generating unit configured to generate a vector v having n entries based on the encoding matrix G and a vector x, the vector x being a concatenation of the vector w and the vector u, and entries of the vector v correspond to the information α and include the values of the stuck cells according to the information μ.
In an example embodiment, α={α(1), . . . , α(t)}, 1≦α(1)<. . . <α(t)≦n, μ={μ(1), . . . , μ(t)}, u=[u1 . . . un−s]T, w=[w1 . . . ws]T, x=[w1 . . . ws u1 . . . un−s]T, an v=[v1 . . . vn]T, vα(j)=μ(j), 1≦j≦t where j is an integer, the matrix generating unit configured to generate the encoding matrix G such that G=[G1, G2]=G11, 0s×(n−s); G21, I(n−s)×(n−s)], G1 is a n×s first encoding matrix, G2 is a n×(n−s) a second encoding matrix, G11 is a s×s invertible matrix, G21 is a (n−s)×s matrix, 0s×(n−s) is a zero matrix, and I(n−s)×(n−s) is a unit matrix, the second vector generating unit is configured to generate the vector w such that [G1·w]α(j)=μ(j)−[G2·u]α(j)(1≦j≦t), which are t linear equations, and the third vector generating unit is configured to determine a vector y such that y=[y1 . . . yn]T=G1·w and v=[v1 . . . vn]T=y+[0 . . . 0 u1 . . . un−s]T.
The encoder further includes a header generating unit configured to generate a header including a label of the encoding matrix G and s.
The s×s invertible matrix G11 is a unit matrix and the (n−s)×s matrix G21 includes independently and identically distributed (i.i.d.) entries having values 0 or 1, each of the i.i.d. values having a ½ probability of being 1.
The matrix generating unit is configured to generate rows of the first encoding matrix G1 based on sample row vectors that are independently and identically distribution.
The matrix generating unit is configured to generate the first encoding matrix G1 as a transposed matrix HT of a check matrix H for binary erasure channel (BEC) codes.
According to another aspect of the inventive concept, a method for encoding data, the method includes generating a vector u corresponding to data to be stored in n memory cells including t stuck cells the vector u having n−s entries, where n, s, and t are integers, n is greater than or equal to s, s is greater than or equal to t and t is greater than or equal to zero, generating a vector w having s entries based on information α and information μ, the information α including coordinates of the stuck cells and the information μ including values of the stuck cells, generating an n×n encoding matrix G, and generating a vector v having n entries based on the encoding matrix G and a vector x, the vector x being a concatenation of the vector w and the vector u, and entries of the vector v correspond to the information α and include the values of the stuck cells according to the information μ.
The encoding matrix G includes a n×s first encoding matrix G1 and a n×(n−s) second encoding matrix G2, the generating of the vector w includes determining t linear equations and employing s roots as entries of the vector w, coefficients of the t linear equations are entries included in rows of the first encoding matrix G1 corresponding to the information α, and determining constant terms of the t linear equations based on calculations of entries in rows of the second encoding matrix G2.
The generating the vector v adds a product of the first encoding matrix G1 and the vector w to a product of the second encoding matrix G2 and the vector u.
The generating the vector w further includes, replacing the encoding matrix G with another encoding matrix G, if s roots do not exist. The method includes generating a header including a label of the encoding matrix G.
The method further includes generating the encoding matrix G by using at least one matrix that is randomly or pseudo-randomly sampled.
The generating of the encoding matrix G includes generating a first encoding matrix G1, the first encoding matrix G1 including a s×s invertible matrix G11 and a (n−s)×s matrix G21; and generating a second encoding matrix G2 by combining a zero matrix 0s×(n−s) and a unit matrix I(n−s)×(n−s).
The generating of the first encoding matrix G1 includes configuring the s×s invertible matrix G11 as a s×s unit matrix Is×s; and configuring the (n−s)×s matrix G21 such that the (n−s)×s matrix G21 includes independently and identically distributed (i.i.d.) entries having values 0 or 1, each of the i.i.d. entries having a ½ probability of being 1.
The generating of the first encoding matrix G1 includes generating a row vector including s entries having values of 0 or 1, each of the s entries having a corresponding index, only one of the s entries having a corresponding index between 1 and s′/2 is 1, only one of the s entries having a corresponding index between from (s′/2)+1 through s′ is 1, and the probability that a value of each of entries of the row vector of which indexes are from s′+1 through s becomes 1 is ½, and generating the first encoding matrix G1 based on the generated row vector, s′ is an even number and s′≦s≦t.
The generating of the first encoding matrix G1 is a transposed matrix HT of a check matrix H for binary erasure channel (BEC) codes.
At least one example embodiment discloses a method of processing data to be stored in a plurality of memory cells including a plurality of stuck cells, each of the stuck cells storing a value. The method includes obtaining input data and stock cell information, the stuck cell information indicating locations of the plurality of stuck cells in the plurality of memory cells and the values of the plurality of stuck cells, generating a first vector based on the input data, generating an encoding matrix, generating a second vector based on the stuck cell information and the encoding matrix, a number of entries of the second vector being greater than or equal to a number of the plurality of stuck cells and generating encoded data based on the first vector, the encoding matrix and the second vector, the encoded data including values corresponding to the locations of the plurality of stuck cells in the plurality of memory cells and the values of the plurality of stuck cells.
In an example embodiment, the method includes generating label information, the label information identifying the encoding matrix and the number of entries of the second vector.
In an example embodiment, the method includes decoding the encoded data using the label information.
In an example embodiment, the encoding matrix includes a first encoding matrix part and a second encoding matrix part, the second encoding matrix part including a zero matrix and a unit matrix.
In an example embodiment, the method includes storing the first encoding matrix part and an inverse of a portion of the first encoding matrix part.
BRIEF DESCRIPTION OF TEE DRAWINGS
Example embodiments of inventive concepts will be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings in which:
FIG. 1 is a diagram showing a memory system according to an example embodiment of inventive concepts.
FIG. 2 is a diagram showing an example of stuck cell information storing units according to an example embodiment of inventive concepts.
FIG. 3 is a schematic block diagram showing the structure of an encoder according to an example embodiment of inventive concepts.
FIG. 4 is a diagram showing operation of the encoder according to an example embodiment of inventive concepts as equations.
FIG. 5 is a diagram for describing operation of a generating unit according to an example embodiment of inventive concepts.
FIG. 6 is a diagram for describing operations of the v generating unit and the decoder according to an example embodiment of Inventive concepts.
FIG. 7 is a diagram showing an encoding matrix according to an example embodiment of inventive concepts.
FIGS. 8A through 8C are diagrams showing encoding matrixes according to example embodiments of inventive concepts.
FIG. 9 is a flowchart showing an encoding method according to an example embodiment of inventive concepts.
FIG. 10 is a flowchart showing a method of generating a column vector according to an example embodiment of inventive concepts.
FIG. 11 is a flowchart showing a method of generating a column vector according to an example embodiment of inventive concepts.
FIG. 12 is a decoding method according to an example embodiment of inventive concepts.
FIG. 13 is a schematic block diagram showing an encoder and a decoder according to an example embodiment of inventive concepts.
FIG. 14 is a block diagram showing a computer system including a memory system according to example embodiments of inventive concepts.
FIG. 15 is a block diagram showing a memory card according to an example embodiment of inventive concepts.
FIG. 16 is a block diagram showing an example network system including a memory system according to an example embodiment of inventive concepts.
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
Example embodiments will now be described more fully with reference to the accompanying drawings. Many alternate forms may be embodied and example embodiments should not be construed as limited to example embodiments set forth herein. In the drawings, like reference numerals refer to like elements.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
It will be understood that when an element is referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element or intervening elements may be present. In contrast, when an element is referred to as being “directly connected” or “directly coupled” to another element, there are no intervening elements present. Other words used to describe the relationship between elements should be interpreted in a like fashion (e.g., “between” versus “directly between,” “adjacent” versus “directly adjacent,” etc.).
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood, that the terms “comprises,” “comprising,” “includes” and/or “including,” when used herein, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
Unless specifically stated otherwise, or as is apparent from the discussion, terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical, electronic quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
Specific details are provided in the following description to provide a thorough understanding of example embodiments. However, it will be understood by one of ordinary skill in the art that example embodiments may be practiced without these specific details. For example, systems may be shown in block diagrams so as not to obscure the example embodiments in unnecessary detail. In other instances, well-known processes, structures and techniques may be shown without unnecessary detail in order to avoid obscuring example embodiments.
In the following description, illustrative embodiments will be described with reference to acts and symbolic representations of operations (e.g., in the form of flow charts, flow diagrams, data flow diagrams, structure diagrams, block diagrams, etc.) that may be implemented as program modules or functional processes include routines, programs, objects, components, data structures, etc., that perform particular tasks or implement particular abstract data types and may be implemented using existing hardware in existing electronic systems (e.g., nonvolatile memories universal flash memories, universal flash memory controllers, nonvolatile memories and memory controllers, digital point-and-shoot cameras, personal digital assistants (PDAs), smartphones, tablet personal computers (PCs), laptop computers, etc.). Such existing hardware may include one or more Central Processing Units (CPUs), digital signal processors (DSPs), application-specific-integrated-circuits (ASICs), field programmable gate arrays (FPGAs) computers or the like.
Although a flow chart may describe the operations as a sequential process, many of the operations may be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. A process may be terminated when its operations are completed, but may also have additional steps not included in the figure. A process may correspond to a method, function, procedure, subroutine, subprogram, etc. When a process corresponds to a function, its termination may correspond to a return of the function to the calling function or the main function.
As disclosed herein, the term “storage medium”, “computer readable storage medium” or “non-transitory computer readable storage medium” may represent one or more devices for storing data, including read only memory (ROM), random access memory (RAM), magnetic RAM, core memory, magnetic disk storage mediums, optical storage mediums, flash memory devices and/or other tangible machine readable mediums for storing information. The term “computer-readable medium” may include, but is not limited to, portable or fixed storage devices, optical storage devices, and various other mediums capable of storing, containing or carrying instruction(s) and/or data.
Furthermore, example embodiments may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine or computer readable medium such as a computer readable storage medium. When implemented in software, a processor or processors may be programmed to perform the necessary tasks, thereby being transformed into special purpose processor(s) or computer(s).
A code segment may represent a procedure, function, subprogram, program, routine, subroutine, module, software package, class, or any combination of instructions, data structures or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, etc.
Hereinafter, the term entry refers to an element of a matrix or a vector, where an element and an entry of a matrix or a vector have the identical meaning. Furthermore, when AεFN×N, BεFN×M, CεFM×N, and DεFM×M, the following equation may be established:
FIG. 1 is a diagram showing a memory system 900 according to an example embodiment of inventive concepts. As shown in FIG. 1, the memory system 900 may communicate with a host 800 in compliance with a protocol. For example, the memory system 900 may support protocols including advanced technology attachment (ATA) interface, serial advanced technology attachment (SATA) interface, parallel advanced technology attachment (PATA) interface, universal serial bus (USB) or serial attached small computer system (SAS) interface, small computer system interface (SCSI) interface, embedded multimedia card (EMMC) interface, and unix file system (UFS) interfaced. However, the above-stated interfaces are merely examples, and example embodiments of inventive concepts are not limited thereto.
As shown in FIG. 1, the memory system 900 may include a memory controller 1000 and a memory device 2000. The memory controller 1000 may control the memory device 2000, may receive a request from the host 800 outside the memory system 900, and may transmit a response to the request to the host 800. The memory device 2000 may include a non-volatile memory device, e.g., a flash memory, a phase change RAM (PRAM), a ferroelectric RAM (FRAM), a magnetic RAM (MRAM), etc.
As shown in FIG. 1, the memory device 2000 may include a cell array 2100, and the cell array 2100 may include a plurality of memory cells 2110 arranged in an array shape. For example, a flash memory may include a plurality of blocks, each of the blocks may include a plurality of pages, and the cell array 2100 may correspond to a page or a block of the flash memory.
The memory cell 2110 is capable of storing the smallest unit of data stored in the memory device 2000 and may have different states according to data stored therein. The memory device 2000 may write data by changing a state of the memory cell 2110 and may output data according to the state of the memory cell 2110. Data corresponding to the state of the memory cell 2110 is referred to as a value of the memory cell 2110. Although not shown in FIG. 1, the memory device 2000 may include peripheral circuits for controlling the cell array 2100, and the memory controller 1000 may write data to the cell array 2100 or read data from the cell array 2100 by using the peripheral circuits.
Each of the memory cells 2110 may have two or more different states, and the number of bits of data that may be stored in each of the memory cells 2110 may be determined based on the number of states each of the memory cells 2110 may have. For example, in case of a flash memory, the memory cell 2110 may include a single level cell (SLC) capable of storing 1-bit data or a multi level cell (MLC) capable of storing 2-bit data or more data, according to a distribution of threshold voltages of a transistor included in the memory cell 2110. Hereinafter, it is assumed that the memory cell 2110 may have a value of 0 or 1 and is capable of storing 1-bit data. However, example embodiments of inventive concepts are not limited thereto.
Meanwhile, as shown in FIG. 1, the cell array 2100 may include not only normal cells 2111, whose states may be changed according to data stored therein, but also stuck cells 2112, whose states may not be changed according to data stored therein. Therefore, the stuck cell 2112 may cause errors in data. The stuck cell 2112 may be formed due to manufacturing process of the memory device 2000, abnormal electric signals applied from outside, or an end-of-life of the memory cell 2110.
As shown in FIG. 1, the memory controller 1000 may include an encoder 1100, a decoder 1200, and a stuck cell information storing unit 1300. The encoder 1100 may encode data for performing an error correction code (ECC), whereas the decoder 1200 may receive encoded data, that is, a codeword and decode the codeword. According to an example embodiment of inventive concepts, the stuck cell information storing unit 1300 may store information regarding the stuck cell 2112 included in the cell array 2100 of the memory device 2000. For example, the stuck cell information storing unit 1300 may store the coordinate and the value of the stuck cell 2112 included in the cell array 2100. Detailed descriptions thereof will be given below.
The encoder 1100 may receive stuck cell information from the stuck cell information storing unit 1300 and may generate a codeword by encoding data based on the received stuck cell information. The encoder 1100 and the decoder 1200 may share encoding information (e.g., encoding matrix G) used for encoding data. In other words, the encoder 1100 may add encoding information to the header of a codeword, and the decoder 1200 may decode the codeword by using the encoding information added to the header.
The encoder 1100 and the decoder 1200 may be embodied hardware, firmware, hardware executing software or any combination thereof included in the memory controller 1000.
When the encoder 1100 and the decoder 1200 are hardware, such hardware may include one or more Central Processing Units (CPUs), digital signal processors (DSPs), application-specific-integrated-circuits (ASICs), field programmable gate arrays (FPGAs) computers or the like configured as special purpose machines to perform the functions of the encoder 1100 and the decoder 1200. CPUs, DSPs, ASICs and FPGAs may generally be referred to as processors and/or microprocessors.
In the event that the encoder 1100 and the decoder 1200 are processors executing software, the processors are configured as special purpose machine to execute the software to perform the functions of the encoder 1100 and the decoder 1200. In such an embodiment, the controller 1000 may include one or more Central Processing Units (CPUs), digital signal processors (DSPs), application-specified-integrated-circuits (ASICs), field programmable gate arrays (FPGAs) computers.
Furthermore, although FIG. 1 shows that the memory controller 1000 includes the independent encoder 1100 and the independent decoder 1200, the encoder 1100 and the decoder 1200 may be embodied as a single element.
FIG. 2 is a diagram showing an example of the stuck cell information storing unit 1300 according to an example embodiment of inventive concepts. The cell array 2100 may include the plurality of memory cells 2110 arranged in an array shape, and the stuck cell information storing unit 1300 may store information regarding stuck cells included in the cell array 2100. For example, as shown in FIG. 2, the cell array 2100 may include the eight memory cells 2110, and two from among the eight memory cells 2110 may be stuck cells. The eight memory cells 2110 may have coordinates 1 through 8, respectively, and the coordinates of the memory cells 2110 may be addresses of the memory cells 2110.
As shown in FIG. 2, when the eight memory cells 2110 have values v1 through v8, respectively, data stored in the cell array 2100 may be expressed as a column vector v=[v1 . . . v8]T. However, as shown in FIG. 2, when the memory cells respectively corresponding to the coordinates 2 and 5 are stuck cells and respectively have fixed values of 0 and 1, values v2 and v5 of the column vector v stored in the cell array 2100 may be expressed as a column vector v=[v1 0 v3 v4 1 v6 v7 v8]T.
According to an example embodiment of inventive concepts, the stuck cell information storing unit 1300 may store information α regarding coordinates of stuck cells included in the cell array 2100 and information μ regarding values of the stuck cells. For example, in case of the cell array 2100 shown to FIG. 2, the stuck cell information storing unit 1300 may store information α={2, 5} as coordinates of stuck cells and may store information μ={0, 1} as values of the stuck cells corresponding to the coordinates. The encoder 1100 of FIG. 1 may receive the information α and μ from the stuck cell information storing unit 1300, encode data based on the information α and μ, and generate a column vector v corresponding to data to be stored in the cell array 2100. The column vector v generated by the encoder 1100 may include v2=0 and v5=1.
Referring to FIGS. 1 and 2, when the cell array 2100 includes n memory cells 2110 and t memory cells 2110 from among the n memory cells 2110 are the stuck cell 2112 (n>t≧0), the information α regarding coordinates of the stuck cells 2112, the information μ regarding values of the stuck cells 2112, and a column vector v corresponding to data stored in the cell array 2100 may be expressed as shown below.
v=[v1 . . . vn]T ε1 Fn(F=GF(2))
α={α(1), . . . , α(1)< . . . <α(t))
μ={μ(1), . . . , μ(t)}
vα(j)=μ(j)(1≦j≦t)
where G is an encoding matrix generated by the encoder 1100.
FIG. 3 is a schematic block diagram showing the structure of the encoder 1100 according to an example embodiment of inventive concepts. According to an example embodiment of inventive concepts, the encoder 1100 may include a first vector generating unit, a second vector generating unit, a third vector generating unit, a matrix generating unit, and a header generating unit. For example, FIG. 3 shows an embodiment in which the first vector generating unit is a u generating unit 1111, the second vector generating unit is a w generating unit 1112, the third vector generating unit is a v generating unit 1114, and the matrix generating unit is a G generating unit 1113.
The encoder 1100 may receive input data DATA_IN and stuck cell information SC_INFO and may generate and output codeword DATA_CW and header DATA_HD. The input data DATA_IN may include data the host 800 requested the memory system 900 to write, that is, user data and metadata generated by the memory controller 1000 to manage the memory device 2000. Furthermore, the stuck cell information SC_INFO received by the encoder 1100 from the stuck cell information storing unit 1300 may include the information α regarding coordinates of stuck cells and information μ regarding values of the stuck cells.
The encoder 1100 may generate the codeword DATA_CW, and the codeword DATA_CW may include values of stuck cells at addresses corresponding to coordinates of the stuck cells. Furthermore, the encoder 1100 may generate the header DATA_HD including encoding information regarding the codeword DATA_CW, and the header DATA_HD may be stored in the memory controller 1000 or the memory device 2000 separately from the codeword DATA_CW.
According to an example embodiment of inventive concepts as shown in FIG. 3, the encoder 1100 may include the u generating unit 1111, the w generating unit 1112, the G generating unit 1113, the v generating unit 1114, and a header generating unit 1115. The u generating unit 1111 may receive the input data DATA_IN and generate a column vector u=[u1 . . . un−s]T(0≦t≦s<n). The u generating unit 1111 may select data to be stored in the cell array 2100 including the n memory cells 2110 from among the input data DATA_IN and generate the column vector u. Since the cell array 2100 includes t stuck cells, the u generating unit 1111 is unable to store data in all of the n memory cells 2110. Therefore, the u generating unit 1111 may generate the column vector u including n−s entries (s>t). For example, in the embodiment shown in FIG. 2, the u generating unit 1111 may select data corresponding to a column vector u=[u1 u2 u3 u4 u5]T including five entries from among the input data DATA_IN and output the column vector u (s=3).
The G generating unit 1113 may generate the encoding matrix G. The encoding matrix G is an n×n matrix including n rows and n columns and may be used for generating a column vector v. The G generating unit 1113 may receive signals from the w generating unit 1112, generate a new encoding matrix G based on the signals, and output the new encoding matrix G. The G generating unit 1113 may generate an encoding matrix G randomly or pseudo-randomly. For example, the G generating unit 1113 may generate an encoding matrix G by combining at least one from among matrixes stored in a memory randomly or pseudo-randomly. Detailed descriptions thereof will be given below.
The w generating unit 1112 may receive the stuck cell information SC_INFO from the stuck cell information storing unit 1300, may receive the column vector u from the u generating unit 1111, and may receive the encoding matrix G from the G generating unit 1113. A column vector w generated by the w generating unit 1112 is auxiliary data based on stuck cells included in the cell array 2100, where the column vector w may be added to the column vector u generated by the u generating unit 1111 and may be used for generating the column vector v. The column vector w generated by the w generating unit 1112 may be expressed as w=[w1 . . . ws]T. The s entries w1 through ws included in the column vector w may be determined based on the column vector u, the information α, the information μ, and the encoding matrix G that are received by the w generating unit 1112. If the w generating unit 1112 is unable to determine s entries w1 through ws based on the column vector u, the information α, the information μ, and the encoding matrix G, the w generating unit 1112 may receive a new encoding matrix G from the G generating unit 1113 and determine s entries w1 through ws by using the new encoding matrix G. Furthermore, the w generating unit 1112 may generate and output a label for the encoding matrix G used for generating the column vector w. The label is for the decoder 1200 to recognize the encoding matrix G used by the encoder 1100.
The v generating unit 1114 may receive the column vector u from the u generating unit 1111 and may receive the column vector w from the w generating unit 1112. Furthermore, the v generating unit 1114 may receive the encoding matrix G used by the w generating unit 1112 to generate the column vector w from the w generating unit 1112. The v generating unit 1114 may generate the codeword DATA_CW by using the column vectors u and w and the encoding matrix G, where the codeword DATA_CW may be expressed as a column vector v=[v1 . . . vn]T. In detail, the v generating unit 1114 may generate a column vector x by concatenating the column vector w with the column vector u and may generate the column vector v by multiplying the encoding matrix G by the column vector x.
The header generating unit 1115 may receive the label of the encoding matrix G and s from the w generating unit 1112 and may generate the header DATA_HD. The header DATA_HD generated by the header generating unit 1115 may include information required for the decoder 1200 to decode the codeword DATA_CW, and the decoder 1200 may decode the codeword DATA_CW by using the information included in the header DATA_HD. For example, the decoder 1200 may recognize the encoding matrix G used for encoding the codeword DATA_CW based on the label of the encoding matrix G included in the header DATA_HD and may generate a column vector x′ by multiplying the codeword DATA_CW by an inverse matrix of the encoding matrix G. Furthermore, the decoder 1200 may extract a column vector u′ included in the column vector x′ by using s and may restore data based on the column vector u′. The header DATA_HD generated by the header generating unit 1115 may be stored in a storage space accessible by the encoder 1100 and the decoder 1200 separately from the codeword DATA_CW. For example, the header DATA_HD may be stored in a storage space included in the memory controller 1000 or a cell array included in the memory device 2000.
FIG. 4 is a diagram showing operation of the encoder 1100 according to an example embodiment of inventive concepts as equations. Referring to FIGS. 1 through 3, the u generating unit 1111 may generate a column vector u having n−s entries, the G generating unit 1113 may generate an n×n encoding matrix G, and the w generating unit 1112 may generate a column vector w having s entries. The v generating unit 1114 may generate a column vector v having n entries. As shown in FIG. 4, a column vector x having n entries may be formed by concatenating the column vector w having s entries with the column vector u having n−s entries. The column vector v may be calculated by multiplying the encoding matrix G by the column vector x.
As shown in FIG. 4, the n×n encoding matrix G may include a n×s first encoding matrix G1 and a n×(n−s) second encoding matrix G2. In the multiplication of the encoding matrix G by the column vector x, the first encoding matrix G1 may correspond to the column vector w, whereas the second encoding matrix G2 may correspond to the column vector u. Therefore, as shown in FIG. 4, the column vector v may be calculated by adding a multiplication of the first encoding matrix G1 by the column vector w to a multiplication of the second encoding matrix G2 by the column vector w. A column vector y refers to the multiplication of the first encoding matrix G1 by the column vector w and may be expressed as y=[y1 . . . yn]T.
FIG. 5 is a diagram for describing operation of the w generating unit 1112 according to an example embodiment of inventive concepts. In the example embodiment shown in FIG. 2, the cell array 2100 may include the eight memory cells 2110, the memory cells 2110 corresponding to the coordinates 2 and 5 from among the memory cells 2110 may be stuck cells (α={2, 5}), and the stuck cells may have values of 0 and 1, respectively (μ={0, 1}). Therefore, a column vector v corresponding to data to be stored in the cell array 2100 may be expressed as v=[v1 0 v3 v4 1 v6 v7 v8]T. As shown in FIG. 5, the encoding matrix G is a 8×8 matrix (n=8), and a column vector x may be a concatenation of a column vector w having three entries and a column vector u having five entries (s=3). The u generating unit 1111 may generate the column vector u=[1 1 1 0 1]T having five entries from the input data DATA_IN. Furthermore, the encoding matrix G may be a combination of a 8×3 first encoding matrix G1 and a 8×5 second encoding matrix G2, where the first encoding matrix G1 and the second encoding matrix G2 may include entries of which values are 0 or 1.
According to an example embodiment of inventive concepts, a linear equation including entries w1 through ws of the column vector w as variables may be induced from the entries included in rows of the encoding matrix G corresponding to coordinates of stuck cells and entries of the column vector x. Since there are t stuck cells, t linear equations may be induced. For example, as shown in FIG. 5, since the cell array 2100 of FIG. 2 includes stuck cells respectively corresponding to the coordinates of 2 and 5, two linear equations may be induced from entries included in the second row and fifth row of the encoding matrix G and entries of the column vector x. In detail, as shown in FIG. 5, a linear equation w3=0 may be induced from a stuck cell whose coordinate is 2 and value is 0 and the second row of the encoding matrix G, whereas a linear equation w2=0 may be induced from a stuck cell whose coordinate is 5 and value is 1 and the fifth row of the encoding matrix G. Therefore, the column vector w may be either [0 0 0]T or [1 0 0]T.
According to an example embodiment of inventive concepts, the w generating unit 1112 may calculate s roots satisfying t linear equations and generate a column vector w by employing the roots as entries of the column vector w. Therefore, the column vector w may include information regarding stuck cells in relation to the column vector n generated by the u generating unit 1111. Meanwhile, if the w generating unit 1112 is unable to calculate s roots satisfying t linear equations or s roots satisfying t linear equations do not exist, the w generating unit 1112 may receive a new encoding matrix G from the G generating unit 1113.
FIG. 6 is a diagram for describing operations of the v generating unit 1114 and the decoder 1200 according to an example embodiment of inventive concepts. According to an example embodiment of inventive concepts, the decoder 1200 may receive a column vector v′ based on data stored in the cell array 2100. The decoder 1200 may determine an inverse matrix G−1 of the encoding matrix G based on the label of the encoding matrix G included in the header DATA_HD and may generate a column vector x′ by multiplying the inverse matrix G−1 by the received column vector v′. Furthermore, the decoder 1200 may extract a column vector u′ having n−s entries from the column vector x′ based on s included in the header DATA_HD.
Referring to FIGS. 5 and 6, if the w generating unit 1112 generates a column vector w=[1 0 0]T according to the embodiment shown in FIG. 5, the v generating unit 1114 may generate a column vector v=[1 0 1 0 1 1 0 1]T as shown in FIG. 6. The cell array 2100 may store data according to the column vector v. When the column vector v generated by the encoder 1100 is stored in the cell array 2100 and the decoder 1200 receives a column vector v′ according to the data, the condition v′=v may be satisfied. As shown in FIG. 6, the decoder 1200 may generate a column vector x′ by multiplying the inverse matrix G−1 of the encoding matrix G by the column vector v′ and may generate a column vector u′ by extracting lower five entries from among entries of the x′.
When the encoder 1100 and the decoder 1200 perform the operations described above with reference to FIGS. 5 and 6, encoding operation may have a complexity of O(n2+s·t2) regarding an arbitrary encoding matrix G, whereas decoding operation may have a complexity of O(n2). Furthermore, the encoding matrix G and the inverse matrix G−1 of the encoding matrix G may be stored in memories respectively accessible by the encoder 1100 and the decoder 1200 and may be recognized via labels. According to an example embodiment of inventive concepts, if the encoding matrix G is generated as a sparse random matrix, the complexity may be lowered, and capacities of the memories for storing the encoding matrix G and the inverse matrix G−1 of the encoding matrix G may be adjusted.
FIG. 7 is a diagram showing an encoding matrix G according to an example embodiment of inventive concepts. As shown in FIG. 7, the encoding matrix G may include a n×s first encoding matrix G1 and a n×(n−s) second encoding matrix G2, where the first encoding matrix G1 may include a s×s matrix G11 and a (n−s)×s matrix G21. Furthermore, the second encoding matrix G2 may include a zero matrix 0s×(n−s) and a unit matrix I(n−s)×(n−s). According to an example embodiment of inventive concepts, if the encoding matrix G is generated as shown in FIG. 7, the inverse matrix G−1 of the encoding matrix G may have the same second encoding matrix G2 and a different first encoding matrix G1 as compared to the encoding matrix G. In other words, as shown in FIG. 7, the inverse matrix G−1 of the encoding matrix G includes the second encoding matrix G2 as-is and includes n×s matrix which is a combination of the inverse matrix G11−1 of the matrix G11 and −G21·G11−1, instead of the first encoding matrix G1Therefore, a memory for storing the encoding matrix G may store the (n−s)×s matrix G21, the s×s matrix G11, and the inverse matrix G11−1 of the matrix G11 instead of storing the n×n matrix G and the inverse matrix G−1 of the encoding matrix G, and thus capacity of the memory may be reduced.
FIGS. 8A through 8C are diagrams showing encoding matrixes G according to example embodiments of inventive concepts. The encoding matrixes G shown in FIGS. 8A through 8C may include the second encoding matrix G2 may include a zero matrix 0s×(n−s) and a unit matrix I(n−s)×(n−s), as shown in FIG. 7.
FIG. 8A is a diagram showing the encoding matrix G including a first encoding matrix G1 including a combination of the unit matrix Is×s and the (n−s)×s matrix G21 (first combination). If the matrix G11 of FIG. 7 is the unit matrix Is×s as shown in FIG. 8A, the encoding matrix G may be identical to the inverse matrix G−1 of the encoding matrix G. Therefore, the encoder 1100 and the decoder 1200 may share a memory in which, the (n−s)×s matrix G21 is stored, thereby saving memory space for storing the encoding matrix G.
Referring to FIGS. 3 and 8A, the G generating unit 1113 may generate the (n−s)×s matrix G21 randomly or pseudo-randomly. According to the (n−s)×s matrix G21 generated by the G generating unit 1113, the w generating unit 1112 may calculate s roots of t linear equations and generate a column vector w. However, if the w generating unit 1112 is unable to calculate s roots from t linear equations, the w generating unit 1112 may receive a new (n−s)×s matrix G21 randomly or pseudo-randomly generated by the G generating unit 1113 and may obtain new t linear equations. For example, the possibility that the value of an entry of the (n−s)×s matrix G21 becomes 1 is ½, and the (n−s)×s matrix G21 may include independently and identically distributed (i.i.d.) entries having values of 0 or 1. Referring to Lemma 1 below, the possibility for the w generating unit 1112 to calculate s roots from t linear equations and generate a column vector w is 1-2t−s.
Lemma 1.
- (i) Let 1≦t<s and let V be a t×s random matrix whose entries are sampled independently and identically distribute (i.i.d.) from Z2 (where Z2 is a field or set having two Integers, 0 and 1) with probability ½ per entry to be 1. It then holds that the probability that V's row are linearly dependent is bounded from above by
- (ii) For 1≦k≦t<s let V be a t×s matrix and let (v1, . . . , vt) be V's rows. Suppose that (v1, . . . , vk) are linearly independent and that each component vi, t≧i>k is sampled independently and identically distributed (i.i.d.) from Z2 with probability ½ of being 1. It then holds that the probability that V's rows are linearly dependent is bounded from above by 2t−s
Proof. (i) Let (v1, . . . , vt) be V's rows. The event U that V's row are linearly dependent is a non-disjoint union of the events Uα, α⊂[t], α≠Ø, that v=Σiεαvi=0. Note that each component of V has a ½ probability of being 1 and that the components are independent. It follows that Pr(v=0)=2−s. Since there are 2t−1 events Uα, α⊂[t], a≠Ø, the probability that they all occur is bounded from above by the union bound by: (2t−1)·2−s<2t−s.
- (ii) In this case the event U that V's row are linearly dependent is a non-disjoint union of the events Uα that vα=Σiεαvi=0, where α⊂[t], and a is not a subset of [k]. Here too each component of vα(α⊂[t], α⊂[k]) has a ½ probability of being 1 where the components are independent, which implies Pr(vα=0)=2−s. Since there are 2t−2k events Uα, α⊂[t], α⊂[k], the probability that they all occur is bounded from above by 2t−s.
FIG. 8B is a diagram showing a first encoding matrix G1 generated by the G generating unit 1113 according to an example embodiment of inventive concepts. Referring to FIG. 7, according to an example embodiment of inventive concepts, the G generating unit 1113 may generate a second encoding matrix G2 including a zero matrix 0s×(n−s) and a unit matrix I(n−s)×(n−s). Furthermore, the G generating unit 1113 may generate a first encoding matrix G1 including a row vector r satisfying equations below.
1≦t≦s′<s≦n/2(s′:even)
d=s−s′
ei=[0 1i 0s](0<i<s)
q=[01≠s′ qs′+1 . . . qs]
r=ei+ej+q(1≦i≦s′/2 and ((s′/2)+1)≦j≦s′)
where e is a row vector, q is a row vector and r is a row vector at uniform probability
In other words, according to an example embodiment of inventive concepts, a row of the first encoding matrix G1 generated by the G generating unit 1113, may include a row vector in which value of only one entry from among entries of which indexes are from 1 to s′/2 is 1 and value of only one entry from among entries of which indexes are from s′/2+1 to s′ is 1. In other words, as shown in FIG. 8B, when s=14 and s′=10, the second row of the first encoding matrix G1 may be a row vector r, which is the sum of row vectors e3, es, and q.
Meanwhile, the G generating unit 1113 may generate the row vector r at a uniform probability in the probability space. In other words, the probability that a value of each of entries of the row vector r of which indexes are from 1 through s′/2 becomes 1 may be 1/(s′/2), whereas the probability that a value of each of entries of the row vector r of which indexes are from s′/2+1 through s′ becomes 1 may be 1/(s′/2), independent from the probability regarding the entries of the row vector r of which indexes are from 1 through x′/2. Furthermore, the probability that a value of each of entries of the row vector r of which indexes are from s′+1 through s becomes 1 may be ½, independent from one another. The row vector r in the probability space (Ω, P) may be expressed as shown below, where the first encoding matrix G1 may include the row vector r sampled to be independent and identically distributed from the probability space (Ω, P).
Ω={r=ei+ej+q: 1≦i≦s′/2 and ((s′/2)+1)≦j≦s′}
- P is a uniform probability and Ω is a set of possible outcomes.
From among n row's of the first encoding matrix G1, if t rows corresponding to coordinates of stuck cells are not linearly dependent on one another, a column vector w may be generated from an encoding matrix G based on the first encoding matrix G1. In other words, since entries included in the t rows of the first encoding matrix G1 are coefficients regarding s variables in t linear equation, there may be s roots satisfying the t linear equations if the t rows are not linearly dependent on one another. Therefore, when a t×s matrix including a row vector r sampled to be independent and identically distributed from the probability space (Ω, P) is referred to as a matrix V, the possibility that rows of the matrix V become linearly dependent on one another may be adjusted as desired based on the Lemmas below,
Lemma 2. Let M,K≧1. Denote the M dimensional unit vectors by: ei=[0, . . . i−1 times . . . , 1,0, . . . , ]T and EM={ei: iε[M]}
- (i) Let i(1), . . . , i(K)ε[s] than the sequence of vectors (ei(k))kε[K] is linearly dependent if there exists k,k′ε[K] k≠k′ such that ei(k)=ei(k′).
Define:
- VK={[u1, . . . , uK]: for kε[K], ukεEM}
- UK={[u1, . . . , uK]εVK: u1+ . . . +uK=0}
It then holds that:
- (ii) |VK|=MK
- (iii) If K is odd UK=Ø
- (iv) If K is even, then |UK|≦MK/2·(K−1)·(K−3)·, . . . , ·1≦MK/2·(K1)1/2
- (v) If K is even, then |UK|≧M(M−1)· . . . ·(M−K/2+1)·(K−1)·(K−3)·, . . . , ·1
- (vi) If we sample with uniform probability an element [u1, . . . , uK]εVK (K even) than the probability of zero sum is bounded from above by:
Pr(u1+ . . . +uK=0)≦M−K/2·(K−1)·(K−3)·, . . . , ·1=f(M,K) - (vii) If we sample with uniform probability an element [u1, . . . , uK]εVK (K even) than the probability of zero sum is bounded from below by;
For example
- for K=2: Pr(u1+u2=0)≦M−1
- for K=4: Pr(u1+u2+u3+u4=0)≦M−2·3
- for K=6: Pr(u1+u2+u3+u4+u5+u6=0)≦M−3·15
Proof of lemma 2: I. (i) & (ii) are trivial. To prove (iii) note firstly that it holds for K=1, let K>1 be odd and suppose inductively that for all odd K′, 1≦K′<K and all [u1, . . . , uK′]εVK′: u1+ . . . +uK′≠0. Take [u1, . . . , uK]εVK′ and suppose to get a contradiction that [u1+ . . . +uK=0. Note that there must be some k>1 such that u1=uk, which implies that u1+uk=0 and hence u2+ . . . +uk−1+uk+1+ . . . +uk−1=0 which contradicts the induction assumption.
To prove (iv) we would construct for every element u=[u1, . . . , uK]εUK a 1:1 mapping φ=φa:[K]→[K] defined inductively, which has the following properties, for all aε[K]:
- ua=uφ(a)
- φ(a)≠a
- φ(φ(a))=a
Such mapping is called pairing, for obvious reason, φ is defined inductively as follows. For k=1 let φ(1) be the minimal index in {2, . . . , K} such that u1=uφ(1). Define φ(φ(1))=1 and let b1=1. Suppose inductively that after the r step, r<K/2 there is subset Ar⊂[K], with |Ar|=2r on which we defined a 1:1 mapping φ:Ar→Ar such that for all aεA the above 3 properties holds
- ua=uφ(a)
- φ(a)≠a
- φ(φ(a))=a
Note that ΣaεArub=0 and hence for B=[K]\Ar:
- ΣbεBub=0.
For the r+1 step let b=br be the minimal element in B and let b′ be the minimal element in B\{b} such that ub=ub′ and define φ(b)=b′ and φ(b′)=b. Set Ar+1=Ar∪{b,b′}, it is clear that the induction assumption carries to r+1 and hence the final φ satisfies the requirements (a)-(c). We will now be interested in the mapping u→φu where u=[u1, . . . , uK]εUK.
Let F={φu:uεUK}. The number of such pairings—|F|—is found via the following combinatorial observation: K−1 possibilities for φ(b1) and independent K−3 possibilities for φ(b2) and independent K−5 possibilities for φ(b3), . . . . It follows that:
|F|=(K−1)·(K−3)·(K−5)· . . . ·1
For each φεF define
UK,φ={:uεUK:φu=φ}.
Note that UK is a disjoint union of UK,φ, φεF. There are at most MK/2 elements in UK,φ which is all the possible ways to choose b1, . . . , br and hence |UK|≦MK/2·(K−1)·(K−3)· , . . . , ·1♦
Lemma 3: For k≧1 if u1, . . . , uk are sampled independently from (Ω,P) and u=u1+ . . . +uk than
- If k is odd u≠0
- If k is even Pr(u=0)≦(2/s′)k·[(k−1)·(k−3)· , . . . , ·1]2·2−d
- ≦(2/s′)k·k1·2−d
Define: qk=(2/s′)k·[(k=1)·(k−3)·, . . . , ·1]2, note that
- q2=(2/s′)2
Proof: Let Ti:Fs→Fs′/2 i=1,2 and be projections, defined for w=[w1, . . . , ws]T εFs by:
- T1(w)=[w1, . . . , ws′/2]T εFs′
- T2(w)=[ws′/2+1, . . . , ws′]T εFs′
- T3(w)=[ws′+1, . . . , ws]T εFd
Take u1, . . . , uK which are sampled independently from (Ω,P) and let u=u1+ . . . +uK than the above mention independency of the three sector and the symmetry of the first 2 sectors yields that:
Pr(T1(u)=0)=Pr(T2(u)=0)
Pr(T3(u)=0)=0)=2−d
Pr(u=0)=Pr(T1(u)=0)·Pr(T2(u)=0)·Pr(T3(u)=0)=Pr(T1(u)=0)2·2−d.
To assess the probability of the event (T1(u)=0) we go back to lemma a2 that says: if we sample K unit vectors [u1, . . . , uK] of FM independently and identically distribute (i.i.d.) with uniform probability then
Pr(u1+ . . . +uK=0)≦M−K/2·(K−1)·(K−3)· , . . . , ·1 and
Pr(u1+ . . . +uK=0)≧M−K·M(M−1)·, . . . , ·(M−K/2+1)·(K−1)·(K−3)·, . . . , ·1
It follows that:
And hence
Pr(u=0)=Pr(T1(u)=0)2·2−d≦(2/s′)k·((k−1)·(k−3)·, . . . , ·1)2·2−d.♦
Lemma 4. Recall that Set 1≦t≦s′≦s≦n/2 and s′+d=s. Let V be a t×s random matrix whose rows are sampled independently and identically distribute (i.i.d.) from (Ω,P). It then holds that:
- I. the probability P0, that V's row are linearly dependent is bounded from above by:
- II Let γ>0 be a constant. If we add the assumption that (1−γ)·s′=2·t than:
P0≦(2−d)/γ.
Thus the failure probability can be arbitrarily small for s=(2+δ)·t with δ>0 as small as we wish.
Proof. I. Let (v1, . . . , vt) be V's rows, vi=(vij)1≦j≦s i=1, . . . , t. The event U that V's row are linearly dependent is a non-disjoint union of the events Uα, α⊂[t], α≠Ø, that Σiεαvi=0.
By lemma 2 above for k≧1 and α⊂[t]|α|=k:
- If k is odd: Pr(Uα)=0
- If k is even: Pr(Uα)≦(2/s′)k·k1·2−d
- Since U=∪α⊂[t], α≠ØUα it holds by the union bound that:
Define
It then holds that ak+1/ak=(t−k)·2/s′<2t/s′=1−γ. So
Therefore, the possibility that it is unable to calculate s roots satisfying t linear equations may be reduced as desired by adjusting the δ>0.
FIG. 8C is a diagram showing an example of first encoding matrix G1 generated by the G generating unit 1113 according to an example embodiment of inventive concepts. Referring to FIG. 7, according to an example embodiment of inventive concepts, the G generating unit 1113 may include a second encoding matrix G2 including a zero matrix 0s×(n−s) and a unit matrix I(n−s)×(n−s). Furthermore, the G generating unit 1113 may generate a first encoding matrix G1 including a transposed matrix HT of a check matrix H for binary erasure channel (BEC) codes. For example, as shown in FIG, 8C, the first encoding matrix G1 may include a transposed matrix of a s×n check matrix. The check matrix H is for checking errors in BEC codes, e.g., a low-density parity-check (LDPC) matrix, which is a sparse matrix.
According to an example embodiment of inventive concepts, an ε-failure-probability depends on a small header and the complexity of having to solve again t equations with, s unknowns in ε-probability. Reducing the computational complexity of both encoder and decoder for higher stucking probability p is accomplished by specially designed ensembles of sparse random G1-matrices. A sparse matrix is a matrix which has a small number of nonzero elements at each row. More precisely the mean number of non-zero elements per row b bounded by a positive constant. It turns out, that the complexity of the decoder and the encoder (the linear system solver) are reduced.
There are several algorithms in this specification which allow a solution of a sparse linear system of t×s with O(t·s) operations. A well-known family of codes which are based on sparse matrices is referred to as low-density parity-cheek (LDPC) codes. It is known that those codes can achieve the capacity of many discrete memoryless channels (DMC) under maximum likelihood (ML) decoding.
In BEC, there are 3 symbols at the output of the channel: 0, 1 and erasure (E). The BEC is characterized by its erasure probability p>0, and it is denoted by BEC(p). For this channel, it is known that there exist codes which approach the BEC capacity: 1−p with linear decoding complexity.
SC capacity is also 1−p. There is another resemblance: the BEC decoder also tries to solve a linear system of t×s equations.
There is a fundamental difference between BEC and SC. The former requires an overdetermined full ranked system of linear equations (more equations than unknowns) to guarantee decoding, while the latter requires an underdetermined full ranked set of linear equations to guarantee encoding. Nevertheless, both share a common property: a failure occurs if the associated matrix is not full ranked. This led the inventors to investigate the possibility of some relation between the parity-check matrix of a BEC code and G1 of the generator matrix of SC.
We proved that encoding and decoding BEC(p) algorithms with s-failure-probability can be translated into dual, encoding and decoding SC(p) algorithms with the same complexity and rate and same ε-failure-probability. The proof is presented in the sequel. The significance of this duality result is that the 1−p capacity of the SC(p) channel can be approached with, linear complexity encoding and decoding algorithms. In other words the above s and t (t the number of stuck cells s is the SC redundancy, s>t) can be chosen (in long enough codeword) with s/t close to 1.
Consider now BEC and a linear code C with code words of length n, where C has the ability to correct t erasures in probability 1−ε. The code C can be represent by an s×n parity check matrix, H, s>t. Let now y=A·x depict a linear system that is needed to be solved in a BEC decoding scheme with t erasures. Thus A is an s×t sub-matrix is of H composed of the t H-columns which correspond to the positions of the erased bits.
In BEC decoding, the iterative algorithm searches for a row with a single one (the jth row, where the ith position is 1, for example), solve its associated unknown immediately (xi), and then substitute the solution in all the equations which involve xi. This is equivalent to removing the jth row and ith column.
This process is continued until the entire matrix is diminished. If in some stage of the decoding no row with a single one was found—a decoding failure is declared. Good BEC codes has the property that for almost any set of t erasures, the above process succeeds, that is in any stage, until the linear system is solved, there is at least one row with a single 1 in it. It is known from the vast literature on BEC codes, that there exist codes that can achieve capacity using this simple, low linear complexity, algorithm. Unfortunately, we cannot apply this algorithm straightforwardly to our SC case, since there we deal with a transposed version of A. It is easy to prove that the probability to find a row with a single 1 in AT (a column with a single 1 In A) approaches 0 for good codes. We propose instead a decoding algorithm for SC, based on BEC-SC duality, which is composed of 2 steps. In the first step we use an iterative algorithm to turn AT into an upper triangular matrix. Then, in a second step, we solve the linear system using back-substitution method. In the first step only row and column permutations are done, hence the number of ones in AT stays the same. Thus, the number of operations in the second step is linear in t.
Iterative Triangulation, an outline of Algorithm that solves the linear system of equations y=A·x
- Input: sparse matrix A of size t×s, the vector y of size t.
- Output: Upper triangular matrix A, rearrangement of the s-dimensional vector x rearrangement of the t-dimensional vector y.
- Notation: Let 1 and J be sets of natural numbers. We denote by A[1,J] the sub-matrix of A which is composed by the rows indexed by 1 and the columns indexed by J.
- Algorithm 1:
Initialization: Set n=1.
Steps:
- 1. Search for a column in A[n:t,n:s] with a single 1. Denote the column position, in A by i, and the (row) position of the 1 by j. In none was found→Declare an error
- 2. Swap between the ith and nth columns of A. Swap between the ith and nth elements of x
- 3. Swap between the jth and nth rows of A. Swap between the jth and nth elements of y
- 4. Set n=n+1
- 5. If n==t, proceed, else go to step 1
- 6. Set A#=A x#=x (# signifies the permuted matrix/vector) End.
- Back-Substitution
After the first decoding stage we have the linear equation, system A#·x#=y# where A# is an upper triangular matrix. In the back-substitution stage we find x# iterative, and hence also the original unknowns vector x.
- Algorithm 2:
Initialization: Set x1#=0 For iε{t+1, t+2, . . . s}, arbitrarily (any other initialization vector will also work). Fix n=t.
Steps:
- 1. Compute; xn#=yn#−Σn+1≦i≦t a#n,i·x#i
- 2. Set n=n−1
- 3. If n=0, end successfully, else go to step 1
Computational Complexity
In order to take advantage of the large number of zero elements, special storage schemes are required to store sparse matrixes. Example of simple and efficient scheme to store an n×n sparse matrix which has at most r ones in a row, is to store an n×r matrix, where each row contains the column indices of the 1 elements in the original matrix row. For rows with less than, r nonzero elements, repetition of the last nonzero index should be filled in. Another version that can save storage space is to store the delta of the indexes.
The first decoding stage requires at most 2·(t−1) swap operations. The swap operation can be done without any change in the original matrix storage, by using two auxiliary memories of n elements each, one for column ordering and one for rows ordering. The content of the i-th memory element contains the index of the swapped (permuted) row (column).
The second decoding stage needs only a linear (in t) number of XOR operations, since the first stage do not change the number of ones in the matrix. To conclude, the suggested decoding algorithm is very simple and easy to implement in HW/SW.
According to an example embodiment of inventive concepts, when each row of the first encoding matrix G1 include 3 nonzero elements, the encoding matrix G may satisfy the equation below (third combination). An encoding code according to the encoding matrix G satisfying the below equation enables transmission at rate of 1−1.21p (channel capacity is 1−p).
G=[G1,G2]=[G11, 0s×(n−s):G21, I(n−s)×(n−s)] εFn×n
The storage and manipulation of such a code is highly efficient due to the regularity of the code (no fill-ins are needed since any row contains exactly 3 ones). The second decoding stage requires only 3·t XOR operations.
FIG. 9 is a flowchart showing an encoding method according to an example embodiment of inventive concepts. An encoder for performing the encoding method according to an example embodiment of inventive concepts may generate codeword DATA_CW corresponding to data to be stored in a cell array in which n memory cells including t stuck cells are arranged.
As shown in FIG. 9, an encoder may generate a column vector including a n−s entries based on data to be stored in a memory device, in correspondence to data to be stored in a cell array including n memory cells (operation E10). The column vector u may be concatenated with a column vector w to be generated later, thereby forming a column vector x.
An encoder may generate an encoding matrix G, which is a n×n matrix, by combining at least one matrix that is generated randomly or pseudo-randomly (operation E50). For example, the encoder may generate the encoding matrix G as shown in FIG. 7. In other words, the encoder may generate a n×s first encoding matrix G1 and a n×(n−s) second encoding matrix G2 and may generate the encoding matrix G by combining the first encoding matrix G1 with the second encoding matrix G2. Here, the encoder may generate the second encoding matrix G2 by combining a zero matrix 0s×(n−s) and a unit matrix I(n−s)×(n−s).
The encoder may generate the column vector w having s entries based on the information α regarding coordinates of stuck cells, information μ regarding values of the stuck cells, the column vector u, and the encoding matrix G (operation E30). As described above with reference to FIG. 5, the encoder may obtain t linear equations based on the information α, the information μ, the column vector u, and the encoding matrix G and may calculate s roots satisfying the t linear equations as entries of the column vector w.
The encoder may determine that the encoder is unable to generate the column vector w based on the information α, the information μ, the column vector u, and the encoding matrix G (operation E40). For example, if the encoder is unable to calculate s roots satisfying the t linear equations, the encoder may determine that there is no column vector w corresponding to the information α, the information μ, the column vector u, and the encoding matrix G. If the encoder determines that there is no column vector w, the encoder may generate a new encoding matrix G by combining at least one matrix that is randomly or pseudo-randomly sampled (the operation E20).
If the encoder generates the column vector w, the encoder may generate a column vector x by concatenating the column vector w with the column vector u and may generate a column vector v having n entries by multiplying the encoding matrix G by the column vector x (operation E20). The encoding matrix G used for generating the column vector x may be identical to the encoding matrix G used for generating the column vector w in the operation E30. The column vector v generated by the encoder corresponds to data to be stored in a cell array including n memory cells, and entries of the column vector v respectively corresponding to t stuck cells may have values of the corresponding stuck cells, respectively.
If the column vector w is successfully generated, the encoder may generate a header including the label of the encoding matrix G used for generating the column vector w (or used for generating the column vector v) and s, which is the number of entries of the column vector w (operation E60). The header generated by the encoder may be stored in a storage space separate from a storage space in which data corresponding to the column vector v is stored, and a decoder may later access the storage space and determine the encoding matrix G of a column vector v′ and s corresponding to data received from memory cells based on the header.
FIG. 10 is a flowchart showing a method of generating a column vector w according to an example embodiment of inventive concepts. An encoder may obtain t linear equations by using information α, information μ, a column vector u, and an encoding matrix G (operation E31). The linear equations may include s entries of the column vector w as variables, entries included in rows of a first encoding matrix G1 corresponding to the information α as coefficients, and mathematically-calculated values of the information μ, the column vector u, and a second encoding matrix G2 as constants. As described above with reference to FIGS. 7 and 8A through 8C, the encoder may generate the second encoding matrix G2 by combining a zero matrix with a unit matrix and may generate the first encoding matrix G1 according to any of example embodiments of inventive concepts described above.
The encoder may calculate s roots satisfying t linear equations and generate a column vector w by employing the roots as entries of the column vector w (operation E32). Based on the encoding matrix G according to example embodiments of inventive concepts as described above, the encoder may calculate the s roots and generate the column vector w at low complexity.
FIG. 11 is a flowchart showing a method of generating a column vector v according to an example embodiment of inventive concepts. An encoder may generate a column vector y having n entries by multiplying a first encoding matrix G1 by a column vector w (operation E51). The encoder may generate a column vector v by adding the column vector y to a product of a second encoding matrix G2 and a column vector u (operation E52).
According to example embodiments of inventive concepts, the multiplication of the second encoding matrix G2 and the column vector u may have low complexity. For example, if the second encoding matrix G2 is a combination of a zero matrix and a unit matrix as shown in FIG. 7, the multiplication of the second encoding matrix G2 and the column vector u may be a column vector having n entries, which includes s zero entries and entries of the column vector u. In other words, without further mathematical calculation, the multiplication of the second encoding matrix G2 and the column vector u may be obtained based on arrangement of the column vector u.
FIG. 12 is a decoding method according to an example embodiment of inventive concepts. The decoding method may be similar to a reverse of the encoding method described above with reference to FIG. 9. A decoder for performing the decoding method may use a column vector v′ including n entries corresponding to data stored in a cell array including n memory cells and a header corresponding to the data as inputs and may output a column vector having (n−s) entries.
As shown in FIG. 12, a decoder may receive a column vector v′ including n entries corresponding to data stored in a cell array and a header corresponding to the data (operation D10). The column vector v′ may include values corresponding to stuck, cells as entries. The header may be separately stored in a storage space included in the memory controller 1000 of FIG. 1 or may be separately stored in a cell array included in the memory device 2000, where the decoder may receive the header from the storage space or the cell array in which the header is stored.
The header may include the label of an encoding matrix G used by an encoder to generate the column vector v′ and may include s, which is the number of entries of the column vector w (operation D20). The decoder may obtain an inverse matrix G-1 of the encoding matrix G based on the label of the encoding matrix G included in the header (operation D30). For example, the decoder may access a memory in which the inverse matrix G-1 is stored based on the label of the encoding matrix G and read out the inverse matrix G-1. According, to the structure of the encoding matrix G, the inverse matrix G-1 may be identical or similar to the encoding matrix G.
The decoder may generate a column vector x′ including n entries by multiplying the inverse matrix G-1 by the column vector v′ and may generate a column vector u′ by extracting (n−s) entries from the column vector x′ (operation D40). The decoder or a component, following the decoder may restore data prior to being encoded by the encoder from the column vector u′.
FIG. 13 is a schematic block diagram showing an encoder 1100a and a decoder 1200a according to an example embodiment of inventive concepts. According to an example embodiment of inventive concepts, the encoder 1100a may not only perform the operation described above with reference to FIG. 9, but also perform additional ECC encoding. Furthermore, according to an example embodiment of inventive concepts, the decoder 1200a may not only perform the operation described above with reference to FIG. 12, but also perform additional ECC decoding. In other words, an encoding method according to an example embodiment of inventive concepts may be used for encoding data output by another ECC encoder, and data generated by using the encoding method according to an example embodiment of inventive concepts may be encoded by another ECC encoder. Furthermore, a decoding method according to an example embodiment of inventive concepts may be used for decoding data output by another ECC decoder, and data generated by using the decoding method according to an example embodiment of inventive concepts may be decoded by another ECC encoder. As shown in FIG. 13, the encoder 1100a may include a SC encoder 1110 and first and second ECC encoders 1120 and 1130, whereas the decoder 1200a may include a SC decoder 1210 and first and second ECC decoders 1220 and 1230.
The first ECC encoder 1120 may receive raw data DATA_RAW from outside the encoder 1100a. The raw data DATA_RAW may include user data to be written to the memory system 900 by the host 800 of FIG. 1 and metadata generated by the memory controller 1000 to manage the user data. The first ECC encoder 1120 may encode the raw data DATA_RAW and output input data DATA_IN with respect to the SC encoder 1110.
According to an example embodiment of inventive concepts, the SC encoder 1110 may include the components shown in FIG. 3 and may perform the operations shown in FIG. 9. The SC encoder 1110 may perform an encoding method according to an example embodiment of inventive concepts based on input data DATA_IN and stuck cell information SC_INFO and output codeword DATA_CW and header DATA_HD.
The second ECC encoder 1130 may receive the codeword DATA_CW from the SC encoder 1110 and encode the codeword DATA_CW. Data encoded by the second ECC encoder 1130 may be transmitted to the decoder 1200a via a noise channel 2000a. The noise channel 2000a is a channel in which errors may occur between received data and transmitted data and may include a memory device and a cell array, for example. The second ECC decoder 1230 may receive data from the noise channel 2000a and output codeword DATA_CW by decoding the received data.
The second ECC encoder 1130 and the second ECC decoder 1230 may perform an encoding operation and a decoding operation, respectively, for correcting possible errors in the noise channel 2000a. According to an example embodiment of inventive concepts, the second ECC encoder 1130 and the second ECC decoder 1230 may perform an encoding operation and a decoding operation while information regarding stuck cells is maintained, so that errors regarding the stuck cells may be corrected.
According to an example embodiment of inventive concepts, the SC decoder 1210 may perform the operation shown in FIG. 12. The SC decoder 1210 may receive the codeword DATA_CW′ and the header DATA_HD and generate output data DATA_OUT. The header DATA_HD is a header generated by the SC encoder 1110 when the input data DATA_IN is encoded, is stored in a separate storage space 910 in the memory system 900, and is transmitted to the SC decoder 1210. The first ECC decoder 1220 may receive the output data DATA_OUT from the SC decoder 1210, decode the output data DATA_OUT by using a decoding method corresponding to the encoding method sued by the first ECC encoder 1120, and output raw data DATA_RAW′.
The first ECC encoder 1120 and the first ECC decoder 1220 may perform an encoding operation and a decoding operation in correspondence to each other, whereas the second ECC encoder 1130 and the second ECC decoder 1230 may also perform an encoding operation and a decoding operation in correspondence to each other. For example, the ECC encoders 1120, 1130 and the ECC decoders 1220, 1230 may use LDPC.
Although FIG. 13 shows an example embodiment in which all ECC encoders and all ECC decoders are arranged around the SC encoder 1110 and the SC decoder 1210, the encoder 1100a and the decoder 1200a may include a pair of an encoder and a decoder corresponding to each other from among the ECC encoders and the ECC decoders stated above.
FIG, 14 is a block diagram showing a computer system 3000 including a memory system according to an example embodiment of inventive concepts. The computer system 3000, such as a mobile device, a desktop computer, and a server, may employ a memory system 3400 according to an example embodiment of inventive concepts.
The computer system 3000 may include a central processing unit 3100, a RAM 3200, a user interface 3300, and the memory system 3400, are electrically connected to buses 3500. The host as described above may include the central processing unit 3100, the RAM 3200, and the user interface 3300 in the computer system 3000. The central processing unit 3100 may control the entire computer system 3000 and may perform calculations corresponding to user commands input via the user interface 3300. The RAM 3200 may function as a data memory for the central processing unit 3100, and the central processing unit 3100 may write/read, data to/from the memory system 3400.
As in example embodiments of Inventive concepts described above, the memory system 3400 may include a memory controller 3410 and a memory device 3420. The memory controller 3410 may include an encoder, a decoder, and a stuck cell information storing unit, the memory device 3420 may include a cell array including a plurality of memory cells, and the cell array may include stuck cells. The encoder may receive information regarding stuck cells from the stuck cell information storing unit, encode data to be stored in the cell array, generate codeword, and generate a header corresponding to the codeword. The codeword generated by the encoder may include values of the stuck cells included in the cell array. The decoder may extract encoding information from the header and decode the data stored in the cell array based on the encoding information.
FIG. 15 is a block diagram showing a memory card 4000 according to an example embodiment of inventive concepts. A memory system according to example embodiments of inventive concepts described above may be the memory card 4000. For example, the memory card 4000 may include an embedded multimedia card (eMMC) or a secure digital (SD) card. As shown in FIG. 15, the memory card 4000 may include a memory controller 4100, a non-volatile memory 4200, and a port region 4300. A memory device according to example embodiments of inventive concepts may be the non-volatile memory 4200 shown in FIG. 15.
The memory controller 4100 may include an encoder, a decoder, and a stuck cell information storing unit according to example embodiments of inventive concepts as described above. The encoder and the decoder may perform an encoding method and a decoding method according to example embodiments of inventive concepts, whereas the stock cell information storing unit may store information regarding stuck cells included in the non-volatile memory 4200. The memory controller 4100 may communicate with an external host via the port region 4300 in compliance with a pre-set protocol. The protocol may be eMMC protocol, SD protocol, SATA protocol, SAS protocol, or USB protocol. The non-volatile memory 4200 may include memory cells which retain data stored therein even if power supplied thereto is blocked. For example, the non-volatile memory 4200 may include a flash memory, a magnetic random access memory (MRAM), a resistance RAM (RRAM), a ferroelectric RAM (FRAM), or a phase change memory (PCM).
FIG. 16 is a block diagram showing an example network system 5000 including a memory system according to an example embodiment of inventive concepts. As shown in FIG. 16, the network system 5000 may include a server system 5100 and a plurality of terminals 5300, 5400, and 5500 that are connected via a network 5200. The server system 5100 may include a server 5110 for processing requests received from the plurality of terminals 5300, 5400, and 5500 connected to the network 5200 and a SSD 5120 for storing data corresponding to the requests received from the terminals 5300, 5400, and 5500. Here, the SSD 5120 may be a memory system according to an example embodiment of inventive concepts.
Meanwhile, a memory system according to example embodiments of inventive concepts may be mounted via any of various packages. For example, a memory system according to an example embodiment of inventive concepts may be mounted via any of packages including package on package (PoP), ball grid arrays (BGAs), chip scale packages (CSPs), plastic leaded chip Carrier (PLCC), plastic dual in-line package (PDIP), die in waffle pack, die in wafer form, chip on board (COB), ceramic dual in-line package (CERDIP), plastic metricquad flat pack (MQFP), thin quad flatpack (TQFP), small outline (SOIC), shrink small outline package (SSOP), thin small outline (TSOP), thin quad flatpack (TQFP), system. In package (SIP), multi chip package (MCP), wafer-level fabricated package (WFP), wafer-level processed stack package (WSP), etc.
It should be understood that example embodiments described herein should be considered in a descriptive sense only and not for purposes of limitation. Descriptions of features or aspects within each embodiment should typically be considered as available for other similar features or aspects in other example embodiments.
Example Embodiments of the Fast Linear Equations Solution Used by the SC Encoder
- Introduction
Let F be any field and m,n≧1. A matrix AεFn×m is called here C-sparse for some C>0 if the number of non-zero entries in A is upper bounded by C·n (though the reference to C will usually be omitted here). The goal of this section is low complexity algorithms that solves the set of linear equations A·x=y where AεFn×m is a sparse matrix, xεFm, yεFn where x, y are column vectors. This goal is achieved by an algorithm that first diagonal be A and then (or simultaneously) it solves the set of linear equations.
- Define for i,jε[n], E′i,j,nεFn×n to be the matrix whose entries are all zeros except for the (i,j) entry which is 1. Elementary matrix are matrix of the type (In×n+x·E′p,q,n) and also simple permutation matrix whose product swaps 2 coordinates such as the matrix
This section reminds that every matrix AεFn×m can be diagonalized that is, decomposed (as is done in SVD with Householder matrix) as follows:
A=E1−1· . . . , ·EN−1−1·EN−1·Ik,n×m·E′M−1·E′M−1−1· . . . ·E′1−1.
Where Ik,n×m is diagonal with (1,1, . . . , 1,0, . . . 0) on the diagonal. It proposes a greedy algorithm that at each subroutine minimizes (in some sense) the field operations done by the algorithm. Each subroutine reduces the remaining matrix by one column and one row, that is reduces n and m by one. It is—at this point—hubristic procedure aimed at reducing (& nearly minimizing) field (F) operations for general sparse matrix
- Define:
Mt,d,n,m={A=(ai,j)iε[n],jε[m]εFn×m: for all 1≦i≦n: |{jε[m−d]: aij≠0}|≦r}.
This type of matrix with constant d correspond to our random coding and hence they are our main focus in the effort of developing fast algorithms that solves A·x=y. At this point will show that that for r=2 there is an O(n)-complexity solution.
Left Product by Matrix of the Type (In×n+x·E′p,q,n) is Equivalent to Adding a Scalar Product of a Row to Another Row
- Lemma 1: For A=(ai,j)iε[n],jε[m]εFn×m, xεF and p,qε[n],
- (i) E′p,q,n·A=B=(bi,j)iε[n],jε[m] where for iε[n],jε[m]:
- bi,j=δi,p·aq,j (δi,p is Kronecker's δ).
That is, B is the matrix whose rows are all zeros except for the p row that is equal to the q row of A.
- (ii) (In×n+x·E′p,q,n)·A=C=(ci,j)iε[n],jε[m] where for iε[n],jε[m]:
- ci,j=ai,j+x·δi,p·aq,j (δi,p is Kronecker's δ).
That is, C is the matrix whose rows are ail equal to A's row except for the p row that is equal to the p row of A plus x multiplied by the q row of A.
Right Product by Matrix of the Type (Im×m+x·E′p,q,m) is Equivalent to Adding a Scalar Product of a Column to Another Column
- Lemma 2: For A=(aa,j)iε[n],jε[m]εFn×m, xεF and p,qε[m],
- (i) A·E′p,q,m=B=(bi,j)iε[n],jε[m] where for iε[n],jε[m]:
- bi,j=δj,qai,p.
That is, B is the matrix whose columns are all zeros except for the q column that is equal to the p row of A.
- (ii) A·(Im×m+x·E′p,q,m)=C=(ci,j)iε[n],jε[m] where for iε[n],jε[m]:
- ci,j=ai,j+x·δj,q·ai,p (δi,p is Kronecker's δ).
That is, C is the matrix whose columns are all equal to A's columns except for q column that is equal to the q column of A plus x multiplied by the p row of A.
- Permutation and Swap Permutation Matrix
Denote as usual the set of all permutations on [n] by:
- Sn={π: π:[n]→[n] 1:1}.
Let πεSn and denote Ord(π)=min {k≧1: πk=1}.
- Fix F=Z2. For πεSn define the associated permutation matrix: Pπ,n=(aij)1≦i,j≦nεFn×n by:
- (1) aij=1j=π(i) for all i,jε[n]
note the equivalence (1) (2) (3) (4)
- (2) aij=δm(i),j for all i,jε[n]
- (3) aij=1i=π−1(j) for all i,jε[n]
- (4) aij=δi,n−1(j) for all i,jε[n]
- Proposition 1. For πεSn and v=[v1, . . . , vn]TεFn
- Pπ,n·v=[vπ(1), . . . , vπ(n)]T
- Proof. Let u=[u1, . . . , un]T=Pπ,n·v then for iε[n]: ui=Σi≦j≦nδi,n−1(j)·vj=vn(i)♦
- Proposition 2. For π,σεSn: Pσ,n·Pπ,n=Pπoσ,n which implies: Pπ,n−1=Pπ−1,n
- Proof. Take v=[v1, . . . , vn]TεFn and let u=[u1, . . . , un]T=Pπ,n·v and let w=[w1, . . . , wn]T=Pσ,n·u.
That is w=Pσ,n·Pπ,n·v. Then for iε[n] ui=vπ(i) and hence for kε[n] wk=uσ(k)=vπ(σ(k)). Thus Pσ,n·Pπ,n=Pπoσ,n.
It follows from proposition 1 that:
- Proposition 3. For A=(ai,j)iε[n],jε[m]εFn×m, Pπ,n·A=B=(bi,j)iε[n],jε[m] where for iε[n],jε[m]: bi,j=an(i),j
That is π-permutation of each column.
- Proposition 4. For A=(ai,j)iε[n],jε[m]εFn×m, A·Pπ,m=B=(bi,j)iε[n],jε[m] where for iε[n],jε[m]: bi,j=ai,π(j)
That is π permutation of each row.
Definition: for p,qε[n] define the swap permutation πp,q=π[p,q]εSn by:
- πp,q(p)=q
- πp,q(q)=p
- πp,q(i)=i for all iε[n] i≠p and i≠q
And define the swap permutation matrix by
- Pp,q,n=Pπ[p,q],n
When p=q Pp,q,n=Iε×n. Obviously the interesting case is when p≠q, Right product with Pp,q,n switches the p and q rows. Left product with Pp,q,n switches the p and q columns.
- Definition—Elementary Matrix
Define for p,qε[n], p≠q and xεF:
Ep,q,n,x=In×n+x·E′p,q,n
Recall that right product with this matrix amounts to adding x multiplied by the q row to the p row. Left product with this matrix amounts to adding x multiplied by the p column to the q column. Define for pε[n] and xεF:
Ep,n,x=In×n+(x−1)·E′p,p,n
Note that tight product with this matrix amounts to multiplying the p row multiplied by x, while leaving all other rows unchanged. Left product with this matrix amounts to multiplying the p column multiplied by x, while leaving all other columns unchanged. This type of elementary matrix is not relevant to our Z2 applications, but they are relevant to other applications.
Set
E*0,n={Ep,n,x:pε[n], xεF\{0}}
E*1,n={Ep,q,n,x:p,qε[n], p≠q xεF\{0}}
E*2,n={Pp,q,n:p,qε[n], p≠q}
E*n=E*0,n∪E*1,n∪E*2,n.
E*n is called here the set of elementary matrix in Fn×n.
Inverse of Elementary Matrix is Elementary Matrix of the Same Type
- Proposition: For p,qε[n] p≠q, xεF\{0}:
- (i) Ep,n,x−1=Ep,n,x−1
- (ii) Pp,q,n−1=Pp,q,n
- (iii) Ep,q,n,x−1=Ep,q,n,−x
- Proof: (i) is obvious, (ii) follows from proposition 2 above and (iii) from:
(In×n+x·E′p,q,n)·(In×n−x·E′p,q,n)=In×n+x·E′p,q,n−x2·E′p,q,n·E′p,q,n=In×n. - Diagonalization—Definition
For k≦min(n,m) define the k unit diagonal matrix of Fn×m, Ik,n×m, by:
- [Ik,n×m]ij=1 for all 1≦i≦k and [Ik,n×m]ij=0 for all iε[n] and jε[m] where i≠j or i>k.
Let AεFn×m. The sequences of matrix E1, . . . , ENεE*n and E′1, . . . , E′MεE*m are said to diagonalize A if for some k≦min(n,m):
- Ik,n×m=EN·EN−1· . . . ·E1·A·E′1·E′2· . . . ·E′M
Note that this is equivalent to the following (SVD-like) decomposition of A:
- A=E1−1· . . . ·EN−1−1·EN−1·Ik,n×m·E′M−1·E′M−1−1· . . . ·E′1−1
- Diagonalization Theorem
To every AεFn×m there exists E1, . . . , ENεE*n and E′1, . . . , E′MεE*m and unique k such that:
- Ik,n×m=EN·EN−1· . . . ·E1·A·E′1·E′2· . . . ·E′M♦
Observe that k=rank(A) and hence k is unique. We will prove this well known proposition via a constructive proof that provides a greedy-minimization of the number of arithmetic operations and N+M. To this end we must first define associated complexity norm.
- Diagonalization with Given Bracket
For K≧1 θ=(θ1, . . . , θK) is called here brackets sequence if θ1ε{0,1} and for all iε{2, . . . , K}θi−θi−1ε{0,1}. Let AεFn×m and suppose E1, . . . , ENεE*n and E′1, . . . , E′MεE*m diagonalize A for some k≦min(n,m), that is
- Ik,n×m=EN·EN−1· . . . ·E1·A·E′1·E′2· . . . E′M.
Let θ=(θ1, . . . , θM+N) be brackets sequence that satisfies θM+N=N. We define θ-ordered diagonalization process—that is, diagonalization process with the elementary matrix E1, . . . , ENεE*n and E′1, . . . , E′MεE*m whose brackets are determined by the sequence θ—to be the sequence
- A=A0, A1, A2, . . . , AM+M=Ik,n×m
Where for kε[M+N]:
- Ak=Eθ(k)·Eθ(k)−1′ . . . ·E1·A·E′1·E′2· . . . ·E′k−θ(k).
Which implies that for kε[M+N], when θk−θk−1=1:
- Ak=Eθ(k)·Ak−1,
and when θk−θk−1=0:
- Ak=Ak−1·E′k−θ(k).
- The Complexity of a Single Product with Elementary Matrix and Complexity of Diagonalization
Let A=(ai,j)iε[n],jε[m]εFn×m, define for iε[n], jε[m]:
- X(A,i)=|{jε[m]: aij≠0}|
- Y(A,j)=|{iε[n]: aij≠0}|
- Z(A,i,j)=max(X(A,i)−1, 0)·max(Y(A,j)−1, 0).
Let p,qε[n], p≠qxεF\{0}, we define the complexity of the product Ep,q,n,x·A by:
- C(Ep,q,n,x,A)=c1·(X(A,q)−1)
where c1>0 is a constant that reflects the complexity of one scalar product and one scalar addition. Recall that right product with Ep,q,n,x amounts to adding to the p row x multiplied by the q row, in the case F=Z2 the product is not relevant so c1 depends (of course) on F. The reason for defining this complexity as c1·(X(A,p)−1) and not c1·X(A,p) is because in our algorithms when we add a scalar product of one row to another where at the “leading 1” column we get cancelation without a need for computation. Likewise we define the complexity of the product A·Ep,q,n,x by:
- C(A,Ep,q,n,x)=c1·(Y(A,p)−1)
Recall that left product with this matrix amounts to adding x multiplied by the p column to the q column. Let pε[n], xεF\{0}, we define the complexity of the product Ep,n,x·A by:
- C(Ep,n,x,A)=c0·X(A,p),
where c0>0 is a constant that reflects the complexity of one scalar product. Likewise we define the complexity of the product A·Ep,n,x by:
- C(Ep,n,x,A)=c0·Y(A,p),
Let p,qε[n], p≠q we define the complexity of the products Pp,q,n·A and A·Pp,q,n by:
- C(Pp,q,n,A)=C(A, Pp,q,n)=c2
where c0>0 is a constant.
- The Complexity of Diagonalization
Let AεFn×m and suppose E1, . . . , ENεE*n and E′1, . . . , E′MεE*m diagonalize A for some k≦min(n,m), that is
- Ik,n×m=EN·EN−1· . . . ·E1·A·E′1·E′2· . . . ·E′M.
Let θ=(θ1, . . . , θM+N) be brackets sequence that satisfies θM+N=N. We define the complexity of the diagonalization process that corresponds to E1, . . . , ENεE*n and E′1, . . . , E′MεE*m and θ to be:
- C=C1+ . . . +CM+N
where C1, . . . , CM+N are defined as follows. For kε[M+N], set as before:
- Ak=Eθ(k)·Eθ(k)−1· . . . ·E1·A·E′1·E′2· . . . ·E′k−θ(k).
For kε[M+N], if θk−θk−1=1 then Ak=Eθ(k)·Ak−1, and we define Ck=C(Eθ(k),Ak−1) and if θk−θk−1=0, then Ak=Ak−1·E′k−θ(k), and we define Ck=C(Ak−1, Eθ(k)).
k-Unit Matrix
- Definition. Let AεFn×m and min(m,n)≧k≧1. A is called k-unit matrix if [A]i,j=δij for all (i,j)ε[n]×[m] such that i≦k or j≦k. It means that the Erst k rows and the first k columns are zero except for the diagonal which is all ones.
- The Generic Greedy Diagonalization Algorithm and a Proof to the Diagonalization Theorem
The following algorithm is effective for sparse matrix. A variant of this matrix that will later be described seem to be particularly effective for our ensembles of matrix. This target algorithm, which is designed for our ensembles will evolve from this generic algorithm. Let AεFn×m be non-zero matrix, A is the input of the following algorithm. Its output is E1, . . . , ENεE*n and E′1, . . . , E′MεE*m that diagonalize A for some K≦min(n,m), that is
- IK,n×m=EN·EN−1· . . . ·E1·A·E′1·E′2· . . . ·E′M.
A bi-product of the algorithm is θ=(θ1, . . . , θM+N) a brackets sequence that satisfies θM+N=N and which correspond to the diagonalization process executed by the algorithm.
The algorithm is executed by K≦min(n,m) subroutines. It Is supposed inductively that the input of the k-subroutine (1≦k≦min(n,m)) of the algorithm is a matrix Bk such that B1=A and for k>1 Bk is (k−1)-unit matrix and an addition the input of the k-subroutine includes E1, . . . , EN(k)εE*n and E′1, . . . , E′M(k)εE*m such that:
- Bk=EN(k)·EN(k)−1· . . . ·E1·A·E′1·E′2· . . . ·E′M(k)
If Bk=Ik−1,n×m then the algorithm terminates and its output is the input of the k-subroutine. Otherwise the k-subroutine computes the matrix Bk+1 which, is k-unit matrix and the input of the next subroutine). It is also supposed inductively that the input of the k-subroutine of the algorithm includes for all iε[n], jε[m] the values
- X(Bk,i), Y(Bk,j)
and when [Bk]i,j≠0 then also the value Z(A,i,j). This additional memory required for this storage is M(Bk), which is defined by:
- M1(Bk)={X(Bk,i), Y(Bk,j): iε[n], jε[m]}
- M2(Bk)={Z(Bk,i,j): iε[n], jε[m] and [Bk]i,j≠0}
- M(Bk)=M1(Bk)∪M2(Bk).
Note that the memory size required to store M(Bk) is proportional to the memory size required to store the nonzero entries of Bk. The k-subroutine of the algorithm starts by listing the coordinates (i,j), i,≧k for which [Bk]i,j≠0 and Z(A,i,j) is minimal that is obtaining min(M2(Bk)) and forming the set
- M3(Bk)={(i,j): k≦i≦n, k≦j≦m and Z(Bk,i,j)=min(M2(Bk))}
Next the k-subroutine selects the (i*,j*)εM3(Bk) for which X(Bk,i)+Y(Bk,j) is maximal. The permutation swap elementary matrix will now be utilized to “move” (i*,j*) to (k,k). This is done by computing (where required):
- Bk,1=Pk,i*,n·Bk·Pk,1*n,
(Note that if k=i* then Pk,i*,n=In×n—so this product is not required and if k=j* then Pk,j*,n=In×n and again this product is not required). Observe that now (k,k) took the place of (i*,j*): (k,k)εM3(Bk,1) and X(Bk,1,k)+Y(Bk,1,k) is the maximal value of X(Bk,1,i)+Y(Bk,1,j) for (i,j)εM3(Bk,1). At this point the algorithm updates X(,), Y(,), Z(,). Observe that:
- 1) X(Bk,1,k)=X(Bk,i*)
- 2) X(Bk,1,i*)=X(Bk,k)
- 3) X(Bk,1,i)=X(Bk,i) for all iε[n] i≠i* and i≠k
- 4) Y(Bk,1,k)=Y(Bk,j*)
- 5) Y(Bk,1,j*)=Y(Bk,k)
- 6) Y(Bk,1,j)=Y(Bk,J) for all jε[m] j≠j* and j≠k
And hence Z is updated accordingly by a swap on its definitions in the rows i* and k and the columns j* and k.
Let x=([B′k]k,k)−1. If x=1 (which is always the case when F=Z2) set Bk,2=Bk,1. Otherwise multiply the k row by x, i.e., compute the corresponding product with elementary matrix:
- Bk,2=Ek,n,x·Bk,1
Note again that X(Bk,2,k)=X(Bk,i*) and Y(Bk,2,k)=Y(Bk,j*). Now [Bk,2]k,k=1. If X(Bk,2,k)=Y(Bk,2,k)=1, then the k-step ends here and Bk+1=Bk,2. Suppose now that this is not the case, and assume for example that X(Bk,2,k)≧Y(Bk,2,k) which implies that X(Bk,2,k)>1. To simplify notation write B=(bij=b(i,j)iε[n],jε[m]=Bk,2. d=X(B,k)−1=X(Bk,i*)−1 is the number of nonzero elements in the k B-row, except for bkk and that B is k-unit matrix. That is, there are k<j(1)<j(2) . . . <j(d)≦m such that: bk,j(1)≠0 for lε[d] all the rest of the k-row—except for bkk—is zeros. It is also provided that the first k−1 elements of the k-column are zeros. If The k-subroutine will now apply left product with d elementary matrix of the type E*l,n={Ep,q,n,x} to zero the k row, that is, the entries bk,i(1)≠0 for lε[d]. Thus it computes Bk,3 as follows:
- (i) Bk,3=Bk,2·Ek,i(1),n,−b(k,i(1))·Ek,j(2),n,−b(k,j(2))· . . . ·Ek,j(1),n,−b(k,j(d))·
Recall that, left product with Ep,q,n,x amounts to adding x multiplied by the q row to the p row and right product with this matrix amounts to adding x multiplied by the p column to the q column.
Each products with Ek,j(2),n,−b(k,j(2)), lε[d] is equivalent to adding scalar product of the k column of B to the each one of the d B-columns that intersects with the k row with non-zero entry. This is done in such way that cancels the k element in these columns. Following this procedure the only non-zero element in the k row of Bk,3 is [Bk,3]k,k. Note also that each product with Ek,j(1),n,−b(k,j(1)) implies adding a column whose first (k−1)-entries are zero to another column of this type. The result Is that the first (k−1)-entries of each column of are zero. Since k<j(1)≦j(2) . . . <j(d)≦m also the first (k−1)-entries of each row of Bk,3 are zero. Thus Bk,3 is k-unit matrix whose k-row is all zeros except for [Bk,3]k,k which is one. At this point the algorithm updates
At this point Y(Bk,3,k)=Y(Bk,j*) and X(Bk,j,k)=1. If Y(Bk,3,k)=1, then the k-step ends here and Bk+1=Bk,3. Suppose now that this is not the case, and Y(Bk,3,k)>1. For notational simplicity set C=(cij=c(i,j))iε[n],jε[m]=Bk,3. There are exactly d′=Y(C,k)−1 nonzero elements in the k column of C, except for ckk. That is, there are k<i(1)<i(2) . . . <i(d′)≦n such that: ck,i(1)≠0 for lε[d′] all the rest of the k-column—except for ckk—is zeros. The k-subroutine applies now right product with d′ elementary matrix of the type E*l,n to nil the k column (off the (k,k) entry) in the following way:
- (ii) Bk,4=Ei(1),k,n,−c(i(1))·Ei(2),k,n,−c(i(2))· . . . ·Ei(d′),k,n,−c(i(d′))·Bk,3
Note that no arithmetic field operations are involved here, in fact the relative complexity of (ii) is marginal. Set Bk+1=Bk,4. A key point w.r.t. complexity is that the almost all the complexity of the k-subroutine is in (i). Now Z(Bk,i*,j*)=Z(Bk,2,k,k) multiplications and additions were involved in (i). This is so since we added the k-column to another column in (i) X(Bk,2,k)−1 times, performing Y(Bk,2,k)−1 products and additions at each of these times. Thus the motivation in choosing (i*,j*) is greedy minimization of the bulk complexity of the k-subroutine. Also Z(Bk,i*,j*)−X(Bk,1k)−Y(Bk,1,k)+2 is an upper bound to the number of non-zero entries that were added to Bk via (i) and (ii). At this stage the algorithm updates X(,), Y(,), Z(,), as follows: j(1)≠0 for lε[d]
- 1) Y(Bk+1,k)=1
- 2) X(Bk+1,k)=1
- 3) Y(Bk+1,j(1))=Y(Bk,1, j(1))−1 for lε[d]
- 4) X(Bk+1,i(1))=X(Bk,1, i(1))−1 for lε[d′]
And Z(,) is updated accordingly. Note that the supposition X(Bk,2,k)≧Y(Bk,2,k) led us to start the k subroutine by an elimination of the k-line in (i). Likewise if we had X(Bk,2,k)≦Y(Bk,2,k) we would start the k subroutine by an elimination of the k-column in (i) and then eliminate the k row.
The algorithm terminates at k if Bk=Ik−1,n×m for some k<min(n,m). Otherwise it terminates at k=min(n,m).
- An Algorithm that Solves A·x=y when Diagonalization is Provided
Let AεFn×m\{0} and yεFn and suppose that we want to solve the system of linear equations A·x=y for xεFm. Our first step will be to find E1, . . . , ENεE*n and E′1, . . . , E′MεE*m that diagonalize A for some k≦min(n,m), that is
- (1) Ik,n×m=EN·EN−1· . . . ·E1·A·E′1·E′2· . . . ·E′M.
The case k<min(n,m) our algorithm declares failure. But we note that in some uncommon cases a solution might exit to the original equation A·x=y, and even if not there are alternatives to failure-declaration such as small ad-hoc correction to A when min(n,m)−k is small. We would thus assume henceforth, that k=min(n,m) which is equivalent to A being full ranked. It follows from (1) that
- (2) A=E1−1·E2−1· . . . EN−1−1·EN−2·Ik,n×m·E′M−1·E′M−1−1· . . . ·E′2−1·E′1−1.
And hence it is desired to solve:
- (3) y=E1−1·E2−1· . . . EN−1−1·EN−2·Ik,n×m·E′M−1·E′M−1−1· . . . ·E′2−1·E′1−1·x,
Which is equivalent to:
- (4) z≡EN·EN−1· . . . ·E1·y=Ik,n×m·(E′M−1·E′M−1−1· . . . ·E′2−1·E′1−1·x).
write
- (5) u=E′M−1·E′M−1−1· . . . ·E′2−1·E′1−1·x.
So solving A·x=y is equivalent to solving
- (6) z=Ik,n×m·u.
Note that z is computed from y via N successive coordinates switch/products/additions and products in the field F. This can be done simultaneously with the diagonalization of A. In the equal-dimensional and under determined case, where m≧n,
- u=[y,0Fm−n]TεFm
solves the equation z=Ik,n×m·u. In the over determined case, where m<n, uεFm solves the equation z=Ik,n×m·u if zi=0 for all n≧i<m and ui=zi for all iε[m]. Finally, whenever the equation z=Ik,n×m·u has a solution the original equation A·x=y has solution. This follows from the fact that it follows from (5) that in this case:
- x=E′1·E′2· . . . ·E′M·u
Note again that in the under determined case, where m≧n solution exists when A is full ranked and In the over determined case, where m<n solution exists when A is full ranked and if y is in the subspace spanned by A's columns.
Diagonaling ensemble-2 matrix & solving A·x=y in O(n)-worst-case-complexity
(Demonstrating the technique when d=0)
To demonstrate the concept let, at first, A=(aij)iε[n],jε[m]εFn×m be a matrix with no more than 2 ones per row, that is AεM2,d,n,m where d=0. It would be assumed without loss of generality that m≧n. The proposed algorithm is a slight variation on the above greedy algorithm. It is characterized by the fact that in the above generic algorithm if at the k-subroutine X(Bk,2,k)≦Y(Bk,2,k) we would carry out the k subroutine by first eliminating the k-column in (i) and then eliminate the k row.
Fix now d≧0. The passage to the general case where AεM2,d in the next section, will imply a rather straightforward d scaling of the complexity, i.e. O(n·(d+1))-complexity. Observe that every matrix in M2,d,n,m, can be formed by taking matrix AεM2,0 and adding to it d random columns (that is, with independently and identically distribute (i.i.d.). Bernoulli with 1/|F| probability to each F-scalar per entry). We now add a stopping rule to our algorithm: if at some point in the process it is found that the rank of A is below n−d the algorithm halts.
A key observation is that when we deduct product of a row from another row with the same position of leading non-zero entry—then if each of these 2 rows has at most 2 non-zero entries—the result is also row of at most 2 non-zero entries. Let ei be the i unit row of Fm and take two such rows:
- u=ek+x·ei and v=y·ek+z·ej, 1≦k<i,j≦m,
and note that;
- v−y·u=z·ej−x·y·ei
Before describing the algorithm's steps, we state an inductive assumption that at the beginning of the k subroutine each row of Bk (see generic algorithm above) has no more than 2 non-zero entries. It will be shown that this assumption carry on to k+1. At the beginning of each subroutine we go to a non-zero column with index j*ε {k,k+1, . . . , m} that have minimal number of non-zero elements. (It is noted in brackets that with high probability (w.r.t. our ensembles) there will be a single one in the j* column). Let i* be a row having a minimal number of nonzero entries under the constraint [Bk]i*,j*≠0. Set:
- Bk,i=Pk,i*,n·Bk·Pk,j*,n
Let x=([B′k]k,k)−1 and compute (when required):
- Bk,2=Ek,n,x·Bk,1
As a result row i* was swapped with row k and column j* was swapped with column k. Next, if Y(Bk,2,k)=Y(Bk,j*)=1 them the k subroutine starts by eliminating the k row as in (i). Otherwise, if Y(Bk,2,k)=Y(Bk,j*)>1 the k subroutine starts by eliminating the k column as in (ii). In both cases after (i) & (ii) are performed there is one nonzero entry in the k row (in fact 1 in the (k,k) entry) and one non-zero entry in the k column. In addition the resulting matrix, Bk+1, has no more than 2 non-zero entries at each row. Also, the total number of non-zero entries cannot increase at such subroutine. Note that Y(Bk,j*)−1 is an upper bound to the number of field operations in the k-subroutine. This is so since X(Bk,i*)−1≦1 and hence Y(Bk,j*)−1≧(Y(Bk,j*)−1)·(X(Bk,i*)−1)=Z(Bk,i*,j*). Recall that Z(Bk,i*j*) is the number of field operations in the k-subroutine. Note also that Y(Bk,j*)+3 is an upper bound to the number of products with elementary matrix in the k-subroutine.
Define Ck=([Bk]i,j)k≦i≦n,k≦j≦m. It follows from the above stopping rule that if the number of non-zero columns in Ck is below n−k−d+1 then the algorithm comes to halt. Observe also that by the above, Ck has no more than 2 nonzero entries at each row. It follows that Ck has at most 2(n−k+1) non-zero entries. Consider the set of relevant indexes of non-zero columns:
- Jk={jε{k, . . . , m}: the j column, of Bk is a non-zero column (i.e. Y(Bk,j)>0)}.
Note that
- 2(n−k+1)≧ΣjεJkY(Bk,j)
Now, if the algorithm did not halt until the k-subroutine started,
- |Jk|≧n−k−d+1.
It follows that for j*εJk that minimizes Y(Bk,j),
- Y(Bk,j*)≦2(n−k+1)/(n−k−d+1)
- ≦2(n−k)/(n−k−d)
- ≦2/(1−d/(n−k)).
Thus when n−k≧2d (equivalent to n−2d≧k, recall that there are n subroutines) Y(Bk,j*)≦4.
When n=2d≦k≦n we are left with diagonalization of the last 2d rows. Note that (n−2d<k≦n) has no more than 2 non-zero entries and that Y(Bk,j*)−1≦2d is an upper bound to the number of field operations in the k-subroutine and also that Y(Bk,j*)+3≦2d+3 is an upper bound to the number of products with elementary matrix in the k-subroutine.
It follows that the number of field operation is upper-bounded by 3·n+4d2 and at the end of the diagonalization process.
Note that when we find E1, . . . , ENεE*n and E′MεE*m which satisfy the diagonalization requirement:
- A=E1−1·E2−1· . . . EN−1−1·EN−2·Ik,n×m·E′M−1−1· . . . ·E′2−1·E′1−1.
It holds that N+M≦6·n+4d2+6d. Note that the upper bounds here are very far from tight.
(Appendix: . . . (A+Ek,n,x)−1=(A·(I+A−1·Ek,n,x))−1 . . . )
- The general case, when d>0:
Fix d>0. For B=(bi,j)iε[n],jε[m]εFn×m, and iε[n] and jε[m] define:
- Xd(B,i)=|{jε[m−d]: bij≠0}|
- Y(B,j)=|{iε[n]: bij≠0}|
- Zd(B,i,j)=max(Xd(B,i)−1, 0)·max(Y(A,j)−1, 0).
Take AεM2,d,n,m. It would be assumed without toss of generality that m≧n. The proposed algorithm is a variation of the above greedy algorithms. The stopping rule here is that if at some point in the process it is found that the rank of A is not full ranked the algorithm, halts.
Set W={w[w1, . . . , wm]εFm: wi=0 for iε[m−d]}. A key observation is that when we deduct product of a row from another row with the same position of leading non-zero entry—then if each of these 2 rows has at most 2 non-zero entries in the first m−d components—the result is also row of at most 2 non-zero entries in the first m−d components. Let ei be the i unit row vector of Fm. Take two such rows:
- u=ek+x·ei+w and v=y·ek+z·ej+w′ 1≦k<i,j≦m, w,w′εW
and note that;
- v−y·u=z·ej−x·y·ei+(w′−y·w) where (w′−y·w)εW.
Here too before describing the algorithm's steps, we state an inductive assumption that at the beginning of the k subroutine each row of Bk (see generic algorithm above) has no more than 2 non-zero entries in the first m−d coordinates. It will be shown that this assumption carry on to k+1. At the beginning of each subroutine we go to a non-zero column with index j*ε{k,k+1, . . . , m−d} that have minimal number of non-zero entries. Let i* be a row having a minimal number of nonzero entries in the first m−d coordinates under the constraint [Bk]i*j*≠0. Set:
- Bk,1=Pk,i*,n·Bk·Pk,j*,n
Let x=([B′k]k,k)−1 and compute (when required):
- Bk,2=Ek,n,x·Bk,1
As a result row i* was swapped with row k and column j* was swapped with column k. Next, if Yd(Bk,2,k)=Yd(Bk,j*)=1 them the k subroutine starts by eliminating the k row as in (i) [an alternative: do the same when Yd(Bk,2,k)=Yd(Bk,j*)≦2]. Otherwise, if Yd(Bk,2,k)=Yd(k,j*)>1 the k subroutine starts by eliminating the k column as in (ii). In both cases after (i) & (ii) are performed there is one nonzero entry in the k row (1 in the (k,k) entry) and one non-zero entry in the k column. In addition the resulting matrix, Bk+1, has no more than 2 nonzero entries at each row. Also, the total number of non-zero entries cannot increase at such subroutine, it is not hard to compute that d·Y(Bk,j*) is an upper bound to the number of field operations in the k-subroutine and that d·Yd(Bk,j*)+4 is an upper bound to the number of products with elementary matrix in the k-subroutine.
Define
- Ck=([Bk]i,j)k≦i≦n,k≦j≦m−d.
Note the difference from the above definition in the case d=0. It follows from the above stopping rule that if the number of non-zero columns in Ck is below n−k−d+1 then the algorithm comes to halt. Observe also that by the above Ck has no more than 2 nonzero entries at each row. It follows that Ck has at most 2(n−k+1) non-zero entries. Thereby, consider the set of relevant indexes of non-zero columns:
- Jk{jε{k, . . . , m−d}: the j column of Bk is a non-zero column (i.e. Y(Bk,j)>0)}.
Note again that
- 2(n−k+1)≧ΣjεjkY(Bk,j)
Now, if the algorithm did not hah until the k-subroutine started,
- |Jk|≧n−k−d+1.
It follows that for j*εJk that minimizes Y(Bk,j),
- Y(Bk,j*)≦2(n−k+1)/(n−k−d+1)
- ≦2(n−k)/(n−k−d)
- ≦2/(1−d/(n−k)).
Thus when n−k≧2d (equivalent to n−2d≧k, recall that there are n subroutines) Y(Bk,j*)≦4.
When n−2d<k≦n we are left with diagonalization of the last 2d rows. Recall that Bk (n−2d<k≦n) has no more than 2 non-zero entries in the first m−d coordinates. Also note that by the above d·2d is an upper bound to the number of field operations in the k-subroutine and that d·2d+4 is an upper bound to the number of products with elementary matrix in the k-subroutine.
It follows that the number of field operation is upper-bounded by 4·d·n+4·d3 and at the end of the diagonalization process.
Finally when we find elementary matrix E1, . . . , ENεE*n and E′1, . . . , E′MεE*m which satisfy the diagonalization requirement:
- A=E1−1·E2−1· . . . EN−1−1·EN−2·Ik,n×m·E′M−1·E′M−1−1· . . . ·E′2−1·E′1−1.
It holds that N+M≦(4·d+4)·n+4·d3+8d.
Note that the bounds are very far from being tight.
- Extending a Solution of Linear Equations to Larger Matrix
Take matrix AεFn×m and suppose that the sequences of matrix E1, . . . , ENεE*n and E′1, . . . , E′MεE*m diagonalize A, where for some k<min(n,m)=L, that is:
- Ik,n×m=EN·EN−1· . . . ·E1·A·E′1·E′2· . . . ·E′M
Or equivalently
- A=E1−1·E21· . . . EN−1−1·EN−1·Ik,n×m·E′M−1·E′M−1−1· . . . ·E′2−1·E′1−1.
Let BεFn×q and consider the following set of n equations with m+q unknowns: z=[A,B]·[x,y]T - [E1−1·E2−1· . . . EN−1−1·EN−1·Ik,n×m·E′M−1·E′M−1−1· . . . ·E′2−1·E′1−1·x+B·y].
Detailed Description and Proof of the Duality of SC and BEC
- Introduction
Let n>s>t>0. This section proves that every SC linear code with code block of length n encompassing n−s information bits and having the ability to accommodate t stack bits with success probability 1−δ, has dual BEC code with equivalent properties: code block of length n encompassing n−s information bits and ability to fix t erasures with success probability 1−δ.
- Some Introductory Definitions
For α⊂[n] we denote the complement of α w.r.t. [n] by: αc=[n]\α. For n≧s,k≧1 define:
- Ω(n,k)={α: α={α(1), . . . , α(k)}⊂[n], α(1)< . . . <α(k)}.
For GεFn×m and αεΩ(n,k) define: Gα,s=(gα(i),j)iε[k],jε[s].
For GεFn×n and αεΩ(n,k) define: vα=[vσ(1), . . . , vσ(k)].
- Definition of the Set of SC Encoder Matrix
Fix F=Z2, fix n>s>t≧1, and small δ>0. Define first:
- SC(n,s)={GεFn×n: G is invertible and G of the form:
- G=[G1,G2]=(gi,j)1≦i,j≦n=[G11, 0s×(n−s): G21, I(n−s)×(n−s)]}
Next define SC(n,s,t,δ), the set SC encoder matrix that are capable of processing t stack bits with δ error probability and redundancy (t,s). Take G=[G1,G2]εFn×n such that G1εFn×s and G2εFn×(n−s) than SC(n,s,t,δ) is the set of matrix GεSC(n,s) such that when we sample αεΩ(n,t) with uniform probability the probability that Gα,s is not full ranked is <δ.
- The Erasure Channel, the Subspace of Code Words and the Duality Theorem
The parameters of the binary erasure channel includes ε>0, the independent-probability of erasure per coordinate. We fix block length n, then for transmit vector u=[u1, . . . , un]TεFn every ui has independent ε probability of being erased, and consequently for tε[n] there is δ(t,ε)>0 probability that more than t coordinates of u were erased. (Let t0, a positive integer, be such that the probability for more than t0 erased coordinated when v=[v1, . . . , vn]TεFn is transmitted, is smaller than δ′).
Let now,
- Vs={v=[v1, . . . , vn]εFn: vi=0 for all iε[s]}.
For invertible GεFn×n define: UG,s={v·G−1: vεVs}.
Take G=[G1,G2]εFn×n invertible matrix such that G1εFn×s and G2εFn×(n−s). Note that for uεFn: uε UG,sv=u·GεVsu·G1=01×s. Thus G1 is the parity check matrix of the code UG,s.
- The Duality Theorem
Take GεSC(n,s,t,δ), G=[G11, 0s×(n−s): G21, I(n−s)×(n−s)]εFn×n it than holds that:
- (i) If there are t stack, bits in a block of si bits, with locations αεΩ(n,t) that are chosen randomly with uniform probability than there is then probability δ that n−s information bits can be stored via the SC code associated with G.
- (ii) If any uεUG,s is transmitted via the above erasure channel and x is received with t erasures there is then probability δ>0 than n can be decoded.
While (i) is proved in [ ] the proof of (ii) will be presented here with the complexity cost.
- The Fundamental Lemma
Let u=[u1, . . . , un]εU and y=[y1, . . . , ynεU. If α⊂[n] satisfies |α|≦s and Gα,s is full ranked and
- yi=ui for all iεαc
than y=u.
- Proof. Since u,yεU than y·G1=u·G1=01×s. It follows that:
- 01×s=(y−u)·G1=(y−u)α·Gα,s.
- Now since Gα,s is full ranked and |α|≦s it yields (y−u)α=0. Hence yi=ui for all iεα. Thus y=u.
The Erasure Channel Decoder Scheme which Yields a Systematic Code
- The input is the above G21εF(n−s)×s, and G11−1εFs×s, which is computed once offline and v′=[vs+1, . . . , vn]εF1×(n−s), which is the un-coded information vector. Set v=[01×s,v′]εV
- 1. Compute u=v·G−1.
- u=v·G−1=[01×s, v′]·[G11−1, 0s×(n−s); −G21·G11−1, I(n−s)×(n−s)]=[−v′·G21·G11−1, v′] εF1×n
This is doable in a low complexity mode, and it yields a systematic code. The steps are:
- (a) Compute w′=−v′·G21 εFs. The arithmetic complexity is upper bounded by (n−s)·s, however due to utilization of sparse matrix in our target ensembles, w′ will be computed with O(n) complexity.
- (b) Compute w=−v′·G21·G11−1=w′·G11−1. The arithmetic complexity is upper bounded by s2. In our sparse ensembles, it might be reduced to O(s).
- 2. The output is u=[w,v′]. Transmit u=[u1, . . . , un]εFn.
The Encoder Scheme
- The input. One part of the input is the received vector x=[x1, . . . , xn]εFn which was obtained from the transmit vector u=[u1, . . . , un]εFn via the erasure channel. Another part of the input is the above G=[G1,G2]εFn×n.
- 1) The erasures are counted and their indexes are stored. Let α={α(1), . . . , α(t)}⊂[n], α(1)< . . . <α(t) be the indexes of erasures. If t>t0 the decoding is terminated with a statement that u cannot be computed in an unambiguous manner. Due to our assumptions the probability of the event (t>t0) is smaller than δ′. We would suppose henceforth that t≦t0.
- 2) Diagonalize the matrix Gα,s=(gα(i),j)iε[t],jε[s], with algorithm described above (Greedy algorithm for matrix-diagonalization). If in the process it is found that Gα,s is not full ranked the decoding is terminated with a statement that u cannot be computed in an unambiguous manner According to our assumptions the probability of such event is <δ. We would suppose henceforth that Gα,s is full ranked.
- A Non-Operative Observation: Set
- Vx={y=[y1, . . . , yn]εFn: yi=xi for all iεαc}
By the above lemma if yεVx∩U than, since Gα,s is full ranked: y=α. We are thus looking for yεVx such that y=[y1, . . . , yn]εVx∩U which implies
01×s=y·G1=yαc·Gαc,s+yα·Gαc,s=uαc·Gαc,s+yα·Gα, s·
- 3) Compute w1×s=uαC·Gαc,s. The arithmetic complexity is upper bounded by (n−s)·s, however due to utilization of sparse matrix in our target ensembles, w′ will be computed with O(n) complexity.
- 4) It now holds that: −w=yα·Gα, s. This is a set of s linear equation with t unknowns, has a unique solution due to the assumptions that
- (i) Gα,s is full ranked
- (ii) t≦s
- (iii) uα solves this set if equations
The solution can be found via the algorithm of section 6 below that solves such set of equations once the respective matrix was diagonalized. The solution to this set of linear equations is uα which yields that the receiver knows u.
Detailed Description of the SC Algorithm Over Noisy Channel
- Preface
Let F=Z2, and let the algorithm's dimensions be, n>s>t. The scheme is based on a code represented by a matrix G
- G=[G1,G2]=(gi,j)1≦i,j≦n=[G11, 0s×(n−s); G21, I(n−s)×(n−s)]εFn×n
Where G11εFs×s is invertible, and in some schemes G11=Is×s. The inverse is given by;
- G−1=[G11−1, 0s×(n−s); −G21·G11−1, I(n−s)×(n−s)]εFn×n
Note that when G11=Is×s, G−1=G.
In the channel under consideration a noise is introduced to the received data vector after its storage in the HAND flash. This will be dealt by the above outlined suboptimal scheme of SC solution merged with ECC solution. The post storage noisy channel is a binary symmetric channel (BSC) with probability δ>0 of an error and independent coordinates. The problem's parameters include small ε1ε2>0 that bounds the block error probabilities (e.g. ε1=10−15 ε2=10−20).
Suppose that for kε[n], i=1,2 the encoder and decoder are provided with systematic linear codes which corresponds to G′k,iεFd(k,i)+k and has rate k/(d(k,i)+k), such that for all wεFk, if we store gi(w)=[w,G′k(w)]t(notation implies one column vector on top of another) the block error probability of decoding is bounded by εi. The channel when i=1 is the BSC(δ) channel and the channel when i=2 is the combination of BSC(δ) with the SC channel when no preceding is done. In practice we would utilize G′k,1 only for k=n−s and G′k,2 only for k=s, so the index k is not essential and is required only for the sake of current presentation. Thus the code can be represented by the subspace Vk={(w, G′k(w)): wεFk}.
Accordingly we formulate the SC noisy channel problem in the following way: we would like to convey (i.e. transmit) information via the vector v=[v1, . . . , vN]εFN under the constraint that for some set of N>t≧0 indexes α={α(1), . . . , α(t)}⊂[N], α(1)< . . . <α(t), it holds that (vα(1),vα(2), . . . , vα(n)) are prescribed.
The Transmitter's Algorithm
- The input is N>s>t≧1, (s is chosen w.r.t. N), and with the δ>0 of the BSC, & d(s), & n=N−d(s), & d(n−s), & G′n−,1 & G′s,2. The transmitter's input also includes the matrix G1=(gi,j)1≦i≦n,i≦j≦sεFn×s, and α={α(1), . . . , α(t)}⊂[n], α(1)< . . . <α(t), and (v′α(1),v′α(2), . . . , v′α(t))εFt, the vector of stock cells.
- 1) Select a vector u=[u1, . . . , un−s]εVn−s of information+ECC where Vn−s⊂Fn−s is the above described BSC(δ) linear code (that corresponds to G′n−s,1).
- Note that G2·uT=uT
- Define;
- Gα=(gα(i)j)iε[t],jε[s],
- v′α=[v′α(1),v′α(2), . . . , v′α(t)]T
- uα=[uα(1),uα(2), . . . , uα(t)]T
- 2) Compute—when possible—the unique wεFs that satisfies Gα·w=v′α−uα. In case of failure—i.e. non-full-rank Gα—go to (F) below, otherwise, continue with the subsequent flow. The straightforward complexity of finding w is O(s3). The upper bound to complexity in our target ensembles, that admit only sparse G is O(s2) and it goes down to O(s) in some cases.
- 3) Compute G1·wT. The straightforward complexity is s·(n−s). However due to utilization of sparse matrix in our target ensembles, w′ will be computed with O(n) complexity.
- 4) Compute, y=G·[w,u]T=G1·wT+g2·uT=G1·wT+uT with n additions. Set y1=[y1, . . . , ys] and y2=[ys+1, . . . , yn].
- 5) Compute y1ecc=G′s,2·y1
- The output is: v=[yecc,y]T
(F) In case of failure to compute wεFs that satisfies: Gα·w=v′α−uα, in step (3), restart the entire flow with random permutation on the rows of G1, which is taken from a list known to the transmitter and receiver. A header is attached, that conveys the index of the permutation. If the failure probability of (3) is a small ε>0 probability (e.g. ε=10−6 as a work point) and we do k restarts the overall failure probability is εk. In practice we can use pseudo random permutations that are either cyclic in Zn−s with pseudo random shift or are XOR with pseudo random sequences modulo n−s.
The Receiver's Algorithm
- The BSC channel. To every vector zεFk that is transmitted via the noisy channel BSC(δ) we denote by z+ze, the resulting received vector. Note that each coordinate of zc is selected independently and identically distribute (i.i.d.) with probability δ>0 to be 1.
- The input is the above G21, and G11−1 which is computed once offline (not required for G11=Is×s) and v+ve. Note that each coordinate of ve is selected independently and identically distribute (i.i.d.) with probability δ>0 to be 1.
Compute with high success probability (ε2 failure probability):
y1=DecoderBSC(δ)(y1+y1e, y1ecc+y1ecc,e). As a result the processor now “knows” [y1,y2+y2e] (that is with ε2 failure probability):
Following the decoding of (1), compute
- G−1·[y1, y2+y2e]T=[G11−1, 0s×(n−s); −G21·G11−1, I(n−s)×(n−s)]·[y1, y2+y2e]T==[G11−1·y1, −G21·G11−1·y1]1+y2+y2e=G−1·[y1,y2]+y2e=[w,u]T+y2e.
The main steps are:
- Compute ξ=G11−1·y1,
- Compute z=−G21·(G11−1·y1)=−G21·y,
- Compute with high success probability (ε1 failure probability):
- u=DecoderBSC(δ)(u+y2e).
- The output is u.
Note the substantial complexity reduction when G11=Is×s.