The Hamming weight of a string or word is the number of symbols in the string or word that differ from the zero-symbol of the alphabet that is used to express the string or word, and particularly, the Hamming weight of a binary word is the number of ones in the binary word. Constant Hamming weight coding is a type of encoding that encodes input data into codewords that all have the same Hamming weight. Constant Hamming weight coding has many uses particularly in error detection and correction strategies, where an error in a constant Hamming weight binary codeword can be easily detected if the codeword does not contain the correct number of ones. Hamming weight coding has also been used in data storage such as in memory systems including cross-point arrays of memristor memory elements, where encoding of raw binary data produces codewords that when stored in the cross-point memory array may limit the numbers of low resistance states in rows or columns of the array. Encoded storage using constant Hamming weight codes may thus avoid data-dependent memory performance issues that may arise if rows or columns of a memory array are allowed to contain a large number of memory elements in a low resistance state.
Use of the same reference symbols in different figures indicates similar or identical items.
Constant Hamming weight encoding or decoding for codewords of any size may be implemented using a single input parameter {circumflex over (N)}(n, k) associated with a codeword length n and a Hamming weight k, recursive relations for determination of values of a function {circumflex over (N)}(m,l), and modest floating point processing capabilities to evaluate the recursive relations. The following describes examples of encoding of binary data to produce binary codewords and examples of decoding binary codewords to recover binary data, although the techniques disclosed may be extended or modified for data or codes using other representations. Also, in the following examples, binary values 0 and 1 may be interpreted as placeholders for two arbitrary distinct values, in which case the Hamming weight refers to the occurrence count of one of these values that would be substituted wherever “1” occurs. One specific example is a case in which the roles of 0 and 1 are reversed.
For binary coding, a set A(n,k) of words containing exactly n bits and having a Hamming weight k can be defined as shown in Equation 1.
A(n,k)={znε{0,1}n:Σi=1nzi=k} Equation 1:
An encoding system that aims at encoding data units of a specific length, e.g., a specific number of bits, needs to use encoding parameters n and k such that a number N(n,k) of words available in the set A(n,k) is greater than or equal to the number of different input data values that may need to be encoded. For example, encoding p-bit binary data values requires a set A(n,k) containing at least 2p words. In general, an encoding process may not use all of the words in the set A(n,k) for encoding of data. Even so, for reasonably sized input data values, the codewords that are used for encoding data values may be so numerous that storing a look-up table that maps data values to codewords is impractical in many physical systems. Accordingly, encoding and decoding processes may perform computations to avoid the need for large look-up tables. Further, the encoding and decoding process may employ storage and processing hardware having specific capability and accuracy, and a practical encoding and decoding process should be executable in such hardware without concern that round-off or other errors, which may be inherent in the available hardware, may create coding or decoding errors.
Encoding process 100 uses a single input parameter {circumflex over (N)}(n,k), rather than a lookup table. Parameter {circumflex over (N)}(n,k) is considered herein to be the value of a function {circumflex over (N)}(m,l) when m=n and l=k. Further, the function {circumflex over (N)}(m,l) may in some sense approximate the function N(m,l), which gives the total number words in the set A(m,l), i.e., the total number of binary words having length m and Hamming weight l, but function {circumflex over (N)}(m,l) is required to be less than or equal to function N(m,l) for all relevant values of m and l as indicated in Equation 2.
{circumflex over (N)}(m,l)≦N(m,l) Equation 2:
A sufficiently powerful computing system could calculate the values of function N(m,l), e.g., using the formula m!/(m−l)!l!, and function N(m,l) could be used for function {circumflex over (N)}(m,l). But, exact computation of a value N(m,l) generally requires calculations of large factorials, making exact calculations impractical in many computing systems or impractical for use in desired encoding times. Further, data units normally used in hardware executing process 100 may not be large enough to represent an exact value for function N(m,l). For one specific case given length n and given Hamming weight k, the approximation {circumflex over (N)}(n,k) may be as close to (but smaller than) the exact value N(n,k) as can be conveniently stored or manipulated in the hardware executing process 100. Whether approximation {circumflex over (N)}(n,k) and exact value N(n,k) are equal or not, process block 110 can compute {circumflex over (N)}(n,k), e.g., using a high-power processing system during design of an encoding or decoding system for implementing process 100, and the pre-computed parameter {circumflex over (N)}(n,k) determined in process block 110 can be stored in encoding hardware for later use when encoding process 100 receives and encodes data values.
A process block 120 receives a data value D to be encoded. Data value D may be, for example, an integer greater than or equal to 1 and less than or equal to {circumflex over (N)}(n,k) and may be further limited to a value within a range from 1 to 2p, where 2p is less than or equal to parameter {circumflex over (N)}(n,k) and therefore less than or equal to the total number N(n,k) of codes in code set A(n,k). In this manner, D may be represented as 1 added to a p-bit binary number. For encoding of data value D, a process block 122 can set process variables m and l to initially equal the number n of bits in the codeword X being generated and the Hamming weight k of the codeword X being generated. As described further below, process 100 uses n iterations that generate codeword X one bit at a time. A process block 126 computes {circumflex over (N)}(m−1,l) using a known value or values {circumflex over (N)}(m,l) and recursive relations. (See the example of Equations 7, 8, and 9, below.) Initially, m=n and l=k, so that {circumflex over (N)}(m,l) is equal to the pre-computed parameter value {circumflex over (N)}(n,k). The computed value {circumflex over (N)}(m−1,l) may be an estimate of the number of codewords of length m−1 and Hamming weight l used as the m−1 least significant or lower indexed bits of a codeword X having a most significant or leftmost bit Xm equal to zero, wherein the bits in codeword X from left (most significant) to right (least significant) are indexed m to 1, as in X=Xm, Xn-1, . . . , X2, X1.
Process 100 in the implementation shown in
If decision block 130 determines that data value D is greater than the last computed integer value ┌{circumflex over (N)}(m−1,l)┐, the remaining portion of the codeword X is larger than any of the m-bit codewords with zero as the most significant bit, and a process block 150 selects 1 for the value of bit Xm of codeword X. Process 100 then needs to determine a word of length m−1 and Hamming weight l−1 that distinguishes data values having codewords X with one or more most significant bits as already set. A block 152 decreases data value D by computed integer value ┌{circumflex over (N)}(m−1,l)┐. A process block 154 decrements Hamming weight variable l, and process block 142 decreases code length variable m.
After process block 142, a decision block 160 ends an iteration that selected whether bit Xm was 0 or 1 and determines whether any further codeword bits need to be selected. In the implementation of
An implementation of process 100 using particular hardware needs to provide a technique for calculation of {circumflex over (N)}(m−1,l) in process block 126. As noted above, the number N(m,l) of words in set A(m,l) may be difficult or impractical to exactly calculate, store, or use for values of function {circumflex over (N)}(m,l). Instead, a function {circumflex over (N)}(m, l) may be sought that satisfies Equation 2 (above) and Equation 3 (below) and that is easily computed using available hardware. Equation 2 ensures that function {circumflex over (N)}(m,l), which indicates a number of words of length m and Hamming weight l that may be used within codewords, is less than or equal to the number of all words of length m and Hamming weight l. Equation 3 ensures that when a most significant bit of an m-bit code is selected, as in process block 140 and 150 of encoding process 100, the words of length m−1 and Hamming weight l or l−1 that may be used in a sub-word are sufficient to distinguish distinct data values.
{circumflex over (N)}(m,l)≦{circumflex over (N)}(m−1,l)+{circumflex over (N)}(m−1,l−1) Equation 3:
In one specific implementation, function {circumflex over (N)}(m,l) is further required to provide values within a set F(b) given in Equation 4. In Equation 4, a “precision” b is a positive integer, and + represents the set of positive integers. Elements or values M·2e−b in set F(b) can be interpreted as b-bit floating-point values, where M is a binary mantissa between one half (½) and strictly smaller than one (1) and e is an exponent base two. The requirement that function {circumflex over (N)}(m,l) provide values within a set F(b) suggests that values of function {circumflex over (N)}(m,l) can be stored or calculated using hardware that supports b-bit floating point arithmetic.
F(b){0}∪{M·2e−b:eε+,Mε+∩[2b-1,2b−1]} Equation 4:
Operations ┌x┐b and └x┘b on any real number x can be based on a set F(b) for a precision b and can be defined as shown in Equations 5 and 6. (If a real value x is less than zero, ┌x┐b=└x┘b=0.) Operations ┌x┐b and └x┘b thus identify b-bit floating point values close to real value x. The values ┌x┐b and └x┘b are distinct from values ┌x┐ and └x┘. A value ┌x┐ is the smallest integer greater than or equal to real value x, and a value └x┘ is the largest integer less than or equal to real value x.
┌x┐b=min {yεF(b):y≧x} Equation 5:
└x┘b=max {yεF(b): y≦x} Equation 6:
An encoding system capable of storing and manipulating b-bit floating-point numbers can use Equation 7 to compute parameter {circumflex over (N)}(n,k) as in process block 110 of process 100, and parameter {circumflex over (N)}(n,k) may be represented and stored using a floating-point value from set F(b). Recursive relationships defined for function {circumflex over (N)}(m,l) as shown in Equations 8 and 9 can then be used to compute {circumflex over (N)}(m−1,l) and {circumflex over (N)}(m−1, l−1) when needed, e.g., in process block 126 of process 100. In particular, an initial execution of process block 126 can use parameter {circumflex over (N)}(m,l)={circumflex over (N)}(n,k) in Equation 8 to compute {circumflex over (N)}(m−1,l). Subsequently, if the prior iteration set a bit to 0, only length variable m was decremented, and the previous execution of process block 126 provided a new value for {circumflex over (N)}(m,l), which can again be used in Equation 8 to determine {circumflex over (N)}(m−1,l). If a prior iteration set a bit to 1, length variable m and Hamming weight variable l were both decremented, and the previous value computed by process block 126 is used as {circumflex over (N)}(m,l) in Equation 8 and the value from Equation 8 is used in Equation 9.
{circumflex over (N)}(n,k)=└N(n,k)(1+2−(b−1))−(n−1)┘b Equation 7:
{circumflex over (N)}(m−1,l)=┌(m−l)*{circumflex over (N)}(m,l)/m┐b Equation 8:
{circumflex over (N)}(m−1,l−1)=┌{circumflex over (N)}(m,l)−{circumflex over (N)}(m−1,l)┐, Equation 9:
Encoding process 100 using Equations 7, 8, and 9 to recursively define the required values of function {circumflex over (N)}(m,l) can be mathematically proven to provide a function {circumflex over (N)}(m, l) that satisfies Equations 2 and 3 and that allows encoding of data values D to unique codewords X containing n bits and having a Hamming weight k. Such proofs are, however, unnecessary for construction or use of implementations of encoding process 100 and therefore not described in detail here.
A process block 220 computes a value {circumflex over (N)}(m−1,l) from a known value or known values of function {circumflex over (N)}(m, l). Such computation may be based on the recursive relations defined by Equations 8 and 9 above, which may be implemented using hardware providing b-bit floating point arithmetic using values in set F(b) as in Equation 4 above. A decision block 240 determines whether a codeword bit Xm for the current value of variable m is equal to 1. If so, the tentative decoded data value D is smaller than the original encoded data value by more than the calculated value {circumflex over (N)}(m−1, l), and the next m−1 bits of codeword X has Hamming weight l−1. Accordingly, a process block 250 adds a calculated integer value ┌{circumflex over (N)}(m−1,l)┐ to data value D, a process block 252 decrements Hamming weight variable l, and processing block 260 decrements length variable m. If decision block 240 determines that code bit Xm is zero, the data value D is not increased and Hamming weight variable l is not changed, but process block 260 decrements length variable m for a next iteration.
Decision block 270 determines whether further iteration is needed, i.e., whether more bits of the codeword remain to be processed. If so, process 200 returns to process block 230, where {circumflex over (N)}(m−1, l) is computed for current values of variables m and l as modified in process block 252 or 260. Iterations continue until all of the bits of code X have been evaluated, e.g., when decision block 270 determines that m is equal to 0. At which point, data D has the decoded value, which a process block 280 outputs as data D.
Decoding process 200 using Equations 7, 8, and 9 to recursively define the function {circumflex over (N)}(m,l) can be mathematically proven to decode each codeword X containing n bits and having a Hamming weigh k to produce the original data that process 100 using the same recursive relations encoded as codeword X. Such proofs are, however, unnecessary for construction or use of implementations of encoding process 200 and therefore not described in detail here.
Encoding and decoding processes such as described illustrated in
Some encoded data storage systems are further described, for example, in U.S. Pat. App. Pub. No. 2013/0097396, entitled “Method and System for Encoding Data for Storage in a Memory Array” and U.S. Pat. App. Pub. No. 2013/0121062, entitled “Rewriting a Memory Array,” which are incorporated by reference herein to the extent legally permitted.
All or portions of some of the above-described systems and methods can be implemented in a computer-readable media, e.g., a non-transient media, such as an optical or magnetic disk, a memory card, or other solid state storage containing instructions that a computing device can execute to perform specific processes that are described herein. Such media may further be or may be contained in a server or other device connected to a network such as the Internet that provides for the downloading of data and executable instructions.
Although the invention has been described with reference to particular implementations, the disclosed implementations are only examples and should not be taken as limitations. Various other adaptations and combinations of features of the implementations disclosed are within the scope defined by the following claims.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2014/014219 | 1/31/2014 | WO | 00 |