This invention relates to the storage of data in computer readable form, and more specifically to the efficient storage of such data on memory devices known as flash memory.
References considered to be relevant as background to the presently disclosed subject matter are listed below:
[1] A
[2] A
[3] C
[4] F
[5] F
[6] F
[7] G
[8] H
[9] I
[10] J
[11] K
[12] K
[13] K
[14] K
[15] K
[16] P
[17] R
[18] S
[19] W
[20] Y
[21] Y
[22] Z
The advent of flash memory [2, 7] in the early 1990s had a major impact on industries depending on the availability of cheap, massive storage space. Flash memory is now omnipresent in our personal computers, mobile phones, digital cameras, and many more devices. There are, however, several features that are significantly different for flash memory, when compared to previously know) storage media.
Without going into the technical details leading to these changes, to appreciate the present invention, it suffices to know that, contrarily to conventional storage, writing a 0-bit or a 1-bit, on flash are not considered to be symmetrical tasks. If a block contains only 0s (consider it as a freshly erased block), individual bits can be changed to 1. However, once a bit is set to 1, it can be changed back, to value 0 only by erasing entire blocks (of size 0.5 MB or more). Therefore, while one can randomly access and read any data in flash memory, overwriting or erasing it cannot be performed in random access, only blockwise.
The problem of compressing data in the context of flash memory has been addressed in the literature and in many patents, see [3, 10, 21, 19, 8] to cite just a few, but they generally refer to well known compression techniques, that can be applied for any storage device. The current invention focuses on changing the coding method used on the device and obtaining thereby a compression gain, as also done in [17, 20].
Consider then the problem of reusing a piece of flash memory, after a block of r bits has already been used to encode some data in what we shall call a first round of encoding. Now some new data is given to be encoded in a second round, and the challenge is to reuse the same r bits, or a subset thereof, without incurring the expensive overhead of erasing the entire block before rewriting.
There might, of course, be a possibility of recoding data using only changes from 0-bits to 1-bits, but not vice versa. For example, suppose one is given a data block containing 00110101, it could be changed to 10111101 or 00111111, but not to 00100100. The problem here is that since every bit encoded in the first round can a priori contain either 0 or 1, only certain bit patterns can be encoded in the second round, and even if they can be adapted to the new data, there need to be a way of knowing which bits have been modified in the passage from the first to the second round.
Actually, the problem of devising such special codes has been treated long before flash memory became popular, under the name of Write-Once Memory (WOM). Rivest and Shamir (RS) suggested a simple way to use 3 bits of memory to encode two rounds of the four possible values of 2 bits [17]. This work has been extended over the years, see, e.g., [4, 10, 15, 17, 18, 20], and the corresponding codes are called rewriting codes.
As a baseline against which the compression efficiency of the new method can be compared, we use the compression ratio defined as the number of information bits di-vided by the number of storage bits. The number of information bits is in fact the information content, of the data, whereas the number of storage bits depends on the way the data, is encoded. For example, consider a 3-digit decimal number, with each digit being encoded in a 4-bit binary encoded decimal, that is, the digits 0, 1, . . . , 9 are encoded as 0000, 0001, . . . , 1001, respectively. The information content of the three digits is −[log2 1000]=10 and the number of storage bits is 12, which yields the ratio 10/12=0.833. For a standard binary encoding, information and storage bits are equivalent, giving a baseline of 1. For rewriting codes, we use the combined number of information bits of all writing rounds, thus the above mentioned RS-code yields a ratio of
The theoretical best possible ratio is log 3=1.585 and the best achieved ratio so far is 1.49, see [18].
For the RS-code, every bit-triplet is coded individually, independently of the preceding ones. The code is thus not context sensitive, and this is true also for many of its extensions. One of the innovations of the present invention is to exploit context sensitivity by using a special encoding in the first round that might be more wasteful than the standard encoding, but has the advantage of allowing the unambiguous reuse of a part of the data bits in the second round, such that the overall number of bits used in both rounds together is increased. This effectively increases the storage capacity of the flash memory between erasing cycles. Taken as a stand-alone rewriting technique, the compression ratio of the basic scheme suggested in this invention is shown to vary between 1.028 in the worst case to at most 1.194, with an average of 1.162. This is less than the performance of the RS-code.
The new method has, however, other advantages. It can be generalized to yield various partitions between the first and the second, rounds, while the RS-code is restricted to use the same number of bits in both rounds. More importantly, the suggested codes can be used as compression boosters, transforming any context insensitive k-rewriting system (with k≧2 writing rounds) into a (k+1)-rewriting system, which may lead to an improved overall compression ratio. One of the variants transforms the RS-code into a 3-rewriting code with compression ratio 1.456, a 9.2% increase of storage space over using RS as a stand-alone encoding. Note that these numbers, as well as those above, are analytically derived, and not experimental estimates.
Consider an unbounded stream of data bits to be stored. It does not matter what these input bits represent and various interpretations might be possible. For example, the binary string 010011100101000101111000 could represent the
For technical reasons, it is convenient to break the input stream into successive blocks of n bits, for some constant n. This may help limiting the propagation of errors and setting an upper bound to the numbers that, are manipulated. In any case, this does not limit the scope of the method, as the processed blocks can be concatenated to restore the original input. To continue the above example, if n=8, the input blocks are 01001110, 01010001 and 01111000 the first of which represents the character N or the number 78. The description below concentrates on the encoding of a single block of length n.
A block of n bits can be used to store numbers between 0 and 2n−1 in what is commonly called the standard binary representation, based, on a sum of differ cut powers of 2. Any number x in this range can be uniquely represented by the string bn-1, bn-2 . . . b1b0, with biε{0,1}, such that x=Σi=0n-1bi2i. But this is not the only possibility. Actually, there are infinitely many binary representations for a given integer, each based on a different numeration system [5], The numeration system used for the standard representation is the sequence of powers of 2: {1, 2, 4, 8, . . . }. Another popular and useful numeration system in this context is based on the Fibonacci sequence: {1, 2, 3, 5, 8, 13, . . . }.
Fibonacci numbers are defined by the following recurrence relation:
F
i
=F
i−1
+F
i-2for i≧1,
and the boundary conditions
F
0=1
and
F
−1=0.
The number Fi, for i≧1, can be approximated by φ[+]/√{square root over (5)}, rounded to the nearest integer, where
is the golden ratio.
Any integer x can be decomposed into a sum of distinct Fibonacci numbers; it can therefore be represented by a binary string cγ cγ-1 . . . c2c1 of length r, called its Fibonacci or Zeckendorf representation [22], such that x=Σi=1xciFi. This can be seen from the following procedure producing such a representation: given the integer x, find the largest Fibonacci number Fr smaller or equal to x; then continue recursively with x−Fr. For example, 49=34+13+2=F8+F6+F2, so its binary Fibonacci representation would be 10100010. Moreover, the use of the largest possible Fibonacci number in each iteration implies the uniqueness of this representation. Note that as a result of this encoding procedure, there are never consecutive Fibonacci numbers in any of these sums, implying that in the corresponding binary representation, there are no adjacent 1s.
This property of the appearance of a 1-bit implying that the following bit must be a zero has been exploited in several useful applications: robustness to errors [1], the design of Fibonacci codes [6], fast decoding and compressed search [13], compressed matching in dictionaries [14], faster modular exponentiation [11], etc. The present invention is yet another application of this idea.
The repeated encoding will be performed in three steps:
1. Encoding the data of the first round;
2. Preparing the data block for a possible second encoding:
3. Encoding the (new) data of the second round, overwriting the previous data.
In the first step, the n bits of the block are transformed into a block of size r by recoding the integer represented in the input, block into its Fibonacci representation. The resulting block will be longer, since more bits are needed, but also sparser, because of the property prohibiting adjacent 1s. To get an estimate of the increase in the number of bits, note that the largest number that can be represented is y=2n−1. The largest Fibonacci number Fr≈φr+i/√{square root over (5)} needed to represent y is r=[logφ(√{square root over (5)}y)−1]=[1.44n−0.67]. The storage penalty incurred by passing from the standard to the Fibonacci representation is thus at most 44%, for any block size n.
The second step is supposed to be performed after the data written in the first round has finished its life cycle and is not needed any more, but instead of overwriting it by erasing first the entire block, we wish to be able to reuse the block subject to the update constraints of flash memory. The step is optional and not needed for the correctness of the procedure, but it may increase the number of data bits that can be stored in the second round. In the second step, a maximal number 1-bits is added without violating the non-adjacency property of the Fibonacci encoding. This means that short, runs of zeros limited by 1-bis, like 101 and 1001, are not touched, but the longer ones, like 100001 or 1000001, are changed to 101001 and 1010101, where the added bits are bold faced. In general, in a run of zeros of odd length 2i+1, every second zero is turned on, and this is true also for a run of zeros of even length 2i, except that for the even length the last bit is left, as zero, since it is followed by a 1. A similar strategy is used for a run of leading zeros in the block: a run of length 1 is left untouched, but longer runs, like 001, 0001 or 00001, are changed to 101, 1001 and 10101, respectively. As a result of this filling strategy, the data block still does not have any adjacent 1s, but the lengths of the 1-limited zero-runs is now either 1 or 2, and the length of the leading run is either 0 or 1.
In the third step, new data is encoded in the bits immediately to the right, of every 1-bit. Since it is known that these positions contained only zeros at the end of step 2, they can be used at this stage to record new data, and their location can be identified. The data block at the end of the third step thus contains bits of three kinds: separator bits (S), data bits (D) and extension bits (E). The first bit of the blocks is either an S-bit, if it is 1, or an E-bit, if it is 0 (which can only occur if the leading zero-run was of length 1).
Decoding at the end of the first step is according to Fibonacci codes, as in [13], and decoding of the data of the second round at the end of the third step can be done using the decoding automaton appearing in
Since at the end of the second step, no run of zeros can be longer than 2, the worst case scenario is when every third bit is a separator. Any block is then of the form SDESDE . . . , and one third of the bits are data-bits. The number of data bits in the third step is thus 1.44 n/3=0.48 n, which together with the n bits encoded in the first step, yield 1.48 n, 2.76% more than the 1.44 n storage bits used. Thus even in the worst case, there is a gain albeit a small one.
The maximal possible benefit, will be in the case when there are no E-bite at all, that is the block is of the form SDSDSD . . . . In this case, half of the bits axe D-bits, and the compression ratio will be
The constraint of the Fibonacci encoding implies that the probabilities of occurrence of 0s and 1s are not the same, as would be the case in the standard binary encoding, when all possible inputs are supposed to be equi-probable. Under such an assumption, the probability of a 1-bit is shown in [11] to
when the block size n tends to infinity. From this, one can derive that the expected distance between consecutive S-bits, which is the expected length of a zero-run including the terminating 1-bit in the data block at the end of the second step, is
This yields an average compression ratio of
Summarizing, the new code effectively expands the storage capacity of flash memory by 3 to 19%, and at the average 16%.
The basic idea leading to the possibility above of multiple encoding rounds is the use of a code in which certain bits are guaranteed to be 0. This is true for the Fibonacci, coding, in which every 1-bit is followed by a 0-bit, which can be extended to a code in which every 1-bit is followed by at least m 0-bits, for m>1. Such a code for m=2 has been designed tor the encoding of data on CD-ROMs [9] and is known as Eight-to-Fourteen-Modulation (EFM); every byte of 8 bits is mapped to a bit-string of length 14 in which there are at least two zeros between any two 1s.
These properties are obtained by representing numbers according to the basis elements of numeration systems which are extensions of the Fibonacci sequence. To get sparser strings, use the numeration systems based on the following recurrences, see [12]:
A
k
(m)
=A
k-1
(m)
+A
k-m
(m) for k>m+1,
and the boundary conditions
A
k
(m)
=k−1 for 1<k≦m+1.
In particular. Ak(2)=Fk-1(2)=Fk-1 are the standard Fibonacci numbers. The first few elements of the sequences A(m) ≡{1=A2(m),A3(m),A4(m) . . . } for 2≦m≦8 are listed in the right part of Table 1 below.
A closed form expression of the elements of the sequence A(m) can be obtained by considering the characteristic polynomial xm−xm−1=0, and finding its m roots φm,1, φm,2, . . . , φm,m. The element Ak(m) is then a linear combination of the k-th power of these roots. For these particular polynomials, when m>2, there is only one root, say φm,1≡φm, which is real and is larger than 1, all the other roots are complex numbers a+ib with b≈0 and with norm strictly smaller than 1. For m=2, the second root
is also real, but its absolute value is <1. It follows that with increasing k, all the terms φm,jk, 1<j≦m, quickly vanish, so that the elements Ak(m) can be accurately approximated by powers of the dominant, root φm alone, with appropriate coefficients, Ak(m)≈amφmk-1. The constants am and φm are listed in Table 1.
For a given m, any integer x can be decomposed into a sum of distinct elements of the sequence A(m); it can therefore be uniquely represented by a binary string crcr-1 . . . c3c2 of length r−1, such that x=Σi=2r c1Ai(m), using the recursive encoding method presented in the previous section, based on finding in each iteration the largest element of the sequence fitting into the remainder. For example, 36=28+6+2=A10(3)+A6(3)+A2(3), so its representation according to A(3) would be 100010010. As a result of the encoding procedure, the indices i1,i2, . . . of the elements in the sum x=Σi=2rciAi(m) for which ci=1 satisfy that ik+1≧ik+m, for k>2. In the above example x=36 these indices are 3, 6 and 10. This implies that in the corresponding binary representation, there are at least, m−1 zeros between any two 1s.
Using tire same argument as above for the Fibonacci numbers, the length r−1 of the representation according to A(m) of an integer smaller than 2n will be about (logφm2) n. These numbers represent the storage penalty paid for the passage to A(m) and are listed in the 4th column of Table 1.
The encoding procedure is similar to the three step procedure described earlier.
In the first step, the n bits of the block are transformed into a block of size r=(logφm2) n by recoding the integer represented in the input block into its representation according to A(m). The resulting block will be longer, since more bits are needed, but also the larger m, the sparser will the representation be, because of the property forcing at least m−1 zeros between any two 1s.
In the second step, as above, a maximal number 1-bits is added without, violating the property of having at least m−1 zeros after each 1. This means that in a run of zeros of length j, limited on both sides by 1s, with j≧2m−1, the zeros in positions m, 2m, . . . ,
are turned on. For a run of leading zeros or length j (limited by a 1-bit only at its right end), for j≧m, the zeros in positions 1, m+1, 2m+1, . . . ,
are turned on. For example, for A(3), 100000000001 is turned into 100100100001, and 0000001 is turned into 0010001. As a result of this filling strategy, the data block still does have at least m−1 zeros between 1s, but the lengths of the 1-limited zero-runs are now between m−1 and 2m−1, and the length of the leading run is between 0 and m−1.
In the third step, new data is encoded in the m−1 bits immediately to the right of every 1-bit. Since it is known that these positions contained only zeros at the end of step 2, they can be used at this stage to record new data, and their location can be identified. To continue the analogy with the case m=2, there are now data bits of different kinds D1 to Dm−1, and similarly for extension bits E1 to Em−1.
The decoding of the data of the second round at the end of the third step for A(3) can be done using the decoding automaton appearing in
Since at the end of the second step, no run of zeros can be longer than 2m−2, the worst case scenario is when every (2m−1)th bit is a separator. Any block is then of the form SDD . . . DEE . . . ESDD . . . DEE . . . , where all the runs of Ds and Es are of length m−1 and (m−1)/(2m−1) of the bits are data-bits. The worst, case compression factor is thus
The maximal possible benefit will be in the case when there are no E-bits at all, that is, the block is of the form SDD . . . DSDD . . . DSD . . . , where all the runs of Ds are of length m−1 and the number of data-bits is (m−1)/m. In this case, the compression ratio will be
1.028
1.218
1.190
1.4584
As to the average compression ratio, we omit here the details but list, all the results, the best, worst, and average compression ratios for 2≦m≦6, in Table 2. For each case, the columns headed ratio show the proportion of data-bits relative to the total number of bits used in the second round. The denominator in the ratio column for the average case is the expected distance between 1-bits E(m). As can be seen, for the average case there is always a gain relative to the baseline, and in the worst case only for m=2.
It should be noted that the present invention is relevant only for applications in which the data to be encoded can be partitioned into several writing round, and under the assumption that in any round, the data of the previous rounds is not accessible any more. If these assumptions do not apply, the second and subsequent rounds can be skipped, which corresponds to extending the definition of the sequence A(m) also to m=1. Indeed, for m−1, one gets the sequence of powers of 2, that is, the standard binary numeration system, with no restrictions on the appearance of 1-bits. The compression ratio in that case will be 1. For higher values of m, the combined compression ratio will be higher, but the proportion of the first round data will be smaller. Table 3 brings; these proportions for 1≦m≦8, and
One way to look at these results is thus to choose the order m of the encoding according to the partition between first and second round data one may be interested in.
The above ideas can be used to build a compression booster in the following way. Suppose we are given a rewriting system S allowing k rounds. This can be turned into a system with k+1 rounds by using, in a first round, the new encoding as described earlier, which identifies a subset of the bits in which the new data can be recorded. These bits are then used in k additional rounds according to S. Note that only context-insensitive systems, like the RS-code, can be extended in that way. Since the first round recodes the data using more bits, the extension with an additional round of rewriting will not always improve the compression. For example, for the Fibonacci code, even if the first term of the numerator of equation (1), representing the number of bits used in the second round, is multiplied by
the compression factor of the RS-code, one still gets only 1.318, about 1.1% less than the RS-code used alone. However, using A(m) codes with m>2 in the first round, followed by two rounds of RS, may yield better codes than RS as can be seen in the last two columns of Table 2, giving the compression ratios and the relative improvement over RS.
Number | Date | Country | |
---|---|---|---|
61844443 | Jul 2013 | US |