Tunnel injection and tunnel release are respectively used to program and erase NAND Flash storage. Both types of operations are stressful to NAND Flash cells, causing the electrical insulation of NAND Flash cells to break down over time (e.g., the NAND Flash cells become “leaky” which is bad for data which is stored for a long time period of time). For this reason, it is generally desirable to keep the number of program and erase cycles down. New techniques for managing NAND Flash storage which reduce the total number of programs and erases would be desirable.
Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
Various embodiments of a NAND Flash storage system which reduces the number of programs and/or erases are described herein. First, some examples of previous and modified versions of logical data chunks stored in NAND Flash are discussed. Then, some examples of how the various versions of the logical data chunks may be used to assist in error correction decoding are described. Finally, some examples of a relocation process (e.g., to consolidate the information stored in the NAND Flash and/or free up blocks) are described.
At 100, one or more write requests which include a plurality of logical data chunks are received. In some cases, the logical data chunks which are received at step 100 are all associated with or part of the same write request. Alternatively, each of the logical data chunks may be associated with its own write request. In some embodiments, the write request(s) is/are received from a host.
At 102, the plurality of logical data chunks are distributed to a plurality of physical pages on Flash such that data from different logical data chunks are stored in different ones of the plurality of physical pages, wherein a logical data chunk is smaller in size than a physical page. For example, by storing each logical data chunk on its own physical page, subsequent updates of those logical data chunk result in fewer total programs and/or erases. In some embodiments, the logical data chunks are distributed to physical pages on different blocks and/or different (e.g., NAND) Flash integrated circuits. Alternatively, the logical data chunks may be distributed to physical pages on the same block and/or same (e.g., NAND) Flash integrated circuit.
In one example, the NAND Flash is used in a hyperscale data center which runs many applications. At least some of those applications have random writes with a relatively small block size (e.g., 512 Bytes) where the small blocks or chunks are updated frequently. This disclosure presents the novel scheme to mitigate the write amplification from the small-chunk of data which is frequently updated.
The following figures show some examples of how the plurality of logical data chunks are distributed to a plurality of physical pages.
In the example shown, NAND Flash integrated circuit (IC) 200 includes multiple blocks, including block j (202). Each block, including block j (202), includes multiple physical pages such as physical page 1 (204), physical page 2 (206), and physical page 3 (208).
In this example, three logical data chunks are received: chunk 1.0 (210), chunk 2.0 (212), and chunk 3.0 (214). These are examples of logical data chunks which are received at step 100 in
In contrast, some other storage system may choose to group the chunks together and store all of them on the same physical page. For example, some other storage systems may choose to append chunk 1.0, chunk 2.0, and chunk 3.0 to each other (not shown) and store them on the same physical page. As will be described in more detail below, when updates to chunk 1.0, chunk 2.0, and/or chunk 3.0 are subsequently received, the total numbers of programs and erases is greater (i.e., worse) when the exemplary chunks are stored on the same physical page compared to when they are stored on different physical pages (one example of which is shown here).
In this example, the three chunks (210, 212, and 214) are written to NAND Flash IC 200 by NAND Flash controller 220. NAND Flash controller 220 is one example of a component which performs the process of
The following figure shows another example where chunks are stored on different physical pages but those pages are in different blocks and different NAND Flash integrated circuits.
As before, three logical data chunks have been received and are to be stored in this example. The chunk 1.0 (300) is stored on NAND Flash integrated circuit A (302) in block X (304) in page 1 (306). The chunk 2.0 (310) is stored on NAND Flash integrated circuit B (312) in block Y (314) in page 2 (316). The chunk 3.0 (320) is stored on NAND Flash integrated circuit C (322) in block Z (324) in page 3 (326).
Like the previous example, the three chunks are stored on different physical pages. Unlike the previous example, however, the three chunks are stored on different NAND Flash integrated circuits and in different blocks (e.g., with different block numbers).
The writes of the chunks (300, 310, and 320) to the pages, blocks and NAND Flash integrated circuits shown here is performed by NAND Flash controller 330, which is one example of a component which performs the process of
The following figures discuss examples of how logical data chunks are updated.
At 400, an additional write request comprising a modified version of one of the plurality of logical data chunks is received. For example, suppose the write request at received step 100 in
At 402, the modified version is stored in a physical page that also stores a previous version of said one of the plurality of logical data chunks. For example, assuming space on the physical page permits, the modified version is written next to the previous version (i.e., on the same physical page as the previous version).
The following figure describes an example of this.
When writing to NAND Flash, pages are typically written as a whole. However, during write operation, each bitline has its own program and verify check. When one cell reaches its expected programmed state, this bitline is shut down, and no further program pulse will be applied onto this cell (i.e., no more charge will be added to that cell). The other cells in this page that have not reached their expected states will continue the program and verify check until the cell's threshold voltage reaches the individual, desired charge level. In some embodiments, only part of a page is programmed by turning off other bitlines (e.g., to only program the chunk 2.0). The physics are not novel. For convenience and brevity, a single bitline is shown for each chunk but a single bitline may actually correspond to a single cell.
Diagram 520 shows the same pages at a second point in time after a second (i.e., updated) version of the first chunk is received and stored. In this example, chunk 1.1 (522) is stored next to chunk 1.0 (502b) in page A (504b) because chunk 1.1 is an updated version of chunk 1.0 which replaces chunk 1.0. To write chunk 1.1 (522) to page A (504b), the second-from-left bitline (512b) is selected. The other bitlines (i.e., bitlines 510b, 514b, 516b, ad 518b) are not selected since nothing is being written to those locations at this time.
In some embodiments, a NAND Flash controller or other entity performing the process of
In some embodiments, a NAND Flash controller knows where to write chunk 1.1 in page A because each physical page has a write pointer (shown with arrows) that tracks the last chunk written to that page and thus where the next chunk should be written. Chunk 1.1 (522) is one example of a modified version of a logical data chunk which is received at step 400 in
One reason why distributing logical data chunks across different physical pages (e.g., per
Write amplification is the amount of data written to the NAND Flash divided by the amount of data written by a host or other upper-level entity. If chunk 1.0 and chunk 2.0 were stored together on the same physical page (as described above), then the write amplification for updating chunk 1.0 to be chunk 1.1 would be 2/1=2 since the host writes or otherwise updates chunk 1.1 (i.e., 1 chunk of data) but what is actually written to the NAND Flash is chunk 1.1 and chunk 2.0 (i.e., 2 chunks of data).
In contrast, the write amplification associated with diagram 520 is 1/1=1. This is because the host writes chunk 1.1 (i.e., 1 chunk of data) and the actual amount of data written to the NAND Flash is chunk 1.1 (i.e., 1 chunk of data). For example, this may be enabled by selecting appropriate bitlines (e.g., corresponding to the (next) empty space in the page after to the previous version).
Keeping the write amplification performance metric down is desirable because extra writes to the NAND Flash delay the system's response time to instructions from the host. Also, as described above, programs (i.e., writes) gradually damage the NAND Flash over time and it is desirable to minimize the number of writes to the NAND Flash to a minimum. For these reasons, it is desirable to keep write amplification down.
Diagram 540 shows the pages at a third point in time. In the state shown, page A (504c) has been filled with different versions of the first chunk (i.e., chunk 1.0-1.4) and is now full. The most recent version of chunk 1.X (i.e., chunk 1.5 (542)) is written to a new physical page because page A is full. In this example, the new page (i.e., page C (546)) is specifically selected to be part of a new or different block (i.e., block Y (544) instead of block X (542)). This is because garbage collection (e.g., a process to copy out any remaining valid data and erase any stored information in order to free up space) is performed at the block level. By writing chunk 1.5 to a new or different block (in this example, block Y (544)), block X (542) can more quickly be garbage collected.
Another benefit to this technique is that there are fewer updates to the Flash translation layer which stores logical to physical mapping information. The following figure illustrates an example of this.
Row 604a in table 600 shows the mapping information for chunk 2.0 (506) in diagram 500 in
Table 610 also corresponds to diagram 500 in
Table 620 and table 630 correspond to diagram 520 in
Table 630 shows the write pointers updated to reflect the new position of the write pointer for chunk 1.X (now chunk 1.1). Row 612b, for example, notes that the write pointer for chunk 1.X is located at an offset of 2 chunks. See, for example, write pointer 550b in
Table 620 and table 630 correspond to diagram 540 in
As shown here, it is not until the page is completely filled that the FTL information for a particular chunk (in this example, chunk 1.X) is updated. In this example where 5 chunks fit into a page, the FTL information is updated ⅕th the number of times the FTL information used to be updated.
The benefits associated with the storage technique described herein tend to be most apparent when the chunks are relatively small. In some embodiments, the process of
At 100′, one or more write requests which include a plurality of logical data chunks are received, wherein the size of each logical data chunk in the plurality of logical data chunks does not exceed a size threshold. For example, prior to step 100′, the logical data chunks may be pre-screened by comparing the size of the logical data chunks against some size threshold and therefore all logical data chunks that make it to step 100's are less than some size threshold.
At 102, the plurality of logical data chunks are distributed to a plurality of physical pages on the Flash such that data from different logical data chunks are stored in different ones of the plurality of physical pages, wherein a logical data chunk is smaller in size than a physical page.
To illustrate what might happen to logical data chunks which do exceed the size threshold, in one example those larger chunks are grouped or otherwise aggregated together and written to the same physical page. This is merely exemplary and other storage techniques for larger chunks may be used.
In one example, the size of a physical page is 16 or 32 kB but the NAND Flash storage system is used with a file system (e.g., ext4) which uses 512 Bytes as the size of a logical block address. In one example, logical data chunks which are 512 Bytes or smaller are distributed to a plurality of physical pages where each page is 16 or 32 kB. This size threshold is merely exemplary and is not intended to be limiting.
Since older copies of a given logical data chunk are not overwritten until the block is erased, one or more previous versions of the logical data chunk may be used to assist in error correction decoding when decoding fails (e.g., for the most recent version of that logical data chunk). The following figures describe some examples of this.
At 800, a trial version of a logical data chunk is obtained that is based at least in part on a previous version of the logical data chunk, wherein the previous version is stored on a same physical page as a current version of the logical data chunk. For example, supposed chunk 1.0, chunk 1.1, and chunk 1.2 are all different version of the same logical data chunk from oldest to most recent. In one example described below, chunk 1.1 (one example of a previous version) and chunk 1.2 (one example of a current version) are stored on the same physical page. As will be described in more detail below, the trial version is generated by copying parts of chunk 1.1 into the trial version.
At 802, error correction decoding is performed on the trial version of the logical data chunk. Conceptually, the idea behind a trial version is to use a previous version to (e.g., hopefully) reduce the number of errors in the failing/current version to be within the error correction capability of the code. For example, suppose that the code can correct (at most) n errors in the data and CRC portions. If there are (n+1) errors in the current version, then error correction decoding will fail. By generating a trial version using parts of the previous version, it is hoped that the number of errors in the trial version will be reduced so that it is within the error correction capability of the code (e.g., reduce the number of errors to n errors or (n−1) errors, which the decoding would then be able to fix). That is, it is hoped that copying part(s) of the previous version into the trial version eliminates at least one existing error and does not introduce new errors.
At 804, it is checked whether error correction decoding is successful. If so, a cyclic redundancy check (CRC) is performed using a result from the error correction decoding on the trial version of the logical data chunk at 806. For example, there is the possibility of a false positive decoding scenario where decoding is successful (e.g., at step 802 and 804) but the decoder output or result does not match the original data. To identify such false positives, a CRC is used.
After the performing the cyclic redundancy check at step 806, it is checked whether the CRC passes at 808. For example, all versions of the logical data block include a CRC which is based on the corresponding original data. If the CRC output by the decoder (e.g., at step 804) matches the data output by the decoder (e.g., at step 804), then the CRC is declared to pass.
If the CRC passes at step 808, then the result of the error correction decoding on the trial version of the logical data chunk is output at 810. A trial version may fail to produce the original data for a variety of reasons (e.g., copying part of the previous version does not remove existing errors, copying part of the previous version introduces new errors, decoding produces a result which satisfies the error correction decoding process but which is not the original data, etc.), and therefore the decoding result is only output if error correction decoding succeeds and the CRC check passes.
If decoding is not successful at step 804, then a next trial version is obtained at step 800. For example, a different previous version of the logical data chunk may be used. In some embodiments, the process ends if the check at step 804 fails more than a certain number of time.
If the CRC does not pass at step 808, then a next trial version is obtained at step 800. As described above, multiple tries and/or trial versions may be attempted before the process decides to quit.
In some embodiments, the process of
In order to have a convenient fork or branch point, steps 804 and step 808 are included in
It may be helpful to illustrate the process of
A trial version of the logical data chunk (which is based on a previous version of the logical data chunk) is used to assist with decoding because error correction decoding for chunk 1.2 has failed. Diagram 910 shows an example of how the trial version (930) may be generated. In this example, chunk 1.0 (902) and chunk 1.1 (904) are the previous versions of the logical data chunk which are used to generate the trial version. In some embodiments, the two most recent versions of the logical data chunk which passes error correction decoding are used to generate the trial version. Using two or more previous versions (as opposed to a single previous version) may be desirable because if the current version (e.g., chunk 1.2) and single previous version do not match, it may be difficult to decide if it is a genuine change to the data or an error.
In this example, the chunks contain three portions: a data portion (e.g., data 1.0 (911), data 1.1 (912), and data 1.2 (914)) which contains the payload data, a cyclic redundancy check (CRC) portion which is generated from a corresponding data portion (e.g., CRC 1.0 (915) which is based on data 1.0 (911), CRC 1.1 (916) which is based on data 1.1 (912), and CRC 1.2 (918) which is based on data 1.2 (914)), and a parity portion which is generated from a corresponding data portion and a corresponding CRC portion (e.g., parity 1.0 (919) which is based on data 1.0 (910) and CRC 1.0 (915), parity 1.1 (920) which is based on data 1.1 (912) and CRC 1.1 (916), and parity 1.2 (922) which is based on data 1.2 (914) and CRC 1.2 (918)).
The data portions (i.e., data 1.0 (911), data 1.1 (912), and data 1.2 (914)) are compared using a sliding window (e.g., where the length of the sliding window is shorter than the length of the data portion) to obtain similarity values for each of the comparisons. For brevity, only three comparisons are shown here: a comparison of the beginning of the data portions, a comparison of the middle of the data portions, and comparison of the end of the data portions. These comparisons yield exemplary similarity values of 80%, 98%, and 100%, respectively. For example, each time all of the corresponding bits are the same, it counts toward the similarity value and each time the corresponding bits do not match (e.g., one of them does not match the other two), it counts against the similarity value.
In some embodiments, the length of a window is relatively long (e.g., 50 bytes) where the total length of the data portion is orders of magnitude larger (e.g., 2 KB). Comparing larger windows and setting a relatively high similarity threshold (e.g., 80% or higher) may better identify windows where any difference between the current version and the previous version is due to errors and not due to some update of the data between versions.
The similarity values (which in this example are 80%, 98%, and 100%) are compared to a similarity threshold (e.g., 80%) in order to identify windows which are highly similar but not identical. In this example, that means identifying those similarity values which are greater than or equal to 80% similar but strictly less than 100% similar. The similarity values which meet this division criteria are the 80% and 98% similarity values which correspond respectively to the beginning window and middle window. Therefore, two trial versions may be generated: one using the beginning window and one using the middle window.
Trial version 930 (i.e., before decoding) shows one example of a trial version which is obtained at step 800 in
The data portion (932) is generated using that part of the previous version which is highly similar to (but not identical to) the current version which failed error correction decoding. In this example, that means copying the middle part of data 1.1 (912b) to be the middle part of trial data 1.2 (932). The beginning part of trial data 1.2 (932) is obtained by copying the beginning part of data 1.2 (914a) and the end a part of trial data 1.2 (932) is obtained by copying the beginning part of data 1.2 (914c).
Copying part of a previous version into a trial version is conceptually the same thing as guessing or hypothesizing about the location of error(s) in the current version and attempting to fix those error(s). For example, if a window of the current version is 0000 and is 1000 in the previous version, then copying 1000 into the trial version is the same thing as guessing that the first bit is an error and fixing it (e.g., by flipping that first bit, 0000→1000).
Error correction decoding is then performed on the trial version (930) which produces a trial version after decoding (940). This is one example of the error correction decoding performed at step 802 in
To ensure that the error correction decoding process decoded or otherwise mapped trial data 1.2 (932) to the proper corrected data 1.2 (942) (that is, the corrected data matches the original data), a double check is performed using the corrected data (942) and corrected CRC (944) to ensure that they match. This is one example of step 806 in
In some embodiments, multiple trial versions are tested where the various trial versions use various windows and/or various previous versions copied into them (e.g., because trial versions continue to be tested until one passes both error correction decoding and the CRC check). In some embodiments, if there are multiple trial versions, the one with the highest similarity measurement is tested first. For example, if the trial version generated from the middle window with 98% similarity (930) had failed error correction decoding and/or the CRC check, then a trial version generated from the beginning window with 80% similarity (not shown) may be put through error correction decoding and the CRC check next.
In some embodiments, a fragment in a window (e.g., within the 80%, 98%, or 100% similar windows shown here) is ignored when calculating a similarity value and/or generating a trial version. The following figure shows one example of this.
If a similarity value is calculated without ignoring the fragment, then the similarity value is 12/20 or 60%. If, however, the fragment is ignored, then the similarity value is 11/12 or 91.6%.
When generating the trial version, the fragment (952) would be ignored. For example, if the trial version is thought of as the current version with some bits flipped, then the trial version would be the current version flipped only at the last bit location (954) but the bits in the fragment (952) would not be flipped.
In some embodiments, fragments with high differences may be identified and ignored when calculating a similarity measurement because those fragments are suspected updates and are not errors. If a trial version is generated using this window, this would corresponding to not flipping the bits of the current version (which failed error correction decoding) at the bit locations corresponding to the fragment. In some embodiments, fragments always begin and end with a difference (e.g., shown here with a “≠”) and fragments are identified by starting at some beginning bit location (e.g., a difference) and adding adjacent bit locations (e.g., expanding leftwards or rightwards) so long as the difference value stays above some threshold (e.g., a fragment difference threshold). Once the difference value drops below that threshold, the end(s) may be trimmed to begin/end with a difference. For example, fragment 952 may be identified in this manner.
The following flowcharts more generally and/or formally describes the processes of generating a trial version shown there.
At 1000, a plurality of windows of the previous version are compared against a corresponding plurality of windows of the modified version in order to obtain a plurality of similarity measurements. See, for example, the three windows in
At 1002, one or more windows are selected based at least in part on the plurality of similarity measurements and a similarity threshold. In some embodiments, only one window is selected and that window is the one with the highest similarity measurement that exceeds the similarity threshold but is not a perfect match. In some embodiments, multiple windows are selected (e.g., all windows that exceed a similarity threshold).
At 1004, the selected windows of the previous version are included in the trial version. For example, in
At 1006, the current version is included in any remaining parts of the trial version not occupied by the selected windows of the previous version. In
At 1000′, a plurality of windows of the previous version are compared against a corresponding plurality of windows of the modified version in order to obtain a plurality of similarity measurements, including by ignoring a fragment within at least one of the plurality of windows which has a difference value which exceeds a fragment difference threshold. See, for example, fragment 952 in
At 1002, one or more windows are selected based at least in part on the plurality of similarity measurements and a similarity threshold.
At 1004′, the selected windows of the previous version are included in the trial version except for the fragment. As described above, this means using leaving those bits which fall into the fragment in the current version alone (i.e., not flipping them). Other bit locations outside of the fragment (e.g., isolated difference 954 in
At 1006, the current version is included in any remaining parts of the trial version not occupied by the selected windows of the previous version.
Returning to
At 1100, a metric associated with write frequency is obtained for each of a plurality of logical data chunks, wherein the plurality of logical data chunks are distributed to a plurality of physical pages in a first block such that data from different logical data chunks are stored in different ones of the plurality of physical pages in the first block and a logical data chunk is smaller in size than a physical page. To put it another way, the first block is a source block which is input to the relocation process. Each of the logical data chunks in the plurality gets its own page (e.g., the various versions of a first logical data chunk (e.g., chunk 1.X) do not have to share the same physical page with the various versions of a second logical data chunk (e.g., chunk 2.X)).
At 1102, the plurality of logical data chunks are divided into a first group and a second group based at least in part on the metrics associated with write frequency. In some embodiments, division criteria used at step 1102 are adjusted until some desired relocation outcome is achieved. For example, the write frequency metrics may be compared against division criteria such as a write pointer position threshold or a percentile cutoff (e.g., associated with a distribution) at step 1102. If the desired relocation outcome is n total pages split amongst some number of shared pages (e.g., pages on which logical data chunks share a page) and some number of dedicated pages (e.g., pages on which logical data chunks have their own page), then the division criteria may be adjusted until the desired total number of pages (or, more generally, the desired relocation outcome) is reached.
At 1104, the plurality of logical data chunks in the first group are distributed to a plurality of physical pages in a second block such that data from different logical data chunks in the first group are stored in different ones of the plurality of physical pages in the second block. For example, the current version of the logical data chunks in the first group may be copied from the first block (i.e., a source block) into second block (i.e., a destination block) where each logical data chunk gets its own page in the second block.
At 1106, the plurality of logical data chunks in the second group are stored in a third block such that data from at least two different logical data chunks in the first group are stored in a same physical page in the third block. For example, the current version of the logical data chunks in the second group may be copied from the first block (i.e., a source block) to the third block (i.e., a destination block) where the logical data chunks share pages in the third block.
The following figures show some examples of this.
In this example, the write pointers (shown as an arrow after each current version of each logical data chunk) are compared against a write pointer position threshold (1220). If the write pointer exceeds the threshold, then the current version of the corresponding logical data chunk is copied to block p (1222) where each logical data chunk gets its own physical page. For example, logical data chunks A (1202a), C (1206a), 3 (1216a), and 4 (1218a) meet this division criteria and are copied to block p where each gets its own page (see, e.g., how chunks A (1202b), C (1206b), 3 (1216b), and 4 (1218b) are on different physical pages by themselves). The older versions are not copied to block p in this example.
If a write pointer does not exceeds the threshold, then the current version of the corresponding logical data chunk is copied to block q (1224) where logical data chunks share physical pages. For example, logical data chunks B (1204a) and D (1208a) have write pointers which are less than the threshold (1200) and current versions of those logical data chunks are copied to the same physical page in block q (see chunk B (1204b) and chunk D (1208b)). Similarity, logical data chunks 1 (1212a) and 2 (1214a) have write pointers which do not exceed the threshold and current versions of those logical data chunks share the same physical page in block q (see chunk 1 (1212b) and chunk 2 (1214b)).
As described above, after relocation has completed, garbage collection (not shown) may be performed on block i (1200) and block j (1210).
As shown here, the relocation process divides the logical data chunks into two groups: more frequently updated chunks and less frequently updated chunks. During relocation, the more frequently updated chunks are given their own physical page. See, for example, block p (1222). The less frequently updated chunks share physical pages with other less frequently updated chunks. See, for example, block q (1224). This may be desirable for a number of reasons. For one thing, the more frequently updated chunks are given more space for updates (e.g., roughly an entire page of space for updates instead of roughly half a page of space of updates). Also, separating more frequently updated chunks from less frequently updated chunks may reduce write amplification and/or increase the number of free blocks available at any given time.
In some embodiments, the threshold (1220) is set or tuned to a value based on some desired relocation outcome. For example, if free blocks are at a premium and it would be desirable to pack the logical data chunks in more tightly, the threshold may be set to a higher value (e.g., so that fewer logical data chunks get their own physical page). That is, any threshold may be used and the value shown here is merely exemplary.
Referring back to
Diagram 1310 shows this same process applied to a different distribution. Note, for example, that the shape of the distribution and the mean/median of the distribution are different. As before, logical data chunks in the bottom 50% of the distribution (1312) are relocated to shared pages and logical data chunks in the upper 50% of the distribution (1314) are relocated to their own pages.
As shown here, using or otherwise taking a distribution into account may be desirable because it is adaptive to various distributions. For example, if a write pointer position threshold of 6.5 had been used instead, then in the example of diagram 1300, all of the logical data chunks would be assigned to shared pages. In contrast, with a write pointer position threshold of 6.5 applied to diagram 1310, all of the logical data chunks would be assigned their own page.
Although a percentile cutoff of 50% is shown here, any percentile cutoff may be used.
Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.