The field relates generally to data compression techniques.
Data often contains redundancy in the form of repeated patterns in a data frame, such as repeated bits, repeated bytes, repeated strings of bits and repeated strings of bytes. Compression represents frequently repeated data patterns in shorter binary codes thus reducing the total number of bits used to send out the entire data frame. For example, the letter “e” appears most frequently in an English context, so the letter “e” is represented by one dot in Morse code for transmission efficiency.
Data compression techniques aim to identify such repeated data patterns and then replace them with shorter strings. Many data compression techniques rely on one or more search engines to find such redundancy. There is a tradeoff between compression efficiency and the size of the search engines. Generally, larger search engines exhibit better compression efficiency, but also bring larger area and power consumption requirements for a hardware implementation, as well as longer latency in performing the search.
For a given search engine with a predefined boundary or a target engine with a smaller boundary size, a need exists for improved data compression techniques that remove redundancy across the boundary of compression search engines.
In one embodiment, a method comprises splitting the data frame into a plurality of sub-chunks; comparing at least two of the plurality of sub-chunks to one another to remove at least one sub-chunk from the plurality of sub-chunks that substantially matches at least one other sub-chunk in the plurality of sub-chunks to generate a remaining plurality of sub-chunks; generating matching sub-chunk information for data reconstruction identifying the at least one removed sub-chunk and the corresponding substantially matched at least one other sub-chunk; grouping the remaining plurality of sub-chunks into sub-units; removing substantially repeated patterns within the sub-units to generate corresponding compressed sub-units; and combining the compressed sub-units with the matching sub-chunk information to generate a compressed data frame.
In one or more embodiments, the step of removing substantially repeated patterns comprises applying the plurality of sub-units to a compressor comprising compression search engines that identify the substantially repeated patterns in the plurality of sub-units to generate the compressed sub-units. The step of removing at least one sub-chunk from the plurality of sub-chunks removes redundancy from the plurality of sub-chunks across a boundary of a plurality of the compression search engines.
In some embodiments, the data frame is reconstructed by decompressing the compressed sub-units and restoring the at least one removed sub-chunk using the matching sub-chunk information.
In one illustrative embodiment, the data frame comprises one or more host pages compressed substantially simultaneously. The plurality of sub-chunks each comprise (i) one of the host pages, such that substantially duplicated host pages are removed, and/or (ii) a portion of one of the host pages, such that substantial duplication across a plurality of host pages is removed. The compressed data frame for a plurality of the host pages compressed substantially simultaneously comprises a host page address for each host page in the plurality of the host pages.
Other illustrative embodiments include, without limitation, apparatus, systems, methods and computer program products comprising processor-readable storage media.
Illustrative embodiments will be described herein with reference to exemplary solid state storage devices and associated storage media, controllers, and other processing devices. It is to be appreciated, however, that these and other embodiments are not restricted to the particular illustrative system and device configurations shown. Accordingly, the term “solid state storage device” as used herein are intended to be broadly construed, so as to encompass, for example, any storage device implementing the cross-boundary compression techniques described herein. Numerous other types of storage systems are also encompassed by the term “solid state storage device” as that term is broadly used herein.
In one or more embodiments, improved data compression techniques are provided that remove redundancy across the boundary of search engines. Among other benefits, the disclosed cross-boundary data compression techniques employ a plurality of smaller search engines to improve data processing throughput, while also further improving compression efficiency. In addition, in some embodiments, the interface of the disclosed cross-boundary data compression techniques can be the same as conventional compression implementations. Thus, the disclosed cross-boundary data compression techniques can be employed in any implementation where data compression is implemented, including, for example, storage devices and file archiving software.
After compression is performed by the compressor 120, to generate compressed sub-units 130-1 through 130-4, a combiner 150 combines the four compressed sub-units 130 together to write to the flash storage medium and mapped in the Flash Translation Layer (FTL), as one compressed Host Page.
Generally, the scope of the search is bounded by the size, or the boundary, of the search engines 125. Each search engine 125 removes substantially repeated patterns within a given sub-unit 115, such as repeated bits, repeated bytes, repeated strings of bits and repeated strings of bytes, to generate a corresponding compressed sub-unit 130. One or more aspects of the disclosure recognize that redundancy beyond the search boundary of the search engines 125 (e.g., a given sub-unit 115 searched by a search engine 125) cannot be compressed. In the exemplary embodiment of
The size of the Host Page in
Since one Host Page is one unit to be mapped and stored into the physical flash media of the SSD, it is usually compressed as one unit of data. However, the compression unit size and the search engine size are not expected to grow to accommodate large Host Pages. The reason is to avoid increased cost in area, power and delay from the compressor hardware. As a result, one Host Page is often split using splitter 110 into multiple smaller sub-pages that fit a conventional compressor design.
The sub-chunks 220 are applied to an identical sub-chunk remover (ISR) 230 within the CBC 205. Generally, the CBC 205 removes redundancy across the boundary of the search engines 125 of
The ISR 230 compares each of the plurality of sub-chunks 220 to one another to remove any duplicate sub-chunks 220 that substantially matches at least one other sub-chunk 220, as discussed further below in conjunction with
In addition, the ISR 230 generates identical sub-chunk information (ISI) that is applied to the combiner 150. Generally, the ISI provides matching sub-chunk information for data reconstruction identifying the removed sub-chunk(s) and the corresponding substantially matched sub-chunk, as discussed further below in conjunction with
The remaining unique sub-chunks (sub-chunks 220-1, 220-7, 220-8 and 220-20 in
It is again noted that the redundancy detected by the ISR process 300 cannot be identified by the search engines 125 in the Host Page compressor 100 of
It is noted that a smaller sub-chunk size will require more sub-chunks given the same Host Page size, and more bits to represent the Identical Sub-Chunk Information (ISI) of
The ISI table 450 contains all of the information needed for the reconstruction: the size of the table 450 indicates how many sub-chunks are removed, and each entry in the table 450 indicates which unique sub-chunk to copy to restore the removed sub-chunks. For example, the entry (S4, S1) means that sub-chunk S4 should be copied from sub-chunk S1. Since there are nine sub-chunks in total in the input sub-chunks 220 shown in
The ISI information in the ISI table 450 of
In one or more embodiments, the decompression process will first generate decompressed unique data corresponding to the unique sub-chunks 350 of
In the embodiment of
In various flexible grouping embodiments, the sub-chunks can also be grouped into fewer and larger sub-units so that the search engine can get more repeated patterns and a better compression efficiency.
In the embodiment of
The sub-chunks can also be rearranged and/or reordered if better compression efficiency can be achieved. The reordering information should be included in the compression data together with the ISI information for original data reconstruction. How to group and/or reorder the unique sub-chunks 350 to sub-units 115 depends on the data pattern and/or performance requirement.
In a cross-boundary compression implementation, the sub-chunks SB1 through SB8 are applied to the CBC 205 (as discussed above in conjunction with
Note that in the case of
It has been found that the compression shown in
In the example of
The 16 KB data is applied to the CBC 205 of
The unique Host Pages (e.g., 8 KB of data) are applied as one or more sub-units to the search engines 125 of the compressor 120, which generates compressed versions of the unique Host Pages. The combiner (not shown in
Since one compressed data frame 730 should be decompressed together to recover original content, the compressed data frame 730 comprises Host Page Address (HPA) information for each compressed Host Page HP in order to distinguish data for different Host Pages.
Without the CBC 205 in the embodiment of
If the sub-chunk size in the CBC 205 of
If the compression of multiple Host Pages as one compression unit is not supported in an SSD controller, the Flash Translation Layer (FTL), which keeps record of which Host Page is written to which location in the flash memory, is optionally modified so that multiple Host Pages can be mapped into one same flash memory location.
The remaining sub-chunks are grouped into sub-units during step 840. Repeated patterns within the sub-units are removed during step 850 to generate corresponding compressed sub-units. Finally, the compressed sub-units are combined with the matching sub-chunk information during step 860 to generate a compressed data frame.
The particular processing operations and other system functionality described in conjunction with the flow diagram of
Functionality such as that described in conjunction with the flow diagram of
As shown in
The solid state storage media 950 comprises a memory array, such as a single-level or multi-level cell flash memory, a NAND flash memory, a phase-change memory (PCM), a magneto-resistive random access memory (MRAM), a nano RAM (NRAM), a NOR flash memory, a dynamic RAM (DRAM) or another non-volatile memory (NVM). While the invention is illustrated primarily in the context of a solid state storage device (SSD), the disclosed cross-boundary compression techniques can be applied in hard disk drives (HDD) and other storage devices, as would be apparent to a person of ordinary skill in the art based on the present disclosure.
It should be understood that the particular cross-boundary compression arrangements illustrated in
Illustrative embodiments disclosed herein can provide a number of significant advantages relative to conventional arrangements.
For example, one or more embodiments provide significantly improved redundancy detection and by splitting one data frame to multiple smaller sub-chunks and redundant sub-chunks can be removed beyond the boundary of one compression search engine. As a result, compression efficiency is improved and searching latency is reduced.
In some embodiments, where multiple Host Pages are compressed and/or mapped together in SSDs, duplicated Host Pages are removed before compression to achieve data deduplication (e.g., redundancy removal at the Host Page level). When such deduplication techniques are combined with smaller sub-chunks, redundancy is removed both inside and across Host Pages.
It is to be appreciated that the particular advantages described above and elsewhere herein are associated with particular illustrative embodiments and need not be present in other embodiments. Also, the particular types of cross-boundary compression features and functionality as illustrated in the drawings and described above are exemplary only, and numerous other arrangements may be used in other embodiments.
As mentioned previously, at least portions of the disclosed cross-boundary compression system may be implemented using one or more processing platforms. A given such processing platform comprises at least one processing device comprising a processor coupled to a memory. The processor and memory in some embodiments comprise respective processor and memory elements of a virtual machine or container provided using one or more underlying physical machines. The term “processing device” as used herein is intended to be broadly construed so as to encompass a wide variety of different arrangements of physical processors, memories and other device components as well as virtual instances of such components. For example, a “processing device” in some embodiments can comprise or be executed across one or more virtual processors. Processing devices can therefore be physical or virtual and can be executed across one or more physical or virtual processors. It should also be noted that a given virtual device can be mapped to a portion of a physical one.
Some illustrative embodiments of a processing platform that may be used to implement at least a portion of an information processing system comprises cloud infrastructure including virtual machines. The cloud infrastructure further comprises sets of applications running on respective ones of the virtual machines. These and other types of cloud infrastructure can be used to provide what is also referred to herein as a multi-tenant environment. One or more system components such as cross-boundary compressor 205 and/or compressor 120, or portions thereof, are illustratively implemented for use by tenants of such a multi-tenant environment.
The disclosed cross-boundary compression arrangements may be implemented using one or more processing platforms. One or more of the processing modules or other components may therefore each run on a computer, storage device or other processing platform element. A given such element may be viewed as an example of what is more generally referred to herein as a “processing device.”
Referring now to
The processing device 1002-1 in the processing platform 1000 comprises a processor 1010 coupled to a memory 1012. The processor 1010 may comprise a microprocessor, a microcontroller, an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other type of processing circuitry, as well as portions or combinations of such circuitry elements. The memory 1012 may comprise random access memory (RAM), read only memory (ROM) or other types of memory, in any combination. The memory 1012 and other memories disclosed herein should be viewed as illustrative examples of what are more generally referred to as “processor-readable storage media” storing executable program code of one or more software programs.
Also included in the processing device 1002-1 is network interface circuitry 1014, which is used to interface the processing device with the network 1004 and other system components, and may comprise conventional transceivers.
The other processing devices 1002, if any, of the processing platform 1000 are assumed to be configured in a manner similar to that shown for processing device 1002-1 in the figure.
Again, the particular processing platform 1000 shown in the figure is presented by way of example only, and the given system may include additional or alternative processing platforms, as well as numerous distinct processing platforms in any combination, with each such platform comprising one or more computers, storage devices or other processing devices.
Multiple elements of the system may be collectively implemented on a common processing platform of the type shown in
Articles of manufacture comprising such processor-readable storage media are considered illustrative embodiments. A given such article of manufacture may comprise, for example, a storage array, a storage disk or an integrated circuit containing RAM, ROM or other electronic memory, or any of a wide variety of other types of computer program products. The term “article of manufacture” as used herein should be understood to exclude transitory, propagating signals. Numerous other types of computer program products comprising processor-readable storage media can be used.
Again, the particular processing platform 1000 shown in
It should therefore be understood that in other embodiments different arrangements of additional or alternative elements may be used. At least a subset of these elements may be collectively implemented on a common processing platform, or each such element may be implemented on a separate processing platform.
Also, numerous other arrangements of computers, servers, storage devices or other components are possible in the cross-boundary compression techniques system. Such components can communicate with other elements of the cross-boundary compression system over any type of network or other communication media.
As indicated previously, components of an information processing system as disclosed herein can be implemented at least in part in the form of one or more software programs stored in memory and executed by a processor of a processing device. For example, at least portions of the functionality of the cross-boundary compression process 800 of
It should again be emphasized that the above-described embodiments are presented for purposes of illustration only. Many variations and other alternative embodiments may be used. For example, the disclosed techniques are applicable to a wide variety of other types of information processing systems and cross-boundary compression systems. Also, the particular configurations of system and device elements and associated processing operations illustratively shown in the drawings can be varied in other embodiments. Moreover, the various assumptions made above in the course of describing the illustrative embodiments should also be viewed as exemplary rather than as requirements or limitations of the disclosure. Numerous other alternative embodiments within the scope of the appended claims will be readily apparent to those skilled in the art.
Number | Name | Date | Kind |
---|---|---|---|
5864859 | Franaszek | Jan 1999 | A |
9400610 | Wallace | Jul 2016 | B1 |
9514146 | Wallace | Dec 2016 | B1 |
20060123031 | Shin | Jun 2006 | A1 |
20100225506 | Chen et al. | Sep 2010 | A1 |
20110246673 | Kishore et al. | Oct 2011 | A1 |
20140229452 | Serita et al. | Aug 2014 | A1 |
20150178214 | Alameldeen et al. | Jun 2015 | A1 |
20150293817 | Subramanian | Oct 2015 | A1 |
20160294410 | Bhaskar et al. | Oct 2016 | A1 |
Entry |
---|
Ziv et al., “A Universal Algorithm for Sequential Data Compression,” IEEE Transactions on Information Theory, vol. IT-23, No. 3, May 1977. |
Number | Date | Country | |
---|---|---|---|
20180329642 A1 | Nov 2018 | US |