The present invention relates generally to a system and method for communications, and, in particular, to a system and method for multi-stream compression.
The effective throughput of a network can be improved, for example, using data compression. Data compression involves encoding information using fewer bits than the original compression. Data may be compressed, transmitted, and then decompressed. Compression reduces data storage space and transmission capacity, with a tradeoff of increased computation.
Data compression may be lossy or lossless. In lossless data compression, statistical redundancy is identified and eliminated, and no information is lost. Examples of lossless compression include Lempel-Ziv (LZ) compression, DEFLATE compression, LZ-Renau compression, Huffman coding, compression based on probabilistic models such as prediction by partial matching, grammar-based coding, and arithmetic coding. In lossy compression, marginally important information is identified and removed. Lossy data compression schemes are based on how people perceive the data in question. For example, the human eye is more sensitive to subtle variations in luminance than the variations in color. Examples of lossy compression include JPEG compression, MPEG compression, and Mp3 compression. Different coding methods are more efficient in compressing different data types. For example, JPEG compression is best used to compress images, MPEG compression is best used to compress video, MP3 compression is best used to compress audio, a lossless compression scheme is best used to compress a text file, and no compression is best for an already compressed file.
In accordance with an embodiment, a method for decompressing data includes receiving, by a network element, a first plurality of packets. Also, the method includes receiving, by the network element, a second plurality of packets. Additionally, the method includes decompressing the first plurality of packets by a first decompressor using a first compression scheme and decompressing the second plurality of packets by a second decompressor using a second compression scheme.
In accordance with another embodiment, a method for compressing data includes determining that a first packet is in a first plurality of packets and determining that a second packet is in a second plurality of packets. Also, the method includes compressing, by a network element, the first plurality of packets by a first compressor using a first compression scheme, and compressing, by the network element, a second plurality of plackets by a second compressor using a second compression scheme. Additionally, the method includes transmitting the first plurality of packets and transmitting the second plurality of packets.
In accordance with yet another embodiment, a network element for decompressing data includes a receiver configured to receive a first plurality of packets, route the first plurality of packets to a first decompressor, receive a second plurality of packets, and route the second plurality of packets to a second decompressor. Also, the network element includes the first decompressor, configured to decompress the first plurality of packets using a first decompression scheme, and the second decompressor, configured to decompress the second plurality of packets using a second decompression scheme.
The foregoing has outlined rather broadly the features of an embodiment of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of embodiments of the invention will be described hereinafter, which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiments disclosed may be readily utilized as a basis for modifying or designing other structures or processes for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.
For a more complete understanding of the present invention, and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawing, in which:
Corresponding numerals and symbols in the different figures generally refer to corresponding parts unless otherwise indicated. The figures are drawn to clearly illustrate the relevant aspects of the embodiments and are not necessarily drawn to scale.
It should be understood at the outset that although an illustrative implementation of one or more embodiments are provided below, the disclosed systems and/or methods may be implemented using any number of techniques, whether currently known or in existence. The disclosure should in no way be limited to the illustrative implementations, drawings, and techniques illustrated below, including the exemplary designs and implementations illustrated and described herein, but may be modified within the scope of the appended claims along with their full scope of equivalents.
In an example, multiple data streams are compressed separately, where each data stream is compressed, transmitted, and decompressed. Alternatively, multiple streams may be merged into one transmission stream and compressed together prior to transmission. The multiple streams are then decompressed together after transmission.
Adaptive tunneling is a method of compressing, transmitting, and decompressing multiple streams of data. In adaptive tunneling, data from multiple data streams is directed to multiple tunnels, and a different compression scheme is used for each tunnel, where the tunnels are first in first out (FIFO) tunnels.
Next, the data is compressed by compressor 130 and compressor 132. Two compressors are illustrated for clarity, but a greater number of compressors may be used. There may be more compressors than data streams, the same number of compressors as data streams, or fewer compressors than data streams. Finally, in step 134, data packets compressed by the first compressor are transmitted in a first tunnel, and in step 136, data packets compressed by the second compressor are transmitted in a second tunnel. Additionally, there may be a tunnel for uncompressed data, which, for example, transmits data that was previously compressed. In an example, the mapping of the tunnels is semi-static, and by default based on an IP or port number. The tunnels may be layer 2 tunneling protocol (L2TP) over IP, generic routing encapsulation (GRE) over IP, or multiprotocol label switching (MPLS) architecture. In tunneling, packets are encapsulated into some form of lower layer payload and transmitted to another node, where they are converted to their native form and transmitted. A tunnel could use a lossless protocol, such as TCP, to ensure that the data arrives without loss or a change in ordering. Alternatively, a lossy protocol, such as UDP, may be used, which has less delay, but the compression layer should account for possible losses. The compression streams are continuous and non-branching in adaptive tunneling.
Another way to compress multiple data streams together is to use a with respect to (wrt) field. A header is created, which contains a wrt field that explicitly informs the decompressor where this packet falls in the compressed data stream. The wrt field indicates what the packet is compressed with respect to. A protocol layer is generated for separating streams. The wrt field may be part of a compression layer. Streams may branch, and new packet streams may be created on the fly.
It should be noted that a compressor/decompressor may include several compression steps and that the wrt field may be applied independently or jointly for the different stages. For example DEFLATE consists of two stages of compression, one consisting of adding backwards references and another applying statistical compression. These two levels can be treated separately with possibly different wrt fields.
In one embodiment, the wrt field indicates one of transmission several transmission streams, which may include upload streams, download streams, and TCP/IP streams. When a new packet goes through the compressor, a new compression stream is created, consisting of the previous compression stream plus the new packet. The reference in the wrt field may be a hash value, for example a hash of the compression stream, a random number, a label and a state, such as download stream state x, or a label and a distance, for example 200 packets from the beginning of the compression stream or from the last acknowledgement. In an example, the wrt field is highly predictable. For instance, a compression stream may reference the same transmission stream that this transmission flow previously referenced. Such a default may be predetermined. Alternatively, the default may be configured with minimal signaling to indicate a confirmation. Using a default can thus reduce the size. Standard header compression schemes like those used for TCP/IP compression can be used. Using a default can thus reduce the size. Standard header compression schemes like those used for TCP/IP compression can be used.
In another embodiment, the wrt field refers to a hash of a dictionary. As the dictionary is updated, an agreed upon HASH of that dictionary is stored, which can be referred to later. In an example, the dictionary is Hoffman coded. A header field may indicate when dictionaries will be stored or overwritten. In an embodiment, a wrt field does not affect references occurring before a particular point (i.e., the beginning of a packet/flow). In another embodiment, it may be included in the middle of a block if the data types change.
In an example, packet losses occur in data transmission, but the compressor is aware of the losses with a delay X. The wrt field refers to a stream of packets X packets back. To implement this, the location in the stream of a few values may be stored, along with the stream itself. In another example, no packet losses occur, but the transmission stream contains unlike elements. Thus, the wrt field organizes packets into like elements. The wrt field may build up combinations of like packets.
In one embodiment, hashing and storing is performed using a small chunk size. Only a fraction of the chunks may be stored. The similarity between different streams can be detected. Then, the wrt field is set to the chunk with the most similarity, considering the possibility of errors.
A preferred size of the wrt field is a function of how much gain can be achieved and what the wrt field does. In an example, the size is dynamic, depending on the feature used. For example, the actual number of bits used for each wrt field is decided based on expected parameters. It may be dynamically determined. Alternatively, the size of the wrt field is fixed. If there is no compression, there may not be a sub-field. In another example, the wrt indicates a lack of compression. When there is a new packet flow, the sub-field may indicate the compression type, for example gzip or 7zip. For new packet flows, the type of compression applied to the packet flow is decided. The number of options or the meaning of the data can be set semi-statically or hardwired. Also for a new packet flow, a sub-field may include the packet flow name, which may be generated implicitly, by some agreed-upon labeling method (i.e. a hash of the packet, or a simple counting).
There may be a continuation of the previous packet flow. X bits may indicate the packet flow options. For example, a name may be global, or it may be dependent on variables within the packet itself, for example combined with an IP address. A subframe may indicate a fixed stream, with X bits explicitly indicating a packet flow name, and optionally with Y bits representing the location within the stream, which may be a hash value, a relative value or an absolute value. Also, a sub-field may indicate that a new flow is generated. This sub-field indicates if the new packet steam is referenceable, for example if the hash of the old stream and the present packet are valid hashes. In another example, the sub-field is the hash of the packet flow, where X bits represent the hash of the previously used hash. Alternatively, the sub-field could be a fixed predefined packet stream, which enables dictionaries or compression schemes to be applied to traffic.
The wrt field may indicate a state of a flow or a cyclic redundancy check (CRC). A sub-field may be responsible for verifying the state of the packet stream. For example, it may indicate that there are no missing or reordered packets. This may be the CRC of the uncompressed data, the XOR of all the remaining bits of the hash values, or a combination of both. Lower layers, such as port numbers, can be reused for a wrt field.
In an example, there are multiple wrt fields, which subsequent metadata distinguishes between. For example, the wrt field may describe two compression streams, and the metadata includes an additional bit to distinguish between the two streams. Alternatively, there could be a switch wrt field message that indicates a change in the wrt field within the data packet. The switch may be differential, for example it could refer back to the wrt field used previously. Alternatively, the switch may be absolute.
In an example, at a regular interval, for example every ms, the decompressor acknowledges the packets that it has received and assigns them an order. These received packets may be decompressed in several orders. Once an order is chosen, the decompressor acknowledges, to the compressor, that the packets were received. The decompressor also indicates, using a decompression state, the order in which the packets were processed. The new packets, combined with the previous state, results in a unique state value. The state value may be a hash of the packet IDs and the previous state hash, or a function of an acknowledgment packet, such as an index number. The compressor uses the decompressor state in encoding additional packets. Also, the compressor prepends the value to packets to indicate that the decompressor should use this state for decompression of these packets. If any packets come in outside the acknowledgements, they are still processed, but they are not used in compression.
Compression may be performed at the object level. The compression streams may branch and merge, but at the object level rather than at the packet level. When a compression stream branches, inputs, an initial state, and a label inform the decompressor that a new compression stream branch is to be generated based on the state, which will be referred to by the label. If available, the size of the object (i.e. this stream) can also be sent. If the label was previously used, it will be overwritten. The acknowledgment of the received packets will then contain the new label. When compression streams merge, inputs of a first state and a second state inform the decompressor to create a new state, which is a combination of the first state and the second state. The first state and the second state must have previously branched. The response is the new state, or an error message. This new state could be as simple as the first stream followed by the second stream.
Two stage redundancy elimination (RE) uses compression to improve performance.
In one example, the order in which chunks arrive is taken into account when signaling their transition. If an order has occurred previously, only a first portion of bits, not all the bits, are transmitted. In an example, two or three bits are transmitted. In another example, metadata is compressed using standard software. For examples, back references, may indicate to go back 1000 bits and use the next 80 bits. In an example, TCP is used. A protocol that detects and resolves missed back references may be used. Back references may either refer to bits in the same packet, or from some agreed upon point, such as the last packet where an acknowledge was received under TCP, or with an acknowledge from UDP without retransmission. Alternatively, a dictionary approach is used. The dictionary can be adaptive or static, and may use a hash of the dictionary. The compressor can choose which dictionary the dictionary comes from. Compression ratios greater than 10:1 may be found for a chunk size of 32 bytes. Alternatively, LZP may be used.
The bus may be one or more of any type of several bus architectures including a memory bus or memory controller, a peripheral bus, video bus, or the like. CPU 274 may comprise any type of electronic data processor. Memory 276 may comprise any type of system memory such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous DRAM (SDRAM), read-only memory (ROM), a combination thereof, or the like. In an embodiment, the memory may include ROM for use at boot-up, and DRAM for program and data storage for use while executing programs.
Mass storage device 278 may comprise any type of storage device configured to store data, programs, and other information and to make the data, programs, and other information accessible via the bus. Mass storage device 278 may comprise, for example, one or more of a solid state drive, hard disk drive, a magnetic disk drive, an optical disk drive, or the like.
Video adaptor 280 and I/O interface 288 provide interfaces to couple external input and output devices to the processing unit. As illustrated, examples of input and output devices include the display coupled to the video adapter and the mouse/keyboard/printer coupled to the I/O interface. Other devices may be coupled to the processing unit, and additional or fewer interface cards may be utilized. For example, a serial interface card (not pictured) may be used to provide a serial interface for a printer.
The processing unit also includes one or more network interface 284, which may comprise wired links, such as an Ethernet cable or the like, and/or wireless links to access nodes or different networks. Network interface 284 allows the processing unit to communicate with remote units via the networks. For example, the network interface may provide wireless communication via one or more transmitters/transmit antennas and one or more receivers/receive antennas. In an embodiment, the processing unit is coupled to a local-area network or a wide-area network for data processing and communications with remote devices, such as other processing units, the Internet, remote storage facilities, or the like.
Advantages of an embodiment include good compression rates without added loss. Another advantage of an embodiment is good results with small packet or stream sizes or with a lossy transmission medium. Other advantages of an embodiment include a method that is simple and expandable, that scales easily to large sliding windows, and that reuses well tested and designed algorithms to find matches on larger window sizes. An advantage of an embodiment is good performance at small chunk size. In an embodiment, errors in packet ordering may be detected. Also, in an embodiment, the impact of packet reordering or loss on compression and decompression are minimized. In another example, the overall performance is improved by allowing a single compression scheme to access the data from more than one stream. Advantages of another embodiment include improved performance by moving like data closer together. Also, decoding errors may be detected by an embodiment. In an example, different streams can be efficiently mixed. In another embodiment, the total number of independent streams is decreased, leading to a reduction in the total memory requirement.
While several embodiments have been provided in the present disclosure, it should be understood that the disclosed systems and methods may be embodied in many other specific forms without departing from the spirit or scope of the present disclosure. The present examples are to be considered as illustrative and not restrictive, and the intention is not to be limited to the details given herein. For example, the various elements or components may be combined or integrated in another system or certain features may be omitted, or not implemented.
In addition, techniques, systems, subsystems, and methods described and illustrated in the various embodiments as discrete or separate may be combined or integrated with other systems, modules, techniques, or methods without departing from the scope of the present disclosure. Other items shown or discussed as coupled or directly coupled or communicating with each other may be indirectly coupled or communicating through some interface, device, or intermediate component whether electrically, mechanically, or otherwise. Other examples of changes, substitutions, and alterations are ascertainable by one skilled in the art and could be made without departing from the spirit and scope disclosed herein.
Number | Name | Date | Kind |
---|---|---|---|
6249532 | Yoshikawa et al. | Jun 2001 | B1 |
7975071 | Ramjee | Jul 2011 | B2 |
20020199185 | Kaminski | Dec 2002 | A1 |
20030133461 | Ho et al. | Jul 2003 | A1 |
20050090273 | Jin | Apr 2005 | A1 |
20070165604 | Mooney et al. | Jul 2007 | A1 |
20070230461 | Singh | Oct 2007 | A1 |
20090123081 | DeLuca | May 2009 | A1 |
20100157802 | Casey et al. | Jun 2010 | A1 |
20100215020 | Lee | Aug 2010 | A1 |
20130018932 | Bhaskar | Jan 2013 | A1 |
20130103655 | Fanghaenel | Apr 2013 | A1 |
20130170435 | Dinan | Jul 2013 | A1 |
20140006036 | Qi et al. | Jan 2014 | A1 |
20140029622 | Bettink et al. | Jan 2014 | A1 |
Number | Date | Country |
---|---|---|
101120592 | Feb 2008 | CN |
101615911 | Dec 2009 | CN |
2011112316 | Sep 2011 | WO |
2012138819 | Oct 2012 | WO |
Entry |
---|
“Data Compression/Dictionary Compression,” Wikibooks, http://en.wikibooks.org/wiki/Data_Compression/Dictionary_compression, downloaded Jun. 29, 2014, 13 pgs. |
“Lecture 5: Sources with Memory and the Lempel-Ziv Algorithm,” 6.450 Principles of Digital Communications, Sep. 2002, 9 pgs. |
Farinacci, et al., “RFC2784—Generic Routing Encapsulation (GRE),” The Internet Society, Mar. 2000, 8 pgs. |
Irmak, U. et al., “Hierarchical Substring Caching for Efficient Content Distribution to Low-Bandwidth Clients,” In Proc. or the 14th Int. World Wide Web Conference, May 2005, 11 pgs. |
Reinhardt, A. et al., “Stream-oriented Lossless Packet Compression in Wireless Sensor Networks,” 6th Annual IEEE Communications Society Conference on Sensor, Mesh and Ad Hoc Communications and Networks, (SECON 2009), Jun. 2009, 9 pgs. |
Saha, S. et al., “CombiHeader: Minimizing the Number of Shim Headers in Redundancy Elimination Aystems,” 2011 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Apr. 2011, 6 pgs. |
Saldana, J. et al., “Tunneling Compressed Multiplexed Traffic Flows (TCMTF),” Transport Area Working Group, Internet Draft, Mar. 2, 2012, 16 pgs. |
Ziv, J. et al., “Compression of Individual Sequences via Variable-Rate Coding,” IEEE Transactions on Information Theory, vol. 24, No. 5, Sep. 1978, pp. 530-536. |
PCT International Search Report and Written Opinion, International Application No. PCT/CN2014/073407, Applicant: Huawei Technologies Co., Ltd., dated Jun. 3, 2014, 14 pages. |
Number | Date | Country | |
---|---|---|---|
20140269774 A1 | Sep 2014 | US |