Claims
- 1. A method for data compression comprising the steps of:
receiving a data file having data patterns; storing received data patterns in a dictionary; assigning an index to each data pattern in the dictionary; storing the index of each data pattern in the dictionary; accumulating statistical information about each index; encoding each index using the statistical information; and clearing stored indices and stored data patterns in the dictionary when another data file is received.
- 2. The method of claim 1, wherein the step of accumulating statistical information is performed by a statistical model.
- 3. The method of claim 2, wherein each index is encoded by an encoder.
- 4. The method of claim 3, wherein if the received data pattern does not match any of the stored data patterns in the dictionary, and if the dictionary is not full, then the dictionary sends the index assigned to the received data pattern to the encoder and the statistical model.
- 5. The method of claim 3, wherein if the received data pattern does not match any of the stored data patterns in the dictionary, and if the dictionary is full, then the statistical model instructs the dictionary to replace a stored data pattern with the received data pattern, and the dictionary sends the index associated with the stored data pattern to the encoder and the statistical model.
- 6. The method of claim 3, wherein if the received data pattern matches the stored data pattern in the dictionary, then the dictionary sends the index associated with the stored data pattern to the encoder and the statistical model.
- 7. The method of claim 2, wherein the step of accumulating statistical information comprises the steps of:
receiving indices from the dictionary; recording the frequency of occurrence of each index within a set of frequency counters; and updating the dictionary.
- 8. The method of claim 7, wherein the statistical model resets the set of frequency counters when another data file is received.
- 9. The method of claim 7, wherein the set of frequency counters contains a distinct and unique counter for each distinct and unique pair of context indices and a current pattern index.
- 10. The method of claim 7, wherein the set of frequency counters contains a distinct and unique counter for each distinct and unique tuple of arbitary context indices and a current pattern index.
- 11. The method of claim 9, wherein a context index of the current pattern index is another index received just prior to the current pattern index.
- 12. The method of claim 10, wherein context indices of the current pattern index are other indices received just prior to the current pattern index.
- 13. The method of claim 11, wherein upon receiving index n after receiving context index m, where n and m are integers, a frequency counter associated with an element {m, n} is incremented.
- 14. The method of claim 12, wherein upon receiving index n after receiving context indices mk, mk-1, . . . , m1, m0, where n and mj are integers, a frequency counter associated with an element {mk, . . . , m0, n} is incremented.
- 15. The method of claim 13, wherein if the frequency counter exceeds a threshold value, then the statistical model sends index n and context index m to the dictionary.
- 16. The method of claim 14, wherein if the frequency counter exceeds a threshold value, then the statistical model sends index n and context indices mk, mk-1, . . . , m1, m0 to the dictionary.
- 17. The method of claim 15, wherein the dictionary stores a new data pattern associated with context index m and index n, and assigns the new data pattern a new index.
- 18. The method of claim 16, wherein the dictionary stores a new data pattern associated with context indices mk, mk-1, . . . , m1, m0 and index n, and assigns the new data pattern a new index.
- 19. The method of claim 3, wherein the encoder is an arithmetic encoder.
- 20. The method of claim 3, wherein the encoder is a Huffman encoder.
- 21. The method of claim 19, wherein the encoder receives statistical information from the statistical model and indices from the dictionary.
- 22. The method of claim 21, wherein the statistical information includes frequency of occurrence of each index.
- 23. The method of claim 22, wherein the encoder uses fewer bits to encode a first index with a higher frequency of occurrence than to encode a second index with a lower frequency of occurrence.
- 24. A system for data compression, comprising:
a data buffer for storing data; a data compressor configured to compress data from the data buffer, comprising:
a dictionary configured to determine an index for one or more patterns; a statistical model configured to measure the frequency of occurrence of the one or more patterns; and an encoder configured to use statistical information from the statistical model to encode indices received from the dictionary.
- 25. The system of claim 24, further comprising:
a data transformer configured to apply a transform function to data in the data buffer; and a quantizer configured to quantize the data in the data buffer.
- 26. The system of claim 24, wherein the dictionary includes a bounded number of indices and corresponding data locations.
- 27. The system of claim 26, wherein the dictionary is a one-dimensional array.
- 28. The system of claim 26, wherein the dictionary is tree based.
- 29. The system of claim 26, wherein the dictionary is a hash table.
- 30. The system of claim 24, wherein the statistical model is a two-dimensional array.
- 31. The system of claim 24, wherein the statistical model is a tree.
- 32. The system of claim 24, wherein the statistical model is a list.
- 33. The system of claim 24, wherein the statistical model is a hash table.
- 34. The system of claim 24, wherein the encoder is an arithmetic encoder.
- 35. The system of claim 24, wherein the encoder is a Huffman encoder.
- 36. A system for data compression, comprising:
a data compressor configured to compress data, comprising:
a dictionary configured to determine an index for one or more patterns; a statistical model configured to measure the frequency of occurrence of the one or more patterns; and an encoder configured to use statistical information from the statistical model to encode indices received from the dictionary.
- 37. The system of claim 36, wherein the dictionary includes a bounded number of indices and corresponding data locations.
- 38. The system of claim 37, wherein the dictionary is a one-dimensional array.
- 39. The system of claim 37, wherein the dictionary is tree based.
- 40. The system of claim 37, wherein the dictionary is a hash table.
- 41. The system of claim 36, wherein the statistical model is a two-dimensional array.
- 42. The system of claim 36, wherein the statistical model is a tree.
- 43. The system of claim 36, wherein the statistical model is a list.
- 44. The system of claim 36, wherein the statistical model is a hash table.
- 45. The system of claim 36, wherein the encoder is an arithmetic encoder.
- 46. The system of claim 36, wherein the encoder is a Huffman encoder.
- 47. A computer-readable medium storing instructions for causing a computer to compress data, by performing the steps of:
receiving a data file having data patterns; storing received data patterns in a dictionary; assigning an index to each data pattern in the dictionary; storing the index of each data pattern in the dictionary; accumulating statistical information about each index; encoding each index using the statistical information; and clearing stored indices and stored data patterns in the dictionary when another data file is received.
- 48. A system for data compressing, comprising:
means for receiving a data file having data patterns; means for storing received data patterns in a dictionary; means for assigning an index to each data pattern in the dictionary; means for storing the index of each data pattern in the dictionary; means for accumulating statistical information about each index; means for encoding each index using the statistical information; and means for clearing stored indices and stored data patterns in the dictionary when another data file is received.
- 49. A computer-readable medium storing instructions for causing a computer to compress data, by performing the steps of:
receiving a data file having data patterns; storing received data patterns in a dictionary; assigning an index to each data pattern in the dictionary; storing the index of each data pattern in the dictionary; accumulating statistical information about each index; and encoding each index using the statistical information.
- 50. A system for data compressing, comprising:
means for receiving a data file having data patterns; means for storing received data patterns in a dictionary; means for assigning an index to each data pattern in the dictionary; means for storing the index of each data pattern in the dictionary; means for accumulating statistical information about each index; and means for encoding each index using the statistical information.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] The present application claims the benefit of priority from U.S. Provisional Patent Application No. 60/301,926, entitled “System and Method for Data Compression Using a Hybrid Coding Scheme” filed on Jun. 29, 2001, which is incorporated by reference herein.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60301926 |
Jun 2001 |
US |