Claims
- 1. A method for audio compressing comprising:
receiving an audio signal; applying transform coding to the audio signal to generate a sequence of transform frequency coefficients; partitioning the sequence of transform frequency coefficients into a plurality of non-uniform width frequency ranges; inserting zero value frequency coefficients at the boundaries of the non-uniform width frequency ranges; and dropping certain of the transform frequency coefficients that represent high frequencies.
- 2. The method of claim 1 further comprising separately applying a transform to each of the plurality of non-uniform width frequency ranges.
- 3. The method of claim 2 wherein application of the transform is in parallel.
- 4. The method of claim 1 further comprising varying length of transform operations applied to each of the plurality of non-uniform width frequency ranges.
- 5. The method of claim 1 wherein the number of dropped transform frequency coefficients is equal to the number of inserted zero value frequency coefficients.
- 6. The method of claim 1 further comprising:
constructing a psycho-acoustic model with the plurality of non-uniform width frequency ranges with inserted zero value frequency coefficients; and quantizing the plurality of non-uniform width frequency ranges with inserted zero value frequency coefficients.
- 7. A method for audio compression comprising:
generating a plurality of frequency coefficients representing an audio signal; grouping the plurality of frequency coefficients into frequency ranges of non-uniform width; determining if a sound attack occurs in any one of the non-uniform width frequency ranges; and performing transform length switching separately on each of the frequency ranges based on determining occurrence of a sound attack.
- 8. The method of claim 7 further comprising stuffing zeros at the boundaries of the non-uniform width frequency ranges and dropping certain of the plurality of frequency coefficients that represent higher end frequencies.
- 9. The method of claim 8 wherein stuffing zeros at the boundaries comprises:
insert zeros at the boundaries of the frequency ranges; and shifting those of the plurality of frequency coefficients that are displaced by the,, inserted zeros into the next frequency range.
- 10. The method of claim 7 further comprising separately performing transforms on each of the plurality of non-uniform width frequency ranges based on their width.
- 11. The method of claim 10 wherein the transforms are inverse modified discrete cosine transforms.
- 12. The method of claim 7 wherein the performed long and short transforms are modified discrete cosine transforms.
- 13. A method for audio compression comprising:
applying a transform to a plurality of audio samples to generate a sequence of transform frequency coefficients; and partitioning the sequence of transform frequency coefficients into varying width frequency subbands with zero value frequency coeffcients at the boundaries of the frequency subbands.
- 14. The method of claim 13 further comprising dropping a set of one or more transform frequency coefficients in the highest frequency subband.
- 15. The method of claim 14 wherein the number of dropped transform frequency coefficients corresponds to the number of zero value frequency coefficients stuffed at the boundaries of the frequency subbands.
- 16. The method of claim 13 further comprising:
constructing a psycho-acoustic model with the varying width subbands; and quantizing the varying width subbands.
- 17. The method of claim 13 further comprising applying transforms of varying length to each of the varying width subbands.
- 18. A method for audio compression comprising:
partitioning an audio input into a plurality of non-uniform frequency subbands, each of the plurality of non-uniform frequency subbands including a set of one or more frequency coefficients; displacing those of the set of frequency coefficients at the boundary of each subband with zeros; and dropping those of the set of frequency coefficients that fall outside of the plurality of frequency subbands after the displacing.
- 19. The method of claim 18 further comprising separately applying a transform to each of the plurality of non-uniform frequency subbands.
- 20. The method of claim 19 wherein application of the transform is in parallel.
- 21. The method of claim 18 further comprising varying length of transform operations applied to each of the plurality of non-uniform frequency subbands.
- 22. The method of claim 18 wherein the number of dropped frequency coefficients is equal to the number of inserted zeros.
- 23. The method of claim 18 further comprising:
constructing a psycho-acoustic model with the plurality of non-uniform frequency subbands; and quantizing the plurality of non-uniform frequency subbands.
- 24. A method for audio compression comprising:
generating a plurality of non-uniform frequency subbands, each of the plurality of non-uniform frequency subbands including a set of one or more frequency coefficients, from an audio input signal; displacing those of the set of frequency coefficients at the boundary of each non-uniform frequency subband with zeros; separately normalizing the non-uniform frequency subbands, including the zeros; varying transform length applied to each of the plurality of non-uniform frequency subbands based on the detection of a sound attack within the plurality of non-uniform frequency subbands; and multiplexing the plurality of non-uniform frequency subbands.
- 25. The method of claim 24 wherein inverse modified discrete transform is applied to the plurality of non-uniform frequency subbands after normalizing.
- 26. The method of claim 24 wherein the varied transform is modified discrete cosine transform.
- 27. A method for audio decompression comprising:
receiving a bitstream; extracting a sequence of transform frequency coefficients from the bitstream; demultiplexing the sequence of transform frequency coefficients into a plurality of frequency ranges; removing boundary transform frequency coefficients that were originally zeros from the plurality of frequency ranges; shifting the remaining transform frequency coefficients to fill in for the removed boundary transform frequency coefficients; and inserting zeros into vacancies in the higher range of the plurality of frequency ranges caused by said shifting.
- 28. The method of claim 27 further comprising applying inverse modified discrete cosing transform to the plurality of frequency ranges.
- 29. The method of claim 27 further comprising decoding and dequantizing the sequence of transform frequency coefficients.
- 30. A method for audio compression comprising:
partitioning an audio signal into a plurality of non-uniform width frequency ranges, each of the plurality of non-uniform width frequency ranges including a set of one or more transform frequency coefficients; indicating the width of each of the plurality of frequency ranges; separately processing each of the plurality of non-uniform width frequency ranges; and encoding the plurality of frequency ranges and their width indications.
- 31. The method of claim 30 further comprising separately performing transform length switching on one of the plurality of frequency ranges based on detection of a sound attack within the one of the plurality of frequency ranges.
- 32. The method of claim 30 further comprising:
stuffing zeros at the boundaries of the plurality of frequency ranges; shifting those transform frequency coefficients displaced by the stuffed zeros; and dropping those transform frequency coefficients that fall outside of the plurality of frequency ranges from said shifting.
- 33. The method of claim 30 wherein the processing comprises normalizing and transforming.
- 34. The method of claim 33 wherein the transforming is modified discrete cosine transforming.
- 35. An apparatus comprising:
an adaptive non-uniform filterbank to represent an audio input with a number of transform frequency coefficients that is less than the audio input's number of samples; a quantization unit coupled with the adaptive non-uniform filterbank, the quantization unit to receive transform frequency coefficients from adaptive non-uniform filterbank; and a lossless encoding unit coupled with the quantization unit, the lossless encoding unit to receive quantized transform coefficients from the quantization unit.
- 36. The apparatus of claim 35 wherein the adaptive non-uniform filterbank comprises:
a non-uniform frequency range transform function flattening filterbank to partition a sequence of transform frequency coefficients generated from the audio input into frequency ranges of non-uniform width and to flatten a transfer function of the sequence of transform frequency coefficients; an adaptive sound attack based transform length varying filterbank coupled with the non-uniform frequency range transform function flattening filterbank; a sound attack detection unit coupled with the adaptive sound attack based transform length varying filterbank; and a multiplexer coupled with the adaptive sound attack based transform length varying filterbank.
- 37. The apparatus of claim 36 wherein the non-uniform frequency range transform function flattening filterbank comprises:
a modified discrete cosine transform unit; a frequency range boundary zero stuffing unit coupled with the transform unit; and a plurality of parallel inverse modified discrete cosine transform units coupled with the frequency range boundary zero stuffing unit.
- 38. The apparatus of claim 36 wherein the adaptive sound attack based transform length varying filterbank comprises a plurality of parallel multi-length transform units.
- 39. The apparatus of claim 35 further comprising a psych-acoustic model computing unit coupled with the adaptive non-uniform filterbank and the quantization unit.
- 40. An apparatus comprising:
a non-uniform frequency range transform function flattening filterbank to receive an audio signal, to partition the audio signal into varying frequency ranges of frequency coefficients, and to perform zero bit stuffing at the boundaries of the frequency ranges and to drop certain high frequency coefficients; a sound attack detection unit coupled with the non-uniform frequency range transform function flattening filterbank, the sound attack detection unit to locate sound attacks within the audio signal; an adaptive sound attack based transform length varying filterbank coupled with the non-uniform frequency range transform function flattening filterbank and the sound attack detection unit, the adaptive sounds attack based transform length varying filterbank to perform varying length transforms on the audio signal based on sound attack detection indicated by the sound attack detection unit; a multiplexer coupled with the adaptive sound attack based transform length varying filterbank; a quantization unit coupled with the multiplexer; a pysco-acoustic model (PAM) computing unit coupled with the multiplexer; and a lossless coding unit coupled with the quantization unit and the PAM computing unit, the lossless coding unit to losslessly code transform coefficients received from the quantization unit.
- 41. The apparatus of claim 40 wherein the non-uniform frequency range transform function flattening filterbank comprises:
a modified discrete cosine transform unit; a frequency range boundary zero stuffing unit coupled with the transform unit; and a plurality of parallel inverse modified discrete cosine transform units coupled with the frequency range boundary zero stuffing unit.
- 42. The apparatus of claim 40 wherein the adaptive sound attack based transform length varying filterbank comprises a plurality of parallel multi-length transform units.
- 43. An audio decoder comprising:
a demultiplexer to receive a bitstream and to extract a sequence of transform frequency coefficients; and an inverse adaptive non-uniform filterbank coupled with the demultiplexer, the inverse adaptive non-uniform filterbank to partition a sequence of transform frequency coefficients into a plurality of non-uniform width frequency ranges, to remove certain boundary transform frequency coefficients originally based on zeros, and to insert zeros for previously removed high range transform frequency coefficients.
- 44. The audio decoder of claim 43 wherein the inverse adaptive non-uniform filterbank includes:
a plurality of parallel inverse modified discrete cosine transform units; a plurality of parallel modified discrete cosine transform units coupled with the plurality of parallel inverse modified discrete cosine transform units; a plurality of parallel de-normalization units coupled with the plurality of parallel modified discrete cosine transform units; a zero removing unit coupled with the plurality of de-normalization units; and an inverse modified discrete cosine transform unit coupled with the zero removing unit.
- 45. The audio decoder of claim 43 further comprising a decoder and dequanztizer unit coupled with the demultiplexer.
- 46. A machine-readable medium having a set of instruction stored thereon, which when executed by a set of one or more processors causes the set of processors to perform the operations comprising:
receiving an audio signal; applying transform coding to the audio signal to generate a sequence of transform frequency coefficients; partitioning the sequence of transform frequency coefficients into a plurality of non-uniform width frequency ranges; inserting zero value frequency coefficients at the boundaries of the non-uniform width frequency ranges; and dropping certain of the transform frequency coefficients that represent high frequencies.
- 47. The machine-readable medium of claim 46 further comprising separately applying a transform to each of the plurality of non-uniform width frequency ranges.
- 48. The machine-readable medium of claim 47 wherein application of the transform is in parallel.
- 49. The machine-readable medium of claim 46 further comprising varying length of transform operations applied to each of the plurality of non-uniform width frequency ranges.
- 50. The machine-readable medium of claim 46 wherein the number of dropped transform frequency coefficients is equal to the number of inserted zero value frequency coefficients.
- 51. The machine-readable medium of claim 46 further comprising:
constructing a psycho-acoustic model with the plurality of non-uniform width frequency ranges with inserted zero value frequency coefficients; and quantizing the plurality of non-uniform width frequency ranges with inserted zero value frequency coefficients.
- 52. A machine-readable medium having a set of instruction stored thereon, which when executed by a set of one or more processors causes the set of processors to perform the operations comprising:
generating a plurality of frequency coefficients representing an audio signal; grouping the plurality of frequency coefficients into frequency ranges of non-uniform width; determining if a sound attack occurs in any one of the non-uniform width frequency ranges; and performing short transforms on those non-uniform frequency ranges that have a sound attack and long transforms on those non-uniform frequency ranges that do not have a sound attack.
- 53. The machine-readable medium of claim 52 further comprising stuffing zeros at the boundaries of the non-uniform width frequency ranges and dropping certain of the plurality of frequency coefficients that represent higher end frequencies.
- 54. The machine-readable medium of claim 53 wherein stuffing zeros at the boundaries comprises:
insert zeros at the boundaries of the frequency ranges; and shifting those of the plurality of frequency coefficients that are displaced by the inserted zeros into the next frequency range.
- 55. The machine-readable medium of claim 52 further comprising separately performing transforms on each of the plurality of non-uniform width frequency ranges based on their width.
- 56. The machine-readable medium of claim 55 wherein the transforms are inverse modified discrete cosine transforms.
- 57. The machine-readable medium of claim 52 wherein the performed long and short transforms are modified discrete cosine transforms.
- 58. A machine-readable medium having a set of instruction stored thereon, which when executed by a set of one or more processors causes the set of processors to perform the operations comprising:
applying a transform to a plurality of audio samples to generate a sequence of transform frequency coefficients; and partitioning the sequence of transform frequency coefficients into varying width frequency subbands with zero value frequency coeffcients at the boundaries of the frequency subbands.
- 59. The machine-readable medium of claim 58 further comprising dropping a set of one or more transform frequency coefficients in the highest frequency subband.
- 60. The machine-readable medium of claim 59 wherein the number of dropped transform frequency coefficients corresponds to the number of zero value frequency coefficients stuffed at the boundaries of the frequency subbands.
- 61. The machine-readable medium of claim 58 further comprising:
constructing a psycho-acoustic model with the varying width subbands; and quantizing the varying width subbands.
- 62. The machine-readable medium of claim 58 further comprising applying transforms of varying length to each of the varying width subbands.
- 63. A machine-readable medium having a set of instruction stored thereon, which when executed by a set of one or more processors causes the set of processors to perform the operations comprising:
partitioning an audio input into a plurality of non-uniform frequency subbands, each of the plurality of non-uniform frequency subbands including a set of one or more frequency coefficients; displacing those of the set of frequency coefficients at the boundary of each subband with zeros; and dropping those of the set of frequency coefficients that fall outside of the plurality of frequency subbands after the displacing.
- 64. The machine-readable medium of claim 63 further comprising separately applying a transform to each of the plurality of non-uniform frequency subbands.
- 65. The machine-readable medium of claim 64 wherein application of the transform is in parallel.
- 66. The machine-readable medium of claim 63 further comprising varying length of transform operations applied to each of the plurality of non-uniform frequency subbands.
- 67. The machine-readable medium of claim 63 wherein the number of dropped frequency coefficients is equal to the number of inserted zeros.
- 68. The machine-readable medium of claim 63 further comprising:
constructing a psycho-acoustic model with the plurality of non-uniform frequency subbands; and quantizing the plurality of non-uniform frequency subbands.
- 69. A machine-readable medium having a set of instruction stored thereon, which when executed by a set of one or more processors causes the set of processors to perform the operations comprising:
generating a plurality of non-uniform frequency subbands, each of the plurality of non-uniform frequency subbands including a set of one or more frequency coefficients, from an audio input signal; displacing those of the set of frequency coefficients at the boundary of each non-uniform frequency subband with zeros; separately normalizing the non-uniform frequency subbands, including the zeros; varying transform length applied to each of the plurality of non-uniform frequency subbands based on the detection of a sound attack within the plurality of non-uniform frequency subbands; and multiplexing the plurality of non-uniform frequency subbands.
- 70. The machine-readable medium of claim 69 wherein inverse modified discrete transform is applied to the plurality of non-uniform frequency subbands after normalizing.
- 71. The machine-readable medium of claim 69 wherein the varied transform is modified discrete cosine transform.
- 72. A machine-readable medium having a set of instruction stored thereon, which when executed by a set of one or more processors causes the set of processors to perform the operations comprising:
receiving a bitstream; extracting a sequence of transform frequency coefficients from the bitstream; demultiplexing the sequence of transform frequency coefficients into a plurality of frequency ranges; removing boundary transform frequency coefficients that were originally zeros from the plurality of frequency ranges; shifting the remaining transform frequency coefficients to fill in for the removed boundary transform frequency coefficients; and inserting zeros into vacancies in the higher range of the plurality of frequency ranges caused by said shifting.
- 73. The machine-readable medium of claim 72 further comprising applying inverse modified discrete cosing transform to the plurality of frequency ranges.
- 74. The machine-readable medium of claim 72 further comprising decoding and dequantizing the sequence of transform frequency coefficients.
- 75. A machine-readable medium having a set of instruction stored thereon, which when executed by a set of one or more processors causes the set of processors to perform the operations comprising:
partitioning an audio signal into a plurality of non-uniform width frequency ranges, each of the plurality of non-uniform width frequency ranges including a set of one or more transform frequency coefficients; indicating the width of each of the plurality of frequency ranges; separately processing each of the plurality of non-uniform width frequency ranges; and encoding the plurality of frequency ranges and their width indications.
- 76. The machine-readable medium of claim 75 further comprising separately performing transform length switching on one of the plurality of frequency ranges based on detection of a sound attack within the one of the plurality of frequency ranges.
- 77. The machine-readable medium of claim 75 further comprising:
stuffing zeros at the boundaries of the plurality of frequency ranges; shifting those transform frequency coefficients displaced by the stuffed zeros; and dropping those transform frequency coefficients that fall outside of the plurality of frequency ranges from said shifting.
- 78. The machine-readable medium of claim 75 wherein the processing comprises normalizing and transforming.
- 79. The machine-readable medium of claim 78 wherein the transforming is modified discrete cosine transforming.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority from U.S. Provisional Patent Application, Serial No. entitled “Method and Apparatus for Audio Compression” filed Feb. 28, 2003.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60450943 |
Feb 2003 |
US |