The present disclosure relates generally to speech and audio processing and, more particularly, to a decoder for processing an audio signal including generic audio and speech frames.
Many audio signals may be classified as having more speech like characteristics or more generic audio characteristics more typical of music, tones, background noise, reverberant speech, etc. Codecs based on source-filter models that are suitable for processing speech signals do not process generic audio signals as effectively. Such codecs include Linear Predictive Coding (LPC) codecs like Code Excited Linear Prediction (CELP) coders. Speech coders tend to process speech signals low bit rates. Conversely, generic audio processing systems such as frequency domain transform codecs do not process speech signals very well. It is well known to provide a classifier or discriminator to determine, on a frame-by-frame basis, whether an audio signal is more or less speech like and to direct the signal to either a speech codec or a generic audio codec based on the classification. An audio signal processor capable of processing different signal types is sometimes referred to as a hybrid core codec.
However, transitioning between the processing of speech frames and generic audio frames using speech and generic audio codecs, respectively, is known to produce discontinuities in the form of audio gaps in the processed output signal. Such audio gaps are often perceptible at a user interface and are generally undesirable. Prior art
U.S. Publication No. 2006/0173675 entitled “Switching Between Coding Schemes” (Nokia) discloses a hybrid coder that accommodates both speech and music by selecting, on a frame-by-frame basis, between an adaptive multi-rate wideband (AMR-WB) codec and a codec utilizing a modified discrete cosine transform (MDCT), for example, an MPEG 3 codec or a (AAC) codec, whichever is most appropriate. Nokia ameliorates the adverse affect of discontinuities that occur as a result of un-canceled aliasing error arising when switching from the AMR-WB codec to the MDCT based codec using a special MDCT analysis/synthesis window with a near perfect reconstruction property, which is characterized by minimization of aliasing error. The special MDCT analysis/synthesis window disclosed by Nokia comprises three constituent overlapping sinusoidal based windows, H0(n), H1(n) and H2(n) that are applied to the first input music frame following a speech frame to provide an improved processed music frame. This method, however, may be subject to signal discontinuities that may arise from under-modeling of the associated spectral regions defined by H0(n), H1(n) and H2(n). That is, the limited number of bits that may be available need to be distributed across the three regions, while still being required to produce a nearly perfect waveform match between the end of the previous speech frame and the beginning of region H0(n).
The various aspects, features and advantages of the invention will become more fully apparent to those having ordinary skill in the art upon careful consideration of the following Detailed Description thereof with the accompanying drawings described below. The drawings may have been simplified for clarity and are not necessarily drawn to scale.
Prior art
In
In
In
In
In
In
In order to insure proper alias cancellation, the following properties must be exhibited by the complementary windows within the M sample overlap-add region:
wm-12(M+n)+wm2(n)=1, 0≦n≦M, and (1)
wm-1(M+n)wm-1(2M−n−1)−wm(n)wm(M−n−1)=0, 0≦n≦M, (2)
where m in the current frame index, n is the sample index within the current frame, wm(n) is the corresponding analysis and synthesis window at frame m, and M is the associated frame length. A common window shape which satisfies the above criteria is given as:
However, it is well know that many window shapes may satisfy these conditions. For example, in the present disclosure, the algorithmic delay of the generic audio coding overlap-add process is reduced by zero-padding the 2M frame structure as follows:
This reduces algorithmic delay by allowing processing to begin after acquisition of only 3M/2 samples, or 480 samples for a frame length of M=320. Note that while w (n) is defined for 2M samples (which is required for processing an MDCT structure have 50% overlap-add), only 480 samples are needed for processing.
Returning to Equations (1) and (2) above, if the previous frame (m−1) were a speech frame and the current frame (m) were a generic audio frame, then there would be no overlap-add data and essentially the window from frame (m−1) would be zero, or wm-1(M+n)=0, 0≦n≦M. Equations (1) and (2) would therefore become:
wm2(n)=1, 0≦n≦M, and (5)
wm(n)wm(M−n−1)=0, 0≦n≦M. (6)
From these revised equations it is apparent that the window function in Equations (3) and (4) does no satisfy these constraints, and in fact the only possible solution for Equations (5) and (6) that exists is for the interval M/2≦n≦M as:
wm(n)=1, M/2≦n<M, and (7)
wm(n)=0, 0≦n<M/2. (8)
So, in order to insure proper alias cancellation, the speech-to-audio frame transition window is given in the present disclosure as:
and is shown in
In
In one embodiment, the parameters include a first weighting parameter and a first index for a weighted segment of the first frame, e.g., the speech frame, of coded audio samples, and a second weighting parameter and a second index for a weighted segment of the portion of the second frame, e.g., the generic audio frame, of coded audio samples. The parameters may be constant values or functions. In one implementation, the first index specifies a first time offset from a reference audio gap sample in the sequence of input frames to a corresponding sample in the segment of the first frame of coded audio samples (e.g., the coded speech frame), and the second index specifies a second time offset from the reference audio gap sample to a corresponding sample in the segment of the portion of the second frame of coded audio samples (e.g., the coded generic speech frame). The first weighting parameter comprises a first gain factor that is applied to the corresponding samples in the indexed segment of the first frame. Similarly, the second weighting parameter comprises a second gain factor that is applied to the corresponding samples in the indexed segment of the portion of the second frame. In
The parameters are generally selected to reduce distortion between the audio gap filler samples that are generated using the parameters and a set of samples, sg(n), in the sequence of frames corresponding to the audio gap, wherein the set of samples are referred to as a set of reference audio gap samples. Thus generally the parameters may be based on a distortion metric that is a function of a set of reference audio gap samples in the sequence of input frames. In one embodiment, the distortion metric is a squared error distortion metric. In another embodiment, the distortion metric is a weighted mean squared error distortion metric.
In one particular implementation, the first index is determined based on a correlation between a segment of the first frame of coded audio samples and a segment of reference audio gap samples in the sequence of frames. The second index is also determined based on a correlation between a segment of the portion of the second frame of coded audio samples and the segment of reference audio gap samples. In
The details for determining the parameters associated with the audio gap filler samples are discussed below. Let sg be an input vector of length L=80 representing a gap region. The gap region is coded by generating an estimate ŝg from the speech frame output ŝs of the previous frame (m−1) and the portion of the generic audio frame output ŝa of the current frame (m). Let ŝs(−T) be a vector of length L starting from Tth past sample of ŝs and ŝa(T) be a vector of length L starting from the Tth future sample of ŝa (see
ŝg=α·ŝs(−T1)+β·ŝa(T2), (10)
where T1, T2, α, and β are obtained to minimize a distortion between sg and ŝg. T1 and T2 are integer valued where 160≦T1≦260 and 0≦T2≦80. Thus the total number of combinations for T1 and T2 are 101×81=8181<8192 and hence they can be jointly coded using 13 bits. A 6 bit scalar quantizer is used for coding each of the parameters α and β. The gap is coded using 25 bits.
A method for determining these parameters is given as follows. A weighted mean squared error distortion is first given by:
D=|sg−ŝg|T·W·|sg−ŝg|, (11)
where W is a weighting matrix used for finding optimal parameters, and T denotes the vector transpose. W is a positive definite matrix and is preferably a diagonal matrix. If W is an identity matrix, then the distortion is a mean squared distortion.
We can now define the self and cross correlation between the various terms of Equation (11) as:
Rgs=sgT·W·ŝs(−T1), (12)
Rga=sgT·W·ŝa(T2), (13)
Raa=ŝa(T2)T·W·ŝa(T2), (14)
Rss=ŝa(−T1)T·W·ŝs(−T1), and (15)
Ras=ŝa(T2)T·W·ŝs(−T). (16)
From these, we can further define the following:
δ(T1,T2)=RssRaa−RasRas, (17)
η(T1,T2)=RaaRgs−RasRga, (18)
γ(T1,T2)=RssRga−RasRgs. (19)
The values of T1 and T2 which minimize the distortion in Equation (10) are the values of T1 and T2 which maximize:
S=(η·Rgs+γ·Rga)/δ. (20)
Now let T1* and T2* be the optimum values which maximizes the expression in (20) then the coefficients α and β in Equation (10) are obtained as:
α=η(T1*,T2*)/δ(T1*,T2*) and (21)
β=γ(T1*,T2*)/δ(T1*,T2*). (22)
The values of α and β are subsequently quantized using six bit scalar quantizers. In an unlikely case where for certain values of T1 and T2, the determinant g in Equation (20) is zero, the expression in Equation (20) is evaluated as:
S=RgsRgs/Rss, Rss>0, (23)
or
S=RgaRga/Raa, Raa>0. (24)
If both Rss and Raa are zero, then S is set to a very small value.
A joint exhaustive search method for T1 and T2 has been described above. The joint search is generally complex however various relatively low complexity approaches may be adopted for this search. For example, the search for T1 and T2 can be first decimated by a factor greater than 1 and then the search can be localized. A sequential search may also be used, where a few optimum values of T1 are first obtained assuming Rga=0, and then T2 is searched only over those values of T1.
Using a sequential search as described above also gives rise to the case where either the first weighted segment α·ŝs(−T1) or the second weighted segment β·ŝa(T2) may be used to construct the coder audio gap filler samples represented ŝg. That is, in one embodiment, it is possible that only one set of parameters for the weighted segments is generated and used by the decoder to reconstruct the audio gap filler samples. Furthermore, there may be embodiments which consistently favor one weighted segment over the other. In such cases, the distortion may be reduced by considering only one of the weighted segments.
In
In one implementation, with reference to
If an audio coder could generate all the samples of the current frame without any loss, then a window with the left end having a rectangular shape is preferred. However, using a window with a rectangular shape may result in more energy in the high frequency MDCT coefficients, which may be more difficult to code without significant loss using a limited number of bits. Thus, to have a proper frequency response, a window having a smooth transition (with an M1=50 sample sine window on left and M/2 samples cosine window on right) is used. This is described as:
In the present example, a gap of 80+M1 samples is coded using an alternative method to that described previously. Since a smooth window with a transition region of 50 samples is used instead of a rectangular or step window, the gap region to be coded using an alternate method is extended by M1=50 samples, thereby making the length of the gap region 130 samples. The same forward/backward prediction approach discussed above is used for generating these 130 samples.
Weighted mean square methods are typically good for low frequency signals and tend to decrease the energy of high frequency signals. To decrease this effect, the signals ŝs, and ŝa may be passed through a first order pre-emphasis filter (pre-emphasis filter coefficient=0.1) before generating ŝg in Equation (10) above.
The audio mode output ŝa may have a tapering analysis and synthesis window and hence ŝa for delay T2 such that ŝa(T2) overlaps with the tapering region of ŝa. In such situations, the gap region sg may not have a very good correlation with ŝa(T2). In such a case, it may be preferable to multiply ŝa with an equalizer window E to get an equalized audio signal:
ŝae=E·ŝa, (26)
Instead of using ŝa, this equalized audio signal may now be used in Equation (10) and discussion following Equation (10).
The Forward/Backward estimation method used for coding of the gap frame generally produces a good match for the gap signal but it sometimes results in discontinuities at both the end points, i.e., at the boundary of the speech part and gap regions as well at the boundary between the gap region and the generic audio coded part (see
For the smoothed transition at the boundary of the gap and the MDCT output of the speech to audio switching frame, the last 50 samples of ŝg are first multiplied by (1−wm2)) and then added to first 50 samples of ŝa.
In
In
At 730, audio gap filler samples are generated based on parameters representative of a weighted segment of the first frame of coded audio samples and/or a weighted segment of the portion of the second frame of coded audio samples. In
In
The audio gap frame fills at least a portion of the audio gap between the first frame of coded audio samples and the portion of the second frame of coded audio sample, thereby eliminating or at least reducing any audible noise that may be perceived by the user. A switch 370 selects either the output of the speech decoder 320 or the combiner 360 based on the codeword, such that the decoded frames are recombined in an output sequence.
While the present disclosure and the best modes thereof have been described in a manner establishing possession and enabling those of ordinary skill to make and use the same, it will be understood and appreciated that there are equivalents to the exemplary embodiments disclosed herein and that modifications and variations may be made thereto without departing from the scope and spirit of the inventions, which are to be limited not by the exemplary embodiments but by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
218/KOL/2010 | Mar 2010 | IN | national |
Number | Name | Date | Kind |
---|---|---|---|
4560977 | Murakami et al. | Dec 1985 | A |
4670851 | Murakami et al. | Jun 1987 | A |
4727354 | Lindsay | Feb 1988 | A |
4853778 | Tanaka | Aug 1989 | A |
5006929 | Barbero et al. | Apr 1991 | A |
5067152 | Kisor et al. | Nov 1991 | A |
5327521 | Savic et al. | Jul 1994 | A |
5394473 | Davidson | Feb 1995 | A |
5956674 | Smyth et al. | Sep 1999 | A |
6108626 | Cellario et al. | Aug 2000 | A |
6236960 | Peng et al. | May 2001 | B1 |
6253185 | Arean et al. | Jun 2001 | B1 |
6263312 | Kolesnik et al. | Jul 2001 | B1 |
6304196 | Copeland et al. | Oct 2001 | B1 |
6453287 | Unno et al. | Sep 2002 | B1 |
6493664 | Uday Bhaskar et al. | Dec 2002 | B1 |
6504877 | Lee | Jan 2003 | B1 |
6593872 | Makino et al. | Jul 2003 | B2 |
6658383 | Koishida et al. | Dec 2003 | B2 |
6662154 | Mittal et al. | Dec 2003 | B2 |
6691092 | Udaya Bhaskar et al. | Feb 2004 | B1 |
6704705 | Kabal et al. | Mar 2004 | B1 |
6775654 | Yokoyama et al. | Aug 2004 | B1 |
6813602 | Thyssen | Nov 2004 | B2 |
6940431 | Hayami | Sep 2005 | B2 |
6975253 | Dominic | Dec 2005 | B1 |
7031493 | Fletcher et al. | Apr 2006 | B2 |
7130796 | Tasaki | Oct 2006 | B2 |
7161507 | Tomic | Jan 2007 | B2 |
7180796 | Tanzawa et al. | Feb 2007 | B2 |
7212973 | Toyama et al. | May 2007 | B2 |
7230550 | Mittal et al. | Jun 2007 | B1 |
7231091 | Keith | Jun 2007 | B2 |
7414549 | Yang et al. | Aug 2008 | B1 |
7461106 | Mittal et al. | Dec 2008 | B2 |
7761290 | Koishida et al. | Jul 2010 | B2 |
7840411 | Hotho et al. | Nov 2010 | B2 |
7885819 | Koishida et al. | Feb 2011 | B2 |
7889103 | Mittal et al. | Feb 2011 | B2 |
20020052734 | Unno et al. | May 2002 | A1 |
20030004713 | Makino et al. | Jan 2003 | A1 |
20030009325 | Kirchherr et al. | Jan 2003 | A1 |
20030220783 | Streich et al. | Nov 2003 | A1 |
20040252768 | Suzuki et al. | Dec 2004 | A1 |
20050261893 | Toyama et al. | Nov 2005 | A1 |
20060022374 | Chen et al. | Feb 2006 | A1 |
20060047522 | Ojanpera | Mar 2006 | A1 |
20060173675 | Ojanpera | Aug 2006 | A1 |
20060190246 | Park | Aug 2006 | A1 |
20060241940 | Ramprashad | Oct 2006 | A1 |
20060265087 | Philippe et al. | Nov 2006 | A1 |
20070171944 | Schuijers et al. | Jul 2007 | A1 |
20070239294 | Brueckner et al. | Oct 2007 | A1 |
20070271102 | Morii | Nov 2007 | A1 |
20080065374 | Mittal et al. | Mar 2008 | A1 |
20080120096 | Oh et al. | May 2008 | A1 |
20090024398 | Mittal et al. | Jan 2009 | A1 |
20090030677 | Yoshida | Jan 2009 | A1 |
20090076829 | Ragot et al. | Mar 2009 | A1 |
20090100121 | Mittal et al. | Apr 2009 | A1 |
20090112607 | Ashley et al. | Apr 2009 | A1 |
20090234642 | Mittal et al. | Sep 2009 | A1 |
20090259477 | Ashley et al. | Oct 2009 | A1 |
20090276212 | Khalil et al. | Nov 2009 | A1 |
20090306992 | Ragot et al. | Dec 2009 | A1 |
20090326931 | Ragot et al. | Dec 2009 | A1 |
20100088090 | Ramabadran | Apr 2010 | A1 |
20100169087 | Ashley et al. | Jul 2010 | A1 |
20100169099 | Ashley et al. | Jul 2010 | A1 |
20100169100 | Ashley et al. | Jul 2010 | A1 |
20100169101 | Ashley et al. | Jul 2010 | A1 |
20110161087 | Ashley et al. | Jun 2011 | A1 |
Number | Date | Country |
---|---|---|
0932141 | Jul 1999 | EP |
1533789 | May 2005 | EP |
1619664 | Jan 2006 | EP |
1483759 | Sep 2006 | EP |
1845519 | Sep 2009 | EP |
1959431 | Jun 2010 | EP |
9715983 | May 1997 | WO |
03073741 | Sep 2003 | WO |
2007063910 | Jun 2007 | WO |
2008063035 | May 2008 | WO |
2010003663 | Jan 2010 | WO |
Entry |
---|
Office Action for U.S. Appl. No. 12/345,141, mailed Sep. 19, 2011. |
Office Action for U.S. Appl. No. 12/345,165, mailed Sep. 1, 2011. |
Office Action for U.S. Appl. No. 12/047,632, mailed Oct. 18, 2011. |
Office Action for U.S. Appl. No. 12/187,423, mailed Sep. 30, 2011. |
Office Action for U.S. Appl. No. 12/099,842, mailed Oct. 12, 2011. |
Patent Cooperation Treaty, “PCT Search Report and Written Opinion of the International Searching Authority” for International Application No. PCT/US2011/0266400 Aug. 5, 2011, 11 pages. |
Neuendorf, et al., “Unified Speech Audio Coding Scheme for High Quality oat Low Bitrates” ieee International Conference on Accoustics, Speech and Signal Processing, 2009, Apr. 19, 2009, 4 pages. |
Mexican Patent Office, 2nd Office Action, Mexican Patent Application MX/a/2010/004479 dated Jan. 31, 2012, 5 pages. |
Bruno Bessette: “Universal Speech/Audio Coding using Hybrid ACELP/TCX Techniques”, Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference, Mar. 18-23, 2005, ISSN : III-301-III-304, Print ISBN: 0-78. |
United States Patent and Trademark Office, “Non-Final Rejection” for U.S. Appl. No. 12/047,632 dated Mar. 2, 2011, 20 pages. |
United States Patent and Trademark Office, “Non-Final Rejection” for U.S. Appl. No. 12/099,842 dated Apr. 15, 2011, 21 pages. |
Ramo et al. “Quality Evaluation of the G.EV-VBR Speech Codec” Apr. 4, 2008, pp. 4745-4748. |
Jelinek et al. “ITU-T G.EV-VBR Baseline Codec” Apr. 4, 2008, pp. 4749-4752. |
Jelinek et al. “Classification-Based Techniques for Improving the Robustness of CELP Coders” 2007, pp. 1480-1484. |
Fuchs et al. “A Speech Coder Post-Processor Controlled by Side-Information” 2005, pp. IV-433-IV-436. |
J. Fessler, “Chapter 2; Discrete-time signals and systems” May 27, 2004, pp. 2.1-2.21. |
Patent Cooperation Treaty, “PCT Search Report and Written Opinion of the International Searching Authority” for International Application No. PCT/US2011/026660 Jun. 15, 2011, 10 pages. |
Virette et al “Adaptive Time-Frequency Resolution in Modulated Transform at Reduced Delay” ICASSP 2008; pp. 3781-3784. |
Edler “Coding of Audio Signals with Overlapping Block Transform and Adaptive Window Functions”; Journal of Vibration and Low Voltage fnr; vol. 43, 1989, Section 3.1; pp. 252-256. |
Princen et al., “Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation” IEEE 1987; pp. 2161-2164. |
Ramprashad: “High Quality Embedded Wideband Speech Coding Using an Inherently Layered Coding Paradigm” Proceedings of International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2000, vol. 2, Jun. 5-9, 2000 pp. 1145-1148. |
Patent Cooperation Treaty, “PCT Search Report and Written Opinion of the International Searching Authority” for International Application No. PCT/US2008/077693 Dec. 15, 2008, 12 pages. |
Kovesi et al.; “A Scalable Speech and Audio Coding Scheme with Continuous Bitrate Flexibility” Proceeding of International Conference on Acoustics, Speech, and Signal Processing, 2004, Piscataway, JY vol. 1, May 17, 2004 pp. 273-276. |
Ramprashad: “Embedded Coding Using a Mixed Speech and Audio Coding Paradigm” International Journal of Speech Technology Kluwer Academic Publishers Netherlands, Vo. 2, No. 4, May 1999, pp. 359-372. |
3GPP TS 26.290 v7.0.0 (Mar. 2007) 3rd Generation Partnership Project; Technical Speciification Group Service and System Aspects; Audio codec processing functions; Extended Adaptive Multi0Rate—Wideband (AMR-WB+) codec; Transcoding functions (Release 7). |
International Telecommunication Union, G.729.1, Series G: Transmission Systems and Media, Digital Systems and Networks, Digital Terminal Equipments—Coding of analogue signals by methods other than PCM, G.729 based Embedded Variable bit-rate coder. |
Chen et al.; “Adaptive Postifiltering for Quality Enhancement of Coded Speech” IEEE Transactions on Speech and Audio Processing, vol. 3, No. 1, Jan. 1995, pp. 59-71. |
Chan et al.; “Frequency domain postfiltering for multiband excited linear predictive coding of speech” Electronics Letters, Jun. 6, 1996, vol. 32 No. 12; pp. 1061-1063. |
Andersen et al.; “Reverse Water-Filling in Predictive Encoding of Speech” IEEE 1999 pp. 105-107. |
Makinen et al., “AMR-WB+: a new audio coding standard for 3rd generation mobile audio service”, In 2005 Proceedings IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. ii/1109-ii/1112, Mar. 18, 2005. |
Faller et al., “Technical advances in digital audio radio broadcasting”, Proceedings of the IEEE, vol. 90, No. 8, pp. 1303-1333, Aug. 1, 2002. |
Salami et al., “Extended AMR-WB for High-Quality Audio on Mobile Devices”, IEEE Communications Magazine, pp. 90-97, May 1, 2006. |
Hung et al., Error-Resilient Pyramid Vector Quantization for Image Compression, IEEE Transactions on Image Processing, 1994 pp. 583-587. |
Hung et al., Error-resilient pyramid vector quantization for image compression, IEEE Transactions on Image Processing, vol. 7, No. 10, Oct. 1, 1998. |
Daniele Cadel, et al. “Pyramid Vector Coding for High Quality Audio Compression”, IEEE 1997, pp. 343-346, Cefriel, Milano, Italy and Alcatel Telecom, Vimercate Italy. |
Mittal et al., Coding unconstrained FCB excitation using combinatorial and Huffman codes, Speech Coding 2002 IEEE Workshop Proceedings, Oct. 1, 2002, pp. 129-131. |
Ashley et al., Wideband coding of speech using a scalable pulse codebook, Speech Coding 2000 IEEE Workshop Proceedings, Sep. 1, 2000, pp. 148-150. |
Patent Cooperation Treaty, “PCT Search Report and Written Opinion of the International Searching Authority” for International Application No. PCT/US07/74222 Jul. 23, 2008, 9 pages. |
Patent Cooperation Treaty, “PCT Search Report and Written Opinion of the International Searching Authority” for International Application No. PCT/US2009/036479 Jul. 28, 2009, 15 pages. |
Markas et al. “Multispectral Image Compression Algorithms”; Data Compression Conference, 1993; Snowbird, UT USA Mar. 30-Apr. 2, 1993; pp. 391-400. |
Mittal et al., Low complexity factorial pulse coding of MDCT coefficients using approximation of combinatorial functions, Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, Apr. 1, 2007, pp. I-289-I-292. |
“Enhanced Variable Rate Codec, Speech Service Options 3, 68, and 70 for Wideband Spread Spectrum Digital Systems”, 3GPP2 TSG-C Working Group 2, XX, XX, No. C. S0014-C, Jan. 1, 2007, pp. 1-5. |
Boris Ya Ryabko et al.: “Fast and Efficient Construction of an Unbiased Random Sequence”, IEEE Transactions on Information Theory, IEEE, US, vol. 46, No. 3, May 1, 2000, ISSN: 0018-9448, pp. 1090-1093. |
Ratko V. Tomic: “Quantized Indexing: Background Information”, May 16, 2006, URL: http://web.archive.org/web/20060516161324/www.1stworks.com/ref/TR/tr05-0625a.pdf, pp. 1-39. |
Ido Tal et al.: “On Row-by-Row Coding for 2-D Constraints”, Information Theory, 2006 IEEE International Symposium on, IEEE, PI, Jul. 1, 2006, pp. 1204-1208. |
Patent Cooperation Treaty, “PCT Search Report and Written Opinion of the International Searching Authority” for International Application No. PCT/US2009/036481 Jul. 20, 2009, 15 pages. |
Tancerel, L. et al., “Combined Speech and Audio Coding by Discrimination,” In Proceedings of IEEE Workshop on Speech Coding, pp. 154-156, (2000). |
Udar Mittal et al., “Decoder for Audio Signal Including Generic Audio and Speech Frames”, U.S. Appl. No. 12/844,199, filed Jul. 27, 2010. |
Patent Cooperation Treaty, “PCT Search Report and Written Opinion of the International Searching Authority” for International Application No. PCT/US2009/039984 Aug. 13, 2009, 14 pages. |
United States Patent and Trademark Office, “Non-Final Office Action” for U.S. Appl. No. 12/196,414 dated Jun. 4, 2012, 9 pages. |
Ratko V. Tomic: “Fast, Optimal Entropy Coder” 1stWorks Corporation Technical Report TR04-0815, Aug. 15, 2004, pp. 1-52. |
European Patent Office, Supplementary Search Report for EPC Patent Application No. 07813290.9 dated Jan. 4, 2013, 8 pages. |
Cover, T.M., “Enumerative Source Encoding” IEEE Transactions on Information Theory, IEEE Press, USA vol. IT-19, No. 1; Jan. 1, 1973, pp. 73-77. |
MacKay, D., “Information Theory, Inference, and Learning Algorithms” in: “Information Theory, Inference, and Learning Algorithms”, Jan. 1, 2004; pp. 1-10. |
Korean Intellectual Property Office, Notice of Preliminary Rejection for Korean Patent Application No. 10-2010-0725140 dated Jan. 4, 2013. |
Chinese Patent Office (SIPO), 1st Office Action for Chinese Patent Application No. 200980153318.0 dated Sep. 12, 2012, 6 pages. |
Number | Date | Country | |
---|---|---|---|
20110218799 A1 | Sep 2011 | US |