The present disclosure relates to the field of coding and decoding audio signals. More specifically, the present disclosure relates time-domain aliasing cancellation in a coded audio signal.
State-of-the-art audio coding uses time-frequency decomposition to represent the signal in a meaningful way for data reduction. More specifically, audio coders use transforms to perform a mapping of the time-domain samples into frequency-domain coefficients. Discrete-time transforms used for this time-to-frequency mapping are typically based on kernels of sinusoidal functions, such as the Discrete Fourier Transform (DFT) and the Discrete Cosine Transform (DCT). It can be shown that such transforms achieve energy compaction of the audio signal. Energy compaction means that, in the transform (or frequency) domain, the energy distribution is localized on fewer significant frequency-domain coefficients than in the time-domain samples. Coding gains can then be achieved by applying adaptive bit allocation and suitable quantization to the frequency-domain coefficients. At the receiver, the bits representing the quantized and coded parameters (including the frequency-domain coefficients) are used to recover the quantized frequency-domain coefficients (or other quantized data such as gains), and the inverse transform generates the time-domain audio signal. Such coding schemes are generally referred to as transform coding.
By definition, transform coding operates on consecutive blocks (usually called “frames”) of samples of the input audio signal. Since quantization introduces some distortion in each synthesized block of audio signal, using non-overlapping blocks may introduce discontinuities at the block boundaries which may degrade the audio signal quality. Hence, in transform coding, to avoid discontinuities, the coded blocks of audio signal are overlapped prior to applying the transform, and appropriately windowed in the overlapping segment to allow smooth transition from one decoded block of samples to the next. Using a transform such as the DFT (or its fast equivalent, the Fast Fourier Transform (FFT)) or the DCT and applying it to overlapped blocks of samples unfortunately results in what is called “non-critical sampling”. For example, taking a typical 50% overlap condition, coding a block of N consecutive time-domain samples actually requires taking a transform on 2N consecutive samples, including N samples from the present block and N samples from the preceding and next block overlapping parts. Hence, for every block of N time-domain samples, 2N frequency-domain coefficients are coded. Critical sampling in the frequency domain implies that N input time-domain samples produce only N frequency-domain coefficients to be quantized and coded.
Specialized transforms have been designed to allow the use of overlapping windows and still maintain critical sampling in the transform-domain. With such specialized transforms, the 2N time-domain samples at the input of the transform result in N frequency-domain coefficients at the output of the transform. To achieve this, the block of 2N time-domain samples is first reduced to a block of N time domain samples through special time inversion, summation of specific parts of the 2N-sample long windowed signal at one end of the window, and subtraction of specific parts of the 2N-sample long windowed signal from each other at the other end of the window. These special time inversion, summation and subtraction introduce what is called “time-domain aliasing” (TDA). Once TDA is introduced in the block of samples of the audio signal, it cannot be removed using only that block. It is this time-domain aliased signal that is the input of a transform of size N (and not 2N), producing the N frequency-domain coefficients of the transform. To recover the N time-domain samples, the inverse transform uses the transform coefficients from two consecutive and overlapping frames or blocks to cancel out the TDA, in a process called Time-domain aliasing cancellation (TDAC).
An example of such a transform applying TDAC, which is widely used in audio coding, is the Modified Discrete Cosine Transform (MDCT). Actually, the MDCT introduces TDA without explicit folding in the time domain. Rather, time-domain aliasing is introduced when considering both the direct MDCT and inverse MDCT (IMDCT) of a single block of samples. This comes from the mathematical construction of the MDCT and is well known to those of ordinary skill in the art. But it is also known that this implicit time-domain aliasing can be seen as equivalent to first inverting parts of the time-domain samples and adding (or subtracting) these inverted parts to other parts of the signal. This is known as “folding”.
A problem arises when an audio coder switches between two coding modes, one using TDAC and the other not. Suppose for example that a codec switches from a TDAC coding mode to a non-TDAC coding mode. The side of the block of samples coded using the TDAC coding mode, and which is common to the block coded without using TDAC, contains TDA which cannot be cancelled out using the block of samples coded using the non-TDAC coding mode.
A first solution is to discard the samples which contain aliasing that cannot be cancelled out.
This first solution results in an inefficient use of transmission bandwidth because the block of samples for which TDA cannot be cancelled out is coded twice, once by the TDAC-based codec and a second time by the non-TDAC based codec.
A second solution is to use specially designed windows which do not introduce TDA in at least one part of the window when the time inversion and summation/subtraction process is applied.
As illustrated in
Therefore, there is a need for an improved TDAC technique usable, for example, in the multi-mode Moving Pictures Expert Group (MPEG) Unified Speech and Audio Codec (USAC), to manage the different transitions between frames using rectangular, non-overlapping windows and frames using non-rectangular, overlapping windows, while ensuring proper spectral resolution, data overhead reduction and smoothness of transition between these different frame types.
Therefore, there is a need for an aliasing cancellation technique for supporting switching between coding modes, wherein the technique compensates for aliasing effects at a switching point between these modes.
Therefore, according to a first aspect, there is provided a method for producing forward aliasing cancellation (FAC) parameters for cancelling time-domain aliasing caused to a coded audio signal in a first transform-coded frame by a transition between the first transform-coded frame using a first coding mode with overlapping window and a second frame using a second coding mode with non-overlapping window, comprising: calculating a FAC target representative of a difference between the audio signal of the first frame prior to coding and a synthesis of the coded audio signal of the first transform-coded frame; and weighting the FAC target to produce the FAC parameters.
According to a second aspect, there is provided a method for forward cancelling time-domain aliasing caused to a coded audio signal in a first transform-coded frame by a transition between the first transform-coded frame using a first coding mode with overlapping window and a second frame using a second coding mode with non-overlapping window, comprising: receiving weighted forward aliasing cancellation (FAC) parameters; inverse weighting the weighted FAC parameters to produce a FAC synthesis; and upon synthesis of the coded audio signal in the first frame, cancelling the time-domain aliasing from the audio signal synthesis using the FAC synthesis.
According to a third aspect, there is provided a device for producing forward aliasing cancellation (FAC) parameters for cancelling time-domain aliasing caused to a coded audio signal in a first transform-coded frame by a transition between the first transform-coded frame using a first coding mode with overlapping window and a second frame using a second coding mode with non-overlapping window, comprising: a calculator of a FAC target representative of a difference between the audio signal of the first frame prior to coding and a synthesis of the coded audio signal of the first transform-coded frame; and a weighting filter supplied with the FAC target to produce the FAC parameters.
According to a fourth aspect, there is provided an audio signal coder, comprising: a first coder of the audio signal in a first transform coding mode using frames with overlapping windows; a second coder of the audio signal in a second coding mode using frames with non-overlapping windows; and a device as defined hereinabove for producing FAC parameters for cancelling time-domain aliasing caused to the audio signal coded in the first coding mode in a first frame with overlapping window by a transition between the first frame using the first coding mode with overlapping window and a second frame using the second coding mode with non-overlapping window.
According to a fifth aspect, there is provided a device for forward cancelling time-domain aliasing caused to a coded audio signal in a first transform-coded frame by a transition between the first transform-coded frame using a first coding mode with overlapping window and a second frame using a second coding mode with non-overlapping window, comprising: an input for receiving weighted forward aliasing cancellation (FAC) parameters; an inverse weighting filter supplied with the weighted FAC parameters to produce a FAC synthesis; and a decoder of the coded audio signal responsive to the FAC synthesis to produce in the first frame an audio signal synthesis with cancelled time-domain aliasing.
According to a fifth aspect, there is provided an audio signal decoder, comprising: a first decoder of the audio signal coded in a first transform coding mode using frames with overlapping windows; a second decoder of the audio signal coded in a second coding mode using frames with non-overlapping windows; and a device as defined hereinabove for forward cancelling time-domain aliasing caused to the audio signal coded using the first coding mode in a frame with overlapping window by a transition between the first frame using the first coding mode with overlapping window and a second frame using the second coding mode with non-overlapping window.
The foregoing and other features will become more apparent upon reading of the following non-restrictive description of illustrative embodiments of the device and method for forward cancelling time-domain aliasing, given by way of example only with reference to the accompanying drawings.
In the appended drawings:
The following disclosure addresses the problem of cancelling the effects of time-domain aliasing and non-rectangular windowing when an audio signal is coded using both overlapping and non-overlapping windows in contiguous frames. Using the technology described herein the use of special, non-optimal windows may be avoided while still allowing proper management of frame transitions between coding modes using both rectangular, non-overlapping windows and non-rectangular, overlapping windows.
Linear Predictive (LP) coding, for example ACELP (Algebraic Code-Excited Linear Prediction) coding, is an example of coding mode in which a frame is coded using rectangular, non-overlapping windowing. Alternatively, an example of coding mode using non-rectangular, overlapping windowing is Transform Coded eXcitation (TCX) coding as applied in the MPEG Unified Speech and Audio Codec (USAC). Another example of coding mode using non-rectangular, overlapping windowing is perceptual transform coding as in the FD mode of USAC, where an MDCT is also used as a transform and a perceptual model is used to dynamically allocate the bits to the transform coefficients. In USAC, TCX frames use both overlapping windows and Modified Discrete Cosine Transform (MDCT), which introduces Time Domain Aliasing (TDA). USAC is also a typical example where contiguous frames can be coded using either rectangular, non-overlapping windows such as in ACELP frames, or non-rectangular, overlapping windows, such as in TCX frames. Without loss of generality, the present disclosure thus considers the specific example of USAC to illustrate the benefits of the device and method for forward cancelling time-domain aliasing.
Two distinct cases are addressed in the present disclosure. The first case is concerned with a transition from a frame using a rectangular, non-overlapping window to a frame using a non-rectangular, overlapping window. The second case is concerned with a transition from a frame using a non-rectangular, overlapping window to a frame using a rectangular, non-overlapping window. For the purpose of illustration and without suggesting limitation, frames using a rectangular, non-overlapping window may be coded using the ACELP coding mode, and frames using a non-rectangular, overlapping window may be coded using the TCX coding mode. Further, specific durations may be used for some frames, for example 20 milliseconds for a TCX frame, noted TCX20. However, it should be kept in mind that these examples are used only for illustration purposes, and that other frame lengths and coding modes other than ACELP and TCX can be contemplated.
The case of a transition from a frame with rectangular, non-overlapping window to a frame with non-rectangular, overlapping window will now be addressed in relation to the following description taken in conjunction with
More specifically,
To code the TCX20 frame 203 of
After the combination of time-reversed and shifted portions of the window described in
Techniques to manage this type of transition were presented hereinabove. The present disclosure proposes an alternative approach to managing these transitions. This approach does not use non-optimal and asymmetric windows in the frames where MDCT-based transform-domain coding is used. Instead, the device and method introduced herein allow the use of symmetric windows, centered at the middle of the coded frame, such as for example the TCX20 frame of
In
The device and method introduced herein propose to transmit additional information in the form of Forward Aliasing Cancellation (FAC) parameters, for cancelling these effects and for properly recovering TCX frames.
An embodiment of particular interest uses Frequency-Domain Noise Shaping (FDNS) for example as in PCT application No. PCT/CA2010/001649 filed on Oct. 15, 2010 and entitled “SIMULTANEOUS TIME-DOMAIN AND FREQUENCY-DOMAIN NOISE SHAPING FOR TDAC TRANSFORMS” to shape the quantization noise in transform-coded frames such as TCX frames. In this embodiment, FAC correction may be applied directly in the original signal domain, such as an audio signal having no weighting applied thereto. In a multi-mode switched codec such as USAC, this implies that quantization noise shaping is performed in the transform domain, for example using MDCT, in all coding modes involving a transform. Specifically, in TCX frames using using FDNS, the transform (MDCT) is applied directly to the original signal (as in perceptual transform coding mode) instead of the weighted residual. FDNS operates in such a way as to obtain a noise shaping in TCX frames which is essentially equivalent to using the time-domain perceptual weighting filter but by only operating on the transform (MDCT) coefficients. The FAC correction may then be applied with the procedure described hereinbelow.
The USAC audio codec is used herein as a non-limiting example of a codec. Three coding modes have been proposed for the USAC codec, as follows:
Coding mode 1: Perceptual transform coding of the original audio signal;
Coding mode 2: Transform coding of the weighted residual of an LPC filter;
Coding mode 3: ACELP coding.
In coding mode 1, quantization noise shaping is already accomplished in the transform domain through the application of scale factors derived from a perceptual model, as is well known by those of ordinary skill in the art of audio coding. In coding mode 2, however, quantization noise shaping is typically applied in the time domain using a perceptual, or weighting, filter W(z) derived from a linear-predictive coding (LPC) filter calculated for the current frame. A transform, for example a DCT transform, is applied after this time-domain filtering to obtain a FAC target to be quantized and coded as FAC parameter. This prevents joining successive frames coded in modes 1 and 2 directly using Time-Domain Aliasing Cancellation (TDAC) properties of the MDCT since the MDCT is not applied in the same domain for coding modes 1 and 2.
Consequently, in an embodiment of the device and method for forward cancelling time-domain aliasing, quantization noise shaping for coding mode 2 is made through frequency-domain filtering using the FDNS process of PCT application No. PCT/CA2010/001649, rather than time-domain filtering. Hence, the transform, which is for example MDCT in the case of USAC, is applied to the original audio signal rather than a weighted version of that audio signal at the output of the filter W(z). This ensures uniformity between coding mode 1 and coding mode 2 and allows joining successive frames coded in modes 1 and 2 using the TDAC property of MDCT.
However, applying the quantization noise shaping in the transform domain for coding mode 2 uses special processing when handling transitions from and to ACELP mode.
There are four lines (line 1 to line 4) in
Line 1 of
Line 2 of
Then, specifics of the exemplary ACELP coding may be used to alleviate at least in part the transform coding error signal induced at the beginning of the synthesis signal 412. A prediction for use in reducing an energy of the transform coding error signal is shown on line 3 of
At the beginning of the frame 402 between markers LPC1 and LPC2 of line 3, two contributions from the ACELP synthesis filter states immediately at the left of marker LPC1 may be positioned. A first contribution 422 comprises a windowed and time-reversed, or folded, version of the last ACELP synthesis samples of frame 404. The window length and shape for this time-reversed signal 422 is the same as the windowed and folded ACELP synthesis portion 418 on the left side of the decoded Transform Coding (TC) frame 402 on line 2. This contribution 422 represents a good approximation of the time-domain aliasing present in the TC frame of line 2. A second contribution 424 comprises a windowed zero-input response (ZIR) of the ACELP synthesis filter, with initial states taken as the final states of this filter at the end of the ACELP synthesis frame 404, immediately at the left of marker LPC1. The window length and shape of this second contribution 424 is taken as the complement of the square of the transform window used in the transform-coded frame which, in the exemplary case of USAC, is the MDCT.
Then, having optionally positioned these two prediction contributions (windowed and folded ACELP synthesis 422 and windowed ACELP ZIR 424) on line 3, line 4 is obtained by subtracting line 2 and line 3 from line 1, using adders 426 and 427. It should be noted that the difference computed during this operation stops at marker LPC2. An approximate view of the expected time-domain envelope of the transform coding error signal is shown on line 4. The time-domain envelope of an ACELP coding error 430 in the ACELP frame 404 is expected to be approximately flat in amplitude, provided that the coded signal is stationary for this duration. Then the time-domain envelope of the transform coding error in the TC frame 402, between markers LPC1 and LPC2, is expected to exhibit the general shape as shown in this frame on line 4. This expected shape of the time-domain envelope of the transform coding error is only shown here for illustration purposes and can vary depending on the signal coded in the TC frame between markers LPC1 and LPC2. This illustration of the time-domain envelope of the transform coding error expresses that it is expected to be relatively large near the beginning and end of the TC frame 402, between markers LPC1 and LPC2. At the beginning of the frame 402, where a first FAC target part 432 is shown, the transform coding error is reduced using the two ACELP prediction contributions 422, 424, shown on line 3. This reduction is not present at the end of the TC frame 402, where a second FAC target part 434 is shown. In the second FAC target part 434, the windowing and time-domain aliasing effects cannot be reduced using the synthesis from the next frame, which begins after marker LPC2, since the TC frame 402 needs to be decoded before the next frame can be decoded.
The quantization noise may be typically as the expected envelope of the error signal shown on line 4 of
It is thus understood that parameters for the FAC correction are to be sent to the decoder to compensate for this coding error signal, which affects the beginning and end of the TC frame 402. Windowing and aliasing effects are cancelled in a manner that maintains the quantization noise at a proper level, similar to that of the ACELP frame, and that avoids discontinuities at the boundaries between the TC frame 402 and frames coded in other modes such as 404 and 406. These effects can be cancelled using FAC in the frequency-domain. This is accomplished by filtering the MDCT coefficients using information derived from the first and second LPC filters calculated at the boundaries LPC1 and LPC2, although other Frequency-Domain Noise Shaping (FDNS) can also be used.
To efficiently compensate the windowing and time-domain aliasing effects at the beginning and end of the TC frame 402 on line 4 of
To compensate for the windowing and time-domain aliasing effects around marker LPC1, the processing can be as described at the top of
Now, turning to the processing for the windowing and time-domain aliasing correction at the end of the TC frame 402, before marker LPC2, the bottom part of
The entire process of
The received MDCT-coded TC frame 402 is decoded by IMDCT and a resulting time-domain signal 608 is produced between markers LPC1 and LPC2 as shown on line B of
The FAC synthesis signal 506, 512 as in
The windowed and folded (time-inverted) ACELP synthesis 618 from the ACELP frame 404 preceding the TC frame 402 and the ZIR 620 of the ACELP synthesis filter are positioned at the beginning of the TC frame 402. This is shown on line C.
Lines A, B and C are added through adders 622 and 624 to form the synthesis signal 602 for the TC frame in the original domain on line D. This processing has produced, in the TC frame 402, the synthesis signal 602 where time-domain aliasing and windowing effects have been cancelled at the beginning and end of the frame 402, and where the potential discontinuity at the frame boundary around marker LPC1 may further have been smoothed and perceptually masked by the filters 1/W1(z) 505 and 1/W2(z) 511 of
Of course, the addition of the signals from lines A to C can be performed in any order without changing the result of the processing described.
FAC may also be applied directly to the synthesis output of the TC frame without any windowing at the decoder. In this case, the shape of the FAC is adapted to take into account the different windowing (or lack of windowing) of the decoded TC frame 402.
The length of the FAC frame can be changed during coding. For example, exemplary frame lengths may be 64 or 128 samples depending on the nature of the signal. For example a shorter FAC frame may be used in case of unvoiced signals. Information about the length of the FAC frame can be signaled to the decoder, using for example a 1-bit indicator, or flag, to indicate 64 or 128-sample frames. An example of transmission sequence with signaling FAC length may comprise the following suite:
Further signaling information may be transmitted to indicate certain processing functions to be performed by the decoder. An example is the signaling of the activation of post-processing, specific to ACELP frames. The post-processing can be switched on or off for a certain period consisting of several consecutive ACELP frames. In a transition from TC to ACELP, a 1-bit flag may be included within the FAC information to signal the activation of post-processing. In an embodiment, this flag is only transmitted in a first frame in a sequence of several ACELP frames. Thus the flag may be added to the FAC information, which is also sent for the first ACELP frame.
The device 700 comprises a receiver 710 for receiving a bitstream 701 representative of a coded audio signal including the FAC parameters representative of the FAC target.
Parameters (prm) for ACELP frames from the bitstream 701 are supplied from the receiver 710 to an ACELP decoder 711 including an ACELP synthesis filter. The ACELP decoder 711 produces a zero-input-response (ZIR) 704 of the ACELP synthesis filter. Also, the ACELP synthesis decoder 711 produces an ACELP synthesis signal 702. The ACELP synthesis signal 702 and the ZIR 704 are concatenated to form an ACELP synthesis signal followed by the ZIR. A FAC window 703, having characteristics matching the windowing applied on
Parameters (prm) for TCX 20 frames from the bitstream 701 are supplied to a TCX decoder 706, followed by an IMDCT transform 713 and a window 714 for the IMDCT, to produce a TCX 20 synthesis signal 702 (see 608, 610 and 612 of line B
However, upon a transition between coding modes (for example from an ACELP frame to a TCX 20 frame), a part of the audio signal would not be properly decoded without the use of a FAC processor 715. In the example of
The global output of the adder 716 represents the FAC cancelled synthesis signal (602 of
An audio signal 801 to be coded is applied to the device 800. A logic (not shown) applies ACELP frames of the audio signal 801 to an ACELP coder 810. An output of the ACELP coder 810, the ACELP-coded parameters 802, is applied to a first input of a multiplexer (MUX) 811 for transmission to a receiver (not shown). Another output of the ACELP coder is an ACELP synthesis signal 860 followed by the zero-input response (ZIR) 861 of an ACELP synthesis filter forming part of the ACELP coder 810. A FAC window 805 having characteristics matching the windowing applied on
The logic (not shown) also applies TCX 20 frames (see frame 402 of
Upon a transition between coding modes (for example from an ACELP frame to a TCX 20 frame), some of the audio frames coded by the MDCT module 812 may not be properly decoded without additional information. A calculator 813 provides this additional information, more specifically a coded and quantized FAC target. All components of the calculator 813 may be viewed as a producer of FAC parameters 806. The output of adder 851 is the FAC target (corresponding to line 4 of
The signal at the output of the multiplexer 811 represents the coded audio signal 855 to be transmitted to a receiver (not shown) through a transmitter 856 in a coded bitstream 857.
Those of ordinary skill in the art will realize that the description of the device and method for forward cancelling time-domain aliasing in a coded signal are illustrative only and are not intended to be in any way limiting. Other embodiments will readily suggest themselves to such persons with ordinary skill in the art having the benefit of this disclosure. Furthermore, the disclosed device and method can be customized to offer valuable solutions to existing needs and problems of cancelling time-domain aliasing in a coded signal.
Those of ordinary skill in the art will also appreciate that numerous types of terminals or other apparatuses may embody both aspects of coding for transmission of coded audio, and aspects of decoding following reception of coded audio, in a same device.
In the interest of clarity, not all of the routine features of the implementations of forward cancellation of time-domain aliasing in a coded signal are shown and described. It will, of course, be appreciated that in the development of any such actual implementation of the audio coding, numerous implementation-specific decisions must be made in order to achieve the developer's specific goals, such as compliance with application-, system-, network- and business-related constraints, and that these specific goals will vary from one implementation to another and from one developer to another. Moreover, it will be appreciated that a development effort might be complex and time-consuming, but would nevertheless be a routine undertaking of engineering for those of ordinary skill in the field of audio coding systems having the benefit of this disclosure.
In accordance with this disclosure, the components, process steps, and/or data structures described herein may be implemented using various types of operating systems, computing platforms, network devices, computer programs, and/or general purpose machines. In addition, those of ordinary skill in the art will recognize that devices of a less general purpose nature, such as hardwired devices, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), or the like, may also be used. Where a method comprising a series of process steps is implemented by a computer or a machine and those process steps can be stored as a series of instructions readable by the machine, they may be stored on a tangible medium.
Systems and modules described herein may comprise software, firmware, hardware, or any combination(s) of software, firmware, or hardware suitable for the purposes described herein. Software and other modules may reside on servers, workstations, personal computers, computerized tablets, PDAs, and other devices suitable for the purposes described herein. Software and other modules may be accessible via local memory, via a network, via a browser or other application in an ASP context or via other means suitable for the purposes described herein. Data structures described herein may comprise computer files, variables, programming arrays, programming structures, or any electronic information storage schemes or methods, or any combinations thereof, suitable for the purposes described herein.
Although the present disclosure has been described hereinabove by way of non-restrictive illustrative embodiments thereof, these embodiments can be modified at will within the scope of the appended claims without departing from the spirit and nature of the present disclosure.
Number | Name | Date | Kind |
---|---|---|---|
5297236 | Antill et al. | Mar 1994 | A |
6134518 | Cohen et al. | Oct 2000 | A |
6246645 | Tsutsui | Jun 2001 | B1 |
6314393 | Zheng et al. | Nov 2001 | B1 |
6327691 | Huang | Dec 2001 | B1 |
6475245 | Gersho et al. | Nov 2002 | B2 |
7873227 | Geiger et al. | Jan 2011 | B2 |
20010023396 | Gersho et al. | Sep 2001 | A1 |
20040024588 | Watson et al. | Feb 2004 | A1 |
20050185850 | Vinton et al. | Aug 2005 | A1 |
20080195383 | Shlomot et al. | Aug 2008 | A1 |
20090299757 | Guo et al. | Dec 2009 | A1 |
20110173011 | Geiger et al. | Jul 2011 | A1 |
20110257981 | Beack et al. | Oct 2011 | A1 |
Number | Date | Country |
---|---|---|
1 672 418 | Sep 2005 | CN |
1 954 367 | Apr 2007 | CN |
101 231 850 | Jul 2008 | CN |
2 144 171 | Jan 2010 | EP |
2012118517 | Apr 2002 | JP |
2007526691 | Sep 2007 | JP |
2 323 469 | Apr 2008 | RU |
2 326 449 | Jun 2008 | RU |
2004006226 | Jan 2004 | WO |
2005114654 | Dec 2005 | WO |
2008089705 | Jul 2008 | WO |
2009093466 | Jul 2009 | WO |
2009113316 | Sep 2009 | WO |
2010148516 | Dec 2010 | WO |
2011044700 | Apr 2011 | WO |
2011048117 | Apr 2011 | WO |
2012004349 | Jan 2012 | WO |
Entry |
---|
Bessette. et al., “Universal Speech/Audio Coding Using Hybrid ACELP/TCX Techniques”, IEEE Conference, Philadelphia, PA, Mar. 18, 2005, pp. 301-304. |
Lecomte et al., “Efficient cross-fade windows for transitions between LPC-based and non-LPC based audio coding”, 126th Audio Engineering Society Convention Paper 7712, Munich, Germany May 7, 2009, pp. 1-9. |
Princen et al., “Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation”, IEEE Transactions on Acoustics, Speech and Signal Processing, vol. ASSP-34, No. 5, Oct. 1986, pp. 1153-1161. |
Princen et al., “Subband/Transform Coding Using Filter Bank Designs Based on Time Doman Aliasing Cancellation”, IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 12, 1987, pp. 2161-2164. |
Ferreira, “Convolutional Effects in Transform Coding with TDAC: An Optimal Window”, IEEE Transactions on Speech and Audio Processing, vol. 4, No. 2, Mar. 1996, pp. 104-114. |
Neundorf et al., “A Novel Scheme for Low Bitrate Unified Speech and Audio Coding-MPEG RM0”, 126th Audio Engineering Society Convention Paper 7713, Munich, Germany , May 7, 2009, pp. 1-13. |
Neundorf et al., “Unified Speech and Audio Coding Scheme for High Quality at Low Bitrates”, IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, pp. 1-4. |
Information Technology—MPEG audio technologies—Part 3: Unified Speech and Audio Coding, ISO/IEC JTC 1/SC 29N, 2010, pp. 1-158. |
3rd Generation Partnership Project; Technical Specification Group Service and System Aspects; Audio codec processing functions; Extended Arm Wideband codec; Transcoding functions (Release 6), Jun. 2004, pp. 1-72. |
International Standard, ISO/IEC 14496-3, Information technology—coding of audio-visual objects—Part 3 Audio, 2005, pp. 1-1178. |
Neuronorf et al., “Completion of Core Experiment on Unification of USAC Windowing and Frame Transitions”, 91. MPEG Meeting; 18-1-2010-22-1-2010; Kyoto; (Motion picture Expert Group or ISO/IEC JTC1/SC29/WG11), No. M17167, Jan. 16, 2010, XP030045757, 52 pgs. |
Bessette et al., “Alternatives for windowing in USAC”, 89., MPEG Meeting; Jun. 29, 2009-Jul. 3, 2009; London; (Motion Picture Expert Group or ISO/IEC JTC1/SC29/WG11), No. M16688, Jun. 29, 2009, (XP030045285), 12 pgs. |
Number | Date | Country | |
---|---|---|---|
20120022880 A1 | Jan 2012 | US |
Number | Date | Country | |
---|---|---|---|
61294688 | Jan 2010 | US |