1. Field
The following description generally relates to encoders and decoders and, in particular, to filterbank implementations for Advanced Audio Coding (AAC) and AAC Enhanced Low Delay (ELD).
2. Background
One goal of audio coding is to compress an audio signal into a desired limited information quantity while keeping as much as the original sound quality as possible. In an encoding process, an audio signal in a time domain is transformed into a frequency domain.
Advanced Audio Coding (AAC) is a standardized, lossy compression and encoding scheme for digital audio that is specified as part of the Movie Pictures Expert Group (MPEG) standard. AAC is a wideband audio coding algorithm that exploits two primary coding strategies to dramatically reduce the amount of data needed to represent high-quality digital audio. First, signal components that are perceptually irrelevant are discarded. Second, redundancies in the coded audio signal are eliminated. In order to apply these techniques the signal is first processed by a modified discrete cosine transform (MDCT). The modified discrete cosine transform (MDCT) is a Fourier-related transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped. The relation of MDCT transform to DCT-IV and Fourier transforms allows such filterbanks to be very efficiently implemented by using so-called “fast” algorithms (related to Fast Fourier Transform (FFT) algorithm—see K. R. Rao, and P. Yip, “Discrete Cosine Transform: Algorithms, Advantages, Applications”, Academic Press, 1990 ISBN: 012580203X ).
An emerging MPEG AAC-ELD (Enhanced Low-Delay) codec is designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. However, AAC-ELD uses a different filterbank structure as compared to the traditional AAC codec. This filterbank is not compatible with MDCT or DCT-IV transforms and can not be directly computed by existing fast algorithms. This increases the complexity and cost of implementing AAC-ELD. This also increases complexity and cost when both types of algorithms are to be implemented on the same DSP core. Therefore, there is a need for a simpler way to implement AAC-ELD or both AAC and AAC-ELD codec algorithms on the same DSP core.
The following presents a simplified summary of one or more embodiments in order to provide a basic understanding of some embodiments. This summary is not an extensive overview of all contemplated embodiments, and is intended to neither identify key or critical elements of all embodiments nor delineate the scope of any or all embodiments. Its sole purpose is to present some concepts of one or more embodiments in a simplified form as a prelude to the more detailed description that is presented later.
An encoder is provided that includes a core MDCT analysis filterbank that can be used to implement an advanced audio coding (AAC) algorithm, an AAC-enhanced low delay (ELD) algorithm or both algorithms. For the AAC algorithm, input samples are sent directly to the MDCT analysis filterbank to obtain output samples. For the AAC-ELD algorithm, the signs of the signs of a first set of input samples are inverted, the MDCT analysis filterbank is applied to obtain spectral coefficient output samples, the order of the spectral coefficient output samples are reversed, and the signs of a second set of alternating spectral coefficient output samples are inverted.
According to one example, an encoder is provided that implements an analysis filterbank using a common core modified discrete cosine transform. A sequence of input samples is obtained and the signs of a first set of alternating input samples are inverted. Spectral coefficient output samples are generated by applying a modified discrete cosine transform (MDCT) to the sequence of input samples. The order the spectral coefficient output samples is reversed and the signs a second set of alternating spectral coefficient output samples are then inverted. In one example, the sequence of input samples is N samples long, and inverting the signs of the first set of alternating input samples includes: (a) inverting the signs of the even-indexed input samples of the sequence if N/4 is an even number; and (b) inverting the signs of the odd-indexed input samples of the sequence if N/4 is an odd number. In another example, the sequence of input samples is N samples long, and inverting the signs of the second set of alternating spectral coefficient output samples includes: (a) inverting the signs of the odd-indexed spectral coefficient output samples if N/2 is an even number; and (b) inverting the signs of the even-indexed spectral coefficient output samples if N/2 is an odd number. In one mode of operation, the MDCT may operate as an advanced audio coding (AAC) filterbank. In another mode of operation, the analysis filterbank may operate as an AAC enhanced low-delay (ELD) filterbank.
Similarly, a decoder is provided that implements a synthesis filterbank using a common core inverse modified discrete cosine transform. A sequence of input spectral coefficients is obtained and the signs of a first set of alternating spectral coefficients are inverted. The order of the input spectral coefficients is reversed. Output samples are generated by applying an inverse modified discrete cosine transform (IMDCT) to the spectral coefficients. The signs of a second set of alternating output samples are then inverted.
In one example, the sequence of input spectral coefficients is N samples long, and inverting the signs of the first set of alternating input spectral coefficients includes: (a) inverting the signs of the odd-indexed spectral coefficients if N/2 is an even number; and (b) inverting the signs of the even-indexed spectral coefficients if N/2 is an odd number.
In another example, the sequence of input spectral coefficients is N samples long, and inverting the signs of the second set of alternating output samples includes: (a) inverting the signs of the odd-indexed output samples if N/4 is an odd number; and (b) inverting the signs of the even-indexed output samples if N/4 is an even number. In one mode of operation, the IMDCT may operate as an advanced audio coding (AAC) filterbank. In another mode of operation, the synthesis filterbank may operate as an AAC enhanced low-delay (ELD) filterbank.
Various features, nature, and advantages may become apparent from the detailed description set forth below when taken in conjunction with the drawings in which like reference characters identify correspondingly throughout.
Various embodiments are now described with reference to the drawings, wherein like reference numerals are used to refer to like elements throughout. In the following description, for purposes of explanation, numerous specific details arc set forth in order to provide a thorough understanding of one or more embodiments. It may be evident, however, that such embodiment(s) may be practiced without these specific details. In other instances, well-known structures and devices are shown in block diagram form in order to facilitate describing one or more embodiments.
One feature provides a way to implement AAC-ELD or both AAC and AAC-ELD algorithms using the same core MDCT analysis filterbank and core IMDCT synthesis filterbank.
An encoder may include a core MDCT analysis filterbank that can be used to implement AAC-ELD or both AAC and AAC-ELD algorithms. For the AAC algorithm, input samples are sent directly to the MDCT analysis filterbank to obtain output samples. For the AAC-ELD algorithm, a vector of residual values of input samples is formed and the signs of a first set of alternating input samples are inverted. Spectral coefficient output samples are generated by applying a modified discrete cosine transform (MDCT) to the sequence of input samples. The order of the spectral coefficient output samples is then reversed and the signs a second set of alternating spectral coefficient output samples are inverted.
Similarly, a decoder may include a core IMDCT synthesis filterbank that can be used to implement AAC-ELD or both AAC and AAC-ELD algorithms. For the AAC algorithm, input samples are sent directly to the IMDCT synthesis filterbank to obtain output samples. For the AAC-ELD algorithm, a sequence of input spectral coefficients is obtained and the signs of a first set of alternating spectral coefficients are inverted. The order of the input spectral coefficients is reversed. Output samples are generated by applying an inverse modified discrete cosine transform (IMDCT) to the spectral coefficients. The signs of a second set of alternating output samples are then inverted.
Because both AAC and AAC-ELD filterbanks may be implemented using the same MDCT and IMDCT core modules, this allows reusability of existing code with only a few minor modifications. If only an AAC-ELD filterbank is to be implemented, the disclosed methods offer a simple solution utilizing known fast MDCT filterbanks implementations.
The AAC ELD core coder analysis (Equation 1) and synthesis (Equation 2) filterbanks can be defined as follows:
where
z(n) denotes windowed input data samples, X(k) denotes subband coefficients, x(n) denotes reconstructed samples (prior to aliasing cancellation). In one example, N may be 1024 or 960.
The Modified Discrete Cosine Transform (MDCT) (Equation 3) and the Inverse MDCT (IMDCT) (Equation 4) are usually defined as follows:
where
and where z(n) denotes windowed input data samples, {tilde over (X)}(k) denotes MDCT spectral coefficients, and {tilde over (x)}(n) denotes reconstructed samples (prior to aliasing cancellation).
where:
zi,n=windowed input sequence
Xi,k=output spectral coefficients
n=sample index
k=spectral coefficient index
i=block index
N=window length based on the window_sequence value
p0=(N/2+1)/2.
The analysis filterbank in AAC-ELD output Xi,k can be represented by:
where:
zi,n=windowed input sequence
Xi,k=output spectral coefficients
n=sample index
K=spectral coefficient index
I=block index
N=window length based on the window_sequence value
n0=(−N/2+1)/2.
In the case of an AAC-ELD analysis filterbank, for
it can be shown that:
which allows reuse of core MDCT filterbank for its implementation. Note that the right hand side of the summation is an MDCT (e.g., as in Equation 3). The algorithm for the analysis filterbank may include:
where:
X=spectral coefficients
n=sample index
i=window index
k=spectral coefficient index
N=window length
p0=(N/2+1)/2
with N=1920 or 2048 (for example).
The AAC-ELD synthesis filterbank output xi,n can be represented by:
where:
X=spectral coefficients
n=sample index
i=window index
k=spectral coefficient index
N=window length
n0=(−N/2+1)/2
with N=960 or 1024 (for example).
In the case of AAC-ELD synthesis filterbank, it can be shown that,
x
i,n+N
=−x
i,n for 0≦n<N.
Consequently, for 0≦n<N, the filterbank output xi,n can be represented as:
which allows reuse of core IMDCT filterbank for its implementation. Note that the right hand side of the Note that the right hand side of the summation is an IMDCT (e.g., as in Equation 4). The algorithm for the synthesis filterbank may include:
Consequently, both AAC and ELD-AAC filterbanks can be implemented by using the same N-point MDCT core transform or IMDCT core transform. Support for both types of filterbanks is possible by using only order-reversal and sign-inversion operations, with minimum impact on overall complexity of the implementation.
Information and signals may be represented using any of a variety of different technologies and techniques. For example, data, instructions, commands, information, signals and the like that may be referenced throughout the above description may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles or any combination thereof.
The various illustrative logical blocks, modules and circuits and algorithm steps described herein may be implemented or performed as electronic hardware, software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. It is noted that the configurations may be described as a process that is depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination corresponds to a return of the function to the calling function or the main function.
When implemented in hardware, various examples may employ a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array signal (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core or any other such configuration.
When implemented in software, various examples may employ firmware, middleware or microcode. The program code or code segments to perform the necessary tasks may be stored in a computer-readable medium such as a storage medium or other storage(s). A processor may perform the necessary tasks. A code segment may represent a procedure, a function, a subprogram, a program, a routine, a subroutine, a module, a software package, a class, or any combination of instructions, data structures, or program statements. A code segment may be coupled to another code segment or a hardware circuit by passing and/or receiving information, data, arguments, parameters, or memory contents. Information, arguments, parameters, data, etc. may be passed, forwarded, or transmitted via any suitable means including memory sharing, message passing, token passing, network transmission, etc.
As used in this application, the terms “component,” “module,” “system,” and the like are intended to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, software, or software in execution. For example, a component may be, but is not limited to being, a process running on a processor, a processor, an object, an executable, a thread of execution, a program, and/or a computer. By way of illustration, both an application running on a computing device and the computing device can be a component. One or more components can reside within a process and/or thread of execution and a component may be localized on one computer and/or distributed between two or more computers. In addition, these components can execute from various computer readable media having various data structures stored thereon. The components may communicate by way of local and/or remote processes such as in accordance with a signal having one or more data packets (e.g., data from one component interacting with another component in a local system, distributed system, and/or across a network such as the Internet with other systems by way of the signal).
In one or more examples herein, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. By way of example, and not limitation, such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. Also, any connection is properly termed a computer-readable medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media. Software may comprise a single instruction, or many instructions, and may be distributed over several different code segments, among different programs and across multiple storage media. An exemplary storage medium may be coupled to a processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor.
The methods disclosed herein comprise one or more steps or actions for achieving the described method. The method steps and/or actions may be interchanged with one another without departing from the scope of the claims. In other words, unless a specific order of steps or actions is required for proper operation of the embodiment that is being described, the order and/or use of specific steps and/or actions may be modified without departing from the scope of the claims.
One or more of the components, steps, and/or functions illustrated in
It should be noted that the foregoing configurations are merely examples and are not to be construed as limiting the claims. The description of the configurations is intended to be illustrative, and not to limit the scope of the claims. As such, the present teachings can be readily applied to other types of apparatuses and many alternatives, modifications, and variations will be apparent to those skilled in the art.
The present Application for Patent claims priority to U.S. Provisional Application No. 60/980,418, entitled “Efficient Joint Implementation of analysis and Synthesis Filterbanks For MPEG AAC and MPEG AAC ELD Encoders/Decoders” filed Oct. 16, 2007, and assigned to the assignee hereof and hereby expressly incorporated by reference herein.
Number | Date | Country | |
---|---|---|---|
60980418 | Oct 2007 | US |