This application claims the benefit under 35 U.S.C. §119(a) of Korean Patent Application No. 10-2010-0091040, filed on Sep. 16, 2010, in the Korean Intellectual Property Office, the entire disclosure of which is incorporated herein by reference for all purposes.
1. Field
The following description relates to a method of encoding and decoding a multi-channel audio, and more particularly, to a method and apparatus of encoding and decoding a high-frequency signal of the multi-channel audio.
2. Description of Related Art
Multi-channel audio coding schemes may generally include a waveform multi-channel audio coding scheme and a parametric multi-channel audio coding scheme.
The waveform multi-channel audio coding scheme may be classified as a moving picture expert group (MPEG)-2 multi channel extension (MC) audio coding scheme, an advanced audio coding (AAC) MC audio coding scheme, a bit sliced arithmetic coding/audio video standard MC (BSAC/AVS MC) audio coding scheme, and the like.
The parametric multi-channel audio coding scheme may include an MPEG surround scheme, and the MPEG surround scheme may restore a multi-channel audio signal using a downmixed signal and spatial information.
The MPEG surround scheme may down-mix the multi-channel audio signal and parameterize the spatial information to compress the multi-channel audio signal, and may restore the multi-channel audio signal with only a small amount of information. The MPEG-surround scheme may be used together with a Spectral Band Replication (SBR) coding scheme to increase compression efficiency.
In one general aspect there is provided a multi-channel audio signal encoding apparatus including a downmixer configured to downmix a multi-channel audio input signal, a channel decorrelator configured to expand a number of channels of the downmixed signal thereby providing an expanded channel signal, a parameter estimator configured to select at least one signal from among the expanded channel signal, and to extract a parameter indicating a characteristic relation between the selected signal and the multi-channel audio input signal and a bitmuxer configured to encode the downmixed signal and the extracted parameter.
The channel decorrelator may expand the number of channels of the downmixed signal through linear combination or decorrelation.
The bitmuxer may encode the extracted parameter and a signal associated with a high frequency band signal of the multi-channel audio input signal from among the downmixed signal.
The parameter estimator may select, from among the downmixed signal and the expanded channel signal, at least one signal having a maximal value when a match function is applied to the downmixed signal and the expanded channel signal with each input signal of the multi-channel audio input signal, and extracts a parameter indicating a characteristic relation between the selected signal and the multi-channel audio input signal.
In another aspect, there is provided a multi-channel audio signal decoding apparatus including a bitdemuxer configured to restore, from an input bitstream that is obtained by encoding a multi-channel audio signal, a downmixed signal of the multi-channel audio signal, a parameter decoder configured to restore, from the input bit stream, a parameter to be used for restoring a channel signal included in the multi-channel audio signal, and a channel decorrelator configured to expand a number of channels of the restored downmixed signal. The multi-channel audio decoding apparatus further includes a high-frequency signal synthesizer configured to select, from the downmixed signal of which the number of channels is expanded, a channel signal to be patched using the restored parameter and a spatial synthesizer configured to restore the channel signal included in the multi-channel audio signal using the selected channel signal and the restored parameter information.
The channel decorrelator may expand the number of channels of the downmixed signal, through linear combination or decorrelation.
In another aspect, there is provided a multi-channel audio signal encoding method of a transmitter including downmixing a multi-channel audio input signal, expanding a number of channels of the downmixed signal, selecting at least one signal from among the expanded channel signal, extracting a characteristic relation between the selected signal and the multi-channel audio input signal, and encoding the downmixed signal and the extracted parameter.
The expanding may include expanding the number of channels of the downmixed signal through linear combination or decorrelation.
The encoding may include encoding the extracted parameter and a signal associated with a high frequency band signal of the multi-channel audio input signal from among the downmixed signal.
The selecting and extracting may include selecting, from among the downmixed signal and the expanded channel signal, at least one signal having a maximal value when a match function is applied to the downmixed signal and the expanded channel signal with each input signal of the multi-channel audio input signal and extracting a parameter indicating a characteristic relation between the selected signal and the multi-channel audio input signal.
A non-transitory computer readable storage medium may store a program to implement the multi-channel audio encoding method.
In another aspect there is provided a multi-channel audio signal decoding method of a receiver including restoring, from an input bitstream that is obtained by encoding a multi-channel audio signal, a downmixed signal of the multi-channel audio signal, restoring, from the input bitstream, a parameter to be used for restoring a channel signal included in the multi-channel audio signal, expanding a number of channels of the restored downmixed signal, selecting, from the downmixed signal of which the number of channels is expanded, a channel signal to be patched using the restored parameter, and restoring the channel signal included in the multi-channel audio signal using the selected channel signal and the restored parameter information.
The expanding may include expanding the number of channels of the downmixed signal through linear combination or decorrelation.
In still another general aspect, there is provided a transmitter having a multi-channel audio signal encoding apparatus, the multi-channel audio signal encoding apparatus including a downmixer configured to downmix a multi-channel audio input signal received at the transmitter and a channel decorrelator configured to expand a number of channels of the downmixed signal thereby providing an expanded channel signal. The encoding apparatus further includes a parameter estimator configured to select at least one signal from among the expanded channel signal, and to extract a parameter indicating a characteristic relation between the selected signal and the multi-channel audio input signal and a bitmuxer configured to encode the downmixed signal and the extracted parameter. The transmitter transmits the encoded downmixed signal and extracted parameter.
In another general aspect, there is provided a receiver having a multi-channel audio signal decoding apparatus, the multi-channel audio signal decoding apparatus including a bitdemuxer configured to restore, from an input bitstream that is obtained by encoding a multi-channel audio signal, a downmixed signal of the multi-channel audio signal, a parameter decoder configured to restore, from the input bit stream, a parameter to be used for restoring a channel signal included in the multi-channel audio signal, and a channel decorrelator configured to expand a number of channels of the restored downmixed signal. The signal decoding apparatus further includes a high-frequency signal synthesizer configured to select, from the downmixed signal of which the number of channels is expanded, a channel signal to be patched using the restored parameter and a spatial synthesizer configured to restore the channel signal included in the multi-channel audio signal using the selected channel signal and the restored parameter information.
Other features and aspects may be apparent from the following detailed description, the drawings, and the claims.
Throughout the drawings and the detailed description, unless otherwise described, the same drawing reference numerals should be understood to refer to the same elements, features, and structures. The relative size and depiction of these elements may be exaggerated for clarity, illustration, and convenience.
The following detailed description is provided to assist the reader in gaining a comprehensive understanding of the methods, apparatuses and/or systems described herein. Accordingly, various changes, modifications, and equivalents of the systems, apparatuses and/or methods described herein may be suggested to those of ordinary skill in the art. Also, descriptions of well-known functions and constructions may be omitted for increased clarity and conciseness.
In this example, multi-channel signals y1, y2, . . . , yN are inputted to a downmixer 110.
The downmixer 110 down-mixes, based on a moving picture expert group (MPEG) surround scheme, the multi-channel signals into 2-channel signals x1 and x2.
A spatial parameter extractor 120 expresses low frequency band signals of the multi-channel signals y1, y2, . . . , yN by spatial parameters indicating spatial correlations between channels.
A channel decorrelator 140 generates additional signals x3, x4, and the like by expanding channels using high frequency band signals of the downmixed signals x1 and x2, and may generate base signal sets.
The parameter estimator 150 generates parameters corresponding to envelopes of the high frequency band signals, based on correlation between signals x1, x2, x3, x4, and the like corresponding to the base signal sets and high-frequency band signals of the inputted multi-channel signals y1, y2, . . . , yN.
The above described process will be described with reference to the examples in
In the process, when high-frequency band signals corresponding to a jth subband of the inputted multi-channel signals y1, y2, . . . , yN is Y0j, Y1j, Y2j, Y3j, Y4j, downmixed signals X0j and X1j may be calculated as expressed by Equation 1.
In Equation 1, the downmixed signals) X0j and X1j are calculated in the same manner as a downmixing process based on an MPEG surround scheme.
The high frequency signals may be restored based on a conventional Spectral Band Replication (SBR) coding scheme.
High-frequency signals X2j and X3j that are additionally generated based on the downmixed signals X0j and X1j are calculated as expressed by Equation 2.
In Equation 2, the additional high-frequency signals X2j and X3j may be generated by the channel decorrelator 140.
The base signal sets that are generated after the additional high-frequency signals are generated are expressed below in Equation 3.
In Equation 3, signals X0j, X1j, X2j and X3j are candidate values for an optimal signal to be used for extracting the parameters indicating a characteristic relation between the multi-channel audio input signals and a signal selected by the parameter estimator 150.
The high-frequency signals of the multi-channel signals may be restored by selecting a signal to be patched from signals X0j, X1j, X2j, and X3j, and in the same manner as selecting a signal to be patched from a low frequency signal during a bandwidth extension process.
The high frequency signals of the multi-channel signals may be restored by selecting, from among the signals, a signal that is most similar to a high frequency signal of an original signal.
In this example, the parameter estimator 150 selects an optimal signal from among the expanded channel signals.
The optimal signal may be a channel signal having a maximal value among the downmixed signals and the expanded channel signals, when a match function is applied to the downmixed signals and the expanded channel signals with each input signal of the multi-channel signals.
As for and X0j, X1j, X2j, and X3j, a characteristic of a signal (Y0j+Y1j) may be dominant in a signal X0j or a signal X3j, and a characteristic of a signal (Y3j+Y4j) may be dominant in a signal X1j or a signal X3j.
A signal component Y2j may be represented by dominant in a signal X2j.
An energy matching equation is applied to the candidate signals, and a signal having a maximal value is selected, from among the candidate signals, as a signal to be patched, that is, the optimal signal.
The process will be described with reference to the example in
Referring to
A match function calculator 220 receives the generated channel signals X0j, X1j, X2j, and X3j, and calculates a matching function value of each of the signals as expressed by Equation 4.
A signal having a maximal matching function value R(Ysj,Xkj) is determined as an optimal channel signal.
A base signal selector 210 selects a base signal based on Equation 5.
A gain estimator 230 generates information associated with gain values corresponding to envelopes of an SBR coding scheme with respect to high-frequency band signals of multi-channel audio input signals.
As an example, a gain value may be calculated based on an energy ratio of a signal to be patched with an original signal as expressed by Equation 6.
Referring again to
Here, a multi-channel decoding process is performed in reverse order of the multi-channel encoding process described with reference to
First, a bitdemuxer 310 demuxes a transmitted bit stream.
A waveform decoder 320 decodes the waveform of the demuxed bit stream received from the bitdemuxer 310.
According to one example, multi-channel signals in a low frequency are restored using the transmitted downmixed signals and spatial parameters extracted by the spatial parameter extractor 120.
A spatial synthesizer 340 synthesizes multi-channel signals corresponding to a low frequency based on the downmixed signals and information associated with the spatial parameter.
The channel decorrelator 330 generates additional signals from the downmixed signals in the same manner as the multi-channel audio signal encoding apparatus 100 of
The multi-channel encoding process proceeds using the spatial synthesizer 340, the parameter decoder 350, the high-frequency synthesizer 360, and a multi-channel output voice signal that is similar to a multi-channel input voice signal. That is, an original signal may be generated.
In this example, a downmixed signal 401 is inputted to a channel decorrelator 410, and the channel decorrelator 410 generates an additional signal from a downmixed signal in the same manner as the multi-channel audio signal encoding apparatus 100 of
A high-frequency generator 420 selects a target signal to be patched from the base signal set based on patching channel index information, and may generate a high-frequency band signal based on generated gain information.
The multi-channel audio encoding apparatus may be implemented in a transmitter into which a multi-channel audio signal is input. As such, various aspects of the multi-channel audio encoding apparatus described above, for example, the downmixer, channel decorrelator, parameter estimator and bitmuxer, may be implemented in a transmitter as well. As noted above, and shown in
The multi-channel audio decoding apparatus may be implemented in a receiver which receives a transmitted bit stream. As such, various aspects of the multi-channel audio decoding apparatus described above, for example, the bitdemuxer, parameter decoder, channel decorrelator, high-frequency signal synthesizer and spatial synthesizer, may be implemented in the receiver as well.
The transmitted and receiver may be implemented in various electronic devices.
The processes, functions, methods and/or software described herein may be recorded, stored, or fixed in one or more computer-readable storage media that includes program instructions to be implemented by a computer to cause a processor to execute or perform the program instructions. The media may also include, alone or in combination with the program instructions, data files, data structures, and the like. The media and program instructions may be those specially designed and constructed, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable media include magnetic media, such as hard disks, floppy disks, and magnetic tape; optical media such as CD ROM disks and DVDs; magneto-optical media, such as optical disks; and hardware devices that are specially configured to store and perform program instructions, such as read-only memory (ROM), random access memory (RAM), flash memory, and the like. Examples of program instructions include machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter. The described hardware devices may be configured to act as one or more software modules that are recorded, stored, or fixed in one or more computer-readable storage media, in order to perform the operations and methods described above, or vice versa. In addition, a computer-readable storage medium may be distributed among computer systems connected through a network and computer-readable codes or program instructions may be stored and executed in a decentralized manner.
A number of examples have been described above. Nevertheless, it should be understood that various modifications may be made. For example, suitable results may be achieved if the described techniques are performed in a different order and/or if components in a described system, architecture, device, or circuit are combined in a different manner and/or replaced or supplemented by other components or their equivalents. Accordingly, other implementations are within the scope of the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2010-0091040 | Sep 2010 | KR | national |
Number | Name | Date | Kind |
---|---|---|---|
8170882 | Davis | May 2012 | B2 |
20060133618 | Villemoes et al. | Jun 2006 | A1 |
20070239442 | Hotho et al. | Oct 2007 | A1 |
20080033732 | Seefeldt et al. | Feb 2008 | A1 |
20080077412 | Oh et al. | Mar 2008 | A1 |
20080120095 | Oh et al. | May 2008 | A1 |
20080205658 | Breebaart | Aug 2008 | A1 |
20080253576 | Choo et al. | Oct 2008 | A1 |
20080255859 | Jung et al. | Oct 2008 | A1 |
20080270124 | Son et al. | Oct 2008 | A1 |
20090037180 | Kim et al. | Feb 2009 | A1 |
20090043591 | Breebaart et al. | Feb 2009 | A1 |
20090110203 | Taleb | Apr 2009 | A1 |
20090157411 | Kim et al. | Jun 2009 | A1 |
20090164221 | Kim et al. | Jun 2009 | A1 |
20090164222 | Kim et al. | Jun 2009 | A1 |
20090172060 | Taleb et al. | Jul 2009 | A1 |
20090210234 | Sung et al. | Aug 2009 | A1 |
20100114583 | Lee et al. | May 2010 | A1 |
20110002470 | Purnhagen et al. | Jan 2011 | A1 |
20110051935 | Moon et al. | Mar 2011 | A1 |
20110173005 | Hilpert et al. | Jul 2011 | A1 |
20110196685 | Kim et al. | Aug 2011 | A1 |
20110249821 | Jaillet et al. | Oct 2011 | A1 |
20130094654 | Breebaart et al. | Apr 2013 | A1 |
Number | Date | Country |
---|---|---|
2 144 231 | Jan 2010 | EP |
10-2007-0107615 | Nov 2007 | KR |
10-2008-0027129 | Mar 2008 | KR |
WO 2009066960 | May 2009 | WO |
WO 2010070225 | Jun 2010 | WO |
Number | Date | Country | |
---|---|---|---|
20120070007 A1 | Mar 2012 | US |