This application is a U.S. national phase under the provisions of 35 U.S.C. §371 of International Application No. PCT/SG08/00036 filed Jan. 31, 2008 in the names of Te Li, et al. for “METHOD AND DEVICE OF BITRATE DISTRIBUTION/TRUNCATION FOR SCALABLE AUDIO CODING.” The disclosure of such international application is hereby incorporated herein by reference in its entirety, for all purposes.
Embodiments of the invention relate generally to scalable audio coding. Specifically, embodiments of the invention relate to bitrate distribution and/or bitrate truncation for scalable audio coding.
Due to the various scenario of applications, a scalable audio coding system is highly favorable, which is capable of producing a hierarchical bitstream whose bitrates can be dynamically changed during transmission.
For example, MPEG-4 scalable lossless (SLS) coding provides a gradual refinement, from perceptually weighted reconstruction levels provided by the perceptual audio coding (e.g., advanced audio coding, AAC) core bitstream up to the resolution of the original signal. The original signal is transformed by an integer modified discrete cosine transform (IntMDCT), and the resultant IntMDCT spectral data is coded with two complementary layers, including a core MPEG-4 AAC layer which generates an AAC compliant bit-stream at a pre-defined bitrate which constitutes the minimum rate/quality of the lossless bitstream, and a lossless enhanced layer that makes use of bit-plane coding method to produce fine grain scalable to lossless portion of the lossless bitstream.
In the MPEG-4 SLS encoder, the bitrate for different channels of the audio signal is equally distributed for lossy coding. For example, the bitrate assigned to each frame, Br/f, is calculated as
wherein Br is the total bitrate (kbps), Ns/f is the sample number/frame and S is the sampling rate. If there are two channels, Br/f is evenly distributed to the two channels as
For example, if the mid/side joint stereo coding (M/S stereo coding) is utilized, the bitrates assigned to the mid channel and the side channel are identical according to the equation above. The mid channel represents the Average of Left and Right channel data, and the side channel represents the Difference between Left and Right channel data. In another example, the first and the second channels are the left channel and the right channel, and the bitrate is then assigned to the left and right channel according to the above equation.
The lossless bitstream resulting from the SLS encoder can be directly decoded or can be truncated by a truncator. The lossless bitstream is truncated, e.g. for low bitrate applications, wherein the lossless bitstream may be truncated for each frame based on the target bitrate. For a frame, the original lossless bitstream lengths for the first and second channels are represented as BS1 and BS2, respectively. The target bitstream length is denoted as BST. In a standard SLS truncator, the truncated bitrates are allocated as
M/S stereo coding can be used in lossy audio coding as well as lossless audio coding, for example, in MPEG-4 audio scalable lossless coding (SLS). In most cases, there is comparatively little difference between the audio data for the left and right channels; whereas in some other cases, there is much difference between the audio data for the left and right channels. Accordingly, encoding the data into mid and side channels usually results in a situation where the mid channel is much different from the side channel. In this case, evenly distributing bitrates between the mid channel and the side channel in the audio encoding, or evenly distributing truncated bitrates between the mid channel and the side channel, becomes inefficient.
Various embodiments of the invention provide an efficient method and device for bitrate assignment in the scalable audio encoding process.
An embodiment of the invention provides a method for assigning bitrates to a plurality of channels in a scalable audio encoding process. The method includes assigning different bitrates to different channels in the scalable audio encoding process.
Another embodiment of the invention provides a method for assigning truncated bitrates to a plurality of channels in a scalable audio truncation process. The method includes assigning different truncated bitrates to different channels in the scalable audio truncation process.
Other embodiments of the invention provide an encoder for scalable audio encoding, a computer readable medium for scalable audio encoding, a computer program element for scalable audio encoding, a scalable audio encoder, a truncator for scalable audio truncation, a computer readable medium for scalable audio truncation, and a computer program element for scalable audio truncation.
In the drawings, like reference characters generally refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead generally being placed upon illustrating the principles of the invention. In the following description, various embodiments of the invention are described with reference to the following drawings, in which:
Various embodiments of the invention are based on the finding that the mid channel data amount is much different from the side channel data amount in most cases. Therefore, the smaller channel can be accurately encoded using fewer bitrates, thereby freeing up resources which can be employed more efficiently on the larger channel.
An embodiment of the invention provides a method for assigning bitrates to a plurality of channels in a scalable audio encoding process. The method may include assigning different bitrates to different channels in the scalable audio encoding process.
In one embodiment, the plurality of channels may include a mid channel and a side channel of a mid/side stereo encoding process. A first bitrate is assigned to the mid channel, and a second bitrate, which is different from the first bitrate, is assigned to the side channel. In another embodiment, the plurality of channels may include a left channel and a right channel.
According to an embodiment of the invention, the different bitrates are determined based on psychoacoustic information. For example, the different bitrates may be determined based on the ratio of psychoacoutic information in the different channels.
The different bitrates may be assigned to different channels of each audio frame in a bit-plane encoding process. In one embodiment, the different bitrates are assigned to different channels based on bit-plane values for different channels. In another embodiment, the different bitrates are assigned to different channels based on the ratio of bit-plane values for different channels.
In a further embodiment, the different bitrates are assigned to different channels based on the ratio of maximum bit-plane values for the different channels. In another embodiment, the different bitrates are assigned to different channels based on the ratio of average maximum bit-plane values for all the scalefactor bands (sfb) for different channel. For example, the different bitrates may be assigned to different channels based on the ratio of a first average maximum bit-plane value and a second average maximum bit-plane value. The first average maximum bit-plane value may include an average value of a plurality of maximum bit-plane values for a first channel of the plurality of channels, and the second average maximum bit-plane value comprises an average value of a plurality of maximum bit-plane values for a second channel of the plurality of channels.
Based on the different bitrates assigned to different channels, the audio signal is scalable encoded, e.g. to form a scalable lossless bitstream. The scalable lossless bitstream may be used in different applications, which may have different available/target bitrates. The scalable lossless bitstream may be truncated to cater for different applications according to the embodiment of the invention.
According to one embodiment, it is further determined as to whether a target total bitrate is smaller than or equal to the sum of a first perceptual core bitrate for a first channel of the plurality of channels and a second perceptual core bitrate for a second channel of the plurality of channels.
If the target total bitrate is smaller than or equal to the sum of a first perceptual core bitrate for a first channel of the plurality of channels and a second perceptual core bitrate for a second channel of the plurality of channels, different truncated bitrates may be assigned to different channels in a scalable audio truncation process based on the total bitrate, the first perceptual core bitrate, and the second perceptual core bitrate, in one embodiment. In another embodiment, if the target total bitrate is smaller than or equal to the sum of the first perceptual core bitrate and the second perceptual core bitrate, the different truncated bitrates may be assigned to different channels in the scalable audio truncation process based on the total bitrate, and a ratio between the first perceptual core bitrate and the second perceptual core bitrate.
In a further embodiment, if the target total bitrate is smaller than or equal to the sum of the first perceptual core bitrate and the second perceptual core bitrate, a first truncated bitrate may be assigned to the first channel of the plurality of channels in accordance with the following equation:
and a second truncated bitrate is assigned to a second channel of the plurality of channels in accordance with the following equation:
Wherein
It is to be understood that the above equations for the first channel and the second channel may be modified accordingly if the plurality of channels include more than two channels.
According to another embodiment, if it is determined that the target total bitrate is greater than the sum of the first perceptual core bitrate for the first channel of the plurality of channels and the second perceptual core bitrate for the second channel of the plurality of channels, different truncated bitrates may be assigned to different channels in the scalable audio truncation process based on the first perceptual core bitrate, the second perceptual core bitrate, a first enhancement bitrate for an enhancement layer of the first channel, and a second enhancement bitrate for an enhancement layer of the second channel. In another embodiment, if the target total bitrate is greater than the sum of the first perceptual core bitrate and the second perceptual core bitrate, the different truncated bitrates may be assigned to different channels in the scalable audio truncation process based on the first perceptual core bitrate, the second perceptual core bitrate, and a ratio between the first enhancement bitrate assigned to the enhancement layer of the first channel and the second enhancement bitrate assigned to the enhancement layer of the second channel.
In a further embodiment, if the target total bitrate is greater than the sum of the first perceptual core bitrate and the second perceptual core bitrate, a first truncated bitrate may be assigned to the first channel in accordance with the following equation:
a second truncated bitrate may be assigned to the second channel in accordance with the following equation:
wherein
It is to be understood that the above equations for the first channel and the second channel may be modified accordingly if the plurality of channels include more than two channels.
Another embodiment of the invention provides a method for assigning truncated bitrates to a plurality of channels of a bitstream in a scalable audio truncation process. The method includes assigning different truncated bitrates to different channels in the scalable audio truncation process.
In one embodiment, the plurality of channels includes a mid channel and a side channel of a mid/side stereo decoding process. A first truncated bitrate may be assigned to the mid channel, and a second truncated bitrate, which is different from the first truncated bitrate, may be assigned to the side channel. In another embodiment, the plurality of channels may include a left channel and a right channel. The bitstream may be a scalable lossless bitstream derived by scalable encoding an audio signal, for example. The bitstream may also be a lossy bitstream derived by lossy encoding an audio signal, in another example.
According to one embodiment, it is determined as to whether a target total bitrate is smaller than or equal to the sum of a first perceptual core bitrate for a first channel of the plurality of channels and a second perceptual core bitrate for a second channel of the plurality of channels.
If the target total bitrate is smaller than or equal to the sum of a first perceptual core bitrate for a first channel of the plurality of channels and a second perceptual core bitrate for a second channel of the plurality of channels, different truncated bitrates may be assigned to different channels in the scalable audio truncation process based on the total bitrate, the first perceptual core bitrate, and the second perceptual core bitrate, in one embodiment. In another embodiment, if the target total bitrate is smaller than or equal to the sum of the first perceptual core bitrate and the second perceptual core bitrate, the different truncated bitrates may be assigned to different channels in the scalable audio truncation process based on the total bitrate, and a ratio between the first perceptual core bitrate and the second perceptual core bitrate.
In a further embodiment, if the target total bitrate is smaller than or equal to the sum of the first perceptual core bitrate and the second perceptual core bitrate, a first truncated bitrate may be assigned to the first channel of the plurality of channels in accordance with the following equation:
and a second truncated bitrate is assigned to a second channel of the plurality of channels in accordance with the following equation:
Wherein
It is to be understood that the above equations for the first channel and the second channel may be modified accordingly if the plurality of channels include more than two channels.
According to another embodiment, if it is determined that the target total bitrate is greater than the sum of the first perceptual core bitrate for the first channel of the plurality of channels and the second perceptual core bitrate for the second channel of the plurality of channels, different truncated bitrates may be assigned to different channels in the scalable audio truncation process based on the first perceptual core bitrate, the second perceptual core bitrate, a first enhancement bitrate for an enhancement layer of the first channel, and a second enhancement bitrate for an enhancement layer of the second channel. In another embodiment, if the target total bitrate is greater than the sum of the first perceptual core bitrate and the second perceptual core bitrate, the different truncated bitrates may be assigned to different channels in the scalable audio truncation process based on the first perceptual core bitrate, the second perceptual core bitrate, and a ratio between the first enhancement bitrate assigned to the enhancement layer of the first channel and the second enhancement bitrate assigned to the enhancement layer of the second channel.
In a further embodiment, if the target total bitrate is greater than the sum of the first perceptual core bitrate and the second perceptual core bitrate, a first truncated bitrate may be assigned to the first channel in accordance with the following equation:
a second truncated bitrate may be assigned to the second channel in accordance with the following equation:
wherein
It is to be understood that the above equations for the first channel and the second channel may be modified accordingly if the plurality of channels include more than two channels.
According to an embodiment of the invention, the bitstream may be truncated based on the assigned truncated bitrates, such that a prioritized truncation is performed on different channels.
Another embodiment of the invention relates to a method of decoding a bitstream in a scalable audio decoding process. In one embodiment, a bitrate assignment information may be received from another device, e.g. a scalable audio encoder. The bitrate assignment information may be embedded in an encoded bitstream in another embodiment. The bitrate assignment information indicates the different bitrates assigned to the different channels of the bitstream in the scalable audio encoding process. Based on the received bitrate assignment information, the bitstream is decoded in the scalable audio decoding process.
In another embodiment, the bitrate assignment information indicates the different truncated bitrates for different channels used to truncate the encoded bitstream. Based on the bitrate assignment information, the encoded bitstream which is further truncated in a scalable audio truncation process may be decoded in the scalable audio decoding process.
Other embodiments of the invention provide an encoder for scalable audio encoding, a computer readable medium for scalable audio encoding, a computer program element for scalable audio encoding, a scalable audio encoder, a truncator for scalable audio truncation, a computer readable medium for scalable audio truncation, a computer program element for scalable audio truncation, which will be described in more detail in the examples below.
At 101, different bitrates are assigned to different channels of a signal. For example, different bitrates may be assigned to mid and side channels of an audio signal. At 103, the signal is scalable encoded based on the different bitrates assigned to different channels. In one example, the mid channel may be assigned more bitrates such that the mid channel data is encoded with more accuracy.
At 201, bit-plane values for different channels of a signal, e.g. for different channels of each frame of an audio signal, is determined. Different bitrates are assigned to different channels based on the bit-plane values for different channels at 203. For example, different bitrates may be assigned to mid and side channels of an audio signal. The bitrates may be assigned based on the ratio of bit-plane values for the different channels in one embodiment, and may be assigned based on the ratio of maximum bit-plane values for the different channels in another embodiment. In a further embodiment, the different bitrates may be assigned based on the ratio of average maximum bit-plane values assigned to the different channels. The signal is bit-plane encoded based on the different bitrates assigned to different channels at 205. For example, the mid channel may be assigned with more bitrates such that the mid channel data is encoded with higher accuracy.
It is to be noticed that a circuit as described in this description may be hard wired logic, a controller, a microcontroller, or a microprocessor (including e.g. a complex instruction set computer (CISC) processor or a reduced instruction set computer (RISC) processor).
In
The SLS encoder 300 further includes a mid/side encoding circuit 305 configured to encode the transformed signal to form a mid/side encoded signal. For example, if the transformed signal has left and right channels, the mid/side encoded signal is encoded to have mid and side channels.
An error mapping circuit 307 is included to perform an error mapping process based on the mid-side encoded signal and the core-layer bitstream. The information which has been encoded into the encoding circuit 303 is then removed from the transformed signal, resulting in an error signal.
The SLS encoder also includes a bit-plane encoding circuit 309 configured to bit-plane encode the error signal based on different bitrates to form an enhancement-layer bitstream. The bit-plane encoding circuit 309 may include an assignment circuit configured to assign the different bitrates to different channels of a plurality of channels in the bit-plane coding process. For example, the different bitrates may be assigned based on the bit-plane values for different channels, as explained in the embodiments above.
A bitstream multiplexing circuit 311 is configured to multiplex the core-layer bitstream and the enhancement-layer bitstream, thereby generating the scalable encoded bitstream, which is a lossless bitstream.
It is noticed that the above encoding circuit 303 of the SLS encoder 300 is used to generate the core-layer bitstream from the transformed audio signal in accordance with the embodiment of the invention.
The SLS encoder 350 includes a domain transform circuit 351 configured to transform an audio signal to form a transformed signal. The domain transform circuit 351 may be an integer modified discrete Cosine transform (IntMDCT), for example.
The SLS encoder 350 further includes a mid/side encoding circuit 353 configured to encode the transformed signal to form a mid/side encoded signal. For example, if the transformed signal has left and right channels, the left and right channel information is encoded to become mid and side channel information.
A bit-plane encoding circuit 355 is included to bit-plane encode the mid/side encoded signal based on different bitrates for different channels. The bit-plane encoding circuit 355 may include an assignment circuit configured to assign the different bitrates to different channels of a plurality of channels in the bit-plane coding process. For example, the different bitrates may be assigned based on the bit-plane values assigned to different channels, as explained in the embodiments above. After the mid/side encoded signal is encoded through the bit-plane encoding circuit 355, a lossless bitstream is formed.
The non-core SLS encoder 350 may be used such that perceptual information of the audio signal is not used to determine the different bitrates for different channels in the bit-plane coding process.
The non-core SLS encoder 350 may also have a structure of the SLS encoder 300 of
The assignment of different bitrates to different channels in the method of
For an input of n-dimensional data vector x={x0, x1, . . . , xn-1}, each element xi, i=n−1 can be represented in a binary format
that includes a sign symbol
and the bit-plane symbols bijε{0, 1}. The bit-plane symbols usually starts from a maximum bit-plane Mi that satisfies
2M
In bit-plane coding, the input data vector is first scanned into sign and bit-plane symbols, usually from MSB to LSB. The resultant binary string is then entropy coded with a properly assigned statistical model. In the decoder, the data flow is reversed where the sign and amplitude symbols are decoded to reconstruct the original data vectors. The compressed bitstream resultant from the bit-plane coding can be arbitrarily truncated to lower rates which still can be decoded to a coarse reconstruction that comprises partial bit-plane symbols. Thus, bit-plane coding provides a convenient way to implement an embedded code with sequentially refined step size.
In one embodiment, the bitrates for different channels used in the bit-plane coding process may be assigned/distributed based on the average values of the maximum bit-planes (MBP) for each channel. The average MBP value for each channel is calculated based on the MBP for each scalefactor bands as shown in
wherein MAverage,1 and MAverage,2 are the average MBP values for the first and the second channel of the frame, respectively. N is the number of total scalefactor bands (sfbs) in the frame. M1,i and M2,i denote the MBP of the bit-planes for the sfb i in the first channel and the second channel, respectively. Then, the ratio of the average values in the first and the second channel, r is computed as
and the bitrate assigned for each channel is then assigned according to the following equations
wherein Br/f is the total bitrate for each frame.
From the above equations, it is noticed that more bitrates are assigned to the channel with higher average maximum bit-plane values.
In another embodiment, the bitrates for different channels used in the bit-plane coding process may be assigned/distributed based on the average maximum bit-plane values for each channel, wherein the average maximum bit-plane values for each channel is determined in consideration of the number of spectrum coefficients in each scale factor band.
For each frame, the average MBP values are calculated as follows
wherein {circumflex over (M)}Average,1 and {circumflex over (M)}Average,2 are the average total MBP values for the first and the second channel of the frame, respectively. N is the number of total scalefactor bands (sfbs) in the frame, with Wi denotes the number of spectrum coefficients for the sib i. M1,i and M2,i denote the MBP of the bit-planes for the sfb i in the first channel and the second channel, respectively Then, the ratio of the average values in the first and the second channel, r is computed as
and the bitrate assigned for each channel is then assigned according to the following equations
wherein Br/f is the total bitrate for each frame.
From the above equations, it is noticed that more bitrates are assigned to the channel with higher average maximum bit-plane values.
At 501, it is determined whether a target total bitrate BST is smaller than or equal to the sum of a first perceptual core bitrate BS1P for a first channel and a second perceptual core bitrate BS2P for a second channel of a plurality of channels.
If yes, different truncated bitrates are assigned to different channels at 503 based on the target total bitrate BST, the first perceptual core bitrate BS1P and the second perceptual core bitrate BS2P. In one example, the target total bitrate BST may be divided into two different truncated bitrates based on the ratio between the first perceptual core bitrate and the second perceptual core bitrate.
If it is determined at 501 that the target total bitrate is greater than the sum of the first perceptual core bitrate BS1P for the first channel and the second perceptual core bitrate BS2P for the second channel, different truncated bitrates may be assigned to different channels at 505 based on the target total bitate BST, the first perceptual core bitrate BS1P, the second perceptual core bitrate BS2P, a first enhancement bitrate for an enhancement layer of the first channel, and a second enhancement bitrate for an enhancement layer of the second channel. In one example, the target total bitrate BST may be divided into two different truncated bitrates based on the ratio between the first enhancement bitrate and the second enhancement bitrate.
After the different truncated bitrate is determined for different channels at 503 or 505, a bitstream may be scalable truncated based on the different truncated bitrates. In one example, an input audio signal has been encoded into a lossless bitstream by the SLS encoder 300, 350 described above. The resultant lossless bitstream is then truncated/compressed using the different truncated bitrates as assigned in 503 or 505 above, so that a truncated bitstream may be formed for situations with only limited target total bitrate.
The embodiments of assigning different truncated bitrates for different channels are described in
hi one embodiment, a target total bitrate BST is smaller than or equal to the sum of the first perceptual core bitrate BSP and the second perceptual core bitrate BS2P, i.e., BST≦BS1P+BS2P. In order to optimize the basic perceptual quality, the truncated bitrates are allocated as shown in
As seen from the resultant bitstream in
In another embodiment, the target total bitrate BST is greater than the sum of the first perceptual core bitrate BS1P and the second perceptual core bitrate BS2P, i.e., BST>BS1P+BS2P. In this case, the perceptual core bitstream may be remained, and the enhancement bitstream may be truncated. The resultant truncated bitstream for each channel as shown in
As seen from
It is to be noticed that the lossless bitstream may be a non-core bitstream without the first perceptual core bitstream and the second perceptual core bitstream. The different truncated bitrate may be assigned based on the ratio between the first bitstream for the first channel and the second bitstream for the second channel.
In other embodiments, the truncated bitrates for different channels may be assigned such that the bitrate for one of some of the plurality of channels is truncated more. For example, more truncated bitrate may be assigned to the mid channel compared to that of the side channel such that the side channel bitstream is more truncated than the mid channel bitstream. This illustratively means, the bitrates is truncated with priorities on the mid channel.
The audio signal is encoded through the SLS encoder 710, resulting in a lossless bitstream 712. The lossless bitstream 712 includes header information, side information, and the data for each channel of the plurality of channels. In this example, the SLS encoder 710 may be the SLS encoder 300, 350 of
A truncator 720 is included to assign different truncated bitrates to different channels, such that the lossless bitstream 712 is truncated to form the truncated bitstream 722 based on the assigned different truncated bitrate. A target bitrate 724 is used by the truncator to determine the different truncated bitrates for different channels. And the different truncated bitrates may be assigned according to the embodiments described with reference to
According to the above embodiments of the invention for the assignment of different bitrates and/or different truncated bitrates for different channels, no additional side information and complexity is involved as the bitrate per channel is encoded in the bitstream in the original codec.
A lossless bitstream 812 may be truncated by a truncator 820 to form a truncated bitstream 822, similar to
An SLS decoder 810 decodes the truncated bitstream 822 to form a reconstructed audio signal. The reconstructed audio signal may be a lossy signal as the truncated bitstream 822 is a lossy bitstream.
The method of scalable decoding a bitstream and the corresponding SLS decoder according to the embodiments of the invention are described in the following.
At 901, a bitrate assignment information of a bitstream is determined. The bitrate assignment information may be received from another device, e.g. a scalable audio encoder, or may be be embedded in the bitstream.
In one embodiment, the bitstream may be a lossless bitstream encoded by the scalable lossless encoder 300, 350 of
In another embodiment, the bitstream may be a truncated bitstream derived from a truncator 720, 802 of
Based on the determined bitrate assignment information, the bitstream is decoded in a scalable audio decoding process at 903.
In
The decoder 1000 further includes a perceptual decoding circuit 1003 for decoding the core-layer bitstream to form a core-layer signal, which may constitute the minimum rate/quality unit of the original audio signal. The perceptual decoding circuit 1003 may be called as the core-layer decoding circuit as well. In one example, the decoding circuit 1003 is an MPEG-4 AAC (advanced audio coding) decoder.
The SLS decoder 1000 includes a bit-plane decoding circuit 1005 configured to bit-plane decode the enhancement-layer bitstream to form a bit-plane decoded enhancement-layer signal. The bit-plane decoding circuit 1005 may be configured to decode the enhancement-layer bitstream based on a bitrate assignment information, which indicates different bitrates assigned to different channels of the enhancement-layer bitstream, for example.
An inverse error mapping circuit 1007 is included to perform an inverse error mapping process based on the core-layer signal and the bit-plane decoded enhancement-layer signal, resulting in an error corrected signal.
The SLS decoder 1000 further includes a mid/side decoding circuit 1009 configured to decode the error corrected signal to form a mid/side decoded signal. For example, if the error corrected signal has mid and side channels, the mid/side decoded signal is decoded to left and right channels.
The mid/side decoded signal is then input to an inverse domain transform circuit 1011 to be inversely transformed to a decoded audio signal. The inverse domain transform circuit 1011 may be an inverse integer modified discrete Cosine transform (inverse IntMDCT), for example. The decoded audio signal may be a lossless reconstruction of the original encoded audio signal.
It is noticed that the above perceptual decoding circuit 1003 of the SLS decoder 1000 is used to decode the core-layer bitstream in accordance with the above embodiment.
The SLS decoder 1050 includes a bit-plane decoding circuit 1051 configured to bit-plane decode a lossless bitstream to form a bit-plane decoded signal. The bit-plane decoding circuit 1005 may be configured to decode the lossless bitstream based on a bitrate assignment information, which indicates different bitrates assigned to different channels of the lossless bitstream, for example.
The SLS decoder 1050 further includes a mid/side decoding circuit 1053 configured to decode the bit-plane decoded signal to form a mid/side decoded signal. For example, if the bit-plane decoded signal has mid and side channels, the mid/side decoded signal is decoded to left and right channels.
The mid/side decoded signal is then input to an inverse domain transform circuit 1055 to be inversely transformed to a decoded audio signal. The inverse domain transform circuit 1055 may be an inverse integer modified discrete Cosine transform (inverse IntMDCT), for example. The decoded audio signal may be a lossless reconstruction of the original encoded audio signal.
The non-core SLS decoder 1050 may be used such that perceptual information of the encoded lossless bitstream is not used to determine the different bitrates for different channels in the bit-plane decoding process.
The non-core SLS decoder 1050 may also have a structure of the SLS decoder 1000 of
While the invention has been particularly shown and described with reference to specific embodiments, it should be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. The scope of the invention is thus indicated by the appended claims and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/SG2008/000036 | 1/31/2008 | WO | 00 | 10/19/2010 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2009/096898 | 8/6/2009 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5636324 | Teh et al. | Jun 1997 | A |
5774844 | Akagiri | Jun 1998 | A |
5956674 | Smyth et al. | Sep 1999 | A |
5978762 | Smyth et al. | Nov 1999 | A |
6104321 | Akagiri | Aug 2000 | A |
6345246 | Moriya et al. | Feb 2002 | B1 |
7240014 | Fuchigami et al. | Jul 2007 | B2 |
7573912 | Lindblom | Aug 2009 | B2 |
7751572 | Villemoes et al. | Jul 2010 | B2 |
20030220800 | Budnikov | Nov 2003 | A1 |
20040049379 | Thumpudi et al. | Mar 2004 | A1 |
20040105551 | Fuchigami et al. | Jun 2004 | A1 |
20040181395 | Kim et al. | Sep 2004 | A1 |
20050251709 | Kemmochi et al. | Nov 2005 | A1 |
20070016406 | Thumpudi et al. | Jan 2007 | A1 |
20080262850 | Taleb et al. | Oct 2008 | A1 |
Number | Date | Country |
---|---|---|
1422694 | May 2004 | EP |
2392359 | Feb 2004 | GB |
2005098822 | Oct 2005 | WO |
Entry |
---|
Liu et al. “M/S Coding Based on Allocation Entropy”, Proc. of the 6th Int. Conference on Digital Audio Effects (DAFX-03), London, UK, Sep. 8-11, 2003. |
Geiger et al. “ISO/IEC MPEG-4 High-Definition Scalable Advanced Audio Coding”, J. Audio Eng. Soc., vol. 55, No. 1/2, 2007. |
Jean et al. “Two-stage bit allocation algorithm for stereo audio coder”, IEE Proceedings of Vision, Image and Signal processing, Oct. 1996. |
Yang et al. “High-Fidelity Multichannel Audio Coding With Karhunen-Loève Transform”, IEEE Transactions on Speech and Audio Processing, vol. 11, No. 4, Jul. 2003. |
Li, T., et al., “Efficient Stereo Bitrate Allocation for Fully Scalable Audio Codec”, “10th Workshop on Multimedia Signal Processing, 2008 IEEE Piscataway, NJ, USA”, Oct. 8, 2008, pp. 921-926. |
Yu, R., et al., “MPEG-4 Scalable to Lossless Audio Coding”, “117th Audio Engineering Society Convention Paper”, Oct. 28-31, 2004, pp. 1-14, No. 6183. |
Number | Date | Country | |
---|---|---|---|
20110046945 A1 | Feb 2011 | US |