The invention relates to coding a signal, in particular an audio signal.
Audio coding schemes are known which use frames that include a set of values representing (a component of) the audio signal in the time interval to which the frame relates. At least some frames relate to time intervals having an overlap in time. In order to achieve a low bit-rate, the redundancy between values obtained at successive time-instants can be exploited by employing, e.g. differential, coding techniques.
An object of the invention is to provide advantageous coding. To this end, the invention provides a method of coding, an encoder, a bit-stream, a storage medium, a method of decoding, a decoder, a transmitter, a receiver and a system as defined in the independent claims. Advantageous embodiments are defined in the dependent claims.
A first aspect of the invention provides coding a signal, the coding comprising providing a first set of values related to subsequent times in a first time interval of the signal, providing a second set of values related to subsequent times in a second time interval of the signal, wherein the first time interval has an overlap (in time) with the second time interval, the overlap including at least two subsequent times of the second interval, wherein at least one of the values of the second set related to the at least two subsequent times in the overlap is encoded with reference to a value of the first set which is closer in time to the at least one value of the second set than any other value in the second set. By encoding at least one value of the second set with reference to a value of the first set which is closer in time to the at least one value of the second set than any other value in the second set, a better exploitation of redundancy in the values is achieved. This aspect of the invention is based on the insight that when using overlapping time intervals, it might happen that in the other set, a value is related to a time which time is closer to the time of the current value of the second set to be encoded than any value available in the second set. Because in general, values are more correlated when closer in time, the in general better correlation can be used to code the signal more efficiently.
The subsequent times may be time instants (or points) or time spans smaller than the time interval (e.g. related to sub-frames). The second time interval will usually be subsequent in time to the first time interval, but may also be preceding the first time interval.
The overlapping times are not necessarily identical, the times of the second time interval may have an offset relative to the times of the first time interval. In the case that the times are time instants, the differences in time between subsequent time instants in the first time interval are not necessarily the same as the differences in time between the subsequent time instants in the second time interval. Further, if the times are time spans, they have not necessarily the same length within the respective time interval or relative to the other time interval. In preferred embodiments the number of times per time interval is the same for the first time interval and the second time interval and the times are (substantially) evenly distributed over the respective time intervals.
The sets of values may be included in frames or sub-frames.
Although the invention is applicable to any coding scheme which uses frames related to overlapping time intervals and any kind of values, the invention is advantageously applied in a parametric audio coding schemes, wherein the values are e.g. gains of a noise component in the audio signal.
These and other aspects of the invention will be apparent from and elucidated with reference to the accompanying drawings.
In the drawings:
The drawings only show those elements that are necessary to understand the embodiments of the invention. The numbers in the drawings denote serial numbers of the values in a given sub-frame, subsequent serial numbers being related to subsequent times in the respective time interval to which the given sub-frame relates.
In a preferred parametric coding scheme, the input signal is typically dissected into transient signal components, sinusoidal signal components and noise components. Reference is made to WO 01/69593-A1. The parameters representing the sinusoidal components are typically chosen to be amplitude, frequency and phase. For the transient components the extension of such parameters with an envelope description is an efficient representation of the transient component. With respect to the noise, the spectral shape and a gain parameter controlling a random noise generator, represent an efficient parametric representation. In order to encode all these parameters with sufficient low bit-rate, redundancy between these parameters at successive time-instances must be exploited. For example, in the case of the sinusoidal components, the amplitude and frequency parameters of a single component are slowly varying in time. It is therefore beneficial to encode the changes in amplitude and frequency. Per analysis frame a single parameter for frequency and amplitude is to be encoded.
In the case of the parameterization of the noise signal, a number of e.g. 7 gain parameter values are obtained per sub-frame, each gain value representing the power in a sub-sub-frame where it relates to. A number of sub-frames are included in a noise frame. The analysis frames are e.g. 50% overlapping. This is visualized in
Due to the slowly varying nature of the gain parameters, redundancy is exploited by encoding these parameters differentially. For that purpose the estimated gain parameters are organized sequentially. The differences are subsequently entropy encoded.
. . . g(i−1,7) g(i,1) g(i,2) . . . g(i,6) g(i,7) g(i+1,1) g(i+1,2) . . . g(i+1,6) g(i+1,7) . . .
where g(a, b) denotes the bth noise gain representation level of sub-frame a. Finally these differential representation levels are entropy encoded using a Huffman table.
According to embodiments of the invention, the estimated parameter values, in this example the gain parameters, are organized such that the redundancy is even better exploited. With respect to conventional coding, a simple change to the bit-stream syntax results in an improvement in coding efficiency.
Approach 1
In the parametric coding example the estimated noise gains are organized as follows (see also FIG. 2):
. . . g(i,3) g(i,4) g(i,5) g(i+1,1) g(i,6) g(i+1,2) g(i,7) g(i+1,3) g(i+1,4) g(i+1,5) . . .
The thus obtained sequence of gain parameters is preferably differentially encoded.
Approach 2
The following approach, which proved to be slightly more efficient in the case of the parametric coding example, is as follows (see also
Step A) first for frame i the gains are organized as: g(i,3) g(i, 4) g(i,5) g(i,6) g(i,7) which are then be (preferably differentially) encoded.
Step B) Then the pairs g(i,5) g(i+1,1), g(i,6) g(i+1,2) and g(i,7) g(i+1,3) are (preferably differentially) encoded
Approach 3
Further investigation showed that the three inter-frame differences g(i+1,1)-g(i,5), g(i+1,2)-g(i,6) and g(i+1,3)-g(i,7) have much similarity. Therefore, it is even more efficient to encode the mean m of these differences and then code the differences with respect to this mean. This thus means that an extra parameter, the mean difference, is included in the bit-stream.
As a comparison of the different approaches consider the following example:
For the different approaches as described above using differential encoding this would deliver the sequences:
Note that even though in approach 3 an extra parameter is added the resulting sequence can be encoded more efficiently.
In a practical embodiment of noise frame encoding, each sub-frame defines or updates filter parameters which remain constant over the sub-frame. Per sub-frame several subsequent gain parameter values are given which relate to subsequent times in the time interval to which the sub-frame relates. The sub-frames overlap in time. A refresh noise frame is defined which starts with a sub-frame comprising refresh filter parameters which are encoded as absolute filter parameters. Filter parameters in other sub-frames are mainly differentially encoded.
In a preferred practical embodiment, the following coding strategy is used: For the first sub-frame of a ‘refresh-frame’ the first noise gain is coded absolutely. All following noise gains of that sub-frame are encoded differentially. For all other sub-frames instead of encoding the difference g(i+1,1)-g(i,7) the difference g(i+1,1)-g(i,5) is encoded, thus exploiting the redundancy that is apparent between noise-gains that are analyzed at similar time-instances. The same is repeated for g(i+1,2) and g(i+1,3). So, instead of encoding the difference g(i+1,2)-g(i+1,1) respectively g(i+1,3)-g(i+1,2), the difference g(i+1,2)-g(i,6) respectively g(i+1,3)-g(i,7) is encoded (see also
In an even more preferred practical embodiment, the following coding strategy is used:
For the first sub-frame of a ‘refresh-frame’ the first noise gain is coded absolutely. All following noise gains of that sub-frame are encoded differentially. For any other sub-frame i+1 the differences g(i+1,1)-g(i,5), g(i+1,2)-g(i,6) and g(i+1,3)-g(i,7) and the mean value m(i+1) of these differences is calculated. First the mean value m(i+1) is encoded into the bit-stream, followed by the differences g(i+1,1)-g(i,5)-m(i+1), g(i+1,2)-g(i,6)-m(i+1) and g(i+1,3)-g(i,7)-m(i+1) which represent the differences to the mean value. Finally the values g(i+1,4)-g(i+1,3), g(i+1,5)-g(i+1,4), g(i+1,6)-g(i+1,5) and g(i+1,7)-g(i+1,6) are encoded into the bit-stream.
Except for the first sub-frame of a refresh noise frame, first the mean m(i+1) of the overlapping differences is inserted just after the differential parameters representing the filter. Immediately after the mean m(i+1), the differences to the mean value m(i+1) are inserted into the bit-stream. For the non-overlapping gain values the parameters are encoded differentially. This embodiment results in the following bit-stream syntax:
The mean differential gain coefficient m(i+1) is preferably encoded by using a Huffman table. The differences to the mean m(i+1) are also preferably encoded by using a Huffman table. The other differential noise parameters are also preferably encoded by use of a Huffman table.
In a decoder, the noise gain parameter values in sub-frame i+1 relating to the overlap are obtained by adding the mean m(i+1) and the respective ‘difference to the mean value’ to the noise gain parameter value of the sub-frame i which value is used as reference value. For example in the above example (see
Especially speech excerpts which may be critical for parametric encoding benefit from embodiments of the invention. The extra decoder complexity caused by the embodiments of the invention is negligible.
Application areas of embodiments of the invention are: Internet download, Internet Radio, Solid State audio.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design many alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. This word ‘comprising’ does not exclude the presence of other elements or steps than those listed in a claim. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.
Number | Date | Country | Kind |
---|---|---|---|
01204653 | Nov 2001 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB02/04776 | 11/13/2002 | WO | 00 | 5/25/2004 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO03/046889 | 6/5/2003 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
5109417 | Fielder et al. | Apr 1992 | A |
5227990 | Vaupel et al. | Jul 1993 | A |
5850403 | Lasne | Dec 1998 | A |
5928376 | Dettmar et al. | Jul 1999 | A |
6345246 | Moriya et al. | Feb 2002 | B1 |
6377916 | Hardwick | Apr 2002 | B1 |
6519558 | Tsutsui | Feb 2003 | B1 |
6584437 | Heikkinen et al. | Jun 2003 | B2 |
6691090 | Laurila et al. | Feb 2004 | B1 |
6889183 | Gunduzhan | May 2005 | B1 |
20030081685 | Montgomery | May 2003 | A1 |
20050261896 | Schuijers et al. | Nov 2005 | A1 |
Number | Date | Country | |
---|---|---|---|
20050021326 A1 | Jan 2005 | US |