The present invention relates to a signal processing, in particular a processing of digital signals in the field of telecommunications, these signals possibly being for example speech signals, music signals, video signals, or the like.
Generally, the bit rate required to pass an audio and/or video signal with sufficient quality is an important parameter in telecommunications. In order to reduce this parameter and thus increase the number of possible communications via one and the same network, audio coders have been developed in particular for compressing the quantity of information required to transmit a signal.
Certain coders make it possible to achieve particularly high information compression factors. Such coders generally use advanced information modeling and quantization techniques. Thus, these coders only transmit models or partial data of the signal.
The decoded signal, although it is not identical to the original signal (since part of the information has not been transmitted on account of the quantization operation), nevertheless remains very close to the original signal (at least from the perception point of view). The difference, in the mathematical sense, between the decoded signal and the original signal is then called “quantization noise”.
Signal compression processings are often designed so as to minimize quantization noise and, in particular, to render this quantization noise as inaudible as possible when the processing of an audio signal is involved. Thus, techniques exist which take into account the psycho-acoustic characteristics of hearing, with the aim of “masking” this noise. However, to obtain the lowest possible bit rates, the quantization noise may sometimes be difficult (or indeed impossible) to mask totally, thereby, in certain circumstances, degrading the intelligibility and/or the quality of the signal.
In order to reduce this quantization noise and hence improve quality, two families of techniques can be used on decoding.
It is possible, firstly, to use an adaptive post-filter, of the type described in the article by Chen and Gersho:
“Adaptive post filtering for quality enhancement of coded speech”, IEEE Transactions on Speech and Audio Processing, Vol. 3, No. 1, January 1995, pages 59-71, and employed in particular in the speech decoders of CELP (“Code Excited Linear Prediction”) type.
This involves performing a filtering which improves subjective quality by attenuating the signal in the zones where the quantization noise is most audible (in particular between the formants and the harmonics of fundamental period or “pitch”). Current adaptive post-filters afford good results for speech signals, but less good results for other types of signals (music signals, for example).
Another processing family is aimed at the conventional noise reduction processings which distinguish the useful signal from spurious noise and which can be applied as post-processing to reduce the quantization noise after decoding. This type of processing makes it possible at the origin to reduce the noise related to the signal capture environment and it is often used for speech signals. However, it is impossible to make the processing transparent in relation to the noise related to the sound pick-up environment, thereby posing a problem for music signal coding, in particular. Thus, in coding/decoding, one might want to transmit the “atmospheric” noise and it is then desirable for the noise reduction not to apply to this type of “atmospheric” noise but solely to the quantization noise, in particular in the context of post-processing on decoding aimed at reducing quantization noise.
Nevertheless, these various types of quantization noise reduction methods deform the signal to a greater or lesser extent. For example, the use of a post-filter (denoising) which would be too aggressive for the speech signal would make it possible to completely eliminate the quantization noise but the voice sound obtained would seem less natural and/or muffled. Optimization of these various types of methods is therefore difficult and it is appropriate systematically to find a compromise between:
The present invention aims to improve the situation.
To this end it proposes a method for processing a digital signal, arising from a decoder and from a noise reduction post-processing. The method within the meaning of the invention proposes a limitation of a distortion introduced by the post-processing so as to deliver a corrected output signal, by assigning the corrected output signal:
Advantageously, a delay line is provided so as to ensure a temporal correspondence between the current amplitude of the post-processed signal and the corresponding current amplitude of the decoded signal.
In a particular embodiment, the method comprises the steps:
Thus, the present invention proposes that the decoded signal not be deviated from, beyond a certain tolerance, during the post-processing of the decoded signal.
It is then possible, in one embodiment, to assign a span of amplitude values to each possible amplitude value of the decoded signal so as to define this tolerance quantitatively, in such a way that the aforesaid lower and upper bounds are chosen so that the difference between the upper bound and the lower bound is equal to this span of values.
This embodiment can advantageously be implemented in the case where the signal received has been coded by a scalar quantization coding, the decoder delivering quantized amplitude values which vary from one to another in a discrete manner, the successive deviations between the quantized values defining successive quantization stepsizes. Thus:
An exemplary scalar quantization coding is so-called “pulse code modulation” coding, delivering a coded index. In this case, it is possible to determine respective current values of the lower and upper bounds simply on the basis of the current coded index, received at the decoder. Moreover, provision may be made for a correspondence table giving, for a current index received, a corresponding quantized value and a half of a corresponding quantization stepsize, on the basis of which can then be are determined the respective current values of the lower and upper bounds.
Other characteristics and advantages of the invention will be apparent on examining the description detailed hereinbelow and the appended drawings in which:
The present invention advantageously intervenes in the context of a coding/decoding of the scalar quantization type. For example, in the case of PCM (“Pulse Code Modulation”) type coding, each input sample is individually coded, without prediction. The principle of such a codec is recalled with reference to
This type of coding, within the meaning of the ITU-T G.711 standard, carries out a compression of the signals sampled at 8 kHz, typically defined in a minimum band of frequencies from 300 to 3400 Hz, by a logarithmic curve which makes it possible to obtain a nearly constant signal-to-noise ratio for a wide dynamic range of signals.
More precisely, the quantization stepsize is approximately proportional to the amplitude of the signals. The initial signal S is firstly coded (module 10) in a coder 13 and the resulting sequence of indices IPCM is represented on 8 bits per sample (see the reference 15 of
For example, an original sample of the signal S to be coded has an amplitude equal to −75. Consequently, this amplitude lies in the interval [−80, −65] of row 123 (or “level” 123) of the chart. The coding of this information consists in delivering a coded final index, referenced I′Pcm in
To facilitate its implementation, the PCM compression is carried out by a segment-based linear amplitude compression. In the ITU-T G.711 standard, the bits characterizing 256 quantized values are thus distributed in the following manner:
In the G.711 standard according to the A-law in particular, the quantization stepsize is multiplied by two (16, 32, 64, . . . ) on passing from one segment to the next, doing so from the second segment onward. This coding law therefore makes it possible to have a quantization precision of 12 bits (with a quantization stepsize of 16) on the first two segments of indices 0 and 1 (chart 2). Then, the precision decreases by 1 bit at each incrementation of the segment index (the quantization stepsize being multiplied by two at each incrementation), as shown by chart 2 hereinbelow.
Chart 2 is interpreted as follows. By way of example, if the amplitude of an original sample equals −30000:
Likewise, if the amplitude of an original sample equals +4000:
Chart 3 hereinbelow is the equivalent of chart 2, but for the G.711 standard such as it is practiced in particular in the United States of America or in Japan (termed the “μ-law”), with in particular the quantization stepsizes and the maximum possible deviations EMAX between the quantized value QV and the real value of the amplitude of the original sample.
Returning to row 123 of chart 1, all the 16 values of the interval [−80, −65] are represented by the code word of 0x51 which, once decoded, gives the quantized value −72. However, it should be noted that conversely, by obtaining a decoded value −72, it is certain that the original value which has been coded was in the interval [−80, −65]. It is therefore known that the maximum amplitude of the coding error for this sample is EMAX=8, this corresponding to half the quantization stepsize.
In what follows, it will be supposed that the final index I′Pcm received at the decoder makes it possible to determine, on the one hand, the quantized value QV and, on the other hand, the segment index ID-SEG on the basis of which the quantization stepsize can be deduced and, from this, the maximum amplitude of the coding error EMAX. It will also be noted that the index of the segment ID-SIG can be found also as a function of the position of the highest-order bit of the amplitude of the signal in the case of a G.711 coding according to the A-law (chart 2). As a general rule, it will also be supposed that a specific feature of PCM coding is that the original sample and the decoded sample always have their amplitude in one and the same quantization interval:
Again with reference to
Indeed, as indicated previously, the post-processing 16 (even if it is in general of linear phase type so as to preserve the waveform) may be too aggressive and impair in particular the natural aspect of a speech signal. At the decoder, information about the original signal is nevertheless available and can be utilized, within the meaning of the present invention, to limit the deviation between the decoded and post-filtered signal SPOST, on the one hand, and the original signal S, on the other hand. Thus the module 20 (
A possible exemplary embodiment, described in detail further on, is to require that the distortion introduced by the post-processing 16 with respect to the decoded signal S′Pcm cannot be greater than the maximum amplitude of the coding error EMAX. This therefore ensures that the post-filtered signal remains in the same quantization interval as the original signal. The overall distortion due to the coding/decoding processing and post-processing is limited, and in particular very close to the maximum distortion of the coding EMAX. This measure also ensures that the energy distribution between successive samples and the overall waveform are well preserved.
An exemplary implementation of the invention is illustrated in
An exemplary embodiment of the delay line 23 can be the following. Assuming that the post-processing 22 introduces a delay of 16 samples, the module 23 then comprises, in an advantageous manner, a memory MEM of 16 samples, with shift register. For example, the index 0 of this memory corresponds to the oldest sample, whereas the index 15 corresponds to the last sample stored. Thus, when a new index arrives at the input of the module 23, the following operations are carried out:
On the basis of the delayed index I′Pcm
Here, the information given by chart 4 evolves as a function of the quantized value QV to show that this chart 4 is derived from chart 1 given above. However, in practice and as explained further on, it is preferable to use a table 24 which, as input, catalogs the received and delayed indices I′Pcm
Thus, chart 5 presents the respective parameters QV and EMAX as a function of a given index I′Pcm
Of course, it would be possible, as a variant, to present the decoded signal S′Pcm (before post-processing) to the input of the delay line 23 and, on the basis of the quantized value QV assigned to each sample, deduce therefrom the corresponding parameter EMAX. A table 24 laid out according to chart 4 given above would then be used.
However, this embodiment is less advantageous in particular in the coding according to the μ-law, for which the equivalent of chart 1, given for the A-law, is given hereinbelow in chart 6.
It will indeed be noted in chart 6 that one and the same quantized value QV=0 is assigned for different received indices: I′Pcm=0x7f and I′Pcm=0xff. Thus, in the case of a coding according to the μ-law, when the module 25 operates on the basis of the index received (and not on the basis of the quantized value), the bounds of the intervals in which the amplitude of an original sample could have lain can be more finely determined.
The data that a table 24 can comprise in a processing of the type represented in
The table 24 (which can therefore include the data of charts 5 or 7) can be hard-stored in a memory of a module 20 (
Actually, the identifier of the segment ID-SEG is coded on three bits in the index received and delayed I′Pcm
in accordance with charts 2 and 3 given previously.
Thereafter, the module 26 verifies whether the deviation between the post-processed sample SPOST and the sample just decoded without post-processing S′Pcm does not exceed the value found of the parameter EMAX, in which case the post-processing has induced distortions that it is appropriate to limit. In an exemplary embodiment, the value of the sample SPOST is thus reduced to a value closer to the quantized value QV, so that the deviation between the values SPOST and QV remains below an authorized threshold.
Accordingly, the module 26 operates, as follows, on the basis:
The same check is performed in step 35, but for the upper limit LimSUP. Finally the output SOUT gives:
Thus the output signal SOUT always remains in the same quantization interval as the original signal S.
In this exemplary embodiment, the output signal is strictly reduced to the quantization interval of the original signal, delimited by:
[S′Pcm−EMAX, S′Pcm+EMAX−1].
Of course, the interval in which it is desired to preserve the amplitude of the output signal with respect to the quantized value found could be defined otherwise. For example provision may be made for:
In the last two examples, the distortion of the post-processing is limited with respect to the decoded signal, and not necessarily with respect to the original signal, according to the type of coding/decoding employed.
In the exemplary embodiment illustrated in
The signal-to-noise ratio (denoted SNR hereinbelow), obtained by the PCM coding/decoding, is substantially constant (of a level of about 38 dB) for a wide dynamic range of signals. On the other hand, for the low signal levels (in the first identifier segment 0 typically) the SNR ratio is low and may even be negative at the start of the segment of the amplitude compression law. The output of the PCM decoder is then very “noisy” for the signals of low amplitude (for example in the cases of silence between two sentences of a speech signal). Moreover, it is difficult to suppress the PCM coding/decoding noise simply with a post-filter, having regard to the very low SNR ratio. A solution often consists in modifying the post-processing of signals of very low amplitude by greatly decreasing the amplitude of the decoded signal. The amplitude of the signal resulting from this type of post-processing is absolutely not faithful, therefore, to the amplitude of the original signal. Under these conditions, it is preferable to disable the limitation of distortion due to the post-processing and steps 32 to 35 of the processing within the meaning of the invention (
Thus, with reference to
Of course, the present invention is not limited to the form of embodiment described above by way of example; it extends to other variants.
For example, the distortion limitation module 20 is represented in
Moreover, an exemplary embodiment has been described above in which intervals were defined around the decoded value S′ (which can be the quantized value QV in the case of a scalar quantization coding/decoding of the type described above). However, this embodiment was described by way of nonlimiting example. Provision may be made, as a variant, to assign to the amplitude of the output signal SOUT the mean (or more generally a weighted mean) between the decoded value S′ and the post-processed amplitude value SPOST, while authorizing the direct assignment of the post-processed amplitude value SPOST if, for example, this latter SPOST is still in a chosen interval. Thus, by defining lower LimINF and upper LimSUP limits of intervals, or by defining means (optionally weighted) between the decoded value S′ and the post-processed amplitude SPOST, a possible intermediate value that can be taken by the output signal SOUT, corrected within the meaning of the invention, is always defined.
More generally, the present invention applies to any type of coding/decoding, beyond a coding according to the G.711 standard, and for example the embodiment described in detail above can be applied in particular in the case of a scalar quantization coding/decoding with any number of levels, followed, on decoding, by a linear-phase type post-processing.
The present invention is also aimed at a digital signal processing module 20, this signal being decoded by an upstream decoder 14 (
A storage memory of such a module 20 can advantageously also comprise a computer program comprising instructions for implementing the method within the meaning of the invention, when these instructions are executed by a processor μP of the module 20. Typically,
Number | Date | Country | Kind |
---|---|---|---|
07/04901 | Jul 2007 | FR | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/FR08/51246 | 7/4/2008 | WO | 00 | 5/14/2010 |