Controlling bandwidth in encoders and/or decoders

Description

1. CONVENTIONAL TECHNOLOGY

The present examples relate to encoders and decoders and methods for these apparatus, in particular for information signals, such as audio signals.

BACKGROUND OF THE INVENTION

General audio codecs need to transmit music and speech signals in a very good quality. Such audio codecs are for instance used in Bluetooth where the audio signals are transmitted from the mobile phone to a headset or headphone and vice versa.

Quantizing parts of a spectrum to zeros often leads to a perceptual degradation. Therefore, it is possible to replace zero-quantized spectral lines with noise using a noise filler tool operating in the frequency domain (FD).

Temporal noise shaping (TNS) uses open-loop linear prediction in the frequency domain (FD). This predictive encoding/decoding process over frequency effectively adapts the temporal structure of the quantization noise to that of the time signal, thereby efficiently using the signal to mask the effects of noise. In the MPEG2 Advanced Audio Coder (AAC) standard, TNS is currently implemented by defining one filter for a given frequency band, and then switching to another filter for the adjacent frequency band when the signal structure in the adjacent band is different than the one in the previous band.

Especially, for speech signals, the audio content may be bandlimited, meaning the audio bandwidth contains only 4 kHz (narrow band, NB), 8 kHz (wide band, WB) or 16 kHz (super wide band, SWB). Audio codecs need to detect the active audio bandwidth and control the coding tools accordingly. As the detection of the bandwidth is not 100% reliable, technical issues may arise.

Some audio coding tools, e.g. Temporal Noise Shaping (TNS) or noise filling (NF), may cause annoying artefacts when operating on bandlimited audio files, e.g., if the tool is not aware about the active signal part. Assuming that the WB signal is coded at 32 kHz, the tools might fill the upper spectrum (8-16 kHz) with artificial noise.

FIG. 1 shows artificial noise generated by unguided tools: line 11 is the active signal up to WB while the signal 12 is artificially generated by a parametric tool, e.g. by noise filling, which is not aware of the active audio bandwidth.

Therefore, the tools need to be restricted to operate only on the active frequency regions.

Some codecs like AAC are configured so as to send the information on active spectrum per scale factor band. This information is also used to control the coding tools. This provides precise results but involves a significant amount of side information to be transmitted. As speech is usually just transmitted in NB, WB, SWB and FB, this limited set of possible active bandwidths is advantageously used to limit the side information.

It is unavoidable that a bandwidth detector returns wrong results from time to time. For instance, a detector may see the fade out of a music signal and interprets this as a low bandwidth case. For codecs, which switch between the different bandwidth modes (NB, WB, SWB, FB) in a hard manner, e.g. 3GPP EVS codec [1], this results in a rectangular spectral hole. Hard manner means that the complete coding operation is limited to the detected bandwidth. Such hard switch can result in audible artefacts. FIG. 2 outlines the spectral hole 22 resulting from a wrong detection.

FIG. 2 shows a schematic outline of wrong bandwidth detection: all coding tools work on lower audio bandwidth, leading to rectangular spectral hole 22.

It is requested to overcome or reduce impairments such as those identified above.

1.1 References

[1] 3 GPP EVS Codec, http://www.3gpp.org/ftp//Specs/archive/26_series/26.445/26445-e10.zip, Section 5.1.6 “Bandwidth detection”

2. SUMMARY

According to an embodiment, an encoder apparatus may have: a plurality of frequency domain, FD, encoder tools for encoding an information signal, the information signal presenting a plurality of frames; and an encoder bandwidth detector and controller configured to select a bandwidth for at least a subgroup of the plurality of FD encoder tools, the subgroup including less FD encoder tools than the plurality of FD encoder tools, on the basis of information signal characteristics so that at least one of the FD encoder tools of the subgroup has a different bandwidth with respect to at least one of the FD encoder tools which are not in the subgroup.

According to another embodiment, a decoder apparatus may have: a plurality of FD decoder tools for decoding an information signal encoded in a bitstream, wherein: the FD decoder tools are divided:

- in a subgroup including at least one FD decoder tool;
- in remaining FD decoder tools including at least one FD decoder tool;
- wherein the decoder apparatus is configured so that at least one of the plurality of decoder tools of the subgroup performs signal processing a different bandwidth with respect to at least one of the remaining FD decoder tools of the plurality of decoder tools.

According to another embodiment, a system may have: an inventive encoder apparatus and an inventive decoder apparatus.

According to another embodiment, a method for encoding an information signal according to at least a plurality of operations in the frequency domain, FD, may have the steps of: selecting a bandwidth for a subgroup of FD operations; performing first signal processing operations at the a bandwidth for the subgroup of FD operations; performing second signal processing operations at a different bandwidth for FD operations which are not in the subgroup.

According to yet another embodiment, a method for decoding a bitstream with an information signal and control data, the method including a plurality of signal processing operations in the frequency domain, FD, may have the steps of: choosing a bandwidth selection for a subgroup of FD operations on the basis of the control data; performing first signal processing operations at the a bandwidth for the subgroup of FD operations; performing second signal processing operations at a different bandwidth for FD operations which are not in the subgroup.

In accordance with examples, there is provided an encoder apparatus comprising:

- a plurality of frequency domain, FD, encoder tools for encoding an information signal, the information signal presenting a plurality of frames; and
- an encoder bandwidth detector and controller configured to select a bandwidth for at least a subgroup of the plurality of FD encoder tools, the subgroup including less FD encoder tools than the plurality of FD encoder tools, on the basis of information signal characteristics so that at least one of the FD encoder tools of the subgroup has a different bandwidth with respect to at least one of the FD encoder tools which are not in the subgroup.

Accordingly, it is possible to avoid spectral holes while maintaining in case of wrong detection of the bandwidth.

In accordance with examples, at least one FD encoder tool of the subgroup may be a temporal noise shaping, TNS, tool and/or a noise level estimator tool.

In accordance with examples, at least one FD encoder tool which is not in the subgroup is chosen among at least on of linear predictive coding, LPC, based spectral shaper, a spectral noise shaper, SNS, tool a spectral quantizer, and a residual coder.

In accordance with examples, the encoder bandwidth detector and controller is configured to select the bandwidth of the at least one FD encoder tool of the subgroup between at least a first bandwidth common to at least one of the FD encoder tools which are not in the subgroup and a second bandwidth different from the bandwidth of the at least one of the FD encoder tools which are not in the subgroup.

In accordance with examples, the encoder bandwidth detector and controller is configured to select the bandwidth of the at least one of the plurality of FD encoder tools on the basis of at least one energy estimate on the information signal.

In accordance with examples, the encoder bandwidth detector and controller is configured to compare at least one energy estimation associated to a bandwidth of the information signal to a respective threshold to control the bandwidth for the at least one of the plurality of FD encoder tools.

In accordance with examples, the at least one of the plurality of FD encoder tools of the subgroup comprises a TNS configured to autocorrelate a TNS input signal within the bandwidth chosen by the encoder bandwidth detector and controller.

In accordance with examples, the at least one of the FD encoder tools which are not in the subgroup is configured to operate at a full bandwidth.

Therefore, the bandwidth selection operates only for the tools of the subgroup (e.g., TNS, noise estimator tool).

In accordance with examples, the encoder bandwidth detector and controller is configured to select at least one bandwidth which is within the full bandwidth at which the at least one of the FD encoder tools which are not in the subgroup is configured to operate.

In accordance with examples, the at least one of the remaining FD encoder tools of the plurality of FD encoder tools is configured to operate in open chain with respect to the bandwidth chosen by the encoder bandwidth detector and controller.

In accordance with examples, the encoder bandwidth detector and controller is configured to select a bandwidth among a finite number of bandwidths and/or among a set of pre-defined bandwidths.

Therefore, the choice is limited and there is no necessity of encoding too complicated and/or long parameters. In examples, only one single parameter (e.g., encoded in 0-3 bits) may be used for the bitstream.

In accordance with examples, the encoder bandwidth detector and controller is configured to perform a selection among at least one or a combination of: a 8 KHz, 16 KHz, 24 KHz, 32 KHz, and 48 KHz, and/or NB, WB, SSWB, SWB, FB, etc.

In accordance with examples, the encoder bandwidth detector and controller is configured to control the signalling of the bandwidth to a decoder.

Therefore, also the bandwidth of signals processed by some tools at the decoder may be controlled (e.g., using the same bandwidth).

In accordance with examples, the encoder apparatus is configured to encode a control data field including an information regarding the chosen bandwidth.

In accordance with examples, the encoder apparatus is configured to define a control data field including:

- 0 data bits corresponding to NB bandwidth;
- 1 data bit corresponding to NB, WB bandwidth;
- 2 data bits corresponding to NB, WB, SSWB bandwidth;
- 2 data bits corresponding to NB, WB, SSWB, SWB bandwidth;
- 3 data bits corresponding to NB, WB, SSWB, SWB, FB bandwidth.

In accordance with examples, the encoder apparatus at least one energy estimation is performed by:

$\begin{matrix} E_{B} (n) = \sum_{k = I_{f_{s}} (n)}^{I_{f_{s}} (n + 1) - 1} \frac{{X (k)}^{2}}{I_{f_{s}} (n + 1) - I_{f_{s}} (n)} & for n = 0 \dots N_{B} - 1 \end{matrix}$

where X(k) are MDCT (or MDST . . . ) coefficients, N_Bis the number of bands and I_f_s(n) are the indices associated to the band.

In accordance with examples, the encoder apparatus comprises a TNS tool which may be configured to perform a filtering operation including the calculation of an autocorrelation function. One of the possible autocorrelation functions may be in the following form:

$for each k = 0 \dots 8$

$r (k) = {\begin{matrix} r_{0} (k), & if \prod_{s = 0}^{2} e (s) = 0 \\ \sum_{s = 0}^{2} \frac{\sum_{n = sub_start (f, s)}^{sub_stop (f, s) - 1 - k} X_{s} (n) X_{s} (n + k)}{e (s)}, & otherwise \end{matrix} with r_{0} (k) = {\begin{matrix} 1, & if k = 0 \\ 0, & otherwise \end{matrix} and \begin{matrix} e (s) = \sum_{n = sub_start (f, s)}^{sub_stop (f, s) - 1} {X_{s} (n)}^{2} & for s = 0 \dots 2 \end{matrix}$

where X(k) are MDCT coefficients, sub_start(f,s) and sub_stop(f,s) are associated to the particular bandwidth as detected by the encoder bandwidth detector and controller.

In accordance with examples, the encoder apparatus may comprise a noise estimator tool which may be configured to estimate a noise level. One of the procedures used for such an estimation may be in the form of

$L_{N F} = \frac{\sum_{k = 0}^{N_{E} - 1} I_{NF} (k) \cdot \frac{| X_{f} (k) |}{gg}}{\sum_{k = 0}^{N_{E} - 1} I_{N F}}$

where gg refers to the global gain, I_NF(k) to the identification of the spectral lines on which the noise level is to be estimated, and X_f(k) is the signal (e.g., the MDCT or MDST or another FD spectrum after TNS).

In examples, I_NF(k) may be obtained with:

$I_{N F} (k) = {\begin{matrix} 1 & \begin{matrix} if 24 \leq k < {bw}_{stop} and \\ X_{q} (i) == 0 for all i = k - 3 \dots \min (bw_stop, k + 3) \end{matrix} \\ 0 & otherwise \end{matrix}$

where bw_stopdepends on the bandwidth detected by the encoder bandwidth detector and controller.

In accordance with examples, there may be provided a decoder apparatus comprising a plurality of FD decoder tools for decoding an information signal encoded in a bitstream, wherein:

the FD decoder tools are subdivided:

- in a subgroup comprising at least one FD decoder tool;
- in remaining FD decoder tools comprising at least one FD decoder tool;

wherein the decoder apparatus is configured so that the at least one of the plurality of decoder tools of the subgroup performs signal processing a different bandwidth with respect to at least one of the remaining FD decoder tools of the plurality of decoder tools.

In accordance with examples, the decoder apparatus may comprise a bandwidth controller configured to choose the bandwidth on the basis of the bandwidth information.

In accordance with examples, the decoder apparatus may be such that the subgroup comprises at least one of a decoder noise estimator tool and/or a temporal noise shape, TNS, decoder.

In accordance with examples, the at least one of the remaining FD decoder tools is at least one of a linear predictive coding, LPC, decoder tool, spectral noise shaper decoder, SNS, tool, a decoder global gain tool, an MDCT or MDST shaping tool.

In accordance with examples, the decoder apparatus may be configured to control the bandwidth of the at least one of the plurality of decoder tools in the subgroup between:

- at least a first bandwidth common to at least one of the remaining FD decoder tools; and
- at least a second bandwidth different from the first bandwidth.

In accordance with examples, the at least one of the FD remaining decoder tools is configured to operate at a full bandwidth.

In accordance with examples, the at least one of the remaining FD decoder tools is configured to operate in open chain with respect to the bandwidth (e.g., chosen by the bandwidth controller).

In accordance with examples, the bandwidth controller is configured to choose a bandwidth among a finite number of bandwidths and/or among a set of pre-defined bandwidths.

In accordance with examples, the bandwidth controller is configured to perform a choice among at least one or a combination of: a 8 KHz, 16 KHz, 24 KHz, 32 KHz, and 48 KHz and/or NB, WB, SSWB, SWB, FB.

In accordance with examples, the decoder may be further comprising a noise filling tool (46) configured to apply a noise level using indices. A technique for obtaining the indices may provide, for example:

$I_{N F} (k) = {\begin{matrix} 1 & \begin{matrix} if 24 \leq k < {bw}_{stop} and \\ (i) == 0 for all i = k - 3 \dots \min (bw_stop, k + 3) \end{matrix} \\ 0 & otherwise \end{matrix}$

where bw_stopis obtained on the basis of bandwidth information in the bitstream.

In accordance with examples, the decoder apparatus may comprise a TNS decoder tool configured to perform at least some of the following operations:

s⁰(start_freq(0)−1)=s¹(start_freq(0)−1)= . . . =s⁷(start_freq(0)−1)=0

for f=0 to num_tns_filters−1 do

- for n=start_freq(f) to stop_freq(t)−1 do
  
  t^K(n)=(n)
- for k=7 to 0 do
  
  t^k(n)=t^k+1(n)−rc_q(k)s^k(n−1)
  s^k+1(n)=rc_q(k)t^k(n)+s^k(n−1)
  (n)=s⁰(n)=t⁰(n)
  
  where (n) is the output of the TNS decoder and (n) is the input of the TNS decoder, num_tns_filters, start_freq, stop_freq are obtained on the basis of bandwidth information in the bitstream.

Coding tools like TNS or noise filling can create unwanted artificial noise in the silent sections of band limited signals. Therefore, bandwidth detectors are usually in-cooperated to control the bandwidth all coding tools should work on. As bandwidth detection might lead to uncertain results, such wrong detection might lead to audible artefacts such as sudden limitation of audio bandwidth.

To overcome the problem, in some examples some tools, e.g., the quantizer, are not controlled by the bandwidth detector. In case of miss-detection, the quantizer can code the upper spectrum—even tough in low quality—to compensate the problem.

3. BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present invention will be detailed subsequently referring to the appended drawings, in which:

FIG. 1 shows artificial noise generated by unguided tools;

FIG. 2 shows a schematic outline of wrong bandwidth detection;

FIGS. 3a and 3b show encoder apparatus according to examples;

FIGS. 4a and 4b show decoder apparatus according to examples;

FIG. 5 shows a scheme in case of wrong bandwidth detection;

FIG. 6a-6c show methods according to examples;

FIG. 7 shows a method for TNS at the encoder according to an example;

FIGS. 8a and 8b show apparatus according to examples.

4. DETAILED DESCRIPTION OF THE INVENTION

The invention described in this document permits to avoid the occurrence of spectral holes even when the bandwidth detector returns a wrong result. In particular, soft band switching for audiocoding applications may be obtained.

A key aspect is that parametric coding tools, e.g. TNS and NF, may be strictly controlled by the bandwidth detector and controller 39 while the remaining coding, i.e. LPC based spectral shaper or spectral noise shaper, SNS, spectral quantizer and residual coder, still work on the full audio bandwidth up to the Nyquist frequency.

FIGS. 3b and 3a outline examples of an encoder apparatus 30 and 30a where a bandwidth (BW) detector and controller 39 estimates the current audio bandwidth in the frame based on energies derived from an MDCT or MDST spectrum (or other FD spectrum).

On the decoder side (FIGS. 4b and 4a), the guiding bandwidth information for TNS and NF is extracted from the bitstream and the tools are controlled accordingly.

As a result, artificially generated noise in non-active spectral regions is avoided due to the bandwidth parameter used to control the TNS and NF coding tools (unguided tools). The tool just work on the active audio part and therefore do not generate any artificial noise.

On the other side, the audible effect of wrong detections (false bandwidth detection) can be reduced significantly as the remaining coding tools, e.g. spectral quantizer, LPC shaper or SNS (spectral noise shaper) and residual coder, still work up to the Nyquist frequency. In case of wrong detections, these tools can code the upper frequency—at least with some more distortions compared to a regular coding—and therefore avoid the more severe impression that the audio bandwidth suddenly drops.

FIG. 5 shows a new scheme in case of wrong bandwidth detection: spectral hole is quantized sparsely but avoids an audible bandwidth drop.

In case the region outlined in the figure above contains mostly zero values, the arithmetic coder does not need to code those as the information on the last non-zero spectral tuple is transmitted as side information for the arithmetic coder. This means there is no overhead involved for the arithmetic coder.

The side information that may be used for the transmitted bandwidth is also minimized. Due the robust switching behavior, a signaling of the typically used communication audio bandwidths, i.e. NB, WB, SSWB and SWB, is appropriate.

This technique also allows to build less complex bandwidth detectors which do not use frame dependencies and long history memories to get stable decisions, see the EVS codec [1] Section 5.1.6. This means, the new technique allows the bandwidth detector and controller 39 to react very fast on any audio bandwidth change.

Accordingly, a bandwidth information is used to only control specific tools of a codec (e.g., audio codec) while keeping the remaining tools in another operation mode (e.g., full bandwidth).

5. EXAMPLES
5.1 the Bandwidth Detection and Control of the Tools

An information signal (e.g., an audio signal) may be described in the time domain, TD, as a succession of samples (e.g., x(n)) acquired at different discrete time instants (n). The TD representation may be made of a plurality of frames, each associated to a plurality of samples (e.g., 2048 samples per frame). In the frequency domain, FD, a frame may be represented as a succession of bins (e.g., X(k)), each associated to a particular frequency (each frequency being associated to an index k).

FIGS. 3b and 3a show encoder apparatus 30 and 30a, respectively, each comprising an encoder bandwidth detector and controller 39 which is capable of selecting a bandwidth for some tools (a subgroup at the encoder) of the encoder apparatus 30 or 30a, so that other tools operate at different bandwidth. The encoder bandwidth detector and controller 39 is also capable of selecting the bandwidth for at least some of the tools of a decoder (a subgroup at the decoder). 39a refers to the bandwidth selection information provided by the encoder bandwidth detector and controller 39 to the tools of the subgroup (e.g., 33, 36) and/or to a decoder.

Each of the encoder apparatus 30 and 30a may comprise a low delay modified discrete cosine transform, MDCT, tool 31 or low delay modified discrete sine transform, MDST, tool 31 (or a tool based on another transformation, such as a lapped transformation) which may convert an information signal (e.g., an audio signal) from a time domain, TD, representation to a frequency domain, FD, representation (e.g., to obtain MDCT, MDST, or, more in general, FD coefficients).

The encoder apparatus 30 may comprise a linear predictive coding, LPC, tool 32 for performing an LPC analysis in the FD.

The encoder apparatus 30a may comprise an SNS tool 32a for performing an SNS analysis in the FD.

Each of the encoder apparatus 30 and 30a may comprise a temporal noise shaping, TNS, tool 33, to control the temporal shape of noise within each window of the information signal (e.g., as output by the MDCT or MDST tool) in the FD.

Each of the encoder apparatus 30 and 30a may comprise a spectral quantizer 34 processing signals in the in the FD. The signal as output by the TNS tool 33 may be quantized, e.g., using dead-zone plus uniform thresholds scalar quantization. A gain index may be chosen so that the number of bits needed to encode the quantized FD signal is as close as possible to an available bit budget.

Each of the encoder apparatus 30 and 30a may comprise a coder 35 processing signals in the FD, for example, to perform entropy coding, e.g., to compress a bitstream. The coder 35 may, for example, perform residual coding and/or arithmetic coding.

Each of the encoder apparatus 30 and 30a may comprise, for example, a noise level estimator tool 36, processing signals in the FD, to estimate the noise, quantize it, and/or transmit it in a bitstream.

In examples, the level estimator tool 36 may be placed upstream or downstream to the coder 35.

Each of the encoder apparatus 30 and 30a may comprise tools which process signals in the time domain, TD. For example, the encoder apparatus 30 or 30a may comprise a re-sampling tool 38a (e.g., a downsampler) and/or a long term postfiltering, LTPF, tool 38b, for controlling an LTPF active in TD at the decoder.

Each of the encoder apparatus 30 and 30a may comprise a bitstream multiplexer tool 37 to prepare a bitstream with data obtained from TD and/or FD tools placed upstream. The bitstream may comprise a digital representation of an information signal together with control data (including, for example, a bandwidth information for selecting the bandwidth at some tools of the decoder) to be used at the decoder. The bitstream may be compressed or include portions which are compressed.

Therefore, each of the encoder apparatus 30 and 30a may comprise FD tools (e.g., 31-36) and, in case, TD tools (e.g., 38a, 38b).

The encoder bandwidth detector and controller 39 may control the bandwidth of FD tools forming a first group (subgroup), such as the temporal noise shaping, TNS, tool 33, and/or the noise estimator tool 36. The TNS tool 33 may be used to control the quantization noise. The bandwidth at which FD tools which are not in the subgroup (such as at least one of the LPC tool 32 and/or the SNS tool 32a, the spectrum quantizer 34, and the coder 35) perform signal processing may therefore be different from the bandwidth at which the tools of the subgroup (e.g., 33, 36) perform signal processing. For example, the bandwidth for the FD tools which are not in the subgroup may be greater, e.g., may be a full bandwidth.

In examples, the encoder bandwidth detector and controller 39 may be a part of a digital signal processor which, for example, implements also other tools of the encoder apparatus.

FIGS. 4b and 4a show decoder apparatus 40 and 40a, respectively, each of which may decode a digital representation of an information signal as encoded by the encoder 30 or 30a, for example. Each of the decoder apparatus 40 and 40a may comprise FD tools and, in case, TD tools.

Each of the decoder apparatus 40 and 40a may comprise a bitstream multiplex tool 41 to obtain a bitstream (e.g., by transmission) from an encoder apparatus (e.g., the apparatus 30 or 30a). For example, an output from the encoder apparatus 30 or 30a may be provided as an input signal to the decoder apparatus 40 or 40a.

Each of the decoder apparatus 40 and 40a may comprise a decoder 42 which may, for example, decompress data in the bitstream. Arithmetic decoding may be performed. Residual decoding may be performed.

Each of the decoder apparatus 40 and 40a may comprise a noise filling tool 43 processing signals in the FD.

Each of the decoder apparatus 40 and 40a may comprise a global gain tool 44 processing signals in the FD.

Each of the decoder apparatus 40 and 40a may comprise a TNS decoder tool 45 processing signals in the FD. TNS can be briefly described as follows. At the encoder-side and before quantization, a signal is filtered in the frequency domain (FD) using linear prediction, LP, in order to flatten the signal in the time-domain. At the decoder-side and after inverse quantization, the signal is filtered back in the frequency-domain using the inverse prediction filter, in order to shape the quantization noise in the time-domain such that it is masked by the signal.

Each of the decoder apparatus 40 and 40a may comprise an MDCT or MDST shaping tool 46 (other kinds of shaping tools may be used). Notably, the MDCT or MDST shaping tool 46 may process signals by applying scale factors (or quantized scale factors) obtained from the encoder SNS tool 32a or gain factors computed from decoded LP filter coefficients (obtained from an LPC decoding tool 47) transformed to the MDCT or MDST spectrum.

Each of the decoder apparatus 40 and 40a may comprise an inverse low delay inverse MDCT or MDST tool 48a to transform signal representations from FD to TD (tools based on other kinds of inverse transform may be used).

Each of the decoder apparatus 40 and 40a may comprise an LTPF tool 48b for performing a postfilter in the TD, e.g., on the basis of the parameters provided by the component 38b at the decoder.

Each of the decoder apparatus 40 and 40a may comprise a decoder bandwidth controller 49 configured to select the bandwidth of at least one of the FD tools. In particular, the bandwidth of a subgroup (e.g., formed by the tools 43 and 45) may be controlled so as to be different from the bandwidth at which other FD tools (42, 44, 46, 47) process signals. The bandwidth controller 49 may be input with a signal 39a which has been prepared at the encoder side (e.g., by the bandwidth detector and controller 39) to indicate the selected bandwidth for at least one of the subgroups (33, 36, 43, 45).

In examples, the decoder bandwidth controller 49 may perform operations similar to those processed by the encoder bandwidth detector and controller 39. However, in some examples, the decoder bandwidth controller 49 may be intended as a component which obtains control data (e.g., encoded in a bitstream) from the encoder bandwidth detector and controller 39 and provides the control data (e.g., bandwidth information) to the tools of the subgroup (e.g., decoder noise filling tool 43 and/or TNS decoder tool 45). In examples, the controller 39 is a master and the controller 49 is a slave. In examples, the decoder bandwidth controller 49 may be a part or a section of a digital signal processor which, for example, implements also other tools of the decoder.

In general, the bandwidth controllers 39 and 49 may operate so that the FD tools of the subgroups (e.g., 33 and 36 for the encoder apparatus and/or 43 and 45 for the decoder apparatus) have a same frequency band, while the other FD tools of the decoder and/or encoder have another frequency band (e.g., a broader band).

It has been noted, in fact, that accordingly it is possible to reduce impairments of conventional technology. While for some FD tools (e.g., TNS tools, noise filling tools) it may be advantageous to actually perform a band selection, for other FD tools (e.g., 32, 34, 35, 42, 44, 46, 47) it may be advantageous to process signals at a broader band (e.g. full band), Accordingly, it is possible to avoid spectral holes that would be present in case of hard selection of the bandwidth for all the tools (in particular when a wrong band is selected).

In examples, the bandwidth that is selected by the decoder bandwidth controller 49 may be one of a finite number of choices (e.g., a finite number of bandwidths). In examples, it is possible to choose among narrow band NB (e.g., 4 KHz), wide band WB (e.g., 8 KHz), semi-super wide band SSWB (e.g., 12 KHz), super wide band SWB (e.g., 16 KHz) or full band FB (e.g., 20 KHz).

The selection may be encoded in a data field by the encoder apparatus, so that the decoder apparatus knows which bandwidths have been selected (e.g., according to a selection performed by the encoder bandwidth detector and controller 39).

FIG. 6a shows a method 60. The method 60 may comprise steps which may be performed, at least in part, by at least one of the controllers 39 and 49. The method 60 may be looped so as to perform operations in association to each frame of the information signal.

At step S61, an energy per band may be estimated (e.g., by the bandwidth detector and controller 39).

At step S62, the bandwidth may be detected (e.g., by the bandwidth detector and controller 39).

At step S63, the detected bandwidth may be selected for at least one of the TNS tool 33 and noise estimation tool 36: these tools will perform their processes at the bandwidth detected at S62.

In addition or in alternative, at step S64 parameters may be defined (and/or encoded) in the bitstream to be stored and/or transmitted and to be used by a decoder. Among the parameters, a bandwidth selection information (e.g., 39a) may be encoded, so that the decoder will know the detected and selected bandwidth for the subgroup (e.g., TNS and noise filling/estimation).

Then, a new frame of the information signal may be examined. Method 60 may therefore cycle by moving to S61. Therefore, a decision may be carried out frame by frame.

Notably, in accordance to the detected bandwidth, a different number of bits may be encoded in the bitstream. In examples, if a bandwidth 8 KHz (NB) is detected, no bits will be encoded in the bitstream. However, the decoder will understand that the bandwidth is 8 KHz.

Each of the encoder apparatus 30 and 30a of FIGS. 3b and 3a may comprise:

- a plurality of frequency domain, FD, encoder tools (31-36) for encoding an information signal, the information signal presenting a plurality of frames; and
- an encoder bandwidth detector and controller 39 configured to select a bandwidth (e.g., at S63) for at least a subgroup (e.g., TNS tool 33, and noise level estimator tool 36) of the plurality of FD encoder tools on the basis of information signal characteristics so that at least one (e.g., 33, 36) of the FD encoder tools of the subgroup has a different bandwidth with respect to at least one of the FD encoder tools (e.g., 31, 32, 34, 35) which are not in the subgroup.

In particular, the encoder bandwidth detector and controller 39 may be configured to select the bandwidth of the at least one FD encoder tool of the subgroup (33, 36) between at least a first bandwidth (e.g., Nyquist frequency) common to at least one (or more) of the FD encoder tools which are not in the subgroup and a second bandwidth (e.g., NB, WB, SSWB, SWB) different from the bandwidth of the at least one (or more) of the FD encoder tools which are not in the subgroup.

Therefore, some tools may operate at bandwidths different from each other and/or perform signal processing using bandwidths different from each other.

The tools which are not in the subgroup (e.g., global gain, spectral noise shaping, and so on) may operate in open chain which respect to the bandwidth selection.

In examples, the encoder bandwidth detector and controller 39 is configured to select (e.g., at S62) the bandwidth of the at least one of the plurality of FD encoder tools (31-36) on the basis of at least one energy estimation (e.g., at S61) on the information signal.

The decoder apparatus 40 of FIG. 4b comprises comprising a plurality of FD decoder tools (43-48a) for decoding an information signal encoded in a bitstream, wherein:

- the FD decoder tools are divided:
  - in a subgroup comprising at least one FD decoder tool (e.g., 43, 45);
  - in remaining FD decoder tools comprising at least one FD decoder tool (e.g., 44, 46, 48a);
- wherein the decoder apparatus 40 or 40a is configured so as to choose a bandwidth for at least one of the plurality of decoder tools of the subgroup (e.g., 43, 45) on the basis of bandwidth information included in the bitstream so that the at least one of the plurality of decoder tools of the subgroup (e.g., 43, 45) performs signal processing a different bandwidth with respect to at least one of the remaining FD decoder tools of the plurality of decoder tools (e.g., 44, 46, 48a).

FIG. 6b shows a method 60b. The method 60b may be a method for encoding an information signal according to at least a plurality of operations in the frequency domain, FD, the method comprising:

- selecting a bandwidth for a subgroup of FD operations (e.g., S61b);
- performing first signal processing operations at the a bandwidth for the subgroup of FD operations (e.g., S62b);
- performing second signal processing operations at a different bandwidth for FD operations which are not in the subgroup (e.g., S63b).

It is not necessary, e.g., to perform the steps S61b and S62b in this temporal order. For example, S62b may be performed before S61b. S61b and S62b may also be performed in parallel (e.g., using time-sharing techniques or similar).

FIG. 6c shows a method 60c. The method 60c may be a method for decoding a bitstream with an information signal and control data (e.g., 39a), the method comprising a plurality of signal processing operations in the frequency domain, FD, the method comprising:

- choosing a bandwidth selection for a subgroup of FD operations on the basis of the control data (S61c);
- performing first signal processing operations at the a bandwidth for the subgroup of FD operations (S62c);
- performing second signal processing operations at a different bandwidth for FD operations which are not in the subgroup (S63c).

It is not necessary, e.g., to perform the steps S61c and S62c in this temporal order. For example, S62c may be performed before S61c. S61c and S62c may also be performed in parallel (e.g., using time-sharing techniques or similar).

According to an example, the encoder bandwidth detector and controller 39 may detect the energy per band, e.g., using an equation such as:

$E_{B} (n) = \sum_{k =_{f_{S}} (n)}^{I_{f_{s}} (n + 1) - 1} \frac{{X (k)}^{2}}{l_{f_{s}} (n + 1) - l_{f_{s}} (n)} for n = 0 ... N_{B} - 1$

where X(k) are the MDCT or MDST coefficients (or any other representation of the signal in the FD), N_B(e.g., 64) is the number of bands and I_f_s(n) are the indices associated to the band (each index being associated to a bin).

It is therefore possible to detect (e.g., at S62) the bandwidth (e.g., among a finite number of bandwidths). The encoder bandwidth detector and controller 39 may be able to detect the commonly used bandwidth in speech communication, i.e. 4 kHz, 8 kHz, 12 kHz and 16 kHz. For example, it is possible to detect the quietness of each bandwidth. In case of a positive detection of quietness for a bandwidth, a dedicated cut-off characteristics on the spectrum is further detected. For example, a flag (or in any case a data) regarding the detection of quietness may be obtained as:

$F_{Q} (b w) = \sum_{n = I_{bw start} (bw)}^{I_{bw stop} (b w)} \frac{E_{b} (n)}{l_{bw stop} (b w) - l_{bw start} (b w) + 1} < T_{Q} (bw) for bw = N_{b w} - 1 ... 0$

The F_Q(bw) is a binary value which is 1 if the summation is less than T_Q(bw), and 0 if the summation is greater than T_Q(bw). F_Q(bw), associated to a particular bandwidth bw, indicates quietness (e.g., with logical value “1”) when the summation of the energy values is less than a threshold for the particular bandwidth bw (and “0” otherwise). The summation relates to the sum of energy values at different indexes (e.g., energy per bin or band), e.g., for n from a first index of the bandwidth associated to the index I_{bw start}(bw) to a last index of the bandwidth associated to the index I_{bw stop}(bw). The number of the examined bandwidths is N_bw.

The procedure may stop when F_Q(bw)==0 (energy greater than the threshold for the bandwidth bw). In case F_Q(bw+1)==1, the flags F_C(b) indicating the cut-off characteristic of the spectrum may be detected by

F_C(b)=[10 log₁₀(E_b(b−D))−10 log₁₀(E_b(b))]<T_C(bw)

- for b=I_{bw start}(bw) . . . I_{bw start}(bw)−D

where D defines the distance between the bands where the cut-off characteristic should be checked, i.e. D(bw).

Then, it is possible to define a final information (bandwidth information or bandwidth selection information) to be used to control a subgroup (e.g., TNS tool 33 and/or noise level estimation tool 36 and/or the TNS decoder tool 45 and/or noise filling tool 43). The final information may be, for example, encoded in some bits and may take the form of such as

$P_{b w} = {\begin{matrix} b w, & if \sum F_{C} (b) > 0 \\ N_{b w} - 1, & else \end{matrix}$

The parameter bandwidth P_bw(bandwidth selection information) may be used to control the TNS and the noise filling tool, e.g., at the decoder and embody the signal 39a. The parameter P_bwmay be stored and/or transmitted in a bitstream using the number of bits nbits_bw. Notably, the number of bits is not necessarily constant and may vary according to the chosen sample rate f_s, hence, reducing the payload for the bitstream where not necessary.

A table such as the following one may be used:

TABLE 1

Bandwidth
nbits

f_s
N_bw
l_{bw start}
l_{bw stop}
(P_bw) 39a

_bw

8000
0
—
—
{NB}
0

16000
1
{53, 0, 0, 0}
{63, 0, 0, 0}
{NB, WB}
1

24000
2
{47, 59, 0, 0}
{56, 63, 0, 0
{NB, WB,
2

SSWB}

32000
3
{44, 54, 60, 0}
{52, 59, 63, 0}
{NB, WB,
2

SSWB, SWB}

48000
4
{41, 51, 57, 61}
{49, 55, 60, 63}
{NB, WB,
3

SSWB, SWB,

FB}

f_sis a given sampling rate (e.g., 8 KHz, 16 KHz, 24 KHz, 32 KHz, and/or 48 KHz) and, for each f_s, the number of possible modes is N_bw+1.

Therefore, it is possible to 0 data encode a control data field including:

- 0 data bits corresponding to (signalling the choice of) NB bandwidth;
- 1 data bit corresponding to (signalling the choice of one of) NB and WB bandwidth;
- 2 data bits corresponding to (signalling the choice of one of) NB, WB, and SSWB bandwidth;
- 2 data bits corresponding to (signalling the choice of one of) NB, WB, SSWB, and SWB bandwidth;
- 3 data bits corresponding to (signalling the choice of one of) NB, WB, SSWB, SWB, and FB bandwidth.

An electronic version of at least some portions of Table 1 may be stored in the encoder and/or encoder. Accordingly, when the parameter bandwidth P_bw, it is possible to automatically know control information for the TNS and noise filling operations. For example, I_{bw start}may refer to the start index associated to the lower end of the bandwidth I_{bw stop}may refer to the final index associated to the higher end of the bandwidth. The bandwidth choice and parameters based on this choice may, therefore, derived from a table such as Table 1.

In examples, when f_s=8000, the bandwidth detector is not needed and we have P_bw=0 and nbits_bw=0, i.e. the parameter P_bwis not placed in the bitstream. However, the decoder will understand that the chosen bandwidth is NB (e.g., on the basis of electronic instruments such as an electronic version of Table 1).

Other methods may be used. One of the bandwidths NB, WB, SSWB, SWB, FB may be identified and transmitted to the FD tools of the encoder subgroup, such as the TNS shaping tool 33 and the noise estimator tool 36. Information such as the parameter P_bw(39a) may be encoded and transmitted to the decoder apparatus 40 or 40a, so that the decoder noise estimator tool 43 and the TNS decoder tool 45 make use of the information regarding the selected bandwidth.

In general terms, the information signal characteristics which constitute the basis for the selection of the bandwidth may comprise, inter alia, one or more of the signal bandwidth, at least one energy estimation of the information signal, cut-off characteristics on the spectrum, information on the detection of quietness in some particular bands, F_Q(bw), etc.

The examples above permit to obtain a soft bandwidth switching.

5.2 MDCT or MDST (or Other Transform) at the Encoder

A modified discrete cosine transform (MDCT) or modified discrete sine transform (MDST) (or another modulated lapped transform) tool 31 may convert a digital representation in the TD into a digital representation in the FD. Other examples (maybe based on other transformations, such as lapped transformations) may be notwithstanding used. An example is provided here.

The input signal x(n) of a current frame b in the TD may consist of N_Faudio samples, where the newest one is located at x(N_F−1). Audio samples of past frames are accessed by negative indexing, e.g. x(−1) is the newest of the previous frame.

The time input buffer for the MDCT t may be updated according to

- t(n)=x(Z−N_F+n) for n=0 . . . 2N_F−1−Z
- t(2N_F−Z+n)=0 for n=0 . . . Z−1 (initialization may be used only for consistency)

A block of N_Ftime samples may be transformed to the frequency coefficients X(k) using the following equation:

$X (k) = \sqrt{\frac{2}{N_{F}}} \sum_{n = 0}^{2 N_{F} - 1} w_{N} (n) \cdot t (n) \cos [\frac{π}{N_{F}} (n + \frac{1}{2} + \frac{N_{F}}{2}) (k + \frac{1}{2})] for k = 0 \dots N_{F} - 1$

where w_Nis the Low Delay MDCT window according to the used frame size. The window may be optimized for N_F=480 and other versions for different frame sizes may be generated by means of interpolation. The window shape may be the result of an optimization procedure and may be provided point by point.

It is also possible to apply MDST or other transformations.

5.3.1 LPC at the Encoder

A linear predictive coding (LPC) analysis may be performed by an LPC tool 32. LPC is a used representing the spectral envelope of a digital signal in compressed form, using the information of a linear predictive model.

An LPC filter may be derived in a warped frequency domain and therefore psychoacoustically optimized. To obtain the autocorrelation function, the Energy E_B(b), as defined above, may be pre-emphasized by

$E_{Pre} (b) = E_{B} (b) \cdot 10^{\frac{b \cdot ℊ_{tilt}}{10 (N_{B} - 1)}} for b = 0 \dots N_{B} - 1$

where

f_s

f_s
g_tilt

16000
18

24000
22

32000
26

48000
30

and transformed to time domain using, for example, an inverse odd DFT

$R_{Pre} (n) = Re (\sum_{b = 0}^{N_{B} - 1} E_{Pre} (b) \cdot e^{j \frac{π \cdot n}{N_{B}} (b + \frac{1}{2})}) for n = 0 \dots N_{B} - 1 R_{Pre} (0) = R_{Pre} (0) * 1.0 0 0 1$

In case R_Pre(0)=0, set R_Pre(0)=1 and R_Pre(1 . . . N_B−1)=0. The first N_Lsamples are extracted into the vector R_L=R_Pre(0 . . . N_L−1), where N_Lstands for the LP filter order, i.e. N_L=16.

The LP filter coefficients may be calculated, for example, based on the vector R_Lthrough the Levinson-Durbin procedure. This procedure may be described by the following pseudo code:

e = R_L(0)

a⁰(0) = 1

for k = 1 to N_Ldo

rc = \frac{- \sum_{n = 0}^{k - 1} a^{k - 1} (n) R_{L} (k - n)}{e}

a^k(0) = 1

for n = 1 to k − 1 do

a^k(n) = a^k−1(n) + rc.a^k−1(k − n)

a^k(k) = rc

e = (1 − rc²)e

with a(k)=a^N^L(k), k=0 . . . N_Lare the estimated LPC coefficients and e is the prediction error.

The LPC coefficients may be weighted, in examples, by equation such as:

a_w(k)=a(k)·0.94^kfor k=0 . . . N_L

The LPC coefficients may be quantized.

For example, the weighted LPC coefficients a_w(k) are first convolved with the coefficients b(i) using

$a_{c} (k) = \sum_{i = 0}^{2} a_{w}^{'} (k - i) b (i) for k = 0 \dots N_{L} + 2$

$with$

$a_{w}^{'} (k) = {\begin{matrix} a_{w} (k) & if 0 \leq k \leq N_{L} \\ 0 & otherwise \end{matrix} and b (i) = {\begin{matrix} \sum_{k = 0}^{N_{L}} a_{w} (k) - \sum_{k = 0}^{N_{L}} {(- 1)}^{k} a_{w} (k) & if i = 0 or i = 2 \\ - 2 (\sum_{k = 0}^{N_{L}} a_{w} (k) + \sum_{k = 0}^{N_{L}} {(- 1)}^{k} a_{w} (k)) & if i = 1 \end{matrix}$

The coefficients a_c(k) may then be transformed to the frequency domain using

$A (k) = \sum_{n = 0}^{N_{L} + 2} a_{c} (n) e^{\frac{- i 2 π k (n - \frac{N_{L} + 1 + 2}{2})}{N_{T}}} for k = 0 \dots N_{T} - 1$

where N_T=256 is the transform length. Note that this transform can be efficiently implemented using a pruned FFT. The real and imaginary parts of A(k) are then extracted

$A_{r} (k) = Re (A (k)) for k = 0 \dots \frac{N_{T}}{2}$

$A_{i} (k) = I m (A (k)) for k = 0 \dots \frac{N_{T}}{2}$

LSFs may be obtained by a zero-crossing search of A_r(k) and A_i(k) that can be described with the following pseudo-code

specix = 1;

lsfix = 0;

while ((specix <= 128) && lsfix <= 15)

{

while (specix <= 128 && A_r[specix−1]* A_r[specix] >= 0)

{

specix++;

}

if (specix <= 128)

{

tmp = specix−1 + A_r[specix−l]/( A_r[specix−1]− A_r[specix]);

lsf[lsfix++] = tmp/128;

}

while (specix <= 128 && A_i[specix−1]* A_i[specix] >= 0)

{

specix++;

}

if (specix <= 128)

{

tmp = specix−1 + A_i[specix−1]/(A_i[specix−1]−A_i[specix]);

lsf[lsfix++] = tmp/128;

}

}

If less than 16 LSFs are found, the LSFs are set according to

$lsf (k) = \frac{k + 1}{N_{L} + 1} for k = 0 \dots N_{L} - 1$

An LPC shaping may be performed in the MDCT or MDST (FD) domain by applying gain factors computed from the weighted and quantized LP filter coefficients transformed to the MDCT or MDST spectrum.

To compute N_B=64 LPC shaping gains, weighted LP filter coefficients a are first transformed into the frequency domain using an odd DFT.

$G_{L P C} (b) = \sum_{k = 0}^{N_{L}} \tilde{a} (k) \cdot e^{- j \frac{π k}{N_{B}} (b + \frac{1}{2})} for b = 0 \dots N_{B} - 1$

LPC shaping gains g_LPC(b) may then be obtained as the absolute values of G_LPC(b).

g_LPC(b)=|G_LPC(b)| for b=0 . . . N_B−1

The LPC shaping gains g_LPC(b) may be applied on the MDCT or MDST frequency lines for each band separately in order to generate the shaped spectrum X_s(k) as outlined by the following code.

for b=0 to N_B− 1 do

for k=I_f_s(b) to I_f_s(b + 1) − 1

X_s(k) = X(k) · g_LPC(b)

As can be seen from above, the LPC tool, for performing the LPC analysis, is not controlled by the controller 39: for example, there is no selection of a particular bandwidth.

5.3.2 SNS at the Encoder

With reference to FIG. 4a, it is possible to use a spectral noise shaper tool 32a.

Spectral noise shaping (SNS) shapes the quantization noise in the frequency domain such that it is minimally perceived by the human ear, maximizing the perceptual quality of the decoded output.

Spectral noise shaping may be performed using, for example, 16 scaling parameters. These parameters may be obtained in the encoder by first computing the energy of the MDCT (or MDST, or another transform) spectrum in 64 non-uniform bands, then by applying some processing to the 64 energies (smoothing, pre-emphasis, noise-floor, log-conversion), then by downsampling the 64 processed energies by a factor of 4 to obtain 16 parameters which are finally normalized and scaled. These 16 parameters may be then quantized using vector. The quantized parameters may then be interpolated to obtain 64 interpolated scaling parameters. These 64 scaling parameters are then used to directly shape the MDCT (or MDST . . . ) spectrum in the 64 non-uniform bands. The scaled MDCT (or MDST . . . ) coefficients may then be quantized using a scalar quantizer with a step size controlled by a global gain. At the decoder, inverse scaling is performed in every 64 bands, shaping the quantization noise introduced by the scalar quantizer. An SNS technique here disclosed may use, for example, only 16+1 parameters as side-information and the parameters can be efficiently encoded with a low number of bits using vector quantization. Consequently, the number of side-information bits is reduced, which may lead to a significant advantage at low bitrate and/or low delay. A non-linear frequency scaling may be used. In this examples, none of the LPC-related functions are used to reduce complexity. The processing functions involved (smoothing, pre-emphasis, noise-floor, log-conversion, normalization, scaling, interpolation) need very small complexity in comparison. Only the vector quantization still has relatively high complexity. However, some low complexity vector quantization techniques can be used with small loss in performance (multi-split/multi-stage approaches). This SNS technique is not relying on a LPC-based perceptual filter. It uses 16 scaling parameters which can be computed with a lot of freedom. Flexibility is therefore increased.

At the encoder 30a, the SNS tool 32 may perform at least one of the following passages:

Step 1: Energy Per Band

The energy per band E_B(n) may be computed as follows

$E_{B} (b) = \sum_{k = Ind (b)}^{Ind (b + 1) - 1} \frac{{X (k)}^{2}}{Ind (b + 1) - Ind (b)} for b = 0 \dots N_{B} - 1$

with X(k) are the MDCT (or MDST, or another transform) coefficients, N_B=64 is the number of bands and I_f_s(n) are the band indices. The bands may be non-uniform and follow the perceptually-relevant bark scale (smaller in low-frequencies, larger in high-frequencies).

Step 2: Smoothing

The energy per band E_B(b) is smoothed using

$E_{S} (b) = {\begin{matrix} 0.75 \cdot E_{B} (0) + 0 .25 \cdot E_{B} (1), & if b = 0 \\ 0.25 \cdot E_{B} (6 2) + 0.75 \cdot E_{B} (63), & if b = 6 3 \\ 0.25 \cdot E_{B} (b - 1) + 0.5 \cdot E_{B} (b) + 0 .25 \cdot E_{B} (b + 1), & otherwise \end{matrix}$

This step may be mainly used to smooth the possible instabilities that can appear in the vector E_B(b). If not smoothed, these instabilities are amplified when converted to log-domain (see step 5), especially in the valleys where the energy is close to 0.

Step 3: Pre-Emphasis

The smoothed energy per band E_S(b) is then pre-emphasized using

$\begin{matrix} E_{P} (b) = E_{S} (b) \cdot 10^{\frac{b \cdot g_{tilt}}{10 \cdot 63}} & for b = 0 \dots 63 \end{matrix}$

with g_tiltcontrols the pre-emphasis tilt and depends on the sampling frequency. It is for example 18 at 16 kHz and 30 at 48 kHz. The pre-emphasis used in this step has the same purpose as the pre-emphasis used in the LPC-based perceptual filter of conventional technology, it increases the amplitude of the shaped Spectrum in the low-frequencies, resulting in reduced quantization noise in the low-frequencies.

Step 4: Noise Floor

A noise floor at −40 dB is added to E_P(b) using

E_p=max(E_p(b),noiseFloor) for b=0 . . . 63

with the noise floor being calculated by

$noiseFloor = \max (\frac{\sum_{b = 0}^{6 3} E_{P} (b)}{6 4} \cdot 10^{- \frac{4 0}{1 0}}, 2^{- 3 2})$

This step improves quality of signals containing very high spectral dynamics such as e.g. glockenspiel, by limiting the amplitude amplification of the shaped spectrum in the valleys, which has the indirect effect of reducing the quantization noise in the peaks (an increase of quantization noise in the valleys is not perceptible).

Step 5: Logarithm

A transformation into the logarithm domain is then performed using

$\begin{matrix} E_{L} (b) = \frac{\log_{2} (E_{P} (b))}{2} & for b = 0 \dots 63 \end{matrix}$

Step 6: Downsampling

The vector E_L(b) is then downsampled by a factor of 4 using

$E_{4} (b) = {\begin{matrix} w (0) E_{L} (0) + \sum_{k = 1}^{5} w (k) E_{L} (4 b + k - 1) & if b = 0 \\ \sum_{k = 0}^{4} w (k) E_{L} (4 b + k - 1) + w (5) E_{L} (6 3) & if b = 1 5 \\ \sum_{k = 0}^{5} w (k) E_{L} (4 b + k - 1) & otherwise \end{matrix} with w (k) = {\frac{1}{1 2}, \frac{2}{1 2}, \frac{3}{1 2}, \frac{3}{1 2}, \frac{2}{1 2}, \frac{1}{1 2}}$

This step applies a low-pass filter (w(k)) on the vector E_L(b) before decimation. This low-pass filter has a similar effect as the spreading function used in psychoacoustic models: it reduces the quantization noise at the peaks, at the cost of an increase of quantization noise around the peaks where it is anyway perceptually masked.

Step 7: Mean Removal and Scaling

The final scale factors are obtained after mean removal and scaling by a factor of 0.85

$\begin{matrix} scf (n) = 0.8 5 (E_{4} (n) - \frac{\sum_{b = 0}^{1 5} E_{4} (b)}{1 6}) & for n = 0 \dots 15 \end{matrix}$

Since the codec has an additional global-gain, the mean can be removed without any loss of information. Removing the mean also allows more efficient vector quantization. The scaling of 0.85 slightly compress the amplitude of the noise shaping curve. It has a similar perceptual effect as the spreading function mentioned in Step 6: reduced quantization noise at the peaks and increased quantization noise in the valleys.

Step 8: Quantization

The scale factors are quantized using vector quantization, producing indices which are then packed into the bitstream and sent to the decoder, and quantized scale factors scfQ(n).

Step 9: Interpolation

The quantized scale factors scfQ(n) are interpolated using:

scfQint(0)=scfQ(0)
scfQint(1)=scfQ(0)
scfQint(4n+2)=scfQ(n)+⅛(scfQ(n+1)−scfQ(n)) for n=0 . . . 14
scfQint(4n+3)=scfQ(n)+⅜(scfQ(n+1)−scfQ(n)) for n=0 . . . 14
scfQint(4n+4)=scfQ(n)+⅝(scfQ(n+1)−scfQ(n)) for n=0 . . . 14
scfQint(4n+5)=scfQ(n)+⅞(scfQ(n+1)−scfQ(n)) for n=0 . . . 14
scfQint(62)=scfQ(15)+⅛(scfQ(15)−scfQ(14))
scfQint(63)=scfQ(15)+⅜(scfQ(15)−scfQ(14))

and transformed back into linear domain using

g_SNS(b)=2^scfQint(b)for b=0 . . . 63

Interpolation may be used to get a smooth noise shaping curve and thus to avoid any big amplitude jumps between adjacent bands.

Step 10: Spectral Shaping

The SNS scale factors g_SNS(b) are applied on the MDCT (or MDST, or another transform) frequency lines for each band separately in order to generate the shaped spectrum X_s(k)

$X_{s} (k) = \frac{X (k)}{g_{SNS} (b)} for k = I_{f_{s}} (b) \dots I_{f_{s}} (b + 1) - 1, for b = 0 \dots 63$

5.4 TNS at the Encoder

FIG. 7 shows a method 70 indicating operations of a TNS tool such as the TNS tool 33 of the encoder 30 or 30a.

At step S71, selection information regarding the selected bandwidth (e.g., parameter P_bw) may be obtained from the encoder bandwidth detector and controller 39, for example.

According to the selection information (bandwidth information), the behaviour of the TNS is different for different bandwidths (NB, WB, SSWB, SWB, FB). An example is provided by the following table:

TABLE 2

num_

Band-
tns_
start_
stop_

width
filters
freq(f)
freq(f)
sub_start(f, s)
sub_stop(f, s)

NB
1
{12}
{80}
{{12, 34, 57}}
{{34, 57, 80}}

WB
1
{12}
{160}
{{12, 61, 110}}
{61, 110, 160}}

SSWB
1
{12}
{240}
{{12, 88, 164}}
{88, 164, 240}}

SWB
2
{12,
{160,
{12, 61, 110},
{61, 110, 160},

160}
320}
{160, 213, 266}}
{213, 266, 320}}

FB
2
{12,
{200,
{12, 74, 137},
{74, 137, 200},

200}
400}
{200, 266, 333}}
{266, 333, 400}}

For example, when the selection information is SWB, the TNS will perform a filtering twice (see num_tns_filters). As can be seen from the tables, different indexes are associated to different bandwidths (e.g., for NB the stop frequency is different than for WB, and so on).

Therefore, as can be seen, the TNS tool 33 may operate at a different bandwidth on the basis of the selection set out by the controller 39. Notably, other FD tools of the same encoder apparatus 40 or 40a may continue perform processes at a different frequency.

The TNS encoding steps are described below. First, an analysis estimates a set of reflection coefficients for each TNS Filter (step S72). Then, these reflection coefficients are quantized (step S73). And finally, the MDCT- or MDST-spectrum is filtered using the quantized reflection coefficients (step S73).

With reference to the step S72, a complete TNS analysis described below may be repeated for every TNS filter f, with f=0 . . . num_tns_filters−1 (num_filters is given in Table 2). Other TNS analysis operations may be performed, which provide reflection coefficients.

The TNS tool may be configured to perform an autocorrelation on a TNS input value. A normalized autocorrelation function may be calculated as follows, for each k=0 . . . 8 (for example)

$r (k) = {\begin{matrix} r_{0} (k), & if \prod_{s = 0}^{2} e (s) = 0 \\ \sum_{s = 0}^{2} \frac{\sum_{n = sub_start (f, s)}^{sub_stop (f, s) - 1 - k} X_{s} (n) X_{s} (n + k)}{e (s)}, & otherwise \end{matrix} with r_{0} (k) = {\begin{matrix} 1, & if k = 0 \\ 0, & otherwise \end{matrix} and \begin{matrix} e (s) = \sum_{n = sub_start (f, s)}^{sub_stop (f, s) - 1} {X_{s} (n)}^{2} & for s = 0 \dots 2 \end{matrix}$

with sub_start(f,s) and sub_stop(f,s) given Table 2. e (s) is an energy sum over a spectral subsection (a normalization factor between the start and the stop frequency of each filter).

The normalized autocorrelation function may be lag-windowed using, for example:

$r (k) = r (k) \exp [- \frac{1}{2} {(0.0 2 π k)}^{2}] for k = 0 \dots 8$

In some examples, it is possible to perform a decision to turn on/off the TNS filter fin the current frame is based on the prediction gain

If predGain>thresh, then turn on the TNS filter f

with thresh=1.5 and the prediction gain may be computed by

$predGain = \frac{r (0)}{e}$

The additional steps described below are performed only if the TNS filter f is turned on (or in the example which do not use the turning on/off).

In some examples, a weighting factor may be computed by

$γ = {\begin{matrix} 1 - (1 - γ_{\min}) \frac{thresh 2 - predGain}{thresh 2 - thresh}, & if tns_lpc_weighting = 1 and predGain < thresh 2 \\ 1, & otherwise \end{matrix}$

with thresh2=2, γ_min=0.85 and

$tns_lpc_weighting = {\begin{matrix} 1, & if nbits < 480 \\ 0, & otherwise \end{matrix}$

The LPC coefficients may be weighted using the factor γ

a_w(k)=γ^ka(k) for k=0 . . . 8

The weighted LPC coefficients may be converted to reflection coefficients using the following procedure:

$a^{K} (k) = a_{w} (k), k = 0, \dots, K for k = K to 1 do rc (k) = a^{k} (k) e = (1 - {rc (k)}^{2}) for n = 1 to k - 1 do a^{k - 1} (n) = \frac{a^{k} (n) - rc (k) a^{k} (k - n)}{e}$

wherein rc(k,f)=rc(k) are the final estimated reflection coefficients for the TNS filter f.

If the TNS filter f is turned off, then the reflection coefficients may be simply set to 0: rc(k,f)=0, k=0 . . . 8.

At step S73, a quantization step may be performed. For example, for each TNS filter f, reflection coefficients (e.g., as obtained at step S72) may quantized. For example, scalar uniform quantization in the arcsine domain may be used:

$\begin{matrix} {rc}_{i} (k, f) = nint [\frac{\arcsin (rc (k, f))}{Δ}] + 8 & for k = 0 \dots 8 \end{matrix}$

and/or

rc_q(k,f)=sin[Δ(rc_i(k,f)−8)] for k=0 . . . 8

with

$Δ = \frac{π}{1 7}$

and nint(.) being the rounding-to-nearest-integer function, for example; rc_i(k,f) the quantizer output indices; and rc_q(k,f) the quantized reflection coefficients.

An order of the quantized reflection coefficients may be calculated using

k=7

while k≥0 and rc_q(k,f)=0 do

k=k−1
rc_order(f)=k+1

A total number of bits consumed by TNS in the current frame may be computed as follows

${nbits}_{TNS} = ⌈ \sum_{f = 0}^{num_tns_filters - 1} \frac{2 0 4 8 + {nbits}_{{TNS}_{order}} (f) + {nbits}_{{TNS}_{rc}} (f)}{2 0 4 8} ⌉ with {nbits}_{{TNS}_{order}} (f) = {\begin{matrix} ac_tns_order_bits [tns_lpc_weighting] [r c_{order} (f) - 1], & if {rc}_{order} (f) > 0 \\ 0, & otherwise \end{matrix} and / or {nbits}_{{TNS}_{coef}} (f) = {\begin{matrix} \sum_{k = 0}^{{rc}_{order} (f) - 1} ac_tns_coef_bits [k] [{rc}_{i} (k, f)], & if {rc}_{order} (f) > 0 \\ 0, & otherwise \end{matrix}$

┌ . . . ┐ means a rounding operation to the integer over.

The tables tab_nbits_TNS_order and tab_nbits_TNS_coef may be pre-defined.

At step S74, a digital representation of an information signal in the FD (e.g., as provided by the LPC tool 32 or SNS tool 32a) may be filtered. This representation may be, in examples, in the form of a modified discrete cosine or sine transform (MDCT or MDST). The MDCT spectrum X_s(n) may filtered using the following algorithm, for example:

s⁰(start_freq(0)−1)=s¹(start_freq(0)−1)= . . . =s⁷(start_freq(0)−1)=0

for f=0 to num_tns_filters−1 do

- for n=start_freq(f) to stop_freq(t)−1 do
  
  t⁰(n)=s⁰(n)=X_s(n)
- for k=7 to 0 do
  
  t^k+1(n)=t^k(n)+rc_q(k)s^k(n−1)
  s^k+1(n)=rc_q(k)t^k(n)+s^k(n−1)
  X_f(n)=t⁸(n)

where X_f(n) is the TNS filtered MDCT or MDST spectrum.

Other filtering techniques may be used. However, it may be seen that the TNS is applied to the particular bandwidth (e.g., NB, WB, SSWB, SWB, FB) chosen by the controller 39 on the basis of the signal characteristics.

5.5 Spectral Quantization at the Encoder

A spectrum quantizer tool 34 is here discussed. The MDCT or MDST spectrum after TNS filtering (X_f(n)) may be quantized using dead-zone plus uniform threshold scalar quantization and the quantized MDCT or MDST spectrum X_q(n) may then be encoded using arithmetic encoding. A global gain gg may control the step size of the quantizer. This global gain is quantized with 7 bits and the quantized global gain index gg_indis then an integer, for example, between 0 and 127. The global gain index may be chosen such that the number of bits needed to encode the quantized MDCT or MDST spectrum is as close as possible to the available bit budget.

In one example, a number of bits available for coding the spectrum may be given by

${nbits}_{spec} = nbits - {nbits}_{bw} - {nbits}_{TNS} - {nbits}_{LTPF} - {nbits}_{LPC / SNS} - {nbits}_{gain} - {nbits}_{nf} - ⌈ \log_{2} (\frac{N_{E}}{2}) ⌉ - 1$

with nbits being the number of bits available in one TD frame for the original information signal, nbits_bwprovided in Table 1, nbits_TNSprovided by the TNS (total number of bits consumed by TNS in a current frame), nbits_LTPFbeing associated to the LTPF 38b (number of bits consumed by LTPF), nbits_LPC/SNS=38, nbits_gain=7 and nbits_nf=3, for example. In examples, also protection bits (e.g., cyclical redundancy code, CRC, bits) may be taken into consideration.

An offset may first be computed using

nbits_offset=0.8*nbits_offset^old+0.2*min(40,max(−40,nbits_offset^old+nbits_spec^old−nbits_est^old))

with nbits_offset^oldis the value of nbits_offsetin the previous frame, nbits_spec^oldis the value of nbits_specin the previous frame and nbits_est^oldis the value of nbits_estin the previous frame.

This offset may then be used to adjust the number of bits available for coding the spectrum

nbits_spec=nint(nbits_spec+nbits_offset)

A global gain index may then be estimated such that the number of bits needed to encode the quantized MDCT or MDST spectrum is as close as possible to the available bit budget. This estimation is based on a low-complexity bisection search which coarsely approximates the number of bits needed to encode the quantized spectrum. The algorithm can be described as follows

fac = 128;

gg_ind= 127;

for (iter = 0; iter < 7; iter++)

{

fac >>= 1;

gg_ind−= fac;

tmp = 0;

for (i = 0; i < N_E/4; i++)

{

if (E[i]*28/20 <gg_ind)

{

tmp += 2.7*28/20;

}

else

{

tmp += E[i]*28/20 − gg_ind+ 7*28/20;

}

}

if (tmp > nbits_spec* 1.4*28/20)

{

gg_ind+= fac;

}

}

with E[k] being the energy (in dB) of blocks of 4 MDCT or MDST coefficients given by

$E (k) = 1 0 * \log_{1 0} (\sum_{n = 0}^{3} {X_{f} (4 * k + n)}^{2}) for k = 0 \dots \frac{N_{E}}{4}$

The global gain index above is first unquantized using

$g g = 10^{\frac{g g_{i n d}}{2 8}}$

The spectrum X_fmay then be quantized using, for example:

$X_{q} (n) = {\begin{matrix} \min (⌊ \frac{X_{f} (n)}{g g} + 0.375 ⌋, 32767), & if X_{f} (n) \geq 0 \\ \max (⌈ \frac{X_{f} (n)}{g g} - 0.3 7 5 ⌉, - 327 68), & otherwise \end{matrix}$

The number of bits nbits_estneeded to encode the quantized MDCT or MDST (or, anyway, FD) spectrum X_q(n) can be accurately estimated using the algorithm below.

A bitrate flag is first computed using, for example:

get_rateFlag(f_s, nbits)

if (nbits > (160 + min(4,(f_s/8000−1)) * 160))

{

rateFlag =512;

}

else

{

rateFlag = 0;

}

return rateFlag;

Then the index of the last non-zeroed 2-tuple is obtained by

get_lastnz(Xq[ ], N_E)

lastnz = N_E;

while (lastnz>2 && X_q[lastnz−1] == 0 && X_q[lastnz−2] == 0)

{

lastnz −= 2;

}

return lastnz;

The number of bits nbits_estmay be then computed as follows

nbits_est= 0;

c = 0;

for (n = 0; n < lastnz; n++)

{

t = c + rateFlag;

if (k > N_E/2)

{

t += 256;

}

a = abs(X_q[k]);

b = abs(X_q[k+1]);

nbits_est+= (min(a,1) + min(b,1)) * 2048;

lev = 0;

while (max(a,b) >= 4)

{

pki = ac_spec_lookup[t+lev*1024];

nbits_est+= 2*2048 + ac_spec_bits[pki][16];

a >>= 1;

b >>= 1;

lev = min(lev+1,3);

}

pki = ac_spec_lookup[t+lev*1024];

sym = a + 4*b;

nbits_est+= ac_spec_bits[pki][sym];

if (lev <= 1)

{

t= 1 + (a+b)*(lev+1);

}

else

{

t = 12 + lev;

}

c = (c&15)*16 + t;

}

nbits_est= ceil(nbits_est/2048);

with ac_lookup and ac_bits are tables which may be predefined.

The number of bits nbits_estmay be compared with the available bit budget nbits_spec. If they are far from each other, then the quantized global gain index gg_indis adjusted and the spectrum is requantized. A procedure used to adjust the quantized global gain index gg_indis given below

If ((gg_ind< 127 && nbits_est> nbits_spec) ∥

(gg_ind> 0 && nbits_est< nbitsspec − 20))

{

if (nbits_est< nbits_spec− 20)

{

gg_ind−= 1;

}

else if (gg_ind== 126 ∥ nbits_est< nbits_spec+ 20)

{

gg_ind+= 1;

{

else

{

gg_ind+= 2;

}

}

As can see from above, the spectral quantization is not controlled by the controller 39: there is no restriction to a particular band.

5.6 Entropy Coding

All or part of the encoded data (TNS data, LTPF data, global gain, quantized spectrum . . . ) may be entropy coded, e.g., by compression according to any algorithm.

A portion of this data may be composed by pure bits which are directly put in the bitstream starting from the end of the bitstream and going backward.

The rest of data may be encoded using arithmetic encoding starting from the beginning of the bitstream and going forward.

The two data fields above may be exchanged regarding starting point and direction of reading/writing of the bit stream.

An example in pseudo code may be:

bp = 0;

bp_side = nbytes − 1;

mask_side = 1;

nbits_written = 2 << 11;

c = 0;

lastnz = get_lastnz(X_q, N_E);

rateFlag = get_rateFlag(f_s, nbits);

/* Bandwidth */

if (nbits_bw> 0)

{

write_uint_backward(bytes, &bp_side, &mask_side, P_bw, nbits_bw);

nbits_written += nbits_bw<< 11;

}

/* Global Gain */

write_uint_backward(bytes, &bp_side, &mask_side, gg_ind, 7);

nbits_written += 7 << 11;

/* Noise Factor */

write_uint_backward(bytes, &bp_side, &mask_side, 0, 3);

nbits_written += 3 << 11;

/* TNS activation flag */

for (f = 0; f < num_tns_filters; f++)

{

write_bit_backward(bytes, &bp_side, &mask_side, min(rc_order(f), 1));

nbits_written += 1 << 11;

}

/* LTPF data */

write_bit_backward(bytes, &bp_side, &mask_side, pitch_present);

nbits_written += 1 << 11;

if (pitch_present != 0)

{

write_uint_backward(bytes, &bp_side, &mask_side, pitch_index, 9);

write_uint_backward(bytes, &bp_side, &mask_side, ltpf_active, 1);

nbits_written += 10 << 11;

}

/* Env-VQ integer bits */

write_uint_backward(bytes, &bp_side, &mask_side, L_lsf_idx[0],

10 >> 1);

write_uint_backward(bytes, &bp_side, &mask_side, L_lsf_idx[l], 10 >> 1);

write_bit_backward(bytes, &bp_side, &mask_side, lsf_submode_flag);

write_uint_backward(bytes, &bp_side, &mask_side, L_lsf_idx[3], fgBits);

write_bit_backward(bytes, &bp_side, &mask_side, L_lsf_idx[4]);

nbits_written += (12 + fgBits) <<11;

/* Last non-zero tuple */

nbits_lastnz = ceil(log2(N_E/2));

bp_side_lastnz = bp_side;

mask_side_lastnz = mask_side;

write_uint_backward(bytes, &bp_side_lastnz, &mask_side_lastnz,

(lastnz >> 1) − 1, nbits_lastnz);

nbits_written += nbits_lastnz << 11;

5.7 Noise Estimation at the Encoder

A noise estimation tool 36 (noise level estimator) may control the noise filing on the decoder side. At the encoder side, the noise level parameter may be estimated, quantized and transmitted or stored in a bitstream.

The noise level may be estimated based on the spectral coefficients which have been quantized to zero, i.e. X_q(k)==0. The indices for the relevant spectral coefficients are given by

$I_{N F} (k) = {\begin{matrix} 1 & \begin{matrix} if 24 \leq k < {bw}_{stop} and \\ X_{q} (i) == 0 for all i = k - 3 \dots \min (bw_stop, k + 3) \end{matrix} \\ 0 & otherwise \end{matrix}$

where bw_stopmay depend on the bandwidth detected at step S62 and/or by the bandwidth detector and controller 39 as defined, for example, in the following table:

TABLE 3

Bandwidth(P_bw, 39a)

NB
WB
SSWB
SWB
FB

bw_stop
80
160
240
320
400

For the identified indices, the mean level of missing coefficients is estimated based on the spectrum after TNS filtering (X_f(k)), for example, and normalized by the global gain.

$L_{N F} = \frac{\sum_{k = 0}^{N_{E} - 1} I_{N F} (k) \cdot \frac{| X_{f} (k) |}{gg}}{\sum_{k = 0}^{N_{E} - 1} I_{N F} (k)}$

The final noise level may be quantized to eight steps:

F_NF=min(max(└8−16·L_NF┘,0),7)

Therefore, the noise level estimator tool 36 may be controlled by the controller 39, e.g., on the basis of bandwidth information 39a.

For example, an electronic version of Table 3 may be stored in a storage unit so that, when the bandwidth selection for a particular bandwidth is obtained, the parameter bw_stopis easily derived.

5.8 Entropy Decoding at the Decoder

All the encoded data (TNS data, LTPF data, global gain, quantized spectrum . . . ) may be entropy decoded at the decoder side, e.g., using the decoder tool 42. A bitstream provided by an encoder may, therefore, be decompressed according to any algorithm.

5.9 Noise Filling at the Decoder

A decoder noise filling tool 43 is here discussed. The decoder noise filling tool 43 may be controlled, inter alia, by the decoder bandwidth controller 49 (and/or by the controller 39 via information 39a encoded in the bitstream, such as the control data field N_bwand/or P_wbof Table 1).

The indices for the relevant spectral coefficients may be given by

$I_{N F} (k) = {\begin{matrix} 1 & \begin{matrix} if 24 \leq k < {bw}_{stop} and \\ (i) == 0 for all i = k - 3 \dots \min ({bw}_{stop}, k + 3) \end{matrix} \\ 0 & otherwise \end{matrix}$

where bw_stopmay be given in Table 3.

The noise filling may be applied on the identified relevant spectral lines I_NF(k) using a transmitted noise factor F_NFobtained from the encoder. F_NFmay be calculated at the noise estimator on encoder side. F_NFmay be a 3 bit value coded as side information in the bit stream. F_NFmay be obtained, for example, using the following procedure:

/* Bandwidth */

if (nbits_bw> 0)

{

P_bw= read_uint(bytes, &bp_side, &mask_side, nbits_bw),

}

else

{

P_bw= 0;

}

/* Global Gain */

gg_ind= read_uint(bytes, &bp_side, &mask_side, 7);

/* Noise Level */

F_NF= read_uint(bytes, &bp_side, &mask_side, 3);

/* TNS activation flag */

if (P_bw< 3)

{

num_tns_filters = 1;

}

else

{

num_tns_filters = 2;

}

for(f = 0; f < num_tns_filters; f++)

{

rc_order(f) = read_bit(bytes, &bp_side, &mask_side);

}

/* LTPF data */

pitch_present = read_bit(bytes, &bp_side, &mask_side);

if (pitch_present != 0)

{

pitch_index = read_uint(bytes, &bp_side, &mask_side, 9);

ltpf_active = read_uint(bytes, &bp_side, &mask_side, 1);

}

else

{

pitch_index = 0;

ltpf_active = 0;

}

/* LSF-VQ integer bits */

L_lsf_idx[0] = read_uint(bytes, &bp_side, &mask_side, 10 >> 1);

L_lsf_idx[1] = read_uint(bytes, &bp_side, &mask_side, 10 >> 1);

lsf_submode_flag = read_bit(bytes, &bp_side, &mask_side);

L_lsf_idx[3] = read_uint(bytes, &bp_side, &mask_side, fgBits);

L_lsf_idx[4] = read_bit(by(es, &bp_side, &mask_side);

/* Last non-zero tuple */

nbits_lastnz = ceil(log2(N_E/2));

lastnz = read_uint(bytes, &bp_side, &mask_side, nbits_lastnz);

lastnz = (lastnz + 1) << 1;

A procedure is here provided:

= (8-F_NF)/16;

for k=0.. bw_stop−1

if l_NF(k)==1

nf_seed = (13849+nf_seed*31821) & 0xFFFF;

if nf_seed>=0x8000

custom character

;

else

custom character

= −

;

How to obtain the nf_seed may be described, for example, by the following pseudocode:

{

[k] = 0;

}

/* Noise Filling Seed */

tmp = 0;

for (k = 0; k < N_E; k++)

{

tmp += abs( custom character

[k]) * k;

}

nf_seed = tmp & 0xFFFF;

As can be seen from above, the decoder noise filter tool 43 may make use of the parameter bw_stop.

In some examples, the parameter bw_stopexplicitly obtained as a value in the bitstream. In examples, the parameter bw_stopis obtained by the controller 49 on the basis of the bandwidth information 39a (P_bw) in a control field of the bitstream encoded by the encoder. The decoder may have an electronic version of Table 3 stored in a non-transitory storage unit. Accordingly, the bitstream length is reduced.

Therefore, the bandwidth controller 49 (and/or the bandwidth detector and controller 39 of the decoder via the control data 39a) may control the decoder noise filling tool 43.

5.9 Global Gain at the Decoder

A global gain may be applied on the spectrum after the noise filling has been applied using, for example, a formula such as

$\hat{X_{f}} (k) = (k) \cdot 10^{(\frac{g g_{i n d}}{2 8})} for k = 0 \dots N_{E} - 1$

where gg_indis a global gain index, e.g., obtained from the encoder.

5.10 TNS at the Decoder

A TNS decoder tool 45 is here discussed. The quantized reflection coefficients may be obtained for each TNS filter f using

rc_q(k,f)=sin[Δ(rc_i(k,f)−8)] k=0 . . . 8

where rc_i(k,f) are the quantizer output indices.

The MDCT or MDST spectrum custom character (n) (e.g., as generated by the global gain tool) may then be filtered using a following procedure such as:

s⁰(start_freq(0) −1) = s¹(start_freq(0) −1) = . . . = s⁷(start_freq(0) − 1) = 0

for f = 0 to num_tns_filters−1 do

for n = start_freq(f) to stop_freq(f) − 1 do

t^k(n) = custom character

(n)

for k = 7 to 0 do

t^k(n) = t^k+1(n) − rc_q(k)s^k(n − 1)

s^k+1(n) = rc_q(k)t^k(n) + s^k(n −1)

custom character

(n) = s⁰(n) = t⁰(n)

where custom character (n) is the output of the TNS decoder.

The parameters num_tns_filters, start_freq and stop_freq may be provided, on the basis of control information provided by the encoder.

In some examples num_tns_filters, start_freq and/or stop_freq are not explicitly provided in the bitstream. In examples, num_tns_filters, start_freq and stop_freq are derived on the basis of the N_bwvalue in a control field of the bitstream encoded by the encoder. For example, the decoder may have an electronic version of Table 2 (or at least a portion thereof) stored therein. Accordingly, the bitstream length is reduced.

Therefore, the TNS decoder tool 45 may be controlled by the bandwidth detected at the encoder side.

5.11.1 MDCT or MDST Shaping at the Decoder

An MDCT or MDST shaping tool 46 is here discussed. The LPC or SNS shaping may be performed in the MDCT (FD) domain by applying gain factors computed from the decoded LP filter coefficients transformed to the MDCT or MDST spectrum.

To compute the N_BLPC shaping gains, the decoded LP filter coefficients ã may be first transformed into the frequency domain using an odd DFT.

$G_{L P C} (b) = \sum_{k = 0}^{N_{L}} \tilde{a} (k) \cdot e^{- j \frac{π k}{N_{B}} (b + \frac{1}{2})} for b = 0 \dots N_{B} - 1$

The LPC shaping gains g_LPC(b) may then be computed as the reciprocal absolute values of G_LPC(b).

$g_{L P C} (b) = \frac{1}{\langle G_{L P C} (b) \rangle} for b = 0 \dots N_{B} - 1$

The LPC shaping gains g_LPC(b) may be applied on the TNS filtered MDCT frequency lines for each band separately as outlined in order to generate the shaped spectrum {circumflex over (X)}(k) as outlined, for example, by the following code:

for (b=0; b<N_B; b++) {

for (k=l_f_s(b); k<l_f_s(b +1); k++) {

{circumflex over (X)}(k) = custom character

(k) · g_LPC(b)

}

}

As can be seen above, the MDCT or MDST shaping tool 46 does not need to be restricted to a particular bandwidth and, therefore, does not need to be controlled by the controller 49 or 39.

5.11.2 SNS at the Decoder

The following steps may be performed at the noise shaper decoder, SNS, tool 46a:

Step 1: Quantization

The vector quantizer indices produced in encoder step 8 (see section 5.3.2) are read from the bitstream and used to decode the quantized scale factors scfQ (n).

Step 2: Interpolation

Same as Step 9 at section 5.3.2.

Step 3: Spectral Shaping

The SNS scale factors g_SNS(b) are applied on the quantized MDCT (or MDST, or another transform) frequency lines for each band separately in order to generate the decoded spectrum {circumflex over (X)}(k) as outlined by the following code.

{circumflex over (X)}(k)= custom character (k)·g_SNS(b) for k=I_f_s(b) . . . I_f_s(b+1)−1, for b=0 . . . 63

5.12 MDCT or MDST Synthesis at the Decoder

An inverse MDCT or MDST tool 48a is here discussed (other tools based on other transformations, such as lapped transformations, may be used).

A reconstructed spectrum {circumflex over (X)}(k) may be transformed to time domain by the following steps:

1. Generation of time domain aliasing buffer {circumflex over (t)}(n)

$\hat{t} (n) = \sqrt{\frac{2}{N_{F}}} \sum_{k = 0}^{N_{F} - 1} \hat{X} (k) \cos [\frac{π}{N_{F}} (n + \frac{1}{2} + \frac{N_{F}}{2}) (k + \frac{1}{2})] for$

$n = 0 \dots 2 N_{F} - 1$

2. Windowing of time-aliased buffer

{circumflex over (t)}(n)=w_N(2N−1−n)·{circumflex over (t)}(n) for n=0 . . . 2N_F−1

3. Conduct overlap-add operation to get reconstructed time samples {circumflex over (x)}(n)

{circumflex over (x)}(n)=mem_ola_add(n)+{circumflex over (t)}(Z+n) for n=0 . . . N_F−Z−1
{circumflex over (x)}(n)={circumflex over (t)}(Z+n) for n=N_F−Z . . . N_F−1
mem_ola_add(n)={circumflex over (t)}(N_F+Z+n) for n=0 . . . N_F−Z−1

with mem_ola_add(n) is initialized to 0 before decoding the first frame.

With reference to step 1, an MDST may be performed by exchanging the cos function by a sine function, e.g., to have:

$\hat{t} (n) = \sqrt{\frac{2}{N_{F}}} \sum_{k = 0}^{N_{F} - 1} \hat{X} (k) \sin [\frac{π}{N_{F}} (n + \frac{1}{2} + \frac{N_{F}}{2}) (k + \frac{1}{2})] for$

$n = 0 \dots 2 N_{F} - 1$

As can be seen above, the inverse MDCT or MDST tool 48a is not controlled on the basis of the bandwidth determined at the encoder side.

6. OTHER EXAMPLES

FIG. 8a shows an apparatus 110 which may implement at least some tools of the encoder apparatus 30 or 30a and/or perform at least some steps of the method 60 and/or 70. The apparatus 110 may comprise a processor 111 and a non-transitory memory unit 112 storing instructions which, when executed by the processor 111, may cause the processor 111 to implement at least one of the TD and/or FD tools of the encoder apparatus 30 or 30a. In particular, the instructions may implement a subgroup of FD tools (e.g., TNS and/or noise filling) and other FD tools which are not in the subgroup (e.g., 31, 32, 34, 35). The instructions may also comprise instructions which, when executed by the processor 111, perform a selection of the bandwidth so that the bandwidth of the signals processed by the tools in the subgroup of FD tools (e.g., TNS and/or noise filling) differs from the bandwidth of the signals processed by the other FD tools which are not in the subgroup (e.g., 31, 32, 34, 35). The instructions may be such as to control the bandwidth selection based on energy detections associated to the different bandwidths. The instructions may also comprise instructions which, when executed by the processor 111, permit to control a decoder and, in particular, permit to control the bandwidth of a subgroup of FD tools (e.g., 43, 45) which may be different from the bandwidth of other FD tools. The bandwidth chosen for the subgroup at the encoder may be the same chosen for the subgroup at the decoder. The non-transitory memory unit 112 may also comprise other data, such as at least portions electronic versions of Tables 1, 2, and/or 3. The apparatus 110 may comprise a storage space 118 for storing, for example, a bitstream obtained from an information signal (e.g., an audio signal). The apparatus 110 may comprise an output unit 117 for transmitting data, e.g., wirelessly, e.g., using a particular protocol, such as Bluetooth. For example, the apparatus 110 may define, by executing the instructions stored in the non-transitory memory unit 112, a bitstream to be transmitted to a decoder. The apparatus 110 may also comprise an input unit 116 for obtaining data, e.g., wirelessly, e.g., using a particular protocol, such as Bluetooth.

FIG. 8b shows an apparatus 120 which may implement at least some tools of the decoder apparatus 40 or 40a. The apparatus 120 may comprise a processor 121 and a non-transitory memory unit 122 storing instructions which, when executed by the processor 121, may cause the processor 121 to implement at least one of the TD and/or FD tools of the decoder apparatus 40 or 40a. In particular, the instructions may implement a subgroup of FD tools (e.g., TNS and/or noise filling) and other FD tools which are not in the subgroup (e.g., 44, 46, etc.). The instructions may also comprise instructions which, when executed by the processor 121, perform a selection of the bandwidth so that the bandwidth of the signals processed by the tools in the subgroup of FD tools (e.g., TNS and/or noise filling) differs from the bandwidth of the signals processed by the other FD tools which are not in the subgroup (e.g., 44, 46, etc.). The instructions may be such as to control a bandwidth selection based on energy detections associated to the different bandwidths, as, for example, performed by an encoder. The instructions may also comprise instructions which, when executed by the processor 121, permit to operate as a encoder and, in particular, permit to control the bandwidth of a subgroup of FD tools (e.g., 43, 45) which may be different from the bandwidth of other FD tools. The bandwidth chosen for the subgroup at the encoder may be the same chosen for the subgroup at the decoder. The non-transitory memory unit 122 may also comprise other data, such as at least portions electronic versions of Tables 1, 2, and/or 3. The apparatus 120 may comprise a storage space 128 for storing, for example, a bitstream obtained from an information signal (e.g., an audio signal).

The apparatus 120 may comprise an output unit 127 for transmitting data, e.g., wirelessly, e.g., using a particular protocol, such as Bluetooth. The apparatus 120 may also comprise an input unit 126 for obtaining data, e.g., wirelessly, e.g., using a particular protocol, such as Bluetooth. For example, the apparatus 120 may obtain, by executing the instructions stored in the non-transitory memory unit 122, a bitstream transmitted by a decoder.

In examples, the apparatus 110 and 120 may be the same device. In examples, the composition of different apparatus 110 and 120 form a system.

Depending on certain implementation requirements, examples may be implemented in hardware. The implementation may be performed using a digital storage medium, for example a floppy disk, a Digital Versatile Disc (DVD), a Blu-Ray Disc, a Compact Disc (CD), a Read-only Memory (ROM), a Programmable Read-only Memory (PROM), an Erasable and Programmable Read-only Memory (EPROM), an Electrically Erasable Programmable Read-Only Memory (EEPROM) or a flash memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.

Generally, examples may be implemented as a computer program product with program instructions, the program instructions being operative for performing one of the methods when the computer program product runs on a computer. The program instructions may for example be stored on a machine readable medium.

Other examples comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier. In other words, an example of method is, therefore, a computer program having a program instructions for performing one of the methods described herein, when the computer program runs on a computer.

A further example of the methods is, therefore, a data carrier medium (or a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein. The data carrier medium, the digital storage medium or the recorded medium are tangible and/or non-transitionary, rather than signals which are intangible and transitory.

A further example comprises a processing unit, for example a computer, or a programmable logic device performing one of the methods described herein.

A further example comprises a computer having installed thereon the computer program for performing one of the methods described herein.

A further example comprises an apparatus or a system transferring (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.

In some examples, a programmable logic device (for example, a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some examples, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods may be performed by any appropriate hardware apparatus.

While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.

Claims

1. An encoder apparatus comprising: a plurality of frequency domain (FD) encoder tools for encoding an information signal, the information signal presenting a plurality of frames; andan encoder bandwidth detector and controller configured to select a bandwidth for at least a subgroup of the plurality of FD encoder tools, the subgroup comprising less FD encoder tools than the plurality of FD encoder tools, on a basis of information signal characteristics so that at least one of the FD encoder tools of the subgroup comprises a different bandwidth with respect to at least one of FD encoder tools which are not in the subgroup,wherein at least one FD encoder tool of the subgroup is a temporal noise shaping (TNS) tool,wherein at least one FD encoder tool which is not in the subgroup is connected upstream to the TNS tool and is chosen between a linear predictive coding (LPC) based spectral shaper and a spectral noise shaping (SNS) tool, andwherein the at least one FD encoder tool which is not in the subgroup is configured to operate at a full bandwidth or at a bandwidth broader than the selected bandwidth.
2. The encoder apparatus of claim 1, wherein: the at least one FD encoder tool of the subgroup is a noise level estimator tool.
3. The encoder apparatus of claim 1, wherein: the at least one FD encoder tool which is not in the subgroup is chosen between a spectral quantizer and a residual coder.
4. The encoder apparatus of claim 1, wherein: the encoder bandwidth detector and controller is configured to select a bandwidth of the at least one FD encoder tool of the subgroup between at least a first bandwidth common to the at least one of the FD encoder tools which are not in the subgroup and a second bandwidth different from a bandwidth of the at least one of the FD encoder tools which are not in the subgroup.
5. The encoder apparatus of claim 1, wherein: the encoder bandwidth detector and controller is configured to select a bandwidth of at least one of the plurality of FD encoder tools on a basis of at least one energy estimation on the information signal.
6. The encoder apparatus of claim 1, wherein: the encoder bandwidth detector and controller is configured to compare at least one energy estimate associated to a bandwidth of the information signal to a respective threshold to control a bandwidth for at least one of the plurality of FD encoder tools.
7. The encoder apparatus of claim 1, wherein: the TNS tool is configured to autocorrelate a TNS input signal within a bandwidth chosen by the encoder bandwidth detector and controller.
8. The encoder apparatus of claim 1, wherein the encoder bandwidth detector and controller is configured to select at least one bandwidth which is within the full bandwidth at which the at least one of the FD encoder tools which are not in the subgroup is configured to operate.
9. The encoder apparatus of claim 1, wherein at least one of remaining FD encoder tools of the plurality of FD encoder tools is configured to operate in open chain with respect to a bandwidth chosen by the encoder bandwidth detector and controller.
10. The encoder apparatus of claim 1, wherein: the encoder bandwidth detector and controller is configured to select a bandwidth among a finite number of bandwidths and/or among a set of pre-defined bandwidths.
11. The encoder apparatus of claim 1, wherein: the encoder bandwidth detector and controller is configured to perform a selection among at least one or a combination of the following bandwidths: a 4 KHz, 8 KHz, 12 KHz, 16 KHz, and 24 KHz, and/or NB, WB, SSWB, SWB, FB.
12. The encoder apparatus of claim 1, wherein: the encoder bandwidth detector and controller is configured to control the signalling of the bandwidth to a decoder.
13. The encoder apparatus of claim 1, configured to: encode a control data field comprising information regarding a chosen bandwidth.
14. The encoder apparatus of claim 1, configured to: encode a control data field comprising at least one of: 0 data bits corresponding to NB bandwidth;1 data bit corresponding to NB, WB bandwidth;2 data bits corresponding to NB, WB, SSWB bandwidth;2 data bits corresponding to NB, WB, SSWB, SWB bandwidth; and3 data bits corresponding to NB, WB, SSWB, SWB, FB bandwidth.
15. An encoder apparatus comprising: a plurality of frequency domain (FD) encoder tools for encoding an information signal, the information signal presenting a plurality of frames; andan encoder bandwidth detector and controller configured to select a bandwidth for at least a subgroup of the plurality of FD encoder tools, the subgroup comprising less FD encoder tools than the plurality of FD encoder tools, on a basis of information signal characteristics so that at least one of the FD encoder tools of the subgroup comprises a different bandwidth with respect to at least one of FD encoder tools which are not in the subgroup,wherein the encoder bandwidth detector and controller is configured to select a bandwidth of at least one of the plurality of FD encoder tools on a basis of at least one energy estimation on the information signal, andwherein the at least one energy estimation is performed as:
16. An encoder apparatus comprising: a plurality of frequency domain (FD) encoder tools for encoding an information signal, the information signal presenting a plurality of frames; andan encoder bandwidth detector and controller configured to select a bandwidth for at least a subgroup of the plurality of FD encoder tools, the subgroup comprising less FD encoder tools than the plurality of FD encoder tools, on a basis of information signal characteristics so that at least one of the FD encoder tools of the subgroup comprises a different bandwidth with respect to at least one of FD encoder tools which are not in the subgroup,the encoder apparatus further comprising a TNS tool configured to perform a filtering operation comprising a calculation of the filtering operation:
17. An encoder apparatus comprising: a plurality of frequency domain (FD) encoder tools for encoding an information signal, the information signal presenting a plurality of frames; andan encoder bandwidth detector and controller configured to select a bandwidth for at least a subgroup of the plurality of FD encoder tools, the subgroup comprising less FD encoder tools than the plurality of FD encoder tools, on a basis of information signal characteristics so that at least one of the FD encoder tools of the subgroup comprises a different bandwidth with respect to at least one of FD encoder tools which are not in the subgroup,the encoder apparatus further comprising a noise estimator configured to estimate a noise level using
18. A decoder apparatus comprising a plurality of FD decoder tools for decoding an information signal encoded in a bitstream, wherein: the plurality of FD decoder tools are divided among: a subgroup comprising at least one FD decoder tool, the subgroup comprising a temporal noise shape (TNS) decoder; andremaining FD decoder tools comprising at least one FD decoder tool which includes a spectral noise shaping (SNS) tool, and an MDCT or MDST shaping tool, downstream to the TNS decoder,wherein the decoder apparatus is configured to control a bandwidth of the at least one FD decoder tool in the subgroup between a first bandwidth common to the remaining FD decoder tools and a second bandwidth different from the first bandwidth, wherein the first bandwidth is a full bandwidth or a bandwidth broader than the second bandwidth.
19. The decoder apparatus of claim 18, further comprising a bandwidth controller configured to: choose the bandwidth of the at least one FD decoder tool in the subgroup on a basis of bandwidth information in the bitstream.
20. The decoder apparatus of claim 18, wherein: the subgroup comprises a decoder noise filling tool.
21. The decoder apparatus of claim 18, wherein: the at least one of the remaining FD decoder tools comprises an MDCT or MDST shaping tool or another shaping tool based on another transformation.
22. The decoder apparatus of claim 18, wherein the remaining FD decoder tools are configured to operate in open chain with respect to a chosen bandwidth.
23. The decoder apparatus of claim 18, further configured to: choose a bandwidth among a set of pre-defined bandwidths.
24. The decoder apparatus of claim 18, further configured to: perform a choice among at least one or a combination of: a 8 KHz, 16 KHz, 24 KHz, 32 KHz, and 48 KHz and/or NB, WB, SSWB, SWB, FB.
25. A decoder apparatus comprising a plurality of FD decoder tools for decoding an information signal encoded in a bitstream, wherein: the plurality of FD decoder tools are divided: in a subgroup comprising at least one FD decoder tool; andin remaining FD decoder tools comprising at least one FD decoder tool,wherein the decoder apparatus is configured so that at least one of the plurality of FD decoder tools of the subgroup performs signal processing a different bandwidth with respect to at least one of the remaining FD decoder tools of the plurality of FD decoder tools,the decoder apparatus further comprising a noise filling tool configured to apply a noise level using indices given by
26. A decoder apparatus comprising a plurality of FD decoder tools for decoding an information signal encoded in a bitstream, wherein: the plurality of FD decoder tools are divided: in a subgroup comprising at least one FD decoder tool; andin remaining FD decoder tools comprising at least one FD decoder tool,wherein the decoder apparatus is configured so that at least one of the plurality of FD decoder tools of the subgroup performs signal processing a different bandwidth with respect to at least one of the remaining FD decoder tools of the plurality of FD decoder tools,the decoder apparatus further comprising a TNS decoder configured to perform: s0(start_freq(0)−1)=s1(start_freq(0)−1)= . . . =s7(start_freq(0)−1)=0
27. A method for encoding an information signal according to at least a plurality of operations in a frequency domain (FD), the method comprising: selecting a bandwidth for a subgroup of FD operations, the subgroup including a temporal noise shaping (TNS) operation;performing first signal processing operations at the selected bandwidth for the subgroup of FD operations;upstream, performing second signal processing operations at a different bandwidth for FD operations which are not in the subgroup, the FD operations which are not in the subgroup including at least one of a linear predictive coding (LPC) based spectral shaping operation and a spectral noise shaping (SNS) operation, wherein the different bandwidth is a full bandwidth or a bandwidth broader than the selected bandwidth.
28. A method for decoding a bitstream with an information signal and control data, the method comprising a plurality of signal processing operations in a frequency domain (FD), the method comprising: choosing a bandwidth selection for a subgroup of FD operations on a basis of the control data;performing first signal processing operations at a selected bandwidth for the subgroup of FD operations;downstream, performing second signal processing operations at a different bandwidth for FD operations which are not in the subgroup,wherein FD operations of the subgroup are at a bandwidth which is a full bandwidth or a broader bandwidth than the selected bandwidth.

Priority Claims (1)

Number	Date	Country	Kind
17201082	Nov 2017	EP	regional

CROSS-REFERENCES TO RELATED APPLICATIONS

This application is a continuation of copending International Application No. PCT/EP2018/080335, filed Nov. 6, 2018, which is incorporated herein by reference in its entirety, and additionally claims priority from European Application No. EP 17201082.9, filed Nov. 10, 2017, which is incorporated herein by reference in its entirety.

US Referenced Citations (158)

Number	Name	Date	Kind
4972484	Link et al.	Nov 1990	A
5012517	Chhatwal et al.	Apr 1991	A
5581653	Todd	Dec 1996	A
5651091	Chen et al.	Jul 1997	A
5781888	Herre	Jul 1998	A
5812971	Herre	Sep 1998	A
5819209	Inoue	Oct 1998	A
5909663	Iijima et al.	Jun 1999	A
5999899	Robinson	Dec 1999	A
6018706	Huang et al.	Jan 2000	A
6148288	Park	Nov 2000	A
6167093	Tsutsui et al.	Dec 2000	A
6507814	Gao	Jan 2003	B1
6570991	Scheirer et al.	May 2003	B1
6665638	Kang et al.	Dec 2003	B1
6735561	Johnston et al.	May 2004	B1
7009533	Wegener	Mar 2006	B1
7302396	Cooke	Nov 2007	B1
7353168	Chen et al.	Apr 2008	B2
7395209	Dokic et al.	Jul 2008	B1
7539612	Chen et al.	May 2009	B2
7546240	Wei-Ge et al.	Jun 2009	B2
8015000	Chen et al.	Sep 2011	B2
8095359	Boehm et al.	Jan 2012	B2
8280538	Kim et al.	Oct 2012	B2
8473301	Chen et al.	Jun 2013	B2
8543389	Ragot et al.	Sep 2013	B2
8554549	Oshikiri et al.	Oct 2013	B2
8612240	Fuchs et al.	Dec 2013	B2
8682681	Fuchs et al.	Mar 2014	B2
8738385	Chen	May 2014	B2
8751246	Bayer et al.	Jun 2014	B2
8847795	Faure et al.	Sep 2014	B2
8891775	Mundt et al.	Nov 2014	B2
8898068	Fuchs et al.	Nov 2014	B2
9026451	Kleijn et al.	May 2015	B1
9123350	Zhao et al.	Sep 2015	B2
9489961	Kovesi et al.	Nov 2016	B2
9595262	Fuchs et al.	Mar 2017	B2
10296959	Chernikhova et al.	May 2019	B1
10726854	Ghido et al.	Jul 2020	B2
20010026327	Schreiber et al.	Oct 2001	A1
20030088408	Thyssen et al.	May 2003	A1
20030101050	Vladimir et al.	May 2003	A1
20040158462	Rutledge et al.	Aug 2004	A1
20040162866	Malvar et al.	Aug 2004	A1
20050010395	Chiu et al.	Jan 2005	A1
20050015249	Wei-Ge et al.	Jan 2005	A1
20050192799	Kim et al.	Sep 2005	A1
20050246178	Fejzo	Nov 2005	A1
20060288851	Naoki et al.	Dec 2006	A1
20070033056	Groeschl et al.	Feb 2007	A1
20070078646	Lei et al.	Apr 2007	A1
20070118361	Sinha et al.	May 2007	A1
20070118369	Chen	May 2007	A1
20070124136	Den Brinker et al.	May 2007	A1
20070127729	Breebaart et al.	Jun 2007	A1
20070129940	Geyersberger et al.	Jun 2007	A1
20070154031	Carlos et al.	Jul 2007	A1
20070276656	Solbach et al.	Nov 2007	A1
20080033718	Zopf et al.	Feb 2008	A1
20080091418	Laaksonen et al.	Apr 2008	A1
20080126086	Kandhadai et al.	May 2008	A1
20080126096	Ki-Hyun et al.	May 2008	A1
20090076805	Zhengzhong et al.	Mar 2009	A1
20090076830	Taleb	Mar 2009	A1
20090089050	Mo et al.	Apr 2009	A1
20090138267	Davidson et al.	May 2009	A1
20090248424	Koishida et al.	Oct 2009	A1
20090254352	Zhao	Oct 2009	A1
20100010810	Morii	Jan 2010	A1
20100070270	Gao	Mar 2010	A1
20100094637	Vinton	Apr 2010	A1
20100115370	Sakari et al.	May 2010	A1
20100198588	Masataka et al.	Aug 2010	A1
20100223061	Ojanpera	Sep 2010	A1
20100312552	Kandhadai et al.	Dec 2010	A1
20100312553	Fang et al.	Dec 2010	A1
20100324912	Mi et al.	Dec 2010	A1
20110015768	Soo et al.	Jan 2011	A1
20110022924	Malenovsky et al.	Jan 2011	A1
20110035212	Briand et al.	Feb 2011	A1
20110060597	Wei-Ge et al.	Mar 2011	A1
20110071839	Budnikov et al.	Mar 2011	A1
20110095920	Ashley et al.	Apr 2011	A1
20110096830	Ashley et al.	Apr 2011	A1
20110116542	Marc et al.	May 2011	A1
20110125505	Philleppe et al.	May 2011	A1
20110145003	Bessette	Jun 2011	A1
20110196673	Jin et al.	Aug 2011	A1
20110200198	Stefan et al.	Aug 2011	A1
20110238425	Jeremie et al.	Sep 2011	A1
20110238426	Borsum et al.	Sep 2011	A1
20120010879	Kei et al.	Jan 2012	A1
20120022881	Geiger et al.	Jan 2012	A1
20120072209	Krishnan et al.	Mar 2012	A1
20120109659	Guoming et al.	May 2012	A1
20120214544	Rodriguez et al.	Aug 2012	A1
20120245947	Neuendorf et al.	Sep 2012	A1
20120265540	Fuchs et al.	Oct 2012	A1
20120265541	Geiger et al.	Oct 2012	A1
20130030819	Pontus et al.	Jan 2013	A1
20130096912	Resch et al.	Apr 2013	A1
20130226594	Fuchs et al.	Aug 2013	A1
20130282369	Sang-Ut et al.	Oct 2013	A1
20140052439	Tejaswi et al.	Feb 2014	A1
20140067404	Baumgarte	Mar 2014	A1
20140074486	Martin et al.	Mar 2014	A1
20140108020	Yang et al.	Apr 2014	A1
20140142957	Nam-Suk et al.	May 2014	A1
20140172141	Mangold	Jun 2014	A1
20140223029	Bhaskar et al.	Aug 2014	A1
20140358531	Vos	Dec 2014	A1
20150010155	Yue et al.	Jan 2015	A1
20150081312	Fuchs et al.	Mar 2015	A1
20150142452	Nam-Suk et al.	May 2015	A1
20150154969	Craven et al.	Jun 2015	A1
20150162011	Zexin et al.	Jun 2015	A1
20150170668	Kovesi et al.	Jun 2015	A1
20150221311	Jeon et al.	Aug 2015	A1
20150228287	Bruhn et al.	Aug 2015	A1
20150255079	Huang et al.	Sep 2015	A1
20150302859	Aguilar et al.	Oct 2015	A1
20150302861	Salami et al.	Oct 2015	A1
20150325246	Philip et al.	Nov 2015	A1
20150371647	Faure et al.	Dec 2015	A1
20160019898	Schreiner et al.	Jan 2016	A1
20160027450	Gao	Jan 2016	A1
20160078878	Ravelli et al.	Mar 2016	A1
20160111094	Martin et al.	Apr 2016	A1
20160163326	Resch et al.	Jun 2016	A1
20160189721	Johnston et al.	Jun 2016	A1
20160225384	Kristofer et al.	Aug 2016	A1
20160285718	Bruhn	Sep 2016	A1
20160293174	Atti	Oct 2016	A1
20160293175	Atti et al.	Oct 2016	A1
20160307576	Stefan et al.	Oct 2016	A1
20160365097	Guan et al.	Dec 2016	A1
20160372125	Atti et al.	Dec 2016	A1
20160372126	Atti et al.	Dec 2016	A1
20160379649	Lecomte et al.	Dec 2016	A1
20160379655	Truman et al.	Dec 2016	A1
20170011747	Faure et al.	Jan 2017	A1
20170053658	Atti et al.	Feb 2017	A1
20170078794	Bongiovi et al.	Mar 2017	A1
20170103769	Laaksonen	Apr 2017	A1
20170110135	Disch et al.	Apr 2017	A1
20170133029	Markovic et al.	May 2017	A1
20170140769	Ravelli et al.	May 2017	A1
20170154631	Bayer et al.	Jun 2017	A1
20170154635	Doehla et al.	Jun 2017	A1
20170221495	Sung	Aug 2017	A1
20170236521	Venkatraman et al.	Aug 2017	A1
20170249387	Hatami-Hanza	Aug 2017	A1
20170256266	Sung	Sep 2017	A1
20170294196	Bradley et al.	Oct 2017	A1
20170303114	Johansson	Oct 2017	A1
20190027156	Sung	Jan 2019	A1

Foreign Referenced Citations (95)

Number	Date	Country
101140759	Mar 2008	CN
102779526	Nov 2012	CN
107103908	Aug 2017	CN
0716787	Jun 1996	EP
0732687	Sep 1996	EP
1791115	May 2007	EP
2676266	Dec 2013	EP
2980796	Feb 2016	EP
2980799	Feb 2016	EP
3111624	Jan 2017	EP
2944664	Oct 2010	FR
H05-281996	Oct 1993	JP
H07-28499	Jan 1995	JP
H0811644	Jan 1996	JP
H9-204197	Aug 1997	JP
H10-51313	Feb 1998	JP
H1091194	Apr 1998	JP
H11-330977	Nov 1999	JP
2004-138756	May 2004	JP
2006-527864	Dec 2006	JP
2007519014	Jul 2007	JP
2007-525718	Sep 2007	JP
2009-003387	Jan 2009	JP
2009-008836	Jan 2009	JP
2009-538460	Nov 2009	JP
2010-500631	Jan 2010	JP
2010-501955	Jan 2010	JP
2012-533094	Dec 2012	JP
2016-523380	Aug 2016	JP
2016-200750	Dec 2016	JP
2017-522604	Aug 2017	JP
2017-528752	Sep 2017	JP
100261253	Jul 2000	KR
20030031936	Apr 2003	KR
1020050007853	Jan 2005	KR
1020090077951	Jul 2009	KR
10-2010-0136890	Dec 2010	KR
20130019004	Feb 2013	KR
10-2016-0079056	Jul 2016	KR
1020160144978	Dec 2016	KR
20170000933	Jan 2017	KR
2337414	Oct 2008	RU
2376657	Dec 2009	RU
2413312	Feb 2011	RU
2419891	May 2011	RU
2439718	Jan 2012	RU
2483365	May 2013	RU
2520402	Jun 2014	RU
2568381	Nov 2015	RU
2596594	Sep 2016	RU
2596596	Sep 2016	RU
2015136540	Mar 2017	RU
2628162	Aug 2017	RU
2016105619	Aug 2017	RU
200809770	Feb 2008	TW
201005730	Feb 2010	TW
201126510	Aug 2011	TW
201131550	Sep 2011	TW
201207839	Feb 2012	TW
201243832	Nov 2012	TW
201612896	Apr 2016	TW
201618080	May 2016	TW
201618086	May 2016	TW
201642246	Dec 2016	TW
201642247	Dec 2016	TW
201705126	Feb 2017	TW
201711021	Mar 2017	TW
201713061	Apr 2017	TW
201724085	Jul 2017	TW
201732779	Sep 2017	TW
9916050	Apr 1999	WO
2004072951	Aug 2004	WO
2005086138	Sep 2005	WO
2005086139	Sep 2005	WO
2007073604	Jul 2007	WO
2007138511	Dec 2007	WO
2008025918	Mar 2008	WO
2008046505	Apr 2008	WO
2009066869	May 2009	WO
2011048118	Apr 2011	WO
2011086066	Jul 2011	WO
2011086067	Jul 2011	WO
2012000882	Jan 2012	WO
2012000882	Jan 2012	WO
2012126893	Sep 2012	WO
2014165668	Oct 2014	WO
2014202535	Dec 2014	WO
2014202535	Dec 2014	WO
2015063045	May 2015	WO
2015063227	May 2015	WO
2015071173	May 2015	WO
2015174911	Nov 2015	WO
2016016121	Feb 2016	WO
2016142002	Sep 2016	WO
2016142337	Sep 2016	WO

Non-Patent Literature Citations (85)

Entry
Dietz, Martin, et al. “Overview of the EVS codec architecture.” 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015. (Year: 2015).
Tetsuyuki Okumachi, “Office Action for JP Application 2020-118837”, dated Jul. 16, 2021, JPG, Japan.
Tetsuyuki Okumachi, “Office Action for JP Application 2020-118838”, dated Jul. 16, 2021, JPO, Japan.
John Tan, “Office Action for SG Application 11202004173P”, dated Jul. 23, 2021, IPOS, Singapore.
Guojun Lu et al., “A Technique towards Automatic Audio Classification and Retrieval, Forth International Conference on Signal Processing”, 1998, IEEE, Oct. 12, 1998, pp. 1142 to 1145.
Hiroshi Ono, “Office Action for JP Application No. 2020-526135”, dated May 21, 2021, JPO Japan.
“Decision on Grant Patent for Invention for RU Application No. 2020118949”, dated Nov. 11, 2020, Rospatent, Russia.
Takeshi Yamashita, “Office Action for JP Application 2020-524877”, dated Jun. 24, 2021, JPO, Japan.
P.A. Volkov, “Office Action for RU Application No. 2020120251”, dated Oct. 28, 2020, Rospatent, Russia.
P.A. Volkov, “Office Action for RU Application No. 2020120256”, dated Oct. 28, 2020, Rospatent, Russia.
D.V.Travnikov, “Decision on Grant for RU Application No. 2020118969”, dated Nov. 2, 2020, Rospatent, Russia.
Lakshmi Narayana Chinta, “Office Action for IN Application No. 202037018098”, dated Jul. 13, 2021, Intellectual Property India, India.
ETSI TS 126 445 V13.2.0 (Aug. 2016), Universal Mobile Telecommunications System (UMTS); LTE; Codec for Enhanced Voice Services (EVS); Detailed algorithmic description (3GPP TS 26.445 version 13.2.0 Release 13) [Online]. Available: http://www.3gpp.org/ftp/Specs/archive/26_series/26.445/26445-d00.zip.
Geiger, “Audio Coding based on integer transform”, Ilmenau: https://www.db-thueringen.de/receive/dbt_mods_00010054, 2004.
Henrique S Malvar, “Biorthogonal and Nonuniform Lapped Transforms for Transform Coding with Reduced Blocking and Ringing Artifacts”, IEEE Transactions on Signal Processing, IEEE Service Center, New York, NY, US, (Apr. 1998), vol. 46, No. 4, ISSN 1053-587X, XP011058114.
Anonymous, “ISO/IEC 14496-3:2005/FDAM 9, AAC-ELD”, 82. MPEG Meeting;Oct. 22, 2007-Oct. 26, 2007; Shenzhen; (Motion Picture Expert Group or ISO/IEC JTC1/SC29/WG11),, (Feb. 21, 2008), No. N9499, XP030015994.
Virette, “Low Delay Transform for High Quality Low Delay Audio Coding”, Université de Rennes 1, (Dec. 10, 2012), pp. 1-195, URL: https://hal.inria.fr/tel-01205574/document, (Mar. 30, 2016), XP055261425.
ISO/IEC 14496-3:2001; Information technology—Coding of audio-visual objects—Part 3: Audio.
3GPP TS 26.403 v14.0.0 (Mar. 2017); General audio codec audio processing functions; Enhanced acPlus general audio codec; Encoder specification; Advanced Audio Coding (AAC) part; (Release 14).
ISO/IEC 23003-3; Information technology—MPEG audio technologies—Part 3: Unified speech and audio coding, 2011.
3GPP TS 26.445 V14.1.0 (Jun. 2017), 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Codec for Enhanced Voice Services (EVS); Detailed Algorithmic Description (Release 14), http://www.3gpp.org/ftp//Specs/archive/26_series/26.445/26445-e10.zip, Section 5.1.6 “Bandwidth detection”.
Eksler Vaclav et al, “Audio bandwidth detection in the EVS codec”, 2015 IEEE Global Conference on Signal and Information Processing (GLOBALSIP), IEEE, (Dec. 14, 2015), doi:10.1109/GLOBALSIP.2015.7418243, pp. 488-492, XP032871707.
Oger M et al, “Transform Audio Coding with Arithmetic-Coded Scalar Quantization and Model-Based Bit Allocation”, International Conference on Acoustics, Speech, and Signalprocessing, IEEE, XX, Apr. 15, 2007 (Apr. 15, 2007), page IV-545, XP002464925.
Asad et al., “An enhanced least significant bit modification technique for audio steganography”, International Conference on Computer Networks and Information Technology, Jul. 11-13, 2011.
Makandar et al, “Least Significant Bit Coding Analysis for Audio Steganography”, Journal of Future Generation Computing, vol. 2, No. 3, Mar. 2018.
ISO/IEC 23008-3:2015; Information technology—High efficiency coding and media delivery in heterogeneous environments—Part 3: 3D audio.
ITU-T G.718 (Jun. 2008): Series G: Transmission Systems and Media, Digital Systems and Networks, Digital terminal equipments—Coding of voice and audio signals, Frame error robust narrow-band and wideband embedded variable bit-rate coding of speech and audio from 8-32 kbit/s.
3GPP TS 26.447 V14.1.0 (Jun. 2017), Technical Specification, 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Codec for Enhanced Voice Services (EVS); Error Concealment of Lost Packets (Release 14).
DVB Organization, “ISO-IEC 23008-3_A3_(E)_(H 3DA FDAM3).docx”, DVB, Digital Video Broadcasting, C/O EBI—17A Ancienne Route—CH-1218 Grand Saconnex, Geneva—Switzerland, (Jun. 13, 2016), XP017851888.
Hill et al., “Exponential stability of time-varying linear systems,” IMA J Numer Anal, pp. 865-885, 2011.
3GPP TS 26.090 V14.0.0 (Mar. 2017), 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec; Transcoding functions (Release 14).
3GPP TS 26.190 V14.0.0 (Mar. 2017), Technical Specification, 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Speech codec speech processing functions; Adaptive Multi-Rate—Wideband (AMR-WB) speech codec; Transcoding functions (Release 14).
3GPP TS 26.290 V14.0.0 (Mar. 2017), Technical Specification, 3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Audio codec processing functions; Extended Adaptive Multi-Rate—Wideband (AMR-WB+) codec; Transcoding functions (Release 14).
Edler et al., “Perceptual Audio Coding Using a Time-Varying Linear Pre- and Post-Filter,” in AES 109th Convention, Los Angeles, 2000.
Gray et al., “Digital lattice and ladder filter synthesis,” IEEE Transactions on Audio and Electroacoustics, vol. vol. 21, No. No. 6, pp. 491-500, 1973.
Lamoureux et al., “Stability of time variant filters,” CREWES Research Report—vol. 19, 2007.
Herre et al., “Enhancing the performance of perceptual audio coders by using temporal noise shaping (TNS).” Audio Engineering Society Convention 101. Audio Engineering Society, 1996.
Herre et al., “Continuously signal-adaptive filterbank for high-quality perceptual audio coding.” Applications of Signal Processing to Audio and Acoustics, 1997. 1997 IEEE ASSP Workshop on. IEEE, 1997.
Herre, “Temporal noise shaping, quantization and coding methods in perceptual audio coding: A tutorial introduction.” Audio Engineering Society Conference: 17th International Conference: High-Quality Audio Coding. Audio Engineering Society, 1999.
Fuchs Guillaume et al, “Low delay LPC and MDCT-based audio coding in the EVS codec”, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, (Apr. 19, 2015), doi: 10.1109/ICASSP.2015.7179068, pp. 5723-5727, XP033187858.
Niamut et al, “RD Optimal Temporal Noise Shaping for Transform Audio Coding”, Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on Toulouse, France May 14-19, 2006, Piscataway, NJ, USA,IEEE, Piscataway, NJ, USA, (Jan. 1, 2006), doi:10.1109/ICASSP.2006.1661244, ISBN 978-1-4244-0469-8, pages V-V, XP031015996.
ITU-T G.711 (Sep. 1999): Series G: Transmission Systems and Media, Digital Systems and Networks, Digital transmission systems—Terminal equipments—Coding of analogue signals by pulse code modulation, Pulse code modulation (PCM) of voice frequencies, Appendix I: A high quality low-complexity algorithm for packet loss concealment with G.711.
Cheveigne et al.,“YIN, a fundamental frequency estimator for speech and music.” The Journal of the Acoustical Society of America 111.4 (2002): 1917-1930.
Ojala P et al, “A novel pitch-lag search method using adaptive weighting and median filtering”, Speech Coding Proceedings, 1999 IEEE Workshop on Porvoo, Finland Jun. 20-23, 1999, Piscataway, NJ, USA, IEEE, US, (Jun. 20, 1999), doi:10.1109/SCFT.1999.781502, ISBN 978-0-7803-5651-1, pp. 114-116, XP010345546.
“5 Functional description of the encoder”, Dec. 10, 2014 (Dec. 10, 2014), 3GPP Standard; 26445-C10_1_S05_S0501, 3rd Generation Partnership Project (3GPP)?, Mobile Competence Centre ; 650, Route Des Lucioles ; F-06921 Sophia-Antipolis Cedex; France Retrieved from the Internet:URL http://www.3gpp.org/ftp/Specs/2014-12/Rel-12/26_series/ XP050907035.
Hiroshi Ono, “Office Action for JP Application No. 2020-526081”, dated Jun. 22, 2021, JPO, Japan.
Hiroshi Ono, “Office Action for JP Application No. 2020-526084”, dated Jun. 23, 2021, JPO, Japan.
Tomonori Kikuchi, “Office Action for JP Application No. 2020-524874”, dated Jun. 2, 2021, JPO Japan.
O.E. Groshev, “Office Action for RU Application No. 2020118947”, dated Dec. 1, 2020, Rospatent, Russia.
O.I. Starukhina, “Office Action for RU Application No. 2020118968”, dated Dec. 23, 2020, Rospatent, Russia.
Sujoy Sarkar, “Examination Report for IN Application No. 202037018091”, dated Jun. 1, 2021, Intellectual Property India, India.
Miao Xiaohong, “Examination Report for SG Application No. 11202004228V”, dated Sep. 2, 2021, IPOS, Singapore.
Miao Xiaohong, “Search Report for SG Application No. 11202004228V”, dated Sep. 3, 2021, IPOS, Singapore.
Nam Sook Lee, “Office Action for KR Application No. 10-2020-7015512”, dated Sep. 9, 2021, KIPO, Republic of Korea.
Santosh Mehtry, “Office Action for IN Application No. 202037019203”, dated Mar. 19, 2021, Intellectual Property India, India.
Khalid Sayood, “Introduction to Data Compression”, Elsevier Science & Technology, 2005, Section 16.4, Figure 16. 13, p. 526.
Patterson et al., “Computer Organization and Design”, The hardware/software Interface, Revised Fourth Edition, Elsevier, 2012.
Nam Sook Lee, “Office Action for KR Application No. 10-2020-7016424”, dated Feb. 9, 2022, KIPO, Korea.
Nam Sook Lee, “Office Action for KR Application No. 10-2020-7016503”, dated Feb. 9, 2022, KIPO, Korea.
International Telecommunication Union, “G. 729-based embedded variable bit-rate coder: An 8-32 kbit/s scalable wideband coder bitstream interoperable with G.729”. ITU-T Recommendation G.729.1., May 2006
3GGP TS 26.445, “Universal Mobile TElecommunications System (UMTS; LTE; Codec for Enhanced Voice Services (EVS); Detailed algorithmic description (3GPP TS 26.445 version 13.4.1 Release 13)”, ETSI TS 126 445 V13.4.1., Apr. 2017.
Nam Sook Lee, “Office Action for KR Application No. 10-2020-7016100”, dated Jan. 13, 2022, KIPO, Republic of Korea.
Nam Sook Lee, “Office Action for KR Application No. 10-2020-7016224”, dated Jan. 13, 2022, KIPO, Republic of Korea.
Nam Sook Lee, “Office Action for KR Application No. 10-2020-7015835”, dated Jan. 13, 2022, KIPO, Republic of Korea.
Kazunori Mochimura, “Decision to Grant a Patent for JP application No. 2020-524579”, dated Nov. 29, 2021, JPO, Japan.
ETSI TS 126 445 V12.0.0, “Universal Mobile Telecommunications System (UMTS); LTE; EVS Codec Detailed Algorithmic Description (3GPP TS 26.445 version 12.0.0 Release 12)”, Nov. 2014.
ETSI TS 126 403 V6.0.0, “Universal Mobile Telecommunications System (UMTS); General audio codec audio processing functions; Enhanced aacPIus general audio codec; Encoder specification; Advanced Audio Coding (AAC) part (3GPP TS 26.403 version 6.0.0 Release 6)”, Sep. 2004.
ETSI TS 126 401 V6.2.0, “Universal Mobile Telecommunications System (UMTS); General audio codec audio processing functions; Enhanced aacPlus general audio codec; General description (3GPP TS 26.401 version 6.2.0 Release 6)”, Mar. 2005.
3GPP TS 26.405, “3rd Generation Partnership Project; Technical Specification Group Services and System Aspects General audio codec audio processing functions; Enhanced aacPlus general audio codec; Encoder specification parametric stereo part (Release 6)”, Sep. 2004.
3GPP TS 26.447 V12.0.0, “3rd Generation Partnership Project; Technical Specification Group Services and System Aspects; Codec for Enhanced Voice Services (EVS); Error Concealment of Lost Packets (Release 12)”, Sep. 2014.
ISO/IEC Fdis 23003-3:2011 (E), “Information technology—MPEG audio technologies—Part 3: Unified speech and audio coding”, ISO/IEC JTC 1/SC 29/WG 11, Sep. 20, 2011.
Valin et al., “Definition of the Opus Audio Codec”, Internet Engineering Task Force (IETF) RFC 6716, Sep. 2012.
Nam Sook Lee, “Decision to Grant a Patent for KR Application No. 10-2020-7015511”, dated Apr. 19, 2022, KIPO, Republic of Korea.
Nam Sook Lee, “Decision to Grant a Patent for KR Application No. 10-2020-7016100”, dated Apr. 21, 2022, KIPO, Republic of Korea.
Nam Sook Lee, “Decision to Grant a Patent for KR Application No. 10-2020-7015836”, dated Apr. 28, 2022, KIPO, Republic of Korea.
Nam Sook Lee, “Decision to Grant a Patent for KR Application No. 10-2020-7015512”, dated Apr. 20, 2022, KIPO, Republic of Korea.
Nam Sook Lee, “Decision to Grant a Patent for KR Application No. 10-2020-7015835”, dated Apr. 22, 2022, KIPO, Republic of Korea.
Xiong-Malvar, “A Nonuniform Modulated Complex Lapped Transform”, IEEE Signal Processing Letters, vol. 8, No. 9, Sep. 2001. (Year: 2001).
Raj et al., “An Overview of MDCT for Time Domain Aliasing Cancellation”, 2014 International Conference on Communication and Network Technologies (ICCNT). (Year: 2014).
Malvar, “Biorthogonal and Nonuniform Lapped Transforms for Transform Coding with Reduced Blocking and Ringing Artifacts”, IEEE Transactions on Signal Processing, vol. 46, No. 4, Apr. 1998. (Year: 1998).
Malvar, “Lapped Transforms for Efficient Transform/Subband Coding”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 38, No. 6, Jun. 1990. (Year: 1990).
Malvar, “Fast Algorithms for Orthogonal and Biorthogonal Modulated Lapped Transforms”, Microsoft Research, 1998. (Year: 1998).
Princen-Bradley, “Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-34, No. 5, Oct. 1986. (Year: 1986).
Shlien, “The Modulated Lapped Transform, Its Time-Varying Forms, and Its Applications to Audio Coding Standards”, IEEE Transactions on Speech and Audio Processing, vol. 5, No. 4, Jul. 1997. (Year: 1997).
Nam Sook Lee, “Decision to Grant a Patent for KR Application No. 10-2020-7016224”, dated Jul. 25, 2022, KIPO, Republic of Korea.

Related Publications (1)

	Number	Date	Country
	20200265852 A1	Aug 2020	US

Continuations (1)

	Number	Date	Country
Parent	PCT/EP2018/080335	Nov 2018	US
Child	16866280		US

Controlling bandwidth in encoders and/or decoders

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

Field of Search

US

International Classifications

Abstract