The present invention relates to audio signal processing and, in particular, to a bandwidth extension encoder, a method for encoding an audio signal, a bandwidth extension decoder, a method for decoding an encoded audio signal, a phase vocoder and an audio signal.
Moreover, embodiments of the present invention relate to an application of a phase vocoder for pure time stretching, independent of a bandwidth extension.
Storage or transmission of audio signals is often subject to strict bit rate constraints. These constraints are usually accounted for by the use of encoders/decoders (“codecs”) that efficiently compress the audio signal in terms of the information rate needed to store or transmit the signal. In the past, coders were forced to drastically reduce the audio bandwidth when only a very low bit rate was available. Modem audio codecs are able to code wide-band signals by using bandwidth extension (BWE) methods, as described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002; S. Meltzer, R. Böhm and F. Henn, “SBR enhanced audio codecs for digital broadcasting such as “Digital Radio Mondiale” (DRM),” in 112th AES Convention, Munich, May 2002; T. Ziegler, A. Ehret, P. Ekstrand and M. Lutzky, “Enhancing mp3 with SBR: Features and Capabilities of the new mp3PRO Algorithm,” in 112th AES Convention, Munich, May 2002; International Standard ISO/IEC 14496-3:2001/FPDAM 1, “Bandwidth Extension,” ISO/IEC, 2002; “Speech bandwidth extension method and apparatus”, Vasu Iyengar et al. U.S. Pat. No. 5,455,888; E. Larsen, R. M. Aarts, and M. Danessis. Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002; R. M. Aarts, E. Larsen, and O. Ouweltjes. A unified approach to low- and high frequency bandwidth extension. In AES 115th Convention, New York, USA, October 2003; K. Käyhkö. A Robust Wideband Enhancement for Narrowband Speech Signal. Research Report, Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing, 2001; E. Larsen and R. M. Aarts. Audio Bandwidth Extension—Application to psychoacoustics, Signal Processing and Loudspeaker Design. John Wiley & Sons, Ltd, 2004; E. Larsen, R. M. Aarts, and M. Danessis. Efficient high-frequency bandwidth extension of music and speech. In AES 112th Convention, Munich, Germany, May 2002; J. Makhoul. Spectral Analysis of Speech by Linear Prediction. IEEE Transactions on Audio and Electroacoustics, AU-21(3), June 1973; U.S. patent application Ser. No. 08/951,029, Ohmori, et al. Audio band width extending system and method; U.S. Pat. No. 6,895,375, Malah, D & Cox, R. V.: System for bandwidth extension of Narrow-band speech and Frederik Nagel, Sascha Disch, “A harmonic bandwidth extension method for audio codecs,” ICASSP International Conference on Acoustics, Speech and Signal Processing, IEEE CNF, Taipei, Taiwan, April 2009.
These algorithms rely on a parametric representation of the high-frequency content (HF). This representation is generated from the low-frequency part (LF) of the decoded signal by means of transposition into the HF spectral region (“patching”) and application of a parameter driven post processing.
In the art, methods of bandwidth extension such as spectral band replication (SBR) or harmonic bandwidth extension (HBE) are known. In the following, these two BWE methods are briefly described.
On the one hand, spectral band replication (SBR), as described in M. Dietz, L. Liljeryd, K. Kjörling and O. Kunz, “Spectral Band Replication, a novel approach in audio coding,” in 112th AES Convention, Munich, May 2002, uses a quadrature mirror filterbank (QMF) for generating the HF information. Applying a so-called “patching” algorithm, lower QMF band signals are copied into higher QMF bands, leading to a replication of the information of the LF part in the HF part. Subsequently, the generated HF part is adapted to closely match the original HF part with the help of parameters that adjust the spectral envelope and the tonality.
On the other hand, harmonic bandwidth extension (HBE) is an alternative bandwidth extension scheme based on phase vocoders. HBE enables a harmonic continuation of the spectrum as opposed to SBR, which relies on a non-harmonic spectral shift. It may be utilized to replace or amend the SBR patching algorithm.
U.S. Provisional Patent Application with the application No. 61/079,841 discloses a BWE method, which may choose between alternative patching algorithms that operate either in frequency domain or in time domain. In the time-frequency transform by the filterbank, a certain predetermined analysis window is applied. Moreover, classic phase vocoder implementations according to the state-of-the-art use one predefined window shape such as a raised-cosine window or a Bartlett window.
However, choosing one predetermined analysis window for vocoder applications encompasses a trade-off to be made by the application designer in terms of overall perceptual audio quality achieved for different classes of audio signals. Thus, although the mean audio quality can be optimized by the initial choice of a certain window, the audio quality for each individual class of signals remains to be sub-optimal.
Moreover, it was found that certain signals benefit from using specialized analysis windows for a phase vocoder, which may especially be used for temporally spreading the audio signal without modifying the pitch of the same.
Therefore, a concept for selecting the optimal analysis windows such as within a BWE scheme is needed. However, measures against the just-mentioned degradation of the perceptional audio quality should advantageously not result in a significantly increased computational complexity of the employed codecs.
According to an embodiment, a bandwidth extension encoder for encoding an audio signal, the audio signal having a low frequency signal having a core frequency band and a high frequency signal having an upper frequency band, may have a signal analyzer for analyzing the audio signal, the audio signal having a block of audio samples, the block having a specified length in time, wherein the signal analyzer is configured for determining, from a plurality of analysis windows, an analysis window to be used for performing a bandwidth extension in a bandwidth extension decoder; a core encoder for encoding the low frequency signal to obtain an encoded low frequency signal; and a parameter calculator for calculating bandwidth extension parameters from the high frequency signal.
According to another embodiment, a bandwidth extension decoder for decoding an encoded audio signal, the encoded audio signal having an encoded low frequency signal and upper band parameters, may have a core decoder for decoding the encoded low frequency signal, wherein the decoded low frequency signal has a core frequency band; a patch module which is configured to generate a patched signal based on the decoded low frequency signal and the upper band parameters, wherein the patched signal has an upper frequency band generated from the core frequency band; and a combiner which is configured to combine the patched signal and the decoded low frequency signal to acquire a combined output signal.
According to another embodiment, a phase vocoder processor for processing an audio signal may have an analysis windower for applying a plurality of analysis window functions to the audio signal or a signal derived from the audio signal, the audio signal having a block of audio samples, the block having a specified length in time, to acquire a plurality of windowed audio signals; a time/spectrum converter for converting the windowed audio signals into spectra; a frequency domain processor for processing the spectra in a frequency domain to acquire modified spectra; a frequency/time converter for converting the modified spectra into modified time domain signals; a synthesis windower for applying a plurality of synthesis window functions to the modified time domain signals, wherein the synthesis window functions are matched to the analysis window functions, to acquire windowed modified time domain signals; a comparator which is configured to determine a plurality of comparison parameters based on a comparison of the plurality of windowed modified time domain signals and the audio signal or a signal derived from the audio signal, wherein the plurality of comparison parameters corresponds to the plurality of analysis window functions, and wherein the comparator is furthermore configured to select an analysis window function and a synthesis window function for which a comparison parameter satisfies a predetermined condition; and an overlap adder for adding overlapping blocks of a windowed modified time domain signal to acquire a temporally spreaded signal, wherein the overlap adder is configured for processing blocks of the windowed modified time domain signal having been modified by an analysis window function and a synthesis window function selected by the comparator.
According to another embodiment, a method for encoding an audio signal, the audio signal having a low frequency signal having a core frequency band and a high frequency signal having an upper frequency band, may have the steps of analyzing the audio signal, the audio signal having a block of audio samples, the block having a specified length in time, for determining, from a plurality of analysis windows, an analysis window to be used for performing a bandwidth extension in a bandwidth extension decoder; encoding the low frequency signal to acquire an encoded low frequency signal; and calculating bandwidth extension parameters from the high frequency signal.
According to another embodiment, a method for decoding an encoded audio signal, the encoded audio signal having an encoded low frequency signal and upper band parameters, may have the steps of decoding the encoded low frequency signal, wherein the decoded low frequency signal has a core frequency band; generating a patched signal based on the decoded low frequency signal and the upper band parameters, wherein the patched signal has an upper frequency band generated from the core frequency band; and combining the patched signal and the decoded low frequency signal to acquire a combined output signal.
According to another embodiment, an encoded audio signal may have an encoded low frequency signal; bandwidth extension parameters; and an analysis window to be used for performing a bandwidth extension in a bandwidth extension decoder.
According to another embodiment, a computer program may have a program code for performing one of the above-mentioned methods, when the computer program is executed on a computer.
An idea underlying the present invention is that an improved perceptual quality can be achieved when the audio signal having a block of audio samples with a specified length in time is analyzed in order to determine from a plurality of analysis windows an analysis window to be used for performing a bandwidth extension in a bandwidth extension decoder. By this measure, the reduction of the audio quality resulting from the application of a predetermined analysis window may be prevented and, consequently, the perceptual audio quality may be improved with relatively low efforts as compared to standard BWE methods.
According to an embodiment of the present invention, a bandwidth extension encoder for encoding an audio signal comprises a signal analyzer, a core encoder and a parameter calculator. The audio signal comprises a low frequency signal comprising a core frequency band and a high frequency signal comprising an upper frequency band. The signal analyzer is configured for analyzing the audio signal, the audio signal having a block of audio samples, the block having a specified length in time. The signal analyzer is furthermore configured for determining from a plurality of analysis windows an analysis window to be used for performing a bandwidth extension in a bandwidth extension decoder. The core encoder is configured for encoding the low frequency signal to obtain an encoded low frequency signal. The parameter calculator is configured for calculating bandwidth extension parameters from the high frequency signal.
According to another embodiment of the present invention, a bandwidth extension decoder for decoding an encoded audio signal comprises a core decoder, a patch module and a combiner. The encoded audio signal comprises an encoded low frequency signal and upper band parameters. The core decoder is configured for decoding the encoded low frequency signal, wherein the decoded low frequency signal comprises a core frequency band. The patch module is configured to generate a patched signal based on the decoded low frequency signal and the upper band parameters, wherein the patched signal comprises an upper frequency band generated from the core frequency band. The combiner is configured to combine the patched signal and the decoded low frequency signal to obtain a combined output signal.
According to another embodiment, a phase vocoder processor for processing an audio signal comprises an analysis windower, a time/spectrum converter, a frequency domain processor, a frequency/time converter, a synthesis windower, a comparator and an overlap adder. The analysis windower is configured for applying a plurality of analysis window functions to the audio signal or a signal derived from the audio signal, the audio signal having a block of audio samples, the block having a specified length in time, to obtain a plurality of windowed audio signals. The time/spectrum converter is configured for converting the windowed audio signals into spectra. The frequency domain processor is configured for processing the spectra in a frequency domain to obtain modified spectra. The frequency/time converter is configured for converting the modified spectra into modified time domain signals. The synthesis windower is configured for applying a plurality of synthesis window functions to the modified time domain signals, wherein the synthesis window functions are matched to the analysis window functions, to obtain windowed modified time domain signals. The comparator is configured to determine a plurality of comparison parameters based on a comparison of the plurality of windowed modified time domain signals and the audio signal or a signal derived from the audio signal, wherein the plurality of comparison parameters corresponds to the plurality of analysis window functions. The comparator is furthermore configured to select an analysis window function and a synthesis window function for which a comparison parameter satisfies a predetermined condition. The overlap adder is configured for adding overlapping blocks of a windowed modified time domain signal to obtain a temporally spread signal. The overlap adder is furthermore configured for processing blocks of the windowed modified time domain signal having been modified by an analysis window function and a synthesis window function selected by the comparator.
Embodiments of the present invention are based on the concept that a plurality of patched signals may be generated from a plurality of analysis window functions applied to the audio signal comprising the core frequency band. The plurality of patched signals may be compared with a reference signal being the original audio signal or a signal derived from the audio signal. This will result in a plurality of comparison parameters, which may be related to measures of the audio quality. Furthermore, from the plurality of analysis window functions, an analysis window function may be selected for which a comparison parameter satisfies a predetermined condition. Therefore, the use of the selected analysis window function may ensure minimal reduction of the audio quality, leading to optimal perceptual audio quality in the context of a BWE scenario.
Other embodiments of the present invention relate to a signal analyzer comprising a signal classifier, wherein the signal classifier is configured to analyze/classify the audio signal or a signal derived from the audio signal. In this case, the analysis window function to be used for performing a bandwidth extension in the bandwidth extension decoder is selected based on a signal characteristic of the analyzed/classified signal.
Therefore, embodiments provide a method of selecting the optimal analysis window for the bandwidth extension in the decoder. Control parameters may be evaluated in order to decide which analysis window is the most appropriate. To achieve this, an analysis-by-synthesis scheme may be used; i.e. a set of windows may be applied and the best according to a suitable objective is chosen. In the advantageous mode of the invention, the objective is to ensure optimal perceptual audio quality of the restitution. In alternative modes, an objective function may be optimized. For example, the objective may be to preserve the spectral flatness of the original HF as close as possible.
On the one hand, the window selection can be done only at the encoder by considering the original signal, the synthesized signal or both of them. A decision (window indication) is then transmitted to the decoder. On the other hand, the selection may be performed synchronously at the encoder and the decoder side considering only the core bandwidth of the decoded signal. The latter method is not in need to generate additional side information, which is favorable in terms of bitrate efficiency of the codec.
The invention is advantageous in that it optimizes the perceptual quality of the vocoder output signal. Embodiments provide a signal adaptive choosing of appropriate analysis and synthesis windows for the vocoding process, wherein different time responses or frequency responses of the analysis and/or synthesis windows are possible.
Another advantage of the invention is that it enables a better trade-off between reduction of the above-mentioned degradation and the computational complexity such as within a BWE scheme.
In the following, embodiments of the present invention are explained with reference to the accompanying drawings, in which:
In the embodiment shown in
Furthermore, the signal analyzer 110 of the bandwidth extension encoder 300 comprises a comparator 340, which is configured to determine a plurality 341-2 of comparison parameters based on a comparison of the patched signals 331-1 and a reference signal being the audio signal 101-1 or a signal derived from the audio signal such as the high frequency signal 101-4 indicated by the dashed line, wherein the plurality 341-2 of comparison parameters corresponds to the plurality 111-1 of analysis window functions. The comparator 340 is furthermore configured to provide a window indication 341-1 corresponding to an analysis window function 111-2, for which a comparison parameter satisfies a predetermined condition. Finally, the bandwidth extension encoder 300 comprises an output interface 350 for providing an encoded audio signal 351, the encoded audio signal 351 comprising the window indication 341-1.
With regard to an implementation of the above comparison,
Specifically, the input signals 701-1 may correspond to the patched signals 331-1, the patched signals 331-1 having been obtained after applying the plurality 111-1 of analysis window functions to the audio signal 101-1 or a signal derived from the audio signal 101-1 such as the low frequency signal 101-2, while the reference input signal 701-2 may correspond to the original audio signal 101-1. Furthermore, the plurality 705 of comparison parameters of the comparator 700 may correspond to the plurality 341-2 of comparison parameters of the bandwidth extension encoder 300. Therefore, an analysis window function 111-2 may be selected corresponding to the selected comparison parameter in that a deviation in the SFM parameters of the patched signals 331-1 and the original audio signal 101-1, for example, will be minimal. The selected analysis window function 111-2 may also be referenced to by a window indication 707, which may correspond to the window indication 341-1, provided at the output of the comparator 700 or the comparator 340, respectively. Consequently, the perceptual audio quality as measured by a spectral flatness, for example, will be changed or reduced as less as possible when the selected analysis window function 111-2 is chosen for performing a bandwidth extension such as within a bandwidth extension decoder.
Moreover, the plurality 111-1 of analysis window functions indicated by the window control information 311 at the output of the window controller 310 may comprise different analysis window functions having different window characteristics having the same window length as the block 101-6 in time. In particular, the different analysis window functions may be characterized by different frequency response functions (“transfer functions”) obtained from a spectral analysis. The transfer functions, in turn, can be distinguished by characteristic features such as their main lobe widths, side lobe levels or side lobe fall-offs. The different analysis window functions may also be divided into several groups with regard to their performance characteristics such as spectral resolution or dynamic range. For example, high and moderate resolution windows may be represented by rectangular, triangular, cosine, raised-cosine, Hamming, Hann, Bartlett, Blackman, Gaussian, Kaiser or Bartlett-Hann window functions, while low resolution or high dynamic range windows may be represented by flat-top, Blackman-Harris or Tukey window functions. In alternative embodiments, it may also be possible to use window functions having a different number of samples (i.e. windows of different window lengths).
Specifically, applying different analysis window functions 111-1, which may belong to different groups of analysis window functions, to the block 101-6 of audio samples by the use of the patch module 330, for example, will result in patched signals 331-1 having different characteristic features such as different SFM parameters.
With regard to
As shown in
The analysis windower 610 is configured for applying a plurality of analysis window functions such as the analysis window functions 111-1 in the embodiments of the bandwidth extension encoders 300; 500 to the decoded low frequency signal 681-1 to obtain a plurality 611 of windowed low frequency signals. The time/spectrum converter 620 is configured for converting the windowed low frequency signals 611 into spectra 621. The frequency domain processor 630 is configured for processing the spectra 621 in a frequency domain to obtain modified spectra 631. The frequency/time converter 640 is configured for converting the modified spectra 631 into modified time domain signals 641. The synthesis windower 650 is configured for applying a plurality of synthesis window functions to the modified time domain signals 641, wherein the synthesis window functions are matched to the analysis window functions, to obtain windowed modified time domain signals 651. In particular, the synthesis window functions can be matched to the analysis window functions such that applying the synthesis window functions will compensate for the effect of the corresponding analysis window functions. The comparator 660 is configured to determine a plurality of comparison parameters based on a comparison of the plurality 651 of windowed modified time domain signals and the decoded low frequency signal 681-1, wherein the plurality of comparison parameters corresponds to the plurality 111-1 of analysis window functions having been applied to the decoded low frequency signal 681-1 by the analysis windower 610. The comparator 660 is furthermore configured to select an analysis window function and a synthesis window function for which a comparison parameter satisfies a predetermined condition. Here, the comparator 660 may especially be configured as discussed before in the context of
In the embodiments of the bandwidth extension encoders/decoders presented before, the employed comparators may correspond to the comparator 700 as described in
In the previous embodiments, the window selection is performed by a signal analysis in that a plurality of different analysis window functions is applied to the audio signal or a signal derived from the audio signal, generating a plurality of different patched (synthesized) signals. From this plurality of synthesized signals, an optimum window function is selected based on a predefined criterion based on a comparison of the synthesized signals with the original audio signal or a signal derived from the audio signal. The selected window function is then applied to the audio signal or a signal derived from the audio signal such as within a bandwidth extension scheme so that a specific patched (synthesized) signal will be generated. The above procedure, in particular, corresponds to a closed loop and can be referred to as an ‘analysis-by-synthesis’ scheme. Alternatively, the window selection can also be performed by a direct analysis of an input signal being the audio signal or a signal derived from the audio signal, wherein the original input signal is analyzed/classified with regard to a certain signal characteristic such as a measure of the tonality. This alternative analysis scheme corresponding to an open loop will be presented in the following embodiments.
The signal analyzer 110 of the bandwidth extension encoder 800 comprises a signal classifier 810, wherein the signal classifier 810 is configured to classify the audio signal 101-1 or a signal derived from the audio signal such as the high frequency signal 101-4 (dashed line) for determining a window indication 811 corresponding to an analysis window function based on a signal characteristic of the classified signal. For example, the signal classifier 810 may be implemented to determine the window indication 811 by calculating a tonality measure from the audio signal 101-1 or the high frequency signal 101-4, wherein the tonality measure may indicate how the spectral energy is distributed in their bands. In case the spectral energy is distributed relatively uniformly in a band, a rather non-tonal signal (‘noisy signal’) exists in this band and the window indication 811 may be related to a first window function having a first characteristic adapted to be applied to the non-tonal signal, while in case the spectral energy is relatively strongly concentrated at a certain location in this band, a rather tonal signal exists for this band and the window indication 811 may be related to a second window function having a second characteristic adapted to be applied to the tonal signal. Furthermore, the encoder 800 comprises a window controller 820 for providing window control information 821 based on the window indication 811 determined by the signal classifier 810. The parameter calculator 830 of the encoder 800 comprises a windower controlled by the window controller 820, wherein the windower of the parameter calculator 830 is configured to apply an analysis window function based on the window control information 821 to the high frequency signal 101-4 to obtain BWE parameters 831. The window controller 820 may, for example, be implemented to provide the window control information 821 for the parameter calculator 830 so that a first window characterized by a transfer function with a first width of a main lobe will be applied by the windower of the parameter calculator 830, when the determined tonality measure is below a predefined threshold, or a second window characterized by a transfer function with a second width of a main lobe will be applied by the windower of the parameter calculator 830, when the determined tonality measure is equal or above the predefined threshold, wherein the first width of the main lobe of the transfer function is larger than the second width of the main lobe of the transfer function. In particular, in the context of a bandwidth extension scheme, it may be advantageous to use a window function having a rather large main lobe of the transfer function in case of a non-tonal signal and a rather small main lobe of the transfer function in case of a tonal signal.
The core encoder 120 of the bandwidth extension encoder 800 is configured to encode the low frequency signal 101-2 to obtain an encoded low frequency signal 121. As in the embodiment shown in
In this case, the window indication is not contained in the encoded audio signal on the encoder side (
The analysis-by-synthesis scheme of the previous embodiments may also be used in the context of a phase vocoder implementation. Accordingly,
In particular, the temporally spread signal 1271 can be obtained by spacing the overlapping consecutive blocks of the windowed modified time domain signal 1255 further apart from each other than the corresponding blocks of the original audio signal 1201 or the decoded low frequency signal 1202. Additionally, the overlap adder 1270 here acting as a signal spreader may also be configured to temporally spread the audio signal 1201 or the decoded low frequency signal 1202 in that the pitch of the same will not be changed, leading to a scenario of “pure time stretching”.
Alternatively, the comparator 1260 may also be placed after the overlap adder 1270 in the processing chain such that the latter will also be included in the analysis-by-synthesis scheme, which may be advantageous insofar as in this case, effects of the different windowed modified time domain signals 1251 processed by the overlap adder 1270 may also be accounted for by a subsequent comparison/window selection.
In further alternative embodiments, the phase vocoder processor 1200 may also comprise a decimator in form of, for example, a simple sample rate converter, wherein the decimator may be configured to decimate (compress) the spreaded signal such that a decimated signal in a target frequency range of a bandwidth extension algorithm will be obtained.
In further alternative embodiments, a phase vocoder processor may also be implemented to perform a direct analysis of an input audio signal with the aim to select an optimal analysis window function adapted to the signal characteristic of the analyzed audio signal. Particularly, it was found that certain signals benefit from using specialized analysis windows for the phase vocoder. For instance, noisy signals are better analyzed by application of, for example, a Tukey window, while predominantly tonal signals benefit from a small main lobe of the transfer function as provided by, e.g., the Bartlett window.
In summary, it can be seen that the procedure of selecting the optimum window function can either be performed only on the encoder side such as within the bandwidth extension encoders 300 and 800 of
In this context, it may be of advantage that in the latter case, the window indication is not to be stored as additional side-information within the encoded audio signal such that the bit rate for storage or transmission of the encoded audio signal may be reduced.
wherein the window evolves from the rectangular window to the Hanning window as the parameter α varies from 0 to unity. The Bartlett window representing a triangular window may be defined as
In Eqs. (1) and (2), n is an integer value and N the width (in samples) of the time-discrete window functions w(n).
The windowed audio signal obtained after applying the analysis window 1311-1 may further be transformed in a block 1320 denoted by “time-frequency transformation” from the time domain to a frequency domain. The obtained spectrum may then be processed in a block 1330 denoted by “frequency domain processing”. In particular, the block 1330 may comprise a phase modifier for modifying phases of spectral values of the spectrum. Then, the processed spectrum may be transformed in a block 1340 denoted by “frequency-time transformation” back into the time domain to obtain a modified time domain signal. Finally, depending on the control information 1301-2, a synthesis window 1351-1 from a plurality of synthesis windows 1351-2 denoted by “synthesis window 1” to synthesis window 4″, wherein the synthesis window 1351-1 compensates for the effect of the analysis window 1311-1, may be applied to the modified time domain signal to obtain, after adding contributions from all possible signal paths in a block 1360 indicated by a plus symbol, the windowed modified time domain signal 1361 at the output of the apparatus 1300.
In case of the HBE patching algorithm (block 1460-2), the decoded low frequency signal 1421 may be down-sampled by a down sampler 1490 by, for example, a factor of 2 to obtain a down-sampled version of the decoded low frequency signal 1491. The down-sampled signal 1491 may further be processed in an advanced processing scheme of a harmonic bandwidth extension algorithm using a phase vocoder.
On the one hand, a signal dependent processing scheme may be employed, making use of the switching between a standard algorithm as illustrated by a signal path 1500 denoted by “no” when a transient event is not detected in a block of the decoded low frequency signal 1421 by a transient detector 1485 and an advanced algorithm as illustrated by a signal path 1510 denoted by “yes” starting from a zero padding operation (block 1515) when a transient event is detected in the block.
On the other hand, essentially, a signal dependent switching of analysis window characteristics within a phase vocoder in a time-frequency transform implementation may be performed as has been described in detail before. In particular, in
Here, the blocks denoted by “FFT” (Fast Fourier Transform), “phase adaption” and “iFFT” (inverse Fast Fourier Transform) may correspond to the blocks 1320, 1330 and 1340 shown in
It is to be noted that with the above concept, it is possible to switch between different windows on arbitrary positions in the audio signal.
Although the present invention has been described in the context of block diagrams where the blocks represent actual or logical hardware components, the present invention can also be implemented by a computer-implemented method. In the latter case, the blocks represent corresponding method steps where these steps stand for the functionalities performed by corresponding logical or physical hardware blocks.
The described embodiments are merely illustrative for the principles of the present invention. It is understood that modifications and variations of the arrangements and the details described herein will be apparent to others skilled in the art. It is the intent, therefore, to be limited only by the scope of the impending patent claims and not by the specific details presented by way of description and explanation of the embodiments herein.
Dependent on certain implementation requirements of the inventive methods, the inventive methods can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, in particular a disc, a DVD or a CD having electronically, readable control signals stored thereon, which co-operate with programmable computer systems, such that the inventive methods are performed. Generally, the present invention can therefore be implemented as a computer program product with the program code stored on a machine-readable carrier, the program code being operated for performing the inventive methods when the computer program product runs on a computer. In other words, the inventive methods are, therefore, a computer program having a program code for performing at least one of the inventive methods when the computer program runs on a computer. The inventive encoded audio signal can be stored on any machine-readable storage medium, such as a digital storage medium.
The advantages of the novel processing are that the above-mentioned embodiments, i.e. apparatus, methods or computer programs, described in this application allow improving the perceptual audio quality of bandwidth extension applications. In particular, it utilizes a signal-dependent switching of analysis window characteristics such as within a phase vocoder driven bandwidth extension.
The novel processing can also be used in other phase vocoder applications such as pure time stretching whenever it is beneficial to take into account signal characteristics for the choice of an optimal analysis or synthesis window.
The presented concept allows the bandwidth extension to take into account signal characteristics for the patching process. The decision for the best-suited analysis window can be done within an open or within a closed loop. Therefore, the restitution quality can be optimized and, thus, further enhanced.
Most prominent applications are audio decoders based on bandwidth extension principles. However, the inventive processing may also enhance phase vocoder applications for music production or audio post-processing.
While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.
Number | Date | Country | Kind |
---|---|---|---|
EP 10153530 | Feb 2010 | EP | regional |
This application is a continuation of copending International Application No. PCT/EP2010/059025, filed Jun. 24, 2010, which is incorporated herein by reference in its entirety, and additionally claims priority from U.S. Application No. 61/221,442, filed Jun. 29, 2009 and European Application No. EP 10153530.0, filed Feb. 12, 2010, which are all incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/EP2010/059025 | Jun 2010 | US |
Child | 13335096 | US |