The present application claims priority to German Patent Application No. 102019124285.1, entitled “INPUT SIGNAL DECORRELATION”, and filed on Sep. 10, 2019. The entire contents of the above-listed application are hereby incorporated by reference for all purposes.
The disclosure relates to a system and method (generally referred to as a “system”) for decorrelating an input signal.
In some cases, for example in multichannel adaptive systems, it may be beneficial for reference or input signals used to be statistically independent of each other, i.e. to have a high a degree of decorrelation. For example, changes in a room may be automatically recognized and compensated for based on continuously estimated room impulse responses (RIR) of a multi-channel adaptive system for suppressing acoustic echoes (AEC). When doing so, the RIRs represented by room transfer functions between loudspeakers and microphones installed in the room are determined (e.g., calculated, estimated etc.) and compared to stored reference data previously determined in a reference room. The resulting spectral deviation then forms the basis for determining the compensation filter, which may makes it possible to create a sound impression that is subjectively consistent, independent of the currently existing acoustic conditions in the room. As long as the multi-channel adaptive system uses mono-signals, e.g. emits sound omnidirectionally, determining or using the adaptively estimated RIRs will be straightforward. However, if the device is operated in stereo or, in general, in a multichannel playback modus—in which, for example, numerous different signals that might be spatially vectored are played back—ambiguities may arise among the adaptively determined RIRs, depending on the degree of correlation between the signals used. In this case it may be more difficult to use the method for automatically compensating for room changes, as discussed above, which, as is known, relies on continuously determined RIRs.
Such ambiguities in the estimation of the RIRs may be addressed by ensuring that the various input signals to be played back are sufficiently decorrelated from each other. In general, both channels of a stereo system are sufficiently decorrelated from each other and thus, in the case of a pure stereo playback, this problem may not arise. It does indeed arise, however, when so-called “upmixing” algorithms, such as, for example, Logic7 or Dolby Pro Logic are used. These generate a multichannel signal (e.g. a 5.1 signal from a stereo input signal), wherein the generated additional signals may no longer possess a high degree of decorrelation from each other, which may increase a probability of ambiguity in the estimation of the RIRs. For this reason, employing a decorrelator may be beneficial. Therefore it is generally desirable to explore systems and methods for reliably decorrelating multi-channel audio signals.
An example decorrelator for decorrelating an input signal includes a controllable allpass filter arrangement configured to phase shift the first input signal by a phase shift, the allpass filter arrangement comprising one or more controllable allpass filter stages connected in series, and each controllable allpass filter stage having a filter quality and a cut-off frequency. The decorrelator further includes a filter controller operatively connected to the controllable allpass filter arrangement and configured to control at least one of the filter quality and the cut-off frequency of the controllable allpass filter stages to change over time.
An example decorrelation method for decorrelating an input signal includes allpass filtering to phase shift the first input signal by a phase shift, the allpass filtering comprising filtering with one or more subsequent controllable allpass filter stages, each controllable allpass filter stage having a filter quality and a cut-off frequency. The method further includes controlling at least one of the filter quality and the cut-off frequency of the controllable allpass filter stages to change over time.
Other systems, methods, features and advantages will be, or will become, apparent to one with skill in the art upon examination of the following detailed description and appended figures (FIGs.). It is intended that all such additional systems, methods, features and advantages be included within this description, be within the scope of the invention, and be protected by the following claims.
The system and method may be better understood with reference to the following drawings and description. The components in the figures are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. Moreover, in the figures, like referenced numerals designate corresponding parts throughout the different views.
Additionally or alternatively, in one embodiment, filter base frequencies with a maximum frequency of fs/4 may be chosen in order to ensure that the resulting group delay of the allpass filter chain does not only rise to only this frequency due to the accumulation of the individual, constantly falling phase response, but that it also begins to fall again after having reached the maximum frequency of fs/4, thus avoiding an excessive and unwanted build-up of the group delay. Regardless of this, the options mentioned above, as well as an option in which both filter parameters, i.e. the cutoff frequencies fcn(n) and the quality factors Qn(n) are time-variable, may be used.
A simple way of implementing parametric allpass filter stages of M-th order is, for example, provided by lattice ladder filters, of which various designs exist such as, for example, the one-multiplier, two-multipliers and four-multipliers designs. In allpass filters, the attenuation of the filter is constant at all frequencies but the relative phase between input and output varies with frequency.
The forward path input of stage 201 receives a filter input signal x(n)=fM(n) and provides a filter output signal x′(n)=gM(n) at its backward path output. Further, the backward path input of stage 201 receives a signal gM-1(n) and provides a signal fM-1(n) at its forward path output. For example, if n=3, the signal gM-1(n) is g2(n) and the signal fN-1(n) is f2(n). In the example shown in
An advantage of lattice ladder filters is that their filter coefficients correspond to the reflection coefficients which, for example, may be determined using the Levinson Durbin Recursion. One of the properties of the reflection coefficients is that they make sure that the filter is stable as long as their value stays smaller than 1, i.e. as long as Km≤|1|, wherein m=1, . . . , M, and M is the order of the filter.
In the case of a 2nd order lattice ladder allpass filter, the first filter (or reflection) coefficient K1 corresponds to the filter cutoff frequency fc and the second filter coefficient K2 corresponds to the filter quality factor Q. With this, filter coefficients Kc can be easily generated over time, e.g. by way of an ordinary pseudo random number generator (white noise generator) which provides quasi-random values from the range of [−1, . . . , +1]. The range of values used can be further limited, e.g. in order to prevent the filter quality factor from becoming too large, according to:
K2(n)1, . . . M∈[0, . . . , K2Max],
with K2Max≤1 and M is the number of allpass filters in the chain.
In order to prevent the generation of disturbing acoustic artefacts, the dynamics over time of the time-variable filter parameter(s) or filter coefficient(s) is limited, i.e. the time-variable filter parameter(s) or filter coefficient(s) change not too greatly. To achieve this, either the dynamics range within which the filter parameter(s) in question (fc and/or Q) may change from one sample to the next is accordingly limited (for example: fc may not change from one sample to the next by more than Δfc=1 [Hz]), or the time duration over which the filter parameter(s) may unlimitedly change is very long, in which case interpolations may be performed in between.
Here the advantage of employing lattice ladder filters for implementing the allpass filters and the accompanying reflection filter coefficients once again becomes apparent as using such a structure allows the parameter changes to be carried out directly in the filter coefficients. As opposed to this, when common allpass filters are used, e.g. in a direct form structure, the filter coefficients must be constantly calculated anew from the limited or interpolated filter parameters, which entails a considerable computational effort that is not needed with lattice ladder filters.
In practice, an update time of approximately tud=1 [s] may be useful, for example, every tud, new time-variable filter coefficients K2c, wherein c=1, . . . , C, and C is the number of 2nd order allpass filters, are calculated by way of a pseudo random number generator from a range of K2c∈[0, . . . , K2max], and are applied. Within the time period determined by tud these are then (e.g. linearly) interpolated, so that, by the end of tud all time-variable filter coefficients K2c(n) correspond to the new values generated by the pseudo random number generator. In this simple manner and without an undue increase of the computational effort, disturbing acoustic artefacts can be so greatly reduced that they no longer present an acoustic problem.
Referring to
In a further example, the allpass filter parameters, cut-off frequencies and/or quality factors, are controlled dependent on a correlation analysis of the input signal and at least one comparison signal (e.g., the other input or reference signals) so that decorrelation is only applied (e.g. in certain spectral ranges) if a certain correlation between reference signals is detected. The filter controller 102 shown in Figure may be adapted to perform this procedure, e.g., a processor that implements the filter controller 102 includes software that allows for assessing a value corresponding to a degree of correlation and comparing this value with a threshold.
In some applications, e.g. in multi-channel, adaptive systems, such as a multi-channel acoustic echo canceller (MCAEC), it may have some merits to decorrelate the reference signals so that these become statistically independent and hence allow for a distinct, i.e. unambiguous estimation of the “real” room impulse responses (RIRs). This is, for example, applicable in an automatic equalization system designed to compensate for different room characteristics in order to ideally achieve a subjectively similar tonal balance, independent of the room where the device is used and/or the position of the device in the room.
The drawback described above does not exist if a mono signal is used as a reference. If a stereo signal is used as a reference, there are usually also no negative effects since a typical stereo input signal offers a sufficiently high degree of decorrelation between its left- and right channel. However, if an up-mixing algorithm is used to create several signals based on its (mainly) stereo input, we do face the problem of ambiguity, if no further actions are taken to decorrelate its output signals, which may be used as reference signals for the MCAEC. In such cases, it may be beneficial to introduce additional decorrelation to one or more output signals of the up-mixer before they are used as references for the MCAEC.
The systems and methods described above provide a simple and efficient way to implement a decorrelator that, in addition, does not create significant supererogatory acoustical artifacts. An allpass filter (AP) chain is used including, for example, parametric filters in order to enable a simple time-variation of certain parameters, such as its filter qualities and/or of its cut-off frequencies. Further, a fix set of cut-off frequencies, distributed over a certain, restricted frequency range, may be used in combination with time varying quality factors, where the latter are also restricted to a defined, adjustable range, to avoid acoustical artifacts, which may occur if, e.g. too high quality factor values are employed.
The method described above may be encoded in a computer-readable medium such as a CD ROM, disk, flash memory, RAM or ROM, an electromagnetic signal, or other machine-readable medium as instructions for execution by a processor. Alternatively or additionally, any type of logic may be utilized and may be implemented as analog or digital logic using hardware, such as one or more integrated circuits (including amplifiers, adders, delays, and filters), or one or more processors executing amplification, adding, delaying, and filtering instructions; or in software in an application programming interface (API) or in a Dynamic Link Library (DLL), functions available in a shared memory or defined as local or remote procedure calls; or as a combination of hardware and software.
The method may be implemented by software and/or firmware stored on or in a computer-readable medium, machine-readable medium, propagated-signal medium, and/or signal-bearing medium. The media may comprise any device that contains, stores, communicates, propagates, or transports executable instructions for use by or in connection with an instruction executable system, apparatus, or device. The machine-readable medium may selectively be, but is not limited to, an electronic, magnetic, optical, electromagnetic, or infrared signal or a semiconductor system, apparatus, device, or propagation medium. A non-exhaustive list of examples of a machine-readable medium includes: a magnetic or optical disk, a volatile memory such as a Random Access Memory “RAM,” a Read-Only Memory “ROM,” an Erasable Programmable Read-Only Memory (i.e., EPROM) or Flash memory, or an optical fiber. A machine-readable medium may also include a tangible medium upon which executable instructions are printed, as the logic may be electronically stored as an image or in another format (e.g., through an optical scan), then compiled, and/or interpreted or otherwise processed. The processed medium may then be stored in a computer and/or machine memory.
The systems may include additional or different logic and may be implemented in many different ways including a controller that implements the filter chain and/or the filter controller. A controller may be implemented as a microprocessor, microcontroller, application specific integrated circuit (ASIC), discrete logic, or a combination of other types of circuits or logic. Similarly, memories may be DRAM, SRAM, Flash, or other types of memory. Parameters (e.g., conditions and thresholds) and other data structures may be separately stored and managed, may be incorporated into a single memory or database, or may be logically and physically organized in many different ways. Programs and instruction sets may be parts of a single program, separate programs, or distributed across several memories and processors.
The description of embodiments has been presented for purposes of illustration and description. Suitable modifications and variations to the embodiments may be performed in light of the above description or may be acquired from practicing the methods. For example, unless otherwise noted, one or more of the described methods may be performed by a suitable device and/or combination of devices. The described methods and associated actions may also be performed in various orders in addition to the order described in this application, in parallel, and/or simultaneously. The described systems are exemplary in nature, and may include additional elements and/or omit elements.
As used in this application, an element or step recited in the singular and proceeded with the word “a” or “an” should be understood as not excluding plural of said elements or steps, unless such exclusion is stated. Furthermore, references to “one embodiment” or “one example” of the present disclosure are not intended to be interpreted as excluding the existence of additional embodiments that also incorporate the recited features. The terms “first,” “second,” and “third,” etc. are used merely as labels, and are not intended to impose numerical requirements or a particular positional order on their objects.
While various embodiments of the invention have been described, it will be apparent to those of ordinary skilled in the art that many more embodiments and implementations are possible within the scope of the invention. In particular, the skilled person will recognize the interchangeability of various features from different embodiments. Although these techniques and systems have been disclosed in the context of certain embodiments and examples, it will be understood that these techniques and systems may be extended beyond the specifically disclosed embodiments to other embodiments and/or uses and obvious modifications thereof.
Number | Date | Country | Kind |
---|---|---|---|
102019124285.1 | Sep 2019 | DE | national |
Number | Name | Date | Kind |
---|---|---|---|
9412354 | Ramprashad | Aug 2016 | B1 |
20080137874 | Christoph | Jun 2008 | A1 |
20080247558 | Laroche | Oct 2008 | A1 |
20140185811 | Stein | Jul 2014 | A1 |
20170070839 | Mihelich | Mar 2017 | A1 |
Number | Date | Country |
---|---|---|
2466864 | Feb 2019 | EP |
Entry |
---|
Laura Romoli et al., “A Mixed Decorrelation Approach for Stereo Acoustic Echo Cancellation Based on the Estimation of the Fundamental Frequency”, Feb. 2012, IEEE, vol. 20, pp. 690-697. (Year: 2012). |
Buchner et al., English Translation of EP2466864B, Feb. 27, 2019, EPO, entire document (Year: 2019). |
Number | Date | Country | |
---|---|---|---|
20210076133 A1 | Mar 2021 | US |