The present embodiments relate generally to signal processing, and more particularly to the implementation of audio processing systems having a parallel bank of resonant filters, including modal reverberators and modal processors.
While acoustic spaces and vibrating systems have long been studied using modal analysis (see, e.g., N. H. Fletcher and T. D. Rossing, Physics of Musical Instruments, Springer, 2nd edition, 2010, A. Benade, Fundamentals of Musical Acoustics. Oxford University Press, 1976., pg. 172 et seq., and P. Morse and K. Ingard, Theoretical Acoustics. Princeton University Press, 1987, pg. 576 et seq.), such systems have only recently been synthesized using modal structures. In J. Abel et al., “A modal architecture for artificial reverberation,” The Journal of the Acoustical Society of America, vol. 134, no. 5, p. 4220, 2013 and J. Abel et al., “A modal architecture for artificial reverberation with application to room acoustics modeling,” in Audio Engineering Society Convention, vol. 137, (Los Angeles, Calif.), Oct. 9-12, 2014, the so-called “modal reverberator” was introduced, implementing reverberation as the sum of resonant filters, one for each mode of the system. The resulting parallel structure allows accurate modeling of the acoustic space or object, and provides explicit, interactive control over its features with no computational latency.
One drawback of the modal reverberator can be its computational cost. There are many acoustic spaces and vibrating objects with a large number of modes in the audio band, and several thousand resonant filters may be needed to accurately implement the desired system. For instance,
In J. Abel and K. Werner, “Distortion and Pitch Processing Using a Modal Reverberator Architecture,” in Proceedings of the 18th International Conference on Digital Audio Effects (DAFx-15), Trondheim, Norway, 30 Nov.-3 Dec. 2015 (hereinafter “[3]”), the implementation of the mode filters is manipulated to incorporate other audio processors into the modal reverberator, including pitch shifting and distortion. Accordingly, there is also a need to provide an efficient implementation of the “modal processor” structures.
According to certain aspects, the present embodiments provide a method and system for efficiently implementing a parallel sum of resonant filters that is at the core of the modal reverberator and modal processor systems.
In one embodiment, the input is processed in frequency subbands defined by a bank of filters, at least two of which have non-overlapping pass bands, transition bands, and stop bands, and with each frequency in the audio band appearing in either a single pass band, or a pair of transition bands. Each mode filter is assigned to a single subband or a pair of subbands according to its mode frequency. Downsampled subband signals are formed, and processed with their associated down sampled mode filters. In a related embodiment, some of the pass bands overlap in frequency, and mode filters that would have been assigned to two adjacent subbands are instead assigned to separate subbands.
In another embodiment, downsampled subbands are formed through a process that includes the steps of heterodyning, low-pass filtering, and downsampling. In this case, the mode frequencies are adjusted according to the heterodyning frequency and downsampling factor.
In a further embodiment, the mode gains are adjusted according to the characteristics of the filters used to split the signal into subbands, including adjustments to accommodate pass band ripple.
In still another embodiment, a wideband “residual” filter is applied to the input in parallel with the subband processing.
In yet another embodiment, the mode frequencies are used to design subband frequency ranges and associated downsampling factors so as to reduce the computation needed to implement the modal filter.
In a further embodiment, the downsampling and subband frequencies are designed to accommodate the increased bandwidth from pitch processing. In another embodiment, distortion products or frequency shifted components initially processed in one subband are added into other subbands, preferrably at the same downsampling factor.
These and other aspects and features of the present embodiments will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments in conjunction with the accompanying figures, wherein:
The present embodiments will now be described in detail with reference to the drawings, which are provided as illustrative examples of the embodiments so as to enable those skilled in the art to practice the embodiments and alternatives apparent to those skilled in the art. Notably, the figures and examples below are not meant to limit the scope of the present embodiments to a single embodiment, but other embodiments are possible by way of interchange of some or all of the described or illustrated elements. Moreover, where certain elements of the present embodiments can be partially or fully implemented using known components, only those portions of such known components that are necessary for an understanding of the present embodiments will be described, and detailed descriptions of other portions of such known components will be omitted so as not to obscure the present embodiments. Embodiments described as being implemented in software should not be limited thereto, but can include embodiments implemented in hardware, or combinations of software and hardware, and vice-versa, as will be apparent to those skilled in the art, unless otherwise specified herein. In the present specification, an embodiment showing a singular component should not be considered limiting; rather, the present disclosure is intended to encompass other embodiments including a plurality of the same component, and vice-versa, unless explicitly stated otherwise herein. Moreover, applicants do not intend for any term in the specification or claims to be ascribed an uncommon or special meaning unless explicitly set forth as such. Further, the present embodiments encompass present and future known equivalents to the known components referred to herein by way of illustration.
According to certain aspects, the present embodiments build upon and extend principles of the modal reverberator architecture described in U.S. Pat. Nos. 10,262,645 and 10,019,980, the contents of which are incorporated herein by reference in their entireties.
The modal reverberator architecture is shown in
The system output y(t) in response to an input x(t), that is the convolution
y(t)=h(t)*x(t), (2)
is then seen to be the sum of mode outputs
where the mth mode output ym(t) is the mth mode response convolved with the input. The modal reverberator 200 simply implements this parallel combination by 204 of mode responses 202 determined by (3), as shown in
Typically, the mode responses hm(t) are complex exponentials, each characterized by a mode frequency ωm, mode damping αm and complex mode amplitude γm,
h
m(t)=γm exp{(jωm−αm)t}. (4)
The mode frequencies and dampings are properties of the room or object. They describe, respectively, the various mode oscillation frequencies and decay times. The mode amplitudes are determined by the sound source and listener positions (e.g., driver and pick-up positions for an electro-mechanical device, or loudspeaker and microphone positions for a room), according to the mode spatial patterns.
Note that even for short reverberation times of just a few hundred milliseconds, the mode responses hm(t) are very resonant, and last many thousands of samples at typical audio sampling rates. Example transfer function magnitudes are shown in
Such strongly resonant systems present numerical implementation challenges, which standard biquad forms might not overcome. One method with good numerics is the phasor filter (see, e.g., M. Mathews, J. O. Smith III, “Methods for synthesizing very high Q parametrically well behaved two pole filters,” in Stockholm Musical Acoustics Conference (SMAC), Stockholm, Sweden, Aug. 6-9, 2003 and D. Massie, “Coefficient interpolation for the Max Mathews phasor filter,” in Audio Engineering Society Convention, San Francisco, Calif., Oct. 26-29, 2012, vol. 133), in which each mode filter is implemented as a complex first-order update,
y
m(t)=γmx(t)+e(jω
Another approach is to implement the mode filtering as shown in
y
m(t)=ejω
The heterodyning and modulation steps implement the mode frequency, and the smoothing filter generates the mode envelope, in this case an exponential decay.
As described in [3], if the modulator ejωmt were replaced with one at a different frequency, ejvmt, the mode filter output would be frequency shifted. Similarly, by resampling the term in parentheses in (6), the filter output will be time stretched. Also, by applying a memoryless nonlinearity to or otherwise replacing the modulator, the filter output will be distorted.
The present applicant recognizes that there are a number of settings in which it is desired to process signals in separate frequency subbands. For example, in audio data compression, high-frequency bands are allocated fewer bits than low-frequency bands, according to the sensitivity of human hearing. In the present embodiments, the modal filter is processed in subbands, with each mode of the system appearing in only a few, preferably just one or two, frequency bands. This is made possible since mode filters have such strong resonances. For a typical filterbank, energy at the mode output would appear in at most two adjacent bands. By separating the input signal into downsampled subbands, the cost of implementing the mode filters may be significantly reduced. This is because the computational cost of the recursive mode filters is proportional to the sampling rate, and therefore inversely proportional to the downsampling ratio. For instance, at a sampling rate of 48 kHz, a filterbank having 16 bands could produce more than a factor of eight reduction in computational cost.
An example system according to embodiments is shown in
In this system, the modal filter
is divided into N frequency subbands according to the band splitting filters bn(t). In the nth band, the filtered modal response gn(t) is
Assuming that each resonant mode filter hm(t) contains significant energy in only a few bands, and that the band splitting filter is approximately flat over its pass band, the band filtered modal response is approximately
where the summation is over those mode filters having mode frequencies in the range:
[fn−, fn+]
according to the corresponding frequency subband. Because each subband n occupies only a limited frequency range, the subband signal may be downsampled, for example by a factor of Rn, without loss of information. The downsampling ratio Rn is typically the largest integer less than the Nyquist limit fs/2 divided by the nth subband bandwidth. To avoid aliasing artifacts, while maintaining computational efficiency, that integer might be reduced. By processing the mode filters at the downsampled sampling rate, computational cost will be reduced by roughly the downsampling ratio Rn.
The output of the system (ignoring aliasing effects from the downsampling and upsampling, which may be minimized by choice of bn(t) and cn(t)) is
which approximates h(t). The approximation may be made more accurate by adjusting the mode amplitudes in each subband according to the band splitting and reconstruction filters, respectively, bn(t) and cn(t), to account for pass band ripple and other effects. For instance if the pass band through the system is down 1 dB at the mode frequency of a given mode filter, then its associated mode amplitude could be increased by 1 dB. Similarly, if multiple subbands contain a given mode filter, and their complex amplitudes don't sum to the desired complex amplitude (or magnitude, for instance) at the system output, one or more of the complex amplitudes could be adjusted to produce the desired sum.
The approximation may also be improved by including a parallel wideband filter 514, r(t),
This additional wideband filter would account for mode filter energy outside of their assigned subbands, as well as subtle differences in their assigned subbands. This “residual” filter may be designed using the difference between the desired modal response h(t) and the subband processing response h{circumflex over ( )}(t) as a target impulse response.
It is instructive to further consider the residual filter r(t). As illustrated in
As noted above, the residual filter r(t) has little energy away from time t =0, as is very often the case. Therefore, in implementing the residual filter, both FIR and IIR filter designs could provide efficient implementations. In particular, FFT-based methods can be computationally efficient for implementing the residual filter as an FIR filter in cases when IIR designs prove inefficient or numerically problematic.
A related architecture is shown in
The architecture of
The subband processing may be done using filtering and downsampling, as shown in
{tilde over (h)}p(t),
where the mode frequencies and dampings are adjusted according to the subband n downsampling ratio Rn,
{tilde over (h)}
p(t)=Rnγp exp{(j rem(Rnωp, 2π)−Rnαp){tilde over (t)}}, (12)
where rem( ) represents the remainder function, and t˜ is the downsampled sample number. Note that the scaling of the decay rate by the downsampling ratio is due to the downsampled samples covering a factor of Rn more time. Similarly, the modal frequency is increased by a factor of Rn, and aliased to the downsampled unit circle. The amplitude is likewise increased by the factor Rm, as the downsampling and upsampling reduce the amplitude of the subband signal by the downsampling ratio. After applying the subband mode filters, the signal is upsampled at 608, filtered at 610, and summed to form the output.
Note that the band filters bn(t) may be designed so that their transition bands overlap, while their pass bands don't overlap, as shown in
Similarly, the band filters bn(t) may be designed to have overlapping pass bands, as seen in the example of
For a given modal filter having a set of mode frequencies, and processed according to the subband processing of
The subband processing may also be done using heterodyning and modulation, as illustrated in
{tilde over (ω)}=rem(Rn(ωp−ηn), 2π), (13)
as the original mode frequency ωp becomes ωp−ωn when it is heterodyned to baseband. The processed downsampled signal is then upsampled at 910, filtered at 912, and modulated at 914 to its output frequency band, typically by n, the same frequency used to heterodyne the band.
Another approach to band processing using real-valued signals is shown in
As an example of the processing performed by the embodiment of
In certain implementations of the modal filter, a frequency shifting is implemented, in effect modulating the mode envelope by an output frequency vm that is different from the mode frequency ωm (e.g., [3]). In the context of subband processing, a group of modes that are shifted to higher frequencies will occupy a greater output bandwidth, as shown in
In the presence of distortion processing, additional partials are generated. Such additional partials would significantly increase the bandwidth of the subband output, and reduce the computational savings afforded by the subband architecture. In this case, as seen in
Note that this architecture is also applicable to any case in which a mode frequency is shifted out of its respective band. In such cases, the matrix M could be sparse, having only one nonzero entry for each mode filter p.
While the processor of
Although the present embodiments have been particularly described with reference to preferred examples thereof, it should be readily apparent to those of ordinary skill in the art that changes and modifications in the form and details may be made without departing from the spirit and scope of the present disclosure. It is intended that the appended claims encompass such changes and modifications.
The present application is a continuation of U.S. patent application Ser. No. 17/087,407, filed Nov. 2, 2020, now U.S. Pat. No. 11,488,574, which application is a continuation-in-part of U.S. patent application Ser. No. 16/432,866 filed Jun. 5, 2019, now U.S. Pat. No. 10,825,443, which application is a continuation-in-part of U.S. patent application Ser. No. 16/384,266 filed Apr. 15, 2019, which application is a continuation of U.S. patent application Ser. No. 15/796,327 filed Oct. 27, 2017, now U.S. Pat. No. 10,262,645, which application is a continuation of U.S. patent application Ser. No. 14/558,531, filed Dec. 2, 2014, now U.S. Pat. No. 9,805,704, which application claims priority to U.S. Provisional Patent Application Nos. 62/061,219 filed Oct. 8, 2014, 61/913,093 filed Dec. 6, 2013 and 61/910,548 filed Dec. 2, 2013. U.S. patent application Ser. No. 16/432,866 is also a continuation-in-part of U.S. patent application Ser. No. 16/030,789 filed Jul. 9, 2018, which application is a continuation of U.S. patent application Ser. No. 15/201,013 filed Jul. 1, 2016, now U.S. Pat. No. 10,019,980, which application claims priority to U.S. Provisional Patent Application No. 62/188,299 filed Jul. 2, 2015. U.S. patent application Ser. No. 16/432,866 also claims priority to U.S. Provisional Patent Application No. 62/732,574 filed Sep. 17, 2018. The contents of all the above applications are incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
62732574 | Sep 2018 | US | |
62188299 | Jul 2015 | US | |
61910548 | Dec 2013 | US | |
61913093 | Dec 2013 | US | |
62061219 | Oct 2014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 17087407 | Nov 2020 | US |
Child | 17978900 | US | |
Parent | 15201013 | Jul 2016 | US |
Child | 16030789 | US | |
Parent | 15796327 | Oct 2017 | US |
Child | 16384266 | US | |
Parent | 14558531 | Dec 2014 | US |
Child | 15796327 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16432866 | Jun 2019 | US |
Child | 17087407 | US | |
Parent | 16030789 | Jul 2018 | US |
Child | 16432866 | US | |
Parent | 16384266 | Apr 2019 | US |
Child | 16432866 | US |