The proposed technology generally relates to a method and system for determining filter coefficients of an audio precompensation controller for the compensation of an associated sound system, a corresponding computer program and carrier for a computer program, and an apparatus for determining filter coefficients of an audio precompensation controller, a corresponding audio precompensation controller, and an audio system comprising a sound system and an improved audio precompensation controller in the input path to the sound system, as well as a digital audio signal.
Multichannel sound reproduction systems, including amplifiers, cables, loudspeakers and room acoustics, will always affect the spectral, transient and spatial properties of the reproduced sound, typically in unwanted ways. Whereas the technical quality of the components, such as amplifiers and loudspeakers, can generally be assumed to be high nowadays, sound reproduction nevertheless suffers from sound quality degradation for multiple reasons. Some of them will be discussed in the following.
First, the acoustic reverberation of the room where the equipment is placed has a considerable and often detrimental effect on the perceived audio quality of the system. The effect of reverberation is often described differently depending on which frequency region is considered. At low frequencies, reverberation is often described in terms of resonances, standing waves, or so-called room modes, which affect the reproduced sound by introducing strong peaks and deep nulls at distinct frequencies in the low end of the spectrum. At higher frequencies, reverberation is generally thought of as reflections arriving at the listener's ears some time after the direct sound from the loudspeaker itself. Reverberation at high frequencies cannot be generally assumed to have a detrimental effect on sound quality. Nevertheless, reverberation definitely has an effect on timbral and spatial sound reproduction.
Second, established loudspeaker-based multichannel sound reproduction standards, such as stereo or 5.1 surround (e.g., home cinema systems), generally assume a symmetric setup of the sound system. It is assumed that multichannel signals, which are in some way coded in the recording, are reproduced via loudspeakers that are placed at defined angles and distances from the listener. Such a symmetric setup is usually found in, for example, professional recording studios. In reality however, such a symmetrical setup is unrealistic when considering typical listening environments such as consumer homes. In these environments, other factors such as the furniture, determine the location of the loudspeakers and listener, rather than placing them with regard to the suggestion in the standards. This leads to impaired sound reproduction and consequently detrimental sound quality.
Third, these multichannel standards generally assume one listener, which is located in a certain position, often referred to as sweet spot. This sweet spot is typically rather small and corresponds to a limited region in space. Yet, high fidelity sound reproduction, that is sound reproduction with a high amount of accuracy and trueness with respect to the recording, is only provided in the sweet spot. Outside this limited region, sound reproduction is severely deteriorated. This also yields impaired sound quality for one or several listeners, which are located outside the given sweet spot.
Last, sound reproduction by means of multiple loudspeakers always has a conceptual problem of identity. Exact reproduction of recorded sound by means of multiple loudspeakers in other than the genuine recording environment must be considered an impossible task. This is particularly valid for the spatial aspects of multichannel sound reproduction, which will always correspond to an approximation of the recorded sound field rather than true (high fidelity) reproduction of it. Therefore, sound quality also depends on human expectation and experience with regard to the presented multichannel method. Whereas sound reproduction may not be accurate in many cases, it may still be plausible to the listener, and thus perceived as proper spatial sound reproduction. Therefore, the fidelity of sound reproduction can generally be improved relative to a given listening situation.
It is an object to provide an improved method of determining filter coefficients of an audio precompensation controller for the compensation of an associated sound system.
It is another object to provide a system configured to determine filter coefficients of an audio precompensation controller for the compensation of an associated sound system.
It is also an object to provide a computer program for determining, when executed by a processor, filter coefficients of an audio precompensation controller.
Yet another object is to provide a carrier comprising a computer program.
Still another object is to provide an apparatus for determining filter coefficients of an audio precompensation controller.
It is also an object to provide an improved audio precompensation controller.
Yet another object is to provide an audio system comprising a sound system and an improved audio precompensation controller in the input path to the sound system.
It is a further object to enable generation of a digital audio signal by an improved audio precompensation controller.
These and other objects are met by embodiments of the proposed technology.
According to a first aspect, there is provided a method for determining filter coefficients of an audio precompensation controller for the compensation of an associated sound system, comprising N≧2 loudspeakers. The method comprises the following steps:
In this way, an audio precompensation controller for an associated sound system can be obtained that enables improved and/or customized sound reproduction in two or more listening zones simultaneously.
By way of example, by using zone-dependent target transfer functions, the sound reproduction can be made similar in the different zones, depending on the listening environment, or at least partly individualized or customized.
According to a second aspect, there is provided a system configured to determine filter coefficients of an audio precompensation controller for the compensation of an associated sound system. The sound system comprises N≧2 loudspeakers. The system is configured to estimate, for each one of at least a pair of the loudspeakers, a model transfer function at each of a plurality M of control points distributed in Z≧2 spatially separated listening zones in a listening environment of the sound system based on a model of acoustic properties of the listening environment. The system is also configured to determine, for each of the M control points, a zone-dependent target transfer function, at least based on the zone affiliation of the control point and the model of acoustic properties. The system is further configured to determine the filter coefficients of the audio precompensation controller at least based on the model transfer functions and the target transfer functions of the M control points.
According to a third aspect, there is provided a computer program for determining, when executed by a processor, filter coefficients of an audio precompensation controller for the compensation of an associated sound system, comprising N≧2 loudspeakers. The computer program comprises instructions, which when executed by the processor causes the processor to:
According to a fourth aspect, there is provided a carrier comprising the computer program of the third aspect.
According to a fifth aspect, there is provided an apparatus for determining filter coefficients of an audio precompensation controller for the compensation of an associated sound system, comprising N≧2 loudspeakers. The apparatus comprises an estimating module for estimating, for each one of at least a pair of the loudspeakers, a model transfer function at each of a plurality M of control points distributed in Z≧2 spatially separated listening zones in a listening environment of the sound system. The apparatus also comprises a defining module for defining, for each of the M control points, a zone-dependent target transfer function at least based on the zone affiliation of the control point. The apparatus further comprises a determining module for determining the filter coefficients of the audio precompensation controller at least based on the model transfer functions and the target transfer functions of the M control points.
According to a sixth aspect, there is provided an audio precompensation controller determined by using the method of the first aspect.
According to a seventh aspect, there is provided an audio system comprising a sound system and an audio precompensation controller in the input path to the sound system.
According to an eighth aspect, there is provided a digital audio signal generated by an audio precompensation controller determined by using the method of the first aspect.
Other advantages will be appreciated when reading the detailed description.
The embodiments, together with further objects and advantages thereof, may best be understood by making reference to the following description taken together with the accompanying drawings, in which:
Throughout the drawings, the same reference designations are used for similar or corresponding elements.
For a better understanding of the proposed technology, it may be useful to begin with a brief overview of an example of an embodied sound system and precompensation controller with reference to
Standardized multichannel audio systems, such as stereo or 5.1 surround, which are represented by the sound system shown in
For a single listener, the sweet spot can be placed at different locations, for example, by the use of appropriate delay and gain adjustments to the loudspeaker channels. Traditionally, the sweet spot is positioned in a location equidistant to the loudspeakers at a certain distance and height, see the grey seat in
In multichannel audio systems, virtual sound sources are created by multiple loudspeakers simultaneously radiating sound. In a traditional stereo setup, two loudspeakers are placed equidistant in front of the listener, with an angle of typically 30° to the left and right of the listener, see
Outside the sweet spot, the intensity and time of arrival differences at the listener's ears differ from those in the sweet spot, resulting in differently perceived virtual sources. In listening positions outside the sweet spot, the precedence effect causes a shift of the sound image towards the nearest loudspeaker [5]. However, multichannel audio productions are produced with one listener in mind who is sitting in the sweet spot. Hence, spatial reproduction of multichannel sound is severely deteriorated in other listening positions than the sweet spot and spatial fidelity in several listening positions is in general not attainable by the use of standard multichannel audio systems.
In the following we shall discuss the challenge of using standard multichannel audio systems as a means for creating multiple sweet spots, which are separated in space and subject to spatial fidelity. Numerous attempts to solve to this challenge have been reported in the literature, and we will discuss a number of them next.
In automotive audio systems, a dashboard center speaker is frequently used to create spatial fidelity, especially in the two front seats, see, e.g., [11][17][23]. However, placing a loudspeaker in the center of the dashboard is rather costly and it is some times also unfeasible due to space constraints. Nowadays, the majority of all standard automotive sound systems are four-channel systems without a center speaker.
Passive solutions, such as loudspeaker placement, or controlling reflections and loudspeaker radiation, are proposed in the literature, see, e.g., [11][16][33]. These solutions are however limited to higher frequencies, above the important frequency range where voices and the fundamentals of many instruments lie, but they may serve as complementary means to improve spatial fidelity in multiple zones.
Binaural systems constitute another proposed solution. In [4], transaural stereo is discussed as a means to generate desired sound fields at the listeners' ears. Transaural stereo is a signal processing technique that precisely controls the sound field at the listeners ears based on cross-talk cancellation. Several scenarios with different combinations of loudspeakers, inputs and listeners (ears), are discussed. It is argued that, in general, the reproduction of virtual sources with transaural stereo is potentially superior to standard stereo. Example solutions are derived for most scenarios. However, the case with two loudspeakers, two inputs, and two listeners (four ears) is not discussed. According to [4], the system is in this case overdetermined and an exact solution does not exist. Further, including room correction in the design is subject to enormous complications and some of the signal processing techniques bring potential problems due to, amongst others, non-causal filters and unstable head-related transfer functions.
Further, sound field synthesis techniques are an option to create multiple sweet spots. In [25], an Ambisonics system approach to multiple off-center listeners is presented. The usage of wave-field synthesis (WFS), vector-based amplitude panning and Ambisonics is described in [15][28]. In general however, Ambisonics solutions require several loudspeakers positioned in a circular, or spherical, layout around the listeners. WFS approaches also require a small spacing between the involved loudspeakers, and thus high numbers of loudspeakers are required. These approaches have therefore, so far, been of limited use for many applications.
A related method is proposed in [7], where sound field control is proposed for multiple listening regions. The basic idea is to make the listeners in different positions of, e.g., a car compartment perceive a sound field similar to what would have been the case if the listeners would have been sitting in an ordinary listening room. The method presented in [7] is thus a matter of creating virtual sound sources. If, for example, a stereo or surround source material is to be presented in a car compartment, then the proposed method presented in [7] is transforming the sound field so that all the listeners in the car compartment will perceive the same sound experience as if they were sitting in an ordinary living room with a stereo or surround set-up. While this sound field transformation is excellent for the posed problem, it does not consider the situation where all listeners will experience personal stereo or surround in all listening zones simultaneously. As is the case when listening to, e.g., uncompensated stereo or surround in an ordinary car compartment the sound field transformation is still subject to the precedence effect. In other words, even though the sound field transformation proposed in [7] gives a living room experience it does not solve the problem of creating several sweet spots. The novel methodology proposed in this invention solves this problem. A related solution is suggested in [20].
Another approach is to control the sound field, generated by two loudspeakers in two listening positions, by controlling delay as a function of frequency, aiming at the theoretically optimal inter-loudspeaker differential phase (IDP) in two separated zones, see, e.g., [12][24][27][29][30] or other methods related to adjusting the phase responses [10][19]. Based on the relative propagation delay difference, given by the distance between the two loudspeakers to the center of each zone, the resulting IDP in each zone can be determined. The uncompensated IDP in both zones is zero at 0 Hz and varies between ±180° for increasing frequencies. The IDP in the first zone is hereby inverse to the IDP in the second zone. For an example see the grey lines in the bottom diagram of
Allpass filters can be used to compensate the IDP in each zone such that the compensated IDP is mainly in phase for all frequencies, i.e., that the compensated system has a maximum relative phase difference of ±90° in both zones, see the black lines in
According to a first aspect, there is provided a method for determining filter coefficients of an audio precompensation controller for the compensation of an associated sound system, comprising N≧2 loudspeakers. With reference to
S1: estimating, for each one of at least a pair of the loudspeakers, a model transfer function at each of a plurality M of control points distributed in Z≧2 spatially separated listening zones in a listening environment of the sound system.
For example, a model transfer function can here be any model, which is represented in transfer function form, representing the sound propagation from a loudspeaker to a measurement point, or a so called control point. A listening zone comprises a subset of the M control points and can be locate anywhere in the listening environment.
S2: determining, for each of the M control points, a zone-dependent target transfer function at least based on the zone affiliation of the control point.
By way of example, the target transfer function is a description of the desired behaviour of the received sound in each of the M control points. The targets may be set differently for the different control points belonging to, or affiliated with, the different zones.
S3: determining the filter coefficients of the audio precompensation controller at least based on the model transfer functions and the target transfer functions of the M control points.
For example, the filter coefficients of the precompensation controller, which determines the precompensation controller's characteristics, can be adjustable parameters of any filter structure, e.g., a Finite Impulse Response (FIR) or an Infinite Impulse Response (IIR) filter.
In this way, an audio precompensation controller for an associated sound system can be obtained that enables improved and/or customized sound reproduction in two or more listening zones simultaneously.
By way of example, by using zone-dependent target transfer functions, the sound reproduction can be made similar in the different zones, depending on the listening environment, or at least partly individualized or customized.
Normally, the listening zones correspond to different human listening positions.
As an example, an objective may be to create similar sound fields in several, spatially separated, listening zones, where at least one of the zones is located outside the traditional so-called sweet spot. It may, for example, be desirable to obtain spatial and timbral fidelity in all zones simultaneously, regardless of their location. This can neither be achieved by standard multichannel sound systems such as, for example, stereo or 5.1 surround, nor can it be obtained for realistic listening environments and reasonable amount of loudspeakers by any method proposed in the literature. In standard multichannel systems, proper sound reproduction with high fidelity is only provided in a single, well defined, listening position; the sweet spot.
The concept of setting a zone-dependent target which differs between zones may for example be used to provide one or more of the following features:
As a non-limiting example, the filter coefficients may be determined based on optimizing a criterion function, where the criterion function at least comprises a target error related to the model transfer functions and the target transfer functions and optionally also differences between representations of compensated model transfer functions of at least a pair of the loudspeakers.
For example, the model transfer functions and the target transfer functions may be representing impulse responses at the considered control points.
It should be understood that the proposed technology may be applied to more than two listening zones, i.e., Z≧3.
Similarly, the proposed technology may be applied to more than two loudspeakers, i.e., N≧3. In this scenario, the proposed technology may, for example, be applied to the loudspeakers in a pair-wise manner, or by simply considering a pair of loudspeakers and using the remaining loudspeaker(s) as optional support loudspeaker(s).
If the optional support loudspeakers are to be used with the current method, then the method comprises the following optional steps of:
Furthermore, it should be understood that the L 1 input signals may be created by upmixing or downmixing source signals to match a desired sound recording standard. By way of example, a single mono source signal may be upmixed to, e.g., stereo (L=2) or surround 5.1 (L=6). Similarly, a 7.1 surround source signal can be downmixed to stereo (L=2) or 5.1 (L=6) surround. It is furthermore obvious for a person skilled in the art that a single mono source signal can be used as an input signal (L=1), subsequently compensated and fed to at least a pair of loudspeakers.
In a particular example, the model transfer functions are acoustically unsymmetrical for both symmetrical and unsymmetrical setups in relation to the position of the loudspeakers and the listening zones. As an example, a symmetric car sound system setup with two loudspeakers and two control points, one in each listening zone, is shown in
Optionally, the target transfer function in each control point is determined based on phase differences between at least a pair of said loudspeakers in the control point. The phase differences are, for example, defined by the model transfer function in the control point, and the phase characteristics of the zone-dependent target transfer functions typically differ between control points affiliated with different listening zones.
In one example, the step of estimating a model transfer function at each of a plurality M of control points may be based on estimating an impulse response at each of the control points, based on sound measurements of the sound system.
In another example, the step of estimating a model transfer function at each of a plurality M of control points may be based on simulation of an impulse response at each of the control points, wherein the simulation includes first order reflections and/or higher order reflections.
In a particular example, the filter coefficients may be determined based on optimizing a criterion function under the constraint of stability of the dynamics of the audio precompensation controller. For example, the criterion function may include at least a weighted summation of powers of differences between compensated model impulse responses and target impulse response over said M control points, and optionally a weighted summation of powers of differences between representations of compensated model transfer functions of at least a pair of the loudspeakers.
If such an optional similarity requirement is to be used with the current method, then the method comprises the following optional steps of:
Optionally, the method may further comprise the step of merging the filter coefficients, determined for the Z listening zones, into a merged set of filter parameters for the audio precompensation controller.
In the following, the proposed technology will be described with reference to non-limiting examples of a filter design based on equally non-limiting examples of a model framework.
The objective in the following non-limiting example is to simultaneously create a true personal multichannel audio experience in multiple listening zones. The different zones are spatially separated and at least one of them is located outside the default sweet spot. In the example we suggest the use of a general filter design framework based on MIMO feedforward control with three basic features: (1) Pairwise channel similarity equalization; (2) Possible use of support loudspeakers; (3) Equalization of the model transfer functions to the respective zones based on target transfer functions, which are individually selected for each control point. The characteristics of the phase responses of these target transfer functions differ between zones. The magnitude responses of the target transfer functions are not restricted.
If one only considers phase differences due to the distances from two loudspeakers to the center of two zones, then one already knows the answer to the current problem description. As discussed above, certain allpass filters provide means to equalize the system such that the compensated IDP is predominantly in phase in both zones for a defined range of frequencies. The design of such allpass filters is fairly straightforward. Based on the assumption that the system is entirely described by the distances between the loudspeakers to the center of each zone, the system's behavior can be described by comb filters. Such a comb filter has dips at equally spaced frequencies, where the IDP is maximum, i.e., ±180°, and peaks at equally spaced frequencies where the IDP is completely in phase, i.e., 0°, see
Based on this assumption, the desired allpass filters can be readily designed [12][24][27][29][30]. The basic idea is to apply a 180° phase shift at frequencies where the inter-loudspeaker differential phase (IDP) in a zone is mainly out of phase, i.e., ∥DP|>90°. A sound system with two loudspeakers, s1 and s2 and two control points, r1 and r2, one in each of the two zones, is illustrated in
When considering measured RTFs in typical listening environments, the IDP between two loudspeakers in a control point does not follow such systematic patterns, which are easy to determine. We shall clarify this by means of an example.
Another limitation of allpass filter methods lies in their design. Instantaneous phase shifts, as depicted by the black lines in the bottom diagram of
We shall in the following non-limiting example introduce the novel method, which is proposed, and highlight its advantages over previous work.
The acoustic signal path from loudspeaker input to microphone will by way of example be modeled as a linear time-invariant system (LTI), which is fully described by its room transfer function (RTF). The room-acoustic impulse responses of each of N loudspeakers are estimated from measurements at M control points, which are distributed over the spatial regions of the intended Z listening zones, such that each listening zone is covered by MZ control points. For simplicity, we assume in this example that the number of control points MZ in each zone is equal, such that the total number M of control points is given by the sum of all MZ control points. It is recommended that the number of control points M is equal to or larger than the number of loudspeakers N. The dynamic acoustic responses can then be estimated by sending out test signals from the loudspeakers, one loudspeaker at a time, and recording the resulting acoustic signals at all M measurement positions. Test signals such as white or colored noise or swept sinusoids may be used for this purpose. Models of the linear dynamic responses from one loudspeaker to M outputs can then be estimated in the form of, e.g., FIR or IIR filters with one input and M outputs. Various system identification techniques such as the least squares method or Fourier-transform based techniques can be used for this purpose. The measurement procedure is repeated for all loudspeakers, finally resulting in a model transfer function that is represented by a M×N matrix of dynamic models. The multiple-input multiple-output (MIMO) model may alternatively be represented by a state-space description. An example of a mathematically convenient, although very general, MIMO model for representing a sound reproduction system is by means of a right Matrix Fraction Description (MFD) [18] with diagonal denominator,
which is the type of MFD that will be utilized in the following. An even more general model can be obtained if the matrix A−1n(q−1) is allowed to be a full polynomial matrix, and there is nothing in principle that prohibits the use of such a structure. However, we shall adhere to the structure (1) in the following, as it allows a more transparent mathematical derivation of the optimal controller. Note that the model transfer functions Hn as defined in (1) may include a parameterization that describes model errors and uncertainties, as given by the following example.
Considering a feasible amount of M control points resulting in models obtained from spatially sparse measurement data, we shall employ the stochastic uncertainty model presented in [6][26][32]. Hence, the linear system model is decomposed into a sum of two parts, one deterministic nominal part and one stochastic uncertainty part, where the uncertainty part is partly parameterized by random variables. The nominal part will here represent those components of the model transfer functions that are known to be varying only slowly with respect to space (and which therefore are well captured by spatially sparse RTF measurements), whereas the uncertainty part represents components that are not fully captured by such measurements. Typically, these spatially complex components consist of late room reflections and reverberation at high frequencies. Accordingly, Hn(q−1) in (1) is decomposed as
n(q−1)=0n(q−1)+Δn(q−1), Eq. (2.1):
where H0n(q−1) is the nominal model and ΔHn(q−1) constitutes the uncertainty model. Writing out the matrix fractions for H0n(q−1) and ΔHn(q−1), the decomposition of Eq. (2.1) can be expressed as
Consider a multichannel audio sound system comprising N loudspeakers, N≧2 and 1≦Nn≦N, around Z bounded three dimensional listening areas ΩZεR3 in a room. Here, Nn, nε{1,2}, represents the total number of loudspeakers used for each of the loudspeakers in a considered loudspeaker pair which creates virtual sources. As an example consider a 4-channel automotive loudspeaker system with two listening zones, located in the two front seats. The total number of loudspeakers (called 1, 2, 3, and 4) is then N=4. The total number of listening zones is then Z=2. Suppose that the front left (FL) and front right (FR) loudspeakers reproduce a stereo recording. Further suppose that all N=4 loudspeakers are used in order to improve the sound reproduction of the FL and FR loudspeakers, which yields that the total number of loudspeakers associated with FL and FR is Nn=N1=N2=4, because the total number of loudspeakers corresponds to the union of N1 and N2: N=N1 ∪N2. The acoustic output of the system is measured in M control points, or measurement positions, where MZ control points are uniformly distributed within each listening zone ΩZ. Let the Nn input signals of the above sound system be represented by a signal vector u1n(k)=[u11n(k) . . . u1N
y
n(k)=n(q−1)u1n(k), Eq. (3):
where Hn(q−1), given by Eq. (1) and Eq. (2.1), is a rational matrix of dimension M×Nn, with elements that are scalar stable rational functions Hijn(q−1); i=1, . . . , M; j=1, . . . , Nn.
A target transfer function Dn, of dimension M×1, can for example be parameterized as
In {tilde over (D)}n(q−1) above, at least one of the polynomial elements is assumed to have a non-zero leading coefficient; the second equality in Eq. (4) is included to emphasize that Dn contains an initial modeling delay of d0 samples. In this example, we will use a FIR model for the targets, and we therefore have that E=1. Further, each control point has an individual target transfer function, which contains an allpass filter. The phase characteristics of the allpass filters differ significantly between control points that are affiliated with different zones. The target D can then be expressed as
where DΩ
A similarity requirement can optionally be included the proposed method. If it is desired to optionally minimize the differences between the loudspeakers of a selected loudspeaker pair, then a similarity matrix P, which is a part of the design, can be included. When P is set to a matrix containing only zeros, then similarity will not be regarded. We will show how to include a similarity requirement by means of an example. The similarity matrix P is defined as follows:
P=[diag(D)|−diag(D)], Eq. (6):
where D is given by Eq. (5) and where diag(D), for the column vector D, represents a diagonal matrix of appropriate dimensions with the elements of D along the diagonal, i.e., diag(D1, . . . , Dm) represents a diagonal matrix with D1, . . . , Dm on the diagonal. The polynomial matrix P also contains a scalar similarity weighting factor ρ, which allows for adjustments of the degree of desired similarity based on a given sound system and listening environment. The proposed design of the similarity matrix is in general significantly different to the design suggested in [3], where identity matrices and permutations are suggested (the similarity matrix is in [3] referred to as a permutation matrix). A design according to Eq. (6) considers differences in the target transfer functions between different zones, which is not anticipated in [3]. By invoking Eq. (5) and (6), we then obtain following formulation for the last terms on the right hand side of Eq. (11) (the criterion function which is to be minimized)
This means that when the difference between the representations of the compensated model transfer functions of a pair of the loudspeakers, represented by y1 and y2, is minimized in each control point, each compensated model transfer function is multiplied with the target transfer function in each control point. The difference is thus minimized under consideration of the desired target transfer functions in each control point.
For the suggested method, at least one loudspeaker pair must be selected amongst the N loudspeakers. The selected pair should correspond to two of the L input signals, where the two selected inputs are used for the creation of virtual sound sources, or optionally each loudspeaker in the pair should correspond to the same mono (single signal) input. If, for example, a stereo recording shall be reproduced, then the left and right front loudspeakers define the loudspeaker pair. If, in another example, a 5.1 surround sound recording (home cinema) is to be reproduced, the left and right front loudspeakers should be primarily chosen as the loudspeaker pair. The remaining loudspeakers can then be equalized according to the proposed method by selecting further loudspeaker pairs, or by combination with other equalizers whenever desired.
Optional support loudspeakers must be carefully selected. For example, if the left front primary loudspeaker of a stereo system is fully supported by the right front loudspeaker, then both loudspeakers will reproduce both the left and right channel. This inevitably leads to a mono effect, because both loudspeakers will reproduce a very similar signal, which corresponds to the sum of the left and right channel. This mono effect can be avoided by applying either one of the following two optional design strategies: (a) Only support loudspeakers which are associated with the input channel of the primary loudspeaker are allowed. (b) The position of sound sources is typically not localizable by human hearing at low frequencies in small rooms, especially for off-center listening positions [35]. Therefore, a low-pass filter with cut-off frequency of about 180 Hz can be applied to support loudspeakers that are not associated with the primary input channel, referred to as constrained support loudspeakers. Then support loudspeakers at arbitrary positions may be used without creating a mono effect, because the sum of the left and right channel is then only reproduced by the loudspeakers for low frequencies, which will not affect the localization.
Consider the MIMO system introduced in Eq. (1)-(6) consisting of at least one loudspeaker pair. Let nε{1, 2} describe the two loudspeakers of the pair and recall that the total number of loudspeakers N is given by N=N1∪N2, where N1 and N2 are the number of loudspeakers used for each of the loudspeakers of the pair that are required to be similar. Note that each loudspeaker of the pair has Nn−1 optional support loudspeakers, and let us introduce the signals, see
z
1n(k)=Vn(q−1)(n(q−1)w(k)−n(q−1)u1n(k))
z
2n(k)=Wn(q−1)u2n(k)
y
n(k)=0n(q−1)u1n(k). Eq. (7):
Here, w(k) is a stationary white noise with zero-mean and covariance E{w2(k)}=ψ. The filters Vn(q−1) and Wn(q−1), of dimensions M×M and Nn×Nn, respectively, constitute weighting matrices for the error and control signals, respectively. Furthermore, Hn(q−1) and H0n(q−1), both of dimension M×Nn, are given by Eq. (1)-(3). The control signals u1n(k) and u2n(k), of dimension Nn×1, are given by
Here, Rtot(q−1, q) is a (optionally noncausal) feedforward compensator whereas {tilde over (Δ)}n(q−1), Fn
Here, d0 is the same as in Eq. (4) and represents the primary bulk delay (or smoothing lag) of the compensated system, whereas djn, j=1, . . . , Nn, are delays that can be used to compensate for individual deviations in distances among the different loudspeakers. According to [8][9], Fn
Since {tilde over (Δ)}n(q−1) Fn
where Hn(q−1) is given by Eq. (1)-(3).
The objective is now to design the controllers Rn(q−1) so as to attain the targets of the respective channels while making the nominal compensated channel responses, see
Here Ē and E denote, respectively, expectation with respect to the uncertain parameters in ΔBn, see (3), and the driving noise w(k). The matrix Pn, of dimension M×M, constitutes a similarity matrix, which can be used to define how to minimize the third term on the right hand side of Eq. (11) with regard to the involved loudspeaker pair. Furthermore, Pn constitutes a weighting matrix to regulate the control points that take similarity into account in both frequency and space.
The criterion Eq. (11), which constitutes a squared 2-norm, or other forms of criteria, based e.g., on other norms, can be optimized in several ways with respect to the adjustable parameters of the precompensator R. It is also possible to impose structural constraints on the precompensator, such as e.g., requiring its elements to be FIR filters of certain fixed orders, and then perform optimization of the adjustable parameters under these constraints. Such optimization can be performed with adaptive techniques, or by the use of FIR Wiener filter design methods. However, as all structural constraints lead to a constrained solution space, the attainable performance will be inferior compared with problem formulations without such constraints. Hence, the optimization should preferably be performed without structural constraints on the precompensator, except for causality of the precompensator and stability of the compensated system. With the optimization problem stated as above, the problem becomes a Linear Quadratic Gaussian (LQG) design problem for the multivariable feedforward compensator R.
Linear quadratic theory provides optimal linear controllers, or precompensators, for linear systems and quadratic criteria, see e.g., [1][21][22][31]. If the involved signals are assumed to be Gaussian, then the LQG precompensator, obtained by optimizing the criterion Eq. (11) can be shown to be optimal, not only among all linear controllers but also among all nonlinear controllers, see e.g., [1]. Hence, optimizing the criterion Eq. (11) with respect to the adjustable parameters of R, under the constraint of causality of R and stability of the compensated system HR, is very general. With H and D assumed stable, stability of the compensated system, or error transfer operator, D-HR, is thus equivalent to stability of the controller R.
We will now present the LQG-optimal precompensator for the problem defined by equations Eq. (1)-(10) and the criterion Eq. (11). The solution is given in transfer operator, or transfer function form, using polynomial matrices. Techniques for deriving such solutions have been presented in e.g., [31]. Alternatively, the solution can be derived by means of state space techniques and the solution of Riccati equations, see e.g., [1][22].
Given the system {tilde over (H)}(q−1) above, with the fixed and known delay polynomial matrix {tilde over (Δ)}n(q−1), the all-pass rational matrix Fn
where β, of dimension (N1+N2)×(N1+N2), is the unique (up to a unitary constant matrix) stable spectral factor given by
β*β=Ē{{tilde over (B)}*V*V{tilde over (B)}+A*W*WA+{hacek over (B)}0*P*P{hacek over (B)}0} Eq. (13):
with {tilde over (β)}(q−1), of dimension 2M×(N1+N2), being as in Eq. (10). {hacek over (B)}0(q−1) in Eq. (13) is given by invoking Eq. (10) and (2.2)
{tilde over (B)}={hacek over (B)}
0
+ΔB{hacek over (B)}
1=({circumflex over (B)}0+ΔB{circumflex over (B)}1){tilde over (Δ)}F*=B{tilde over (Δ)}*.
Note here that the scalar weighting factor ρ is included in P, such that ρ2 scales the similarity term in Eq. (13) with respect to the target requirement. The polynomial matrix Q, together with a polynomial matrix L], both of dimension (N1+N2)×1, constitute the unique solution to the Diophantine equation
{tilde over (B)}=({circumflex over (B)}+ΔB{circumflex over (B)}){tilde over (Δ)}*. Eq. (14):
with generic degrees
deg Q=max(deg V+deg D, deg E−1)
deg L*=max(deg {tilde over (E)}{{circumflex over ({tilde over (B)})}0*}+deg V*,deg β*)−1. Eq. (15):
When a sound system is reproducing music, it is mostly preferable that the magnitude spectrum of the system's transfer functions is smooth and well balanced, at least on average over the listening zones. If the compensated system perfectly attains the desired target D and similarity at all positions, then the average magnitude response of the compensated system will be as desired. However, since the designed controller R cannot be expected to fully reach the target response D and similarity at all frequencies, e.g., due to very complex room reverberation that cannot be fully compensated for, there will always be some remaining approximation errors in the compensated system. These approximation errors may have different magnitude at different frequencies, and they may affect the quality of the reproduced sound. Magnitude response imperfections are generally undesirable and the controller matrix should preferably be adjusted so that an overall target magnitude response is reached on average in all the listening zones.
A final design step is therefore preferably added after the criterion minimization with the aim of adjusting the controller response so that, on average, a target magnitude response is well approximated on average over the measurement positions. To this end, the magnitude responses of the overall system (i.e., the system including the controller R) can be evaluated in the various listening positions, based on the design models or based on new measurements. A minimum phase filter can then be designed so that on average (in the root mean square (RMS) sense) the target magnitude response is reached in all listening regions. As an example, variable fractional octave smoothing based on the spatial response variations may be employed in order not to overcompensate in any particular frequency region. The result is one scalar equalizer filter that adjusts all the elements of R by an equal amount.
We shall now present results of an evaluation based on real measurements acquired in the two front seats of a four channel automotive sound system, which consists of four broadband loudspeakers, which are located in the doors. The car used is a Ford Mondeo sedan, where all loudspeakers are part of the delivered standard sound system. An automotive after-market amplifier was fitted in order to have access to the channels and bypass the head unit. This sound system corresponds to a typical standard automotive sound system.
In this filter design, the matrix V contains identity matrices of appropriate dimensions. Hence we will not use any frequency weighting of the target error. The matrix W contains frequency weightings, which penalize the control actions so that the involved loudspeakers are not driven outside their operating ranges. Furthermore, this weighting matrix also controls the operating frequency range of the support loudspeakers, e.g., by limiting their impact to lower frequencies. We here make use of all available loudspeakers as support loudspeakers, with the following confinement: Support loudspeakers associated with the other input signals than the considered loudspeaker of the chosen pair are only used up to 180 Hz, see the fine dotted line in
The similarity matrix P also contains a frequency weighting, preferably used to focus the similarity efforts to lower frequencies. The weighting consists of a shelving filter with a cut-off frequency of 4 kHz, see
In order to assess the spatial performance of the compared methods under reverberant conditions, we shall use a cross-correlation measure, which evaluates the cross-correlation between the loudspeakers in the loudspeaker pair, which creates virtual sources, in narrow frequency bands.
It will be appreciated that the methods and devices described herein can be combined and re-arranged in a variety of ways.
For example, embodiments may be implemented in hardware, or in software for execution by suitable processing circuitry, or a combination thereof.
The steps, functions, procedures, modules and/or blocks described herein may be implemented in hardware using any conventional technology, such as discrete circuit or integrated circuit technology, including both general-purpose electronic circuitry and application-specific circuitry.
Particular examples include one or more suitably configured digital signal processors and other known electronic circuits, e.g., discrete logic gates interconnected to perform a specialized function, or Application Specific Integrated Circuits (ASICs).
Alternatively, at least some of the steps, functions, procedures, modules and/or blocks described herein may be implemented in software such as a computer program for execution by suitable processing circuitry such as one or more processors or processing units.
Examples of processing circuitry includes, but is not limited to, one or more microprocessors, one or more Digital Signal Processors (DSPs), one or more Central Processing Units (CPUs), video acceleration hardware, and/or any suitable programmable logic circuitry such as one or more Field Programmable Gate Arrays (FPGAs), or one or more Programmable Logic Controllers (PLCs).
It should also be understood that it may be possible to re-use the general processing capabilities of any conventional device or unit in which the proposed technology is implemented. It may also be possible to re-use existing software, e.g., by reprogramming of the existing software or by adding new software components.
According to a second aspect, there is provided a system configured to determine filter coefficients of an audio precompensation controller for the compensation of an associated sound system. The sound system comprises N≧2 loudspeakers. The system is configured to estimate, for each one of at least a pair of the loudspeakers, a model transfer function at each of a plurality M of control points distributed in Z≧2 spatially separated listening zones in a listening environment of the sound system based on a model of acoustic properties of the listening environment. The system is also configured to determine, for each of the M control points, a zone-dependent target transfer function at least based on the zone affiliation of the control point and the model of acoustic properties. The system is further configured to determine the filter coefficients of the audio precompensation controller at least based on the model transfer functions and the target transfer functions of the M control points.
By way of example, the system may be configured to determine the filter coefficients based on optimizing a criterion function, where the criterion function at least comprises a target error related to the model transfer functions and the target transfer functions and optionally also differences between representations of compensated model transfer functions of at least a pair of the loudspeakers.
For example, the system may be configured to operate based on model transfer functions and target transfer functions that are representing impulse responses at the control points.
In a particular example, the system is configured to determine model transfer functions that are acoustically unsymmetrical for both symmetrical and unsymmetrical setups in relation to the position of the loudspeakers and the listening zones.
As an example, the system may be configured to determine the target transfer function in each control point based on phase differences between at least a pair of the loudspeakers in the control point. The phase differences may for example be defined by the model transfer function(s) in the control point, and the phase characteristics of said zone-dependent target transfer functions normally differ between control points affiliated with different listening zones.
In one example, the system may be configured to estimate a model transfer function at each of the control points based on estimating an impulse response at each of the control points, based on sound measurements of the sound system.
In another example, the system is configured to estimate a model transfer function at each of the control points based on a simulation of an impulse response at each of the control points, wherein the simulation includes first order reflections and/or higher order reflections.
Optionally, the system may be configured to determine the filter coefficients based on optimizing a criterion function under the constraint of stability of the dynamics of the audio precompensation controller. For example, the criterion function may at least include a weighted summation of powers of differences between compensated model impulse responses and target impulse response over the M control points, and optionally a weighted summation of powers of differences between representations of compensated model transfer functions of at least a pair of the loudspeakers.
By way of example, as illustrated in the overview of
In a particular example, the system comprises a processor and a memory. The memory comprises instructions executable by the processor, whereby the processor is operative to determine the filter coefficients of the audio precompensation controller.
In this particular example, at least some of the steps, functions, procedures, modules and/or blocks described herein are implemented in a computer program 25; 45, which is loaded into the memory 20 for execution by processing circuitry including one or more processors. The processor(s) 10 and memory 20 are interconnected to each other to enable normal software execution. An optional input/output device may also be interconnected to the processor(s) 10 and/or the memory 20 to enable input and/or output of relevant data such as input parameter(s) and/or resulting output parameter(s).
The term ‘processor’ should be interpreted in a general sense as any system or device capable of executing program code or computer program instructions to perform a particular processing, determining or computing task.
The processing circuitry including one or more processors is thus configured to perform, when executing the computer program, well-defined processing tasks such as those described herein.
The processing circuitry does not have to be dedicated to only execute the above-described steps, functions, procedure and/or blocks, but may also execute other tasks.
In a particular embodiment, the computer program comprises instructions, which when executed by at least one processor, cause the processor(s) to:
The proposed technology also provides a carrier 20; 40 comprising the computer program 25; 45, wherein the carrier is one of an electronic signal, an optical signal, an electromagnetic signal, a magnetic signal, an electric signal, a radio signal, a microwave signal, or a computer-readable storage medium.
By way of example, the software or computer program 25; 45 may be realized as a computer program product, which is normally carried or stored on a computer-readable medium 20; 40, in particular a non-volatile medium. The computer-readable medium may include one or more removable or non-removable memory devices including, but not limited to a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc (CD), a Digital Versatile Disc (DVD), a Blu-ray disc, a Universal Serial Bus (USB) memory, a Hard Disk Drive (HDD) storage device, a flash memory, a magnetic tape, or any other conventional memory device. The computer program may thus be loaded into the operating memory of a computer or equivalent processing device for execution by the processing circuitry thereof.
The flow diagram or diagrams presented herein may therefore be regarded as a computer flow diagram or diagrams, when performed by one or more processors. A-corresponding apparatus may be defined as a group of function modules, where each step performed by the processor corresponds to a function module. In this case, the function modules are implemented as a computer program running on the processor. Hence, the system or apparatus for filter design may alternatively be defined as a group of function modules, where the function modules are implemented as a computer program running on at least one processor.
The computer program residing in memory may thus be organized as appropriate function modules configured to perform, when executed by the processor, at least part of the steps and/or tasks described herein.
Alternatively it is possibly to realize the modules in
Typically, the design equations are solved on a separate computer system to produce the filter parameters of the precompensation filter. The calculated filter parameters are then normally downloaded into a digital filter, for example, realized by a digital signal processing system or similar computer system, such as, e.g., smartphones, tablets, laptop computers, which executes the actual filtering.
Although the invention can be implemented in software, hardware, firmware or any combination thereof, the filter design scheme proposed by the invention is preferably implemented as software in the form of program modules, functions or equivalent. The software may be written in any type of computer language, such as C, C++ or even specialized languages for digital signal processors (DSPs). In practice, the relevant steps, functions and actions of the invention are mapped into a computer program, which when being executed by the computer system effectuates the calculations associated with the design of the precompensation filter. In the case of a PC-based system, the computer program used for the design of the audio precompensation filter is normally encoded on a computer-readable medium such as a DVD, CD, USB flash drive, or similar structure for distribution to the user/filter designer, who then may load the program into his/her computer system for subsequent execution. The software may even be downloaded from a remote server via the Internet.
The determined filter parameters are then normally transferred from the RAM 24 in the system memory 20 via an I/O interface 70 of the system 100 to a precompensation controller, also referred to as a precompensation filter system 200. Preferably, the precompensation controller or filter system 200 is based on a digital signal processor (DSP) or similar central processing unit (CPU) 202, or equivalent processor, and one or more memory modules 204 for holding the filter parameters and the required delayed signal samples. The memory 204 normally also includes a filtering program, which when executed by the processor 202, performs the actual filtering based on the filter parameters.
Instead of transferring the calculated filter parameters directly to a precompensation controller or filter system 200 via the I/O system 70, the filter parameters may be stored on a peripheral memory card or memory disk 40 for later distribution to a precompensation controller or filter system, which may or may not be remotely located from the filter design system 100. The calculated filter parameters may also be downloaded from a remote location, e.g. via the Internet, and then preferably in encrypted form.
In order to enable measurements of sound produced by the audio equipment under consideration, any conventional microphone unit(s) or similar recording equipment 80 may be connected to the computer system 100, typically via an analog-to-digital (A/D) converter 80. Based on measurements of (conventional) audio test signals made by the microphone 80 unit, the system 100 can develop a model of the audio system, using an application program loaded into the system memory 20. The measurements may also be used to evaluate the performance of the combined system of precompensation filter and audio equipment. If the designer is not satisfied with the resulting design, he may initiate a new optimization of the precompensation filter based on a modified set of design parameters.
Furthermore, the system 100 typically has a user interface 50 for allowing user-interaction with the filter designer. Several different user-interaction scenarios are possible.
For example, the filter designer may decide that he/she wants to use a specific, customized set of design parameters in the calculation of the filter parameters of the controller or filter system 200. The filter designer then defines the relevant design parameters via the user interface 50.
It is also possible for the filter designer to select between a set of different preconfigured parameters, which may have been designed for different audio systems, listening environments and/or for the purpose of introducing special characteristics into the resulting sound. In such a case, the preconfigured options are normally stored in the peripheral memory 40 and loaded into the system memory during execution of the filter design program.
The filter designer may also define the model transfer functions by using the user interface 50. Instead of determining a system model based on microphone measurements, it is also possible for the filter designer to select a model of the audio system from a set of different preconfigured system models. Preferably, such a selection is based on the particular audio equipment with which the resulting precompensation filter is to be used.
Preferably, the audio filter is embodied together with the sound generating system so as to enable generation of sound influenced by the filter.
In an alternative implementation, the filter design is performed more or less autonomously with no or only marginal user participation. An example of such a construction will now be described. The exemplary system comprises a supervisory program, system identification software and filter design software. Preferably, the supervisory program first generates test signals and measures the resulting acoustic response of the audio system. Based on the test signals and the obtained measurements, the system identification software determines a model of the audio system. The supervisory program then gathers and/or generates the required design parameters and forwards these design parameters to the filter design program, which calculates the precompensation filter parameters. The supervisory program may then, as an option, evaluate the performance of the resulting design on the measured signal and, if necessary, order the filter design program to determine a new set of filter parameters based on a modified set of design parameters. This procedure may be repeated until a satisfactory result is obtained. Then, the final set of filter parameters are downloaded/implemented into the precompensation controller or filter system.
It is also possible to adjust the filter parameters of the precompensation filter adaptively, instead of using a fixed set of filter parameters. During the use of the filter in an audio system, the audio conditions may change. For example, the position of the loudspeakers and/or objects such as furniture in the listening environment may change, which in turn may affect the room acoustics, and/or some equipment in the audio system may be exchanged by some other equipment leading to different characteristics of the overall audio system. In such a case, continuous or intermittent measurements of the sound from the audio system in one or several positions in the listening environment may be performed by one or more microphone units or similar sound recording equipment. The recorded sound data may then be fed into a filter design system, such as system 100 of
Naturally, the invention is not limited to the arrangement of
A sound generating or reproducing system 400 incorporating a precompensation controller or filter system 200 according to the present invention is schematically illustrated in
The digital or digitized input signal w(k) is then precompensated by the precompensation filter 200, basically to take the effects of the subsequent audio system equipment into account.
The resulting compensated signal u(k) is then forwarded, possibly through a further I/O unit 230, for example, via a wireless link, to a D/A-converter 240, in which the digital compensated signal u(k) is converted to a corresponding analog signal. This analog signal then enters an amplifier 250 and a loudspeaker 260. The sound signal ym(t) emanating from the set of N loudspeaker 260 then has the desired audio characteristics, giving a close to ideal sound experience. This means that any unwanted effects of the audio system equipment have been eliminated through the inverting action of the precompensation filter.
The precompensation controller or filter system may be realized as a standalone equipment in a digital signal processor or computer that has an analog or digital interface to the subsequent amplifiers, as mentioned above. Alternatively, it may be integrated into the construction of a digital preamplifier, a D/A converter, a computer sound card, a compact stereo system, a home cinema system, a computer game console, a TV, an MP3 player docking station, a smartphone, a tablet, a laptop computer, or any other device or system aimed at producing sound. It is also possible to realize the precompensation filter in a more hardware-oriented manner, with customized computational hardware structures, such as FPGAs or ASICs.
It should be understood that the precompensation may be performed separate from the distribution of the sound signal to the actual place of reproduction. The precompensation signal generated by the precompensation filter does not necessarily have to be distributed immediately to and in direct connection with the sound generating system, but may be recorded on a separate medium for later distribution to the sound generating system. The compensation signal u(k) in
The embodiments described above are merely given as examples, and it should be understood that the proposed technology is not limited thereto. It will be understood by those skilled in the art that various modifications, combinations and changes may be made to the embodiments without departing from the present scope as defined by the appended claims. In particular, different part solutions in the different embodiments can be combined in other configurations, where technically possible.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/SE2014/050956 | 8/21/2014 | WO | 00 |