The present invention relates to the technical field of acoustics and acoustic signal processing.
It is in particular aimed at a method and a system for estimating a quantity representative of sound energy.
It has already been contemplated to estimate a quantity representative of sound energy at at least one point of a three-dimensional space by means of a plurality of arrays each comprising several acoustic sensors and located in this three-dimensional space.
The article “Localization of Multiple Acoustic Sources with a Distributed Array of Unsynchronized First-Order Ambisonics Microphones”, by C. Shörkhuber, P. Hack, M. Zaunschirm, F. Zotter and A. Sontacchi, in Proceedings of the 6th Congress of the Alps-Adria Acoustics Association, October 2014, Graz, Austria proposes a solution in which, under the hypothesis of temporal and spectral disjointness, a difference-of-arrival histogram covering a set of directions around the array is built for each array, then a spatially discretized probability function is calculated by combining the histograms obtained for the various arrays.
In this context, the present invention proposes a method for estimating a quantity representative of the sound energy at at least one point of a three-dimensional space where a plurality of arrays are located, each comprising at least K acoustic sensors, K being higher than or equal to 2, comprising the following steps:
The use of arrays each comprising at least 2 acoustic sensors (and preferably at least 4 acoustic sensors) allows a fine analysis of the sound field at the array. The various signals resulting from this analysis allow generating a matrix that renders accurately the sound field present at the array. The sound field analysis is thus both rich and made in a compact way, so that it is possible to map correctly the sound field at the array.
In the case where 2nd-order ambisonic signals are used, for example, the number K of acoustic sensors per array is higher than or equal to 9. In the case where 3rd-order ambisonic signals are used, the number K of acoustic sensors per array is higher than 16.
The step of determining a raw value for a given array may comprise the following sub-steps:
In order to cover a set of directions about each array, the estimation method may comprise, for each array of the plurality of arrays, a step of determining, based on said matrix, a plurality of directional values of the quantity representative of the sound energy received at the array in question, from a plurality of directions respectively.
Moreover, it may be provided, for at least one array of the plurality of arrays, a step of refining the directional values by means of a beamforming technique.
The estimation method may further comprise, in this case, for each array of the plurality of arrays, a step of determining raw values of said quantity at a plurality of points based on the directional values determined for the array in question.
The method may then comprise, for each point of said plurality of points, a step of determining an estimated value of said quantity at the point in question by combining the raw values determined for the various arrays of the plurality of arrays at the point in question.
A mapping of the quantity representative of the sound energy is thus carried out.
The method may further comprise a step of refining the raw values by means of a beamforming technique using the values estimated for the various points of the plurality of points.
In practice, the estimated value of said quantity may be determined by applying to the raw values a multi-variable function whose image is zero for any antecedent comprising at least one zero variable, which makes it possible to determine relatively simply the estimated value of said quantity on the basis of the raw values.
The estimated value of said quantity may for example be equal to the inverse of the sum of the inverses of the raw values.
According to another conceivable possibility, the estimated value of said quantity may be equal to the M-th root of the product of the raw values, where M is the number of arrays of the plurality of arrays.
The pairwise combination of representative signals are for example each an estimation of the mathematical expectation of the product of the representative signals in question.
The above-mentioned representative signals may be produced by processing measurements respectively acquired by the acoustic sensors of the array in question.
The above-mentioned quantity is for example the acoustic power. As an alternative, it could be the acoustic pressure (defined on the basis of the square root of the acoustic power).
The present invention also relates to a system for estimating a quantity representative of the sound energy at at least one point of a three-dimensional space comprising:
Of course, the different features, alternatives and embodiments of the invention can be associated with each other according to various combinations, insofar as they are not mutually incompatible or exclusive.
Moreover, various other features of the invention will be apparent from the appended description made with reference to the drawings that illustrate non-limiting embodiments of the invention, and wherein:
The system shown in
As schematically shown in
Each array Am comprises several acoustic sensors Si each capable of making a measurement of a sound field present at the acoustic sensor Si in question. In
In the example described herein, each array Am comprises exactly K acoustic sensors (K being higher than or equal to 2, preferably K being higher than or equal to 4), for example 35 acoustic sensors. As an alternative, however, certain acoustic arrays could comprise more than K acoustic sensors.
Each array Am also comprises a processing unit U adapted to process the signals measured by the acoustic sensors Si of the array in question, as explained hereinafter.
Each array Am can moreover communicate with processor P (for example by means of a wireless link or, as an alternative, a wire link) in order to allow data exchanges between the processing unit U of this array Am and processor P.
Steps E2 to E8 that will now be described are implemented in each of the arrays A1, Am, AM. However, for the sake of brevity, a single array reference is given below: Am.
The method starts by a step E2 of acquiring respective measurements by the K acoustic sensors Si of each array Am of the plurality of arrays.
In the example described hereinafter, step E2 further comprises a processing (by the processing unit U of each array Am) of the measurements acquired by the K acoustic sensors Si of the array Am in question in order to produce signals sk(t) representative of the sound field at the array Am in question. According to the representation used, these signals sk(t) may be complex signals (i.e. represented as complex number in order to define a module, or amplitude, and a phase) or real signals.
These signals sk(t) are for example L-order ambisonic signals. The L-order ambisonic representation indeed allows representing the sound field at the array Am in question by means of N signals sk(t) with N=(L+1)2. Generally, the number K of acoustic sensors is higher than or equal to the number N of signals sk(t) produced.
The method then continues, at each array Am (and by means of the processing unit U of the array Am in question), by a step E4 of determining directional values p(m)(Ω) of the acoustic power received at the array Am from a plurality of directions Ω.
As schematically shown in
For each direction Ω (and at each time for which the estimation is made), each processing unit U determines for that purpose the elements of a covariance matrix Css in which:
The covariance matrix Css provides a set of statistical information about the spatial properties of the sound field, in particular about the position of the sound sources and the more or less strong correlation of the signals transmitted by them. From this point of view, each element of the matrix enriches the information and thus allows refining the analysis performed.
In the case where ambisonic signals sk(t) with real values are used as described here, the covariance matrix Css is written:
where E is a function estimating the mathematical expectation of the signal in question.
In practice, the function E may be an indicator of central tendency of the signal in question over a predetermined number of samples of this signal (the samples used in the calculation of the central tendency indicator being generally the last samples produced). The function E is for example the (sliding) average of the signal over this predetermined number of (last) samples.
The directional value p(m)(Ω) of the acoustic power received from a direction Ω is then written:
where (.)H is the transpose-conjugate operator and a(Ω) is a steering vector of the direction Ω defined as follows in the case of ambisonic signals sk(t):
where Ylq is the spherical harmonic function of real value of order l and degree q and the variables θ and φ represent the direction Ω in spherical coordinates. (The number of spherical harmonic functions of order lower or equal to L being equal to (L+1)2, the vector a(Ω) is of dimension N=(L+1)2 and the above-mentioned covariance matrix Css of size N×N.)
The directional values p(m)(Ω) obtained for a given array Am can potentially be refined by means of a beamforming technique, as described hereinafter with reference to
The processing unit U of each array Am then performs a step E6 of determining raw values p(m)(r) of the acoustic power at a plurality of points of the three-dimensional space E, the position of a point being given by a coordinate vector r.
The points where the raw values p(m)(r) of acoustic power are determined are for example predefined and are the same for all the arrays Am. These points form for example a meshing of the area of interest of the three-dimensional space (area that thus comprises all the arrays Am).
A portion of this meshing around a particular array Am is schematically shown in
For each point indicated by its coordinates r (comprising for example three coordinates (x,y,z) as shown in
For each array Am, the raw values p(m)(r) determined by this array Am (precisely by the processing unit U of this array Am) are then transmitted at step E8 to processor P.
Processor P thus receives at step E10 all the raw values p(m)(r) determined by all the arrays Am of the plurality of arrays.
Processor P can hence determine, at step E12, for all the points considered, an so estimated value p(all)(r) of the acoustic power at the point in question by combining the raw values p(m)(r) for this point received from the various arrays Am.
The estimated value p(all)(r) for a given point (of coordinates r) is for example determined by applying, to the raw values p(m)(r) for this given point, a multi-variable function f (the number of variables x1, x2, . . . , xM being equal to the number of arrays) and whose image f(x1, x2, . . . , xM) is equal to zero for any antecedent (x1, x2, . . . , xM) comprising at least one zero variable xi.
In other words, the function f verifies: f(x1, x2, . . . , xM)=0 if (at least) one index i exists between 1 and M such that xi=0.
The estimated value p(all)(r) for a given point is then equal to:
p
(all)(r)=f(p(1)(r),p(2)(r), . . . ,p(M)(r)).
The use of such a function f is interesting in that it allows obtaining an estimated value p(all)(r) that is zero (or very low in practice) whenever one of the raw values p(m)(r) is zero (or very low in practice), which tends to reduce the occurrence of noise in the estimation process.
In practice, the estimated value p(all)(r) for a given point may be determined as follows:
In other words, in this case, the estimated value p(all)(r) is equal to the inverse of the sum of the inverses of the M raw values p(m)(r).
This solution, based on the hypothesis of absence of interaction between the various arrays, is simple to implement and gives good results in practice.
It can be noticed that this possible embodiment corresponds to the case of a function f as proposed hereinabove due to the fact that the above expression of p(all)(r) may also be written:
According to a conceivable alternative, the processor determines as follows the estimated value p(all)(r) for a given point:
p
(all)(r)=(Πm=1Mp(m)(r))1/M.
In other words, in this case, the estimated value p(all)(r) is equal to the M-th root of the product of the raw values (p(m)(r)).
Once the estimated values p(all)(r) determined for the plurality of considered points, it is possible to refine the corresponding raw vales p(m)(r) associated with the various arrays Am, and thus the estimated values p(all)(r) themselves, by means of a beamforming technique, as described hereinafter with reference to
As indicated hereinabove, this refining method may take place when a set of D directional values p(m)(Ωi) has been determined (as indicated hereinabove in relation with step E4) for D directions Ω1, . . . , ΩD, respectively.
This refining method starts by a step E20 of determining a matrix V(m) defined as followed:
V
(m)
=W
(m)
A
H(AW(m)AH+R)−1, with
A a matrix obtained by concatenating the pointing vectors a(Ωi) defined as hereinabove, each for a direction Ωi, and associated with the D directions Ωi, . . . , ΩD, respectively,
W(m) the diagonal matrix comprising (in diagonal) the directional values p(m)(Ωi) previously determined for the D directions Ω1, . . . , ΩD,
R a regularization matrix that allows taking account of the presence of diffuse noise in the measured signals.
Reference may be made to the book “Geophysical Data Analysis: Diverse Inverse Theory, 4th Edition”, Academic Press, 2008, p. 62 for more details about this technique indicated as the solution to the “weighted damped least-square problem”.
The refining method continues with a step E22 in which the processing unit U in question determines a refined version of the matrix W(m) (and therefore of the directional values p(m)(Ωi) present on the diagonal of this matrix) as follows:
Z
(m)
=V
(m)
C
ss
(m)
V
(m)H
W
(m)=diag(Z(m))
where Css(m) is the covariance matrix determined (as indicated hereinabove at step E4) for the array Am in question (and at the moment in question),
where diag is the operator that, with matrix Z(m), associates the diagonal matrix W(m), whose diagonal elements are identical to those of matrix Z(m) (and whose other elements are zero).
The new directional values p(m)(Ωi) present on the diagonal of the so-obtained matrix W(m) may be used for the following of the method.
Steps E20 and E22 may in practice be repeated several times to further refine the directional values p(m)(Ωi).
This refining method starts with a step E30 in which processor P determines, for each array Am, a matrix V(m) as follows:
V
(m)
=W
(all)
A
(m)H(A(m)W(all)A(m)H+R)−1, with
A(m) a matrix obtained by concatenating the pointing vectors a(ωi) defined, for a set of T points (of the area of interest) indicated by vectors r1, r2, . . . , rT, by the direction Wi connecting the array Am to the point ri in question (the pointing vector associated with a particular direction being defined hereinabove),
W(m) the diagonal matrix comprising (in diagonal) the estimated values p(all)(ri) previously determined for the T points of coordinates r1, r2, . . . , rT,
R a regularization matrix that allows taking account of the presence of diffuse noise in the measured signals and of sound sources present out of the area of interest.
This solution is of the same type as that proposed hereinabove for refining the directional values and reference can therefore also be made to the above-mentioned book for more details on this subject.
The refining method continues with a step E32 of determining, for each array Am, refined raw values p(m)(ri). For that purpose, processor P determines the matrix V(m)Css(m)V(m)H, the refined raw values p(m)(r1), p(m)(r2), . . . , p(m)(rT) then being the diagonal elements of this matrix V(m)Css(m)V(m)H (matrix Css(m) being as hereinabove the covariance matrix determined at step E4 for the array Am in question).
Processor P can then obtain at step E34, for each point ri of the plurality of T points of coordinates r1, r2, . . . , rT, a refined estimated value p(all)(ri) by combining the M refined raw values p(m)(ri) obtained for this point ri for the various arrays Am, respectively, for example by the combination method described hereinabove at step E12:
p
(all)(ri)=f(p(1)(ri),p(2)(ri), . . . ,p(M)(ri)).
Steps E30 to E34 may in practice be repeated several times to further refine the raw values p(m)(ri) and the estimated values p(all)(ri).
Number | Date | Country | Kind |
---|---|---|---|
FR1915670 | Dec 2019 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2020/087212 | 12/18/2020 | WO |