The present application claims priority to Japanese Patent Application Number 2019-149812, filed on Aug. 19, 2019. The contents of this application are incorporated herein by reference in their entirety.
The present invention relates to a method for determining positions of a plurality of microphones in a microphone array including the plurality of microphones, and a microphone system including the microphone array.
Conventionally, a microphone array installed in a conference room or the like is known. In the conventional microphone array disclosed in U.S. Pat. No. 9,565,493, a plurality of microphones are provided on a plurality of concentric circles.
An arrangement of the microphones in the conventional microphone array is determined by the experience and intuition of a designer. Therefore, a difference between a main lobe and a side lobe in directional characteristics of the microphone array is insufficient, and it has been required to improve a directivity.
This invention focuses on this point, and an object of the invention is to improve the directivity of the microphone array.
A method for determining microphone position according to a first aspect of the present invention is a method for determining positions of a plurality of microphones in a microphone array having the plurality of microphones arranged in a plurality of concentric circles. The method for determining microphone position includes a constraint condition acquiring step of acquiring constraint conditions including the maximum number of the plurality of microphones; and a selecting step of selecting, from among a plurality of combinations of (i) the number of microphones included in each of the plurality of concentric circles and (ii) the radius of each of the plurality of concentric circles, a combination indicating directional characteristics with the smallest difference from a target value of the directional characteristics of the microphone array, where the plurality of combinations satisfy the constraint conditions.
A microphone system according to a second aspect of the present invention is a microphone array having a plurality of microphones arranged on a plurality of concentric circles, wherein a variation amount of a difference between the radii of two concentric circles adjacent to each other among the plurality of concentric circles does not increase monotonically according to a distance from the center position of the plurality of concentric circles, and an attenuation amount of a side lobe relative to a main lobe in the directional characteristics is equal to or greater than 10 dB.
Hereinafter, the present invention will be described through exemplary embodiments of the present invention, but the following exemplary embodiments do not limit the invention according to the claims, and not all of the combinations of features described in the exemplary embodiments are necessarily essential to the solution means of the invention.
[Outline of a Microphone System S]
As shown by black circles in
The audio processing part 2 is a device that processes the sound signals output from the microphone array 1 (that is, the plurality of sound signals output from the plurality of microphones 11). The audio processing part 2 specifies a direction to a position where a speaker H who has spoken (i.e., a sound source) is located, by analyzing the sound signals input from the microphone array 1. Further, the audio processing part 2 executes a beamforming process by adjusting weight coefficients of the plurality of sound signals corresponding to the plurality of microphones 11 on the basis of the direction toward a specified speaker H and makes sensitivity to the voice generated by this speaker H higher than sensitivity to sounds coming from directions other than the direction toward this speaker H.
In the microphone array 1, the plurality of the microphones 11 are arranged such that a difference in the directional characteristics between the main lobe and a side lobe is equal to or greater than 10 dB due to the audio processing part 2 performing the beamforming process. Next, a configuration of the microphone array 1 and a method for determining an arrangement of the plurality of the microphones 11 will be described in detail.
[Configuration of the Microphone Array 1]
As shown with the black circles in
The concentric circle C2 is the second inner concentric circle, and four microphones 11c are arranged on the concentric circle C2. The concentric circle C3 is the third inner concentric circle, and seven microphones 11d are arranged on the concentric circle C3. The concentric circle C4 is the outermost concentric circle. On the concentric circle C4, seventeen microphones 11e are arranged. The microphones 11 arranged on the concentric circles C2, C3 and C4 function as the beamforming microphones 11. It should be noted that, in
As will be described in detail below, the radii of the four concentric circles C1, C2, C3 and C4, as well as the number and positions of the microphones 11 included in each concentric circle, are determined by searching for optimal directional characteristics. As a result, a variation amount of a difference between the radii of two concentric circles adjacent to each other among the four concentric circles C1, C2, C3, and C4 is determined such that the variation amount does not increase monotonically according to a distance from the center position of the plurality of concentric circles.
Specifically, in the microphone array 1 shown in
Among the plurality of microphones 11 included in the microphone array 1, both (i) a microphone 11a arranged at the central position of the plurality of concentric circles and (ii) three microphones 11b (11b-1, 11b-2, and 11b-3) provided at uniform intervals on the innermost concentric circle C1, which is the closest to the central position, function as a plurality of sound source localization microphones 11 used for specifying positions of the sound sources. The other microphones 11 included in the microphone array 1 function as a plurality of beamforming microphones 11 used for collecting sounds generated from the sound sources whose positions are specified by the sound source localization microphones 11. The microphone 11a and the microphones 11b-1 to 11b-3 may further function as the beamforming microphones 11. In other words, the microphone 11a and the microphones 11b-1 to 11b-3 may be used for two purposes: for the sound source localization and for beamforming.
A distance between two sound source localization microphones 11 adjacent to each other among the plurality of microphones 11 that function as the sound source localization microphones 11 is less than or equal to half of the minimum wavelength of a sound in a frequency band used to specify the direction to the position where the speaker H, who is the sound source, is located. Since aliasing does not occur when the distance between the two sound source localization microphones 11 is set in this manner, the accuracy of estimating the direction toward the speaker H improves.
When a frequency range that includes main frequency components of the voice of an assumed speaker H is equal to or above 500 Hz and equal to or below 4000 Hz, a distance D between the two sound source localization microphones 11 adjacent to each other is preferably 42.5 mm or less, since the wavelength of a sound with a frequency of 4000 Hz is 85 mm. When the frequency range that includes the main frequency components of the voice of the assumed speaker H is equal to or above 500 Hz and equal to or below 5000 Hz, the distance D is preferably 34 mm or less since the wavelength of a sound with a frequency of 5000 Hz is 68 mm. It should be noted that if the distance D is too small, a difference in sounds entering each of the sound source localization microphones 11 becomes too small, and for this reason, the distance D is preferably, for example, 30 mm or more and 40 mm or less.
Also, some of the microphones 11 are provided at a plurality of intersections where at least one straight line L passing through the center of the plurality of concentric circles C1, C2, C3, and C4 intersects with the respective concentric circles C1, C2, C3, and C4. In an example shown in
Because the microphone array 1 is configured in this manner, the accuracy of performing audio processing to enhance the directivity of the direction toward the speaker H is improved, and the load of the audio processing is reduced. Also, since a positional relationship of the plurality of microphones 11 becomes clearer, the accuracy of specifying the direction toward the speaker H is improved.
[Configuration of the Audio Processing Part 2]
The AD converter 21 converts a plurality of sound signals based on sounds that entered the plurality of sound source localization microphones 11 into a plurality of pieces of sound source localization digital data. The AD converter 21 inputs the converted sound source localization digital data to the direction specification part 23. The AD converter 22 converts a plurality of sound signals based on sounds that enter the plurality of beamforming microphones 11 (“BF” in
The direction specification part 23 specifies the direction to the position where the speaker H who is the sound source is located, on the basis of the plurality of sound signals input from the plurality of sound source localization microphones 11. Specifically, the direction specification part 23 specifies the direction toward the speaker H on the basis of a plurality of pieces of sound source localization digital data input from the AD converter 21. The direction specification part 23 specifies the direction toward the speaker H, for example, on the basis of a relationship between the loudness of sounds which each of the plurality of sound source localization digital data indicates. The direction specification part 23 notifies the sound output part 24 of the direction toward the specified speaker H.
The sound output part 24 outputs sounds synthesized by weighting each of the plurality of sounds input to the beamforming microphones 11 on the basis of the direction toward the speaker H, specified by the direction specification part 23. Specifically, the sound output part 24 outputs the synthesized sounds by generating a plurality of multiplied values by multiplying a weight coefficient, which is determined on the basis of a direction to a position where the speaker H who is speaking is located, to each of the plurality of beamforming digital data corresponding to each microphone 11, and by adding the generated plurality of multiplied values. For example, an absolute value of a weight coefficient for the microphone 11 at a position corresponding to the direction toward the speaker H is set to a value greater than an absolute value of a weight coefficient for a microphone 11 at the other position. Due to the direction specification part 23 and the sound output part 24 operating in this manner, reproducibility of the sounds generated by the speakers H is improved regardless of the directions to the positions where the speakers H are located.
Since the directional characteristics of the microphone array 1 are different according to the arrangement of the plurality of microphones 11, the quality of the sounds synthesized by the sound output part 24 is affected by the arrangement of the plurality of microphones 11. Next, a method for determining the arrangement of the plurality of microphones 11 for improving the quality of the sounds synthesized by the sound output part 24 will be described in detail.
[Outline of the Method for Determining the Arrangement of the Plurality of Microphones 11]
Hereinafter, the process in which the arrangement search device determines the arrangement of the plurality of microphones 11 will be described with reference to
In order to determine the arrangement of the plurality of microphones 11, the arrangement search device first acquires constraint conditions (step S1). For example, the arrangement search device displays a screen for inputting the constraint conditions on a display, and acquires the constraint conditions input on the screen.
The arrangement search device acquires, for example, the maximum number of the plurality of microphones 11, as one of the constraint conditions. The arrangement search device may acquire the number of the sound source localization microphones 11 and the radius of the outermost concentric circle of the plurality of concentric circles, as one of the constraint conditions. Due to the arrangement search device acquiring these constraint conditions, the time for determining the arrangement of a plurality of microphones 11 that satisfy the size and cost requirements of the microphone array 1 can be reduced. The arrangement search device may acquire the number of microphones 11 included in each of the plurality of concentric circles to be three or more, as one of the constraint conditions. By having three or more microphones 11 in one concentric circle, it is possible to reduce the variability of the directional characteristics due to the direction of the sound source.
Subsequently, the arrangement search device acquires a target value of the directional characteristics of the microphone array 1 (step S2). The directional characteristics of the microphone array 1 are represented by a value corresponding to a difference between (i) the magnitude of a main lobe of sensitivity to the input sound signals and (ii) the magnitude of a side lobe of the sensitivity to the input sound signals. For example, the directional characteristics of the microphone array 1 are expressed as an attenuation amount of the side lobe relative to the main lobe when a predetermined sound is input to the microphone array 1. For example, the arrangement search device displays a screen for inputting the target value on the display, and acquires the target value inputted on the screen.
Next, the arrangement search device determines an initial variable vector for starting a search for the optimal arrangement of the plurality of microphones 11 by using the JADE method (step S3). For example, the arrangement search device sets a vector including, as a variable, the number of concentric circles in which the microphones 11 are arranged, the radius of each concentric circle, and the number of microphones 11 in each concentric circle to the initial variable vector.
Subsequently, the arrangement search device calculates an objective function value (i.e., an initial objective function value) when the determined initial variable vector is used (step S4), and temporarily stores the calculated objective function value as a reference function value in association with the initial variable vector (step S5). The objective function value is a value indicating an error between an ideal value of the directional characteristics of the microphone array 1 and the directional characteristics of the microphone array 1 calculated using the initial variable vector. The smaller the objective function value, the better the directional characteristics.
Next, the arrangement search device determines an updated variable vector (step S6). The updated variable vector is a variable vector in which at least one variable included in the initial variable vector is changed. The arrangement search device determines the updated variable vector by setting at least one of (i) the number of concentric circles in which the microphones 11 are arranged, (ii) the radius of each concentric circle, and (ii) the number of microphones 11 in each concentric circle to a value different from the initial variable vector. The arrangement search device uses, for example, the differential evolution algorithm in determining the updated variable vector.
The arrangement search device uses a variable vector including, for example, the number of microphones 11 included in each of the plurality of concentric circles and the radius of each of the plurality of concentric circles, as the updated variable vector which is a mutant vector used in the differential evolution algorithm. The arrangement search device selects, from among a plurality of combinations of (i) the number of microphones 11 included in each of the plurality of concentric circles and (ii) the radius of each of the plurality of concentric circles, a combination indicating directional characteristics with the smallest difference from the target value of the directional characteristics, where the plurality of combinations satisfy the constraint conditions.
Specifically, the arrangement search device first calculates the objective function value when the updated variable vector is used (step S7). The arrangement search device compares the calculated objective function value with the objective function value stored in step S5 (step S8). When the calculated objective function value is equal to or greater than the stored reference function value (YES in step S8), the arrangement search device advances the arrangement determination process to step S10. When the calculated objective function value is less than the stored objective function value (NO in step S8), the arrangement search device stores the calculated objective function value (i.e., the updated objective function value) as a new reference function value in association with the updated variable vector (step S9).
Next, the arrangement search device determines whether or not the objective function value has been calculated a predetermined number of times (step S10). That is, the arrangement search device determines whether or not the objective function value has been calculated for a predetermined number of variable vectors. The predetermined number of times is, for example, a number set by a designer of the microphone array 1. When the object function value has been calculated the predetermined number of times (YES in step S10), the arrangement search device determines the arrangement indicated by the variable vector stored in association with the reference function value as the arrangement of the plurality of microphones 11, and ends the process.
If the number of times that the calculation of the objective function value has been performed has not reached the predetermined number of times (NO in step S10), the arrangement search device returns the arrangement determination process to step S6. By executing a selection step of steps S7 to S10 in this manner, the arrangement search device selects, from among a plurality of combinations of positions of the microphones 11, an optimal combination indicating the directional characteristics with the smallest difference from the target value of the directional characteristics, where the plurality of combinations satisfy the constraint conditions (step S11). That is, the arrangement search device selects, from among the initial objective function value and a plurality of updated objective function values, a combination of positions of the plurality of microphones 11 corresponding to the minimum objective function value.
[Search Example for an Optimal Arrangement Using the JADE Method]
Hereinafter, an example that shows searching for an optimal arrangement of the plurality of microphones 11 using the JADE method is described. The following designing process is performed by executing the programs with the arrangement search device, which executes the flowchart of
It is supposed that a total number of concentric circles is P, the radius of each concentric circle is rp, and the number of microphones 11 arranged in each concentric circle is Mp (p=1, 2, . . . , P). If a distance between a sound source and the microphone array 1 is sufficiently large with respect to the radius rP of the largest concentric circle, a sound signal generated by the sound source is considered to be a plane wave in the vicinity of the microphone array 1. In this case, a sound receiving signal zpm(n) of the m-th microphone 11 on a certain concentric circle p can be expressed by the following equations using an arrival time difference τpm(θ, Φ) based on a sound receiving signal zp,xaxis(n) of the microphones 11 on the x-axis of each concentric circle.
Here, c is the speed of sound. In this case, a directivity G(θ, Φ, ωk) corresponding to the size of the main lobe of the microphone array 1 can be expressed by the following equation.
A weight coefficient w*pm,k of a delay-sum beamformer can be expressed by the following equation.
A design problem relevant to the optimal arrangement of the plurality of microphones 11 can be replaced by a problem of searching for the arrangement of the microphones 11 which can obtain a directivity G(θ, Φ, ωk), which is close to a desired directivity D(θ, Φ, ωk), serving as the target value. The error E(θ, Φ, ωk) used in the search can be expressed by the following equation.
E(θ,ϕ,ωk)=|D(θ,ϕ,ωk)−G(θ,ϕ,ωk)| [Equation 6]
The optimal placement can be specified by obtaining a variable vector that minimizes the maximum error in an approximate band, as shown in the following equation.
Here, in order to obtain the variable vector that minimizes the maximum error by using the JADE method, the arrangement search device first initializes N solution populations Xi (i=1, 2, . . . , N) using a uniform random number for within a domain range of a search space, and calculates the objective function value of each individual. The arrangement search device generates differential mutant individuals, child individuals, and evolution individuals up to the maximum generation number I, and searches for the minimal solution of the objective function.
In order to apply the JADE method to a microphone arrangement design problem, a variable vector x is defined as follows:
x=[M1, . . . ,MP,r1, . . . ,rP] [Equation 8]
Here, to make sure that the arrangement will not be determined to be an arrangement that is impossible to realize, the constraint conditions for keeping the number of microphones 11 within the maximum number Mmax that can be realized are defined as follows:
In the microphone system S, a sound source localization process is performed prior to the beamforming process. Therefore, when determining the arrangement of the plurality of microphones 11, an arrangement of the sound source localization microphones 11 must also be considered. To arrange one concentric circle at the central position of the concentric circles and three or six sound source localization microphones 11 in the innermost concentric circle C1, as shown in
M1=1,M2={3,6},Mp′∉v,v={1,2},p′={w∈|w=[3,P]} [Equation 10]
When the maximum radius of the outermost concentric circle is Rmax, the constraint conditions on the radius rp of each concentric circle are as follows:
r1=0,rP=Rmax,rp-1<rp [Equation 11]
In this case, a variable vector x′ to be obtained is expressed as follows:
x′=[1,M2, . . . ,MP,0,72,r2, . . . ,rp-1,Rmax]T [Equation 12]
Therefore, the design problem of arranging the plurality of microphones 11 is formulated as a mixed integer programming problem, as shown below:
min δ,sub.toE(θs,ϕs,ωk)≤δ [Equation 13]
Here, θs and Φs (s=1, . . . , S) represent discrete directions, and δ represents the maximum error in the approximate band in Equation 6. In the search for the optimal arrangement by the JADE method, the following magnification objective function f(x′) using this δ is used.
Here, λu(x′) (u=1, . . . , 4) represents a penalty function. λ1(x′) is a penalty function for limiting the maximum number of microphones 11.
The λ2(x′) is a penalty function for the number of sound source localization microphones 11.
λ3(x′) is a penalty function for preventing the number of microphones 11 arranged in each concentric circle from being 2 or less.
λ4(x′) is a penalty function for arranging the radii in ascending order. α>0 is a constant for preventing the difference between the radii of the adjacent concentric circles from being 0.
[First Search Example]
In the present search example, ΦL=0[rad], for simplicity. A desired directivity D(θ, ωk) is set as shown in the following equation.
Here, θS1 and θS2 are the directions of the borders of the main lobe. In the present search example, θS1=−π/3[rad], θS2=π/3[rad], a sound source direction θL=0[rad], and the sound speed c=343 [m/s]. In the JADE method, the initial values of μF and μCR are 0.5, and Pbest is 0.05.
As a result of determining the arrangement of the plurality of microphones 11 with the JADE method using a computer as the arrangement search device under the above conditions, the microphone array 1 shown in
As a comparative example, the radius of each concentric circle and the number of the microphones 11 for each concentric circle of a microphone array, in which the microphones 11 are arranged without using the JADE method, are shown in Table 2.
By comparing
[Second Search Example]
The radius of each concentric circle and the number of microphones 11 in each concentric circle determined using the JADE method under the condition that the number of microphones 11 is 48 and the maximum radius of the concentric circle is 0.215 [m] is shown in Table 3.
[3rd Search Example]
The radius of each concentric circle and the number of microphones 11 in each concentric circle determined by using the JADE method under the condition that the number of microphones 11 is 64 and the maximum radius of the concentric circle is 0.215 [m] is shown in Table 4.
The microphone arrays 1 designed by using the JADE method have the following common features:
An example where three sound source localization microphones 11 are arranged at uniform intervals on the innermost concentric circle C1 has been shown above, but six sound source localization microphones 11 may be arranged at uniform intervals on the innermost concentric circle C1.
The present invention is explained on the basis of the exemplary embodiments. The technical scope of the present invention is not limited to the scope explained in the above embodiments and it is possible to make various changes and modifications within the scope of the invention. For example, the specific embodiments of the distribution and integration of the apparatus are not limited to the above embodiments, all or part thereof, can be configured with any unit which is functionally or physically dispersed or integrated. Further, new exemplary embodiments generated by arbitrary combinations of them are included in the exemplary embodiments of the present invention. Further, effects of the new exemplary embodiments brought by the combinations also have the effects of the original exemplary embodiments.
Number | Date | Country | Kind |
---|---|---|---|
2019-149812 | Aug 2019 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
6205224 | Underbrink | Mar 2001 | B1 |
9930448 | Chen et al. | Mar 2018 | B1 |
10506337 | Chen | Dec 2019 | B2 |
10839822 | Chen | Nov 2020 | B2 |
11089402 | Sabin | Aug 2021 | B2 |
Number | Date | Country |
---|---|---|
H10293168 | Nov 1994 | JP |
Number | Date | Country | |
---|---|---|---|
20230125643 A1 | Apr 2023 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16996326 | Aug 2020 | US |
Child | 18061634 | US |