This application is the National Stage entry under 35 U.S.C. § 371 of International Application No. PCT/EP2019/061529 filed May 6, 2019, published as Publication No. WO 2019/211487 on Nov. 7, 2019, which claims benefit of foreign priority of German Patent Application No. 10 2018 110 759.5, filed on May 4, 2018, the entireties of which are herein incorporated by reference.
The invention relates to a microphone array.
For sound recordings in large sports facilities, the acoustic events on the field may be particularly interesting for an immersive playback, such as noise from the ball, the bat or racket and so forth as well as conversations of the players, umpire or referee, trainers and so forth. Due to the amount of ambient noise, it is difficult to achieve a good sound quality and speech intelligibility. This has to do with the fact that microphones often have to be positioned on the edge of the field, because a large distance to the desired sound sources needs to be maintained. The disturbing noise comprises substantially noise of the audience, which in sports facilities is normally found in the spectator stands. Moreover, the microphones for sound recording should not block the view for the spectators or the usually present cameras.
A typical example is the playing field of a soccer stadium, wherein ball noises, player conversations, whistling of the referee and trainer instructions should be captured.
Similar problems may occur in other sports, such as, e.g., baseball, or in other situations where sound recordings are to be made from sound sources that are widely distributed over a plane area and that may be mobile and cannot be directly provided with a microphone, despite disturbing ambient noise.
A solution from LAWO that is known as “KICK” is an arrangement of numerous directional microphones or microphones having a super-cardioid characteristic, which are distributed around a soccer field on the edge of the field, parallel to the ground (https://www.lawo.com/en/products/audio-production-tools/kick.html). For capturing ball noises, the ball position is visually tracked, automatically or semi-automatically. The position data are input into an automatic audio mixing unit that receives also the microphones' output signals, processes or weights them respectively according to the position data and mixes them. The idea behind is that signals from microphones that are closest to the current ball position are particularly weighted. A disadvantage of this known solution is that a large amount of cabling is required. The cables and the microphones must be laid before each game and removed again after the game. Additional microphones require additional cabling and make the system more expensive. Further, due to the fixed alignment of the microphones, their optimally captured region must be relatively wide in order to cover also regions in between neighboring microphones. Nevertheless, these regions are captured with only poor sound quality and therefore suboptimal. Additionally, a larger coverage area of the microphones in the plane (azimuth angle) leads to an increase in the vertical coverage area (elevation angle), since the directional characteristics (i.e., beam patterns) of known microphones are rotationally symmetric. This means that noises from the higher spectator stands are also captured.
Another possible solution consists in a manual alignment or tracking of directional microphones with a particularly high directivity. However, this is associated with a time delay. Moreover, service personnel for each directional microphone is required in the case of manual alignment, and structure-borne noise can be transferred to the microphone. With a possible remote control for aligning the microphones, both additional delay and motor noise would occur, which would inevitably be captured by the microphone and be hearable as disturbing noise. An incorrect alignment of a directional microphone affects different frequencies differently, since the directivity of the directional microphones is stronger for higher frequencies than for lower. This leads to a permanently changing tone or timbre of the sound signal.
Another known solution for achieving a high directional effect is beamforming, where output signals of a plurality of microphones arranged as an array are combined, e.g., using delay, addition and filtering. The resulting beam, i.e., the region of particularly high sensitivity, has an adjustable direction and is usually rotationally symmetric. The respective shape of the beam depends on the type, number and arrangement of the microphones as well as on the algorithm that is used for the combining. Common algorithms are the Delay-and-Sum (DS) algorithm and the “Minimum Variance Distortionless Response” (MVDR) algorithm, which both have drawbacks, however. Normally, microphone arrays are constructed from microphones without or with low directivity, since they are easy to handle and cheap. This requires a very large number of microphones for obtaining a high directivity over a wide azimuth angle and a similar directivity with respect to elevation, leading to a high computation effort.
It is therefore an object of the present invention to provide a microphone arrangement that solves the above-mentioned problems.
For multi-channel audio recording, e.g., for 22 channels, an arrangement with shotgun microphones arranged on a circle is known (Y. Sasaki, T. Nishiguchi, K. Ono: “Development of multichannel single-unit microphone using shotgun microphone array”). Neighboring shotgun microphones are used for additionally narrowing the rotationally symmetric directivity pattern (or beam pattern) of each single shotgun microphone at low frequencies to the respective direction by filtering. In another known solution (K. Niwa, Y. Koizumi, K. Kobayashi, H. Uematsu: “Binaural sound generation corresponding to omnidirectional video view using angular region-wise source enhancement”), shotgun microphones are used as an alternative to beamforming.
An object of the present invention is to provide a microphone arrangement with a particularly high directivity in vertical direction and a high yet in wide limits adjustable directivity in horizontal direction.
This object is achieved by the microphone array according to claim 1.
According to the invention, a microphone array comprises a plurality of microphones whose output signals are combined into at least one common output signal, wherein the microphones are shotgun microphones arranged with a preferred direction of high sensitivity. Further, the microphones are arranged essentially evenly on a circle or segment of a circle such that each of the microphones has another preferred direction of high sensitivity, wherein preferably the angles between the individual microphones are substantially equal over the entire circle or segment. The microphones may point inwardly or outwardly with respect to the circle or circle segment. In one embodiment, all microphones are arranged substantially in one plane. In another embodiment, the microphones are arranged in multiple, e.g., two or three, parallel and adjacent planes. The thickness of each plane may correspond to about the diameter of a microphone or interference tube, respectively. The common output signal of the microphone array is obtained by beamforming.
Due to the high directivity of the shotgun microphones, both the elevation angle and the azimuth angle of the detection area or coverage of the arrangement are very small, while the azimuth angle is adjustable in a very large range that may be up to 360°. The resulting azimuthal directivity of the microphone arrangement can be stronger than the directivity of a single shotgun microphone, even if none of the shotgun microphones points to the respective direction. In embodiments where the microphones are distributed over a full circle, there are always some shotgun microphones that point opposite to the actual target direction. This enables a constant directivity, regardless of the orientation of the microphone array.
A method for audio recording by means of shotgun microphones is disclosed in 12. Further advantageous embodiments are disclosed in the claims 2-11, 13-14 and in the following detailed description.
Further details and advantageous embodiments are depicted in the drawings in which:
Alternatively, the shotgun microphones may be arranged in two or more different planes. These planes should preferably be close to each other. In principle, the microphones may also be arranged in different planes, but the sensitivity of all microphones with regard to a defined elevation should then be similar. In other words, the “view angles” or focus regions of the various microphones should all be substantially in one plane in an intended distance.
The radius of the circle 120 or circle segment determines the alias frequency and the operating frequency range. A larger radius, at a constant number of directional microphones, results in improvements for low frequencies, by leading to a shift of this range to lower frequencies and leading to a lower alias frequency. Increasing the number of microphones results in a higher alias frequency.
Shotgun microphones afford the advantage of a particularly high directivity, which relates both to a very small azimuth angle as well as a very small elevational angle. The elevational angle is the angle perpendicular to the drawing plane in
A further advantage of a rotationally symmetrical arrangement as in
Various methods of signal processing may be used. A possible and particularly advantageous signal processing for the microphone array is the beamforming algorithm. Here, the beamforming is based on the so-called modal beamforming, which is especially suitable for configurations where all microphones have essentially the same directivity (directional effect) and are arranged on a sphere or on a circle. For the operating frequency range of the array it is possible to achieve an almost uniform directivity over all frequencies of the operating frequency range. The number Q of microphones used determines the maximum achievable degree M of the output signal, which corresponds to the spatial resolution of the beam pattern, according to
The processing is effected in two steps: (a) frequency-independent mixing (or matrixing) of the microphone signals to obtain 2M+1 intermediate signals or mixed signals, and (b) filtering and then weighting and summing of the intermediate signals or mixed signals.
What is especially remarkable is the option to accomplish directing the beam (i.e., steering the resulting direction of high sensitivity) to a target azimuth angle ΦT by accordingly recomputing the real values weights gm(ϕ
and provides (2M+1) output signals. Each output signal is filtered, wherein one filter 320 of the (2M+1) filters 320, . . . , 322′ occurs once and all others occur twice as equal filter pairs 321, 321′. E.g., the filter 321 for the (−M+1)th matrix output and the filter 321′ for the (M−1)th matrix output are equal. Each filter or filter pair respectively has its own filtering function, corresponding to an order of a particular mode. The output signal of each filter 320, . . . , 322′ is weighted in one or more weighting units 330 according to the desired azimuthal direction ΦT with a corresponding gain value g−M(ϕ
For the number of directional microphones and their positions, the following applies. In general, the number of microphones determines the spatial resolution of the achievable target beam pattern or directional characteristic, in particular the maximum directivity index, which is the ratio between the beamformer's output power with respect to a desired target direction and the total output power integrated over all other directions. In the context of modal beamforming, it is useful to choose the number Q of microphones in dependence of the required maximum degree M according to Q=2M+1. If the Circular Harmonics transform described below is used, then it is advantageous, in consideration of the assumptions made for it, to use a uniform distribution of microphones on a circle. This ensures a uniform signal quality over all (azimuthal) directions, as intended with modal beamforming.
If another algorithm than modal beamforming is used, it may however be appropriate to arrange the directional microphones differently, namely not exactly radially but slightly rotated or displaced, respectively. This makes the overall arrangement smaller, without reducing the length of the individual directional microphones or the diameter of the circle of microphone capsules.
Moreover, it may make sense for certain applications to arrange the directional microphones on a segment of a circle that has a certain angle, e.g., if only low levels of disturbing noise from the rear are to be expected. However, the disadvantage of a segmental arrangement as compared to a circular arrangement is that for a positioning near the edge, ambient noise from directions in which no directional microphone is pointed cannot be well suppressed. This problem can be compensated partially by making the segment larger than the region to be observed.
However, for segmental arrangements of directional microphones, other algorithms than modal beamforming are normally better suited, since they are not based on a circularly symmetrical arrangement of the microphones. But a disadvantage of such alternatively usable algorithms is that not only their scalar weightings but also their filtering functions are direction dependent. Since the calculation of filtering functions, or filter coefficients respectively, is often relatively computationally expensive, these may be calculated in advance. The device comprises then a memory in which the respective filtering coefficients for certain directions are stored and from which they can be retrieved if necessary. In this way, real-time operation is also possible with such alternative algorithms.
Details of the two-dimensional modal beamforming will be explained hereinafter.
First, basic assumptions and relationships will be explained. In a compact area of interest in three-dimensional space that contains the center of a notional coordinate system, which is free of sound sources and which is excited from outside by a sound field which is independent of the coordinate system's z-axis, there is an array of Q acoustic sensors (i.e., microphones) that behave linearly. They are arranged on a circle within the x-y plane of the notional coordinate system, with the (two-dimensional) coordinates
with r0 being the radius of the circle and Φq being the azimuth angle of the q-th microphone, measured counter-clockwise in the x-y plane from the x-axis. The frequency domain representation X(ω, xq) of the q-th microphone signal at an angular frequency ω may be expressed as a superimposition (or composition) of responses to individual plane waves impinging from all possible azimuth angles Φ, i.e.,
X(ω,xq)=∫−ππH(ω,xq,ϕ)·C(ω,ϕ)dϕ (2)
Here, C(ω,Φ) denotes the so-called plane wave amplitude density function, which is basically a frequency domain representation of the sound pressure in the coordinate origin caused by a single plane wave incident from an azimuth angle Φ. H(ω, xq, ϕ) indicates the directivity pattern of the q-th microphone.
By expanding the directivity pattern H(ω, xq,ϕ) and the plane wave amplitude density function C(ω, ϕ) into series of real valued orthonormal Circular Harmonics (a special form of the Spherical Harmonics), defined by
according to
H(ω,xq,ϕ)=Σm=−∞∞Hm(ω,xq)trgm(ϕ) (4)
C(ω,ϕ)=Σm=−∞∞Cm(ω)trgm(ϕ) (5)
and exploiting the orthonormality of the Circular Harmonics, i.e.,
∫−ππtrgm(ϕ)trgm,(ϕ)dϕ=δm,m, (6)
where δ, denotes the Kronecker delta function, the frequency domain microphone signal representation X(ω, xq) can be reformulated as
The individual weights Hm(ω, xq) of the Circular Harmonics series in (4) are called the modal responses of degree m.
If all microphones have identical directivity patterns and face outwards or inwards perpendicular to the circle, this may be formally expressed as
H(ω,xq,ϕ)=HPROTO(ω,r0,ϕ−ϕq) (9)
with HPROTO(ω, r0, ϕ) indicating a Φ-symmetric prototype directivity, which can be regarded as belonging to a microphone located at a position (r0, ϕq=0). Due to its Φ-symmetry, the Circular Harmonics expansion of HPROTO(ω, r0, ϕ) is given by
HPROTO(ω,r0,ϕ)=Σm=−∞∞HPROTO,m(ω,r0)trgm(ϕ) (10)
with
HPROTO,m(ω,r0)=0 for m<0. (11)
For this special case, the modal responses can be factorized into a frequency and radius-dependent component and another component that only depends on the azimuth angle according to
Further remarkable is the symmetry of the radial components
bm(ω,r0)=b−m(ω,r0)∀m (14)
and the fact that the radial components depend on the product of the angular frequency and the radius:
bm(ω,r0)=bm(ωr0) (14a)
By plugging (12) into (8), the frequency domain representation X(ω, xq) of the q-th microphone signal may be express as
X(ω,xq)=Σm=−∞∞bm(ω,r0)Cm(ω)trgm(ϕq) (15)
In the following, the basic principle of modal beamforming is described. It may be subdivided into the following two steps:
A block diagram of a typical modal beamformer is shown in
To motivate the incident sound field reconstruction, the Circular Harmonics expansion of the frequency domain microphone signals
X(ω,xq)=Σm=−∞∞Xm(ω,r0)trgm(ϕq) (16)
is compared with (15). It becomes clear that the expansion coefficients Xm(ω, r0) are related to the desired Circular Harmonics series expansion coefficients Cm(ω) of the plane wave amplitude density function according to
Xm(ω,r0)=bm(ω,r0)Cm(ω) (17)
Therefore, two further steps are performed:
Here, it is to be noted that due to the finite number Q of spatial sampling points xq the maximum absolute value of the degree m that can be reconstructed is also finite, and depends on the distribution of the spatial sampling points xq on the circle. For instance, for the special case of a uniform distribution, the weights are all equal, namely
and the maximum absolute value of the degree m that can be reconstructed is given by
By defining the vector X(ω) containing the signals of all microphones by
X(ω)=[X(ω,x1)X(ω,x2) . . . X(ω,xQ)]T (20)
the vector with all Circular Harmonics series expansion coefficients by
XCH(ω,r0)=[{circumflex over (X)}−M(ω,r0){circumflex over (X)}−M+1(ω,r0) . . . {circumflex over (X)}M(ω,r0)]T (21)
and the discrete Circular Harmonics transformation matrix by
the estimation of the Circular Harmonics series expansion coefficients may be expressed by the following matrix multiplication:
XCH(ω,r0)=T(M)(ϕ1,ϕ2, . . . ,ϕQ)·X(ω) (23)
Especially important is that this matrix is frequency independent.
which corresponds to a filtering operation for each individual estimated Circular Harmonics series expansion coefficient of the frequency domain microphone signals {circumflex over (X)}m(ω, r0)
Using the estimated Circular Harmonics series expansion coefficients of the plane wave amplitude density function, the individual plane waves of the incident sound field are weighted according to a desired target beam pattern to be subsequently integrated, or summed up respectively.
The maximum degree M of the Circular Harmonics series expansion coefficients of the plane wave amplitude density function determines the maximum possible spatial resolution of the target beam pattern. Hence, a prototype of a desired target beam pattern is defined by means of a Circular Harmonics expansion truncated at the same maximum degree M:
g(ϕ
which is steered towards a target azimuth angle ϕT=0 and which is Φ-symmetric. Due to the symmetry, the expansion coefficients for negative degree indices m are zero.
If the target beam pattern is steered to an arbitrary target azimuth ΦT, its Circular Harmonics series expansion coefficients can be computed in dependence on those for ϕT=0 according to gm(ϕ
The actual frequency domain output signal Y(w) of the beamformer is computed as weighted sum of Circular Harmonics series expansion coefficients of the plane wave amplitude density function as follows:
Y(ω)=Σm=−MMgm(ϕ
Due to the equivalence of (28) with
Y(ω)=∫−ππg(ϕ
the integration of the weighted plane wave contributions to the incident sound field becomes evident.
For most applications, the frequency-invariant beam pattern used above is advantageous and desired. However, also a frequency dependent beam pattern can be created very easily by making the weighting factors frequency dependent. This requires a filter per individual Circular Harmonics series expansion coefficient of the plane wave amplitude density function before summation.
Optionally, an equalizing filter 350, 350′ can be applied to the output signal Y(ω) of the beamformer to create a direction independent coloration, or compensate a direction dependent coloration respectively, e.g., to attenuate high frequency signal components affected by spatial aliasing.
The radius of the circle on which the microphone capsules of the directional microphones are arranged affects at least two parameters of the array, namely the practically realizable directivity for low frequencies and the frequency at which the spatial aliasing starts occurring.
The directivity at low frequencies is affected as follows. The radial components bm(ω, r0) of the modal responses typically have a high-pass characteristic, where the cutoff frequency increases with the degree index m. For illustration,
(see (26), since |b|m|(ω, r0)| is small. This leads to a typically low white noise gain when using a target beam pattern of high degree m, which means that microphone noise is highly amplified within the beamformer output signal. By increasing the radius r of the array, the curves depicted in
Spatial aliasing is a phenomenon that occurs e.g., when sampling a sound field with the sampling points being distributed too sparsely to capture high frequency spatial sound pressure oscillations. Since the relevance of Circular Harmonics with higher degree m within the signature function usually grows with spectral frequency, the same happens with the amount of error caused by the spatial aliasing. In particular, the angular frequency where the contribution of Circular Harmonics of degrees greater than M to the signature function becomes significant can be seen as the frequency where the aliasing error effects start to become disturbing, or notable respectively. Substantially, this angular frequency is
where cs denotes the speed of sound. This means that for a chosen number Q of microphones the spatial aliasing frequency may be increased by decreasing the array radius r. Alternatively, the number of microphones can be increased for a given array radius.
For microphone arrays for audible frequencies, the microphone capsules should be on a circle or circle segment with a radius of at least rmin=5 cm. For practical reasons, a maximum radius of about rmax=100 cm is advisable. For microphone arrays intended for usage in a sports stadium it is advantageous if for outwardly pointing shotgun microphones the radius is between rmin=30 cm and rmax=40 cm, and for inwardly pointing shotgun microphones e.g., between rmin=40 cm and rmax=60 cm. With the exemplarily described arrangement, a very high directivity e.g., for frequencies in the range of 200 Hz-3 kHz can be achieved. For recordings in a sports stadium, frequencies below 3-4 kHz are particularly relevant.
A smaller construction of the microphone array is possible if the circular arranged shotgun microphones point radially inward. The above calculations continue to be valid in this case.
A particular advantage of the microphone array according to the invention is that it needs not be moved but remains stationary, wherein the direction of maximum sensitivity can be adjusted by electronic control, in the case of the circular arrangement to any direction withing the plane of the circle (corresponding to an azimuth angle of 0°-360° in a horizontal setup). In other specific applications it may make sense to position the circle vertically in order to capture an elevation angle of 0°-360° while keeping the azimuth angle very small. Likewise, arbitrary orientations of the microphone plane are possible in between. As shown in the drawings, there is no microphone in the center of the arrangement. The mentioned respective number of shotgun microphones per array is the respective minimum number; it is always possible and may be advantageous to increase the number Q of microphones, as explained above. The number Q may be even or odd.
In an embodiment, the invention relates to a method for audio recording by means of a microphone array composed of directional microphones, wherein at least one common output signal is generated that comprises sound coming from an adjustable preferred direction of high sensitivity of the microphone array, with the steps: mixing a plurality of microphone signals in a mixing matrix to obtain (2M+1) mixed signals, wherein M is the order of the common output signal, and wherein the microphone signals come from the directional microphones and the directional microphones are arranged substantially in a plane and on a circle or segment of a circle, such that for each of the directional microphones a preferred direction of high sensitivity is substantially orthogonal outward or inward to the circle or circle segment; filtering the mixed signals in a plurality of (2M+1) filters, wherein filtered mixed signals are obtained, weighting each of the filtered mixed signals with a weighting in a plurality of (2M+1) weighting units, wherein the weighting of each weighting unit corresponds to the adjustable preferred direction of high sensitivity of the microphone array, and summing up the (2M+1) weighted filtered mixed signals in a summation unit, wherein the common output signal is obtained.
The embodiments described above are exemplary and may be combined with one another, even if such combination is not expressly mentioned. E.g., in an array arrangement as shown in
Number | Date | Country | Kind |
---|---|---|---|
102018110759.5 | May 2018 | DE | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2019/061529 | 5/6/2019 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2019/211487 | 11/7/2019 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
20050195988 | Tashev | Sep 2005 | A1 |
20070118360 | Hetherington | May 2007 | A1 |
20120076316 | Zhu et al. | Mar 2012 | A1 |
20130315404 | Goldfeder | Nov 2013 | A1 |
20170076720 | Gopalan et al. | Mar 2017 | A1 |
20180286433 | Hicks | Oct 2018 | A1 |
Number | Date | Country |
---|---|---|
1 538 867 | Jun 2005 | EP |
3188504 | Jul 2017 | EP |
H0572025 | Mar 1993 | JP |
WO 2009009568 | Jan 2009 | WO |
WO 2014083542 | Jun 2014 | WO |
WO 2015013058 | Jan 2015 | WO |
Entry |
---|
Search Report for Application No. PCT/EP2019/061529 dated Jul. 12, 2019. |
Renato S. Pellegrini et al., “Object-Audio Capture System for Sports Broadcast” Sep. 27, 2018 (Sep. 27, 2018). Retrieved from the Internet: https://www.ibc.org/download?ac=6529 [retrieved on Jul. 2, 2019] XP055601525. |
Niwa et al., Binaural sound generation corresponding to omnidirectional video view using angular region-wise source enhancement, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar. 20, 2016. |
Yo Sasak et al., Electroacoustics and Audio Engineering: Paper ICA2016-155 Development of multichannel single-unit microphone using shotgun microphone array, In: 22nd International Congress on Acoustics—Sep. 5-9, 2016—Buenos Aires, 2016. |
Number | Date | Country | |
---|---|---|---|
20210235187 A1 | Jul 2021 | US |