The present technology relates to a sound image localizing device, a sound image localizing method, and a program, and particularly to an acoustic reproduction technology that has a sound production effect of generating a virtual sound source at a desired position, rather than a speaker main body.
In recent years, reproduction methods of arranging a plurality of speakers are widely used in public viewing and at homes. Also, following the spread of imaging technologies related to 3D images and wide images, efforts are made regarding acoustics as well to realize reproduction that gives higher presence by generating a virtual sound source at a desired position, rather than a speaker main body. In particular, a virtual speaker is generated by controlling directivity of sound and causing the sound to be reflected from a wall surface through directivity control that is performed using a speaker that uses ultrasonic waves and has sharp directivity or a speaker array that is constituted by arranging a plurality of ordinary speakers. Ultrasonic speakers commonly demodulate ultrasonic waves into audible sound, and accordingly, the sound quality deteriorates due to distortion occurring in demodulation, and particularly, the treble range is difficult to reproduce. In view of reproduction of various types of contents such as music, directional reproduction that enables reproduction with high sound quality in a wide frequency band is needed.
Directivity Control Technology
The following describes directivity control technologies. Directivity control technologies are technologies of controlling a direction in which sound strongly propagates from speakers or a direction in which sound does not propagate from the speakers by arranging control points around a speaker array in which the plurality of speakers are arranged, designing filters that control the amplitude and the phase of the speakers based on characteristics of transfer from the speakers to the control points, and applying the filters to an input signal.
A representative method is design of a directivity control filter using the least squares method.
Here, G(ω) represents a transfer function matrix with M rows and L columns in which transfer functions Gm1(ω) from the speakers to the control points are stored, and Gm1(ω) is given by the following expression.
Here, j represents an imaginary number j=√−1, k represents a wavenumber, and rm1 represents a distance from an m-th control point to an l-th speaker. The least squares method for finding a directivity control filter is a minimization problem of finding a filter w(ω) that minimizes the sum of squares ∥e∥2 of errors between a desired directional characteristic d(ω) and a directional characteristic dO(ω) observed at each control point. Accordingly, an objective function J to be minimized is expressed by the following expression.
J=∥e(ω)∥2=(d(ω)−G(ω))w(ω))H(d(ω)−(ω)w(ω)) (3)
Here, the superscript H represents complex conjugate transposition. The following directivity control filter is found by solving the problem of minimizing the objective function J expressed by Expression (3) with respect to w(ω).
Directivity Control Technology Using Reflecting Plate
With regard to acoustic reproduction technologies for generating a virtual speaker by using reflection of sound, a method based on PTL 1 realizes local reproduction by controlling directivity such that the sum total of radiated sounds from a directional speaker and reflected sounds from a reflecting plate is the maximum at a desired point.
Filter Gain Suppression Using Penalty Term
When a filter for controlling the directivity of sound is designed, the computed filter includes a filter gain that affects a sound source output from the filter. Here, a filter gain F1gain(ω) that corresponds to an l-th speaker at an angular frequency ω is defined as follows.
F
l
gain(ω)=|wl(ω)|=wl(ω)*wl(ω). (5)
Here, wl(ω) represents a filter coefficient that corresponds to the l-th speaker. Also, the superscript * represents a complex conjugate. If the filter gain is large, an input signal increases in proportion to the filter gain, and a large load is applied to the speaker, which makes reproduction difficult. In terms of this, NPL 1 derived a filter for controlling directivity by using a penalty term, which will be described later, with respect to an objective function for deriving the filter. At this time, the sum of squares of filter coefficients was used as the penalty term to suppress the filter gain.
A directivity control filter of a case where the penalty term is used will be considered, taking a directivity control filter obtained using the least squares method as an example. If the penalty term is used with respect to the objective function J of Expression (3), the following expression is obtained.
J=∥e∥
2−β(ω)∥w(ω)∥2, (6)
Here, β(ω) is a regularization parameter that controls a relative weight between ∥e∥2, which is a loss term, and ∥w(ω)∥2, which is the penalty term. Similarly to Expression (4), the following directivity control filter is found by solving a minimization problem regarding w(ω).
Here, I represents a unit matrix with L rows and L columns.
Sound Localization System Using Directivity Control and Reflection of Sound from Wall Surface
Here, ω represents an angular frequency (ω=2πf), f represents a frequency, and Gl(ω) represents a transfer function from the l-th speaker to the target control point. The transfer function Gl(ω) can be obtained through Fourier transformation of an impulse response gl(n).
G
l(ω)={gl(n)} (9)
Here, n represents a time term and F represents Fourier transformation. The normalization matched filter is found by performing this computation with respect to all speakers constituting a speaker array.
Also, regarding upward sound image localization, NPL 2 confirmed through experiments that a sound image was localized in the direction of reflected sound if a sound pressure difference between the reflected sound from a wall surface and direct sound from a speaker was larger than 5 dB.
According to NPL 2, it was confirmed that a sound image could be localized upward if the difference between reflected sound from the wall surface and direct sound from the speaker was larger than 5 dB. Accordingly, it is necessary to generate directional sound with high sound quality in a wide frequency band while suppressing direct sound from the speaker. However, this is difficult to realize through directivity control that is performed using commonly used conventional methods.
The method described in NPL 2 realizes directional reproduction that gives high sound quality and uses a wide frequency band. However, the directivity is not intentionally designed in this method, and accordingly, there is a problem in that, although the directivity can be formed, a desired directional characteristic cannot be given.
In a case where a directivity control filter is designed with respect to a wide frequency band, it is possible to design the filter, but the filter gain becomes large in a low frequency band, and accordingly, it is difficult to reproduce the low frequency band with the computed filter. In terms of this, NPL 1 suppresses the filter gain by using a penalty term that suppresses the filter gain. As the regularization parameter, which is the weight of the penalty term, the same value is experimentally used for all frequencies based on degrees of reproduction of the desired directional characteristic and the magnitude of filter gains at the respective frequencies. If the same regularization parameter is used for all frequencies, there is a problem in that optimum parameters cannot be given for the respective frequencies. Also, if regularization parameters are determined for the respective frequencies, it is necessary to set the same number of regularization parameters as the frequencies used, and it is difficult to experimentally set the regularization parameters. Additionally, there is a problem in that it is difficult to set an optimum parameter because there is a trade-off relationship between reproductivity of the desired directional characteristic and the magnitude of the filter gain.
In view of the conventional technologies described above, an object of the present invention is to provide a sound image localizing device, a sound image localizing method, and a program that enable a virtual speaker to reproduce sound in a wide frequency band with high sound quality.
The gist of an invention according to a first aspect is a sound image localizing device that includes: a directivity control filter design unit configured to compute a directivity control filter from a desired directional characteristic; a filter coefficient correction unit configured to correct the directivity control filter computed by the directivity control filter design unit; and a convolution operation unit configured to compute an output acoustic signal by performing convolution of an input acoustic signal and the directivity control filter corrected by the filter coefficient correction unit, wherein filters that respectively correspond to speakers constituting a speaker array are computed by the directivity control filter design unit and the filter coefficient correction unit, an acoustic beam is generated using directivity control by the speaker array, and the acoustic beam is caused to be reflected from a wall surface or a ceiling to generate a virtual sound source.
The gist of an invention according to a second aspect is that, in the invention according to the first aspect, the filter coefficient correction unit performs computation such that a filter gain becomes constant, the filter gain being an absolute value of a filter coefficient at each frequency.
The gist of an invention according to a third aspect is a sound image localizing device that includes: an objective function setting unit configured to set an objective function from a desired directional characteristic; a constraint setting unit configured to set a linear or non-linear constraint; an optimization unit configured to compute an optimum filter coefficient from the objective function set by the objective function setting unit and the constraint set by the constraint setting unit; and a convolution operation unit configured to compute an output acoustic signal by performing convolution of an input acoustic signal and a directivity control filter that is computed by the optimization unit, wherein filters that respectively correspond to speakers constituting a speaker array are computed by the objective function setting unit, the constraint setting unit, and the optimization unit, an acoustic beam is generated using directivity control by the speaker array, and the acoustic beam is caused to be reflected from a wall surface or a ceiling to generate a virtual sound source.
The gist of an invention according to a fourth aspect is that, in the invention according to the third aspect, the constraint setting unit sets at least one of a constraint that makes the value of a filter gain constant at each frequency and a constraint relating to directional characteristics that is based on the desired directional characteristic.
The gist of an invention according to a fifth aspect is a sound image localizing method that includes: a directivity control filter designing step of computing a directivity control filter from a desired directional characteristic; a filter coefficient correction step of correcting the directivity control filter computed in the directivity control filter designing step; and a convolution operation step of computing an output acoustic signal by performing convolution of an input acoustic signal and the directivity control filter corrected in the filter coefficient correction step, wherein filters that respectively correspond to speakers constituting a speaker array are computed in the directivity control filter designing step and the filter coefficient correction step, an acoustic beam is generated using directivity control by the speaker array, and the acoustic beam is caused to be reflected from a wall surface or a ceiling to generate a virtual sound source.
The gist of an invention according to a sixth aspect is a sound image localizing method that includes: an objective function setting step of setting an objective function from a desired directional characteristic; a constraint setting step of setting a linear or non-linear constraint; an optimization step of computing an optimum filter coefficient from the objective function set in the objective function setting step and the constraint set in the constraint setting step; and a convolution operation step of computing an output acoustic signal by performing convolution of an input acoustic signal and a directivity control filter that is computed in the optimization step, wherein filters that respectively correspond to speakers constituting a speaker array are computed in the objective function setting step, the constraint setting step, and the optimization step, an acoustic beam is generated using directivity control by the speaker array, and the acoustic beam is caused to be reflected from a wall surface or a ceiling to generate a virtual sound source.
The gist of an invention according to a seventh aspect is a program for causing a computer to function as the sound image localizing device according to the first or second aspect.
The gist of an invention according to an eight aspect is a program for causing a computer to function as the sound image localizing device according to the third or fourth aspect.
According to the present invention, it is possible to provide a sound image localizing device, a sound image localizing method, and a program that enable a virtual speaker to reproduce sound in a wide frequency band with high sound quality.
The following describes embodiments that are most suited to implement the present invention, by using the drawings.
Overview
As described above, in terms of the frequency band and the sound quality, it is difficult to generate a virtual speaker by generating an acoustic beam using directivity control performed through a conventional method and causing the acoustic beam to be reflected from a wall surface. A virtually generated speaker needs to support a wide frequency band as a single speaker and give high sound quality.
In the embodiments of the present invention, a directivity control filter that can generate a desired directional characteristic is designed while restricting filter gains to be equal in all of the frequency band as in the case of NPL 2, rather than suppressing the filter gains using a penalty term as in the case of NPL 1, and a virtual speaker is generated using reflection from a wall surface as shown in
A first embodiment is an example in which directional reproduction that enables reproduction in a wide frequency band with high sound quality is realized by performing correction for restricting the filter gain with respect to a directivity control filter that is designed using a method such as the least squares method.
The directivity control filter design unit 11 computes a fundamental directivity control filter from a desired directional characteristic, which has been input (step S11-S12 in
The filter coefficient correction unit 12 computes a corrected directivity control filter from the fundamental directivity control filter, which has been input (step S13 in
The convolution operation unit 13 computes an output acoustic signal from an input acoustic signal, which has been input, and the corrected directivity control filter (step S14 in
An acoustic signal that corresponds to the desired directional characteristic can be reproduced by reproducing the output acoustic signal from a speaker array.
Method for Setting Directional Characteristic
As described above, the sound image localizing device 10 according to the first embodiment includes the directivity control filter design unit 11 that computes a directivity control filter from a desired directional characteristic, the filter coefficient correction unit 12 that corrects the directivity control filter computed by the directivity control filter design unit 11, and the convolution operation unit 13 that computes an output acoustic signal by performing convolution of an input acoustic signal and the directivity control filter corrected by the filter coefficient correction unit 12. Filters that respectively correspond to speakers constituting a speaker array are computed by the directivity control filter design unit 11 and the filter coefficient correction unit 12, an acoustic beam is generated using directivity control by the speaker array, and the acoustic beam is caused to be reflected from a wall surface or a ceiling to generate a virtual sound source. Thus, it is possible to provide the sound image localizing device 10 that enables the virtual speaker to reproduce sound in a wide frequency band with high sound quality.
Also, it is desirable that the filter coefficient correction unit 12 performs computation such that a filter gain becomes constant, the filter gain being the absolute value of a filter coefficient at each frequency. Thus, desired directional reproduction can be realized.
Note that the meaning of “a wall surface or a ceiling” in the expression “the acoustic beam is caused to be reflected from a wall surface or a ceiling” should be widely interpreted. That is, “a wall surface or a ceiling” includes what reflects the acoustic beam similarly to a wall surface or a ceiling.
The following describes a second embodiment. Note that the following mainly describes differences from the first embodiment, and detailed descriptions of aspects similar to those in the first embodiment will be omitted.
The second embodiment is an example in which desired directional reproduction is realized by designing a filter by solving an optimization problem to which a function that forms a desired directional characteristic is given as an objective function and a non-linear equality constraint that restricts the filter gain to a constant value is given as a constraint.
The objective function setting unit 21 sets an objective function from a desired directional characteristic, which has been input (step S21-S22 in
The constraint setting unit 22 sets a constraint relating to the filter gain (step S23 in
The optimization unit 23 computes a directivity control filter by solving an optimization problem based on the objective function and the constraint, which have been input (step S24 in
Here, G(ω) represents a transfer function matrix in which transfer functions from speakers to control points are stored, w(ω)=[w1(ω), w2(ω), . . . , wL(ω)] represents a filter coefficient vector in which filter coefficients w1(ω) corresponding to the respective speakers are stored, c represents a constant, and Gpoint(ω) represents a transfer function vector in which transfer functions from the respective speakers to the target direction are stored. A directivity control filter of which the filter gain is suppressed can be computed by solving the optimization problem as that expressed by Expression (10).
The convolution operation unit 24 is similar to that in the first embodiment, and therefore a description thereof is omitted (step S25 in
As described above, the sound image localizing device 20 according to the second embodiment includes the objective function setting unit 21 that sets an objective function from a desired directional characteristic, the constraint setting unit 22 that sets a linear or non-linear constraint, the optimization unit 23 that computes an optimum filter coefficient from the objective function set by the objective function setting unit 21 and the constraint set by the constraint setting unit 22, and the convolution operation unit 24 that computes an output acoustic signal by performing convolution of an input acoustic signal and the directivity control filter computed by the optimization unit 23. Filters that respectively correspond to speakers constituting a speaker array are computed by the objective function setting unit 21, the constraint setting unit 22, and the optimization unit 23, an acoustic beam is generated using directivity control by the speaker array, and the acoustic beam is caused to be reflected from a wall surface or a ceiling to generate a virtual sound source. Thus, it is possible to provide the sound image localizing device 20 that enables the virtual speaker to reproduce sound in a wide frequency band with high sound quality.
Also, it is desirable that the constraint setting unit 22 sets at least one of a constraint that makes the value of the filter gain constant at each frequency and a constraint relating to directional characteristics that is based on the desired directional characteristic. Thus, desired directional reproduction can be realized.
Note that the present invention can be realized not only as the sound image localizing devices 10 and 20 described above, but also as a sound image localizing method that includes, as steps, functional units that are characteristic to the sound image localizing devices 10 and 20, or a program that causes a computer to execute those steps. It goes without saying that such a program can be distributed via a recording medium such as a CD-ROM or a transmission medium such as the Internet.
Number | Date | Country | Kind |
---|---|---|---|
2019-016881 | Feb 2019 | JP | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/JP2020/001405 | 1/17/2020 | WO | 00 |