The present invention relates to the field of communications technologies, in particular, to a method, communication device and communication system for controlling sound focusing.
A speaker array may aggregate sounds to the position where the audience locates, that is, the speaker array has the function of sound focusing. The speaker array with the function of sound focusing may be used in a communication device, such as a telephone terminal device and a video conference terminal device, which does not affect the work and life of other people and guarantees the security of the communication content and therefore guarantees the privacy of communications.
In the conventional art, a speaker array with the function of sound focusing is arranged in a communication device. During the control of sound focusing, the position to which sounds focus need to be adjusted continually and manually when the position of the audience changes. Therefore, it is inconvenient to use the function of sound focusing.
The embodiments of the present invention provide a method, communication device and communication system for controlling sound focusing to control the sound from a speaker to be focused to a target sound source according to the position of a local user (that is, the target sound source).
The embodiments of the present invention provide the following technical solutions.
A method for controlling sound focusing includes:
obtaining position information of a target sound source relative to a speaker in a speaker array; and
controlling sound from the speaker in the speaker array to be focused to the target sound source according to the obtained position information.
A communication device includes:
a position obtaining unit configured to obtain position information of a target sound source relative to a speaker in a speaker array; and
a controlling unit configured to control sound from the speaker in the speaker array to be focused to the target sound source according to the position information obtained by the position obtaining unit.
A communication system includes: a target sound source, a communication device and a speaker array.
The communication device is configured to obtain position information of a target sound source relative to a speaker in a speaker array, and control sound from the speaker in the speaker array to be focused to the target sound source according to the obtained position information.
The speaker array is configured to focus the sound to the target sound source under the control of the communication device.
The technical solution brings the following benefits:
In the embodiments of the present invention, the position information of the target sound source relative to the speaker is obtained and used to control an audio signal of a remote user to be input to the speaker and focus an audio signal from the speaker to the position of the target sound source, thus automatically controlling the sound from the speaker array to be focused to the target sound source according to the position of the target sound source.
The embodiments of the present invention provide a method for controlling sound focusing. The method includes: obtaining the position information of a target sound source relative to a speaker; and controlling a sound from the speaker to be focused to the target sound source according to the obtained position information. The technical solution provided by the embodiments of the present invention can control the sound from a speaker array to be focused to a sound source according to the position of the sound source.
As shown in
101. A sound source locating module computes the position information of a sound source relative to a reference microphone.
The shape of a microphone array may be linear, rectangular, round, and so on. The position of a sound source relative to the microphone array computed by the sound source locating module is the position of the sound source relative to the reference microphone. The reference microphone is in the center of the microphone array. Taking a linear microphone array composed of three microphones as an example,
As illustrated in
Regardless of
in the equations above, the equation for computing the azimuth θ and the distance R from the sound source to the reference microphone M2 is obtained as follows:
Therefore, the coordinates of the sound source relative to the reference microphone are:
x=R×Sin θ
y=R×Cos θ
During the communication, besides the target sound source (i.e. local user), the microphone array may receive interference from other sound sources, such as noise sources, sounds from the remote users through speakers and other sounds from the non-target users. The first two cases may be eliminated by the methods, such as noise suppression and echo cancellation, to determine a target sound source. In the third case, the following two methods may be used to determine a target sound source. The first method is, after obtaining the distance from a sound source to a reference microphone, if the distance of the sound source relative to the reference microphone is less than a preset distance, determine that the sound source is a target sound source, if the distance of the sound source relative to the reference microphone is more than or equal to a preset distance, determine that the sound source is not a target sound source. The second method is, if a voiceprint characteristic of a sound source is that of a local user (i.e. target sound source) pre-stored in a communication device, determine that the sound source is the target sound source. During the computation of the position information of a sound source relative to a reference microphone, only the sound source in accordance with a stored voiceprint characteristic is subjected to the azimuth computation, and thus the target sound source is determined before step 101 in which a sound source locating module computes the position information of a target sound source relative to a reference microphone.
102. According to the position of a reference microphone relative to a reference speaker and the obtained position information of the target sound source relative to the reference microphone, a position computing module obtains the position information of the target sound source relative to the reference speaker.
Before the step, the position of the reference microphone relative to the reference speaker needs to be determined, and methods for obtaining the position of the reference microphone relative to the reference speaker vary with different communication systems, for example, there are the following two methods for obtaining:
1. A speaker array and a microphone array are integrated in a same communication device, so the position of the reference microphone relative to the reference speaker is fixed, and may be preset in a position computing module.
2. A speaker array and a microphone array are arranged in separate devices rather than a same communication device, so the position of the reference microphone relative to the reference speaker is variable and specifically determined below.
The speaker array is regarded as the sound source.
The microphone array receives the sound from the speaker array, and a sound source locating module connected to the microphone array computes the position of the sound source (a reference speaker in the speaker array) relative to a reference microphone in the microphone array to obtain the position of the reference microphone relative to the reference speaker. The position of the sound source (the reference speaker in the speaker array) relative to the reference microphone may be computed with reference to step 101.
The sound from the speaker array for test may be a sound from a remote user or a special test voice.
The detailed implementation of obtaining the position information of the sound source relative to the reference speaker in the step is illustrated in
x1=x−x0
y1=y−y0
L=√{square root over (x12+y12)}
φ=arctan(x1/y1)
According to the layout of the speaker array, the distance from a speaker except the reference speaker in the speaker array to the target sound source is computed utilizing the distance L and the angle φ of the target sound source relative to the reference speaker, as illustrated in
103. A delay and gain parameter computing module computes the delay parameter (delay-time) and the gain parameter according to the distance Li from the speaker to the target sound source.
Assuming the layout of a speaker array is illustrated in
τi=(Lmax−Li)/C
The equation for computing the gain parameter of the ith speaker for the audio signal is as follows:
104. A sound processing module controls the sound from the speaker to be focused to the target sound source according to the delay-time and the gain parameter of the speaker for the audio signal.
As shown in
In the first embodiment of the present invention, the position information of the target sound source relative to a microphone is obtained, and the position information of a target sound source relative to a speaker is obtained according to the position of the microphone relative to the speaker and the position information of the target sound source relative to the microphone, and the obtained position information of the target sound source relative to the speaker is used to compute the delay parameter of the delay module and the gain parameter of the gain module in the sound processing module, in order to control the audio signal from a remote user to be delayed, amplified and input to the speaker and focus the speaker to the position of the target sound source, thus realizing automatically controlling the sound from the speaker array to be focused to the target sound source according to the position of the target sound source.
The second embodiment of the present invention provides a method for controlling sound focusing, as shown in
601. A sound source locating module computes the position information of a first sound source and a second sound source relative to a reference microphone.
602. A position computing module obtains the position information of the first sound source and the second sound source relative to a reference speaker according to the position of the reference microphone relative to the reference speaker and the obtained position information of the first sound source and the second sound source relative to the reference microphone.
603. A delay and gain parameter computing module computes the first delay parameter and the first gain parameter of the speaker focused to the first target sound source according to the position information of the first target sound source relative to the reference speaker. The delay and gain parameter computing module computes the second delay parameter and the second gain parameter of the speaker focused to the second target sound source according to the position information of the second target sound source relative to the reference speaker.
604. A sound processing module controls the speaker to be focused to the first target sound source according to the first delay parameter and the first gain parameter of the speaker focused to the first target sound source, and controls the speaker to be focused to the second target sound source according to the second delay parameter and the second gain parameter of the speaker focused to the second target sound source.
With reference to
In the second embodiment of the present invention, the position information of the first target sound sources relative to a speaker and the position information of the second target sound sources relative to the speaker are obtained according to the position of a microphone relative to the speaker and the obtained position information of the first target sound source and the second target sound source that are relative to the microphone; the first delay parameter and the first gain parameter of the speaker focused to the first target sound sources are computed, and the second delay parameter and the second gain parameter of the speaker focused to the second target sound source are computed. Those computed delay parameters and gain parameters are used to control the speaker to be focused to the first target sound source and the second target sound source. This automatically controls the sound from a speaker array to be focused to multiple target sound sources.
The third embodiment of the present invention provides a method for controlling sound focusing, as shown in
901. A sound source locating module computes the position information of a target sound source relative to a camera.
The step specifically includes the following sub-steps:
The sound source can be identified by image identification technologies. Because the sound source is human, conventional facial skin color identification technology and motion characteristics of lips identification technology may be used;
the position of the sound source relative to the camera, besides the azimuth, further includes the distance information. Therefore, a stereo camera shoots the sound source and the depth information of the sound source, namely the distance information of the sound source relative to the camera, may be extracted by using technologies, such as image matching.
Before this step, the target sound source may be determined if a voiceprint characteristic of the sound source is one of a local user (target sound source) pre-stored in a communication device.
902. A position computing module obtains the position of the sound source relative to the reference speaker according to the position of the camera relative to the reference speaker and the obtained position information of the target sound source relative to the camera.
Steps 903 and 904 are the same as steps 103 and 104.
In the third embodiment of the present invention, the position information of a target sound source relative to a speaker is obtained according to the position of a camera relative to the speaker and the obtained position information of the target sound source relative to the camera, and used to compute the delay parameter of a delay module and the gain parameter of a gain module in a sound processing module, in order to control an audio signal from a remote user to be delayed, amplified and input to the speaker and focus the speaker to the position of the target sound source, thus realizing automatically controlling the sound from a speaker array to be focused to the target sound source according to the position of the target sound source.
Those skilled in the art may understand that all or part of the steps in the method embodiments may be implemented by a program instructing the relevant hardware. The program may be stored in a computer readable storage medium, such as a read only memory (ROM), a magnetic disk or a compact disk-read only memory (CD-ROM).
The fourth embodiment of the present invention provides a communication device. As shown in
a position obtaining unit 1101 configured to obtain the position information of a target sound source relative to a speaker in a speaker array; and
a controlling unit 1102 configured to control the sound from the speaker to be focused to the target sound source according to the position information obtained by the position obtaining unit.
The device further includes: a target sound source determining unit configured to determine the target sound source.
The position obtaining unit 1101 includes: a sound source locating module configured to obtain the position information of the target sound source relative to a microphone; and a position computing module configured to obtain the position information of the target sound source relative to the speaker according to the position of the microphone relative to the speaker and the position information of the target sound source relative to the microphone. Here, the target sound source determining unit is configured to determine the target sound source according to one or more pre-stored voiceprint characteristics of the target sound source or the distance from the sound source to the microphone.
Or, the position obtaining unit 1101 includes: a sound source locating module configured to obtain the position information of the target sound source relative to a camera; and a position computing module configured to obtain the position information of the target sound source relative to the speaker according to the position of the camera relative to the speaker and the position information of the target sound source relative to the camera. Here, the target sound source determining unit is configured to determine the target sound source according to one or more pre-stored voiceprint characteristics of the target sound source.
The controlling unit 1102 includes: a computing module 11021 and a sound processing module 11022. The computing module is called a delay and gain parameter computing module when configured to compute a delay parameter and a gain parameter of an audio signal.
The delay and gain parameter computing module is configured to compute the delay parameter and the gain parameter of the audio signal to be input to the speaker according to the obtained position information of the target sound source relative to the speaker in a speaker array.
The sound processing module is configured to delay the audio signal, adjust the delayed the audio signal and input the adjusted audio signal to the corresponding speaker according to the computed delay parameter and the computed gain parameter of the audio signal. Specifically, the sound processing module includes a delay module configured to delay the audio signal according to the delay parameter and output the delayed audio signal, and a gain module configured to adjust the amplitude of the delayed audio signal according to the gain parameter and input the adjusted audio signal to the corresponding speaker.
Preferably, the target sound source includes: a first target sound source and a second target sound source. According to the position information of the first target sound source relative to the speaker in the speaker array, the computed delay parameter and the computed gain parameters are a first delay parameter and a first gain parameter respectively; and according to the position information of the second target sound source relative to the speaker in the speaker array, the computed delay parameter and computed gain parameter are a second delay parameter and a second gain parameter respectively.
The sound processing module includes:
a first delay module configured to delay the audio signal according to the first delay parameter;
a first gain module configured to adjust the amplitude of the audio signal delayed by the first delay module according to the first gain parameter to obtain a first audio signal;
a second delay module configured to delay the audio signal according to the second delay parameter;
a second gain module configured to adjust the amplitude of the audio signal delayed by the second delay module according to the second gain parameter to obtain a second audio signal; and
a combining module configured to combine the two audio signals from the first gain module and the second gain module and input the combined audio signal to an amplifying module, where the combining module may combine the two audio signals by adding the two audio signals.
The amplifying module is configured to amplify the audio signal from the combining module and input the amplified audio signal to the corresponding speaker.
In the communication device provided by the fourth embodiment of the present invention, the position obtaining unit 1101 obtains the position information of the target sound source relative to the speaker, and the controlling unit 1102 controls the audio signal from a remote user to be input to the speaker by using the position information of the target sound source relative to the speaker to focus the speaker to the position of the target sound source, thus realizing automatically controlling the sound from the speaker array to be focused to the target sound source according to the position of the target sound source.
The fifth embodiment of the present invention provides a communication system, including: a target sound source, a communication device and a speaker array.
The communication device is configured to obtain the position information of the target sound source relative to a speaker in the speaker array and control the sound from the speaker in the speaker array to be focused to the target sound source according to the obtained position information.
The speaker array is configured to focus the sound to the target sound source under the control of the communication device.
The system further includes: a microphone array, configured to receive a sound signal of the target sound source.
The communication device is configured to: obtain the time delay between the adjacent microphones in the microphone array according to the sound signal; multiply the time delay by the sound speed to obtain the sound path difference between the adjacent microphones, where the sound path difference is the difference of the distances from the sound source to the adjacent microphones; obtain the position of the target sound source relative to a reference microphone in the microphone array according to the sound path difference; and obtain the position information of the target sound source relative to the speaker according to the position of the reference microphone relative to the speaker in the speaker array and the position information of the target sound source relative to the reference microphone.
Or, the system further includes: a camera, configured to shoot the target sound source.
The communication device is configured to obtain the position information of the target sound source relative to the camera according to an image taken by the camera; and obtain the position information of the target sound source relative to the speaker in the speaker array according to the position of the camera relative to the speaker in the speaker array and the obtained position information of the target sound source relative to the camera.
In the fifth embodiment of the present invention, the communication device obtains the position information of the target sound source relative to the speaker, and controls the sound from the speaker to be focused to the target sound source by using the obtained position information of the target sound source relative to the speaker, thus realizing automatically controlling the sound from the speaker array to be focused to the target sound source according to the position of the target sound source.
The above describes the method, communication device and communication system provided by the embodiments of the present invention in detail. It is understandable that those skilled in the art may make various modifications and variations to the present invention without departing from the spirit and concept of the present invention. To sum up, the content of the specification shall not be construed as a limitation to the present invention.
Number | Date | Country | Kind |
---|---|---|---|
200810135510.4 | Aug 2008 | CN | national |
This application is a continuation of International Application No. PCT/CN2009/073283, filed on Aug. 17, 2009, which claims priority to Chinese Patent Application No. 200810135510.4, filed on Aug. 19, 2008, both of which are hereby incorporated by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2009/073283 | Aug 2008 | US |
Child | 13030893 | US |