This Nonprovisional application claims priority under 35 U.S.C. ยง 119(a) on Patent Application No. 2022-007557 filed in Japan on Jan. 21, 2022, the entire contents of which is hereby incorporated by reference.
The present disclosure relates to a sound processing apparatus and a sound processing method, and more particularly relates to a technology to reduce noise.
Japanese Unexamined Patent Application Publication No. 2010-122617 discloses a noise gate that estimates a noise spectrum of stationary noise based on a frequency spectrum of a sound signal. The noise gate, in a case in which a signal level ratio of the frequency spectrum of the sound signal to a noise spectrum is greater than or equal to a threshold value, outputs the frequency spectrum as it is. The noise gate, in a case in which the signal level ratio of the frequency spectrum of the sound signal to the noise spectrum is less than a threshold value, decreases and outputs a gain.
In a case in which a gain control is performed according to a ratio (S/N) of a noise level to a sound level, noise is mixed when a voice of a talker is inputted.
In view of the foregoing, one aspect of the present disclosure is directed to providing a sound processing apparatus capable of reducing noise when inputting a voice of a talker.
A sound processing apparatus includes sound collection circuity that collects a sound and generates a first sound signal, and processing circuitry that estimates an estimated noise, controls a gain of the first sound signal and outputs a second sound signal, based on the estimated noise, and performs filter processing to reduce a component of a predetermined frequency band of the second sound signal based at least in part on the estimated noise.
According to an embodiment of the present disclosure, noise is able to be reduced when a voice of a talker is inputted.
The microphone 11 collects a sound. In various embodiments, the microphone 11 constitutes the sound collection circuitry. The processor 12 sends a sound signal of the sound collected by the microphone 11, to an external personal computer (PC) or the like, through the communicator 15.
The processor 12 includes a CPU, a DSP, or an SoC (System on a Chip). The processor 12 reads out a program from the flash memory 14 being a storage medium, and temporarily stores the program in the RAM 13, and thus performs various operations. The program includes a sound processing program 141.
The flash memory 14 stores a program for operating the processor 12. For example, the flash memory 14 stores the sound processing program 141. The processor 12 executes the sound processing method of the present disclosure by the sound processing program 141. In various embodiments, the processor 12 constitutes the processing circuitry.
The microphone 11 collects a sound and generates a first sound signal (S11). The sound includes a voice of a talker or noise. The microphone 11 outputs a generated first sound signal to the processor 12.
First, the first noise estimator 125 estimates noise power based on the first sound signal (S12). The method of estimating noise power may be any method. For example, the first noise estimator 125 estimates the minimum value in a power average value in a predetermined section of the first sound signal, as noise power.
The gain calculator 123 calculates a gain of the first sound signal in the noise reducer 121 based on the noise power estimated by the first noise estimator 125 (S13). For example, the gain calculator 123 determines a gain of the noise reducer 121 based on a ratio (S/N) of power S and noise power N of the first sound signal so as to cause the noise reducer 121 to function as a Wiener filter.
The noise reducer 121 inputs the first sound signal by the gain calculated by the gain calculator 123, and outputs a second sound signal (S14). As a result, the noise reducer 121 reduces noise in order to decrease a level of the second sound signal when a talker is not talking. On the other hand, the noise reducer 121 does not reduce the voice of the talker in order to increase the level of the second sound signal when the talker is talking.
The second noise estimator 126 estimates noise based on a part of a band of the first sound signal. For example, the second noise estimator 126 obtains a noise power estimation value based on noise power of 1 kHz or less among the noise power calculated by the first noise estimator 125 (S15).
The EQ controller 124 calculates a gain of the EQ 122 based on the noise power estimation value obtained by the second noise estimator 126 (S16). The EQ 122 performs processing to reduce a component in a predetermined frequency band of the second sound signal based on the gain calculated by the EQ controller 124 (S17). For example, the EQ 122 reduces a band of 1 kHz or less of the second sound signal.
As described above, the noise reducer 121 reduces noise in order to decrease the level of the second sound signal when a talker is not talking. On the other hand, the noise reducer 121 increases the level of the second sound signal when the talker is talking, so that noise may be mixed with the second sound signal. In particular, noise included in a low frequency band of 1 kHz or less is auditorily noticeable. However, the EQ 122 and the EQ controller 124 according to the present embodiment reduce the low frequency band of 1 kHz or less based on the noise power estimation value, so that the noise when the voice of a talker is inputted is able to be reduced. In addition, the EQ controller 124 according to the present embodiment sets the gain of the EQ 122 only based on the noise power estimation value without depending on the power of the first sound signal. Therefore, stationary noise is able to be reduced without depending on a level of the voice of a talker.
The second noise estimator 126 may estimate a noise component in each of a plurality of frequency bands, and may estimate noise based on an estimation result of the noise component of each of the plurality of frequency bands.
For example, the second noise estimator 126 obtains noise power of each of Band 1 of 0 to 250 Hz, Band 2 of 250 to 500 Hz, Band 3 of 500 to 750 Hz, and Band 4 of 750 to 1000 Hz. However, the number of bands and the bandwidth are not limited to this example.
Furthermore, the second noise estimator 126 weights the noise power in each band. Weight increases a band having a large auditory effect and decreases a band having a small auditory effect. For example, the second noise estimator 126 sets a weighting coefficient of Band 1 as 0.8, a weighting coefficient of Band 2 as 0.1, a weighting coefficient of Band 3 as 0.05, and a weighting coefficient of Band 4 as 0.05, multiplies the noise power of each band by each weighting coefficient, and calculates an expectation value. The second noise estimator 126 adds the expectation value of each band. The second noise estimator 126 sets an addition result as a noise power estimation value.
In such a manner, the second noise estimator 126 estimates noise by separating a band that is able to be predicted to be more affected by the noise and a band that is able to be predicted to be less affected by the noise. As a result, the second noise estimator 126 is able to stabilize filter processing by the EQ 122.
As shown in
In contrast, as shown in
It is to be noted that the EQ 122 may perform the filter processing in a band narrower than a plurality of frequency bands (Band 1 to Band 4) estimated by the second noise estimator 126. For example, the EQ 122 may perform the filter processing only on the band (Band 1, for example) having the largest auditory effect. As a result, the EQ 122 is able to minimize a change in sound quality.
The first noise estimator 125 or the second noise estimator 126 may obtain image data, and may estimate noise based on obtained image data.
Specifically, the second noise estimator 126 recognizes a noise source included in the image data, and obtains the noise power estimation value according to the state of a recognized noise source. The noise source includes a person, a PC, an air conditioner, a ventilation fan, or a vacuum cleaner, for example.
The second noise estimator 126 obtains the noise power estimation value based on the number of movable objects (pedestrians, for example) to be recognized within a predetermined time, for example. The second noise estimator 126 estimates that the noise power estimation value is increased as the number of movable objects (pedestrians, for example) recognized within the predetermined time is increased, and estimates that the noise power estimation value is decreased as the number of movable objects (pedestrians, for example) recognized within the predetermined time is decreased.
Alternatively, the second noise estimator 126 may obtain the noise power estimation value based on the number of persons at a distant place. The second noise estimator 126 may recognize the image of an air conditioner, and may obtain the noise power estimation value based on a state (the number of rotations of a fan, for example) of the air conditioner. Alternatively, the second noise estimator 126 may obtain the noise power estimation value based on a state (a degree of swinging of a curtain, for example) of an object around the air conditioner. Alternatively, the second noise estimator 126 may recognize a remote controller of the air conditioner, and may obtain the noise power estimation value based on a set temperature displayed on the remote controller. The second noise estimator 126, in a case of the air conditioner in cooling operation, estimates that the noise power estimation value is increased as the set temperature is decreased, and estimates that the noise power estimation value is decreased as the set temperature is increased. The second noise estimator 126, in a case of the air conditioner in heating operation, estimates that the noise power estimation value is increased as the set temperature is increased, and estimates that the noise power estimation value is decreased as the set temperature is decreased.
It is to be noted that the first noise estimator 125 may obtain image data from the camera 20 and may estimate noise based on obtained image data, or both of the first noise estimator 125 and the second noise estimator 126 may obtain image data from the camera 20 and may estimate noise based on obtained image data. In addition, the first noise estimator 125 or the second noise estimator 126 may estimate noise power based on the first sound signal and the image data.
The description of the foregoing embodiments is illustrative in all points and should not be construed to limit the present disclosure. The scope of the present disclosure is defined not by the foregoing embodiments but by the following claims. Further, the scope of the present disclosure includes the scopes of the claims and the scopes of equivalents.
For example, the EQ controller 124 may calculate the gain of the EQ 122 based on the noise power estimation value obtained by the first noise estimator 125. The EQ controller 124 may calculate the gain of the EQ 122 based on the ratio (S/N) of the power S to the noise power N of the first sound signal.
In addition, in
In addition, in a case in which the second noise estimator 126 obtains the noise power in each of the plurality of frequency bands and obtains the noise power estimation value, as shown in the first modification, the EQ controller 124 may change the gain for each band of the EQ 122 based on an obtained noise power estimation value.
For example,
In such a manner, the EQ controller 124 may change the gain of the EQ 122 based on noise power estimation value, for each band. As a result, the EQ 122 is able to minimize a change in sound quality and accurately reduce noise.
Number | Date | Country | Kind |
---|---|---|---|
2022-007557 | Jan 2022 | JP | national |