1. Field of the Invention
The present invention relates to a noise cancellation method, and more particularly to a noise cancellation method for a portable device.
2. Description of the Related Art
Portable devices, such as a smart phone, tablet or personal digital assist (PDA), have become necessaries for consumers, personally or for business. More and more users use a portable device to shot a video or record a voice mail. The general portable device does not support noise cancellation for voice received by the microphone of the portable device, and wind noise may decrease the quality of the recorded voice no matter where the user is at, indoors or outdoors. When a user is outdoors, the microphone is easily affected by the wind noise. When the user is indoors, the microphone is easily affected by reflected voice signal. The noise suppression methods for the wind noise and the reflected voice signal are different and are not easily integrated in the portable device.
An embodiment of the invention provides a noise cancellation method for an electronic device. The method comprises: receiving an audio signal; applying a Fast Fourier Transform operation on the audio signal to generate a sound spectrum; acquiring a first spectrum corresponding to a noise and a second spectrum corresponding to a human voice signal from the sound spectrum; estimating a center frequency according to the first spectrum and the second spectrum; and applying a high pass filtering operation to the sound spectrum according to the center frequency.
Another embodiment of the invention provides a noise cancellation method for an electronic device. The method comprises the steps of: receiving an audio signal; applying a Fast Fourier Transform operation on the audio signal to generate a sound spectrum; determining whether the electronic device is outdoors according to the sound spectrum; and executing the following steps when the electronic device is outdoors: acquiring a first spectrum corresponding to a noise and a second spectrum corresponding to a human voice signal from the sound spectrum; estimating a center frequency according to the first spectrum and the second spectrum; and applying a high pass filtering operation to the sound spectrum according to the center frequency.
A detailed description is given in the following embodiments with reference to the accompanying drawings.
The present invention can be more fully understood by reading the subsequent detailed description and examples with references made to the accompanying drawings, wherein:
The following description is of the best-contemplated mode of carrying out the invention. This description is made for the purpose of illustrating the general principles of the invention and should not be taken in a limiting sense. The scope of the invention is best determined by reference to the appended claims.
In the step S24, the noise suppression device estimates a center frequency fc according to a first energy of the noise spectrum and a second energy of a human speech spectrum. Then, a center frequency of a frequency domain high pass filter is adjusted according to the estimated center frequency. The first spectrum is then filtered by the frequency domain high pass filter to filter out the wind noise at a low frequency and a second spectrum is therefore generated. Then, in the step S25, the noise suppression device processes the second spectrum to enhance the human speech spectrum and suppress the noise spectrum according to the human speech spectrum and the noise spectrum, and a third spectrum is generated accordingly. An Inverse Fast Fourier Transform (IFFT) operation is then applied to the third spectrum to generate a filtered audio signal. The filtered audio signal is then stored or played by a speaker.
In
If the energy of the noise spectrum is not larger than a predetermined value, the processor 33 does not transmit the enable signal to the HPF 34 and transmits the select signal to the IFFT device 35 and the IFFT device 35 applies an inverse Fast Fourier Transform operation on the first spectrum output by the FFT device 32. In another embodiment, if the energy of the noise spectrum is not larger than a predetermined value, and the processor 33 receives a control signal indicating that the user wants to apply a noise cancellation operation or noise suppress operation, the processor 33 transmits the enable signal to the HPF 34 to apply a high pass filter operation on the first spectrum. The processor 33 also transmits a select signal to the IFFT device 35 and the IFFT device 35 applies an inverse Fast Fourier Transform operation on the output of the HPF 34, not the first spectrum output by the FFT device 32. Thus, the processor 33 can pass or ignore the step of determining whether the energy of the noise spectrum is larger than a predetermined value.
After the processor 33 receives the first spectrum, the processor 33 first acquires a noise spectrum corresponding to a first frequency range and a human speech spectrum corresponding to a second frequency range. The processor 33 estimates a center frequency fc according to a first energy of the noise spectrum and a second energy of a human speech spectrum. When the center frequency of the HPF device 34 is adjusted to the center frequency fc, the HPF device 34 applies a high pass filter operation on the first spectrum to filter out the low frequency wind noise, and a second spectrum is then generated. The second spectrum is transmitted to the IFFT device 35 and the IFFT device 35 executes an IFFT operation to transform the second spectrum into a second audio signal. In this embodiment, the first frequency range ranges from 0 to 100 Hz, and the second frequency range ranges from 300 Hz to 4K Hz, but are not limited thereto.
The processor 33 can set different frequency ranges according to the type of noise and the processor 33 first determines the type of noise according to the first spectrum and then when the type of noise is determined, the processor 33 determines the center frequency of the HPF device 34 accordingly. In other words, the invention not only cancels or suppresses the wind noise, but also noise at any frequency range.
In this embodiment, the processor 43 can select only one of the frequency domain HPF 44 and time domain HPF 46 to execute the filter operation, or both the frequency domain HPF 44 and time domain HPF 46 execute the filter operation. If both the frequency domain HPF 44 and time domain HPF 46 work simultaneously, the processor 43 transmits a select signal SEL to the enhancement device 48 and the enhancement device 48 processes the output signal from the frequency domain HPF 44 or a second FFT device 47 according to the select signal SEL. In other words, a multiplexer can be applied for directing the output signal from the frequency domain HPF 44 or the output signal from the second FFT device 47 to the enhancement device 48 according to the select signal SEL. The enhancement device 48 can be implemented by hardware or software to enhance the human voice signal of the received signal and suppress the wind noise of the received signal.
When the processor 43 receives the first spectrum, the processor 43 acquires a noise spectrum N corresponding to a first frequency range corresponding to the noise and a human speech spectrum corresponding to the second frequency range corresponding to the human speech signal. The processor 43 calculates a ratio (PN/PS) according to a first energy of the noise spectrum and the second energy of the human speech spectrum to estimate a center frequency fc. The controlled 43 then adjusts the center frequency of both the frequency domain HPF 44 and time domain HPF 46 to be fc. When the center frequency of frequency domain HPF 44 is set, the first spectrum is filtered by the frequency domain HPF 44, the wind noise at low frequency is filtered out from the first spectrum, and a second spectrum is generated accordingly. When the center frequency of time domain HPF 46 is set, the first audio signal is filtered by the time domain HPF 46 the wind noise at low frequency is filtered out from the first spectrum, and a second audio signal is generated accordingly. The second audio signal is transmitted to the second FFT device 47 to generate a third spectrum.
The processor 43 transmits the noise spectrum N and the human speech spectrum S to the enhancement device 48. The enhancement device 48 receives the second spectrum or the third spectrum according to a select signal SEL, and enhances the human speech of the received spectrum and suppresses the noise of the received spectrum. For example, the second spectrum can be represented as (S2+N2). The enhancement device 48 calculates an average spectrum of the second spectrum and the human speech spectrum, wherein the average spectrum can be represented as ((S+S2)/2+N2/2). Then the enhancement device 48 subtracts the noise spectrum N from the average spectrum to generate the result: ((S+S2)/2+(N2−N)/2). According to this way, a signal to noise ratio between the human speech spectrum (S+S2)/2 and the noise spectrum (N2−N)/2) becomes larger and the quality of the output audio signal becomes better accordingly. The enhancement device 48 outputs a fourth spectrum to the IFFT device 45 and an IFFT operation is applied to the fourth spectrum to generate a third audio signal. The processor 43 can set different frequency ranges according to the type of noise and the processor 43 first determines the type of noise according to the first spectrum and then when the type of noise is determined, the processor 43 determines the center frequency of the frequency domain HPF 44 and the time domain HPF 46 accordingly. In other words, the invention not only cancels or suppresses the wind noise, but also the noise at any frequency range.
Although the description of the embodiment in
The generation of the center frequency and how the processor 43 detects noises are explained in the following. The signal received by the microphone 41 is first sampled by an analog to digital converter with 48K Hz sampling rate to generate a digital signal. The digital signal is transmitted to a 256 points Fast Fourier Transform device to generate a corresponding spectrum. Energy of a first band of the spectrum and energy of a second band of the spectrum are used to determine whether the noise exists. The frequency of wind noise can be acquired by the following equation:
2/256*48K Hz=375 Hz
The processor 43 determines the center frequency fc according to a signal to noise (SNR) ratio of the noise and the human speech signal. The SNR ratio is determined by the following equation:
SNR=the energy from band 3 to band 24/the energy from band 1 to band 2=the energy from 375 Hz to 4K Hz/the energy from 0 to 375 Hz
In the present application, the frequency of the center frequency fc estimated by the SNR ranges from 100 Hz to 1000 Hz.
Y(k)=g(k)*x(k)
The gain value of the gain function g(k) ranges from 0.1 to 1. For example, if the Fast Fourier Transform executed in step S62 is a 256 points Fast Fourier Transform, the gain function g(k) comprises 256 gain values to adjust the energy of each point of the first spectrum. Furthermore, in step S64, an echo spectrum n(k) is also estimated according to the first audio signal or the first spectrum. The echo noise spectrum n(k) is represented by the equation:
n(k)=(1−g(k))*u(k)
wherein u(k) is the original estimated noise.
Then, a second spectrum is generated by subtracting n(k) from Y(k). In the step S65, the second spectrum is transformed into a third audio signal x″(t) by an Inverse Fast Fourier Transform operation.
Generally speaking, the frequency of wind noise ranges from 0 to 100 Hz, and the frequency of human speech signals range from 300 Hz to 4K Hz. In this embodiment, the user or designer sets a first frequency range corresponding to the wind noise and a second frequency range corresponding to the speech signal and acquires a noise spectrum corresponding to the first frequency range and a human speech spectrum corresponding to the second frequency range by an application program. Then, a first determination device may determine whether a user is outdoors according to the energy of the noise spectrum. If the user is determined not to be outdoors, the step S704 is executed. If the user is determined to be outdoors, the step S706 is executed
In the step S706, the energy of the noise spectrum Nr is compared with a first predetermined value Nth1. If the energy of the noise spectrum Nr is larger than the first predetermined value Nth1, the step S711 is executed to cancel the noise. If the energy of the noise spectrum Nr is smaller than the first predetermined value Nth1, the step 707 is executed. In the step S707, the noise suppression function is determined to be forcedly executed or not according to user settings. For example, when the user uses a portable device to execute a video recording program or a voice recording program, an operational menu is jumped and shown on the display of the portable device for the user to determine whether the noise cancellation or suppression operation should be executed. If the answer of step S707 is yes, wherein the noise cancellation operation or suppression operation has to be executed, step S711 is then executed. If the answer of step S707 is no, wherein the noise cancellation operation or suppression operation does not have be executed, step S715 is then executed. In step S715, an IFFT operation is applied to the first spectrum to generate a second audio signal.
In step S711, a signal to noise (SNR) ratio is estimated according to the energy of the noise spectrum and the energy of the human speech spectrum. In the step S712, a center frequency fc is estimated according to the SNR ratio. A center frequency of a frequency domain high pass filter is adjusted according to the center frequency fc, the first spectrum is filtered by the frequency domain high pass filter to filter out the wind noise at low frequency in step S713, and a second spectrum is therefore generated. In step S714, a noise suppression operation is applied to the second spectrum again according to the noise spectrum and the human speech spectrum to enhance the human speech of the second spectrum and suppress the wind noise of the second spectrum. A third spectrum is generated accordingly. In step S714, the third spectrum is processed by the IFFT operation to generate a filtered audio signal.
In step S704, a second determination device may determine whether the user is indoors according to the first spectrum. In one embodiment, the second determination device may determine whether the echo noise exists according to two successive spectrums. If the result of step S704 is no, step S705 is executed. If the result of step S704 is yes, step S708 is executed. In step S708, an indoor noise, such as an echo, is estimated according to the first spectrum, and the energy of the indoor noise Nr is compared with a second predetermined value Nth2. If the energy of the indoor noise Nr is larger than the second predetermined value Nth2, the step S716 is executed to suppress the noise. For the operation of the step S716, reference can be made to the description of
If the energy of the noise spectrum is not larger than a predetermined value, the processor 83 does not transmit the enable signal to the HPF 84 and transmits the select signal to the IFFT device 85. The IFFT device 85 applies an inverse Fast Fourier Transform operation on the first spectrum output by the FFT device 82 according to the select signal. In other embodiment, if the energy of the noise spectrum is not larger than a predetermined value, and the processor 83 receives a control signal indicating that the user wants to apply a noise cancellation operation or noise suppress operation on the audio signal received by the microphone 81, the processor 83 directly transmits the enable signal to the HPF 84 to apply a high pass filter operation on the first spectrum. The processor 83 also transmits a select signal to the IFFT device 85 and the IFFT device 85 applies an inverse Fast Fourier Transform operation on the output signal of the HPF 84, not the first spectrum output by the FFT device 82. Thus, the processor 83 can pass or ignore the step of determining whether the energy of the noise spectrum is larger than a predetermined value.
After the processor 83 receives the first spectrum, the processor 83 first acquires a noise spectrum corresponding to a first frequency range and a human speech spectrum corresponding to a second frequency range. The processor 83 estimates a center frequency fc according to a first energy of the noise spectrum and a second energy of a human speech spectrum. After the center frequency of the HPF device 84 is adjusted to the center frequency fc, the HPF device 84 applies a high pass filter operation on the first spectrum to filter out the low frequency wind noise, and a second spectrum is then generated. The second spectrum is transmitted to the IFFT device 85 and the IFFT device 85 executes an IFFT operation to transform the second spectrum into a second audio signal. In this embodiment, the first frequency range ranges from 0 to 100 Hz, and the second frequency range ranges from 300 Hz to 4K Hz, but are not limited thereto. The processor 83 can set different frequency ranges according to the type of noise and the processor 83 first determines the type of noise according to the first spectrum and then when the type of noise is determined, the processor 83 determines the center frequency of the HPF device 84 accordingly. In other words, the invention not only cancels or suppresses the wind noise, but also the noise at any frequency range.
When the processor 83 receives the first spectrum and determines that the portable device is indoors, the first spectrum is transmitted to the enhancement device 86. At the same time, the processor 83 transmits the select signal SEL to the IFFT device 85 to process the output signal of the enhancement device 86. The enhancement 86 estimates a noise spectrum according to a previous received audio signal, and executes a noise suppression operation on the first spectrum according to the noise spectrum to generate a third spectrum. The third spectrum is then transmitted to the IFFT device 85 to generate a third audio signal by applying an IFFT operation on the third spectrum.
The SNR estimator 93 estimates an SNR ratio according to the energy of the first spectrum and the energy of the second spectrum. The SNR ration is transmitted to a center frequency generator 94 to estimate a center frequency fc. The high pass filter adjusts its center frequency according to the center frequency fc and applies a high pass filter operation to the audio spectrum. Then, the output of the high pass filter is transmitted to an IFFT device to output a second audio signal.
Generally speaking, the frequency of wind noise ranges from 0 to 100 Hz, and the frequency of human speech signals range from 300 Hz to 4K Hz. In this embodiment, the user or designer sets a first frequency range corresponding to the wind noise and a second frequency range corresponding to the speech signal and acquires a noise spectrum corresponding to the first frequency range and a human speech spectrum corresponding to the second frequency range by an application program. Then, a first determination device may determine whether a user is outdoors according to the energy of the noise spectrum.
In this embodiment, a second determination device may determine whether the user is indoors according to the first spectrum. In one embodiment, the second determination device may determine whether the echo noise is generated according to two successive spectrums. If the user is determined to be indoors, the step S1104 is executed. If the user is determined to be outdoors, the step S1106 is executed
In the step S1106, the energy of the noise spectrum Nr is compared with a first predetermined value Nth1. If the energy of the noise spectrum Nr is larger than the first predetermined value Nth1, the step S1111 is executed to cancel the noise. If the energy of the noise spectrum Nr is smaller than the first predetermined value Nth1, the step 1107 is executed. In the step S1107, the noise suppression function is determined to be forcedly executed or not according to user settings. For example, when the user uses a portable device to execute a video recording program or a voice recording program, an operational menu is jumped and shown on the display of the portable device for the user to determine whether the noise cancellation or suppression operation should be executed. If the answer of step S1107 is to execute the noise cancellation or suppression operation, step S111 is then executed. If the answer of step S1107 is not to execute the noise cancellation or suppression operation, step S1115 is then executed. In step S1115, an IFFT operation is applied to the first spectrum to generate a second audio signal.
In step S1111, a signal to noise (SNR) ratio is determined according to the energy of the noise spectrum and the energy of the human speech spectrum. In the step S1112, a center frequency fc is estimated according to the SNR ratio. A center frequency of a frequency domain high pass filter is adjusted according to the center frequency fc, and the first spectrum is filtered by the frequency domain high pass filter to filter out the wind noise at low frequency, and a second spectrum is therefore generated. In step S1114, noise suppression is applied to the second spectrum according to the noise spectrum and the human speech spectrum to enhance the human speech and suppress the wind noise. A third spectrum is generated accordingly. In step S1114, the third spectrum is processed by the IFFT operation to generate a filtered audio signal.
In step S1104, a second determination device may determine whether the user is indoors according to the first spectrum. In one embodiment, the second determination device may determine whether the echo noise exists according to two successive spectrums. If the result of step S1104 is no, step S1105 is executed. If the result of step S1104 is yes, step S1108 is executed. In step S1108, indoor noise, such as an echo, is estimated according to the first spectrum, and the energy of the indoor noise Nr is compared with a second predetermined value Nth2. If the energy of the indoor noise Nr is larger than the second predetermined value Nth2, the step S1116 is executed to suppress the noise. For the operation of the step S1116, reference can be made to the description of
In the step S1109, the noise suppression function is determined to be forcedly executed or not according to user settings. For example, when the user uses a portable device to execute a video recording program or a voice recording program, an operational menu is jumped and shown on the display of the portable device for the user to determine whether the noise cancellation or suppression operation should be executed. If the answer of step S1109 is yes, step S1116 is then executed. If the answer of step S1109 is no, step S1115 is then executed. In step S1115, the first spectrum is processed by the IFFT operation to generate a second audio signal.
While the invention has been described by way of example and in terms of the preferred embodiments, it is to be understood that the invention is not limited to the disclosed embodiments. To the contrary, it is intended to cover various modifications and similar arrangements (as would be apparent to those skilled in the art). Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
This application is a Divisional of copending application Ser. No. 13/471,085, filed on May 14, 2012, which is hereby expressly incorporated by reference into the present application.
Number | Date | Country | |
---|---|---|---|
Parent | 13471085 | May 2012 | US |
Child | 15003629 | US |