1. Field of the Invention
The present invention relates to a method for reducing noise; more particularly, the present invention relates to a method capable of controlling a noise adjustment ratio during a noise reduction process.
2. Description of the Related Art
There are various ways of reducing noise, and the known technique related to amplitude adjustment has been disclosed in publications such as Taiwan Patent No. M277217 issued on Oct. 1, 2005 entitled “Background noise elimination device”, which comprises an amplitude capture channel to insulate low voltage signals, because in its disclosure, the low voltage signals are determined as noise signals. Therefore, after the low voltage signals are insulated, high voltage signals (which are normal voice) successfully passing through the channel for being played are the voice without noise interference. However, the insulated low voltage signals might possibly contain non-noise voice, if they are determined as noise and directly insulated, the output voice would be different from the original voice and sounds unnatural, therefore it is necessary to improve the method of reducing noise by simply adjusting the amplitude.
Therefore, there is a need to provide a method for reducing noise and a computer program thereof and an electronic device to mitigate and/or obviate the aforementioned problems.
It is an object of the present invention to provide a method for reducing noise.
To achieve the abovementioned object, the method for reducing noise of the present invention comprises: dividing an input voice into a plurality of voice segments; and obtaining a maximum energy reference value of a current voice segment.
The energy of the current voice segment is adjusted according to a current reference ratio, wherein the current reference ratio is calculated according to the maximum energy reference value and a predetermined energy value, and the current reference ratio is less than or equal to 1 and greater than or equal to 0.
According to one embodiment of the present invention, the maximum energy reference value is determined according to the maximum energy from n voice segments prior to the current voice segment, wherein n is between 0 and 180 (depending on the number of sampling points included in each voice segment and a system sampling rate; as an assumption of covering two wave crests (or two wave troughs) of 70 Hz, n is 9 if the sampling rate is 44100 Hz and each voice segment has 64 sampling points; and n is 171 if the sampling rate is 192000 Hz and each voice segment has 16 sampling points); if n is 0, the maximum energy reference value is the maximum energy of the current voice segment.
According to one embodiment of the present invention, the current reference ratio is calculated further according to a previous reference ratio, where the previous reference ratio is an energy used for adjusting a previous voice segment. The previous reference ratio is less than or equal to 1 and greater than or equal to 0, and the previous voice segment is one voice segment ahead of the current voice segment.
According to one embodiment of the present invention, the current reference ratio is calculated further according to a constraint coefficient, and the constraint coefficient is less than 1 and greater than 0. The constraint coefficient can be different when the voice energy increases and decreases. For example, when the voice energy increases (with the current reference ratio greater than the previous reference ratio), the constraint coefficient is between 0.01 and 1; and, when the voice energy decreases (with the current reference ratio less than the previous reference ratio), the constraint coefficient is between 0.0004 and 0.1. Because when the voice energy increases, there is no need to restrict the change of the reference ratio too much (so as to normally output normal voice as soon as possible (by setting the reference ratio as 1), and therefore the constraint coefficient is larger); when the voice energy decreases, it is easy to mistakenly determine the ending sound (with a smaller amplitude) of the normal voice as noise for adjustment, and therefore in order to avoid over-adjustment to mute the ending sound, the reference ratio adjustment would be slower which results in a smaller constraint coefficient.
According to one embodiment of the present invention, the energy of the maximum energy reference value and the predetermined energy value is a sound amplitude.
According to one embodiment of the present invention, the predetermined energy value is between 30 dB and 90 dB.
Other objects, advantages, and novel features of the invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings.
These and other objects and advantages of the present invention will become apparent from the following description of the accompanying drawings, which disclose several embodiments of the present invention. It is to be understood that the drawings are to be used for purposes of illustration only, and not as a definition of the invention.
In the drawings, wherein similar reference numerals denote similar elements throughout the several views:
Please refer to
A voice electronic device 10 of the present invention comprises a voice receiver 11, a voice processing module 12 and a speaker 13. The voice receiver 11 is used for receiving an input voice 20. And the input voice 20 is processed by the voice processing module 12 for being outputted by the speaker 13 to a user 81. The voice receiver 11 can be a microphone or any other equivalent voice receiving equipment; and the speaker 13 (which can also include an amplifier) can be a headphone or any other equivalent voice outputting equipment without being limited to the above scope. The voice processing module 12 is generally composed of a sound effect processing chip associated with a control circuit and an amplification circuit; or can be composed of a solution including a processor and a memory associated with a control circuit and an amplification circuit. The purpose of the voice processing module 12 is to carry out amplification to voice signals, to filter out noises, to change voice frequency composition, and to carry out necessary processes to achieve the object of the present invention. Because the voice processing module 12 can be implemented by utilizing conventional hardware associated with new firmware or software, there is no need for further description about the hardware structure of the voice processing module 12. The voice electronic device 10 of the present invention can be a hardware specialized dedicated device, or can be, but not limited to, a small computer such as a personal digital assistant (PDA), a mobile phone, a hearing-aid headphone (such as a Bluetooth headphone having a chip or a processor for processing audio signals), a smart phone and/or a personal computer installed with a software program. The voice electronic device 10 of the present invention can be designed for a hearing-impaired listener, therefore, the voice processing module 12 can process functions such as frequency conversion, frequency compression or frequency shifting. However, because the purpose of the present invention is not focused on frequency processing, there is no need for further description.
Then, please refer to
The object of the present invention is to reduce the influence caused by noise energy to the overall voice energy. According to the embodiment, the definition of energy is sound amplitude. The method for determining noise is to set a predetermined energy value as a reference value, such as 40 dB, wherein the voice over 40 dB is determined as normal voice, and the voice lower than 40 dB is determined as noise. The voice determined as noise would multiply by a certain ratio to reduce its energy in order to reduce the noise influence. According to a preferred embodiment of the present invention, the predetermined energy value is between 30 dB and 90 dB. The reason of setting the predetermined energy value as high as even 90 dB is because there might be a scenario of a user using the device bundled with this method for reducing noise while taking public transportation, and in this case, the predetermined energy value would not be set as only 30 dB, instead the predetermined energy value would be set higher, such as 80 dB, so as to process louder noise.
Step 201: dividing the input voice 20 into a plurality of voice segments 21.
The time length of each voice segment is preferably between 0.0000833 and 0.1 second (e.g. it is suggested to be 0.0000833 second if the sampling rate is 192000 Hz and each voice segment has 16 sampling points). According to an experiment which utilizes an Apple iPhone4 as the hearing aid (by means of executing, in the Apple iPhone4, a software program made according to the present invention), a positive outcome is obtained when the time length of each voice segment is between about 0.0001 and 0.1 second, which means 10˜10,000 voice segments in each second. For the convenience of explanation, 15 voice segments are displayed in the embodiment.
Step 202: obtaining a maximum energy reference value of a current voice segment, wherein the maximum energy reference value is determined according to the energy from n voice segments prior to the current voice segment, where n is between 0 and 180. Basically, n can be larger if the time length of each voice segment is smaller.
The maximum energy reference value is the value of the maximum amplitude among the voice segments. As shown in
Step 203: adjusting the energy of the current voice segment according to a current reference ratio, wherein the current reference ratio is calculated according to the maximum energy reference value, a predetermined energy value, a previous reference ratio and a constraint coefficient, and the current reference ratio is less than or equal to 1 and greater than or equal to 0.
After the maximum energy reference value is found, the voice processing module 12 would divide the “maximum energy reference value” by the “predetermined energy value” to obtain a current reference ratio. If the maximum energy reference value is greater than or equal to the predetermined energy value, the current reference ratio is greater than or equal to 1, it means the voice segment having the maximum energy reference value is a normal voice, and thus the current reference ratio would be corrected as 1. Please note that the current reference ratio might need further correction by taking the previous reference ratio and the constraint coefficient into account. If the maximum energy reference value is less than the predetermined energy value, the voice processing module 12 would determine the current voice segment as noise and process the current reference ratio.
The method of processing the noise is to multiply the “current voice segment energy” by the “ratio after correction” to be used as the current voice segment energy. However, in order to prevent the voice processing module 12 from over-processing the noise voice segment to produce unnatural voice, the present invention further comprises a constraint coefficient, which is used for restricting the correction range of the reference ratio. For the convenience of explaining the functions of the constraint coefficient applied for adjusting the reference ratio and n applied for correcting the reference ratio, in
To understand the above methods and the use of the constraint coefficient, please refer to
As shown in
The current reference ratio R5 of the voice segment T5 is calculated as 0.6 (by dividing the energy of A5 by the predetermined energy value), and it has to be corrected according to the constraint coefficient and the previous current reference ratio R4′. Because R5 is less than R4′, the corrected R5′ (1−0.1=0.9) is calculated by deducting one unit of the constraint coefficient from R4′.
The current reference ratio R6 of the voice segment T6 is calculated as 0.7, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R5′. Because R6 is less than R5′, the corrected R6′ (0.9−0.1=0.8) is calculated by deducting one unit of the constraint coefficient from R5′. According to the above description, there is no need for further describing the voice segment T7, wherein its corrected R7′ is calculated as 0.7.
The current reference ratio R8 of the voice segment T8 is calculated as 0.8, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R7′. Because R8 is greater than R7′, the corrected R8′ (0.7+0.1=0.8) is calculated by adding one unit of the constraint coefficient to R7′.
The current reference ratio R9 of the voice segment T9 is calculated as 0.8, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R8′. However, since R9 is equal to R8′, there is no need for correction.
The current reference ratio R10 of the voice segment T10 is calculated as greater than 1, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R9′. Because R10 is greater than R9′, the corrected R10′ (0.8+0.1=0.9) is calculated by adding one unit of the constraint coefficient to R9′.
The current reference ratio R10 of the voice segment T11 is calculated as greater than 1, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R10′. Because R11 is greater than R10′, the corrected R11′ (0.9+0.1=1) is calculated by adding one unit of the constraint coefficient to R10′.
The rules of correcting the voice segments T12˜T15 are identical to the rules of correcting the voice segments T0˜T4, there is no need for further description.
In short, the ratio calculated for each voice segment is just a reference value for comparison. By comparing the ratio of the previous voice segment with the ratio of the current voice segment, and performing addition and/or deduction through the constraint coefficient, then the final ratio being through addition/deduction can be used as the ratio for reducing the voice energy.
As shown in
According to the above rules, the maximum energy reference value adopted by T5 should be the maximum energy of T4, therefore the current reference ratio R5 (which is calculated by dividing A4 by the predetermined energy value) is greater than 1, and thus the current reference ratio R5′ would be corrected as 1.
The maximum energy reference value adopted by T6 should be the maximum energy of T6 (because A6>A5), therefore the current reference ratio R6 is 0.7, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R5′. Because R6 is less than R5′, the corrected R6′ (1−0.1=0.9) is calculated by deducting one unit of the constraint coefficient from R5′.
The maximum energy reference value adopted by T7 should be the maximum energy of T6 (because A7<A6), therefore the current reference ratio R7 is 0.7, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R6′. Because R7 is less than R6′, the corrected R7′ (0.9−0.1=0.8) is calculated by deducting one unit of the constraint coefficient from R6′.
The maximum energy reference value adopted by T8 should be the maximum energy of T8 (because A8>A7), therefore the current reference ratio R8 is 0.8, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R7′. However, since R8 is equal to R7′, there is no need for correction.
The maximum energy reference value adopted by T9 can be the maximum energy of either T8 or T9 (because A9=A8), therefore the current reference ratio R9 is 0.8, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R8′. However, since R9 is equal to R8′, there is no need for correction.
The maximum energy reference value adopted by T10 should be the maximum energy of T10 (because A10>A9), therefore the current reference ratio R10 is 0.8, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R9′. Because R10 is greater than R9′, the corrected R10′ (0.8+0.1=0.9) is calculated by adding one unit of the constraint coefficient to R9′.
The maximum energy reference value adopted by T11 can be the maximum energy of either T10 or T11 (because both A11 and A10 are greater than 1), therefore the current reference ratio R11 is greater than 1, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R10′. Because R11 is greater than R10′, the corrected R11′ (0.9+0.1=1) is calculated by adding one unit of the constraint coefficient to R10′.
The rules of correcting the voice segments T12˜T15 are identical to the rules of correcting the voice segments T0˜T5, there is no need for further description.
Please note that, the initial value of the reference ratio of the voice is predetermined as 1. Therefore, in the above two embodiments, if the voice begins with noise (with A0 less than the predetermined energy value, and R0<1), the corrected ratio R0′ (1−(constraint coefficient)=R0′) would be calculated by deducting one unit of the constraint coefficient from 1 according to the constraint coefficient and the previous current reference ratio.
Please refer to
T4 to T8 shows the change when the voice energy decreases, wherein the constraint coefficient is between 0.0004 and 0.1 when it decreases. In this embodiment, the constraint coefficient is set as 0.05.
The current reference ratio R5 of the voice segment T5 is calculated as 0.6, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R4′. Because R5 is less than R4′, the corrected R5′ (1−0.05=0.95) is calculated by deducting one unit of the constraint coefficient from R4′. Same calculation rules apply to T6 to T8.
T9 to T11 shows the change when the voice energy increases, wherein the constraint coefficient is between 0.01 and 1 when it increases. In this embodiment, the constraint coefficient is set as 0.1.
The current reference ratio R10 of the voice segment T10 is calculated as greater than 1, and it has to be corrected according to the constraint coefficient and the previous current reference ratio R9′. Because R10 is greater than R9′, the corrected R10′ (0.8+0.1=0.9) is calculated by adding one unit of the constraint coefficient to R9′. The same calculation rule is also applied to T11.
If the number of voice segments n for selecting the maximum energy changes, the corrected ratio would be different, and the amplitude of voice adjustment would be different accordingly. For the convenience of explanation, n is set as 0 and 1 only as examples. However, according to preferred embodiments, if the sampling rate is 44100 Hz and each voice segment has 64 sapling points, n would be set as 7˜10 to better achieve the desired noise reduction purpose. The purpose of having higher number n of the sampling voice segments is because: the amplitude of the voice itself is in a curve shape, some voice segments located in the predetermined energy values are in fact just transitions of the curve instead of noise, therefore fewer samples would easily cause misjudgement.
Please note that the method for reducing noise of the present invention is not only applicable for realtime hearing aid processing, but also can be applicable for a non-realtime voice processing device, such as removing noise from a pre-recorded voice. Although the present invention has been explained in relation to its preferred embodiments, it is to be understood that many other possible modifications and variations can be made without departing from the spirit and scope of the invention as hereinafter claimed.
Number | Date | Country | Kind |
---|---|---|---|
103139189 A | Nov 2014 | TW | national |
Number | Name | Date | Kind |
---|---|---|---|
20090281800 | LeBlanc | Nov 2009 | A1 |
20110015923 | Dai | Jan 2011 | A1 |
20130272543 | Tracey | Oct 2013 | A1 |
Number | Date | Country | |
---|---|---|---|
20160133270 A1 | May 2016 | US |