1. Field of the Invention
The present invention relates to a hearing aid device for a hearing-impaired listener.
2. Description of the Related Art
Hearing aids have been in use since the early 1900s. The main concept of the hearing aid is to amplify sounds so as to help a hearing-impaired listener to hear, and to make the sound amplification process generate almost no sound delay. Furthermore, if a hearing aid performs frequency processing, generally the processing reduces the sound frequency. For example, U.S. Pat. No. 6,577,739 “Apparatus and methods for proportional audio compression and frequency shifting” discloses a method of compressing a sound signal according to a specific proportion for being provided to a hearing-impaired listener with hearing loss in a specific frequency range. However, this technique involves compressing the overall sound; even though it can perform real-time output, the compression can result in serious sound distortion.
If frequency reduction is performed only on some high-frequency sounds, the distortion will be reduced. However, this technique involves a huge amount of computation, which may delay the output, and therefore it is often inappropriate for real-time speech processing. For example, the applicant filed U.S. patent application Ser. No. 13/064,645 (Taiwan Patent Application Serial No. 099141772), which discloses a method to reduce distortion; however, it still causes an output delay problem.
Therefore, there is a need to provide a hearing aid and a method of enhancing speech output in real time to reduce distortion of the sound output as well as to reduce the delay of the sound output caused by frequency processing or amplification, so as to mitigate and/or obviate the aforementioned problems.
During the process of performing frequency processing on speech, sometimes a time delay might occur, and such a delay causes asynchronous speech output. Therefore, it is an object of the present invention to provide a method of enhancing speech output in real time.
To achieve the abovementioned object, the present invention comprises the following steps:
dividing an input speech into a plurality of audio segments;
searching for at least two audio segments with attributes different from the plurality of audio segments, including:
and
outputting some of the plurality of audio segments, wherein:
According to the abovementioned steps, a delay caused by performing frequency processing on all or some of the non-soundless segments can be reduced or eliminated by deleting all or some of the soundless segments.
Other objects, advantages, and novel features of the invention will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings.
These and other objects and advantages of the present invention will become apparent from the following description of the accompanying drawings, which disclose several embodiments of the present invention. It is to be understood that the drawings are to be used for purposes of illustration only, and not as a definition of the invention.
In the drawings, wherein similar reference numerals denote similar elements throughout the several views:
Please refer to
The hearing aid device 10 of the present invention comprises a sound receiver 11, a sound processing module 12, and a sound output module 13. The sound receiver 11 is used for receiving an input speech 20 transmitted from a sound source 80. After the input speech 20 is processed by the sound processing module 12, it can be outputted to a hearing-impaired listener 81 by the sound output module 13. The sound receiver 11 can be a microphone or any equipment capable of receiving sound. The sound output module 13 can include a speaker, an earphone, or any equipment capable of playing audio signals. However, please note that the scope of the present invention is not limited to the abovementioned devices. The sound processing module 12 is generally composed of a sound effect processing chip associated with a control circuit and an amplifier circuit; or it can be composed of a processor and a memory associated with a control circuit and an amplifier circuit. The object of the sound processing module 12 is to perform amplification processing, noise filtering, frequency composition processing, or any other necessary processing on sound signals in order to achieve the object of the present invention. Because the sound processing module 12 can be accomplished by utilizing known hardware associated with new firmware or software, there is no need for further description of the hardware structure of the sound processing module 12. The hearing aid device 10 of the present invention is basically a specialized device with custom-made hardware, or it can be a small computer such as a personal digital assistant (PDA), a PDA phone, a smart phone, or a personal computer. Take a mobile phone as an example; after a processor executes a software program in a memory, the main structure of the sound processing module 12 shown in
Now please refer to
Step 201: Receiving an input speech 20.
This step is accomplished by the sound receiver 11, which receives the input speech 20 transmitted from the sound source 80.
Step 202: Dividing the input speech 20 into a plurality of audio segments.
Please refer to “Stage 0” in
The time length of each audio segment is preferably between 0.0001 and 0.1 second. According to an experiment using an Apple iPhone 4 as the hearing aid device (by means of executing, on the Apple iPhone 4, a software program made according to the present invention), a positive outcome is obtained when the time length of each audio segment is between about 0.0001 and 0.1 second.
Step 203:
Searching for at least two audio segments with different attributes from the plurality of audio segments, including:
The sound processing module 12 divides the input speech 20 into a plurality of audio segments and also determines the attribute “L”, “H” or “Q” of each audio segment. It is very easy to determine whether an audio segment is a soundless segment (i.e., “Q”). Basically, a sound energy threshold (such as 15 decibels) is given; any audio segment with sound energy less than the given sound energy threshold will be determined to be a soundless segment, and any audio segment with sound energy higher than the threshold will be determined to be a non-soundless segment. In this embodiment, the non-soundless segments are divided into at least two attributes, respectively marked as “L” (low-frequency segment) or “H” (high-frequency segment).
As for the process of determining whether the audio segment is prone to a high-frequency segment or a low-frequency segment, the determination is primarily performed according to the condition of the hearing-impaired listener. Generally, the frequency of human speech communication is between 20 Hz and 16,000 Hz. However, it is difficult for general hearing-impaired listeners to hear frequencies higher than 3,000 Hz or 4,000 Hz. The greater the severity of impairment of the hearing-impaired listener is, the greater the loss of sensitivity to the high-frequency range is. Therefore, whether the attribute of each audio segment is marked as “L” or “H” is determined according to the hearing-impaired listener. There are various known techniques of determining whether the audio segment should belong to “L” or “H”. For example, one technique analyzes whether each audio segment has a sound higher than a certain hertz (such as 3000 Hz); however, this simple technique is somewhat imprecise. The applicant has previously filed U.S. patent application Ser. No. 13/064,645 (Taiwan Patent Application Serial No. 099141772), which discloses a technique for determining high-frequency or low-frequency energy. Below please find some examples of possible determination:
If at most 30% of the sound energy of the audio segment is under 1,000 Hz and at least 70% of the sound energy of the audio segment is over 2500 Hz, the attribute of the audio segment is marked as high-frequency “H”; otherwise, the attribute of the audio segment is marked as low-frequency “L”.
If at least 30% of the sound energy of the audio segment is under 1,000 Hz, the attribute of the audio segment is marked as low-frequency “L”; otherwise, the attribute of the audio segment is marked as high-frequency “H”.
If at most 30% of the sound energy of the audio segment is under 1000 Hz, the attribute of the audio segment is marked as high-frequency “H”; otherwise, the attribute of the audio segment is marked as low-frequency “L”.
If at least 70% of the sound energy of the audio segment is over 2500 Hz, the attribute of the audio segment is marked as high-frequency “H”; otherwise, the attribute of the audio segment is marked as low-frequency “L”.
Basically, right after dividing an audio segment, the sound processing module 12 can immediately determine the attribute of the audio segment. Alternatively, the sound processing module 12 can divide, for example, five audio segments at first and then determine the attribute of each audio segment by means of batch processing.
Step 204:
Outputting some of the plurality of audio segments, wherein:
In this embodiment, the present invention performs frequency processing on non-soundless segments with attributes marked as “H” (high-frequency sound), and does not perform frequency processing on non-soundless segments with attributes marked as “L” (low-frequency sound). Because it is difficult for the hearing-impaired listener to hear high-frequency sound, the audio segments with attributes of “H” are classified as “processing-necessary segments”, and the audio segments with attributes of “L” are classified as “processing-free segments”. In order to enable the hearing-impaired listener to hear the high-frequency sound, the frequency processing reduces the sound frequency, which is performed by means of methods such as frequency compression or frequency shifting. Because the technique of frequency compression or frequency shifting is well known to those skilled in the art, there is no need for further description. Please note that in order to enable the hearing-impaired listener to hear the high-frequency sound, a conventional technique is to reduce the sound frequency of the entire sound section, which results in serious sound distortion. U.S. patent application Ser. No. 13/064,645 (Taiwan Patent Application Serial No. 099141772) is disclosed to improve such a problem. However, the technique of determining whether the sound is high-frequency or low-frequency first and then determining whether to perform frequency processing to the high-frequency sound will cause a delay. Therefore, the technique disclosed in U.S. patent application Ser. No. 13/064,645 (Taiwan Patent Application Serial No. 099141772) will cause an obvious delay problem when outputting speech in real time, and thus the present invention is provided to improve this problem.
Please refer to
Stage 0: An initial status. Please refer to the description of step 202 regarding how the audio segment is marked.
Stage 1: The attribute of the first audio segment S1 is marked as low-frequency “L”, and therefore the audio segment S1 will be outputted without undergoing frequency processing. Please note that in order to enable the hearing-impaired listener to hear sound, the outputted audio segment undergoes amplification processing (so as to enhance its sound energy).
Stage 2: The attribute of the second audio segment S2 is marked as low-frequency “L”, and therefore the audio segment S2 is outputted without undergoing frequency processing.
Stage 3: The attribute of the third audio segment S3 is marked as high-frequency “H”, and therefore the frequency processing is performed. Because the frequency processing takes time, it starts to generate a delayed output, wherein the audio segment S3 cannot be outputted in real time. For ease of explanation, an audio segment SX in Stage 3 is used as a virtual output, wherein the audio segment SX is in fact soundless and also represents a delayed time segment.
Stage 4: The attribute of the fourth audio segment S4 is marked as high-frequency “H”, and therefore the frequency processing is performed. In this embodiment, it is assumed that the time required for performing frequency processing is equal to the length of two audio segments, that the audio segment S3 still cannot be outputted at this time point, and that the audio segment S4 also cannot be outputted because it is undergoing frequency processing; therefore, another audio segment SX is added to Stage 4 in a similar way.
Stage 5: Because the audio segment S3 is fully processed at this time point, the audio segment S3 is outputted. As shown in the figures, if there is no delay, the audio segment S5 should be outputted in Stage 5. However, because there are two delayed audio segments SX, what is outputted in Stage 5 is the audio segment S3.
Stage 6: Because the audio segment S4 is fully processed at this time point, the audio segment S4 is outputted.
Stage 7: The attribute of the fifth audio segment S5 is marked as low-frequency “L”, and therefore the audio segment S5 is outputted without undergoing frequency processing.
Stage 8: The attribute of the sixth audio segment S6 is marked as low-frequency “L”, and therefore the audio segment S6 is outputted without undergoing frequency processing.
Stage 9: The attribute of the seventh audio segment S7 is marked as low-frequency “L”, and therefore the audio segment S7 is outputted without undergoing frequency processing. As shown in the figures, the delay in Stage 3 is equal to the length of one audio segment (i.e., one audio segment SX), and the delay from Stage 4 to Stage 9 is equal to the length of two audio segments (i.e., two audio segments SX).
Stage 10: the subsequent audio segment S8, audio segment S9, and audio segment S10 are all soundless segments. The present invention deletes all or some of the soundless segments without outputting the soundless segments. In this embodiment, because two audio segments are delayed, the audio segment S8 and the audio segment S9 are not outputted, and only the audio segment S10 is outputted.
Therefore, if there is any delay generated earlier, the present invention can achieve the object of reducing or eliminating the delay by means of not outputting all or some of the soundless segments. For example, if the delay is accumulated with six audio segments, and the subsequent audio segments have four soundless segments, then none of the four soundless segments will be outputted; however, if the subsequent audio segments have eight soundless segments, then six of the soundless segments will not be outputted and two of the soundless segments will be outputted.
Generally speaking, in speech communications, the high-frequency segments are the lowest proportion (often less then 10%), the low-frequency segments are the largest proportion, and the soundless segments greatly outnumber the high-frequency segments. Therefore, if the sound processing module 12 operates at sufficiently high speed, the delay caused by performing frequency processing on the high-frequency segments can be reduced or eliminated by means of deleting some soundless segments.
Stage 11: The attribute of the eleventh audio segment S11 is marked as low-frequency “L”, and therefore the audio segment S11 will be outputted without undergoing frequency processing. As shown in the figures, no delay is caused in Stage 11 when the audio segment S11 is outputted.
Please note that in a general hearing aid device, the sound processing module 12 basically performs sound amplification processing and noise reduction processing. Because the abovementioned sound amplification processing and noise reduction processing are not the key point of the present invention, there is no need for further description.
Although the present invention has been explained in relation to its preferred embodiments, it is to be understood that many other possible modifications and variations can be made without departing from the spirit and scope of the invention as hereinafter claimed.
Number | Name | Date | Kind |
---|---|---|---|
4759071 | Heide | Jul 1988 | A |
8582792 | Chao | Nov 2013 | B2 |
8837757 | Rung | Sep 2014 | B2 |
20040175010 | Allegro | Sep 2004 | A1 |
20070127748 | Carlile | Jun 2007 | A1 |
Number | Date | Country | |
---|---|---|---|
20140270289 A1 | Sep 2014 | US |