The present invention relates to a field of speech processing and more particularly to an impulsive noise suppression method based on dual-microphone architecture, system, computing device, and computer-readable storage medium.
With the development of electronic devices, hearing devices (such as headphones or hearing aids) have been developed to supplement the hearing loss of hearing-impaired individuals. The hearing devices are usually installed in the user's ear to amplify the sound and provide the amplified sound to the wearer. The hearing devices usually include a microphone that collects input signals; a processor for amplifying an input signal; and a speaker (which may be referred to as a receiver in the field of hearing aids) that outputs sound.
In hearing device wear, when the external ambient signal is small or the sound source is far away from the hearing device, it needs to be amplified because of its relatively weak energy intensity when it reaches the microphone. However, linear amplification will lead to new problems. For example, when the amplification reaches a certain degree, if an input signal is or includes an impulsive signal, the gain of the signal itself may reach 100 dB or even more energy. At this time, the amplification of the hearing device will lead to a signal with a very large energy amplitude to be output, thus causing damage to the hearing.
To solve the above problem, usually directly using WDRC (Wide Dynamic Range Compression) algorithm or AGCO (Automatic Gain Control), the relatively weak signal is amplified, while the higher-energy signal is performed on for a certain degree of suppression, so that the headphone wearer can produce a better hearing experience. However, the above methods are complex processing of all signals directly in the frequency domain, which consumes a lot of computing resources and slow response time, and takes a relatively long time to complete the processing.
It is therefore an objective to provide an impulsive noise suppression method and system based on dual-microphone architecture to solve the above problem.
The present invention provides an impulsive noise suppression method based on a dual-microphone architecture, applied to a hearing aid, and the hearing aid comprises a first feedforward microphone, a second feedforward microphone, and a speaker, and a sensitivity level of the first feedforward microphone is less than a sensitivity level of the second feedforward microphone, wherein the first feedforward microphone and the second feedforward microphone are located on a side of the hearing aid away from an ear canal, and the speaker is located on a side close to the ear canal, wherein the method comprises obtaining an input signal, the input signal comprising a first signal provided through the first microphone and a second signal provided through the second feedforward microphone; determining whether the input signal comprises an impulsive signal according to a first time-domain signal energy value of the first signal and a second time-domain signal energy value of the second signal; and performing an impulsive signal suppression operation on the input signal if the input signal includes the impulsive signal.
The present invention further provides a computing device, comprising a first feedforward microphone; a second feedforward microphone, wherein a sensitivity level of the first feedforward microphone is less than a sensitivity level of the second feedforward microphone, and wherein the first feedforward microphone and the second feedforward microphone are located on a side of the computing device away from an ear canal, and the speaker is located on a side close to the ear canal; a speaker; at least one processor; and at least one memory communicatively coupled to the at least one processor to configure the at least one processor to: obtain an input signal, the input signal comprising a first signal provided through the first microphone and a second signal provided through the second feedforward microphone; determine whether the input signal comprises an impulsive signal according to a first time-domain signal energy value of the first signal and a second time-domain signal energy value of the second signal; and perform an impulsive signal suppression operation on the input signal if the input signal includes the impulsive signal.
The present invention further provides an impulsive noise suppression method based on a single-microphone architecture, applied to a hearing aid, wherein the hearing aid comprises a feedforward microphone and a speaker electrically connected in sequence, wherein the feedforward microphone is located on a side of the hearing aid away from an ear canal, and the speaker is located on a side close to the ear canal, and the method comprising obtaining an input signal through the feedforward microphone, the input signal comprising a signal provided from a surrounding environment; detecting whether the input signal comprises a time-domain impulsive signal; performing an output gain control on the input signal to obtain a first target signal if the input signal comprises the time-domain impulsive signal; performing a dynamic range companding control and the output gain control on the input signal in sequence to obtain a second target signal if the input signal does not comprise the time-domain impulsive signal; and outputting the first target signal or the second target signal to the speaker for playing through the speaker.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
To make the objectives, technical solutions, and advantages of the present application more comprehensible, the present application is described in further detail below with reference to embodiments and the accompanying drawings. It should be understood that the specific embodiments described herein are merely used for explaining the present application, and are not intended to limit the present application. All other embodiments obtained by those skilled in the art based on the embodiments of the present application without creative efforts shall fall within the protection scope of the present application.
It should be noted that the description of “first”, “second” and the like in the present application is used for the purpose of description only, and cannot be construed as indicating or implying its relative importance or implicitly indicating the number of the indicated technical features. Thus, features defining “first” or “second” may include at least one of the features, either explicitly or implicitly. In addition, the technical solutions in the embodiments can be combined with each other, but must be based on the realization of those ordinary skilled in the art, when the combinations of the technical solutions are contradictory or unrealizable, it shall be deemed that the combinations of the technical solutions do not exist and are not within the protection scope required by the present application.
In the description of the present application, it should be understood that a numerical label before the steps does not identify the sequence of execution of the steps, and is only used to facilitate the description of the application and distinguish each step, and therefore cannot be understood as a limitation of the application.
The impulsive noise suppression method based on a dual-microphone architecture can be implemented in a hearing aid 2.
The hearing aid 2 comprises a housing, and the housing comprises a first feedforward microphone 21, a second feedforward microphone 22, a processor 23, and a speaker 24.
The first feedforward microphone 21 is located on a side of the hearing aid 2 away from an ear canal, and can be configured to obtain surrounding environment signals around a wearer.
The second feedforward microphone 22 is located on the side of the hearing aid 2 away from the ear canal (that is, located on the same side of the first feedforward microphone 21), and can be configured to obtain the surrounding environment signals around the wearer. A sensitivity level of the first feedforward microphone 21 is less than a sensitivity level of the second feedforward microphone 22. Wherein, the sensitivity level may comprise sound sensitivity level. The sensitivity level refers to an electrical response of an output end of the microphone to a given standard acoustic input. For a fixed acoustic input, the second feedforward microphone 22 with high sensitivity level outputs a higher electrical signal amplitude than the first feedforward microphone 21 with a low sensitivity level.
The processor 23 is electrically connected to the first feedforward microphone 21, the second feedforward microphone 22, and the speaker 24, and is configured to process signals provided by the first feedforward microphone 21 and by the second feedforward microphone 22. For example, impulsive noise suppression, wide dynamic range compression (WDRC), beamforming, etc. The processor 23 may be a DSP (Digital Signal Processing, digital signal processing) chip or the like.
The speaker 24 is configured to receive signals processed by the processor 23 and output processed signals to the ear canal 4.
A silicone sleeve 25 is configured to at least partially insert into the ear canal 4 when the hearing aid 2 is worn. The silicone sleeve 25 can block the surrounding sound around the wearer from entering the ear canal 4 to a certain extent. Of course, the material of the silicone sleeve 25 can be replaced.
The present invention can provide an impulsive noise suppression solution based on dual-microphone architecture according to the structure of the above-mentioned hearing aid. According to a first signal provided by the first feedforward microphone 21 and a second signal provided by the second feedforward microphone 22, it is determined that whether there is an impulsive signal. If there is an impulsive signal, an impulsive noise suppression operation is performed.
Of course, the invention also provides an impulsive noise suppression solution for single-microphone architecture (first feedforward microphone 21).
An implementation principle of the impulsive noise suppression scheme based on dual-microphone architecture is provided below.
Design idea: the first feedforward microphone and the second feedforward microphone with different sensitivity levels are used.
A duration of an impulsive signal may be between 10-200 ms, which needs to be detected in a range of tens of milliseconds at the fastest, and the processing of the impulsive signal is realized. Characteristics of the impulsive signal comprise: energy increases sharply to very large in a very short time.
As shown in
Due to the saturation phenomenon, the amplitude of the impulsive signal obtained by the processor 23 from the second feedforward microphone 22 with a higher sensitivity level is limited. That is, a signal energy value of the impulsive signal obtained by the processor 23 from the second feedforward microphone 22 with the higher sensitivity level may be saturated. Therefore, a time-domain energy difference between the impulsive signal obtained by the processor 23 from the second feedforward microphone 22 with the higher sensitivity level and an impulsive signal obtained from the first feedforward microphone 21 with a lower sensitivity level will be reduced (e.g., less than a certain threshold). For example, the first feedforward microphone 21 adopts a microphone with a normal sensitivity level (for example, the sensitivity level is −38 dBv), and the second feedforward microphone 22 adopts a microphone with an ultra-high sensitivity level (for example, the sensitivity level is −23 dBv). Therefore, when the time-domain energy difference between the input signal obtained from the second feedforward microphone 22 with the higher sensitivity level and the input signal obtained from the first feedforward microphone 21 with the lower sensitivity level is less than 15 dB, it indicates that the impulsive signal may appear in the input signal. And, the lower the time-domain energy difference, the greater the probability of an impulsive signal.
As shown in
Based on the above analysis, it can be determined whether there is an impulsive signal based on an energy ratio. The details are shown in
(1) Obtaining a first signal through the first feedforward microphone 21 and calculating a first time-domain signal energy value of the first signal;
(2) Obtaining a second signal through the second feedforward microphone 22 and calculating a second time-domain signal energy value of the second signal;
(3) Comparing a time-domain energy difference between a time-domain signal energy value of the first signal and a time-domain signal energy value of the second signal, and determining whether there is an impulsive signal in the surrounding environment of the wearer through the time-domain energy difference.
If the time-domain energy difference is greater than a preset energy difference threshold (15 dB), it is determined that there is an impulsive signal.
Further, since the characteristics of the impulsive signal comprise: the energy increases sharply to very large in a very short time. Therefore when the impulsive signal appears in the surrounding environment, the output of the first feedforward microphone 21 will increase sharply. In view of this, as shown in
It should be noted that the first preset energy threshold can be determined by the ambient noise detected by the first feedforward microphone 21. When the ambient noise detected by the first feedforward microphone 21 is low, the first preset energy threshold is low. When the ambient noise detected by the first feedforward microphone 21 is high, the first preset energy threshold is dynamically adjusted up.
Further, the inventor also found that when there is an impulsive signal, a transient peak energy value of the signal surges, and an average energy value of the signal increases relatively slowly. As shown in
In order to further improve the accuracy of determining whether there is an impulsive signal, as shown in
Step 1: performing multi-band filtering on the input signal (first signal or second signal) to obtain M subband signals corresponding to M channels;
Step 2: calculating a subband average energy value and a subband transient peak energy value of an i-th channel, i is a natural number, 1≤i≤m;
Step 3: determining whether there is an impulsive signal in the i-th channel according to the time-domain energy difference between the subband average energy value and the subband transient peak energy value of the i-th channel, so as to obtain M determination results corresponding to the M channels;
Step 4: determining whether the input signal comprises an impulsive signal according to the M determination results.
Further, the inventor found that:
Since a frequency range covered by the impulsive signal is the whole frequency band, and a voice is mainly concentrated in 300-3400 Hz.
Therefore, a weight value can be increased for a determination of channels above 4 kHz.
The weight value of a low-frequency part is low, which can effectively resist the interference of speech and make the determination of impulsive signals more robust.
In order to further improve the determination accuracy of the impulsive signal, as shown in
Step 1: performing multi-band filtering on the first signal or the second signal respectively to obtain M first subband signals corresponding to M channels and M second subband signals corresponding to the M channels, wherein, M first subband signals are obtained according to the first signal, and M second subband signals are obtained according to the second signal.
Step 2: calculating a time-domain energy difference between the subband average energy value of the first subband signal and the subband transient peak energy value of the first subband signal in the i-th channel, and a time-domain energy difference between the subband average energy value of the second subband signal and the subband transient peak energy value of the second subband signal in the i-th channel, and obtaining an i-th determination result corresponding to the i-th channel, to obtain M determination results corresponding to the M channels, the i is a natural number, 1≤i≤m.
Step 3: configuring a weight value for each channel, a channel with a frequency higher than 4 kHz is configured with a higher weight value, and a channel with a frequency lower than 4 kHz is configured with a lower weight value.
Step 4: determining whether there is an impulsive signal comprehensively according to each of the M determination results and a corresponding weight value.
For example, when there is an impulsive signal, the corresponding determination result is 1. When there is no impulsive signal, the corresponding determination result is −1. The weight value of the channel with a frequency higher than 4 kHz is 0.5, and the weight value of the channel with a frequency lower than 4 kHz is 0.2. Wherein an influence of the determination result of each channel on the comprehensive determination is: the determination result*the weight value of the channel.
The above comprehensive weight value can be compressed by sigma and other functions to obtain a probability value between 0-1.
The higher the probability value, the greater the possibility of impulsive signal and the greater the degree of suppression of the input signal in the time domain.
The lower the probability value, the lighter the degree of suppression.
A plurality of embodiments will be provided below, each of which can be used to implement the impulsive noise suppression method based on the dual-microphone architecture described above. For ease of understanding, the hearing aid 2 will be exemplarily described below as the executive body.
In the embodiment, the impulsive noise suppression method based on dual-microphone architecture is applied to the hearing aid 2. As shown in
Step S900, obtaining an input signal, the input signal comprising a first signal provided through the first microphone and a second signal provided through the second feedforward microphone.
The first feedforward microphone 21 is configured to collect signals of the surrounding environment and can be a microphone with a normal sensitivity level, such as −38 dBv.
The second feedforward microphone 22 is configured to collect signals of the surrounding environment and can be a microphone with an ultra-high sensitivity level, such as −23 dBv.
The input signal is a signal input to the processor 23.
The input signal comprises the first signal and the second signal. The first signal is a signal output from the first feedforward microphone 21 to the processor 23. The second signal is a signal output from the second feedforward microphone 22 to the processor 23.
Step S902, determining whether the input signal comprises an impulsive signal according to a first time-domain signal energy value of the first signal and a second time-domain signal energy value of the second signal.
Step S904, performing an impulsive signal suppression operation on the input signal if the input signal comprises the impulsive signal.
The impulsive noise suppression method based on dual-microphone architecture provided by the embodiment of the present invention collects the signals of the surrounding environment based on the first feedforward microphone and the second feedforward microphone with different sensitivity levels to obtain the first signal and the second signal with a difference, and whether the input signal comprises an impulsive signal is determined by the first time-domain signal energy value of the first signal and the second time-domain signal energy value of the second signal, and the impulsive noise suppression is implemented. In the embodiment, whether there is an impulsive signal is analyzed through the time-domain signal energy value, a calculation process is simple, calculation resource consumption is small, and a response speed is fast, to ensure that the wearer has a better hearing experience.
There are various ways to determine whether the input signal comprises an impulsive signal, such as:
Method 1
As shown in
Method 2 (a Further Scheme Based on the Method 1)
As shown in
Method 3 (a Further Scheme Based on the Method 2)
As shown in
In order to further improve the accuracy of determining whether there is an impulsive signal, the input signal can be divided into channels for comprehensive analysis.
As an example, the first time-domain signal energy value comprises multiple first subband energy values, and the second time-domain signal energy value comprises multiple second subband energy values. As shown in
As an example, the average energy value of the second signal comprises multiple second subband average energy values of the second signal, and the transient peak energy value of the second signal comprises multiple second subband transient peak energy values of the second signal. As shown in
As an example, as shown in
As an example, as shown in
As shown in
As shown in
The obtaining module 1710, obtaining an input signal, the input signal comprising a first signal provided through the first microphone and a second signal provided through the second feedforward microphone;
The determining module 1720, determining whether the input signal comprises an impulsive signal according to a first time-domain signal energy value of the first signal and a second time-domain signal energy value of the second signal;
The suppressing module 1730, performing an impulsive signal suppression operation on the input signal if the input signal includes the impulsive signal.
As shown in
The memory 1810 includes at least one type of computer-readable storage medium. The readable storage medium includes flash memory, hard disk, multimedia card, card type memory (e.g., SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 1810 may be an internal storage module of the computing device 1800 such as a hard disk or memory of the computing device 1800. In other embodiments, the memory 1810 may also be an external storage device of the computing device 1800, such as a plugged hard disk provided on the computing device 1800, a smart media card (SMC), secure digital (SD) card, a flash memory card, and the like. Of course, the memory 1810 may also include both an internal storage module and an external storage device of the computing device 1800. In the embodiment, the memory 1810 is generally configured to store an operating system and various types of application software installed in the computing device 1800 such as program codes of the network communication method and the like. In addition, the memory 1810 may also be configured to temporarily store various types of data that have been or will be outputted.
The processor 1820, in some embodiments, may be a central processing unit (CPU), a controller, a microprocessor, or other data processing chip. The processor 1820 is generally configured to control the overall operation of the computing device 1800 such as performing control and processing related to data interaction or communication with the computing device 1800. In the embodiment, the processor 1820 is configured to run program code stored in the memory 1810 or process data.
The network interface 1830 may include a wireless network interface or a wired network interface which is generally used to establish a communication connection between the computing device 1800 and other computing devices. For example, the network interface 1830 is used for connecting the computing device 1800 to an external terminal via a network and establishing a data transmission channel and a communication connection between the computing device 1800 and the external terminal. The network can be a wireless or wired network such as an enterprise intranet, an Internet, a Global System of Mobile communication (GSM), a Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, a Bluetooth, Wi-Fi, and the like.
It is to be noted that
In the embodiment, an interactive method of bullet screen eggs stored in the memory 1810 may be divided into one or more program modules and executed by one or more processors (processor 1820 in the embodiment) to complete the present application.
The embodiment further provides a non-transitory computer-readable storage medium, which stores computer programs, and when the computer programs are executed by a processor, the steps of an impulsive noise suppression method based on a dual-microphone architecture in the embodiment are realized.
In the embodiment, the computer-readable storage medium includes flash memory, hard disk, multimedia card, card type memory (e.g., SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the computer-readable storage medium may be an internal storage module of the computing device such as a hard disk or memory of the computing device. In other embodiments, the memory may also be an external storage device of the computing device, such as a plugged hard disk provided on the computing device, a smart media card (SMC), secure digital (SD) card, a flash memory card, and the like. Of course, the computer-readable storage medium may also include both an internal storage module and an external storage device of the computing device. In the embodiment, the computer-readable storage medium is generally used to store an operating system and various types of application software installed in the computing device such as program codes of the impulsive noise suppression method based on dual-microphone architecture and the like. In addition, the memory may also be used to temporarily store various types of data that have been or will be outputted.
S1, obtaining an input signal through the feedforward microphone, the input signal comprising a signal provided from a surrounding environment;
S2, detecting whether the input signal comprises a time-domain impulsive signal;
S3, performing an output gain control on the input signal to obtain a first target signal if the input signal comprises the time-domain impulsive signal;
S4, performing a dynamic range companding control and the output gain control on the input signal in sequence to obtain a second target signal if the input signal does not comprise the time-domain impulsive signal; and
S5, outputting the first target signal or the second target signal to the speaker for playing through the speaker.
As shown in
Apparently, it should be appreciated by those skilled in the art that each module or step described in the embodiment of the present application can be realized by a general-purpose and that the modules or steps may be integrated on a single computing device or distributed on a network consisting of a plurality of computing devices, optionally, the modules or steps may be realized by executable program codes so that the modules or steps can be stored in a storage device to be executed by a computing device, and in some cases, the steps shown or described herein can be executed in a sequence different from this presented herein, or the modules or steps are formed into integrated circuit modules, or several of the modules or steps are formed into integrated circuit modules. Therefore, the present application is not limited to the combination of specific hardware and software.
The embodiments described above are just preferred embodiments of the present application and thus do not limit the patent scope of the present application. Any equivalent structure, or equivalent process transformation made according to the contents of the description and the drawings of the present application or any direct or indirect application to other related arts shall be also included in the patent protection scope of the present application.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
202110412930.8 | Apr 2021 | CN | national |
Number | Name | Date | Kind |
---|---|---|---|
20130102259 | Son | Apr 2013 | A1 |
20160316303 | Kornagel | Oct 2016 | A1 |
20180234760 | Chen | Aug 2018 | A1 |
20190158963 | Best | May 2019 | A1 |
20200221236 | Jensen | Jul 2020 | A1 |
Number | Date | Country |
---|---|---|
102254563 | Nov 2011 | CN |
106157967 | Nov 2016 | CN |
110503973 | Nov 2019 | CN |
112037806 | Dec 2020 | CN |
112802486 | May 2021 | CN |
2009102141 | Aug 2009 | WO |
Number | Date | Country | |
---|---|---|---|
20220337959 A1 | Oct 2022 | US |