The present invention relates to the field of signal processing and providing clear, high quality voice both in presence and absence of background noise in voice communication systems, devices, telephones, voice communication gateways, multi-channel environments and other communication environments.
This invention is in the field of processing audio signals in cell phones, Bluetooth headsets, VoIP telephones, gateways etc and in general any single channel or multi-channel communication device(s) operating both in a noisy and non-noisy (quite) environments.
The invention relates to the field of providing a means to save power, increase battery life, reduce crucial processing time, program space, and data space and reduce MIPS in a communication devices, gateways, servers, multi-channel environments etc.
Background noise is a major problem when processing audio signals. It is usually caused by engines, blowers, fans, air conditioners, cars, busy intersections, people talking in restaurants etc. If untreated, this noise can be annoying at times. To cope with this problem, the signal is processed in a Digital Signal Processor (DSP) where the noisy signal, picked up by the microphone, is digitized by an Analog to Digital Converter (ADC) and fed to the DSP for analysis and noise reduction. However, communication devices are not always used in noisy environments. In such cases, there is no need for noise reduction. This saves power, increases battery life and reduces crucial processing times which are critical to a communication device. Also in multi-channel environments like voice gateways, servers, conference bridges etc there should be flexibility to disable noise reduction based on a threshold to save power, MIPS (Millions of Instructions per Second), reduce program space, data space required by complex noise reduction algorithms which increase the channel capacity.
Modern day communication devices operate in a myriad of environments. Some of these environments may be extremely noisy (bars, crowded restaurants etc.) and some may be extremely quite (home, relaxing lounge etc.). In all communication devices, the microphone(s) pick up the desired signal and background noise (if present). If the environment in which the communication device is operating is noisy, the noise signal should be cancelled before being transmitted to the other end of the communication for the conversation to be pleasant and discernable.
The noise reduction algorithms, however, come at an expense of battery life, power, MIPS (Millions of Instructions per Second), huge program space, data space and crucial processing time. Not all communication devices operate in noisy environments. In other words, a single communication device operates in noisy and non-noisy/quiet environments. Simply put, not all devices need noise reduction at all times.
Voice gateways, conference bridges and similar devices should be able to enable or disable noise reduction based on a threshold during “peak” times and avoid overloading the systems. Disabling noise reduction saves crucial processing time, data space, code space and increases channel capacity in a multi-channel environment.
The present invention provides a novel system and method for monitoring the audio signals, analyze selected audio signal components, compare the results of analysis with a threshold value, and enable or disable noise reduction capability of a communication device.
In one aspect of the invention, the threshold can be predefined by the user, manufacturer or can be set “on the fly” in real time during a telephonic conversation.
In another aspect of the invention, the invention can be used in communication devices which perform noise reduction on the received signals which are reproduced at the earpiece of the communication device.
In another aspect of the invention, the invention provides the flexibility to disable noise reduction if there is no background noise or if it is less than the set threshold to save crucial processing times, data space, program space required by the complex noise reduction algorithms and increases the channel capacity in gateways, conference bridges, networks, servers and any multi-channel environment.
In another aspect of the invention, the invention provides flexibility to the users so they can “by-pass” the noise cancellation by modifying the threshold and preserve the voice quality which are usually altered/modified by noise reduction algorithms.
In yet another aspect of the invention, the invention can be added as a module to the already existing devices with noise reduction capability. In such cases, the current invention enhances the battery life, reduces the power consumption, MIPS etc. However, it does not interfere with the native noise reduction algorithms.
In yet another aspect of the invention, a machine and a system for automatically controlling noise reduction feature of a communication device are provided. The communication device may belong to a narrowband communication system and/or a wideband communication system.
In one embodiment herein, a receiving module is configured to receive audio signals from the communication device, and a user input. The audio signals are processed by a microprocessor based on the user inputs and a predefined time interval. A comparator is provided for comparing the user input and the processed audio signals, to determine if the processed audio signal is greater or lesser than the user input and the noise reduction feature of the communication device is correspondingly enabled or disabled.
In one embodiment herein, a user interface is configured to receive the user input, wherein the user input is a threshold value that defines noise limit of the communication device as preferred by the user.
The microprocessor may be a digital signal processor (DSP) and executes program instructions stored in a memory. The microprocessor processes the audio signals by calculating root mean square (RMS) value of the audio signals after every predefined time interval.
The comparator compares the user input and the processed audio signals after every ‘N’ seconds based on frame size of the communication device.
In one embodiment herein, a voice activity detector (“VAD”) is provided for receiving the audio signals from the communication device and for determining if the received signal is speech or noise. The VAD is turned OFF when the determined signal is a noise signal. Further, the VAD is turned ‘ON’ when the determined signal is a speech signal. The machine for controlling noise reduction feature of a communication device enables or disables the noise reduction feature based on the ON/OFF status of VAD.
Other features and advantages of the invention will become apparent to one with skill in the art upon examination of the following figures and detailed description. All such features, advantages are included within this description and be within the scope of the invention and be protected by the claims.
The invention is better understood in conjunction with detailed description and the figures. It should be noted that the components, blocks in the figures are not to scale and are used only for descriptive purposes.
a shows the embodiments of the Machine for Enabling and Disabling Noise Reduction (MEDNR) as described in the current invention.
b shows the general block diagram of a microprocessor system.
a shows the plot of clean speech file with no background noise.
b shows the plot of the decision to enable or disable noise reduction, based on a threshold for the audio signal described above.
a shows the plot of clean speech file corrupted with background noise (street noise).
b shows the plot of the decision to enable or disable noise reduction, based on a threshold for the audio signal described above.
The following detailed description is directed to certain specific embodiments of the invention. However, the invention can be embodied in a multitude of different ways as defined and covered by the claims and their equivalents. In this description, reference is made to the drawings wherein like parts are designated with like numerals throughout.
Unless otherwise noted in this specification or in the claims, all of the terms used in the specification and the claims will have the meanings normally ascribed to these terms by workers in the art.
Hereinafter, preferred embodiments of the invention will be described in detail in reference to the accompanying drawings. It should be understood that like reference numbers are used to indicate like elements even in different drawings. Detailed descriptions of known functions and configurations that may unnecessarily obscure the aspect of the invention have been omitted.
a shows the embodiments of the Machine for Enabling and Disabling Noise Reduction (MEDNR) as described in the current invention. The transducer/microphone, 11, of the communication device, picks up the analog signal. It should be noted by people skilled in the art that the communication device can have M number of microphone(s), where M>1. The Analog to Digital Converter (ADC), block 12, converts the analog signal to digital signal. Block 17 and 18 are M.sup.th microphone and ADC respectively. The digital signal is then sent to the MEDNR, block 16. In general any communication signal received from a communication device, in its digital form, is sent to the MEDNR. The MEDNR (block 16) consists of a microprocessor, block 14 and a memory, block 15. The microprocessor can be a general purpose Digital Signal Processor (DSP), fixed point or floating point, or a specialized DSP (fixed point or floating point).
Examples of DSP include Texas Instruments (TI) TMS320VC5510, TMS320VC6713, TMS320VC6416 or Analog Devices (ADI) BF531, BF532, 533 etc or Cambridge Silicon Radio (CSR) Blue Core 5 Multi-media (BC5-MM) or Blue Core 7 Multi-media BC7-MM etc. In general, the MEDNR can be implemented on any general purpose fixed point/floating point DSP or a specialized fixed point/floating point DSP.
The memory can be Random Access Memory (RAM) based or FLASH based and can be internal (on-chip) or external memory (off-chip). The instructions reside in the internal or external memory. The microprocessor, in this case a DSP, fetches instructions from the memory and executes them.
b shows the embodiments of block 16. It is a general block diagram of a DSP system where MEDNR is implemented. The internal memory, block 15 (b) for example, can be SRAM (Static Random Access Memory) and the external memory, block 15 (a) for example, can be SDRAM (Synchronous Dynamic Random Access Memory). The microprocessor, block 14 for example, can be TI TMS320VC5510. However, those skilled in the art can appreciate the fact that the block 14, can be a microprocessor, a general purpose fixed/floating point DSP or a specialized fixed/floating point DSP. The internal buses, block 17, are physical connections that are used to transfer data. All the instructions to enable or disable noise reduction reside in the memory and are executed in the microprocessor.
N can be as small as the “frame size” used in the communication. For example, in narrowband and wideband communication systems, the frame size is 20 and 10 milli-seconds respectively. Therefore, N.gtoreq.20 milli-seconds and N.gtoreq.10 milli-seconds for narrowband and wideband respectively. If the communication device, system uses 5 or 1 milli-second frame size, then N.gtoreq.5 or 1 milli-second(s). The upper limit for N is programmable by the end-user, manufacturer or can be set during production stage, before/during a conversation.
If the time is equal to N seconds, at block 114, Root Mean Square (RMS) value of the input signal is calculated at block 116. If not, the time is incremented, at block 115. The RMS of the input signal is calculated as follows:
InputSignalSquare=0
Loop i=1 to P
InputSignalSquare=InputSignalSquare+inpuat[i]2
End loop (1)
Where “i” is the index, P is the number of samples in each frame. Example, there are 160 samples in each frame for narrowband communication system. In equation (1), “input[ ]” is the audio signal picked up by the microphone(s) or received at the conference bridge, gateway etc.
MeanSquare=InputSignalSquare/P (2)
RMS=√MeanSquare (3)
RMS(dB)=10 log 10(RMS) (4)
The RMS and/or RMS (dB) calculated in equations (3) and (4) respectively are compared to a set threshold. This threshold can be pre-defined, set by the end-user, manufacturer at the beginning of the conversation or can be set “on the fly” in real-time during conversation. If the RMS and/or RMS (dB) is greater than the threshold, noise reduction is enabled at block 119. If the RMS and/or RMS (dB) is less than the threshold, noise reduction is disabled at block 118. For convenience, this enable or disable decision is stored in a binary format (1 and 0) at block 120. It should be noted that this decision can be stored in any other machine readable format.
Once the decision is stored, the time is reset to zero seconds and the audio signal received at block 111 is either bypassed or processed with noise reduction algorithms (block 121 based on the decision at 120. At block 114, if time is not equal to N seconds, the time is incremented and the control goes to block 121 where the stored decision (block 120) is used to either by pass or perform noise reduction on the audio signal. If at block 112, the VAD decides that the audio signal is speech, the control goes to block 121 where the stored decision (block 120) is used to either by pass or perform noise reduction.
When the program is first launched and until the time is equal to N seconds, the default initial value at block 120 can be either “1” or “0”. This initial time can be completely independent of time N seconds. For narrowband and wideband communication systems, Initial time .gtoreq.20 milli-seconds and Initial time .gtoreq.10 milli-seconds respectively. For example, users may want noise reduction to be initially enabled or disabled for the first 60 seconds (Initial time) irrespective of the amount of noise they have in the background. But after that, the users may want the system to automatically decide to enable and disable noise reduction every 5 seconds (N seconds).
a shows the plot of clean speech file with no background noise. The x-axis represents the number of samples and the y-axis represents the normalized amplitude [−1 1] of the audio signal. [−1 1] represents +32,767 to −32768 for 16-bit audio codecs. It should be noted that each sample is equal to 20 milli-seconds at 8000 Hz sampling rate.
b shows the plot of the decision to enable or disable noise reduction, for the audio signal described above based on the threshold. If the decision is “zero”, the noise reduction is disabled. If the decision is “one”, then the noise reduction is enabled. It should be noted that in this particular example, the initial decision is forced to be “one”. The initial decision can be either zero or one depending on personal, end-user or manufacturer's preference. The initial decision in this case is about 1600 samples which corresponds to 200 milli-seconds at 8000 Hertz sampling rate. This initial decision is programmable and can be modified/configured. In this particular example, the threshold is set at −50 dB. It can be seen that after 1600 samples (200 milli-seconds); the noise reduction is disabled as the RMS (dB) value of the non-speech durations is less than −50 dB. For this particular example, N is chosen to be 200 milli-seconds. The RMS (dB) value is calculated using equations (1), (2), (3) and (4) respectively, when VAD decision is OFF.
a shows the plot of clean speech file corrupted with background noise (street noise). The x-axis represents the number of samples and the y-axis represents the normalized amplitude [−1 1] of the audio signal. [−1 1] represents+32,767 to −32768 for 16-bit audio codecs. It should be noted that each sample is equal to 20 milli-seconds at 8000 Hz sampling rate.
b shows the plot of the decision to enable or disable noise reduction, for the audio signal described above based on the threshold. A decision of “one” means the noise reduction is enabled. A decision of “zero” means the noise reduction is disabled. It should be noted that in this particular example, the initial decision is forced to be “one” which is about 1600 samples which corresponds to 200 milli-seconds at 8000 Hertz sampling rate. For this particular example, the threshold is set at −50 dB. After 1600 samples (200 milli-seconds); the noise reduction is enabled as RMS (dB) value of non-speech durations is greater than −50 dB. For this particular example, N is chosen to be 200 milli-seconds. The RMS (dB) value is calculated using equations (1), (2), (3) and (4) respectively, when VAD decision is OFF.
Disclosed embodiments include, but are not limited to the following items:
(Item 1)—A machine for automatically enabling and disabling noise reduction feature of a communication device of a communication system, the machine comprising: at least one receiver adapted to receive an input audio signal from the communication device; a memory adapted to store program instructions; a noise reduction module adapted to process the input audio signal to reduce disturbance due to background noise to transmit a clear audio signal; a micro-processor adapted to be functionally coupled to the memory to process the program instructions stored in the memory, the micro-processor having, a setting module adapted to set a threshold value, the threshold value adapted to be set at beginning of conversation, while conversation using the communication device and while production of the communication device, a calculating module adapted to cyclically calculate a Root Mean Square (RMS) value of the input audio signal received from the communication device, a comparator module adapted to compare the threshold value and the RMS value to obtain a single decision value, and a decision module adapted to store in the memory, a single decision value corresponding to either enabling or disabling the noise reduction module, the decision value corresponds to disabling the noise reduction module if the RMS value is less than the threshold value, and the decision value corresponds to enabling the noise reduction module if the RMS value is greater than the threshold value; and a control module functionally coupled to the micro-processor and the noise reduction module to enable or disable the noise reduction module based on the decision value.
(Item 2) The machine in accordance with item 1, wherein the micro-processor is a Digital Signal Processor (DSP).
(Item 3) The machine in accordance with item 1, wherein the micro-processor is a fixed point DSP.
(Item 4) The machine in accordance with item 1, wherein the micro-processor is a floating point DSP.
(Item 5) The machine in accordance with item 1, wherein the memory is a Random Access Memory (RAM).
(Item 6) The machine in accordance with item 1, wherein the memory is a FLASH based memory.
(Item 7) The machine in accordance with item 1, wherein the memory is an internal (on-chip) memory.
(Item 8) The machine in accordance with item 1, wherein the memory is an external (off-chip) memory.
(Item 9) The machine in accordance with item 1, wherein the calculating module cyclically calculates the RMS value of the input audio signal after every N seconds, wherein RMS value={square root over (Mean)} Square, and wherein Mean Square=Input Signal Square/P, where P is number of samples in each frame.
(Item 10) The machine in accordance with item 1, wherein the calculating module is adapted to re-calculate RMS value of the input audio signal received from the communication device to facilitate revision of the decision value after every N seconds based on frame size of the communication system.
(Item 11) The machine in accordance with item 10, wherein the calculating module is adapted to re-calculate RMS value of the input audio signal received from the communication device to facilitate revision of the decision value after at least 20 milliseconds in case the communication system is a narrow band communication system.
(Item 12) The machine in accordance with item 10, wherein the calculating module is adapted to re-calculate RMS value of the input audio signal received from the communication device to facilitate revision of the decision value after at least 10 milliseconds in case the communication system is a wide band communication system.
(Item 13) The machine in accordance with claim 1, wherein the control module is adapted to enable and disable the noise reduction module initially for a certain time, irrespective of the RMS value of the input audio signal calculated by the calculating module based on frame size of the communication system.
(Item 14) The machine in accordance with item 13, wherein the control module is adapted to enable and disable the noise reduction module initially for at least 20 milliseconds in case the communication system is a narrowband communication system.
(Item 15) The machine in accordance with item 13, wherein the control module is adapted to enable and disable the noise reduction module initially for at least 10 milliseconds in case the communication system is a wideband communication system.
(Item 16) The machine in accordance with item 1, wherein the memory is adapted to store program instructions in a binary format.
(Item 17) The machine in accordance with item 1, wherein the memory is adapted to store program instructions in a machine readable format.
(Item 18) A system for controlling noise reduction feature of at least one communication device, the at least one communication device adapted to receive an input audio signal, the system comprising: a Voice Activity Detector (VAD) adapted to check if the input audio signal is a noise signal, based on the checked data of the input audio signal the VAD is adapted to be “OFF” if the input audio signal is a noise signal, and is adapted to be “ON” if the input audio signal is a speech signal; and a machine adapted to be communicably coupled to the VAD, the machine comprising: at least one receiver adapted to receive the input audio signal from the communication device via the VAD, a memory adapted to store program instructions, a noise reduction module adapted to receive the input audio signal there-through directed by the VAD, wherein the noise reduction module at the VAD “OFF” is adapted to receive and process the input audio signal to reduce disturbances due to background noise to transmit a clear audio signal therefrom, and wherein the noise reduction module at the VAD “ON” is by-passed by the input audio signal keeping the quality of the input audio signal unaffected; a micro-processor adapted to be functionally coupled to the VAD and the memory to process the program instructions stored in the memory, the micro-processor having, a setting module adapted to set a threshold value, the threshold value set at beginning of conversation, while conversation using the communication device and while production of the communication device, a calculating module adapted to cyclically calculate a Root Mean Square (RMS) value of the input audio signal received from the communication device, a comparator module adapted to compare the threshold value and the RMS value to obtain a single decision value, and a decision module adapted to store in the memory, a single decision value corresponding to either enabling or disabling the noise reduction module of the communication device, the decision value corresponds to disabling the noise reduction module if the RMS value is less than the threshold value, and the decision value corresponds to enabling the noise reduction module if the RMS value is greater than the threshold value, and a control module functionally coupled to the VAD, the micro-processor and the noise reduction module to enable or disable the noise reduction module of the communication device based on the decision value, in case the VAD is “OFF.”
(Item 19) The system in accordance with item 18, wherein the calculating module cyclically calculates the RMS value of the input audio signal after every ‘N’ seconds, wherein RMS value={square root over (Mean)} Square, and wherein Mean Square=Input Signal Square/P, where P is number of samples in each frame.
(Item 20) The system in accordance with item 18, wherein the noise reduction module is adapted to enable and disable noise reduction of the at least one communication device of the communication system during a telephonic conversation or on a fly, by a user.
(Item 21) A machine for controlling noise reduction of at least one communication device of a communication system, the machine comprising a noise reduction module adapted to enable and disable noise reduction of the at least one communication device, wherein the noise reduction feature is enabled or disabled by a user during a telephonic conversation or on a fly.
(Item 22) A system for controlling noise reduction of at least one communication device, the system comprising a noise reduction module adapted to enabled or disabled noise reduction feature of the at least one communication device, wherein the noise reduction feature is enabled or disabled by a user during a telephonic conversation or on a fly.
Further, a machine and a system for automatically controlling noise reduction feature of a communication device are disclosed. The communication device may belong to a narrowband communication system and/or a wideband communication system.
In one embodiment herein, a receiving module is configured to receive audio signals from the communication device, and a user input. The audio signals are processed by a microprocessor based on the user inputs and a predefined time interval. A comparator is provided for comparing the user input and the processed audio signals, to determine if the processed audio signal is greater or lesser than the user input and the noise reduction feature of the communication device is correspondingly enabled or disabled.
In one embodiment herein, a user interface is configured to receive the user input, wherein the user input is a threshold value that defines noise limit of the communication device as preferred by the user.
The microprocessor may be a digital signal processor (DSP) and executes program instructions stored in a memory. The microprocessor processes the audio signals by calculating root mean square (RMS) value of the audio signals after every predefined time interval.
The comparator compares the user input and the processed audio signals after every ‘N’ seconds based on frame size of the communication device.
In one embodiment herein, a voice activity detector (“VAD”) is provided for receiving the audio signals from the communication device and for determining if the received signal is speech or noise. The VAD is turned OFF when the determined signal is a noise signal. Further, the VAD is turned ‘ON’ when the determined signal is a speech signal. The machine for controlling noise reduction feature of a communication device enables or disables the noise reduction feature based on the ON/OFF status of VAD.
This application is a continuation in part to U.S. patent application Ser. No. 14/289,137 filed on or about May 28, 2014 which is a continuation in part of U.S. patent application Ser. No. 13/083,513 (now U.S. Pat. No. 8,775,172) filed on Apr. 8, 2011 which claims the benefit of provisional patent application 61/389,203 filed on Oct. 2, 2010. The contents of the related patent applications are incorporated herein by reference as if restated herein. This application claims the priority dates of the related patent applications.
Number | Date | Country | |
---|---|---|---|
Parent | 14289137 | May 2014 | US |
Child | 14539029 | US |