The present invention relates to sound processing devices in which an acoustic sound input or an electric or digital representation of an acoustic sound input is processed and converted to an acoustic or electric sound output, and in particular relates to the processing of sound in noisy environments to improve speech intelligibility, sound quality, and naturalness of the sound. Sound processing devices of this kind are often used in hearing aids, assistive listening devices (ALD), and consumer audio devices such as radios, television sets, CD players, MP3 players, stereo systems, headsets, telephones, and mobile phone handsets. The Global Medical Device Nomenclature Agency (GMDNS) definition of an ALD is an amplifying device, other than a hearing aid, for use by a hard of hearing person. In the case of an electric sound output, sound processing devices of this kind are used in cochlear implants.
Sound processing devices, including hearing aids, ALDs, cochlear implants, and consumer audio devices are being used more frequently in noisy environments. Normally, people make good use of both ears to separate the sounds they want to listen to from the other noises in the environment that they want to ignore. Present day consumer audio devices, hearing aids, cochlear implants, and ALDs also rely on these internal binaural perceptual processes to be able to function adequately in noisy environments. In addition to the internal perceptual processing, many audio devices include various external noise reduction schemes aimed at improving speech intelligibility, sound quality, and listening comfort in noisy environments. These noise reduction schemes typically use information that is available from a single microphone, or an array of closely-spaced microphones that may be worn on one side of the head. They rely on directional information, and spectral and temporal information to separate desired sounds from other noises in the environment. For example, some schemes seek to improve signal-to-noise ratios by expanding the intensity differences between more intense parts of the sound and less intense parts of the sound. A noise reduction scheme based on spectral information may apply more gain to the peaks in the spectrum than to the troughs. A noise reduction scheme based on temporal information may apply more gain at times when the sound is above a certain intensity threshold than when the sound is below this threshold. A noise reduction scheme based on directional information may apply more gain to sounds from the front of the listener than sounds from other directions. There is clear evidence that directional microphones can improve sound quality, comfort, and intelligibility. It is also clear that spectral and temporal noise reduction improves comfort, but the effects of spectral and temporal noise reduction on intelligibility and sound quality are more controversial.
One potential reason for the uncertainty about the effects of external spectral and temporal noise reduction schemes on intelligibility and sound quality is that they are changing the spectral and temporal cues that are used by the internal perceptual processes. If these cues are changed differently in the left and right ears, they may also disrupt the internal binaural processes that most listeners rely upon most heavily in noisy situations. There are at least three important perceptual processes that are important in binaural sound perception:
Bregman (1990) uses the term “auditory streaming” to describe the perceptual process that separates sounds from different sources and groups together sounds from the same source. A stream is a series of sequential and overlapping sound events that come from the same source. An example of a stream is the speech from a single person speaking. A word or a sentence spoken by this person must be perceived as a connected series of sound events to be understood, while being kept separate from the other sounds in the environment. Important sound events include the onsets and offsets of sounds, and changes in intensity and spectrum. The spectral and temporal noise reduction schemes referred to above introduce onsets and offsets, changes in intensity, and spectral changes in the noise that are correlated with the onsets and offsets and spectral changes in the desired signals. The perceptual effects of introducing these artificial streaming cues are difficult to predict. On one hand, they may emphasize the temporal and spectral characteristics of the desired sounds. On the other hand they will make it more difficult for the internal auditory streaming processes to separate the desired sound events and streams from the noise events and streams.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common knowledge in the field relevant to the present invention as it existed before the priority date of each claim of this application.
According to a first aspect the present invention provides a method for controlling a sound processing device having a binaural input including at least one microphone mounted in or near each ear of the device user, and having a binaural output including at least one output signal directed to each ear including: transducing sound at each ear, by the respective at least one microphone in or near the ear; estimating a signal-to-noise ratio present at each ear; selecting the ear with the greater signal-to-noise ratio; applying noise reduction processing to the signals of each ear based on information present in the signal of the selected ear; and presenting the processed signals to each ear.
According to a second aspect the present invention provides a sound processing device having a binaural input including at least one microphone mounted in or near each ear of the device user, and having a binaural output including at least one output signal directed to each ear, the device including: at least one microphone in or near each ear for the transduction of sound at each ear; a signal-to-noise estimation module to estimate the signal-to-noise ratio present at each ear; a comparison and selection module to compare the signal-to-noise ratios present at the two ears and to select the ear with the greater signal-to-noise ratio; a noise reduction control module that uses information present in the signal of the selected ear to control two noise reduction modules; two noise reduction modules that process the respective signals of the two ears, under the control of the control module; and two output modules that present the signals to each ear of the device user.
According to a third aspect the present invention provides a computer program product comprising computer program code means to make a computer execute a binaural noise reduction sound processing procedure, the computer program product including: computer program means accepting at least one input signal representing sound from at or near each ear of a listener; computer program means for estimating a signal-to-noise ratio present at each ear; computer program means for comparing the signal-to-noise ratios present at the two ears and selecting the ear with the greater signal-to-noise ratio; computer program means for using information from the signal of the selected ear to control two noise reduction processes respectively applied to the signals from the two ears; and computer program means for presenting the signals to each ear of the device user.
It is to be understood that “presenting signals to the ear” within the scope of this invention can include one or more of: generating acoustic signals to be presented to the outer ear by a speaker; generating electrical stimuli to be presented to the cochlea by a cochlear implant, generating mechanical signals for bone conduction to the middle ear, or the like.
The present invention recognises that if artificial streaming cues created by signal processing applied for external noise reduction are different in the two ears, they will add further to the confusion between what is the desired sound stream and what is the noise stream. The present invention thus recognises that it is important that the external noise reduction processing operates in a coordinated and consistent manner in the two ears.
In addition to avoiding the creation of artificial events or streaming cues, and avoiding the creation of artificial differences between the ears, the present invention recognises that an effective binaural sound processing strategy also needs to be able to “pay attention to only one ear when it is advantageous to do so.” This invention can thus be considered to offer a prosthetic supplement or substitute to the perceptual ability of normal human hearing mentioned at point (b) in the preceding. In order to emulate this aspect of the internal binaural processing, a binaural sound processing device needs to continuously assess the signal-to-noise ratio in each ear, select the ear with the higher signal-to-noise ratio, and use the signal at that ear as a basis for controlling the noise reduction processing for both ears.
The amplifier for each ear preferably comprises a conventional wide dynamic range compression (WDRC) or adaptive dynamic range optimization (ADRO) sound amplifier. See Dillon 2001 for a review of the WDRC prior art. See Blamey et al, U.S. Pat. No. 6,731,767 for a description of the ADRO sound processing. The variable gain in each channel of the amplifier may also be controlled according to the information derived from the ear with the greater signal-to-noise ratio, and the overall gain of the ear with the lower signal-to-noise ratio may be reduced relative to the overall gain of the ear with the higher signal-to-noise ratio.
In one embodiment of the invention, the noise reduction scheme is a multichannel expansion scheme or spectral subtraction scheme which temporarily reduces the gain applied to frequency bands that are thought to be primarily noise, and increases the gain in frequency bands that are thought to be primarily signal. The choice between whether a frequency band contains primarily noise or signal is preferably based on instantaneous amplitude and dynamic range of the sound in that frequency band in the selected ear. The reduction or increase in gain is applied equally and simultaneously to the signal for both ears. The control signals derived from the selected ear signal can be particularly simple in this case, for example, a 32-channel noise reduction scheme can be controlled by sending 32 bits to encode whether each channel is primarily signal (bit value=1) or noise (bit value=0).
In a second embodiment of the invention, the gains or gain reductions for each frequency channel are transmitted from the selected ear to the unselected ear and applied simultaneously to the signal for each ear.
In a third embodiment of the invention, the amplitude and dynamic range (or signal-to-noise ratio) for each frequency band are transmitted from the selected ear to the unselected ear and applied in identical noise reduction algorithms in both ears simultaneously.
The changes to the gains in individual frequency channels of the noise reduction processing or in the individual frequency channels of the amplifiers are preferably made slowly enough and over a time scale that is long enough to avoid the generation of artificial sound events and streaming cues. Any faster changes that may be necessary to avoid discomfort or damage to hearing are preferably applied across a broad frequency range and are also applied identically and simultaneously to each ear.
The operation of controls on the device, such as a volume control and program selection switch are preferably linked so that any change initiated by the control is applied to both ears simultaneously in a coordinated manner.
The sound processing in the two signal paths for the two ears is preferably configured to have minimum delay (Dickson and Steele, 2006) and to have equal delay from input to output to preserve fine temporal differences between the ears to the maximum extent possible.
The wired or wireless communication link between the two devices is preferably disabled when the signal-to-noise-ratio in each ear is greater than a configurable threshold value, and enabled when the signal-to-noise-ratio is below the configurable threshold. The purpose of this refinement is to save power when binaural noise reduction is not required or would not provide any discernable improvement to sound quality or speech intelligibility.
Many alternative noise reduction algorithms may be adapted for use in the binaural noise reduction scheme the subject of this invention. In one embodiment of the invention, the noise reduction scheme is a multichannel scheme which temporarily reduces the gain applied to frequency channels that are thought to be primarily noise, and increases the gain in frequency channels that are thought to be primarily signal. The choice between whether a frequency channel contains primarily noise or signal is preferably based on instantaneous amplitude 402 and signal-to-noise ratios 403 of the sound in that frequency channel in the selected ear. In a preferred embodiment of this type, the 30th and 90th percentiles of the amplitude are calculated in each frequency channel. If the amplitude is below the 30th percentile, or the 90th percentile is less than 2 dB above the 30th percentile, the frequency channel is judged to contain mostly noise, otherwise the frequency channel is judged to contain primarily signal. The reduction in gain for channels that are primarily noise and increase in gain for frequency channels that are mostly signal are applied equally and simultaneously to the signal for both ears. The control signals derived from the selected ear signal can be particularly simple in this case, for example, a 32-channel noise reduction scheme can be controlled by sending 32 bits to encode whether each channel is primarily signal (bit value=1) or noise (bit value=0). Preferably, a maximum cumulative gain reduction and a maximum cumulative gain increase are applied in each frequency channel.
In a second embodiment of the invention, the gains or gain reductions for each frequency channel are calculated in the selected ear in the same manner as for a conventional monaural noise reduction scheme and transmitted from the selected ear to the unselected ear and applied simultaneously to the signal for each ear.
In a third embodiment of the invention, the amplitude and dynamic range (or signal-to-noise ratio) for each frequency band are transmitted from the selected ear to the unselected ear and applied in identical noise reduction algorithms in both ears simultaneously.
The advantages of these embodiments of the present invention comprise: more accurate assessment of signal and noise levels in the unselected ear by utilizing information from the ear with the better SNR; avoidance of the creation of artificial streaming events that could disrupt the normal binaural processing of sounds; emphasize the signal relative to the noise in such a manner as to improve the signal-to-noise ratio in the unselected ear; minimizing the data transmission requirements and hence minimizing the additional power consumption of the devices; intelligently switching data transmission from one ear to the other to halve power consumption relative to a device that always transmits data in both directions; and intelligently switching off data transmission when binaural noise reduction is not required to reduce battery consumption.
Some portions of this detailed description are presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, conceived to be a self-consistent series of steps leading to a desired result. The steps are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
As such, it will be understood that such acts and operations, which are at times referred to as being computer-executed, include the manipulation by the processing unit of the computer of electrical signals representing data in a structured form. This manipulation transforms the data or maintains it at locations in the memory system of the computer, which reconfigures or otherwise alters the operation of the computer in a manner well understood by those skilled in the art. The data structures where data are maintained are physical locations of the memory that have particular properties defined by the format of the data. However, while the invention is described in the foregoing context, it is not meant to be limiting as those of skill in the art will appreciate that various of the acts and operations described may also be implemented in hardware.
It should be borne in mind, however, that all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Unless specifically stated otherwise as apparent from the description, it is appreciated that throughout the description, discussions utilizing terms such as “processing” or “computing” or “calculating” or “determining” or “displaying” or the like, refer to the action and processes of a computer system, or similar electronic computing device, that manipulates and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
Number | Date | Country | Kind |
---|---|---|---|
2008904473 | Aug 2008 | AU | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/AU2009/001104 | 8/27/2009 | WO | 00 | 10/18/2011 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2010/022456 | 3/4/2010 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
3894196 | Briskey | Jul 1975 | A |
5029217 | Chabries et al. | Jul 1991 | A |
5479522 | Lindemann et al. | Dec 1995 | A |
5651071 | Lindemann et al. | Jul 1997 | A |
6222927 | Feng et al. | Apr 2001 | B1 |
6731767 | Blamey et al. | May 2004 | B1 |
20030220786 | Chandran et al. | Nov 2003 | A1 |
20040196994 | Kates | Oct 2004 | A1 |
20060126865 | Blamey et al. | Jun 2006 | A1 |
20060227976 | Csermak et al. | Oct 2006 | A1 |
20070021085 | Kroeger | Jan 2007 | A1 |
20070098202 | Viranyi et al. | May 2007 | A1 |
20070291969 | Tateno et al. | Dec 2007 | A1 |
20080089530 | Bostick | Apr 2008 | A1 |
20080279410 | Cheung et al. | Nov 2008 | A1 |
20090028363 | Frohlich et al. | Jan 2009 | A1 |
Number | Date | Country |
---|---|---|
WO2007063139 | Jun 2007 | WO |
2007095664 | Aug 2007 | WO |
Entry |
---|
Boll, Steven F. “Suppression of Acoustic Noise in Speech Using Spectral Subtraction.” IEEE 27.2 (1979): 113-20. Web. Mar. 18, 2014. |
Bregman, A.S., “Auditory Scene Analysis: The Perceptual Organization of Sound”, MIT Press, Cambridge Massachusetts, 1990, pp. 9-11. |
Dillion, H., “Hearing Aids”, Boomerang Press, 2001, Chapter 6 “Compression Systems in Hearing Aids”, pp. 159-186. |
Dillion, H., “Hearing Aids”, Boomerang Press, 2001, Section 9.3 “Gain, Frequency Response, and Input-Output Functions for Nonlinear Amplification”, pp. 249-262. |
Dillion, H., “Hearing Aids”, Boomerang Press, 2001, Chapter 14 “Binaural and Bilateral Considerations in Hearing Aid Fitting”, pp. 370-403. |
International Search Report for International Application No. PCT/AU2009/001104, International Filing Date Aug. 27, 2009, Search Completed dated Nov. 24, 2009, dated Nov. 27, 2009, 4 pages. |
International Written Opinion for International Application No. PCT/AU2009/001104, International Filing Date Aug. 27, 2009, Report Completed dated Nov. 24, 2009, dated Nov. 27, 2009, 3 pages. |
P.J. Blamey, “Adaptive Dynamic Range Optimization for Hearing Aids”, The 9th Western Pacific Acoustics Conference, Seoul, Korea, Jun. 26-28, 2006. |
P.J. Blamey et al., “An Intrinsically Digital Amplification Scheme for Hearing Aids”, J. App. Sig. Proc. 18, 2005, pp. 3026-3033. |
Number | Date | Country | |
---|---|---|---|
20120128164 A1 | May 2012 | US |