The present invention is related to a method to adjust a hearing system comprising two hearing devices to be at least partly inserted into a left and right ear of a head, to a method to operate the hearing system adjusted accordingly as well as to hearing systems.
There are basically two proven ways of increasing intelligibility above that obtainable with a well-fitted conventional hearing device that delivers sound at a comfortable level. One way is to move the hearing device microphone—or some auxiliary microphone—closer to the source of interest. This increases the level of direct sound compared to reverberant sound and background noise. Unfortunately, moving closer to the source, or positioning a remote microphone near the source, is not always practical.
The other proven solution is to use some type of directional microphones that are used to obtain directional characteristics so as to have minimum sensitivity for sounds coming from the direction of dominant noise sources. Such a group of microphones is often referred to as a microphone array or as a beam forming array meaning that at least two microphones or a microphone having at least two ports are involved.
There are various approaches in the array signal processing literature to finding direction of arrival of multiple sources from superimposed signals in noise incident on an array of sensors. One can divide the known approaches basically into three general groups:
A first group is based on maximizing the steered response power of a beam former. The location estimate is derived directly from a filtered, weighted and summed version of the signal data received at the sensors. The location estimate is computed by finding the location that maximizes the output power. The main difficulty with these methods is that the steered response usually does not have a global peak and has lots of local maxima. Thus a maximum-likelihood-type optimization technique is usually not efficient both in accuracy and in computational complexity. Computationally less complex iterative methods can be used for maximum likelihood estimation, but they introduce overall system delay.
A second group is based on high-resolution spectral estimation techniques including autoregressive modeling, minimum variance spectral estimation, and Eigenvalue-decomposition-based techniques such as the popular MUSIC (multiple signal classification) algorithm. These methods rely on spatial signal correlation matrix, which is usually derived from observed data with assumptions such as the sources and noise being stationary. Those assumptions are not satisfied by speech signals, and the computational cost of Eigenvalue-decomposition is very high for a hearing device application. Furthermore, these methods are designed for narrowband signals. They can be extended to wideband signals, such as speech, in expense of at least a linear increase in computation with the number of frequency bins. These methods are also quite sensitive to source and sensor modeling errors as well as to reverberation.
A third group is based on time delay of arrival information—e.g. basically ITD-(interaural time difference)—, where the methods calculate source locations from a set of delay estimates measured across various combinations of microphones. These methods use temporal correlation of the signals to compute accurately the ITD information. These methods are theoretically good for free field application. However, for hearing device application, where there is a head causing head shadowing between sensors for high frequencies, ITD information is useful only in the lower frequency bands. Due to the temporal correlation estimation, these methods require higher computational power than a hearing device can afford.
All the above-mentioned methods from array signal processing literature perform poorly when the number of sensors (e.g. microphones in a hearing device) and the number of observations are small, and the number of sources in the incident signal is large. However, the main disadvantage of these solutions is the computational complexity. Due to the low-power requirements of a digital signal processor in a hearing device, it is difficult to run such methods on a hearing device. Furthermore, most of the methods rely on the availability of signals from both hearing devices of a binaural hearing system.
Direction of arrival of a source signal is important information for a hearing device to adjust its parameters according to the direction of the source.
Location estimation using a binaural hearing instrument is difficult by using known methods. In particular, the known techniques show disadvantages in terms of
It is therefore an object of the present invention to overcome the above-mentioned disadvantages and to provide an improved method to localize a sound source.
The present invention is related to a method to adjust a hearing system comprising two hearing devices to be at least partly inserted into left and right ear of a head, each hearing device comprising at least one microphone, comprising the steps of:
An important advantage of the present invention is the fact that a head-related transfer function is automatically taken into account while the hearing system is adapted to the individual. Therewith, an optimal adaptation of the hearing system is obtained also resulting in precise sound source localization during the operating mode. In cases where a so called KEMAR, i.e. a dummy head, is used during the adjustment mode, a standardized relation is obtained to be stored in the memory unit, which relation does not reflect the individual shape of a user's head but still give adequate results for a later good operation of the hearing system.
In an embodiment of the invention, the power levels are determined in predefined frequency ranges.
In a further embodiment, power ratios are calculated using the determined power levels. Therewith, the multiple power levels from the microphones are packed into the fewer power ratios.
In a further embodiment, said relation is partitioned into segments covering complete range of 360 degrees, and is inverted in each segment. The segmentation allows a definite inversion of the between power ratios and angle of incidence.
A further embodiment comprises the step of comparing the power ratios to predefined threshold levels and by partitioning said relation as a result of the comparison.
In a further embodiment of the present invention, said relation is determined in different acoustic situations, taking into account the impact on the relation between the power levels and power ratios, respectively, and the angle of incidence. Acoustic situations might be defined as music, noise, speech in calm situations, speech in restaurant, living room, car noise, etc.
Once the hearing system is adapted to the hearing device user according to the above-mentioned adjustment phase, the hearing system is ready to be operated. Therefore, a method to operate a hearing system is provided that is adjusted according to the adjustment phase. The hearing system comprises two hearing devices to be at least partly inserted in or behind a left and right ear of a user's head, each hearing device comprising at least one microphone. The method to operate the hearing system comprises the steps of:
An advantage of the present method to operate the hearing system lies in the fact that a precise determination of a location of a sound source is achieved. This in particular because the head-related transfer function is considered during the adjustment phase of the hearing system.
Furthermore, this invention proposes a computationally cheaper method to localize a sound source given a binaural hearing system with at least two microphones. A binaural hearing system using only the left and right sensors is subject to front-back ambiguity in localization. By using also the front-back signals, the front-back ambiguity can be resolved. For such an embodiment, at least four microphones must be used. The method used in this invention is capable of locating the sound source that is dominant in power within the sound field.
In an embodiment of the invention, the power levels are determined in predefined frequency ranges.
In a further embodiment, power ratios are calculated using the determined power levels.
In yet another embodiment of the present invention, the method comprises the steps of
A further embodiment comprises the steps of comparing the power ratios to predefined threshold levels and partitioning said relation as a result of the comparison.
In a further embodiment of the present invention, the momentary acoustic situation is determined with a classifier, for example, the information regarding the momentary acoustic situation being used to select the most suitable relation between the power ratio or power levels, respectively, and the angle of incidence.
A further embodiment of the present invention comprises the steps of
Furthermore, a hearing system comprising two hearing devices to be at least partly inserted into left and right ear of a head, each hearing device comprising at least one microphone, is provided, the hearing system comprising:
In a further embodiment, the system comprises means for determining the power levels in predefined frequency ranges.
In yet another embodiment of the present invention, the system comprises means for calculating power ratios using the determined power levels.
In another embodiment of the present invention, the system comprises
In yet another embodiment, the system comprises
In a further embodiment, the system comprises means for determining said relation in different acoustic surround situations.
Finally, a hearing system adjusted according to the adjustment phase is provided comprising two hearing devices to be at least partly inserted into left and right ear of a user's head, at least one hearing device comprising:
In a further embodiment of the present invention, the hearing system comprises means for determining the power levels in predefined frequency ranges.
In yet another embodiment of the present invention, the hearing system comprises means for calculating power ratios using the determined power levels.
In a further embodiment of the present invention, the hearing system comprises
In a further embodiment of the present invention, the hearing system comprises
In yet another embodiment of the present invention, the hearing system comprises
In yet another embodiment of the present invention, the hearing system comprises
In yet another embodiment of the present invention, the hearing system comprises
In yet another embodiment of the present invention, the hearing system comprises
It is emphasized that the power-based approach proposed by this invention not only works for a binaural hearing system but still works for a bilateral hearing system for which the transmission between the hearing devices must not be of high capacity—as needed for a binaural operation.
The present invention is further explained in more detail by way of examples shown in drawings.
In
Furthermore, a sound source S is shown at an angle of incidence 9 with regard to the viewing arrow V, i.e. the line of sight of the user U.
The arrangement of
For the binaural hearing system of
By the four microphones 1 to 4, it is possible to distinguish between left and right as well as between front and back. The method according to the present invention applies also to a hearing system with more than four microphones that are possibly in a different constellation.
Acoustic signals are recorded or captured by the microphones 1 to 4 and fed to a pre-processing stage, in which beam-formed signals are generated by using only signals of microphones 1 and 2 for the left hearing device 10, and by using only signals of the microphones 3 and 4 for the right hearing device 20, so that each hearing device 10, 20 has directionality instead of being omni-directional for purposes of spatial noise reduction. Due to a typical cardioid shape of the beam pattern resulting from using two microphones, one generally calls this type of such a signal a cardioid.
In the following, reference is often made to a signal with indication of the reference number of one of the microphones. This can either mean a beam-formed microphone signal (cardioid) or an omni-directional microphone signal. In connection with cardioid signals, the reference numbers 1 to 4 therefore refer to the left front-facing cardioid, the left back-facing cardioid, the right front-facing cardioid, and the right back-facing cardioid, respectively. In connection with omni-directional signals, the reference numbers 1 to 4 refer to the left-front, left-back, right-front and right-back microphone signals.
A basic principle of the present invention is the following: an acoustic excitation—i.e. a sound source S—from different directions (different angles of incidence θ) around the head H causes different power levels p at the microphones 1 to 4 of a hearing system, the power level pn recorded by the microphone n being defined in the time interval t1 to t2 as follows:
where sn(t) is the input signal as a function of time recorded by the microphone n.
Although the definition for the power level pn is given for an analog input signal sn(t), the present invention can readily be applied to digital signals which are then processed digitally. As a consequence, the above definition as well as the equations to follow must then be rewritten in the discrete time domain instead of the continuous time domain. Measures similar to power, such as magnitude can be used as well and are functionally equivalent. All of these measures shall be referred to as power levels.
From the power levels p1 to p4 recorded by the microphones 1 to 4 and by knowing the location of the sound source S via the corresponding known angle of incidence θ, a reference point is obtained in dependence on the angle of incidence θ. This procedure is repeated for several, possibly for a high number of times, each being done at a different known angle of incidence θ to cover the entire range of 360 degrees. Therewith, relations between the power levels p and the angle of incidence θ are obtained over the entire range of 360 degrees. These relations are stored in a memory unit in at least one of the hearing devices 10 and 20, and form the basis for a later determination of an angle of incidence θ from calculated power levels p1 to p4 during the operating mode of the hearing system.
In a further embodiment of the present invention, power ratios are calculated from different power levels p1 to p4 obtained via the input signals of the microphones 1 to 4. For example, the left-right power ratio R13, considering the left-front and right-front microphones 1 and 3, is defined as follows:
wherein ε is a noise, respectively a regularization term occurring naturally in a practical system, in which a division by zero must be prevented.
Similarly, the front-back ratios, namely R12 and R34, are defined as follows:
It shall be noted that these or similar ratios can also be computed at least in part in logarithmic domain, This changes the mathematical equation, but not the underlying functional principle, which is presented here.
The left-front, left-back, right-front and right-back signals of the microphones 1 to 4 can be omni-directional microphone signals or cardioid signals.
The power ratios R12, R34 and R13 are defined, for example, in terms of the time-averaged subband powers p1 to p4, the subband referring, for example, to a band-pass region in the frequency domain, which may include—for discrete systems—multiple frequency bins in terms of a discrete Fourier transform. In one embodiment, the total power in a frequency range is determined. However, it is possible to carry out the same formulation and come up with a location estimate for each frequency bin individually, as it is the case for another embodiment of the present invention.
Considering a power ratio RA in order to obtain a smooth graph but still distinguish between the two front-back ratios R12 and R34, the following rules can be defined:
where tA is a threshold, and RA is the combination of front-back power ratios R12 and R34 in dependence on the threshold tA.
In
A specific advantage of the present invention is obtained by the above-described determination of the power levels and power ratios in that the individual geometric form—e.g. head, ears, hairs, etc.—of a hearing system user is automatically considered when determining the power levels or power ratios in dependence on the angle of incidence for an individual. In other words, the so called head related transfer function (HRTF) is automatically considered and compensated which results in an overall improvement of localizing sound sources S in the operating mode later.
The power levels pn, which actually are averaged during the considered time interval t1 to t2, are calculated in every frame of an input signal, and are used to calculate power ratios, and, if need be, the power ratios are averaged or smoothed along the entire duration of the signal for this graph. Because of the low-order nature of these graphs, it is possible to fit low-order polynomials to the curves so that the location estimation can be parameterized.
As has been pointed out, the power ratios are computed given specific locations around the user's head H (
In a specific embodiment of the present invention, the entire range from 0 to 360 degrees is divided into four segments I, II, III and IV by using predefined thresholds that are compared to the power ratios. For instance and with a view on
With similar thresholds, segment II covers the angles of incidence θ that are greater than 40 and less than 130 degrees. Furthermore, segment III covers the angles of incidence θ being greater than 130 and less than 240 degrees. Finally, segment IV covers the angles of incidence θ being greater than 240 and less than 320 degrees.
It is pointed out that these specific values for the thresholds are only examples. The idea, however, is to adjust thresholds such that the segments form a partition of the entire range for the angle of incidence θ. In addition, it is also conceivable that the segments I to IV or some of the segments I to IV are overlapping to have overlapping segments. The respective thresholds must then be selected accordingly.
The shape of the power ratio graphs changes slightly depending on the nature of the sound source signal. In addition, the acoustic situation, in which the sound source S is contained, influences the shape of the power ratio graphs. Therefore, it is proposed in a further embodiment of the present invention to determine power levels or power ratios, respectively, for different acoustic situations in order to further optimize sound source localization. In other words, the above-described procedure for determining the relation between power levels and power ratios, respectively, and angle of incidence θ is performed in each acoustic situation the hearing system is adapted to operate in. Therefore, a set of optimum localizer coefficients are computed and stored in a memory unit of the hearing system for each acoustic situation. If a particular acoustic situation is detected—either by the hearing system itself or by other means—the corresponding coefficients or relations between power levels and power ratios, respectively, and angle of incidence θ are accessed for operating the hearing system.
For example, if the acoustic situation is detected to be speech in a restaurant then the localizer parameters for this particular acoustic situation is accessed in the memory unit and loaded into the working memory for operating the hearing system.
The power ratio profiles—i.e. the power ratios as a function of the angle of incidence, also called the relation between power ratio and angle of incidence θ—can change in accordance with certain parameters. In a further embodiment of the present invention, it is therefore proposed to adjust the hearing system in accordance to these parameters. For each parameter or parameter value a power ratio profile or a power level profile is stored in the memory unit of at least one of the two hearing devices. In the operating mode of the hearing system, means are provided to determine or estimate the respective parameters or parameter values in order to select the most appropriate power ratio profile or power level profile, respectively, of the set available in the memory unit of the hearing device.
The parameters can be, for example, one of the following:
As has been pointed out, the power ratio graph is split into segments—e.g. into the four segments I to IV as described in connection with
In the normal operating mode of the localizer, the power ratios R13 and RA are calculated, using time-average power values, for each frame of the input signal. Using thresholds on the power ratio R13 and RA, a decision is made about which segment those power ratios belong to. Then, the locations (i.e. angle of incidence) are computed using the inverse relation specific to this segment. The size of each signal frame can be adjusted depending on the signal properties. The frame should be long enough to have an average power value especially for non-stationary signals. However, it should not be too long either; otherwise the method cannot accommodate moving sources.
The hearing system depicted in
The second hearing device 20 of the hearing system of
In
Having thus shown and described what is at present considered to be the embodiments of the invention, it should be noted that the same has been made by way of illustration and not limitation. Accordingly, all modifications, alterations and changes coming within the sprit and scope of the invention are herein meant to be included.