The present invention relates to a method for adapting a classification of audio signals. The present invention further relates to a corresponding signal processor and a hearing aid.
Hearing devices are primarily used to improve the clarity of audio signals from sound waves for a desired purpose in each case. One field of use for hearing devices as a hearing aid is the care of those with a hearing impairment. The amplification function of a hearing device is achieved by means of the integrated electronics. One or more microphones in the hearing device receive an audio signal, which is processed by means of an audio processor and output again from an earphone.
Different hearing situations are produced depending on the location of the hearing device user. Desirable and undesirable sounds occur in many hearing situations, e.g. a car journey. In the example of the car journey, the voice of a fellow passenger is desirable while the noise of the vehicle is undesirable. A hearing device should preferably filter out and then process desirable sounds only. Hearing situations which occur frequently can be classified. This classification is performed by a signal processor, which uses an algorithm to assign a specific classification to an audio signal on the basis of one or more possible audio features of said audio signal. An audio feature may be a level or amplitude of an audio signal, for example. An audio processor can then process the audio signal further using the relevant classification information accordingly. An audio processor has various processing programs, which are selected as a function of the classification.
The process of setting a classification is essentially influenced by two requirements, the first being to set a classification which most closely matches the current hearing situation and the second being to effect this setting quickly. However, accuracy and rapid change of classification represent conflicting requirements.
The object of the present invention is to allow a rapid change of classification in response to a changed hearing situation, while ensuring a reliably stable classification.
This object is achieved by a method for adapting a classification of an audio signal, a method for classifying an audio signal, and a hearing aid.
By comparing difference sums of audio features, which are summed over time periods of different length, brief changes in the received audio signal can be identified reliably with reference to a longer monitoring period, thereby forming a reliable basis for performing a change of classification. The change of classification is based on the temporal sequence of differences of consecutive values of an audio feature of the audio signal, and therefore the change is considered in the form of a multiplicity of intermediate values over a specific duration, whereby a change of the hearing situation is reliably reflected in the differences of the feature values. A change in the audio signal is identified quickly by examining a first time period of shorter duration, while adequate stability of the classification is ensured by virtue of the reference to a second time period of longer duration.
An audio feature is a variable derived from an audio signal. The audio feature typically relates to a temporal aspect, i.e. phase or frequency, or to the amplitude of an audio signal. The audio feature therefore changes over time according to the audio signal. In the following, the audio feature can also be a mean value, a standard deviation, a modulation or a variance of a level of the audio signal.
According to a development, the comparison is effected by means of a quotient from the first sum and the second sum. A quotient can easily be determined by means of a simple mathematical operation and represents a meaningful measure of the relationship between the first sum and the second sum.
According to a development, a temporal sequence of values of various types of audio features is generated and the difference is formed from individual differences of the consecutive values of audio features of the same type. The audio features may be mean values, standard deviations, modulations or variances of a level of an audio signal. Using various types of audio features instead of being limited to a specific audio feature improves the accuracy of the classification. When forming the difference, the individual differences are weighted according to the type of the respective audio feature, thereby providing increased flexibility when specifying a change of classification in the method according to the invention.
According to a development, the values of the various types of audio features are combined to produce a feature vector and the difference is obtained in the form of a distance between consecutive feature vectors. By virtue of said combination into a vector, the audio features can be processed more easily.
According to a development, the change of classification is performed as a function of a currently selected classification. By virtue of the change of classification also depending on a currently selected classification, the stability and/or the response speed for a change of classification can by controlled as a function of the classification. For example, the change of classification from a hearing situation for speech can only take place if the comparison of the sum difference of the sequence of audio features indicates particularly clearly that the hearing situation has changed, in order thereby to achieve greater stability for the class for speech.
The first time period advantageously has a duration of 2 to 6 seconds and the second time period a duration of 10 to 20 seconds.
Also provided is a method for classifying an audio signal, wherein said method comprises the steps of the method cited in the introduction and, in addition, steps for preparing a change of classification by selecting a proposal for an adapted classification as a function of a value of the audio feature, and performing the change of classification in accordance with the proposal for an adapted classification as a function of the comparison.
A specific proposal is made for a change of classification. The additional presence of such a proposal reduces the time required to change to a classification, since the proposal can be used as a basis for changing to a classification without having to perform the entire calculation for the classification change.
The present invention is now explained with reference to exemplary embodiments in the appended drawings, in which:
In a first step 1, an audio signal is provided. This audio signal is typically a microphone signal of the hearing aid. The microphone signal can be supplied by one or more microphones of the hearing aid. Further signal preparation means may also be connected between the microphone or microphones and the signal processor, e.g. for the purpose of smoothing the microphone signal.
In a second step 2, a temporal sequence of values uk of an audio feature is generated. The values of the sequence are numbered in chronological order by an index k in this case. Provision is advantageously made for considering not just a single audio feature, but a plurality of audio features of various types. In this case, uk represents a feature vector which combines the values of this audio feature at the time point tk corresponding to the index k. The temporal separation between two consecutive time points tk-1 and tk may be 10 ms to 200 ms, for example. The audio feature represents characteristic properties of the audio signal at a specific time point. The audio feature is typically determined from the temporal course of the audio signal in a temporal vicinity of the respective time point. A person skilled in the art will be familiar with various audio features per se, e.g. a mean value, a standard deviation, a modulation or a variance of a level of the audio signal.
In a third step 3, a difference uk−uk-1 is formed in each case from consecutive values uk-1 and uk of the audio features. In this way, a sequence of differences is therefore obtained for the various values k=1, 2, 3, etc. of the index. Of primary importance for the subsequent method steps is the absolute amount of this difference, i.e. dk=|uk−uk-1|. In the case of a feature vector for a multiplicity of audio features, dk represents the distance of the consecutive vectors uk-1 and uk. The distance can be variously selected, e.g. as a Euclidean distance or a Mahalanobis distance. The audio features can also be variously weighted in this distance, e.g. by means of multiplying the feature values by various scalar coefficients before the distance is determined. In the following, dk is only defined as a difference, though it can also represent the absolute amount of the difference or the distance depending on the embodiment.
In the next steps 4 and 5, the sequence of differences dk is processed in different ways, in that they are summed over time periods of different length. In step 4, the differences are summed over a first time period T1 to give a first sum Σ1. In step 5, however, the differences are summed over a longer time period T2 to give a second sum Σ2. The shorter time period T1 may be 2 to 5 seconds and the longer time period T2 may be 10 to 20 seconds, for example. In this exemplary embodiment, the longer time period T2 is two to ten times longer than the shorter time period T1. For a shorter time period T1 of e.g. 2 seconds and a temporal separation of the consecutive values of the audio signals of e.g. 10 ms, 200 individual values of differences of the values uk are therefore summed for the time period T1, said individual values corresponding to the time points Tk which lie in the time period T1. The sum of the differences therefore describes the totality of all individual changes of the audio features over the respective time period of the sum.
In a sixth step 6, the two sums Σ1 and Σ2 over the elapsed respective time periods T1 and T2 are compared with each other. On the basis of this comparison of the totality of the individual changes over two time periods of different length, it is possible to identify any short-term changes in relation to a longer-term trend. The comparison is made in a simple manner by generating a quotient from Σ1 and Σ2, wherein the relative length of the two time periods must be taken into consideration when evaluating the quotient. For example, the value of the quotient
can be used for the comparison. This effectively means that the average rate of change Σ1/T1 in the shorter time period T1 is compared with the average rate of change Σ2/T2 in the longer time period T2. If the value of Q is significantly greater than 1, this indicates a noticeable increase in the rate of change in the time period T1.
In a seventh step 7, a change of classification is performed as a function of the comparison in the preceding step 6. In this case, provision is not made for selecting the classification itself, but merely for implementing a classification which has been proposed by other means. The classification proposal per se can be determined in a conventional manner as a function of the hearing situation. By virtue of the present method, a change of classification is therefore inhibited if the above described comparison indicates that the hearing situation has not changed appreciably in the preceding time period T1. However, since a relatively short time period T1 is selected, this method allows a change of classification to be determined quickly yet reliably.
The method can be fine-tuned by taking various audio features into consideration and optionally also applying a weighting to these various audio features. The selection and weighting can be improved by a series of tests in various changing hearing situations, for example, in order to allow accurate detection of a change in the hearing situation.
The change of classification can also be performed according to the currently selected hearing situation. For example, it is desirable for e.g. the hearing situation “speech in quiet” to be particularly resistant to an incorrect change of classification, while other classifications such as “car”, “music”, “quiet” or “interference noise” may be changed more readily. This change can also depend on a proposed new classification, such that e.g. a change to the hearing situation “speech in quiet” can take place particularly quickly. The current and/or proposed classification can also take into consideration the weighting of the audio features in the determination of the distances. For further improvement, the summing time periods T1 and T2 can also depend on the current and/or proposed classification.
The hearing situation “speech in quiet” occurs when a person is speaking in otherwise quiet surroundings. In addition to this, other classifications are known in respect of the hearing situations for a car (“car”), music (“music”), quiet surroundings (“quiet”), interference noise (“interference noise”) and many other situations. The classification of the hearing situation is likewise performed by the hearing aid on the basis of the audio signal, wherein the above cited audio features can also be taken into consideration. Depending on the respective hearing situation, a suitable hearing program for the hearing situation is specified for processing the audio signal. The audio signal which is processed by the respective hearing program is reproduced in amplified form for the hearing aid wearer. The hearing program specifies e.g. different types of frequency filters, the amplification level, which is possibly also frequency-dependent, and the directivity of the microphones.
A sequence of short time periods T1,i and a further sequence of longer time periods T2,i are indicated below the time axis. The short time periods T1,i have ten individual intervals between the time points tk in each case. The associated sums Σ1,i therefore comprise the differences of ten consecutive value pairs uk-1 and uk. The longer time periods T2,i are three times as long as the short time periods T1,i in each case. The associated sums Σ2,i therefore comprise the differences of thirty consecutive value pairs uk-1 and uk.
The numbering of the index is selected such that the intervals T1,i and T2,i for the same index i end at the same time point tk with k=10·i. The time period T1,i always lies within the time period T2,i in this case, both ending at the same time point. Alternatively, T1,i can also link directly to the time period T2,i. In any case, T1,i and T2,i should be closely related.
With each increment of the index i, the time intervals T1,i and T2,i are shifted by the same amount, such that the relationship between these intervals is maintained. In this case, the time intervals T1,i are shifted by the same duration as the time periods T1,i, such that the time periods follow each other without interruption relative to time. Alternatively, the shift may also be longer or shorter than the time intervals T1,i.
In the present case, the sums Σ1,i and Σ2,i can be represented in the form of equations as follows:
These sums can in turn be used to form the following quotients Qi, on the basis of which the change of classification is performed:
As described above, uk can be an individual numerical value for a feature or a vector comprising a multiplicity of individual values for various audio features. In the case of an individual numerical value, |uk| represents the absolute value. In the case of a vector, uk is specified in the form of an ordered set of numerical values (uk)n, where n is the index by means of which the individual numerical values are differentiated. Various norms can be selected according to the field of use. One possible norm is the Euclidean norm, which is defined as follows:
The sum is produced over all of the vector entries. Alternatively, |uk−uk-1| can be defined as a Mahalanobis distance.
In the exemplary embodiment shown here, after the audio signal 8 is provided, a first value of an audio feature is generated from the audio signal 8 in step 9 at a first time point. As before, it is also possible to take a multiplicity of values of different audio features into consideration instead of a single value here. On the basis of this first value of the audio feature, a classification is selected in step 10. This selection takes place in accordance with a generally known method for the classification of audio signals.
Both of the above described steps 9 and 10 are repeated in the subsequent steps 11 and 12. This means that a second value of the audio feature is generated in step 11 at a second time point, said value being the basis of a further classification selection. The now adapted classification may differ from the previously selected classification. In such a case, the chosen classification at the second time point corresponds to the proposal for a change of classification. This proposal is not initially performed, however.
In the step 2 in the interval between the first time point and the second time point, the temporal sequence of the values of the audio feature is generated as described above in relation to
The audio signal 8 from the microphones 14 arrives via an electric contact 18 at an input interface 19 of a signal processor 20. A classification unit 21 in the signal processor 20 performs the method for classification of the audio signal 8 as described with reference to
The audio processor 23 also receives the audio signal 8 directly from the microphones 14 via the contact 18. On the basis of the selected classification in each case, the audio processor 23 processes the audio signal 8 by applying a processing program which corresponds to the classification and is adapted to the respective hearing situation. The processed audio signal is forwarded to the earphone 17 of the hearing aid 13 by the audio processor 23. An optional amplifier for the processed audio signal, which may be connected in series, is not illustrated in the drawing for the sake of simplicity.
In conclusion, the underlying concept of at least one embodiment of the invention is summarized here again: the invention relates to the adaptation of the classification of an audio signal as a function of a comparison between two difference sums of audio features over time periods of different length. Thus, an adequately exact yet quickly reacting adaptation of the classification in changing hearing situations is ensured. The method according to the invention is advantageously used in a hearing aid. The audio signal is processed in different ways on the basis of the classification.
Although the invention has been illustrated and described in detail with reference to the preferred exemplary embodiment, it is not limited by the examples disclosed herein and other variants can be derived therefrom by a person skilled in the art without thereby departing from the scope of the invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2012/051371 | 1/27/2012 | WO | 00 | 7/28/2014 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2013/110348 | 8/1/2013 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
7995781 | Allegro Baumann et al. | Aug 2011 | B2 |
20020191799 | Nordqvist | Dec 2002 | A1 |
20070250461 | Sabe | Oct 2007 | A1 |
20070269053 | Meier | Nov 2007 | A1 |
Number | Date | Country |
---|---|---|
1513371 | Mar 2005 | EP |
Entry |
---|
Büchler, Michael Christoph: “Algorithms for Sound Classification in Hearing Instruments”—A dissertation submitted to the Swiss Federal Institute of Technology Zurich, 2002. Diss.ETH No. 14498. |
Number | Date | Country | |
---|---|---|---|
20140369510 A1 | Dec 2014 | US |